summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-12-12xfrm: put policies when reusing pcpu xdst entryFlorian Westphal
We need to put the policies when re-using the pcpu xdst entry, else this leaks the reference. Fixes: ec30d78c14a813db39a647b6a348b428 ("xfrm: add xdst pcpu cache") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-12-11scsi: core: Fix a scsi_show_rq() NULL pointer dereferenceBart Van Assche
Avoid that scsi_show_rq() triggers a NULL pointer dereference if called after sd_uninit_command(). Swap the NULL pointer assignment and the mempool_free() call in sd_uninit_command() to make it less likely that scsi_show_rq() triggers a use-after-free. Note: even with these changes scsi_show_rq() can trigger a use-after-free but that's a lesser evil than e.g. suppressing debug information for T10 PI Type 2 commands completely. This patch fixes the following oops: BUG: unable to handle kernel NULL pointer dereference at (null) IP: scsi_format_opcode_name+0x1a/0x1c0 CPU: 1 PID: 1881 Comm: cat Not tainted 4.14.0-rc2.blk_mq_io_hang+ #516 Call Trace: __scsi_format_command+0x27/0xc0 scsi_show_rq+0x5c/0xc0 __blk_mq_debugfs_rq_show+0x116/0x130 blk_mq_debugfs_rq_show+0xe/0x10 seq_read+0xfe/0x3b0 full_proxy_read+0x54/0x90 __vfs_read+0x37/0x160 vfs_read+0x96/0x130 SyS_read+0x55/0xc0 entry_SYSCALL_64_fastpath+0x1a/0xa5 [mkp: added Type 2] Fixes: 0eebd005dd07 ("scsi: Implement blk_mq_ops.show_rq()") Reported-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: James E.J. Bottomley <jejb@linux.vnet.ibm.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Ming Lei <ming.lei@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: stable@vger.kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-11scsi: MAINTAINERS: change FCoE list to linux-scsiJohannes Thumshirn
fcoe-devel@open-fcoe.org is defunct and all patches are routed via the SCSI tree anyways. So update MAINTAINERS accordingly. Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-11scsi: libsas: fix length error in sas_smp_handler()Jason Yan
The return value of smp_execute_task_sg() is the untransferred residual, but bsg_job_done() requires the length of payload received. This makes SMP passthrough commands from userland by sg ioctl to libsas get a wrong response. The userland tools such as smp_utils failed because of these wrong responses: ~#smp_discover /dev/bsg/expander-2\:13 response too short, len=0 ~#smp_discover /dev/bsg/expander-2\:134 response too short, len=0 Fix this by passing the actual received length to bsg_job_done(). And if smp_execute_task_sg() returns 0, this means received length is exactly the buffer length. [mkp: typo] Fixes: 651a01364994 ("scsi: scsi_transport_sas: switch to bsg-lib for SMP passthrough") Cc: <stable@vger.kernel.org> # v4.14+ Signed-off-by: Jason Yan <yanaijie@huawei.com> Reported-by: chenqilin <chenqilin2@huawei.com> Tested-by: chenqilin <chenqilin2@huawei.com> CC: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-12selftests: bpf: Adding config fragment CONFIG_CGROUP_BPF=yNaresh Kamboju
CONFIG_CGROUP_BPF=y is required for test_dev_cgroup test case. Signed-off-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-11platform/x86: dell-wmi: check for kmalloc() errorsDan Carpenter
This allocation won't fail in the current kernel because it's small but not checking for kmalloc() failures introduces static checker warnings so let's fix it. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Mario Limonciello <mario.limonciello@dell.com> Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
2017-12-11platform/x86: asus-wireless: send an EV_SYN/SYN_REPORT between state changesPeter Hutterer
Sending the switch state change twice within the same frame is invalid evdev protocol and only works if the client handles keys immediately as well. Processing events immediately is incorrect, it forces a fake order of events that does not exist on the device. Recent versions of libinput changed to only process the device state and SYN_REPORT time, so now the key event is lost. https://bugs.freedesktop.org/show_bug.cgi?id=104041 Signed-off-by: Peter Hutterer <peter.hutterer@who-t.net> Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
2017-12-11platform/x86: dell-laptop: Fix keyboard max lighting for Dell Latitude E6410Pali Rohár
This machine reports number of keyboard backlight led levels, instead of value of the last led level index. Therefore max_brightness properly needs to be subtracted by 1 to match led max_brightness API. Signed-off-by: Pali Rohár <pali.rohar@gmail.com> Reported-by: Gabriel M. Elder <gabriel@tekgnowsys.com> Link: https://bugzilla.kernel.org/show_bug.cgi?id=196913 Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
2017-12-11Merge branch 'for-4.15-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu Pull percpu fix from Tejun Heo: "Just one patch to work around CRIS boot problem caused by a recent change which freed a temporary boot data structure. The root cause is on CRIS side but it doesn't seem trivial to fix. For now, work around by skipping freeing on CRIS" * 'for-4.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu: hack to let the CRIS architecture to boot until they clean up
2017-12-11Merge branch 'for-4.15-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: - Prateek posted a couple patches to fix a deadlock involving cpuset and workqueue. It unfortunately caused a different deadlock and the recent workqueue hotplug simplification removed the original deadlock, so Prateek's two patches are reverted for now. - The new stat code was missing u64_stats initialization. Fixed. - Doc and other misc changes * 'for-4.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: add warning about RT not being supported on cgroup2 Revert "cgroup/cpuset: remove circular dependency deadlock" Revert "cpuset: Make cpuset hotplug synchronous" cgroup: properly init u64_stats debug cgroup: use task_css_set instead of rcu_dereference cpuset: Make cpuset hotplug synchronous cgroup/cpuset: remove circular dependency deadlock
2017-12-11Merge branch 'for-4.15-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue fixes from Tejun Heo: - Lai's hotplug simplifications inadvertently fix a possible deadlock involving cpuset and workqueue - CPU isolation fix which was reverted due to the changes in the housekeeping code resurrected - A trivial unused include removal * 'for-4.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: remove unneeded kallsyms include workqueue/hotplug: remove the workaround in rebind_workers() workqueue/hotplug: simplify workqueue_offline_cpu() workqueue: respect isolated cpus when queueing an unbound work main: kernel_start: move housekeeping_init() before workqueue_init_early()
2017-12-11Merge branch 'for-4.15-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata fixes from Tejun Heo: "Nothing too interesting. David Milburn improved a corner case misbehavior during hotplug. Other than that, minor driver-specific fixes" * 'for-4.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: libata: sata_down_spd_limit should return if driver has not recorded sstatus speed ahci: mtk: Change driver name to ahci-mtk ahci: qoriq: refine port register configuration pata_pdc2027x : make pdc2027x_*_timing structures const pata_pdc2027x: Remove unnecessary error check ata: mediatek: Fix typo in module description
2017-12-11Merge tag 'for-linus-4.15-2' of git://github.com/cminyard/linux-ipmiLinus Torvalds
Pull IPMI fixes from Corey Minyard. * tag 'for-linus-4.15-2' of git://github.com/cminyard/linux-ipmi: ipmi_si: fix crash on parisc ipmi_si: Fix oops with PCI devices ipmi: Stop timers before cleaning up the module
2017-12-11Merge branch 'linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: "This push fixes the following issues: - buffer overread in RSA - potential use after free in algif_aead. - error path null pointer dereference in af_alg - forbid combinations such as hmac(hmac(sha3)) which may crash - crash in salsa20 due to incorrect API usage" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: salsa20 - fix blkcipher_walk API usage crypto: hmac - require that the underlying hash algorithm is unkeyed crypto: af_alg - fix NULL pointer dereference in crypto: algif_aead - fix reference counting of null skcipher crypto: rsa - fix buffer overread when stripping leading zeroes
2017-12-11iw_cxgb4: only insert drain cqes if wq is flushedSteve Wise
Only insert our special drain CQEs to support ib_drain_sq/rq() after the wq is flushed. Otherwise, existing but not yet polled CQEs can be returned out of order to the user application. This can happen when the QP has exited RTS but not yet flushed the QP, which can happen during a normal close (vs abortive close). In addition never count the drain CQEs when determining how many CQEs need to be synthesized during the flush operation. This latter issue should never happen if the QP is properly flushed before inserting the drain CQE, but I wanted to avoid corrupting the CQ state. So we handle it and log a warning once. Fixes: 4fe7c2962e11 ("iw_cxgb4: refactor sq/rq drain logic") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Cc: stable@vger.kernel.org Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-11ext4: fix crash when a directory's i_size is too smallChandan Rajendra
On a ppc64 machine, when mounting a fuzzed ext2 image (generated by fsfuzzer) the following call trace is seen, VFS: brelse: Trying to free free buffer WARNING: CPU: 1 PID: 6913 at /root/repos/linux/fs/buffer.c:1165 .__brelse.part.6+0x24/0x40 .__brelse.part.6+0x20/0x40 (unreliable) .ext4_find_entry+0x384/0x4f0 .ext4_lookup+0x84/0x250 .lookup_slow+0xdc/0x230 .walk_component+0x268/0x400 .path_lookupat+0xec/0x2d0 .filename_lookup+0x9c/0x1d0 .vfs_statx+0x98/0x140 .SyS_newfstatat+0x48/0x80 system_call+0x58/0x6c This happens because the directory that ext4_find_entry() looks up has inode->i_size that is less than the block size of the filesystem. This causes 'nblocks' to have a value of zero. ext4_bread_batch() ends up not reading any of the directory file's blocks. This renders the entries in bh_use[] array to continue to have garbage data. buffer_uptodate() on bh_use[0] can then return a zero value upon which brelse() function is invoked. This commit fixes the bug by returning -ENOENT when the directory file has no associated blocks. Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Cc: stable@vger.kernel.org
2017-12-11fou: fix some member types in guehdrXin Long
guehdr struct is used to build or parse gue packets, which are always in big endian. It's better to define all guehdr members as __beXX types. Also, in validate_gue_flags it's not good to use a __be32 variable for both Standard flags(__be16) and Private flags (__be32), and pass it to other funcions. This patch could fix a bunch of sparse warnings from fou. Fixes: 5024c33ac354 ("gue: Add infrastructure for flags and options") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11sctp: make sure stream nums can match optlen in sctp_setsockopt_reset_streamsXin Long
Now in sctp_setsockopt_reset_streams, it only does the check optlen < sizeof(*params) for optlen. But it's not enough, as params->srs_number_streams should also match optlen. If the streams in params->srs_stream_list are less than stream nums in params->srs_number_streams, later when dereferencing the stream list, it could cause a slab-out-of-bounds crash, as reported by syzbot. This patch is to fix it by also checking the stream numbers in sctp_setsockopt_reset_streams to make sure at least it's not greater than the streams in the list. Fixes: 7f9d68ac944e ("sctp: implement sender-side procedures for SSN Reset Request Parameter") Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11net: ipv4: fix for a race condition in raw_sendmsgMohamed Ghannam
inet->hdrincl is racy, and could lead to uninitialized stack pointer usage, so its value should be read only once. Fixes: c008ba5bdc9f ("ipv4: Avoid reading user iov twice after raw_probe_proto_opt") Signed-off-by: Mohamed Ghannam <simo.ghannam@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11forcedeth: remove unnecessary structure memberZhu Yanjun
Since both tx_ring and first_tx are the head of tx ring, it not necessary to use two structure members to statically indicate the head of tx ring. So first_tx is removed. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Joe Jin <joe.jin@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11x86/unwinder/guess: Prevent using CONFIG_UNWINDER_GUESS=y with ↵Andrey Ryabinin
CONFIG_STACKDEPOT=y Stackdepot doesn't work well with CONFIG_UNWINDER_GUESS=y. The 'guess' unwinder generate awfully large and inaccurate stacktraces, thus stackdepot can't deduplicate stacktraces because they all look like unique. Eventually stackdepot reaches its capacity limit: WARNING: CPU: 0 PID: 545 at lib/stackdepot.c:119 depot_save_stack+0x28e/0x550 Call Trace: ? kasan_kmalloc+0x144/0x160 ? depot_save_stack+0x1f5/0x550 ? do_raw_spin_unlock+0xda/0xf0 ? preempt_count_sub+0x13/0xc0 <...90 lines...> ? do_raw_spin_unlock+0xda/0xf0 Add a STACKDEPOT=n dependency to UNWINDER_GUESS to avoid the problem. Reported-by: kernel test robot <xiaolong.ye@intel.com> Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Acked-by: Dmitry Vyukov <dvyukov@google.com> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Alexander Potapenko <glider@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20171130123554.4330-1-aryabinin@virtuozzo.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-11x86/build: Don't verify mtools configuration file for isoimageChangbin Du
If mtools.conf is not generated before, 'make isoimage' could complain: Kernel: arch/x86/boot/bzImage is ready (#597) GENIMAGE arch/x86/boot/image.iso *** Missing file: arch/x86/boot/mtools.conf arch/x86/boot/Makefile:144: recipe for target 'isoimage' failed mtools.conf is not used for isoimage generation, so do not check it. Signed-off-by: Changbin Du <changbin.du@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 4366d57af1 ("x86/build: Factor out fdimage/isoimage generation commands to standalone script") Link: http://lkml.kernel.org/r/1512053480-8083-1-git-send-email-changbin.du@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-11Merge branch 'nfp-dead-code-clean-ups-and-slight-improvements'David S. Miller
Jakub Kicinski says: ==================== nfp: dead code, clean ups and slight improvements This series contains small clean ups from John and Carl, and brings no functional changes. John's improvements target the flower code. First he makes sure we don't allocate space in FW request messages for MAC matches if the TC rule does not contain any. The remaining two patches remove some dead code and unused defines. Carl follows up with a slight optimization to his recent ethtool FW state dumps, byte swapping input parameters once instead of the data for every dumped item. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11nfp: debug dump - decrease endian conversionsCarl Heymann
Convert the requested dump level parameter to big-endian at the start of nfp_net_dump_calculate_size() and nfp_net_dump_populate_buffer(), then compare and assign it directly where needed in the traversal and prolog code. This decreases the total number of conversions used. Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11nfp: flower: remove unused definesJohn Hurley
Delete match field defines that are not supported at this time. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11nfp: flower: remove dead code pathsJohn Hurley
Port matching is selected by default on every rule so remove check for it and delete 'else' side of the statement. Remove nfp_flower_meta_one as now it will not feature in the code. Rename nfp_flower_meta_two given that one has been removed. 'Additional metadata' if statement can never be true so remove it as well. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11nfp: flower: do not assume mac/mpls matchesJohn Hurley
Remove the matching of mac/mpls as a default selection. These are not necessarily set by a TC rule (unlike the port). Previously a mac/mpls field would exist in every match and be masked out if not used. This patch has no impact on functionality but removes unnessary memory assignment in the match cmsg. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11netlink: Add netns check on tapsKevin Cernekee
Currently, a nlmon link inside a child namespace can observe systemwide netlink activity. Filter the traffic so that nlmon can only sniff netlink messages from its own netns. Test case: vpnns -- bash -c "ip link add nlmon0 type nlmon; \ ip link set nlmon0 up; \ tcpdump -i nlmon0 -q -w /tmp/nlmon.pcap -U" & sudo ip xfrm state add src 10.1.1.1 dst 10.1.1.2 proto esp \ spi 0x1 mode transport \ auth sha1 0x6162633132330000000000000000000000000000 \ enc aes 0x00000000000000000000000000000000 grep --binary abc123 /tmp/nlmon.pcap Signed-off-by: Kevin Cernekee <cernekee@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11net: sh_eth: do not advertise Gigabit capabilities when not availableThomas Petazzoni
Not all variants of the sh_eth hardware have Gigabit support. Unfortunately, the current driver doesn't tell the PHY about the limited MAC capabilities. Due to this, if you have a Gigabit capable PHY, the PHY will advertise its Gigabit capability and establish a link at 1Gbit/s, even though the MAC doesn't support it. In order to avoid this, we use the recently introduced phy_set_max_speed() to tell the PHY to not advertise speed higher than 100 MBit/s. Tested on a SH7786 platform, with a Gigabit PHY. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11dt-bindings: fec: Make the phy-reset-gpio polarity explicitFabio Estevam
The GPIO polarity passed to phy-reset-gpio is ignored by the FEC driver and it is assumed to be active low. It can be active high only when the 'phy-reset-active-high' property is present. The current examples pass active high polarity and work fine, but in order to improve the documentation make it explicit what the real polarity is. Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Acked-by: Fugang Duan <fugang.duan@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11Merge branch 'sctp-stream-interleave-part-1'David S. Miller
Xin Long says: ==================== sctp: Implement Stream Interleave: The I-DATA Chunk Supporting User Message Interleaving Stream Interleave would be Implemented in two Parts: 1. The I-DATA Chunk Supporting User Message Interleaving 2. Interaction with Other SCTP Extensions Overview in section 1.1 of RFC8260 for Part 1: This document describes a new chunk carrying payload data called I-DATA. This chunk incorporates the properties of the current SCTP DATA chunk, all the flags and fields except the Stream Sequence Number (SSN), and also adds two new fields in its chunk header -- the Fragment Sequence Number (FSN) and the Message Identifier (MID). The FSN is only used for reassembling all fragments that have the same MID and the same ordering property. The TSN is only used for the reliable transfer in combination with Selective Acknowledgment (SACK) chunks. In addition, the MID is also used for ensuring ordered delivery instead of using the stream sequence number (the I-DATA chunk omits an SSN). As the 1st part of Stream Interleave Implementation, this patchset adds an ops framework named sctp_stream_interleave with a bunch of stuff that does lots of things needed somewhere. Then it defines sctp_stream_interleave_0 to work for normal DATA chunks and sctp_stream_interleave_1 for I-DATA chunks. With these functions, hundreds of if-else checks for the different process on I-DATA chunks would be avoided. Besides, very few codes could be shared in these two function sets. In this patchset, it adds some basic variables, structures and socket options firstly, then implement these functions one by one to add the procedures for ordered idata gradually, at last adjusts some codes to make them work for unordered idata. To make it safe to be implemented and also not break the normal data chunk process, this feature can't be enabled to use until all stream interleave codes are completely accomplished. v1 -> v2: - fixed a checkpatch warning that a blank line was missed. - avoided a kbuild warning reported from gcc-4.9. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11sctp: add support for the process of unordered idataXin Long
Unordered idata process is more complicated than unordered data: - It has to add mid into sctp_stream_out to save the next mid value, which is separated from ordered idata's. - To support pd for unordered idata, another mid and pd_mode need to be added to save the message id and pd state in sctp_stream_in. - To make unordered idata reasm easier, it adds a new event queue to save frags for idata. The patch mostly adds the samilar reasm functions for unordered idata as ordered idata's, and also adjusts some other codes on assign_mid, abort_pd and ulpevent_data for idata. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11sctp: implement abort_pd for sctp_stream_interleaveXin Long
abort_pd is added as a member of sctp_stream_interleave, used to abort partial delivery for data or idata, called in sctp_cmd_assoc_failed. Since stream interleave allows to do partial delivery for each stream at the same time, sctp_intl_abort_pd for idata would be very different from the old function sctp_ulpq_abort_pd for data. Note that sctp_ulpevent_make_pdapi will support per stream in this patch by adding pdapi_stream and pdapi_seq in sctp_pdapi_event, as described in section 6.1.7 of RFC6458. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11sctp: implement start_pd for sctp_stream_interleaveXin Long
start_pd is added as a member of sctp_stream_interleave, used to do partial_delivery for data or idata when datalen >= asoc->rwnd in sctp_eat_data. The codes have been done in last patches, but they need to be extracted into start_pd, so that it could be used for SCTP_CMD_PART_DELIVER cmd as well. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11sctp: implement renege_events for sctp_stream_interleaveXin Long
renege_events is added as a member of sctp_stream_interleave, used to renege some old data or idata in reasm or lobby queue properly to free some memory for the new data when there's memory stress. It defines sctp_renege_events for idata, and leaves sctp_ulpq_renege as it is for data. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11sctp: implement enqueue_event for sctp_stream_interleaveXin Long
enqueue_event is added as a member of sctp_stream_interleave, used to enqueue either data, idata or notification events into user socket rx queue. It replaces sctp_ulpq_tail_event used in the other places with enqueue_event. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11sctp: implement ulpevent_data for sctp_stream_interleaveXin Long
ulpevent_data is added as a member of sctp_stream_interleave, used to do the most process in ulpq, including to convert data or idata chunk to event, reasm them in reasm queue and put them in lobby queue in right order, and deliver them up to user sk rx queue. This procedure is described in section 2.2.3 of RFC8260. It adds most functions for idata here to do the similar process as the old functions for data. But since the details are very different between them, the old functions can not be reused for idata. event->ssn and event->ppid settings are moved to ulpevent_data from sctp_ulpevent_make_rcvmsg, so that sctp_ulpevent_make_rcvmsg could work for both data and idata. Note that mid is added in sctp_ulpevent for idata, __packed has to be used for defining sctp_ulpevent, or it would exceeds the skb cb that saves a sctp_ulpevent variable for ulp layer process. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11sctp: implement validate_data for sctp_stream_interleaveXin Long
validate_data is added as a member of sctp_stream_interleave, used to validate ssn/chunk type for data or mid (message id)/chunk type for idata, called in sctp_eat_data. If this check fails, an abort packet will be sent, as said in section 2.2.3 of RFC8260. It also adds the process for idata in rx path. As Marcelo pointed out, there's no need to add event table for idata, but just share chunk_event_table with data's. It would drop data chunk for idata and drop idata chunk for data by calling validate_data in sctp_eat_data. As last patch did, it also replaces sizeof(struct sctp_data_chunk) with sctp_datachk_len for rx path. After this patch, the idata can be accepted and delivered to ulp layer. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11sctp: implement assign_number for sctp_stream_interleaveXin Long
assign_number is added as a member of sctp_stream_interleave, used to assign ssn for data or mid (message id) for idata, called in sctp_packet_append_data. sctp_chunk_assign_ssn is left as it is, and sctp_chunk_assign_mid is added for sctp_stream_interleave_1. This procedure is described in section 2.2.2 of RFC8260. All sizeof(struct sctp_data_chunk) in tx path is replaced with sctp_datachk_len, to make it right for idata as well. And also adjust sctp_chunk_is_data for SCTP_CID_I_DATA. After this patch, idata can be built and sent in tx path. Note that if sp strm_interleave is set, it has to wait_connect in sctp_sendmsg, as asoc intl_enable need to be known after 4 shake- hands, to decide if it should use data or idata later. data and idata can't be mixed to send in one asoc. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11sctp: implement make_datafrag for sctp_stream_interleaveXin Long
To avoid hundreds of checks for the different process on I-DATA chunk, struct sctp_stream_interleave is defined as a group of functions used to replace the codes in some place where it needs to do different job according to if the asoc intl_enabled is set. With these ops, it only needs to initialize asoc->stream.si with sctp_stream_interleave_0 for normal data if asoc intl_enable is 0, or sctp_stream_interleave_1 for idata if asoc intl_enable is set in sctp_stream_init. After that, the members in asoc->stream.si can be used directly in some special places without checking asoc intl_enable. make_datafrag is the first member for sctp_stream_interleave, it's used to make data or idata frags, called in sctp_datamsg_from_user. The old function sctp_make_datafrag_empty needs to be adjust some to fit in this ops. Note that as idata and data chunks have different length, it also defines data_chunk_len for sctp_stream_interleave to describe the chunk size. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11sctp: add basic structures and make chunk function for idataXin Long
sctp_idatahdr and sctp_idata_chunk are used to define and parse I-DATA chunk format, and sctp_make_idata is a function to build the chunk. The I-DATA Chunk Format is defined in section 2.1 of RFC8260. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11sctp: add asoc intl_enable negotiation during 4 shakehandsXin Long
asoc intl_enable will be set when local sp strm_interleave is set and there's I-DATA chunk in init and init_ack extensions, as said in section 2.2.1 of RFC8260. asoc intl_enable indicates all data will be sent as I-DATA chunks. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11sctp: add stream interleave enable members and sockoptXin Long
This patch adds intl_enable in asoc and netns, and strm_interleave in sctp_sock to indicate if stream interleave is enabled and supported. netns intl_enable would be set via procfs, but that is not added yet until all stream interleave codes are completely implemented; asoc intl_enable will be set when doing 4-shakehands. sp strm_interleave can be set by sockopt SCTP_INTERLEAVING_SUPPORTED which is also added in this patch. This socket option is defined in section 4.3.1 of RFC8260. Note that strm_interleave can only be set by sockopt when both netns intl_enable and sp frag_interleave are set. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11net: phy: meson-gxl: detect LPA corruptionJerome Brunet
The purpose of this change is to fix the incorrect detection of the link partner (LP) advertised capabilities which sometimes happens with this PHY (roughly 1 time in a dozen) This issue may cause the link to be negotiated at 10Mbps/Full or 10Mbps/Half when 100MBps/Full is actually possible. In some case, the link is even completely broken and no communication is possible. To detect the corruption, we must look for a magic undocumented bit in the WOL bank (hint given by the SoC vendor kernel) but this is not enough to cover all cases. We also have to look at the LPA ack. If the LP supports Aneg but did not ack our base code when aneg is completed, we assume something went wrong. The detection of a corrupted LPA triggers a restart of the aneg process. This solves the problem but may take up to 6 retries to complete. Fixes: 7334b3e47aee ("net: phy: Add Meson GXL Internal PHY driver") Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11ipvlan: add L2 check for packets arriving via virtual devicesMahesh Bandewar
Packets that don't have dest mac as the mac of the master device should not be entertained by the IPvlan rx-handler. This is mostly true as the packet path mostly takes care of that, except when the master device is a virtual device. As demonstrated in the following case - ip netns add ns1 ip link add ve1 type veth peer name ve2 ip link add link ve2 name iv1 type ipvlan mode l2 ip link set dev iv1 netns ns1 ip link set ve1 up ip link set ve2 up ip -n ns1 link set iv1 up ip addr add 192.168.10.1/24 dev ve1 ip -n ns1 addr 192.168.10.2/24 dev iv1 ping -c2 192.168.10.2 <Works!> ip neigh show dev ve1 ip neigh show 192.168.10.2 lladdr <random> dev ve1 ping -c2 192.168.10.2 <Still works! Wrong!!> This patch adds that missing check in the IPvlan rx-handler. Reported-by: Amit Sikka <amit.sikka@ericsson.com> Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11arm64: mm: Fix pte_mkclean, pte_mkdirty semanticsSteve Capper
On systems with hardware dirty bit management, the ltp madvise09 unit test fails due to dirty bit information being lost and pages being incorrectly freed. This was bisected to: arm64: Ignore hardware dirty bit updates in ptep_set_wrprotect() Reverting this commit leads to a separate problem, that the unit test retains pages that should have been dropped due to the function madvise_free_pte_range(.) not cleaning pte's properly. Currently pte_mkclean only clears the software dirty bit, thus the following code sequence can appear: pte = pte_mkclean(pte); if (pte_dirty(pte)) // this condition can return true with HW DBM! This patch also adjusts pte_mkclean to set PTE_RDONLY thus effectively clearing both the SW and HW dirty information. In order for this to function on systems without HW DBM, we need to also adjust pte_mkdirty to remove the read only bit from writable pte's to avoid infinite fault loops. Cc: <stable@vger.kernel.org> Fixes: 64c26841b349 ("arm64: Ignore hardware dirty bit updates in ptep_set_wrprotect()") Reported-by: Bhupinder Thakur <bhupinder.thakur@linaro.org> Tested-by: Bhupinder Thakur <bhupinder.thakur@linaro.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Steve Capper <steve.capper@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-12-11arm64: Initialise high_memory global variable earlierSteve Capper
The high_memory global variable is used by cma_declare_contiguous(.) before it is defined. We don't notice this as we compute __pa(high_memory - 1), and it looks like we're processing a VA from the direct linear map. This problem becomes apparent when we flip the kernel virtual address space and the linear map is moved to the bottom of the kernel VA space. This patch moves the initialisation of high_memory before it used. Cc: <stable@vger.kernel.org> Fixes: f7426b983a6a ("mm: cma: adjust address limit to avoid hitting low/high memory boundary") Signed-off-by: Steve Capper <steve.capper@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-12-11netfilter: ip6t_MASQUERADE: add dependency on conntrack moduleKonstantin Khlebnikov
After commit 4d3a57f23dec ("netfilter: conntrack: do not enable connection tracking unless needed") conntrack is disabled by default unless some module explicitly declares dependency in particular network namespace. Fixes: a357b3f80bc8 ("netfilter: nat: add dependencies on conntrack module") Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-12-11netlink: convert netlink tap spinlock to mutexCong Wang
Both netlink_add_tap() and netlink_remove_tap() are called in process context, no need to bother spinlock. Note, in fact, currently we always hold RTNL when calling these two functions, so we don't need any other lock at all, but keeping this lock doesn't harm anything. Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-11netlink: make netlink tap per netnsCong Wang
nlmon device is not supposed to capture netlink events from other netns, so instead of filtering events, we can simply make netlink tap itself per netns. Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Kevin Cernekee <cernekee@chromium.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>