summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-08-26sfc: fix potential stack corruption from running past stat bitmaskAndrew Rybchenko
On 32-bit systems, mask is only an array of 3 longs, not 4, so don't try to write to mask[3]. Also include build-time checks in case the size of the bitmask changes. Fixes: 3c36a2aded8c ("sfc: display vadaptor statistics for all interfaces") Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26Merge branch 'tipc-udp-replicast'David S. Miller
Richard Alpe says: ==================== tipc: introduce UDP replicast This series introduces UDP replicast. A concept where we emulate multicast by sending multiple unicast messages to configured peers. This allows TIPC to be used in environments where IP multicast is disabled. There is a corresponding patch series for the tipc user space tool that allows a user to add remote addresses to the replicast list. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26tipc: add UDP remoteip dump to netlink APIRichard Alpe
When using replicast a UDP bearer can have an arbitrary amount of remote ip addresses associated with it. This means we cannot simply add all remote ip addresses to an existing bearer data message as it might fill the message, leaving us with a truncated message that we can't safely resume. To handle this we introduce the new netlink command TIPC_NL_UDP_GET_REMOTEIP. This command is intended to be called when the bearer data message has the TIPC_NLA_UDP_MULTI_REMOTEIP flag set, indicating there are more than one remote ip (replicast). Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26tipc: add the ability to get UDP options via netlinkRichard Alpe
Add UDP bearer options to netlink bearer get message. This is used by the tipc user space tool to display UDP options. The UDP bearer information is passed using either a sockaddr_in or sockaddr_in6 structs. This means the user space receiver should intermediately store the retrieved data in a large enough struct (sockaddr_strage) before casting to the proper IP version type. Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26tipc: add replicast peer discoveryRichard Alpe
Automatically learn UDP remote IP addresses of communicating peers by looking at the source IP address of incoming TIPC link configuration messages (neighbor discovery). This makes configuration slightly easier and removes the problematic scenario where a node receives directly addressed neighbor discovery messages sent using replicast which the node cannot "reply" to using mutlicast, leaving the link FSM in a limbo state. Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26tipc: introduce UDP replicastRichard Alpe
This patch introduces UDP replicast. A concept where we emulate multicast by sending multiple unicast messages to configured peers. The purpose of replicast is mainly to be able to use TIPC in cloud environments where IP multicast is disabled. Using replicas to unicast multicast messages is costly as we have to copy each skb and send the copies individually. Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26tipc: refactor multicast ip checkRichard Alpe
Add a function to check if a tipc UDP media address is a multicast address or not. This is a purely cosmetic change. Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26tipc: split UDP send functionRichard Alpe
Split the UDP send function into two. One callback that prepares the skb and one transmit function that sends the skb. This will come in handy in later patches, when we introduce UDP replicast. Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26tipc: split UDP nl address parsingRichard Alpe
Split the UDP netlink parse function so that it only parses one netlink attribute at the time. This makes the parse function more generic and allow future UDP API functions to use it for parsing. Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26net: dsa: bcm_sf2: Utilize mask clear/set helpers in bcm_sf2_intr_disableFlorian Fainelli
And while at it, remove the unecessary writing of zeroes to the CPU_MASK_CLEAR register since it has no functional use. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26Merge branch 'for-upstream' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Johan Hedberg says: ==================== pull request: bluetooth 2016-08-25 Here are a couple of important Bluetooth fixes for the 4.8 kernel: - Memory leak fix for HCI requests - Fix sk_filter handling with L2CAP - Fix sock_recvmsg behavior when MSG_TRUNC is not set Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26alx: add tso supportTobias Regnery
Add tso/tso6 support to the alx driver. Based on information from the downstream driver at github.com/qca/alx Signed-off-by: Tobias Regnery <tobias.regnery@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26Merge branch 'mediatek-pdma-rx'David S. Miller
Nelson Chang says: ==================== net: ethernet: mediatek: modify to use the PDMA for Ethernet RX This series have some modifications and refines to support Ethernet RX by the PDMA. changes since v4: - Remove the redundant OR operation in mtk_hw_init() changes since v3: - Add GDM hardware settings to send packets to PDMA for RX changes since v2: - Fix the bugs of PDMA cpu index and interrupt settings in mtk_poll_rx() changes since v1: - Modify to use the PDMA instead of the QDMA for Ethernet RX ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26net: ethernet: mediatek: modify GDM to send packets to the PDMA for RXNelson Chang
Because we change to use the PDMA as the Ethernet RX DMA engine, the patch modifies to set GDM to send packets to PDMA for RX. Acked-by: John Crispin <john@phrozen.org> Signed-off-by: Nelson Chang <nelson.chang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26net: ethernet: mediatek: modify to use the PDMA instead of the QDMA for ↵Nelson Chang
Ethernet RX Because the PDMA has richer features than the QDMA for Ethernet RX (such as multiple RX rings, HW LRO, etc.), the patch modifies to use the PDMA to handle Ethernet RX. Acked-by: John Crispin <john@phrozen.org> Signed-off-by: Nelson Chang <nelson.chang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26Merge branch 'for-linus-4.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "We've queued up a few different fixes in here. These range from enospc corners to fsync and quota fixes, and a few targeted at error handling for corrupt metadata/fuzzing" * 'for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: fix lockdep warning on deadlock against an inode's log mutex Btrfs: detect corruption when non-root leaf has zero item Btrfs: check btree node's nritems btrfs: don't create or leak aliased root while cleaning up orphans Btrfs: fix em leak in find_first_block_group btrfs: do not background blkdev_put() Btrfs: clarify do_chunk_alloc()'s return value btrfs: fix fsfreeze hang caused by delayed iputs deal btrfs: update btrfs_space_info's bytes_may_use timely btrfs: divide btrfs_update_reserved_bytes() into two functions btrfs: use correct offset for reloc_inode in prealloc_file_extent_cluster() btrfs: qgroup: Fix qgroup incorrectness caused by log replay btrfs: relocation: Fix leaking qgroups numbers on data extents btrfs: qgroup: Refactor btrfs_qgroup_insert_dirty_extent() btrfs: waiting on qgroup rescan should not always be interruptible btrfs: properly track when rescan worker is running btrfs: flush_space: treat return value of do_chunk_alloc properly Btrfs: add ASSERT for block group's memory leak btrfs: backref: Fix soft lockup in __merge_refs function Btrfs: fix memory leak of reloc_root
2016-08-26Merge tag 'dlm-4.8-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm Pull dlm fix from David Teigland: "This fixes a bug introduced by recent debugfs cleanup" * tag 'dlm-4.8-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: dlm: fix malfunction of dlm_tool caused by debugfs changes
2016-08-26Merge tag 'dm-4.8-fixes-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - another stable fix for DM flakey (that tweaks the previous fix that didn't factor in expected 'drop_writes' behavior for read IO). - a dm-log bio operation flags fix for the broader block changes that were merged during the 4.8 merge window. * tag 'dm-4.8-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm log: fix unitialized bio operation flags dm flakey: fix reads to be issued if drop_writes configured
2016-08-26Merge tag 'iommu-fixes-v4.8-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU fixes from Joerg Roedel: "Fixes from Will Deacon: - fix a couple of thinkos in the CMDQ error handling and short-descriptor page table code that have been there since day one - disable stalling faults, since they may result in hardware deadlock - fix an accidental BUG() when passing disable_bypass=1 on the cmdline" * tag 'iommu-fixes-v4.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/arm-smmu: Don't BUG() if we find aborting STEs with disable_bypass iommu/arm-smmu: Disable stalling faults for all endpoints iommu/arm-smmu: Fix CMDQ error handling iommu/io-pgtable-arm-v7s: Fix attributes when splitting blocks
2016-08-26Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: "Here's a set of block fixes for the current 4.8-rc release. This contains: - a fix for a secure erase regression, from Adrian. - a fix for an mmc use-after-free bug regression, also from Adrian. - potential zero pointer deference in bdev freezing, from Andrey. - a race fix for blk_set_queue_dying() from Bart. - a set of xen blkfront fixes from Bob Liu. - three small fixes for bcache, from Eric and Kent. - a fix for a potential invalid NVMe state transition, from Gabriel. - blk-mq CPU offline fix, preventing us from issuing and completing a request on the wrong queue. From me. - revert two previous floppy changes, since they caused a user visibile regression. A better fix is in the works. - ensure that we don't send down bios that have more than 256 elements in them. Fixes a crash with bcache, for example. From Ming. - a fix for deferencing an error pointer with cgroup writeback. Fixes a regression. From Vegard" * 'for-linus' of git://git.kernel.dk/linux-block: mmc: fix use-after-free of struct request Revert "floppy: refactor open() flags handling" Revert "floppy: fix open(O_ACCMODE) for ioctl-only open" fs/block_dev: fix potential NULL ptr deref in freeze_bdev() blk-mq: improve warning for running a queue on the wrong CPU blk-mq: don't overwrite rq->mq_ctx block: make sure a big bio is split into at most 256 bvecs nvme: Fix nvme_get/set_features() with a NULL result pointer bdev: fix NULL pointer dereference xen-blkfront: free resources if xlvbd_alloc_gendisk fails xen-blkfront: introduce blkif_set_queue_limits() xen-blkfront: fix places not updated after introducing 64KB page granularity bcache: pr_err: more meaningful error message when nr_stripes is invalid bcache: RESERVE_PRIO is too small by one when prio_buckets() is a power of two. bcache: register_bcache(): call blkdev_put() when cache_alloc() fails block: Fix race triggered by blk_set_queue_dying() block: Fix secure erase nvme: Prevent controller state invalid transition
2016-08-26Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input subsystem fixes from Dmitry Torokhov: "Simply small driver fixups" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: ads7846 - remove redundant regulator_disable call Input: synaptics-rmi4 - fix register descriptor subpacket map construction Input: tegra-kbc - fix inverted reset logic Input: silead - use devm_gpiod_get Input: i8042 - set up shared ps2_cmd_mutex for AUX ports
2016-08-26Merge tag 'pci-v4.8-fixes-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: "Resource management: - Update "pci=resource_alignment" documentation (Mathias Koehrer) MSI: - Use positive flags in pci_alloc_irq_vectors() (Christoph Hellwig) - Call pci_intx() when using legacy interrupts in pci_alloc_irq_vectors() (Christoph Hellwig) Intel VMD host bridge driver: - Fix infinite loop executing irq's (Keith Busch)" * tag 'pci-v4.8-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: x86/PCI: VMD: Fix infinite loop executing irq's PCI: Call pci_intx() when using legacy interrupts in pci_alloc_irq_vectors() PCI: Use positive flags in pci_alloc_irq_vectors() PCI: Update "pci=resource_alignment" documentation
2016-08-26mm: silently skip readahead for DAX inodesRoss Zwisler
For DAX inodes we need to be careful to never have page cache pages in the mapping->page_tree. This radix tree should be composed only of DAX exceptional entries and zero pages. ltp's readahead02 test was triggering a warning because we were trying to insert a DAX exceptional entry but found that a page cache page had already been inserted into the tree. This page was being inserted into the radix tree in response to a readahead(2) call. Readahead doesn't make sense for DAX inodes, but we don't want it to report a failure either. Instead, we just return success and don't do any work. Link: http://lkml.kernel.org/r/20160824221429.21158-1-ross.zwisler@linux.intel.com Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reported-by: Jeff Moyer <jmoyer@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Jan Kara <jack@suse.com> Cc: <stable@vger.kernel.org> [4.5+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-26dax: fix device-dax region baseDan Williams
The data offset for a dax region needs to account for a reservation in the resource range. Otherwise, device-dax is allowing mappings directly into the memmap or device-info-block area with crash signatures like the following: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: get_zone_device_page+0x11/0x30 Call Trace: follow_devmap_pmd+0x298/0x2c0 follow_page_mask+0x275/0x530 __get_user_pages+0xe3/0x750 __gfn_to_pfn_memslot+0x1b2/0x450 [kvm] tdp_page_fault+0x130/0x280 [kvm] kvm_mmu_page_fault+0x5f/0xf0 [kvm] handle_ept_violation+0x94/0x180 [kvm_intel] vmx_handle_exit+0x1d3/0x1440 [kvm_intel] kvm_arch_vcpu_ioctl_run+0x81d/0x16a0 [kvm] kvm_vcpu_ioctl+0x33c/0x620 [kvm] do_vfs_ioctl+0xa2/0x5d0 SyS_ioctl+0x79/0x90 entry_SYSCALL_64_fastpath+0x1a/0xa4 Fixes: ab68f2622136 ("/dev/dax, pmem: direct access to persistent memory") Link: http://lkml.kernel.org/r/147205536732.1606.8994275381938837346.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reported-by: Abhilash Kumar Mulumudi <m.abhilash-kumar@hpe.com> Reported-by: Toshi Kani <toshi.kani@hpe.com> Tested-by: Toshi Kani <toshi.kani@hpe.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-26fs/seq_file: fix out-of-bounds readVegard Nossum
seq_read() is a nasty piece of work, not to mention buggy. It has (I think) an old bug which allows unprivileged userspace to read beyond the end of m->buf. I was getting these: BUG: KASAN: slab-out-of-bounds in seq_read+0xcd2/0x1480 at addr ffff880116889880 Read of size 2713 by task trinity-c2/1329 CPU: 2 PID: 1329 Comm: trinity-c2 Not tainted 4.8.0-rc1+ #96 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014 Call Trace: kasan_object_err+0x1c/0x80 kasan_report_error+0x2cb/0x7e0 kasan_report+0x4e/0x80 check_memory_region+0x13e/0x1a0 kasan_check_read+0x11/0x20 seq_read+0xcd2/0x1480 proc_reg_read+0x10b/0x260 do_loop_readv_writev.part.5+0x140/0x2c0 do_readv_writev+0x589/0x860 vfs_readv+0x7b/0xd0 do_readv+0xd8/0x2c0 SyS_readv+0xb/0x10 do_syscall_64+0x1b3/0x4b0 entry_SYSCALL64_slow_path+0x25/0x25 Object at ffff880116889100, in cache kmalloc-4096 size: 4096 Allocated: PID = 1329 save_stack_trace+0x26/0x80 save_stack+0x46/0xd0 kasan_kmalloc+0xad/0xe0 __kmalloc+0x1aa/0x4a0 seq_buf_alloc+0x35/0x40 seq_read+0x7d8/0x1480 proc_reg_read+0x10b/0x260 do_loop_readv_writev.part.5+0x140/0x2c0 do_readv_writev+0x589/0x860 vfs_readv+0x7b/0xd0 do_readv+0xd8/0x2c0 SyS_readv+0xb/0x10 do_syscall_64+0x1b3/0x4b0 return_from_SYSCALL_64+0x0/0x6a Freed: PID = 0 (stack is not available) Memory state around the buggy address: ffff88011688a000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff88011688a080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff88011688a100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ffff88011688a180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88011688a200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== Disabling lock debugging due to kernel taint This seems to be the same thing that Dave Jones was seeing here: https://lkml.org/lkml/2016/8/12/334 There are multiple issues here: 1) If we enter the function with a non-empty buffer, there is an attempt to flush it. But it was not clearing m->from after doing so, which means that if we try to do this flush twice in a row without any call to traverse() in between, we are going to be reading from the wrong place -- the splat above, fixed by this patch. 2) If there's a short write to userspace because of page faults, the buffer may already contain multiple lines (i.e. pos has advanced by more than 1), but we don't save the progress that was made so the next call will output what we've already returned previously. Since that is a much less serious issue (and I have a headache after staring at seq_read() for the past 8 hours), I'll leave that for now. Link: http://lkml.kernel.org/r/1471447270-32093-1-git-send-email-vegard.nossum@oracle.com Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Reported-by: Dave Jones <davej@codemonkey.org.uk> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-26mm: memcontrol: avoid unused function warningArnd Bergmann
A bugfix in v4.8-rc2 introduced a harmless warning when CONFIG_MEMCG_SWAP is disabled but CONFIG_MEMCG is enabled: mm/memcontrol.c:4085:27: error: 'mem_cgroup_id_get_online' defined but not used [-Werror=unused-function] static struct mem_cgroup *mem_cgroup_id_get_online(struct mem_cgroup *memcg) This moves the function inside of the #ifdef block that hides the calling function, to avoid the warning. Fixes: 1f47b61fb407 ("mm: memcontrol: fix swap counter leak on swapout from offline cgroup") Link: http://lkml.kernel.org/r/20160824113733.2776701-1-arnd@arndb.de Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-26mm: clarify COMPACTION Kconfig textMichal Hocko
The current wording of the COMPACTION Kconfig help text doesn't emphasise that disabling COMPACTION might cripple the page allocator which relies on the compaction quite heavily for high order requests and an unexpected OOM can happen with the lack of compaction. Make sure we are vocal about that. Link: http://lkml.kernel.org/r/20160823091726.GK23577@dhcp22.suse.cz Signed-off-by: Michal Hocko <mhocko@suse.com> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Mel Gorman <mgorman@suse.de> Cc: Joonsoo Kim <js1304@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-26treewide: replace config_enabled() with IS_ENABLED() (2nd round)Masahiro Yamada
Commit 97f2645f358b ("tree-wide: replace config_enabled() with IS_ENABLED()") mostly killed config_enabled(), but some new users have appeared for v4.8-rc1. They are all used for a boolean option, so can be replaced with IS_ENABLED() safely. Link: http://lkml.kernel.org/r/1471970749-24867-1-git-send-email-yamada.masahiro@socionext.com Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-26printk: fix parsing of "brl=" optionNicolas Iooss
Commit bbeddf52adc1 ("printk: move braille console support into separate braille.[ch] files") moved the parsing of braille-related options into _braille_console_setup(), changing the type of variable str from char* to char**. In this commit, memcmp(str, "brl,", 4) was correctly updated to memcmp(*str, "brl,", 4) but not memcmp(str, "brl=", 4). Update the code to make "brl=" option work again and replace memcmp() with strncmp() to make the compiler able to detect such an issue. Fixes: bbeddf52adc1 ("printk: move braille console support into separate braille.[ch] files") Link: http://lkml.kernel.org/r/20160823165700.28952-1-nicolas.iooss_linux@m4x.org Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-26soft_dirty: fix soft_dirty during THP splitAndrea Arcangeli
While adding proper userfaultfd_wp support with bits in pagetable and swap entry to avoid false positives WP userfaults through swap/fork/ KSM/etc, I've been adding a framework that mostly mirrors soft dirty. So I noticed in one place I had to add uffd_wp support to the pagetables that wasn't covered by soft_dirty and I think it should have. Example: in the THP migration code migrate_misplaced_transhuge_page() pmd_mkdirty is called unconditionally after mk_huge_pmd. entry = mk_huge_pmd(new_page, vma->vm_page_prot); entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma); That sets soft dirty too (it's a false positive for soft dirty, the soft dirty bit could be more finegrained and transfer the bit like uffd_wp will do.. pmd/pte_uffd_wp() enforces the invariant that when it's set pmd/pte_write is not set). However in the THP split there's no unconditional pmd_mkdirty after mk_huge_pmd and pte_swp_mksoft_dirty isn't called after the migration entry is created. The code sets the dirty bit in the struct page instead of setting it in the pagetable (which is fully equivalent as far as the real dirty bit is concerned, as the whole point of pagetable bits is to be eventually flushed out of to the page, but that is not equivalent for the soft-dirty bit that gets lost in translation). This was found by code review only and totally untested as I'm working to actually replace soft dirty and I don't have time to test potential soft dirty bugfixes as well :). Transfer the soft_dirty from pmd to pte during THP splits. This fix avoids losing the soft_dirty bit and avoids userland memory corruption in the checkpoint. Fixes: eef1b3ba053aa6 ("thp: implement split_huge_pmd()") Link: http://lkml.kernel.org/r/1471610515-30229-2-git-send-email-aarcange@redhat.com Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Pavel Emelyanov <xemul@virtuozzo.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-26sysctl: handle error writing UINT_MAX to u32 fieldsSubash Abhinov Kasiviswanathan
We have scripts which write to certain fields on 3.18 kernels but this seems to be failing on 4.4 kernels. An entry which we write to here is xfrm_aevent_rseqth which is u32. echo 4294967295 > /proc/sys/net/core/xfrm_aevent_rseqth Commit 230633d109e3 ("kernel/sysctl.c: detect overflows when converting to int") prevented writing to sysctl entries when integer overflow occurs. However, this does not apply to unsigned integers. Heinrich suggested that we introduce a new option to handle 64 bit limits and set min as 0 and max as UINT_MAX. This might not work as it leads to issues similar to __do_proc_doulongvec_minmax. Alternatively, we would need to change the datatype of the entry to 64 bit. static int __do_proc_doulongvec_minmax(void *data, struct ctl_table { i = (unsigned long *) data; //This cast is causing to read beyond the size of data (u32) vleft = table->maxlen / sizeof(unsigned long); //vleft is 0 because maxlen is sizeof(u32) which is lesser than sizeof(unsigned long) on x86_64. Introduce a new proc handler proc_douintvec. Individual proc entries will need to be updated to use the new handler. [akpm@linux-foundation.org: coding-style fixes] Fixes: 230633d109e3 ("kernel/sysctl.c:detect overflows when converting to int") Link: http://lkml.kernel.org/r/1471479806-5252-1-git-send-email-subashab@codeaurora.org Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Cc: Heinrich Schuchardt <xypron.glpk@gmx.de> Cc: Kees Cook <keescook@chromium.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-26get_maintainer: quiet noisy implicit -f vcs_file_exists checkingJoe Perches
Checking command line filenames that are outside the git tree can emit a noisy and confusing message. Quiet that message by redirecting stderr. Verify that the command was executed successfully. Fixes: 4cad35a7ca69 ("get_maintainer.pl: reduce need for command-line option -f") Link: http://lkml.kernel.org/r/1970a1d2fecb258e384e2e4fdaacdc9ccf3e30a4.1470955439.git.joe@perches.com Signed-off-by: Joe Perches <joe@perches.com> Reported-by: Wolfram Sang <wsa@the-dreams.de> Tested-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-26byteswap: don't use __builtin_bswap*() with sparseJohannes Berg
Although sparse declares __builtin_bswap*(), it can't actually do constant folding inside them (yet). As such, things like switch (protocol) { case htons(ETH_P_IP): break; } which we do all over the place cause sparse to warn that it expects a constant instead of a function call. Disable __HAVE_BUILTIN_BSWAP*__ if __CHECKER__ is defined to avoid this. Fixes: 7322dd755e7d ("byteswap: try to avoid __builtin_constant_p gcc bug") Link: http://lkml.kernel.org/r/1470914102-26389-1-git-send-email-johannes@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-26Merge branch 'bcm_sf2-utilize-b53_common'David S. Miller
Florian Fainelli says: ==================== net: dsa: Make bcm_sf2 utilize b53_common This patch series makes the bcm_sf2 driver utilize a large number of the core functions offered by the b53_common driver since the SWITCH_CORE registers are mostly register compatible with the switches driven by b53_common. In order to accomplish that, we just override the dsa_driver_ops callbacks that we need to. There are still integration specific logic from the bcm_sf2 that we cannot absorb into b53_common because it is just not there, mostly in the area of link management and power management, but most of the features are within b53_common now: VLAN, FDB, bridge Along the process, we also improve support for the BCM58xx SoCs, since those also have the same version of the switching IP that 7445 has (for which bcm_sf2 was developed). Changes in v3: - rebase against 145dd5f9c88f6ee645662df0be003e8f04bdae93 ("net: flush the softnet backlog in process context") Changes in v2: - rebased against "net: dsa: rename switch operations structure" ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26net: dsa: bcm_sf2: Remove duplicate codeFlorian Fainelli
Now that we are using b53_common for most VLAN, FDB and bridge operations, delete all the redundant code that we had in bcm_sf2.c to keep only the integration specific logic that we have to deal with: power management, link management and the external interfaces (RGMII, MDIO). Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26net: dsa: bcm_sf2: Utilize core B53 driver when possibleFlorian Fainelli
The Broadcom Starfighter2 is almost entirely register compatible with B53, yet for historical reasons came up first in the tree and is now being updated to utilize b53_common.c to the fullest extent possible. A few things need to be adjusted to allow that: - the switch "core" registers currently operate on a 32-bit address, whereas b53 passes a page + reg pair to offset from, so we need to convert that, thankfully there is a generic formula to do that - the link managemenent is not self contained with the B53/CORE register set, but instead is in the SWITCH_REG block which is part of the integration glue logic, so we keep that entirely custom here because this really is part of the existing bcm_sf2 implementation - there are additional power management constraints on the port's memories that make us keep the port_enable/disable callbacks custom for now, also, we support tagging whereas b53_common does not support that yet All the VLAN and bridge code is entirely identical though so, avoid duplicating it. Other things will be migrated in the future like EEE and possibly Wake-on-LAN. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26net: dsa: b53: Add JOIN_ALL_VLAN supportFlorian Fainelli
In order to migrate the bcm_sf2 driver over to the b53 driver for most VLAN/FDB/bridge operations, we need to add support for the "join all VLANs" register and behavior which allows us to make a given port join all VLANs and avoid setting specific VLAN entries when it is leaving the bridge. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26net: dsa: b53: Define SF2 MIB layoutFlorian Fainelli
The 58xx and 7445 chips use the Starfighter2 code, define its MIB layout and introduce a helper function: is58xx() which checks for both of these IDs for now. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26net: dsa: b53: Prepare to support 7445 switchFlorian Fainelli
Allocate a device entry for the Broadcom BCM7445 integrated switch currently backed by bcm_sf2.c. Since this is the latest generation, it has 4 ARL entries, 4K VLANs and uses Port 8 for the CPU/IMP port. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26net: dsa: b53: Initialize ds->ops in b53_switch_allocFlorian Fainelli
In order to allow drivers to override specific dsa_switch_driver callbacks, initialize ds->ops to b53_switch_ops earlier, which avoids having to expose this structure to glue drivers. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26Merge branch 'mlxsw-fw-mark-offload'David S. Miller
Jiri Pirko says: ==================== mlxsw: Introduce support for offload forward mark Ido says: This patchset enables the forwarding of certain control packets by the device instead of relying on the CPU to do the forwarding. The first two patches simplify the current switchdev offload forward infrastructure and make it usable for stacked devices. This is done by moving the packet and port marking to the bridge driver instead of the switch driver. Patches 3-5 add the mlxsw specific bits to support the forward mark. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26mlxsw: spectrum: Mirror certain packets to CPUIdo Schimmel
Instead of trapping certain packets to the CPU and then relying on it to flood them we can instead make the device mirror them. The following packet types are mirrored: * DHCP: Broadcast packets that should be flooded by the device, but also trapped in case CPU is running the DHCP server. * IGMP query: Multicast packets that need to be forwarded to other bridge ports, but also trapped so that receiving netdev will be marked as a router port by the bridge driver. * ARP request: Broadcast packets that should be forwarded to other bridge ports, but also trapped in case requested IP is of the local machine. * ARP response: Unicast packets that should be forwarded by the bridge but also trapped in case response is directed at us. Set the trap action of such packets to mirror and mark them using 'offload_fwd_mark' to prevent the bridge driver from forwarding them itself. Note that OSPF packets are also marked despite their action being trap. The reason for this is that the device traps such packets in the pipeline after they were already flooded. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26mlxsw: spectrum: Allow different traps to have different actionsIdo Schimmel
Up until now we only trapped packets to CPU, but we are going to allow some packets to be mirrored (trap & forward) to CPU. Extend the Rx listener with 'action' member. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26mlxsw: spectrum: Simplify traps definitionIdo Schimmel
Instead of copying & pasting the same struct initialization for every Rx listener, just use a macro. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26bridge: switchdev: Add forward mark support for stacked devicesIdo Schimmel
switchdev_port_fwd_mark_set() is used to set the 'offload_fwd_mark' of port netdevs so that packets being flooded by the device won't be flooded twice. It works by assigning a unique identifier (the ifindex of the first bridge port) to bridge ports sharing the same parent ID. This prevents packets from being flooded twice by the same switch, but will flood packets through bridge ports belonging to a different switch. This method is problematic when stacked devices are taken into account, such as VLANs. In such cases, a physical port netdev can have upper devices being members in two different bridges, thus requiring two different 'offload_fwd_mark's to be configured on the port netdev, which is impossible. The main problem is that packet and netdev marking is performed at the physical netdev level, whereas flooding occurs between bridge ports, which are not necessarily port netdevs. Instead, packet and netdev marking should really be done in the bridge driver with the switch driver only telling it which packets it already forwarded. The bridge driver will mark such packets using the mark assigned to the ingress bridge port and will prevent the packet from being forwarded through any bridge port sharing the same mark (i.e. having the same parent ID). Remove the current switchdev 'offload_fwd_mark' implementation and instead implement the proposed method. In addition, make rocker - the sole user of the mark - use the proposed method. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26switchdev: Support parent ID comparison for stacked devicesIdo Schimmel
switchdev_port_same_parent_id() currently expects port netdevs, but we need it to support stacked devices in the next patch, so drop the NO_RECURSE flag. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26team: loadbalance: push lacpdus to exact deliveryJiri Pirko
When team is in bridge and LACP is utilized, LACPDU packets are pushed to userspace using raw socket and there they are processed. However, since 8626c56c8279b, LACPDU skbs are dropped by bridge rx_handler so they never reach packet handlers in rx path. Fix this by explicity treat LACPDUs to be pushed to exact delivery in team rx_handler. Reported-by: Ido Schimmel <idosch@mellanox.com> Fixes: 8626c56c8279b ("bridge: fix potential use-after-free when hook returns QUEUE or STOLEN verdict") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26devlink: remove unused priv_sizeIvan Vecera
Remove unused and useless priv_size member from struct devlink_ops. Cc: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26net: flush the softnet backlog in process contextPaolo Abeni
Currently in process_backlog(), the process_queue dequeuing is performed with local IRQ disabled, to protect against flush_backlog(), which runs in hard IRQ context. This patch moves the flush operation to a work queue and runs the callback with bottom half disabled to protect the process_queue against dequeuing. Since process_queue is now always manipulated in bottom half context, the irq disable/enable pair around the dequeue operation are removed. To keep the flush time as low as possible, the flush works are scheduled on all online cpu simultaneously, using the high priority work-queue and statically allocated, per cpu, work structs. Overall this change increases the time required to destroy a device to improve slightly the packets reinjection performances. Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26net: bridge: export also pvid flag in the xstats flagsNikolay Aleksandrov
When I added support to export the vlan entry flags via xstats I forgot to add support for the pvid since it is manually matched, so check if the entry matches the vlan_group's pvid and set the flag appropriately. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>