summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-01-19vhost/scsi: silence uninitialized variable warningDan Carpenter
This is to silence an uninitialized variable warning in debug output. The problem is this line: pr_debug("vhost_get_vq_desc: head: %d, out: %u in: %u\n", head, out, in); If "head == vq->num" is true on the first iteration then "out" and "in" aren't initialized. We handle that a few lines after the printk. I was tempted to just delete the pr_debug() but I decided to just initialize them to zero instead. Also checkpatch.pl complains if variables are declared as just "unsigned" without the "int". Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-01-19vhost: scsi: constify target_core_fabric_ops structuresBhumika Goyal
Declare target_core_fabric_ops strucrues as const as they are only passed as an argument to the functions target_register_template and target_unregister_template. The arguments are of type const struct target_core_fabric_ops *, so target_core_fabric_ops structures having this property can be declared const. Done using Coccinelle: @r disable optional_qualifier@ identifier i; position p; @@ static struct target_core_fabric_ops i@p={...}; @ok@ position p; identifier r.i; @@ ( target_register_template(&i@p) | target_unregister_template(&i@p) ) @bad@ position p!={r.p,ok.p}; identifier r.i; @@ i@p @depends on !bad disable optional_qualifier@ identifier r.i; @@ +const struct target_core_fabric_ops i; File size before: drivers/vhost/scsi.o text data bss dec hex filename 18063 2985 40 21088 5260 drivers/vhost/scsi.o File size after: drivers/vhost/scsi.o text data bss dec hex filename 18479 2601 40 21120 5280 drivers/vhost/scsi.o Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2017-01-19nbd: only set MSG_MORE when we have more to sendJosef Bacik
A user noticed that write performance was horrible over loopback and we traced it to an inversion of when we need to set MSG_MORE. It should be set when we have more bvec's to send, not when we are on the last bvec. This patch made the test go from 20 iops to 78k iops. Signed-off-by: Josef Bacik <jbacik@fb.com> Fixes: 429a787be679 ("nbd: fix use-after-free of rq/bio in the xmit path") Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-19Merge tag 'pci-v4.10-fixes-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - recognize that a PCI-to-PCIe bridge originates a PCIe hierarchy, so we enumerate that hierarchy correctly - X-Gene: fix a change merged for v4.10 that broke MSI - Keystone: avoid reading undefined registers, which can cause asynchronous external aborts - Supermicro X8DTH-i/6/iF/6F: ignore broken _CRS that caused us to change (and break) existing I/O port assignments * tag 'pci-v4.10-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI/MSI: pci-xgene-msi: Fix CPU hotplug registration handling PCI: Enumerate switches below PCI-to-PCIe bridges x86/PCI: Ignore _CRS on Supermicro X8DTH-i/6/iF/6F PCI: designware: Check for iATU unroll only on platforms that use ATU
2017-01-19Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid Pull HID fixes from Jiri Kosina: - regression fix for generic Wacom devices, from Jason Gerecke - DMA-on-stack fixes for hid-corsair driver, from Johan Hovold * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: HID: wacom: Fix sibling detection regression HID: corsair: fix control-transfer error handling HID: corsair: fix DMA buffers on stack
2017-01-19Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull two s390 bug fixes from Martin Schwidefsky: "Two changes, the first is a fix to add a missing memory clobber to the inline assembly to load control registers. This has not caused any issues so far, but who knows what code gcc will generate in future versions. The second change is an update for the default configurations. This includes CONFIG_BUG_ON_DATA_CORRUPTION=y, we want this to be enabled for s390. The usual approach to debug problems on production systems is to use crash on a system dump and for us avoiding data corruptions is priority one" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390: update defconfigs s390/ctl_reg: make __ctl_load a full memory barrier
2017-01-19Merge tag 'for-linus-4.10-rc4-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fix from Juergen Gross: "A fix for Xen running in nested virtualization environment" * tag 'for-linus-4.10-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: partially revert "xen: Remove event channel notification through Xen PCI platform device"
2017-01-19Btrfs: fix truncate down when no_holes feature is enabledLiu Bo
For such a file mapping, [0-4k][hole][8k-12k] In NO_HOLES mode, we don't have the [hole] extent any more. Commit c1aa45759e90 ("Btrfs: fix shrinking truncate when the no_holes feature is enabled") fixed disk isize not being updated in NO_HOLES mode when data is not flushed. However, even if data has been flushed, we can still have trouble in updating disk isize since we updated disk isize to 'start' of the last evicted extent. Reviewed-by: Chris Mason <clm@fb.com> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-01-19Btrfs: Fix deadlock between direct IO and fast fsyncChandan Rajendra
The following deadlock is seen when executing generic/113 test, ---------------------------------------------------------+---------------------------------------------------- Direct I/O task Fast fsync task ---------------------------------------------------------+---------------------------------------------------- btrfs_direct_IO __blockdev_direct_IO do_blockdev_direct_IO do_direct_IO btrfs_get_blocks_direct while (blocks needs to written) get_more_blocks (first iteration) btrfs_get_blocks_direct btrfs_create_dio_extent down_read(&BTRFS_I(inode) >dio_sem) Create and add extent map and ordered extent up_read(&BTRFS_I(inode) >dio_sem) btrfs_sync_file btrfs_log_dentry_safe btrfs_log_inode_parent btrfs_log_inode btrfs_log_changed_extents down_write(&BTRFS_I(inode) >dio_sem) Collect new extent maps and ordered extents wait for ordered extent completion get_more_blocks (second iteration) btrfs_get_blocks_direct btrfs_create_dio_extent down_read(&BTRFS_I(inode) >dio_sem) -------------------------------------------------------------------------------------------------------------- In the above description, Btrfs direct I/O code path has not yet started submitting bios for file range covered by the initial ordered extent. Meanwhile, The fast fsync task obtains the write semaphore and waits for I/O on the ordered extent to get completed. However, the Direct I/O task is now blocked on obtaining the read semaphore. To resolve the deadlock, this commit modifies the Direct I/O code path to obtain the read semaphore before invoking __blockdev_direct_IO(). The semaphore is then given up after __blockdev_direct_IO() returns. This allows the Direct I/O code to complete I/O on all the ordered extents it creates. Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-01-19btrfs: fix false enospc error when truncating heavily reflinked fileWang Xiaoguang
Below test script can reveal this bug: dd if=/dev/zero of=fs.img bs=$((1024*1024)) count=100 dev=$(losetup --show -f fs.img) mkdir -p /mnt/mntpoint mkfs.btrfs -f $dev mount $dev /mnt/mntpoint cd /mnt/mntpoint echo "workdir is: /mnt/mntpoint" blocksize=$((128 * 1024)) dd if=/dev/zero of=testfile bs=$blocksize count=1 sync count=$((17*1024*1024*1024/blocksize)) echo "file size is:" $((count*blocksize)) for ((i = 1; i <= $count; i++)); do dst_offset=$((blocksize * i)) xfs_io -f -c "reflink testfile 0 $dst_offset $blocksize"\ testfile > /dev/null done sync truncate --size 0 testfile The last truncate operation will fail for ENOSPC reason, but indeed it should not fail. In btrfs_truncate(), we use a temporary block_rsv to do truncate operation. With every btrfs_truncate_inode_items() call, we migrate space to this block_rsv, but forget to cleanup previous reservation, which will make this block_rsv's reserved bytes keep growing, and this reserved space will only be released in the end of btrfs_truncate(), this metadata leak will impact other's metadata reservation. In this case, it's "btrfs_start_transaction(root, 2);" fails for enospc error, which make this truncate operation fail. Call btrfs_block_rsv_release() to fix this bug. Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-01-19gianfar: Do not reuse pages from emergency reserveEric Dumazet
A driver using dev_alloc_page() must not reuse a page that had to use emergency memory reserve. Otherwise all packets using this page will be immediately dropped, unless for very specific sockets having SOCK_MEMALLOC bit set. This issue might be hard to debug, because only a fraction of the RX ring buffer would suffer from drops. Fixes: 75354148ce69 ("gianfar: Add paged allocation and Rx S/G") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Claudiu Manoil <claudiu.manoil@freescale.com> Acked-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-19tcp: initialize max window for a new fastopen socketAlexey Kodanev
Found that if we run LTP netstress test with large MSS (65K), the first attempt from server to send data comparable to this MSS on fastopen connection will be delayed by the probe timer. Here is an example: < S seq 0:0 win 43690 options [mss 65495 wscale 7 tfo cookie] length 32 > S. seq 0:0 ack 1 win 43690 options [mss 65495 wscale 7] length 0 < . ack 1 win 342 length 0 Inside tcp_sendmsg(), tcp_send_mss() returns max MSS in 'mss_now', as well as in 'size_goal'. This results the segment not queued for transmition until all the data copied from user buffer. Then, inside __tcp_push_pending_frames(), it breaks on send window test and continues with the check probe timer. Fragmentation occurs in tcp_write_wakeup()... +0.2 > P. seq 1:43777 ack 1 win 342 length 43776 < . ack 43777, win 1365 length 0 > P. seq 43777:65001 ack 1 win 342 options [...] length 21224 ... This also contradicts with the fact that we should bound to the half of the window if it is large. Fix this flaw by correctly initializing max_window. Before that, it could have large values that affect further calculations of 'size_goal'. Fixes: 168a8f58059a ("tcp: TCP Fast Open Server - main code path") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-19net/mlx5e: Remove unused variableArnd Bergmann
A cleanup removed the only user of this variable mlx5/core/en_ethtool.c: In function 'mlx5e_set_channels': mlx5/core/en_ethtool.c:546:6: error: unused variable 'ncv' [-Werror=unused-variable] Let's remove the declaration as well. Fixes: 639e9e94160e ("net/mlx5e: Remove unnecessary checks when setting num channels") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-19ipv6: addrconf: Avoid addrconf_disable_change() using RCU read-side lockKefeng Wang
Just like commit 4acd4945cd1e ("ipv6: addrconf: Avoid calling netdevice notifiers with RCU read-side lock"), it is unnecessary to make addrconf_disable_change() use RCU iteration over the netdev list, since it already holds the RTNL lock, or we may meet Illegal context switch in RCU read-side critical section. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-19MAINTAINERS: update cxgb4 maintainerHariprasad Shenai
Ganesg will be taking over as maintainer from now Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-19arm64: avoid returning from bad_modeMark Rutland
Generally, taking an unexpected exception should be a fatal event, and bad_mode is intended to cater for this. However, it should be possible to contain unexpected synchronous exceptions from EL0 without bringing the kernel down, by sending a SIGILL to the task. We tried to apply this approach in commit 9955ac47f4ba1c95 ("arm64: don't kill the kernel on a bad esr from el0"), by sending a signal for any bad_mode call resulting from an EL0 exception. However, this also applies to other unexpected exceptions, such as SError and FIQ. The entry paths for these exceptions branch to bad_mode without configuring the link register, and have no kernel_exit. Thus, if we take one of these exceptions from EL0, bad_mode will eventually return to the original user link register value. This patch fixes this by introducing a new bad_el0_sync handler to cater for the recoverable case, and restoring bad_mode to its original state, whereby it calls panic() and never returns. The recoverable case branches to bad_el0_sync with a bl, and returns to userspace via the usual ret_to_user mechanism. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Fixes: 9955ac47f4ba1c95 ("arm64: don't kill the kernel on a bad esr from el0") Reported-by: Mark Salter <msalter@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: stable@vger.kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-01-19netfilter: conntrack: refine gc worker heuristics, reduxFlorian Westphal
This further refines the changes made to conntrack gc_worker in commit e0df8cae6c16 ("netfilter: conntrack: refine gc worker heuristics"). The main idea of that change was to reduce the scan interval when evictions take place. However, on the reporters' setup, there are 1-2 million conntrack entries in total and roughly 8k new (and closing) connections per second. In this case we'll always evict at least one entry per gc cycle and scan interval is always at 1 jiffy because of this test: } else if (expired_count) { gc_work->next_gc_run /= 2U; next_run = msecs_to_jiffies(1); being true almost all the time. Given we scan ~10k entries per run its clearly wrong to reduce interval based on nonzero eviction count, it will only waste cpu cycles since a vast majorities of conntracks are not timed out. Thus only look at the ratio (scanned entries vs. evicted entries) to make a decision on whether to reduce or not. Because evictor is supposed to only kick in when system turns idle after a busy period, pick a high ratio -- this makes it 50%. We thus keep the idea of increasing scan rate when its likely that table contains many expired entries. In order to not let timed-out entries hang around for too long (important when using event logging, in which case we want to timely destroy events), we now scan the full table within at most GC_MAX_SCAN_JIFFIES (16 seconds) even in worst-case scenario where all timed-out entries sit in same slot. I tested this with a vm under synflood (with sysctl net.netfilter.nf_conntrack_tcp_timeout_syn_recv=3). While flood is ongoing, interval now stays at its max rate (GC_MAX_SCAN_JIFFIES / GC_MAX_BUCKETS_DIV -> 125ms). With feedback from Nicolas Dichtel. Reported-by: Denys Fedoryshchenko <nuclearcat@nuclearcat.com> Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Fixes: b87a2f9199ea82eaadc ("netfilter: conntrack: add gc worker to remove timed-out entries") Signed-off-by: Florian Westphal <fw@strlen.de> Tested-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Tested-by: Denys Fedoryshchenko <nuclearcat@nuclearcat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-01-19netfilter: conntrack: remove GC_MAX_EVICTS breakFlorian Westphal
Instead of breaking loop and instant resched, don't bother checking this in first place (the loop calls cond_resched for every bucket anyway). Suggested-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-01-19HID: wacom: Fix sibling detection regressionJason Gerecke
Commit 345857b ("HID: wacom: generic: Add support for sensor offsets") included a change to the operation and location of the call to 'wacom_add_shared_data' in 'wacom_parse_and_register'. The modifications included moving it higher up so that it would occur before the call to 'wacom_retrieve_hid_descriptor'. This was done to prevent a crash that would have occured when the report containing tablet offsets was fed into the driver with 'wacom_hid_report_raw_event' (specifically: the various 'wacom_wac_*_report' functions were written with the assumption that they would only be called once tablet setup had completed; 'wacom_wac_pen_report' in particular dereferences 'shared' which wasn't yet allocated). Moving the call to 'wacom_add_shared_data' effectively prevented the crash but also broke the sibiling detection code which assumes that the HID descriptor has been read and the various device_type flags set. To fix this situation, we restore the original 'wacom_add_shared_data' operation and location and instead implement an alternative change that can also prevent the crash. Specifically, we notice that the report functions mentioned above expect to be called only for input reports. By adding a check, we can prevent feature reports (such as the offset report) from causing trouble. Fixes: 345857bb49 ("HID: wacom: generic: Add support for sensor offsets") Signed-off-by: Jason Gerecke <jason.gerecke@wacom.com> Tested-by: Ping Cheng <pingc@wacom.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2017-01-19pinctrl: uniphier: fix Ethernet (RMII) pin-mux setting for LD20Masahiro Yamada
Fix the pin-mux values for the MDC, MDIO, MDIO_INTL, PHYRSTL pins. Fixes: 1e359ab1285e ("pinctrl: uniphier: add Ethernet pin-mux settings") Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2017-01-19pinctrl: meson: fix uart_ao_b for GXBB and GXL/GXMMartin Blumenstingl
The GXBB and GXL/GXM pinctrl drivers had a configuration which conflicts with uart_ao_a. According to the GXBB ("S905") datasheet the AO UART functions are: - GPIOAO_0: Func1 = UART_TX_AO_A (bit 12), Func2 = UART_TX_AO_B (bit 26) - GPIOAO_1: Func1 = UART_RX_AO_A (bit 11), Func2 = UART_RX_AO_B (bit 25) - GPIOAO_4: Func2 = UART_TX_AO_B (bit 24) - GPIOAO_5: Func2 = UART_RX_AO_B (bit 25) The existing definition for uart_AO_A already uses GPIOAO_0 and GPIOAO_1. The old definition of uart_AO_B however was broken, as it used GPIOAO_0 for TX (which would be fine) and two pins (GPIOAO_1 and GPIOAO_5) for RX (which does not make any sense). This fixes the uart_AO_B configuration by moving it to GPIOAO_4 and GPIOAO_5 (it would be possible to use GPIOAO_0 and GPIOAO_1 in theory, but all existing hardware uses uart_AO_A there). The fix for GXBB and GXL/GXM is identical since it seems that these specific pins are identical on both SoC variants. Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Reviewed-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2017-01-19gpio: provide lockdep keys for nested/unnested irqchipsLinus Walleij
The helper function for adding a GPIO chip compiles in a lockdep key for debugging, the same key is needed for nested chips as well. The macro construction is unreadable, replace this with two static inlines instead. The _gpiochip_irqchip_add prefixed function is not helpful, rename it with gpiochip_irqchip_add_key() that tell us what the function is actually doing. Fixes: d245b3f9bd36 ("gpio: simplify adding threaded interrupts") Cc: Roger Quadros <rogerq@ti.com> Reported-by: Clemens Gruber <clemens.gruber@pqgruber.com> Reported-by: Roger Quadros <rogerq@ti.com> Reported-by: Grygorii Strashko <grygorii.strashko@ti.com> Tested-by: Clemens Gruber <clemens.gruber@pqgruber.com> Tested-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2017-01-18ARC: Revert "ARC: mm: IOC: Don't enable IOC by default"Vineet Gupta
The programming model has been fixed with prev patches so re-enable it by default This reverts commit 23cb1f644019bac49d87b4dd7c1eac0569cc4f53. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-01-18ARC: mm: split arc_cache_init to allow __init reaping of bulkVineet Gupta
arc_cache_init() is called for each core so can't be tagged __init. However bulk of it is only executed by master core and thus is candidate for __init reaping. So split it up to allow that. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-01-18Merge tag 'omap-for-v4.10/fixes-rc4' of ↵Olof Johansson
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes Fixes for omaps for v4.10-rc cycle. Mostly a DMA regression fix for omap1, and then a handful of trivial fixes for boards and devices to work: - Fixes TI wilink bluetooth strange platform data baud rate - Remove duplicate pinmux line for am335x-icev2 - Fix omap1 dma regression - Fix uninitialized return value for wkup_m3_ipc_probe() - Fix Ethernet PHY binding typo for dra72-evm - Fix init for omap5 and dra7 sata ports - Fix mmc card detect pin for Logic PD SOM-LV * tag 'omap-for-v4.10/fixes-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: dts: omap3: Fix Card Detect and Write Protect on Logic PD SOM-LV ARM: dts: OMAP5 / DRA7: indicate that SATA port 0 is available. ARM: dts: dra72-evm-revc: fix typo in ethernet-phy node soc: ti: wkup_m3_ipc: Fix error return code in wkup_m3_ipc_probe() ARM: OMAP1: DMA: Correct the number of logical channels ARM: dts: am335x-icev2: Remove the duplicated pinmux setting ARM: OMAP2+: Fix WL1283 Bluetooth Baud Rate Signed-off-by: Olof Johansson <olof@lixom.net>
2017-01-18ARCv2: IOC: Use actual memory size to setup aperture sizeVineet Gupta
vs. fixed 512M before. But this still assumes that all of memory is under IOC which may not be true for the SoC. Improve that later when this becomes a real issue, by specifying this from DT. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-01-18ARCv2: IOC: Adhere to progamming model guidelines to avoid DMA corruptionVineet Gupta
On AXS103 release bitfiles, DMA data corruptions were seen because IOC setup was not following the recommended way in documentation. Flipping IOC on when caches are enabled or coherency transactions are in flight, might cause some of the memory operations to not observe coherency as expected. So strictly follow the programming model recommendations as documented in comment header above arc_ioc_setup() Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-01-18ARCv2: IOC: refactor the IOC and SLC operations into own functionsVineet Gupta
- Move IOC setup into arc_ioc_setup() - Move SLC disabling into arc_slc_disable() Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-01-18blk-mq: Remove unused variableKeith Busch
Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-18bpf: don't trigger OOM killer under pressure with map allocDaniel Borkmann
This patch adds two helpers, bpf_map_area_alloc() and bpf_map_area_free(), that are to be used for map allocations. Using kmalloc() for very large allocations can cause excessive work within the page allocator, so i) fall back earlier to vmalloc() when the attempt is considered costly anyway, and even more importantly ii) don't trigger OOM killer with any of the allocators. Since this is based on a user space request, for example, when creating maps with element pre-allocation, we really want such requests to fail instead of killing other user space processes. Also, don't spam the kernel log with warnings should any of the allocations fail under pressure. Given that, we can make backend selection in bpf_map_area_alloc() generic, and convert all maps over to use this API for spots with potentially large allocation requests. Note, replacing the one kmalloc_array() is fine as overflow checks happen earlier in htab_map_alloc(), since it must also protect the multiplication for vmalloc() should kmalloc_array() fail. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-18lwtunnel: fix autoload of lwt modulesDavid Ahern
Trying to add an mpls encap route when the MPLS modules are not loaded hangs. For example: CONFIG_MPLS=y CONFIG_NET_MPLS_GSO=m CONFIG_MPLS_ROUTING=m CONFIG_MPLS_IPTUNNEL=m $ ip route add 10.10.10.10/32 encap mpls 100 via inet 10.100.1.2 The ip command hangs: root 880 826 0 21:25 pts/0 00:00:00 ip route add 10.10.10.10/32 encap mpls 100 via inet 10.100.1.2 $ cat /proc/880/stack [<ffffffff81065a9b>] call_usermodehelper_exec+0xd6/0x134 [<ffffffff81065efc>] __request_module+0x27b/0x30a [<ffffffff814542f6>] lwtunnel_build_state+0xe4/0x178 [<ffffffff814aa1e4>] fib_create_info+0x47f/0xdd4 [<ffffffff814ae451>] fib_table_insert+0x90/0x41f [<ffffffff814a8010>] inet_rtm_newroute+0x4b/0x52 ... modprobe is trying to load rtnl-lwt-MPLS: root 881 5 0 21:25 ? 00:00:00 /sbin/modprobe -q -- rtnl-lwt-MPLS and it hangs after loading mpls_router: $ cat /proc/881/stack [<ffffffff81441537>] rtnl_lock+0x12/0x14 [<ffffffff8142ca2a>] register_netdevice_notifier+0x16/0x179 [<ffffffffa0033025>] mpls_init+0x25/0x1000 [mpls_router] [<ffffffff81000471>] do_one_initcall+0x8e/0x13f [<ffffffff81119961>] do_init_module+0x5a/0x1e5 [<ffffffff810bd070>] load_module+0x13bd/0x17d6 ... The problem is that lwtunnel_build_state is called with rtnl lock held preventing mpls_init from registering. Given the potential references held by the time lwtunnel_build_state it can not drop the rtnl lock to the load module. So, extract the module loading code from lwtunnel_build_state into a new function to validate the encap type. The new function is called while converting the user request into a fib_config which is well before any table, device or fib entries are examined. Fixes: 745041e2aaf1 ("lwtunnel: autoload of lwt modules") Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-18bnxt_en: Fix "uninitialized variable" bug in TPA code path.Michael Chan
In the TPA GRO code path, initialize the tcp_opt_len variable to 0 so that it will be correct for packets without TCP timestamps. The bug caused the SKB fields to be incorrectly set up for packets without TCP timestamps, leading to these packets being rejected by the stack. Reported-by: Andy Gospodarek <andrew.gospodarek@broadocm.com> Acked-by: Andy Gospodarek <andrew.gospodarek@broadocm.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-18Merge tag 'upstream-4.10-rc5' of git://git.infradead.org/linux-ubifsLinus Torvalds
Pull UBIFS fixes from Richard Weinberger: "This contains fixes for UBIFS: - a long standing issue in UBIFS journal replay code - fallout from the merge window" * tag 'upstream-4.10-rc5' of git://git.infradead.org/linux-ubifs: ubifs: Fix journal replay wrt. xattr nodes ubifs: remove redundant checks for encryption key ubifs: allow encryption ioctls in compat mode ubifs: add CONFIG_BLOCK dependency for encryption ubifs: fix unencrypted journal write ubifs: ensure zero err is returned on successful return
2017-01-18net: phy: bcm63xx: Utilize correct config_intr functionDaniel Gonzalez Cabanelas
Commit a1cba5613edf ("net: phy: Add Broadcom phy library for common interfaces") make the BCM63xx PHY driver utilize bcm_phy_config_intr() which would appear to do the right thing, except that it does not write to the MII_BCM63XX_IR register but to MII_BCM54XX_ECR which is different. This would be causing invalid link parameters and events from being generated by the PHY interrupt. Fixes: a1cba5613edf ("net: phy: Add Broadcom phy library for common interfaces") Signed-off-by: Daniel Gonzalez Cabanelas <dgcbueu@gmail.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-18xfs: fix xfs_mode_to_ftype() prototypeArnd Bergmann
A harmless warning just got introduced: fs/xfs/libxfs/xfs_dir2.h:40:8: error: type qualifiers ignored on function return type [-Werror=ignored-qualifiers] Removing the 'const' modifier avoids the warning and has no other effect. Fixes: 1fc4d33fed12 ("xfs: replace xfs_mode_to_ftype table with switch statement") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-01-18net: fix harmonize_features() vs NETIF_F_HIGHDMAEric Dumazet
Ashizuka reported a highmem oddity and sent a patch for freescale fec driver. But the problem root cause is that core networking stack must ensure no skb with highmem fragment is ever sent through a device that does not assert NETIF_F_HIGHDMA in its features. We need to call illegal_highdma() from harmonize_features() regardless of CSUM checks. Fixes: ec5f06156423 ("net: Kill link between CSUM and SG features.") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Pravin Shelar <pshelar@ovn.org> Reported-by: "Ashizuka, Yuusuke" <ashiduka@jp.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-18Merge branch 'xen-netback-leaks'David S. Miller
Igor Druzhinin says: ==================== xen-netback: fix memory leaks on XenBus disconnect Just split the initial patch in two as proposed by Wei. Since the approach for locking netdev statistics is inconsistent (tends not to have any locking at all) accross the kernel we'd better to rely on our internal lock for this purpose. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-18xen-netback: protect resource cleaning on XenBus disconnectIgor Druzhinin
vif->lock is used to protect statistics gathering agents from using the queue structure during cleaning. Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-18xen-netback: fix memory leaks on XenBus disconnectIgor Druzhinin
Eliminate memory leaks introduced several years ago by cleaning the queue resources which are allocated on XenBus connection event. Namely, queue structure array and pages used for IO rings. Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-18netfilter: ipt_CLUSTERIP: fix build error without procfsArnd Bergmann
We can't access c->pde if CONFIG_PROC_FS is disabled: net/ipv4/netfilter/ipt_CLUSTERIP.c: In function 'clusterip_config_find_get': net/ipv4/netfilter/ipt_CLUSTERIP.c:147:9: error: 'struct clusterip_config' has no member named 'pde' This moves the check inside of another #ifdef. Fixes: 6c5d5cfbe3c5 ("netfilter: ipt_CLUSTERIP: check duplicate config when initializing") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-01-18Merge branch 'ethtool-set-channels-fix'David S. Miller
Tariq Toukan says: ==================== ethtool fix This patchset from Eran contains a fix to ethtool set_channels, where the call to get_channels with an uninitialized parameter might result in garbage fields. It also contains two followup changes in our mlx4/mlx5 Eth drivers. Series generated against net commit: 0faa9cb5b383 net sched actions: fix refcnt when GETing of action after bind ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-18net/mlx5e: Remove unnecessary checks when setting num channelsEran Ben Elisha
Boundaries checks for the number of RX and TX should be checked by the caller and not in the driver. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-18net/mlx4_en: Remove unnecessary checks when setting num channelsEran Ben Elisha
Boundaries checks for the number of RX, TX, other and combined channels should be checked by the caller and not in the driver. In addition, remove wrong memset on get channels as it overrides the cmd field in the requester struct. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-18net: ethtool: Initialize buffer when querying device channel settingsEran Ben Elisha
Ethtool channels respond struct was uninitialized when querying device channel boundaries settings. As a result, unreported fields by the driver hold garbage. This may cause sending unsupported params to driver. Fixes: 8bf368620486 ('ethtool: ensure channel counts are within bounds ...') Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> CC: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-18Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM fixes from Russell King: "A few ARM fixes: - fix a crash while performing TLB maintanence on early ARM SMP cores - blacklist Scorpion CPUs for hardware breakpoints - ARMs asm/types.h has been included as part of the UAPI due to the way the makefiles work, move it to uapi/asm/types.h to make it official - fix up ftrace syscall name matching" * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 8613/1: Fix the uaccess crash on PB11MPCore MAINTAINERS: update rmk's entries ARM: put types.h in uapi ARM: 8634/1: hw_breakpoint: blacklist Scorpion CPUs ARM: 8632/1: ftrace: fix syscall name matching
2017-01-18ARC: module: Fix !CONFIG_ARC_DW2_UNWIND buildsVineet Gupta
commit d65283f7b695b5 added mod->arch.secstr under CONFIG_ARC_DW2_UNWIND, but used it unconditionally which broke builds when the option was disabled. Fix that by adjusting the #ifdef guard. And while at it add a missing guard (for unwinder) in module.c as well Reported-by: Waldemar Brodkorb <wbx@openadk.org> Cc: stable@vger.kernel.org #4.9 Fixes: d65283f7b695b5 ("ARC: module: elide loop to save reference to .eh_frame") Tested-by: Anton Kolesov <akolesov@synopsys.com> Reviewed-by: Alexey Brodkin <abrodkin@synopsys.com> [abrodkin: provided fixlet to Kconfig per failure in allnoconfig build] Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-01-18Merge branch 'smp-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull SMP hotplug update from Thomas Gleixner: "This contains a trivial typo fix and an extension to the core code for dynamically allocating states in the prepare stage. The extension is necessary right now because we need a proper way to unbreak LTTNG, which iscurrently non functional due to the removal of the notifiers. Surely it's out of tree, but it's widely used by distros. The simple solution would have been to reserve a state for LTTNG, but I'm not fond about unused crap in the kernel and the dynamic range, which we admittedly should have done right away, allows us to remove quite some of the hardcoded states, i.e. those which have no ordering requirements. So doing the right thing now is better than having an smaller intermediate solution which needs to be reworked anyway" * 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: cpu/hotplug: Provide dynamic range for prepare stage perf/x86/amd/ibs: Fix typo after cleanup state names in cpu/hotplug
2017-01-18Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Ingo Molnar: "Fix a crash in the ARM-Exynos clocksource driver, triggered by CPU hotplug operations" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clocksource/exynos_mct: Clear interrupt when cpu is shut down
2017-01-18Merge branch 'rcu-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU fixes from Ingo Molnar: "This fixes sporadic ACPI related hangs in synchronize_rcu() that were caused by the ACPI code mistakenly relying on an aspect of RCU that was neither promised to work nor reliable but which happened to work - until in v4.9 we changed the RCU implementation, which made the hangs more prominent. Since the mis-use of the RCU facility wasn't properly detected and prevented either, these fixes make the RCU side work reliably instead of working around the problem in the ACPI code. Hence the slightly larger diffstat that goes beyond the normal scope of RCU fixes in -rc kernels" * 'rcu-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: rcu: Narrow early boot window of illegal synchronous grace periods rcu: Remove cond_resched() from Tiny synchronize_sched()
2017-01-18Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "An Intel PMU driver hotplug fix and three 'perf probe' tooling fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel: Handle exclusive threadid correctly on CPU hotplug perf probe: Fix to probe on gcc generated functions in modules perf probe: Add error checks to offline probe post-processing perf probe: Fix to show correct locations for events on modules