summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-05-08mm, compaction: reorder fields in struct compact_controlVlastimil Babka
Patch series "try to reduce fragmenting fallbacks", v3. Last year, Johannes Weiner has reported a regression in page mobility grouping [1] and while the exact cause was not found, I've come up with some ways to improve it by reducing the number of allocations falling back to different migratetype and causing permanent fragmentation. The series was tested with mmtests stress-highalloc modified to do GFP_KERNEL order-4 allocations, on 4.9 with "mm, vmscan: fix zone balance check in prepare_kswapd_sleep" (without that, kcompactd indeed wasn't woken up) on UMA machine with 4GB memory. There were 5 repeats of each run, as the extfrag stats are quite volatile (note the stats below are sums, not averages, as it was less perl hacking for me). Success rate are the same, already high due to the low allocation order used, so I'm not including them. Compaction stats: (the patches are stacked, and I haven't measured the non-functional-changes patches separately) patch 1 patch 2 patch 3 patch 4 patch 7 patch 8 Compaction stalls 22449 24680 24846 19765 22059 17480 Compaction success 12971 14836 14608 10475 11632 8757 Compaction failures 9477 9843 10238 9290 10426 8722 Page migrate success 3109022 3370438 3312164 1695105 1608435 2111379 Page migrate failure 911588 1149065 1028264 1112675 1077251 1026367 Compaction pages isolated 7242983 8015530 7782467 4629063 4402787 5377665 Compaction migrate scanned 980838938 987367943 957690188 917647238 947155598 1018922197 Compaction free scanned 557926893 598946443 602236894 594024490 541169699 763651731 Compaction cost 10243 10578 10304 8286 8398 9440 Compaction stats are mostly within noise until patch 4, which decreases the number of compactions, and migrations. Part of that could be due to more pageblocks marked as unmovable, and async compaction skipping those. This changes a bit with patch 7, but not so much. Patch 8 increases free scanner stats and migrations, which comes from the changed termination criteria. Interestingly number of compactions decreases - probably the fully compacted pageblock satisfies multiple subsequent allocations, so it amortizes. Next comes the extfrag tracepoint, where "fragmenting" means that an allocation had to fallback to a pageblock of another migratetype which wasn't fully free (which is almost all of the fallbacks). I have locally added another tracepoint for "Page steal" into steal_suitable_fallback() which triggers in situations where we are allowed to do move_freepages_block(). If we decide to also do set_pageblock_migratetype(), it's "Pages steal with pageblock" with break down for which allocation migratetype we are stealing and from which fallback migratetype. The last part "due to counting" comes from patch 4 and counts the events where the counting of movable pages allowed us to change pageblock's migratetype, while the number of free pages alone wouldn't be enough to cross the threshold. patch 1 patch 2 patch 3 patch 4 patch 7 patch 8 Page alloc extfrag event 10155066 8522968 10164959 15622080 13727068 13140319 Extfrag fragmenting 10149231 8517025 10159040 15616925 13721391 13134792 Extfrag fragmenting for unmovable 159504 168500 184177 97835 70625 56948 Extfrag fragmenting unmovable placed with movable 153613 163549 172693 91740 64099 50917 Extfrag fragmenting unmovable placed with reclaim. 5891 4951 11484 6095 6526 6031 Extfrag fragmenting for reclaimable 4738 4829 6345 4822 5640 5378 Extfrag fragmenting reclaimable placed with movable 1836 1902 1851 1579 1739 1760 Extfrag fragmenting reclaimable placed with unmov. 2902 2927 4494 3243 3901 3618 Extfrag fragmenting for movable 9984989 8343696 9968518 15514268 13645126 13072466 Pages steal 179954 192291 210880 123254 94545 81486 Pages steal with pageblock 22153 18943 20154 33562 29969 33444 Pages steal with pageblock for unmovable 14350 12858 13256 20660 19003 20852 Pages steal with pageblock for unmovable from mov. 12812 11402 11683 19072 17467 19298 Pages steal with pageblock for unmovable from recl. 1538 1456 1573 1588 1536 1554 Pages steal with pageblock for movable 7114 5489 5965 11787 10012 11493 Pages steal with pageblock for movable from unmov. 6885 5291 5541 11179 9525 10885 Pages steal with pageblock for movable from recl. 229 198 424 608 487 608 Pages steal with pageblock for reclaimable 689 596 933 1115 954 1099 Pages steal with pageblock for reclaimable from unmov. 273 219 537 658 547 667 Pages steal with pageblock for reclaimable from mov. 416 377 396 457 407 432 Pages steal with pageblock due to counting 11834 10075 7530 ... for unmovable 8993 7381 4616 ... for movable 2792 2653 2851 ... for reclaimable 49 41 63 What we can see is that "Extfrag fragmenting for unmovable" and "... placed with movable" drops with almost each patch, which is good as we are polluting less movable pageblocks with unmovable pages. The most significant change is patch 4 with movable page counting. On the other hand it increases "Extfrag fragmenting for movable" by 50%. "Pages steal" drops though, so these movable allocation fallbacks find only small free pages and are not allowed to steal whole pageblocks back. "Pages steal with pageblock" raises, because the patch increases the chances of pageblock migratetype changes to happen. This affects all migratetypes. The summary is that patch 4 is not a clear win wrt these stats, but I believe that the tradeoff it makes is a good one. There's less pollution of movable pageblocks by unmovable allocations. There's less stealing between pageblock, and those that remain have higher chance of changing migratetype also the pageblock itself, so it should more faithfully reflect the migratetype of the pages within the pageblock. The increase of movable allocations falling back to unmovable pageblock might look dramatic, but those allocations can be migrated by compaction when needed, and other patches in the series (7-9) improve that aspect. Patches 7 and 8 continue the trend of reduced unmovable fallbacks and also reduce the impact on movable fallbacks from patch 4. [1] https://www.spinics.net/lists/linux-mm/msg114237.html This patch (of 8): While currently there are (mostly by accident) no holes in struct compact_control (on x86_64), but we are going to add more bool flags, so place them all together to the end of the structure. While at it, just order all fields from largest to smallest. Link: http://lkml.kernel.org/r/20170307131545.28577-2-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-08net: mdio-mux: bcm-iproc: call mdiobus_free() in error pathJon Mason
If an error is encountered in mdio_mux_init(), the error path will call mdiobus_free(). Since mdiobus_register() has been called prior to mdio_mux_init(), the bus->state will not be MDIOBUS_UNREGISTERED. This causes a BUG_ON() in mdiobus_free(). To correct this issue, add an error path for mdio_mux_init() which calls mdiobus_unregister() prior to mdiobus_free(). Signed-off-by: Jon Mason <jon.mason@broadcom.com> Fixes: 98bc865a1ec8 ("net: mdio-mux: Add MDIO mux driver for iProc SoCs") Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-08ide: don't call memcpy with the same source and destinationMikulas Patocka
The parisc architecture recently reimplemented the memcpy function and their reimplementation crashed when source and destination overlapped. The crash happened in the function ide_complete_cmd where memcpy is called with the same source and destination pointer. According to the C specification, memcpy behavior is undefined if the source and destination range overlaps. This patches fixes the undefined behavior. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-08ide: use setup_timerGeliang Tang
Use setup_timer() instead of init_timer() to simplify the code. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-08net: ethernet: ti: cpsw: adjust cpsw fifos depth for fullduplex flow controlGrygorii Strashko
When users set flow control using ethtool the bits are set properly in the CPGMAC_SL MACCONTROL register, but the FIFO depth in the respective Port n Maximum FIFO Blocks (Pn_MAX_BLKS) registers remains set to the minimum size reset value. When receive flow control is enabled on a port, the port's associated FIFO block allocation must be adjusted. The port RX allocation must increase to accommodate the flow control runout. The TRM recommends numbers of 5 or 6. Hence, apply required Port FIFO configuration to Pn_MAX_BLKS.Pn_TX_MAX_BLKS=0xF and Pn_MAX_BLKS.Pn_RX_MAX_BLKS=0x5 during interface initialization. Cc: Schuyler Patton <spatton@ti.com> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-08ipv6: reorder ip6_route_dev_notifier after ipv6_dev_notfWANG Cong
For each netns (except init_net), we initialize its null entry in 3 places: 1) The template itself, as we use kmemdup() 2) Code around dst_init_metrics() in ip6_route_net_init() 3) ip6_route_dev_notify(), which is supposed to initialize it after loopback registers Unfortunately the last one still happens in a wrong order because we expect to initialize net->ipv6.ip6_null_entry->rt6i_idev to net->loopback_dev's idev, thus we have to do that after we add idev to loopback. However, this notifier has priority == 0 same as ipv6_dev_notf, and ipv6_dev_notf is registered after ip6_route_dev_notifier so it is called actually after ip6_route_dev_notifier. This is similar to commit 2f460933f58e ("ipv6: initialize route null entry in addrconf_init()") which fixes init_net. Fix it by picking a smaller priority for ip6_route_dev_notifier. Also, we have to release the refcnt accordingly when unregistering loopback_dev because device exit functions are called before subsys exit functions. Acked-by: David Ahern <dsahern@gmail.com> Tested-by: David Ahern <dsahern@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-08Merge tag 'mac80211-for-davem-2017-05-08' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== A couple more fixes: * don't try to authenticate during reconfiguration, which causes drivers to get confused * fix a kernel-doc warning for a recently merged change * fix MU-MIMO group configuration (relevant only for monitor mode) * more rate flags fix: remove stray RX_ENC_FLAG_40MHZ * fix IBSS probe response allocation size ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-08net: cdc_ncm: Fix TX zero paddingJim Baxter
The zero padding that is added to NTB's does not zero the memory correctly. This is because the skb_put modifies the value of skb_out->len which results in the memset command not setting any memory to zero as (ctx->tx_max - skb_out->len) == 0. I have resolved this by storing the size of the memory to be zeroed before the skb_put and using this in the memset call. Signed-off-by: Jim Baxter <jim_baxter@mentor.com> Reviewed-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-08Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "ARM: - HYP mode stub supports kexec/kdump on 32-bit - improved PMU support - virtual interrupt controller performance improvements - support for userspace virtual interrupt controller (slower, but necessary for KVM on the weird Broadcom SoCs used by the Raspberry Pi 3) MIPS: - basic support for hardware virtualization (ImgTec P5600/P6600/I6400 and Cavium Octeon III) PPC: - in-kernel acceleration for VFIO s390: - support for guests without storage keys - adapter interruption suppression x86: - usual range of nVMX improvements, notably nested EPT support for accessed and dirty bits - emulation of CPL3 CPUID faulting generic: - first part of VCPU thread request API - kvm_stat improvements" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits) kvm: nVMX: Don't validate disabled secondary controls KVM: put back #ifndef CONFIG_S390 around kvm_vcpu_kick Revert "KVM: Support vCPU-based gfn->hva cache" tools/kvm: fix top level makefile KVM: x86: don't hold kvm->lock in KVM_SET_GSI_ROUTING KVM: Documentation: remove VM mmap documentation kvm: nVMX: Remove superfluous VMX instruction fault checks KVM: x86: fix emulation of RSM and IRET instructions KVM: mark requests that need synchronization KVM: return if kvm_vcpu_wake_up() did wake up the VCPU KVM: add explicit barrier to kvm_vcpu_kick KVM: perform a wake_up in kvm_make_all_cpus_request KVM: mark requests that do not need a wakeup KVM: remove #ifndef CONFIG_S390 around kvm_vcpu_wake_up KVM: x86: always use kvm_make_request instead of set_bit KVM: add kvm_{test,clear}_request to replace {test,clear}_bit s390: kvm: Cpu model support for msa6, msa7 and msa8 KVM: x86: remove irq disablement around KVM_SET_CLOCK/KVM_GET_CLOCK kvm: better MWAIT emulation for guests KVM: x86: virtualize cpuid faulting ...
2017-05-08Merge branch 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM updates from Russell King: "Lots of little things this time: - allow modules to be autoloaded according to the HWCAP feature bits (used primarily for crypto modules) - split module core and init PLT sections, since the core code and init code could be placed far apart, and the PLT sections need to be local to the code block. - three patches from Chris Brandt to allow Cortex-A9 L2 cache optimisations to be disabled where a SoC didn't wire up the out of band signals. - NoMMU compliance fixes, avoiding corruption of vector table which is not being used at this point, and avoiding possible register state corruption when switching mode. - fixmap memory attribute compliance update. - remove unnecessary locking from update_sections_early() - ftrace fix for DEBUG_RODATA with !FRAME_POINTER" * 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 8672/1: mm: remove tasklist locking from update_sections_early() ARM: 8671/1: V7M: Preserve registers across switch from Thread to Handler mode ARM: 8670/1: V7M: Do not corrupt vector table around v7m_invalidate_l1 call ARM: 8668/1: ftrace: Fix dynamic ftrace with DEBUG_RODATA and !FRAME_POINTER ARM: 8667/3: Fix memory attribute inconsistencies when using fixmap ARM: 8663/1: wire up HWCAP/HWCAP2 feature bits to the CPU modalias ARM: 8666/1: mm: dump: Add domain to output ARM: 8662/1: module: split core and init PLT sections ARM: 8661/1: dts: r7s72100: add l2 cache ARM: 8660/1: shmobile: r7s72100: Enable L2 cache ARM: 8659/1: l2c: allow CA9 optimizations to be disabled
2017-05-08Merge tag 'xtensa-20170507' of git://github.com/jcmvbkbc/linux-xtensaLinus Torvalds
Pull Xtensa updates from Max Filippov: - clearly mark references to spilled register locations with SPILL_SLOT macros - clean up xtensa ptrace: use generic tracehooks, move internal kernel definitions from uapi/asm to asm, make locally-used functions static, fix code style and alignment - use command line parameters passed to ISS as kernel command line. * tag 'xtensa-20170507' of git://github.com/jcmvbkbc/linux-xtensa: xtensa: clean up access to spilled registers locations xtensa: use generic tracehooks xtensa: move internal ptrace definitions from uapi/asm to asm xtensa: clean up xtensa/kernel/ptrace.c xtensa: drop unused fast_io_protect function xtensa: use ITLB_HIT_BIT instead of hardcoded number xtensa: ISS: update kernel command line in platform_setup xtensa: ISS: add argc/argv simcall definitions xtensa: ISS: cleanup setup.c
2017-05-08Merge tag 'for-f2fs-4.12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "In this round, we've focused on enhancing performance with regards to block allocation, GC, and discard/in-place-update IO controls. There are a bunch of clean-ups as well as minor bug fixes. Enhancements: - disable heap-based allocation by default - issue small-sized discard commands by default - change the policy of data hotness for logging - distinguish IOs in terms of size and wbc type - start SSR earlier to avoid foreground GC - enhance data structures managing discard commands - enhance in-place update flow - add some more fault injection routines - secure one more xattr entry Bug fixes: - calculate victim cost for GC correctly - remain correct victim segment number for GC - race condition in nid allocator and initializer - stale pointer produced by atomic_writes - fix missing REQ_SYNC for flush commands - handle missing errors in more corner cases" * tag 'for-f2fs-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (111 commits) f2fs: fix a mount fail for wrong next_scan_nid f2fs: enhance scalability of trace macro f2fs: relocate inode_{,un}lock in F2FS_IOC_SETFLAGS f2fs: Make flush bios explicitely sync f2fs: show available_nids in f2fs/status f2fs: flush dirty nats periodically f2fs: introduce CP_TRIMMED_FLAG to avoid unneeded discard f2fs: allow cpc->reason to indicate more than one reason f2fs: release cp and dnode lock before IPU f2fs: shrink size of struct discard_cmd f2fs: don't hold cmd_lock during waiting discard command f2fs: nullify fio->encrypted_page for each writes f2fs: sanity check segment count f2fs: introduce valid_ipu_blkaddr to clean up f2fs: lookup extent cache first under IPU scenario f2fs: reconstruct code to write a data page f2fs: introduce __wait_discard_cmd f2fs: introduce __issue_discard_cmd f2fs: enable small discard by default f2fs: delay awaking discard thread ...
2017-05-08Merge branch 'stmmac-pci-fix-crash-on-Intel-Galileo-Gen2'David S. Miller
Andy Shevchenko says: ==================== stmmac: pci: Fix crash on Intel Galileo Gen2 Due to misconfiguration of PCI driver for Intel Quark the user will get a kernel crash: udhcpc: started, v1.26.2 stmmaceth 0000:00:14.6 eth0: device MAC address 98:4f:ee:05:ac:47 Generic PHY stmmac-a6:01: attached PHY driver [Generic PHY] (mii_bus:phy_addr=stmmac-a6:01, irq=-1) stmmaceth 0000:00:14.6 eth0: IEEE 1588-2008 Advanced Timestamp supported stmmaceth 0000:00:14.6 eth0: registered PTP clock IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready udhcpc: sending discover stmmaceth 0000:00:14.6 eth0: Link is Up - 100Mbps/Full - flow control off IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready BUG: unable to handle kernel NULL pointer dereference at (null) IP: stmmac_xmit+0xf1/0x1080 Fix this by adding necessary settings. P.S. I split fix to three patches according to what each of them adds. ==================== Tested-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-08stmmac: pci: split out common_default_data() helperAndy Shevchenko
New helper is added in order to prevent misconfiguration happened for one of the platforms when configuration data is expanded. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Joao Pinto <jpinto@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-08stmmac: pci: RX queue routing configurationAndy Shevchenko
The commit abe80fdc6ee6 ("net: stmmac: RX queue routing configuration") missed Intel Quark configuration. Append it here. Fixes: abe80fdc6ee6 ("net: stmmac: RX queue routing configuration") Cc: Joao Pinto <Joao.Pinto@synopsys.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Joao Pinto <jpinto@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-08stmmac: pci: TX and RX queue priority configurationAndy Shevchenko
The commit a8f5102af2a7 ("net: stmmac: TX and RX queue priority configuration") missed Intel Quark configuration. Append it here. Fixes: a8f5102af2a7 ("net: stmmac: TX and RX queue priority configuration") Cc: Joao Pinto <Joao.Pinto@synopsys.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Joao Pinto <jpinto@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-08stmmac: pci: set default number of rx and tx queuesAndy Shevchenko
The commit 26d6851fd24e ("net: stmmac: set default number of rx and tx queues in stmmac_pci") missed Intel Quark configuration. Append it here. Fixes: 26d6851fd24e ("net: stmmac: set default number of rx and tx queues in stmmac_pci") Cc: Joao Pinto <Joao.Pinto@synopsys.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Joao Pinto <jpinto@synopsys.com> Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-08vti: check nla_put_* return valueHangbin Liu
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-08bpf: don't let ldimm64 leak map addresses on unprivilegedDaniel Borkmann
The patch fixes two things at once: 1) It checks the env->allow_ptr_leaks and only prints the map address to the log if we have the privileges to do so, otherwise it just dumps 0 as we would when kptr_restrict is enabled on %pK. Given the latter is off by default and not every distro sets it, I don't want to rely on this, hence the 0 by default for unprivileged. 2) Printing of ldimm64 in the verifier log is currently broken in that we don't print the full immediate, but only the 32 bit part of the first insn part for ldimm64. Thus, fix this up as well; it's okay to access, since we verified all ldimm64 earlier already (including just constants) through replace_map_fd_with_map_ptr(). Fixes: 1be7f75d1668 ("bpf: enable non-root eBPF programs") Fixes: cbd357008604 ("bpf: verifier (add ability to receive verification log)") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-08yam: use memdup_userGeliang Tang
Use memdup_user() helper instead of open-coding to simplify the code. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-08net/hippi/rrunner: use memdup_userGeliang Tang
Use memdup_user() helper instead of open-coding to simplify the code. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-08cxgb4: avoid disabling FEC by defaultGanesh Goudar
Recent Chelsio firmware started using few port capablity bits to manage FEC and as driver was not aware of FEC changes those bits were zeroed, consequently disabling FEC. Avoid zeroing those bits and default to whatever the firmware tells us the Link is currently advertising. Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-08net: dsa: loop: Check for memory allocation failureChristophe Jaillet
If 'devm_kzalloc' fails, a NULL pointer will be dereferenced. Return -ENOMEM instead, as done for some other memory allocation just a few lines above. Fixes: 98cd1552ea27 ("net: dsa: Mock-up driver") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-08bonding: check nla_put_be32 return valueHangbin Liu
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-08bnxt_en: allocate enough space for ->ntp_fltr_bmapDan Carpenter
We have the number of longs, but we need to calculate the number of bytes required. Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-08qlge: Avoid reading past end of bufferKees Cook
Using memcpy() from a string that is shorter than the length copied means the destination buffer is being filled with arbitrary data from the kernel rodata segment. Instead, use strncpy() which will fill the trailing bytes with zeros. This was found with the future CONFIG_FORTIFY_SOURCE feature. Cc: Daniel Micay <danielmicay@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-08bna: ethtool: Avoid reading past end of bufferKees Cook
Using memcpy() from a string that is shorter than the length copied means the destination buffer is being filled with arbitrary data from the kernel rodata segment. Instead, use strncpy() which will fill the trailing bytes with zeros. This was found with the future CONFIG_FORTIFY_SOURCE feature. Cc: Daniel Micay <danielmicay@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-08bna: Avoid reading past end of bufferKees Cook
Using memcpy() from a string that is shorter than the length copied means the destination buffer is being filled with arbitrary data from the kernel rodata segment. Instead, use strncpy() which will fill the trailing bytes with zeros. This was found with the future CONFIG_FORTIFY_SOURCE feature. Cc: Daniel Micay <danielmicay@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-08Merge tag 'fscrypt_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt Pull fscrypt updates from Ted Ts'o: "Only bug fixes and cleanups for this merge window" * tag 'fscrypt_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt: fscrypt: correct collision claim for digested names MAINTAINERS: fscrypt: update mailing list, patchwork, and git ext4: clean up ext4_match() and callers f2fs: switch to using fscrypt_match_name() ext4: switch to using fscrypt_match_name() fscrypt: introduce helper function for filename matching fscrypt: avoid collisions when presenting long encrypted filenames f2fs: check entire encrypted bigname when finding a dentry ubifs: check for consistent encryption contexts in ubifs_lookup() f2fs: sync f2fs_lookup() with ext4_lookup() ext4: remove "nokey" check from ext4_lookup() fscrypt: fix context consistency check when key(s) unavailable fscrypt: Remove __packed from fscrypt_policy fscrypt: Move key structure and constants to uapi fscrypt: remove fscrypt_symlink_data_len() fscrypt: remove unnecessary checks for NULL operations
2017-05-08vlan: Keep NETIF_F_HW_CSUM similar to other software devicesVlad Yasevich
Vlan devices, like all other software devices, enable NETIF_F_HW_CSUM feature. However, unlike all the othe other software devices, vlans will switch to using IP|IPV6_CSUM features, if the underlying devices uses them. In these situations, checksum offload features on the vlan device can't be controlled via ethtool. This patch makes vlans keep HW_CSUM feature if the underlying device supports checksum offloading. This makes vlan devices behave like other software devices, and restores control to the user. A side-effect is that some offload settings (typically UFO) may be enabled on the vlan device while being disabled on the HW. However, the GSO code will correctly process the packets. This actually results in slightly better raw throughput. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-08tcp: make congestion control optionally skip slow start after idleWei Wang
Congestion control modules that want full control over congestion control behavior do not want the cwnd modifications controlled by the sysctl_tcp_slow_start_after_idle code path. So skip those code paths for CC modules that use the cong_control() API. As an example, those cwnd effects are not desired for the BBR congestion control algorithm. Fixes: c0402760f565 ("tcp: new CC hook to set sending rate with rate_sample in any CA state") Signed-off-by: Wei Wang <weiwan@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-08ipv4: restore rt->fi for reference countingWANG Cong
IPv4 dst could use fi->fib_metrics to store metrics but fib_info itself is refcnt'ed, so without taking a refcnt fi and fi->fib_metrics could be freed while dst metrics still points to it. This triggers use-after-free as reported by Andrey twice. This patch reverts commit 2860583fe840 ("ipv4: Kill rt->fi") to restore this reference counting. It is a quick fix for -net and -stable, for -net-next, as Eric suggested, we can consider doing reference counting for metrics itself instead of relying on fib_info. IPv6 is very different, it copies or steals the metrics from mx6_config in fib6_commit_metrics() so probably doesn't need a refcnt. Decnet has already done the refcnt'ing, see dn_fib_semantic_match(). Fixes: 2860583fe840 ("ipv4: Kill rt->fi") Reported-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-08Merge tag 'ext4_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: - add GETFSMAP support - some performance improvements for very large file systems and for random write workloads into a preallocated file - bug fixes and cleanups. * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: jbd2: cleanup write flags handling from jbd2_write_superblock() ext4: mark superblock writes synchronous for nobarrier mounts ext4: inherit encryption xattr before other xattrs ext4: replace BUG_ON with WARN_ONCE in ext4_end_bio() ext4: avoid unnecessary transaction stalls during writeback ext4: preload block group descriptors ext4: make ext4_shutdown() static ext4: support GETFSMAP ioctls vfs: add common GETFSMAP ioctl definitions ext4: evict inline data when writing to memory map ext4: remove ext4_xattr_check_entry() ext4: rename ext4_xattr_check_names() to ext4_xattr_check_entries() ext4: merge ext4_xattr_list() into ext4_listxattr() ext4: constify static data that is never modified ext4: trim return value and 'dir' argument from ext4_insert_dentry() jbd2: fix dbench4 performance regression for 'nobarrier' mounts jbd2: Fix lockdep splat with generic/270 test mm: retry writepages() on ENOMEM when doing an data integrity writeback
2017-05-08fix braino in generic_file_read_iter()Al Viro
Wrong sign of iov_iter_revert() argument. Unfortunately, slipped through the testing, since most of the time we don't do anything to the iterator afterwards and potential oops on walking the iter->iov too far backwards is too infrequent to be easily triggered. Add a sanity check in iov_iter_revert() to catch bugs like this one; fortunately, the same braino hadn't happened in other callers, but we'd better have a warning if such thing crops up. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-05-08aquantia: Fix "ethtool -S" crash when adapter down.Pavel Belous
This patch fixes the crash that happens when driver tries to collect statistics from already released "aq_vec" object. If adapter is in "down" state we still allow user to see statistics from HW. V2: fixed braces around "aq_vec_free". Fixes: 97bde5c4f909 ("net: ethernet: aquantia: Support for NIC-specific code") Signed-off-by: Pavel Belous <pavel.belous@aquantia.com> Tested-by: David Arcari <darcari@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-08cfg80211: fix multi scheduled scan kernel-docJohannes Berg
Replace @results_wk with @report_results, which was missed in an earlier patch between revisions thereof. Fixes: b34939b98369 ("cfg80211: add request id to cfg80211_sched_scan_*() api") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-05-08mac80211: fix IBSS presp allocation sizeJohannes Berg
When VHT IBSS support was added, the size of the extra elements wasn't considered in ieee80211_ibss_build_presp(), which makes it possible that it would overrun the allocated buffer. Fix it by allocating the necessary space. Fixes: abcff6ef01f9 ("mac80211: add VHT support for IBSS") Reported-by: Shaul Triebitz <shaul.triebitz@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-05-08nl80211: correctly validate MU-MIMO groupsJohannes Berg
Since groups 0 and 63 are invalid, we should check for those bits. Note that the 802.11 spec specifies the *bit* order, but the CPU doesn't care about bit order since it can't address bits, so it's always treating BIT(0) as the lowest bit within a byte. Reported-by: Jan Fuchs <jan.fuchs@lancom.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-05-08mac80211: bail out from prep_connection() if a reconfig is ongoingLuca Coelho
If ieee80211_hw_restart() is called during authentication, the authentication process will continue, causing the driver to be called in a wrong state. This ultimately causes an oops in the iwlwifi driver (at least). This fixes bugzilla 195299 partly. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195299 Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-05-08mac80211: properly remove RX_ENC_FLAG_40MHZJohannes Berg
Somehow I missed this in my RX rate cleanup series, causing some drivers to not report correct bandwidth since this flag isn't used by mac80211 anymore. Fix this, and make hwsim also report higher bandwidths appropriately. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-05-07Merge branch 'xtensa-sim-params' into xtensa-fixesMax Filippov
2017-05-06docs: complete bumping minimal GNU Make version to 3.81Max Filippov
Commit 37d69ee30808 ("docs: bump minimal GNU Make version to 3.81") changes one entry of GNU make version in the changes.rst, there's still one more entry saying that one need version 3.80. Fix that. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-06Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull cifs fixes from Steve French: "Various fixes for stable for CIFS/SMB3 especially for better interoperability for SMB3 to Macs. It also includes Pavel's improvements to SMB3 async i/o support (which is much faster now)" * 'for-next' of git://git.samba.org/sfrench/cifs-2.6: CIFS: add misssing SFM mapping for doublequote SMB3: Work around mount failure when using SMB3 dialect to Macs cifs: fix CIFS_IOC_GET_MNT_INFO oops CIFS: fix mapping of SFM_SPACE and SFM_PERIOD CIFS: fix oplock break deadlocks cifs: fix CIFS_ENUMERATE_SNAPSHOTS oops cifs: fix leak in FSCTL_ENUM_SNAPS response handling Set unicode flag on cifs echo request to avoid Mac error CIFS: Add asynchronous write support through kernel AIO CIFS: Add asynchronous read support through kernel AIO CIFS: Add asynchronous context to support kernel AIO cifs: fix IPv6 link local, with scope id, address parsing cifs: small underflow in cnvrtDosUnixTm()
2017-05-06Merge tag 'xfs-4.12-merge-7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs updates from Darrick Wong: "Here are the XFS changes for 4.12. The big new feature for this release is the new space mapping ioctl that we've been discussing since LSF2016, but other than that most of the patches are larger bug fixes, memory corruption prevention, and other cleanups. Summary: - various code cleanups - introduce GETFSMAP ioctl - various refactoring - avoid dio reads past eof - fix memory corruption and other errors with fragmented directory blocks - fix accidental userspace memory corruptions - publish fs uuid in superblock - make fstrim terminatable - fix race between quotaoff and in-core inode creation - avoid use-after-free when finishing up w/ buffer heads - reserve enough space to handle bmap tree resizing during cow remap" * tag 'xfs-4.12-merge-7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (53 commits) xfs: fix use-after-free in xfs_finish_page_writeback xfs: reserve enough blocks to handle btree splits when remapping xfs: wait on new inodes during quotaoff dquot release xfs: update ag iterator to support wait on new inodes xfs: support ability to wait on new inodes xfs: publish UUID in struct super_block xfs: Allow user to kill fstrim process xfs: better log intent item refcount checking xfs: fix up quotacheck buffer list error handling xfs: remove xfs_trans_ail_delete_bulk xfs: don't use bool values in trace buffers xfs: fix getfsmap userspace memory corruption while setting OF_LAST xfs: fix __user annotations for xfs_ioc_getfsmap xfs: corruption needs to respect endianess too! xfs: use NULL instead of 0 to initialize a pointer in xfs_ioc_getfsmap xfs: use NULL instead of 0 to initialize a pointer in xfs_getfsmap xfs: simplify validation of the unwritten extent bit xfs: remove unused values from xfs_exntst_t xfs: remove the unused XFS_MAXLINK_1 define xfs: more do_div cleanups ...
2017-05-06Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes and updates from Jens Axboe: "Some fixes and followup features/changes that should go in, in this merge window. This contains: - Two fixes for lightnvm from Javier, fixing problems in the new code merge previously in this merge window. - A fix from Jan for the backing device changes, fixing an issue in NFS that causes a failure to mount on certain setups. - A change from Christoph, cleaning up the blk-mq init and exit request paths. - Remove elevator_change(), which is now unused. From Bart. - A fix for queue operation invocation on a dead queue, from Bart. - A series fixing up mtip32xx for blk-mq scheduling, removing a bandaid we previously had in place for this. From me. - A regression fix for this series, fixing a case where we wait on workqueue flushing from an invalid (non-blocking) context. From me. - A fix/optimization from Ming, ensuring that we don't both quiesce and freeze a queue at the same time. - A fix from Peter on lock ordering for CPU hotplug. Not a real problem right now, but will be once the CPU hotplug rework goes in. - A series from Omar, cleaning up out blk-mq debugfs support, and adding support for exporting info from schedulers in debugfs as well. This is really useful in debugging stalls or livelocks. From Omar" * 'for-linus' of git://git.kernel.dk/linux-block: (28 commits) mq-deadline: add debugfs attributes kyber: add debugfs attributes blk-mq-debugfs: allow schedulers to register debugfs attributes blk-mq: untangle debugfs and sysfs blk-mq: move debugfs declarations to a separate header file blk-mq: Do not invoke queue operations on a dead queue blk-mq-debugfs: get rid of a bunch of boilerplate blk-mq-debugfs: rename hw queue directories from <n> to hctx<n> blk-mq-debugfs: don't open code strstrip() blk-mq-debugfs: error on long write to queue "state" file blk-mq-debugfs: clean up flag definitions blk-mq-debugfs: separate flags with | nfs: Fix bdi handling for cloned superblocks block/mq: Cure cpu hotplug lock inversion lightnvm: fix bad back free on error path lightnvm: create cmd before allocating request blk-mq: don't use sync workqueue flushing from drivers mtip32xx: convert internal commands to regular block infrastructure mtip32xx: cleanup internal tag assumptions block: don't call blk_mq_quiesce_queue() after queue is frozen ...
2017-05-06refcount: change EXPORT_SYMBOL markingsGreg Kroah-Hartman
Now that kref is using the refcount apis, the _GPL markings are getting exported to places that it previously wasn't. Now kref.h is GPLv2 licensed, so any non-GPL code using it better be talking to some lawyers, but changing api markings isn't considered "nice", so let's fix this up. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-06docs: bump minimal GNU Make version to 3.81Masahiro Yamada
Since 2014, you can't successfully build kernels with GNU Make version 3.80. Example errors: $ git describe v4.11 $ make --version | head -1 GNU Make 3.80 $ make defconfig HOSTCC scripts/basic/fixdep scripts/Makefile.host:135: *** missing separator. Stop. make: *** [defconfig] Error 2 $ make ARCH=arm64 help arch/arm64/Makefile:43: *** unterminated call to function `warning': missing `)'. Stop. $ make help >/dev/null ./Documentation/Makefile.sphinx:25: Extraneous text after `else' directive ./Documentation/Makefile.sphinx:31: *** only one `else' per conditional. Stop. make: *** [help] Error 2 The first breakage was introduced by commit c8589d1e9e01 ("kbuild: handle multi-objs dependency appropriately"). Since then (i.e. v3.18), GNU Make 3.80 has not been able to compile the kernel, but nobody has ever complained aboutt (or noticed) it. Even GNU Make 3.81 is more than 10 years old. It would not hurt to match the documentation with reality instead of fixing makefiles. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-06initramfs: avoid "label at end of compound statement" errorLinus Torvalds
Commit 17a9be317475 ("initramfs: Always do fput() and load modules after rootfs populate") introduced an error for the CONFIG_BLK_DEV_RAM=y case, because even though the code looks fine, the compiler really wants a statement after a label, or you'll get complaints: init/initramfs.c: In function 'populate_rootfs': init/initramfs.c:644:2: error: label at end of compound statement That commit moved the subsequent statements to outside the compound statement, leaving the label without any associated statements. Reported-by: Jörg Otte <jrg.otte@gmail.com> Fixes: 17a9be317475 ("initramfs: Always do fput() and load modules after rootfs populate") Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: stable@vger.kernel.org # if 17a9be317475 gets backported Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-05Merge tag 'devicetree-for-4.12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull DeviceTree updates from Rob Herring: - fix sparse warnings in drivers/of/ - add more overlay unittests - update dtc to v1.4.4-8-g756ffc4f52f6. This adds more checks on dts files such as unit-address formatting and stricter character sets for node and property names - add a common DT modalias function - move trivial-devices.txt up and out of i2c dir - ARM NVIC interrupt controller binding - vendor prefixes for Sensirion, Dioo, Nordic, ROHM - correct some binding file locations * tag 'devicetree-for-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (24 commits) of: fix sparse warnings in fdt, irq, reserved mem, and resolver code of: fix sparse warning in of_pci_range_parser_one of: fix sparse warnings in of_find_next_cache_node of/unittest: Missing unlocks on error of: fix uninitialized variable warning for overlay test of: fix unittest build without CONFIG_OF_OVERLAY of: Add unit tests for applying overlays of: per-file dtc compiler flags fpga: region: add missing DT documentation for config complete timeout of: Add vendor prefix for ROHM Semiconductor of: fix "/cpus" reference leak in of_numa_parse_cpu_nodes() of: Add vendor prefix for Nordic Semiconductor dt-bindings: arm,nvic: Binding for ARM NVIC interrupt controller on Cortex-M dtc: update warning settings for new bus and node/property name checks scripts/dtc: Update to upstream version v1.4.4-8-g756ffc4f52f6 scripts/dtc: automate getting dtc version and log in update script of: Add function for generating a DT modalias with a newline of: fix of_device_get_modalias returned length when truncating buffers Documentation: devicetree: move trivial-devices out of I2C realm dt-bindings: add vendor prefix for Dioo ..
2017-05-05Merge tag 'for-4.12/dm-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - DM cache metadata fixes to short-circuit operations that require the metadata not be in 'fail_io' mode. Otherwise crashes are possible. - a DM cache fix to address the inability to adapt to continuous IO that happened to also reflect a changing working set (which required old blocks be demoted before the new working set could be promoted) - a DM cache smq policy cleanup that fell out from reviewing the above - fix the Kconfig help text for CONFIG_DM_INTEGRITY * tag 'for-4.12/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm cache metadata: fail operations if fail_io mode has been established dm integrity: improve the Kconfig help text for DM_INTEGRITY dm cache policy smq: cleanup free_target_met() and clean_target_met() dm cache policy smq: allow demotions to happen even during continuous IO