summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2017-10-20Merge tag 'armsoc-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Arnd Bergmann: "Here is another set of bugfixes for ARM SoCs, mostly harmless: - a boot regression fix on ux500 - PCIe interrupts on NXP i.MX7 and on Marvell Armada 7K/8K were wired up wrong, in different ways - Armada XP support for large memory never worked - the socfpga reset controller now builds on 64-bit - minor device tree corrections on gemini, mvebu, r-pi 3, rockchip and at91" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: ux500: Fix regression while init PM domains ARM: dts: fix PCLK name on Gemini and MOXA ART arm64: dts: rockchip: fix typo in iommu nodes arm64: dts: rockchip: correct vqmmc voltage for rk3399 platforms ARM: dts: imx7d: Invert legacy PCI irq mapping bus: mbus: fix window size calculation for 4GB windows ARM: dts: at91: sama5d2: add ADC hw trigger edge type ARM: dts: at91: sama5d2_xplained: enable ADTRG pin ARM: dts: at91: at91-sama5d27_som1: fix PHY ID ARM: dts: bcm283x: Fix console path on RPi3 reset: socfpga: fix for 64-bit compilation ARM: dts: Fix I2C repeated start issue on Armada-38x arm64: dts: marvell: fix interrupt-map property for Armada CP110 PCIe controller arm64: dts: salvator-common: add 12V regulator to backlight ARM: dts: sun6i: Fix endpoint IDs in second display pipeline arm64: allwinner: a64: pine64: Use dcdc1 regulator for mmc0
2017-10-20bpf: avoid preempt enable/disable in sockmap using tcp_skb_cb regionJohn Fastabend
SK_SKB BPF programs are run from the socket/tcp context but early in the stack before much of the TCP metadata is needed in tcp_skb_cb. So we can use some unused fields to place BPF metadata needed for SK_SKB programs when implementing the redirect function. This allows us to drop the preempt disable logic. It does however require an API change so sk_redirect_map() has been updated to additionally provide ctx_ptr to skb. Note, we do however continue to disable/enable preemption around actual BPF program running to account for map updates. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20Merge branch 'fixes-v4.14-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull key handling fixes from James Morris: "This includes a fix for the capabilities code from Colin King, and a set of further fixes for the keys subsystem. From David: - Fix a bunch of places where kernel drivers may access revoked user-type keys and don't do it correctly. - Fix some ecryptfs bits. - Fix big_key to require CONFIG_CRYPTO. - Fix a couple of bugs in the asymmetric key type. - Fix a race between updating and finding negative keys. - Prevent add_key() from updating uninstantiated keys. - Make loading of key flags and expiry time atomic when not holding locks" * 'fixes-v4.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: commoncap: move assignment of fs_ns to avoid null pointer dereference pkcs7: Prevent NULL pointer dereference, since sinfo is not always set. KEYS: load key flags and expiry time atomically in proc_keys_show() KEYS: Load key expiry time atomically in keyring_search_iterator() KEYS: load key flags and expiry time atomically in key_validate() KEYS: don't let add_key() update an uninstantiated key KEYS: Fix race between updating and finding a negative key KEYS: checking the input id parameters before finding asymmetric key KEYS: Fix the wrong index when checking the existence of second id security/keys: BIG_KEY requires CONFIG_CRYPTO ecryptfs: fix dereference of NULL user_key_payload fscrypt: fix dereference of NULL user_key_payload lib/digsig: fix dereference of NULL user_key_payload FS-Cache: fix dereference of NULL user_key_payload KEYS: encrypted: fix dereference of NULL user_key_payload
2017-10-19doc: Fix various RCU docbook comment-header problemsPaul E. McKenney
Because many of RCU's files have not been included into docbook, a number of errors have accumulated. This commit fixes them. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-19membarrier: Provide register expedited private commandMathieu Desnoyers
This introduces a "register private expedited" membarrier command which allows eventual removal of important memory barrier constraints on the scheduler fast-paths. It changes how the "private expedited" membarrier command (new to 4.14) is used from user-space. This new command allows processes to register their intent to use the private expedited command. This affects how the expedited private command introduced in 4.14-rc is meant to be used, and should be merged before 4.14 final. Processes are now required to register before using MEMBARRIER_CMD_PRIVATE_EXPEDITED, otherwise that command returns EPERM. This fixes a problem that arose when designing requested extensions to sys_membarrier() to allow JITs to efficiently flush old code from instruction caches. Several potential algorithms are much less painful if the user register intent to use this functionality early on, for example, before the process spawns the second thread. Registering at this time removes the need to interrupt each and every thread in that process at the first expedited sys_membarrier() system call. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-19Input: allow matching device IDs on property bitsDmitry Torokhov
Let's allow matching input devices on their property bits, both in-kernel and when generating module aliases. Tested-by: Roderick Colenbrander <roderick.colenbrander@sony.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2017-10-19Input: factor out and export input_device_id matching codeDmitry Torokhov
Factor out and export input_match_device_id() so that modules may use it. It will be needed by joydev to blacklist accelerometers in composite devices. Tested-by: Roderick Colenbrander <roderick.colenbrander@sony.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2017-10-19Merge tag 'sound-4.14-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "We've got slightly more fixes than wished, but heading to a good shape. Most of changes are about HD-audio fixes, one for a buggy code that went into 4.13, and another for avoiding a crash due to buggy BIOS. Apart from HD-audio, a sequencer core change that is only for UP config (which must be pretty rare nowadays), and a USB-audio quirk as usual" * tag 'sound-4.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda - Fix incorrect TLV callback check introduced during set_fs() removal ALSA: hda: Remove superfluous '-' added by printk conversion ALSA: hda: Abort capability probe at invalid register read ALSA: seq: Enable 'use' locking in all configurations ALSA: usb-audio: Add native DSD support for Pro-Ject Pre Box S2 Digital
2017-10-19Merge tag 'mvebu-fixes-4.14-2' of git://git.infradead.org/linux-mvebu into fixesArnd Bergmann
Pull "mvebu fixes for 4.14 (part 2)" from Gregory CLEMENT Two device tree related fixes: - One on Armada 38x using a other compatible string for I2C in order to cover an errata. - One for Armada 7K/8K fixing a typo on interrupt-map property for PCIe leading to fail PME and AER root port service initialization And the last one for the mbus fixing the window size calculation when it exceed 32bits * tag 'mvebu-fixes-4.14-2' of git://git.infradead.org/linux-mvebu: bus: mbus: fix window size calculation for 4GB windows ARM: dts: Fix I2C repeated start issue on Armada-38x arm64: dts: marvell: fix interrupt-map property for Armada CP110 PCIe controller
2017-10-19Merge commit 'tags/keys-fixes-20171018' into fixes-v4.14-rc5James Morris
2017-10-18ALSA: hda - Fix incorrect TLV callback check introduced during set_fs() removalTakashi Iwai
The commit 99b5c5bb9a54 ("ALSA: hda - Remove the use of set_fs()") converted the get_kctl_0dB_offset() call for killing set_fs() usage in HD-audio codec code. The conversion assumed that the TLV callback used in HD-audio code is only snd_hda_mixer_amp() and applies the TLV calculation locally. Although this assumption is correct, and all slave kctls are actually with that callback, the current code is still utterly buggy; it doesn't hit this condition and falls back to the next check. It's because the function gets called after adding slave kctls to vmaster. By assigning a slave kctl, the slave kctl object is faked inside vmaster code, and the whole kctl ops are overridden. Thus the callback op points to a different value from what we've assumed. More badly, as reported by the KERNEXEC and UDEREF features of PaX, the code flow turns into the unexpected pitfall. The next fallback check is SNDRV_CTL_ELEM_ACCESS_TLV_READ access bit, and this always hits for each kctl with TLV. Then it evaluates the callback function pointer wrongly as if it were a TLV array. Although currently its side-effect is fairly limited, this incorrect reference may lead to an unpleasant result. For addressing the regression, this patch introduces a new helper to vmaster code, snd_ctl_apply_vmaster_slaves(). This works similarly like the existing map_slaves() in hda_codec.c: it loops over the slave list of the given master, and applies the given function to each slave. Then the initializer function receives the right kctl object and we can compare the correct pointer instead of the faked one. Also, for catching the similar breakage in future, give an error message when the unexpected TLV callback is found and bail out immediately. Fixes: 99b5c5bb9a54 ("ALSA: hda - Remove the use of set_fs()") Reported-by: PaX Team <pageexec@freemail.hu> Cc: <stable@vger.kernel.org> # v4.13 Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-10-18KEYS: Fix race between updating and finding a negative keyDavid Howells
Consolidate KEY_FLAG_INSTANTIATED, KEY_FLAG_NEGATIVE and the rejection error into one field such that: (1) The instantiation state can be modified/read atomically. (2) The error can be accessed atomically with the state. (3) The error isn't stored unioned with the payload pointers. This deals with the problem that the state is spread over three different objects (two bits and a separate variable) and reading or updating them atomically isn't practical, given that not only can uninstantiated keys change into instantiated or rejected keys, but rejected keys can also turn into instantiated keys - and someone accessing the key might not be using any locking. The main side effect of this problem is that what was held in the payload may change, depending on the state. For instance, you might observe the key to be in the rejected state. You then read the cached error, but if the key semaphore wasn't locked, the key might've become instantiated between the two reads - and you might now have something in hand that isn't actually an error code. The state is now KEY_IS_UNINSTANTIATED, KEY_IS_POSITIVE or a negative error code if the key is negatively instantiated. The key_is_instantiated() function is replaced with key_is_positive() to avoid confusion as negative keys are also 'instantiated'. Additionally, barriering is included: (1) Order payload-set before state-set during instantiation. (2) Order state-read before payload-read when using the key. Further separate barriering is necessary if RCU is being used to access the payload content after reading the payload pointers. Fixes: 146aa8b1453b ("KEYS: Merge the type-specific data with the payload data") Cc: stable@vger.kernel.org # v4.4+ Reported-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Eric Biggers <ebiggers@google.com>
2017-10-18fq_impl: Properly enforce memory limitToke Høiland-Jørgensen
The fq structure would fail to properly enforce the memory limit in the case where the packet being enqueued was bigger than the packet being removed to bring the memory usage down. So keep dropping packets until the memory usage is back below the limit. Also, fix the statistics for memory limit violations. Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-10-16tun: call dev_get_valid_name() before register_netdevice()Cong Wang
register_netdevice() could fail early when we have an invalid dev name, in which case ->ndo_uninit() is not called. For tun device, this is a problem because a timer etc. are already initialized and it expects ->ndo_uninit() to clean them up. We could move these initializations into a ->ndo_init() so that register_netdevice() knows better, however this is still complicated due to the logic in tun_detach(). Therefore, I choose to just call dev_get_valid_name() before register_netdevice(), which is quicker and much easier to audit. And for this specific case, it is already enough. Fixes: 96442e42429e ("tuntap: choose the txq based on rxq") Reported-by: Dmitry Alexeev <avekceeb@gmail.com> Cc: Jason Wang <jasowang@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-16Merge tag 'irqchip-4.14-3' of ↵Thomas Gleixner
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent Pull irqchip updates for 4.14-rc5 from Marc Zyngier: - Fix unfortunate mistake in the GICv3 ITS binding example - Two fixes for the recently merged GICv4 support - GICv3 ITS 52bit PA fixes - Generic irqchip mask-ack fix, and its application to the tango irqchip
2017-10-15Merge tag 'char-misc-4.14-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are 4 patches to resolve some char/misc driver issues found these past weeks. One of them is a mei bugfix and another is a new mei device id. There is also a hyper-v fix for a reported issue, and a binder issue fix for a problem reported by a few people. All of these have been in my tree for a while, I don't know if linux-next is really testing much this month. But 0-day is happy with them :)" * tag 'char-misc-4.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: binder: fix use-after-free in binder_transaction() Drivers: hv: vmbus: Fix bugs in rescind handling mei: me: add gemini lake devices id mei: always use domain runtime pm callbacks.
2017-10-14Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Three fixes that address an SMP balancing performance regression" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/core: Ensure load_balance() respects the active_mask sched/core: Address more wake_affine() regressions sched/core: Fix wake_affine() performance regression
2017-10-13kmemleak: clear stale pointers from task stacksKonstantin Khlebnikov
Kmemleak considers any pointers on task stacks as references. This patch clears newly allocated and reused vmap stacks. Link: http://lkml.kernel.org/r/150728990124.744199.8403409836394318684.stgit@buzz Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-13fs/mpage.c: fix mpage_writepage() for pages with buffersMatthew Wilcox
When using FAT on a block device which supports rw_page, we can hit BUG_ON(!PageLocked(page)) in try_to_free_buffers(). This is because we call clean_buffers() after unlocking the page we've written. Introduce a new clean_page_buffers() which cleans all buffers associated with a page and call it from within bdev_write_page(). [akpm@linux-foundation.org: s/PAGE_SIZE/~0U/ per Linus and Matthew] Link: http://lkml.kernel.org/r/20171006211541.GA7409@bombadil.infradead.org Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Reported-by: Toshi Kani <toshi.kani@hpe.com> Reported-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Tested-by: Toshi Kani <toshi.kani@hpe.com> Acked-by: Johannes Thumshirn <jthumshirn@suse.de> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-13linux/kernel.h: add/correct kernel-doc notationRandy Dunlap
Add kernel-doc notation for some macros. Correct kernel-doc comments & typos for a few macros. Link: http://lkml.kernel.org/r/76fa1403-1511-be4c-e9c4-456b43edfad3@infradead.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-13include/linux/of.h: provide of_n_{addr,size}_cells wrappers for !CONFIG_OFArnd Bergmann
The pci-rcar driver is enabled for compile tests, and this has shown that the driver cannot build without CONFIG_OF, following the inclusion of commit f8f2fe7355fb ("PCI: rcar: Use new OF interrupt mapping when possible"): drivers/pci/host/pcie-rcar.c: In function 'pci_dma_range_parser_init': drivers/pci/host/pcie-rcar.c:1039:2: error: implicit declaration of function 'of_n_addr_cells' [-Werror=implicit-function-declaration] parser->pna = of_n_addr_cells(node); ^ As pointed out by Ben Dooks and Geert Uytterhoeven, this is actually supposed to build fine, which we can achieve if we make the declaration of of_irq_parse_and_map_pci conditional on CONFIG_OF and provide an empty inline function otherwise, as we do for a lot of other of interfaces. This lets us build the rcar_pci driver again without CONFIG_OF for build testing. All platforms using this driver select OF, so this doesn't change anything for the users. [akpm@linux-foundation.org: be consistent with surrounding code] Link: http://lkml.kernel.org/r/20170911200805.3363318-1-arnd@arndb.de Fixes: c25da4778803 ("PCI: rcar: Add Renesas R-Car PCIe driver") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Frank Rowand <frank.rowand@sony.com> Acked-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Magnus Damm <damm@opensource.se> Cc: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-13genirq: generic chip: remove irq_gc_mask_disable_reg_and_ack()Doug Berger
Any usage of the irq_gc_mask_disable_reg_and_ack() function has been replaced with the desired functionality. The incorrect and ambiguously named function is removed here to prevent accidental misuse. Signed-off-by: Doug Berger <opendmb@gmail.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-10-13genirq: generic chip: Add irq_gc_mask_disable_and_ack_set()Doug Berger
The irq_gc_mask_disable_reg_and_ack() function name implies that it provides the combined functions of irq_gc_mask_disable_reg() and irq_gc_ack(). However, the implementation does not actually do that since it writes the mask instead of the disable register. It also does not maintain the mask cache which makes it inappropriate to use with other masking functions. In addition, commit 659fb32d1b67 ("genirq: replace irq_gc_ack() with {set,clr}_bit variants (fwd)") effectively renamed irq_gc_ack() to irq_gc_ack_set_bit() so this function probably should have also been renamed at that time. The generic chip code currently provides three functions for use with the irq_mask member of the irq_chip structure and two functions for use with the irq_ack member of the irq_chip structure. These functions could be combined into six functions for use with the irq_mask_ack member of the irq_chip structure. However, since only one of the combinations is currently used, only the function irq_gc_mask_disable_and_ack_set() is added by this commit. The '_reg' and '_bit' portions of the base function name were left out of the new combined function name in an attempt to keep the function name length manageable with the 80 character source code line length while still allowing the distinct aspects of each combination to be captured by the name. If other combinations are desired in the future please add them to the irq generic chip library at that time. Signed-off-by: Doug Berger <opendmb@gmail.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-10-13irqchip/gic-v3-its: Add missing changes to support 52bit physical addressShanker Donthineni
The current ITS driver works fine as long as normal memory and GICR regions are located within the lower 48bit (>=0 && <2^48) physical address space. Some of the registers GICR_PEND/PROP, GICR_VPEND/VPROP and GITS_CBASER are handled properly but not all when configuring the hardware with 52bit physical address. This patch does the following changes to support 52bit PA. -Handle 52bit PA in GITS_BASERn. -Fix ITT_addr width to 52bits, bits[51:8]. -Fix RDbase width to 52bits, bits[51:16]. -Fix VPT_addr width to 52bits, bits[51:16]. Definition of the GITS_BASERn register when ITS PageSize is 64KB: -Bits[47:16] of the register provide bits[47:16] of the table PA. -Bits[15:12] of the register provide bits[51:48] of the table PA. -Bits[15:00] of the base physical address are 0. Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-10-12Merge tag 'sound-4.14-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "It's been a busy week for defending the attacks from fuzzer people. This contains various USB-audio driver fixes and sequencer core fixes spotted by syzkaller and other fuzzer, as well as one quirk for a Plantronics USB audio device" * tag 'sound-4.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: caiaq: Fix stray URB at probe error path ALSA: seq: Fix use-after-free at creating a port ALSA: usb-audio: Kill stray URB at exiting ALSA: line6: Fix leftover URB at error-path during probe ALSA: line6: Fix NULL dereference at podhd_disconnect() ALSA: line6: Fix missing initialization before error path ALSA: seq: Fix copy_from_user() call inside lock ALSA: usb-audio: Add sample rate quirk for Plantronics P610
2017-10-12bus: mbus: fix window size calculation for 4GB windowsJan Luebbe
At least the Armada XP SoC supports 4GB on a single DRAM window. Because the size register values contain the actual size - 1, the MSB is set in that case. For example, the SDRAM window's control register's value is 0xffffffe1 for 4GB (bits 31 to 24 contain the size). The MBUS driver reads back each window's size from registers and calculates the actual size as (control_reg | ~DDR_SIZE_MASK) + 1, which overflows for 32 bit values, resulting in other miscalculations further on (a bad RAM window for the CESA crypto engine calculated by mvebu_mbus_setup_cpu_target_nooverlap() in my case). This patch changes the type in 'struct mbus_dram_window' from u32 to u64, which allows us to keep using the same register calculation code in most MBUS-using drivers (which calculate ->size - 1 again). Fixes: fddddb52a6c4 ("bus: introduce an Marvell EBU MBus driver") CC: stable@vger.kernel.org Signed-off-by: Jan Luebbe <jlu@pengutronix.de> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
2017-10-10sched/core: Fix wake_affine() performance regressionPeter Zijlstra
Eric reported a sysbench regression against commit: 3fed382b46ba ("sched/numa: Implement NUMA node level wake_affine()") Similarly, Rik was looking at the NAS-lu.C benchmark, which regressed against his v3.10 enterprise kernel. PRE (current tip/master): ivb-ep sysbench: 2: [30 secs] transactions: 64110 (2136.94 per sec.) 5: [30 secs] transactions: 143644 (4787.99 per sec.) 10: [30 secs] transactions: 274298 (9142.93 per sec.) 20: [30 secs] transactions: 418683 (13955.45 per sec.) 40: [30 secs] transactions: 320731 (10690.15 per sec.) 80: [30 secs] transactions: 355096 (11834.28 per sec.) hsw-ex NAS: OMP_PROC_BIND/lu.C.x_threads_144_run_1.log: Time in seconds = 18.01 OMP_PROC_BIND/lu.C.x_threads_144_run_2.log: Time in seconds = 17.89 OMP_PROC_BIND/lu.C.x_threads_144_run_3.log: Time in seconds = 17.93 lu.C.x_threads_144_run_1.log: Time in seconds = 434.68 lu.C.x_threads_144_run_2.log: Time in seconds = 405.36 lu.C.x_threads_144_run_3.log: Time in seconds = 433.83 POST (+patch): ivb-ep sysbench: 2: [30 secs] transactions: 64494 (2149.75 per sec.) 5: [30 secs] transactions: 145114 (4836.99 per sec.) 10: [30 secs] transactions: 278311 (9276.69 per sec.) 20: [30 secs] transactions: 437169 (14571.60 per sec.) 40: [30 secs] transactions: 669837 (22326.73 per sec.) 80: [30 secs] transactions: 631739 (21055.88 per sec.) hsw-ex NAS: lu.C.x_threads_144_run_1.log: Time in seconds = 23.36 lu.C.x_threads_144_run_2.log: Time in seconds = 22.96 lu.C.x_threads_144_run_3.log: Time in seconds = 22.52 This patch takes out all the shiny wake_affine() stuff and goes back to utter basics. Between the two CPUs involved with the wakeup (the CPU doing the wakeup and the CPU we ran on previously) pick the CPU we can run on _now_. This restores much of the regressions against the older kernels, but leaves some ground in the overloaded case. The default-enabled WA_WEIGHT (which will be introduced in the next patch) is an attempt to address the overloaded situation. Reported-by: Eric Farman <farman@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthew Rosato <mjrosato@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: jinpuwang@gmail.com Cc: vcaputo@pengaru.com Fixes: 3fed382b46ba ("sched/numa: Implement NUMA node level wake_affine()") Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix object leak on IPSEC offload failure, from Steffen Klassert. 2) Fix range checks in ipset address range addition operations, from Jozsef Kadlecsik. 3) Fix pernet ops unregistration order in ipset, from Florian Westphal. 4) Add missing netlink attribute policy for nl80211 packet pattern attrs, from Peng Xu. 5) Fix PPP device destruction race, from Guillaume Nault. 6) Write marks get lost when BPF verifier processes R1=R2 register assignments, causing incorrect liveness information and less state pruning. Fix from Alexei Starovoitov. 7) Fix blockhole routes so that they are marked dead and therefore not cached in sockets, otherwise IPSEC stops working. From Steffen Klassert. 8) Fix broadcast handling of UDP socket early demux, from Paolo Abeni. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (37 commits) cdc_ether: flag the u-blox TOBY-L2 and SARA-U2 as wwan net: thunderx: mark expected switch fall-throughs in nicvf_main() udp: fix bcast packet reception netlink: do not set cb_running if dump's start() errs ipv4: Fix traffic triggered IPsec connections. ipv6: Fix traffic triggered IPsec connections. ixgbe: incorrect XDP ring accounting in ethtool tx_frame param net: ixgbe: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag Revert commit 1a8b6d76dc5b ("net:add one common config...") ixgbe: fix masking of bits read from IXGBE_VXLANCTRL register ixgbe: Return error when getting PHY address if PHY access is not supported netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of 'xt_bpf_info_v1' netfilter: SYNPROXY: skip non-tcp packet in {ipv4, ipv6}_synproxy_hook tipc: Unclone message at secondary destination lookup tipc: correct initialization of skb list gso: fix payload length when gso_size is zero mlxsw: spectrum_router: Avoid expensive lookup during route removal bpf: fix liveness marking doc: Fix typo "8023.ad" in bonding documentation ipv6: fix net.ipv6.conf.all.accept_dad behaviour for real ...
2017-10-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains Netfilter/IPVS fixes for your net tree, they are: 1) Fix packet drops due to incorrect ECN handling in IPVS, from Vadim Fedorenko. 2) Fix splat with mark restoration in xt_socket with non-full-sock, patch from Subash Abhinov Kasiviswanathan. 3) ipset bogusly bails out when adding IPv4 range containing more than 2^31 addresses, from Jozsef Kadlecsik. 4) Incorrect pernet unregistration order in ipset, from Florian Westphal. 5) Races between dump and swap in ipset results in BUG_ON splats, from Ross Lagerwall. 6) Fix chain renames in nf_tables, from JingPiao Chen. 7) Fix race in pernet codepath with ebtables table registration, from Artem Savkov. 8) Memory leak in error path in set name allocation in nf_tables, patch from Arvind Yadav. 9) Don't dump chain counters if they are not available, this fixes a crash when listing the ruleset. 10) Fix out of bound memory read in strlcpy() in x_tables compat code, from Eric Dumazet. 11) Make sure we only process TCP packets in SYNPROXY hooks, patch from Lin Zhang. 12) Cannot load rules incrementally anymore after xt_bpf with pinned objects, added in revision 1. From Shmulik Ladkani. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of 'xt_bpf_info_v1'Shmulik Ladkani
Commit 2c16d6033264 ("netfilter: xt_bpf: support ebpf") introduced support for attaching an eBPF object by an fd, with the 'bpf_mt_check_v1' ABI expecting the '.fd' to be specified upon each IPT_SO_SET_REPLACE call. However this breaks subsequent iptables calls: # iptables -A INPUT -m bpf --object-pinned /sys/fs/bpf/xxx -j ACCEPT # iptables -A INPUT -s 5.6.7.8 -j ACCEPT iptables: Invalid argument. Run `dmesg' for more information. That's because iptables works by loading existing rules using IPT_SO_GET_ENTRIES to userspace, then issuing IPT_SO_SET_REPLACE with the replacement set. However, the loaded 'xt_bpf_info_v1' has an arbitrary '.fd' number (from the initial "iptables -m bpf" invocation) - so when 2nd invocation occurs, userspace passes a bogus fd number, which leads to 'bpf_mt_check_v1' to fail. One suggested solution [1] was to hack iptables userspace, to perform a "entries fixup" immediatley after IPT_SO_GET_ENTRIES, by opening a new, process-local fd per every 'xt_bpf_info_v1' entry seen. However, in [2] both Pablo Neira Ayuso and Willem de Bruijn suggested to depricate the xt_bpf_info_v1 ABI dealing with pinned ebpf objects. This fix changes the XT_BPF_MODE_FD_PINNED behavior to ignore the given '.fd' and instead perform an in-kernel lookup for the bpf object given the provided '.path'. It also defines an alias for the XT_BPF_MODE_FD_PINNED mode, named XT_BPF_MODE_PATH_PINNED, to better reflect the fact that the user is expected to provide the path of the pinned object. Existing XT_BPF_MODE_FD_ELF behavior (non-pinned fd mode) is preserved. References: [1] https://marc.info/?l=netfilter-devel&m=150564724607440&w=2 [2] https://marc.info/?l=netfilter-devel&m=150575727129880&w=2 Reported-by: Rafael Buchbinder <rafi@rbk.ms> Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-10-09ALSA: seq: Fix copy_from_user() call inside lockTakashi Iwai
The event handler in the virmidi sequencer code takes a read-lock for the linked list traverse, while it's calling snd_seq_dump_var_event() in the loop. The latter function may expand the user-space data depending on the event type. It eventually invokes copy_from_user(), which might be a potential dead-lock. The sequencer core guarantees that the user-space data is passed only with atomic=0 argument, but snd_virmidi_dev_receive_event() ignores it and always takes read-lock(). For avoiding the problem above, this patch introduces rwsem for non-atomic case, while keeping rwlock for atomic case. Also while we're at it: the superfluous irq flags is dropped in snd_virmidi_input_open(). Reported-by: Jia-Ju Bai <baijiaju1990@163.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-10-07Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: - a couple of serious fixes: use after free and blacklist for WRITE SAME - one error leg fix: write_pending failure - one user experience problem: do not override max_sectors_kb - one minor unused function removal * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ibmvscsis: Fix write_pending failure path scsi: libiscsi: Remove iscsi_destroy_session scsi: libiscsi: Fix use-after-free race during iscsi_session_teardown scsi: sd: Do not override max_sectors_kb sysfs setting scsi: sd: Implement blacklist option for WRITE SAME w/ UNMAP
2017-10-07Merge tag 'mmc-v4.14-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fixes from Ulf Hansson: "MMC core: - Fix driver strength selection when selecting hs400es - Delete bounce buffer handling: This change fixes a problem related to how bounce buffers are being allocated. However, instead of trying to fix that, let's just remove the mmc bounce buffer code altogether, as it has practically no use. MMC host: - meson-gx: A couple of fixes related to clock/phase/tuning - sdhci-xenon: Fix clock resource by adding an optional bus clock" * tag 'mmc-v4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: sdhci-xenon: Fix clock resource by adding an optional bus clock mmc: meson-gx: include tx phase in the tuning process mmc: meson-gx: fix rx phase reset mmc: meson-gx: make sure the clock is rounded down mmc: Delete bounce buffer handling mmc: core: add driver strength selection when selecting hs400es
2017-10-06Merge branch 'core-watchdog-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull watchddog clean-up and fixes from Thomas Gleixner: "The watchdog (hard/softlockup detector) code is pretty much broken in its current state. The patch series addresses this by removing all duct tape and refactoring it into a workable state. The reasons why I ask for inclusion that late in the cycle are: 1) The code causes lockdep splats vs. hotplug locking which get reported over and over. Unfortunately there is no easy fix. 2) The risk of breakage is minimal because it's already broken 3) As 4.14 is a long term stable kernel, I prefer to have working watchdog code in that and the lockdep issues resolved. I wouldn't ask you to pull if 4.14 wouldn't be a LTS kernel or if the solution would be easy to backport. 4) The series was around before the merge window opened, but then got delayed due to the UP failure caused by the for_each_cpu() surprise which we discussed recently. Changes vs. V1: - Addressed your review points - Addressed the warning in the powerpc code which was discovered late - Changed two function names which made sense up to a certain point in the series. Now they match what they do in the end. - Fixed a 'unused variable' warning, which got not detected by the intel robot. I triggered it when trying all possible related config combinations manually. Randconfig testing seems not random enough. The changes have been tested by and reviewed by Don Zickus and tested and acked by Micheal Ellerman for powerpc" * 'core-watchdog-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits) watchdog/core: Put softlockup_threads_initialized under ifdef guard watchdog/core: Rename some softlockup_* functions powerpc/watchdog: Make use of watchdog_nmi_probe() watchdog/core, powerpc: Lock cpus across reconfiguration watchdog/core, powerpc: Replace watchdog_nmi_reconfigure() watchdog/hardlockup/perf: Fix spelling mistake: "permanetely" -> "permanently" watchdog/hardlockup/perf: Cure UP damage watchdog/hardlockup: Clean up hotplug locking mess watchdog/hardlockup/perf: Simplify deferred event destroy watchdog/hardlockup/perf: Use new perf CPU enable mechanism watchdog/hardlockup/perf: Implement CPU enable replacement watchdog/hardlockup/perf: Implement init time detection of perf watchdog/hardlockup/perf: Implement init time perf validation watchdog/core: Get rid of the racy update loop watchdog/core, powerpc: Make watchdog_nmi_reconfigure() two stage watchdog/sysctl: Clean up sysctl variable name space watchdog/sysctl: Get rid of the #ifdeffery watchdog/core: Clean up header mess watchdog/core: Further simplify sysctl handling watchdog/core: Get rid of the thread teardown/setup dance ...
2017-10-05Merge tag 'for-4.14/dm-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - a stable fix for the alignment of the event number reported at the end of the 'DM_LIST_DEVICES' ioctl. - a couple stable fixes for the DM crypt target. - a DM raid health status reporting fix. * tag 'for-4.14/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm raid: fix incorrect status output at the end of a "recover" process dm crypt: reject sector_size feature if device length is not aligned to it dm crypt: fix memory leak in crypt_ctr_cipher_old() dm ioctl: fix alignment of event number in the device list
2017-10-05Merge tag 'sound-4.14-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A collection of small fixes, mostly with stable ones: - X32 ABI fix for PCM; likely not so many people suffer from it, but still better to fix - Two minor kernel warning fixes on USB audio devices spotted by syzkaller - Regression fix of echoaudio due to its inconsistent dimension - Fix for HBR support on Intel DP audio, on some recent chips - USB-audio quirk for yet another Plantronics devices - Fix for potential double-fetch in ASIHPI FIFO queue" * tag 'sound-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: usx2y: Suppress kernel warning at page allocation failures Revert "ALSA: echoaudio: purge contradictions between dimension matrix members and total number of members" ALSA: usb-audio: Check out-of-bounds access by corrupted buffer descriptor ALSA: pcm: Fix structure definition for X32 ABI ALSA: usb-audio: Add sample rate quirk for Plantronics C310/C520-M ALSA: hda - program ICT bits to support HBR audio ALSA: asihpi: fix a potential double-fetch bug when copying puhm ALSA: compress: Remove unused variable
2017-10-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Check iwlwifi 9000 reorder buffer out-of-space condition properly, from Sara Sharon. 2) Fix RCU splat in qualcomm rmnet driver, from Subash Abhinov Kasiviswanathan. 3) Fix session and tunnel release races in l2tp, from Guillaume Nault and Sabrina Dubroca. 4) Fix endian bug in sctp_diag_dump(), from Dan Carpenter. 5) Several mlx5 driver fixes from the Mellanox folks (max flow counters cap check, invalid memory access in IPoIB support, etc.) 6) tun_get_user() should bail if skb->len is zero, from Alexander Potapenko. 7) Fix RCU lookups in inetpeer, from Eric Dumazet. 8) Fix locking in packet_do_bund(). 9) Handle cb->start() error properly in netlink dump code, from Jason A. Donenfeld. 10) Handle multicast properly in UDP socket early demux code. From Paolo Abeni. 11) Several erspan bug fixes in ip_gre, from Xin Long. 12) Fix use-after-free in socket filter code, in order to handle the fact that listener lock is no longer taken during the three-way TCP handshake. From Eric Dumazet. 13) Fix infoleak in RTM_GETSTATS, from Nikolay Aleksandrov. 14) Fix tail call generation in x86-64 BPF JIT, from Alexei Starovoitov. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (77 commits) net: 8021q: skip packets if the vlan is down bpf: fix bpf_tail_call() x64 JIT net: stmmac: dwmac-rk: Add RK3128 GMAC support rndis_host: support Novatel Verizon USB730L net: rtnetlink: fix info leak in RTM_GETSTATS call socket, bpf: fix possible use after free mlxsw: spectrum_router: Track RIF of IPIP next hops mlxsw: spectrum_router: Move VRF refcounting net: hns3: Fix an error handling path in 'hclge_rss_init_hw()' net: mvpp2: Fix clock resource by adding an optional bus clock r8152: add Linksys USB3GIGV1 id l2tp: fix l2tp_eth module loading ip_gre: erspan device should keep dst ip_gre: set tunnel hlen properly in erspan_tunnel_init ip_gre: check packet length and mtu correctly in erspan_xmit ip_gre: get key from session_id correctly in erspan_rcv tipc: use only positive error codes in messages ppp: fix __percpu annotation udp: perform source validation for mcast early demux IPv4: early demux can return an error code ...
2017-10-04Merge tag 'armsoc-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "Our first batch of fixes this release cycle, unfortunately a bit noisier than usual. Two major groups stand out: - Some pinctril dts/dtsi changes for stm32 due to a new driver being merged during the merge window, and this aligns the DT contents between the old format and the new. This could arguably be moved to the next merge window but it also seemed relatively harmless to include now. - Amlogic/meson had driver changes merged that required devicetree changes to avoid functional/performance regressions. I've already asked them to be more careful about this going forward, and making sure drivers are compatible with older DTs when they make these kind of changes. The platform is actively being upstreamed so there's a few things in flight, we've seen this happen before and sometimes it's hard to catch in time. Besides that there is the usual mix of minor fixes" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (33 commits) ARM: dts: stm32: use right pinctrl compatible for stm32f469 ARM: dts: stm32: Fix STMPE1600 binding on stm32429i-eval board ARM: defconfig: update Gemini defconfig ARM: defconfig: FRAMEBUFFER_CONSOLE can no longer be =m arm64: dts: rockchip: add the grf clk for dw-mipi-dsi on rk3399 reset: Restrict RESET_HSDK to ARC_SOC_HSDK or COMPILE_TEST ARM: dts: da850-evm: add serial and ethernet aliases ARM: dts: am43xx-epos-evm: Remove extra CPSW EMAC entry ARM: dts: am33xx: Add spi alias to match SOC schematics ARM: OMAP2+: hsmmc: fix logic to call either omap_hsmmc_init or omap_hsmmc_late_init but not both ARM: dts: dra7: Set a default parent to mcasp3_ahclkx_mux ARM: OMAP2+: dra7xx: Set OPT_CLKS_IN_RESET flag for gpio1 ARM: dts: nokia n900: drop unneeded/undocumented parts of the dts arm64: dts: rockchip: Correct MIPI DPHY PLL clock on rk3399 arm64: dt marvell: Fix AP806 system controller size MAINTAINERS: add Macchiatobin maintainers entry ARC: reset: remove the misleading v1 suffix all over ARC: reset: add missing DT binding documentation for HSDKv1 reset driver ARC: reset: Only build on archs that have IOMEM ARM: at91: Replace uses of virt_to_phys with __pa_symbol ...
2017-10-04Drivers: hv: vmbus: Fix bugs in rescind handlingK. Y. Srinivasan
This patch addresses the following bugs in the current rescind handling code: 1. Fixes a race condition where we may be invoking hv_process_channel_removal() on an already freed channel. 2. Prevents indefinite wait when rescinding sub-channels by correctly setting the probe_complete state. I would like to thank Dexuan for patiently reviewing earlier versions of this patch and identifying many of the issues fixed here. Greg, please apply this to 4.14-final. Fixes: '54a66265d675 ("Drivers: hv: vmbus: Fix rescind handling")' Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Dexuan Cui <decui@microsoft.com> Cc: stable@vger.kernel.org # (4.13 and above) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-04powerpc/watchdog: Make use of watchdog_nmi_probe()Thomas Gleixner
The rework of the core hotplug code triggers the WARN_ON in start_wd_cpu() on powerpc because it is called multiple times for the boot CPU. The first call is via: start_wd_on_cpu+0x80/0x2f0 watchdog_nmi_reconfigure+0x124/0x170 softlockup_reconfigure_threads+0x110/0x130 lockup_detector_init+0xbc/0xe0 kernel_init_freeable+0x18c/0x37c kernel_init+0x2c/0x160 ret_from_kernel_thread+0x5c/0xbc And then again via the CPU hotplug registration: start_wd_on_cpu+0x80/0x2f0 cpuhp_invoke_callback+0x194/0x620 cpuhp_thread_fun+0x7c/0x1b0 smpboot_thread_fn+0x290/0x2a0 kthread+0x168/0x1b0 ret_from_kernel_thread+0x5c/0xbc This can be avoided by setting up the cpu hotplug state with nocalls and move the initialization to the watchdog_nmi_probe() function. That initializes the hotplug callbacks without invoking the callback and the following core initialization function then configures the watchdog for the online CPUs (in this case CPU0) via softlockup_reconfigure_threads(). Reported-and-tested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: linuxppc-dev@lists.ozlabs.org
2017-10-04watchdog/core, powerpc: Replace watchdog_nmi_reconfigure()Thomas Gleixner
The recent cleanup of the watchdog code split watchdog_nmi_reconfigure() into two stages. One to stop the NMI and one to restart it after reconfiguration. That was done by adding a boolean 'run' argument to the code, which is functionally correct but not necessarily a piece of art. Replace it by two explicit functions: watchdog_nmi_stop() and watchdog_nmi_start(). Fixes: 6592ad2fcc8f ("watchdog/core, powerpc: Make watchdog_nmi_reconfigure() two stage") Requested-by: Linus 'Nursing his pet-peeve' Torvalds <torvalds@linuxfoundation.org> Signed-off-by: Thomas 'Mopping up garbage' Gleixner <tglx@linutronix.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Don Zickus <dzickus@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1710021957480.2114@nanos
2017-10-04mmc: Delete bounce buffer handlingLinus Walleij
In may, Steven sent a patch deleting the bounce buffer handling and the CONFIG_MMC_BLOCK_BOUNCE option. I chose the less invasive path of making it a runtime config option, and we merged that successfully for kernel v4.12. The code is however just standing in the way and taking up space for seemingly no gain on any systems in wide use today. Pierre says the code was there to improve speed on TI SDHCI controllers on certain HP laptops and possibly some Ricoh controllers as well. Early SDHCI controllers lacked the scatter-gather feature, which made software bounce buffers a significant speed boost. We are clearly talking about the list of SDHCI PCI-based MMC/SD card readers found in the pci_ids[] list in drivers/mmc/host/sdhci-pci-core.c. The TI SDHCI derivative is not supported by the upstream kernel. This leaves the Ricoh. What we can however notice is that the x86 defconfigs in the kernel did not enable CONFIG_MMC_BLOCK_BOUNCE option, which means that any such laptop would have to have a custom configured kernel to actually take advantage of this bounce buffer speed-up. It simply seems like there was a speed optimization for the Ricoh controllers that noone was using. (I have not checked the distro defconfigs but I am pretty sure the situation is the same there.) Bounce buffers increased performance on the OMAP HSMMC at one point, and was part of the original submission in commit a45c6cb81647 ("[ARM] 5369/1: omap mmc: Add new omap hsmmc controller for 2430 and 34xx, v3") This optimization was removed in commit 0ccd76d4c236 ("omap_hsmmc: Implement scatter-gather emulation") which found that scatter-gather emulation provided even better performance. The same was introduced for SDHCI in commit 2134a922c6e7 ("sdhci: scatter-gather (ADMA) support") I am pretty positively convinced that software scatter-gather emulation will do for any host controller what the bounce buffers were doing. Essentially, the bounce buffer was a reimplementation of software scatter-gather-emulation in the MMC subsystem, and it should be done away with. Cc: Pierre Ossman <pierre@ossman.eu> Cc: Juha Yrjola <juha.yrjola@solidboot.com> Cc: Steven J. Hill <Steven.Hill@cavium.com> Cc: Shawn Lin <shawn.lin@rock-chips.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Suggested-by: Steven J. Hill <Steven.Hill@cavium.com> Suggested-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2017-10-03Merge tag 'reset-fixes-for-4.14' of git://git.pengutronix.de/git/pza/linux ↵Olof Johansson
into fixes Reset controller fixes for v4.14 - Remove misleading HSDK v1 suffix, as there is no v2 planned - Add missing DT binding documentation for HSDK reset driver - Fix HSDK reset driver dependencies * tag 'reset-fixes-for-4.14' of git://git.pengutronix.de/git/pza/linux: reset: Restrict RESET_HSDK to ARC_SOC_HSDK or COMPILE_TEST ARC: reset: remove the misleading v1 suffix all over ARC: reset: add missing DT binding documentation for HSDKv1 reset driver ARC: reset: Only build on archs that have IOMEM Signed-off-by: Olof Johansson <olof@lixom.net>
2017-10-03include/linux/fs.h: fix comment about struct address_spaceMike Rapoport
Before commit 9c5d760b8d22 ("mm: split gfp_mask and mapping flags into separate fields") the private_* fields of struct adrress_space were grouped together and using "ditto" in comments describing the last fields was correct. With introduction of gpf_mask between private_lock and private_list "ditto" references the wrong description. Fix it by using the elaborate description. Link: http://lkml.kernel.org/r/1507009987-8746-1-git-send-email-rppt@linux.vnet.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-03mm/memory_hotplug: change pfn_to_section_nr/section_nr_to_pfn macro to ↵YASUAKI ISHIMATSU
inline function pfn_to_section_nr() and section_nr_to_pfn() are defined as macro. pfn_to_section_nr() has no issue even if it is defined as macro. But section_nr_to_pfn() has overflow issue if sec is defined as int. section_nr_to_pfn() just shifts sec by PFN_SECTION_SHIFT. If sec is defined as unsigned long, section_nr_to_pfn() returns pfn as 64 bit value. But if sec is defined as int, section_nr_to_pfn() returns pfn as 32 bit value. __remove_section() calculates start_pfn using section_nr_to_pfn() and scn_nr defined as int. So if hot-removed memory address is over 16TB, overflow issue occurs and section_nr_to_pfn() does not calculate correct pfn. To make callers use proper arg, the patch changes the macros to inline functions. Fixes: 815121d2b5cd ("memory_hotplug: clear zone when removing the memory") Link: http://lkml.kernel.org/r/e643a387-e573-6bbf-d418-c60c8ee3d15e@gmail.com Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Xishi Qiu <qiuxishi@huawei.com> Cc: Reza Arbab <arbab@linux.vnet.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-03include/linux/bitfield.h: remove 32bit from FIELD_GET comment blockMasahiro Yamada
I do not see anything that restricts this macro to 32 bit width. Link: http://lkml.kernel.org/r/1505921975-23379-1-git-send-email-yamada.masahiro@socionext.com Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-03exec: load_script: kill the onstack interp[BINPRM_BUF_SIZE] arrayOleg Nesterov
Patch series "exec: binfmt_misc: fix use-after-free, kill iname[BINPRM_BUF_SIZE]". It looks like this code was always wrong, then commit 948b701a607f ("binfmt_misc: add persistent opened binary handler for containers") added more problems. This patch (of 6): load_script() can simply use i_name instead, it points into bprm->buf[] and nobody can change this memory until we call prepare_binprm(). The only complication is that we need to also change the signature of bprm_change_interp() but this change looks good too. While at it, do whitespace/style cleanups. NOTE: the real motivation for this change is that people want to increase BINPRM_BUF_SIZE, we need to change load_misc_binary() too but this looks more complicated because afaics it is very buggy. Link: http://lkml.kernel.org/r/20170918163446.GA26793@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: Travis Gummels <tgummels@redhat.com> Cc: Ben Woodard <woodard@redhat.com> Cc: Jim Foraker <foraker1@llnl.gov> Cc: <tdhooge@llnl.gov> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-03android: binder: drop lru lock in isolate callbackSherry Yang
Drop the global lru lock in isolate callback before calling zap_page_range which calls cond_resched, and re-acquire the global lru lock before returning. Also change return code to LRU_REMOVED_RETRY. Use mmput_async when fail to acquire mmap sem in an atomic context. Fix "BUG: sleeping function called from invalid context" errors when CONFIG_DEBUG_ATOMIC_SLEEP is enabled. Also restore mmput_async, which was initially introduced in commit ec8d7c14ea14 ("mm, oom_reaper: do not mmput synchronously from the oom reaper context"), and was removed in commit 212925802454 ("mm: oom: let oom_reap_task and exit_mmap run concurrently"). Link: http://lkml.kernel.org/r/20170914182231.90908-1-sherryy@android.com Fixes: f2517eb76f1f2 ("android: binder: Add global lru shrinker to binder") Signed-off-by: Sherry Yang <sherryy@android.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reported-by: Kyle Yan <kyan@codeaurora.org> Acked-by: Arve Hjønnevåg <arve@android.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Martijn Coenen <maco@google.com> Cc: Todd Kjos <tkjos@google.com> Cc: Riley Andrews <riandrews@android.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Hillf Danton <hdanton@sina.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Hoeun Ryu <hoeun.ryu@gmail.com> Cc: Christopher Lameter <cl@linux.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-03mm, oom_reaper: skip mm structs with mmu notifiersMichal Hocko
Andrea has noticed that the oom_reaper doesn't invalidate the range via mmu notifiers (mmu_notifier_invalidate_range_start/end) and that can corrupt the memory of the kvm guest for example. tlb_flush_mmu_tlbonly already invokes mmu notifiers but that is not sufficient as per Andrea: "mmu_notifier_invalidate_range cannot be used in replacement of mmu_notifier_invalidate_range_start/end. For KVM mmu_notifier_invalidate_range is a noop and rightfully so. A MMU notifier implementation has to implement either ->invalidate_range method or the invalidate_range_start/end methods, not both. And if you implement invalidate_range_start/end like KVM is forced to do, calling mmu_notifier_invalidate_range in common code is a noop for KVM. For those MMU notifiers that can get away only implementing ->invalidate_range, the ->invalidate_range is implicitly called by mmu_notifier_invalidate_range_end(). And only those secondary MMUs that share the same pagetable with the primary MMU (like AMD iommuv2) can get away only implementing ->invalidate_range" As the callback is allowed to sleep and the implementation is out of hand of the MM it is safer to simply bail out if there is an mmu notifier registered. In order to not fail too early make the mm_has_notifiers check under the oom_lock and have a little nap before failing to give the current oom victim some more time to exit. [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/20170913113427.2291-1-mhocko@kernel.org Fixes: aac453635549 ("mm, oom: introduce oom reaper") Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Andrea Arcangeli <aarcange@redhat.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-03include/linux/mm.h: fix typo in VM_MPX definitionKirill A. Shutemov
There's a typo in recent change of VM_MPX definition. We want it to be VM_HIGH_ARCH_4, not VM_HIGH_ARCH_BIT_4. This bug does cause visible regressions. In arch_vma_name the vmflags are tested against VM_MPX. With the incorrect value of VM_MPX, a number of vmas (such as the stack) test positive and end up being marked as "[mpx]" in /proc/N/maps instead of their correct names. This confuses tools like rr which expect to be able to find familiar vmas. Fixes: df3735c5b40f ("x86,mpx: make mpx depend on x86-64 to free up VMA flag") Link: http://lkml.kernel.org/r/20170918140253.36856-1-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kyle Huey <me@kylehuey.com> Cc: <stable@vger.kernel.org> [4.14+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>