summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2018-05-01net: Add TLS TX offload featuresIlya Lesokhin
This patch adds a netdev feature to configure TLS TX offloads. Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-01net: Add TLS offload netdev opsIlya Lesokhin
Add new netdev ops to add and delete tls context Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-01net: Rename and export copy_skb_headerIlya Lesokhin
copy_skb_header is renamed to skb_copy_header and exported. Exposing this function give more flexibility in copying SKBs. skb_copy and skb_copy_expand do not give enough control over which parts are copied. Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-01fasync: Fix deadlock between task-context and interrupt-context kill_fasync()Kirill Tkhai
I observed the following deadlock between them: [task 1] [task 2] [task 3] kill_fasync() mm_update_next_owner() copy_process() spin_lock_irqsave(&fa->fa_lock) read_lock(&tasklist_lock) write_lock_irq(&tasklist_lock) send_sigio() <IRQ> ... read_lock(&fown->lock) kill_fasync() ... read_lock(&tasklist_lock) spin_lock_irqsave(&fa->fa_lock) ... Task 1 can't acquire read locked tasklist_lock, since there is already task 3 expressed its wish to take the lock exclusive. Task 2 holds the read locked lock, but it can't take the spin lock. Also, there is possible another deadlock (which I haven't observed): [task 1] [task 2] f_getown() kill_fasync() read_lock(&f_own->lock) spin_lock_irqsave(&fa->fa_lock,) <IRQ> send_sigio() write_lock_irq(&f_own->lock) kill_fasync() read_lock(&fown->lock) spin_lock_irqsave(&fa->fa_lock,) Actually, we do not need exclusive fa->fa_lock in kill_fasync_rcu(), as it guarantees fa->fa_file->f_owner integrity only. It may seem, that it used to give a task a small possibility to receive two sequential signals, if there are two parallel kill_fasync() callers, and task handles the first signal fastly, but the behaviour won't become different, since there is exclusive sighand lock in do_send_sig_info(). The patch converts fa_lock into rwlock_t, and this fixes two above deadlocks, as rwlock is allowed to be taken from interrupt handler by qrwlock design. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: Jeff Layton <jlayton@redhat.com>
2018-04-30net: bridge: Publish bridge accessor functionsPetr Machata
Add a couple new functions to allow querying FDB and vlan settings of a bridge. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-30Merge tag 'fixes-for-v4.17-rc3' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-linus Felipe writes: usb: fixes for v4.17-rc3 Not much this time around: A list_del corruption on dwc3_ep_dequeue(), sparse warning fix also on dwc3, build issues with f_phonet. Apart from these three, some other minor fixes. Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2018-04-30Merge 4.17-rc3 into tty-nextGreg Kroah-Hartman
We want the tty and serial driver fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-30Merge 4.17-rc3 into usb-nextGreg Kroah-Hartman
This resolves the merge issue with drivers/usb/core/hcd.c Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-30bpf: remove tracepoints from bpf coreAlexei Starovoitov
tracepoints to bpf core were added as a way to provide introspection to bpf programs and maps, but after some time it became clear that this approach is inadequate, so prog_id, map_id and corresponding get_next_id, get_fd_by_id, get_info_by_fd, prog_query APIs were introduced and fully adopted by bpftool and other applications. The tracepoints in bpf core started to rot and causing syzbot warnings: WARNING: CPU: 0 PID: 3008 at kernel/trace/trace_event_perf.c:274 Kernel panic - not syncing: panic_on_warn set ... perf_trace_bpf_map_keyval+0x260/0xbd0 include/trace/events/bpf.h:228 trace_bpf_map_update_elem include/trace/events/bpf.h:274 [inline] map_update_elem kernel/bpf/syscall.c:597 [inline] SYSC_bpf kernel/bpf/syscall.c:1478 [inline] Hence this patch deletes tracepoints in bpf core. Reported-by: Eric Biggers <ebiggers3@gmail.com> Reported-by: syzbot <bot+a9dbb3c3e64b62536a4bc5ee7bbd4ca627566188@syzkaller.appspotmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-04-29net: core: Assert the size of netdev_featres_tFlorian Fainelli
We have about 53 netdev_features_t bits defined and counting, add a build time check to catch when an u64 type will not be enough and we will have to convert that to a bitmap. This is done in register_netdevice() for convenience. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-29net: Revoke export for __skb_tx_hash, update it to just be static skb_tx_hashAlexander Duyck
I am dropping the export of __skb_tx_hash as after my patches nobody is using it outside of the net/core/dev.c file. In addition I am renaming and repurposing it to just be a static declaration of skb_tx_hash since that was the only user for it at this point. By doing this the compiler can inline it into __netdev_pick_tx as that will improve performance. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-29Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "Two fixes from the timer departement: - Fix a long standing issue in the NOHZ tick code which causes RB tree corruption, delayed timers and other malfunctions. The cause for this is code which modifies the expiry time of an enqueued hrtimer. - Revert the CLOCK_MONOTONIC/CLOCK_BOOTTIME unification due to regression reports. Seems userspace _is_ relying on the documented behaviour despite our hope that it wont" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Revert: Unify CLOCK_MONOTONIC and CLOCK_BOOTTIME tick/sched: Do not mess with an enqueued hrtimer
2018-04-29bpf/verifier: improve register value range tracking with ARSHYonghong Song
When helpers like bpf_get_stack returns an int value and later on used for arithmetic computation, the LSH and ARSH operations are often required to get proper sign extension into 64-bit. For example, without this patch: 54: R0=inv(id=0,umax_value=800) 54: (bf) r8 = r0 55: R0=inv(id=0,umax_value=800) R8_w=inv(id=0,umax_value=800) 55: (67) r8 <<= 32 56: R8_w=inv(id=0,umax_value=3435973836800,var_off=(0x0; 0x3ff00000000)) 56: (c7) r8 s>>= 32 57: R8=inv(id=0) With this patch: 54: R0=inv(id=0,umax_value=800) 54: (bf) r8 = r0 55: R0=inv(id=0,umax_value=800) R8_w=inv(id=0,umax_value=800) 55: (67) r8 <<= 32 56: R8_w=inv(id=0,umax_value=3435973836800,var_off=(0x0; 0x3ff00000000)) 56: (c7) r8 s>>= 32 57: R8=inv(id=0, umax_value=800,var_off=(0x0; 0x3ff)) With better range of "R8", later on when "R8" is added to other register, e.g., a map pointer or scalar-value register, the better register range can be derived and verifier failure may be avoided. In our later example, ...... usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK); if (usize < 0) return 0; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0); ...... Without improving ARSH value range tracking, the register representing "max_len - usize" will have smin_value equal to S64_MIN and will be rejected by verifier. Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-04-29bpf: add bpf_get_stack helperYonghong Song
Currently, stackmap and bpf_get_stackid helper are provided for bpf program to get the stack trace. This approach has a limitation though. If two stack traces have the same hash, only one will get stored in the stackmap table, so some stack traces are missing from user perspective. This patch implements a new helper, bpf_get_stack, will send stack traces directly to bpf program. The bpf program is able to see all stack traces, and then can do in-kernel processing or send stack traces to user space through shared map or bpf_perf_event_output. Acked-by: Alexei Starovoitov <ast@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-04-29mtd: rawnand: add a way to pass an ID table with nand_scan()Miquel Raynal
As part of the work of migrating all the drivers to nand_scan(), and because nand_scan() does not provide a way to pass an ID table, rename the function nand_scan_with_ids() and add a third parameter to give a flash ID table (like what was done with nand_scan_ident()). Create a nand_scan() helper that is just a wrapper of nand_scan_with_ids(), passing NULL as the ID table. This way a controller drivers can continue using nand_scan() transparently. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
2018-04-28<linux/stringhash.h>: fix end_name_hash() for 64bit longAmir Goldstein
The comment claims that this helper will try not to loose bits, but for 64bit long it looses the high bits before hashing 64bit long into 32bit int. Use the helper hash_long() to do the right thing for 64bit long. For 32bit long, there is no change. All the callers of end_name_hash() either assign the result to qstr->hash, which is u32 or return the result as an int value (e.g. full_name_hash()). Change the helper return type to int to conform to its users. [ It took me a while to apply this, because my initial reaction to it was - incorrectly - that it could make for slower code. After having looked more at it, I take back all my complaints about the patch, Amir was right and I was mis-reading things or just being stupid. I also don't worry too much about the possible performance impact of this on 64-bit, since most architectures that actually care about performance end up not using this very much (the dcache code is the most performance-critical, but the word-at-a-time case uses its own hashing anyway). So this ends up being mostly used for filesystems that do their own degraded hashing (usually because they want a case-insensitive comparison function). A _tiny_ worry remains, in that not everybody uses DCACHE_WORD_ACCESS, and then this potentially makes things more expensive on 64-bit architectures with slow or lacking multipliers even for the normal case. That said, realistically the only such architecture I can think of is PA-RISC. Nobody really cares about performance on that, it's more of a "look ma, I've got warts^W an odd machine" platform. So the patch is fine, and all my initial worries were just misplaced from not looking at this properly. - Linus ] Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-28net: phy: Fix modular PHYLIB buildFlorian Fainelli
After commit c59530d0d5dc ("net: Move PHY statistics code into PHY library helpers") we made net/core/ethtool.c reference symbols which are part of the library which can be modular. David introduced a temporary fix with 1ecd6e8ad996 ("phy: Temporary build fix after phylib changes.") which would prevent such modularity. This is not desireable of course, so instead, just inline the functions into include/linux/phy.h to keep both options available. Fixes: c59530d0d5dc ("net: Move PHY statistics code into PHY library helpers") Fixes: 1ecd6e8ad996 ("phy: Temporary build fix after phylib changes.") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-27Merge tag 'v4.17-rc2' into docs-nextJonathan Corbet
Merge -rc2 to pick up the changes to Documentation/core-api/kernel-api.rst that hit mainline via the networking tree. In their absence, subsequent patches cannot be applied.
2018-04-27Merge tag 'char-misc-4.17-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are some small char and misc driver fixes for 4.17-rc3 A variety of small things that have fallen out after 4.17-rc1 was out. Some vboxguest fixes for systems with lots of memory, amba bus fixes, some MAINTAINERS updates, uio_hv_generic driver fixes, and a few other minor things that resolve problems that people reported. The amba bus fixes took twice to get right, the first time I messed up applying the patches in the wrong order, hence the revert and later addition again with the correct fix, sorry about that. All of these have been in linux-next with no reported issues" * tag 'char-misc-4.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: ARM: amba: Fix race condition with driver_override ARM: amba: Make driver_override output consistent with other buses Revert "ARM: amba: Fix race condition with driver_override" ARM: amba: Don't read past the end of sysfs "driver_override" buffer ARM: amba: Fix race condition with driver_override virt: vbox: Log an error when we fail to get the host version virt: vbox: Use __get_free_pages instead of kmalloc for DMA32 memory virt: vbox: Add vbg_req_free() helper function virt: vbox: Move declarations of vboxguest private functions to private header slimbus: Fix out-of-bounds access in slim_slicesize() MAINTAINERS: add dri-devel&linaro-mm for Android ION fpga-manager: altera-ps-spi: preserve nCONFIG state MAINTAINERS: update my email address uio_hv_generic: fix subchannel ring mmap uio_hv_generic: use correct channel in isr uio_hv_generic: make ring buffer attribute for primary channel uio_hv_generic: set size of ring buffer attribute ANDROID: binder: prevent transactions into own process.
2018-04-27Merge tag 'driver-core-4.17-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg Kroah-Hartman: "Here are some small driver core and firmware fixes for 4.17-rc3 There's a kobject WARN() removal to make syzkaller a lot happier about some "normal" error paths that it keeps hitting, which should reduce the number of false-positives we have been getting recently. There's also some fimware test and documentation fixes, and the coredump() function signature change that needed to happen after -rc1 before drivers started to take advantage of it. All of these have been in linux-next with no reported issues" * tag 'driver-core-4.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: firmware: some documentation fixes selftests:firmware: fixes a call to a wrong function name kobject: don't use WARN for registration failures firmware: Fix firmware documentation for recent file renames test_firmware: fix setting old custom fw path back on exit, second try test_firmware: Install all scripts drivers: change struct device_driver::coredump() return type to void
2018-04-27Merge tag 'tty-4.17-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial fixes from Greg KH: "Here are some tty and serial driver fixes for reported issues for 4.17-rc3. Nothing major, but a number of small things: - device tree fixes/updates for serial ports - earlycon fixes - n_gsm fixes - tty core change reverted to help resolve syszkaller reports - other serial driver small fixes All of these have been in linux-next with no reported issues" * tag 'tty-4.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: tty: Use __GFP_NOFAIL for tty_ldisc_get() tty: serial: xuartps: Setup early console when uartclk is also passed tty: Don't call panic() at tty_ldisc_init() tty: Avoid possible error pointer dereference at tty_ldisc_restore(). dt-bindings: mvebu-uart: DT fix s/interrupts-names/interrupt-names/ tty: serial: qcom_geni_serial: Use signed variable to get IRQ earlycon: Use a pointer table to fix __earlycon_table stride serial: sh-sci: Document r8a77470 bindings dt-bindings: meson-uart: DT fix s/clocks-names/clock-names/ serial: imx: fix cached UCR2 read on software reset serial: imx: warn user when using unsupported configuration serial: mvebu-uart: Fix local flags handling on termios update tty: n_gsm: Fix DLCI handling for ADM mode if debug & 2 is not set tty: n_gsm: Fix long delays with control frame timeouts in ADM mode
2018-04-27Merge tag 'armsoc-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Arnd Bergmann: "This round of fixes has two larger changes that came in last week: - a couple of patches all intended to finally turn on USB support on various Amlogic SoC based boards. The respective driver were not finalized until very late before the merge window and the DT portion is the last bit now. - a defconfig update for gemini that had repeatedly missed the cut but that is required to actually boot any real machines with the default build. The rest are the usual small changes: - a fix for a nasty build regression on the OMAP memory drivers - a fix for a boot problem on Intel/Altera SocFPGA - a MAINTAINER file update - a couple of fixes for issues found by automated testing (kernelci, coverity, sparse, ...) - a few incorrect DT entries are updated to match the hardware" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: defconfig: Update Gemini defconfig ARM: s3c24xx: jive: Fix some GPIO names HISI LPC: Add Kconfig MFD_CORE dependency ARM: dts: Fix NAS4220B pin config MAINTAINERS: Remove myself as maintainer arm64: dts: correct SATA addresses for Stingray ARM64: dts: meson-gxm-khadas-vim2: enable the USB controller ARM64: dts: meson-gxl-nexbox-a95x: enable the USB controller ARM64: dts: meson-gxl-s905x-libretech-cc: enable the USB controller ARM64: dts: meson-gx-p23x-q20x: enable the USB controller ARM64: dts: meson-gxl-s905x-p212: enable the USB controller ARM64: dts: meson-gxm: add GXM specific USB host configuration ARM64: dts: meson-gxl: add USB host support ARM: OMAP2+: Fix build when using split object directories soc: bcm2835: Make !RASPBERRYPI_FIRMWARE dummies return failure soc: bcm: raspberrypi-power: Fix use of __packed ARM: dts: Fix cm2 and prm sizes for omap4 ARM: socfpga_defconfig: Remove QSPI Sector 4K size force firmware: arm_scmi: remove redundant null check on array arm64: dts: juno: drop unnecessary address-cells and size-cells properties
2018-04-27Merge tag 'mtd/fixes-for-4.17-rc3' of git://git.infradead.org/linux-mtdLinus Torvalds
Pull mtd fixes from Boris Brezillon: - Fix nanddev_mtd_erase() function to match the changes done in e7bfb3fdbde3 ("mtd: Stop updating erase_info->state and calling mtd_erase_callback()") - Fix a memory leak in the Tango NAND controller driver - Fix read/write to a suspended erase block in the CFI driver - Fix the DT parsing logic in the Marvell NAND controller driver * tag 'mtd/fixes-for-4.17-rc3' of git://git.infradead.org/linux-mtd: mtd: rawnand: marvell: fix the chip-select DT parsing logic mtd: cfi: cmdset_0002: Do not allow read/write to suspend erase block. mtd: cfi: cmdset_0001: Workaround Micron Erase suspend bug. mtd: cfi: cmdset_0001: Do not allow read/write to suspend erase block. mtd: spi-nor: cadence-quadspi: Fix page fault kernel panic mtd: nand: Fix nanddev_mtd_erase() mtd: rawnand: tango: Fix struct clk memory leak
2018-04-27net: Allow network devices to have PHY statisticsFlorian Fainelli
Add a new callback: get_ethtool_phy_stats() which allows network device drivers not making use of the PHY library to return PHY statistics. Update ethtool_get_phy_stats(), __ethtool_get_sset_count() and __ethtool_get_strings() accordingly to interogate the network device about ETH_SS_PHY_STATS. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-27net: Move PHY statistics code into PHY library helpersFlorian Fainelli
In order to make it possible for network device drivers that do not necessarily have a phy_device attached, but still report PHY statistics, have a preliminary refactoring consisting in creating helper functions that encapsulate the PHY device driver knowledge within PHYLIB. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-27delayacct: Use raw_spinlocksSebastian Andrzej Siewior
try_to_wake_up() might invoke delayacct_blkio_end() while holding the pi_lock (which is a raw_spinlock_t). delayacct_blkio_end() acquires task_delay_info.lock which is a spinlock_t. This causes a might sleep splat on -RT where non raw spinlocks are converted to 'sleeping' spinlocks. task_delay_info.lock is only held for a short amount of time so it's not a problem latency wise to make convert it to a raw spinlock. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Balbir Singh <bsingharora@gmail.com> Link: https://lkml.kernel.org/r/20180423161024.6710-1-bigeasy@linutronix.de
2018-04-27locking/barriers: Introduce smp_cond_load_relaxed() and ↵Will Deacon
atomic_cond_read_relaxed() Whilst we currently provide smp_cond_load_acquire() and atomic_cond_read_acquire(), there are cases where the ACQUIRE semantics are not required because of a subsequent fence or release operation once the conditional loop has exited. This patch adds relaxed versions of the conditional spinning primitives to avoid unnecessary barrier overhead on architectures such as arm64. Signed-off-by: Will Deacon <will.deacon@arm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Waiman Long <longman@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: boqun.feng@gmail.com Cc: linux-arm-kernel@lists.infradead.org Cc: paulmck@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/1524738868-31318-2-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-04-27usb: gadget: composite Allow for larger configuration descriptorsJoel Pepper
The composite framework allows us to create gadgets composed from many different functions, which need to fit into a single configuration descriptor. Some functions (like uvc) can produce configuration descriptors upwards of 2500 bytes on their own. This patch increases the limit from 1024 bytes to 4096. Signed-off-by: Joel Pepper <joel.pepper@rwth-aachen.de> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2018-04-26signal: Extend siginfo_layout with SIL_FAULT_{MCEERR|BNDERR|PKUERR}Eric W. Biederman
Update the siginfo_layout function and enum siginfo_layout to represent all of the possible field layouts of struct siginfo. This allows the uses of siginfo_layout in um and arm64 where they are testing for SIL_FAULT to be more accurate as this rules out the other cases. Further this allows the switch statements on siginfo_layout to be simpler if perhaps a little more wordy. Making it easier to understand what is actually going on. As SIL_FAULT_BNDERR and SIL_FAULT_PKUERR are never expected to appear in signalfd just treat them as SIL_FAULT. To include them would take 20 extra bytes an pretty much fill up what is left of signalfd_siginfo. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-04-26Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio fixups from Michael Tsirkin: - Latest header update will break QEMU (if it's rebuilt with the new header) - and it seems that the code there is so fragile that any change in this header will break it. Add a better interface so users do not need to change their code every time that header changes. - Fix virtio console for spec compliance. * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio_console: reset on out of memory virtio_console: move removal code virtio_console: drop custom control queue cleanup virtio_console: free buffers after reset virtio: add ability to iterate over vqs virtio_console: don't tie bufs to a vq virtio_balloon: add array of stat names
2018-04-26cgroup: Add cgroup_subsys->css_rstat_flush()Tejun Heo
This patch adds cgroup_subsys->css_rstat_flush(). If a subsystem has this callback, its csses are linked on cgrp->css_rstat_list and rstat will call the function whenever the associated cgroup is flushed. Flush is also performed when such csses are released so that residual counts aren't lost. Combined with the rstat API previous patches factored out, this allows controllers to plug into rstat to manage their statistics in a scalable way. Signed-off-by: Tejun Heo <tj@kernel.org>
2018-04-26cgroup: Replace cgroup_rstat_mutex with a spinlockTejun Heo
Currently, rstat flush path is protected with a mutex which is fine as all the existing users are from interface file show path. However, rstat is being generalized for use by controllers and flushing from atomic contexts will be necessary. This patch replaces cgroup_rstat_mutex with a spinlock and adds a irq-safe flush function - cgroup_rstat_flush_irqsafe(). Explicit yield handling is added to the flush path so that other flush functions can yield to other threads and flushers. Signed-off-by: Tejun Heo <tj@kernel.org>
2018-04-26cgroup: Factor out and expose cgroup_rstat_*() interface functionsTejun Heo
cgroup_rstat is being generalized so that controllers can use it too. This patch factors out and exposes the following interface functions. * cgroup_rstat_updated(): Renamed from cgroup_rstat_cpu_updated() for consistency. * cgroup_rstat_flush_hold/release(): Factored out from base stat implementation. * cgroup_rstat_flush(): Verbatim expose. While at it, drop assert on cgroup_rstat_mutex in cgroup_base_stat_flush() as it crosses layers and make a minor comment update. v2: Added EXPORT_SYMBOL_GPL(cgroup_rstat_updated) to fix a build bug. Signed-off-by: Tejun Heo <tj@kernel.org>
2018-04-26cgroup: Distinguish base resource stat implementation from rstatTejun Heo
Base resource stat accounts universial (not specific to any controller) resource consumptions on top of rstat. Currently, its implementation is intermixed with rstat implementation making the code confusing to follow. This patch clarifies the distintion by doing the followings. * Encapsulate base resource stat counters, currently only cputime, in struct cgroup_base_stat. * Move prev_cputime into struct cgroup and initialize it with cgroup. * Rename the related functions so that they start with cgroup_base_stat. * Prefix the related variables and field names with b. This patch doesn't make any functional changes. Signed-off-by: Tejun Heo <tj@kernel.org>
2018-04-26cgroup: Rename stat to rstatTejun Heo
stat is too generic a name and ends up causing subtle confusions. It'll be made generic so that controllers can plug into it, which will make the problem worse. Let's rename it to something more specific - cgroup_rstat for cgroup recursive stat. This patch does the following renames. No other changes. * cpu_stat -> rstat_cpu * stat -> rstat * ?cstat -> ?rstatc Note that the renames are selective. The unrenamed are the ones which implement basic resource statistics on top of rstat. This will be further cleaned up in the following patches. Signed-off-by: Tejun Heo <tj@kernel.org>
2018-04-26cgroup: Limit event generation frequencyTejun Heo
".events" files generate file modified event to notify userland of possible new events. Some of the events can be quite bursty (e.g. memory high event) and generating notification each time is costly and pointless. This patch implements a event rate limit mechanism. If a new notification is requested before 10ms has passed since the previous notification, the new notification is delayed till then. As this only delays from the second notification on in a given close cluster of notifications, userland reactions to notifications shouldn't be delayed at all in most cases while avoiding notification storms. Signed-off-by: Tejun Heo <tj@kernel.org>
2018-04-26genirq/irq_sim: Use the SPDX license identifier in the headerBartosz Golaszewski
Use C-style comment for the identifier as per Documentation/process/license-rules.rst and remove the license boilerplate. Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20180426200747.8344-2-brgl@bgdev.pl
2018-04-26net/mlx5: Fix mlx5_get_vector_affinity functionIsrael Rukshin
Adding the vector offset when calling to mlx5_vector2eqn() is wrong. This is because mlx5_vector2eqn() checks if EQ index is equal to vector number and the fact that the internal completion vectors that mlx5 allocates don't get an EQ index. The second problem here is that using effective_affinity_mask gives the same CPU for different vectors. This leads to unmapped queues when calling it from blk_mq_rdma_map_queues(). This doesn't happen when using affinity_hint mask. Fixes: 2572cf57d75a ("mlx5: fix mlx5_get_vector_affinity to start from completion vector 0") Fixes: 05e0cc84e00c ("net/mlx5: Fix get vector affinity helper function") Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
2018-04-26udp: add gso support to virtual devicesWillem de Bruijn
Virtual devices such as tunnels and bonding can handle large packets. Only segment packets when reaching a physical or loopback device. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-26udp: generate gso with UDP_SEGMENTWillem de Bruijn
Support generic segmentation offload for udp datagrams. Callers can concatenate and send at once the payload of multiple datagrams with the same destination. To set segment size, the caller sets socket option UDP_SEGMENT to the length of each discrete payload. This value must be smaller than or equal to the relevant MTU. A follow-up patch adds cmsg UDP_SEGMENT to specify segment size on a per send call basis. Total byte length may then exceed MTU. If not an exact multiple of segment size, the last segment will be shorter. The implementation adds a gso_size field to the udp socket, ip(v6) cmsg cookie and inet_cork structure to be able to set the value at setsockopt or cmsg time and to work with both lockless and corked paths. Initial benchmark numbers show UDP GSO about as expensive as TCP GSO. tcp tso 3197 MB/s 54232 msg/s 54232 calls/s 6,457,754,262 cycles tcp gso 1765 MB/s 29939 msg/s 29939 calls/s 11,203,021,806 cycles tcp without tso/gso * 739 MB/s 12548 msg/s 12548 calls/s 11,205,483,630 cycles udp 876 MB/s 14873 msg/s 624666 calls/s 11,205,777,429 cycles udp gso 2139 MB/s 36282 msg/s 36282 calls/s 11,204,374,561 cycles [*] after reverting commit 0a6b2a1dc2a2 ("tcp: switch to GSO being always on") Measured total system cycles ('-a') for one core while pinning both the network receive path and benchmark process to that core: perf stat -a -C 12 -e cycles \ ./udpgso_bench_tx -C 12 -4 -D "$DST" -l 4 Note the reduction in calls/s with GSO. Bytes per syscall drops increases from 1470 to 61818. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-26udp: add udp gsoWillem de Bruijn
Implement generic segmentation offload support for udp datagrams. A follow-up patch adds support to the protocol stack to generate such packets. UDP GSO is not UFO. UFO fragments a single large datagram. GSO splits a large payload into a number of discrete UDP datagrams. The implementation adds a GSO type SKB_UDP_GSO_L4 to differentiate it from UFO (SKB_UDP_GSO). IPPROTO_UDPLITE is excluded, as that protocol has no gso handler registered. [ Export __udp_gso_segment for ipv6. -DaveM ] Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-26blk-mq: fix sysfs inflight counterOmar Sandoval
When the blk-mq inflight implementation was added, /proc/diskstats was converted to use it, but /sys/block/$dev/inflight was not. Fix it by adding another helper to count in-flight requests by data direction. Fixes: f299b7c7a9de ("blk-mq: provide internal in-flight variant") Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-04-26Merge tag 'omap-for-v4.17/fixes-signed' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes Pull "Two fixes for v4.17-rc cycle" from Tony Lindgren: Fix a build regression with split object directories reported by Russell and fix range sizes for omap4 cm2 and prm modules. * tag 'omap-for-v4.17/fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: OMAP2+: Fix build when using split object directories ARM: dts: Fix cm2 and prm sizes for omap4
2018-04-26Revert: Unify CLOCK_MONOTONIC and CLOCK_BOOTTIMEThomas Gleixner
Revert commits 92af4dcb4e1c ("tracing: Unify the "boot" and "mono" tracing clocks") 127bfa5f4342 ("hrtimer: Unify MONOTONIC and BOOTTIME clock behavior") 7250a4047aa6 ("posix-timers: Unify MONOTONIC and BOOTTIME clock behavior") d6c7270e913d ("timekeeping: Remove boot time specific code") f2d6fdbfd238 ("Input: Evdev - unify MONOTONIC and BOOTTIME clock behavior") d6ed449afdb3 ("timekeeping: Make the MONOTONIC clock behave like the BOOTTIME clock") 72199320d49d ("timekeeping: Add the new CLOCK_MONOTONIC_ACTIVE clock") As stated in the pull request for the unification of CLOCK_MONOTONIC and CLOCK_BOOTTIME, it was clear that we might have to revert the change. As reported by several folks systemd and other applications rely on the documented behaviour of CLOCK_MONOTONIC on Linux and break with the above changes. After resume daemons time out and other timeout related issues are observed. Rafael compiled this list: * systemd kills daemons on resume, after >WatchdogSec seconds of suspending (Genki Sky). [Verified that that's because systemd uses CLOCK_MONOTONIC and expects it to not include the suspend time.] * systemd-journald misbehaves after resume: systemd-journald[7266]: File /var/log/journal/016627c3c4784cd4812d4b7e96a34226/system.journal corrupted or uncleanly shut down, renaming and replacing. (Mike Galbraith). * NetworkManager reports "networking disabled" and networking is broken after resume 50% of the time (Pavel). [May be because of systemd.] * MATE desktop dims the display and starts the screensaver right after system resume (Pavel). * Full system hang during resume (me). [May be due to systemd or NM or both.] That happens on debian and open suse systems. It's sad, that these problems were neither catched in -next nor by those folks who expressed interest in this change. Reported-by: Rafael J. Wysocki <rjw@rjwysocki.net> Reported-by: Genki Sky <sky@genki.is>, Reported-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kevin Easton <kevin@guarana.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Salyzyn <salyzyn@android.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org>
2018-04-26HID: multitouch: implement precision touchpad latency and switchesBenjamin Tissoires
The Win 8.1 precision touchpad spec introduce new modes for touchpads that can come in handy[1]. Implement the settings of these modes, so we are not taken off-guard if a firmware decides to enforce them. [1] https://docs.microsoft.com/en-us/windows-hardware/design/component-guidelines/windows-precision-touchpad-required-hid-top-level-collections Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2018-04-26HID: input: append a suffix matching the applicationBenjamin Tissoires
Given that we create one input node per application, we should name the input node accordingly to not lose userspace. Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2018-04-26HID: generic: create one input report per application typeBenjamin Tissoires
It is not a good idea to try to fit all types of applications in the same input report. There are a lot of devices that are needing the quirk HID_MULTI_INPUT but this quirk doesn't match the actual HID description as it is based on the report ID. Given that most devices with MULTI_INPUT I can think of split nicely the devices inputs into application, it is a good thing to split the devices by default based on this assumption. Also make hid-multitouch following this rule, to not have to deal with too many input created. While we are at it, fix some checkpatch complaints about converting 'unsigned' to 'unsigned int'. Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2018-04-26HID: store the full list of reports in the hidinputBenjamin Tissoires
We were only storing the report in case of QUIRK_MULTI_INPUT. It is interesting for the upcoming HID_QUIRK_INPUT_PER_APP to also store the full list of reports that are attached to it. We need the full list because a device (Advanced Silicon has some) might want to use a different report ID for the Input reports and the Output reports. Storing the full list allows the drivers to have all the data. Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2018-04-25Merge tag 'for_v4.17-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull fsnotify fix from Jan Kara: "A fix of a fsnotify race causing panics / softlockups" * tag 'for_v4.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: fsnotify: Fix fsnotify_mark_connector race
2018-04-25Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Eight bug fixes, one spelling update and one tracepoint addition. The most serious is probably the mptsas write same fix because it means anyone using these controllers sees errors when modern filesystems try to issue discards" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: target: fix crash with iscsi target and dvd scsi: sd_zbc: Avoid that resetting a zone fails sporadically scsi: sd: Defer spinning up drive while SANITIZE is in progress scsi: megaraid_sas: Do not log an error if FW successfully initializes. scsi: ufs: add trace event for ufs upiu scsi: core: remove reference to scsi_show_extd_sense() scsi: mptsas: Disable WRITE SAME scsi: fnic: fix spelling mistake in fnic stats "Abord" -> "Abort" scsi: scsi_debug: IMMED related delay adjustments scsi: iscsi: respond to netlink with unicast when appropriate