summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-01-24ovl: factor out ovl_check_origin_fh()Amir Goldstein
Re-factor ovl_check_origin() and ovl_get_origin(), so origin fh xattr is read from upper inode only once during lookup with multiple lower layers and only once when verifying index entry origin. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: store layer index in ovl_layerAmir Goldstein
Store the fs root layer index inside ovl_layer struct, so we can get the root fs layer index from merge dir lower layer instead of find it with ovl_find_layer() helper. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: force r/o mount when index dir creation failsAmir Goldstein
When work dir creation fails, a warning is emitted and overlay is mounted r/o. Trying to remount r/w will fail with no work dir. When index dir creation fails, the same warning is emitted and overlay is mounted r/o, but trying to remount r/w will succeed. This may cause unintentional corruption of filesystem consistency. Adjust the behavior of index dir creation failure to that of work dir creation failure and do not allow to remount r/w. User needs to state an explicitly intention to work without an index by mounting with option 'index=off' to allow r/w mount with no index dir. When mounting with option 'index=on' and no 'upperdir', index is implicitly disabled, so do not warn about no file handle support. The issue was introduced with inodes index feature in v4.13, but this patch will not apply cleanly before ovl_fill_super() re-factoring in v4.15. Fixes: 02bcd1577400 ("ovl: introduce the inodes index dir feature") Cc: <stable@vger.kernel.org> #v4.13 Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: disable index when no xattr supportAmir Goldstein
Overlayfs falls back to index=off if lower/upper fs does not support file handles. Do the same if upper fs does not support xattr. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: fix inconsistent d_ino for legacy merge dirAmir Goldstein
For a merge dir that was copied up before v4.12 or that was hand crafted offline (e.g. mkdir {upper/lower}/dir), upper dir does not contain the 'trusted.overlay.origin' xattr. In that case, stat(2) on the merge dir returns the lower dir st_ino, but getdents(2) returns the upper dir d_ino. After this change, on merge dir lookup, missing origin xattr on upper dir will be fixed and 'impure' xattr will be fixed on parent of the legacy merge dir. Suggested-by: zhangyi (F) <yi.zhang@huawei.com> Reviewed-by: zhangyi (F) <yi.zhang@huawei.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24sched/core: Fix cpu.max vs. cpuhotplug deadlockPeter Zijlstra
Tejun reported the following cpu-hotplug lock (percpu-rwsem) read recursion: tg_set_cfs_bandwidth() get_online_cpus() cpus_read_lock() cfs_bandwidth_usage_inc() static_key_slow_inc() cpus_read_lock() Reported-by: Tejun Heo <tj@kernel.org> Tested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20180122215328.GP3397@worktop Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-24locking/lockdep: Avoid triggering hardlockup from debug_show_all_locks()Tejun Heo
debug_show_all_locks() iterates all tasks and print held locks whole holding tasklist_lock. This can take a while on a slow console device and may end up triggering NMI hardlockup detector if someone else ends up waiting for tasklist_lock. Touch the NMI watchdog while printing the held locks to avoid spuriously triggering the hardlockup detector. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kernel-team@fb.com Link: http://lkml.kernel.org/r/20180122220055.GB1771050@devbig577.frc2.facebook.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-24futex: Fix OWNER_DEAD fixupPeter Zijlstra
Both Geert and DaveJ reported that the recent futex commit: c1e2f0eaf015 ("futex: Avoid violating the 10th rule of futex") introduced a problem with setting OWNER_DEAD. We set the bit on an uninitialized variable and then entirely optimize it away as a dead-store. Move the setting of the bit to where it is more useful. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Reported-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: c1e2f0eaf015 ("futex: Avoid violating the 10th rule of futex") Link: http://lkml.kernel.org/r/20180122103947.GD2228@hirez.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-24USB: misc: fix up some remaining DEVICE_ATTR() usagesGreg Kroah-Hartman
For all of these, a simple DEVICE_ATTR_*() macro should be used instead, so convert the drivers to use them. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-24USB: musb: fix up one odd DEVICE_ATTR() usageGreg Kroah-Hartman
It really should be DEVICE_ATTR_WO(), no need to "open code" it. Acked-by: Bin Liu <b-liu@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-24USB: atm: fix up some remaining DEVICE_ATTR() usageGreg Kroah-Hartman
There's no need to have DEVICE_ATTR() in these crazy macros, so use the proper DEVICE_ATTR_*() versions intead. Cc: Matthieu CASTET <castet.matthieu@free.fr> Cc: Stanislaw Gruszka <stf_xl@wp.pl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-24USB: move many drivers to use DEVICE_ATTR_WOGreg Kroah-Hartman
Instead of "open coding" a DEVICE_ATTR() define, use the DEVICE_ATTR_WO() macro instead, which does everything properly instead. This does require a few static functions to be renamed to work properly, but thanks to a script from Joe Perches, this was easily done. Reported-by: Joe Perches <joe@perches.com> Cc: Peter Chen <Peter.Chen@nxp.com> Cc: Valentina Manea <valentina.manea.m@gmail.com> Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com> Acked-by: Johan Hovold <johan@kernel.org> Acked-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-24USB: move many drivers to use DEVICE_ATTR_ROGreg Kroah-Hartman
Instead of "open coding" a DEVICE_ATTR() define, use the DEVICE_ATTR_RO() macro instead, which does everything properly instead. This does require a few static functions to be renamed to work properly, but thanks to a script from Joe Perches, this was easily done. Reported-by: Joe Perches <joe@perches.com> Cc: Matthieu CASTET <castet.matthieu@free.fr> Cc: Stanislaw Gruszka <stf_xl@wp.pl> Cc: Oliver Neukum <oneukum@suse.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com> Acked-by: Pete Zaitcev <zaitcev@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-24USB: move many drivers to use DEVICE_ATTR_RWGreg Kroah-Hartman
Instead of "open coding" a DEVICE_ATTR() define, use the DEVICE_ATTR_RW() macro instead, which does everything properly instead. This does require a few static functions to be renamed to work properly, but thanks to a script from Joe Perches, this was easily done. Reported-by: Joe Perches <joe@perches.com> Cc: Matthieu CASTET <castet.matthieu@free.fr> Cc: Stanislaw Gruszka <stf_xl@wp.pl> Cc: Peter Chen <Peter.Chen@nxp.com> Cc: Mathias Nyman <mathias.nyman@intel.com> Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: Bin Liu <b-liu@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-24USB: misc: chaoskey: Use true and false for boolean valuesGustavo A. R. Silva
Assign true or false to boolean variables instead of an integer value. This issue was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Keith Packard <keithp@keithp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-24platform/x86: thinkpad_acpi: suppress warning about palm detectionDavid Herrmann
This patch prevents the thinkpad_acpi driver from warning about 2 event codes returned for keyboard palm-detection. No behavioral changes, other than suppressing the warning in the kernel log. The events are still forwarded via acpi-netlink channels. We could, optionally, decide to forward the event through a input-switch on the tpacpi input device. However, so far no suitable input-code exists, and no similar drivers report such events. Hence, leave it an acpi event for now. Note that the event-codes are named based on empirical studies. On the ThinkPad X1 5th Gen the sensor can be found underneath the arrow key. Cc: Matthew Thode <mthode@mthode.org> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Acked-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2018-01-24i2c: meson: add configurable divider factorsJian Hu
This patch try to add support for I2C controller in Meson-AXG SoC, Due to the IP changes between I2C controller, we need to introduce a compatible data to make the divider factor configurable. Reviewed-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: Jian Hu <jian.hu@amlogic.com> Signed-off-by: Yixun Lan <yixun.lan@amlogic.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2018-01-24dt-bindings: i2c: update documentation for the Meson-AXGJian Hu
Update the doc to explicitly add Meson-AXG to support list Signed-off-by: Jian Hu <jian.hu@amlogic.com> Signed-off-by: Yixun Lan <yixun.lan@amlogic.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2018-01-24i2c: imx-lpi2c: add runtime pm supportFugang Duan
Add runtime pm support to dynamically manage the clock to avoid enable/disable clock in frequently that can improve the i2c bus transfer performance. And use pm_runtime_force_suspend/resume() instead of lpi2c_imx_suspend/resume(). Signed-off-by: Fugang Duan <fugang.duan@nxp.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2018-01-24i2c: rcar: fix some trivial typos in commentsWolfram Sang
Nothing big, but they get annoying after a while ;) Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2018-01-24i2c: davinci: fix the cpufreq transitionBartosz Golaszewski
i2c_davinci_cpufreq_transition() is implemented in a way that will block if it ever gets called while no transfer is in progress. Not only that, but reinit_completion() is never called for xfr_complete. Use the fact that cpufreq uses an srcu_notifier (running in process context) for transitions and that the bus_lock is taken during the call to master_xfer() and simplify the code by removing the transfer completion entirely and protecting i2c_davinci_cpufreq_transition() with i2c_lock/unlock_adapter(). Reported-by: David Lechner <david@lechnology.com> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Reviewed-by: Sekhar Nori <nsekhar@ti.com> Tested-by: David Lechner <david@lechnology.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2018-01-23Merge branch 'bpf-and-netdevsim-test-updates'David S. Miller
Jakub Kicinski says: ==================== bpf and netdevsim test updates A number of test improvements (delayed by merges). Quentin provides patches for checking printing to the verifier log from the drivers and validating extack messages are propagated. There is also a test for replacing TC filters to avoid adding back the bug Daniel recently fixed in net and stable. ==================== Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23selftests/bpf: validate replace of TC filters is workingJakub Kicinski
Daniel discovered recently I broke TC filter replace (and fixed it in commit ad9294dbc227 ("bpf: fix cls_bpf on filter replace")). Add a test to make sure it never happens again. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23selftests/bpf: check bpf verifier log buffer usage works for HW offloadQuentin Monnet
Make netdevsim print a message to the BPF verifier log buffer when a program is offloaded. Then use this message in hardware offload selftests to make sure that using this buffer actually prints the message to the console for eBPF hardware offload. The message is appended after the last instruction is processed with the verifying function from netdevsim. Output looks like the following: $ tc filter add dev foo ingress bpf obj sample_ret0.o \ sec .text verbose skip_sw Prog section '.text' loaded (5)! - Type: 3 - Instructions: 2 (0 over limit) - License: Verifier analysis: 0: (b7) r0 = 0 1: (95) exit [netdevsim] Hello from netdevsim! processed 2 insns, stack depth 0 "verbose" flag is required to see it in the console since netdevsim does not throw an error after printing the message. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23netdevsim: don't compile BPF code if syscall not enabledJakub Kicinski
We should not compile netdevsim/bpf.c if BPF syscall is not enabled. Otherwise bpf core would have to provide wrappers for all functions offload drivers may call, even though system will never see a BPF object. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23selftests/bpf: add checks on extack messages for eBPF hw offload testsQuentin Monnet
Add checks to test that netlink extack messages are correctly displayed in some expected error cases for eBPF offload to netdevsim with TC and XDP. iproute2 may be built without libmnl support, in which case the extack messages will not be reported. Try to detect this condition, and when enountered print a mild warning to the user and skip the extack validation. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23netdevsim: add extack support for TC eBPF offloadQuentin Monnet
Use the recently added extack support for TC eBPF filters in netdevsim. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23Merge branch '40GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2018-01-23 This series contains updates to i40e and i40evf only. Pawel enables FlatNVM support on x722 devices by allowing nvmupdate tool to configure the preservation flags in the AdminQ command. Mitch fixes a potential divide by zero error when DCB is enabled and the firmware fails to configure the VSI, so check for this state. Fixed a bug where the driver could fail to adhere to ETS bandwidth allocations if 8 traffic classes were configured on the switch. Sudheer fixes a potential deadlock by avoiding to call flush_schedule_work() in i40evf_remove(), since cancel_work_sync() and cancel_delayed_work_sync() already cleans up necessary work items. Fixed an issue with the problematic detection and recovery from hung queues in the PF which was causing lost interrupts. This is done by triggering a software interrupt so that interrupts are forced on and if we are already in napi_poll and an interrupt fires, napi_poll will not be rescheduled and the interrupt is lost. Avinash fixes an issue in the VF where is was possible to issue a reset_task while the device is currently being removed. Michal fixes an issue occurring while calling i40e_led_set() with the blink parameter set to true, which was causing the activity LED instead of the link LED to blink for port identification. Shiraz changes the client interface to not call client close/open on netdev down/up events, since this causes a lot of thrash that is not needed. Instead, disable the PE TCP-ENA flag during a netdev down event and re-enable on a netdev up event, since this blocks all TCP traffic to the RDMA protocol engine. Alan fixes an issue which was causing a potential transmit hang by ignoring the PF link up message if the VF state is not yet in the RUNNING state. Amritha fixes the channel VSI recreation during the reset flow to reconfigure the transmit rings and the queue context associated with the channel VSI. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23vmxnet3: repair memory leakNeil Horman
with the introduction of commit b0eb57cb97e7837ebb746404c2c58c6f536f23fa, it appears that rq->buf_info is improperly handled. While it is heap allocated when an rx queue is setup, and freed when torn down, an old line of code in vmxnet3_rq_destroy was not properly removed, leading to rq->buf_info[0] being set to NULL prior to its being freed, causing a memory leak, which eventually exhausts the system on repeated create/destroy operations (for example, when the mtu of a vmxnet3 interface is changed frequently. Fix is pretty straight forward, just move the NULL set to after the free. Tested by myself with successful results Applies to net, and should likely be queued for stable, please Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Reported-By: boyang@redhat.com CC: boyang@redhat.com CC: Shrikrishna Khare <skhare@vmware.com> CC: "VMware, Inc." <pv-drivers@vmware.com> CC: David S. Miller <davem@davemloft.net> Acked-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23ipv6: Fix getsockopt() for sockets with default IPV6_AUTOFLOWLABELBen Hutchings
Commit 513674b5a2c9 ("net: reevalulate autoflowlabel setting after sysctl setting") removed the initialisation of ipv6_pinfo::autoflowlabel and added a second flag to indicate whether this field or the net namespace default should be used. The getsockopt() handling for this case was not updated, so it currently returns 0 for all sockets for which IPV6_AUTOFLOWLABEL is not explicitly enabled. Fix it to return the effective value, whether that has been set at the socket or net namespace level. Fixes: 513674b5a2c9 ("net: reevalulate autoflowlabel setting after sysctl ...") Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23Merge branch 'act_csum-spinlock-remove'David S. Miller
Davide Caratti says: ==================== net/sched: remove spinlock from 'csum' action Similarly to what has been done earlier with other actions [1][2], this series tries to improve the performance of 'csum' tc action, removing a spinlock in the data path. Patch 1 lets act_csum use per-CPU counters; patch 2 removes spin_{,un}lock_bh() calls from the act() method. test procedure (using pktgen from https://github.com/netoptimizer): # ip link add name eth1 type dummy # ip link set dev eth1 up # tc qdisc add dev eth1 root handle 1: prio # for a in pass drop; do > tc filter del dev eth1 parent 1: pref 10 matchall action csum udp > tc filter add dev eth1 parent 1: pref 10 matchall action csum udp $a > for n in 2 4; do > ./pktgen_bench_xmit_mode_queue_xmit.sh -v -s 64 -t $n -n 1000000 -i eth1 > done > done test results: | | before patch | after patch $a | $n | avg. pps/thread | avg. pps/thread -----+----+-----------------+---------------- pass | 2 | 1671463 ± 4% | 1920789 ± 3% pass | 4 | 648797 ± 1% | 738190 ± 1% drop | 2 | 3212692 ± 2% | 3719811 ± 2% drop | 4 | 1078824 ± 1% | 1328099 ± 1% references: [1] https://www.spinics.net/lists/netdev/msg334760.html [2] https://www.spinics.net/lists/netdev/msg465862.html v3 changes: - use rtnl_dereference() in place of rcu_dereference() in tcf_csum_dump() v2 changes: - add 'drop' test, it produces more contentions - use RCU-protected struct to store 'action' and 'update_flags', to avoid reading the values from subsequent configurations ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23net/sched: act_csum: don't use spinlock in the fast pathDavide Caratti
use RCU instead of spin_{,unlock}_bh() to protect concurrent read/write on act_csum configuration, to reduce the effects of contention in the data path when multiple readers are present. Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23net/sched: act_csum: use per-core statisticsDavide Caratti
use per-CPU counters, like other TC actions do, instead of maintaining one set of stats across all cores. This allows updating act_csum stats without the need of protecting them using spin_{,un}lock_bh() invocations. Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23pppoe: take ->needed_headroom of lower device into account on xmitGuillaume Nault
In pppoe_sendmsg(), reserving dev->hard_header_len bytes of headroom was probably fine before the introduction of ->needed_headroom in commit f5184d267c1a ("net: Allow netdevices to specify needed head/tailroom"). But now, virtual devices typically advertise the size of their overhead in dev->needed_headroom, so we must also take it into account in skb_reserve(). Allocation size of skb is also updated to take dev->needed_tailroom into account and replace the arbitrary 32 bytes with the real size of a PPPoE header. This issue was discovered by syzbot, who connected a pppoe socket to a gre device which had dev->header_ops->create == ipgre_header and dev->hard_header_len == 0. Therefore, PPPoE didn't reserve any headroom, and dev_hard_header() crashed when ipgre_header() tried to prepend its header to skb->data. skbuff: skb_under_panic: text:000000001d390b3a len:31 put:24 head:00000000d8ed776f data:000000008150e823 tail:0x7 end:0xc0 dev:gre0 ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:104! invalid opcode: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 1 PID: 3670 Comm: syzkaller801466 Not tainted 4.15.0-rc7-next-20180115+ #97 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:skb_panic+0x162/0x1f0 net/core/skbuff.c:100 RSP: 0018:ffff8801d9bd7840 EFLAGS: 00010282 RAX: 0000000000000083 RBX: ffff8801d4f083c0 RCX: 0000000000000000 RDX: 0000000000000083 RSI: 1ffff1003b37ae92 RDI: ffffed003b37aefc RBP: ffff8801d9bd78a8 R08: 1ffff1003b37ae8a R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff86200de0 R13: ffffffff84a981ad R14: 0000000000000018 R15: ffff8801d2d34180 FS: 00000000019c4880(0000) GS:ffff8801db300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000208bc000 CR3: 00000001d9111001 CR4: 00000000001606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: skb_under_panic net/core/skbuff.c:114 [inline] skb_push+0xce/0xf0 net/core/skbuff.c:1714 ipgre_header+0x6d/0x4e0 net/ipv4/ip_gre.c:879 dev_hard_header include/linux/netdevice.h:2723 [inline] pppoe_sendmsg+0x58e/0x8b0 drivers/net/ppp/pppoe.c:890 sock_sendmsg_nosec net/socket.c:630 [inline] sock_sendmsg+0xca/0x110 net/socket.c:640 sock_write_iter+0x31a/0x5d0 net/socket.c:909 call_write_iter include/linux/fs.h:1775 [inline] do_iter_readv_writev+0x525/0x7f0 fs/read_write.c:653 do_iter_write+0x154/0x540 fs/read_write.c:932 vfs_writev+0x18a/0x340 fs/read_write.c:977 do_writev+0xfc/0x2a0 fs/read_write.c:1012 SYSC_writev fs/read_write.c:1085 [inline] SyS_writev+0x27/0x30 fs/read_write.c:1082 entry_SYSCALL_64_fastpath+0x29/0xa0 Admittedly PPPoE shouldn't be allowed to run on non Ethernet-like interfaces, but reserving space for ->needed_headroom is a more fundamental issue that needs to be addressed first. Same problem exists for __pppoe_xmit(), which also needs to take dev->needed_headroom into account in skb_cow_head(). Fixes: f5184d267c1a ("net: Allow netdevices to specify needed head/tailroom") Reported-by: syzbot+ed0838d0fa4c4f2b528e20286e6dc63effc7c14d@syzkaller.appspotmail.com Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Reviewed-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23net: link_watch: mark bonding link events urgentRoopa Prabhu
It takes 1sec for bond link down notification to hit user-space when all slaves of the bond go down. 1sec is too long for protocol daemons in user-space relying on bond notification to recover (eg: multichassis lag implementations in user-space). Since the link event code already marks team device port link events as urgent, this patch moves the code to cover all lag ports and master. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24cxl: Remove support for "Processing accelerators" classFrederic Barrat
The cxl driver currently declares in its table of supported PCI devices the class "Processing accelerators". Therefore it may be called to probe for opencapi devices, which generates errors, as the config space of a cxl device is not compatible with opencapi. So remove support for the generic class, as we now have (at least) two drivers for devices of the same class. Most cxl devices are FPGAs with a PSL which will show a known device ID of 0x477. Other devices are really supported by the cxlflash driver and are already listed in the table. So removing the class is expected to go unnoticed. Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-24ocxl: Add Makefile and KconfigFrederic Barrat
OCXL_BASE triggers the platform support needed by the driver. Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-24ocxl: Add trace pointsFrederic Barrat
Define a few trace points so that we can use the standard tracing mechanism for debug and/or monitoring. Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-24ocxl: Add a kernel API for other opencapi driversFrederic Barrat
Some of the functions done by the generic driver should also be needed by other opencapi drivers: attaching a context to an adapter, translation fault handling, AFU interrupt allocation... So to avoid code duplication, the driver provides a kernel API that other drivers can use, similar to calling a in-kernel library. It is still a bit theoretical, for lack of real hardware, and will likely need adjustements down the road. But we used the cxlflash driver as a guinea pig. Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-24ocxl: Add AFU interrupt supportFrederic Barrat
Add user APIs through ioctl to allocate, free, and be notified of an AFU interrupt. For opencapi, an AFU can trigger an interrupt on the host by sending a specific command targeting a 64-bit object handle. On POWER9, this is implemented by mapping a special page in the address space of a process and a write to that page will trigger an interrupt. Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-24ocxl: Driver code for 'generic' opencapi devicesFrederic Barrat
Add an ocxl driver to handle generic opencapi devices. Of course, it's not meant to be the only opencapi driver, any device is free to implement its own. But if a host application only needs basic services like attaching to an opencapi adapter, have translation faults handled or allocate AFU interrupts, it should suffice. The AFU config space must follow the opencapi specification and use the expected vendor/device ID to be seen by the generic driver. The driver exposes the device AFUs as a char device in /dev/ocxl/ Note that the driver currently doesn't handle memory attached to the opencapi device. Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Alastair D'Silva <alastair@d-silva.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-24powerpc/powernv: Capture actag information for the deviceFrederic Barrat
In the opencapi protocol, host memory contexts are referenced by a 'actag'. During setup, a driver must tell the device how many actags it can used, and what values are acceptable. On POWER9, the NPU can handle 64 actags per link, so they must be shared between all the PCI functions of the link. To get a global picture of how many actags are used by each AFU of every function, we capture some data at the end of PCI enumeration, so that actags can be shared fairly if needed. This is not powernv specific per say, but rather a consequence of the opencapi configuration specification being quite general. The number of available actags on POWER9 makes it more likely to be hit. This is somewhat mitigated by the fact that existing AFUs are coded by requesting a reasonable count of actags and existing devices carry only one AFU. Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-24powerpc/powernv: Add platform-specific services for opencapiFrederic Barrat
Implement a few platform-specific calls which can be used by drivers: - provide the Transaction Layer capabilities of the host, so that the driver can find some common ground and configure the device and host appropriately. - provide the hw interrupt to be used for translation faults raised by the NPU - map/unmap some NPU mmio registers to get the fault context when the NPU raises an address translation fault The rest are wrappers around the previously-introduced opal calls. Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-24powerpc/powernv: Add opal calls for opencapiFrederic Barrat
Add opal calls to interact with the NPU: OPAL_NPU_SPA_SETUP: set the Shared Process Area (SPA) The SPA is a table containing one entry (Process Element) per memory context which can be accessed by the opencapi device. OPAL_NPU_SPA_CLEAR_CACHE: clear the context cache The NPU keeps a cache of recently accessed memory contexts. When a Process Element is removed from the SPA, the cache for the link must be cleared. OPAL_NPU_TL_SET: configure the Transaction Layer The Transaction Layer specification defines several templates for messages to be exchanged on the link. During link setup, the host and device must negotiate what templates are supported on both sides and at what rates those messages can be sent. Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-24powerpc/powernv: Set correct configuration space size for opencapi devicesAndrew Donnellan
The configuration space for opencapi devices doesn't have a PCI Express capability, therefore confusing linux in thinking it's of an old PCI type with a 256-byte configuration space size, instead of the desired 4k. So add a PCI fixup to declare the correct size. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-24powerpc/powernv: Introduce new PHB type for opencapi linksFrederic Barrat
The NPU was already abstracted by opal as a virtual PHB for nvlink, but it helps to be able to differentiate between a nvlink or opencapi PHB, as it's not completely transparent to linux. In particular, PE assignment differs and we'll also need the information in later patches. So rename existing PNV_PHB_NPU type to PNV_PHB_NPU_NVLINK and add a new type PNV_PHB_NPU_OCAPI. Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-23Merge branch '10GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 10GbE Intel Wired LAN Driver Updates 2018-01-23 This series contains updates to ixgbe only. Shannon Nelson provides an implementation of the ipsec hardware offload feature for the ixgbe driver for these devices: x540, x550, 82599. The ixgbe NICs support ipsec offload for 1024 Rx and 1024 Tx Security Associations (SAs), using up to 128 inbound IP addresses, and using the rfc4106(gcm(aes)) encryption. This code does not yet support checksum offload, or TSO in conjunction with the ipsec offload - those will be added in the future. This code shows improvements in both packet throughput and CPU utilization. For example, here are some quicky numbers that show the magnitude of the performance gain on a single run of "iperf -c <dest>" with the ipsec offload on both ends of a point-to-point connection: 9.4 Gbps - normal case 7.6 Gbps - ipsec with offload 343 Mbps - ipsec no offload ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23Merge branch 'GEHC-Bx50-Switch-Support'David S. Miller
Sebastian Reichel says: ==================== GEHC Bx50 Switch Support This adds support for the internal switch found in GE Healthcare B450v3, B650v3 and B850v3. All devices use a GPIO bitbanged MDIO bus to communicate with the switch and a PCIe based network card for exchanging network data. The cpu network data link requires, that the switch's internal phy interface is enabled, so support for that is added by the first patch in this series. The patch series is based on v4.15-rc8. Changes since PATCHv4: * Introduce dsa_port_link_(un)register_of and mark the fixed variant static. * Update patch description to describe the phy<->phy connection from i210 to the Marvell switch Changes since PATCHv3: * Enable the phy in dsa_port_setup() instead of abusing the fixed link setup function Changes since PATCHv2: * Add phy nodes to switch in bx50.dtsi and reference them from switch ports * Enable cpu-port's phy based on 'phy-handle' instead of 'phy-mode' Changes since PATCHv1: * Use 'marvell,mv88e6085' instead of introducing compatible string for mv88e6240. * Fix indention of DT nodes * Only enable 'cpu' phy, if explicitly set to "internal". ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23ARM: dts: imx6q-b450v3: Add switch port configurationSebastian Reichel
This adds support for the Marvell switch and names the network ports according to the labels, that can be found next to the connectors. The switch is connected to the host system using a PCI based network card. The PCI bus configuration has been written using the following information: root@b450v3# lspci -tv -[0000:00]---00.0-[01]----00.0 Intel Corporation I210 Gigabit Network Connection root@b450v3# lspci -nn 00:00.0 PCI bridge [0604]: Synopsys, Inc. Device [16c3:abcd] (rev 01) 01:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03) Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23ARM: dts: imx6q-b650v3: Add switch port configurationSebastian Reichel
This adds support for the Marvell switch and names the network ports according to the labels, that can be found next to the connectors. The switch is connected to the host system using a PCI based network card. The PCI bus configuration has been written using the following information: root@b650v3# lspci -tv -[0000:00]---00.0-[01]----00.0 Intel Corporation I210 Gigabit Network Connection root@b650v3# lspci -nn 00:00.0 PCI bridge [0604]: Synopsys, Inc. Device [16c3:abcd] (rev 01) 01:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03) Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>