summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-11-25mlxsw: spectrum: Remove unused trapsNogah Frankel
Since commit 99724c18fc66 ("mlxsw: spectrum: Introduce support for router interfaces") we no longer rely on flooding traffic to the CPU in order to trap packets intended for the host itself. Therefore, the FDB MC trap can be removed. Remove traps for protocols that are not supported yet. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25mvpp2: use correct size for memsetArnd Bergmann
gcc-7 detects a short memset in mvpp2, introduced in the original merge of the driver: drivers/net/ethernet/marvell/mvpp2.c: In function 'mvpp2_cls_init': drivers/net/ethernet/marvell/mvpp2.c:3296:2: error: 'memset' used with length equal to number of elements without multiplication by element size [-Werror=memset-elt-size] The result seems to be that we write uninitialized data into the flow table registers, although we did not get any warning about that uninitialized data usage. Using sizeof() lets us initialize then entire array instead. Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25net/mlx5: drop duplicate header delay.hGeliang Tang
Drop duplicate header delay.h from mlx5/core/main.c. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Acked-by: Matan Barak <matanb@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25net: ieee802154: drop duplicate header delay.hGeliang Tang
Drop duplicate header delay.h from adf7242.c. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Acked-by: Stefan Schmidt <stefan@osg.samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25ibmvnic: drop duplicate header seq_file.hGeliang Tang
Drop duplicate header seq_file.h from ibmvnic.c. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25fsl/fman: fix a leak in tgec_free()Dan Carpenter
We set "tgec->cfg" to NULL before passing it to kfree(). There is no need to set it to NULL at all. Let's just delete it. Fixes: 57ba4c9b56d8 ("fsl/fman: Add FMan MAC support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25net/mlx5: remove a duplicate conditionDan Carpenter
We verified that MLX5_FLOW_CONTEXT_ACTION_COUNT was set on the first line of the function so we don't need to check again here. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25net: ethtool: don't require CAP_NET_ADMIN for ETHTOOL_GLINKSETTINGSMiroslav Lichvar
The ETHTOOL_GLINKSETTINGS command is deprecating the ETHTOOL_GSET command and likewise it shouldn't require the CAP_NET_ADMIN capability. Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25Merge branch 'thunderx-new-features'David S. Miller
Sunil Goutham says: ==================== net: thunderx: Support for 80xx, RED, PFC e.t.c This patch series adds support for SLM modules present on 80xx silicon, enables ramdom early discard, backpressure generation, PFC and some ethtool changes to display supported link modes e.t.c. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25net: thunderx: Pause frame supportSunil Goutham
Enable pause frames on both Rx and Tx side, configure pause interval e.t.c. Also support for enable/disable pause frames on Rx/Tx via ethtool has been added. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25net: thunderx: Configure RED and backpressure levelsSunil Goutham
This patch enables moving average calculation of Rx pkt's resources and configures RED and backpressure levels for both CQ and RBDR. Also initialize SQ's CQ_LIMIT properly. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25net: thunderx: Add ethtool support for supported ports and link modes.Thanneeru Srinivasulu
Signed-off-by: Thanneeru Srinivasulu <tsrinivasulu@cavium.com> Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25net: thunderx: 80xx BGX0 configuration changesSunil Goutham
On 80xx only one lane of DLM0 and DLM1 (of BGX0) can be used , so even though lmac count may be 2 but LMAC1 should use serdes lane of DLM1. Since it's not possible to distinguish 80xx from 81xx as PCI devid are same, this patch adds this config support by replying on what firmware configures the lmacs with. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25tipc: improve sanity check for received domain recordsJon Paul Maloy
In commit 35c55c9877f8 ("tipc: add neighbor monitoring framework") we added a data area to the link monitor STATE messages under the assumption that previous versions did not use any such data area. For versions older than Linux 4.3 this assumption is not correct. In those version, all STATE messages sent out from a node inadvertently contain a 16 byte data area containing a string; -a leftover from previous RESET messages which were using this during the setup phase. This string serves no purpose in STATE messages, and should no be there. Unfortunately, this data area is delivered to the link monitor framework, where a sanity check catches that it is not a correct domain record, and drops it. It also issues a rate limited warning about the event. Since such events occur much more frequently than anticipated, we now choose to remove the warning in order to not fill the kernel log with useless contents. We also make the sanity check stricter, to further reduce the risk that such data is inavertently admitted. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25tipc: fix compatibility bug in link monitoringJon Paul Maloy
commit 817298102b0b ("tipc: fix link priority propagation") introduced a compatibility problem between TIPC versions newer than Linux 4.6 and those older than Linux 4.4. In versions later than 4.4, link STATE messages only contain a non-zero link priority value when the sender wants the receiver to change its priority. This has the effect that the receiver resets itself in order to apply the new priority. This works well, and is consistent with the said commit. However, in versions older than 4.4 a valid link priority is present in all sent link STATE messages, leading to cyclic link establishment and reset on the 4.6+ node. We fix this by adding a test that the received value should not only be valid, but also differ from the current value in order to cause the receiving link endpoint to reset. Reported-by: Amar Nv <amar.nv005@gmail.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25phy: fix error case of phy_led_triggers_(un)registerWoojung Huh
When phy_init_hw() fails at phy_attach_direct(); - phy_detach() calls phy_led_triggers_unregister() without previous call of phy_led_triggers_register(). - still call phy_led_triggers_register() and cause memory leak. Fixes: 2e0bc452f472 ("net: phy: leds: add support for led triggers on phy link state change") Signed-off-by: Woojung Huh <woojung.huh@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25net: ethernet: mvneta: Remove IFF_UNICAST_FLT which is not implementedAndrew Lunn
The mvneta driver advertises it supports IFF_UNICAST_FLT. However, it actually does not. The hardware probably does support it, but there is no code to configure the filter. As a quick and simple fix, remove the flag. This will cause the core to fall back to promiscuous mode. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Fixes: b50b72de2f2f ("net: mvneta: enable features before registering the driver") Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25Merge branch 'parisc-4.9-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc fixes from Helge Deller: "On parisc we were still seeing occasional random segmentation faults and memory corruption on SMP machines. Dave Anglin then looked again at the TLB related code and found two issues in the PCI DMA and generic TLB flush functions. Then, in our startup code we had some timing of the cache and TLB functions to calculate a threshold when to use a complete TLB/cache flush or just to flush a specific range. This code produced a race with newly started CPUs and thus lead to occasional kernel crashes (due to stale TLB/cache entries). The patch by Dave fixes this issue by flushing the local caches before starting secondary CPUs and by removing the race. The last problem fixed by this series is that we quite often suffered from hung tasks and self-detected stalls on the CPUs. It was somehow clear that this was related to the (in v4.7) newly introduced cr16 clocksource and the own implementation of sched_clock(). I replaced the open-coded sched_clock() function and switched to the generic sched_clock() implementation which seems to have fixed this isse as well. All patches have been sucessfully tested on a variety of machines, including our debian buildd servers. All patches (beside the small pr_cont fix) are tagged for stable releases" * 'parisc-4.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Also flush data TLB in flush_icache_page_asm parisc: Fix race in pci-dma.c parisc: Switch to generic sched_clock implementation parisc: Fix races in parisc_setup_cache_timing() parisc: Fix printk continuations in system detection
2016-11-25net: properly flush delay-freed skbsEric Dumazet
Typical NAPI drivers use napi_consume_skb(skb) at TX completion time. This put skb in a percpu special queue, napi_alloc_cache, to get bulk frees. It turns out the queue is not flushed and hits the NAPI_SKB_CACHE_SIZE limit quite often, with skbs that were queued hundreds of usec earlier. I measured this can take ~6000 nsec to perform one flush. __kfree_skb_flush() can be called from two points right now : 1) From net_tx_action(), but only for skbs that were queued to sd->completion_queue. -> Irrelevant for NAPI drivers in normal operation. 2) From net_rx_action(), but only under high stress or if RPS/RFS has a pending action. This patch changes net_rx_action() to perform the flush in all cases and after more urgent operations happened (like kicking remote CPUS for RPS/RFS). Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull keys fixes from James Morris: "From David: - Fix mpi_powm()'s handling of a number with a zero exponent [CVE-2016-8650]. Integrate my and Andrey's patches for mpi_powm() and use mpi_resize() instead of RESIZE_IF_NEEDED() - the latter adds a duplicate check into the execution path of a trivial case we don't normally expect to be taken. - Fix double free in X.509 error handling" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: mpi: Fix NULL ptr dereference in mpi_powm() [ver #3] X.509: Fix double free in x509_cert_parse() [ver #3]
2016-11-25Fix subtle CONFIG_MODVERSIONS problemsLinus Torvalds
CONFIG_MODVERSIONS has been broken for pretty much the whole 4.9 series, and quite frankly, nobody has cared very deeply. We absolutely know how to fix it, and it's not _complicated_, but it's not exactly pretty either. This oneliner fixes it without the ugliness, and allows for further future cleanups. "We've secretly replaced their regular MODVERSIONS with nothing at all, let's see if they notice" Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-25Merge tag 'acpi-4.9-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "Two ACPI fixes for 4.9-rc7. One of them reverts a recent ACPI commit that attempted to improve reboot/power-off on some systems, but introduced problems elsewhere, and the other one fixes kernel builds with the new WDAT watchdog driver enabled in some configurations. Specifics: - Revert the recent commit that caused the ACPI _PTS method to be executed in the power-off/reboot code path (as per the specification) in an attempt to improve things on some systems (apparently expecting _PTS to be executed in that code path), but broke power-off/reboot on at least one other machine (Rafael Wysocki). - Fix kernel builds with the new WDAT watchdog driver enabled in some configurations by explicitly selecting WATCHDOG_CORE when enabling the WDAT watchdog driver (Mika Westerberg)" * tag 'acpi-4.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: watchdog: wdat_wdt: Select WATCHDOG_CORE Revert "ACPI: Execute _PTS before system reboot"
2016-11-25MAINTAINERS: Add bug tracking system location entry typeRafael J. Wysocki
Following the kernel Bugzilla discussion during the Kernel Summit (https://lwn.net/Articles/705245/), add bug tracking system location entry type (B) to MAINTAINERS and populate it for several subsystems known to be using the kernel BZ actively (and add the upstream BZ for ACPICA too). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-25Revert "i2c: designware: do not disable adapter after transfer"Jarkko Nikula
This reverts commit 0317e6c0f1dc1ba86b8d9dccc010c5e77b8355fa. Srinivas reported recently touchscreen and touchpad stopped working in Haswell based machine in Linux 4.9-rc series with timeout errors from i2c_designware: [ 16.508013] i2c_designware INT33C3:00: controller timed out [ 16.508302] i2c_hid i2c-MSFT0001:02: failed to change power setting. [ 17.532016] i2c_designware INT33C3:00: controller timed out [ 18.556022] i2c_designware INT33C3:00: controller timed out [ 18.556315] i2c_hid i2c-ATML1000:00: failed to retrieve report from device. I managed to reproduce similar errors on another Haswell based machine where touchscreen initialization fails maybe in every 1/5 - 1/2 boots. Since root cause for these errors is not clear yet and debugging is ongoing it's better to revert this commit as we are near to release. Reported-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2016-11-25Merge tag 'sti-dt-for-v4.9-rc-round2' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/pchotard/sti into fixes Pull "STi DT fix" from Patrice Chotard: The I2C nodes are missing #address-cells and #size-cells. This is causing warning at device tree compilation when some I2C device sub-nodes are defined. * tag 'sti-dt-for-v4.9-rc-round2' of git://git.kernel.org/pub/scm/linux/kernel/git/pchotard/sti: ARM: dts: STiH407-family: fix i2c nodes
2016-11-25Merge tag 'sunxi-fixes-for-4.9-2' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux into fixes Pull "Allwinner fixes for 4.9, second iteration" from Maxime Ripard: A renaming of the GR8 DTSI and DTS to make it explicitly part of the sun5i family. * tag 'sunxi-fixes-for-4.9-2' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux: ARM: gr8: Rename the DTSI and relevant DTS
2016-11-25Merge branch 'cgroup-bpf'David S. Miller
Daniel Mack says: ==================== Add eBPF hooks for cgroups This is v9 of the patch set to allow eBPF programs for network filtering and accounting to be attached to cgroups, so that they apply to all sockets of all tasks placed in that cgroup. The logic also allows to be extendeded for other cgroup based eBPF logic. Again, only minor details are updated in this version. Changes from v8: * Move the egress hooks into ip_finish_output() and ip6_finish_output() so they run after the netfilter hooks. For IPv4 multicast, add a new ip_mc_finish_output() callback that is invoked on success by netfilter, and call the eBPF program from there. Changes from v7: * Replace the static inline function cgroup_bpf_run_filter() with two specific macros for ingress and egress. This addresses David Miller's concern regarding skb->sk vs. sk in the egress path. Thanks a lot to Daniel Borkmann and Alexei Starovoitov for the suggestions. Changes from v6: * Rebased to 4.9-rc2 * Add EXPORT_SYMBOL(__cgroup_bpf_run_filter). The kbuild test robot now succeeds in building this version of the patch set. * Switch from bpf_prog_run_save_cb() to bpf_prog_run_clear_cb() to not tamper with the contents of skb->cb[]. Pointed out by Daniel Borkmann. * Use sk_to_full_sk() in the egress path, as suggested by Daniel Borkmann. * Renamed BPF_PROG_TYPE_CGROUP_SOCKET to BPF_PROG_TYPE_CGROUP_SKB, as requested by David Ahern. * Added Alexei's Acked-by tags. Changes from v5: * The eBPF programs now operate on L3 rather than on L2 of the packets, and the egress hooks were moved from __dev_queue_xmit() to ip*_output(). * For BPF_PROG_TYPE_CGROUP_SOCKET, disallow direct access to the skb through BPF_LD_[ABS|IND] instructions, but hook up the bpf_skb_load_bytes() access helper instead. Thanks to Daniel Borkmann for the help. Changes from v4: * Plug an skb leak when dropping packets due to eBPF verdicts in __dev_queue_xmit(). Spotted by Daniel Borkmann. * Check for sk_fullsock(sk) in __cgroup_bpf_run_filter() so we don't operate on timewait or request sockets. Suggested by Daniel Borkmann. * Add missing @parent parameter in kerneldoc of __cgroup_bpf_update(). Spotted by Rami Rosen. * Include linux/jump_label.h from bpf-cgroup.h to fix a kbuild error. Changes from v3: * Dropped the _FILTER suffix from BPF_PROG_TYPE_CGROUP_SOCKET_FILTER, renamed BPF_ATTACH_TYPE_CGROUP_INET_{E,IN}GRESS to BPF_CGROUP_INET_{IN,E}GRESS and alias BPF_MAX_ATTACH_TYPE to __BPF_MAX_ATTACH_TYPE, as suggested by Daniel Borkmann. * Dropped the attach_flags member from the anonymous struct for BPF attach operations in union bpf_attr. They can be added later on via CHECK_ATTR. Requested by Daniel Borkmann and Alexei. * Release old_prog at the end of __cgroup_bpf_update rather that at the beginning to fix a race gap between program updates and their users. Spotted by Daniel Borkmann. * Plugged an skb leak when dropping packets on the egress path. Spotted by Daniel Borkmann. * Add cgroups@vger.kernel.org to the loop, as suggested by Rami Rosen. * Some minor coding style adoptions not worth mentioning in particular. Changes from v2: * Fixed the RCU locking details Tejun pointed out. * Assert bpf_attr.flags == 0 in BPF_PROG_DETACH syscall handler. Changes from v1: * Moved all bpf specific cgroup code into its own file, and stub out related functions for !CONFIG_CGROUP_BPF as static inline nops. This way, the call sites are not cluttered with #ifdef guards while the feature remains compile-time configurable. * Implemented the new scheme proposed by Tejun. Per cgroup, store one set of pointers that are pinned to the cgroup, and one for the programs that are effective. When a program is attached or detached, the change is propagated to all the cgroup's descendants. If a subcgroup has its own pinned program, skip the whole subbranch in order to allow delegation models. * The hookup for egress packets is now done from __dev_queue_xmit(). * A static key is now used in both the ingress and egress fast paths to keep performance penalties close to zero if the feature is not in use. * Overall cleanup to make the accessors use the program arrays. This should make it much easier to add new program types, which will then automatically follow the pinned vs. effective logic. * Fixed locking issues, as pointed out by Eric Dumazet and Alexei Starovoitov. Changes to the program array are now done with xchg() and are protected by cgroup_mutex. * eBPF programs are now expected to return 1 to let the packet pass, not >= 0. Pointed out by Alexei. * Operation is now limited to INET sockets, so local AF_UNIX sockets are not affected. The enum members are renamed accordingly. In case other socket families should be supported, this can be extended in the future. * The sample program learned to support both ingress and egress, and can now optionally make the eBPF program drop packets by making it return 0. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25samples: bpf: add userspace example for attaching eBPF programs to cgroupsDaniel Mack
Add a simple userpace program to demonstrate the new API to attach eBPF programs to cgroups. This is what it does: * Create arraymap in kernel with 4 byte keys and 8 byte values * Load eBPF program The eBPF program accesses the map passed in to store two pieces of information. The number of invocations of the program, which maps to the number of packets received, is stored to key 0. Key 1 is incremented on each iteration by the number of bytes stored in the skb. * Detach any eBPF program previously attached to the cgroup * Attach the new program to the cgroup using BPF_PROG_ATTACH * Once a second, read map[0] and map[1] to see how many bytes and packets were seen on any socket of tasks in the given cgroup. The program takes a cgroup path as 1st argument, and either "ingress" or "egress" as 2nd. Optionally, "drop" can be passed as 3rd argument, which will make the generated eBPF program return 0 instead of 1, so the kernel will drop the packet. libbpf gained two new wrappers for the new syscall commands. Signed-off-by: Daniel Mack <daniel@zonque.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25net: ipv4, ipv6: run cgroup eBPF egress programsDaniel Mack
If the cgroup associated with the receiving socket has an eBPF programs installed, run them from ip_output(), ip6_output() and ip_mc_output(). From mentioned functions we have two socket contexts as per 7026b1ddb6b8 ("netfilter: Pass socket pointer down through okfn()."). We explicitly need to use sk instead of skb->sk here, since otherwise the same program would run multiple times on egress when encap devices are involved, which is not desired in our case. eBPF programs used in this context are expected to either return 1 to let the packet pass, or != 1 to drop them. The programs have access to the skb through bpf_skb_load_bytes(), and the payload starts at the network headers (L3). Note that cgroup_bpf_run_filter() is stubbed out as static inline nop for !CONFIG_CGROUP_BPF, and is otherwise guarded by a static key if the feature is unused. Signed-off-by: Daniel Mack <daniel@zonque.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25net: filter: run cgroup eBPF ingress programsDaniel Mack
If the cgroup associated with the receiving socket has an eBPF programs installed, run them from sk_filter_trim_cap(). eBPF programs used in this context are expected to either return 1 to let the packet pass, or != 1 to drop them. The programs have access to the skb through bpf_skb_load_bytes(), and the payload starts at the network headers (L3). Note that cgroup_bpf_run_filter() is stubbed out as static inline nop for !CONFIG_CGROUP_BPF, and is otherwise guarded by a static key if the feature is unused. Signed-off-by: Daniel Mack <daniel@zonque.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25bpf: add BPF_PROG_ATTACH and BPF_PROG_DETACH commandsDaniel Mack
Extend the bpf(2) syscall by two new commands, BPF_PROG_ATTACH and BPF_PROG_DETACH which allow attaching and detaching eBPF programs to a target. On the API level, the target could be anything that has an fd in userspace, hence the name of the field in union bpf_attr is called 'target_fd'. When called with BPF_ATTACH_TYPE_CGROUP_INET_{E,IN}GRESS, the target is expected to be a valid file descriptor of a cgroup v2 directory which has the bpf controller enabled. These are the only use-cases implemented by this patch at this point, but more can be added. If a program of the given type already exists in the given cgroup, the program is swapped automically, so userspace does not have to drop an existing program first before installing a new one, which would otherwise leave a gap in which no program is attached. For more information on the propagation logic to subcgroups, please refer to the bpf cgroup controller implementation. The API is guarded by CAP_NET_ADMIN. Signed-off-by: Daniel Mack <daniel@zonque.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25cgroup: add support for eBPF programsDaniel Mack
This patch adds two sets of eBPF program pointers to struct cgroup. One for such that are directly pinned to a cgroup, and one for such that are effective for it. To illustrate the logic behind that, assume the following example cgroup hierarchy. A - B - C \ D - E If only B has a program attached, it will be effective for B, C, D and E. If D then attaches a program itself, that will be effective for both D and E, and the program in B will only affect B and C. Only one program of a given type is effective for a cgroup. Attaching and detaching programs will be done through the bpf(2) syscall. For now, ingress and egress inet socket filtering are the only supported use-cases. Signed-off-by: Daniel Mack <daniel@zonque.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25bpf: add new prog type for cgroup socket filteringDaniel Mack
This program type is similar to BPF_PROG_TYPE_SOCKET_FILTER, except that it does not allow BPF_LD_[ABS|IND] instructions and hooks up the bpf_skb_load_bytes() helper. Programs of this type will be attached to cgroups for network filtering and accounting. Signed-off-by: Daniel Mack <daniel@zonque.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25Merge branches 'acpi-sleep-fixes' and 'acpi-wdat-fixes'Rafael J. Wysocki
* acpi-sleep-fixes: Revert "ACPI: Execute _PTS before system reboot" * acpi-wdat-fixes: watchdog: wdat_wdt: Select WATCHDOG_CORE
2016-11-25Merge tag 'linux-can-fixes-for-4.9-20161123' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2016-11-23 this is a pull request for net/master. The patch by Oliver Hartkopp for the broadcast manager (bcm) fixes the CAN-FD support, which may cause an out-of-bounds access otherwise. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25dwc_eth_qos: drop duplicate headersGeliang Tang
Drop duplicate headers types.h and delay.h from dwc_eth_qos.c. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25cxgb4: fix memory leak on txq_infoColin Ian King
Currently if txq_info->uldtxq cannot be allocated then txq_info->txq is being kfree'd (which is redundant because it is NULL) instead of txq_info. Fix this by instead kfree'ing txq_info. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-25Merge tag 'mfd-fixes-4.9.1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd Pull MFD fixes from Lee Jones: "Received a copule of last minute fixes for v4.9. The patches from Viresh are fixing issues displayed in KernelCI" * tag 'mfd-fixes-4.9.1' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: mfd: wm8994-core: Don't use managed regulator bulk get API mfd: wm8994-core: Disable regulators before removing them mfd: syscon: Support native-endian regmaps
2016-11-25Merge tag 'media/v4.9-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fix from Mauro Carvalho Chehab: "Fix for the firmware load logic of the tuner-xc2028 driver" * tag 'media/v4.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: xc2028: Fix use-after-free bug properly
2016-11-25perf trace: Update tid/pid filtering option to leverage symbol_confDavid Ahern
Leverage pid/tid filtering done by symbol_conf hooks. Signed-off-by: David Ahern <dsahern@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/1480091392-35645-1-git-send-email-dsa@cumulusnetworks.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-25perf sched timehist: Handle cpu migration eventsDavid Ahern
Add handlers for sched:sched_migrate_task event. Total number of migrations is added to summary display and -M/--migrations can be used to show migration events. Signed-off-by: David Ahern <dsahern@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1480091321-35591-1-git-send-email-dsa@cumulusnetworks.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-25perf annotate: Show invalid jump offset in error messageArnaldo Carvalho de Melo
To help in debugging when the wrong offset is being used, like in: │13d98: ↓ jne 13dd1 <lzma_lzma_preset@@XZ_5.0+0x28e1> That is the full line from objdump, and it seems what should be used is 13dd1, not 28e1. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-4nc0marsgst1ft6inmvqber7@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-25Merge tag 'drm-fixes-for-v4.9-rc7' of ↵Linus Torvalds
git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "Seems to be quietening down nicely, a few mediatek, one exynos and one hdlcd fix, along with two amd fixes" * tag 'drm-fixes-for-v4.9-rc7' of git://people.freedesktop.org/~airlied/linux: gpu/drm/exynos/exynos_hdmi - Unmap region obtained by of_iomap drm/mediatek: fix null pointer dereference drm/mediatek: fixed the calc method of data rate per lane drm/mediatek: fix a typo of DISP_OD_CFG to OD_RELAYMODE drm/radeon: fix power state when port pm is unavailable (v2) drm/amdgpu: fix power state when port pm is unavailable drm/arm: hdlcd: fix plane base address update drm/amd/powerplay: avoid out of bounds access on array ps.
2016-11-25perf ui helpline: Provide a printf variantArnaldo Carvalho de Melo
To print some values, like in the annotation code with invalid jump offsets. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-1vk0g5twas2ioswn1mmvnvwq@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-25Merge tag 'perf-core-for-mingo-20161125' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: New features: - Improve ARM support in the annotation code, affecting 'perf annotate', 'perf report' and live annotation in 'perf top' (Kim Phillips) - Initial support for PowerPC in the annotation code (Ravi Bangoria) - Skip repetitive scheduler function on the top of the stack in 'perf sched timehist' (Namhyung Kim) Fixes: - Fix maps resolution in libbpf (Eric Leblond) - Get the kernel signature via /proc/version_signature, available on Ubuntu systems, to make sure BPF proggies works, as the one provided via 'uname -r' doesn't (Wang Nan) - Fix segfault in 'perf record' when running with suid and kptr_restrict is 1 (Wang Nan) Infrastructure changes: - Support per-arch instruction tables, kept via a static or dynamic table (Arnaldo Carvalho de Melo) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-25drm: hdlcd: Fix cleanup orderRobin Murphy
If hdlcd_drm_bind() fails at drm_fbdev_cma_init(), its cleanup will call drm_mode_config_cleanup() as if to balance drm_mode_config_reset(). The net result is that drm_connector_cleanup() will clean up the active connectors long before component_unbind_all() gets called, so when the connector later tries to clean up itself after being unbound, Bad Things can happen: [ 4.121888] Unable to handle kernel NULL pointer dereference at virtual address 00000000 [ 4.129951] pgd = ffffff80091e0000 [ 4.133345] [00000000] *pgd=00000009ffffe003, *pud=00000009ffffe003, *pmd=0000000000000000 [ 4.141613] Internal error: Oops: 96000005 [#1] PREEMPT SMP [ 4.147144] Modules linked in: [ 4.150188] CPU: 0 PID: 122 Comm: kworker/u12:2 Not tainted 4.8.0-rc2+ #989 [ 4.157097] Hardware name: ARM Juno development board (r1) (DT) [ 4.162981] Workqueue: deferwq deferred_probe_work_func [ 4.168173] task: ffffffc975d93200 task.stack: ffffffc975dac000 [ 4.174055] PC is at drm_connector_cleanup+0x58/0x1c0 [ 4.179074] LR is at tda998x_unbind+0x24/0x40 [ 4.183401] pc : [<ffffff80084c46f0>] lr : [<ffffff800850414c>] pstate: 00000045 [ 4.190750] sp : ffffffc975dafa10 [ 4.194041] x29: ffffffc975dafa10 x28: ffffffc9768152a8 [ 4.199325] x27: ffffffc97ff46450 x26: ffffff8008d99000 [ 4.204608] x25: dead000000000100 x24: dead000000000200 [ 4.209891] x23: ffffffc976bf91e8 x22: 0000000000000000 [ 4.215172] x21: ffffffc976bf9170 x20: ffffffc976bf9170 [ 4.220454] x19: ffffffc976bf9018 x18: 0000000000000000 [ 4.225737] x17: 0000000074ce71ee x16: 000000008ff5d35f [ 4.231019] x15: ffffffc97681e91c x14: ffffffffffffffff [ 4.236301] x13: ffffffc97681e185 x12: 0000000000000038 [ 4.241583] x11: 0101010101010101 x10: 0000000000000000 [ 4.246866] x9 : 0000000040000000 x8 : 0000000000210d00 [ 4.252148] x7 : ffffffc97fea8c00 x6 : 000000000000001b [ 4.257430] x5 : ffffff80084b7b8c x4 : 0000000000000080 [ 4.262712] x3 : ffffff8008504128 x2 : ffffffc975df3800 [ 4.267993] x1 : 0000000000000000 x0 : 0000000000000000 ... [ 4.750937] [<ffffff80084c46f0>] drm_connector_cleanup+0x58/0x1c0 [ 4.756990] [<ffffff800850414c>] tda998x_unbind+0x24/0x40 [ 4.762354] [<ffffff8008507918>] component_unbind.isra.4+0x28/0x50 [ 4.768492] [<ffffff8008507a0c>] component_unbind_all+0xcc/0xd8 [ 4.774373] [<ffffff80084d5adc>] hdlcd_drm_bind+0x234/0x418 [ 4.779909] [<ffffff8008507b58>] try_to_bring_up_master+0x140/0x1a0 [ 4.786133] [<ffffff8008507c50>] component_add+0x98/0x170 [ 4.791496] [<ffffff8008504b90>] tda998x_probe+0x18/0x20 [ 4.796774] [<ffffff80086bf914>] i2c_device_probe+0x164/0x258 [ 4.802481] [<ffffff800850d094>] driver_probe_device+0x204/0x2b0 [ 4.808447] [<ffffff800850d28c>] __device_attach_driver+0x9c/0xf8 [ 4.814498] [<ffffff800850b108>] bus_for_each_drv+0x58/0x98 [ 4.820033] [<ffffff800850cd64>] __device_attach+0xc4/0x138 [ 4.825567] [<ffffff800850d338>] device_initial_probe+0x10/0x18 [ 4.831446] [<ffffff800850c124>] bus_probe_device+0x94/0xa0 [ 4.836981] [<ffffff800850c5b0>] deferred_probe_work_func+0x78/0xb0 [ 4.843207] [<ffffff80080d2998>] process_one_work+0x118/0x378 [ 4.848914] [<ffffff80080d2c40>] worker_thread+0x48/0x498 [ 4.854276] [<ffffff80080d8918>] kthread+0xd0/0xe8 [ 4.859036] [<ffffff8008082e90>] ret_from_fork+0x10/0x40 [ 4.864314] Code: f2fbd5b9 f2fbd5b8 f8478ee0 eb17001f (f9400013) [ 4.870472] ---[ end trace a643cfe4ce1d838b ]--- Fix this by moving the drm_mode_config_cleanup() much later such that it correctly balances drm_mode_config_init(). Suggested-by: Russell King <linux@armlinux.org.uk> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
2016-11-25scsi: lpfc: fix oops/BUG in lpfc_sli_ringtxcmpl_put()Mauricio Faria de Oliveira
The BUG_ON() recently introduced in lpfc_sli_ringtxcmpl_put() is hit in the lpfc_els_abort() > lpfc_sli_issue_abort_iotag() > lpfc_sli_abort_iotag_issue() function path [similar names], due to 'piocb->vport == NULL': BUG_ON(!piocb || !piocb->vport); This happens because lpfc_sli_abort_iotag_issue() doesn't set the 'abtsiocbp->vport' pointer -- but this is not the problem. Previously, lpfc_sli_ringtxcmpl_put() accessed 'piocb->vport' only if 'piocb->iocb.ulpCommand' is neither CMD_ABORT_XRI_CN nor CMD_CLOSE_XRI_CN, which are the only possible values for lpfc_sli_abort_iotag_issue(): lpfc_sli_ringtxcmpl_put(): if ((unlikely(pring->ringno == LPFC_ELS_RING)) && (piocb->iocb.ulpCommand != CMD_ABORT_XRI_CN) && (piocb->iocb.ulpCommand != CMD_CLOSE_XRI_CN) && (!(piocb->vport->load_flag & FC_UNLOADING))) lpfc_sli_abort_iotag_issue(): if (phba->link_state >= LPFC_LINK_UP) iabt->ulpCommand = CMD_ABORT_XRI_CN; else iabt->ulpCommand = CMD_CLOSE_XRI_CN; So, this function path would not have hit this possible NULL pointer dereference before. In order to fix this regression, move the second part of the BUG_ON() check prior to the pointer dereference that it does check for. For reference, this is the stack trace observed. The problem happened because an unsolicited event was received - a PLOGI was received after our PLOGI was issued but not yet complete, so the discovery state machine goes on to sw-abort our PLOGI. kernel BUG at drivers/scsi/lpfc/lpfc_sli.c:1326! Oops: Exception in kernel mode, sig: 5 [#1] <...> NIP [...] lpfc_sli_ringtxcmpl_put+0x1c/0xf0 [lpfc] LR [...] __lpfc_sli_issue_iocb_s4+0x188/0x200 [lpfc] Call Trace: [...] [...] __lpfc_sli_issue_iocb_s4+0xb0/0x200 [lpfc] (unreliable) [...] [...] lpfc_sli_issue_abort_iotag+0x2b4/0x350 [lpfc] [...] [...] lpfc_els_abort+0x1a8/0x4a0 [lpfc] [...] [...] lpfc_rcv_plogi+0x6d4/0x700 [lpfc] [...] [...] lpfc_rcv_plogi_plogi_issue+0xd8/0x1d0 [lpfc] [...] [...] lpfc_disc_state_machine+0xc0/0x2b0 [lpfc] [...] [...] lpfc_els_unsol_buffer+0xcc0/0x26c0 [lpfc] [...] [...] lpfc_els_unsol_event+0xa8/0x220 [lpfc] [...] [...] lpfc_complete_unsol_iocb+0xb8/0x138 [lpfc] [...] [...] lpfc_sli4_handle_received_buffer+0x6a0/0xec0 [lpfc] [...] [...] lpfc_sli_handle_slow_ring_event_s4+0x1c4/0x240 [lpfc] [...] [...] lpfc_sli_handle_slow_ring_event+0x24/0x40 [lpfc] [...] [...] lpfc_do_work+0xd88/0x1970 [lpfc] [...] [...] kthread+0x108/0x130 [...] [...] ret_from_kernel_thread+0x5c/0xbc <...> Cc: stable@vger.kernel.org # v4.8 Fixes: 22466da5b4b7 ("lpfc: Fix possible NULL pointer dereference") Reported-by: Harsha Thyagaraja <hathyaga@in.ibm.com> Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-25tools lib bpf: Fix maps resolutionEric Leblond
It is not correct to assimilate the elf data of the maps section to an array of map definition. In fact the sizes differ. The offset provided in the symbol section has to be used instead. This patch fixes a bug causing a elf with two maps not to load correctly. Wang Nan added: This patch requires a name for each BPF map, so array of BPF maps is not allowed. This restriction is reasonable, because kernel verifier forbid indexing BPF map from such array unless the index is a fixed value, but if the index is fixed why not merging it into name? For example: Program like this: ... unsigned long cpu = get_smp_processor_id(); int *pval = map_lookup_elem(&map_array[cpu], &key); ... Generates bytecode like this: 0: (b7) r1 = 0 1: (63) *(u32 *)(r10 -4) = r1 2: (b7) r1 = 680997 3: (63) *(u32 *)(r10 -8) = r1 4: (85) call 8 5: (67) r0 <<= 4 6: (18) r1 = 0x112dd000 8: (0f) r0 += r1 9: (bf) r2 = r10 10: (07) r2 += -4 11: (bf) r1 = r0 12: (85) call 1 Where instruction 8 is the computation, 8 and 11 render r1 to an invalid value for function map_lookup_elem, causes verifier report error. Signed-off-by: Eric Leblond <eric@regit.org> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Wang Nan <wangnan0@huawei.com> [ Merge bpf_object__init_maps_name into bpf_object__init_maps. Fix segfault for buggy BPF script Validate obj->maps ] Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161115040617.69788-5-wangnan0@huawei.com Signed-off-by: Wang Nan <wangnan0@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-25perf tools: Add missing struct definition in probe_event.hWang Nan
Commit 0b3c2264ae30 ("perf symbols: Fix kallsyms perf test on ppc64le") refers struct symbol in probe_event.h, but forgets to include its definition. Gcc will complain about it when that definition is not added, by sheer luck, by some other header included before probe_event.h. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161115040617.69788-4-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-25perf record: Fix segfault when running with suid and kptr_restrict is 1Wang Nan
Before this patch perf panics if kptr_restrict is set to 1 and perf is owned by root with suid set: $ whoami wangnan $ ls -l ./perf -rwsr-xr-x 1 root root 19781908 Sep 21 19:29 /home/wangnan/perf $ cat /proc/sys/kernel/kptr_restrict 1 $ cat /proc/sys/kernel/perf_event_paranoid -1 $ ./perf record -a Segmentation fault (core dumped) $ The reason is that perf assumes it is allowed to read kptr from /proc/kallsyms when euid is root, but in fact the kernel doesn't allow reading kptr when euid and uid do not match with each other: $ cp /bin/cat . $ sudo chown root:root ./cat $ sudo chmod u+s ./cat $ cat /proc/kallsyms | grep do_fork 0000000000000000 T _do_fork <--- kptr is hidden even euid is root $ sudo cat /proc/kallsyms | grep do_fork ffffffff81080230 T _do_fork See lib/vsprintf.c for kernel side code. This patch fixes this problem by checking both uid and euid. Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161115040617.69788-3-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>