summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2016-07-04NFC: digital: Add a delay between poll cyclesThierry Escande
This replaces the polling work struct with a delayed work struct and add a 10 ms delay between 2 poll cycles. This avoids to flood the device with 'switch off'/'switch on' commands. Signed-off-by: Thierry Escande <thierry.escande@collabora.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2016-07-04Merge branch 'irq/for-block' into irq/coreThomas Gleixner
Pull the irq affinity managing code which is in a seperate branch for block developers to pull.
2016-07-04genirq: Add a helper to spread an affinity mask for MSI/MSI-X vectorsChristoph Hellwig
This is lifted from the blk-mq code and adopted to use the affinity mask concept just introduced in the irq handling code. It tries to keep the algorithm the same as the one current used by blk-mq, but improvements like assining vectors on a per-node basis instead of just per sibling are possible with this simple move and refactoring. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: linux-block@vger.kernel.org Cc: linux-pci@vger.kernel.org Cc: linux-nvme@lists.infradead.org Cc: axboe@fb.com Cc: agordeev@redhat.com Link: http://lkml.kernel.org/r/1467621574-8277-7-git-send-email-hch@lst.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-07-04genirq/msi: Make use of affinity aware allocationsThomas Gleixner
Allow the MSI code to provide affinity hints per MSI descriptor. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Christoph Hellwig <hch@lst.de> Cc: linux-block@vger.kernel.org Cc: linux-pci@vger.kernel.org Cc: linux-nvme@lists.infradead.org Cc: axboe@fb.com Cc: agordeev@redhat.com Link: http://lkml.kernel.org/r/1467621574-8277-6-git-send-email-hch@lst.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-07-04genirq: Add affinity hint to irq allocationThomas Gleixner
Add an extra argument to the irq(domain) allocation functions, so we can hand down affinity hints to the allocator. Thats necessary to implement proper support for multiqueue devices. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Christoph Hellwig <hch@lst.de> Cc: linux-block@vger.kernel.org Cc: linux-pci@vger.kernel.org Cc: linux-nvme@lists.infradead.org Cc: axboe@fb.com Cc: agordeev@redhat.com Link: http://lkml.kernel.org/r/1467621574-8277-4-git-send-email-hch@lst.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-07-04genirq: Introduce IRQD_AFFINITY_MANAGED flagThomas Gleixner
Interupts marked with this flag are excluded from user space interrupt affinity changes. Contrary to the IRQ_NO_BALANCING flag, the kernel internal affinity mechanism is not blocked. This flag will be used for multi-queue device interrupts. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Christoph Hellwig <hch@lst.de> Cc: linux-block@vger.kernel.org Cc: linux-pci@vger.kernel.org Cc: linux-nvme@lists.infradead.org Cc: axboe@fb.com Cc: agordeev@redhat.com Link: http://lkml.kernel.org/r/1467621574-8277-3-git-send-email-hch@lst.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-07-04genirq/msi: Remove unused MSI_FLAG_IDENTITY_MAPThomas Gleixner
No user and we definitely don't want to grow one. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: linux-block@vger.kernel.org Cc: linux-pci@vger.kernel.org Cc: linux-nvme@lists.infradead.org Cc: axboe@fb.com Cc: agordeev@redhat.com Link: http://lkml.kernel.org/r/1467621574-8277-2-git-send-email-hch@lst.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-07-04NFC: hci: delete unused nfc_llc_get_rx_head_tail_room()Denys Vlasenko
It used to be EXPORTed, but then EXPORT usage was cleaned up (in 2012), without noticing that the function has no users at all (and curiously, never had any users). Delete it. While at it, remove non-static "inline" hints on nearby functions: these hints don't work across compilation units anyway, and these functions are not used in their .c file, thus they are never inlined. IOW: "inline" here does not help in any way. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> CC: Samuel Ortiz <sameo@linux.intel.com> CC: Christophe Ricard <christophe.ricard@gmail.com> CC: linux-wireless@vger.kernel.org CC: linux-kernel@vger.kernel.org Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2016-07-04drm/i915: Allow userspace to request no-error-capture upon GPU hangsChris Wilson
igt likes to inject GPU hangs into its command streams. However, as we expect these hangs, we don't actually want them recorded in the dmesg output or stored in the i915_error_state (usually). To accommodate this allow userspace to set a flag on the context that any hang emanating from that context will not be recorded. We still do the error capture (otherwise how do we find the guilty context and know its intent?) as part of the reason for random GPU hang injection is to exercise the race conditions between the error capture and normal execution. v2: Split out the request->ringbuf error capture changes. v3: Move the flag defines next to the intel_context->flags definition Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-9-git-send-email-chris@chris-wilson.co.uk
2016-07-03KVM: arm/arm64: The GIC is dead, long live the GICMarc Zyngier
I don't think any single piece of the KVM/ARM code ever generated as much hatred as the GIC emulation. It was written by someone who had zero experience in modeling hardware (me), was riddled with design flaws, should have been scrapped and rewritten from scratch long before having a remote chance of reaching mainline, and yet we supported it for a good three years. No need to mention the names of those who suffered, the git log is singing their praises. Thankfully, we now have a much more maintainable implementation, and we can safely put the grumpy old GIC to rest. Fellow hackers, please raise your glass in memory of the GIC: The GIC is dead, long live the GIC! Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-07-03Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse fix from Miklos Szeredi: "This makes sure userspace filesystems are not broken by the parallel lookups and readdir feature" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: serialize dirops by default
2016-07-03netfilter: Convert FWINV<[foo]> macros and uses to NF_INVFJoe Perches
netfilter uses multiple FWINV #defines with identical form that hide a specific structure variable and dereference it with a invflags member. $ git grep "#define FWINV" include/linux/netfilter_bridge/ebtables.h:#define FWINV(bool,invflg) ((bool) ^ !!(info->invflags & invflg)) net/bridge/netfilter/ebtables.c:#define FWINV2(bool, invflg) ((bool) ^ !!(e->invflags & invflg)) net/ipv4/netfilter/arp_tables.c:#define FWINV(bool, invflg) ((bool) ^ !!(arpinfo->invflags & (invflg))) net/ipv4/netfilter/ip_tables.c:#define FWINV(bool, invflg) ((bool) ^ !!(ipinfo->invflags & (invflg))) net/ipv6/netfilter/ip6_tables.c:#define FWINV(bool, invflg) ((bool) ^ !!(ip6info->invflags & (invflg))) net/netfilter/xt_tcpudp.c:#define FWINVTCP(bool, invflg) ((bool) ^ !!(tcpinfo->invflags & (invflg))) Consolidate these macros into a single NF_INVF macro. Miscellanea: o Neaten the alignment around these uses o A few lines are > 80 columns for intelligibility Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-03random: replace non-blocking pool with a Chacha20-based CRNGTheodore Ts'o
The CRNG is faster, and we don't pretend to track entropy usage in the CRNG any more. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-07-02iio: st_sensors: harden interrupt handlingLinus Walleij
Leonard Crestez observed the following phenomenon: when using hard interrupt triggers (the DRDY line coming out of an ST sensor) sometimes a new value would arrive while reading the previous value, due to latencies in the system. We discovered that the ST hardware as far as can be observed is designed for level interrupts: the DRDY line will be held asserted as long as there are new values coming. The interrupt handler should be re-entered until we're out of values to handle from the sensor. If interrupts were handled as occurring on the edges (usually low-to-high) new values could appear and the line be held asserted after that, and these values would be missed, the interrupt handler would also lock up as new data was available, but as no new edges occurs on the DRDY signal, nothing happens: the edge detector only detects edges. To counter this, do the following: - Accept interrupt lines to be flagged as level interrupts using IRQF_TRIGGER_HIGH and IRQF_TRIGGER_LOW. If the line is marked like this (in the device tree node or ACPI table or similar) it will be utilized as a level IRQ. We mark the line with IRQF_ONESHOT and mask the IRQ while processing a sample, then the top half will be entered again if new values are available. - If we are flagged as using edge interrupts with IRQF_TRIGGER_RISING or IRQF_TRIGGER_FALLING: remove IRQF_ONESHOT so that the interrupt line is not masked while running the thread part of the interrupt. This way we will never miss an interrupt, then introduce a loop that polls the data ready registers repeatedly until no new samples are available, then exit the interrupt handler. This way we know no new values are available when the interrupt handler exits and new (edge) interrupts will be triggered when data arrives. Take some extra care to update the timestamp in the poll loop if this happens. The timestamp will not be 100% perfect, but it will at least be closer to the actual events. Usually the extra poll loop will handle the new samples, but once in a blue moon, we get a new IRQ while exiting the loop, before returning from the thread IRQ bottom half with IRQ_HANDLED. On these rare occasions, the removal of IRQF_ONESHOT means the interrupt will immediately fire again. - If no interrupt type is indicated from the DT/ACPI, choose IRQF_TRIGGER_RISING as default, as this is necessary for legacy boards. Tested successfully on the LIS331DL and L3G4200D by setting sampling frequency to 400Hz/800Hz and stressing the system: extra reads in the threaded interrupt handler occurs. Cc: Giuseppe Barba <giuseppe.barba@st.com> Cc: Denis Ciocca <denis.ciocca@st.com> Tested-by: Crestez Dan Leonard <cdleonard@gmail.com> Reported-by: Crestez Dan Leonard <cdleonard@gmail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Jonathan Cameron <jic23@kernel.org>
2016-07-02net/mlx5e: Create NIC global resources only onceHadar Hen Zion
To allow creating more than one netdev over the same PCI function, we change the driver such that global NIC resources are created once and later be shared amongst all the mlx5e netdevs running over that port. Move the CQ UAR, PD (pdn), Transport Domain (tdn), MKey resources from being kept in the mlx5e priv part to a new resources structure (mlx5e_resources) placed under the mlx5_core device. This patch doesn't add any new functionality. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02net/devlink: Add E-Switch mode controlOr Gerlitz
Add the commands to set and show the mode of SRIOV E-Switch, two modes are supported: * legacy: operating in the "old" L2 based mode (DMAC --> VF vport) * switchdev: the E-Switch is referred to as whitebox switch configured using standard tools such as tc, bridge, openvswitch etc. To allow working with the tools, for each VF, a VF representor netdevice is created by the E-Switch manager vendor device driver instance (e.g PF). Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02net/mlx5: Introduce offloads steering namespaceOr Gerlitz
Add a new namespace (MLX5_FLOW_NAMESPACE_OFFLOADS) to be populated with flow steering rules that deal with rules that have have to be executed before the EN NIC steering rules are matched. The namespace is located after the bypass name-space and before the kernel name-space. Therefore, it precedes the HW processing done for rules set for the kernel NIC name-space. Under SRIOV, it would allow us to match on e-switch missed packet and forward them to the relevant VF representor TIR. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Amir Vadai <amir@vadai.me> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02Merge branch 'for-next' of http://git.agner.ch/git/linux-drm-fsl-dcu into ↵Dave Airlie
drm-next The patchset contains a new helper in drm_fb_cma_helper.c for suspend/ resume when using cma backed framebuffers. * 'for-next' of http://git.agner.ch/git/linux-drm-fsl-dcu: drm/fsl-dcu: disable vblank events on CRTC disable drm/fsl-dcu: implement suspend/resume using atomic helpers drm/fsl-dcu: use clk helpers drm/fsl-dcu: move layer initialization to plane file drm/fsl-dcu: store layer registers in soc_data drm/fb_cma_helper: add suspend helper
2016-07-02Back-merge tag 'v4.7-rc5' into drm-nextDave Airlie
Linux 4.7-rc5 The fsl-dcu pull needs -rc3 so go to -rc5 for now.
2016-07-02Merge tag 'drm-intel-fixes-2016-06-30' of ↵Dave Airlie
git://anongit.freedesktop.org/drm-intel into drm-fixes here's a batch of i915 fixes for 4.7. * tag 'drm-intel-fixes-2016-06-30' of git://anongit.freedesktop.org/drm-intel: drm/i915: Fix missing unlock on error in i915_ppgtt_info() drm/i915: Removing PCI IDs that are no longer listed as Kabylake. drm/i915: Add more Kabylake PCI IDs. drm/i915: Avoid early timeout during AUX transfers drm/i915/hsw: Avoid early timeout during LCPLL disable/restore drm/i915/lpt: Avoid early timeout during FDI PHY reset drm/i915/bxt: Avoid early timeout during PLL enable drm/i915: Refresh cached DP port register value on resume
2016-07-02extcon: adc-jack: add suspend/resume supportVenkat Reddy Talla
adding suspend and resume funtionality for extcon-adc-jack driver to configure system wake up for extcon events, also adding support to enable/disable system wakeup through flag wakeup_source based on platform requirement. Signed-off-by: Venkat Reddy Talla <vreddytalla@nvidia.com> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
2016-07-01clk: core: support clocks which requires parents enable (part 1)Dong Aisheng
On Freescale i.MX7D platform, all clocks operations, including enable/disable, rate change and re-parent, requires its parent clock enable. Current clock core can not support it well. This patch introduce a new flag CLK_OPS_PARENT_ENABLE to handle this special case in clock core that enable its parent clock firstly for each operation and disable it later after operation complete. The patch part 1 fixes the possible disabling clocks while its parent is off during kernel booting phase in clk_disable_unused_subtree(). Before the completion of kernel booting, clock tree is still not built completely, there may be a case that the child clock is on but its parent is off which could be caused by either HW initial reset state or bootloader initialization. Taking bootloader as an example, we may enable all clocks in HW by default. And during kernel booting time, the parent clock could be disabled in its driver probe due to calling clk_prepare_enable and clk_disable_unprepare. Because it's child clock is only enabled in HW while its SW usecount in clock tree is still 0, so clk_disable of parent clock will gate the parent clock in both HW and SW usecount ultimately. Then there will be a child clock is still on in HW but its parent is already off. Later in clk_disable_unused(), this clock disable accessing while its parent off will cause system hang due to the limitation of HW which must require its parent on. This patch simply enables the parent clock first before disabling if flag CLK_OPS_PARENT_ENABLE is set in clk_disable_unused_subtree(). This is a simple solution and only affects booting time. After kernel booting up the clock tree is already created, there will be no case that child is off but its parent is off. So no need do this checking for normal clk_disable() later. Cc: Michael Turquette <mturquette@baylibre.com> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2016-07-01Merge tag 'v4.8-rockchip-clk1' of ↵Stephen Boyd
git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into clk-next Pull rockchip clk driver updates from Heiko Stuebner: Placeholder for the rk3399 watchdog pclk, some newly exported rk3228 clockids and a small fix for the not yet used spdif to displayport clock on the rk3399. * tag 'v4.8-rockchip-clk1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip: clk: rockchip: fix incorrect rk3399 spdif-DPTX divider bits clk: rockchip: export rk3228 MAC clocks clk: rockchip: rename rk3228 sclk_macphy_50m to sclk_mac_extclk clk: rockchip: export rk3228 audio clocks clk: rockchip: include rk3228 downstream muxes into fractional dividers clk: rockchip: fix incorrect rk3228 clock registers clk: rockchip: add clock-ids for rk3228 MAC clocks clk: rockchip: add clock-ids for rk3228 audio clocks clk: rockchip: add a dummy clock for the watchdog pclk on rk3399
2016-07-01Merge tag 'tegra-for-4.8-clk' of ↵Stephen Boyd
git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux into clk-next Pull tegra clk driver updates from Thierry Reding: Fixes and enhancements mostly for Tegra210 clocks that allow DSI and HDMI to work on Tegra X1. There's also a refactoring, including fixes, the USB PLL. * tag 'tegra-for-4.8-clk' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux: clk: tegra: Initialize UTMI PLL when enabling PLLU clk: tegra: Micro-optimize Tegra210 clock setup clk: tegra: Make sor_safe the parent of dpaux and dpaux1 clk: tegra: Mark timer clock as critical clk: tegra: Enable sor1 and sor1_src on Tegra210 clk: tegra: Squash sor1 safe/brick/src into a single mux clk: tegra: Disable spread spectrum on pll_d2 clk: tegra: Fixup post dividers on Tegra210
2016-07-02Revert "ACPI, PCI, IRQ: remove redundant code in acpi_irq_penalty_init()"Sinan Kaya
Trying to make the ISA and PCI init functionality common turned out to be a bad idea, because the ISA path depends on external functionality. Restore the previous behavior and limit the refactoring to PCI interrupts only. Fixes: 1fcb6a813c4f "ACPI,PCI,IRQ: remove redundant code in acpi_irq_penalty_init()" Signed-off-by: Sinan Kaya <okaya@codeaurora.org> Tested-by: Wim Osterholt <wim@djo.tudelft.nl> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-01power_supply: fix return value of get_propertyRhyland Klein
power_supply_get_property() should ideally return -EAGAIN if it is called while the power_supply is being registered. There was no way previously to determine if use_cnt == 0 meant that the power_supply wasn't fully registered yet, or if it had already been unregistered. Add a new boolean to the power_supply struct to simply show if registration is completed. Lastly, modify the check in power_supply_show_property() to also ignore -EAGAIN when so it doesn't complain about not returning the property. Signed-off-by: Rhyland Klein <rklein@nvidia.com> Signed-off-by: Sebastian Reichel <sre@kernel.org>
2016-07-01cgroup: bpf: Add bpf_skb_in_cgroup_protoMartin KaFai Lau
Adds a bpf helper, bpf_skb_in_cgroup, to decide if a skb->sk belongs to a descendant of a cgroup2. It is similar to the feature added in netfilter: commit c38c4597e4bf ("netfilter: implement xt_cgroup cgroup2 path match") The user is expected to populate a BPF_MAP_TYPE_CGROUP_ARRAY which will be used by the bpf_skb_in_cgroup. Modifications to the bpf verifier is to ensure BPF_MAP_TYPE_CGROUP_ARRAY and bpf_skb_in_cgroup() are always used together. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Tejun Heo <tj@kernel.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01cgroup: bpf: Add BPF_MAP_TYPE_CGROUP_ARRAYMartin KaFai Lau
Add a BPF_MAP_TYPE_CGROUP_ARRAY and its bpf_map_ops's implementations. To update an element, the caller is expected to obtain a cgroup2 backed fd by open(cgroup2_dir) and then update the array with that fd. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Tejun Heo <tj@kernel.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01cgroup: Add cgroup_get_from_fdMartin KaFai Lau
Add a helper function to get a cgroup2 from a fd. It will be stored in a bpf array (BPF_MAP_TYPE_CGROUP_ARRAY) which will be introduced in the later patch. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Tejun Heo <tj@kernel.org> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01net_sched: fix mirrored packets checksumWANG Cong
Similar to commit 9b368814b336 ("net: fix bridge multicast packet checksum validation") we need to fixup the checksum for CHECKSUM_COMPLETE when pushing skb on RX path. Otherwise we get similar splats. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01packet: Use symmetric hash for PACKET_FANOUT_HASH.David S. Miller
People who use PACKET_FANOUT_HASH want a symmetric hash, meaning that they want packets going in both directions on a flow to hash to the same bucket. The core kernel SKB hash became non-symmetric when the ipv6 flow label and other entities were incorporated into the standard flow hash order to increase entropy. But there are no users of PACKET_FANOUT_HASH who want an assymetric hash, they all want a symmetric one. Therefore, use the flow dissector to compute a flat symmetric hash over only the protocol, addresses and ports. This hash does not get installed into and override the normal skb hash, so this change has no effect whatsoever on the rest of the stack. Reported-by: Eric Leblond <eric@regit.org> Tested-by: Eric Leblond <eric@regit.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01bpf: refactor bpf_prog_get and type check into helperDaniel Borkmann
Since bpf_prog_get() and program type check is used in a couple of places, refactor this into a small helper function that we can make use of. Since the non RO prog->aux part is not used in performance critical paths and a program destruction via RCU is rather very unlikley when doing the put, we shouldn't have an issue just doing the bpf_prog_get() + prog->type != type check, but actually not taking the ref at all (due to being in fdget() / fdput() section of the bpf fd) is even cleaner and makes the diff smaller as well, so just go for that. Callsites are changed to make use of the new helper where possible. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01bpf: generally move prog destruction to RCU deferralDaniel Borkmann
Jann Horn reported following analysis that could potentially result in a very hard to trigger (if not impossible) UAF race, to quote his event timeline: - Set up a process with threads T1, T2 and T3 - Let T1 set up a socket filter F1 that invokes another filter F2 through a BPF map [tail call] - Let T1 trigger the socket filter via a unix domain socket write, don't wait for completion - Let T2 call PERF_EVENT_IOC_SET_BPF with F2, don't wait for completion - Now T2 should be behind bpf_prog_get(), but before bpf_prog_put() - Let T3 close the file descriptor for F2, dropping the reference count of F2 to 2 - At this point, T1 should have looked up F2 from the map, but not finished executing it - Let T3 remove F2 from the BPF map, dropping the reference count of F2 to 1 - Now T2 should call bpf_prog_put() (wrong BPF program type), dropping the reference count of F2 to 0 and scheduling bpf_prog_free_deferred() via schedule_work() - At this point, the BPF program could be freed - BPF execution is still running in a freed BPF program While at PERF_EVENT_IOC_SET_BPF time it's only guaranteed that the perf event fd we're doing the syscall on doesn't disappear from underneath us for whole syscall time, it may not be the case for the bpf fd used as an argument only after we did the put. It needs to be a valid fd pointing to a BPF program at the time of the call to make the bpf_prog_get() and while T2 gets preempted, F2 must have dropped reference to 1 on the other CPU. The fput() from the close() in T3 should also add additionally delay to the reference drop via exit_task_work() when bpf_prog_release() gets called as well as scheduling bpf_prog_free_deferred(). That said, it makes nevertheless sense to move the BPF prog destruction generally after RCU grace period to guarantee that such scenario above, but also others as recently fixed in ceb56070359b ("bpf, perf: delay release of BPF prog after grace period") with regards to tail calls won't happen. Integrating bpf_prog_free_deferred() directly into the RCU callback is not allowed since the invocation might happen from either softirq or process context, so we're not permitted to block. Reviewing all bpf_prog_put() invocations from eBPF side (note, cBPF -> eBPF progs don't use this for their destruction) with call_rcu() look good to me. Since we don't know whether at the time of attaching the program, we're already part of a tail call map, we need to use RCU variant. However, due to this, there won't be severely more stress on the RCU callback queue: situations with above bpf_prog_get() and bpf_prog_put() combo in practice normally won't lead to releases, but even if they would, enough effort/ cycles have to be put into loading a BPF program into the kernel already. Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01drm/ttm: Make ttm_bo_mem_compat availableSinclair Yeh
There are cases where it is desired to see if a proposed placement is compatible with a buffer object before calling ttm_bo_validate(). Signed-off-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: <stable@vger.kernel.org> --- This is the first of a 3-patch series to fix a black screen issue observed on Ubuntu 16.04 server.
2016-07-01Merge tag 'usb-4.7-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB and PHY fixes from Greg KH: "Here are a number of small USB and PHY driver fixes for 4.7-rc6. Nothing major here, all are described in the shortlog below. All have been in linux-next with no reported issues" * tag 'usb-4.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: USB: don't free bandwidth_mutex too early USB: EHCI: declare hostpc register as zero-length array phy-sun4i-usb: Fix irq free conditions to match request conditions phy: bcm-ns-usb2: checking the wrong variable phy-sun4i-usb: fix missing __iomem * phy: phy-sun4i-usb: Fix optional gpios failing probe phy: rockchip-dp: fix return value check in rockchip_dp_phy_probe() phy: rcar-gen3-usb2: fix unexpected repeat interrupts of VBUS change usb: common: otg-fsm: add license to usb-otg-fsm
2016-07-01crypto: rsa - Generate fixed-length outputHerbert Xu
Every implementation of RSA that we have naturally generates output with leading zeroes. The one and only user of RSA, pkcs1pad wants to have those leading zeroes in place, in fact because they are currently absent it has to write those zeroes itself. So we shouldn't be stripping leading zeroes in the first place. In fact this patch makes rsa-generic produce output with fixed length so that pkcs1pad does not need to do any extra work. This patch also changes DH to use the new interface. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-07-01crypto: api - Add crypto_inst_setnameHerbert Xu
This patch adds the helper crypto_inst_setname because the current helper crypto_alloc_instance2 is no longer useful given that we now look up the algorithm after we allocate the instance object. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-07-01ASoC: simple-card: use asoc_simple_card_parse_daifmt()Kuninori Morimoto
We can use simpel utils asoc_simple_card_parse_daifmt(). Let's use it Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-07-01etherdevice.h & bridge: netfilter: Add and use ether_addr_equal_maskedJoe Perches
There are code duplications of a masked ethernet address comparison here so make it a separate function instead. Miscellanea: o Neaten alignment of FWINV macro uses to make it clearer for the reader Signed-off-by: Joe Perches <joe@perches.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-01drm/i915/bxt: Export pooled eu info to userspacearun.siluvery@linux.intel.com
Pooled EU is a bxt only feature and kernel changes are already merged. This feature is not yet exposed to userspace as the support was not yet available. Beignet team expressed interest and added patches to use this. Since we now have a user and patches to use them, expose them from the kernel side as well. v2: fix compile error [1] https://lists.freedesktop.org/archives/beignet/2016-June/007698.html [2] https://lists.freedesktop.org/archives/beignet/2016-June/007699.html Cc: Winiarski, Michal <michal.winiarski@intel.com> Cc: Zou, Nanhai <nanhai.zou@intel.com> Cc: Yang, Rong R <rong.r.yang@intel.com> Cc: Tim Gore <tim.gore@intel.com> Cc: Jeff McGee <jeff.mcgee@intel.com> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467369782-25992-1-git-send-email-arun.siluvery@linux.intel.com Acked-by: Jani Nikula <jani.nikula@intel.com>
2016-07-01net/mlx5: Add timeout handle to commands with callbackMohamad Haj Yahia
The current implementation does not handle timeout in case of command with callback request, and this can lead to deadlock if the command doesn't get fw response. Add delayed callback timeout work before posting the command to fw. In case of real fw command completion we will cancel the delayed work. In case of fw command timeout the callback timeout handler will be called and it will simulate fw completion with timeout error. Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01bonding: prevent out of bound accessesEric Dumazet
ether_addr_equal_64bits() requires some care about its arguments, namely that 8 bytes might be read, even if last 2 byte values are not used. KASan detected a violation with null_mac_addr and lacpdu_mcast_addr in bond_3ad.c Same problem with mac_bcast[] and mac_v6_allmcast[] in bond_alb.c : Although the 8-byte alignment was there, KASan would detect out of bound accesses. Fixes: 815117adaf5b ("bonding: use ether_addr_equal_unaligned for bond addr compare") Fixes: bb54e58929f3 ("bonding: Verify RX LACPDU has proper dest mac-addr") Fixes: 885a136c52a8 ("bonding: use compare_ether_addr_64bits() in ALB") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Acked-by: Dmitry Vyukov <dvyukov@google.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01tun: switch to use skb array for txJason Wang
We used to queue tx packets in sk_receive_queue, this is less efficient since it requires spinlocks to synchronize between producer and consumer. This patch tries to address this by: - switch from sk_receive_queue to a skb_array, and resize it when tx_queue_len was changed. - introduce a new proto_ops peek_len which was used for peeking the skb length. - implement a tun version of peek_len for vhost_net to use and convert vhost_net to use peek_len if possible. Pktgen test shows about 15.3% improvement on guest receiving pps for small buffers: Before: ~1300000pps After : ~1500000pps Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01net: introduce NETDEV_CHANGE_TX_QUEUE_LENJason Wang
This patch introduces a new event - NETDEV_CHANGE_TX_QUEUE_LEN, this will be triggered when tx_queue_len. It could be used by net device who want to do some processing at that time. An example is tun who may want to resize tx array when tx_queue_len is changed. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01skb_array: add wrappers for resizingJason Wang
Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01ptr_ring: support resizing multiple queuesMichael S. Tsirkin
Sometimes, we need support resizing multiple queues at once. This is because it was not easy to recover to recover from a partial failure of multiple queues resizing. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01skb_array: minor tweakJason Wang
Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01ptr_ring: support zero length ringJason Wang
Sometimes, we need zero length ring. But current code will crash since we don't do any check before accessing the ring. This patch fixes this. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01KVM: remove kvm_guest_enter/exit wrappersPaolo Bonzini
Use the functions from context_tracking.h directly. Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Rik van Riel <riel@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-07-01tcp: md5: use kmalloc() backed scratch areasEric Dumazet
Some arches have virtually mapped kernel stacks, or will soon have. tcp_md5_hash_header() uses an automatic variable to copy tcp header before mangling th->check and calling crypto function, which might be problematic on such arches. David says that using percpu storage is also problematic on non SMP builds. Just use kmalloc() to allocate scratch areas. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: David S. Miller <davem@davemloft.net>