summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2017-11-07x86/mm, resource: Use PAGE_KERNEL protection for ioremap of memory pagesTom Lendacky
In order for memory pages to be properly mapped when SEV is active, it's necessary to use the PAGE_KERNEL protection attribute as the base protection. This ensures that memory mapping of, e.g. ACPI tables, receives the proper mapping attributes. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Tested-by: Borislav Petkov <bp@suse.de> Cc: Laura Abbott <labbott@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: kvm@vger.kernel.org Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Link: https://lkml.kernel.org/r/20171020143059.3291-11-brijesh.singh@amd.com
2017-11-07resource: Provide resource struct in resource walk callbackTom Lendacky
In preperation for a new function that will need additional resource information during the resource walk, update the resource walk callback to pass the resource structure. Since the current callback start and end arguments are pulled from the resource structure, the callback functions can obtain them from the resource structure directly. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Borislav Petkov <bp@suse.de> Tested-by: Borislav Petkov <bp@suse.de> Cc: kvm@vger.kernel.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lkml.kernel.org/r/20171020143059.3291-10-brijesh.singh@amd.com
2017-11-07x86/mm: Add Secure Encrypted Virtualization (SEV) supportTom Lendacky
Provide support for Secure Encrypted Virtualization (SEV). This initial support defines a flag that is used by the kernel to determine if it is running with SEV active. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Tested-by: Borislav Petkov <bp@suse.de> Cc: kvm@vger.kernel.org Cc: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/20171020143059.3291-3-brijesh.singh@amd.com
2017-11-07locking/rwlocks: Fix commentsCheng Jian
- fix the list of locking API headers in kernel/locking/spinlock.c - fix an #endif comment Signed-off-by: Cheng Jian <cj.chengjian@huawei.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: huawei.libin@huawei.com Cc: xiexiuqi@huawei.com Link: http://lkml.kernel.org/r/1509706788-152547-1-git-send-email-cj.chengjian@huawei.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-07Merge branch 'linus' into x86/asm, to pick up fixes and resolve conflictsIngo Molnar
Conflicts: arch/x86/kernel/cpu/Makefile Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-07Merge branch 'linus' into locking/core, to resolve conflictsIngo Molnar
Conflicts: include/linux/compiler-clang.h include/linux/compiler-gcc.h include/linux/compiler-intel.h include/uapi/linux/stddef.h Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-07Merge branch 'linus' into perf/core, to fix conflictsIngo Molnar
Conflicts: tools/perf/arch/arm/annotate/instructions.c tools/perf/arch/arm64/annotate/instructions.c tools/perf/arch/powerpc/annotate/instructions.c tools/perf/arch/s390/annotate/instructions.c tools/perf/arch/x86/tests/intel-cqm.c tools/perf/ui/tui/progress.c tools/perf/util/zlib.c Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-06PCI: Add for_each_pci_bridge() helperAndy Shevchenko
The following pattern is often used: list_for_each_entry(dev, &bus->devices, bus_list) { if (pci_is_bridge(dev)) { ... } } Add a for_each_pci_bridge() helper to make that code easier to write and read by reducing indentation level. It also saves one or few lines of code in each occurrence. Convert PCI core parts here at the same time. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> [bhelgaas: fold in http://lkml.kernel.org/r/20171013165352.25550-1-andriy.shevchenko@linux.intel.com] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-11-06ide: Convert timers to use timer_setup()Kees Cook
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: "David S. Miller" <davem@davemloft.net> Cc: linux-ide@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
2017-11-06ACPI / PM: Take SMART_SUSPEND driver flag into accountRafael J. Wysocki
Make the ACPI PM domain take DPM_FLAG_SMART_SUSPEND into account in its system suspend callbacks. [Note that the pm_runtime_suspended() check in acpi_dev_needs_resume() is an optimization, because if is not passed, all of the subsequent checks may be skipped and some of them are much more overhead in general.] Also use the observation that if the device is in runtime suspend at the beginning of the "late" phase of a system-wide suspend-like transition, its state cannot change going forward (runtime PM is disabled for it at that time) until the transition is over and the subsequent system-wide PM callbacks should be skipped for it (as they generally assume the device to not be suspended), so add checks for that in acpi_subsys_suspend_late/noirq() and acpi_subsys_freeze_late/noirq(). Moreover, if acpi_subsys_resume_noirq() is called during the subsequent system-wide resume transition and if the device was left in runtime suspend previously, its runtime PM status needs to be changed to "active" as it is going to be put into the full-power state going forward, so add a check for that too in there. In turn, if acpi_subsys_thaw_noirq() runs after the device has been left in runtime suspend, the subsequent "thaw" callbacks need to be skipped for it (as they may not work correctly with a suspended device), so set the power.direct_complete flag for the device then to make the PM core skip those callbacks. On top of the above, make the analogous changes in the acpi_lpss driver that uses the ACPI PM domain callbacks. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-06PCI / PM: Take SMART_SUSPEND driver flag into accountRafael J. Wysocki
Make the PCI bus type take DPM_FLAG_SMART_SUSPEND into account in its system-wide PM callbacks and make sure that all code that should not run in parallel with pci_pm_runtime_resume() is executed in the "late" phases of system suspend, freeze and poweroff transitions. [Note that the pm_runtime_suspended() check in pci_dev_keep_suspended() is an optimization, because if is not passed, all of the subsequent checks may be skipped and some of them are much more overhead in general.] Also use the observation that if the device is in runtime suspend at the beginning of the "late" phase of a system-wide suspend-like transition, its state cannot change going forward (runtime PM is disabled for it at that time) until the transition is over and the subsequent system-wide PM callbacks should be skipped for it (as they generally assume the device to not be suspended), so add checks for that in pci_pm_suspend_late/noirq(), pci_pm_freeze_late/noirq() and pci_pm_poweroff_late/noirq(). Moreover, if pci_pm_resume_noirq() or pci_pm_restore_noirq() is called during the subsequent system-wide resume transition and if the device was left in runtime suspend previously, its runtime PM status needs to be changed to "active" as it is going to be put into the full-power state, so add checks for that too to these functions. In turn, if pci_pm_thaw_noirq() runs after the device has been left in runtime suspend, the subsequent "thaw" callbacks need to be skipped for it (as they may not work correctly with a suspended device), so set the power.direct_complete flag for the device then to make the PM core skip those callbacks. In addition to the above add a core helper for checking if DPM_FLAG_SMART_SUSPEND is set and the device runtime PM status is "suspended" at the same time, which is done quite often in the new code (and will be done elsewhere going forward too). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2017-11-06PM / core: Add SMART_SUSPEND driver flagRafael J. Wysocki
Define and document a SMART_SUSPEND flag to instruct bus types and PM domains that the system suspend callbacks provided by the driver can cope with runtime-suspended devices, so from the driver's perspective it should be safe to leave devices in runtime suspend during system suspend. Setting that flag may also cause middle-layer code (bus types, PM domains etc.) to skip invocations of the ->suspend_late and ->suspend_noirq callbacks provided by the driver if the device is in runtime suspend at the beginning of the "late" phase of the system-wide suspend transition, in which case the driver's system-wide resume callbacks may be invoked back-to-back with its ->runtime_suspend callback, so the driver has to be able to cope with that too. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
2017-11-06PCI / PM: Use the NEVER_SKIP driver flagRafael J. Wysocki
Replace the PCI-specific flag PCI_DEV_FLAGS_NEEDS_RESUME with the PM core's DPM_FLAG_NEVER_SKIP one everywhere and drop it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
2017-11-06PM / core: Add NEVER_SKIP and SMART_PREPARE driver flagsRafael J. Wysocki
The motivation for this change is to provide a way to work around a problem with the direct-complete mechanism used for avoiding system suspend/resume handling for devices in runtime suspend. The problem is that some middle layer code (the PCI bus type and the ACPI PM domain in particular) returns positive values from its system suspend ->prepare callbacks regardless of whether the driver's ->prepare returns a positive value or 0, which effectively prevents drivers from being able to control the direct-complete feature. Some drivers need that control, however, and the PCI bus type has grown its own flag to deal with this issue, but since it is not limited to PCI, it is better to address it by adding driver flags at the core level. To that end, add a driver_flags field to struct dev_pm_info for flags that can be set by device drivers at the probe time to inform the PM core and/or bus types, PM domains and so on on the capabilities and/or preferences of device drivers. Also add two static inline helpers for setting that field and testing it against a given set of flags and make the driver core clear it automatically on driver remove and probe failures. Define and document two PM driver flags related to the direct- complete feature: NEVER_SKIP and SMART_PREPARE that can be used, respectively, to indicate to the PM core that the direct-complete mechanism should never be used for the device and to inform the middle layer code (bus types, PM domains etc) that it can only request the PM core to use the direct-complete mechanism for the device (by returning a positive value from its ->prepare callback) if it also has been requested by the driver. While at it, make the core check pm_runtime_suspended() when setting power.direct_complete so that it doesn't need to be checked by ->prepare callbacks. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
2017-11-06Merge branch 'acpi-pm' into pm-coreRafael J. Wysocki
2017-11-06Merge remote-tracking branches 'regmap/topic/const' and ↵Mark Brown
'regmap/topic/hwspinlock' into regmap-next
2017-11-06Merge remote-tracking branch 'regmap/topic/core' into regmap-nextMark Brown
2017-11-06Merge branch 'x86/mm' into x86/asm, to pick up pending changesIngo Molnar
Concentrate x86 MM and asm related changes into a single super-topic, in preparation for larger changes. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-06Merge branch 'x86/fpu' into x86/asm, to pick up fixIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-05Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core fixes from Ingo Molnar: - workaround for gcc asm handling - futex race fixes - objtool build warning fix - two watchdog fixes: a crash fix (revert) and a bug fix for /proc/sys/kernel/watchdog_thresh handling. * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Prevent GCC from merging annotate_unreachable(), take 2 objtool: Resync objtool's instruction decoder source code copy with the kernel's latest version watchdog/hardlockup/perf: Use atomics to track in-use cpu counter watchdog/harclockup/perf: Revert a33d44843d45 ("watchdog/hardlockup/perf: Simplify deferred event destroy") futex: Fix more put_pi_state() vs. exit_pi_state_list() races
2017-11-05bpf, cgroup: implement eBPF-based device controller for cgroup v2Roman Gushchin
Cgroup v2 lacks the device controller, provided by cgroup v1. This patch adds a new eBPF program type, which in combination of previously added ability to attach multiple eBPF programs to a cgroup, will provide a similar functionality, but with some additional flexibility. This patch introduces a BPF_PROG_TYPE_CGROUP_DEVICE program type. A program takes major and minor device numbers, device type (block/character) and access type (mknod/read/write) as parameters and returns an integer which defines if the operation should be allowed or terminated with -EPERM. Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Tejun Heo <tj@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-05device_cgroup: prepare code for bpf-based device controllerRoman Gushchin
This is non-functional change to prepare the device cgroup code for adding eBPF-based controller for cgroups v2. The patch performs the following changes: 1) __devcgroup_inode_permission() and devcgroup_inode_mknod() are moving to the device-cgroup.h and converting into static inline. 2) __devcgroup_check_permission() is exported. 3) devcgroup_check_permission() wrapper is introduced to be used by both existing and new bpf-based implementations. Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-05Merge tag 'mlx5-updates-2017-11-04' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2017-11-04 This series includes: From Huy: dscp to priority mapping for Ethernet packet. =================================================== First six patches enable differentiated services code point (dscp) to priority mapping for Ethernet packet. Once this feature is enabled, the packet is routed to the corresponding priority based on its dscp. User can combine this feature with priority flow control (pfc) feature to have priority flow control based on the dscp. Firmware interface: Mellanox firmware provides two control knobs for this feature: QPTS register allow changing the trust state between dscp and pcp mode. The default is pcp mode. Once in dscp mode, firmware will route the packet based on its dscp value if the dscp field exists. QPDPM register allow mapping a specific dscp (0 to 63) to a specific priority (0 to 7). By default, all the dscps are mapped to priority zero. Software interface: This feature is controlled via application priority TLV. IEEE specification P802.1Qcd/D2.1 defines priority selector id 5 for application priority TLV. This APP TLV selector defines DSCP to priority map. This APP TLV can be sent by the switch or can be set locally using software such as lldptool. In mlx5 drivers, we add the support for net dcb's getapp and setapp call back. Mlx5 driver only handles the selector id 5 application entry (dscp application priority application entry). If user sends multiple dscp to priority APP TLV entries on the same dscp, the last sent one will take effect. All the previous sent will be deleted. The firmware trust state (in QPTS register) is changed based on the number of dscp to priority application entries. When the first dscp to priority application entry is added by the user, the trust state is changed to dscp. When the last dscp to priority application entry is deleted by the user, the trust state is changed to pcp. When the port is in DSCP trust state, the transmit queue is selected based on the dscp of the skb. When the port is in DSCP trust state and vport inline mode is not NONE, firmware requires mlx5 driver to copy the IP header to the wqe ethernet segment inline header if the skb has it. This is done by changing the transmit queue sq's min inline mode to L3. Note that the min inline mode of sqs that belong to other features such as xdpsq, icosq are not modified. =================================================== Plus to the dscp series, some small misc changes are include as well: From Inbar, Ethtool msglvl support and some debug prints in DCBNL logic From Or Gerlitz, Enlarge the NIC TC offload table size From Rabie, Initialize destination_flow struct to 0 From Feras, Add inner TTC table to IPoIB flow steering From Tal, Enable CQE based moderation on TX CQ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-05tcp: higher throughput under reordering with adaptive RACK reordering wndPriyaranjan Jha
Currently TCP RACK loss detection does not work well if packets are being reordered beyond its static reordering window (min_rtt/4).Under such reordering it may falsely trigger loss recoveries and reduce TCP throughput significantly. This patch improves that by increasing and reducing the reordering window based on DSACK, which is now supported in major TCP implementations. It makes RACK's reo_wnd adaptive based on DSACK and no. of recoveries. - If DSACK is received, increment reo_wnd by min_rtt/4 (upper bounded by srtt), since there is possibility that spurious retransmission was due to reordering delay longer than reo_wnd. - Persist the current reo_wnd value for TCP_RACK_RECOVERY_THRESH (16) no. of successful recoveries (accounts for full DSACK-based loss recovery undo). After that, reset it to default (min_rtt/4). - At max, reo_wnd is incremented only once per rtt. So that the new DSACK on which we are reacting, is due to the spurious retx (approx) after the reo_wnd has been updated last time. - reo_wnd is tracked in terms of steps (of min_rtt/4), rather than absolute value to account for change in rtt. In our internal testing, we observed significant increase in throughput, in scenarios where reordering exceeds min_rtt/4 (previous static value). Signed-off-by: Priyaranjan Jha <priyarjha@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-05bpf: remove old offload/analyzerJakub Kicinski
Thanks to the ability to load a program for a specific device, running verifier twice is no longer needed. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-05xdp: allow attaching programs loaded for specific deviceJakub Kicinski
Pass the netdev pointer to bpf_prog_get_type(). This way BPF code can decide whether the device matches what the code was loaded/translated for. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-05bpf: report offload info to user spaceJakub Kicinski
Extend struct bpf_prog_info to contain information about program being bound to a device. Since the netdev may get destroyed while program still exists we need a flag to indicate the program is loaded for a device, even if the device is gone. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-05bpf: offload: add infrastructure for loading programs for a specific netdevJakub Kicinski
The fact that we don't know which device the program is going to be used on is quite limiting in current eBPF infrastructure. We have to reverse or limit the changes which kernel makes to the loaded bytecode if we want it to be offloaded to a networking device. We also have to invent new APIs for debugging and troubleshooting support. Make it possible to load programs for a specific netdev. This helps us to bring the debug information closer to the core eBPF infrastructure (e.g. we will be able to reuse the verifer log in device JIT). It allows device JITs to perform translation on the original bytecode. __bpf_prog_get() when called to get a reference for an attachment point will now refuse to give it if program has a device assigned. Following patches will add a version of that function which passes the expected netdev in. @type argument in __bpf_prog_get() is renamed to attach_type to make it clearer that it's only set on attachment. All calls to ndo_bpf are protected by rtnl, only verifier callbacks are not. We need a wait queue to make sure netdev doesn't get destroyed while verifier is still running and calling its driver. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-05net: bpf: rename ndo_xdp to ndo_bpfJakub Kicinski
ndo_xdp is a control path callback for setting up XDP in the driver. We can reuse it for other forms of communication between the eBPF stack and the drivers. Rename the callback and associated structures and definitions. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-05i2c: nuc900: remove platform_data, tooWolfram Sang
Commit 7da62cb1853025 ("i2c: nuc900: remove driver") removed the driver, we should remove the platform_data as well. Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2017-11-04net/mlx5: QPTS and QPDPM register firmware command supportHuy Nguyen
The QPTS register allows changing the priority trust state between pcp and dscp. Add support to get/set trust state from device. When the port is in pcp/dscp trust state, packet is routed by hardware to matching priority based on its pcp/dscp value respectively. The QPDPM register allow channing the dscp to priority mapping. Add support to get/set dscp to priority mapping from device. Note that to change a dscp mapping, the "e" bit of this dscp structure must be set in the QPDPM firmware command. Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-11-04net/mlx5: Add MLX5_SET16 and MLX5_GET16Huy Nguyen
Add MLX5_SET16 and MLX5_GET16 for 16bit structure field in firmware command. Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-11-04net/mlx5: QCAM register firmware command supportHuy Nguyen
The QCAM register provides capability bit for all the QoS registers using ACCESS_REG command. Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-11-04blk-mq: don't handle failure in .get_budgetMing Lei
It is enough to just check if we can get the budget via .get_budget(). And we don't need to deal with device state change in .get_budget(). For SCSI, one issue to be fixed is that we have to call scsi_mq_uninit_cmd() to free allocated ressources if SCSI device fails to handle the request. And it isn't enough to simply call blk_mq_end_request() to do that if this request is marked as RQF_DONTPREP. Fixes: 0df21c86bdbf(scsi: implement .get_budget and .put_budget for blk-mq) Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-11-04objtool: Prevent GCC from merging annotate_unreachable(), take 2Josh Poimboeuf
This fixes the following warning with GCC 4.6: mm/migrate.o: warning: objtool: migrate_misplaced_transhuge_page()+0x71: unreachable instruction The problem is that the compiler merged identical annotate_unreachable() inline asm blocks, resulting in a missing 'unreachable' annotation. This problem happened before, and was partially fixed with: 3d1e236022cc ("objtool: Prevent GCC from merging annotate_unreachable()") That commit tried to ensure that each instance of the annotate_unreachable() inline asm statement has a unique label. It used the __LINE__ macro to generate the label number. However, even the line number isn't necessarily unique when used in an inline function with multiple callers (in this case, __alloc_pages_node()'s use of VM_BUG_ON). Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kbuild-all@01.org Cc: tipbuild@zytor.com Fixes: 3d1e236022cc ("objtool: Prevent GCC from merging annotate_unreachable()") Link: http://lkml.kernel.org/r/20171103221941.cajpwszir7ujxyc4@treble Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-04netfilter/ipvs: clear ipvs_property flag when SKB net namespace changedYe Yin
When run ipvs in two different network namespace at the same host, and one ipvs transport network traffic to the other network namespace ipvs. 'ipvs_property' flag will make the second ipvs take no effect. So we should clear 'ipvs_property' when SKB network namespace changed. Fixes: 621e84d6f373 ("dev: introduce skb_scrub_packet()") Signed-off-by: Ye Yin <hustcat@gmail.com> Signed-off-by: Wei Zhou <chouryzhou@gmail.com> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-04tty: cyclades: Convert timers to use timer_setup()Kees Cook
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Moves timer structures from global to attached to struct cyclades_port. Cc: Jiri Slaby <jslaby@suse.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-04USB: add SPDX identifiers to all remaining files in drivers/usb/Greg Kroah-Hartman
It's good to have SPDX identifiers in all files to make it easier to audit the kernel tree for correct licenses. Update the drivers/usb/ and include/linux/usb* files with the correct SPDX license identifier based on the license text in the file itself. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This work is based on a script and data from Thomas Gleixner, Philippe Ombredanne, and Kate Stewart. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com> Acked-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Files removed in 'net-next' had their license header updated in 'net'. We take the remove from 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03iommu/vt-d: Clear Page Request Overflow fault bitLu Baolu
Currently Page Request Overflow bit in IOMMU Fault Status register is not cleared. Not clearing this bit would mean that any future page-request is going to be automatically dropped by IOMMU. Suggested-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-11-03block: add a poll_fn callback to struct request_queueChristoph Hellwig
That we we can also poll non blk-mq queues. Mostly needed for the NVMe multipath code, but could also be useful elsewhere. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-11-03block: introduce GENHD_FL_HIDDENChristoph Hellwig
With this flag a driver can create a gendisk that can be used for I/O submission inside the kernel, but which is not registered as user facing block device. This will be useful for the NVMe multipath implementation. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-11-03block: don't look at the struct device dev_t in disk_devtChristoph Hellwig
The hidden gendisks introduced in the next patch need to keep the dev field in their struct device empty so that udev won't try to create block device nodes for them. To support that rewrite disk_devt to look at the major and first_minor fields in the gendisk itself instead of looking into the struct device. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-11-03block: add a blk_steal_bios helperChristoph Hellwig
This helpers allows to bounce steal the uncompleted bios from a request so that they can be reissued on another path. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-11-03block: provide a direct_make_request helperChristoph Hellwig
This helper allows reinserting a bio into a new queue without much overhead, but requires all queue limits to be the same for the upper and lower queues, and it does not provide any recursion preventions. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Javier González <javier@cnexlabs.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-11-03block: add REQ_DRV bitChristoph Hellwig
Set aside a bit in the request/bio flags for driver use. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-11-03block: move REQ_NOWAITChristoph Hellwig
This flag should be before the operation-specific REQ_NOUNMAP bit. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-11-03Merge branch 'nvme-4.15' of git://git.infradead.org/nvme into for-4.15/blockJens Axboe
Pull NVMe changes from Christoph: "Below are the currently queue nvme updates for Linux 4.15. There are a few more things that could make it for this merge window, but I'd like to get things into linux-next, especially for the unlikely case that Linus decided to cut -rc8. Highlights: - support for SGLs in the PCIe driver (Chaitanya Kulkarni) - disable I/O schedulers for the admin queue (Israel Rukshin) - various Fibre Channel fixes and enhancements (James Smart) - various refactoring for better code sharing between transports (Sagi Grimberg and me) as well as lots of little bits from various contributors."
2017-11-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: "Hopefully this is the last batch of networking fixes for 4.14 Fingers crossed... 1) Fix stmmac to use the proper sized OF property read, from Bhadram Varka. 2) Fix use after free in net scheduler tc action code, from Cong Wang. 3) Fix SKB control block mangling in tcp_make_synack(). 4) Use proper locking in fib_dump_info(), from Florian Westphal. 5) Fix IPG encodings in systemport driver, from Florian Fainelli. 6) Fix division by zero in NV TCP congestion control module, from Konstantin Khlebnikov. 7) Fix use after free in nf_reject_ipv4, from Tejaswi Tanikella" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: net: systemport: Correct IPG length settings tcp: do not mangle skb->cb[] in tcp_make_synack() fib: fib_dump_info can no longer use __in_dev_get_rtnl stmmac: use of_property_read_u32 instead of read_u8 net_sched: hold netns refcnt for each action net_sched: acquire RTNL in tc_action_net_exit() net: vrf: correct FRA_L3MDEV encode type tcp_nv: fix division by zero in tcpnv_acked() netfilter: nf_reject_ipv4: Fix use-after-free in send_reset netfilter: nft_set_hash: disable fast_ops for 2-len keys
2017-11-03regset: Add support for dynamically sized regsetsDave Martin
Currently the regset API doesn't allow for the possibility that regsets (or at least, the amount of meaningful data in a regset) may change in size. In particular, this results in useless padding being added to coredumps if a regset's current size is smaller than its theoretical maximum size. This patch adds a get_size() function to struct user_regset. Individual regset implementations can implement this function to return the current size of the regset data. A regset_size() function is added to provide callers with an abstract interface for determining the size of a regset without needing to know whether the regset is dynamically sized or not. The only affected user of this interface is the ELF coredump code: This patch ports ELF coredump to dump regsets with their actual size in the coredump. This has no effect except for new regsets that are dynamically sized and provide a get_size() implementation. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Dmitry Safonov <dsafonov@virtuozzo.com> Cc: H. J. Lu <hjl.tools@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com>