summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-08-04Merge tag 'docs-5.9' of git://git.lwn.net/linuxLinus Torvalds
Pull documentation updates from Jonathan Corbet: "It's been a busy cycle for documentation - hopefully the busiest for a while to come. Changes include: - Some new Chinese translations - Progress on the battle against double words words and non-HTTPS URLs - Some block-mq documentation - More RST conversions from Mauro. At this point, that task is essentially complete, so we shouldn't see this kind of churn again for a while. Unless we decide to switch to asciidoc or something...:) - Lots of typo fixes, warning fixes, and more" * tag 'docs-5.9' of git://git.lwn.net/linux: (195 commits) scripts/kernel-doc: optionally treat warnings as errors docs: ia64: correct typo mailmap: add entry for <alobakin@marvell.com> doc/zh_CN: add cpu-load Chinese version Documentation/admin-guide: tainted-kernels: fix spelling mistake MAINTAINERS: adjust kprobes.rst entry to new location devices.txt: document rfkill allocation PCI: correct flag name docs: filesystems: vfs: correct flag name docs: filesystems: vfs: correct sync_mode flag names docs: path-lookup: markup fixes for emphasis docs: path-lookup: more markup fixes docs: path-lookup: fix HTML entity mojibake CREDITS: Replace HTTP links with HTTPS ones docs: process: Add an example for creating a fixes tag doc/zh_CN: add Chinese translation prefer section doc/zh_CN: add clearing-warn-once Chinese version doc/zh_CN: add admin-guide index doc:it_IT: process: coding-style.rst: Correct __maybe_unused compiler label futex: MAINTAINERS: Re-add selftests directory ...
2020-08-04Merge tag 'printk-for-5.9' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull printk updates from Petr Mladek: - Herbert Xu made printk header file self-contained. - Andy Shevchenko and Sergey Senozhatsky cleaned up console->setup() error handling. - Andy Shevchenko did some cleanups (e.g. sparse warning) in vsprintf code. - Minor documentation updates. * tag 'printk-for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: lib/vsprintf: Force type of flags value for gfp_t lib/vsprintf: Replace custom spec to print decimals with generic one lib/vsprintf: Replace hidden BUILD_BUG_ON() with static_assert() printk: Make linux/printk.h self-contained doc:kmsg: explicitly state the return value in case of SEEK_CUR Replace HTTP links with HTTPS ones: vsprintf hvc: unify console setup naming console: Fix trivia typo 'change' -> 'chance' console: Propagate error code from console ->setup() tty: hvc: Return proper error code from console ->setup() hook serial: sunzilog: Return proper error code from console ->setup() hook serial: sunsab: Return proper error code from console ->setup() hook mips: Return proper error code from console ->setup() hook
2020-08-04Merge branch 'parisc-5.9-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc updates from Helge Deller: "The majority of the patches are reverts of previous commits regarding the parisc-specific low level spinlocking code and barrier handling, with which we tried to fix CPU stalls on our build servers. In the end John David Anglin found the culprit: We missed a define for atomic64_set_release(). This seems to have fixed our issues, so now it's good to remove the unnecessary code again. Other than that it's trivial stuff: Spelling fixes, constifications and such" * 'parisc-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: make the log level string for register dumps const parisc: Do not use an ordered store in pa_tlb_lock() Revert "parisc: Revert "Release spinlocks using ordered store"" Revert "parisc: Use ldcw instruction for SMP spinlock release barrier" Revert "parisc: Drop LDCW barrier in CAS code when running UP" Revert "parisc: Improve interrupt handling in arch_spin_lock_flags()" parisc: Replace HTTP links with HTTPS ones parisc: elf.h: delete a duplicated word parisc: Report bad pages as HardwareCorrupted parisc: Convert to BIT_MASK() and BIT_WORD()
2020-08-04Merge tag 'x86-fsgsbase-2020-08-04' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fsgsbase from Thomas Gleixner: "Support for FSGSBASE. Almost 5 years after the first RFC to support it, this has been brought into a shape which is maintainable and actually works. This final version was done by Sasha Levin who took it up after Intel dropped the ball. Sasha discovered that the SGX (sic!) offerings out there ship rogue kernel modules enabling FSGSBASE behind the kernels back which opens an instantanious unpriviledged root hole. The FSGSBASE instructions provide a considerable speedup of the context switch path and enable user space to write GSBASE without kernel interaction. This enablement requires careful handling of the exception entries which go through the paranoid entry path as they can no longer rely on the assumption that user GSBASE is positive (as enforced via prctl() on non FSGSBASE enabled systemn). All other entries (syscalls, interrupts and exceptions) can still just utilize SWAPGS unconditionally when the entry comes from user space. Converting these entries to use FSGSBASE has no benefit as SWAPGS is only marginally slower than WRGSBASE and locating and retrieving the kernel GSBASE value is not a free operation either. The real benefit of RD/WRGSBASE is the avoidance of the MSR reads and writes. The changes come with appropriate selftests and have held up in field testing against the (sanitized) Graphene-SGX driver" * tag 'x86-fsgsbase-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits) x86/fsgsbase: Fix Xen PV support x86/ptrace: Fix 32-bit PTRACE_SETREGS vs fsbase and gsbase selftests/x86/fsgsbase: Add a missing memory constraint selftests/x86/fsgsbase: Fix a comment in the ptrace_write_gsbase test selftests/x86: Add a syscall_arg_fault_64 test for negative GSBASE selftests/x86/fsgsbase: Test ptracer-induced GS base write with FSGSBASE selftests/x86/fsgsbase: Test GS selector on ptracer-induced GS base write Documentation/x86/64: Add documentation for GS/FS addressing mode x86/elf: Enumerate kernel FSGSBASE capability in AT_HWCAP2 x86/cpu: Enable FSGSBASE on 64bit by default and add a chicken bit x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit x86/entry/64: Introduce the FIND_PERCPU_BASE macro x86/entry/64: Switch CR3 before SWAPGS in paranoid entry x86/speculation/swapgs: Check FSGSBASE in enabling SWAPGS mitigation x86/process/64: Use FSGSBASE instructions on thread copy and ptrace x86/process/64: Use FSBSBASE in switch_to() if available x86/process/64: Make save_fsgs_for_kvm() ready for FSGSBASE x86/fsgsbase/64: Enable FSGSBASE instructions in helper functions x86/fsgsbase/64: Add intrinsics for FSGSBASE instructions x86/cpu: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE ...
2020-08-04Merge tag 'x86-entry-2020-08-04' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 conversion to generic entry code from Thomas Gleixner: "The conversion of X86 syscall, interrupt and exception entry/exit handling to the generic code. Pretty much a straight-forward 1:1 conversion plus the consolidation of the KVM handling of pending work before entering guest mode" * tag 'x86-entry-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/kvm: Use __xfer_to_guest_mode_work_pending() in kvm_run_vcpu() x86/kvm: Use generic xfer to guest work function x86/entry: Cleanup idtentry_enter/exit x86/entry: Use generic interrupt entry/exit code x86/entry: Cleanup idtentry_entry/exit_user x86/entry: Use generic syscall exit functionality x86/entry: Use generic syscall entry function x86/ptrace: Provide pt_regs helper for entry/exit x86/entry: Move user return notifier out of loop x86/entry: Consolidate 32/64 bit syscall entry x86/entry: Consolidate check_user_regs() x86: Correct noinstr qualifiers x86/idtentry: Remove stale comment
2020-08-04Merge tag 'core-entry-2020-08-04' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull generic kernel entry/exit code from Thomas Gleixner: "Generic implementation of common syscall, interrupt and exception entry/exit functionality based on the recent X86 effort to ensure correctness of entry/exit vs RCU and instrumentation. As this functionality and the required entry/exit sequences are not architecture specific, sharing them allows other architectures to benefit instead of copying the same code over and over again. This branch was kept standalone to allow others to work on it. The conversion of x86 comes in a seperate pull request which obviously is based on this branch" * tag 'core-entry-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: entry: Correct __secure_computing() stub entry: Correct 'noinstr' attributes entry: Provide infrastructure for work before transitioning to guest mode entry: Provide generic interrupt entry/exit code entry: Provide generic syscall exit function entry: Provide generic syscall entry functionality seccomp: Provide stub for __secure_computing()
2020-08-04dt-bindings: hwlock: qcom: Remove invalid bindingBjorn Andersson
The Qualcomm hwlock is described in DeviceTree either directly on the mmio bus or split between a syscon and a mutex node, but as noted in [1] the latter is not valid DT, so remove any traces of this from the binding. [1] https://lore.kernel.org/r/CAL_JsqLa9GBtbgN6aL7AQ=A6V-YRtPgYqh6XgM2kpx532+r4Gg@mail.gmail.com/ Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20200729004757.1901107-1-bjorn.andersson@linaro.org Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2020-08-04SUNRPC dont update timeout value on connection resetOlga Kornievskaia
Current behaviour: every time a v3 operation is re-sent to the server we update (double) the timeout. There is no distinction between whether or not the previous timer had expired before the re-sent happened. Here's the scenario: 1. Client sends a v3 operation 2. Server RST-s the connection (prior to the timeout) (eg., connection is immediately reset) 3. Client re-sends a v3 operation but the timeout is now 120sec. As a result, an application sees 2mins pause before a retry in case server again does not reply. Instead, this patch proposes to keep track off when the minor timeout should happen and if it didn't, then don't update the new timeout. Value is updated based on the previous value to make timeouts predictable. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-08-04remoteproc: core: Register the character device interfaceSiddharth Gupta
Add the character device during rproc_add. This would create a character device node at /dev/remoteproc<index>. Userspace applications can interact with the remote processor using this interface. Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: Rishabh Bhatnagar <rishabhb@codeaurora.org> Signed-off-by: Siddharth Gupta <sidgup@codeaurora.org> Link: https://lore.kernel.org/r/1596044401-22083-3-git-send-email-sidgup@codeaurora.org Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2020-08-04remoteproc: Add remoteproc character device interfaceSiddharth Gupta
Add the character device interface into remoteproc framework. This interface can be used in order to boot/shutdown remote subsystems and provides a basic ioctl based interface to implement supplementary functionality. An ioctl call is implemented to enable the shutdown on release feature which will allow remote processors to be shutdown when the controlling userspace application crashes or hangs. Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: Rishabh Bhatnagar <rishabhb@codeaurora.org> Signed-off-by: Siddharth Gupta <sidgup@codeaurora.org> Link: https://lore.kernel.org/r/1596044401-22083-2-git-send-email-sidgup@codeaurora.org [bjorn: s/int32_t/s32/ per checkpatch] Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2020-08-04nfs: nfs_file_write() should check for writeback errorsScott Mayhew
The NFS_CONTEXT_ERROR_WRITE flag (as well as the check of said flag) was removed by commit 6fbda89b257f. The absence of an error check allows writes to be continually queued up for a server that may no longer be able to handle them. Fix it by adding an error check using the generic error reporting functions. Fixes: 6fbda89b257f ("NFS: Replace custom error reporting mechanism with generic one") Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-08-04Merge tag 'timers-core-2020-08-04' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "Time, timers and related driver updates: - Prevent unnecessary timer softirq invocations by extending the tracking of the next expiring timer in the timer wheel beyond the existing NOHZ functionality. The tracking overhead at enqueue time is within the noise, but on sensitive workloads the avoidance of the soft interrupt invocation is a measurable improvement. - The obligatory new clocksource driver for Ingenic X100 OST - The usual fixes, improvements, cleanups and extensions for newer chip variants all over the driver space" * tag 'timers-core-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (28 commits) timers: Recalculate next timer interrupt only when necessary clocksource/drivers/ingenic: Add support for the Ingenic X1000 OST. dt-bindings: timer: Add Ingenic X1000 OST bindings. clocksource/drivers: Replace HTTP links with HTTPS ones clocksource/drivers/nomadik-mtu: Handle 32kHz clock clocksource/drivers/sh_cmt: Use "kHz" for kilohertz clocksource/drivers/imx: Add support for i.MX TPM driver with ARM64 clocksource/drivers/ingenic: Add high resolution timer support for SMP/SMT. timers: Lower base clock forwarding threshold timers: Remove must_forward_clk timers: Spare timer softirq until next expiry timers: Expand clk forward logic beyond nohz timers: Reuse next expiry cache after nohz exit timers: Always keep track of next expiry timers: Optimize _next_timer_interrupt() level iteration timers: Add comments about calc_index() ceiling work timers: Move trigger_dyntick_cpu() to enqueue_timer() timers: Use only bucket expiry for base->next_expiry value timers: Preserve higher bits of expiration on index calculation clocksource/drivers/timer-atmel-tcb: Add sama5d2 support ...
2020-08-04Merge tag 'irq-core-2020-08-04' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "The usual boring updates from the interrupt subsystem: - Infrastructure to allow building irqchip drivers as modules - Consolidation of irqchip ACPI probing - Removal of the EOI-preflow interrupt handler which was required for SPARC support and became obsolete after SPARC was converted to use sparse interrupts. - Cleanups, fixes and improvements all over the place" * tag 'irq-core-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits) irqchip/loongson-pch-pic: Fix the misused irq flow handler irqchip/loongson-htvec: Support 8 groups of HT vectors irqchip/loongson-liointc: Fix misuse of gc->mask_cache dt-bindings: interrupt-controller: Update Loongson HTVEC description irqchip/imx-intmux: Fix irqdata regs save in imx_intmux_runtime_suspend() irqchip/imx-intmux: Implement intmux runtime power management irqchip/gic-v4.1: Use GFP_ATOMIC flag in allocate_vpe_l1_table() irqchip: Fix IRQCHIP_PLATFORM_DRIVER_* compilation by including module.h irqchip/stm32-exti: Map direct event to irq parent irqchip/mtk-cirq: Convert to a platform driver irqchip/mtk-sysirq: Convert to a platform driver irqchip/qcom-pdc: Switch to using IRQCHIP_PLATFORM_DRIVER helper macros irqchip: Add IRQCHIP_PLATFORM_DRIVER_BEGIN/END and IRQCHIP_MATCH helper macros irqchip: irq-bcm2836.h: drop a duplicated word irqchip/gic-v4.1: Ensure accessing the correct RD when writing INVALLR irqchip/irq-bcm7038-l1: Guard uses of cpu_logical_map irqchip/gic-v3: Remove unused register definition irqchip/qcom-pdc: Allow QCOM_PDC to be loadable as a permanent module genirq: Export irq_chip_retrigger_hierarchy and irq_chip_set_vcpu_affinity_parent irqdomain: Export irq_domain_update_bus_token ...
2020-08-04init: add an init_dup helperChristoph Hellwig
Add a simple helper to grab a reference to a file and install it at the next available fd, and switch the early init code over to it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-08-04scsi: lpfc: Update lpfc version to 12.8.0.3Dick Kennedy
Update lpfc version to 12.8.0.3 Link: https://lore.kernel.org/r/20200803210229.23063-9-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-04scsi: lpfc: Fix LUN loss after cable pullDick Kennedy
On devices that support FCP sequence error recovery, which attempts to preserve the devices login across link bounce, adisc is used for device validation. Turns out the device fc4 type is cleared as part of the link bounce, but the ADISC handling doesn't restore the FC4 support as it normally would with a PRLI. This caused situations where the device wasn't reregistered with the transport thus scan logic and LUN discovery never kicked in. In the ADISC completion handling, reset the fc4 type so that transport port reregistration occurs with the remote port. Link: https://lore.kernel.org/r/20200803210229.23063-8-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-04scsi: lpfc: Fix validation of bsg reply lengthsDick Kennedy
There are a couple of code areas which validate sufficient reply buffer length, but the checks are using the request elements rather than the reply elements. Rework to validate using the reply structures. Link: https://lore.kernel.org/r/20200803210229.23063-7-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-04scsi: lpfc: Fix retry of PRLI when status indicates its unsupportedDick Kennedy
With port bounce/address swaps and timing between initiator GID queries vs remote port FC4 support registrations, the driver may be in a situation where it sends PRLIs for both FCP and NVME even though the target may not support one of the protocols. In this case, the remote port will reject the PRLI and usually indicate it does not support the request. However, the driver currently ignores the status of the failure and immediately retries the PRLI, which is pointless. In the case of this one remote port, the reception of the PRLI retry caused it to decide to send a LOGO. The LOGO restarted the process and the same results happened. It made the remote port undiscoverable to either protocol. Add logic to detect the non-support status and not attempt the retry of the PRLI. Link: https://lore.kernel.org/r/20200803210229.23063-6-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-04scsi: lpfc: Fix oops when unloading driver while running mds diagsDick Kennedy
While mds diagnostic tests are running, if the driver is requested to be unloaded, oops or hangs are observed. The driver doesn't terminate the processing of diag frames when the unload is started. As such: oops may be seen for __lpfc_sli_release_iocbq_s4 because ring memory is referenced that was already freed; or hangs see in lpfc_nvme_wait_for_io_drain as ios no longer complete. If unloading, don't process diag frames. Just clean them up. Link: https://lore.kernel.org/r/20200803210229.23063-5-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-04scsi: lpfc: Fix RSCN timeout due to incorrect gidft counterDick Kennedy
In configs with a large number of initiators in the same zone (>250), RSCN timeouts are seen when creating or deleting vports: lpfc 0000:07:00.1: 5:(0):0231 RSCN timeout Data: x0 x3 During RSCN processing driver issues GID_FT command to nameserver. A counter for number of simultaneous GID_FT commands is maintained (an unsigned value). The counter is incremented when the GID_FT is issued. If the GID_FT command fails for some reason the driver retries the GID_FT from the completion call back. But the counter was decremented before the retry was issued. When the second GID_FT completes, the callback again tries to decrement the counter, possibly wrapping to a very large non-zero value, which causes the RSCN cleanup code to not execute. Thus the RSCN timeout failure. Do not decrement the counter on a retry. Also add defensive checks to ensure the counter is not decremented if already zero. Link: https://lore.kernel.org/r/20200803210229.23063-4-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-04scsi: lpfc: Fix no message shown for lpfc_hdw_queue out of range valueDick Kennedy
If module parameters override the default configuration settings for hardware queues or irqs, the driver was not notifying the change from defaults. Revise such that any changes will result in a kernel log message. Link: https://lore.kernel.org/r/20200803210229.23063-3-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-04scsi: lpfc: Fix FCoE speed reportingDick Kennedy
Current Link speed was shown as "unknown" in sysfs for FCoE ports. In this scenario, the port was working in 20G speed, which happens to not be a speed handled by the driver. Add support for all possible link speeds that could get reported from port_speed field in link state ACQE. Additionally, as supported_speeds can't be manipulated via the FCoE driver on a converged ethernet port (it must be managed by the nic function), don't fill out the supported_speeds field for the fc host object in sysfs. Revise debug logging to report Link speed mgmt valuess. Link: https://lore.kernel.org/r/20200803210229.23063-2-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-04scsi: lpfc: Add missing misc_deregister() for lpfc_init()Jing Xiangfeng
lpfc_init() misses a call misc_deregister() in an error path. Add a label 'unregister' to fix it. Link: https://lore.kernel.org/r/20200731065639.190646-1-jingxiangfeng@huawei.com Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-04scsi: lpfc: nvmet: Avoid hang / use-after-free again when destroying targetportEwan D. Milne
We cannot wait on a completion object in the lpfc_nvme_targetport structure in the _destroy_targetport() code path because the NVMe/fc transport will free that structure immediately after the .targetport_delete() callback. This results in a use-after-free, and a crash if slub_debug=FZPU is enabled. An earlier fix put put the completion on the stack, but commit 2a0fb340fcc8 ("scsi: lpfc: Correct localport timeout duration error") subsequently changed the code to reference the completion through a pointer in the object rather than the local stack variable. Fix this by using the stack variable directly. Link: https://lore.kernel.org/r/20200729231011.13240-1-emilne@redhat.com Fixes: 2a0fb340fcc8 ("scsi: lpfc: Correct localport timeout duration error") Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-04scsi: scsi_transport_sas: Add spaces around binary operator "|"Xiang Chen
According to the kernel coding style, use one space around the binary "|" operator. Add spaces around it. Link: https://lore.kernel.org/r/1596454442-220565-1-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-04scsi: sd_zbc: Improve zone revalidationDamien Le Moal
Currently, for zoned disks, since blk_revalidate_disk_zones() requires the disk capacity to be set already to operate correctly, zones revalidation can only be done on the second revalidate scan once the gendisk capacity is set at the end of the first scan. As a result, if zone revalidation fails, there is no second chance to recover from the failure and the disk capacity is changed to 0, with the disk left unusable. This can be improved by shuffling around code, specifically, by moving the call to sd_zbc_revalidate_zones() from sd_zbc_read_zones() to the end of sd_revalidate_disk(), after set_capacity_revalidate_and_notify() is called to set the gendisk capacity. With this change, if sd_zbc_revalidate_zones() fails on the first scan, the second scan will call it again to recover, if possible. Using the new struct scsi_disk fields rev_nr_zones and rev_zone_blocks, sd_zbc_revalidate_zones() does actual work only if it detects a change with the disk zone configuration. This means that for a successful zones revalidation on the first scan, the second scan will not cause another heavy full check. While at it, remove the unecesary "extern" declaration of sd_zbc_read_zones(). Link: https://lore.kernel.org/r/20200731054928.668547-1-damien.lemoal@wdc.com Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-04scsi: libfc: Free skb in fc_disc_gpn_id_resp() for valid casesJaved Hasan
In fc_disc_gpn_id_resp(), skb is supposed to get freed in all cases except for PTR_ERR. However, in some cases it didn't. This fix is to call fc_frame_free(fp) before function returns. Link: https://lore.kernel.org/r/20200729081824.30996-2-jhasan@marvell.com Reviewed-by: Girish Basrur <gbasrur@marvell.com> Reviewed-by: Santosh Vernekar <svernekar@marvell.com> Reviewed-by: Saurav Kashyap <skashyap@marvell.com> Reviewed-by: Shyam Sundar <ssundar@marvell.com> Signed-off-by: Javed Hasan <jhasan@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-04scsi: fcoe: Memory leak fix in fcoe_sysfs_fcf_del()Javed Hasan
In fcoe_sysfs_fcf_del(), we first deleted the fcf from the list and then freed it if ctlr_dev was not NULL. This was causing a memory leak. Free the fcf even if ctlr_dev is NULL. Link: https://lore.kernel.org/r/20200729081824.30996-3-jhasan@marvell.com Reviewed-by: Girish Basrur <gbasrur@marvell.com> Reviewed-by: Santosh Vernekar <svernekar@marvell.com> Reviewed-by: Saurav Kashyap <skashyap@marvell.com> Reviewed-by: Shyam Sundar <ssundar@marvell.com> Signed-off-by: Javed Hasan <jhasan@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-04scsi: target: Make iscsit_register_transport() return voidMax Gurtovoy
This function always returns 0. We can make it return void to simplify the code. Also, no caller ever checks the return value of this function. Link: https://lore.kernel.org/r/20200803150008.83920-1-maxg@mellanox.com Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-04Merge tag 'dma-mapping-5.9' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds
Pull dma-mapping updates from Christoph Hellwig: - make support for dma_ops optional - move more code out of line - add generic support for a dma_ops bypass mode - misc cleanups * tag 'dma-mapping-5.9' of git://git.infradead.org/users/hch/dma-mapping: dma-contiguous: cleanup dma_alloc_contiguous dma-debug: use named initializers for dir2name powerpc: use the generic dma_ops_bypass mode dma-mapping: add a dma_ops_bypass flag to struct device dma-mapping: make support for dma ops optional dma-mapping: inline the fast path dma-direct calls dma-mapping: move the remaining DMA API calls out of line
2020-08-04tracing: Use trace_sched_process_free() instead of exit() for pid tracingSteven Rostedt (VMware)
On exit, if a process is preempted after the trace_sched_process_exit() tracepoint but before the process is done exiting, then when it gets scheduled in, the function tracers will not filter it properly against the function tracing pid filters. That is because the function tracing pid filters hooks to the sched_process_exit() tracepoint to remove the exiting task's pid from the filter list. Because the filtering happens at the sched_switch tracepoint, when the exiting task schedules back in to finish up the exit, it will no longer be in the function pid filtering tables. This was noticeable in the notrace self tests on a preemptable kernel, as the tests would fail as it exits and preempted after being taken off the notrace filter table and on scheduling back in it would not be in the notrace list, and then the ending of the exit function would trace. The test detected this and would fail. Cc: stable@vger.kernel.org Cc: Namhyung Kim <namhyung@kernel.org> Fixes: 1e10486ffee0a ("ftrace: Add 'function-fork' trace option") Fixes: c37775d57830a ("tracing: Add infrastructure to allow set_event_pid to follow children" Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-08-05selftests/powerpc: Fix pkey syscall redefinitionsSandipan Das
On distros using older glibc versions, the pkey tests encounter build failures due to redefinition of the pkey syscall numbers. For compatibility, commit 743f3544fffb added a wrapper for the gettid() syscall and included syscall.h if the version of glibc used is older than 2.30. This leads to different definitions of SYS_pkey_* as the ones in the pkey test header set numeric constants where as the ones from syscall.h reuse __NR_pkey_*. The compiler complains about redefinitions since they are different. This replaces SYS_pkey_* definitions with __NR_pkey_* such that the definitions in both syscall.h and pkeys.h are alike. This way, if syscall.h has to be included for compatibility reasons, builds will still succeed. Fixes: 743f3544fffb ("selftests/powerpc: Add wrapper for gettid") Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com> Suggested-by: David Laight <david.laight@aculab.com> Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a4956d838bf59b0a71a2553c5ca81131ea8b49b9.1596561758.git.sandipan@linux.ibm.com
2020-08-04Merge tag 'uuid-for-5.9' of git://git.infradead.org/users/hch/uuidLinus Torvalds
Pull uuid update from Christoph Hellwig: "Remove a now unused helper (Andy Shevchenko)" * tag 'uuid-for-5.9' of git://git.infradead.org/users/hch/uuid: uuid: remove unused uuid_le_to_bin() definition
2020-08-04farsync: switch from 'pci_' to 'dma_' APIChristophe JAILLET
The wrappers in include/linux/pci-dma-compat.h should go away. The patch has been generated with the coccinelle script below and has been hand modified to replace GFP_ with a correct flag. It has been compile tested. When memory is allocated in 'fst_add_one()', GFP_KERNEL can be used because it is a probe function and no lock is acquired. @@ @@ - PCI_DMA_BIDIRECTIONAL + DMA_BIDIRECTIONAL @@ @@ - PCI_DMA_TODEVICE + DMA_TO_DEVICE @@ @@ - PCI_DMA_FROMDEVICE + DMA_FROM_DEVICE @@ @@ - PCI_DMA_NONE + DMA_NONE @@ expression e1, e2, e3; @@ - pci_alloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3; @@ - pci_zalloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3, e4; @@ - pci_free_consistent(e1, e2, e3, e4) + dma_free_coherent(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_single(e1, e2, e3, e4) + dma_map_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_single(e1, e2, e3, e4) + dma_unmap_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4, e5; @@ - pci_map_page(e1, e2, e3, e4, e5) + dma_map_page(&e1->dev, e2, e3, e4, e5) @@ expression e1, e2, e3, e4; @@ - pci_unmap_page(e1, e2, e3, e4) + dma_unmap_page(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_sg(e1, e2, e3, e4) + dma_map_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_sg(e1, e2, e3, e4) + dma_unmap_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_cpu(e1, e2, e3, e4) + dma_sync_single_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_device(e1, e2, e3, e4) + dma_sync_single_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_cpu(e1, e2, e3, e4) + dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_device(e1, e2, e3, e4) + dma_sync_sg_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2; @@ - pci_dma_mapping_error(e1, e2) + dma_mapping_error(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_dma_mask(e1, e2) + dma_set_mask(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_consistent_dma_mask(e1, e2) + dma_set_coherent_mask(&e1->dev, e2) Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-04wan: wanxl: switch from 'pci_' to 'dma_' APIChristophe JAILLET
The wrappers in include/linux/pci-dma-compat.h should go away. The patch has been generated with the coccinelle script below and has been hand modified to replace GFP_ with a correct flag. It has been compile tested. When memory is allocated in 'wanxl_pci_init_one()', GFP_KERNEL can be used because it is a probe function and no lock is acquired. Moreover, just a few lines above, GFP_KERNEL is already used. @@ @@ - PCI_DMA_BIDIRECTIONAL + DMA_BIDIRECTIONAL @@ @@ - PCI_DMA_TODEVICE + DMA_TO_DEVICE @@ @@ - PCI_DMA_FROMDEVICE + DMA_FROM_DEVICE @@ @@ - PCI_DMA_NONE + DMA_NONE @@ expression e1, e2, e3; @@ - pci_alloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3; @@ - pci_zalloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3, e4; @@ - pci_free_consistent(e1, e2, e3, e4) + dma_free_coherent(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_single(e1, e2, e3, e4) + dma_map_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_single(e1, e2, e3, e4) + dma_unmap_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4, e5; @@ - pci_map_page(e1, e2, e3, e4, e5) + dma_map_page(&e1->dev, e2, e3, e4, e5) @@ expression e1, e2, e3, e4; @@ - pci_unmap_page(e1, e2, e3, e4) + dma_unmap_page(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_sg(e1, e2, e3, e4) + dma_map_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_sg(e1, e2, e3, e4) + dma_unmap_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_cpu(e1, e2, e3, e4) + dma_sync_single_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_device(e1, e2, e3, e4) + dma_sync_single_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_cpu(e1, e2, e3, e4) + dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_device(e1, e2, e3, e4) + dma_sync_sg_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2; @@ - pci_dma_mapping_error(e1, e2) + dma_mapping_error(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_dma_mask(e1, e2) + dma_set_mask(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_consistent_dma_mask(e1, e2) + dma_set_coherent_mask(&e1->dev, e2) Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-04hv_netvsc: do not use VF device if link is downStephen Hemminger
If the accelerated networking SRIOV VF device has lost carrier use the synthetic network device which is available as backup path. This is a rare case since if VF link goes down, normally the VMBus device will also loose external connectivity as well. But if the communication is between two VM's on the same host the VMBus device will still work. Reported-by: "Shah, Ashish N" <ashish.n.shah@intel.com> Fixes: 0c195567a8f6 ("netvsc: transparent VF management") Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-04dpaa2-eth: Fix passing zero to 'PTR_ERR' warningYueHaibing
Fix smatch warning: drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c:2419 alloc_channel() warn: passing zero to 'ERR_PTR' setup_dpcon() should return ERR_PTR(err) instead of zero in error handling case. Fixes: d7f5a9d89a55 ("dpaa2-eth: defer probe on object allocate") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-05rtc: ds1307: provide an indication that the watchdog has firedChris Packham
There's not much feedback when the ds1388 watchdog fires. Generally it yanks on the reset line and the board reboots. Capture the fact that the watchdog has fired in the past so that userspace can retrieve it via WDIOC_GETBOOTSTATUS. This should help distinguish a watchdog triggered reset from a power interruption. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20200727034615.19755-1-chris.packham@alliedtelesis.co.nz
2020-08-04net: macb: Properly handle phylink on at91sam9xStefan Roese
I just recently noticed that ethernet does not work anymore since v5.5 on the GARDENA smart Gateway, which is based on the AT91SAM9G25. Debugging showed that the "GEM bits" in the NCFGR register are now unconditionally accessed, which is incorrect for the !macb_is_gem() case. This patch adds the macb_is_gem() checks back to the code (in macb_mac_config() & macb_mac_link_up()), so that the GEM register bits are not accessed in this case any more. Fixes: 7897b071ac3b ("net: macb: convert to phylink") Signed-off-by: Stefan Roese <sr@denx.de> Cc: Reto Schneider <reto.schneider@husqvarnagroup.com> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Nicolas Ferre <nicolas.ferre@microchip.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-04Merge tag 'close-range-v5.9' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull close_range() implementation from Christian Brauner: "This adds the close_range() syscall. It allows to efficiently close a range of file descriptors up to all file descriptors of a calling task. This is coordinated with the FreeBSD folks which have copied our version of this syscall and in the meantime have already merged it in April 2019: https://reviews.freebsd.org/D21627 https://svnweb.freebsd.org/base?view=revision&revision=359836 The syscall originally came up in a discussion around the new mount API and making new file descriptor types cloexec by default. During this discussion, Al suggested the close_range() syscall. First, it helps to close all file descriptors of an exec()ing task. This can be done safely via (quoting Al's example from [1] verbatim): /* that exec is sensitive */ unshare(CLONE_FILES); /* we don't want anything past stderr here */ close_range(3, ~0U); execve(....); The code snippet above is one way of working around the problem that file descriptors are not cloexec by default. This is aggravated by the fact that we can't just switch them over without massively regressing userspace. For a whole class of programs having an in-kernel method of closing all file descriptors is very helpful (e.g. demons, service managers, programming language standard libraries, container managers etc.). Second, it allows userspace to avoid implementing closing all file descriptors by parsing through /proc/<pid>/fd/* and calling close() on each file descriptor and other hacks. From looking at various large(ish) userspace code bases this or similar patterns are very common in service managers, container runtimes, and programming language runtimes/standard libraries such as Python or Rust. In addition, the syscall will also work for tasks that do not have procfs mounted and on kernels that do not have procfs support compiled in. In such situations the only way to make sure that all file descriptors are closed is to call close() on each file descriptor up to UINT_MAX or RLIMIT_NOFILE, OPEN_MAX trickery. Based on Linus' suggestion close_range() also comes with a new flag CLOSE_RANGE_UNSHARE to more elegantly handle file descriptor dropping right before exec. This would usually be expressed in the sequence: unshare(CLONE_FILES); close_range(3, ~0U); as pointed out by Linus it might be desirable to have this be a part of close_range() itself under a new flag CLOSE_RANGE_UNSHARE which gets especially handy when we're closing all file descriptors above a certain threshold. Test-suite as always included" * tag 'close-range-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: tests: add CLOSE_RANGE_UNSHARE tests close_range: add CLOSE_RANGE_UNSHARE tests: add close_range() tests arch: wire-up close_range() open: add close_range()
2020-08-05Merge tag 'drm-msm-next-2020-07-30' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/msm into drm-next Take 2 of msm-next pull, this version drops the OPP patch due to [1], so I'll send the gpu opp/bw scaling patch after the OPP patch lands. Since I had to force-push I took the opportunity to rebase on drm-next, and since you already merged in 5.8-rc6 a few fixes from the last cycle dropped out. This time around: * A bunch more a650/a640 (sm8150/sm8250) display and GPU enablement and fixes * Enable dpu dither block for 6bpc panels * dpu suspend fixes * dpu fix for cursor on 2nd display * dsi/mdp5 enablement for sdm630/sdm636/sdm660 I also regenerated the register headers, which accounts for a good bit of the size this time, because we hadn't re-synced the register headers since the early days of a6xx bringup. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/ <CAF6AEGs_eswoX-E0Ddg5DoEQy35x3GG+6SDXUAjPMrtAWFkqng@mail.gmail.com
2020-08-04riscv: disable stack-protector for vDSOTobias Klauser
Currently, building the vDSO with clang leads assembler errors like the following: /tmp/vgettimeofday-1ae0d2.s: Assembler messages: /tmp/vgettimeofday-1ae0d2.s:28: Error: bad expression /tmp/vgettimeofday-1ae0d2.s:28: Error: illegal operands `auipc a2,%got_pcrel_hi(__stack_chk_guard)' Disable the stack-protector for vDSO to fix these. Link: https://github.com/ClangBuiltLinux/linux/issues/1112 Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-08-04RISC-V: Fix build warning for smpboot.cAtish Patra
The following warnings are reported by kbuild with W=1. >> arch/riscv/kernel/smpboot.c:109:5: warning: no previous prototype for 'start_secondary_cpu' [-Wmissing-prototypes] 109 | int start_secondary_cpu(int cpu, struct task_struct *tidle) | ^~~~~~~~~~~~~~~~~~~ arch/riscv/kernel/smpboot.c:146:34: warning: no previous prototype for 'smp_callin' [-Wmissing-prototypes] 146 | asmlinkage __visible void __init smp_callin(void) | ^~~~~~~~~~ Fix the warnings by marking the local functions static and adding the prototype for the global function. Signed-off-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-08-04Merge tag 'cap-checkpoint-restore-v5.9' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull checkpoint-restore updates from Christian Brauner: "This enables unprivileged checkpoint/restore of processes. Given that this work has been going on for quite some time the first sentence in this summary is hopefully more exciting than the actual final code changes required. Unprivileged checkpoint/restore has seen a frequent increase in interest over the last two years and has thus been one of the main topics for the combined containers & checkpoint/restore microconference since at least 2018 (cf. [1]). Here are just the three most frequent use-cases that were brought forward: - The JVM developers are integrating checkpoint/restore into a Java VM to significantly decrease the startup time. - In high-performance computing environment a resource manager will typically be distributing jobs where users are always running as non-root. Long-running and "large" processes with significant startup times are supposed to be checkpointed and restored with CRIU. - Container migration as a non-root user. In all of these scenarios it is either desirable or required to run without CAP_SYS_ADMIN. The userspace implementation of checkpoint/restore CRIU already has the pull request for supporting unprivileged checkpoint/restore up (cf. [2]). To enable unprivileged checkpoint/restore a new dedicated capability CAP_CHECKPOINT_RESTORE is introduced. This solution has last been discussed in 2019 in a talk by Google at Linux Plumbers (cf. [1] "Update on Task Migration at Google Using CRIU") with Adrian and Nicolas providing the implementation now over the last months. In essence, this allows the CRIU binary to be installed with the CAP_CHECKPOINT_RESTORE vfs capability set thereby enabling unprivileged users to restore processes. To make this possible the following permissions are altered: - Selecting a specific PID via clone3() set_tid relaxed from userns CAP_SYS_ADMIN to CAP_CHECKPOINT_RESTORE. - Selecting a specific PID via /proc/sys/kernel/ns_last_pid relaxed from userns CAP_SYS_ADMIN to CAP_CHECKPOINT_RESTORE. - Accessing /proc/pid/map_files relaxed from init userns CAP_SYS_ADMIN to init userns CAP_CHECKPOINT_RESTORE. - Changing /proc/self/exe from userns CAP_SYS_ADMIN to userns CAP_CHECKPOINT_RESTORE. Of these four changes the /proc/self/exe change deserves a few words because the reasoning behind even restricting /proc/self/exe changes in the first place is just full of historical quirks and tracking this down was a questionable version of fun that I'd like to spare others. In short, it is trivial to change /proc/self/exe as an unprivileged user, i.e. without userns CAP_SYS_ADMIN right now. Either via ptrace() or by simply intercepting the elf loader in userspace during exec. Nicolas was nice enough to even provide a POC for the latter (cf. [3]) to illustrate this fact. The original patchset which introduced PR_SET_MM_MAP had no permissions around changing the exe link. They too argued that it is trivial to spoof the exe link already which is true. The argument brought up against this was that the Tomoyo LSM uses the exe link in tomoyo_manager() to detect whether the calling process is a policy manager. This caused changing the exe links to be guarded by userns CAP_SYS_ADMIN. All in all this rather seems like a "better guard it with something rather than nothing" argument which imho doesn't qualify as a great security policy. Again, because spoofing the exe link is possible for the calling process so even if this were security relevant it was broken back then and would be broken today. So technically, dropping all permissions around changing the exe link would probably be possible and would send a clearer message to any userspace that relies on /proc/self/exe for security reasons that they should stop doing this but for now we're only relaxing the exe link permissions from userns CAP_SYS_ADMIN to userns CAP_CHECKPOINT_RESTORE. There's a final uapi change in here. Changing the exe link used to accidently return EINVAL when the caller lacked the necessary permissions instead of the more correct EPERM. This pr contains a commit fixing this. I assume that userspace won't notice or care and if they do I will revert this commit. But since we are changing the permissions anyway it seems like a good opportunity to try this fix. With these changes merged unprivileged checkpoint/restore will be possible and has already been tested by various users" [1] LPC 2018 1. "Task Migration at Google Using CRIU" https://www.youtube.com/watch?v=yI_1cuhoDgA&t=12095 2. "Securely Migrating Untrusted Workloads with CRIU" https://www.youtube.com/watch?v=yI_1cuhoDgA&t=14400 LPC 2019 1. "CRIU and the PID dance" https://www.youtube.com/watch?v=LN2CUgp8deo&list=PLVsQ_xZBEyN30ZA3Pc9MZMFzdjwyz26dO&index=9&t=2m48s 2. "Update on Task Migration at Google Using CRIU" https://www.youtube.com/watch?v=LN2CUgp8deo&list=PLVsQ_xZBEyN30ZA3Pc9MZMFzdjwyz26dO&index=9&t=1h2m8s [2] https://github.com/checkpoint-restore/criu/pull/1155 [3] https://github.com/nviennot/run_as_exe * tag 'cap-checkpoint-restore-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: selftests: add clone3() CAP_CHECKPOINT_RESTORE test prctl: exe link permission error changed from -EINVAL to -EPERM prctl: Allow local CAP_CHECKPOINT_RESTORE to change /proc/self/exe proc: allow access in init userns for map_files with CAP_CHECKPOINT_RESTORE pid_namespace: use checkpoint_restore_ns_capable() for ns_last_pid pid: use checkpoint_restore_ns_capable() for set_tid capabilities: Introduce CAP_CHECKPOINT_RESTORE
2020-08-04Merge tag 'fork-v5.9' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull fork cleanups from Christian Brauner: "This is cleanup series from when we reworked a chunk of the process creation paths in the kernel and switched to struct {kernel_}clone_args. High-level this does two main things: - Remove the double export of both do_fork() and _do_fork() where do_fork() used the incosistent legacy clone calling convention. Now we only export _do_fork() which is based on struct kernel_clone_args. - Remove the copy_thread_tls()/copy_thread() split making the architecture specific HAVE_COYP_THREAD_TLS config option obsolete. This switches all remaining architectures to select HAVE_COPY_THREAD_TLS and thus to the copy_thread_tls() calling convention. The current split makes the process creation codepaths more convoluted than they need to be. Each architecture has their own copy_thread() function unless it selects HAVE_COPY_THREAD_TLS then it has a copy_thread_tls() function. The split is not needed anymore nowadays, all architectures support CLONE_SETTLS but quite a few of them never bothered to select HAVE_COPY_THREAD_TLS and instead simply continued to use copy_thread() and use the old calling convention. Removing this split cleans up the process creation codepaths and paves the way for implementing clone3() on such architectures since it requires the copy_thread_tls() calling convention. After having made each architectures support copy_thread_tls() this series simply renames that function back to copy_thread(). It also switches all architectures that call do_fork() directly over to _do_fork() and the struct kernel_clone_args calling convention. This is a corollary of switching the architectures that did not yet support it over to copy_thread_tls() since do_fork() is conditional on not supporting copy_thread_tls() (Mostly because it lacks a separate argument for tls which is trivial to fix but there's no need for this function to exist.). The do_fork() removal is in itself already useful as it allows to to remove the export of both do_fork() and _do_fork() we currently have in favor of only _do_fork(). This has already been discussed back when we added clone3(). The legacy clone() calling convention is - as is probably well-known - somewhat odd: # # ABI hall of shame # config CLONE_BACKWARDS config CLONE_BACKWARDS2 config CLONE_BACKWARDS3 that is aggravated by the fact that some architectures such as sparc follow the CLONE_BACKWARDSx calling convention but don't really select the corresponding config option since they call do_fork() directly. So do_fork() enforces a somewhat arbitrary calling convention in the first place that doesn't really help the individual architectures that deviate from it. They can thus simply be switched to _do_fork() enforcing a single calling convention. (I really hope that any new architectures will __not__ try to implement their own calling conventions...) Most architectures already have made a similar switch (m68k comes to mind). Overall this removes more code than it adds even with a good portion of added comments. It simplifies a chunk of arch specific assembly either by moving the code into C or by simply rewriting the assembly. Architectures that have been touched in non-trivial ways have all been actually boot and stress tested: sparc and ia64 have been tested with Debian 9 images. They are the two architectures which have been touched the most. All non-trivial changes to architectures have seen acks from the relevant maintainers. nios2 with a custom built buildroot image. h8300 I couldn't get something bootable to test on but the changes have been fairly automatic and I'm sure we'll hear people yell if I broke something there. All other architectures that have been touched in trivial ways have been compile tested for each single patch of the series via git rebase -x "make ..." v5.8-rc2. arm{64} and x86{_64} have been boot tested even though they have just been trivially touched (removal of the HAVE_COPY_THREAD_TLS macro from their Kconfig) because well they are basically "core architectures" and since it is trivial to get your hands on a useable image" * tag 'fork-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: arch: rename copy_thread_tls() back to copy_thread() arch: remove HAVE_COPY_THREAD_TLS unicore: switch to copy_thread_tls() sh: switch to copy_thread_tls() nds32: switch to copy_thread_tls() microblaze: switch to copy_thread_tls() hexagon: switch to copy_thread_tls() c6x: switch to copy_thread_tls() alpha: switch to copy_thread_tls() fork: remove do_fork() h8300: select HAVE_COPY_THREAD_TLS, switch to kernel_clone_args nios2: enable HAVE_COPY_THREAD_TLS, switch to kernel_clone_args ia64: enable HAVE_COPY_THREAD_TLS, switch to kernel_clone_args sparc: unconditionally enable HAVE_COPY_THREAD_TLS sparc: share process creation helpers between sparc and sparc64 sparc64: enable HAVE_COPY_THREAD_TLS fork: fold legacy_clone_args_valid() into _do_fork()
2020-08-04Merge tag 'threads-v5.9' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull thread updates from Christian Brauner: "This contains the changes to add the missing support for attaching to time namespaces via pidfds. Last cycle setns() was changed to support attaching to multiple namespaces atomically. This requires all namespaces to have a point of no return where they can't fail anymore. Specifically, <namespace-type>_install() is allowed to perform permission checks and install the namespace into the new struct nsset that it has been given but it is not allowed to make visible changes to the affected task. Once <namespace-type>_install() returns, anything that the given namespace type additionally requires to be setup needs to ideally be done in a function that can't fail or if it fails the failure must be non-fatal. For time namespaces the relevant functions that fell into this category were timens_set_vvar_page() and vdso_join_timens(). The latter could still fail although it didn't need to. This function is only implemented for vdso_join_timens() in current mainline. As discussed on-list (cf. [1]), in order to make setns() support time namespaces when attaching to multiple namespaces at once properly we changed vdso_join_timens() to always succeed. So vdso_join_timens() replaces the mmap_write_lock_killable() with mmap_read_lock(). Please note that arm is about to grow vdso support for time namespaces (possibly this merge window). We've synced on this change and arm64 also uses mmap_read_lock(), i.e. makes vdso_join_timens() a function that can't fail. Once the changes here and the arm64 changes have landed, vdso_join_timens() should be turned into a void function so it's obvious to callers and implementers on other architectures that the expectation is that it can't fail. We didn't do this right away because it would've introduced unnecessary merge conflicts between the two trees for no major gain. As always, tests included" [1]: https://lore.kernel.org/lkml/20200611110221.pgd3r5qkjrjmfqa2@wittgenstein * tag 'threads-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: tests: add CLONE_NEWTIME setns tests nsproxy: support CLONE_NEWTIME with setns() timens: add timens_commit() helper timens: make vdso_join_timens() always succeed
2020-08-04hwmon: (adc128d818) Fix advanced configuration register initRoy van Doormaal
If the operation mode is non-zero and an external reference voltage is set, first the operation mode is written to the advanced configuration register, followed by the externel reference enable bit, resetting the configuration mode to 0. To fix this, first compose the value of the advanced configuration register based on the configuration mode and the external reference voltage. The advanced configuration register is then written to the device, if it is different from the default register value (0x0). Signed-off-by: Roy van Doormaal <roy.van.doormaal@prodrive-technologies.com> Link: https://lore.kernel.org/r/20200728151846.231785-1-roy.van.doormaal@prodrive-technologies.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2020-08-04Merge branch 'exec-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull execve updates from Eric Biederman: "During the development of v5.7 I ran into bugs and quality of implementation issues related to exec that could not be easily fixed because of the way exec is implemented. So I have been diggin into exec and cleaning up what I can. This cycle I have been looking at different ideas and different implementations to see what is possible to improve exec, and cleaning the way exec interfaces with in kernel users. Only cleaning up the interfaces of exec with rest of the kernel has managed to stabalize and make it through review in time for v5.9-rc1 resulting in 2 sets of changes this cycle. - Implement kernel_execve - Make the user mode driver code a better citizen With kernel_execve the code size got a little larger as the copying of parameters from userspace and copying of parameters from userspace is now separate. The good news is kernel threads no longer need to play games with set_fs to use exec. Which when combined with the rest of Christophs set_fs changes should security bugs with set_fs much more difficult" * 'exec-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (23 commits) exec: Implement kernel_execve exec: Factor bprm_stack_limits out of prepare_arg_pages exec: Factor bprm_execve out of do_execve_common exec: Move bprm_mm_init into alloc_bprm exec: Move initialization of bprm->filename into alloc_bprm exec: Factor out alloc_bprm exec: Remove unnecessary spaces from binfmts.h umd: Stop using split_argv umd: Remove exit_umh bpfilter: Take advantage of the facilities of struct pid exit: Factor thread_group_exited out of pidfd_poll umd: Track user space drivers with struct pid bpfilter: Move bpfilter_umh back into init data exec: Remove do_execve_file umh: Stop calling do_execve_file umd: Transform fork_usermode_blob into fork_usermode_driver umd: Rename umd_info.cmdline umd_info.driver_name umd: For clarity rename umh_info umd_info umh: Separate the user mode driver and the user mode helper support umh: Remove call_usermodehelper_setup_file. ...
2020-08-04hwmon: (axi-fan-control) remove duplicate macrosAlexandru Ardelean
These macros are also present in the "include/linux/fpga/adi-axi-common.h" file which is included in this driver. This patch removes them from the AXI Fan Control driver. No sense in having them in 2 places. Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com> Link: https://lore.kernel.org/r/20200803054311.98174-1-alexandru.ardelean@analog.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2020-08-04hwmon: (i5k_amb, vt8231) Drop uses of pci_read_config_*() return valueSaheed O. Bolarinwa
The return value of pci_read_config_*() may not indicate a device error. However, the value read by these functions is more likely to indicate this kind of error. This presents two overlapping ways of reporting errors and complicates error checking. It is possible to move to one single way of checking for error if the dependency on the return value of these functions is removed, then it can later be made to return void. Remove all uses of the return value of pci_read_config_*(). Check the actual value read for ~0. In this case, ~0 is an invalid value thus it indicates some kind of error. Suggested-by: Bjorn Helgaas <bjorn@helgaas.com> Signed-off-by: Saheed O. Bolarinwa <refactormyself@gmail.com> Link: https://lore.kernel.org/r/20200801112446.149549-11-refactormyself@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>