summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2021-12-10EDAC: Add RDDR5 and LRDDR5 memory typesYazen Ghannam
Include Registered-DDR5 and Load-Reduced DDR5 in the list of memory types. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211208174356.1997855-2-yazen.ghannam@amd.com
2021-12-09pktdvd: stop using bdi congestion framework.NeilBrown
The bdi congestion framework isn't widely used and should be deprecated. pktdvd makes use of it to track congestion, but this can be done entirely internally to pktdvd, so it doesn't need to use the framework. So introduce a "congested" flag. When waiting for bio_queue_size to drop, set this flag and a var_waitqueue() to wait for it. When bio_queue_size does drop and this flag is set, clear the flag and call wake_up_var(). We don't use a wait_var_event macro for the waiting as we need to set the flag and drop the spinlock before calling schedule() and while that is possible with __wait_var_event(), result is not easy to read. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: NeilBrown <neilb@suse.de> Link: https://lore.kernel.org/r/163910843527.9928.857338663717630212@noble.neil.brown.name Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-12-09kcsan: Turn barrier instrumentation into macrosMarco Elver
Some architectures use barriers in 'extern inline' functions, from which we should not refer to static inline functions. For example, building Alpha with gcc and W=1 shows: ./include/asm-generic/barrier.h:70:30: warning: 'kcsan_rmb' is static but used in inline function 'pmd_offset' which is not static 70 | #define smp_rmb() do { kcsan_rmb(); __smp_rmb(); } while (0) | ^~~~~~~~~ ./arch/alpha/include/asm/pgtable.h:293:9: note: in expansion of macro 'smp_rmb' 293 | smp_rmb(); /* see above */ | ^~~~~~~ Which seems to warn about 6.7.4#3 of the C standard: "An inline definition of a function with external linkage shall not contain a definition of a modifiable object with static or thread storage duration, and shall not contain a reference to an identifier with internal linkage." Fix it by turning barrier instrumentation into macros, which matches definitions in <asm/barrier.h>. Perhaps we can revert this change in future, when there are no more 'extern inline' users left. Link: https://lkml.kernel.org/r/202112041334.X44uWZXf-lkp@intel.com Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-12-09kcsan: Support WEAK_MEMORY with Clang where no objtool support existsMarco Elver
Clang and GCC behave a little differently when it comes to the __no_sanitize_thread attribute, which has valid reasons, and depending on context either one could be right. Traditionally, user space ThreadSanitizer [1] still expects instrumented builtin atomics (to avoid false positives) and __tsan_func_{entry,exit} (to generate meaningful stack traces), even if the function has the attribute no_sanitize("thread"). [1] https://clang.llvm.org/docs/ThreadSanitizer.html#attribute-no-sanitize-thread GCC doesn't follow the same policy (for better or worse), and removes all kinds of instrumentation if no_sanitize is added. Arguably, since this may be a problem for user space ThreadSanitizer, we expect this may change in future. Since KCSAN != ThreadSanitizer, the likelihood of false positives even without barrier instrumentation everywhere, is much lower by design. At least for Clang, however, to fully remove all sanitizer instrumentation, we must add the disable_sanitizer_instrumentation attribute, which is available since Clang 14.0. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-12-09compiler_attributes.h: Add __disable_sanitizer_instrumentationAlexander Potapenko
The new attribute maps to __attribute__((disable_sanitizer_instrumentation)), which will be supported by Clang >= 14.0. Future support in GCC is also possible. This attribute disables compiler instrumentation for kernel sanitizer tools, making it easier to implement noinstr. It is different from the existing __no_sanitize* attributes, which may still allow certain types of instrumentation to prevent false positives. Signed-off-by: Alexander Potapenko <glider@google.com> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-12-09locking/atomics, kcsan: Add instrumentation for barriersMarco Elver
Adds the required KCSAN instrumentation for barriers of atomics. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-12-09locking/barriers, kcsan: Add instrumentation for barriersMarco Elver
Adds the required KCSAN instrumentation for barriers if CONFIG_SMP. KCSAN supports modeling the effects of: smp_mb() smp_rmb() smp_wmb() smp_store_release() Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-12-09kcsan: Add core memory barrier instrumentation functionsMarco Elver
Add the core memory barrier instrumentation functions. These invalidate the current in-flight reordered access based on the rules for the respective barrier types and in-flight access type. To obtain barrier instrumentation that can be disabled via __no_kcsan with appropriate compiler-support (and not just with objtool help), barrier instrumentation repurposes __atomic_signal_fence(), instead of inserting explicit calls. Crucially, __atomic_signal_fence() normally does not map to any real instructions, but is still intercepted by fsanitize=thread. As a result, like any other instrumentation done by the compiler, barrier instrumentation can be disabled with __no_kcsan. Unfortunately Clang and GCC currently differ in their __no_kcsan aka __no_sanitize_thread behaviour with respect to builtin atomics (and __tsan_func_{entry,exit}) instrumentation. This is already reflected in Kconfig.kcsan's dependencies for KCSAN_WEAK_MEMORY. A later change will introduce support for newer versions of Clang that can implement __no_kcsan to also remove the additional instrumentation introduced by KCSAN_WEAK_MEMORY. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-12-09kcsan: Add core support for a subset of weak memory modelingMarco Elver
Add support for modeling a subset of weak memory, which will enable detection of a subset of data races due to missing memory barriers. KCSAN's approach to detecting missing memory barriers is based on modeling access reordering, and enabled if `CONFIG_KCSAN_WEAK_MEMORY=y`, which depends on `CONFIG_KCSAN_STRICT=y`. The feature can be enabled or disabled at boot and runtime via the `kcsan.weak_memory` boot parameter. Each memory access for which a watchpoint is set up, is also selected for simulated reordering within the scope of its function (at most 1 in-flight access). We are limited to modeling the effects of "buffering" (delaying the access), since the runtime cannot "prefetch" accesses (therefore no acquire modeling). Once an access has been selected for reordering, it is checked along every other access until the end of the function scope. If an appropriate memory barrier is encountered, the access will no longer be considered for reordering. When the result of a memory operation should be ordered by a barrier, KCSAN can then detect data races where the conflict only occurs as a result of a missing barrier due to reordering accesses. Suggested-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-12-09kcsan: Avoid checking scoped accesses from nested contextsMarco Elver
Avoid checking scoped accesses from nested contexts (such as nested interrupts or in scheduler code) which share the same kcsan_ctx. This is to avoid detecting false positive races of accesses in the same thread with currently scoped accesses: consider setting up a watchpoint for a non-scoped (normal) access that also "conflicts" with a current scoped access. In a nested interrupt (or in the scheduler), which shares the same kcsan_ctx, we cannot check scoped accesses set up in the parent context -- simply ignore them in this case. With the introduction of kcsan_ctx::disable_scoped, we can also clean up kcsan_check_scoped_accesses()'s recursion guard, and do not need to modify the list's prev pointer. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-12-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09percpu_ref: Replace kernel.h with the necessary inclusionsAndy Shevchenko
When kernel.h is used in the headers it adds a lot into dependency hell, especially when there are circular dependencies are involved. Replace kernel.h inclusion with the list of what is really being used. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Dennis Zhou <dennis@kernel.org>
2021-12-09skbuff: Extract list pointers to silence compiler warningsKees Cook
Under both -Warray-bounds and the object_size sanitizer, the compiler is upset about accessing prev/next of sk_buff when the object it thinks it is coming from is sk_buff_head. The warning is a false positive due to the compiler taking a conservative approach, opting to warn at casting time rather than access time. However, in support of enabling -Warray-bounds globally (which has found many real bugs), arrange things for sk_buff so that the compiler can unambiguously see that there is no intention to access anything except prev/next. Introduce and cast to a separate struct sk_buff_list, which contains _only_ the first two fields, silencing the warnings: In file included from ./include/net/net_namespace.h:39, from ./include/linux/netdevice.h:37, from net/core/netpoll.c:17: net/core/netpoll.c: In function 'refill_skbs': ./include/linux/skbuff.h:2086:9: warning: array subscript 'struct sk_buff[0]' is partly outside array bounds of 'struct sk_buff_head[1]' [-Warray-bounds] 2086 | __skb_insert(newsk, next->prev, next, list); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ net/core/netpoll.c:49:28: note: while referencing 'skb_pool' 49 | static struct sk_buff_head skb_pool; | ^~~~~~~~ This change results in no executable instruction differences. Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20211207062758.2324338-1-keescook@chromium.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09Merge branches 'doc.2021.11.30c', 'exp.2021.12.07a', 'fastnohz.2021.11.30c', ↵Paul E. McKenney
'fixes.2021.11.30c', 'nocb.2021.12.09a', 'nolibc.2021.11.30c', 'tasks.2021.12.09a', 'torture.2021.12.07a' and 'torturescript.2021.11.30c' into HEAD doc.2021.11.30c: Documentation updates. exp.2021.12.07a: Expedited-grace-period fixes. fastnohz.2021.11.30c: Remove CONFIG_RCU_FAST_NO_HZ. fixes.2021.11.30c: Miscellaneous fixes. nocb.2021.12.09a: No-CB CPU updates. nolibc.2021.11.30c: Tiny in-kernel library updates. tasks.2021.12.09a: RCU-tasks updates, including update-side scalability. torture.2021.12.07a: Torture-test in-kernel module updates. torturescript.2021.11.30c: Torture-test scripting updates.
2021-12-09Merge tag 'net-5.16-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf, can and netfilter. Current release - regressions: - bpf, sockmap: re-evaluate proto ops when psock is removed from sockmap Current release - new code bugs: - bpf: fix bpf_check_mod_kfunc_call for built-in modules - ice: fixes for TC classifier offloads - vrf: don't run conntrack on vrf with !dflt qdisc Previous releases - regressions: - bpf: fix the off-by-two error in range markings - seg6: fix the iif in the IPv6 socket control block - devlink: fix netns refcount leak in devlink_nl_cmd_reload() - dsa: mv88e6xxx: fix "don't use PHY_DETECT on internal PHY's" - dsa: mv88e6xxx: allow use of PHYs on CPU and DSA ports Previous releases - always broken: - ethtool: do not perform operations on net devices being unregistered - udp: use datalen to cap max gso segments - ice: fix races in stats collection - fec: only clear interrupt of handling queue in fec_enet_rx_queue() - m_can: pci: fix incorrect reference clock rate - m_can: disable and ignore ELO interrupt - mvpp2: fix XDP rx queues registering Misc: - treewide: add missing includes masked by cgroup -> bpf.h dependency" * tag 'net-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (82 commits) net: dsa: mv88e6xxx: allow use of PHYs on CPU and DSA ports net: wwan: iosm: fixes unable to send AT command during mbim tx net: wwan: iosm: fixes net interface nonfunctional after fw flash net: wwan: iosm: fixes unnecessary doorbell send net: dsa: felix: Fix memory leak in felix_setup_mmio_filtering MAINTAINERS: s390/net: remove myself as maintainer net/sched: fq_pie: prevent dismantle issue net: mana: Fix memory leak in mana_hwc_create_wq seg6: fix the iif in the IPv6 socket control block nfp: Fix memory leak in nfp_cpp_area_cache_add() nfc: fix potential NULL pointer deref in nfc_genl_dump_ses_done nfc: fix segfault in nfc_genl_dump_devices_done udp: using datalen to cap max gso segments net: dsa: mv88e6xxx: error handling for serdes_power functions can: kvaser_usb: get CAN clock frequency from device can: kvaser_pciefd: kvaser_pciefd_rx_error_frame(): increase correct stats->{rx,tx}_errors counter net: mvpp2: fix XDP rx queues registering vmxnet3: fix minimum vectors alloc issue net, neigh: clear whole pneigh_entry at alloc time net: dsa: mv88e6xxx: fix "don't use PHY_DETECT on internal PHY's" ...
2021-12-09net: phylink: use legacy_pre_march2020Russell King (Oracle)
Use the legacy flag to indicate whether we should operate in legacy mode. This allows us to stop using the presence of a PCS as an indicator to the age of the phylink user, and make PCS presence optional. Legacy mode involves: 1) calling mac_config() whenever the link comes up 2) calling mac_config() whenever the inband advertisement changes, possibly followed by a call to mac_an_restart() 3) making use of mac_an_restart() 4) making use of mac_pcs_get_state() All the above functionality was moved to a seperate "PCS" block of operations in March 2020. Update the documents to indicate that the differences that this flag makes. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09net: phylink: add legacy_pre_march2020 indicatorRussell King (Oracle)
Add a boolean to phylink_config to indicate whether a driver has not been updated for the changes in commit 7cceb599d15d ("net: phylink: avoid mac_config calls"), and thus are reliant on the old behaviour. We were currently keying the phylink behaviour on the presence of a PCS, but this is sub-optimal for modern drivers that may not have a PCS. This commit merely introduces the new flag, but does not add any use, since we need all legacy drivers to set this flag before it can be used. Once these legacy drivers have been updated, we can remove this flag. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09fs_parse: allow parameter value to be emptyLukas Czerner
Allow parameter value to be empty by specifying fs_param_can_be_empty flag. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Link: https://lore.kernel.org/r/20211027141857.33657-2-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-12-09Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Jiri Kosina: - fixes for various drivers which assume that a HID device is on USB transport, but that might not necessarily be the case, as the device can be faked by uhid. (Greg, Benjamin Tissoires) - fix for spurious wakeups on certain Lenovo notebooks (Thomas Weißschuh) - a few other device-specific quirks * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: Ignore battery for Elan touchscreen on Asus UX550VE HID: intel-ish-hid: ipc: only enable IRQ wakeup when requested HID: google: add eel USB id HID: add USB_HID dependancy to hid-prodikeys HID: add USB_HID dependancy to hid-chicony HID: bigbenff: prevent null pointer dereference HID: sony: fix error path in probe HID: add USB_HID dependancy on some USB HID drivers HID: check for valid USB device for many HID drivers HID: wacom: fix problems when device is not a valid USB device HID: add hid_is_usb() function to make it simpler for USB detection HID: quirks: Add quirk for the Microsoft Surface 3 type-cover
2021-12-09wait: add wake_up_pollfree()Eric Biggers
Several ->poll() implementations are special in that they use a waitqueue whose lifetime is the current task, rather than the struct file as is normally the case. This is okay for blocking polls, since a blocking poll occurs within one task; however, non-blocking polls require another solution. This solution is for the queue to be cleared before it is freed, using 'wake_up_poll(wq, EPOLLHUP | POLLFREE);'. However, that has a bug: wake_up_poll() calls __wake_up() with nr_exclusive=1. Therefore, if there are multiple "exclusive" waiters, and the wakeup function for the first one returns a positive value, only that one will be called. That's *not* what's needed for POLLFREE; POLLFREE is special in that it really needs to wake up everyone. Considering the three non-blocking poll systems: - io_uring poll doesn't handle POLLFREE at all, so it is broken anyway. - aio poll is unaffected, since it doesn't support exclusive waits. However, that's fragile, as someone could add this feature later. - epoll doesn't appear to be broken by this, since its wakeup function returns 0 when it sees POLLFREE. But this is fragile. Although there is a workaround (see epoll), it's better to define a function which always sends POLLFREE to all waiters. Add such a function. Also make it verify that the queue really becomes empty after all waiters have been woken up. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20211209010455.42744-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2021-12-09KVM: Introduce CONFIG_HAVE_KVM_DIRTY_RINGDavid Woodhouse
I'd like to make the build include dirty_ring.c based on whether the arch wants it or not. That's a whole lot simpler if there's a config symbol instead of doing it implicitly on KVM_DIRTY_LOG_PAGE_OFFSET being set to something non-zero. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Message-Id: <20211121125451.9489-2-dwmw2@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-09bus: mhi: core: Add support for forced PM resumeLoic Poulain
For whatever reason, some devices like QCA6390, WCN6855 using ath11k are not in M3 state during PM resume, but still functional. The mhi_pm_resume should then not fail in those cases, and let the higher level device specific stack continue resuming process. Add an API mhi_pm_resume_force(), to force resuming irrespective of the current MHI state. This fixes a regression with non functional ath11k WiFi after suspend/resume cycle on some machines. Bug report: https://bugzilla.kernel.org/show_bug.cgi?id=214179 Link: https://lore.kernel.org/regressions/871r5p0x2u.fsf@codeaurora.org/ Fixes: 020d3b26c07a ("bus: mhi: Early MHI resume failure in non M3 state") Cc: stable@vger.kernel.org #5.13 Reported-by: Kalle Valo <kvalo@codeaurora.org> Reported-by: Pengyu Ma <mapengyu@gmail.com> Tested-by: Kalle Valo <kvalo@kernel.org> Acked-by: Kalle Valo <kvalo@kernel.org> Signed-off-by: Loic Poulain <loic.poulain@linaro.org> [mani: Switched to API, added bug report, reported-by tags and CCed stable] Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20211209131633.4168-1-manivannan.sadhasivam@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-09mtd: Introduce an expert mode for forensics and debugging purposesMiquel Raynal
When developping NAND controller drivers or when debugging filesystem corruptions, it is quite common to need hacking locally into the MTD/NAND core in order to get access to the content of the bad blocks. Instead of having multiple implementations out there let's provide a simple yet effective specific MTD-wide debugfs entry to fully disable these checks on purpose. A warning is added to inform the user when this mode gets enabled. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20211118114659.1282855-1-miquel.raynal@bootlin.com
2021-12-09x86/sgx: Add an attribute for the amount of SGX memory in a NUMA nodeJarkko Sakkinen
== Problem == The amount of SGX memory on a system is determined by the BIOS and it varies wildly between systems. It can be as small as dozens of MB's and as large as many GB's on servers. Just like how applications need to know how much regular RAM is available, enclave builders need to know how much SGX memory an enclave can consume. == Solution == Introduce a new sysfs file: /sys/devices/system/node/nodeX/x86/sgx_total_bytes to enumerate the amount of SGX memory available in each NUMA node. This serves the same function for SGX as /proc/meminfo or /sys/devices/system/node/nodeX/meminfo does for normal RAM. 'sgx_total_bytes' is needed today to help drive the SGX selftests. SGX-specific swap code is exercised by creating overcommitted enclaves which are larger than the physical SGX memory on the system. They currently use a CPUID-based approach which can diverge from the actual amount of SGX memory available. 'sgx_total_bytes' ensures that the selftests can work efficiently and do not attempt stupid things like creating a 100,000 MB enclave on a system with 128 MB of SGX memory. == Implementation Details == Introduce CONFIG_HAVE_ARCH_NODE_DEV_GROUP opt-in flag to expose an arch specific attribute group, and add an attribute for the amount of SGX memory in bytes to each NUMA node: == ABI Design Discussion == As opposed to the per-node ABI, a single, global ABI was considered. However, this would prevent enclaves from being able to size themselves so that they fit on a single NUMA node. Essentially, a single value would rule out NUMA optimizations for enclaves. Create a new "x86/" directory inside each "nodeX/" sysfs directory. 'sgx_total_bytes' is expected to be the first of at least a few sgx-specific files to be placed in the new directory. Just scanning /proc/meminfo, these are the no-brainers that we have for RAM, but we need for SGX: MemTotal: xxxx kB // sgx_total_bytes (implemented here) MemFree: yyyy kB // sgx_free_bytes SwapTotal: zzzz kB // sgx_swapped_bytes So, at *least* three. I think we will eventually end up needing something more along the lines of a dozen. A new directory (as opposed to being in the nodeX/ "root") directory avoids cluttering the root with several "sgx_*" files. Place the new file in a new "nodeX/x86/" directory because SGX is highly x86-specific. It is very unlikely that any other architecture (or even non-Intel x86 vendor) will ever implement SGX. Using "sgx/" as opposed to "x86/" was also considered. But, there is a real chance this can get used for other arch-specific purposes. [ dhansen: rewrite changelog ] Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211116162116.93081-2-jarkko@kernel.org
2021-12-09genirq/msi: Handle PCI/MSI allocation fail in core codeThomas Gleixner
Get rid of yet another irqdomain callback and let the core code return the already available information of how many descriptors could be allocated. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> # PCI Link: https://lore.kernel.org/r/20211206210225.046615302@linutronix.de
2021-12-09PCI/MSI: Make pci_msi_domain_check_cap() staticThomas Gleixner
No users outside of that file. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20211206210224.980989243@linutronix.de
2021-12-09PCI/MSI: Move msi_lock to struct pci_devThomas Gleixner
It's only required for PCI/MSI. So no point in having it in every struct device. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20211206210224.925241961@linutronix.de
2021-12-09PCI/MSI: Sanitize MSI-X table map handlingThomas Gleixner
Unmapping the MSI-X base mapping in the loops which allocate/free MSI descriptors is daft and in the way of allowing runtime expansion of MSI-X descriptors. Store the mapping in struct pci_dev and free it after freeing the MSI-X descriptors. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20211206210224.871651518@linutronix.de
2021-12-09PCI/MSI: Split out irqdomain codeThomas Gleixner
Move the irqdomain specific code into its own file. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20211206210224.817754783@linutronix.de
2021-12-09PCI/MSI: Make arch_restore_msi_irqs() less horrible.Thomas Gleixner
Make arch_restore_msi_irqs() return a boolean which indicates whether the core code should restore the MSI message or not. Get rid of the indirection in x86. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> # PCI Link: https://lore.kernel.org/r/20211206210224.485668098@linutronix.de
2021-12-09genirq/msi, treewide: Use a named struct for PCI/MSI attributesThomas Gleixner
The unnamed struct sucks and is in the way of further cleanups. Stick the PCI related MSI data into a real data structure and cleanup all users. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211206210224.374863119@linutronix.de
2021-12-09PCI/MSI: Remove msi_desc_to_pci_sysdata()Thomas Gleixner
Last user is gone long ago. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20211206210224.210768199@linutronix.de
2021-12-09PCI/MSI: Make pci_msi_domain_write_msg() staticThomas Gleixner
There is no point to have this function public as it is set by the PCI core anyway when a PCI/MSI irqdomain is created. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> # PCI Link: https://lore.kernel.org/r/20211206210224.157070464@linutronix.de
2021-12-09genirq/msi: Fixup includesThomas Gleixner
Remove the kobject.h include from msi.h as it's not required and add a sysfs.h include to the core code instead. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20211206210224.103502021@linutronix.de
2021-12-09genirq/msi: Remove unused domain callbacksThomas Gleixner
No users and there is no need to grow them. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20211126223824.322987915@linutronix.de Link: https://lore.kernel.org/r/20211206210224.041777889@linutronix.de
2021-12-09genirq/msi: Guard sysfs codeThomas Gleixner
No point in building unused code when CONFIG_SYSFS=n. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20211206210223.985907940@linutronix.de
2021-12-09Merge tag 'drm-misc-next-2021-11-29' of ↵Daniel Vetter
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 5.17: UAPI Changes: Cross-subsystem Changes: * Move 'nomodeset' kernel boot option into DRM subsystem Core Changes: * Replace several DRM_*() logging macros with drm_*() equivalents * panel: Add quirk for Lenovo Yoga Book X91F/L * ttm: Documentation fixes Driver Changes: * Cleanup nomodeset handling in drivers * Fixes * bridge/anx7625: Fix reading EDID; Fix error code * bridge/megachips: Probe both bridges before registering * vboxvideo: Fix ERR_PTR usage Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/YaSVz15Q7dAlEevU@linux-uq9g.fritz.box
2021-12-08net: wwan: make debugfs optionalSergey Ryazanov
Debugfs interface is optional for the regular modem use. Some distros and users will want to disable this feature for security or kernel size reasons. So add a configuration option that allows to completely disable the debugfs interface of the WWAN devices. A primary considered use case for this option was embedded firmwares. For example, in OpenWrt, you can not completely disable debugfs, as a lot of wireless stuff can only be configured and monitored with the debugfs knobs. At the same time, reducing the size of a kernel and modules is an essential task in the world of embedded software. Disabling the WWAN and IOSM debugfs interfaces allows us to save 50K (x86-64 build) of space for module storage. Not much, but already considerable when you only have 16MB of storage. So it is hard to just disable whole debugfs. Users need some fine grained set of options to control which debugfs interface is important and should be available and which is not. The new configuration symbol is enabled by default and is hidden under the EXPERT option. So a regular user would not be bothered by another one configuration question. While an embedded distro maintainer will be able to a little more reduce the final image size. Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Loic Poulain <loic.poulain@linaro.org> Acked-by: M Chetan Kumar <m.chetan.kumar@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08Merge tag 'linux-can-next-for-5.17-20211208' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next Marc Kleine-Budde says: ==================== can-next 2021-12-08 The first patch is by Vincent Mailhol and replaces the custom CAN units with generic one form linux/units.h. The next 3 patches are by Evgeny Boger and add Allwinner R40 support to the sun4i CAN driver. Andy Shevchenko contributes 4 patches to the hi311x CAN driver, consisting of cleanups and converting the driver to the device property API. * tag 'linux-can-next-for-5.17-20211208' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next: can: hi311x: hi3110_can_probe(): convert to use dev_err_probe() can: hi311x: hi3110_can_probe(): make use of device property API can: hi311x: hi3110_can_probe(): try to get crystal clock rate from property can: hi311x: hi3110_can_probe(): use devm_clk_get_optional() to get the input clock ARM: dts: sun8i: r40: add node for CAN controller can: sun4i_can: add support for R40 CAN controller dt-bindings: net: can: add support for Allwinner R40 CAN controller can: bittiming: replace CAN units with the generic ones from linux/units.h ==================== Link: https://lore.kernel.org/r/20211208125055.223141-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfJakub Kicinski
Daniel Borkmann says: ==================== bpf 2021-12-08 We've added 12 non-merge commits during the last 22 day(s) which contain a total of 29 files changed, 659 insertions(+), 80 deletions(-). The main changes are: 1) Fix an off-by-two error in packet range markings and also add a batch of new tests for coverage of these corner cases, from Maxim Mikityanskiy. 2) Fix a compilation issue on MIPS JIT for R10000 CPUs, from Johan Almbladh. 3) Fix two functional regressions and a build warning related to BTF kfunc for modules, from Kumar Kartikeya Dwivedi. 4) Fix outdated code and docs regarding BPF's migrate_disable() use on non- PREEMPT_RT kernels, from Sebastian Andrzej Siewior. 5) Add missing includes in order to be able to detangle cgroup vs bpf header dependencies, from Jakub Kicinski. 6) Fix regression in BPF sockmap tests caused by missing detachment of progs from sockets when they are removed from the map, from John Fastabend. 7) Fix a missing "no previous prototype" warning in x86 JIT caused by BPF dispatcher, from Björn Töpel. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf: Add selftests to cover packet access corner cases bpf: Fix the off-by-two error in range markings treewide: Add missing includes masked by cgroup -> bpf dependency tools/resolve_btfids: Skip unresolved symbol warning for empty BTF sets bpf: Fix bpf_check_mod_kfunc_call for built-in modules bpf: Make CONFIG_DEBUG_INFO_BTF depend upon CONFIG_BPF_SYSCALL mips, bpf: Fix reference to non-existing Kconfig symbol bpf: Make sure bpf_disable_instrumentation() is safe vs preemption. Documentation/locking/locktypes: Update migrate_disable() bits. bpf, sockmap: Re-evaluate proto ops when psock is removed from sockmap bpf, sockmap: Attach map progs to psock early for feature probes bpf, x86: Fix "no previous prototype" warning ==================== Link: https://lore.kernel.org/r/20211208155125.11826-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08net: dsa: keep the bridge_dev and bridge_num as part of the same structureVladimir Oltean
The main desire behind this is to provide coherent bridge information to the fast path without locking. For example, right now we set dp->bridge_dev and dp->bridge_num from separate code paths, it is theoretically possible for a packet transmission to read these two port properties consecutively and find a bridge number which does not correspond with the bridge device. Another desire is to start passing more complex bridge information to dsa_switch_ops functions. For example, with FDB isolation, it is expected that drivers will need to be passed the bridge which requested an FDB/MDB entry to be offloaded, and along with that bridge_dev, the associated bridge_num should be passed too, in case the driver might want to implement an isolation scheme based on that number. We already pass the {bridge_dev, bridge_num} pair to the TX forwarding offload switch API, however we'd like to remove that and squash it into the basic bridge join/leave API. So that means we need to pass this pair to the bridge join/leave API. During dsa_port_bridge_leave, first we unset dp->bridge_dev, then we call the driver's .port_bridge_leave with what used to be our dp->bridge_dev, but provided as an argument. When bridge_dev and bridge_num get folded into a single structure, we need to preserve this behavior in dsa_port_bridge_leave: we need a copy of what used to be in dp->bridge. Switch drivers check bridge membership by comparing dp->bridge_dev with the provided bridge_dev, but now, if we provide the struct dsa_bridge as a pointer, they cannot keep comparing dp->bridge to the provided pointer, since this only points to an on-stack copy. To make this obvious and prevent driver writers from forgetting and doing stupid things, in this new API, the struct dsa_bridge is provided as a full structure (not very large, contains an int and a pointer) instead of a pointer. An explicit comparison function needs to be used to determine bridge membership: dsa_port_offloads_bridge(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08net: dsa: make dp->bridge_num one-basedVladimir Oltean
I have seen too many bugs already due to the fact that we must encode an invalid dp->bridge_num as a negative value, because the natural tendency is to check that invalid value using (!dp->bridge_num). Latest example can be seen in commit 1bec0f05062c ("net: dsa: fix bridge_num not getting cleared after ports leaving the bridge"). Convert the existing users to assume that dp->bridge_num == 0 is the encoding for invalid, and valid bridge numbers start from 1. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08x86/sev: Use CC_ATTR attribute to generalize string I/O unrollKuppuswamy Sathyanarayanan
INS/OUTS are not supported in TDX guests and cause #UD. Kernel has to avoid them when running in TDX guest. To support existing usage, string I/O operations are unrolled using IN/OUT instructions. AMD SEV platform implements this support by adding unroll logic in ins#bwl()/outs#bwl() macros with SEV-specific checks. Since TDX VM guests will also need similar support, use CC_ATTR_GUEST_UNROLL_STRING_IO and generic cc_platform_has() API to implement it. String I/O helpers were the last users of sev_key_active() interface and sev_enable_key static key. Remove them. [ bp: Move comment too and do not delete it. ] Suggested-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lkml.kernel.org/r/20211206135505.75045-2-kirill.shutemov@linux.intel.com
2021-12-08PM: hibernate: Allow ACPI hardware signature to be honouredDavid Woodhouse
Theoretically, when the hardware signature in FACS changes, the OS is supposed to gracefully decline to attempt to resume from S4: "If the signature has changed, OSPM will not restore the system context and can boot from scratch" In practice, Windows doesn't do this and many laptop vendors do allow the signature to change especially when docking/undocking, so it would be a bad idea to simply comply with the specification by default in the general case. However, there are use cases where we do want the compliant behaviour and we know it's safe. Specifically, when resuming virtual machines where we know the hypervisor has changed sufficiently that resume will fail. We really want to be able to *tell* the guest kernel not to try, so it boots cleanly and doesn't just crash. This patch provides a way to opt in to the spec-compliant behaviour on the command line. A follow-up patch may do this automatically for certain "known good" machines based on a DMI match, or perhaps just for all hypervisor guests since there's no good reason a hypervisor would change the hardware_signature that it exposes to guests *unless* it wants them to obey the ACPI specification. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-08PM: runtime: Fix pm_runtime_active() kerneldoc commentRafael J. Wysocki
The kerneldoc comment of pm_runtime_active() does not reflect the behavior of the function, so update it accordingly. Fixes: 403d2d116ec0 ("PM: runtime: Add kerneldoc comments to multiple helpers") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-12-08Merge branch 'kvm-on-hv-msrbm-fix' into HEADPaolo Bonzini
Merge bugfix for enlightened MSR Bitmap, before adding support to KVM for exposing the feature to nested guests.
2021-12-08clk: gate: Add devm_clk_hw_register_gate()Horatiu Vultur
Add devm_clk_hw_register_gate() - devres-managed version of clk_hw_register_gate() Suggested-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Link: https://lore.kernel.org/r/20211103085102.1656081-2-horatiu.vultur@microchip.com
2021-12-08KVM: Add helpers to wake/query blocking vCPUSean Christopherson
Add helpers to wake and query a blocking vCPU. In addition to providing nice names, the helpers reduce the probability of KVM neglecting to use kvm_arch_vcpu_get_wait(). No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211009021236.4122790-20-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-08KVM: stats: Add stat to detect if vcpu is currently blockingJing Zhang
Add a "blocking" stat that userspace can use to detect the case where a vCPU is not being run because of an vCPU/guest action, e.g. HLT or WFS on x86, WFI on arm64, etc... Current guest/host/halt stats don't show this well, e.g. if a guest halts for a long period of time then the vCPU could could appear pathologically blocked due to a host condition, when in reality the vCPU has been put into a not-runnable state by the guest. Originally-by: Cannon Matthews <cannonmatthews@google.com> Suggested-by: Sean Christopherson <seanjc@google.com> Reviewed-by: David Matlack <dmatlack@google.com> Signed-off-by: Jing Zhang <jingzhangos@google.com> [sean: renamed stat to "blocking", massaged changelog] Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211009021236.4122790-16-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-08KVM: Split out a kvm_vcpu_block() helper from kvm_vcpu_halt()Sean Christopherson
Factor out the "block" part of kvm_vcpu_halt() so that x86 can emulate non-halt wait/sleep/block conditions that should not be subjected to halt-polling. No functional change intended. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Matlack <dmatlack@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211009021236.4122790-15-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>