summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2021-03-17usb-storage: Add quirk to defeat Kindle's automatic unloadAlan Stern
Matthias reports that the Amazon Kindle automatically removes its emulated media if it doesn't receive another SCSI command within about one second after a SYNCHRONIZE CACHE. It does so even when the host has sent a PREVENT MEDIUM REMOVAL command. The reason for this behavior isn't clear, although it's not hard to make some guesses. At any rate, the results can be unexpected for anyone who tries to access the Kindle in an unusual fashion, and in theory they can lead to data loss (for example, if one file is closed and synchronized while other files are still in the middle of being written). To avoid such problems, this patch creates a new usb-storage quirks flag telling the driver always to issue a REQUEST SENSE following a SYNCHRONIZE CACHE command, and adds an unusual_devs entry for the Kindle with the flag set. This is sufficient to prevent the Kindle from doing its automatic unload, without interfering with proper operation. Another possible way to deal with this would be to increase the frequency of TEST UNIT READY polling that the kernel normally carries out for removable-media storage devices. However that would increase the overall load on the system and it is not as reliable, because the user can override the polling interval. Changing the driver's behavior is safer and has minimal overhead. CC: <stable@vger.kernel.org> Reported-and-tested-by: Matthias Schwarzott <zzam@gentoo.org> Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Link: https://lore.kernel.org/r/20210317190654.GA497856@rowland.harvard.edu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-17module: remove never implemented MODULE_SUPPORTED_DEVICELeon Romanovsky
MODULE_SUPPORTED_DEVICE was added in pre-git era and never was implemented. We can safely remove it, because the kernel has grown to have many more reliable mechanisms to determine if device is supported or not. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-03-17platform/surface: Add DTX driverMaximilian Luz
The Microsoft Surface Book series devices consist of a so-called clipboard part (containing the CPU, touchscreen, and primary battery) and a base part (containing keyboard, secondary battery, and optional discrete GPU). These parts can be separated, i.e. the clipboard can be detached and used as tablet. This detachment process is initiated by pressing a button. On the Surface Book 2 and 3 (targeted with this commit), the Surface Aggregator Module (i.e. the embedded controller on those devices) attempts to send a notification to any listening client driver and waits for further instructions (i.e. whether the detachment process should continue or be aborted). If it does not receive a response in a certain time-frame, the detachment process (by default) continues and the clipboard can be physically separated. In other words, (by default and) without a driver, the detachment process takes about 10 seconds to complete. This commit introduces a driver for this detachment system (called DTX). This driver allows a user-space daemon to control and influence the detachment behavior. Specifically, it forwards any detachment requests to user-space, allows user-space to make such requests itself, and allows handling of those requests. Requests can be handled by either aborting, continuing/allowing, or delaying (i.e. resetting the timeout via a heartbeat commend). The user-space API is implemented via the /dev/surface/dtx miscdevice. In addition, user-space can change the default behavior on timeout from allowing detachment to disallowing it, which is useful if the (optional) discrete GPU is in use. Furthermore, this driver allows user-space to receive notifications about the state of the base, specifically when it is physically removed (as opposed to detachment requested), in what manner it is connected (i.e. in reverse-/tent-/studio- or laptop-mode), and what type of base is connected. Based on this information, the driver also provides a simple tablet-mode switch (aliasing all modes without keyboard access, i.e. tablet-mode and studio-mode to its reported tablet-mode). An implementation of such a user-space daemon, allowing configuration of detachment behavior via scripts (e.g. safely unmounting USB devices connected to the base before continuing) can be found at [1]. [1]: https://github.com/linux-surface/surface-dtx-daemon Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com> Link: https://lore.kernel.org/r/20210308184819.437438-2-luzmaximilian@gmail.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-03-17rcu: Prevent false positive softirq warning on RTThomas Gleixner
Soft interrupt disabled sections can legitimately be preempted or schedule out when blocking on a lock on RT enabled kernels so the RCU preempt check warning has to be disabled for RT kernels. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Tested-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210309085727.626304079@linutronix.de
2021-03-17tick/sched: Prevent false positive softirq pending warnings on RTThomas Gleixner
On RT a task which has soft interrupts disabled can block on a lock and schedule out to idle while soft interrupts are pending. This triggers the warning in the NOHZ idle code which complains about going idle with pending soft interrupts. But as the task is blocked soft interrupt processing is temporarily blocked as well which means that such a warning is a false positive. To prevent that check the per CPU state which indicates that a scheduled out task has soft interrupts disabled. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Tested-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210309085727.527563866@linutronix.de
2021-03-17softirq: Make softirq control and processing RT awareThomas Gleixner
Provide a local lock based serialization for soft interrupts on RT which allows the local_bh_disabled() sections and servicing soft interrupts to be preemptible. Provide the necessary inline helpers which allow to reuse the bulk of the softirq processing code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Tested-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210309085727.426370483@linutronix.de
2021-03-17softirq: Add RT specific softirq accountingThomas Gleixner
RT requires the softirq processing and local bottomhalf disabled regions to be preemptible. Using the normal preempt count based serialization is therefore not possible because this implicitely disables preemption. RT kernels use a per CPU local lock to serialize bottomhalfs. As local_bh_disable() can nest the lock can only be acquired on the outermost invocation of local_bh_disable() and released when the nest count becomes zero. Tasks which hold the local lock can be preempted so its required to keep track of the nest count per task. Add a RT only counter to task struct and adjust the relevant macros in preempt.h. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Tested-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210309085726.983627589@linutronix.de
2021-03-17tasklets: Switch tasklet_disable() to the sleep wait variantThomas Gleixner
-- NOT FOR IMMEDIATE MERGING -- Now that all users of tasklet_disable() are invoked from sleepable context, convert it to use tasklet_unlock_wait() which might sleep. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210309084242.726452321@linutronix.de
2021-03-17tasklets: Prevent tasklet_unlock_spin_wait() deadlock on RTThomas Gleixner
tasklet_unlock_spin_wait() spin waits for the TASKLET_STATE_SCHED bit in the tasklet state to be cleared. This works on !RT nicely because the corresponding execution can only happen on a different CPU. On RT softirq processing is preemptible, therefore a task preempting the softirq processing thread can spin forever. Prevent this by invoking local_bh_disable()/enable() inside the loop. In case that the softirq processing thread was preempted by the current task, current will block on the local lock which yields the CPU to the preempted softirq processing thread. If the tasklet is processed on a different CPU then the local_bh_disable()/enable() pair is just a waste of processor cycles. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210309084241.988908275@linutronix.de
2021-03-17tasklets: Replace spin wait in tasklet_unlock_wait()Peter Zijlstra
tasklet_unlock_wait() spin waits for TASKLET_STATE_RUN to be cleared. This is wasting CPU cycles in a tight loop which is especially painful in a guest when the CPU running the tasklet is scheduled out. tasklet_unlock_wait() is invoked from tasklet_kill() which is used in teardown paths and not performance critical at all. Replace the spin wait with wait_var_event(). There are no users of tasklet_unlock_wait() which are invoked from atomic contexts. The usage in tasklet_disable() has been replaced temporarily with the spin waiting variant until the atomic users are fixed up and will be converted to the sleep wait variant later. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210309084241.783936921@linutronix.de
2021-03-17tasklets: Use spin wait in tasklet_disable() temporarilyThomas Gleixner
To ease the transition use spin waiting in tasklet_disable() until all usage sites from atomic context have been cleaned up. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210309084241.685352806@linutronix.de
2021-03-17tasklets: Provide tasklet_disable_in_atomic()Thomas Gleixner
Replacing the spin wait loops in tasklet_unlock_wait() with wait_var_event() is not possible as a handful of tasklet_disable() invocations are happening in atomic context. All other invocations are in teardown paths which can sleep. Provide tasklet_disable_in_atomic() and tasklet_unlock_spin_wait() to convert the few atomic use cases over, which allows to change tasklet_disable() and tasklet_unlock_wait() in a later step. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210309084241.563164193@linutronix.de
2021-03-17tasklets: Use static inlines for stub implementationsThomas Gleixner
Inlines exist for a reason. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210309084241.407702697@linutronix.de
2021-03-17tasklets: Replace barrier() with cpu_relax() in tasklet_unlock_wait()Thomas Gleixner
A barrier() in a tight loop which waits for something to happen on a remote CPU is a pointless exercise. Replace it with cpu_relax() which allows HT siblings to make progress. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210309084241.249343366@linutronix.de
2021-03-17rseq, ptrace: Add PTRACE_GET_RSEQ_CONFIGURATION requestPiotr Figiel
For userspace checkpoint and restore (C/R) a way of getting process state containing RSEQ configuration is needed. There are two ways this information is going to be used: - to re-enable RSEQ for threads which had it enabled before C/R - to detect if a thread was in a critical section during C/R Since C/R preserves TLS memory and addresses RSEQ ABI will be restored using the address registered before C/R. Detection whether the thread is in a critical section during C/R is needed to enforce behavior of RSEQ abort during C/R. Attaching with ptrace() before registers are dumped itself doesn't cause RSEQ abort. Restoring the instruction pointer within the critical section is problematic because rseq_cs may get cleared before the control is passed to the migrated application code leading to RSEQ invariants not being preserved. C/R code will use RSEQ ABI address to find the abort handler to which the instruction pointer needs to be set. To achieve above goals expose the RSEQ ABI address and the signature value with the new ptrace request PTRACE_GET_RSEQ_CONFIGURATION. This new ptrace request can also be used by debuggers so they are aware of stops within restartable sequences in progress. Signed-off-by: Piotr Figiel <figiel@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Michal Miroslaw <emmir@google.com> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Link: https://lkml.kernel.org/r/20210226135156.1081606-1-figiel@google.com
2021-03-17quota: wire up quotactl_pathSascha Hauer
Wire up the quotactl_path syscall added in the previous patch. Link: https://lore.kernel.org/r/20210304123541.30749-3-s.hauer@pengutronix.de Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
2021-03-17locking/ww_mutex: Fix acquire/release imbalance in ↵Waiman Long
ww_acquire_init()/ww_acquire_fini() In ww_acquire_init(), mutex_acquire() is gated by CONFIG_DEBUG_LOCK_ALLOC. The dep_map in the ww_acquire_ctx structure is also gated by the same config. However mutex_release() in ww_acquire_fini() is gated by CONFIG_DEBUG_MUTEXES. It is possible to set CONFIG_DEBUG_MUTEXES without setting CONFIG_DEBUG_LOCK_ALLOC though it is an unlikely configuration. That may cause a compilation error as dep_map isn't defined in this case. Fix this potential problem by enclosing mutex_release() inside CONFIG_DEBUG_LOCK_ALLOC. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210316153119.13802-3-longman@redhat.com
2021-03-17phy: Add media type and speed serdes configuration interfacesSteen Hegelund
Provide new phy configuration interfaces for media type and speed that allows e.g. PHYs used for ethernet to be configured with this information. Signed-off-by: Lars Povlsen <lars.povlsen@microchip.com> Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Acked-By: Kishon Vijay Abraham I <kishon@ti.com> Link: https://lore.kernel.org/r/20210218161451.3489955-3-steen.hegelund@microchip.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-03-17dt-bindings: ti-serdes-mux: Add defines for AM64 SoCKishon Vijay Abraham I
AM64 has a single lane SERDES which can be configured to be used with either PCIe or USB. Define the possilbe values for the SERDES function in AM64 SoC here. Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Acked-by: Peter Rosin <peda@axentia.se> Acked-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20210310112745.3445-4-kishon@ti.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-03-17dt-bindings: phy: cadence-torrent: Add binding for refclk driverKishon Vijay Abraham I
Add binding for refclk driver used to route the refclk out of torrent SERDES. Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20210310112745.3445-3-kishon@ti.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-03-17dt-bindings: phy: ti,phy-j721e-wiz: Add bindings for AM64 SERDES WrapperKishon Vijay Abraham I
Add bindings for AM64 SERDES Wrapper. Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20210310112745.3445-2-kishon@ti.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-03-17scsi: storvsc: Enable scatterlist entry lengths > 4KbytesMichael Kelley
storvsc currently sets .dma_boundary to limit scatterlist entries to 4 Kbytes, which is less efficient with huge pages that offer large chunks of contiguous physical memory. Improve the algorithm for creating the Hyper-V guest physical address PFN array so that scatterlist entries with lengths > 4Kbytes are handled. As a result, remove the .dma_boundary setting. The improved algorithm also adds support for scatterlist entries with offsets >= 4Kbytes, which is supported by many other SCSI low-level drivers. And it retains support for architectures where possibly PAGE_SIZE != HV_HYP_PAGE_SIZE (such as ARM64). Link: https://lore.kernel.org/r/1614120294-1930-1-git-send-email-mikelley@microsoft.com Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-16net/sched: act_api: fix miss set post_ct for ovs after do conntrack in act_ctwenxu
When openvswitch conntrack offload with act_ct action. The first rule do conntrack in the act_ct in tc subsystem. And miss the next rule in the tc and fallback to the ovs datapath but miss set post_ct flag which will lead the ct_state_key with -trk flag. Fixes: 7baf2429a1a9 ("net/sched: cls_flower add CT_FLAGS_INVALID flag support") Signed-off-by: wenxu <wenxu@ucloud.cn> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-16x86: Introduce restart_block->arch_data to remove TS_COMPAT_RESTARTOleg Nesterov
Save the current_thread_info()->status of X86 in the new restart_block->arch_data field so TS_COMPAT_RESTART can be removed again. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210201174716.GA17898@redhat.com
2021-03-16kernel, fs: Introduce and use set_restart_fn() and arch_set_restart_data()Oleg Nesterov
Preparation for fixing get_nr_restart_syscall() on X86 for COMPAT. Add a new helper which sets restart_block->fn and calls a dummy arch_set_restart_data() helper. Fixes: 609c19a385c8 ("x86/ptrace: Stop setting TS_COMPAT in ptrace code") Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210201174641.GA17871@redhat.com
2021-03-16Merge series "spi: Adding support for software nodes" from Heikki Krogerus ↵Mark Brown
<heikki.krogerus@linux.intel.com>: Hi, The older API used to supply additional device properties for the devices - so mainly the function device_add_properties() - is going to be removed. The reason why the API will be removed is because it gives false impression that the properties are assigned directly to the devices, which has actually never been the case - the properties have always been assigned to a software fwnode which was then just directly linked with the device when the old API was used. By only accepting device properties instead of complete software nodes, the subsystems remove any change of taking advantage of the other features the software nodes have. The change that is required from the spi subsystem and the drivers is trivial. Basically only the "properties" member in struct spi_board_info, which was a pointer to struct property_entry, is replaced with a pointer to a complete software node. thanks, Heikki Krogerus (4): spi: Add support for software nodes ARM: pxa: icontrol: Constify the software node ARM: pxa: zeus: Constify the software node spi: Remove support for dangling device properties arch/arm/mach-pxa/icontrol.c | 12 ++++++++---- arch/arm/mach-pxa/zeus.c | 6 +++++- drivers/spi/spi.c | 21 ++++++--------------- include/linux/spi/spi.h | 7 +++---- 4 files changed, 22 insertions(+), 24 deletions(-) -- 2.30.1 base-commit: a38fd8748464831584a19438cbb3082b5a2dab15
2021-03-16Merge tag 'fuse-fixes-5.12-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse fixes from Miklos Szeredi: "Fix a deadlock and a couple of other bugs" * tag 'fuse-fixes-5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: 32-bit user space ioctl compat for fuse device virtiofs: Fail dax mount if device does not support it fuse: fix live lock in fuse_iget()
2021-03-16Merge tag 'nfsd-5.12-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: "Miscellaneous NFSD fixes for v5.12-rc" * tag 'nfsd-5.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: svcrdma: Revert "svcrdma: Reduce Receive doorbell rate" NFSD: fix error handling in NFSv4.0 callbacks NFSD: fix dest to src mount in inter-server COPY Revert "nfsd4: a client's own opens needn't prevent delegations" Revert "nfsd4: remove check_conflicting_opens warning" rpc: fix NULL dereference on kmalloc failure sunrpc: fix refcount leak for rpc auth modules NFSD: Repair misuse of sv_lock in 5.10.16-rt30. nfsd: don't abort copies early fs: nfsd: fix kconfig dependency warning for NFSD_V4 svcrdma: disable timeouts on rdma backchannel nfsd: Don't keep looking up unhashed files in the nfsd file cache
2021-03-16ARM: amba: Allow some ARM_AMBA users to compile with COMPILE_TESTJason Gunthorpe
CONFIG_VFIO_AMBA has a light use of AMBA, adding some inline fallbacks when AMBA is disabled will allow it to be compiled under COMPILE_TEST and make VFIO easier to maintain. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <3-v1-df057e0f92c3+91-vfio_arm_compile_test_jgg@nvidia.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-03-16fanotify: support limited functionality for unprivileged usersAmir Goldstein
Add limited support for unprivileged fanotify groups. An unprivileged users is not allowed to get an open file descriptor in the event nor the process pid of another process. An unprivileged user cannot request permission events, cannot set mount/filesystem marks and cannot request unlimited queue/marks. This enables the limited functionality similar to inotify when watching a set of files and directories for OPEN/ACCESS/MODIFY/CLOSE events, without requiring SYS_CAP_ADMIN privileges. The FAN_REPORT_DFID_NAME init flag, provide a method for an unprivileged listener watching a set of directories (with FAN_EVENT_ON_CHILD) to monitor all changes inside those directories. This typically requires that the listener keeps a map of watched directory fid to dirfd (O_PATH), where fid is obtained with name_to_handle_at() before starting to watch for changes. When getting an event, the reported fid of the parent should be resolved to dirfd and fstatsat(2) with dirfd and name should be used to query the state of the filesystem entry. Link: https://lore.kernel.org/r/20210304112921.3996419-3-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2021-03-16fanotify: configurable limits via sysfsAmir Goldstein
fanotify has some hardcoded limits. The only APIs to escape those limits are FAN_UNLIMITED_QUEUE and FAN_UNLIMITED_MARKS. Allow finer grained tuning of the system limits via sysfs tunables under /proc/sys/fs/fanotify, similar to tunables under /proc/sys/fs/inotify, with some minor differences. - max_queued_events - global system tunable for group queue size limit. Like the inotify tunable with the same name, it defaults to 16384 and applies on initialization of a new group. - max_user_marks - user ns tunable for marks limit per user. Like the inotify tunable named max_user_watches, on a machine with sufficient RAM and it defaults to 1048576 in init userns and can be further limited per containing user ns. - max_user_groups - user ns tunable for number of groups per user. Like the inotify tunable named max_user_instances, it defaults to 128 in init userns and can be further limited per containing user ns. The slightly different tunable names used for fanotify are derived from the "group" and "mark" terminology used in the fanotify man pages and throughout the code. Considering the fact that the default value for max_user_instances was increased in kernel v5.10 from 8192 to 1048576, leaving the legacy fanotify limit of 8192 marks per group in addition to the max_user_marks limit makes little sense, so the per group marks limit has been removed. Note that when a group is initialized with FAN_UNLIMITED_MARKS, its own marks are not accounted in the per user marks account, so in effect the limit of max_user_marks is only for the collection of groups that are not initialized with FAN_UNLIMITED_MARKS. Link: https://lore.kernel.org/r/20210304112921.3996419-2-amir73il@gmail.com Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2021-03-16fsnotify: use hash table for faster events mergeAmir Goldstein
In order to improve event merge performance, hash events in a 128 size hash table by the event merge key. The fanotify_event size grows by two pointers, but we just reduced its size by removing the objectid member, so overall its size is increased by one pointer. Permission events and overflow event are not merged so they are also not hashed. Link: https://lore.kernel.org/r/20210304104826.3993892-5-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2021-03-16fanotify: reduce event objectid to 29-bit hashAmir Goldstein
objectid is only used by fanotify backend and it is just an optimization for event merge before comparing all fields in event. Move the objectid member from common struct fsnotify_event into struct fanotify_event and reduce it to 29-bit hash to cram it together with the 3-bit event type. Events of different types are never merged, so the combination of event type and hash form a 32-bit key for fast compare of events. This reduces the size of events by one pointer and paves the way for adding hashed queue support for fanotify. Link: https://lore.kernel.org/r/20210304104826.3993892-3-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2021-03-16fsnotify: allow fsnotify_{peek,remove}_first_event with empty queueAmir Goldstein
Current code has an assumtion that fsnotify_notify_queue_is_empty() is called to verify that queue is not empty before trying to peek or remove an event from queue. Remove this assumption by moving the fsnotify_notify_queue_is_empty() into the functions, allow them to return NULL value and check return value by all callers. This is a prep patch for multi event queues. Link: https://lore.kernel.org/r/20210304104826.3993892-2-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2021-03-16fuse: 32-bit user space ioctl compat for fuse deviceAlessio Balsini
With a 64-bit kernel build the FUSE device cannot handle ioctl requests coming from 32-bit user space. This is due to the ioctl command translation that generates different command identifiers that thus cannot be used for direct comparisons without proper manipulation. Explicitly extract type and number from the ioctl command to enable 32-bit user space compatibility on 64-bit kernel builds. Signed-off-by: Alessio Balsini <balsini@android.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-03-16tasklet: Remove tasklet_kill_immediateDavidlohr Bueso
Ever since RCU was converted to softirq, it has no users. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/r/20210306213658.12862-1-dave@stgolabs.net
2021-03-16spi: Remove support for dangling device propertiesHeikki Krogerus
>From now on only accepting complete software nodes. Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20210303152814.35070-5-heikki.krogerus@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-03-16spi: Add support for software nodesHeikki Krogerus
Making it possible for the drivers to assign complete software fwnodes to the devices instead of only the device properties in those nodes. Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20210303152814.35070-2-heikki.krogerus@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-03-16drm: Add GUD USB Display driverNoralf Trønnes
This adds a USB display driver with the intention that it can be used with future USB interfaced low end displays/adapters. The Linux gadget device driver will serve as the canonical device implementation. The following DRM properties are supported: - Plane rotation - Connector TV properties There is also support for backlight brightness exposed as a backlight device. Display modes can be made available to the host driver either as DRM display modes or through EDID. If both are present, EDID is just passed on to userspace. Performance is preferred over color depth, so if the device supports RGB565, DRM_CAP_DUMB_PREFERRED_DEPTH will return 16. If the device transfer buffer can't fit an uncompressed framebuffer update, the update is split up into parts that do fit. Optimal user experience is achieved by providing damage reports either by setting FB_DAMAGE_CLIPS on pageflips or calling DRM_IOCTL_MODE_DIRTYFB. LZ4 compression is used if the device supports it. The driver supports a one bit monochrome transfer format: R1. This is not implemented in the gadget driver. It is added in preparation for future monochrome e-ink displays. The driver is MIT licensed to smooth the path for any BSD port of the driver. v2: - Use devm_drm_dev_alloc() and drmm_mode_config_init() - drm_fbdev_generic_setup: Use preferred_bpp=0, 16 was a copy paste error - The drm_backlight_helper is dropped, copy in the code - Support protocol version backwards compatibility for device v3: - Use donated Openmoko USB pid - Use direct compression from framebuffer when pitch matches, not only on full frames, so split updates can benefit - Use __le16 in struct gud_drm_req_get_connector_status - Set edid property when the device only provides edid - Clear compression fields in struct gud_drm_req_set_buffer - Fix protocol version negotiation - Remove mode->vrefresh, it's calculated v4: - Drop the status req polling which was a workaround for something that turned out to be a dwc2 udc driver problem - Add a flag for the Linux gadget to require a status request on SET operations. Other devices will only get status req on STALL errors - Use protocol specific error codes (Peter) - Add a flag for devices that want to receive the entire framebuffer on each flush (Lubomir) - Retry a failed framebuffer flush - If mode has changed wait for worker and clear pending damage before queuing up new damage, fb width/height might have changed - Increase error counter on bulk transfer failures - Use DRM_MODE_CONNECTOR_USB - Handle R1 kmalloc error (Peter) - Don't try and replicate the USB get descriptor request standard for the display descriptor (Peter) - Make max_buffer_size optional (Peter), drop the pow2 requirement since it's not necessary anymore. - Don't pre-alloc a control request buffer, it was only 4k - Let gud.h describe the whole protocol explicitly and don't let DRM leak into it (Peter) - Drop display mode .hskew and .vscan from the protocol - Shorten names: s/GUD_DRM_/GUD_/ s/gud_drm_/gud_/ (Peter) - Fix gud_pipe_check() connector picking when switching connector - Drop gud_drm_driver_gem_create_object() cached is default now - Retrieve USB device from struct drm_device.dev instead of keeping a pointer - Honour fb->offsets[0] - Fix mode fetching when connector status is forced - Check EDID length reported by the device - Use drm_do_get_edid() so userspace can overrride EDID - Set epoch counter to signal connector status change - gud_drm_driver can be const now v5: - GUD_DRM_FORMAT_R1: Use non-human ascii values (Daniel) - Change name to: GUD USB Display (Thomas, Simon) - Change one __u32 -> __le32 in protocol header - Always log fb flush errors, unless the previous one failed - Run backlight update in a worker to avoid upsetting lockdep (Daniel) - Drop backlight_ops.get_brightness, there's no readback from the device so it doesn't really add anything. - Set dma mask, needed by dma-buf importers v6: - Use obj-y in Makefile (Peter) - Fix missing le32_to_cpu() when using GUD_DISPLAY_MAGIC (Peter) - Set initial brightness on backlight device v7: - LZ4_compress_default() can return zero, check for that - Fix memory leak in gud_pipe_check() error path (Peter) - Improve debug and error messages (Peter) - Don't pass length in protocol structs (Peter) - Pass USB interface to gud_usb_control_msg() et al. (Peter) - Improve gud_connector_fill_properties() (Peter) - Add GUD_PIXEL_FORMAT_RGB111 (Peter) - Remove GUD_REQ_SET_VERSION (Peter) - Fix DRM_IOCTL_MODE_OBJ_SETPROPERTY and the rotation property - Fix dma-buf import (Thomas) v8: - Forgot to filter RGB111 from reaching userspace - Handle a device that only returns unknown device properties (Peter) - s/GUD_PIXEL_FORMAT_RGB111/GUD_PIXEL_FORMAT_XRGB1111/ (Peter) - Fix R1 and XRGB1111 format conversion - Add FIXME about Big Endian being broken (Peter, Ilia) Cc: Lubomir Rintel <lkundrak@v3.sk> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Peter Stuge <peter@stuge.se> Tested-by: Peter Stuge <peter@stuge.se> Signed-off-by: Noralf Trønnes <noralf@tronnes.org> Link: https://patchwork.freedesktop.org/patch/msgid/20210313112545.37527-4-noralf@tronnes.org
2021-03-16drm/uapi: Add USB connector typeNoralf Trønnes
Add a connector type for USB connected display panels. Some examples of what current userspace will name the connector: - Weston: "UNNAMED-%d" - Mutter: "Unknown20-%d" - X: "Unknown20-%d" v2: - Update drm_connector_enum_list - Add examples to commit message Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Noralf Trønnes <noralf@tronnes.org> Link: https://patchwork.freedesktop.org/patch/msgid/20210313112545.37527-2-noralf@tronnes.org
2021-03-16Merge drm/drm-next into drm-misc-nextMaxime Ripard
Noralf needs some patches in 5.12-rc3, and we've been delaying the 5.12 merge due to the swap issue so it looks like a good time. Signed-off-by: Maxime Ripard <maxime@cerno.tech>
2021-03-16can: dev: Move device back to init netns on owning netns deleteMartin Willi
When a non-initial netns is destroyed, the usual policy is to delete all virtual network interfaces contained, but move physical interfaces back to the initial netns. This keeps the physical interface visible on the system. CAN devices are somewhat special, as they define rtnl_link_ops even if they are physical devices. If a CAN interface is moved into a non-initial netns, destroying that netns lets the interface vanish instead of moving it back to the initial netns. default_device_exit() skips CAN interfaces due to having rtnl_link_ops set. Reproducer: ip netns add foo ip link set can0 netns foo ip netns delete foo WARNING: CPU: 1 PID: 84 at net/core/dev.c:11030 ops_exit_list+0x38/0x60 CPU: 1 PID: 84 Comm: kworker/u4:2 Not tainted 5.10.19 #1 Workqueue: netns cleanup_net [<c010e700>] (unwind_backtrace) from [<c010a1d8>] (show_stack+0x10/0x14) [<c010a1d8>] (show_stack) from [<c086dc10>] (dump_stack+0x94/0xa8) [<c086dc10>] (dump_stack) from [<c086b938>] (__warn+0xb8/0x114) [<c086b938>] (__warn) from [<c086ba10>] (warn_slowpath_fmt+0x7c/0xac) [<c086ba10>] (warn_slowpath_fmt) from [<c0629f20>] (ops_exit_list+0x38/0x60) [<c0629f20>] (ops_exit_list) from [<c062a5c4>] (cleanup_net+0x230/0x380) [<c062a5c4>] (cleanup_net) from [<c0142c20>] (process_one_work+0x1d8/0x438) [<c0142c20>] (process_one_work) from [<c0142ee4>] (worker_thread+0x64/0x5a8) [<c0142ee4>] (worker_thread) from [<c0148a98>] (kthread+0x148/0x14c) [<c0148a98>] (kthread) from [<c0100148>] (ret_from_fork+0x14/0x2c) To properly restore physical CAN devices to the initial netns on owning netns exit, introduce a flag on rtnl_link_ops that can be set by drivers. For CAN devices setting this flag, default_device_exit() considers them non-virtual, applying the usual namespace move. The issue was introduced in the commit mentioned below, as at that time CAN devices did not have a dellink() operation. Fixes: e008b5fc8dc7 ("net: Simplfy default_device_exit and improve batching.") Link: https://lore.kernel.org/r/20210302122423.872326-1-martin@strongswan.org Signed-off-by: Martin Willi <martin@strongswan.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-03-16Merge tag 'drm-misc-next-2021-03-03' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 5.13: UAPI Changes: Cross-subsystem Changes: Core Changes: - %p4cc printk format modifier - atomic: introduce drm_crtc_commit_wait, rework atomic plane state helpers to take the drm_commit_state structure - dma-buf: heaps rework to return a struct dma_buf - simple-kms: Add plate state helpers - ttm: debugfs support, removal of sysfs Driver Changes: - Convert drivers to shadow plane helpers - arc: Move to drm/tiny - ast: cursor plane reworks - gma500: Remove TTM and medfield support - mxsfb: imx8mm support - panfrost: MMU IRQ handling rework - qxl: rework to better handle resources deallocation, locking - sun4i: Add alpha properties for UI and VI layers - vc4: RPi4 CEC support - vmwgfx: doc cleanup Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20210303100600.dgnkadonzuvfnu22@gilmour
2021-03-16spi: spi-geni-qcom: Convert to use resource-managed OPP APIYangtao Li
Use resource-managed OPP API to simplify code. Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Acked-by: Mark Brown <broonie@kernel.org> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-03-16opp: Change return type of devm_pm_opp_attach_genpd()Dmitry Osipenko
Make devm_pm_opp_attach_genpd() to return error code instead of opp_table pointer in order to have return type consistent with the other resource-managed OPP helpers. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-03-16opp: Change return type of devm_pm_opp_register_set_opp_helper()Dmitry Osipenko
Make devm_pm_opp_register_set_opp_helper() to return error code instead of opp_table pointer in order to have return type consistent with the other resource-managed OPP helpers. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-03-15tcp: relookup sock for RST+ACK packets handled by obsolete req sockAlexander Ovechkin
Currently tcp_check_req can be called with obsolete req socket for which big socket have been already created (because of CPU race or early demux assigning req socket to multiple packets in gro batch). Commit e0f9759f530bf789e984 ("tcp: try to keep packet if SYN_RCV race is lost") added retry in case when tcp_check_req is called for PSH|ACK packet. But if client sends RST+ACK immediatly after connection being established (it is performing healthcheck, for example) retry does not occur. In that case tcp_check_req tries to close req socket, leaving big socket active. Fixes: e0f9759f530 ("tcp: try to keep packet if SYN_RCV race is lost") Signed-off-by: Alexander Ovechkin <ovov@yandex-team.ru> Reported-by: Oleg Senin <olegsenin@yandex-team.ru> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-15rcu/nocb: Disable bypass when CPU isn't completely offloadedFrederic Weisbecker
Currently, the bypass is flushed at the very last moment in the deoffloading procedure. However, this approach leads to a larger state space than would be preferred. This commit therefore disables the bypass at soon as the deoffloading procedure begins, then flushes it. This guarantees that the bypass remains empty and thus out of the way of the deoffloading procedure. Symmetrically, this commit waits to enable the bypass until the offloading procedure has completed. Reported-by: Paul E. McKenney <paulmck@kernel.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Neeraj Upadhyay <neeraju@codeaurora.org> Cc: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-03-15rcu/tree: Add a trace event for RCU CPU stall warningsSangmoon Kim
This commit adds a trace event which allows tracing the beginnings of RCU CPU stall warnings on systems where sysctl_panic_on_rcu_stall is disabled. The first parameter is the name of RCU flavor like other trace events. The second parameter indicates whether this is a stall of an expedited grace period, a self-detected stall of a normal grace period, or a stall of a normal grace period detected by some CPU other than the one that is stalled. RCU CPU stall warnings are often caused by external-to-RCU issues, for example, in interrupt handling or task scheduling. Therefore, this event uses TRACE_EVENT, not TRACE_EVENT_RCU, to avoid requiring those interested in tracing RCU CPU stalls to rebuild their kernels with CONFIG_RCU_TRACE=y. Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Reviewed-by: Neeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by: Sangmoon Kim <sangmoon.kim@samsung.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-03-15netfilter: x_tables: Use correct memory barriers.Mark Tomlinson
When a new table value was assigned, it was followed by a write memory barrier. This ensured that all writes before this point would complete before any writes after this point. However, to determine whether the rules are unused, the sequence counter is read. To ensure that all writes have been done before these reads, a full memory barrier is needed, not just a write memory barrier. The same argument applies when incrementing the counter, before the rules are read. Changing to using smp_mb() instead of smp_wmb() fixes the kernel panic reported in cc00bcaa5899 (which is still present), while still maintaining the same speed of replacing tables. The smb_mb() barriers potentially slow the packet path, however testing has shown no measurable change in performance on a 4-core MIPS64 platform. Fixes: 7f5c6d4f665b ("netfilter: get rid of atomic ops in fast path") Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>