summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-02-23io-wq: remove nr_process accountingJens Axboe
We're now just using fork like we would from userspace, so there's no need to try and impose extra restrictions or accounting on the user side of things. That's already being done for us. That also means we don't have to pass in the user_struct anymore, that's correctly inherited through ->creds on fork. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-23io_uring: flag new native workers with IORING_FEAT_NATIVE_WORKERSJens Axboe
A few reasons to do this: - The naming of the manager and worker have changed. That's a user visible change, so makes sense to flag it. - Opening certain files that use ->signal (like /proc/self or /dev/tty) now works, and the flag tells the application upfront that this is the case. - Related to the above, using signalfd will now work as well. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-23net: remove cmsg restriction from io_uring based send/recvmsg callsJens Axboe
No need to restrict these anymore, as the worker threads are direct clones of the original task. Hence we know for a fact that we can support anything that the regular task can. Since the only user of proto_ops->flags was to flag PROTO_CMSG_DATA_ONLY, kill the member and the flag definition too. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-23Revert "proc: don't allow async path resolution of /proc/self components"Jens Axboe
This reverts commit 8d4c3e76e3be11a64df95ddee52e99092d42fc19. No longer needed, as the io-wq worker threads have the right identity. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-23Revert "proc: don't allow async path resolution of /proc/thread-self components"Jens Axboe
This reverts commit 0d4370cfe36b7f1719123b621a4ec4d9c7a25f89. No longer needed, as the io-wq worker threads have the right identity. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-23block: fix logging on capacity changeMing Lei
Local variable of 'capacity' stores the previous disk capacity, and 'size' variable records the latest disk capacity, so swap them for fixing logging on capacity change. Cc: Christoph Hellwig <hch@lst.de> Fixes: a782483cc1f8 ("block: remove the nr_sects field in struct hd_struct") Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-23blk-settings: align max_sectors on "logical_block_size" boundaryMikulas Patocka
We get I/O errors when we run md-raid1 on the top of dm-integrity on the top of ramdisk. device-mapper: integrity: Bio not aligned on 8 sectors: 0xff00, 0xff device-mapper: integrity: Bio not aligned on 8 sectors: 0xff00, 0xff device-mapper: integrity: Bio not aligned on 8 sectors: 0xffff, 0x1 device-mapper: integrity: Bio not aligned on 8 sectors: 0xffff, 0x1 device-mapper: integrity: Bio not aligned on 8 sectors: 0x8048, 0xff device-mapper: integrity: Bio not aligned on 8 sectors: 0x8147, 0xff device-mapper: integrity: Bio not aligned on 8 sectors: 0x8246, 0xff device-mapper: integrity: Bio not aligned on 8 sectors: 0x8345, 0xbb The ramdisk device has logical_block_size 512 and max_sectors 255. The dm-integrity device uses logical_block_size 4096 and it doesn't affect the "max_sectors" value - thus, it inherits 255 from the ramdisk. So, we have a device with max_sectors not aligned on logical_block_size. The md-raid device sees that the underlying leg has max_sectors 255 and it will split the bios on 255-sector boundary, making the bios unaligned on logical_block_size. In order to fix the bug, we round down max_sectors to logical_block_size. Cc: stable@vger.kernel.org Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-23block: reopen the device in blkdev_reread_partChristoph Hellwig
Historically the BLKRRPART ioctls called into the now defunct ->revalidate method, which caused the sd driver to check if any media is present. When the ->revalidate method was removed this revalidation was lost, leading to lots of I/O errors when using the eject command. Fix this by reopening the device to rescan the partitions, and thus calling the revalidation logic in the sd driver. Fixes: 471bd0af544b ("sd: use bdev_check_media_change") Reported--by: Tom Seewald <tseewald@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Tom Seewald <tseewald@gmail.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-23io_uring: fix locked_free_list caches_free()Pavel Begunkov
Don't forget to zero locked_free_nr, it's not a disaster but makes it attempting to flush it with extra locking when there is nothing in the list. Also, don't traverse a potentially long list freeing requests under spinlock, splice the list and do it afterwards. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-23io_uring: don't attempt IO reissue from the ring exit pathJens Axboe
If we're exiting the ring, just let the IO fail with -EAGAIN as nobody will care anyway. It's not the right context to reissue from. Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-23Merge branch 'for-5.12/dax' into for-5.12/libnvdimmDan Williams
Pick up device-dax updates to merge with libnvdimm device updates for 5.12. * Fix the polarity of EINVAL in a sysfs return code * Drop the unused return code for driver remove() callbacks
2021-02-23Merge branch '100GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2021-02-22 Dave corrects reporting of max TCs to use the value from hardware capabilities and setting of DCBx capability bits when changing between SW and FW LLDP. Brett fixes trusted VF multicast promiscuous not receiving expected packets and corrects VF max packet size when a port VLAN is configured. Henry updates available RSS queues following a change in channel count with a user defined LUT. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: update the number of available RSS queues ice: Fix state bits on LLDP mode switch ice: Account for port VLAN in VF max packet size calculation ice: Set trusted VF as default VSI when setting allmulti on ice: report correct max number of TCs ==================== Link: https://lore.kernel.org/r/20210222235814.834282-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-23Merge tag 'keys-misc-20210126' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull keyring updates from David Howells: "Here's a set of minor keyrings fixes/cleanups that I've collected from various people for the upcoming merge window. A couple of them might, in theory, be visible to userspace: - Make blacklist_vet_description() reject uppercase letters as they don't match the all-lowercase hex string generated for a blacklist search. This may want reconsideration in the future, but, currently, you can't add to the blacklist keyring from userspace and the only source of blacklist keys generates lowercase descriptions. - Fix blacklist_init() to use a new KEY_ALLOC_* flag to indicate that it wants KEY_FLAG_KEEP to be set rather than passing KEY_FLAG_KEEP into keyring_alloc() as KEY_FLAG_KEEP isn't a valid alloc flag. This isn't currently a problem as the blacklist keyring isn't currently writable by userspace. The rest of the patches are cleanups and I don't think they should have any visible effect" * tag 'keys-misc-20210126' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: watch_queue: rectify kernel-doc for init_watch() certs: Replace K{U,G}IDT_INIT() with GLOBAL_ROOT_{U,G}ID certs: Fix blacklist flag type confusion PKCS#7: Fix missing include certs: Fix blacklisted hexadecimal hash string check certs/blacklist: fix kernel doc interface issue crypto: public_key: Remove redundant header file from public_key.h keys: remove trailing semicolon in macro definition crypto: pkcs7: Use match_string() helper to simplify the code PKCS#7: drop function from kernel-doc pkcs7_validate_trust_one encrypted-keys: Replace HTTP links with HTTPS ones crypto: asymmetric_keys: fix some comments in pkcs7_parser.h KEYS: remove redundant memset security: keys: delete repeated words in comments KEYS: asymmetric: Fix kerneldoc security/keys: use kvfree_sensitive() watch_queue: Drop references to /dev/watch_queue keys: Remove outdated __user annotations security: keys: Fix fall-through warnings for Clang
2021-02-23Merge branch 'wireguard-fixes-for-5-12-rc1'Jakub Kicinski
Jason Donenfeld says: ==================== wireguard fixes for 5.12-rc1 This series has a collection of fixes that have piled up for a little while now, that I unfortunately didn't get a chance to send out earlier. 1) Removes unlikely() from IS_ERR(), since it's already implied. 2) Remove a bogus sparse annotation that hasn't been needed for years. 3) Addition test in the test suite for stressing parallel ndo_start_xmit. 4) Slight struct reordering in preparation for subsequent fix. 5) If skb->protocol is bogus, we no longer attempt to send icmp messages. 6) Massive memory usage fix, hit by larger deployments. 7) Fix typo in kconfig dependency logic. (1) and (2) are tiny cleanups, and (3) is just a test, so if you're trying to reduce churn, you could not backport these. But (4), (5), (6), and (7) fix problems and should be applied to stable. IMO, it's probably easiest to just apply them all to stable. ==================== Link: https://lore.kernel.org/r/20210222162549.3252778-1-Jason@zx2c4.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-23wireguard: kconfig: use arm chacha even with no neonJason A. Donenfeld
The condition here was incorrect: a non-neon fallback implementation is available on arm32 when NEON is not supported. Reported-by: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com> Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-23wireguard: queueing: get rid of per-peer ring buffersJason A. Donenfeld
Having two ring buffers per-peer means that every peer results in two massive ring allocations. On an 8-core x86_64 machine, this commit reduces the per-peer allocation from 18,688 bytes to 1,856 bytes, which is an 90% reduction. Ninety percent! With some single-machine deployments approaching 500,000 peers, we're talking about a reduction from 7 gigs of memory down to 700 megs of memory. In order to get rid of these per-peer allocations, this commit switches to using a list-based queueing approach. Currently GSO fragments are chained together using the skb->next pointer (the skb_list_* singly linked list approach), so we form the per-peer queue around the unused skb->prev pointer (which sort of makes sense because the links are pointing backwards). Use of skb_queue_* is not possible here, because that is based on doubly linked lists and spinlocks. Multiple cores can write into the queue at any given time, because its writes occur in the start_xmit path or in the udp_recv path. But reads happen in a single workqueue item per-peer, amounting to a multi-producer, single-consumer paradigm. The MPSC queue is implemented locklessly and never blocks. However, it is not linearizable (though it is serializable), with a very tight and unlikely race on writes, which, when hit (some tiny fraction of the 0.15% of partial adds on a fully loaded 16-core x86_64 system), causes the queue reader to terminate early. However, because every packet sent queues up the same workqueue item after it is fully added, the worker resumes again, and stopping early isn't actually a problem, since at that point the packet wouldn't have yet been added to the encryption queue. These properties allow us to avoid disabling interrupts or spinning. The design is based on Dmitry Vyukov's algorithm [1]. Performance-wise, ordinarily list-based queues aren't preferable to ringbuffers, because of cache misses when following pointers around. However, we *already* have to follow the adjacent pointers when working through fragments, so there shouldn't actually be any change there. A potential downside is that dequeueing is a bit more complicated, but the ptr_ring structure used prior had a spinlock when dequeueing, so all and all the difference appears to be a wash. Actually, from profiling, the biggest performance hit, by far, of this commit winds up being atomic_add_unless(count, 1, max) and atomic_ dec(count), which account for the majority of CPU time, according to perf. In that sense, the previous ring buffer was superior in that it could check if it was full by head==tail, which the list-based approach cannot do. But all and all, this enables us to get massive memory savings, allowing WireGuard to scale for real world deployments, without taking much of a performance hit. [1] http://www.1024cores.net/home/lock-free-algorithms/queues/intrusive-mpsc-node-based-queue Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-23wireguard: device: do not generate ICMP for non-IP packetsJason A. Donenfeld
If skb->protocol doesn't match the actual skb->data header, it's probably not a good idea to pass it off to icmp{,v6}_ndo_send, which is expecting to reply to a valid IP packet. So this commit has that early mismatch case jump to a later error label. Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-23wireguard: peer: put frequently used members above cache linesJason A. Donenfeld
The is_dead boolean is checked for every single packet, while the internal_id member is used basically only for pr_debug messages. So it makes sense to hoist up is_dead into some space formerly unused by a struct hole, while demoting internal_api to below the lowest struct cache line. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-23wireguard: selftests: test multiple parallel streamsJason A. Donenfeld
In order to test ndo_start_xmit being called in parallel, explicitly add separate tests, which should all run on different cores. This should help tease out bugs associated with queueing up packets from different cores in parallel. Currently, it hasn't found those types of bugs, but given future planned work, this is a useful regression to avoid. Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-23wireguard: socket: remove bogus __be32 annotationJann Horn
The endpoint->src_if4 has nothing to do with fixed-endian numbers; remove the bogus annotation. This was introduced in https://git.zx2c4.com/wireguard-monolithic-historical/commit?id=14e7d0a499a676ec55176c0de2f9fcbd34074a82 in the historical WireGuard repo because the old code used to zero-initialize multiple members as follows: endpoint->src4.s_addr = endpoint->src_if4 = fl.saddr = 0; Because fl.saddr is fixed-endian and an assignment returns a value with the type of its left operand, this meant that sparse detected an assignment between values of different endianness. Since then, this assignment was already split up into separate statements; just the cast survived. Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-23wireguard: avoid double unlikely() notation when using IS_ERR()Antonio Quartulli
The definition of IS_ERR() already applies the unlikely() notation when checking the error status of the passed pointer. For this reason there is no need to have the same notation outside of IS_ERR() itself. Clean up code by removing redundant notation. Signed-off-by: Antonio Quartulli <a@unstable.cc> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-23io_uring: move SQPOLL thread io-wq forked workerJens Axboe
Don't use a kthread for SQPOLL, use a forked worker just like the io-wq workers. With that done, we can drop the various context grabbing we do for SQPOLL, it already has everything it needs. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-23net: qrtr: Fix memory leak in qrtr_tun_openTakeshi Misawa
If qrtr_endpoint_register() failed, tun is leaked. Fix this, by freeing tun in error path. syzbot report: BUG: memory leak unreferenced object 0xffff88811848d680 (size 64): comm "syz-executor684", pid 10171, jiffies 4294951561 (age 26.070s) hex dump (first 32 bytes): 80 dd 0a 84 ff ff ff ff 00 00 00 00 00 00 00 00 ................ 90 d6 48 18 81 88 ff ff 90 d6 48 18 81 88 ff ff ..H.......H..... backtrace: [<0000000018992a50>] kmalloc include/linux/slab.h:552 [inline] [<0000000018992a50>] kzalloc include/linux/slab.h:682 [inline] [<0000000018992a50>] qrtr_tun_open+0x22/0x90 net/qrtr/tun.c:35 [<0000000003a453ef>] misc_open+0x19c/0x1e0 drivers/char/misc.c:141 [<00000000dec38ac8>] chrdev_open+0x10d/0x340 fs/char_dev.c:414 [<0000000079094996>] do_dentry_open+0x1e6/0x620 fs/open.c:817 [<000000004096d290>] do_open fs/namei.c:3252 [inline] [<000000004096d290>] path_openat+0x74a/0x1b00 fs/namei.c:3369 [<00000000b8e64241>] do_filp_open+0xa0/0x190 fs/namei.c:3396 [<00000000a3299422>] do_sys_openat2+0xed/0x230 fs/open.c:1172 [<000000002c1bdcef>] do_sys_open fs/open.c:1188 [inline] [<000000002c1bdcef>] __do_sys_openat fs/open.c:1204 [inline] [<000000002c1bdcef>] __se_sys_openat fs/open.c:1199 [inline] [<000000002c1bdcef>] __x64_sys_openat+0x7f/0xe0 fs/open.c:1199 [<00000000f3a5728f>] do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 [<000000004b38b7ec>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 28fb4e59a47d ("net: qrtr: Expose tunneling endpoint to user space") Reported-by: syzbot+5d6e4af21385f5cfc56a@syzkaller.appspotmail.com Signed-off-by: Takeshi Misawa <jeliantsurux@gmail.com> Link: https://lore.kernel.org/r/20210221234427.GA2140@DESKTOP Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-24s390/cpumf: Add support for complete counter set extractionThomas Richter
Add support to the CPU Measurement counter facility device driver to extract complete counter sets per CPU and per counter set from user space. This includes a new device named /dev/hwctr and support for the device driver functions open, close and ioctl. Other functions are not supported. The ioctl command supports 3 subcommands: S390_HWCTR_START: enables counter sets on a list of CPUs. S390_HWCTR_STOP: disables counter sets on a list of CPUs. S390_HWCTR_READ: reads counter sets on a list of CPUs. The ioctl(..., S390_HWCTR_READ, ...) is the only subcommand which returns data. It requires member data_bytes to be positive and indicates the maximum amount of data available to store counter set data. The other ioctl() subcommands do not use this member and it should be set to zero. The S390_HWCTR_READ subcommand returns the following data: The cpuset data is flattened using the following scheme, stored in member data: 0x0 0x8 0xc 0x10 0x10 0x18 0x20 0x28 0xU-1 +---------+-----+---------+-----+---------+-----+-----+------+------+ | no_cpus | cpu | no_sets | set | no_cnts | cv1 | cv2 | .... | cv_n | +---------+-----+---------+-----+---------+-----+-----+------+------+ 0xU 0xU+4 0xU+8 0xU+10 0xV-1 +-----+---------+-----+-----+------+------+ | set | no_cnts | cv1 | cv2 | .... | cv_n | +-----+---------+-----+-----+------+------+ 0xV 0xV+4 0xV+8 0xV+c +-----+---------+-----+---------+-----+-----+------+------+ | cpu | no_sets | set | no_cnts | cv1 | cv2 | .... | cv_n | +-----+---------+-----+---------+-----+-----+------+------+ U and V denote arbitrary hexadezimal addresses. The first integer represents the number of CPUs data was extracted from. This is followed by CPU number and number of counter sets extracted. Both are two integer values. This is followed by the set identifer and number of counters extracted. Both are two integer values. This is followed by the counter values, each element is eight bytes in size. The S390_HWCTR_READ ioctl subcommand is also limited to one call per minute. This ensures that an application does not read out the counter sets too often and reduces the overall CPU performance. The complete counter set extraction is an expensive operation. Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-02-24virtio/s390: implement virtio-ccw revision 2 correctlyCornelia Huck
CCW_CMD_READ_STATUS was introduced with revision 2 of virtio-ccw, and drivers should only rely on it being implemented when they negotiated at least that revision with the device. However, virtio_ccw_get_status() issued READ_STATUS for any device operating at least at revision 1. If the device accepts READ_STATUS regardless of the negotiated revision (which some implementations like QEMU do, even though the spec currently does not allow it), everything works as intended. While a device rejecting the command should also be handled gracefully, we will not be able to see any changes the device makes to the status, such as setting NEEDS_RESET or setting the status to zero after a completed reset. We negotiated the revision to at most 1, as we never bumped the maximum revision; let's do that now and properly send READ_STATUS only if we are operating at least at revision 2. Cc: stable@vger.kernel.org Fixes: 7d3ce5ab9430 ("virtio/s390: support READ_STATUS command for virtio-ccw") Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20210216110645.1087321-1-cohuck@redhat.com Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-02-24s390/smp: implement arch_irq_work_raise()Ilya Leoshkevich
The immediate need to have this is to have bpf_send_signal() send the signal ASAP instead of during the next hrtimer interrupt. However, it should also improve irq_work_queue() latencies in general, as well as get s390 out of the lame architectures list [1]. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/irq_work.c?h=v5.11#n45 Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-02-24s390/topology: move cpumasks away from stackHeiko Carstens
Make cpumasks static variables to avoid potential large stack frames. There shouldn't be any concurrent callers since all current callers are serialized with the cpu hotplug lock. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-02-24s390/smp: smp_emergency_stop() - move cpumask away from stackHeiko Carstens
Make "cpumask_t cpumask" a static variable to avoid a potential large stack frame. Also protect against potential concurrent callers by introducing a local lock. Note: smp_emergency_stop() gets only called with irqs and machine checks disabled, therefore a cpu local deadlock is not possible. For concurrent callers the first cpu which enters the critical section wins and will stop all other cpus. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-02-24s390/smp: __smp_rescan_cpus() - move cpumask away from stackHeiko Carstens
Avoid a potentially large stack frame and overflow by making "cpumask_t avail" a static variable. There is no concurrent access due to the existing locking. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-02-24s390/smp: consolidate locking for smp_rescan()Heiko Carstens
Move locking to __smp_rescan() instead of duplicating it to all call sites. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-02-24s390/mm: fix phys vs virt confusion in vmem_*() functions familyAlexander Gordeev
Due to historical reasons vmem_*() functions misuse or ignore the notion of physical vs virtual addresses difference. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-02-24s390/mm: fix phys vs virt confusion in pgtable allocation routinesAlexander Gordeev
The physical address of page tables is passed around and used as virtual address in various locations. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-02-24s390/mm: fix invalid __pa() usage in pfn_pXd() macrosAlexander Gordeev
There is little sense in applying __pa() to a physical address, but that what pfn_pXd() macros do. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-02-24s390/mm: make pXd_deref() macros return a pointerAlexander Gordeev
This update fixes semantics of pXd_deref macros which are expected to return a CPU-addressable pointer. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-02-24s390/opcodes: rename selhhhr to selfhrHeiko Carstens
Provide correct mnemonic for selfhr. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-02-23Merge tag 'clang-lto-v5.12-rc1-part2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull more clang LTO updates from Kees Cook: "Clang LTO x86 enablement. Full disclosure: while this has _not_ been in linux-next (since it initially looked like the objtool dependencies weren't going to make v5.12), it has been under daily build and runtime testing by Sami for quite some time. These x86 portions have been discussed on lkml, with Peter, Josh, and others helping nail things down. The bulk of the changes are to get objtool working happily. The rest of the x86 enablement is very small. Summary: - Generate __mcount_loc in objtool (Peter Zijlstra) - Support running objtool against vmlinux.o (Sami Tolvanen) - Clang LTO enablement for x86 (Sami Tolvanen)" Link: https://lore.kernel.org/lkml/20201013003203.4168817-26-samitolvanen@google.com/ Link: https://lore.kernel.org/lkml/cover.1611263461.git.jpoimboe@redhat.com/ * tag 'clang-lto-v5.12-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: kbuild: lto: force rebuilds when switching CONFIG_LTO x86, build: allow LTO to be selected x86, cpu: disable LTO for cpu.c x86, vdso: disable LTO only for vDSO kbuild: lto: postpone objtool objtool: Split noinstr validation from --vmlinux x86, build: use objtool mcount tracing: add support for objtool mcount objtool: Don't autodetect vmlinux.o objtool: Fix __mcount_loc generation with Clang's assembler objtool: Add a pass for generating __mcount_loc
2021-02-23PCI/portdrv: Report reset for frozen channelKeith Busch
The PCI error recovery always resets the link for a frozen state, so the port driver should return that a reset is required for its result. This will get the .slot_reset() callback invoked, which is necessary to restore the port's config space. Without this, the driver had been relying on downstream drivers to return this status. Link: https://lore.kernel.org/r/20210104230300.1277180-6-kbusch@kernel.org Tested-by: Hedi Berriche <hedi.berriche@hpe.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Hedi Berriche <hedi.berriche@hpe.com>
2021-02-23PCI/AER: Specify the type of Port that was resetKeith Busch
The AER driver may be called upon to reset either a Downstream or a Root Port. Check which type it is to properly identify it when logging that the reset occurred. Link: https://lore.kernel.org/r/20210104230300.1277180-5-kbusch@kernel.org Tested-by: Hedi Berriche <hedi.berriche@hpe.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Hedi Berriche <hedi.berriche@hpe.com>
2021-02-23PCI/ERR: Retain status from error notificationKeith Busch
Overwriting the frozen detected status with the result of the link reset loses the NEED_RESET result that drivers are depending on for error handling to report the .slot_reset() callback. Retain this status so that subsequent error handling has the correct flow. Link: https://lore.kernel.org/r/20210104230300.1277180-4-kbusch@kernel.org Reported-by: Hinko Kocevar <hinko.kocevar@ess.eu> Tested-by: Hedi Berriche <hedi.berriche@hpe.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Sean V Kelley <sean.v.kelley@intel.com> Acked-by: Hedi Berriche <hedi.berriche@hpe.com>
2021-02-23PCI/AER: Clear AER status from Root Port when resetting Downstream PortKeith Busch
The pci_dev parameter given to aer_root_reset() may be a Downstream Port rather than the Root Port. Get the Root Port from the provided device in order to clear the root's AER status. Link: https://lore.kernel.org/r/20210104230300.1277180-3-kbusch@kernel.org Tested-by: Hedi Berriche <hedi.berriche@hpe.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Sean V Kelley <sean.v.kelley@intel.com> Acked-by: Hedi Berriche <hedi.berriche@hpe.com>
2021-02-23PCI/ERR: Clear status of the reporting deviceKeith Busch
Error handling operates on the first Downstream Port above the detected error, but the error may have been reported by a downstream device. Clear the AER status of the device that reported the error rather than the first Downstream Port. Link: https://lore.kernel.org/r/20210104230300.1277180-2-kbusch@kernel.org Tested-by: Hedi Berriche <hedi.berriche@hpe.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Sean V Kelley <sean.v.kelley@intel.com> Acked-by: Hedi Berriche <hedi.berriche@hpe.com>
2021-02-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds
Pull sparc updates from David Miller: "A host of mall cleanups and adjustments that have accumulated while I was away, nothing major" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: (26 commits) sparc: make xchg() into a statement expression sparc64: Use arch_validate_flags() to validate ADI flag sparc32: Fix comparing pointer to 0 coccicheck warning sparc: fix led.c driver when PROC_FS is not enabled sparc: Fix handling of page table constructor failure sparc64: only select COMPAT_BINFMT_ELF if BINFMT_ELF is set tty: hvcs: Drop unnecessary if block tty: vcc: Drop unnecessary if block tty: vcc: Drop impossible to hit WARN_ON sparc: sparc64_defconfig: add necessary configs for qemu sparc64: switch defconfig from the legacy ide driver to libata sparc32: Preserve clone syscall flags argument for restarts due to signals sparc32: Limit memblock allocation to low memory sparc: Replace test_ti_thread_flag() with test_tsk_thread_flag() sbus: char: Remove meaningless jump label out_free sparc32: signal: Fix stack trampoline for RT signals sparc: remove SA_STATIC_ALLOC macro definition sparc: use for_each_child_of_node() macro sparc: Use fallthrough pseudo-keyword sparc32: srmmu: improve type safety of __nocache_fix() ...
2021-02-23Merge tag 'dmaengine-5.12-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine updates from Vinod Koul: "We have couple of drivers removed a new driver and bunch of new device support and few updates to drivers for this round. New drivers/devices: - Intel LGM SoC DMA driver - Actions Semi S500 DMA controller - Renesas r8a779a0 dma controller - Ingenic JZ4760(B) dma controller - Intel KeemBay AxiDMA controller Removed: - Coh901318 dma driver - Zte zx dma driver - Sirfsoc dma driver Updates: - mmp_pdma, mmp_tdma gained module support - imx-sdma become modern and dropped platform data support - dw-axi driver gained slave and cyclic dma support" * tag 'dmaengine-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (58 commits) dmaengine: dw-axi-dmac: remove redundant null check on desc dmaengine: xilinx_dma: Alloc tx descriptors GFP_NOWAIT dmaengine: dw-axi-dmac: Virtually split the linked-list dmaengine: dw-axi-dmac: Set constraint to the Max segment size dmaengine: dw-axi-dmac: Add Intel KeemBay AxiDMA BYTE and HALFWORD registers dmaengine: dw-axi-dmac: Add Intel KeemBay AxiDMA handshake dmaengine: dw-axi-dmac: Add Intel KeemBay AxiDMA support dmaengine: drivers: Kconfig: add HAS_IOMEM dependency to DW_AXI_DMAC dmaengine: dw-axi-dmac: Add Intel KeemBay DMA register fields dt-binding: dma: dw-axi-dmac: Add support for Intel KeemBay AxiDMA dmaengine: dw-axi-dmac: Support burst residue granularity dmaengine: dw-axi-dmac: Support of_dma_controller_register() dmaegine: dw-axi-dmac: Support device_prep_dma_cyclic() dmaengine: dw-axi-dmac: Support device_prep_slave_sg dmaengine: dw-axi-dmac: Add device_config operation dmaengine: dw-axi-dmac: Add device_synchronize() callback dmaengine: dw-axi-dmac: move dma_pool_create() to alloc_chan_resources() dmaengine: dw-axi-dmac: simplify descriptor management dt-bindings: dma: Add YAML schemas for dw-axi-dmac dmaengine: ti: k3-psil: optimize struct psil_endpoint_config for size ...
2021-02-23Merge tag 'acpi-5.12-rc1-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more ACPI updates from Rafael Wysocki: "Fix race condition in generic_serial_bus (I2C) and GPIO Operation Region handling in ACPICA and reduce some related code duplication (Hans de Goede)" * tag 'acpi-5.12-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPICA: Remove some code duplication from acpi_ev_address_space_dispatch ACPICA: Fix race in generic_serial_bus (I2C) and GPIO op_region parameter handling
2021-02-23Merge tag 'pm-5.12-rc1-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management updates from Rafael Wysocki: "These are fixes and cleanups on top of the power management material for 5.12-rc1 merged previously. Specifics: - Address cpufreq regression introduced in 5.11 that causes CPU frequency reporting to be distorted on systems with CPPC that use acpi-cpufreq as the scaling driver (Rafael Wysocki). - Fix regression introduced during the 5.10 development cycle related to CPU hotplug and policy recreation in the qcom-cpufreq-hw driver (Shawn Guo). - Fix recent regression in the operating performance points (OPP) framework that may cause frequency updates to be skipped by mistake in some cases (Jonathan Marek). - Simplify schedutil governor code and remove a misleading comment from it (Yue Hu). - Fix kerneldoc comment typo in the cpufreq core (Yue Hu)" * tag 'pm-5.12-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: Fix typo in kerneldoc comment cpufreq: schedutil: Remove update_lock comment from struct sugov_policy definition cpufreq: schedutil: Remove needless sg_policy parameter from ignore_dl_rate_limit() cpufreq: ACPI: Set cpuinfo.max_freq directly if max boost is known cpufreq: qcom-hw: drop devm_xxx() calls from init/exit hooks opp: Don't skip freq update for different frequency
2021-02-23Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input updates from Dmitry Torokhov: "Mostly existing driver fixes plus a new driver for game controllers directly connected to Nintendo 64, and an enhancement for keyboards driven by Chrome OS EC to communicate layout of the top row to userspace" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (47 commits) Input: st1232 - fix NORMAL vs. IDLE state handling Input: aiptek - convert sysfs sprintf/snprintf family to sysfs_emit Input: alps - fix spelling of "positive" ARM: dts: cros-ec-keyboard: Use keymap macros dt-bindings: input: Fix the keymap for LOCK key dt-bindings: input: Create macros for cros-ec keymap Input: cros-ec-keyb - expose function row physical map to userspace dt-bindings: input: cros-ec-keyb: Add a new property describing top row Input: applespi - fix occasional crc errors under load. Input: applespi - don't wait for responses to commands indefinitely. Input: st1232 - add IDLE state as ready condition Input: zinitix - fix return type of zinitix_init_touch() Input: i8042 - add ASUS Zenbook Flip to noselftest list Input: add missing dependencies on CONFIG_HAS_IOMEM Input: joydev - prevent potential read overflow in ioctl Input: elo - fix an error code in elo_connect() Input: xpad - add support for PowerA Enhanced Wired Controller for Xbox Series X|S Input: sur40 - fix an error code in sur40_probe() Input: elants_i2c - detect enum overflow Input: zinitix - remove unneeded semicolon ...
2021-02-23Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID updates from Jiri Kosina: - support for "Unified Battery" feature on Logitech devices from Filipe Laíns - power management improvements for intel-ish driver from Zhang Lixu - support for Goodix devices from Douglas Anderson - improved handling of generic HID keyboard in order to make it easier for userspace to figure out the details of the device, from Dmitry Torokhov - Playstation DualSense support from Roderick Colenbrander - other assorted small fixes and device ID additions. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: (49 commits) HID: playstation: add DualSense player LED support. HID: playstation: add microphone mute support for DualSense. HID: playstation: add initial DualSense lightbar support. HID: wacom: Ignore attempts to overwrite the touch_max value from HID HID: playstation: fix array size comparison (off-by-one) HID: playstation: fix unused variable in ps_battery_get_property. HID: playstation: report DualSense hardware and firmware version. HID: playstation: add DualSense classic rumble support. HID: playstation: add DualSense Bluetooth support. HID: playstation: track devices in list. HID: playstation: add DualSense accelerometer and gyroscope support. HID: playstation: add DualSense touchpad support. HID: playstation: add DualSense battery support. HID: playstation: use DualSense MAC address as unique identifier. HID: playstation: initial DualSense USB support. HID: ite: Enable QUIRK_TOUCHPAD_ON_OFF_REPORT on Acer Aspire Switch 10E HID: Ignore battery for Elan touchscreen on HP Spectre X360 15-df0xxx HID: logitech-dj: add support for the new lightspeed connection iteration HID: intel-ish-hid: ipc: Add Tiger Lake H PCI device ID HID: logitech-dj: add support for keyboard events in eQUAD step 4 Gaming ...
2021-02-23scripts/dtc: Add missing fdtoverlay to gitignoreRob Herring
Commit 0da6bcd9fcc0 ("scripts: dtc: Build fdtoverlay tool") enabled building fdtoverlay, but failed to add it to .gitignore. Also add a note to keep hostprogs in sync with .gitignore. Fixes: 0da6bcd9fcc0 ("scripts: dtc: Build fdtoverlay tool") Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rob Herring <robh@kernel.org>
2021-02-23block: don't skip empty device in in disk_ueventChristoph Hellwig
Restore the previous behavior by using the correct flag for the whole device ("part0"). Fixes: 99dfc43ecbf6 ("block: use ->bi_bdev for bio based I/O accounting") Reported-by: John Stultz <john.stultz@linaro.org> Tested-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-23kbuild: lto: force rebuilds when switching CONFIG_LTOSami Tolvanen
When doing non-clean builds and switching between CONFIG_LTO=n and CONFIG_LTO=y, the build system (correctly) didn't notice that assembly and LTO-excluded C object files were rewritten in place by objtool (to add the .orc_unwind* sections), since their build command lines were the same between CONFIG_LTO=y and CONFIG_LTO=n. The objtool step would fail: vmlinux.o: warning: objtool: file already has .orc_unwind section, skipping make: *** [Makefile:1194: vmlinux] Error 255 Avoid this by making sure the build will see a difference between an LTO and non-LTO build (by including "-fno-lto" in KBUILD_*FLAGS). This will get ignored when CC_FLAGS_LTO is present, and will not be included at all when CONFIG_LTO=n. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Kees Cook <keescook@chromium.org>