summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-12-02Merge tag 'sound-5.10-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Here are the pending sound fixes for 5.10: all small device-specific fixes, and nothing particular stands out, so far" * tag 'sound-5.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda/realtek: Add mute LED quirk to yet another HP x360 model ALSA: hda/realtek: Fix bass speaker DAC assignment on Asus Zephyrus G14 ALSA: hda/generic: Add option to enforce preferred_dacs pairs ALSA: usb-audio: US16x08: fix value count for level meters ALSA: hda/realtek - Add new codec supported for ALC897 ASoC: rt5682: change SAR voltage threshold ASoC: wm_adsp: fix error return code in wm_adsp_load() ALSA: hda/realtek: Enable headset of ASUS UX482EG & B9400CEA with ALC294 ASoC: qcom: Fix enabling BCLK and LRCLK in LPAIF invalid state ALSA: hda/realtek - Fixed Dell AIO wrong sound tone ASoC: Intel: bytcr_rt5640: Fix HP Pavilion x2 Detachable quirks
2020-12-02Merge tag 'trace-v5.10-rc6-bootconfig' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull bootconfig fixes from Steven Rostedt: "Have bootconfig size and checksum be little endian In case the bootconfig is created on one kind of endian machine, and then read on the other kind of endian kernel, the size and checksum will be incorrect. Instead, have both the size and checksum always be little endian and have the tool and the kernel convert it from little endian to or from the host endian" * tag 'trace-v5.10-rc6-bootconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: docs: bootconfig: Add the endianness of fields tools/bootconfig: Store size and checksum in footer as le32 bootconfig: Load size and checksum in the footer as le32
2020-12-02dpaa_eth: copy timestamp fields to new skb in A-050385 workaroundYangbo Lu
The timestamp fields should be copied to new skb too in A-050385 workaround for later TX timestamping handling. Fixes: 3c68b8fffb48 ("dpaa_eth: FMan erratum A050385 workaround") Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Link: https://lore.kernel.org/r/20201201075258.1875-1-yangbo.lu@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-02net: ip6_gre: set dev->hard_header_len when using header_opsAntoine Tenart
syzkaller managed to crash the kernel using an NBMA ip6gre interface. I could reproduce it creating an NBMA ip6gre interface and forwarding traffic to it: skbuff: skb_under_panic: text:ffffffff8250e927 len:148 put:44 head:ffff8c03c7a33 ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:109! Call Trace: skb_push+0x10/0x10 ip6gre_header+0x47/0x1b0 neigh_connected_output+0xae/0xf0 ip6gre tunnel provides its own header_ops->create, and sets it conditionally when initializing the tunnel in NBMA mode. When header_ops->create is used, dev->hard_header_len should reflect the length of the header created. Otherwise, when not used, dev->needed_headroom should be used. Fixes: eb95f52fc72d ("net: ipv6_gre: Fix GRO to work on IPv6 over GRE tap") Cc: Maria Pasechnik <mariap@mellanox.com> Signed-off-by: Antoine Tenart <atenart@kernel.org> Link: https://lore.kernel.org/r/20201130161911.464106-1-atenart@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-02mt76: usb: fix crash on device removalStanislaw Gruszka
Currently 'while (q->queued > 0)' loop was removed from mt76u_stop_tx() code. This causes crash on device removal as we try to cleanup empty queue: [ 96.495571] kernel BUG at include/linux/skbuff.h:2297! [ 96.498983] invalid opcode: 0000 [#1] SMP PTI [ 96.501162] CPU: 3 PID: 27 Comm: kworker/3:0 Not tainted 5.10.0-rc5+ #11 [ 96.502754] Hardware name: LENOVO 20DGS08H00/20DGS08H00, BIOS J5ET48WW (1.19 ) 08/27/2015 [ 96.504378] Workqueue: usb_hub_wq hub_event [ 96.505983] RIP: 0010:skb_pull+0x2d/0x30 [ 96.507576] Code: 00 00 8b 47 70 39 c6 77 1e 29 f0 89 47 70 3b 47 74 72 17 48 8b 87 c8 00 00 00 89 f6 48 01 f0 48 89 87 c8 00 00 00 c3 31 c0 c3 <0f> 0b 90 0f 1f 44 00 00 53 48 89 fb 48 8b bf c8 00 00 00 8b 43 70 [ 96.509296] RSP: 0018:ffffb11b801639b8 EFLAGS: 00010287 [ 96.511038] RAX: 000000001c6939ed RBX: ffffb11b801639f8 RCX: 0000000000000000 [ 96.512964] RDX: ffffb11b801639f8 RSI: 0000000000000018 RDI: ffff90c64e4fb800 [ 96.514710] RBP: ffff90c654551ee0 R08: ffff90c652bce7a8 R09: ffffb11b80163728 [ 96.516450] R10: 0000000000000001 R11: 0000000000000001 R12: ffff90c64e4fb800 [ 96.519749] R13: 0000000000000010 R14: 0000000000000020 R15: ffff90c64e352ce8 [ 96.523455] FS: 0000000000000000(0000) GS:ffff90c96eec0000(0000) knlGS:0000000000000000 [ 96.527171] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 96.530900] CR2: 0000242556f18288 CR3: 0000000146a10002 CR4: 00000000003706e0 [ 96.534678] Call Trace: [ 96.538418] mt76x02u_tx_complete_skb+0x1f/0x50 [mt76x02_usb] [ 96.542231] mt76_queue_tx_complete+0x23/0x50 [mt76] [ 96.546028] mt76u_stop_tx.cold+0x71/0xa2 [mt76_usb] [ 96.549797] mt76x0u_stop+0x2f/0x90 [mt76x0u] [ 96.553638] drv_stop+0x33/0xd0 [mac80211] [ 96.557449] ieee80211_do_stop+0x558/0x860 [mac80211] [ 96.561262] ? dev_deactivate_many+0x298/0x2d0 [ 96.565101] ieee80211_stop+0x16/0x20 [mac80211] Fix that by adding while loop again. We need loop, not just single check, to clean all pending entries. Additionally move mt76_worker_disable/enable after !mt76_has_tx_pending() as we want to tx_worker to run to process tx queues, while we wait for exactly that. I was a bit worried about accessing q->queued without lock, but mt76_worker_disable() -> kthread_park() should assure this value will be seen updated on other cpus. Fixes: fe5b5ab52e9d ("mt76: unify queue tx cleanup code") Signed-off-by: Stanislaw Gruszka <stf_xl@wp.pl> Acked-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20201126125520.72912-1-stf_xl@wp.pl
2020-12-02iwlwifi: pcie: add some missing entries for AX210Golan Ben Ami
Some subsytem device IDs were missing from the list, so some AX210 devices were not recognized. Add them. Signed-off-by: Golan Ben Ami <golan.ben.ami@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/iwlwifi.20201202143859.a06ba7540449.I7390305d088a49c1043c9b489154fe057989c18f@changeid Link: https://lore.kernel.org/r/20201121003411.9450-1-ikegami.t@gmail.com
2020-12-02iwlwifi: pcie: invert values of NO_160 device config entriesLuca Coelho
The NO_160 flag specifies if the device doesn't have 160 MHz support, but we errorneously assumed the opposite. If the flag was set, we were considering that 160 MHz was supported, but it's actually the opposite. Fix it by inverting the bits, i.e. NO_160 is 0x1 and 160 is 0x0. Fixes: d6f2134a3831 ("iwlwifi: add mac/rf types and 160MHz to the device tables") Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/iwlwifi.20201202143859.375bec857ccb.I83884286b688965293e9810381808039bd7eedae@changeid
2020-12-02iwlwifi: pcie: add one missing entry for AX210Luca Coelho
The 0x0024 subsytem device ID was missing from the list, so some AX210 devices were not recognized. Add it. Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/iwlwifi.20201202143859.308eab4db42c.I3763196cd3f7bb36f3dcabf02ec4e7c4fe859c0f@changeid
2020-12-02iwlwifi: update MAINTAINERS entryJohannes Berg
Reflect the fact that the linuxwifi@intel.com address will disappear, and that neither Emmanuel nor myself are really much involved with the maintenance these days. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/iwlwifi.20201129151117.a25afe6d2c7f.I8f13a5689dd353825fb2b9bd5b6f0fbce92cb12b@changeid
2020-12-02s390: fix irq state tracingHeiko Carstens
With commit 58c644ba512c ("sched/idle: Fix arch_cpu_idle() vs tracing") common code calls arch_cpu_idle() with a lockdep state that tells irqs are on. This doesn't work very well for s390: psw_idle() will enable interrupts to wait for an interrupt. As soon as an interrupt occurs the interrupt handler will verify if the old context was psw_idle(). If that is the case the interrupt enablement bits in the old program status word will be cleared. A subsequent test in both the external as well as the io interrupt handler checks if in the old context interrupts were enabled. Due to the above patching of the old program status word it is assumed the old context had interrupts disabled, and therefore a call to TRACE_IRQS_OFF (aka trace_hardirqs_off_caller) is skipped. Which in turn makes lockdep incorrectly "think" that interrupts are enabled within the interrupt handler. Fix this by unconditionally calling TRACE_IRQS_OFF when entering interrupt handlers. Also call unconditionally TRACE_IRQS_ON when leaving interrupts handlers. This leaves the special psw_idle() case, which now returns with interrupts disabled, but has an "irqs on" lockdep state. So callers of psw_idle() must adjust the state on their own, if required. This is currently only __udelay_disabled(). Fixes: 58c644ba512c ("sched/idle: Fix arch_cpu_idle() vs tracing") Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-12-02s390/pci: fix CPU address in MSI for directed IRQAlexander Gordeev
The directed MSIs are delivered to CPUs whose address is written to the MSI message address. The current code assumes that a CPU logical number (as it is seen by the kernel) is also the CPU address. The above assumption is not correct, as the CPU address is rather the value returned by STAP instruction. That value does not necessarily match the kernel logical CPU number. Fixes: e979ce7bced2 ("s390/pci: provide support for CPU directed interrupts") Cc: <stable@vger.kernel.org> # v5.2+ Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-12-02vhost_vdpa: return -EFAULT if copy_to_user() failsDan Carpenter
The copy_to_user() function returns the number of bytes remaining to be copied but this should return -EFAULT to the user. Fixes: 1b48dc03e575 ("vhost: vdpa: report iova range") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/X8c32z5EtDsMyyIL@mwanda Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2020-12-02vdpa: mlx5: fix vdpa/vhost dependenciesRandy Dunlap
drivers/vdpa/mlx5/ uses vhost_iotlb*() interfaces, so select VHOST_IOTLB to make them be built. However, if VHOST_IOTLB is the only VHOST symbol that is set/enabled, the object file still won't be built because drivers/Makefile won't descend into drivers/vhost/ to build it, so make drivers/Makefile build the needed binary whenever VHOST_IOTLB is set, like it does for VHOST_RING. Fixes these build errors: ERROR: modpost: "vhost_iotlb_itree_next" [drivers/vdpa/mlx5/mlx5_vdpa.ko] undefined! ERROR: modpost: "vhost_iotlb_itree_first" [drivers/vdpa/mlx5/mlx5_vdpa.ko] undefined! Fixes: 29064bfdabd5 ("vdpa/mlx5: Add support library for mlx5 VDPA implementation") Fixes: aff90770e54c ("vdpa/mlx5: Fix dependency on MLX5_CORE") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Eli Cohen <eli@mellanox.com> Cc: Parav Pandit <parav@mellanox.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: virtualization@lists.linux-foundation.org Cc: Saeed Mahameed <saeedm@nvidia.com> Cc: Leon Romanovsky <leonro@nvidia.com> Cc: netdev@vger.kernel.org Link: https://lore.kernel.org/r/20201128213905.27409-1-rdunlap@infradead.org Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2020-12-01geneve: pull IP header before ECN decapsulationEric Dumazet
IP_ECN_decapsulate() and IP6_ECN_decapsulate() assume IP header is already pulled. geneve does not ensure this yet. Fixing this generically in IP_ECN_decapsulate() and IP6_ECN_decapsulate() is not possible, since callers pass a pointer that might be freed by pskb_may_pull() syzbot reported : BUG: KMSAN: uninit-value in __INET_ECN_decapsulate include/net/inet_ecn.h:238 [inline] BUG: KMSAN: uninit-value in INET_ECN_decapsulate+0x345/0x1db0 include/net/inet_ecn.h:260 CPU: 1 PID: 8941 Comm: syz-executor.0 Not tainted 5.10.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x21c/0x280 lib/dump_stack.c:118 kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:118 __msan_warning+0x5f/0xa0 mm/kmsan/kmsan_instr.c:197 __INET_ECN_decapsulate include/net/inet_ecn.h:238 [inline] INET_ECN_decapsulate+0x345/0x1db0 include/net/inet_ecn.h:260 geneve_rx+0x2103/0x2980 include/net/inet_ecn.h:306 geneve_udp_encap_recv+0x105c/0x1340 drivers/net/geneve.c:377 udp_queue_rcv_one_skb+0x193a/0x1af0 net/ipv4/udp.c:2093 udp_queue_rcv_skb+0x282/0x1050 net/ipv4/udp.c:2167 udp_unicast_rcv_skb net/ipv4/udp.c:2325 [inline] __udp4_lib_rcv+0x399d/0x5880 net/ipv4/udp.c:2394 udp_rcv+0x5c/0x70 net/ipv4/udp.c:2564 ip_protocol_deliver_rcu+0x572/0xc50 net/ipv4/ip_input.c:204 ip_local_deliver_finish net/ipv4/ip_input.c:231 [inline] NF_HOOK include/linux/netfilter.h:301 [inline] ip_local_deliver+0x583/0x8d0 net/ipv4/ip_input.c:252 dst_input include/net/dst.h:449 [inline] ip_rcv_finish net/ipv4/ip_input.c:428 [inline] NF_HOOK include/linux/netfilter.h:301 [inline] ip_rcv+0x5c3/0x840 net/ipv4/ip_input.c:539 __netif_receive_skb_one_core net/core/dev.c:5315 [inline] __netif_receive_skb+0x1ec/0x640 net/core/dev.c:5429 process_backlog+0x523/0xc10 net/core/dev.c:6319 napi_poll+0x420/0x1010 net/core/dev.c:6763 net_rx_action+0x35c/0xd40 net/core/dev.c:6833 __do_softirq+0x1a9/0x6fa kernel/softirq.c:298 asm_call_irq_on_stack+0xf/0x20 </IRQ> __run_on_irqstack arch/x86/include/asm/irq_stack.h:26 [inline] run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:77 [inline] do_softirq_own_stack+0x6e/0x90 arch/x86/kernel/irq_64.c:77 do_softirq kernel/softirq.c:343 [inline] __local_bh_enable_ip+0x184/0x1d0 kernel/softirq.c:195 local_bh_enable+0x36/0x40 include/linux/bottom_half.h:32 rcu_read_unlock_bh include/linux/rcupdate.h:730 [inline] __dev_queue_xmit+0x3a9b/0x4520 net/core/dev.c:4167 dev_queue_xmit+0x4b/0x60 net/core/dev.c:4173 packet_snd net/packet/af_packet.c:2992 [inline] packet_sendmsg+0x86f9/0x99d0 net/packet/af_packet.c:3017 sock_sendmsg_nosec net/socket.c:651 [inline] sock_sendmsg net/socket.c:671 [inline] __sys_sendto+0x9dc/0xc80 net/socket.c:1992 __do_sys_sendto net/socket.c:2004 [inline] __se_sys_sendto+0x107/0x130 net/socket.c:2000 __x64_sys_sendto+0x6e/0x90 net/socket.c:2000 do_syscall_64+0x9f/0x140 arch/x86/entry/common.c:48 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 2d07dc79fe04 ("geneve: add initial netdev driver for GENEVE tunnels") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Link: https://lore.kernel.org/r/20201201090507.4137906-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-01inet_ecn: Fix endianness of checksum update when setting ECT(1)Toke Høiland-Jørgensen
When adding support for propagating ECT(1) marking in IP headers it seems I suffered from endianness-confusion in the checksum update calculation: In fact the ECN field is in the *lower* bits of the first 16-bit word of the IP header when calculating in network byte order. This means that the addition performed to update the checksum field was wrong; let's fix that. Fixes: b723748750ec ("tunnel: Propagate ECT(1) when decapsulating as recommended by RFC6040") Reported-by: Jonathan Morton <chromatix99@gmail.com> Tested-by: Pete Heist <pete@heistp.net> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20201130183705.17540-1-toke@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-01Merge tag '5.10-rc6-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull cifs fixes from Steve French: "Two smb3 fixes for stable" * tag '5.10-rc6-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: fix potential use-after-free in cifs_echo_request() cifs: allow syscalls to be restarted in __smb_send_rqst()
2020-12-01Merge tag 'trace-v5.10-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: - Use correct timestamp variable for ring buffer write stamp update - Fix up before stamp and write stamp when crossing ring buffer sub buffers - Keep a zero delta in ring buffer in slow path if cmpxchg fails - Fix trace_printk static buffer for archs that care - Fix ftrace record accounting for ftrace ops with trampolines - Fix DYNAMIC_FTRACE_WITH_DIRECT_CALLS dependency - Remove WARN_ON in hwlat tracer that triggers on something that is OK - Make "my_tramp" trampoline in ftrace direct sample code global - Fixes in the bootconfig tool for better alignment management * tag 'trace-v5.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ring-buffer: Always check to put back before stamp when crossing pages ftrace: Fix DYNAMIC_FTRACE_WITH_DIRECT_CALLS dependency ftrace: Fix updating FTRACE_FL_TRAMP tracing: Fix alignment of static buffer tracing: Remove WARN_ON in start_thread() samples/ftrace: Mark my_tramp[12]? global ring-buffer: Set the right timestamp in the slow path of __rb_reserve_next() ring-buffer: Update write stamp with the correct ts docs: bootconfig: Update file format on initrd image tools/bootconfig: Align the bootconfig applied initrd image size to 4 tools/bootconfig: Fix to check the write failure correctly tools/bootconfig: Fix errno reference after printf()
2020-12-01tipc: fix incompatible mtu of transmissionHoang Le
In commit 682cd3cf946b6 ("tipc: confgiure and apply UDP bearer MTU on running links"), we introduced a function to change UDP bearer MTU and applied this new value across existing per-link. However, we did not apply this new MTU value at node level. This lead to packet dropped at link level if its size is greater than new MTU value. To fix this issue, we also apply this new MTU value for node level. Fixes: 682cd3cf946b6 ("tipc: confgiure and apply UDP bearer MTU on running links") Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au> Link: https://lore.kernel.org/r/20201130025544.3602-1-hoang.h.le@dektech.com.au Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-01fs: 9p: add generic splice_write file operationDominique Martinet
The default splice operations got removed recently, add it back to 9p with iter_file_splice_write like many other filesystems do. Link: http://lkml.kernel.org/r/1606837496-21717-1-git-send-email-asmadeus@codewreck.org Fixes: 36e2c7421f02 ("fs: don't allow splice read/write without explicit ops") Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-12-01Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull vhost fixes from Michael Tsirkin: "A couple of minor fixes" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vhost-vdpa: fix page pinning leakage in error path (rework) vringh: fix vringh_iov_push_*() documentation vhost scsi: fix lun reset completion handling
2020-12-01net: broadcom CNIC: requires MMURandy Dunlap
The CNIC kconfig symbol selects UIO and UIO depends on MMU. Since 'select' does not follow dependency chains, add the same MMU dependency to CNIC. Quietens this kconfig warning: WARNING: unmet direct dependencies detected for UIO Depends on [n]: MMU [=n] Selected by [m]: - CNIC [=m] && NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_BROADCOM [=y] && PCI [=y] && (IPV6 [=m] || IPV6 [=m]=n) Fixes: adfc5217e9db ("broadcom: Move the Broadcom drivers") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Rasesh Mody <rmody@marvell.com> Cc: GR-Linux-NIC-Dev@marvell.com Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: netdev@vger.kernel.org Link: https://lore.kernel.org/r/20201129070843.3859-1-rdunlap@infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-01Merge branch 'ibmvnic-Bug-fixes-for-queue-descriptor-processing'David S. Miller
Thomas Falcon says: ==================== ibmvnic: Bug fixes for queue descriptor processing This series resolves a few issues in the ibmvnic driver's RX buffer and TX completion processing. The first patch includes memory barriers to synchronize queue descriptor reads. The second patch fixes a memory leak that could occur if the device returns a TX completion with an error code in the descriptor, in which case the respective socket buffer and other relevant data structures may not be freed or updated properly. v3: Correct length of Fixes tags, requested by Jakub Kicinski v2: Provide more detailed comments explaining specifically what reads are being ordered, suggested by Michael Ellerman ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-01ibmvnic: Fix TX completion error handlingThomas Falcon
TX completions received with an error return code are not being processed properly. When an error code is seen, do not proceed to the next completion before cleaning up the existing entry's data structures. Fixes: 032c5e82847a ("Driver for IBM System i/p VNIC protocol") Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-01ibmvnic: Ensure that SCRQ entry reads are correctly orderedThomas Falcon
Ensure that received Subordinate Command-Response Queue (SCRQ) entries are properly read in order by the driver. These queues are used in the ibmvnic device to process RX buffer and TX completion descriptors. dma_rmb barriers have been added after checking for a pending descriptor to ensure the correct descriptor entry is checked and after reading the SCRQ descriptor to ensure the entire descriptor is read before processing. Fixes: 032c5e82847a ("Driver for IBM System i/p VNIC protocol") Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-01fs: 9p: add generic splice_read file operationsToke Høiland-Jørgensen
The v9fs file operations were missing the splice_read operations, which breaks sendfile() of files on such a filesystem. I discovered this while trying to load an eBPF program using iproute2 inside a 'virtme' environment which uses 9pfs for the virtual file system. iproute2 relies on sendfile() with an AF_ALG socket to hash files, which was erroring out in the virtual environment. Since generic_file_splice_read() seems to just implement splice_read in terms of the read_iter operation, I simply added the generic implementation to the file operations, which fixed the error I was seeing. A quick grep indicates that this is what most other file systems do as well. Link: http://lkml.kernel.org/r/20201201135409.55510-1-toke@redhat.com Fixes: 36e2c7421f02 ("fs: don't allow splice read/write without explicit ops") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2020-11-30docs: bootconfig: Add the endianness of fieldsMasami Hiramatsu
Add a description about the endianness of the size and the checksum fields. Those must be stored as le32 instead of u32. This will allow us to apply bootconfig to the cross build initrd without caring the endianness. Link: https://lkml.kernel.org/r/160583936246.547349.10964204130590955409.stgit@devnote2 Reported-by: Steven Rostedt <rostedt@goodmis.org> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-30tools/bootconfig: Store size and checksum in footer as le32Masami Hiramatsu
Store the size and the checksum fields in the footer as le32 instead of u32. This will allow us to apply bootconfig to the cross build initrd without caring the endianness. Link: https://lkml.kernel.org/r/160583935332.547349.5897811300636587426.stgit@devnote2 Reported-by: Steven Rostedt <rostedt@goodmis.org> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-30bootconfig: Load size and checksum in the footer as le32Masami Hiramatsu
Load the size and the checksum fields in the footer as le32 instead of u32. This will allow us to apply bootconfig to the cross build initrd without caring the endianness. Link: https://lkml.kernel.org/r/160583934457.547349.10504070298990791074.stgit@devnote2 Reported-by: Steven Rostedt <rostedt@goodmis.org> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-30ring-buffer: Always check to put back before stamp when crossing pagesSteven Rostedt (VMware)
The current ring buffer logic checks to see if the updating of the event buffer was interrupted, and if it is, it will try to fix up the before stamp with the write stamp to make them equal again. This logic is flawed, because if it is not interrupted, the two are guaranteed to be different, as the current event just updated the before stamp before allocation. This guarantees that the next event (this one or another interrupting one) will think it interrupted the time updates of a previous event and inject an absolute time stamp to compensate. The correct logic is to always update the timestamps when traversing to a new sub buffer. Cc: stable@vger.kernel.org Fixes: a389d86f7fd09 ("ring-buffer: Have nested events still record running time stamp") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-30Merge tag 'linux-can-fixes-for-5.10-20201130' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2020-11-30 The first patch is by me an target the tcan4x5x bindings for the m_can driver. It fixes the error path in the tcan4x5x_can_probe() function. The next two patches are by Jeroen Hofstee and makes the lost of arbitration error counters of sja1000 and the sun4i drivers consistent with the other drivers. Zhang Qilong contributes two patch that clean up the error path in the c_can and kvaser_pciefd drivers. * tag 'linux-can-fixes-for-5.10-20201130' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: can: kvaser_pciefd: kvaser_pciefd_open(): fix error handling can: c_can: c_can_power_up(): fix error handling can: sun4i_can: sun4i_can_err(): don't count arbitration lose as an error can: sja1000: sja1000_err(): don't count arbitration lose as an error can: m_can: tcan4x5x_can_probe(): fix error path: remove erroneous clk_disable_unprepare() ==================== Link: https://lore.kernel.org/r/20201130125307.218258-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-30ftrace: Fix DYNAMIC_FTRACE_WITH_DIRECT_CALLS dependencyNaveen N. Rao
DYNAMIC_FTRACE_WITH_DIRECT_CALLS should depend on DYNAMIC_FTRACE_WITH_REGS since we need ftrace_regs_caller(). Link: https://lkml.kernel.org/r/fc4b257ea8689a36f086d2389a9ed989496ca63a.1606412433.git.naveen.n.rao@linux.vnet.ibm.com Cc: stable@vger.kernel.org Fixes: 763e34e74bb7d5c ("ftrace: Add register_ftrace_direct()") Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-30ftrace: Fix updating FTRACE_FL_TRAMPNaveen N. Rao
On powerpc, kprobe-direct.tc triggered FTRACE_WARN_ON() in ftrace_get_addr_new() followed by the below message: Bad trampoline accounting at: 000000004222522f (wake_up_process+0xc/0x20) (f0000001) The set of steps leading to this involved: - modprobe ftrace-direct-too - enable_probe - modprobe ftrace-direct - rmmod ftrace-direct <-- trigger The problem turned out to be that we were not updating flags in the ftrace record properly. From the above message about the trampoline accounting being bad, it can be seen that the ftrace record still has FTRACE_FL_TRAMP set though ftrace-direct module is going away. This happens because we are checking if any ftrace_ops has the FTRACE_FL_TRAMP flag set _before_ updating the filter hash. The fix for this is to look for any _other_ ftrace_ops that also needs FTRACE_FL_TRAMP. Link: https://lkml.kernel.org/r/56c113aa9c3e10c19144a36d9684c7882bf09af5.1606412433.git.naveen.n.rao@linux.vnet.ibm.com Cc: stable@vger.kernel.org Fixes: a124692b698b0 ("ftrace: Enable trampoline when rec count returns back to one") Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-30tracing: Fix alignment of static bufferMinchan Kim
With 5.9 kernel on ARM64, I found ftrace_dump output was broken but it had no problem with normal output "cat /sys/kernel/debug/tracing/trace". With investigation, it seems coping the data into temporal buffer seems to break the align binary printf expects if the static buffer is not aligned with 4-byte. IIUC, get_arg in bstr_printf expects that args has already right align to be decoded and seq_buf_bprintf says ``the arguments are saved in a 32bit word array that is defined by the format string constraints``. So if we don't keep the align under copy to temporal buffer, the output will be broken by shifting some bytes. This patch fixes it. Link: https://lkml.kernel.org/r/20201125225654.1618966-1-minchan@kernel.org Cc: <stable@vger.kernel.org> Fixes: 8e99cf91b99bb ("tracing: Do not allocate buffer in trace_find_next_entry() in atomic") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-30tracing: Remove WARN_ON in start_thread()Vasily Averin
This patch reverts commit 978defee11a5 ("tracing: Do a WARN_ON() if start_thread() in hwlat is called when thread exists") .start hook can be legally called several times if according tracer is stopped screen window 1 [root@localhost ~]# echo 1 > /sys/kernel/tracing/events/kmem/kfree/enable [root@localhost ~]# echo 1 > /sys/kernel/tracing/options/pause-on-trace [root@localhost ~]# less -F /sys/kernel/tracing/trace screen window 2 [root@localhost ~]# cat /sys/kernel/debug/tracing/tracing_on 0 [root@localhost ~]# echo hwlat > /sys/kernel/debug/tracing/current_tracer [root@localhost ~]# echo 1 > /sys/kernel/debug/tracing/tracing_on [root@localhost ~]# cat /sys/kernel/debug/tracing/tracing_on 0 [root@localhost ~]# echo 2 > /sys/kernel/debug/tracing/tracing_on triggers warning in dmesg: WARNING: CPU: 3 PID: 1403 at kernel/trace/trace_hwlat.c:371 hwlat_tracer_start+0xc9/0xd0 Link: https://lkml.kernel.org/r/bd4d3e70-400d-9c82-7b73-a2d695e86b58@virtuozzo.com Cc: Ingo Molnar <mingo@redhat.com> Cc: stable@vger.kernel.org Fixes: 978defee11a5 ("tracing: Do a WARN_ON() if start_thread() in hwlat is called when thread exists") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-30samples/ftrace: Mark my_tramp[12]? globalSami Tolvanen
my_tramp[12]? are declared as global functions in C, but they are not marked global in the inline assembly definition. This mismatch confuses Clang's Control-Flow Integrity checking. Fix the definitions by adding .globl. Link: https://lkml.kernel.org/r/20201113183414.1446671-1-samitolvanen@google.com Fixes: 9d907f1ae80b8 ("ftrace/samples: Add a sample module that implements modify_ftrace_direct()") Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-30vxlan: Copy needed_tailroom from lowerdevSven Eckelmann
While vxlan doesn't need any extra tailroom, the lowerdev might need it. In that case, copy it over to reduce the chance for additional (re)allocations in the transmit path. Signed-off-by: Sven Eckelmann <sven@narfation.org> Link: https://lore.kernel.org/r/20201126125247.1047977-2-sven@narfation.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-30vxlan: Add needed_headroom for lower deviceSven Eckelmann
It was observed that sending data via batadv over vxlan (on top of wireguard) reduced the performance massively compared to raw ethernet or batadv on raw ethernet. A check of perf data showed that the vxlan_build_skb was calling all the time pskb_expand_head to allocate enough headroom for: min_headroom = LL_RESERVED_SPACE(dst->dev) + dst->header_len + VXLAN_HLEN + iphdr_len; But the vxlan_config_apply only requested needed headroom for: lowerdev->hard_header_len + VXLAN6_HEADROOM or VXLAN_HEADROOM So it completely ignored the needed_headroom of the lower device. The first caller of net_dev_xmit could therefore never make sure that enough headroom was allocated for the rest of the transmit path. Cc: Annika Wickert <annika.wickert@exaring.de> Signed-off-by: Sven Eckelmann <sven@narfation.org> Tested-by: Annika Wickert <aw@awlnx.space> Link: https://lore.kernel.org/r/20201126125247.1047977-1-sven@narfation.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-30chelsio/chtls: fix panic during unload reload chtlsVinay Kumar Yadav
there is kernel panic in inet_twsk_free() while chtls module unload when socket is in TIME_WAIT state because sk_prot_creator was not preserved on connection socket. Fixes: cc35c88ae4db ("crypto : chtls - CPL handler definition") Signed-off-by: Udai Sharma <udai.sharma@chelsio.com> Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com> Link: https://lore.kernel.org/r/20201125214913.16938-1-vinay.yadav@chelsio.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-01gfs2: Fix deadlock between gfs2_{create_inode,inode_lookup} and delete_work_funcAndreas Gruenbacher
In gfs2_create_inode and gfs2_inode_lookup, make sure to cancel any pending delete work before taking the inode glock. Otherwise, gfs2_cancel_delete_work may block waiting for delete_work_func to complete, and delete_work_func may block trying to acquire the inode glock in gfs2_inode_lookup. Reported-by: Alexander Aring <aahringo@redhat.com> Fixes: a0e3cc65fa29 ("gfs2: Turn gl_delete into a delayed work") Cc: stable@vger.kernel.org # v5.8+ Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2020-11-30cifs: fix potential use-after-free in cifs_echo_request()Paulo Alcantara
This patch fixes a potential use-after-free bug in cifs_echo_request(). For instance, thread 1 -------- cifs_demultiplex_thread() clean_demultiplex_info() kfree(server) thread 2 (workqueue) -------- apic_timer_interrupt() smp_apic_timer_interrupt() irq_exit() __do_softirq() run_timer_softirq() call_timer_fn() cifs_echo_request() <- use-after-free in server ptr Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> CC: Stable <stable@vger.kernel.org> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2020-11-30cifs: allow syscalls to be restarted in __smb_send_rqst()Paulo Alcantara
A customer has reported that several files in their multi-threaded app were left with size of 0 because most of the read(2) calls returned -EINTR and they assumed no bytes were read. Obviously, they could have fixed it by simply retrying on -EINTR. We noticed that most of the -EINTR on read(2) were due to real-time signals sent by glibc to process wide credential changes (SIGRT_1), and its signal handler had been established with SA_RESTART, in which case those calls could have been automatically restarted by the kernel. Let the kernel decide to whether or not restart the syscalls when there is a signal pending in __smb_send_rqst() by returning -ERESTARTSYS. If it can't, it will return -EINTR anyway. Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> CC: Stable <stable@vger.kernel.org> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2020-11-30ring-buffer: Set the right timestamp in the slow path of __rb_reserve_next()Andrea Righi
In the slow path of __rb_reserve_next() a nested event(s) can happen between evaluating the timestamp delta of the current event and updating write_stamp via local_cmpxchg(); in this case the delta is not valid anymore and it should be set to 0 (same timestamp as the interrupting event), since the event that we are currently processing is not the last event in the buffer. Link: https://lkml.kernel.org/r/X8IVJcp1gRE+FJCJ@xps-13-7390 Cc: Ingo Molnar <mingo@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: stable@vger.kernel.org Link: https://lwn.net/Articles/831207 Fixes: a389d86f7fd0 ("ring-buffer: Have nested events still record running time stamp") Signed-off-by: Andrea Righi <andrea.righi@canonical.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-30ring-buffer: Update write stamp with the correct tsSteven Rostedt (VMware)
The write stamp, used to calculate deltas between events, was updated with the stale "ts" value in the "info" structure, and not with the updated "ts" variable. This caused the deltas between events to be inaccurate, and when crossing into a new sub buffer, had time go backwards. Link: https://lkml.kernel.org/r/20201124223917.795844-1-elavila@google.com Cc: stable@vger.kernel.org Fixes: a389d86f7fd09 ("ring-buffer: Have nested events still record running time stamp") Reported-by: "J. Avila" <elavila@google.com> Tested-by: Daniel Mentz <danielmentz@google.com> Tested-by: Will McVicker <willmcvicker@google.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-11-30arm64: mte: Fix typo in macro definitionVincenzo Frascino
UL in the definition of SYS_TFSR_EL1_TF1 was misspelled causing compilation issues when trying to implement in kernel MTE async mode. Fix the macro correcting the typo. Note: MTE async mode will be introduced with a future series. Fixes: c058b1c4a5ea ("arm64: mte: system register definitions") Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20201130170709.22309-1-vincenzo.frascino@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-11-30arm64: entry: fix EL1 debug transitionsMark Rutland
In debug_exception_enter() and debug_exception_exit() we trace hardirqs on/off while RCU isn't guaranteed to be watching, and we don't save and restore the hardirq state, and so may return with this having changed. Handle this appropriately with new entry/exit helpers which do the bare minimum to ensure this is appropriately maintained, without marking debug exceptions as NMIs. These are placed in entry-common.c with the other entry/exit helpers. In future we'll want to reconsider whether some debug exceptions should be NMIs, but this will require a significant refactoring, and for now this should prevent issues with lockdep and RCU. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marins <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201130115950.22492-12-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-11-30arm64: entry: fix NMI {user, kernel}->kernel transitionsMark Rutland
Exceptions which can be taken at (almost) any time are consdiered to be NMIs. On arm64 that includes: * SDEI events * GICv3 Pseudo-NMIs * Kernel stack overflows * Unexpected/unhandled exceptions ... but currently debug exceptions (BRKs, breakpoints, watchpoints, single-step) are not considered NMIs. As these can be taken at any time, kernel features (lockdep, RCU, ftrace) may not be in a consistent kernel state. For example, we may take an NMI from the idle code or partway through an entry/exit path. While nmi_enter() and nmi_exit() handle most of this state, notably they don't save/restore the lockdep state across an NMI being taken and handled. When interrupts are enabled and an NMI is taken, lockdep may see interrupts become disabled within the NMI code, but not see interrupts become enabled when returning from the NMI, leaving lockdep believing interrupts are disabled when they are actually disabled. The x86 code handles this in idtentry_{enter,exit}_nmi(), which will shortly be moved to the generic entry code. As we can't use either yet, we copy the x86 approach in arm64-specific helpers. All the NMI entrypoints are marked as noinstr to prevent any instrumentation handling code being invoked before the state has been corrected. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201130115950.22492-11-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-11-30arm64: entry: fix non-NMI kernel<->kernel transitionsMark Rutland
There are periods in kernel mode when RCU is not watching and/or the scheduler tick is disabled, but we can still take exceptions such as interrupts. The arm64 exception handlers do not account for this, and it's possible that RCU is not watching while an exception handler runs. The x86/generic entry code handles this by ensuring that all (non-NMI) kernel exception handlers call irqentry_enter() and irqentry_exit(), which handle RCU, lockdep, and IRQ flag tracing. We can't yet move to the generic entry code, and already hadnle the user<->kernel transitions elsewhere, so we add new kernel<->kernel transition helpers alog the lines of the generic entry code. Since we now track interrupts becoming masked when an exception is taken, local_daif_inherit() is modified to track interrupts becoming re-enabled when the original context is inherited. To balance the entry/exit paths, each handler masks all DAIF exceptions before exit_to_kernel_mode(). Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201130115950.22492-10-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-11-30arm64: ptrace: prepare for EL1 irq/rcu trackingMark Rutland
Exceptions from EL1 may be taken when RCU isn't watching (e.g. in idle sequences), or when the lockdep hardirqs transiently out-of-sync with the hardware state (e.g. in the middle of local_irq_enable()). To correctly handle these cases, we'll need to save/restore this state across some exceptions taken from EL1. A series of subsequent patches will update EL1 exception handlers to handle this. In preparation for this, and to avoid dependencies between those patches, this patch adds two new fields to struct pt_regs so that exception handlers can track this state. Note that this is placed in pt_regs as some entry/exit sequences such as el1_irq are invoked from assembly, which makes it very difficult to add a separate structure as with the irqentry_state used by x86. We can separate this once more of the exception logic is moved to C. While the fields only need to be bool, they are both made u64 to keep pt_regs 16-byte aligned. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201130115950.22492-9-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-11-30arm64: entry: fix non-NMI user<->kernel transitionsMark Rutland
When built with PROVE_LOCKING, NO_HZ_FULL, and CONTEXT_TRACKING_FORCE will WARN() at boot time that interrupts are enabled when we call context_tracking_user_enter(), despite the DAIF flags indicating that IRQs are masked. The problem is that we're not tracking IRQ flag changes accurately, and so lockdep believes interrupts are enabled when they are not (and vice-versa). We can shuffle things so to make this more accurate. For kernel->user transitions there are a number of constraints we need to consider: 1) When we call __context_tracking_user_enter() HW IRQs must be disabled and lockdep must be up-to-date with this. 2) Userspace should be treated as having IRQs enabled from the PoV of both lockdep and tracing. 3) As context_tracking_user_enter() stops RCU from watching, we cannot use RCU after calling it. 4) IRQ flag tracing and lockdep have state that must be manipulated before RCU is disabled. ... with similar constraints applying for user->kernel transitions, with the ordering reversed. The generic entry code has enter_from_user_mode() and exit_to_user_mode() helpers to handle this. We can't use those directly, so we add arm64 copies for now (without the instrumentation markers which aren't used on arm64). These replace the existing user_exit() and user_exit_irqoff() calls spread throughout handlers, and the exception unmasking is left as-is. Note that: * The accounting for debug exceptions from userspace now happens in el0_dbg() and ret_to_user(), so this is removed from debug_exception_enter() and debug_exception_exit(). As user_exit_irqoff() wakes RCU, the userspace-specific check is removed. * The accounting for syscalls now happens in el0_svc(), el0_svc_compat(), and ret_to_user(), so this is removed from el0_svc_common(). This does not adversely affect the workaround for erratum 1463225, as this does not depend on any of the state tracking. * In ret_to_user() we mask interrupts with local_daif_mask(), and so we need to inform lockdep and tracing. Here a trace_hardirqs_off() is sufficient and safe as we have not yet exited kernel context and RCU is usable. * As PROVE_LOCKING selects TRACE_IRQFLAGS, the ifdeferry in entry.S only needs to check for the latter. * EL0 SError handling will be dealt with in a subsequent patch, as this needs to be treated as an NMI. Prior to this patch, booting an appropriately-configured kernel would result in spats as below: | DEBUG_LOCKS_WARN_ON(lockdep_hardirqs_enabled()) | WARNING: CPU: 2 PID: 1 at kernel/locking/lockdep.c:5280 check_flags.part.54+0x1dc/0x1f0 | Modules linked in: | CPU: 2 PID: 1 Comm: init Not tainted 5.10.0-rc3 #3 | Hardware name: linux,dummy-virt (DT) | pstate: 804003c5 (Nzcv DAIF +PAN -UAO -TCO BTYPE=--) | pc : check_flags.part.54+0x1dc/0x1f0 | lr : check_flags.part.54+0x1dc/0x1f0 | sp : ffff80001003bd80 | x29: ffff80001003bd80 x28: ffff66ce801e0000 | x27: 00000000ffffffff x26: 00000000000003c0 | x25: 0000000000000000 x24: ffffc31842527258 | x23: ffffc31842491368 x22: ffffc3184282d000 | x21: 0000000000000000 x20: 0000000000000001 | x19: ffffc318432ce000 x18: 0080000000000000 | x17: 0000000000000000 x16: ffffc31840f18a78 | x15: 0000000000000001 x14: ffffc3184285c810 | x13: 0000000000000001 x12: 0000000000000000 | x11: ffffc318415857a0 x10: ffffc318406614c0 | x9 : ffffc318415857a0 x8 : ffffc31841f1d000 | x7 : 647261685f706564 x6 : ffffc3183ff7c66c | x5 : ffff66ce801e0000 x4 : 0000000000000000 | x3 : ffffc3183fe00000 x2 : ffffc31841500000 | x1 : e956dc24146b3500 x0 : 0000000000000000 | Call trace: | check_flags.part.54+0x1dc/0x1f0 | lock_is_held_type+0x10c/0x188 | rcu_read_lock_sched_held+0x70/0x98 | __context_tracking_enter+0x310/0x350 | context_tracking_enter.part.3+0x5c/0xc8 | context_tracking_user_enter+0x6c/0x80 | finish_ret_to_user+0x2c/0x13cr Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201130115950.22492-8-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-11-30arm64: entry: move el1 irq/nmi logic to CMark Rutland
In preparation for reworking the EL1 irq/nmi entry code, move the existing logic to C. We no longer need the asm_nmi_enter() and asm_nmi_exit() wrappers, so these are removed. The new C functions are marked noinstr, which prevents compiler instrumentation and runtime probing. In subsequent patches we'll want the new C helpers to be called in all cases, so we don't bother wrapping the calls with ifdeferry. Even when the new C functions are stubs the trivial calls are unlikely to have a measurable impact on the IRQ or NMI paths anyway. Prototypes are added to <asm/exception.h> as otherwise (in some configurations) GCC will complain about the lack of a forward declaration. We already do this for existing function, e.g. enter_from_user_mode(). The new helpers are marked as noinstr (which prevents all instrumentation, tracing, and kprobes). Otherwise, there should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201130115950.22492-7-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>