linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2024-11-07	hrtimers: Introduce hrtimer_setup_sleeper_on_stack()	Nam Cao
	The hrtimer_init() API is replaced by hrtimer_setup() variants to initialize the timer including the callback function at once. hrtimer_init_sleeper_on_stack() does not need user to setup the callback function separately, so a new variant would not be strictly necessary. Nonetheless, to keep the naming convention consistent, introduce hrtimer_setup_sleeper_on_stack(). hrtimer_init_on_stack() will be removed once all users are converted. Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/7b5e18e6dd0ace9eaa211201528cb9dc23752454.1730386209.git.namcao@linutronix.de
2024-11-07	hrtimers: Introduce hrtimer_setup_on_stack()	Nam Cao
	To initialize hrtimer on stack, hrtimer_init_on_stack() needs to be called and also hrtimer::function must be set. This is error-prone and awkward to use. Introduce hrtimer_setup_on_stack() which does both of these things, so that users of hrtimer can be simplified. The new setup function also has a sanity check for the provided function pointer. If NULL, a warning is emitted and a dummy callback installed. hrtimer_init_on_stack() will be removed as soon as all of its users have been converted to the new function. Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/4b05e2ab3a82c517adf67fabc0f0cd8fe118b97c.1730386209.git.namcao@linutronix.de
2024-11-07	hrtimers: Introduce hrtimer_setup() to replace hrtimer_init()	Nam Cao
	To initialize hrtimer, hrtimer_init() needs to be called and also hrtimer::function must be set. This is error-prone and awkward to use. Introduce hrtimer_setup() which does both of these things, so that users of hrtimer can be simplified. The new setup function also has a sanity check for the provided function pointer. If NULL, a warning is emitted and a dummy callback installed. hrtimer_init() will be removed as soon as all of its users have been converted to the new function. Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/5057c1ddbfd4b92033cd93d37fe38e6b069d5ba6.1730386209.git.namcao@linutronix.de
2024-11-07	io_uring: Remove redundant hrtimer's callback function setup	Nam Cao
	The IORING_OP_TIMEOUT command uses hrtimer underneath. The timer's callback function is setup in io_timeout(), and then the callback function is setup again when the timer is rearmed. Since the callback function is the same for both cases, the latter setup is redundant, therefore remove it. Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jens Axboe <axboe@kernel.dk: Link: https://lore.kernel.org/all/07b28dfd5691478a2d250f379c8b90dd37f9bb9a.1730386209.git.namcao@linutronix.de
2024-11-07	_RESEND_PATCH_v2_04_19_wifi_rt2x00_Remove_redundant_hrtimer_init_	Nam Cao
	rt2x00usb_probe() executes a hrtimer_init() for txstatus_timer. Afterwards, rt2x00lib_probe_dev() is called which also initializes this txstatus_timer with the same settings. Remove the redundant hrtimer_init() call in rt2x00usb_probe(). Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/all/66116057f788e18a6603d50a554417eee459e02c.1730386209.git.namcao@linutronix.de
2024-11-07	KVM: x86/xen: Initialize hrtimer in kvm_xen_init_vcpu()	Nam Cao
	The hrtimer is initialized in the KVM_XEN_VCPU_SET_ATTR ioctl. That caused problem in the past, because the hrtimer can be initialized multiple times, which was fixed by commit af735db31285 ("KVM: x86/xen: Initialize Xen timer only once"). This commit avoids initializing the timer multiple times by checking the field 'function' of struct hrtimer to determine if it has already been initialized. This is not required and in the way to make the function field private. Move the hrtimer initialization into kvm_xen_init_vcpu() so that it will only be initialized once. Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/all/9c33c7224d97d08f4fa30d3cc8687981c1d3e953.1730386209.git.namcao@linutronix.de
2024-11-07	drm/i915/request: Remove unnecessary modification of hrtimer:: Function	Nam Cao
	When a request is created, the hrtimer is not initialized and only its 'function' field is set to NULL. The hrtimer is only initialized when the request is enqueued. The point of setting 'function' to NULL is that, it can be used to check whether hrtimer_try_to_cancel() should be called while retiring the request. This "trick" is unnecessary, because hrtimer_try_to_cancel() already does its own check whether the timer is armed. If the timer is not armed, hrtimer_try_to_cancel() returns 0. Fully initialize the timer when the request is created, which allows to make the hrtimer::function field private once all users of hrtimer_init() are converted to hrtimer_setup(), which requires a valid callback function to be set. Because hrtimer_try_to_cancel() returns 0 if the timer is not armed, the logic to check whether to call i915_request_put() remains equivalent. Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/50f865045aa672a9730343ad131543da332b1d8d.1730386209.git.namcao@linutronix.de
2024-11-07	hrtimers: Add missing hrtimer_init() trace points	Nam Cao
	hrtimer_init*_on_stack() is not covered by tracing when CONFIG_DEBUG_OBJECTS_TIMERS=y. Rework the functions similar to hrtimer_init() and hrtimer_init_sleeper() so that the hrtimer_init() tracepoint is unconditionally available. The rework makes hrtimer_init_sleeper() unused. Delete it. Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/74528e8abf2bb96e8bee85ffacbf14e15cf89f0d.1730386209.git.namcao@linutronix.de
2024-11-07	softirq: Use a dedicated thread for timer wakeups on PREEMPT_RT.	Sebastian Andrzej Siewior
	The timer and hrtimer soft interrupts are raised in hard interrupt context. With threaded interrupts force enabled or on PREEMPT_RT this leads to waking the ksoftirqd for the processing of the soft interrupt. ksoftirqd runs as SCHED_OTHER task which means it will compete with other tasks for CPU resources. This can introduce long delays for timer processing on heavy loaded systems and is not desired. Split the TIMER_SOFTIRQ and HRTIMER_SOFTIRQ processing into a dedicated timers thread and let it run at the lowest SCHED_FIFO priority. Wake-ups for RT tasks happen from hardirq context so only timer_list timers and hrtimers for "regular" tasks are processed here. The higher priority ensures that wakeups are performed before scheduling SCHED_OTHER tasks. Using a dedicated variable to store the pending softirq bits values ensure that the timer are not accidentally picked up by ksoftirqd and other threaded interrupts. It shouldn't be picked up by ksoftirqd since it runs at lower priority. However if ksoftirqd is already running while a timer fires, then ksoftird will be PI-boosted due to the BH-lock to ktimer's priority. The timer thread can pick up pending softirqs from ksoftirqd but only if the softirq load is high. It is not be desired that the picked up softirqs are processed at SCHED_FIFO priority under high softirq load but this can already happen by a PI-boost by a force-threaded interrupt. [ frederic@kernel.org: rcutorture.c fixes, storm fix by introduction of local_timers_pending() for tick_nohz_next_event() ] [ junxiao.chang@intel.com: Ensure ktimersd gets woken up even if a softirq is currently served. ] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> [rcutorture] Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/all/20241106150419.2593080-4-bigeasy@linutronix.de
2024-11-07	timers: Use __raise_softirq_irqoff() to raise the softirq.	Sebastian Andrzej Siewior
	Raising the timer soft interrupt is always done from hard interrupt context, so it can be reduced to just setting the TIMER soft interrupt flag. The soft interrupt will be invoked on return from interrupt. Use therefore __raise_softirq_irqoff() to raise the TIMER soft interrupt, which is a trivial optimization. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/all/20241106150419.2593080-3-bigeasy@linutronix.de
2024-11-07	hrtimer: Use __raise_softirq_irqoff() to raise the softirq	Sebastian Andrzej Siewior
	Raising the hrtimer soft interrupt is always done from hard interrupt context, so it can be reduced to just setting the HRTIMER soft interrupt flag. The soft interrupt will be invoked on return from interrupt. Use therefore __raise_softirq_irqoff() to raise the HRTIMER soft interrupt, which is a trivial optimization. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/all/20241106150419.2593080-2-bigeasy@linutronix.de
2024-11-06	Merge branch 'bnxt_en-ethtool-improve-wildcard-l4proto-on-ip4-ip6-ntuple-rules'	Jakub Kicinski
	Daniel Xu says: ==================== bnxt_en: ethtool: Improve wildcard l4proto on ip4/ip6 ntuple rules This patchset improves wildcarding over l4proto on ip4 and ip6 nutple rules. Previous support required setting l4proto explicitly to 0xFF if you wanted wildcard, which ethtool (naturally) did not do. For example, this command would fail with -EOPNOSUPP: ethtool -N eth0 flow-type ip6 dst-ip $IP6 context 1 After this patchset, only TCP, UDP, ICMP, and unset will be supported for l4proto. ==================== Link: https://patch.msgid.link/cover.1730778566.git.dxu@dxuuu.xyz Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-06	bnxt_en: ethtool: Support unset l4proto on ip4/ip6 ntuple rules	Daniel Xu
	Previously, trying to insert an ip4/ip6 ntuple rule with an unset l4proto would get rejected with -EOPNOTSUPP. For example, the following would fail: ethtool -N eth0 flow-type ip6 dst-ip $IP6 context 1 The reason was that all the l4proto validation was being run despite the l4proto mask being set to 0x0. Fix by respecting the mask on l4proto and treating a mask of 0x0 as wildcard l4proto. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/1ac93a2836b25f79e7045f8874d9a17875229ffc.1730778566.git.dxu@dxuuu.xyz Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-06	bnxt_en: ethtool: Remove ip4/ip6 ntuple support for IPPROTO_RAW	Daniel Xu
	Commit 9ba0e56199e3 ("bnxt_en: Enhance ethtool ntuple support for ip flows besides TCP/UDP") added support for ip4/ip6 ntuple rules. However, if you wanted to wildcard over l4proto, you had to provide 0xFF. The choice of 0xFF is non-standard and non-intuitive. Delete support for it in this commit. Next commit we will introduce a cleaner way to wildcard l4proto. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/a5ba0d3bd926d27977c317efa7fdfbc8a704d2b8.1730778566.git.dxu@dxuuu.xyz Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-06	net: enetc: Fix spelling mistake "referencce" -> "reference"	Colin Ian King
	There is a spelling mistake in a dev_err message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20241105093125.1087202-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-06	net: phy: respect cached advertising when re-enabling EEE	Heiner Kallweit
	If we remove modes from EEE advertisement and disable / re-enable EEE, then advertisement is set to all supported modes. I don't think this is what the user expects. So respect the cached advertisement and just fall back to all supported modes if cached advertisement is empty. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/c75f7f8b-5571-429f-abd3-ce682d178a4b@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-06	Merge branch 'net-add-debug-checks-to-skb_reset_xxx_header'	Jakub Kicinski
	Eric Dumazet says: ==================== net: add debug checks to skb_reset_xxx_header() Add debug checks (only enabled for CONFIG_DEBUG_NET=y builds), to catch bugs earlier. ==================== Link: https://patch.msgid.link/20241105174403.850330-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-06	net: add debug check in skb_reset_mac_header()	Eric Dumazet
	Make sure (skb->data - skb->head) can fit in skb->mac_header This needs CONFIG_DEBUG_NET=y. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Link: https://patch.msgid.link/20241105174403.850330-8-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-06	net: add debug check in skb_reset_network_header()	Eric Dumazet
	Make sure (skb->data - skb->head) can fit in skb->network_header This needs CONFIG_DEBUG_NET=y. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Link: https://patch.msgid.link/20241105174403.850330-7-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-06	net: add debug check in skb_reset_transport_header()	Eric Dumazet
	Make sure (skb->data - skb->head) can fit in skb->transport_header This needs CONFIG_DEBUG_NET=y. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Link: https://patch.msgid.link/20241105174403.850330-6-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-06	net: add debug check in skb_reset_inner_mac_header()	Eric Dumazet
	Make sure (skb->data - skb->head) can fit in skb->inner_mac_header This needs CONFIG_DEBUG_NET=y. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Link: https://patch.msgid.link/20241105174403.850330-5-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-06	net: add debug check in skb_reset_inner_network_header()	Eric Dumazet
	Make sure (skb->data - skb->head) can fit in skb->inner_network_header This needs CONFIG_DEBUG_NET=y. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Link: https://patch.msgid.link/20241105174403.850330-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-06	net: add debug check in skb_reset_inner_transport_header()	Eric Dumazet
	Make sure (skb->data - skb->head) can fit in skb->inner_transport_header This needs CONFIG_DEBUG_NET=y. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Link: https://patch.msgid.link/20241105174403.850330-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-06	net: skb_reset_mac_len() must check if mac_header was set	Eric Dumazet
	Recent discussions show that skb_reset_mac_len() should be more careful. We expect the MAC header being set. If not, clear skb->mac_len and fire a warning for CONFIG_DEBUG_NET=y builds. If after investigations we find that not having a MAC header was okay, we can remove the warning. Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/netdev/CANn89iJZGH+yEfJxfPWa3Hm7jxb-aeY2Up4HufmLMnVuQXt38A@mail.gmail.com/T/ Cc: En-Wei Wu <en-wei.wu@canonical.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Link: https://patch.msgid.link/20241105174403.850330-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-06	Merge branch 'ipv6-fix-hangup-on-device-removal'	Jakub Kicinski
	Paolo Abeni says: ==================== ipv6: fix hangup on device removal This addresses the infamous unregister_netdevice splat in net selftests; the actual fix is carried by the first patch, while the 2nd one addresses a related problem in the relevant test that was patially hiding the problem. Targeting net-next as the issue is quite old and I feel a little lost in the fib info/nh jungle. ==================== Link: https://patch.msgid.link/cover.1730828007.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-06	selftests: net: really check for bg process completion	Paolo Abeni
	A recent refactor transformed the check for process completion in a true statement, due to a typo. As a result, the relevant test-case is unable to catch the regression it was supposed to detect. Restore the correct condition. Fixes: 691bb4e49c98 ("selftests: net: avoid just another constant wait") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/0e6f213811f8e93a235307e683af8225cc6277ae.1730828007.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-06	ipv6: release nexthop on device removal	Paolo Abeni
	The CI is hitting some aperiodic hangup at device removal time in the pmtu.sh self-test: unregister_netdevice: waiting for veth_A-R1 to become free. Usage count = 6 ref_tracker: veth_A-R1@ffff888013df15d8 has 1/5 users at dst_init+0x84/0x4a0 dst_alloc+0x97/0x150 ip6_dst_alloc+0x23/0x90 ip6_rt_pcpu_alloc+0x1e6/0x520 ip6_pol_route+0x56f/0x840 fib6_rule_lookup+0x334/0x630 ip6_route_output_flags+0x259/0x480 ip6_dst_lookup_tail.constprop.0+0x5c2/0x940 ip6_dst_lookup_flow+0x88/0x190 udp_tunnel6_dst_lookup+0x2a7/0x4c0 vxlan_xmit_one+0xbde/0x4a50 [vxlan] vxlan_xmit+0x9ad/0xf20 [vxlan] dev_hard_start_xmit+0x10e/0x360 __dev_queue_xmit+0xf95/0x18c0 arp_solicit+0x4a2/0xe00 neigh_probe+0xaa/0xf0 While the first suspect is the dst_cache, explicitly tracking the dst owing the last device reference via probes proved such dst is held by the nexthop in the originating fib6_info. Similar to commit f5b51fe804ec ("ipv6: route: purge exception on removal"), we need to explicitly release the originating fib info when disconnecting a to-be-removed device from a live ipv6 dst: move the fib6_info cleanup into ip6_dst_ifdown(). Tested running: ./pmtu.sh cleanup_ipv6_exception in a tight loop for more than 400 iterations with no spat, running an unpatched kernel I observed a splat every ~10 iterations. Fixes: f88d8ea67fbd ("ipv6: Plumb support for nexthop object in a fib6_info") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/604c45c188c609b732286b47ac2a451a40f6cf6d.1730828007.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-07	alarmtimers: Remove return value from alarm functions	Thomas Gleixner
	Now that the SIG_IGN problem is solved in the core code, the alarmtimer callbacks do not require a return value anymore. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/all/20241105064214.318837272@linutronix.de
2024-11-07	alarmtimers: Remove the throttle mechanism from alarm_forward_now()	Thomas Gleixner
	Now that ignored posix timer signals are requeued and the timers are rearmed on signal delivery the workaround to keep such timers alive and self rearm them is not longer required. Remove the unused alarm timer parts. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20241105064214.252443020@linutronix.de
2024-11-07	posix-timers: Cleanup SIG_IGN workaround leftovers	Thomas Gleixner
	Now that ignored posix timer signals are requeued and the timers are rearmed on signal delivery the workaround to keep such timers alive and self rearm them is not longer required. Remove the relevant hacks and the not longer required return values from the related functions. The alarm timer workarounds will be cleaned up in a separate step. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20241105064214.187239060@linutronix.de
2024-11-07	signal: Queue ignored posixtimers on ignore list	Thomas Gleixner
	Queue posixtimers which have their signal ignored on the ignored list: 1) When the timer fires and the signal has SIG_IGN set 2) When SIG_IGN is installed via sigaction() and a timer signal is already queued This only happens when the signal is for a valid timer, which delivered the signal in periodic mode. One-shot timer signals are correctly dropped. Due to the lock order constraints (sighand::siglock nests inside timer::lock) the signal code cannot access any of the timer fields which are relevant to make this decision, e.g. timer::it_status. This is addressed by establishing a protection scheme which requires to lock both locks on the timer side for modifying decision fields in the timer struct and therefore makes it possible for the signal delivery to evaluate with only sighand:siglock being held: 1) Move the NULLification of timer->it_signal into the sighand::siglock protected section of timer_delete() and check timer::it_signal in the code path which determines whether the signal is dropped or queued on the ignore list. This ensures that a deleted timer cannot be moved onto the ignore list, which would prevent it from being freed on exit() as it is not longer in the process' posix timer list. If the timer got moved to the ignored list before deletion then it is removed from the ignored list under sighand lock in timer_delete(). 2) Provide a new timer::it_sig_periodic flag, which gets set in the signal queue path with both timer and sighand locks held if the timer is actually in periodic mode at expiry time. The ignore list code checks this flag under sighand::siglock and drops the signal when it is not set. If it is set, then the signal is moved to the ignored list independent of the actual state of the timer. When the signal is un-ignored later then the signal is moved back to the signal queue. On signal delivery the posix timer side decides about dropping the signal if the timer was re-armed, dis-armed or deleted based on the signal sequence counter check. If the thread/process exits then not yet delivered signals are discarded which means the reference of the timer containing the sigqueue is dropped and frees the timer. This is way cheaper than requiring all code paths to lock sighand::siglock of the target thread/process on any modification of timer::it_status or going all the way and removing pending signals from the signal queues on every rearm, disarm or delete operation. So the protection scheme here is that on the timer side both timer::lock and sighand::siglock have to be held for modifying timer::it_signal timer::it_sig_periodic which means that on the signal side holding sighand::siglock is enough to evaluate these fields. In posixtimer_deliver_signal() holding timer::lock is sufficient to do the sequence validation against timer::it_signal_seq because a concurrent expiry is waiting on timer::lock to be released. This completes the SIG_IGN handling and such timers are not longer self rearmed which avoids pointless wakeups. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20241105064214.120756416@linutronix.de
2024-11-07	signal: Handle ignored signals in do_sigaction(action != SIG_IGN)	Thomas Gleixner
	When a real handler (including SIG_DFL) is installed for a signal, which had previously SIG_IGN set, then the list of ignored posix timers has to be checked for timers which are affected by this change. Add a list walk function which checks for the matching signal number and if found requeues the timers signal, so the timer is rearmed on signal delivery. Rearming the timer right away is not possible because that requires to drop sighand lock. No functional change as the counter part which queues the timers on the ignored list is still missing. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20241105064214.054091076@linutronix.de
2024-11-07	posix-timers: Handle ignored list on delete and exit	Thomas Gleixner
	To handle posix timer signals on sigaction(SIG_IGN) properly, the timers will be queued on a separate ignored list. Add the necessary cleanup code for timer_delete() and exit_itimers(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20241105064213.987530588@linutronix.de
2024-11-07	signal: Provide ignored_posix_timers list	Thomas Gleixner
	To prepare for handling posix timer signals on sigaction(SIG_IGN) properly, add a list to task::signal. This list will be used to queue posix timers so their signal can be requeued when SIG_IGN is lifted later. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20241105064213.920101900@linutronix.de
2024-11-07	posix-timers: Move sequence logic into struct k_itimer	Thomas Gleixner
	The posix timer signal handling uses siginfo::si_sys_private for handling the sequence counter check. That indirection is not longer required and the sequence count value at signal queueing time can be stored in struct k_itimer itself. This removes the requirement of treating siginfo::si_sys_private special as it's now always zero as the kernel does not touch it anymore. Suggested-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Link: https://lore.kernel.org/all/20241105064213.852619866@linutronix.de
2024-11-07	signal: Cleanup unused posix-timer leftovers	Thomas Gleixner
	Remove the leftovers of sigqueue preallocation as it's not longer used. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20241105064213.786506636@linutronix.de
2024-11-07	posix-timers: Embed sigqueue in struct k_itimer	Thomas Gleixner
	To cure the SIG_IGN handling for posix interval timers, the preallocated sigqueue needs to be embedded into struct k_itimer to prevent life time races of all sorts. Now that the prerequisites are in place, embed the sigqueue into struct k_itimer and fixup the relevant usage sites. Aside of preparing for proper SIG_IGN handling, this spares an extra allocation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20241105064213.719695194@linutronix.de
2024-11-07	signal: Replace resched_timer logic	Thomas Gleixner
	In preparation for handling ignored posix timer signals correctly and embedding the sigqueue struct into struct k_itimer, hand down a pointer to the sigqueue struct into posix_timer_deliver_signal() instead of just having a boolean flag. No functional change. Suggested-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Link: https://lore.kernel.org/all/20241105064213.652658158@linutronix.de
2024-11-07	signal: Refactor send_sigqueue()	Thomas Gleixner
	To handle posix timers which have their signal ignored via SIG_IGN properly it is required to requeue a ignored signal for delivery when SIG_IGN is lifted so the timer gets rearmed. Split the required code out of send_sigqueue() so it can be reused in context of sigaction(). While at it rename send_sigqueue() to posixtimer_send_sigqueue() so its clear what this is about. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20241105064213.586453412@linutronix.de
2024-11-07	posix-timers: Store PID type in the timer	Thomas Gleixner
	instead of re-evaluating the signal delivery mode everywhere. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20241105064213.519086500@linutronix.de
2024-11-07	signal: Provide posixtimer_sigqueue_init()	Thomas Gleixner
	To cure the SIG_IGN handling for posix interval timers, the preallocated sigqueue needs to be embedded into struct k_itimer to prevent life time races of all sorts. Provide a new function to initialize the embedded sigqueue to prepare for that. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20241105064213.450427515@linutronix.de
2024-11-07	signal: Split up __sigqueue_alloc()	Thomas Gleixner
	To cure the SIG_IGN handling for posix interval timers, the preallocated sigqueue needs to be embedded into struct k_itimer to prevent life time races of all sorts. Reorganize __sigqueue_alloc() so the ucounts retrieval and the initialization can be used independently. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20241105064213.371410037@linutronix.de
2024-11-07	posix-timers: Add a refcount to struct k_itimer	Thomas Gleixner
	To cure the SIG_IGN handling for posix interval timers, the preallocated sigqueue needs to be embedded into struct k_itimer to prevent life time races of all sorts. To make that work correctly it needs reference counting so that timer deletion does not free the timer prematuraly when there is a signal queued or delivered concurrently. Add a rcuref to the posix timer part. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20241105064213.304756440@linutronix.de
2024-11-07	posix-cpu-timers: Use dedicated flag for CPU timer nanosleep	Thomas Gleixner
	POSIX CPU timer nanosleep creates a k_itimer on stack and uses the sigq pointer to detect the nanosleep case in the expiry function. Prepare for embedding sigqueue into struct k_itimer by using a dedicated flag for nanosleep. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20241105064213.238550394@linutronix.de
2024-11-07	posix-cpu-timers: Cleanup the firing logic	Thomas Gleixner
	The firing flag of a posix CPU timer is tristate: 0: when the timer is not about to deliver a signal 1: when the timer has expired, but the signal has not been delivered yet -1: when the timer was queued for signal delivery and a rearm operation raced against it and supressed the signal delivery. This is a pointless exercise as this can be simply expressed with a boolean. Only if set, the signal is delivered. This makes delete and rearm consistent with the rest of the posix timers. Convert firing to bool and fixup the usage sites accordingly and add comments why the timer cannot be dequeued right away. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/all/20241105064213.172848618@linutronix.de
2024-11-07	posix-timers: Make signal overrun accounting sensible	Thomas Gleixner
	The handling of the timer overrun in the signal code is inconsistent as it takes previous overruns into account. This is just wrong as after the reprogramming of a timer the overrun count starts over from a clean state, i.e. 0. Don't touch info::si_overrun in send_sigqueue() and only store the overrun value at signal delivery time, which is computed from the timer itself relative to the expiry time. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20241105064213.106738193@linutronix.de
2024-11-07	posix-timers: Make signal delivery consistent	Thomas Gleixner
	Signals of timers which are reprogammed, disarmed or deleted can deliver signals related to the past. The POSIX spec is blury about this: - "The effect of disarming or resetting a timer with pending expiration notifications is unspecified." - "The disposition of pending signals for the deleted timer is unspecified." In both cases it is reasonable to expect that pending signals are discarded. Especially in the reprogramming case it does not make sense to account for previous overruns or to deliver a signal for a timer which has been disarmed. This makes the behaviour consistent and understandable. Remove the si_sys_private check from the signal delivery code and invoke posix_timer_deliver_signal() unconditionally for posix timer related signals. Change posix_timer_deliver_signal() so it controls the actual signal delivery via the return value. It now instructs the signal code to drop the signal when: 1) The timer does not longer exist in the hash table 2) The timer signal_seq value is not the same as the si_sys_private value which was set when the signal was queued. This is also a preparatory change to embed the sigqueue into the k_itimer structure, which in turn allows to remove the si_sys_private magic. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/all/20241105064213.040348644@linutronix.de
2024-11-07	posix-cpu-timers: Correctly update timer status in posix_cpu_timer_del()	Thomas Gleixner
	If posix_cpu_timer_del() exits early due to task not found or sighand invalid, it fails to clear the state of the timer. That's harmless but inconsistent. These early exits are accounted as successful delete. Move the update of the timer state into the success return path, so all "successful" deletions are handled. Reported-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/all/20241105064212.974053438@linutronix.de
2024-11-07	btrfs: fix the length of reserved qgroup to free	Haisu Wang
	The dealloc flag may be cleared and the extent won't reach the disk in cow_file_range when errors path. The reserved qgroup space is freed in commit 30479f31d44d ("btrfs: fix qgroup reserve leaks in cow_file_range"). However, the length of untouched region to free needs to be adjusted with the correct remaining region size. Fixes: 30479f31d44d ("btrfs: fix qgroup reserve leaks in cow_file_range") CC: stable@vger.kernel.org # 6.11+ Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Haisu Wang <haisuwang@tencent.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-07	btrfs: reinitialize delayed ref list after deleting it from the list	Filipe Manana
	At insert_delayed_ref() if we need to update the action of an existing ref to BTRFS_DROP_DELAYED_REF, we delete the ref from its ref head's ref_add_list using list_del(), which leaves the ref's add_list member not reinitialized, as list_del() sets the next and prev members of the list to LIST_POISON1 and LIST_POISON2, respectively. If later we end up calling drop_delayed_ref() against the ref, which can happen during merging or when destroying delayed refs due to a transaction abort, we can trigger a crash since at drop_delayed_ref() we call list_empty() against the ref's add_list, which returns false since the list was not reinitialized after the list_del() and as a consequence we call list_del() again at drop_delayed_ref(). This results in an invalid list access since the next and prev members are set to poison pointers, resulting in a splat if CONFIG_LIST_HARDENED and CONFIG_DEBUG_LIST are set or invalid poison pointer dereferences otherwise. So fix this by deleting from the list with list_del_init() instead. Fixes: 1d57ee941692 ("btrfs: improve delayed refs iterations") CC: stable@vger.kernel.org # 4.19+ Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>