Age | Commit message (Collapse) | Author |
|
Document support for the Timer Unit (TMU) on the R-Mobile APE6 (R8A73A4)
Soc.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/17e3fd5f27ab540c8611545ad3dc5a697ca66c58.1716985096.git.geert+renesas@glider.be
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
|
|
Counter width of GIC is configurable and can be read from a
register.
Use width value from the register for sched_clock.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Link: https://lore.kernel.org/r/20240612-mips-clks-v2-7-a57e6f49f3db@flygoat.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
|
|
It is a good clocksource which usually go as fast as CPU core
and have a low access latency, so raise the base of rating
from Good to desired when we know that it has a stable frequency.
Increase frequency addend dividend to 10000000 (10MHz) to
reasonably accommodate multi GHz level clock, also cap rating
within current level.
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Link: https://lore.kernel.org/r/20240612-mips-clks-v2-6-a57e6f49f3db@flygoat.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
|
|
There is a race condition in the CMT interrupt handler. In the interrupt
handler the driver sets a driver private flag, FLAG_IRQCONTEXT. This
flag is used to indicate any call to set_next_event() should not be
directly propagated to the device, but instead cached. This is done as
the interrupt handler itself reprograms the device when needed before it
completes and this avoids this operation to take place twice.
It is unclear why this design was chosen, my suspicion is to allow the
struct clock_event_device.event_handler callback, which is called while
the FLAG_IRQCONTEXT is set, can update the next event without having to
write to the device twice.
Unfortunately there is a race between when the FLAG_IRQCONTEXT flag is
set and later cleared where the interrupt handler have already started to
write the next event to the device. If set_next_event() is called in
this window the value is only cached in the driver but not written. This
leads to the board to misbehave, or worse lockup and produce a splat.
rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
rcu: 0-...!: (0 ticks this GP) idle=f5e0/0/0x0 softirq=519/519 fqs=0 (false positive?)
rcu: (detected by 1, t=6502 jiffies, g=-595, q=77 ncpus=2)
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.10.0-rc5-arm64-renesas-00019-g74a6f86eaf1c-dirty #20
Hardware name: Renesas Salvator-X 2nd version board based on r8a77965 (DT)
pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : tick_check_broadcast_expired+0xc/0x40
lr : cpu_idle_poll.isra.0+0x8c/0x168
sp : ffff800081c63d70
x29: ffff800081c63d70 x28: 00000000580000c8 x27: 00000000bfee5610
x26: 0000000000000027 x25: 0000000000000000 x24: 0000000000000000
x23: ffff00007fbb9100 x22: ffff8000818f1008 x21: ffff8000800ef07c
x20: ffff800081c79ec0 x19: ffff800081c70c28 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffffc2c717d8
x14: 0000000000000000 x13: ffff000009c18080 x12: ffff8000825f7fc0
x11: 0000000000000000 x10: ffff8000818f3cd4 x9 : 0000000000000028
x8 : ffff800081c79ec0 x7 : ffff800081c73000 x6 : 0000000000000000
x5 : 0000000000000000 x4 : ffff7ffffe286000 x3 : 0000000000000000
x2 : ffff7ffffe286000 x1 : ffff800082972900 x0 : ffff8000818f1008
Call trace:
tick_check_broadcast_expired+0xc/0x40
do_idle+0x9c/0x280
cpu_startup_entry+0x34/0x40
kernel_init+0x0/0x11c
do_one_initcall+0x0/0x260
__primary_switched+0x80/0x88
rcu: rcu_preempt kthread timer wakeup didn't happen for 6501 jiffies! g-595 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
rcu: Possible timer handling issue on cpu=0 timer-softirq=262
rcu: rcu_preempt kthread starved for 6502 jiffies! g-595 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
task:rcu_preempt state:I stack:0 pid:15 tgid:15 ppid:2 flags:0x00000008
Call trace:
__switch_to+0xbc/0x100
__schedule+0x358/0xbe0
schedule+0x48/0x148
schedule_timeout+0xc4/0x138
rcu_gp_fqs_loop+0x12c/0x764
rcu_gp_kthread+0x208/0x298
kthread+0x10c/0x110
ret_from_fork+0x10/0x20
The design have been part of the driver since it was first merged in
early 2009. It becomes increasingly harder to trigger the issue the
older kernel version one tries. It only takes a few boots on v6.10-rc5,
while hundreds of boots are needed to trigger it on v5.10.
Close the race condition by using the CMT channel lock for the two
competing sections. The channel lock was added to the driver after its
initial design.
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Link: https://lore.kernel.org/r/20240702190230.3825292-1-niklas.soderlund+renesas@ragnatech.se
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
|
|
The 'err' variable is initialized whatever the code path, it is
pointless to initialize it when it is declared.
Signed-off-by: Li kunyu <kunyu@nfschina.com>
Link: https://lore.kernel.org/r/20240705052159.22235-1-kunyu@nfschina.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
|
|
The irq variable is initialized whatever the code path, it is poinless
to initialize when declaring it.
Signed-off-by: Li kunyu <kunyu@nfschina.com>
Link: https://lore.kernel.org/r/20240705040729.21961-1-kunyu@nfschina.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
|
|
Srujana Challa says:
====================
Fixes for CPT and RSS configuration
This series of patches fixes various issues related to CPT
configuration and RSS configuration.
v1->v2:
- Excluded the patch "octeontx2-af: reduce cpt flt interrupt vectors for
cn10kb" to submit it to net-next.
- Addressed the review comments.
Kiran Kumar K (1):
octeontx2-af: Fix issue with IPv6 ext match for RSS
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
While performing RSS based on IPv4, packets with
IPv4 options are not being considered. Adding changes
to match both plain IPv4 and IPv4 with option header.
Fixes: 41a7aa7b800d ("octeontx2-af: NIX Rx flowkey configuration for RSS")
Signed-off-by: Satheesh Paul <psatheesh@marvell.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
While performing RSS based on IPv6, extension ltype
is not being considered. This will be problem for
fragmented packets or packets with extension header.
Adding changes to match IPv6 ext header along with IPv6
ltype.
Fixes: 41a7aa7b800d ("octeontx2-af: NIX Rx flowkey configuration for RSS")
Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Checksum and length checks are not enabled for IPv4 header with
options and IPv6 with extension headers.
To fix this a change in enum npc_kpu_lc_ltype is required which will
allow adjustment of LTYPE_MASK to detect all types of IP headers.
Fixes: 21e6699e5cd6 ("octeontx2-af: Add NPC KPU profile")
Signed-off-by: Michal Mazur <mmazur2@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch fixes CPT_LF_ALLOC mailbox error due to
incompatible mailbox message format. Specifically, it
corrects the `blkaddr` field type from `int` to `u8`.
Fixes: de2854c87c64 ("octeontx2-af: Mailbox changes for 98xx CPT block")
Signed-off-by: Srujana Challa <schalla@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Replace slot id with global CPT lf id on reg read/write as
CPTPF/VF driver would send slot number instead of global
lf id in the reg offset. And also update the mailbox response
with the global lf's register offset.
Fixes: ae454086e3c2 ("octeontx2-af: add mailbox interface for CPT")
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
FREESCALE SOC DRIVERS has been orphaned since
commit eaac25d026a1 ("MAINTAINERS: Drop Li Yang as their email address
stopped working")
QUICC ENGINE LIBRARY has Qiang Zhao as maintainer but he hasn't
responded for years and when Li Yang was still maintaining FREESCALE
SOC DRIVERS he was also handling QUICC ENGINE LIBRARY directly.
As a maintainer of LINUX FOR POWERPC EMBEDDED PPC8XX AND PPC83XX, I
also need FREESCALE SOC DRIVERS to be actively maintained, so add
myself as maintainer of FREESCALE SOC DRIVERS and QUICC ENGINE LIBRARY.
See below link for more context.
Link: https://lore.kernel.org/linuxppc-dev/20240219153016.ntltc76bphwrv6hn@skbuf/T/#mf6d4a5eef79e8eae7ae0456a2794c01e630a6756
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
There are many generations of omaps to maintain, and I will be only active
as a hobbyist with time permitting. Let's add more maintainers to ensure
continued Linux support.
TI is interested in maintaining the active SoCs such as am3, am4 and
dra7. And the hobbyists are interested in maintaining some of the older
devices, mainly based on omap3 and 4 SoCs.
Kevin and Roger have agreed to maintain the active TI parts. Both Kevin
and Roger have been working on the omap variants for a long time, and
have a good understanding of the hardware.
Aaro and Andreas have agreed to maintain the community devices. Both Aaro
and Andreas have long experience on working with the earlier TI SoCs.
While at it, let's also change me to be a reviewer for the omap1, and
drop the link to my old omap web page.
Signed-off-by: Tony Lindgren <tony@atomide.com>
Acked-by: Kevin Hilman <khilman@baylibre.com>
Acked-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Acked-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
The to-be-fixed commit rightfully prevented that the registers will be
cleared. However, the index must be cleared. Otherwise a read message
will re-issue the last work. Fix it and add a comment describing the
situation.
Fixes: c422b6a63024 ("i2c: testunit: don't erase registers after STOP")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Andi Shyti <andi.shyti@kernel.org>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.10-2024-07-11:
amdgpu:
- PSR-SU fix
- Reseved VMID fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240712005534.803064-1-alexander.deucher@amd.com
|
|
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
UAPI Changes:
- Use write-back caching mode for system memory on DGFX (Thomas)
Driver Changes:
- Do not leak object when finalizing hdcp gsc (Nirmoy)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/vgqz35btnxdddko3byrgww5ii36wig2tvondg2p3j3b3ourj4i@rqgolll3wwkh
|
|
This change caused PSR SU panels to not read from their remote fb,
preventing us from entering self-refresh. It is a regression.
This reverts commit 6b8487cdf9fc7bae707519ac5b5daeca18d1e85b.
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Remove wrong EIO to EGAIN conversion and pass all errors as is.
After commit 230f3d53a547 ("i40e: remove i40e_status"), which should only
replace F/W specific error codes with Linux kernel generic, all EIO errors
suddenly started to be converted into EAGAIN which leads nvmupdate to retry
until it timeouts and sometimes fails after more than 20 minutes in the
middle of NVM update, so NVM becomes corrupted.
The bug affects users only at the time when they try to update NVM, and
only F/W versions that generate errors while nvmupdate. For example, X710DA2
with 0x8000ECB7 F/W is affected, but there are probably more...
Command for reproduction is just NVM update:
./nvmupdate64
In the log instead of:
i40e_nvmupd_exec_aq err I40E_ERR_ADMIN_QUEUE_ERROR aq_err I40E_AQ_RC_ENOMEM)
appears:
i40e_nvmupd_exec_aq err -EIO aq_err I40E_AQ_RC_ENOMEM
i40e: eeprom check failed (-5), Tx/Rx traffic disabled
The problematic code did silently convert EIO into EAGAIN which forced
nvmupdate to ignore EAGAIN error and retry the same operation until timeout.
That's why NVM update takes 20+ minutes to finish with the fail in the end.
Fixes: 230f3d53a547 ("i40e: remove i40e_status")
Co-developed-by: Kelvin Kang <kelvin.kang@intel.com>
Signed-off-by: Kelvin Kang <kelvin.kang@intel.com>
Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20240710224455.188502-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When user submits a rxfh set command without touching XFRM_SYM_XOR,
rxfh.input_xfrm is set to RXH_XFRM_NO_CHANGE, which is equal to 0xff.
Testing if (rxfh.input_xfrm & RXH_XFRM_SYM_XOR &&
!ops->cap_rss_sym_xor_supported)
return -EOPNOTSUPP;
Will always be true on devices that don't set cap_rss_sym_xor_supported,
since rxfh.input_xfrm & RXH_XFRM_SYM_XOR is always true, if input_xfrm
was not set, i.e RXH_XFRM_NO_CHANGE=0xff, which will result in failure
of any command that doesn't require any change of XFRM, e.g RSS context
or hash function changes.
To avoid this breakage, test if rxfh.input_xfrm != RXH_XFRM_NO_CHANGE
before testing other conditions. Note that the problem will only trigger
with XFRM-aware userspace, old ethtool CLI would continue to work.
Fixes: 0dd415d15505 ("net: ethtool: add a NO_CHANGE uAPI for new RXFH's input_xfrm")
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Ahmed Zaki <ahmed.zaki@intel.com>
Link: https://patch.msgid.link/20240710225538.43368-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
btree_root_lock is for the root keys in btree_root, not the pointers to
the nodes themselves; this fixes a lock ordering issue between
btree_root_lock and btree node locks.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
proper lock ordering is: fs_reclaim -> btree node locks
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
not using unlock_long() blocks key cache reclaim, and the allocator may
take awhile
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This reverts commit 86d81ec5f5f05846c7c6e48ffb964b24cba2e669.
This wasn't tested with memcg enabled, it immediately hits a null ptr
deref in list_lru_add().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
R-Car Gen3+ needs a reset before every controller transfer. That erases
configuration of a potentially in parallel running local target
instance. To avoid this disruption, avoid controller transfers if a
local target is running. Also, disable SMBusHostNotify because it
requires being a controller and local target at the same time.
Fixes: 3b770017b03a ("i2c: rcar: handle RXDMA HW behaviour on Gen3")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fix from Mikulas Patocka:
- Fix broken discard for device mapper VDO target
* tag 'for-6.10/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm vdo: replace max_discard_sectors with max_hw_discard_sectors
|
|
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
drm-misc-fixes for v6.10:
- EDID irq fix for bridge/adv7511.
- gma500 null mode fixes.
- Cleanup meson binding.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/8abff46f-eae6-4521-8434-7c6240f9091c@linux.intel.com
|
|
Commit 4f563a64732d ("block: add a max_user_discard_sectors queue
limit") changed block core to set max_discard_sectors to:
min(lim->max_hw_discard_sectors, lim->max_user_discard_sectors)
Commit 825d8bbd2f32 ("dm: always manage discard support in terms
of max_hw_discard_sectors") fixed most dm targetss to deal with
this, by replacing max_discard_sectors with max_hw_discard_sectors.
Unfortunately, dm-vdo did not get fixed at that time.
Fixes: 825d8bbd2f32 ("dm: always manage discard support in terms of max_hw_discard_sectors")
Signed-off-by: Bruce Johnston <bjohnsto@redhat.com>
Signed-off-by: Matthew Sakai <msakai@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"This fixes two regressions that have been bubbling along for a large
part of this release.
One is a revert of the multi mode support for the OMAP SPI controller,
this introduced regressions on a number of systems and while there has
been progress on fixing those we've not got something that works for
everyone yet so let's just drop the change for now.
The other is a series of fixes from David Lechner for his recent
message optimisation work, this interacted badly with spi-mux which
is altogether too clever with recursive use of the bus and creates
situations that hadn't been considered.
There are also a couple of small driver specific fixes, including one
more patch from David for sleep duration calculations in the AXI
driver"
* tag 'spi-fix-v6.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: mux: set ctlr->bits_per_word_mask
spi: add defer_optimize_message controller flag
spi: don't unoptimize message in spi_async()
spi: omap2-mcspi: Revert multi mode support
spi: davinci: Unset POWERDOWN bit when releasing resources
spi: axi-spi-engine: fix sleep calculation
spi: imx: Don't expect DMA for i.MX{25,35,50,51,53} cspi devices
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from bpf and netfilter.
Current release - regressions:
- core: fix rc7's __skb_datagram_iter() regression
Current release - new code bugs:
- eth: bnxt: fix crashes when reducing ring count with active RSS
contexts
Previous releases - regressions:
- sched: fix UAF when resolving a clash
- skmsg: skip zero length skb in sk_msg_recvmsg2
- sunrpc: fix kernel free on connection failure in
xs_tcp_setup_socket
- tcp: avoid too many retransmit packets
- tcp: fix incorrect undo caused by DSACK of TLP retransmit
- udp: Set SOCK_RCU_FREE earlier in udp_lib_get_port().
- eth: ks8851: fix deadlock with the SPI chip variant
- eth: i40e: fix XDP program unloading while removing the driver
Previous releases - always broken:
- bpf:
- fix too early release of tcx_entry
- fail bpf_timer_cancel when callback is being cancelled
- bpf: fix order of args in call to bpf_map_kvcalloc
- netfilter: nf_tables: prefer nft_chain_validate
- ppp: reject claimed-as-LCP but actually malformed packets
- wireguard: avoid unaligned 64-bit memory accesses"
* tag 'net-6.10-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (33 commits)
net, sunrpc: Remap EPERM in case of connection failure in xs_tcp_setup_socket
net/sched: Fix UAF when resolving a clash
net: ks8851: Fix potential TX stall after interface reopen
udp: Set SOCK_RCU_FREE earlier in udp_lib_get_port().
netfilter: nf_tables: prefer nft_chain_validate
netfilter: nfnetlink_queue: drop bogus WARN_ON
ethtool: netlink: do not return SQI value if link is down
ppp: reject claimed-as-LCP but actually malformed packets
selftests/bpf: Add timer lockup selftest
net: ethernet: mtk-star-emac: set mac_managed_pm when probing
e1000e: fix force smbus during suspend flow
tcp: avoid too many retransmit packets
bpf: Defer work in bpf_timer_cancel_and_free
bpf: Fail bpf_timer_cancel when callback is being cancelled
bpf: fix order of args in call to bpf_map_kvcalloc
net: ethernet: lantiq_etop: fix double free in detach
i40e: Fix XDP program unloading while removing the driver
net: fix rc7's __skb_datagram_iter()
net: ks8851: Fix deadlock with the SPI chip variant
octeontx2-af: Fix incorrect value output on error path in rvu_check_rsrc_availability()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
"cachefiles:
- Export an existing and add a new cachefile helper to be used in
filesystems to fix reference count bugs
- Use the newly added fscache_ty_get_volume() helper to get a
reference count on an fscache_volume to handle volumes that are
about to be removed cleanly
- After withdrawing a fscache_cache via FSCACHE_CACHE_IS_WITHDRAWN
wait for all ongoing cookie lookups to complete and for the object
count to reach zero
- Propagate errors from vfs_getxattr() to avoid an infinite loop in
cachefiles_check_volume_xattr() because it keeps seeing ESTALE
- Don't send new requests when an object is dropped by raising
CACHEFILES_ONDEMAND_OJBSTATE_DROPPING
- Cancel all requests for an object that is about to be dropped
- Wait for the ondemand_boject_worker to finish before dropping a
cachefiles object to prevent use-after-free
- Use cyclic allocation for message ids to better handle id recycling
- Add missing lock protection when iterating through the xarray when
polling
netfs:
- Use standard logging helpers for debug logging
VFS:
- Fix potential use-after-free in file locks during
trace_posix_lock_inode(). The tracepoint could fire while another
task raced it and freed the lock that was requested to be traced
- Only increment the nr_dentry_negative counter for dentries that are
present on the superblock LRU. Currently, DCACHE_LRU_LIST list is
used to detect this case. However, the flag is also raised in
combination with DCACHE_SHRINK_LIST to indicate that dentry->d_lru
is used. So checking only DCACHE_LRU_LIST will lead to wrong
nr_dentry_negative count. Fix the check to not count dentries that
are on a shrink related list
Misc:
- hfsplus: fix an uninitialized value issue in copy_name
- minix: fix minixfs_rename with HIGHMEM. It still uses kunmap() even
though we switched it to kmap_local_page() a while ago"
* tag 'vfs-6.10-rc8.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
minixfs: Fix minixfs_rename with HIGHMEM
hfsplus: fix uninit-value in copy_name
vfs: don't mod negative dentry count when on shrinker list
filelock: fix potential use-after-free in posix_lock_inode
cachefiles: add missing lock protection when polling
cachefiles: cyclic allocation of msg_id to avoid reuse
cachefiles: wait for ondemand_object_worker to finish when dropping object
cachefiles: cancel all requests for the object that is being dropped
cachefiles: stop sending new request when dropping object
cachefiles: propagate errors from vfs_getxattr() to avoid infinite loop
cachefiles: fix slab-use-after-free in cachefiles_withdraw_cookie()
cachefiles: fix slab-use-after-free in fscache_withdraw_volume()
netfs, fscache: export fscache_put_volume() and add fscache_try_get_volume()
netfs: Switch debug logging to pr_debug()
|
|
Running the LTP hotplug stress test on a aarch64 machine results in
rcu_sched stall warnings when the broadcast hrtimer was owned by the
un-plugged CPU. The issue is the following:
CPU1 (owns the broadcast hrtimer) CPU2
tick_broadcast_enter()
// shutdown local timer device
broadcast_shutdown_local()
...
tick_broadcast_exit()
clockevents_switch_state(dev, CLOCK_EVT_STATE_ONESHOT)
// timer device is not programmed
cpumask_set_cpu(cpu, tick_broadcast_force_mask)
initiates offlining of CPU1
take_cpu_down()
/*
* CPU1 shuts down and does not
* send broadcast IPI anymore
*/
takedown_cpu()
hotplug_cpu__broadcast_tick_pull()
// move broadcast hrtimer to this CPU
clockevents_program_event()
bc_set_next()
hrtimer_start()
/*
* timer device is not programmed
* because only the first expiring
* timer will trigger clockevent
* device reprogramming
*/
What happens is that CPU2 exits broadcast mode with force bit set, then the
local timer device is not reprogrammed and CPU2 expects to receive the
expired event by the broadcast IPI. But this does not happen because CPU1
is offlined by CPU2. CPU switches the clockevent device to ONESHOT state,
but does not reprogram the device.
The subsequent reprogramming of the hrtimer broadcast device does not
program the clockevent device of CPU2 either because the pending expiry
time is already in the past and the CPU expects the event to be delivered.
As a consequence all CPUs which wait for a broadcast event to be delivered
are stuck forever.
Fix this issue by reprogramming the local timer device if the broadcast
force bit of the CPU is set so that the broadcast hrtimer is delivered.
[ tglx: Massage comment and change log. Add Fixes tag ]
Fixes: 989dcb645ca7 ("tick: Handle broadcast wakeup of multiple cpus")
Signed-off-by: Yu Liao <liaoyu15@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240711124843.64167-1-liaoyu15@huawei.com
|
|
No check is done on the size of the data to be transmiited. This causes
a kernel panic when this size exceeds the sg_miter's length.
Limit the number of transmitted bytes to sgm->length.
Cc: stable@vger.kernel.org
Fixes: ed01d210fd91 ("mmc: davinci_mmc: Use sg_miter for PIO")
Signed-off-by: Bastien Curutchet <bastien.curutchet@bootlin.com>
Link: https://lore.kernel.org/r/20240711081838.47256-2-bastien.curutchet@bootlin.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
blk_queue_max_segment_size() ensured:
if (max_size < PAGE_SIZE)
max_size = PAGE_SIZE;
whereas:
blk_validate_limits() makes it an error:
if (WARN_ON_ONCE(lim->max_segment_size < PAGE_SIZE))
return -EINVAL;
The change from one to the other, exposed sdhci which was setting maximum
segment size too low in some circumstances.
Fix the maximum segment size when it is too low.
Fixes: 616f87661792 ("mmc: pass queue_limits to blk_mq_alloc_disk")
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://lore.kernel.org/r/20240710180737.142504-1-adrian.hunter@intel.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Free arbiter allocated in intel_hdcp_gsc_init().
Fixes: 152f2df954d8 ("drm/xe/hdcp: Enable HDCP for XE")
Cc: Suraj Kandpal <suraj.kandpal@intel.com>
Cc: Arun R Murthy <arun.r.murthy@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240708125918.23573-1-nirmoy.das@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
(cherry picked from commit 33891539f9d6f245e93a76e3fb5791338180374f)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
The caching mode for buffer objects with VRAM as a possible
placement was forced to write-combined, regardless of placement.
However, write-combined system memory is expensive to allocate and
even though it is pooled, the pool is expensive to shrink, since
it involves global CPU TLB flushes.
Moreover write-combined system memory from TTM is only reliably
available on x86 and DGFX doesn't have an x86 restriction.
So regardless of the cpu caching mode selected for a bo,
internally use write-back caching mode for system memory on DGFX.
Coherency is maintained, but user-space clients may perceive a
difference in cpu access speeds.
v2:
- Update RB- and Ack tags.
- Rephrase wording in xe_drm.h (Matt Roper)
v3:
- Really rephrase wording.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Fixes: 622f709ca629 ("drm/xe/uapi: Add support for CPU caching mode")
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: dri-devel@lists.freedesktop.org
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Effie Yu <effie.yu@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Jose Souza <jose.souza@intel.com>
Cc: Michal Mrozek <michal.mrozek@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Acked-by: Matthew Auld <matthew.auld@intel.com>
Acked-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Fixes: 622f709ca629 ("drm/xe/uapi: Add support for CPU caching mode")
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Acked-by: Effie Yu <effie.yu@intel.com> #On chat
Link: https://patchwork.freedesktop.org/patch/msgid/20240705132828.27714-1-thomas.hellstrom@linux.intel.com
(cherry picked from commit 01e0cfc994be484ddcb9e121e353e51d8bb837c0)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v6.10
A few fairly small fixes for ASoC, there's a relatively large set of
hardening changes for the cs_dsp firmware file parsing and a couple of
other small device specific fixes.
|
|
We store the progress (root and inode numbers) of the extent map shrinker
in fs_info without any synchronization but we can have multiple tasks
calling into the shrinker during memory allocations when there's enough
memory pressure for example.
This can result in a task A reading fs_info->extent_map_shrinker_last_ino
after another task B updates it, and task A reading
fs_info->extent_map_shrinker_last_root before task B updates it, making
task A see an odd state that isn't necessarily harmful but may make it
skip certain inode ranges or do more work than necessary by going over
the same inodes again. These unprotected accesses would also trigger
warnings from tools like KCSAN.
So add a lock to protect access to these progress fields.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The extent map shrinker can be called in a variety of contexts where we
are under memory pressure, and of them is when a task is trying to
allocate memory. For this reason the shrinker is typically called with a
value of struct shrink_control::nr_to_scan that is much smaller than what
we return in the nr_cached_objects callback of struct super_operations
(fs/btrfs/super.c:btrfs_nr_cached_objects()), so that the shrinker does
not take a long time and cause high latencies. However we can still take
a lot of time in the shrinker even for a limited amount of nr_to_scan:
1) When traversing the red black tree that tracks open inodes in a root,
as for example with millions of open inodes we get a deep tree which
takes time searching for an inode;
2) Iterating over the extent map tree, which is a red black tree, of an
inode when doing the rb_next() calls and when removing an extent map
from the tree, since often that requires rebalancing the red black
tree;
3) When trying to write lock an inode's extent map tree we may wait for a
significant amount of time, because there's either another task about
to do IO and searching for an extent map in the tree or inserting an
extent map in the tree, and we can have thousands or even millions of
extent maps for an inode. Furthermore, there can be concurrent calls
to the shrinker so the lock might be busy simply because there is
already another task shrinking extent maps for the same inode;
4) We often reschedule if we need to, which further increases latency.
So improve on this by stopping the extent map shrinking code whenever we
need to reschedule and make it skip an inode if we can't immediately lock
its extent map tree.
Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Reported-by: Andrea Gelmini <andrea.gelmini@gmail.com>
Link: https://lore.kernel.org/linux-btrfs/CABXGCsMmmb36ym8hVNGTiU8yfUS_cGvoUmGCcBrGWq9OxTrs+A@mail.gmail.com/
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
When putting an inode during extent map shrinking we're doing a standard
iput() but that may take a long time in case the inode is dirty and we are
doing the final iput that triggers eviction - the VFS will have to wait
for writeback before calling the btrfs evict callback (see
fs/inode.c:evict()).
This slows down the task running the shrinker which may have been
triggered while updating some tree for example, meaning locks are held
as well as an open transaction handle.
Also if the iput() ends up triggering eviction and the inode has no links
anymore, then we trigger item truncation which requires flushing delayed
items, space reservation to start a transaction and that may trigger the
space reclaim task and wait for it, resulting in deadlocks in case the
reclaim task needs for example to commit a transaction and the shrinker
is being triggered from a path holding a transaction handle.
Syzbot reported such a case with the following stack traces:
======================================================
WARNING: possible circular locking dependency detected
6.10.0-rc2-syzkaller-00010-g2ab795141095 #0 Not tainted
------------------------------------------------------
kswapd0/111 is trying to acquire lock:
ffff88801eae4610 (sb_internal#3){.+.+}-{0:0}, at: btrfs_commit_inode_delayed_inode+0x110/0x330 fs/btrfs/delayed-inode.c:1275
but task is already holding lock:
ffffffff8dd3a9a0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0xa88/0x1970 mm/vmscan.c:6924
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #3 (fs_reclaim){+.+.}-{0:0}:
__fs_reclaim_acquire mm/page_alloc.c:3783 [inline]
fs_reclaim_acquire+0x102/0x160 mm/page_alloc.c:3797
might_alloc include/linux/sched/mm.h:334 [inline]
slab_pre_alloc_hook mm/slub.c:3890 [inline]
slab_alloc_node mm/slub.c:3980 [inline]
kmem_cache_alloc_lru_noprof+0x58/0x2f0 mm/slub.c:4019
btrfs_alloc_inode+0x118/0xb20 fs/btrfs/inode.c:8411
alloc_inode+0x5d/0x230 fs/inode.c:261
iget5_locked fs/inode.c:1235 [inline]
iget5_locked+0x1c9/0x2c0 fs/inode.c:1228
btrfs_iget_locked fs/btrfs/inode.c:5590 [inline]
btrfs_iget_path fs/btrfs/inode.c:5607 [inline]
btrfs_iget+0xfb/0x230 fs/btrfs/inode.c:5636
create_reloc_inode+0x403/0x820 fs/btrfs/relocation.c:3911
btrfs_relocate_block_group+0x471/0xe60 fs/btrfs/relocation.c:4114
btrfs_relocate_chunk+0x143/0x450 fs/btrfs/volumes.c:3373
__btrfs_balance fs/btrfs/volumes.c:4157 [inline]
btrfs_balance+0x211a/0x3f00 fs/btrfs/volumes.c:4534
btrfs_ioctl_balance fs/btrfs/ioctl.c:3675 [inline]
btrfs_ioctl+0x12ed/0x8290 fs/btrfs/ioctl.c:4742
__do_compat_sys_ioctl+0x2c3/0x330 fs/ioctl.c:1007
do_syscall_32_irqs_on arch/x86/entry/common.c:165 [inline]
__do_fast_syscall_32+0x73/0x120 arch/x86/entry/common.c:386
do_fast_syscall_32+0x32/0x80 arch/x86/entry/common.c:411
entry_SYSENTER_compat_after_hwframe+0x84/0x8e
-> #2 (btrfs_trans_num_extwriters){++++}-{0:0}:
join_transaction+0x164/0xf40 fs/btrfs/transaction.c:315
start_transaction+0x427/0x1a70 fs/btrfs/transaction.c:700
btrfs_rebuild_free_space_tree+0xaa/0x480 fs/btrfs/free-space-tree.c:1323
btrfs_start_pre_rw_mount+0x218/0xf60 fs/btrfs/disk-io.c:2999
open_ctree+0x41ab/0x52e0 fs/btrfs/disk-io.c:3554
btrfs_fill_super fs/btrfs/super.c:946 [inline]
btrfs_get_tree_super fs/btrfs/super.c:1863 [inline]
btrfs_get_tree+0x11e9/0x1b90 fs/btrfs/super.c:2089
vfs_get_tree+0x8f/0x380 fs/super.c:1780
fc_mount+0x16/0xc0 fs/namespace.c:1125
btrfs_get_tree_subvol fs/btrfs/super.c:2052 [inline]
btrfs_get_tree+0xa53/0x1b90 fs/btrfs/super.c:2090
vfs_get_tree+0x8f/0x380 fs/super.c:1780
do_new_mount fs/namespace.c:3352 [inline]
path_mount+0x6e1/0x1f10 fs/namespace.c:3679
do_mount fs/namespace.c:3692 [inline]
__do_sys_mount fs/namespace.c:3898 [inline]
__se_sys_mount fs/namespace.c:3875 [inline]
__ia32_sys_mount+0x295/0x320 fs/namespace.c:3875
do_syscall_32_irqs_on arch/x86/entry/common.c:165 [inline]
__do_fast_syscall_32+0x73/0x120 arch/x86/entry/common.c:386
do_fast_syscall_32+0x32/0x80 arch/x86/entry/common.c:411
entry_SYSENTER_compat_after_hwframe+0x84/0x8e
-> #1 (btrfs_trans_num_writers){++++}-{0:0}:
join_transaction+0x148/0xf40 fs/btrfs/transaction.c:314
start_transaction+0x427/0x1a70 fs/btrfs/transaction.c:700
btrfs_rebuild_free_space_tree+0xaa/0x480 fs/btrfs/free-space-tree.c:1323
btrfs_start_pre_rw_mount+0x218/0xf60 fs/btrfs/disk-io.c:2999
open_ctree+0x41ab/0x52e0 fs/btrfs/disk-io.c:3554
btrfs_fill_super fs/btrfs/super.c:946 [inline]
btrfs_get_tree_super fs/btrfs/super.c:1863 [inline]
btrfs_get_tree+0x11e9/0x1b90 fs/btrfs/super.c:2089
vfs_get_tree+0x8f/0x380 fs/super.c:1780
fc_mount+0x16/0xc0 fs/namespace.c:1125
btrfs_get_tree_subvol fs/btrfs/super.c:2052 [inline]
btrfs_get_tree+0xa53/0x1b90 fs/btrfs/super.c:2090
vfs_get_tree+0x8f/0x380 fs/super.c:1780
do_new_mount fs/namespace.c:3352 [inline]
path_mount+0x6e1/0x1f10 fs/namespace.c:3679
do_mount fs/namespace.c:3692 [inline]
__do_sys_mount fs/namespace.c:3898 [inline]
__se_sys_mount fs/namespace.c:3875 [inline]
__ia32_sys_mount+0x295/0x320 fs/namespace.c:3875
do_syscall_32_irqs_on arch/x86/entry/common.c:165 [inline]
__do_fast_syscall_32+0x73/0x120 arch/x86/entry/common.c:386
do_fast_syscall_32+0x32/0x80 arch/x86/entry/common.c:411
entry_SYSENTER_compat_after_hwframe+0x84/0x8e
-> #0 (sb_internal#3){.+.+}-{0:0}:
check_prev_add kernel/locking/lockdep.c:3134 [inline]
check_prevs_add kernel/locking/lockdep.c:3253 [inline]
validate_chain kernel/locking/lockdep.c:3869 [inline]
__lock_acquire+0x2478/0x3b30 kernel/locking/lockdep.c:5137
lock_acquire kernel/locking/lockdep.c:5754 [inline]
lock_acquire+0x1b1/0x560 kernel/locking/lockdep.c:5719
percpu_down_read include/linux/percpu-rwsem.h:51 [inline]
__sb_start_write include/linux/fs.h:1655 [inline]
sb_start_intwrite include/linux/fs.h:1838 [inline]
start_transaction+0xbc1/0x1a70 fs/btrfs/transaction.c:694
btrfs_commit_inode_delayed_inode+0x110/0x330 fs/btrfs/delayed-inode.c:1275
btrfs_evict_inode+0x960/0xe80 fs/btrfs/inode.c:5291
evict+0x2ed/0x6c0 fs/inode.c:667
iput_final fs/inode.c:1741 [inline]
iput.part.0+0x5a8/0x7f0 fs/inode.c:1767
iput+0x5c/0x80 fs/inode.c:1757
btrfs_scan_root fs/btrfs/extent_map.c:1118 [inline]
btrfs_free_extent_maps+0xbd3/0x1320 fs/btrfs/extent_map.c:1189
super_cache_scan+0x409/0x550 fs/super.c:227
do_shrink_slab+0x44f/0x11c0 mm/shrinker.c:435
shrink_slab+0x18a/0x1310 mm/shrinker.c:662
shrink_one+0x493/0x7c0 mm/vmscan.c:4790
shrink_many mm/vmscan.c:4851 [inline]
lru_gen_shrink_node+0x89f/0x1750 mm/vmscan.c:4951
shrink_node mm/vmscan.c:5910 [inline]
kswapd_shrink_node mm/vmscan.c:6720 [inline]
balance_pgdat+0x1105/0x1970 mm/vmscan.c:6911
kswapd+0x5ea/0xbf0 mm/vmscan.c:7180
kthread+0x2c1/0x3a0 kernel/kthread.c:389
ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
other info that might help us debug this:
Chain exists of:
sb_internal#3 --> btrfs_trans_num_extwriters --> fs_reclaim
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(fs_reclaim);
lock(btrfs_trans_num_extwriters);
lock(fs_reclaim);
rlock(sb_internal#3);
*** DEADLOCK ***
2 locks held by kswapd0/111:
#0: ffffffff8dd3a9a0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0xa88/0x1970 mm/vmscan.c:6924
#1: ffff88801eae40e0 (&type->s_umount_key#62){++++}-{3:3}, at: super_trylock_shared fs/super.c:562 [inline]
#1: ffff88801eae40e0 (&type->s_umount_key#62){++++}-{3:3}, at: super_cache_scan+0x96/0x550 fs/super.c:196
stack backtrace:
CPU: 0 PID: 111 Comm: kswapd0 Not tainted 6.10.0-rc2-syzkaller-00010-g2ab795141095 #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:114
check_noncircular+0x31a/0x400 kernel/locking/lockdep.c:2187
check_prev_add kernel/locking/lockdep.c:3134 [inline]
check_prevs_add kernel/locking/lockdep.c:3253 [inline]
validate_chain kernel/locking/lockdep.c:3869 [inline]
__lock_acquire+0x2478/0x3b30 kernel/locking/lockdep.c:5137
lock_acquire kernel/locking/lockdep.c:5754 [inline]
lock_acquire+0x1b1/0x560 kernel/locking/lockdep.c:5719
percpu_down_read include/linux/percpu-rwsem.h:51 [inline]
__sb_start_write include/linux/fs.h:1655 [inline]
sb_start_intwrite include/linux/fs.h:1838 [inline]
start_transaction+0xbc1/0x1a70 fs/btrfs/transaction.c:694
btrfs_commit_inode_delayed_inode+0x110/0x330 fs/btrfs/delayed-inode.c:1275
btrfs_evict_inode+0x960/0xe80 fs/btrfs/inode.c:5291
evict+0x2ed/0x6c0 fs/inode.c:667
iput_final fs/inode.c:1741 [inline]
iput.part.0+0x5a8/0x7f0 fs/inode.c:1767
iput+0x5c/0x80 fs/inode.c:1757
btrfs_scan_root fs/btrfs/extent_map.c:1118 [inline]
btrfs_free_extent_maps+0xbd3/0x1320 fs/btrfs/extent_map.c:1189
super_cache_scan+0x409/0x550 fs/super.c:227
do_shrink_slab+0x44f/0x11c0 mm/shrinker.c:435
shrink_slab+0x18a/0x1310 mm/shrinker.c:662
shrink_one+0x493/0x7c0 mm/vmscan.c:4790
shrink_many mm/vmscan.c:4851 [inline]
lru_gen_shrink_node+0x89f/0x1750 mm/vmscan.c:4951
shrink_node mm/vmscan.c:5910 [inline]
kswapd_shrink_node mm/vmscan.c:6720 [inline]
balance_pgdat+0x1105/0x1970 mm/vmscan.c:6911
kswapd+0x5ea/0xbf0 mm/vmscan.c:7180
kthread+0x2c1/0x3a0 kernel/kthread.c:389
ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>
So fix this by using btrfs_add_delayed_iput() so that the final iput is
delegated to the cleaner kthread.
Link: https://lore.kernel.org/linux-btrfs/000000000000892280061a344581@google.com/
Reported-by: syzbot+3dad89b3993a4b275e72@syzkaller.appspotmail.com
Fixes: 956a17d9d050 ("btrfs: add a shrinker for extent maps")
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Currently, when built with "make W=1", the following warnings are
generated:
net/ceph/crush/mapper.c:466: warning: Function parameter or struct member 'work' not described in 'crush_choose_firstn'
net/ceph/crush/mapper.c:466: warning: Function parameter or struct member 'weight' not described in 'crush_choose_firstn'
net/ceph/crush/mapper.c:466: warning: Function parameter or struct member 'weight_max' not described in 'crush_choose_firstn'
net/ceph/crush/mapper.c:466: warning: Function parameter or struct member 'choose_args' not described in 'crush_choose_firstn'
Update the crush_choose_firstn() kernel-doc to document these
parameters.
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Currently, when built with "make W=1", the following warnings are
generated:
net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'map' not described in 'crush_choose_indep'
net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'work' not described in 'crush_choose_indep'
net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'bucket' not described in 'crush_choose_indep'
net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'weight' not described in 'crush_choose_indep'
net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'weight_max' not described in 'crush_choose_indep'
net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'x' not described in 'crush_choose_indep'
net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'left' not described in 'crush_choose_indep'
net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'numrep' not described in 'crush_choose_indep'
net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'type' not described in 'crush_choose_indep'
net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'out' not described in 'crush_choose_indep'
net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'outpos' not described in 'crush_choose_indep'
net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'tries' not described in 'crush_choose_indep'
net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'recurse_tries' not described in 'crush_choose_indep'
net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'recurse_to_leaf' not described in 'crush_choose_indep'
net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'out2' not described in 'crush_choose_indep'
net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'parent_r' not described in 'crush_choose_indep'
net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'choose_args' not described in 'crush_choose_indep'
These warnings are generated because the prologue comment for
crush_choose_indep() uses the kernel-doc prefix, but the actual
comment is a very brief description that is not in kernel-doc
format. Since this is a static function there is no need to fully
document the function, so replace the kernel-doc comment prefix with a
standard comment prefix to remove these warnings.
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following batch contains Netfilter fixes for net:
Patch #1 fixes a bogus WARN_ON splat in nfnetlink_queue.
Patch #2 fixes a crash due to stack overflow in chain loop detection
by using the existing chain validation routines
Both patches from Florian Westphal.
netfilter pull request 24-07-11
* tag 'nf-24-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nf_tables: prefer nft_chain_validate
netfilter: nfnetlink_queue: drop bogus WARN_ON
====================
Link: https://patch.msgid.link/20240711093948.3816-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2024-07-11
The following pull-request contains BPF updates for your *net* tree.
We've added 4 non-merge commits during the last 2 day(s) which contain
a total of 4 files changed, 262 insertions(+), 19 deletions(-).
The main changes are:
1) Fixes for a BPF timer lockup and a use-after-free scenario when timers
are used concurrently, from Kumar Kartikeya Dwivedi.
2) Fix the argument order in the call to bpf_map_kvcalloc() which could
otherwise lead to a compilation error, from Mohammad Shehar Yaar Tausif.
bpf-for-netdev
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
selftests/bpf: Add timer lockup selftest
bpf: Defer work in bpf_timer_cancel_and_free
bpf: Fail bpf_timer_cancel when callback is being cancelled
bpf: fix order of args in call to bpf_map_kvcalloc
====================
Link: https://patch.msgid.link/20240711084016.25757-1-daniel@iogearbox.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
When using a BPF program on kernel_connect(), the call can return -EPERM. This
causes xs_tcp_setup_socket() to loop forever, filling up the syslog and causing
the kernel to potentially freeze up.
Neil suggested:
This will propagate -EPERM up into other layers which might not be ready
to handle it. It might be safer to map EPERM to an error we would be more
likely to expect from the network system - such as ECONNREFUSED or ENETDOWN.
ECONNREFUSED as error seems reasonable. For programs setting a different error
can be out of reach (see handling in 4fbac77d2d09) in particular on kernels
which do not have f10d05966196 ("bpf: Make BPF_PROG_RUN_ARRAY return -err
instead of allow boolean"), thus given that it is better to simply remap for
consistent behavior. UDP does handle EPERM in xs_udp_send_request().
Fixes: d74bad4e74ee ("bpf: Hooks for sys_connect")
Fixes: 4fbac77d2d09 ("bpf: Hooks for sys_bind")
Co-developed-by: Lex Siegel <usiegl00@gmail.com>
Signed-off-by: Lex Siegel <usiegl00@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Neil Brown <neilb@suse.de>
Cc: Trond Myklebust <trondmy@kernel.org>
Cc: Anna Schumaker <anna@kernel.org>
Link: https://github.com/cilium/cilium/issues/33395
Link: https://lore.kernel.org/bpf/171374175513.12877.8993642908082014881@noble.neil.brown.name
Link: https://patch.msgid.link/9069ec1d59e4b2129fc23433349fd5580ad43921.1720075070.git.daniel@iogearbox.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
KASAN reports the following UAF:
BUG: KASAN: slab-use-after-free in tcf_ct_flow_table_process_conn+0x12b/0x380 [act_ct]
Read of size 1 at addr ffff888c07603600 by task handler130/6469
Call Trace:
<IRQ>
dump_stack_lvl+0x48/0x70
print_address_description.constprop.0+0x33/0x3d0
print_report+0xc0/0x2b0
kasan_report+0xd0/0x120
__asan_load1+0x6c/0x80
tcf_ct_flow_table_process_conn+0x12b/0x380 [act_ct]
tcf_ct_act+0x886/0x1350 [act_ct]
tcf_action_exec+0xf8/0x1f0
fl_classify+0x355/0x360 [cls_flower]
__tcf_classify+0x1fd/0x330
tcf_classify+0x21c/0x3c0
sch_handle_ingress.constprop.0+0x2c5/0x500
__netif_receive_skb_core.constprop.0+0xb25/0x1510
__netif_receive_skb_list_core+0x220/0x4c0
netif_receive_skb_list_internal+0x446/0x620
napi_complete_done+0x157/0x3d0
gro_cell_poll+0xcf/0x100
__napi_poll+0x65/0x310
net_rx_action+0x30c/0x5c0
__do_softirq+0x14f/0x491
__irq_exit_rcu+0x82/0xc0
irq_exit_rcu+0xe/0x20
common_interrupt+0xa1/0xb0
</IRQ>
<TASK>
asm_common_interrupt+0x27/0x40
Allocated by task 6469:
kasan_save_stack+0x38/0x70
kasan_set_track+0x25/0x40
kasan_save_alloc_info+0x1e/0x40
__kasan_krealloc+0x133/0x190
krealloc+0xaa/0x130
nf_ct_ext_add+0xed/0x230 [nf_conntrack]
tcf_ct_act+0x1095/0x1350 [act_ct]
tcf_action_exec+0xf8/0x1f0
fl_classify+0x355/0x360 [cls_flower]
__tcf_classify+0x1fd/0x330
tcf_classify+0x21c/0x3c0
sch_handle_ingress.constprop.0+0x2c5/0x500
__netif_receive_skb_core.constprop.0+0xb25/0x1510
__netif_receive_skb_list_core+0x220/0x4c0
netif_receive_skb_list_internal+0x446/0x620
napi_complete_done+0x157/0x3d0
gro_cell_poll+0xcf/0x100
__napi_poll+0x65/0x310
net_rx_action+0x30c/0x5c0
__do_softirq+0x14f/0x491
Freed by task 6469:
kasan_save_stack+0x38/0x70
kasan_set_track+0x25/0x40
kasan_save_free_info+0x2b/0x60
____kasan_slab_free+0x180/0x1f0
__kasan_slab_free+0x12/0x30
slab_free_freelist_hook+0xd2/0x1a0
__kmem_cache_free+0x1a2/0x2f0
kfree+0x78/0x120
nf_conntrack_free+0x74/0x130 [nf_conntrack]
nf_ct_destroy+0xb2/0x140 [nf_conntrack]
__nf_ct_resolve_clash+0x529/0x5d0 [nf_conntrack]
nf_ct_resolve_clash+0xf6/0x490 [nf_conntrack]
__nf_conntrack_confirm+0x2c6/0x770 [nf_conntrack]
tcf_ct_act+0x12ad/0x1350 [act_ct]
tcf_action_exec+0xf8/0x1f0
fl_classify+0x355/0x360 [cls_flower]
__tcf_classify+0x1fd/0x330
tcf_classify+0x21c/0x3c0
sch_handle_ingress.constprop.0+0x2c5/0x500
__netif_receive_skb_core.constprop.0+0xb25/0x1510
__netif_receive_skb_list_core+0x220/0x4c0
netif_receive_skb_list_internal+0x446/0x620
napi_complete_done+0x157/0x3d0
gro_cell_poll+0xcf/0x100
__napi_poll+0x65/0x310
net_rx_action+0x30c/0x5c0
__do_softirq+0x14f/0x491
The ct may be dropped if a clash has been resolved but is still passed to
the tcf_ct_flow_table_process_conn function for further usage. This issue
can be fixed by retrieving ct from skb again after confirming conntrack.
Fixes: 0cc254e5aa37 ("net/sched: act_ct: Offload connections with commit action")
Co-developed-by: Gerald Yang <gerald.yang@canonical.com>
Signed-off-by: Gerald Yang <gerald.yang@canonical.com>
Signed-off-by: Chengen Du <chengen.du@canonical.com>
Link: https://patch.msgid.link/20240710053747.13223-1-chengen.du@canonical.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The amount of TX space in the hardware buffer is tracked in the tx_space
variable. The initial value is currently only set during driver probing.
After closing the interface and reopening it the tx_space variable has
the last value it had before close. If it is smaller than the size of
the first send packet after reopeing the interface the queue will be
stopped. The queue is woken up after receiving a TX interrupt but this
will never happen since we did not send anything.
This commit moves the initialization of the tx_space variable to the
ks8851_net_open function right before starting the TX queue. Also query
the value from the hardware instead of using a hard coded value.
Only the SPI chip variant is affected by this issue because only this
driver variant actually depends on the tx_space variable in the xmit
function.
Fixes: 3dc5d4454545 ("net: ks8851: Fix TX stall caused by TX buffer overrun")
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Simon Horman <horms@kernel.org>
Cc: netdev@vger.kernel.org
Cc: stable@vger.kernel.org # 5.10+
Signed-off-by: Ronald Wahl <ronald.wahl@raritan.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20240709195845.9089-1-rwahl@gmx.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
syzkaller triggered the warning [0] in udp_v4_early_demux().
In udp_v[46]_early_demux() and sk_lookup(), we do not touch the refcount
of the looked-up sk and use sock_pfree() as skb->destructor, so we check
SOCK_RCU_FREE to ensure that the sk is safe to access during the RCU grace
period.
Currently, SOCK_RCU_FREE is flagged for a bound socket after being put
into the hash table. Moreover, the SOCK_RCU_FREE check is done too early
in udp_v[46]_early_demux() and sk_lookup(), so there could be a small race
window:
CPU1 CPU2
---- ----
udp_v4_early_demux() udp_lib_get_port()
| |- hlist_add_head_rcu()
|- sk = __udp4_lib_demux_lookup() |
|- DEBUG_NET_WARN_ON_ONCE(sk_is_refcounted(sk));
`- sock_set_flag(sk, SOCK_RCU_FREE)
We had the same bug in TCP and fixed it in commit 871019b22d1b ("net:
set SOCK_RCU_FREE before inserting socket into hashtable").
Let's apply the same fix for UDP.
[0]:
WARNING: CPU: 0 PID: 11198 at net/ipv4/udp.c:2599 udp_v4_early_demux+0x481/0xb70 net/ipv4/udp.c:2599
Modules linked in:
CPU: 0 PID: 11198 Comm: syz-executor.1 Not tainted 6.9.0-g93bda33046e7 #13
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:udp_v4_early_demux+0x481/0xb70 net/ipv4/udp.c:2599
Code: c5 7a 15 fe bb 01 00 00 00 44 89 e9 31 ff d3 e3 81 e3 bf ef ff ff 89 de e8 2c 74 15 fe 85 db 0f 85 02 06 00 00 e8 9f 7a 15 fe <0f> 0b e8 98 7a 15 fe 49 8d 7e 60 e8 4f 39 2f fe 49 c7 46 60 20 52
RSP: 0018:ffffc9000ce3fa58 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff8318c92c
RDX: ffff888036ccde00 RSI: ffffffff8318c2f1 RDI: 0000000000000001
RBP: ffff88805a2dd6e0 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0001ffffffffffff R12: ffff88805a2dd680
R13: 0000000000000007 R14: ffff88800923f900 R15: ffff88805456004e
FS: 00007fc449127640(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fc449126e38 CR3: 000000003de4b002 CR4: 0000000000770ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
PKRU: 55555554
Call Trace:
<TASK>
ip_rcv_finish_core.constprop.0+0xbdd/0xd20 net/ipv4/ip_input.c:349
ip_rcv_finish+0xda/0x150 net/ipv4/ip_input.c:447
NF_HOOK include/linux/netfilter.h:314 [inline]
NF_HOOK include/linux/netfilter.h:308 [inline]
ip_rcv+0x16c/0x180 net/ipv4/ip_input.c:569
__netif_receive_skb_one_core+0xb3/0xe0 net/core/dev.c:5624
__netif_receive_skb+0x21/0xd0 net/core/dev.c:5738
netif_receive_skb_internal net/core/dev.c:5824 [inline]
netif_receive_skb+0x271/0x300 net/core/dev.c:5884
tun_rx_batched drivers/net/tun.c:1549 [inline]
tun_get_user+0x24db/0x2c50 drivers/net/tun.c:2002
tun_chr_write_iter+0x107/0x1a0 drivers/net/tun.c:2048
new_sync_write fs/read_write.c:497 [inline]
vfs_write+0x76f/0x8d0 fs/read_write.c:590
ksys_write+0xbf/0x190 fs/read_write.c:643
__do_sys_write fs/read_write.c:655 [inline]
__se_sys_write fs/read_write.c:652 [inline]
__x64_sys_write+0x41/0x50 fs/read_write.c:652
x64_sys_call+0xe66/0x1990 arch/x86/include/generated/asm/syscalls_64.h:2
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0x4b/0x110 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x4b/0x53
RIP: 0033:0x7fc44a68bc1f
Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 e9 cf f5 ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 3c d0 f5 ff 48
RSP: 002b:00007fc449126c90 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00000000004bc050 RCX: 00007fc44a68bc1f
RDX: 0000000000000032 RSI: 00000000200000c0 RDI: 00000000000000c8
RBP: 00000000004bc050 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000032 R11: 0000000000000293 R12: 0000000000000000
R13: 000000000000000b R14: 00007fc44a5ec530 R15: 0000000000000000
</TASK>
Fixes: 6acc9b432e67 ("bpf: Add helper to retrieve socket in BPF")
Reported-by: syzkaller <syzkaller@googlegroups.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20240709191356.24010-1-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
I2C core handles the local target for receiving HostNotify alerts. There
is no separate driver bound to that address. That means userspace can
access it if desired, leading to further complications if controllers
are not capable of reading their own local target. Bind the local target
to the dummy driver so it will be marked as "handled by the kernel" if
the HostNotify feature is used. That protects aginst userspace access
and prevents other drivers binding to it.
Fixes: 2a71593da34d ("i2c: smbus: add core function handling SMBus host-notify")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
|
|
The testunit has nothing to do with 'eeprom', remove that term. It was a
copy&paste leftover.
Fixes: a8335c64c5f0 ("i2c: add slave testunit driver")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
|