Age | Commit message (Collapse) | Author |
|
git://git.infradead.org/users/hch/configfs
Pull configfs updates from Christoph Hellwig:
- remove unused code (Dr. David Alan Gilbert)
- improve item creation performance (Seamus Connor)
* tag 'configfs-6.13-2024-11-19' of git://git.infradead.org/users/hch/configfs:
configfs: improve item creation performance
configfs: remove unused configfs_hash_and_remove
|
|
Pull jfs updates from Dave Kleikamp:
"A few more patches to add sanity checks in jfs"
* tag 'jfs-6.13' of github.com:kleikamp/linux-shaggy:
jfs: add a check to prevent array-index-out-of-bounds in dbAdjTree
jfs: xattr: check invalid xattr size more strictly
jfs: fix array-index-out-of-bounds in jfs_readdir
jfs: fix shift-out-of-bounds in dbSplit
jfs: array-index-out-of-bounds fix in dtReadFirst
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm
Pull dlm updates from David Teigland:
- Fix recovery of locks that are being converted between PR/CW modes
- Fix cleanup of rsb list if recovery is interrupted during
recover_members
- Fix null dereference in debug code if dlm api is called improperly
- Fix wrong args passed to trace function
- Move error checks out of add_to_waiters so the function can't fail
- Clean up some code for configfs
* tag 'dlm-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
dlm: fix dlm_recover_members refcount on error
dlm: fix recovery of middle conversions
dlm: make add_to_waiters() that it can't fail
dlm: dlm_config_info config fields to unsigned int
dlm: use dlm_config as only cluster configuration
dlm: handle port as __be16 network byte order
dlm: disallow different configs nodeid storages
dlm: fix possible lkb_resource null dereference
dlm: fix swapped args sb_flags vs sb_status
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull fsnotify updates from Jan Kara:
"A couple of smaller random fsnotify fixes"
* tag 'fsnotify_for_v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
fsnotify: Fix ordering of iput() and watched_objects decrement
fsnotify: fix sending inotify event with unexpected filename
fanotify: allow reporting errors on failure to open fd
fsnotify, lsm: Decouple fsnotify from lsm
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull reiserfs removal from Jan Kara:
"The deprecation period of reiserfs is ending at the end of this year
so it is time to remove it"
* tag 'reiserfs_delete' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
reiserfs: The last commit
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull quota and isofs updates from Jan Kara:
"Fix a memory leak in isofs and a cleanup of includes in quota"
* tag 'for_v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
dquot.c: get rid of include ../internal.h
isofs: avoid memory leak in iocharset
|
|
Pull xfs updates from Carlos Maiolino:
"The bulk of this pull request is a major rework that Darrick and
Christoph have been doing on XFS's real-time volume, coupled with a
few features to support this rework. It does also includes some bug
fixes.
- convert perag to use xarrays
- create a new generic allocation group structure
- add metadata inode dir trees
- create in-core rt allocation groups
- shard the RT section into allocation groups
- persist quota options with the enw metadata dir tree
- enable quota for RT volumes
- enable metadata directory trees
- some bugfixes"
* tag 'xfs-6.13-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (146 commits)
xfs: port ondisk structure checks from xfs/122 to the kernel
xfs: separate space btree structures in xfs_ondisk.h
xfs: convert struct typedefs in xfs_ondisk.h
xfs: enable metadata directory feature
xfs: enable realtime quota again
xfs: update sb field checks when metadir is turned on
xfs: reserve quota for realtime files correctly
xfs: create quota preallocation watermarks for realtime quota
xfs: report realtime block quota limits on realtime directories
xfs: persist quota flags with metadir
xfs: advertise realtime quota support in the xqm stat files
xfs: scrub quota file metapaths
xfs: fix chown with rt quota
xfs: use metadir for quota inodes
xfs: refactor xfs_qm_destroy_quotainos
xfs: use rtgroup busy extent list for FITRIM
xfs: implement busy extent tracking for rtgroups
xfs: port the perag discard code to handle generic groups
xfs: move the min and max group block numbers to xfs_group
xfs: adjust min_block usage in xfs_verify_agbno
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull erofs updates from Gao Xiang:
"There is no outstanding feature for this cycle. The most useful
changes are SEEK_{DATA,HOLE} support and some decompression
micro-optimization. Other than those, there are some bugfixes and
cleanups as usual:
- Add SEEK_{DATA,HOLE} support
- Free redundant pclusters if no cached compressed data is valid
- Add sysfs entry to drop internal caches
- Several bugfixes & cleanups"
* tag 'erofs-for-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
erofs: handle NONHEAD !delta[1] lclusters gracefully
erofs: clarify direct I/O support
erofs: fix blksize < PAGE_SIZE for file-backed mounts
erofs: get rid of `buf->kmap_type`
erofs: fix file-backed mounts over FUSE
erofs: simplify definition of the log functions
erofs: add sysfs node to drop internal caches
erofs: free pclusters if no cached folio is attached
erofs: sunset `struct erofs_workgroup`
erofs: move erofs_workgroup operations into zdata.c
erofs: get rid of erofs_{find,insert}_workgroup
erofs: add SEEK_{DATA,HOLE} support
|
|
If open_cached_dir() encounters an error parsing the lease from the
server, the error handling may race with receiving a lease break,
resulting in open_cached_dir() freeing the cfid while the queued work is
pending.
Update open_cached_dir() to drop refs rather than directly freeing the
cfid.
Have cached_dir_lease_break(), cfids_laundromat_worker(), and
invalidate_all_cached_dirs() clear has_lease immediately while still
holding cfids->cfid_list_lock, and then use this to also simplify the
reference counting in cfids_laundromat_worker() and
invalidate_all_cached_dirs().
Fixes this KASAN splat (which manually injects an error and lease break
in open_cached_dir()):
==================================================================
BUG: KASAN: slab-use-after-free in smb2_cached_lease_break+0x27/0xb0
Read of size 8 at addr ffff88811cc24c10 by task kworker/3:1/65
CPU: 3 UID: 0 PID: 65 Comm: kworker/3:1 Not tainted 6.12.0-rc6-g255cf264e6e5-dirty #87
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
Workqueue: cifsiod smb2_cached_lease_break
Call Trace:
<TASK>
dump_stack_lvl+0x77/0xb0
print_report+0xce/0x660
kasan_report+0xd3/0x110
smb2_cached_lease_break+0x27/0xb0
process_one_work+0x50a/0xc50
worker_thread+0x2ba/0x530
kthread+0x17c/0x1c0
ret_from_fork+0x34/0x60
ret_from_fork_asm+0x1a/0x30
</TASK>
Allocated by task 2464:
kasan_save_stack+0x33/0x60
kasan_save_track+0x14/0x30
__kasan_kmalloc+0xaa/0xb0
open_cached_dir+0xa7d/0x1fb0
smb2_query_path_info+0x43c/0x6e0
cifs_get_fattr+0x346/0xf10
cifs_get_inode_info+0x157/0x210
cifs_revalidate_dentry_attr+0x2d1/0x460
cifs_getattr+0x173/0x470
vfs_statx_path+0x10f/0x160
vfs_statx+0xe9/0x150
vfs_fstatat+0x5e/0xc0
__do_sys_newfstatat+0x91/0xf0
do_syscall_64+0x95/0x1a0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Freed by task 2464:
kasan_save_stack+0x33/0x60
kasan_save_track+0x14/0x30
kasan_save_free_info+0x3b/0x60
__kasan_slab_free+0x51/0x70
kfree+0x174/0x520
open_cached_dir+0x97f/0x1fb0
smb2_query_path_info+0x43c/0x6e0
cifs_get_fattr+0x346/0xf10
cifs_get_inode_info+0x157/0x210
cifs_revalidate_dentry_attr+0x2d1/0x460
cifs_getattr+0x173/0x470
vfs_statx_path+0x10f/0x160
vfs_statx+0xe9/0x150
vfs_fstatat+0x5e/0xc0
__do_sys_newfstatat+0x91/0xf0
do_syscall_64+0x95/0x1a0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Last potentially related work creation:
kasan_save_stack+0x33/0x60
__kasan_record_aux_stack+0xad/0xc0
insert_work+0x32/0x100
__queue_work+0x5c9/0x870
queue_work_on+0x82/0x90
open_cached_dir+0x1369/0x1fb0
smb2_query_path_info+0x43c/0x6e0
cifs_get_fattr+0x346/0xf10
cifs_get_inode_info+0x157/0x210
cifs_revalidate_dentry_attr+0x2d1/0x460
cifs_getattr+0x173/0x470
vfs_statx_path+0x10f/0x160
vfs_statx+0xe9/0x150
vfs_fstatat+0x5e/0xc0
__do_sys_newfstatat+0x91/0xf0
do_syscall_64+0x95/0x1a0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
The buggy address belongs to the object at ffff88811cc24c00
which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 16 bytes inside of
freed 1024-byte region [ffff88811cc24c00, ffff88811cc25000)
Cc: stable@vger.kernel.org
Signed-off-by: Paul Aurich <paul@darkrain42.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
open_cached_dir() may either race with the tcon reconnection even before
compound_send_recv() or directly trigger a reconnection via
SMB2_open_init() or SMB_query_info_init().
The reconnection process invokes invalidate_all_cached_dirs() via
cifs_mark_open_files_invalid(), which removes all cfids from the
cfids->entries list but doesn't drop a ref if has_lease isn't true. This
results in the currently-being-constructed cfid not being on the list,
but still having a refcount of 2. It leaks if returned from
open_cached_dir().
Fix this by setting cfid->has_lease when the ref is actually taken; the
cfid will not be used by other threads until it has a valid time.
Addresses these kmemleaks:
unreferenced object 0xffff8881090c4000 (size 1024):
comm "bash", pid 1860, jiffies 4295126592
hex dump (first 32 bytes):
00 01 00 00 00 00 ad de 22 01 00 00 00 00 ad de ........".......
00 ca 45 22 81 88 ff ff f8 dc 4f 04 81 88 ff ff ..E"......O.....
backtrace (crc 6f58c20f):
[<ffffffff8b895a1e>] __kmalloc_cache_noprof+0x2be/0x350
[<ffffffff8bda06e3>] open_cached_dir+0x993/0x1fb0
[<ffffffff8bdaa750>] cifs_readdir+0x15a0/0x1d50
[<ffffffff8b9a853f>] iterate_dir+0x28f/0x4b0
[<ffffffff8b9a9aed>] __x64_sys_getdents64+0xfd/0x200
[<ffffffff8cf6da05>] do_syscall_64+0x95/0x1a0
[<ffffffff8d00012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
unreferenced object 0xffff8881044fdcf8 (size 8):
comm "bash", pid 1860, jiffies 4295126592
hex dump (first 8 bytes):
00 cc cc cc cc cc cc cc ........
backtrace (crc 10c106a9):
[<ffffffff8b89a3d3>] __kmalloc_node_track_caller_noprof+0x363/0x480
[<ffffffff8b7d7256>] kstrdup+0x36/0x60
[<ffffffff8bda0700>] open_cached_dir+0x9b0/0x1fb0
[<ffffffff8bdaa750>] cifs_readdir+0x15a0/0x1d50
[<ffffffff8b9a853f>] iterate_dir+0x28f/0x4b0
[<ffffffff8b9a9aed>] __x64_sys_getdents64+0xfd/0x200
[<ffffffff8cf6da05>] do_syscall_64+0x95/0x1a0
[<ffffffff8d00012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
And addresses these BUG splats when unmounting the SMB filesystem:
BUG: Dentry ffff888140590ba0{i=1000000000080,n=/} still in use (2) [unmount of cifs cifs]
WARNING: CPU: 3 PID: 3433 at fs/dcache.c:1536 umount_check+0xd0/0x100
Modules linked in:
CPU: 3 UID: 0 PID: 3433 Comm: bash Not tainted 6.12.0-rc4-g850925a8133c-dirty #49
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
RIP: 0010:umount_check+0xd0/0x100
Code: 8d 7c 24 40 e8 31 5a f4 ff 49 8b 54 24 40 41 56 49 89 e9 45 89 e8 48 89 d9 41 57 48 89 de 48 c7 c7 80 e7 db ac e8 f0 72 9a ff <0f> 0b 58 31 c0 5a 5b 5d 41 5c 41 5d 41 5e 41 5f e9 2b e5 5d 01 41
RSP: 0018:ffff88811cc27978 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff888140590ba0 RCX: ffffffffaaf20bae
RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff8881f6fb6f40
RBP: ffff8881462ec000 R08: 0000000000000001 R09: ffffed1023984ee3
R10: ffff88811cc2771f R11: 00000000016cfcc0 R12: ffff888134383e08
R13: 0000000000000002 R14: ffff8881462ec668 R15: ffffffffaceab4c0
FS: 00007f23bfa98740(0000) GS:ffff8881f6f80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000556de4a6f808 CR3: 0000000123c80000 CR4: 0000000000350ef0
Call Trace:
<TASK>
d_walk+0x6a/0x530
shrink_dcache_for_umount+0x6a/0x200
generic_shutdown_super+0x52/0x2a0
kill_anon_super+0x22/0x40
cifs_kill_sb+0x159/0x1e0
deactivate_locked_super+0x66/0xe0
cleanup_mnt+0x140/0x210
task_work_run+0xfb/0x170
syscall_exit_to_user_mode+0x29f/0x2b0
do_syscall_64+0xa1/0x1a0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f23bfb93ae7
Code: ff ff ff ff c3 66 0f 1f 44 00 00 48 8b 0d 11 93 0d 00 f7 d8 64 89 01 b8 ff ff ff ff eb bf 0f 1f 44 00 00 b8 50 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e9 92 0d 00 f7 d8 64 89 01 48
RSP: 002b:00007ffee9138598 EFLAGS: 00000246 ORIG_RAX: 0000000000000050
RAX: 0000000000000000 RBX: 0000558f1803e9a0 RCX: 00007f23bfb93ae7
RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000558f1803e9a0
RBP: 0000558f1803e600 R08: 0000000000000007 R09: 0000558f17fab610
R10: d91d5ec34ab757b0 R11: 0000000000000246 R12: 0000000000000001
R13: 0000000000000000 R14: 0000000000000015 R15: 0000000000000000
</TASK>
irq event stamp: 1163486
hardirqs last enabled at (1163485): [<ffffffffac98d344>] _raw_spin_unlock_irqrestore+0x34/0x60
hardirqs last disabled at (1163486): [<ffffffffac97dcfc>] __schedule+0xc7c/0x19a0
softirqs last enabled at (1163482): [<ffffffffab79a3ee>] __smb_send_rqst+0x3de/0x990
softirqs last disabled at (1163480): [<ffffffffac2314f1>] release_sock+0x21/0xf0
---[ end trace 0000000000000000 ]---
VFS: Busy inodes after unmount of cifs (cifs)
------------[ cut here ]------------
kernel BUG at fs/super.c:661!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI
CPU: 1 UID: 0 PID: 3433 Comm: bash Tainted: G W 6.12.0-rc4-g850925a8133c-dirty #49
Tainted: [W]=WARN
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
RIP: 0010:generic_shutdown_super+0x290/0x2a0
Code: e8 15 7c f7 ff 48 8b 5d 28 48 89 df e8 09 7c f7 ff 48 8b 0b 48 89 ee 48 8d 95 68 06 00 00 48 c7 c7 80 7f db ac e8 00 69 af ff <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 90 90 90 90 90 90
RSP: 0018:ffff88811cc27a50 EFLAGS: 00010246
RAX: 000000000000003e RBX: ffffffffae994420 RCX: 0000000000000027
RDX: 0000000000000000 RSI: ffffffffab06180e RDI: ffff8881f6eb18c8
RBP: ffff8881462ec000 R08: 0000000000000001 R09: ffffed103edd6319
R10: ffff8881f6eb18cb R11: 00000000016d3158 R12: ffff8881462ec9c0
R13: ffff8881462ec050 R14: 0000000000000001 R15: 0000000000000000
FS: 00007f23bfa98740(0000) GS:ffff8881f6e80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8364005d68 CR3: 0000000123c80000 CR4: 0000000000350ef0
Call Trace:
<TASK>
kill_anon_super+0x22/0x40
cifs_kill_sb+0x159/0x1e0
deactivate_locked_super+0x66/0xe0
cleanup_mnt+0x140/0x210
task_work_run+0xfb/0x170
syscall_exit_to_user_mode+0x29f/0x2b0
do_syscall_64+0xa1/0x1a0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f23bfb93ae7
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:generic_shutdown_super+0x290/0x2a0
Code: e8 15 7c f7 ff 48 8b 5d 28 48 89 df e8 09 7c f7 ff 48 8b 0b 48 89 ee 48 8d 95 68 06 00 00 48 c7 c7 80 7f db ac e8 00 69 af ff <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 90 90 90 90 90 90
RSP: 0018:ffff88811cc27a50 EFLAGS: 00010246
RAX: 000000000000003e RBX: ffffffffae994420 RCX: 0000000000000027
RDX: 0000000000000000 RSI: ffffffffab06180e RDI: ffff8881f6eb18c8
RBP: ffff8881462ec000 R08: 0000000000000001 R09: ffffed103edd6319
R10: ffff8881f6eb18cb R11: 00000000016d3158 R12: ffff8881462ec9c0
R13: ffff8881462ec050 R14: 0000000000000001 R15: 0000000000000000
FS: 00007f23bfa98740(0000) GS:ffff8881f6e80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8364005d68 CR3: 0000000123c80000 CR4: 0000000000350ef0
This reproduces eventually with an SMB mount and two shells running
these loops concurrently
- while true; do
cd ~; sleep 1;
for i in {1..3}; do cd /mnt/test/subdir;
echo $PWD; sleep 1; cd ..; echo $PWD; sleep 1;
done;
echo ...;
done
- while true; do
iptables -F OUTPUT; mount -t cifs -a;
for _ in {0..2}; do ls /mnt/test/subdir/ | wc -l; done;
iptables -I OUTPUT -p tcp --dport 445 -j DROP;
sleep 10
echo "unmounting"; umount -l -t cifs -a; echo "done unmounting";
sleep 20
echo "recovering"; iptables -F OUTPUT;
sleep 10;
done
Fixes: ebe98f1447bb ("cifs: enable caching of directories for which a lease is held")
Fixes: 5c86919455c1 ("smb: client: fix use-after-free in smb2_query_info_compound()")
Cc: stable@vger.kernel.org
Signed-off-by: Paul Aurich <paul@darkrain42.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
We can't use PATH_MAX for SMB symlinks because
(1) Windows Server will fail FSCTL_SET_REPARSE_POINT with
STATUS_IO_REPARSE_DATA_INVALID when input buffer is larger than
16K, as specified in MS-FSA 2.1.5.10.37.
(2) The client won't be able to parse large SMB responses that
includes SMB symlink path within SMB2_CREATE or SMB2_IOCTL
responses.
Fix this by defining a maximum length value (4060) for SMB symlinks
that both client and server can handle.
Cc: David Howells <dhowells@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
smb2_set_next_command() no longer squashes request iovs into a single
iov, so the bounds check can be dropped.
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
After commit f7f291e14dde ("cifs: fix oops during encryption"), the
encryption layer can handle vmalloc'd buffers as well as kmalloc'd
buffers, so there is no need to inefficiently squash request iovs
into a single one to handle padding in compound requests.
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
This client was only requesting READ caching, not READ and HANDLE caching
in the LeaseState on the open requests we send for directories. To
delay closing a handle (e.g. for caching directory contents) we should
be requesting HANDLE as well as READ (as we already do for deferred
close of files). See MS-SMB2 3.3.1.4 e.g.
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Windows Server <<2012
Windows NFS server versions on Windows Server older than 2012 release use
for storing char and block devices modified SFU format, not compatible with
the original SFU. Windows NFS server on Windows Server 2012 and new
versions use different format (reparse points), not related to SFU-style.
SFU / SUA / Interix subsystem stores the major and major numbers as pair of
64-bit integer, but Windows NFS server stores as pair of 32-bit integers.
Which makes char and block devices between Windows NFS server <<2012 and
Windows SFU/SUA/Interix subsytem incompatible.
So improve Linux SMB client.
When SFU mode is enabled (mount option -o sfu is specified) then recognize
also these kind of char and block devices and its major and minor numbers,
which are used by Windows Server versions older than 2012.
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
In the current implementation, the SMB filesystem on a mount point can
trigger upcalls from the kernel to the userspace to enable certain
functionalities like spnego, dns_resolution, amongst others. These upcalls
usually either happen in the context of the mount or in the context of an
application/user. The upcall handler for cifs, cifs.upcall already has
existing code which switches the namespaces to the caller's namespace
before handling the upcall. This behaviour is expected for scenarios like
multiuser mounts, but might not cover all single user scenario with
services such as Kubernetes, where the mount can happen from different
locations such as on the host, from an app container, or a driver pod
which does the mount on behalf of a different pod.
This patch introduces a new mount option called upcall_target, to
customise the upcall behaviour. upcall_target can take 'mount' and 'app'
as possible values. This aids use cases like Kubernetes where the mount
happens on behalf of the application in another container altogether.
Having this new mount option allows the mount command to specify where the
upcall should happen: 'mount' for resolving the upcall to the host
namespace, and 'app' for resolving the upcall to the ns of the calling
thread. This will enable both the scenarios where the Kerberos credentials
can be found on the application namespace or the host namespace to which
just the mount operation is "delegated".
Reviewed-by: Shyam Prasad <shyam.prasad@microsoft.com>
Reviewed-by: Bharath S M <bharathsm@microsoft.com>
Reviewed-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Signed-off-by: Ritvik Budhiraja <rbudhiraja@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
The cifs_sb_tlink() function can return error pointers, but this code
dereferences it before checking for error pointers. Re-order the code
to fix that.
Fixes: 0f9b6b045bb2 ("fs/smb/client: implement chmod() for SMB3 POSIX Extensions")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Bharath SM <bharathsm@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
The NT ACL format for an SMB3 POSIX Extensions chmod() is a single ACE with the
magic S-1-5-88-3-mode SID:
NT Security Descriptor
Revision: 1
Type: 0x8004, Self Relative, DACL Present
Offset to owner SID: 56
Offset to group SID: 124
Offset to SACL: 0
Offset to DACL: 20
Owner: S-1-5-21-3177838999-3893657415-1037673384-1000
Group: S-1-22-2-1000
NT User (DACL) ACL
Revision: NT4 (2)
Size: 36
Num ACEs: 1
NT ACE: S-1-5-88-3-438, flags 0x00, Access Allowed, mask 0x00000000
Type: Access Allowed
NT ACE Flags: 0x00
Size: 28
Access required: 0x00000000
SID: S-1-5-88-3-438
Owner and Group should be NULL, but the server is not required to fail the
request if they are present.
Signed-off-by: Ralph Boehme <slow@samba.org>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Update this log message since cached fids may represent things other
than the root of a mount.
Fixes: e4029e072673 ("cifs: find and use the dentry for cached non-root directories also")
Signed-off-by: Paul Aurich <paul@darkrain42.org>
Reviewed-by: Bharath SM <bharathsm@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni:
"The most significant set of changes is the per netns RTNL. The new
behavior is disabled by default, regression risk should be contained.
Notably the new config knob PTP_1588_CLOCK_VMCLOCK will inherit its
default value from PTP_1588_CLOCK_KVM, as the first is intended to be
a more reliable replacement for the latter.
Core:
- Started a very large, in-progress, effort to make the RTNL lock
scope per network-namespace, thus reducing the lock contention
significantly in the containerized use-case, comprising:
- RCU-ified some relevant slices of the FIB control path
- introduce basic per netns locking helpers
- namespacified the IPv4 address hash table
- remove rtnl_register{,_module}() in favour of
rtnl_register_many()
- refactor rtnl_{new,del,set}link() moving as much validation as
possible out of RTNL lock
- convert all phonet doit() and dumpit() handlers to RCU
- convert IPv4 addresses manipulation to per-netns RTNL
- convert virtual interface creation to per-netns RTNL
the per-netns lock infrastructure is guarded by the
CONFIG_DEBUG_NET_SMALL_RTNL knob, disabled by default ad interim.
- Introduce NAPI suspension, to efficiently switching between busy
polling (NAPI processing suspended) and normal processing.
- Migrate the IPv4 routing input, output and control path from direct
ToS usage to DSCP macros. This is a work in progress to make ECN
handling consistent and reliable.
- Add drop reasons support to the IPv4 rotue input path, allowing
better introspection in case of packets drop.
- Make FIB seqnum lockless, dropping RTNL protection for read access.
- Make inet{,v6} addresses hashing less predicable.
- Allow providing timestamp OPT_ID via cmsg, to correlate TX packets
and timestamps
Things we sprinkled into general kernel code:
- Add small file operations for debugfs, to reduce the struct ops
size.
- Refactoring and optimization for the implementation of page_frag
API, This is a preparatory work to consolidate the page_frag
implementation.
Netfilter:
- Optimize set element transactions to reduce memory consumption
- Extended netlink error reporting for attribute parser failure.
- Make legacy xtables configs user selectable, giving users the
option to configure iptables without enabling any other config.
- Address a lot of false-positive RCU issues, pointed by recent CI
improvements.
BPF:
- Put xsk sockets on a struct diet and add various cleanups. Overall,
this helps to bump performance by 12% for some workloads.
- Extend BPF selftests to increase coverage of XDP features in
combination with BPF cpumap.
- Optimize and homogenize bpf_csum_diff helper for all archs and also
add a batch of new BPF selftests for it.
- Extend netkit with an option to delegate skb->{mark,priority}
scrubbing to its BPF program.
- Make the bpf_get_netns_cookie() helper available also to tc(x) BPF
programs.
Protocols:
- Introduces 4-tuple hash for connected udp sockets, speeding-up
significantly connected sockets lookup.
- Add a fastpath for some TCP timers that usually expires after
close, the socket lock contention.
- Add inbound and outbound xfrm state caches to speed up state
lookups.
- Avoid sending MPTCP advertisements on stale subflows, reducing
risks on loosing them.
- Make neighbours table flushing more scalable, maintaining per
device neigh lists.
Driver API:
- Introduce a unified interface to configure transmission H/W
shaping, and expose it to user-space via generic-netlink.
- Add support for per-NAPI config via netlink. This makes napi
configuration persistent across queues removal and re-creation.
Requires driver updates, currently supported drivers are:
nVidia/Mellanox mlx4 and mlx5, Broadcom brcm and Intel ice.
- Add ethtool support for writing SFP / PHY firmware blocks.
- Track RSS context allocation from ethtool core.
- Implement support for mirroring to DSA CPU port, via TC mirror
offload.
- Consolidate FDB updates notification, to avoid duplicates on
device-specific entries.
- Expose DPLL clock quality level to the user-space.
- Support master-slave PHY config via device tree.
Tests and tooling:
- forwarding: introduce deferred commands, to simplify the cleanup
phase
Drivers:
- Updated several drivers - Amazon vNic, Google vNic, Microsoft vNic,
Intel e1000e and Broadcom Tigon3 - to use netdev-genl to link the
IRQs and queues to NAPI IDs, allowing busy polling and better
introspection.
- Ethernet high-speed NICs:
- nVidia/Mellanox:
- mlx5:
- a large refactor to implement support for cross E-Switch
scheduling
- refactor H/W conter management to let it scale better
- H/W GRO cleanups
- Intel (100G, ice)::
- add support for ethtool reset
- implement support for per TX queue H/W shaping
- AMD/Solarflare:
- implement per device queue stats support
- Broadcom (bnxt):
- improve wildcard l4proto on IPv4/IPv6 ntuple rules
- Marvell Octeon:
- Add representor support for each Resource Virtualization Unit
(RVU) device.
- Hisilicon:
- add support for the BMC Gigabit Ethernet
- IBM (EMAC):
- driver cleanup and modernization
- Cisco (VIC):
- raise the queues number limit to 256
- Ethernet virtual:
- Google vNIC:
- implement page pool support
- macsec:
- inherit lower device's features and TSO limits when
offloading
- virtio_net:
- enable premapped mode by default
- support for XDP socket(AF_XDP) zerocopy TX
- wireguard:
- set the TSO max size to be GSO_MAX_SIZE, to aggregate larger
packets.
- Ethernet NICs embedded and virtual:
- Broadcom ASP:
- enable software timestamping
- Freescale:
- add enetc4 PF driver
- MediaTek: Airoha SoC:
- implement BQL support
- RealTek r8169:
- enable TSO by default on r8168/r8125
- implement extended ethtool stats
- Renesas AVB:
- enable TX checksum offload
- Synopsys (stmmac):
- support header splitting for vlan tagged packets
- move common code for DWMAC4 and DWXGMAC into a separate FPE
module.
- add dwmac driver support for T-HEAD TH1520 SoC
- Synopsys (xpcs):
- driver refactor and cleanup
- TI:
- icssg_prueth: add VLAN offload support
- Xilinx emaclite:
- add clock support
- Ethernet switches:
- Microchip:
- implement support for the lan969x Ethernet switch family
- add LAN9646 switch support to KSZ DSA driver
- Ethernet PHYs:
- Marvel: 88q2x: enable auto negotiation
- Microchip: add support for LAN865X Rev B1 and LAN867X Rev C1/C2
- PTP:
- Add support for the Amazon virtual clock device
- Add PtP driver for s390 clocks
- WiFi:
- mac80211
- EHT 1024 aggregation size for transmissions
- new operation to indicate that a new interface is to be added
- support radio separation of multi-band devices
- move wireless extension spy implementation to libiw
- Broadcom:
- brcmfmac: optional LPO clock support
- Microchip:
- add support for Atmel WILC3000
- Qualcomm (ath12k):
- firmware coredump collection support
- add debugfs support for a multitude of statistics
- Qualcomm (ath5k):
- Arcadyan ARV45XX AR2417 & Gigaset SX76[23] AR241[34]A support
- Realtek:
- rtw88: 8821au and 8812au USB adapters support
- rtw89: add thermal protection
- rtw89: fine tune BT-coexsitence to improve user experience
- rtw89: firmware secure boot for WiFi 6 chip
- Bluetooth
- add Qualcomm WCN785x support for ids Foxconn 0xe0fc/0xe0f3 and
0x13d3:0x3623
- add Realtek RTL8852BE support for id Foxconn 0xe123
- add MediaTek MT7920 support for wireless module ids
- btintel_pcie: add handshake between driver and firmware
- btintel_pcie: add recovery mechanism
- btnxpuart: add GPIO support to power save feature"
* tag 'net-next-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1475 commits)
mm: page_frag: fix a compile error when kernel is not compiled
Documentation: tipc: fix formatting issue in tipc.rst
selftests: nic_performance: Add selftest for performance of NIC driver
selftests: nic_link_layer: Add selftest case for speed and duplex states
selftests: nic_link_layer: Add link layer selftest for NIC driver
bnxt_en: Add FW trace coredump segments to the coredump
bnxt_en: Add a new ethtool -W dump flag
bnxt_en: Add 2 parameters to bnxt_fill_coredump_seg_hdr()
bnxt_en: Add functions to copy host context memory
bnxt_en: Do not free FW log context memory
bnxt_en: Manage the FW trace context memory
bnxt_en: Allocate backing store memory for FW trace logs
bnxt_en: Add a 'force' parameter to bnxt_free_ctx_mem()
bnxt_en: Refactor bnxt_free_ctx_mem()
bnxt_en: Add mem_valid bit to struct bnxt_ctx_mem_type
bnxt_en: Update firmware interface spec to 1.10.3.85
selftests/bpf: Add some tests with sockmap SK_PASS
bpf: fix recursive lock when verdict program return SK_PASS
wireguard: device: support big tcp GSO
wireguard: selftests: load nf_conntrack if not present
...
|
|
We use rwlock to protect core structure data of extent tree during
its shrink, however, if there is a huge number of extent nodes in
extent tree, during shrink of extent tree, it may hold rwlock for
a very long time, which may trigger kernel hang issue.
This patch fixes to shrink read extent node in batches, so that,
critical region of the rwlock can be shrunk to avoid its extreme
long time hold.
Reported-by: Xiuhong Wang <xiuhong.wang@unisoc.com>
Closes: https://lore.kernel.org/linux-f2fs-devel/20241112110627.1314632-1-xiuhong.wang@unisoc.com/
Signed-off-by: Xiuhong Wang <xiuhong.wang@unisoc.com>
Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
If fs corruption occurs in f2fs_new_node_page(), let's print
more information about corrupted metadata into kernel log.
Meanwhile, it updates to record ERROR_INCONSISTENT_NAT instead
of ERROR_INVALID_BLKADDR if blkaddr in nat entry is not
NULL_ADDR which means nat bitmap and nat entry is inconsistent.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
SBI_POR_DOING can be cleared after recovery is completed, so that
changes made before recovery can be persistent, and subsequent
errors can be recorded into cp/sb.
Signed-off-by: Song Feng <songfeng@oppo.com>
Signed-off-by: Yongpeng Yang <yangyongpeng1@oppo.com>
Signed-off-by: Sheng Yong <shengyong@oppo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Fsync data recovery attempts to check and fix write pointer consistency
of cursegs and all other zones. If the write pointers of cursegs are
unaligned, cursegs are changed to new sections.
If recovery fails, zone write pointers are still checked and fixed,
but the latest checkpoint cannot be written back. Additionally, retry-
mount skips recovery and rolls back to reuse the old cursegs whose
zones are already finished. This can lead to unaligned write later.
This patch addresses the issue by leaving writer pointers untouched if
recovery fails. When retry-mount is performed, cursegs and other zones
are checked and fixed after skipping recovery.
Signed-off-by: Song Feng <songfeng@oppo.com>
Signed-off-by: Yongpeng Yang <yangyongpeng1@oppo.com>
Signed-off-by: Sheng Yong <shengyong@oppo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
The unusable cap value must be adjusted before checking whether
checkpoint=disable is feasible.
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
dd if=/dev/zero of=file bs=4k count=5
xfs_io file -c "fiemap -v 2 16384"
file:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..31]: 139272..139303 32 0x1000
1: [32..39]: 139304..139311 8 0x1001
xfs_io file -c "fiemap -v 0 16384"
file:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..31]: 139272..139303 32 0x1000
xfs_io file -c "fiemap -v 0 16385"
file:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..39]: 139272..139311 40 0x1001
There are two problems:
- continuous extent is split to two
- FIEMAP_EXTENT_LAST is missing in last extent
The root cause is: if upper boundary of inquiry crosses extent,
f2fs_map_blocks() will truncate length of returned extent to
F2FS_BYTES_TO_BLK(len), and also, it will stop to query latter
extent or hole to make sure current extent is last or not.
In order to fix this issue, once we found an extent locates
in the end of inquiry range by f2fs_map_blocks(), we need to
expand inquiry range to requiry.
Cc: stable@vger.kernel.org
Fixes: 7f63eb77af7b ("f2fs: report unwritten area in f2fs_fiemap")
Reported-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
If user give a file size as "length" parameter for fiemap
operations, but if this size is non-block size aligned,
it will show 2 segments fiemap results even this whole file
is contiguous on disk, such as the following results:
./f2fs_io fiemap 0 19034 ylog/analyzer.py
Fiemap: offset = 0 len = 19034
logical addr. physical addr. length flags
0 0000000000000000 0000000020baa000 0000000000004000 00001000
1 0000000000004000 0000000020bae000 0000000000001000 00001001
after this patch:
./f2fs_io fiemap 0 19034 ylog/analyzer.py
Fiemap: offset = 0 len = 19034
logical addr. physical addr. length flags
0 0000000000000000 00000000315f3000 0000000000005000 00001001
Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
f2fs doesn't support different blksize in one instance, so
bytes_to_blks() and blks_to_bytes() are equal to F2FS_BYTES_TO_BLK
and F2FS_BLK_TO_BYTES, let's use F2FS_BYTES_TO_BLK/F2FS_BLK_TO_BYTES
instead for cleanup.
Reviewed-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
It missed to cast variable to unsigned long long type before
bit shift, which will cause overflow, fix it.
Fixes: f7ef9b83b583 ("f2fs: introduce macros to convert bytes and blocks in f2fs")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
strcpy is deprecated. Kernel docs recommend replacing strcpy with
strscpy. The function strcpy() return value isn't used so there
shouldn't be an issue replacing with the safer alternative strscpy.
Signed-off-by: Daniel Yang <danielyangkang@gmail.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This reverts commit 54f43a10fa257ad4af02a1d157fefef6ebcfa7dc.
The above commit broke the lazytime mount, given
mount("/dev/vdb", "/mnt/test", "f2fs", 0, "lazytime");
CC: stable@vger.kernel.org # 6.11+
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Pull bpf updates from Alexei Starovoitov:
- Add BPF uprobe session support (Jiri Olsa)
- Optimize uprobe performance (Andrii Nakryiko)
- Add bpf_fastcall support to helpers and kfuncs (Eduard Zingerman)
- Avoid calling free_htab_elem() under hash map bucket lock (Hou Tao)
- Prevent tailcall infinite loop caused by freplace (Leon Hwang)
- Mark raw_tracepoint arguments as nullable (Kumar Kartikeya Dwivedi)
- Introduce uptr support in the task local storage map (Martin KaFai
Lau)
- Stringify errno log messages in libbpf (Mykyta Yatsenko)
- Add kmem_cache BPF iterator for perf's lock profiling (Namhyung Kim)
- Support BPF objects of either endianness in libbpf (Tony Ambardar)
- Add ksym to struct_ops trampoline to fix stack trace (Xu Kuohai)
- Introduce private stack for eligible BPF programs (Yonghong Song)
- Migrate samples/bpf tests to selftests/bpf test_progs (Daniel T. Lee)
- Migrate test_sock to selftests/bpf test_progs (Jordan Rife)
* tag 'bpf-next-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (152 commits)
libbpf: Change hash_combine parameters from long to unsigned long
selftests/bpf: Fix build error with llvm 19
libbpf: Fix memory leak in bpf_program__attach_uprobe_multi
bpf: use common instruction history across all states
bpf: Add necessary migrate_disable to range_tree.
bpf: Do not alloc arena on unsupported arches
selftests/bpf: Set test path for token/obj_priv_implicit_token_envvar
selftests/bpf: Add a test for arena range tree algorithm
bpf: Introduce range_tree data structure and use it in bpf arena
samples/bpf: Remove unused variable in xdp2skb_meta_kern.c
samples/bpf: Remove unused variables in tc_l2_redirect_kern.c
bpftool: Cast variable `var` to long long
bpf, x86: Propagate tailcall info only for subprogs
bpf: Add kernel symbol for struct_ops trampoline
bpf: Use function pointers count as struct_ops links count
bpf: Remove unused member rcu from bpf_struct_ops_map
selftests/bpf: Add struct_ops prog private stack tests
bpf: Support private stack for struct_ops progs
selftests/bpf: Add tracing prog private stack tests
bpf, x86: Support private stack in jit
...
|
|
SPI-NAND changes:
A load of fixes to Winbond manufacturer driver have been done, plus a
structure constification.
Raw NAND changes:
The GPMI driver has been improved on the power management side.
The Davinci driver has been cleaned up.
A leak in the Atmel driver plus some typos in the core have been fixed.
|
|
Rename pwrctrl functions and structures from "pwrctl" to "pwrctrl" to match
the similar file renames.
Link: https://lore.kernel.org/r/20241115214428.2061153-3-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Acked-by: Krzysztof Wilczyński <kw@linux.com>
|
|
To slightly reduce confusion between "pwrctl" (the power controller and
power sequencing framework) and "bwctrl" (the bandwidth controller),
rename "pwrctl" to "pwrctrl" so they use the same "ctrl" suffix.
Rename drivers/pci/pwrctl/ to drivers/pci/pwrctrl/, including the related
MAINTAINERS, include file (include/linux/pci-pwrctl.h), Makefile, and
Kconfig changes.
This is the minimal rename of files only. A subsequent commit will rename
functions and data structures.
Link: https://lore.kernel.org/r/20241115214428.2061153-2-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Acked-by: Krzysztof Wilczyński <kw@linux.com>
|
|
pwrctl parent
There is no need to iterate over all children of the pwrctl device parent
to remove the pwrctl device. Since the pwrctl device associated with the
PCI device can be found using of_find_device_by_node() API, use it directly
instead.
Any pwrctl devices lying around without getting associated with the PCI
devices will be removed once their parent device gets removed.
Link: https://lore.kernel.org/r/20241025-pci-pwrctl-rework-v2-5-568756156cbe@linaro.org
Tested-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Tested-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
As per the kernel device driver model, a pwrctl device is the supplier for
the PCI device, but the device link that enforces the supplier-consumer
relationship was previously created by the pwrctl driver. Therefore, the
driver model didn't prevent probing PCI client drivers before probing the
corresponding pwrctl drivers. This may lead to a race condition if the PCI
device was already powered on by the bootloader (before the pwrctl driver).
If the bootloader did not power on the PCI device, this wouldn't create any
problem as the pwrctl driver will be the one powering on the device, so the
PCI client driver always gets probed afterward. But if the device was
already powered on, then the device will be seen by the PCI core and the
PCI client driver may get probed before its pwrctl driver. This creates a
race condition as the pwrctl driver may change the device power state while
the device is being accessed by the client driver.
One such issue was already reported on the Qcom X13s platform with the WLAN
device and fixed with a hack in the WCN pwrseq driver by a9aaf1ff88a8
("power: sequencing: request the WLAN enable GPIO as-is").
A cleaner way to fix the above mentioned race condition is to ensure that
the pwrctl drivers are always probed before the client drivers.
If the PCI device is associated with a pwrctl platform device with a power
supply, add a device link between the PCI device and the pwrctl device
before device_attach() in pci_bus_add_device().
Note that there is no need to explicitly remove the device link as that
will be taken care of by the driver core when the PCI device gets removed.
Fixes: 4565d2652a37 ("PCI/pwrctl: Add PCI power control core code")
Fixes: 8fb18619d910 ("PCI/pwrctl: Create platform devices for child OF nodes of the port node")
Link: https://lore.kernel.org/r/20241025-pci-pwrctl-rework-v2-3-568756156cbe@linaro.org
Tested-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Tested-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
[bhelgaas: squash fix from
https://lore.kernel.org/r/20241120062459.6371-1-manivannan.sadhasivam@linaro.org
for SPARCv9 issue reported by Jonathan Currier <dullfire@yahoo.com>]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
[kwilczynski: wrap code to 80 columns]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Cc: stable+noautosel@kernel.org # Depends on power supply check
|
|
Currently, pwrctl devices are created if the corresponding PCI nodes are
defined in devicetree. But this is not correct, because not all PCI nodes
require pwrctl support. Pwrctl comes into the picture only when the device
requires kernel to manage its power state. This can be determined using the
power supply properties present in the devicetree node of the device.
Add of_pci_supply_present() to check whether the devicetree contains at
least one power supply property for a device. If one is present, create a
pwrctl device for that PCI node.
Suggested-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Fixes: 8fb18619d910 ("PCI/pwrctl: Create platform devices for child OF nodes of the port node")
Link: https://lore.kernel.org/r/20241025-pci-pwrctl-rework-v2-2-568756156cbe@linaro.org
Tested-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Tested-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
[bhelgaas: rename of_pci_is_supply_present() to of_pci_supply_present() for
readability]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Cc: stable+noautosel@kernel.org # Depends on of_platform_device_create() rework
|
|
The of_platform_populate() API creates platform devices by descending
through the children of the parent node. But it provides no control over
the child nodes, which makes it difficult to add checks for the child
nodes in the future.
Use of_platform_device_create() and for_each_child_of_node_scoped() to make
it possible to add checks for each node before creating the platform
device.
Link: https://lore.kernel.org/r/20241025-pci-pwrctl-rework-v2-1-568756156cbe@linaro.org
Tested-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Tested-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
Fix the inverted check for security_sb_show_options().
Link: https://lore.kernel.org/r/c8eaa647-5d67-49b6-9401-705afcb7e4d7@stanley.mountain
Link: https://lore.kernel.org/r/20241120-verehren-rhabarber-83a11b297bcc@brauner
Fixes: aefff51e1c29 ("statmount: retrieve security mount options")
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Cc: stable@vger.kernel.org # mainline only
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Move common code from opt_array/opt_sec_array to helper. This helper
does more than just unescape options, so rename to
statmount_opt_process().
Handle corner case of just a single character in options.
Rename some local variables to better describe their function.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Link: https://lore.kernel.org/r/20241120142732.55210-1-mszeredi@redhat.com
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Brian Foster <bfoster@redhat.com> says:
Here's v4 of the zero range flush improvements. No real major changes
here, mostly minor whitespace, naming issues, etc.
* patches from https://lore.kernel.org/r/20241115200155.593665-1-bfoster@redhat.com:
iomap: elide flush from partial eof zero range
iomap: lift zeroed mapping handling into iomap_zero_range()
iomap: reset per-iter state on non-error iter advances
Link: https://lore.kernel.org/r/20241115200155.593665-1-bfoster@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Remove duplicate included header file linux/uio.h
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Link: https://lore.kernel.org/r/20240628062329.321162-2-thorsten.blum@toblux.com
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
iomap zero range flushes pagecache in certain situations to
determine which parts of the range might require zeroing if dirty
data is present in pagecache. The kernel robot recently reported a
regression associated with this flushing in the following stress-ng
workload on XFS:
stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --metamix 64
This workload involves repeated small, strided, extending writes. On
XFS, this produces a pattern of post-eof speculative preallocation,
conversion of preallocation from delalloc to unwritten, dirtying
pagecache over newly unwritten blocks, and then rinse and repeat
from the new EOF. This leads to repetitive flushing of the EOF folio
via the zero range call XFS uses for writes that start beyond
current EOF.
To mitigate this problem, special case EOF block zeroing to prefer
zeroing the folio over a flush when the EOF folio is already dirty.
To do this, split out and open code handling of an unaligned start
offset. This brings most of the performance back by avoiding flushes
on zero range calls via write and truncate extension operations. The
flush doesn't occur in these situations because the entire range is
post-eof and therefore the folio that overlaps EOF is the only one
in the range.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Link: https://lore.kernel.org/r/20241115200155.593665-4-bfoster@redhat.com
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
In preparation for special handling of subranges, lift the zeroed
mapping logic from the iterator into the caller. Since this puts the
pagecache dirty check and flushing in the same place, streamline the
comments a bit as well.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Link: https://lore.kernel.org/r/20241115200155.593665-3-bfoster@redhat.com
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
iomap_iter_advance() zeroes the processed and mapping fields on
every non-error iteration except for the last expected iteration
(i.e. return 0 expected to terminate the iteration loop). This
appears to be circumstantial as nothing currently relies on these
fields after the final iteration.
Therefore to better faciliate iomap_iter reuse in subsequent
patches, update iomap_iter_advance() to always reset per-iteration
state on successful completion.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Link: https://lore.kernel.org/r/20241115200155.593665-2-bfoster@redhat.com
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
iomap_zero_range() uses buffered writes for manual zeroing, no
longer updates i_size for such writes, but is still explicitly
called for post-eof ranges. The historical use case for this is
zeroing post-eof speculative preallocation on extending writes from
XFS. However, XFS also recently changed to convert all post-eof
delalloc mappings to unwritten in the iomap_begin() handler, which
means it now never expects manual zeroing of post-eof mappings. In
other words, all post-eof mappings should be reported as holes or
unwritten.
This is a subtle dependency that can be hard to detect if violated
because associated codepaths are likely to update i_size after folio
locks are dropped, but before writeback happens to occur. For
example, if XFS reverts back to some form of manual zeroing of
post-eof blocks on write extension, writeback of those zeroed folios
will now race with the presumed i_size update from the subsequent
buffered write.
Since iomap_zero_range() can't correctly zero post-eof mappings
beyond EOF without updating i_size, warn if this ever occurs. This
serves as minimal indication that if this use case is reintroduced
by a filesystem, iomap_zero_range() might need to reconsider i_size
updates for write extending use cases.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Link: https://lore.kernel.org/r/20241115145931.535207-1-bfoster@redhat.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Setting GPIO direction = high, sometimes results in GPIO value = 0.
If a GPIO is pulled high, the following construction results in the
value being 0 when the desired value is 1:
$ echo "high" > /sys/class/gpio/gpio336/direction
$ cat /sys/class/gpio/gpio336/value
0
Before the GPIO direction is changed from an input to an output,
exar_set_value() is called with value = 1, but since the GPIO is an
input when exar_set_value() is called, _regmap_update_bits() reads a 1
due to an external pull-up. regmap_set_bits() sets force_write =
false, so the value (1) is not written. When the direction is then
changed, the GPIO becomes an output with the value of 0 (the hardware
default).
regmap_write_bits() sets force_write = true, so the value is always
written by exar_set_value() and an external pull-up doesn't affect the
outcome of setting direction = high.
The same can happen when a GPIO is pulled low, but the scenario is a
little more complicated.
$ echo high > /sys/class/gpio/gpio351/direction
$ cat /sys/class/gpio/gpio351/value
1
$ echo in > /sys/class/gpio/gpio351/direction
$ cat /sys/class/gpio/gpio351/value
0
$ echo low > /sys/class/gpio/gpio351/direction
$ cat /sys/class/gpio/gpio351/value
1
Fixes: 36fb7218e878 ("gpio: exar: switch to using regmap")
Co-developed-by: Matthew McClain <mmcclain@noprivs.com>
Signed-off-by: Matthew McClain <mmcclain@noprivs.com>
Signed-off-by: Sai Kumar Cholleti <skmr537@gmail.com>
Cc: stable@vger.kernel.org
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20241105071523.2372032-1-skmr537@gmail.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
During conversion driver to modern APIs the base field initial value
of the GPIO chip was moved from -1 to 0, which triggers a warning.
Add missed base initialisation as it was in the original code.
Initialise the GPIO chip label correctly as it was done by
of_mm_gpiochip_add_data() before the below mentioned change.
Fixes: 50dded8d9d62 ("gpio: altera: Drop legacy-of-mm-gpiochip.h header")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20241118095402.516989-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
Initialise the GPIO chip label correctly as it was done by
of_mm_gpiochip_add_data() before the below mentioned change.
Fixes: cf8f4462e5fa ("gpio: zevio: drop of_gpio.h header")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20241118092729.516736-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|