summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-11-06gfs2: don't withdraw if init_threads() got interruptedAndreas Gruenbacher
In gfs2_fill_super(), when mounting a gfs2 filesystem is interrupted, kthread_create() can return -EINTR. When that happens, we roll back what has already been done and abort the mount. Since commit 62dd0f98a0e5 ("gfs2: Flag a withdraw if init_threads() fails), we are calling gfs2_withdraw_delayed() in gfs2_fill_super(); first via gfs2_make_fs_rw(), then directly. But gfs2_withdraw_delayed() only marks the filesystem as withdrawing and relies on a caller further up the stack to do the actual withdraw, which doesn't exist in the gfs2_fill_super() case. Because the filesystem is marked as withdrawing / withdrawn, function gfs2_lm_unmount() doesn't release the dlm lockspace, so when we try to mount that filesystem again, we get: gfs2: fsid=gohan:gohan0: Trying to join cluster "lock_dlm", "gohan:gohan0" gfs2: fsid=gohan:gohan0: dlm_new_lockspace error -17 Since commit b77b4a4815a9 ("gfs2: Rework freeze / thaw logic"), the deadlock this gfs2_withdraw_delayed() call was supposed to work around cannot occur anymore because freeze_go_callback() won't take the sb->s_umount semaphore unconditionally anymore, so we can get rid of the gfs2_withdraw_delayed() in gfs2_fill_super() entirely. Reported-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Cc: stable@vger.kernel.org # v6.5+
2023-11-06gfs2: remove dead code in add_to_queueSu Hui
clang static analyzer complains that value stored to 'gh' is never read. The code of this line is useless after commit 0b93bac2271e ("gfs2: Remove LM_FLAG_PRIORITY flag"). Remove this code to save space. Signed-off-by: Su Hui <suhui@nfschina.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-11-06gfs2: Fix slab-use-after-free in gfs2_qd_deallocJuntong Deng
In gfs2_put_super(), whether withdrawn or not, the quota should be cleaned up by gfs2_quota_cleanup(). Otherwise, struct gfs2_sbd will be freed before gfs2_qd_dealloc (rcu callback) has run for all gfs2_quota_data objects, resulting in use-after-free. Also, gfs2_destroy_threads() and gfs2_quota_cleanup() is already called by gfs2_make_fs_ro(), so in gfs2_put_super(), after calling gfs2_make_fs_ro(), there is no need to call them again. Reported-by: syzbot+29c47e9e51895928698c@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=29c47e9e51895928698c Signed-off-by: Juntong Deng <juntong.deng@outlook.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-11-06gfs2: Silence "suspicious RCU usage in gfs2_permission" warningAndreas Gruenbacher
Commit 0abd1557e21c added rcu_dereference() for dereferencing ip->i_gl in gfs2_permission. This now causes lockdep to complain when gfs2_permission is called in non-RCU context: WARNING: suspicious RCU usage in gfs2_permission Switch to rcu_dereference_check() and check for the MAY_NOT_BLOCK flag to shut up lockdep when we know that dereferencing ip->i_gl is safe. Fixes: 0abd1557e21c ("gfs2: fix an oops in gfs2_permission") Reported-by: syzbot+3e5130844b0c0e2b4948@syzkaller.appspotmail.com Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-11-06gfs2: fs: derive f_fsid from s_uuidAmir Goldstein
gfs2 already has optional persistent uuid. Use that uuid to report f_fsid in statfs(2), same as ext2/ext4/zonefs. This allows gfs2 to be monitored by fanotify filesystem watch. for example, with inotify-tools 4.23.8.0, the following command can be used to watch changes over entire filesystem: fsnotifywatch --filesystem /mnt/gfs2 Signed-off-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-11-06gfs2: No longer use 'extern' in function declarationsAndreas Gruenbacher
For non-static function declarations, external linkage is implied and the 'extern' keyword isn't needed. Some static checkers complain about the overuse of 'extern', so clean up all the function declarations. In addition, remove 'extern' from the definition of free_local_statfs_inodes(); it isn't needed there, either. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-11-06gfs2: Rename gfs2_lookup_{ simple => meta }Andreas Gruenbacher
Function gfs2_lookup_simple() is used for looking up inodes in the metadata directory tree, so rename it to gfs2_lookup_meta() to closer match its purpose. Clean the function up a little on the way. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-11-06gfs2: Convert gfs2_internal_read to foliosAndreas Gruenbacher
Change gfs2_internal_read() to use folios. Convert sizes to size_t. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-11-06gfs2: Convert stuffed_readpage to foliosAndreas Gruenbacher
Change stuffed_readpage() to take a folio instead of a page. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-11-06gfs2: Minor gfs2_write_jdata_batch PAGE_SIZE cleanupAndreas Gruenbacher
In gfs2_write_jdata_batch(), to compute the number of blocks, compute the total size of the folio batch instead of the number of pages it contains. Not a functional change. Note that we don't currently allow mounting filesystems with a block size bigger than the page size. We could change that after converting the page cache to folios. The page cache would then only contain block-size or bigger folios, so rounding wouldn't become an issue here. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-11-06gfs2: Get rid of gfs2_alloc_blocks generation parameterAndreas Gruenbacher
Get rid of the generation parameter of gfs2_alloc_blocks(): we only ever set the generation of the current inode while creating it, so do so directly. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-11-02gfs2: Add metapath_dibh helperAndreas Gruenbacher
Add a metapath_dibh() helper for extracting the inode's buffer head from a metapath. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-11-02gfs2: Clean up quota.c:print_messageAndreas Gruenbacher
Function print_message() in quota.c doesn't return a meaningful return value. Turn it into a void function and stop abusing it for setting variable error to 0 in gfs2_quota_check(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-11-02gfs2: Clean up gfs2_alloc_parms initializersAndreas Gruenbacher
When intializing a struct, all fields that are not explicitly mentioned are zeroed out already. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-11-02gfs2: Two quota=account mode fixesAndreas Gruenbacher
Make sure we don't skip accounting for quota changes with the quota=account mount option. Reviewed-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-10-23gfs2: Stop using GFS2_BASIC_BLOCK and GFS2_BASIC_BLOCK_SHIFTAndreas Gruenbacher
Header gfs2_ondisk.h defines GFS2_BASIC_BLOCK and GFS2_BASIC_BLOCK_SHIFT in a misguided attempt to abstract away the fact that sectors on block devices are 512 or (1 << 9) bytes in size. Stop using those definitions. I would be inclinded to remove those definitions altogether, but the gfs2 user-space tools are using them. In addition, instead of GFS2_SB(inode)->sd_sb.sb_bsize_shift, simply use inode->i_blkbits. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Cc: Andrew Price <anprice@redhat.com>
2023-10-23gfs2: setattr_chown: Add missing initializationAndreas Gruenbacher
Add a missing initialization of variable ap in setattr_chown(). Without, chown() may be able to bypass quotas. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-10-03gfs2: fix an oops in gfs2_permissionAl Viro
In RCU mode, we might race with gfs2_evict_inode(), which zeroes ->i_gl. Freeing of the object it points to is RCU-delayed, so if we manage to fetch the pointer before it's been replaced with NULL, we are fine. Check if we'd fetched NULL and treat that as "bail out and tell the caller to get out of RCU mode". Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-10-03gfs2: ignore negated quota changesBob Peterson
When lots of quota changes are made, there may be cases in which an inode's quota information is increased and then decreased, such as when blocks are added to a file, then deleted from it. If the timing is right, function do_qc can add pending quota changes to a transaction, then later, another call to do_qc can negate those changes, resulting in a net gain of 0. The quota_change information is recorded in the qc buffer (and qd element of the inode as well). The buffer is added to the transaction by the first call to do_qc, but a subsequent call changes the value from non-zero back to zero. At that point it's too late to remove the buffer_head from the transaction. Later, when the quota sync code is called, the zero-change qd element is discovered and flagged as an assert warning. If the fs is mounted with errors=panic, the kernel will panic. This is usually seen when files are truncated and the quota changes are negated by punch_hole/truncate which uses gfs2_quota_hold and gfs2_quota_unhold rather than block allocations that use gfs2_quota_lock and gfs2_quota_unlock which automatically do quota sync. This patch solves the problem by adding a check to qd_check_sync such that net-zero quota changes already added to the transaction are no longer deemed necessary to be synced, and skipped. In this case references are taken for the qd and the slot from do_qc so those need to be put. The normal sequence of events for a normal non-zero quota change is as follows: gfs2_quota_change do_qc qd_hold slot_hold Later, when the changes are to be synced: gfs2_quota_sync qd_fish qd_check_sync gets qd ref via lockref_get_not_dead do_sync do_qc(QC_SYNC) qd_put lockref_put_or_lock qd_unlock qd_put lockref_put_or_lock In the net-zero change case, we add a check to qd_check_sync so it puts the qd and slot references acquired in gfs2_quota_change and skip the unneeded sync. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-22gfs2: Don't update inode timestamps for direct writesAndreas Gruenbacher
During direct reads and writes, the caller is holding the inode glock in deferred mode, which doesn't allow metadata updates. However, a previous change caused callers to update the inode modification time before carrying out direct writes, which caused the inode glock to be converted to exclusive mode for the timestamp update, only to be immediately converted back to deferred mode for the direct write. This locks out other direct readers and writers and wreaks havoc on performance. Fix that by reverting to not updating the inode modification time for direct writes. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-22gfs2: Get rid of the gfs2_glock_is_held_* helpersAndreas Gruenbacher
Those helpers don't add any clarity and are easy to use wrong. Spell them out to make more obvious what's happening. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-18gfs2: Remove unused gfs2_extent_length argumentAndreas Gruenbacher
The limit argument of gfs2_extent_length() is unused. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-18gfs2: Remove freeze_go_demote_okAndreas Gruenbacher
Before commit b77b4a4815a9 ("gfs2: Rework freeze / thaw logic"), the freeze glock was kept around in the glock cache in shared mode without being actively held while a filesystem is in thawed state. In that state, memory pressure could have eventually evicted the freeze glock, and the freeze_go_demote_ok callback was needed to prevent that from happening. With the freeze / thaw rework, the freeze glock is now always actively held in shared mode while a filesystem is thawed, and the freeze_go_demote_ok hack is no longer needed. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-18gfs2: Simplify function gfs2_upgrade_iopen_glockAndreas Gruenbacher
When trying to upgrade the iopen glock, gfs2_upgrade_iopen_glock() tries to take the iopen glock with the LM_FLAG_TRY_1CB flag set before trying to take it without the LM_FLAG_TRY or LM_FLAG_TRY_1CB flags set. Both calls will cause the lock contention bast callbacks to be invoked throughout the cluster, and we really don't need them to be invoked twice. Remove the first LM_FLAG_TRY_1CB call to eliminate unnecessary dlm traffic. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-18Merge tag 'gfs2-v6.6-rc1-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 fixes from Andreas Gruenbacher: - Fix another freeze/thaw hang - Fix glock cache shrinking - Fix the quota=quiet mount option * tag 'gfs2-v6.6-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: Fix quota=quiet oversight gfs2: fix glock shrinker ref issues gfs2: Fix another freeze/thaw hang
2023-09-18gfs2: Fix quota=quiet oversightBob Peterson
Patch eef46ab713f7 introduced a new gfs2 quota=quiet mount option. Checks for the new option were added to quota.c, but a check in gfs2_quota_lock_check() was overlooked. This patch adds the missing check. Fixes: eef46ab713f7 ("gfs2: Introduce new quota=quiet mount option") Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-18gfs2: fix glock shrinker ref issuesBob Peterson
Before this patch, function gfs2_scan_glock_lru would only try to free glocks that had a reference count of 0. But if the reference count ever got to 0, the glock should have already been freed. Shrinker function gfs2_dispose_glock_lru checks whether glocks on the LRU are demote_ok, and if so, tries to demote them. But that's only possible if the reference count is at least 1. This patch changes gfs2_scan_glock_lru so it will try to demote and/or dispose of glocks that have a reference count of 1 and which are either demotable, or are already unlocked. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-18gfs2: Fix another freeze/thaw hangAndreas Gruenbacher
On a thawed filesystem, the freeze glock is held in shared mode. In order to initiate a cluster-wide freeze, the node initiating the freeze drops the freeze glock and grabs it in exclusive mode. The other nodes recognize this as contention on the freeze glock; function freeze_go_callback is invoked. This indicates to them that they must freeze the filesystem locally, drop the freeze glock, and then re-acquire it in shared mode before being able to unfreeze the filesystem locally. While a node is trying to re-acquire the freeze glock in shared mode, additional contention can occur. In that case, the node must behave in the same way as above. Unfortunately, freeze_go_callback() contains a check that causes it to bail out when the freeze glock isn't held in shared mode. Fix that to allow the glock to be unlocked or held in shared mode. In addition, update a reference to trylock_super() which has been renamed to super_trylock_shared() in the meantime. Fixes: b77b4a4815a9 ("gfs2: Rework freeze / thaw logic") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-09-17Linux 6.6-rc2Linus Torvalds
2023-09-17Merge tag 'x86-urgent-2023-09-17' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes: - Fix an UV boot crash - Skip spurious ENDBR generation on _THIS_IP_ - Fix ENDBR use in putuser() asm methods - Fix corner case boot crashes on 5-level paging - and fix a false positive WARNING on LTO kernels" * tag 'x86-urgent-2023-09-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/purgatory: Remove LTO flags x86/boot/compressed: Reserve more memory for page tables x86/ibt: Avoid duplicate ENDBR in __put_user_nocheck*() x86/ibt: Suppress spurious ENDBR x86/platform/uv: Use alternate source for socket to node data
2023-09-17Merge tag 'sched-urgent-2023-09-17' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Fix a performance regression on large SMT systems, an Intel SMT4 balancing bug, and a topology setup bug on (Intel) hybrid processors" * tag 'sched-urgent-2023-09-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/sched: Restore the SD_ASYM_PACKING flag in the DIE domain sched/fair: Fix SMT4 group_smt_balance handling sched/fair: Optimize should_we_balance() for large SMT systems
2023-09-17Merge tag 'objtool-urgent-2023-09-17' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool fix from Ingo Molnar: "Fix a cold functions related false-positive objtool warning that triggers on Clang" * tag 'objtool-urgent-2023-09-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Fix _THIS_IP_ detection for cold functions
2023-09-17Merge tag 'core-urgent-2023-09-17' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull WARN fix from Ingo Molnar: "Fix a missing preempt-enable in the WARN() slowpath" * tag 'core-urgent-2023-09-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: panic: Reenable preemption in WARN slowpath
2023-09-17stat: remove no-longer-used helper macrosLinus Torvalds
The choose_32_64() macros were added to deal with an odd inconsistency between the 32-bit and 64-bit layout of 'struct stat' way back when in commit a52dd971f947 ("vfs: de-crapify "cp_new_stat()" function"). Then a decade later Mikulas noticed that said inconsistency had been a mistake in the early x86-64 port, and shouldn't have existed in the first place. So commit 932aba1e1690 ("stat: fix inconsistency between struct stat and struct compat_stat") removed the uses of the helpers. But the helpers remained around, unused. Get rid of them. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-09-17Merge tag '6.6-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull smb client fixes from Steve French: "Three small SMB3 client fixes, one to improve a null check and two minor cleanups" * tag '6.6-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb3: fix some minor typos and repeated words smb3: correct places where ENOTSUPP is used instead of preferred EOPNOTSUPP smb3: move server check earlier when setting channel sequence number
2023-09-17Merge tag '6.6-rc1-ksmbd' of git://git.samba.org/ksmbdLinus Torvalds
Pull smb server fixes from Steve French: "Two ksmbd server fixes" * tag '6.6-rc1-ksmbd' of git://git.samba.org/ksmbd: ksmbd: fix passing freed memory 'aux_payload_buf' ksmbd: remove unneeded mark_inode_dirty in set_info_sec()
2023-09-17Merge tag 'ext4_for_linus-6.6-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Regression and bug fixes for ext4" * tag 'ext4_for_linus-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix rec_len verify error ext4: do not let fstrim block system suspend ext4: move setting of trimmed bit into ext4_try_to_trim_range() jbd2: Fix memory leak in journal_init_common() jbd2: Remove page size assumptions buffer: Make bh_offset() work for compound pages
2023-09-17x86/purgatory: Remove LTO flagsSong Liu
-flto* implies -ffunction-sections. With LTO enabled, ld.lld generates multiple .text sections for purgatory.ro: $ readelf -S purgatory.ro | grep " .text" [ 1] .text PROGBITS 0000000000000000 00000040 [ 7] .text.purgatory PROGBITS 0000000000000000 000020e0 [ 9] .text.warn PROGBITS 0000000000000000 000021c0 [13] .text.sha256_upda PROGBITS 0000000000000000 000022f0 [15] .text.sha224_upda PROGBITS 0000000000000000 00002be0 [17] .text.sha256_fina PROGBITS 0000000000000000 00002bf0 [19] .text.sha224_fina PROGBITS 0000000000000000 00002cc0 This causes WARNING from kexec_purgatory_setup_sechdrs(): WARNING: CPU: 26 PID: 110894 at kernel/kexec_file.c:919 kexec_load_purgatory+0x37f/0x390 Fix this by disabling LTO for purgatory. [ AFAICT, x86 is the only arch that supports LTO and purgatory. ] We could also fix this with an explicit linker script to rejoin .text.* sections back into .text. However, given the benefit of LTOing purgatory is small, simply disable the production of more .text.* sections for now. Fixes: b33fff07e3e3 ("x86, build: allow LTO to be selected") Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/r/20230914170138.995606-1-song@kernel.org
2023-09-17x86/boot/compressed: Reserve more memory for page tablesKirill A. Shutemov
The decompressor has a hard limit on the number of page tables it can allocate. This limit is defined at compile-time and will cause boot failure if it is reached. The kernel is very strict and calculates the limit precisely for the worst-case scenario based on the current configuration. However, it is easy to forget to adjust the limit when a new use-case arises. The worst-case scenario is rarely encountered during sanity checks. In the case of enabling 5-level paging, a use-case was overlooked. The limit needs to be increased by one to accommodate the additional level. This oversight went unnoticed until Aaron attempted to run the kernel via kexec with 5-level paging and unaccepted memory enabled. Update wost-case calculations to include 5-level paging. To address this issue, let's allocate some extra space for page tables. 128K should be sufficient for any use-case. The logic can be simplified by using a single value for all kernel configurations. [ Also add a warning, should this memory run low - by Dave Hansen. ] Fixes: 34bbb0009f3b ("x86/boot/compressed: Enable 5-level paging during decompression stage") Reported-by: Aaron Lu <aaron.lu@intel.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230915070221.10266-1-kirill.shutemov@linux.intel.com
2023-09-16Merge tag 'kbuild-fixes-v6.6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Fix kernel-devel RPM and linux-headers Deb package - Fix too long argument list error in 'make modules_install' * tag 'kbuild-fixes-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kbuild: avoid long argument lists in make modules_install kbuild: fix kernel-devel RPM package and linux-headers Deb package
2023-09-16vm: fix move_vma() memory accounting being offLinus Torvalds
Commit 408579cd627a ("mm: Update do_vmi_align_munmap() return semantics") seems to have updated one of the callers of do_vmi_munmap() incorrectly: it used to check for the error case (which didn't change: negative means error). That commit changed the check to the success case (which did change: before that commit, 0 was success, and 1 was "success and lock downgraded". After the change, it's always 0 for success, and the lock will have been released if requested). This didn't change any actual VM behavior _except_ for memory accounting when 'VM_ACCOUNT' was set on the vma. Which made the wrong return value test fairly subtle, since everything continues to work. Or rather - it continues to work but the "Committed memory" accounting goes all wonky (Committed_AS value in /proc/meminfo), and depending on settings that then causes problems much much later as the VM relies on bogus statistics for its heuristics. Revert that one line of the change back to the original logic. Fixes: 408579cd627a ("mm: Update do_vmi_align_munmap() return semantics") Reported-by: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de> Reported-bisected-and-tested-by: Michael Labiuk <michael.labiuk@virtuozzo.com> Cc: Bagas Sanjaya <bagasdotme@gmail.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Link: https://lore.kernel.org/all/1694366957@msgid.manchmal.in-ulm.de/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-09-16Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "16 small(ish) fixes all in drivers. The major fixes are in pm8001 (fixes MSI-X issue going back to its origin), the qla2xxx endianness fix, which fixes a bug on big endian and the lpfc ones which can cause an oops on module removal without them" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: lpfc: Prevent use-after-free during rmmod with mapped NVMe rports scsi: lpfc: Early return after marking final NLP_DROPPED flag in dev_loss_tmo scsi: lpfc: Fix the NULL vs IS_ERR() bug for debugfs_create_file() scsi: target: core: Fix target_cmd_counter leak scsi: pm8001: Setup IRQs on resume scsi: pm80xx: Avoid leaking tags when processing OPC_INB_SET_CONTROLLER_CONFIG command scsi: pm80xx: Use phy-specific SAS address when sending PHY_START command scsi: ufs: core: Poll HCS.UCRDY before issuing a UIC command scsi: ufs: core: Move __ufshcd_send_uic_cmd() outside host_lock scsi: qedf: Add synchronization between I/O completions and abort scsi: target: Replace strlcpy() with strscpy() scsi: qla2xxx: Fix NULL vs IS_ERR() bug for debugfs_create_dir() scsi: qla2xxx: Use raw_smp_processor_id() instead of smp_processor_id() scsi: qla2xxx: Correct endianness for rqstlen and rsplen scsi: ppa: Fix accidentally reversed conditions for 16-bit and 32-bit EPP scsi: megaraid_sas: Fix deadlock on firmware crashdump
2023-09-16Merge tag 'ata-6.6-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata Pull ata fixes from Damien Le Moal: - Fix link power management transitions to disallow unsupported states (Niklas) - A small string handling fix for the sata_mv driver (Christophe) - Clear port pending interrupts before reset, as per AHCI specifications (Szuying). Followup fixes for this one are to not clear ATA_PFLAG_EH_PENDING in ata_eh_reset() to allow EH to continue on with other actions recorded with error interrupts triggered before EH completes. And an additional fix to avoid thawing a port twice in EH (Niklas) - Small code style fixes in the pata_parport driver to silence the build bot as it keeps complaining about bad indentation (me) - A fix for the recent CDL code to avoid fetching sense data for successful commands when not necessary for correct operation (Niklas) * tag 'ata-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: ata: libata-core: fetch sense data for successful commands iff CDL enabled ata: libata-eh: do not thaw the port twice in ata_eh_reset() ata: libata-eh: do not clear ATA_PFLAG_EH_PENDING in ata_eh_reset() ata: pata_parport: Fix code style issues ata: libahci: clear pending interrupt status ata: sata_mv: Fix incorrect string length computation in mv_dump_mem() ata: libata: disallow dev-initiated LPM transitions to unsupported states
2023-09-16Merge tag 'usb-6.6-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fix from Greg KH: "Here is a single USB fix for a much-reported regression for 6.6-rc1. It resolves a crash in the typec debugfs code for many systems. It's been in linux-next with no reported issues, and many people have reported it resolving their problem with 6.6-rc1" * tag 'usb-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: usb: typec: ucsi: Fix NULL pointer dereference
2023-09-16Merge tag 'driver-core-6.6-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here is a single driver core fix for a much-reported-by-sysbot issue that showed up in 6.6-rc1. It's been submitted by many people, all in the same way, so it obviously fixes things for them all. Also in here is a single documentation update adding riscv to the embargoed hardware document in case there are any future issues with that processor family. Both of these have been in linux-next with no reported problems" * tag 'driver-core-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: Documentation: embargoed-hardware-issues.rst: Add myself for RISC-V driver core: return an error when dev_set_name() hasn't happened
2023-09-16Merge tag 'char-misc-6.6-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc fix from Greg KH: "Here is a single patch for 6.6-rc2 that reverts a 6.5 change for the comedi subsystem that has ended up being incorrect and caused drivers that were working for people to be unable to be able to be selected to build at all. To fix this, the Kconfig change needs to be reverted and a future set of fixes for the ioport dependancies will show up in 6.7-rc1 (there's no rush for them.) This has been in linux-next with no reported issues" * tag 'char-misc-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: Revert "comedi: add HAS_IOPORT dependencies"
2023-09-16Merge tag 'i2c-for-6.6-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "The main thing is the removal of 'probe_new' because all i2c client drivers are converted now. Thanks Uwe, this marks the end of a long conversion process. Other than that, we have a few Kconfig updates and driver bugfixes" * tag 'i2c-for-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: cadence: Fix the kernel-doc warnings i2c: aspeed: Reset the i2c controller when timeout occurs i2c: I2C_MLXCPLD on ARM64 should depend on ACPI i2c: Make I2C_ATR invisible i2c: Drop legacy callback .probe_new() w1: ds2482: Switch back to use struct i2c_driver's .probe()
2023-09-16ata: libata-core: fetch sense data for successful commands iff CDL enabledNiklas Cassel
Currently, we fetch sense data for a _successful_ command if either: 1) Command was NCQ and ATA_DFLAG_CDL_ENABLED flag set (flag ATA_DFLAG_CDL_ENABLED will only be set if the Successful NCQ command sense data supported bit is set); or 2) Command was non-NCQ and regular sense data reporting is enabled. This means that case 2) will trigger for a non-NCQ command which has ATA_SENSE bit set, regardless if CDL is enabled or not. This decision was by design. If the device reports that it has sense data available, it makes sense to fetch that sense data, since the sk/asc/ascq could be important information regardless if CDL is enabled or not. However, the fetching of sense data for a successful command is done via ATA EH. Considering how intricate the ATA EH is, we really do not want to invoke ATA EH unless absolutely needed. Before commit 18bd7718b5c4 ("scsi: ata: libata: Handle completion of CDL commands using policy 0xD") we never fetched sense data for successful commands. In order to not invoke the ATA EH unless absolutely necessary, even if the device claims support for sense data reporting, only fetch sense data for successful (NCQ and non-NCQ commands) commands that are using CDL. [Damien] Modified the check to test the qc flag ATA_QCFLAG_HAS_CDL instead of the device support for CDL, which is implied for commands using CDL. Fixes: 3ac873c76d79 ("ata: libata-core: fix when to fetch sense data for successful commands") Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2023-09-16ata: libata-eh: do not thaw the port twice in ata_eh_reset()Niklas Cassel
commit 1e641060c4b5 ("libata: clear eh_info on reset completion") added a workaround that broke the retry mechanism in ATA EH. Tejun himself suggested to remove this workaround when it was identified to cause additional problems: https://lore.kernel.org/linux-ide/20110426135027.GI878@htj.dyndns.org/ He even said: "Hmm... it seems I wasn't thinking straight when I added that work around." https://lore.kernel.org/linux-ide/20110426155229.GM878@htj.dyndns.org/ While removing the workaround solved the issue, however, the workaround was kept to avoid "spurious hotplug events during reset", and instead another workaround was added on top of the existing workaround in commit 8c56cacc724c ("libata: fix unexpectedly frozen port after ata_eh_reset()"). Because these IRQs happened when the port was frozen, we know that they were actually a side effect of PxIS and IS.IPS(x) not being cleared before the COMRESET. This is now done in commit 94152042eaa9 ("ata: libahci: clear pending interrupt status"), so these workarounds can now be removed. Since commit 1e641060c4b5 ("libata: clear eh_info on reset completion") has now been reverted, the ATA EH retry mechanism is functional again, so there is once again no need to thaw the port more than once in ata_eh_reset(). This reverts "the workaround on top of the workaround" introduced in commit 8c56cacc724c ("libata: fix unexpectedly frozen port after ata_eh_reset()"). Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2023-09-16ata: libata-eh: do not clear ATA_PFLAG_EH_PENDING in ata_eh_reset()Niklas Cassel
ata_scsi_port_error_handler() starts off by clearing ATA_PFLAG_EH_PENDING, before calling ap->ops->error_handler() (without holding the ap->lock). If an error IRQ is received while ap->ops->error_handler() is running, the irq handler will set ATA_PFLAG_EH_PENDING. Once ap->ops->error_handler() returns, ata_scsi_port_error_handler() checks if ATA_PFLAG_EH_PENDING is set, and if it is, another iteration of ATA EH is performed. The problem is that ATA_PFLAG_EH_PENDING is not only cleared by ata_scsi_port_error_handler(), it is also cleared by ata_eh_reset(). ata_eh_reset() is called by ap->ops->error_handler(). This additional clearing done by ata_eh_reset() breaks the whole retry logic in ata_scsi_port_error_handler(). Thus, if an error IRQ is received while ap->ops->error_handler() is running, the port will currently remain frozen and will never get re-enabled. The additional clearing in ata_eh_reset() was introduced in commit 1e641060c4b5 ("libata: clear eh_info on reset completion"). Looking at the original error report: https://marc.info/?l=linux-ide&m=124765325828495&w=2 We can see the following happening: [ 1.074659] ata3: XXX port freeze [ 1.074700] ata3: XXX hardresetting link, stopping engine [ 1.074746] ata3: XXX flipping SControl [ 1.411471] ata3: XXX irq_stat=400040 CONN|PHY [ 1.411475] ata3: XXX port freeze [ 1.420049] ata3: XXX starting engine [ 1.420096] ata3: XXX rc=0, class=1 [ 1.420142] ata3: XXX clearing IRQs for thawing [ 1.420188] ata3: XXX port thawed [ 1.420234] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) We are not supposed to be able to receive an error IRQ while the port is frozen (PxIE is set to 0, i.e. all IRQs for the port are disabled). AHCI 1.3.1 section 10.7.1.1 First Tier (IS Register) states: "Each bit location can be thought of as reporting a '1' if the virtual "interrupt line" for that port is indicating it wishes to generate an interrupt. That is, if a port has one or more interrupt status bit set, and the enables for those status bits are set, then this bit shall be set." Additionally, AHCI state P:ComInit clearly shows that the state machine will only jump to P:ComInitSetIS (which sets IS.IPS(x) to '1'), if PxIE.PCE is set to '1'. In our case, PxIE is set to 0, so IS.IPS(x) won't get set. So IS.IPS(x) only gets set if PxIS and PxIE is set. AHCI 1.3.1 section 10.7.1.1 First Tier (IS Register) also states: "The bits in this register are read/write clear. It is set by the level of the virtual interrupt line being a set, and cleared by a write of '1' from the software." So if IS.IPS(x) is set, you need to explicitly clear it by writing a 1 to IS.IPS(x) for that port. Since PxIE is cleared, the only way to get an interrupt while the port is frozen, is if IS.IPS(x) is set, and the only way IS.IPS(x) can be set when the port is frozen, is if it was set before the port was frozen. However, since commit 737dd811a3db ("ata: libahci: clear pending interrupt status"), we clear both PxIS and IS.IPS(x) after freezing the port, but before the COMRESET, so the problem that commit 1e641060c4b5 ("libata: clear eh_info on reset completion") fixed can no longer happen. Thus, revert commit 1e641060c4b5 ("libata: clear eh_info on reset completion"), so that the retry logic in ata_scsi_port_error_handler() works once again. (The retry logic is still needed, since we can still get an error IRQ _after_ the port has been thawed, but before ata_scsi_port_error_handler() takes the ap->lock in order to check if ATA_PFLAG_EH_PENDING is set.) Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>