summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-07-30f2fs: fix to do sanity check on segment type in build_sit_entries()Chao Yu
As Wenqing Liu <wenqingliu0120@gmail.com> reported in bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=216285 RIP: 0010:memcpy_erms+0x6/0x10 f2fs_update_meta_page+0x84/0x570 [f2fs] change_curseg.constprop.0+0x159/0xbd0 [f2fs] f2fs_do_replace_block+0x5c7/0x18a0 [f2fs] f2fs_replace_block+0xeb/0x180 [f2fs] recover_data+0x1abd/0x6f50 [f2fs] f2fs_recover_fsync_data+0x12ce/0x3250 [f2fs] f2fs_fill_super+0x4459/0x6190 [f2fs] mount_bdev+0x2cf/0x3b0 legacy_get_tree+0xed/0x1d0 vfs_get_tree+0x81/0x2b0 path_mount+0x47e/0x19d0 do_mount+0xce/0xf0 __x64_sys_mount+0x12c/0x1a0 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd The root cause is segment type is invalid, so in f2fs_do_replace_block(), f2fs accesses f2fs_sm_info::curseg_array with out-of-range segment type, result in accessing invalid curseg->sum_blk during memcpy in f2fs_update_meta_page(). Fix this by adding sanity check on segment type in build_sit_entries(). Reported-by: Wenqing Liu <wenqingliu0120@gmail.com> Signed-off-by: Chao Yu <chao.yu@oppo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-07-30f2fs: obsolete unused MAX_DISCARD_BLOCKSChao Yu
After commit a7eeb823854c ("f2fs: use bitmap in discard_entry"), MAX_DISCARD_BLOCKS became obsolete, remove it. Signed-off-by: Chao Yu <chao.yu@oppo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-07-30f2fs: fix to avoid use f2fs_bug_on() in f2fs_new_node_page()Chao Yu
As Dipanjan Das <mail.dipanjan.das@gmail.com> reported, syzkaller found a f2fs bug as below: RIP: 0010:f2fs_new_node_page+0x19ac/0x1fc0 fs/f2fs/node.c:1295 Call Trace: write_all_xattrs fs/f2fs/xattr.c:487 [inline] __f2fs_setxattr+0xe76/0x2e10 fs/f2fs/xattr.c:743 f2fs_setxattr+0x233/0xab0 fs/f2fs/xattr.c:790 f2fs_xattr_generic_set+0x133/0x170 fs/f2fs/xattr.c:86 __vfs_setxattr+0x115/0x180 fs/xattr.c:182 __vfs_setxattr_noperm+0x125/0x5f0 fs/xattr.c:216 __vfs_setxattr_locked+0x1cf/0x260 fs/xattr.c:277 vfs_setxattr+0x13f/0x330 fs/xattr.c:303 setxattr+0x146/0x160 fs/xattr.c:611 path_setxattr+0x1a7/0x1d0 fs/xattr.c:630 __do_sys_lsetxattr fs/xattr.c:653 [inline] __se_sys_lsetxattr fs/xattr.c:649 [inline] __x64_sys_lsetxattr+0xbd/0x150 fs/xattr.c:649 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 NAT entry and nat bitmap can be inconsistent, e.g. one nid is free in nat bitmap, and blkaddr in its NAT entry is not NULL_ADDR, it may trigger BUG_ON() in f2fs_new_node_page(), fix it. Reported-by: Dipanjan Das <mail.dipanjan.das@gmail.com> Signed-off-by: Chao Yu <chao.yu@oppo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-07-30f2fs: fix to remove F2FS_COMPR_FL and tag F2FS_NOCOMP_FL at the same timeChao Liu
If the inode has the compress flag, it will fail to use 'chattr -c +m' to remove its compress flag and tag no compress flag. However, the same command will be successful when executed again, as shown below: $ touch foo.txt $ chattr +c foo.txt $ chattr -c +m foo.txt chattr: Invalid argument while setting flags on foo.txt $ chattr -c +m foo.txt $ f2fs_io getflags foo.txt get a flag on foo.txt ret=0, flags=nocompression,inline_data Fix this by removing some checks in f2fs_setflags_common() that do not affect the original logic. I go through all the possible scenarios, and the results are as follows. Bold is the only thing that has changed. +---------------+-----------+-----------+----------+ | | file flags | + command +-----------+-----------+----------+ | | no flag | compr | nocompr | +---------------+-----------+-----------+----------+ | chattr +c | compr | compr | -EINVAL | | chattr -c | no flag | no flag | nocompr | | chattr +m | nocompr | -EINVAL | nocompr | | chattr -m | no flag | compr | no flag | | chattr +c +m | -EINVAL | -EINVAL | -EINVAL | | chattr +c -m | compr | compr | compr | | chattr -c +m | nocompr | *nocompr* | nocompr | | chattr -c -m | no flag | no flag | no flag | +---------------+-----------+-----------+----------+ Link: https://lore.kernel.org/linux-f2fs-devel/20220621064833.1079383-1-chaoliu719@gmail.com/ Fixes: 4c8ff7095bef ("f2fs: support data compression") Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Chao Liu <liuchao@coolpad.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-07-30f2fs: introduce sysfs atomic write statisticsDaeho Jeong
introduce the below 4 new sysfs node for atomic write statistics. - current_atomic_write: the total current atomic write block count, which is not committed yet. - peak_atomic_write: the peak value of total current atomic write block count after boot. - committed_atomic_block: the accumulated total committed atomic write block count after boot. - revoked_atomic_block: the accumulated total revoked atomic write block count after boot. Signed-off-by: Daeho Jeong <daehojeong@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-07-30f2fs: don't bother wait_ms by foreground gcqixiaoyu1
f2fs_gc returns -EINVAL via f2fs_balance_fs when there is enough free secs after write checkpoint, but with gc_merge enabled, it will cause the sleep time of gc thread to be set to no_gc_sleep_time even if there are many dirty segments can be selected. Signed-off-by: qixiaoyu1 <qixiaoyu1@xiaomi.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-07-30f2fs: invalidate meta pages only for post_read required inodeChao Yu
After commit e3b49ea36802 ("f2fs: invalidate META_MAPPING before IPU/DIO write"), invalidate_mapping_pages() will be called to avoid race condition in between IPU/DIO and readahead for GC. However, readahead flow is only used for post_read required inode, so this patch adds check condition to avoids unnecessary page cache invalidating for non-post_read inode. Signed-off-by: Chao Yu <chao.yu@oppo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-07-30f2fs: allow compression of files without blocksChao Liu
Files created by truncate(1) have a size but no blocks, so they can be allowed to enable compression. Signed-off-by: Chao Liu <liuchao@coolpad.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-07-30f2fs: fix to check inline_data during compressed inode conversionChao Yu
When converting inode to compressed one via ioctl, it needs to check inline_data, since inline_data flag and compressed flag are incompatible. Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Chao Yu <chao.yu@oppo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-07-30f2fs: Delete f2fs_copy_page() and replace with memcpy_page()Fabio M. De Francesco
f2fs_copy_page() is a wrapper around two kmap() + one memcpy() from/to the mapped pages. It unnecessarily duplicates a kernel API and it makes use of kmap(), which is being deprecated in favor of kmap_local_page(). Two main problems with kmap(): (1) It comes with an overhead as mapping space is restricted and protected by a global lock for synchronization and (2) it also requires global TLB invalidation when the kmap’s pool wraps and it might block when the mapping space is fully utilized until a slot becomes available. With kmap_local_page() the mappings are per thread, CPU local, can take page faults, and can be called from any context (including interrupts). It is faster than kmap() in kernels with HIGHMEM enabled. Therefore, its use in __clone_blkaddrs() is safe and should be preferred. Delete f2fs_copy_page() and use a plain memcpy_page() in the only one site calling the removed function. memcpy_page() avoids open coding two kmap_local_page() + one memcpy() between the two kernel virtual addresses. Suggested-by: Christoph Hellwig <hch@infradead.org> Suggested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-07-30f2fs: fix to invalidate META_MAPPING before DIO writeChao Yu
Quoted from commit e3b49ea36802 ("f2fs: invalidate META_MAPPING before IPU/DIO write") " Encrypted pages during GC are read and cached in META_MAPPING. However, due to cached pages in META_MAPPING, there is an issue where newly written pages are lost by IPU or DIO writes. Thread A - f2fs_gc() Thread B /* phase 3 */ down_write(i_gc_rwsem) ra_data_block() ---- (a) up_write(i_gc_rwsem) f2fs_direct_IO() : - down_read(i_gc_rwsem) - __blockdev_direct_io() - get_data_block_dio_write() - f2fs_dio_submit_bio() ---- (b) - up_read(i_gc_rwsem) /* phase 4 */ down_write(i_gc_rwsem) move_data_block() ---- (c) up_write(i_gc_rwsem) (a) In phase 3 of f2fs_gc(), up-to-date page is read from storage and cached in META_MAPPING. (b) In thread B, writing new data by IPU or DIO write on same blkaddr as read in (a). cached page in META_MAPPING become out-dated. (c) In phase 4 of f2fs_gc(), out-dated page in META_MAPPING is copied to new blkaddr. In conclusion, the newly written data in (b) is lost. To address this issue, invalidating pages in META_MAPPING before IPU or DIO write. " In previous commit, we missed to cover extent cache hit case, and passed wrong value for parameter @end of invalidate_mapping_pages(), fix both issues. Fixes: 6aa58d8ad20a ("f2fs: readahead encrypted block during GC") Fixes: e3b49ea36802 ("f2fs: invalidate META_MAPPING before IPU/DIO write") Cc: Hyeong-Jun Kim <hj514.kim@samsung.com> Signed-off-by: Chao Yu <chao.yu@oppo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-07-30f2fs: add a sysfs entry to show zone capacityJaegeuk Kim
This patch adds a sysfs entry showing the unusable space in a section made by zone capacity. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-07-30f2fs: adjust zone capacity when considering valid block countJaegeuk Kim
This patch fixes counting unusable blocks set by zone capacity when checking the valid block count in a section. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-07-30f2fs: enforce single zone capacityJaegeuk Kim
In order to simplify the complicated per-zone capacity, let's support only one capacity for entire zoned device. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-07-30f2fs: remove redundant code for gc conditionduguowei
Remove the redundant code and use local variant as the argument directly. Make it more human-readable. Signed-off-by: duguowei <duguowei@xiaomi.com> [Jaegeuk Kim: make code neat] Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-07-30f2fs: introduce memory modeDaeho Jeong
Introduce memory mode to supports "normal" and "low" memory modes. "low" mode is to support low memory devices. Because of the nature of low memory devices, in this mode, f2fs will try to save memory sometimes by sacrificing performance. "normal" mode is the default mode and same as before. Signed-off-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-07-30Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM fixes from Russell King: "Last set of ARM fixes for 5.19: - fix for MAX_DMA_ADDRESS overflow - fix for find_*_bit performing an out of bounds memory access" * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: findbit: fix overflowing offset ARM: 9216/1: Fix MAX_DMA_ADDRESS overflow
2022-07-30dt-bindings: leds: pwm-multicolor: document max-brigthnessKrzysztof Kozlowski
The Multicolor PWM LED uses max-brigthness property (in the example and in the driver), so document it to fixi dt_binding_check warning like: leds/leds-pwm-multicolor.example.dtb: led-controller: multi-led: Unevaluated properties are not allowed ('max-brightness' was unexpected) Reported-by: Rob Herring <robh@kernel.org> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Pavel Machek <pavel@ucw.cz>
2022-07-30leds: turris-omnia: convert to use dev_groupsGreg Kroah-Hartman
The driver core supports the ability to handle the creation and removal of device-specific sysfs files in a race-free manner. Take advantage of that by converting this driver to use this by moving the sysfs attributes into a group and assigning the dev_groups pointer to it. Cc: Pavel Machek <pavel@ucw.cz> Cc: linux-leds@vger.kernel.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Marek Behún <kabel@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Pavel Machek <pavel@ucw.cz>
2022-07-30rv/reactor: Add the panic reactorDaniel Bristot de Oliveira
Sample reactor that panics the system when an exception is found. This is useful both to capture a vmcore, or to fail-safe a critical system. Link: https://lkml.kernel.org/r/729aae3aba95f35738b8f8180e626d747d1d9da2.1659052063.git.bristot@kernel.org Cc: Wim Van Sebroeck <wim@linux-watchdog.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marco Elver <elver@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Gabriele Paoloni <gpaoloni@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: Tao Zhou <tao.zhou@linux.dev> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-trace-devel@vger.kernel.org Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-07-30rv/reactor: Add the printk reactorDaniel Bristot de Oliveira
A reactor that printks the reaction message. Link: https://lkml.kernel.org/r/b65f18a7fd6dc6659a3008fd7b7392de3465d47b.1659052063.git.bristot@kernel.org Cc: Wim Van Sebroeck <wim@linux-watchdog.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marco Elver <elver@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Gabriele Paoloni <gpaoloni@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: Tao Zhou <tao.zhou@linux.dev> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-trace-devel@vger.kernel.org Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-07-30rv/monitor: Add the wwnr monitorDaniel Bristot de Oliveira
Per task wakeup while not running (wwnr) monitor. This model is broken, the reason is that a task can be running in the processor without being set as RUNNABLE. Think about a task about to sleep: 1: set_current_state(TASK_UNINTERRUPTIBLE); 2: schedule(); And then imagine an IRQ happening in between the lines one and two, waking the task up. BOOM, the wakeup will happen while the task is running. Q: Why do we need this model, so? A: To test the reactors. Link: https://lkml.kernel.org/r/473c0fc39967250fdebcff8b620311c11dccad30.1659052063.git.bristot@kernel.org Cc: Wim Van Sebroeck <wim@linux-watchdog.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marco Elver <elver@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Gabriele Paoloni <gpaoloni@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: Tao Zhou <tao.zhou@linux.dev> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-trace-devel@vger.kernel.org Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-07-30rv/monitor: Add the wip monitorDaniel Bristot de Oliveira
The wakeup in preemptive (wip) monitor verifies if the wakeup events always take place with preemption disabled: | | v #==================# H preemptive H <+ #==================# | | | | preempt_disable | preempt_enable v | sched_waking +------------------+ | +--------------- | | | | | non_preemptive | | +--------------> | | -+ +------------------+ The wakeup event always takes place with preemption disabled because of the scheduler synchronization. However, because the preempt_count and its trace event are not atomic with regard to interrupts, some inconsistencies might happen. The documentation illustrates one of these cases. Link: https://lkml.kernel.org/r/c98ca678df81115fddc04921b3c79720c836b18f.1659052063.git.bristot@kernel.org Cc: Wim Van Sebroeck <wim@linux-watchdog.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marco Elver <elver@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Gabriele Paoloni <gpaoloni@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: Tao Zhou <tao.zhou@linux.dev> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-trace-devel@vger.kernel.org Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-07-30rv/monitor: Add the wip monitor skeleton created by dot2kDaniel Bristot de Oliveira
THIS CODE IS NOT LINKED TO THE MAKEFILE. This model does not compile because it lacks the instrumentation part, which will be added next. In the typical case, there will be only one patch, but it was split into two patches for educational purposes. This is the direct output this command line: $ dot2k -d tools/verification/models/wip.dot -t per_cpu Link: https://lkml.kernel.org/r/5eb7a9118917e8a814c5e49853a72fc62be0a101.1659052063.git.bristot@kernel.org Cc: Wim Van Sebroeck <wim@linux-watchdog.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marco Elver <elver@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Gabriele Paoloni <gpaoloni@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: Tao Zhou <tao.zhou@linux.dev> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-trace-devel@vger.kernel.org Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-07-30Documentation/rv: Add deterministic automata instrumentation documentationDaniel Bristot de Oliveira
Add the da_monitor_instrumentation.rst. It describes the basics of RV monitor instrumentation. Link: https://lkml.kernel.org/r/0557d5c68e2fc252f2643c2cc5295a67e2b73277.1659052063.git.bristot@kernel.org Cc: Wim Van Sebroeck <wim@linux-watchdog.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marco Elver <elver@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Gabriele Paoloni <gpaoloni@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: Tao Zhou <tao.zhou@linux.dev> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-trace-devel@vger.kernel.org Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-07-30Documentation/rv: Add deterministic automata monitor synthesis documentationDaniel Bristot de Oliveira
Add the da_monitor_synthesis.rst introduces some concepts behind the Deterministic Automata (DA) monitor synthesis and interface. Link: https://lkml.kernel.org/r/7873bdb7b2e5d2bc0b2eb6ca0b324af9a0ba27a0.1659052063.git.bristot@kernel.org Cc: Wim Van Sebroeck <wim@linux-watchdog.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marco Elver <elver@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Gabriele Paoloni <gpaoloni@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: Tao Zhou <tao.zhou@linux.dev> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-trace-devel@vger.kernel.org Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-07-30tools/rv: Add dot2kDaniel Bristot de Oliveira
transform .dot file into kernel rv monitor usage: dot2k [-h] -d DOT_FILE -t MONITOR_TYPE [-n MODEL_NAME] [-D DESCRIPTION] optional arguments: -h, --help show this help message and exit -d DOT_FILE, --dot DOT_FILE -t MONITOR_TYPE, --monitor_type MONITOR_TYPE -n MODEL_NAME, --model_name MODEL_NAME -D DESCRIPTION, --description DESCRIPTION Link: https://lkml.kernel.org/r/083b3ae61e5a62c1e2e5d08009baa91f82181618.1659052063.git.bristot@kernel.org Cc: Wim Van Sebroeck <wim@linux-watchdog.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marco Elver <elver@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Gabriele Paoloni <gpaoloni@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: Tao Zhou <tao.zhou@linux.dev> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-trace-devel@vger.kernel.org Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-07-30Documentation/rv: Add deterministic automaton documentationDaniel Bristot de Oliveira
Add documentation about deterministic automaton and its possible representations (formal, graphic, .dot and C). Link: https://lkml.kernel.org/r/387edaed87630bd5eb37c4275045dfd229700aa6.1659052063.git.bristot@kernel.org Cc: Wim Van Sebroeck <wim@linux-watchdog.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marco Elver <elver@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Gabriele Paoloni <gpaoloni@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: Tao Zhou <tao.zhou@linux.dev> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-trace-devel@vger.kernel.org Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-07-30tools/rv: Add dot2cDaniel Bristot de Oliveira
dot2c is a tool that transforms an automata in the graphiviz .dot file into an C representation of the automata. usage: dot2c [-h] dot_file dot2c: converts a .dot file into a C structure positional arguments: dot_file The dot file to be converted optional arguments: -h, --help show this help message and exit Link: https://lkml.kernel.org/r/b26204ba9509c80bcda31b76cdea31ddb188cd24.1659052063.git.bristot@kernel.org Cc: Wim Van Sebroeck <wim@linux-watchdog.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marco Elver <elver@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Gabriele Paoloni <gpaoloni@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: Tao Zhou <tao.zhou@linux.dev> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-trace-devel@vger.kernel.org Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-07-30Documentation/rv: Add a basic documentationDaniel Bristot de Oliveira
Add the runtime-verification.rst document, explaining the basics of RV and how to use the interface. Link: https://lkml.kernel.org/r/4be7d1a88ab1e2eb0767521e1ab52a149a154bc4.1659052063.git.bristot@kernel.org Cc: Wim Van Sebroeck <wim@linux-watchdog.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marco Elver <elver@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Gabriele Paoloni <gpaoloni@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: Tao Zhou <tao.zhou@linux.dev> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-trace-devel@vger.kernel.org Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-07-30rv/include: Add instrumentation helper functionsDaniel Bristot de Oliveira
Instrumentation helper functions to facilitate the instrumentation of auto-generated RV monitors create by dot2k. Link: https://lkml.kernel.org/r/3b36c9435f9d9299beb84e5c7c46920e205bedec.1659052063.git.bristot@kernel.org Cc: Wim Van Sebroeck <wim@linux-watchdog.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marco Elver <elver@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Gabriele Paoloni <gpaoloni@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: Tao Zhou <tao.zhou@linux.dev> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-trace-devel@vger.kernel.org Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-07-30rv/include: Add deterministic automata monitor definition via C macrosDaniel Bristot de Oliveira
In Linux terms, the runtime verification monitors are encapsulated inside the "RV monitor" abstraction. The "RV monitor" includes a set of instances of the monitor (per-cpu monitor, per-task monitor, and so on), the helper functions that glue the monitor to the system reference model, and the trace output as a reaction for event parsing and exceptions, as depicted below: Linux +----- RV Monitor ----------------------------------+ Formal Realm | | Realm +-------------------+ +----------------+ +-----------------+ | Linux kernel | | Monitor | | Reference | | Tracing | -> | Instance(s) | <- | Model | | (instrumentation) | | (verification) | | (specification) | +-------------------+ +----------------+ +-----------------+ | | | | V | | +----------+ | | | Reaction | | | +--+--+--+-+ | | | | | | | | | +-> trace output ? | +------------------------|--|----------------------+ | +----> panic ? +-------> <user-specified> Add the rv/da_monitor.h, enabling automatic code generation for the *Monitor Instance(s)* using C macros, and code to support it. The benefits of the usage of macro for monitor synthesis are 3-fold as it: - Reduces the code duplication; - Facilitates the bug fix/improvement; - Avoids the case of developers changing the core of the monitor code to manipulate the model in a (let's say) non-standard way. This initial implementation presents three different types of monitor instances: - DECLARE_DA_MON_GLOBAL(name, type) - DECLARE_DA_MON_PER_CPU(name, type) - DECLARE_DA_MON_PER_TASK(name, type) The first declares the functions for a global deterministic automata monitor, the second for monitors with per-cpu instances, and the third with per-task instances. Link: https://lkml.kernel.org/r/51b0bf425a281e226dfeba7401d2115d6091f84e.1659052063.git.bristot@kernel.org Cc: Wim Van Sebroeck <wim@linux-watchdog.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marco Elver <elver@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Gabriele Paoloni <gpaoloni@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: Tao Zhou <tao.zhou@linux.dev> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-trace-devel@vger.kernel.org Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-07-30rv/include: Add helper functions for deterministic automataDaniel Bristot de Oliveira
Formally, a deterministic automaton, denoted by G, is defined as a quintuple: G = { X, E, f, x_0, X_m } where: - X is the set of states; - E is the finite set of events; - x_0 is the initial state; - X_m (subset of X) is the set of marked states. - f : X x E -> X $ is the transition function. It defines the state transition in the occurrence of a event from E in the state X. In the special case of deterministic automata, the occurrence of the event in E in a state in X has a deterministic next state from X. An automaton can also be represented using a graphical format of vertices (nodes) and edges. The open-source tool Graphviz can produce this graphic format using the (textual) DOT language as the source code. The dot2c tool presented in this paper: De Oliveira, Daniel Bristot; Cucinotta, Tommaso; De Oliveira, Romulo Silva. Efficient formal verification for the Linux kernel. In: International Conference on Software Engineering and Formal Methods. Springer, Cham, 2019. p. 315-332. Translates a deterministic automaton in the DOT format into a C source code representation that to be used for monitoring. This header file implements helper functions to facilitate the usage of the C output from dot2c/k for monitoring. Link: https://lkml.kernel.org/r/563234f2bfa84b540f60cf9e39c2d9f0eea95a55.1659052063.git.bristot@kernel.org Cc: Wim Van Sebroeck <wim@linux-watchdog.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marco Elver <elver@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Gabriele Paoloni <gpaoloni@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: Tao Zhou <tao.zhou@linux.dev> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-trace-devel@vger.kernel.org Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-07-30rv: Add runtime reactors interfaceDaniel Bristot de Oliveira
A runtime monitor can cause a reaction to the detection of an exception on the model's execution. By default, the monitors have tracing reactions, printing the monitor output via tracepoints. But other reactions can be added (on-demand) via this interface. The user interface resembles the kernel tracing interface and presents these files: "available_reactors" - Reading shows the available reactors, one per line. For example: # cat available_reactors nop panic printk "reacting_on" - It is an on/off general switch for reactors, disabling all reactions. "monitors/MONITOR/reactors" - List available reactors, with the select reaction for the given MONITOR inside []. The default one is the nop (no operation) reactor. - Writing the name of a reactor enables it to the given MONITOR. For example: # cat monitors/wip/reactors [nop] panic printk # echo panic > monitors/wip/reactors # cat monitors/wip/reactors nop [panic] printk Link: https://lkml.kernel.org/r/1794eb994637457bdeaa6bad0b8263d2f7eece0c.1659052063.git.bristot@kernel.org Cc: Wim Van Sebroeck <wim@linux-watchdog.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marco Elver <elver@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Gabriele Paoloni <gpaoloni@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: Tao Zhou <tao.zhou@linux.dev> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-trace-devel@vger.kernel.org Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-07-30rv: Add Runtime Verification (RV) interfaceDaniel Bristot de Oliveira
RV is a lightweight (yet rigorous) method that complements classical exhaustive verification techniques (such as model checking and theorem proving) with a more practical approach to complex systems. RV works by analyzing the trace of the system's actual execution, comparing it against a formal specification of the system behavior. RV can give precise information on the runtime behavior of the monitored system while enabling the reaction for unexpected events, avoiding, for example, the propagation of a failure on safety-critical systems. The development of this interface roots in the development of the paper: De Oliveira, Daniel Bristot; Cucinotta, Tommaso; De Oliveira, Romulo Silva. Efficient formal verification for the Linux kernel. In: International Conference on Software Engineering and Formal Methods. Springer, Cham, 2019. p. 315-332. And: De Oliveira, Daniel Bristot. Automata-based formal analysis and verification of the real-time Linux kernel. PhD Thesis, 2020. The RV interface resembles the tracing/ interface on purpose. The current path for the RV interface is /sys/kernel/tracing/rv/. It presents these files: "available_monitors" - List the available monitors, one per line. For example: # cat available_monitors wip wwnr "enabled_monitors" - Lists the enabled monitors, one per line; - Writing to it enables a given monitor; - Writing a monitor name with a '!' prefix disables it; - Truncating the file disables all enabled monitors. For example: # cat enabled_monitors # echo wip > enabled_monitors # echo wwnr >> enabled_monitors # cat enabled_monitors wip wwnr # echo '!wip' >> enabled_monitors # cat enabled_monitors wwnr # echo > enabled_monitors # cat enabled_monitors # Note that more than one monitor can be enabled concurrently. "monitoring_on" - It is an on/off general switcher for monitoring. Note that it does not disable enabled monitors or detach events, but stop the per-entity monitors of monitoring the events received from the system. It resembles the "tracing_on" switcher. "monitors/" Each monitor will have its one directory inside "monitors/". There the monitor specific files will be presented. The "monitors/" directory resembles the "events" directory on tracefs. For example: # cd monitors/wip/ # ls desc enable # cat desc wakeup in preemptive per-cpu testing monitor. # cat enable 0 For further information, see the comments in the header of kernel/trace/rv/rv.c from this patch. Link: https://lkml.kernel.org/r/a4bfe038f50cb047bfb343ad0e12b0e646ab308b.1659052063.git.bristot@kernel.org Cc: Wim Van Sebroeck <wim@linux-watchdog.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marco Elver <elver@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Gabriele Paoloni <gpaoloni@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: Tao Zhou <tao.zhou@linux.dev> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-trace-devel@vger.kernel.org Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-07-30ftrace/x86: Add back ftrace_expected assignmentSteven Rostedt (Google)
When a ftrace_bug happens (where ftrace fails to modify a location) it is helpful to have what was at that location as well as what was expected to be there. But with the conversion to text_poke() the variable that assigns the expected for debugging was dropped. Unfortunately, I noticed this when I needed it. Add it back. Link: https://lkml.kernel.org/r/20220726101851.069d2e70@gandalf.local.home Cc: "x86@kernel.org" <x86@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: stable@vger.kernel.org Fixes: 768ae4406a5c ("x86/ftrace: Use text_poke()") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-07-30tracing: Use a copy of the va_list for __assign_vstr()Steven Rostedt (Google)
If an instance of tracing enables the same trace event as another instance, or the top level instance, or even perf, then the va_list passed into some tracepoints can be used more than once. As va_list can only be traversed once, this can cause issues: # cat /sys/kernel/tracing/instances/qla2xxx/trace cat-56106 [012] ..... 2419873.470098: ql_dbg_log: qla2xxx [0000:05:00.0]-1054:14: Entered (null). cat-56106 [012] ..... 2419873.470101: ql_dbg_log: qla2xxx [0000:05:00.0]-1000:14: Entered ×+<96>²Ü<98>^H. cat-56106 [012] ..... 2419873.470102: ql_dbg_log: qla2xxx [0000:05:00.0]-1006:14: Prepare to issue mbox cmd=0xde589000. # cat /sys/kernel/tracing/trace cat-56106 [012] ..... 2419873.470097: ql_dbg_log: qla2xxx [0000:05:00.0]-1054:14: Entered qla2x00_get_firmware_state. cat-56106 [012] ..... 2419873.470100: ql_dbg_log: qla2xxx [0000:05:00.0]-1000:14: Entered qla2x00_mailbox_command. cat-56106 [012] ..... 2419873.470102: ql_dbg_log: qla2xxx [0000:05:00.0]-1006:14: Prepare to issue mbox cmd=0x69. The instance version is corrupted because the top level instance iterated the va_list first. Use va_copy() in the __assign_vstr() macro to make sure that each trace event for each use case gets a fresh va_list. Link: https://lore.kernel.org/all/259d53a5-958e-6508-4e45-74dba2821242@marvell.com/ Link: https://lkml.kernel.org/r/20220719182004.21daa83e@gandalf.local.home Fixes: 0563231f93c6d ("tracing/events: Add __vstring() and __assign_vstr() helper macros") Reported-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-07-30batman-adv: tracing: Use the new __vstring() helperSteven Rostedt (Google)
Instead of open coding a __dynamic_array() with a fixed length (which defeats the purpose of the dynamic array in the first place). Use the new __vstring() helper that will use a va_list and only write enough of the string into the ring buffer that is needed. Link: https://lkml.kernel.org/r/20220724191650.236b1355@rorschach.local.home Cc: Marek Lindner <mareklindner@neomailbox.ch> Cc: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Simon Wunderlich <sw@simonwunderlich.de> Cc: Antonio Quartulli <a@unstable.cc> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: b.a.t.m.a.n@lists.open-mesh.org Cc: netdev@vger.kernel.org Acked-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-07-30csky: Add jump-label implementationGuo Ren
Add jump-label implementation for static branch Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org>
2022-07-30Revert "MIPS: octeon: Remove vestiges of CONFIG_CAVIUM_RESERVE32"Alexander Sverdlin
This reverts commit e98b461bb057aaea6fa766260788c08825213837. We actually have been using the CONFIG_CAVIUM_RESERVE32 and previous patch defined it in the corresponding Kconfig. Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-07-30MIPS: Introduce CAVIUM_RESERVE32 Kconfig optionAlexander Sverdlin
This options is used to reserve a shared memory region for user processes to use for hardware memory buffers. The actual code to support the option comes in the following patch. Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-07-30locking/rwsem: Allow slowpath writer to ignore handoff bit if not set by ↵Waiman Long
first waiter With commit d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more consistent"), the writer that sets the handoff bit can be interrupted out without clearing the bit if the wait queue isn't empty. This disables reader and writer optimistic lock spinning and stealing. Now if a non-first writer in the queue is somehow woken up or a new waiter enters the slowpath, it can't acquire the lock. This is not the case before commit d257cc8cb8d5 as the writer that set the handoff bit will clear it when exiting out via the out_nolock path. This is less efficient as the busy rwsem stays in an unlock state for a longer time. In some cases, this new behavior may cause lockups as shown in [1] and [2]. This patch allows a non-first writer to ignore the handoff bit if it is not originally set or initiated by the first waiter. This patch is shown to be effective in fixing the lockup problem reported in [1]. [1] https://lore.kernel.org/lkml/20220617134325.GC30825@techsingularity.net/ [2] https://lore.kernel.org/lkml/3f02975c-1a9d-be20-32cf-f1d8e3dfafcc@oracle.com/ Fixes: d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more consistent") Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: John Donnelly <john.p.donnelly@oracle.com> Tested-by: Mel Gorman <mgorman@techsingularity.net> Link: https://lore.kernel.org/r/20220622200419.778799-1-longman@redhat.com
2022-07-30MIPS: msi-octeon: eliminate kernel-doc warningsRandy Dunlap
Rearrange kernel-doc notation for 2 functions to eliminate kernel-doc warnings. Use Return: notation for the function return value description. Add function short descriptions for both functions. Correct 2 typos. Fixes these kernel-doc warnings: msi-octeon.c:49: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * Called when a driver request MSI interrupts instead of the msi-octeon.c:49: warning: missing initial short description on line: * Called when a driver request MSI interrupts instead of the msi-octeon.c:62: warning: No description found for return value of 'arch_setup_msi_irq' msi-octeon.c:189: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * Called when a device no longer needs its MSI interrupts. All msi-octeon.c:189: warning: missing initial short description on line: * Called when a device no longer needs its MSI interrupts. All Fixes: e8635b484f64 ("MIPS: Add Cavium OCTEON PCI support.") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Aditya Srivastava <yashsri421@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: linux-mips@vger.kernel.org Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-07-30MIPS: Fix comment typoJason Wang
Fix the typo `s/that that/than that/' in line 72. Signed-off-by: Jason Wang <wangborong@cdjrlc.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-07-30ALSA: hda/realtek: Add quirk for Lenovo Yoga9 14IAP7Philipp Jungkamp
The Lenovo Yoga 9 14IAP7 is set up similarly to the Thinkpad X1 7th and 8th Gen. It also has the speakers attached to NID 0x14 and the bass speakers to NID 0x17, but here the codec misreports the NID 0x17 as unconnected. The pincfg and hda verbs connect and activate the bass speaker amplifiers, but the generic driver will connect them to NID 0x06 which has no volume control. Set connection list/preferred connections is required to gain volume control. BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=208555 Signed-off-by: Philipp Jungkamp <p.jungkamp@gmx.net> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20220729162103.6062-1-p.jungkamp@gmx.net Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-07-29Merge tag 'mlx5-updates-2022-07-28' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2022-07-28 Misc updates to mlx5 driver: 1) Gal corrects to use skb_tcp_all_headers on encapsulated skbs. 2) Roi Adds the support for offloading standalone police actions. 3) lama, did some refactoring to minimize code coupling with mlx5e_priv "god object" in some of the follows, and converts some of the objects to pointers to preserve on memory when these objects aren't needed. This is part one of two parts series. * tag 'mlx5-updates-2022-07-28' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: Move mlx5e_init_l2_addr to en_main net/mlx5e: Split en_fs ndo's and move to en_main net/mlx5e: Separate mlx5e_set_rx_mode_work and move caller to en_main net/mlx5e: Add mdev to flow_steering struct net/mlx5e: Report flow steering errors with mdev err report API net/mlx5e: Convert mlx5e_flow_steering member of mlx5e_priv to pointer net/mlx5e: Allocate VLAN and TC for featured profiles only net/mlx5e: Make mlx5e_tc_table private net/mlx5e: Convert mlx5e_tc_table member of mlx5e_flow_steering to pointer net/mlx5e: TC, Support tc action api for police net/mlx5e: TC, Separate get/update/replace meter functions net/mlx5e: Add red and green counters for metering net/mlx5e: TC, Allocate post meter ft per rule net/mlx5: DR, Add support for flow metering ASO net/mlx5e: Fix wrong use of skb_tcp_all_headers() with encapsulation ==================== Link: https://lore.kernel.org/r/20220728205728.143074-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-30fs/dcache: Move wakeup out of i_seq_dir write held region.Sebastian Andrzej Siewior
__d_add() and __d_move() wake up waiters on dentry::d_wait from within the i_seq_dir write held region. This violates the PREEMPT_RT constraints as the wake up acquires wait_queue_head::lock which is a "sleeping" spinlock on RT. There is no requirement to do so. __d_lookup_unhash() has cleared DCACHE_PAR_LOOKUP and dentry::d_wait and returned the now unreachable wait queue head pointer to the caller, so the actual wake up can be postponed until the i_dir_seq write side critical section is left. The only requirement is that dentry::lock is held across the whole sequence including the wake up. The previous commit includes an analysis why this is considered safe. Move the wake up past end_dir_add() which leaves the i_dir_seq write side critical section and enables preemption. For non RT kernels there is no difference because preemption is still disabled due to dentry::lock being held, but it shortens the time between wake up and unlocking dentry::lock, which reduces the contention for the woken up waiter. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-07-30fs/dcache: Move the wakeup from __d_lookup_done() to the caller.Sebastian Andrzej Siewior
__d_lookup_done() wakes waiters on dentry->d_wait. On PREEMPT_RT we are not allowed to do that with preemption disabled, since the wakeup acquired wait_queue_head::lock, which is a "sleeping" spinlock on RT. Calling it under dentry->d_lock is not a problem, since that is also a "sleeping" spinlock on the same configs. Unfortunately, two of its callers (__d_add() and __d_move()) are holding more than just ->d_lock and that needs to be dealt with. The key observation is that wakeup can be moved to any point before dropping ->d_lock. As a first step to solve this, move the wake up outside of the hlist_bl_lock() held section. This is safe because: Waiters get inserted into ->d_wait only after they'd taken ->d_lock and observed DCACHE_PAR_LOOKUP in flags. As long as they are woken up (and evicted from the queue) between the moment __d_lookup_done() has removed DCACHE_PAR_LOOKUP and dropping ->d_lock, we are safe, since the waitqueue ->d_wait points to won't get destroyed without having __d_lookup_done(dentry) called (under ->d_lock). ->d_wait is set only by d_alloc_parallel() and only in case when it returns a freshly allocated in-lookup dentry. Whenever that happens, we are guaranteed that __d_lookup_done() will be called for resulting dentry (under ->d_lock) before the wq in question gets destroyed. With two exceptions wq lives in call frame of the caller of d_alloc_parallel() and we have an explicit d_lookup_done() on the resulting in-lookup dentry before we leave that frame. One of those exceptions is nfs_call_unlink(), where wq is embedded into (dynamically allocated) struct nfs_unlinkdata. It is destroyed in nfs_async_unlink_release() after an explicit d_lookup_done() on the dentry wq went into. Remaining exception is d_add_ci(). There wq is what we'd found in ->d_wait of d_add_ci() argument. Callers of d_add_ci() are two instances of ->d_lookup() and they must have been given an in-lookup dentry. Which means that they'd been called by __lookup_slow() or lookup_open(), with wq in the call frame of one of those. Result of d_alloc_parallel() in d_add_ci() is fed to d_splice_alias(), which either returns non-NULL (and d_add_ci() does d_lookup_done()) or feeds dentry to __d_add() that will do __d_lookup_done() under ->d_lock. That concludes the analysis. Let __d_lookup_unhash(): 1) Lock the lookup hash and clear DCACHE_PAR_LOOKUP 2) Unhash the dentry 3) Retrieve and clear dentry::d_wait 4) Unlock the hash and return the retrieved waitqueue head pointer 5) Let the caller handle the wake up. 6) Rename __d_lookup_done() to __d_lookup_unhash_wake() to enforce build failures for OOT code that used __d_lookup_done() and is not aware of the new return value. This does not yet solve the PREEMPT_RT problem completely because preemption is still disabled due to i_dir_seq being held for write. This will be addressed in subsequent steps. An alternative solution would be to switch the waitqueue to a simple waitqueue, but aside of Linus not being a fan of them, moving the wake up closer to the place where dentry::lock is unlocked reduces lock contention time for the woken up waiter. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lkml.kernel.org/r/20220613140712.77932-3-bigeasy@linutronix.de Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-07-30fs/dcache: Disable preemption on i_dir_seq write side on PREEMPT_RTSebastian Andrzej Siewior
i_dir_seq is a sequence counter with a lock which is represented by the lowest bit. The writer atomically updates the counter which ensures that it can be modified by only one writer at a time. This requires preemption to be disabled across the write side critical section. On !PREEMPT_RT kernels this is implicit by the caller acquiring dentry::lock. On PREEMPT_RT kernels spin_lock() does not disable preemption which means that a preempting writer or reader would live lock. It's therefore required to disable preemption explicitly. An alternative solution would be to replace i_dir_seq with a seqlock_t for PREEMPT_RT, but that comes with its own set of problems due to arbitrary lock nesting. A pure sequence count with an associated spinlock is not possible because the locks held by the caller are not necessarily related. As the critical section is small, disabling preemption is a sensible solution. Reported-by: Oleg.Karfich@wago.com Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lkml.kernel.org/r/20220613140712.77932-2-bigeasy@linutronix.de Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-07-30d_add_ci(): make sure we don't miss d_lookup_done()Al Viro
All callers of d_alloc_parallel() must make sure that resulting in-lookup dentry (if any) will encounter __d_lookup_done() before the final dput(). d_add_ci() might end up creating in-lookup dentries; they are fed to d_splice_alias(), which will normally make sure they meet __d_lookup_done(). However, it is possible to end up with d_splice_alias() failing with ERR_PTR(-ELOOP) without having done so. It takes a corrupted ntfs or case-insensitive xfs image, but neither should end up with memory corruption... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>