linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2020-10-29	perf/core: Add PERF_SAMPLE_DATA_PAGE_SIZE	Kan Liang
	Current perf can report both virtual addresses and physical addresses, but not the MMU page size. Without the MMU page size information of the utilized page, users cannot decide whether to promote/demote large pages to optimize memory usage. Add a new sample type for the data MMU page size. Current perf already has a facility to collect data virtual addresses. A page walker is required to walk the pages tables and calculate the MMU page size from a given virtual address. On some platforms, e.g., X86, the page walker is invoked in an NMI handler. So the page walker must be NMI-safe and low overhead. Besides, the page walker should work for both user and kernel virtual address. The existing generic page walker, e.g., walk_page_range_novma(), is a little bit complex and doesn't guarantee the NMI-safe. The follow_page() is only for user-virtual address. Add a new function perf_get_page_size() to walk the page tables and calculate the MMU page size. In the function: - Interrupts have to be disabled to prevent any teardown of the page tables. - For user space threads, the current->mm is used for the page walker. For kernel threads and the like, the current->mm is NULL. The init_mm is used for the page walker. The active_mm is not used here, because it can be NULL. Quote from Peter Zijlstra, "context_switch() can set prev->active_mm to NULL when it transfers it to @next. It does this before @current is updated. So an NMI that comes in between this active_mm swizzling and updating @current will see !active_mm." - The MMU page size is calculated from the page table level. The method should work for all architectures, but it has only been verified on X86. Should there be some architectures, which support perf, where the method doesn't work, it can be fixed later separately. Reporting the wrong page size would not be fatal for the architecture. Some under discussion features may impact the method in the future. Quote from Dave Hansen, "There are lots of weird things folks are trying to do with the page tables, like Address Space Isolation. For instance, if you get a perf NMI when running userspace, current->mm->pgd is different than the PGD that was in use when userspace was running. It's close enough today, but it might not stay that way." If the case happens later, lots of consecutive page walk errors will happen. The worst case is that lots of page-size '0' are returned, which would not be fatal. In the perf tool, a check is implemented to detect this case. Once it happens, a kernel patch could be implemented accordingly then. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201001135749.2804-2-kan.liang@linux.intel.com
2020-10-29	sched: fix exit_mm vs membarrier (v4)	Mathieu Desnoyers
	exit_mm should issue memory barriers after user-space memory accesses, before clearing current->mm, to order user-space memory accesses performed prior to exit_mm before clearing tsk->mm, which has the effect of skipping the membarrier private expedited IPIs. exit_mm should also update the runqueue's membarrier_state so membarrier global expedited IPIs are not sent when they are not needed. The membarrier system call can be issued concurrently with do_exit if we have thread groups created with CLONE_VM but not CLONE_THREAD. Here is the scenario I have in mind: Two thread groups are created, A and B. Thread group B is created by issuing clone from group A with flag CLONE_VM set, but not CLONE_THREAD. Let's assume we have a single thread within each thread group (Thread A and Thread B). The AFAIU we can have: Userspace variables: int x = 0, y = 0; CPU 0 CPU 1 Thread A Thread B (in thread group A) (in thread group B) x = 1 barrier() y = 1 exit() exit_mm() current->mm = NULL; r1 = load y membarrier() skips CPU 0 (no IPI) because its current mm is NULL r2 = load x BUG_ON(r1 == 1 && r2 == 0) Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201020134715.13909-2-mathieu.desnoyers@efficios.com
2020-10-29	entry: Add support for TIF_NOTIFY_SIGNAL	Jens Axboe
	Add TIF_NOTIFY_SIGNAL handling in the generic entry code, which if set, will return true if signal_pending() is used in a wait loop. That causes an exit of the loop so that notify_signal tracehooks can be run. If the wait loop is currently inside a system call, the system call is restarted once task_work has been processed. In preparation for only having arch_do_signal() handle syscall restarts if _TIF_SIGPENDING isn't set, rename it to arch_do_signal_or_restart(). Pass in a boolean that tells the architecture specific signal handler if it should attempt to get a signal, or just process a potential syscall restart. For !CONFIG_GENERIC_ENTRY archs, add the TIF_NOTIFY_SIGNAL handling to get_signal(). This is done to minimize the needed architecture changes to support this feature. Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Link: https://lore.kernel.org/r/20201026203230.386348-3-axboe@kernel.dk
2020-10-29	signal: Add task_sigpending() helper	Jens Axboe
	This is in preparation for maintaining signal_pending() as the decider of whether or not a schedule() loop should be broken, or continue sleeping. This is different than the core signal use cases, which really need to know whether an actual signal is pending or not. task_sigpending() returns non-zero if TIF_SIGPENDING is set. Only core kernel use cases should care about the distinction between the two, make sure those use the task_sigpending() helper. Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Link: https://lore.kernel.org/r/20201026203230.386348-2-axboe@kernel.dk
2020-10-28	bpf: Permit cond_resched for some iterators	Yonghong Song
	Commit e679654a704e ("bpf: Fix a rcu_sched stall issue with bpf task/task_file iterator") tries to fix rcu stalls warning which is caused by bpf task_file iterator when running "bpftool prog". rcu: INFO: rcu_sched self-detected stall on CPU rcu: \x097-....: (20999 ticks this GP) idle=302/1/0x4000000000000000 softirq=1508852/1508852 fqs=4913 \x09(t=21031 jiffies g=2534773 q=179750) NMI backtrace for cpu 7 CPU: 7 PID: 184195 Comm: bpftool Kdump: loaded Tainted: G W 5.8.0-00004-g68bfc7f8c1b4 #6 Hardware name: Quanta Twin Lakes MP/Twin Lakes Passive MP, BIOS F09_3A17 05/03/2019 Call Trace: <IRQ> dump_stack+0x57/0x70 nmi_cpu_backtrace.cold+0x14/0x53 ? lapic_can_unplug_cpu.cold+0x39/0x39 nmi_trigger_cpumask_backtrace+0xb7/0xc7 rcu_dump_cpu_stacks+0xa2/0xd0 rcu_sched_clock_irq.cold+0x1ff/0x3d9 ? tick_nohz_handler+0x100/0x100 update_process_times+0x5b/0x90 tick_sched_timer+0x5e/0xf0 __hrtimer_run_queues+0x12a/0x2a0 hrtimer_interrupt+0x10e/0x280 __sysvec_apic_timer_interrupt+0x51/0xe0 asm_call_on_stack+0xf/0x20 </IRQ> sysvec_apic_timer_interrupt+0x6f/0x80 ... task_file_seq_next+0x52/0xa0 bpf_seq_read+0xb9/0x320 vfs_read+0x9d/0x180 ksys_read+0x5f/0xe0 do_syscall_64+0x38/0x60 entry_SYSCALL_64_after_hwframe+0x44/0xa9 The fix is to limit the number of bpf program runs to be one million. This fixed the program in most cases. But we also found under heavy load, which can increase the wallclock time for bpf_seq_read(), the warning may still be possible. For example, calling bpf_delay() in the "while" loop of bpf_seq_read(), which will introduce artificial delay, the warning will show up in my qemu run. static unsigned q; volatile unsigned p = &q; volatile unsigned long long ll; static void bpf_delay(void) { int i, j; for (i = 0; i < 10000; i++) for (j = 0; j < 10000; j++) ll += p; } There are two ways to fix this issue. One is to reduce the above one million threshold to say 100,000 and hopefully rcu warning will not show up any more. Another is to introduce a target feature which enables bpf_seq_read() calling cond_resched(). This patch took second approach as the first approach may cause more -EAGAIN failures for read() syscalls. Note that not all bpf_iter targets can permit cond_resched() in bpf_seq_read() as some, e.g., netlink seq iterator, rcu read lock critical section spans through seq_ops->next() -> seq_ops->show() -> seq_ops->next(). For the kernel code with the above hack, "bpftool p" roughly takes 38 seconds to finish on my VM with 184 bpf program runs. Using the following command, I am able to collect the number of context switches: perf stat -e context-switches -- ./bpftool p >& log Without this patch, 69 context-switches With this patch, 75 context-switches This patch added additional 6 context switches, roughly every 6 seconds to reschedule, to avoid lengthy no-rescheduling which may cause the above RCU warnings. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20201028061054.1411116-1-yhs@fb.com
2020-10-28	genirq/msi: Allow shadow declarations of msi_msg:: $member	Thomas Gleixner
	Architectures like x86 have their MSI messages in various bits of the data, address_lo and address_hi field. Composing or decomposing these messages with bitmasks and shifts is possible, but unreadable gunk. Allow architectures to provide an architecture specific representation for each member of msi_msg. Provide empty defaults for each and stick them into an union. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201024213535.443185-12-dwmw2@infradead.org
2020-10-28	Fonts: Make font size unsigned in font_desc	Peilin Ye
	`width` and `height` are defined as unsigned in our UAPI font descriptor `struct console_font`. Make them unsigned in our kernel font descriptor `struct font_desc`, too. Also, change the corresponding printk() format identifiers from `%d` to `%u`, in sti_select_fbfont(). Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20201028105647.1210161-1-yepeilin.cs@gmail.com
2020-10-28	misc: mic: remove the MIC drivers	Sudeep Dutt
	This patch removes the MIC drivers from the kernel tree since the corresponding devices have been discontinued. Removing the dma and char-misc changes in one patch and merging via the char-misc tree is best to avoid any potential build breakage. Cc: Nikhil Rao <nikhil.rao@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com> Acked-By: Vinod Koul <vkoul@kernel.org> Reviewed-by: Sherry Sun <sherry.sun@nxp.com> Link: https://lore.kernel.org/r/8c1443136563de34699d2c084df478181c205db4.1603854416.git.sudeep.dutt@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-28	jbd2: fix a kernel-doc markup	Mauro Carvalho Chehab
	The kernel-doc markup that documents _fc_replay_callback is missing an asterisk, causing this warning: ../include/linux/jbd2.h:1271: warning: Function parameter or member 'j_fc_replay_callback' not described in 'journal_s' When building the docs. Fixes: 609f928af48f ("jbd2: fast commit recovery path") Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Link: https://lore.kernel.org/r/6055927ada2015b55b413cdd2670533bdc9a8da2.1603791716.git.mchehab+huawei@kernel.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-10-28	ext4: make num of fast commit blocks configurable	Harshad Shirwadkar
	This patch reserves a field in the jbd2 superblock for number of fast commit blocks. When this value is non-zero, Ext4 uses this field to set the number of fast commit blocks. Fixes: 6866d7b3f2bb ("ext4/jbd2: add fast commit initialization") Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20201027044915.2553163-2-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-10-28	locking/refcount: move kernel-doc markups to the proper place	Mauro Carvalho Chehab
	Changeset a435b9a14356 ("locking/refcount: Provide __refcount API to obtain the old value") added a set of functions starting with __ that have a new parameter, adding a series of new warnings: $ ./scripts/kernel-doc -none include/linux/refcount.h include/linux/refcount.h:169: warning: Function parameter or member 'oldp' not described in '__refcount_add_not_zero' include/linux/refcount.h:208: warning: Function parameter or member 'oldp' not described in '__refcount_add' include/linux/refcount.h:239: warning: Function parameter or member 'oldp' not described in '__refcount_inc_not_zero' include/linux/refcount.h:261: warning: Function parameter or member 'oldp' not described in '__refcount_inc' include/linux/refcount.h:291: warning: Function parameter or member 'oldp' not described in '__refcount_sub_and_test' include/linux/refcount.h:327: warning: Function parameter or member 'oldp' not described in '__refcount_dec_and_test' include/linux/refcount.h:347: warning: Function parameter or member 'oldp' not described in '__refcount_dec' The issue is that the kernel-doc markups are now misplaced, as they should be added just before the functions. So, move the kernel-doc markups to the proper places, in order to drop the warnings. It should be noticed that git show produces a crappy output, for this patch without "--patience" flag. Fixes: a435b9a14356 ("locking/refcount: Provide __refcount API to obtain the old value") Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Link: https://lore.kernel.org/r/7985c31d1ace591bc5e1faa05c367f1295b78afd.1603791716.git.mchehab+huawei@kernel.org Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-10-28	net: phy: remove kernel-doc duplication	Mauro Carvalho Chehab
	Sphinx 3 now checks for duplicated function declarations: .../Documentation/networking/kapi:143: ../include/linux/phy.h:163: WARNING: Duplicate C declaration, also defined in 'networking/kapi'. Declaration is 'unsigned int phy_supported_speeds (struct phy_device phy, unsigned int speeds, unsigned int size)'. .../Documentation/networking/kapi:143: ../include/linux/phy.h:1034: WARNING: Duplicate C declaration, also defined in 'networking/kapi'. Declaration is 'int phy_read_mmd (struct phy_device phydev, int devad, u32 regnum)'. .../Documentation/networking/kapi:143: ../include/linux/phy.h:1076: WARNING: Duplicate C declaration, also defined in 'networking/kapi'. Declaration is 'int __phy_read_mmd (struct phy_device phydev, int devad, u32 regnum)'. .../Documentation/networking/kapi:143: ../include/linux/phy.h:1088: WARNING: Duplicate C declaration, also defined in 'networking/kapi'. Declaration is 'int phy_write_mmd (struct phy_device phydev, int devad, u32 regnum, u16 val)'. .../Documentation/networking/kapi:143: ../include/linux/phy.h:1100: WARNING: Duplicate C declaration, also defined in 'networking/kapi'. Declaration is 'int __phy_write_mmd (struct phy_device phydev, int devad, u32 regnum, u16 val)'. It turns that both the C and the H files have the same kernel-doc markup for the same functions. Let's drop the at the header file, keeping the one closer to the code. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/75e9a357f9a716833d2094b04898754876365e68.1603791716.git.mchehab+huawei@kernel.org Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-10-28	mm: pagemap.h: fix two kernel-doc markups	Mauro Carvalho Chehab
	Changeset a8cf7f272b5a ("mm: add find_lock_head") renamed the index parameter, but forgot to update the kernel-doc markups accordingly. Fixes: a8cf7f272b5a ("mm: add find_lock_head") Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/dce89b296a4f5f9f8f798d5e76b6736c14a916ac.1603791716.git.mchehab+huawei@kernel.org Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-10-28	blk-mq: docs: add kernel-doc description for a new struct member	Mauro Carvalho Chehab
	As reported by kernel-doc: ./include/linux/blk-mq.h:267: warning: Function parameter or member 'active_queues_shared_sbitmap' not described in 'blk_mq_tag_set' There is now a new member for struct blk_mq_tag_set. Add a description for it, based on the commit that introduced it. Fixes: f1b49fdc1c64 ("blk-mq: Record active_queues_shared_sbitmap per tag_set for when using shared sbitmap") Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/8e513153b83eefc05e358f51f2632b592c3f6772.1603791716.git.mchehab+huawei@kernel.org Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-10-28	gpio: Retire the explicit gpio irqchip code	Linus Walleij
	Now that all gpiolib irqchip users have been over to use the irqchip template, we can finally retire the old code path and leave just one way in to the irqchip: set up the template when registering the gpio_chip. For a while we had two code paths for this which was a bit confusing. This brings this work to a conclusion, there is now one way of doing this. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Thierry Reding <thierry.reding@gmail.com> Link: https://lore.kernel.org/r/20201019134046.65101-1-linus.walleij@linaro.org
2020-10-28	qcom-geni-se: remove has_opp_table	Viresh Kumar
	has_opp_table isn't used anymore, remove it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lore.kernel.org/r/08ec1ee1d4252a266956abb5f1e0e0026d753564.1603867487.git.viresh.kumar@linaro.org Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2020-10-28	module: use hidden visibility for weak symbol references	Ard Biesheuvel
	Geert reports that commit be2881824ae9eb92 ("arm64/build: Assert for unwanted sections") results in build errors on arm64 for configurations that have CONFIG_MODULES disabled. The commit in question added ASSERT()s to the arm64 linker script to ensure that linker generated sections such as .got.plt etc are empty, but as it turns out, there are corner cases where the linker does emit content into those sections. More specifically, weak references to function symbols (which can remain unsatisfied, and can therefore not be emitted as relative references) will be emitted as GOT and PLT entries when linking the kernel in PIE mode (which is the case when CONFIG_RELOCATABLE is enabled, which is on by default). What happens is that code such as struct device (fn)(struct device dev); struct device iommu_device; fn = symbol_get(mdev_get_iommu_device); if (fn) { iommu_device = fn(dev); essentially gets converted into the following when CONFIG_MODULES is off: struct device *iommu_device; if (&mdev_get_iommu_device) { iommu_device = mdev_get_iommu_device(dev); where mdev_get_iommu_device is emitted as a weak symbol reference into the object file. The first reference is decorated with an ordinary ABS64 data relocation (which yields 0x0 if the reference remains unsatisfied). However, the indirect call is turned into a direct call covered by a R_AARCH64_CALL26 relocation, which is converted into a call via a PLT entry taking the target address from the associated GOT entry. Given that such GOT and PLT entries are unnecessary for fully linked binaries such as the kernel, let's give these weak symbol references hidden visibility, so that the linker knows that the weak reference via R_AARCH64_CALL26 can simply remain unsatisfied. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Fangrui Song <maskray@google.com> Acked-by: Jessica Yu <jeyu@kernel.org> Cc: Jessica Yu <jeyu@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Link: https://lore.kernel.org/r/20201027151132.14066-1-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2020-10-28	ctype.h: remove duplicate isdigit() helper	Arnd Bergmann
	gcc warns a few thousand times about the isdigit() shadow: include/linux/ctype.h:26:19: warning: declaration of 'isdigit' shadows a built-in function [-Wshadow] As there is already a compiler builtin, just use that, and make it clear we do that by defining a macro. Unfortunately, clang does not have the isdigit() builtin, so this has to be conditional. Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-10-28	tty: goldfish: use __raw_writel()/__raw_readl()	Laurent Vivier
	gf_early_console_putchar() uses __raw_writel() but the standard driver uses writel()/readl(). This means we can't use both on the same machine as the device is either big-endian, little-endian or native-endian. As android implementation defines the endianness of the device is the one of the architecture replace all writel()/readl() by __raw_writel()/__raw_readl() https://android.googlesource.com/platform/external/qemu/+/refs/heads/emu-master-dev/hw/char/goldfish_tty.c#222 Signed-off-by: Laurent Vivier <laurent@vivier.eu> Link: https://lore.kernel.org/r/20201010004749.1201695-1-laurent@vivier.eu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-28	usb: fix kernel-doc markups	Mauro Carvalho Chehab
	There is a common comment marked, instead, with kernel-doc notation. Also, some identifiers have different names between their prototypes and the kernel-doc markup. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Acked-by: Felipe Balbi <balbi@kernel.org> Link: https://lore.kernel.org/r/0b964be3884def04fcd20ea5c12cb90d0014871c.1603469755.git.mchehab+huawei@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-28	platform/x86: asus-wmi: Add support for SW_TABLET_MODE on UX360	Samuel Čavoj
	The UX360CA has a WMI device id 0x00060062, which reports whether the lid is flipped in tablet mode (1) or in normal laptop mode (0). Add a quirk (quirk_asus_use_lid_flip_devid) for devices on which this WMI device should be used to figure out the SW_TABLET_MODE state, as opposed to the quirk_asus_use_kbd_dock_devid. Additionally, the device needs to be queried on resume and restore because the firmware does not generate an event if the laptop is put to sleep while in tablet mode, flipped to normal mode, and later awoken. It is assumed other UX360* models have the same WMI device. As such, the quirk is applied to devices with DMI_MATCH(DMI_PRODUCT_NAME, "UX360"). More devices with this feature need to be tested and added accordingly. The reason for using an allowlist via the quirk mechanism is that the new WMI device (0x00060062) is also present on some models which do not have a 360 degree hinge (at least FX503VD and GL503VD from Hans' DSTS collection) and therefore its presence cannot be relied on. Signed-off-by: Samuel Čavoj <samuel@cavoj.net> Cc: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20201020220944.1075530-1-samuel@cavoj.net Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2020-10-28	KVM: arm64: ARM_SMCCC_ARCH_WORKAROUND_1 doesn't return SMCCC_RET_NOT_REQUIRED	Stephen Boyd
	According to the SMCCC spec[1](7.5.2 Discovery) the ARM_SMCCC_ARCH_WORKAROUND_1 function id only returns 0, 1, and SMCCC_RET_NOT_SUPPORTED. 0 is "workaround required and safe to call this function" 1 is "workaround not required but safe to call this function" SMCCC_RET_NOT_SUPPORTED is "might be vulnerable or might not be, who knows, I give up!" SMCCC_RET_NOT_SUPPORTED might as well mean "workaround required, except calling this function may not work because it isn't implemented in some cases". Wonderful. We map this SMC call to 0 is SPECTRE_MITIGATED 1 is SPECTRE_UNAFFECTED SMCCC_RET_NOT_SUPPORTED is SPECTRE_VULNERABLE For KVM hypercalls (hvc), we've implemented this function id to return SMCCC_RET_NOT_SUPPORTED, 0, and SMCCC_RET_NOT_REQUIRED. One of those isn't supposed to be there. Per the code we call arm64_get_spectre_v2_state() to figure out what to return for this feature discovery call. 0 is SPECTRE_MITIGATED SMCCC_RET_NOT_REQUIRED is SPECTRE_UNAFFECTED SMCCC_RET_NOT_SUPPORTED is SPECTRE_VULNERABLE Let's clean this up so that KVM tells the guest this mapping: 0 is SPECTRE_MITIGATED 1 is SPECTRE_UNAFFECTED SMCCC_RET_NOT_SUPPORTED is SPECTRE_VULNERABLE Note: SMCCC_RET_NOT_AFFECTED is 1 but isn't part of the SMCCC spec Fixes: c118bbb52743 ("arm64: KVM: Propagate full Spectre v2 workaround state to KVM guests") Signed-off-by: Stephen Boyd <swboyd@chromium.org> Acked-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Cc: Andre Przywara <andre.przywara@arm.com> Cc: Steven Price <steven.price@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://developer.arm.com/documentation/den0028/latest [1] Link: https://lore.kernel.org/r/20201023154751.1973872-1-swboyd@chromium.org Signed-off-by: Will Deacon <will@kernel.org>
2020-10-28	firmware: imx: add dummy functions	Peng Fan
	add dummy functions to avoid build failure when header files are included, but drivers are not built. Signed-off-by: Peng Fan <peng.fan@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2020-10-27	audit: trigger accompanying records when no rules present	Richard Guy Briggs
	When there are no audit rules registered, mandatory records (config, etc.) are missing their accompanying records (syscall, proctitle, etc.). This is due to audit context dummy set on syscall entry based on absence of rules that signals that no other records are to be printed. Clear the dummy bit if any record is generated, open coding this in audit_log_start(). The proctitle context and dummy checks are pointless since the proctitle record will not be printed if no syscall records are printed. The fds array is reset to -1 after the first syscall to indicate it isn't valid any more, but was never set to -1 when the context was allocated to indicate it wasn't yet valid. Check ctx->pwd in audit_log_name(). The audit_inode* functions can be called without going through getname_flags() or getname_kernel() that sets audit_names and cwd, so set the cwd in audit_alloc_name() if it has not already been done so due to audit_names being valid and purge all other audit_getcwd() calls. Revert the LSM dump_common_audit_data() LSM_AUDIT_DATA_* cases from the ghak96 patch since they are no longer necessary due to cwd coverage in audit_alloc_name(). Thanks to bauen1 <j2468h@googlemail.com> for reporting LSM situations in which context->cwd is not valid, inadvertantly fixed by the ghak96 patch. Please see upstream github issue https://github.com/linux-audit/audit-kernel/issues/120 This is also related to upstream github issue https://github.com/linux-audit/audit-kernel/issues/96 Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-10-27	cpufreq: Introduce CPUFREQ_NEED_UPDATE_LIMITS driver flag	Rafael J. Wysocki
	Generally, a cpufreq driver may need to update some internal upper and lower frequency boundaries on policy max and min changes, respectively, but currently this does not work if the target frequency does not change along with the policy limit. Namely, if the target frequency does not change along with the policy min or max, the "target_freq == policy->cur" check in __cpufreq_driver_target() prevents driver callbacks from being invoked and they do not even have a chance to update the corresponding internal boundary. This particularly affects the "powersave" and "performance" governors that always set the target frequency to one of the policy limits and it never changes when the other limit is updated. To allow cpufreq the drivers needing to update internal frequency boundaries on policy limits changes to avoid this issue, introduce a new driver flag, CPUFREQ_NEED_UPDATE_LIMITS, that (when set) will neutralize the check mentioned above. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2020-10-27	bpf: Fix -Wshadow warnings	Arnd Bergmann
	There are thousands of warnings about one macro in a W=2 build: include/linux/filter.h:561:6: warning: declaration of 'ret' shadows a previous local [-Wshadow] Prefix all the locals in that macro with __ to avoid most of these warnings. Fixes: 492ecee892c2 ("bpf: enable program stats") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20201026162110.3710415-1-arnd@kernel.org
2020-10-27	MIPS: ingenic: remove unused platform_data header file	Sam Ravnborg
	There are no users of this headers file in the kernel. All users are likely migrated to device tree which is a good thing. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Paul Cercueil <paul@crapouillou.net> Cc: Harvey Hunt <harveyhuntnexus@gmail.com> Cc: linux-mips@vger.kernel.org Reviewed-by: Paul Cercueil <paul@crapouillou.net> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2020-10-26	RDMA/mlx5: Fix devlink deadlock on net namespace deletion	Parav Pandit
	When a mlx5 core devlink instance is reloaded in different net namespace, its associated IB device is deleted and recreated. Example sequence is: $ ip netns add foo $ devlink dev reload pci/0000:00:08.0 netns foo $ ip netns del foo mlx5 IB device needs to attach and detach the netdevice to it through the netdev notifier chain during load and unload sequence. A below call graph of the unload flow. cleanup_net() down_read(&pernet_ops_rwsem); <- first sem acquired ops_pre_exit_list() pre_exit() devlink_pernet_pre_exit() devlink_reload() mlx5_devlink_reload_down() mlx5_unload_one() [...] mlx5_ib_remove() mlx5_ib_unbind_slave_port() mlx5_remove_netdev_notifier() unregister_netdevice_notifier() down_write(&pernet_ops_rwsem);<- recurrsive lock Hence, when net namespace is deleted, mlx5 reload results in deadlock. When deadlock occurs, devlink mutex is also held. This not only deadlocks the mlx5 device under reload, but all the processes which attempt to access unrelated devlink devices are deadlocked. Hence, fix this by mlx5 ib driver to register for per net netdev notifier instead of global one, which operats on the net namespace without holding the pernet_ops_rwsem. Fixes: 4383cfcc65e7 ("net/mlx5: Add devlink reload") Link: https://lore.kernel.org/r/20201026134359.23150-1-parav@nvidia.com Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-26	regmap: mmio: add config option to allow relaxed MMIO accesses	Adrian Ratiu
	On some platforms (eg armv7 due to the CONFIG_ARM_DMA_MEM_BUFFERABLE) MMIO R/W operations always add memory barriers which can increase load, decrease battery life or in general reduce performance unnecessarily on devices which access a lot of configuration registers and where ordering does not matter (eg. media accelerators like the Verisilicon / Hantro video decoders). Drivers used to call the relaxed MMIO variants directly but since they are now accessing the MMIO registers via regmaps (to compensate for different VPU HW reg layouts via regmap fields), there is a need for a relaxed API / config to preserve existing behaviour. Cc: Mark Brown <broonie@kernel.org> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> Link: https://lore.kernel.org/r/20201014203024.954369-1-adrian.ratiu@collabora.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-10-26	ASoC: adau1977: remove platform data and move micbias bindings include	Alexandru Ardelean
	The change removes the platform_data include/definition. It only contains some values for the MICBIAS. These are moved into 'dt-bindings/sound/adi,adau1977.h' so that they can be used inside device-trees. When moving then, they need to be converted to pre-compiler defines, so that the DT compiler can understand them. The driver then, also needs to include the new 'dt-bindings/sound/adi,adau1977.h' file. Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com> Link: https://lore.kernel.org/r/20201019105313.24862-1-alexandru.ardelean@analog.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-10-26	elf: Expose ELF header on arch_setup_additional_pages()	Gabriel Krisman Bertazi
	Like it is done for SET_PERSONALITY with ARM, which requires the ELF header to select correct personality parameters, x86 requires the headers when selecting which VDSO to load, instead of relying on the going-away TIF_IA32/X32 flags. Add an indirection macro to arch_setup_additional_pages(), that x86 can reimplement to receive the extra parameter just for ELF files. This requires no changes to other architectures, who can continue to use the original arch_setup_additional_pages for ELF and non-ELF binaries. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201004032536.1229030-8-krisman@collabora.com
2020-10-26	elf: Expose ELF header in compat_start_thread()	Gabriel Krisman Bertazi
	Like it is done for SET_PERSONALITY with x86, which requires the ELF header to select correct personality parameters, x86 requires the headers on compat_start_thread() to choose starting CS for ELF32 binaries, instead of relying on the going-away TIF_IA32/X32 flags. Add an indirection macro to ELF invocations of START_THREAD, that x86 can reimplement to receive the extra parameter just for ELF files. This requires no changes to other architectures who don't need the header information, they can continue to use the original start_thread for ELF and non-ELF binaries, and it prevents affecting non-ELF code paths for x86. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201004032536.1229030-6-krisman@collabora.com
2020-10-26	time: Prevent undefined behaviour in timespec64_to_ns()	Zeng Tao
	UBSAN reports: Undefined behaviour in ./include/linux/time64.h:127:27 signed integer overflow: 17179869187 * 1000000000 cannot be represented in type 'long long int' Call Trace: timespec64_to_ns include/linux/time64.h:127 [inline] set_cpu_itimer+0x65c/0x880 kernel/time/itimer.c:180 do_setitimer+0x8e/0x740 kernel/time/itimer.c:245 __x64_sys_setitimer+0x14c/0x2c0 kernel/time/itimer.c:336 do_syscall_64+0xa1/0x540 arch/x86/entry/common.c:295 Commit bd40a175769d ("y2038: itimer: change implementation to timespec64") replaced the original conversion which handled time clamping correctly with timespec64_to_ns() which has no overflow protection. Fix it in timespec64_to_ns() as this is not necessarily limited to the usage in itimers. [ tglx: Added comment and adjusted the fixes tag ] Fixes: 361a3bf00582 ("time64: Add time64.h header and define struct timespec64") Signed-off-by: Zeng Tao <prime.zeng@hisilicon.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1598952616-6416-1-git-send-email-prime.zeng@hisilicon.com
2020-10-26	bus: ti-sysc: Fix reset status check for modules with quirks	Tony Lindgren
	Commit d46f9fbec719 ("bus: ti-sysc: Use optional clocks on for enable and wait for softreset bit") started showing a "OCP softreset timed out" warning on enable if the interconnect target module is not out of reset. This caused the warning to be often triggered for i2c and hdq while the devices are working properly. Turns out that some interconnect target modules seem to have an unusable reset status bits unless the module specific reset quirks are activated. Let's just skip the reset status check for those modules as we only want to activate the reset quirks when doing a reset, and not on enable. This way we don't see the bogus "OCP softreset timed out" warnings during boot. Fixes: d46f9fbec719 ("bus: ti-sysc: Use optional clocks on for enable and wait for softreset bit") Signed-off-by: Tony Lindgren <tony@atomide.com>
2020-10-26	PM / devfreq: Remove redundant governor_name from struct devfreq	Chanwoo Choi
	The devfreq structure instance contains the governor_name and a governor instance. When need to show the governor name, better to use the name of devfreq_governor structure. So, governor_name variable in struct devfreq is a redundant and unneeded variable. Remove the redundant governor_name of struct devfreq and then use the name of devfreq_governor instance. Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
2020-10-25	epoll: take epitem list out of struct file	Al Viro
	Move the head of epitem list out of struct file; for epoll ones it's moved into struct eventpoll (->refs there), for non-epoll - into the new object (struct epitem_head). In place of ->f_ep_links we leave a pointer to the list head (->f_ep). ->f_ep is protected by ->f_lock and it's zeroed as soon as the list of epitems becomes empty (that can happen only in ep_remove() by now). The list of files for reverse path check is not going through struct file now - it's a single-linked list going through epitem_head instances. It's terminated by ERR_PTR(-1) (== EP_UNACTIVE_POINTER), so the elements of list can be distinguished by head->next != NULL. epitem_head instances are allocated at ep_insert() time (by attach_epitem()) and freed either by ep_remove() (if it empties the set of epitems and epitem_head does not belong to the reverse path check list) or by clear_tfile_check_list() when the list is emptied (if the set of epitems is empty by that point). Allocations are done from a separate slab - minimal kmalloc() size is too large on some architectures. As the result, we trim struct file _and_ get rid of the games with temporary file references. Locking and barriers are interesting (aren't they always); see unlist_file() and ep_remove() for details. The non-obvious part is that ep_remove() needs to decide if it will be the one to free the damn thing before actually storing NULL to head->epitems.first - that's what smp_load_acquire is for in there. unlist_file() lockless path is safe, since we hit it only if we observe NULL in head->epitems.first and whoever had done that store is guaranteed to have observed non-NULL in head->next. IOW, their last access had been the store of NULL into ->epitems.first and we can safely free the sucker. OTOH, we are under rcu_read_lock() and both epitem and epitem->file have their freeing RCU-delayed. So if we see non-NULL ->epitems.first, we can grab ->f_lock (all epitems in there share the same struct file) and safely recheck the emptiness of ->epitems; again, ->next is still non-NULL, so ep_remove() couldn't have freed head yet. ->f_lock serializes us wrt ep_remove(); the rest is trivial. Note that once head->epitems becomes NULL, nothing can get inserted into it - the only remaining reference to head after that point is from the reverse path check list. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-10-25	convert ->f_ep_links/->fllink to hlist	Al Viro
	we don't care about the order of elements there Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-10-25	treewide: Convert macro and uses of __section(foo) to __section("foo")	Joe Perches
	Use a more generic form for __section that requires quotes to avoid complications with clang and gcc differences. Remove the quote operator # from compiler_attributes.h __section macro. Convert all unquoted __section(foo) uses to quoted __section("foo"). Also convert __attribute__((section("foo"))) uses to __section("foo") even if the __attribute__ has multiple list entry forms. Conversion done using the script at: https://lore.kernel.org/lkml/75393e5ddc272dc7403de74d645e6c6e0f4e70eb.camel@perches.com/2-convert_section.pl Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: Nick Desaulniers <ndesaulniers@gooogle.com> Reviewed-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-25	mm: remove kzfree() compatibility definition	Eric Biggers
	Commit 453431a54934 ("mm, treewide: rename kzfree() to kfree_sensitive()") renamed kzfree() to kfree_sensitive(), but it left a compatibility definition of kzfree() to avoid being too disruptive. Since then a few more instances of kzfree() have slipped in. Just get rid of them and remove the compatibility definition once and for all. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-25	Merge tag 'locking-urgent-2020-10-25' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fix from Thomas Gleixner: "Just a trivial fix for kernel-doc warnings" * tag 'locking-urgent-2020-10-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/seqlocks: Fix kernel-doc warnings
2020-10-25	Merge branch 'parisc-5.10-2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull more parisc updates from Helge Deller: - During this merge window O_NONBLOCK was changed to become 000200000, but we missed that the syscalls timerfd_create(), signalfd4(), eventfd2(), pipe2(), inotify_init1() and userfaultfd() do a strict bit-wise check of the flags parameter. To provide backward compatibility with existing userspace we introduce parisc specific wrappers for those syscalls which filter out the old O_NONBLOCK value and replaces it with the new one. - Prevent HIL bus driver to get stuck when keyboard or mouse isn't attached - Improve error return codes when setting rtc time - Minor documentation fix in pata_ns87415.c * 'parisc-5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: ata: pata_ns87415.c: Document support on parisc with superio chip parisc: Add wrapper syscalls to fix O_NONBLOCK flag usage hil/parisc: Disable HIL driver when it gets stuck parisc: Improve error return codes when setting rtc time
2020-10-25	Merge tag '20201024-v4-5.10' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/prandom Pull random32 updates from Willy Tarreau: "Make prandom_u32() less predictable. This is the cleanup of the latest series of prandom_u32 experimentations consisting in using SipHash instead of Tausworthe to produce the randoms used by the network stack. The changes to the files were kept minimal, and the controversial commit that used to take noise from the fast_pool (f227e3ec3b5c) was reverted. Instead, a dedicated "net_rand_noise" per_cpu variable is fed from various sources of activities (networking, scheduling) to perturb the SipHash state using fast, non-trivially predictable data, instead of keeping it fully deterministic. The goal is essentially to make any occasional memory leakage or brute-force attempt useless. The resulting code was verified to be very slightly faster on x86_64 than what is was with the controversial commit above, though this remains barely above measurement noise. It was also tested on i386 and arm, and build- tested only on arm64" Link: https://lore.kernel.org/netdev/20200808152628.GA27941@SDF.ORG/ * tag '20201024-v4-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/prandom: random32: add a selftest for the prandom32 code random32: add noise from network and scheduling activity random32: make prandom_u32() output unpredictable
2020-10-24	Merge tag 'block-5.10-2020-10-24' of git://git.kernel.dk/linux-block	Linus Torvalds
	Pull block fixes from Jens Axboe: - NVMe pull request from Christoph - rdma error handling fixes (Chao Leng) - fc error handling and reconnect fixes (James Smart) - fix the qid displace when tracing ioctl command (Keith Busch) - don't use BLK_MQ_REQ_NOWAIT for passthru (Chaitanya Kulkarni) - fix MTDT for passthru (Logan Gunthorpe) - blacklist Write Same on more devices (Kai-Heng Feng) - fix an uninitialized work struct (zhenwei pi)" - lightnvm out-of-bounds fix (Colin) - SG allocation leak fix (Doug) - rnbd fixes (Gioh, Guoqing, Jack) - zone error translation fixes (Keith) - kerneldoc markup fix (Mauro) - zram lockdep fix (Peter) - Kill unused io_context members (Yufen) - NUMA memory allocation cleanup (Xianting) - NBD config wakeup fix (Xiubo) * tag 'block-5.10-2020-10-24' of git://git.kernel.dk/linux-block: (27 commits) block: blk-mq: fix a kernel-doc markup nvme-fc: shorten reconnect delay if possible for FC nvme-fc: wait for queues to freeze before calling update_hr_hw_queues nvme-fc: fix error loop in create_hw_io_queues nvme-fc: fix io timeout to abort I/O null_blk: use zone status for max active/open nvmet: don't use BLK_MQ_REQ_NOWAIT for passthru nvmet: cleanup nvmet_passthru_map_sg() nvmet: limit passthru MTDS by BIO_MAX_PAGES nvmet: fix uninitialized work for zero kato nvme-pci: disable Write Zeroes on Sandisk Skyhawk nvme: use queuedata for nvme_req_qid nvme-rdma: fix crash due to incorrect cqe nvme-rdma: fix crash when connect rejected block: remove unused members for io_context blk-mq: remove the calling of local_memory_node() zram: Fix __zram_bvec_{read,write}() locking order skd_main: remove unused including <linux/version.h> sgl_alloc_order: fix memory leak lightnvm: fix out-of-bounds write to array devices->info[] ...
2020-10-24	Merge tag 'io_uring-5.10-2020-10-24' of git://git.kernel.dk/linux-block	Linus Torvalds
	Pull io_uring fixes from Jens Axboe: - fsize was missed in previous unification of work flags - Few fixes cleaning up the flags unification creds cases (Pavel) - Fix NUMA affinities for completely unplugged/replugged node for io-wq - Two fallout fixes from the set_fs changes. One local to io_uring, one for the splice entry point that io_uring uses. - Linked timeout fixes (Pavel) - Removal of ->flush() ->files work-around that we don't need anymore with referenced files (Pavel) - Various cleanups (Pavel) * tag 'io_uring-5.10-2020-10-24' of git://git.kernel.dk/linux-block: splice: change exported internal do_splice() helper to take kernel offset io_uring: make loop_rw_iter() use original user supplied pointers io_uring: remove req cancel in ->flush() io-wq: re-set NUMA node affinities if CPUs come online io_uring: don't reuse linked_timeout io_uring: unify fsize with def->work_flags io_uring: fix racy REQ_F_LINK_TIMEOUT clearing io_uring: do poll's hash_node init in common code io_uring: inline io_poll_task_handler() io_uring: remove extra ->file check in poll prep io_uring: make cached_cq_overflow non atomic_t io_uring: inline io_fail_links() io_uring: kill ref get/drop in personality init io_uring: flags-based creds init in queue
2020-10-24	Merge branch 'work.misc' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc vfs updates from Al Viro: "Assorted stuff all over the place (the largest group here is Christoph's stat cleanups)" * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: remove KSTAT_QUERY_FLAGS fs: remove vfs_stat_set_lookup_flags fs: move vfs_fstatat out of line fs: implement vfs_stat and vfs_lstat in terms of vfs_fstatat fs: remove vfs_statx_fd fs: omfs: use kmemdup() rather than kmalloc+memcpy [PATCH] reduce boilerplate in fsid handling fs: Remove duplicated flag O_NDELAY occurring twice in VALID_OPEN_FLAGS selftests: mount: add nosymfollow tests Add a "nosymfollow" mount option.
2020-10-24	Merge tag 'dma-mapping-5.10-1' of git://git.infradead.org/users/hch/dma-mapping	Linus Torvalds
	Pull dma-mapping fixes from Christoph Hellwig: - document the new dma_{alloc,free}_pages() API - two fixups for the dma-mapping.h split * tag 'dma-mapping-5.10-1' of git://git.infradead.org/users/hch/dma-mapping: dma-mapping: document dma_{alloc,free}_pages dma-mapping: move more functions to dma-map-ops.h ARM/sa1111: add a missing include of dma-map-ops.h
2020-10-24	random32: add noise from network and scheduling activity	Willy Tarreau
	With the removal of the interrupt perturbations in previous random32 change (random32: make prandom_u32() output unpredictable), the PRNG has become 100% deterministic again. While SipHash is expected to be way more robust against brute force than the previous Tausworthe LFSR, there's still the risk that whoever has even one temporary access to the PRNG's internal state is able to predict all subsequent draws till the next reseed (roughly every minute). This may happen through a side channel attack or any data leak. This patch restores the spirit of commit f227e3ec3b5c ("random32: update the net random state on interrupt and activity") in that it will perturb the internal PRNG's statee using externally collected noise, except that it will not pick that noise from the random pool's bits nor upon interrupt, but will rather combine a few elements along the Tx path that are collectively hard to predict, such as dev, skb and txq pointers, packet length and jiffies values. These ones are combined using a single round of SipHash into a single long variable that is mixed with the net_rand_state upon each invocation. The operation was inlined because it produces very small and efficient code, typically 3 xor, 2 add and 2 rol. The performance was measured to be the same (even very slightly better) than before the switch to SipHash; on a 6-core 12-thread Core i7-8700k equipped with a 40G NIC (i40e), the connection rate dropped from 556k/s to 555k/s while the SYN cookie rate grew from 5.38 Mpps to 5.45 Mpps. Link: https://lore.kernel.org/netdev/20200808152628.GA27941@SDF.ORG/ Cc: George Spelvin <lkml@sdf.org> Cc: Amit Klein <aksecurity@gmail.com> Cc: Eric Dumazet <edumazet@google.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: tytso@mit.edu Cc: Florian Westphal <fw@strlen.de> Cc: Marc Plumb <lkml.mplumb@gmail.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
2020-10-24	random32: make prandom_u32() output unpredictable	George Spelvin
	Non-cryptographic PRNGs may have great statistical properties, but are usually trivially predictable to someone who knows the algorithm, given a small sample of their output. An LFSR like prandom_u32() is particularly simple, even if the sample is widely scattered bits. It turns out the network stack uses prandom_u32() for some things like random port numbers which it would prefer are not trivially predictable. Predictability led to a practical DNS spoofing attack. Oops. This patch replaces the LFSR with a homebrew cryptographic PRNG based on the SipHash round function, which is in turn seeded with 128 bits of strong random key. (The authors of SipHash have not been consulted about this abuse of their algorithm.) Speed is prioritized over security; attacks are rare, while performance is always wanted. Replacing all callers of prandom_u32() is the quick fix. Whether to reinstate a weaker PRNG for uses which can tolerate it is an open question. Commit f227e3ec3b5c ("random32: update the net random state on interrupt and activity") was an earlier attempt at a solution. This patch replaces it. Reported-by: Amit Klein <aksecurity@gmail.com> Cc: Willy Tarreau <w@1wt.eu> Cc: Eric Dumazet <edumazet@google.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: tytso@mit.edu Cc: Florian Westphal <fw@strlen.de> Cc: Marc Plumb <lkml.mplumb@gmail.com> Fixes: f227e3ec3b5c ("random32: update the net random state on interrupt and activity") Signed-off-by: George Spelvin <lkml@sdf.org> Link: https://lore.kernel.org/netdev/20200808152628.GA27941@SDF.ORG/ [ willy: partial reversal of f227e3ec3b5c; moved SIPROUND definitions to prandom.h for later use; merged George's prandom_seed() proposal; inlined siprand_u32(); replaced the net_rand_state[] array with 4 members to fix a build issue; cosmetic cleanups to make checkpatch happy; fixed RANDOM32_SELFTEST build ] Signed-off-by: Willy Tarreau <w@1wt.eu>
2020-10-24	Merge tag 'riscv-for-linus-5.10-mw1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull more RISC-V updates from Palmer Dabbelt: "Just a single patch set: the remainder of Christoph's work to remove set_fs, including the RISC-V portion" * tag 'riscv-for-linus-5.10-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: remove address space overrides using set_fs() riscv: implement __get_kernel_nofault and __put_user_nofault riscv: refactor __get_user and __put_user riscv: use memcpy based uaccess for nommu again asm-generic: make the set_fs implementation optional asm-generic: add nommu implementations of __{get,put}_kernel_nofault asm-generic: improve the nommu {get,put}_user handling uaccess: provide a generic TASK_SIZE_MAX definition
2020-10-24	Merge tag 'armsoc-drivers' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC-related driver updates from Olof Johansson: "Various driver updates for platforms. A bulk of this is smaller fixes or cleanups, but some of the new material this time around is: - Support for Nvidia Tegra234 SoC - Ring accelerator support for TI AM65x - PRUSS driver for TI platforms - Renesas support for R-Car V3U SoC - Reset support for Cortex-M4 processor on i.MX8MQ There are also new socinfo entries for a handful of different SoCs and platforms" * tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (131 commits) drm/mediatek: reduce clear event soc: mediatek: cmdq: add clear option in cmdq_pkt_wfe api soc: mediatek: cmdq: add jump function soc: mediatek: cmdq: add write_s_mask value function soc: mediatek: cmdq: add write_s value function soc: mediatek: cmdq: add read_s function soc: mediatek: cmdq: add write_s_mask function soc: mediatek: cmdq: add write_s function soc: mediatek: cmdq: add address shift in jump soc: mediatek: mtk-infracfg: Fix kerneldoc soc: amlogic: pm-domains: use always-on flag reset: sti: reset-syscfg: fix struct description warnings reset: imx7: add the cm4 reset for i.MX8MQ dt-bindings: reset: imx8mq: add m4 reset reset: Fix and extend kerneldoc reset: reset-zynqmp: Added support for Versal platform dt-bindings: reset: Updated binding for Versal reset driver reset: imx7: Support module build soc: fsl: qe: Remove unnessesary check in ucc_set_tdm_rxtx_clk soc: fsl: qman: convert to use be32_add_cpu() ...