linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2021-05-12	f2fs: return EINVAL for hole cases in swap file	Jaegeuk Kim
	This tries to fix xfstests/generic/495. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-12	ACPI: PM: Add ACPI ID of Alder Lake Fan	Sumeet Pawnikar
	Add a new unique fan ACPI device ID for Alder Lake to support it in acpi_dev_pm_attach() function. Fixes: 38748bcb940e ("ACPI: DPTF: Support Alder Lake") Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> Acked-by: Zhang Rui <rui.zhang@intel.com> Cc: 5.10+ <stable@vger.kernel.org> # 5.10+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-05-12	blkdev.h: remove unused codes blk_account_rq	Lin Feng
	Last users of blk_account_rq gone with patch commit a1ce35fa49852db ("block: remove dead elevator code") and now it gets no caller, it can be safely removed. Signed-off-by: Lin Feng <linf@wangsu.com> Link: https://lore.kernel.org/r/20210512100124.173769-1-linf@wangsu.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-12	block, bfq: avoid circular stable merges	Paolo Valente
	BFQ may merge a new bfq_queue, stably, with the last bfq_queue created. In particular, BFQ first waits a little bit for some I/O to flow inside the new queue, say Q2, if this is needed to understand whether it is better or worse to merge Q2 with the last queue created, say Q1. This delayed stable merge is performed by assigning bic->stable_merge_bfqq = Q1, for the bic associated with Q1. Yet, while waiting for some I/O to flow in Q2, a non-stable queue merge of Q2 with Q1 may happen, causing the bic previously associated with Q2 to be associated with exactly Q1 (bic->bfqq = Q1). After that, Q2 and Q1 may happen to be split, and, in the split, Q1 may happen to be recycled as a non-shared bfq_queue. In that case, Q1 may then happen to undergo a stable merge with the bfq_queue pointed by bic->stable_merge_bfqq. Yet bic->stable_merge_bfqq still points to Q1. So Q1 would be merged with itself. This commit fixes this error by intercepting this situation, and canceling the schedule of the stable merge. Fixes: 430a67f9d616 ("block, bfq: merge bursts of newly-created queues") Signed-off-by: Pietro Pedroni <pedroni.pietro.96@gmail.com> Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Link: https://lore.kernel.org/r/20210512094352.85545-2-paolo.valente@linaro.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-12	fs/mount_setattr: tighten permission checks	Christian Brauner
	We currently don't have any filesystems that support idmapped mounts which are mountable inside a user namespace. That was a deliberate decision for now as a userns root can just mount the filesystem themselves. So enforce this restriction explicitly until there's a real use-case for this. This way we can notice it and will have a chance to adapt and audit our translation helpers and fstests appropriately if we need to support such filesystems. Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@vger.kernel.org CC: linux-fsdevel@vger.kernel.org Suggested-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-05-12	spi: Assume GPIO CS active high in ACPI case	Andy Shevchenko
	Currently GPIO CS handling, when descriptors are in use, doesn't take into consideration that in ACPI case the default polarity is Active High and can't be altered. Instead we have to use the per-chip definition provided by SPISerialBus() resource. Fixes: 766c6b63aa04 ("spi: fix client driver breakages when using GPIO descriptors") Cc: Liguang Zhang <zhangliguang@linux.alibaba.com> Cc: Jay Fang <f.fangjian@huawei.com> Cc: Sven Van Asbroeck <thesven73@gmail.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Tested-by: Xin Hao <xhao@linux.alibaba.com> Link: https://lore.kernel.org/r/20210511140912.30757-1-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-05-12	spi: sprd: Add missing MODULE_DEVICE_TABLE	Chunyan Zhang
	MODULE_DEVICE_TABLE is used to extract the device information out of the driver and builds a table when being compiled. If using this macro, kernel can find the driver if available when the device is plugged in, and then loads that driver and initializes the device. Signed-off-by: Chunyan Zhang <chunyan.zhang@unisoc.com> Link: https://lore.kernel.org/r/20210512093534.243040-1-zhang.lyra@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-05-12	ASoC: cs42l52: Minor tidy up of error paths	Charles Keepax
	Fixup a needlessly initialised variable and an unchecked return value. Reported-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://lore.kernel.org/r/20210511175718.15416-5-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-05-12	ASoC: cs35l32: Add missing regmap use_single config	Charles Keepax
	This device requires single register transactions, this will definely cause problems with the new device ID parsing which uses regmap_bulk_read but might also show up in the cache sync sometimes. Add the missing flags to the regmap_config. Fixes: 283160f1419d ("ASoC: cs35l32: Minor error paths fixups") Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://lore.kernel.org/r/20210511175718.15416-4-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-05-12	ASoC: cs35l34: Add missing regmap use_single config	Charles Keepax
	This device requires single register transactions, this will definely cause problems with the new device ID parsing which uses regmap_bulk_read but might also show up in the cache sync sometimes. Add the missing flags to the regmap_config. Fixes: 8cb9b001635c ("ASoC: cs35l34: Minor error paths fixups") Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://lore.kernel.org/r/20210511175718.15416-3-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-05-12	ASoC: cs42l73: Add missing regmap use_single config	Charles Keepax
	This device requires single register transactions, this will definely cause problems with the new device ID parsing which uses regmap_bulk_read but might also show up in the cache sync sometimes. Add the missing flags to the regmap_config. Fixes: 26495252fe0d ("ASoC: cs42l73: Minor error paths fixups") Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://lore.kernel.org/r/20210511175718.15416-2-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-05-12	ASoC: cs53l30: Add missing regmap use_single config	Charles Keepax
	This device requires single register transactions, this will definely cause problems with the new device ID parsing which uses regmap_bulk_read but might also show up in the cache sync sometimes. Add the missing flags to the regmap_config. Fixes: 4fc81bc88ad9 ("ASoC: cs53l30: Minor error paths fixups") Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://lore.kernel.org/r/20210511175718.15416-1-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-05-12	ASoC: sti-sas: add missing MODULE_DEVICE_TABLE	Zou Wei
	This patch adds missing MODULE_DEVICE_TABLE definition which generates correct modalias for automatic loading of this driver when it is built as an external module. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zou Wei <zou_wei@huawei.com> Link: https://lore.kernel.org/r/1620789145-14936-1-git-send-email-zou_wei@huawei.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-05-12	ASoC: soc-dai.h: Align the word of comment for SND_SOC_DAIFMT_CBC_CFC	Kuninori Morimoto
	Let's use "consumer" instead of "follower". Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Link: https://lore.kernel.org/r/8735usc1gr.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-05-12	gpio: tegra186: Don't set parent IRQ affinity	Jon Hunter
	When hotplugging CPUs on Tegra186 and Tegra194 errors such as the following are seen ... IRQ63: set affinity failed(-22). IRQ65: set affinity failed(-22). IRQ66: set affinity failed(-22). IRQ67: set affinity failed(-22). Looking at the /proc/interrupts the above are all interrupts associated with GPIOs. The reason why these error messages occur is because there is no 'parent_data' associated with any of the GPIO interrupts and so tegra186_irq_set_affinity() simply returns -EINVAL. To understand why there is no 'parent_data' it is first necessary to understand that in addition to the GPIO interrupts being routed to the interrupt controller (GIC), the interrupts for some GPIOs are also routed to the Tegra Power Management Controller (PMC) to wake up the system from low power states. In order to configure GPIO events as wake events in the PMC, the PMC is configured as IRQ parent domain for the GPIO IRQ domain. Originally the GIC was the IRQ parent domain of the PMC and although this was working, this started causing issues once commit 64a267e9a41c ("irqchip/gic: Configure SGIs as standard interrupts") was added, because technically, the GIC is not a parent of the PMC. Commit c351ab7bf2a5 ("soc/tegra: pmc: Don't create fake interrupt hierarchy levels") fixed this by severing the IRQ domain hierarchy for the Tegra GPIOs and hence, there may be no IRQ parent domain for the GPIOs. The GPIO controllers on Tegra186 and Tegra194 have either one or six interrupt lines to the interrupt controller. For GPIO controllers with six interrupts, the mapping of the GPIO interrupt to the controller interrupt is configurable within the GPIO controller. Currently a default mapping is used, however, it could be possible to use the set affinity callback for the Tegra186 GPIO driver to do something a bit more interesting. Currently, because interrupts for all GPIOs are have the same mapping and any attempts to configure the affinity for a given GPIO can conflict with another that shares the same IRQ, for now it is simpler to just remove set affinity support and this avoids the above warnings being seen. Cc: <stable@vger.kernel.org> Fixes: c4e1f7d92cd6 ("gpio: tegra186: Set affinity callback to parent") Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2021-05-12	gpio: xilinx: Correct kernel doc for xgpio_probe()	Andy Shevchenko
	Kernel doc validator complains: .../gpio-xilinx.c:556: warning: expecting prototype for xgpio_of_probe(). Prototype was for xgpio_probe() instead Correct as suggested by changing the name of the function in the doc.. Fixes: 749564ffd52d ("gpio/xilinx: Convert the driver to platform device interface") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Tested-by: Neeli Srinivas <sneeli@xilinx.com> Reviewed-by: Michal Simek <michal.simek@xilinx.com> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2021-05-12	gpio: cadence: Add missing MODULE_DEVICE_TABLE	Zou Wei
	This patch adds missing MODULE_DEVICE_TABLE definition which generates correct modalias for automatic loading of this driver when it is built as an external module. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zou Wei <zou_wei@huawei.com> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2021-05-12	vfio-ccw: Serialize FSM IDLE state with I/O completion	Eric Farman
	Today, the stacked call to vfio_ccw_sch_io_todo() does three things: 1) Update a solicited IRB with CP information, and release the CP if the interrupt was the end of a START operation. 2) Copy the IRB data into the io_region, under the protection of the io_mutex 3) Reset the vfio-ccw FSM state to IDLE to acknowledge that vfio-ccw can accept more work. The trouble is that step 3 is (A) invoked for both solicited and unsolicited interrupts, and (B) sitting after the mutex for step 2. This second piece becomes a problem if it processes an interrupt for a CLEAR SUBCHANNEL while another thread initiates a START, thus allowing the CP and FSM states to get out of sync. That is: CPU 1 CPU 2 fsm_do_clear() fsm_irq() fsm_io_request() vfio_ccw_sch_io_todo() fsm_io_helper() Since the FSM state and CP should be kept in sync, let's make a note when the CP is released, and rely on that as an indication that the FSM should also be reset at the end of this routine and open up the device for more work. Signed-off-by: Eric Farman <farman@linux.ibm.com> Acked-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20210511195631.3995081-4-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2021-05-12	vfio-ccw: Reset FSM state to IDLE inside FSM	Eric Farman
	When an I/O request is made, the fsm_io_request() routine moves the FSM state from IDLE to CP_PROCESSING, and then fsm_io_helper() moves it to CP_PENDING if the START SUBCHANNEL received a cc0. Yet, the error case to go from CP_PROCESSING back to IDLE is done after the FSM call returns. Let's move this up into the FSM proper, to provide some better symmetry when unwinding in this case. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Matthew Rosato <mjrosato@linux.ibm.com> Message-Id: <20210511195631.3995081-3-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2021-05-12	vfio-ccw: Check initialized flag in cp_init()	Eric Farman
	We have a really nice flag in the channel_program struct that indicates if it had been initialized by cp_init(), and use it as a guard in the other cp accessor routines, but not for a duplicate call into cp_init(). The possibility of this occurring is low, because that flow is protected by the private->io_mutex and FSM CP_PROCESSING state. But then why bother checking it in (for example) cp_prefetch() then? Let's just be consistent and check for that in cp_init() too. Fixes: 71189f263f8a3 ("vfio-ccw: make it safe to access channel programs") Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Matthew Rosato <mjrosato@linux.ibm.com> Message-Id: <20210511195631.3995081-2-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2021-05-12	sched/fair: Fix clearing of has_idle_cores flag in select_idle_cpu()	Gautham R. Shenoy
	In commit: 9fe1f127b913 ("sched/fair: Merge select_idle_core/cpu()") in select_idle_cpu(), we check if an idle core is present in the LLC of the target CPU via the flag "has_idle_cores". We look for the idle core in select_idle_cores(). If select_idle_cores() isn't able to find an idle core/CPU, we need to unset the has_idle_cores flag in the LLC of the target to prevent other CPUs from going down this route. However, the current code is unsetting it in the LLC of the current CPU instead of the target CPU. This patch fixes this issue. Fixes: 9fe1f127b913 ("sched/fair: Merge select_idle_core/cpu()") Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Link: https://lore.kernel.org/r/1620746169-13996-1-git-send-email-ego@linux.vnet.ibm.com
2021-05-12	can: isotp: prevent race between isotp_bind() and isotp_setsockopt()	Norbert Slusarek
	A race condition was found in isotp_setsockopt() which allows to change socket options after the socket was bound. For the specific case of SF_BROADCAST support, this might lead to possible use-after-free because can_rx_unregister() is not called. Checking for the flag under the socket lock in isotp_bind() and taking the lock in isotp_setsockopt() fixes the issue. Fixes: 921ca574cd38 ("can: isotp: add SF_BROADCAST support for functional addressing") Link: https://lore.kernel.org/r/trinity-e6ae9efa-9afb-4326-84c0-f3609b9b8168-1620773528307@3c-app-gmx-bs06 Reported-by: Norbert Slusarek <nslusarek@gmx.net> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by: Norbert Slusarek <nslusarek@gmx.net> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-05-11	f2fs: avoid swapon failure by giving a warning first	Jaegeuk Kim
	The final solution can be migrating blocks to form a section-aligned file internally. Meanwhile, let's ask users to do that when preparing the swap file initially like: 1) create() 2) ioctl(F2FS_IOC_SET_PIN_FILE) 3) fallocate() Reported-by: kernel test robot <oliver.sang@intel.com> Fixes: 36e4d95891ed ("f2fs: check if swapfile is section-alligned") Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-11	blk-iocost: fix weight updates of inner active iocgs	Tejun Heo
	When the weight of an active iocg is updated, weight_updated() is called which in turn calls __propagate_weights() to update the active and inuse weights so that the effective hierarchical weights are update accordingly. The current implementation is incorrect for inner active nodes. For an active leaf iocg, inuse can be any value between 1 and active and the difference represents how much the iocg is donating. When weight is updated, as long as inuse is clamped between 1 and the new weight, we're alright and this is what __propagate_weights() currently implements. However, that's not how an active inner node's inuse is set. An inner node's inuse is solely determined by the ratio between the sums of inuse's and active's of its children - ie. they're results of propagating the leaves' active and inuse weights upwards. __propagate_weights() incorrectly applies the same clamping as for a leaf when an active inner node's weight is updated. Consider a hierarchy which looks like the following with saturating workloads in AA and BB. R / \ A B \| \| AA BB 1. For both A and B, active=100, inuse=100, hwa=0.5, hwi=0.5. 2. echo 200 > A/io.weight 3. __propagate_weights() update A's active to 200 and leave inuse at 100 as it's already between 1 and the new active, making A:active=200, A:inuse=100. As R's active_sum is updated along with A's active, A:hwa=2/3, B:hwa=1/3. However, because the inuses didn't change, the hwi's remain unchanged at 0.5. 4. The weight of A is now twice that of B but AA and BB still have the same hwi of 0.5 and thus are doing the same amount of IOs. Fix it by making __propgate_weights() always calculate the inuse of an active inner iocg based on the ratio of child_inuse_sum to child_active_sum. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Dan Schatzberg <dschatzberg@fb.com> Fixes: 7caa47151ab2 ("blkcg: implement blk-iocost") Cc: stable@vger.kernel.org # v5.4+ Link: https://lore.kernel.org/r/YJsxnLZV1MnBcqjj@slm.duckdns.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-12	KVM: PPC: Book3S HV: Fix kvm_unmap_gfn_range_hv() for Hash MMU	Michael Ellerman
	Commit 32b48bf8514c ("KVM: PPC: Book3S HV: Fix conversion to gfn-based MMU notifier callbacks") fixed kvm_unmap_gfn_range_hv() by adding a for loop over each gfn in the range. But for the Hash MMU it repeatedly calls kvm_unmap_rmapp() with the first gfn of the range, rather than iterating through the range. This exhibits as strange guest behaviour, sometimes crashing in firmare, or booting and then guest userspace crashing unexpectedly. Fix it by passing the iterator, gfn, to kvm_unmap_rmapp(). Fixes: 32b48bf8514c ("KVM: PPC: Book3S HV: Fix conversion to gfn-based MMU notifier callbacks") Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210511105459.800788-1-mpe@ellerman.id.au
2021-05-12	powerpc/legacy_serial: Fix UBSAN: array-index-out-of-bounds	Christophe Leroy
	UBSAN complains when a pointer is calculated with invalid 'legacy_serial_console' index, allthough the index is verified before dereferencing the pointer. Fix it by checking 'legacy_serial_console' validity before calculating pointers. Fixes: 0bd3f9e953bd ("powerpc/legacy_serial: Use early_ioremap()") Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210511010712.750096-1-mpe@ellerman.id.au
2021-05-12	powerpc/signal: Fix possible build failure with unsafe_copy_fpr_{to/from}_user	Christophe Leroy
	When neither CONFIG_VSX nor CONFIG_PPC_FPU_REGS are selected, unsafe_copy_fpr_to_user() and unsafe_copy_fpr_from_user() are doing nothing. Then, unless the 'label' operand is used elsewhere, GCC complains about it being defined but not used. To fix that, add an impossible 'goto label'. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/cadc0a328bc8e6c5bf133193e7547d5c10ae7895.1620465920.git.christophe.leroy@csgroup.eu
2021-05-12	powerpc/uaccess: Fix __get_user() with CONFIG_CC_HAS_ASM_GOTO_OUTPUT	Christophe Leroy
	Building kernel mainline with GCC 11 leads to following failure when starting 'init': init[1]: bad frame in sys_sigreturn: 7ff5a900 nip 001083cc lr 001083c4 Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b This is an issue due to a segfault happening in __unsafe_restore_general_regs() in a loop copying registers from user to kernel: 10: 7d 09 03 a6 mtctr r8 14: 80 ca 00 00 lwz r6,0(r10) 18: 80 ea 00 04 lwz r7,4(r10) 1c: 90 c9 00 08 stw r6,8(r9) 20: 90 e9 00 0c stw r7,12(r9) 24: 39 0a 00 08 addi r8,r10,8 28: 39 29 00 08 addi r9,r9,8 2c: 81 4a 00 08 lwz r10,8(r10) <== r10 is clobbered here 30: 81 6a 00 0c lwz r11,12(r10) 34: 91 49 00 08 stw r10,8(r9) 38: 91 69 00 0c stw r11,12(r9) 3c: 39 48 00 08 addi r10,r8,8 40: 39 29 00 08 addi r9,r9,8 44: 42 00 ff d0 bdnz 14 <__unsafe_restore_general_regs+0x14> As shown above, this is due to r10 being re-used by GCC. This didn't happen with CLANG. This is fixed by tagging 'x' output as an earlyclobber operand in __get_user_asm2_goto(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/cf0a050d124d4f426cdc7a74009d17b01d8d8969.1620465917.git.christophe.leroy@csgroup.eu
2021-05-12	powerpc/pseries: warn if recursing into the hcall tracing code	Nicholas Piggin
	The hcall tracing code has a recursion check built in, which skips tracing if we are already tracing an hcall. However if the tracing code has problems with recursion, this check may not catch all cases because the tracing code could be invoked from a different tracepoint first, then make an hcall that gets traced, then recurse. Add an explicit warning if recursion is detected here, which might help to notice tracing code making hcalls. Really the core trace code should have its own recursion checking and warnings though. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210508101455.1578318-5-npiggin@gmail.com
2021-05-12	powerpc/pseries: use notrace hcall variant for H_CEDE idle	Nicholas Piggin
	Rather than special-case H_CEDE in the hcall trace wrappers, make the idle H_CEDE call use plpar_hcall_norets_notrace(). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210508101455.1578318-4-npiggin@gmail.com
2021-05-12	powerpc/pseries: Don't trace hcall tracing wrapper	Nicholas Piggin
	This doesn't seem very useful to trace before the recursion check, even if the ftrace code has any recursion checks of its own. Be on the safe side and don't trace the hcall trace wrappers. Reported-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210508101455.1578318-3-npiggin@gmail.com
2021-05-12	powerpc/pseries: Fix hcall tracing recursion in pv queued spinlocks	Nicholas Piggin
	The paravit queued spinlock slow path adds itself to the queue then calls pv_wait to wait for the lock to become free. This is implemented by calling H_CONFER to donate cycles. When hcall tracing is enabled, this H_CONFER call can lead to a spin lock being taken in the tracing code, which will result in the lock to be taken again, which will also go to the slow path because it queues behind itself and so won't ever make progress. An example trace of a deadlock: __pv_queued_spin_lock_slowpath trace_clock_global ring_buffer_lock_reserve trace_event_buffer_lock_reserve trace_event_buffer_reserve trace_event_raw_event_hcall_exit __trace_hcall_exit plpar_hcall_norets_trace __pv_queued_spin_lock_slowpath trace_clock_global ring_buffer_lock_reserve trace_event_buffer_lock_reserve trace_event_buffer_reserve trace_event_raw_event_rcu_dyntick rcu_irq_exit irq_exit __do_irq call_do_irq do_IRQ hardware_interrupt_common_virt Fix this by introducing plpar_hcall_norets_notrace(), and using that to make SPLPAR virtual processor dispatching hcalls by the paravirt spinlock code. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210508101455.1578318-2-npiggin@gmail.com
2021-05-12	powerpc/syscall: Calling kuap_save_and_lock() is wrong	Christophe Leroy
	kuap_save_and_lock() is only for interrupts inside kernel. system call are only from user, calling kuap_save_and_lock() is wrong. Fixes: c16728835eec ("powerpc/32: Manage KUAP in C") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/332773775cf24a422105dee2d383fb8f04589045.1620302182.git.christophe.leroy@csgroup.eu
2021-05-12	powerpc/interrupts: Fix kuep_unlock() call	Christophe Leroy
	Same as kuap_user_restore(), kuep_unlock() has to be called when really returning to user, that is in interrupt_exit_user_prepare(), not in interrupt_exit_prepare(). Fixes: b5efec00b671 ("powerpc/32s: Move KUEP locking/unlocking in C") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b831e54a2579db24fbef836ed415588ce2b3e825.1620312573.git.christophe.leroy@csgroup.eu
2021-05-11	net: ipa: memory region array is variable size	Alex Elder
	IPA configuration data includes an array of memory region descriptors. That was a fixed-size array at one time, but at some point we started defining it such that it was only as big as required for a given platform. The actual number of entries in the array is recorded in the configuration data along with the array. A loop in ipa_mem_config() still assumes the array has entries for all defined memory region IDs. As a result, this loop can go past the end of the actual array and attempt to write "canary" values based on nonsensical data. Fix this, by stashing the number of entries in the array, and using that rather than IPA_MEM_COUNT in the initialization loop found in ipa_mem_config(). The only remaining use of IPA_MEM_COUNT is in a validation check to ensure configuration data doesn't have too many entries. That's fine for now. Fixes: 3128aae8c439a ("net: ipa: redefine struct ipa_mem_data") Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-11	ionic: fix ptp support config breakage	Shannon Nelson
	When IONIC=y and PTP_1588_CLOCK=m were set in the .config file the driver link failed with undefined references. We add the dependancy depends on PTP_1588_CLOCK \|\| !PTP_1588_CLOCK to clear this up. If PTP_1588_CLOCK=m, the depends limits IONIC to =m (or disabled). If PTP_1588_CLOCK is disabled, IONIC can be any of y/m/n. Fixes: 61db421da31b ("ionic: link in the new hw timestamp code") Reported-by: kernel test robot <lkp@intel.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Allen Hubbe <allenbh@pensando.io> Signed-off-by: Shannon Nelson <snelson@pensando.io> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-11	mptcp: fix data stream corruption	Paolo Abeni
	Maxim reported several issues when forcing a TCP transparent proxy to use the MPTCP protocol for the inbound connections. He also provided a clean reproducer. The problem boils down to 'mptcp_frag_can_collapse_to()' assuming that only MPTCP will use the given page_frag. If others - e.g. the plain TCP protocol - allocate page fragments, we can end-up re-using already allocated memory for mptcp_data_frag. Fix the issue ensuring that the to-be-expanded data fragment is located at the current page frag end. v1 -> v2: - added missing fixes tag (Mat) Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/178 Reported-and-tested-by: Maxim Galaganov <max@internet.ru> Fixes: 18b683bff89d ("mptcp: queue data for mptcp level retransmission") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-11	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf	David S. Miller
	Daniel Borkmann says: ==================== pull-request: bpf 2021-05-11 The following pull-request contains BPF updates for your net tree. We've added 13 non-merge commits during the last 8 day(s) which contain a total of 21 files changed, 817 insertions(+), 382 deletions(-). The main changes are: 1) Fix multiple ringbuf bugs in particular to prevent writable mmap of read-only pages, from Andrii Nakryiko & Thadeu Lima de Souza Cascardo. 2) Fix verifier alu32 known-const subregister bound tracking for bitwise operations and/or/xor, from Daniel Borkmann. 3) Reject trampoline attachment for functions with variable arguments, and also add a deny list of other forbidden functions, from Jiri Olsa. 4) Fix nested bpf_bprintf_prepare() calls used by various helpers by switching to per-CPU buffers, from Florent Revest. 5) Fix kernel compilation with BTF debug info on ppc64 due to pahole missing TCP-CC functions like cubictcp_init, from Martin KaFai Lau. 6) Add a kconfig entry to provide an option to disallow unprivileged BPF by default, from Daniel Borkmann. 7) Fix libbpf compilation for older libelf when GELF_ST_VISIBILITY() macro is not available, from Arnaldo Carvalho de Melo. 8) Migrate test_tc_redirect to test_progs framework as prep work for upcoming skb_change_head() fix & selftest, from Jussi Maki. 9) Fix a libbpf segfault in add_dummy_ksym_var() if BTF is not present, from Ian Rogers. 10) Fix tx_only micro-benchmark in xdpsock BPF sample with proper frame size, from Magnus Karlsson. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-11	Merge tag 'mac80211-for-net-2021-05-11' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== pull-request: mac80211 2021-05-11 So exciting times, for the first pull request for fixes I have a bunch of security things that have been under embargo for a while - see more details in the tag below, and at the patch posting message I linked to. I organized with Kalle to just have a single set of fixes for mac80211 and ath10k/ath11k, we don't know about any of the other vendors (the mac80211 + already released firmware is sufficient to fix iwlwifi.) Please pull and let me know if there's any problem. Several security issues in the 802.11 implementations were found by Mathy Vanhoef (New York University Abu Dhabi), and this contains the fixes developed for mac80211 and specifically Qualcomm drivers, I'm sending this together (as agreed with Kalle) to have just a single set of patches for now. We don't know about other vendors though. More details in the patch posting: https://lore.kernel.org/r/20210511180259.159598-1-johannes@sipsolutions.net ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-11	net: stmmac: Fix MAC WoL not working if PHY does not support WoL	Joakim Zhang
	Both get and set WoL will check device_can_wakeup(), if MAC supports PMT, it will set device wakeup capability. After commit 1d8e5b0f3f2c ("net: stmmac: Support WOL with phy"), device wakeup capability will be overwrite in stmmac_init_phy() according to phy's Wol feature. If phy doesn't support WoL, then MAC will lose wakeup capability. To fix this issue, only overwrite device wakeup capability when MAC doesn't support PMT. For STMMAC now driver checks MAC's WoL capability if MAC supports PMT, if not support, driver will check PHY's WoL capability. Fixes: 1d8e5b0f3f2c ("net: stmmac: Support WOL with phy") Reviewed-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-11	f2fs: compress: fix to assign cc.cluster_idx correctly	Chao Yu
	In f2fs_destroy_compress_ctx(), after f2fs_destroy_compress_ctx(), cc.cluster_idx will be cleared w/ NULL_CLUSTER, f2fs_cluster_blocks() may check wrong cluster metadata, fix it. Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-11	f2fs: compress: fix race condition of overwrite vs truncate	Chao Yu
	pos_fsstress testcase complains a panic as belew: ------------[ cut here ]------------ kernel BUG at fs/f2fs/compress.c:1082! invalid opcode: 0000 [#1] SMP PTI CPU: 4 PID: 2753477 Comm: kworker/u16:2 Tainted: G OE 5.12.0-rc1-custom #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 Workqueue: writeback wb_workfn (flush-252:16) RIP: 0010:prepare_compress_overwrite+0x4c0/0x760 [f2fs] Call Trace: f2fs_prepare_compress_overwrite+0x5f/0x80 [f2fs] f2fs_write_cache_pages+0x468/0x8a0 [f2fs] f2fs_write_data_pages+0x2a4/0x2f0 [f2fs] do_writepages+0x38/0xc0 __writeback_single_inode+0x44/0x2a0 writeback_sb_inodes+0x223/0x4d0 __writeback_inodes_wb+0x56/0xf0 wb_writeback+0x1dd/0x290 wb_workfn+0x309/0x500 process_one_work+0x220/0x3c0 worker_thread+0x53/0x420 kthread+0x12f/0x150 ret_from_fork+0x22/0x30 The root cause is truncate() may race with overwrite as below, so that one reference count left in page can not guarantee the page attaching in mapping tree all the time, after truncation, later find_lock_page() may return NULL pointer. - prepare_compress_overwrite - f2fs_pagecache_get_page - unlock_page - f2fs_setattr - truncate_setsize - truncate_inode_page - delete_from_page_cache - find_lock_page Fix this by avoiding referencing updated page. Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-11	f2fs: compress: fix to free compress page correctly	Chao Yu
	In error path of f2fs_write_compressed_pages(), it needs to call f2fs_compress_free_page() to release temporary page. Fixes: 5e6bbde95982 ("f2fs: introduce mempool for {,de}compress intermediate page allocation") Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-11	f2fs: support iflag change given the mask	Jaegeuk Kim
	In f2fs_fileattr_set(), if (!fa->flags_valid) mask &= FS_COMMON_FL; In this case, we can set supported flags by mask only instead of BUG_ON. /* Flags shared betwen flags/xflags */ (FS_SYNC_FL \| FS_IMMUTABLE_FL \| FS_APPEND_FL \| \ FS_NODUMP_FL \| FS_NOATIME_FL \| FS_DAX_FL \| \ FS_PROJINHERIT_FL) Fixes: 9b1bb01c8ae7 ("f2fs: convert to fileattr") Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-11	f2fs: avoid null pointer access when handling IPU error	Jaegeuk Kim
	Unable to handle kernel NULL pointer dereference at virtual address 000000000000001a pc : f2fs_inplace_write_data+0x144/0x208 lr : f2fs_inplace_write_data+0x134/0x208 Call trace: f2fs_inplace_write_data+0x144/0x208 f2fs_do_write_data_page+0x270/0x770 f2fs_write_single_data_page+0x47c/0x830 __f2fs_write_data_pages+0x444/0x98c f2fs_write_data_pages.llvm.16514453770497736882+0x2c/0x38 do_writepages+0x58/0x118 __writeback_single_inode+0x44/0x300 writeback_sb_inodes+0x4b8/0x9c8 wb_writeback+0x148/0x42c wb_do_writeback+0xc8/0x390 wb_workfn+0xb0/0x2f4 process_one_work+0x1fc/0x444 worker_thread+0x268/0x4b4 kthread+0x13c/0x158 ret_from_fork+0x10/0x18 Fixes: 955772787667 ("f2fs: drop inplace IO if fs status is abnormal") Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-11	bpf: Limit static tcp-cc functions in the .BTF_ids list to x86	Martin KaFai Lau
	During the discussion in [0]. It was pointed out that static functions in ppc64 is prefixed with ".". For example, the 'readelf -s vmlinux.ppc': 89326: c000000001383280 24 NOTYPE LOCAL DEFAULT 31 cubictcp_init 89327: c000000000c97c50 168 FUNC LOCAL DEFAULT 2 .cubictcp_init The one with FUNC type is ".cubictcp_init" instead of "cubictcp_init". The "." seems to be done by arch/powerpc/include/asm/ppc_asm.h. This caused that pahole cannot generate the BTF for these tcp-cc kernel functions because pahole only captures the FUNC type and "cubictcp_init" is not. It then failed the kernel compilation in ppc64. This behavior is only reported in ppc64 so far. I tried arm64, s390, and sparc64 and did not observe this "." prefix and NOTYPE behavior. Since the kfunc call is only supported in the x86_64 and x86_32 JIT, this patch limits those tcp-cc functions to x86 only to avoid unnecessary compilation issue in other ARCHs. In the future, we can examine if it is better to change all those functions from static to extern. [0] https://lore.kernel.org/bpf/4e051459-8532-7b61-c815-f3435767f8a0@kernel.org/ Fixes: e78aea8b2170 ("bpf: tcp: Put some tcp cong functions in allowlist for bpf-tcp-cc") Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Michal Suchánek <msuchanek@suse.de> Cc: Jiri Slaby <jslaby@suse.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/bpf/20210508005011.3863757-1-kafai@fb.com
2021-05-11	selftests/bpf: Rewrite test_tc_redirect.sh as prog_tests/tc_redirect.c	Jussi Maki
	As discussed in [0], this ports test_tc_redirect.sh to the test_progs framework and removes the old test. This makes it more in line with rest of the tests and makes it possible to run this test case with vmtest.sh and under the bpf CI. The upcoming skb_change_head() helper fix in [0] is depending on it and extending the test case to redirect a packet from L3 device to veth. [0] https://lore.kernel.org/bpf/20210427135550.807355-1-joamaki@gmail.com Signed-off-by: Jussi Maki <joamaki@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210505085925.783985-1-joamaki@gmail.com
2021-05-11	libbpf: Provide GELF_ST_VISIBILITY() define for older libelf	Arnaldo Carvalho de Melo
	Where that macro isn't available. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/YJaspEh0qZr4LYOc@kernel.org
2021-05-11	bpf: Fix nested bpf_bprintf_prepare with more per-cpu buffers	Florent Revest
	The bpf_seq_printf, bpf_trace_printk and bpf_snprintf helpers share one per-cpu buffer that they use to store temporary data (arguments to bprintf). They "get" that buffer with try_get_fmt_tmp_buf and "put" it by the end of their scope with bpf_bprintf_cleanup. If one of these helpers gets called within the scope of one of these helpers, for example: a first bpf program gets called, uses bpf_trace_printk which calls raw_spin_lock_irqsave which is traced by another bpf program that calls bpf_snprintf, then the second "get" fails. Essentially, these helpers are not re-entrant. They would return -EBUSY and print a warning message once. This patch triples the number of bprintf buffers to allow three levels of nesting. This is very similar to what was done for tracepoints in "9594dc3c7e7 bpf: fix nested bpf tracepoints with per-cpu data" Fixes: d9c9e4db186a ("bpf: Factorize bpf_trace_printk and bpf_seq_printf") Reported-by: syzbot+63122d0bc347f18c1884@syzkaller.appspotmail.com Signed-off-by: Florent Revest <revest@chromium.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210511081054.2125874-1-revest@chromium.org
2021-05-11	bpf: Add deny list of btf ids check for tracing programs	Jiri Olsa
	The recursion check in __bpf_prog_enter and __bpf_prog_exit leaves some (not inlined) functions unprotected: In __bpf_prog_enter: - migrate_disable is called before prog->active is checked In __bpf_prog_exit: - migrate_enable,rcu_read_unlock_strict are called after prog->active is decreased When attaching trampoline to them we get panic like: traps: PANIC: double fault, error_code: 0x0 double fault: 0000 [#1] SMP PTI RIP: 0010:__bpf_prog_enter+0x4/0x50 ... Call Trace: <IRQ> bpf_trampoline_6442466513_0+0x18/0x1000 migrate_disable+0x5/0x50 __bpf_prog_enter+0x9/0x50 bpf_trampoline_6442466513_0+0x18/0x1000 migrate_disable+0x5/0x50 __bpf_prog_enter+0x9/0x50 bpf_trampoline_6442466513_0+0x18/0x1000 migrate_disable+0x5/0x50 __bpf_prog_enter+0x9/0x50 bpf_trampoline_6442466513_0+0x18/0x1000 migrate_disable+0x5/0x50 ... Fixing this by adding deny list of btf ids for tracing programs and checking btf id during program verification. Adding above functions to this list. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210429114712.43783-1-jolsa@kernel.org