summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-03-19pcpcntr: remove percpu_counter_sum_all()Dave Chinner
percpu_counter_sum_all() is now redundant as the race condition it was invented to handle is now dealt with by percpu_counter_sum() directly and all users of percpu_counter_sum_all() have been removed. Remove it. This effectively reverts the changes made in f689054aace2 ("percpu_counter: add percpu_counter_sum_all interface") except for the cpumask iteration that fixes percpu_counter_sum() made earlier in this series. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2023-03-19fork: remove use of percpu_counter_sum_allDave Chinner
This effectively reverts the change made in commit f689054aace2 ("percpu_counter: add percpu_counter_sum_all interface") as the race condition percpu_counter_sum_all() was invented to avoid is now handled directly in percpu_counter_sum() and nobody needs to care about summing racing with cpu unplug anymore. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2023-03-19pcpcntrs: fix dying cpu summation raceDave Chinner
In commit f689054aace2 ("percpu_counter: add percpu_counter_sum_all interface") a race condition between a cpu dying and percpu_counter_sum() iterating online CPUs was identified. The solution was to iterate all possible CPUs for summation via percpu_counter_sum_all(). We recently had a percpu_counter_sum() call in XFS trip over this same race condition and it fired a debug assert because the filesystem was unmounting and the counter *should* be zero just before we destroy it. That was reported here: https://lore.kernel.org/linux-kernel/20230314090649.326642-1-yebin@huaweicloud.com/ likely as a result of running generic/648 which exercises filesystems in the presence of CPU online/offline events. The solution to use percpu_counter_sum_all() is an awful one. We use percpu counters and percpu_counter_sum() for accurate and reliable threshold detection for space management, so a summation race condition during these operations can result in overcommit of available space and that may result in filesystem shutdowns. As percpu_counter_sum_all() iterates all possible CPUs rather than just those online or even those present, the mask can include CPUs that aren't even installed in the machine, or in the case of machines that can hot-plug CPU capable nodes, even have physical sockets present in the machine. Fundamentally, this race condition is caused by the CPU being offlined being removed from the cpu_online_mask before the notifier that cleans up per-cpu state is run. Hence percpu_counter_sum() will not sum the count for a cpu currently being taken offline, regardless of whether the notifier has run or not. This is the root cause of the bug. The percpu counter notifier iterates all the registered counters, locks the counter and moves the percpu count to the global sum. This is serialised against other operations that move the percpu counter to the global sum as well as percpu_counter_sum() operations that sum the percpu counts while holding the counter lock. Hence the notifier is safe to run concurrently with sum operations, and the only thing we actually need to care about is that percpu_counter_sum() iterates dying CPUs. That's trivial to do, and when there are no CPUs dying, it has no addition overhead except for a cpumask_or() operation. This change makes percpu_counter_sum() always do the right thing in the presence of CPU hot unplug events and makes percpu_counter_sum_all() unnecessary. This, in turn, means that filesystems like XFS, ext4, and btrfs don't have to work out when they should use percpu_counter_sum() vs percpu_counter_sum_all() in their space accounting algorithms Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2023-03-19cpumask: introduce for_each_cpu_orDave Chinner
Equivalent of for_each_cpu_and, except it ORs the two masks together so it iterates all the CPUs present in either mask. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2023-03-19Merge tag 'ras_urgent_for_v6.3_rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS fix from Borislav Petkov: - Flush out logged errors immediately after MCA banks configuration changes over sysfs have been done instead of waiting until something else triggers the workqueue later - another error or the polling interval cycle is reached * tag 'ras_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Make sure logged MCEs are processed after sysfs update
2023-03-19xfs: test dir/attr hash when loading moduleDarrick J. Wong
Back in the 6.2-rc1 days, Eric Whitney reported a fstests regression in ext4 against generic/454. The cause of this test failure was the unfortunate combination of setting an xattr name containing UTF8 encoded emoji, an xattr hash function that accepted a char pointer with no explicit signedness, signed type extension of those chars to an int, and the 6.2 build tools maintainers deciding to mandate -funsigned-char across the board. As a result, the ondisk extended attribute structure written out by 6.1 and 6.2 were not the same. This discrepancy, in fact, had been noticeable if a filesystem with such an xattr were moved between any two architectures that don't employ the same signedness of a raw "char" declaration. The only reason anyone noticed is that x86 gcc defaults to signed, and no such -funsigned-char update was made to e2fsprogs, so e2fsck immediately started reporting data corruption. After a day and a half of discussing how to handle this use case (xattrs with bit 7 set anywhere in the name) without breaking existing users, Linus merged his own patch and didn't tell the maintainer. None of the ext4 developers realized this until AUTOSEL announced that the commit had been backported to stable. In the end, this problem could have been detected much earlier if there had been any useful tests of hash function(s) in use inside ext4 to make sure that they always produce the same outputs given the same inputs. The XFS dirent/xattr name hash takes a uint8_t*, so I don't think it's vulnerable to this problem. However, let's avoid all this drama by adding our own self test to check that the da hash produces the same outputs for a static pile of inputs on various platforms. This enables us to fix any breakage that may result in a controlled fashion. The buffer and test data are identical to the patches submitted to xfsprogs. Link: https://lore.kernel.org/linux-ext4/Y8bpkm3jA3bDm3eL@debian-BULLSEYE-live-builder-AMD64/ Link: https://lore.kernel.org/linux-xfs/ZBUKCRR7xvIqPrpX@destitution/T/#md38272cc684e2c0d61494435ccbb91f022e8dee4 Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-03-19xfs: add tracepoints for each of the externally visible allocatorsDarrick J. Wong
There are now five separate space allocator interfaces exposed to the rest of XFS for five different strategies to find space. Add tracepoints for each of them so that I can tell from a trace dump exactly which ones got called and what happened underneath them. Add a sixth so it's more obvious if an allocation actually happened. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-03-19xfs: walk all AGs if TRYLOCK passed to xfs_alloc_vextent_iterate_agsDarrick J. Wong
Callers of xfs_alloc_vextent_iterate_ags that pass in the TRYLOCK flag want us to perform a non-blocking scan of the AGs for free space. There are no ordering constraints for non-blocking AGF lock acquisition, so the scan can freely start over at AG 0 even when minimum_agno > 0. This manifests fairly reliably on xfs/294 on 6.3-rc2 with the parent pointer patchset applied and the realtime volume enabled. I observed the following sequence as part of an xfs_dir_createname call: 0. Fragment the free space, then allocate nearly all the free space in all AGs except AG 0. 1. Create a directory in AG 2 and let it grow for a while. 2. Try to allocate 2 blocks to expand the dirent part of a directory. The space will be allocated out of AG 0, but the allocation will not be contiguous. This (I think) activates the LOWMODE allocator. 3. The bmapi call decides to convert from extents to bmbt format and tries to allocate 1 block. This allocation request calls xfs_alloc_vextent_start_ag with the inode number, which starts the scan at AG 2. We ignore AG 0 (with all its free space) and instead scrape AG 2 and 3 for more space. We find one block, but this now kicks t_highest_agno to 3. 4. The createname call decides it needs to split the dabtree. It tries to allocate even more space with xfs_alloc_vextent_start_ag, but now we're constrained to AG 3, and we don't find the space. The createname returns ENOSPC and the filesystem shuts down. This change fixes the problem by making the trylock scan wrap around to AG 0 if it doesn't like the AGs that it finds. Since the current transaction itself holds AGF 0, the trylock of AGF 0 will succeed, and we take space from the AG that has plenty. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-03-19Merge tag 'perf_urgent_for_v6.3_rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Borislav Petkov: - Check whether sibling events have been deactivated before adding them to groups - Update the proper event time tracking variable depending on the event type - Fix a memory overwrite issue due to using the wrong function argument when outputting perf events * tag 'perf_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Fix check before add_event_to_groups() in perf_group_detach() perf: fix perf_event_context->time perf/core: Fix perf_output_begin parameter is incorrectly invoked in perf_event_bpf_output
2023-03-19Merge tag 'x86_urgent_for_v6.3_rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: "There's a little bit more 'movement' in there for my taste but it needs to happen and should make the code better after it. - Check cmdline_find_option()'s return value before further processing - Clear temporary storage in the resctrl code to prevent access to an unexistent MSR - Add a simple throttling mechanism to protect the hypervisor from potentially malicious SEV guests issuing requests in rapid succession. In order to not jeopardize the sanity of everyone involved in maintaining this code, the request issuing side has received a cleanup, split in more or less trivial, small and digestible pieces. Otherwise, the code was threatening to become an unmaintainable mess. Therefore, that cleanup is marked indirectly also for stable so that there's no differences between the upstream code and the stable variant when it comes down to backporting more there" * tag 'x86_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Fix use of uninitialized buffer in sme_enable() x86/resctrl: Clear staged_config[] before and after it is used virt/coco/sev-guest: Add throttling awareness virt/coco/sev-guest: Convert the sw_exit_info_2 checking to a switch-case virt/coco/sev-guest: Do some code style cleanups virt/coco/sev-guest: Carve out the request issuing logic into a helper virt/coco/sev-guest: Remove the disable_vmpck label in handle_guest_request() virt/coco/sev-guest: Simplify extended guest request handling virt/coco/sev-guest: Check SEV_SNP attribute at probe time
2023-03-19Merge tag 'ext4_for_linus_urgent' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fix from Ted Ts'o: "Fix a double unlock bug on an error path in ext4, found by smatch and syzkaller" * tag 'ext4_for_linus_urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix possible double unlock when moving a directory
2023-03-19ftrace: Set direct_ops storage-class-specifier to staticTom Rix
smatch reports this warning kernel/trace/ftrace.c:2594:19: warning: symbol 'direct_ops' was not declared. Should it be static? The variable direct_ops is only used in ftrace.c, so it should be static Link: https://lore.kernel.org/linux-trace-kernel/20230311135113.711824-1-trix@redhat.com Signed-off-by: Tom Rix <trix@redhat.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-03-19trace/hwlat: Do not start per-cpu thread if it is already runningTero Kristo
The hwlatd tracer will end up starting multiple per-cpu threads with the following script: #!/bin/sh cd /sys/kernel/debug/tracing echo 0 > tracing_on echo hwlat > current_tracer echo per-cpu > hwlat_detector/mode echo 100000 > hwlat_detector/width echo 200000 > hwlat_detector/window echo 1 > tracing_on To fix the issue, check if the hwlatd thread for the cpu is already running, before starting a new one. Along with the previous patch, this avoids running multiple instances of the same CPU thread on the system. Link: https://lore.kernel.org/all/20230302113654.2984709-1-tero.kristo@linux.intel.com/ Link: https://lkml.kernel.org/r/20230310100451.3948583-3-tero.kristo@linux.intel.com Cc: stable@vger.kernel.org Fixes: f46b16520a087 ("trace/hwlat: Implement the per-cpu mode") Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com> Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-03-19trace/hwlat: Do not wipe the contents of per-cpu thread dataTero Kristo
Do not wipe the contents of the per-cpu kthread data when starting the tracer, as this will completely forget about already running instances and can later start new additional per-cpu threads. Link: https://lore.kernel.org/all/20230302113654.2984709-1-tero.kristo@linux.intel.com/ Link: https://lkml.kernel.org/r/20230310100451.3948583-2-tero.kristo@linux.intel.com Cc: stable@vger.kernel.org Fixes: f46b16520a087 ("trace/hwlat: Implement the per-cpu mode") Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com> Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-03-19tracing/osnoise: set several trace_osnoise.c variables ↵Tom Rix
storage-class-specifier to static smatch reports several similar warnings kernel/trace/trace_osnoise.c:220:1: warning: symbol '__pcpu_scope_per_cpu_osnoise_var' was not declared. Should it be static? kernel/trace/trace_osnoise.c:243:1: warning: symbol '__pcpu_scope_per_cpu_timerlat_var' was not declared. Should it be static? kernel/trace/trace_osnoise.c:335:14: warning: symbol 'interface_lock' was not declared. Should it be static? kernel/trace/trace_osnoise.c:2242:5: warning: symbol 'timerlat_min_period' was not declared. Should it be static? kernel/trace/trace_osnoise.c:2243:5: warning: symbol 'timerlat_max_period' was not declared. Should it be static? These variables are only used in trace_osnoise.c, so it should be static Link: https://lore.kernel.org/linux-trace-kernel/20230309150414.4036764-1-trix@redhat.com Signed-off-by: Tom Rix <trix@redhat.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-03-19tracing: Fix wrong return in kprobe_event_gen_test.cAnton Gusev
Overwriting the error code with the deletion result may cause the function to return 0 despite encountering an error. Commit b111545d26c0 ("tracing: Remove the useless value assignment in test_create_synth_event()") solves a similar issue by returning the original error code, so this patch does the same. Found by Linux Verification Center (linuxtesting.org) with SVACE. Link: https://lore.kernel.org/linux-trace-kernel/20230131075818.5322-1-aagusev@ispras.ru Signed-off-by: Anton Gusev <aagusev@ispras.ru> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-03-19mlxsw: core_thermal: Fix fan speed in maximum cooling stateIdo Schimmel
The cooling levels array is supposed to prevent the system fans from being configured below a 20% duty cycle as otherwise some of them get stuck at 0 RPM. Due to an off-by-one error, the last element in the array was not initialized, causing it to be set to zero, which in turn lead to fans being configured with a 0% duty cycle in maximum cooling state. Since commit 332fdf951df8 ("mlxsw: thermal: Fix out-of-bounds memory accesses") the contents of the array are static. Therefore, instead of fixing the initialization of the array, simply remove it and adjust thermal_cooling_device_ops::set_cur_state() so that the configured duty cycle is never set below 20%. Before: # cat /sys/class/thermal/thermal_zone0/cdev0/type mlxsw_fan # echo 10 > /sys/class/thermal/thermal_zone0/cdev0/cur_state # cat /sys/class/hwmon/hwmon0/name mlxsw # cat /sys/class/hwmon/hwmon0/pwm1 0 After: # cat /sys/class/thermal/thermal_zone0/cdev0/type mlxsw_fan # echo 10 > /sys/class/thermal/thermal_zone0/cdev0/cur_state # cat /sys/class/hwmon/hwmon0/name mlxsw # cat /sys/class/hwmon/hwmon0/pwm1 255 This bug was uncovered when the thermal subsystem repeatedly tried to configure the cooling devices to their maximum state due to another issue [1]. This resulted in the fans being stuck at 0 RPM, which eventually lead to the system undergoing thermal shutdown. [1] https://lore.kernel.org/netdev/ZA3CFNhU4AbtsP4G@shredder/ Fixes: a421ce088ac8 ("mlxsw: core: Extend cooling device with cooling levels") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Vadim Pasternak <vadimp@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-19Merge branch 'lan966x-tx-rx-improve'David S. Miller
Horatiu Vultur says: ==================== net: lan966x: Improve TX/RX of frames from/to CPU The first patch of this series improves the RX side. As it seems to be an expensive operation to read the RX timestamp for every frame, then read it only if it is required. This will give an improvement of ~70mbit on the RX side. The second patch stops using the packing library. This improves mostly the TX side as this library is used to set diffent bits in the IFH. If this library is replaced with a more simple/shorter implementation, this gives an improvement of more than 100mbit on TX side. All the measurements were done using iperf3. v1->v2: - update lan966x_ifh_set to set the bytes and not each bit individually ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-19net: lan966x: Stop using packing libraryHoratiu Vultur
When a frame is injected from CPU, it is required to create an IFH(Inter frame header) which sits in front of the frame that is transmitted. This IFH, contains different fields like destination port, to bypass the analyzer, priotity, etc. Lan966x it is using packing library to set and get the fields of this IFH. But this seems to be an expensive operations. If this is changed with a simpler implementation, the RX will be improved with ~5Mbit while on the TX is a much bigger improvement as it is required to set more fields. Below are the numbers for TX. Before: [ 5] 0.00-10.02 sec 439 MBytes 367 Mbits/sec 0 sender After: [ 5] 0.00-10.00 sec 578 MBytes 485 Mbits/sec 0 sender Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-19net: lan966x: Don't read RX timestamp if not neededHoratiu Vultur
Whenever a frame was received to the CPU, the HW is timestamping the frame. In the IFH(Inter Frame Header) it is found the nanosecond part of the timestamps the SW is required to read from HW the second part. But reading the second part it seems to be a expensive operations, so so change this such to read the second part only when rx filter is enabled. Doing this change gives the RX a performance boost of ~70mbit. before: [ 5] 0.00-10.01 sec 546 MBytes 457 Mbits/sec 0 sender now: [ 5] 0.00-10.01 sec 652 MBytes 530 Mbits/sec 0 sender Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-19net: sfp: fix state loss when updating state_hw_maskRussell King (Oracle)
Andrew reports that the SFF modules on one of the ZII platforms do not indicate link up due to the SFP code believing that LOS indicating that there is no signal being received from the remote end, but in fact the LOS signal is showing that there is signal. What makes SFF modules different from SFPs is they typically have an inverted LOS, which uncovered this issue. When we read the hardware state, we mask it with state_hw_mask so we ignore anything we're not interested in. However, we don't re-read when state_hw_mask changes, leading to sfp->state being stale. Arrange for a software poll of the module state after we have parsed the EEPROM in sfp_sm_mod_probe() and updated state_*_mask. This will generate any necessary events for signal changes for the state machine as well as updating sfp->state. Reported-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Andrew Lunn <andrew@lunn.ch> Fixes: 8475c4b70b04 ("net: sfp: re-implement soft state polling setup") Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-19net/packet: remove po->xmitEric Dumazet
Use PACKET_SOCK_QDISC_BYPASS atomic bit instead of a pointer. This removes one indirect call in fast path, and READ_ONCE()/WRITE_ONCE() annotations as well. Signed-off-by: Eric Dumazet <edumazet@google.com> Suggested-by: Willem de Bruijn <willemb@google.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-19net: stmmac: Fix for mismatched host/device DMA address widthJochen Henneberg
Currently DMA address width is either read from a RO device register or force set from the platform data. This breaks DMA when the host DMA address width is <=32it but the device is >32bit. Right now the driver may decide to use a 2nd DMA descriptor for another buffer (happens in case of TSO xmit) assuming that 32bit addressing is used due to platform configuration but the device will still use both descriptor addresses as one address. This can be observed with the Intel EHL platform driver that sets 32bit for addr64 but the MAC reports 40bit. The TX queue gets stuck in case of TCP with iptables NAT configuration on TSO packets. The logic should be like this: Whatever we do on the host side (memory allocation GFP flags) should happen with the host DMA width, whenever we decide how to set addresses on the device registers we must use the device DMA address width. This patch renames the platform address width field from addr64 (term used in device datasheet) to host_addr and uses this value exclusively for host side operations while all chip operations consider the device DMA width as read from the device register. Fixes: 7cfc4486e7ea ("stmmac: intel: Configure EHL PSE0 GbE and PSE1 GbE to 32 bits DMA addressing") Signed-off-by: Jochen Henneberg <jh@henneberg-systemdesign.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-19net: macb: Reset TX when TX halt times outHarini Katakam
Reset TX when halt times out i.e. disable TX, clean up TX BDs, interrupts (already done) and enable TX. This addresses the issue observed when iperf is run at 10Mps Half duplex where, after multiple collisions and retries, TX halts. Signed-off-by: Harini Katakam <harini.katakam@xilinx.com> Signed-off-by: Michal Simek <michal.simek@xilinx.com> Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-19ixgb: Remove ixgb driverTony Nguyen
There are likely no users of this driver as the hardware has been discontinued since 2010. Remove the driver and all references to it in documentation. Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-19Merge branch 'mdiobus-module-owner'David S. Miller
Florian Fainelli says: ==================== ACPI/DT mdiobus module owner fixes This patch series fixes wrong mdiobus module ownership for MDIO buses registered from DT or ACPI. Thanks Maxime for providing the first patch and making me see that ACPI also had the same issue. Changes in v2: - fixed missing kdoc in the first patch ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-19net: mdio: fix owner field for mdio buses registered using ACPIFlorian Fainelli
Bus ownership is wrong when using acpi_mdiobus_register() to register an mdio bus. That function is not inline, so when it calls mdiobus_register() the wrong THIS_MODULE value is captured. CC: Maxime Bizon <mbizon@freebox.fr> Fixes: 803ca24d2f92 ("net: mdio: Add ACPI support code for mdio") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-19net: mdio: fix owner field for mdio buses registered using device-treeMaxime Bizon
Bus ownership is wrong when using of_mdiobus_register() to register an mdio bus. That function is not inline, so when it calls mdiobus_register() the wrong THIS_MODULE value is captured. Signed-off-by: Maxime Bizon <mbizon@freebox.fr> Fixes: 90eff9096c01 ("net: phy: Allow splitting MDIO bus/device support from PHYs") [florian: fix kdoc, added Fixes tag] Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-19net: phy: Ensure state transitions are processed from phy_stop()Florian Fainelli
In the phy_disconnect() -> phy_stop() path, we will be forcibly setting the PHY state machine to PHY_HALTED. This invalidates the old_state != phydev->state condition in phy_state_machine() such that we will neither display the state change for debugging, nor will we invoke the link_change_notify() callback. Factor the code by introducing phy_process_state_change(), and ensure that we process the state change from phy_stop() as well. Fixes: 5c5f626bcace ("net: phy: improve handling link_change_notify callback") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-19Merge branch '1GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-03-16 (igb, igbvf, igc) This series contains updates to igb, igbvf, and igc drivers. Lin Ma removes rtnl_lock() when disabling SRIOV on remove which was causing deadlock on igb. Akihiko Odaki delays enabling of SRIOV on igb to prevent early messages that could get ignored and clears MAC address when PF returns nack on reset; indicating no MAC address was assigned for igbvf. Gaosheng Cui frees IRQs in error path for igbvf. Akashi Takahiro fixes logic on checking TAPRIO gate support for igc. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-19xirc2ps_cs: Fix use after free bug in xirc2ps_detachZheng Wang
In xirc2ps_probe, the local->tx_timeout_task was bounded with xirc2ps_tx_timeout_task. When timeout occurs, it will call xirc_tx_timeout->schedule_work to start the work. When we call xirc2ps_detach to remove the driver, there may be a sequence as follows: Stop responding to timeout tasks and complete scheduled tasks before cleanup in xirc2ps_detach, which will fix the problem. CPU0 CPU1 |xirc2ps_tx_timeout_task xirc2ps_detach | free_netdev | kfree(dev); | | | do_reset | //use dev Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Zheng Wang <zyytlz.wz@163.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-19net: phy: at803x: Replace of_gpio.h with what indeed is usedAndy Shevchenko
of_gpio.h in this driver is solely used as a proxy to other headers. This is incorrect usage of the of_gpio.h. Replace it .h with what indeed is used in the code. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-19net: smc91x: Replace of_gpio.h with what indeed is usedAndy Shevchenko
of_gpio.h in this driver is solely used as a proxy to other headers. This is incorrect usage of the of_gpio.h. Replace it .h with what indeed is used in the code. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-19RDMA/irdma: Add ipv4 check to irdma_find_listener()Tatyana Nikolova
Add ipv4 check to irdma_find_listener(). Otherwise the function incorrectly finds and returns a listener with a different addr family for the zero IP addr, if a listener with a zero IP addr and the same port as the one searched for has already been created. Fixes: 146b9756f14c ("RDMA/irdma: Add connection manager") Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230315145231.931-5-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-03-19RDMA/irdma: Increase iWARP CM default rexmit countMustafa Ismail
When running perftest with large number of connections in iWARP mode, the passive side could be slow to respond. Increase the rexmit counter default to allow scaling connections. Fixes: 146b9756f14c ("RDMA/irdma: Add connection manager") Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230315145231.931-4-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-03-19RDMA/irdma: Fix memory leak of PBLE objectsMustafa Ismail
On rmmod of irdma, the PBLE object memory is not being freed. PBLE object memory are not statically pre-allocated at function initialization time unlike other HMC objects. PBLEs objects and the Segment Descriptors (SD) for it can be dynamically allocated during scale up and SD's remain allocated till function deinitialization. Fix this leak by adding IRDMA_HMC_IW_PBLE to the iw_hmc_obj_types[] table and skip pbles in irdma_create_hmc_obj but not in irdma_del_hmc_objects(). Fixes: 44d9e52977a1 ("RDMA/irdma: Implement device initialization definitions") Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230315145231.931-3-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-03-19RDMA/irdma: Do not generate SW completions for NOPsMustafa Ismail
Currently, artificial SW completions are generated for NOP wqes which can generate unexpected completions with wr_id = 0. Skip the generation of artificial completions for NOPs. Fixes: 81091d7696ae ("RDMA/irdma: Add SW mechanism to generate completions on error") Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230315145231.931-2-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-03-19qed/qed_sriov: guard against NULL derefs from qed_iov_get_vf_infoDaniil Tatianin
We have to make sure that the info returned by the helper is valid before using it. Found by Linux Verification Center (linuxtesting.org) with the SVACE static analysis tool. Fixes: f990c82c385b ("qed*: Add support for ndo_set_vf_trust") Fixes: 733def6a04bf ("qed*: IOV link control") Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-19net: macb: Set MDIO clock divisor for pclk higher than 160MHzBartosz Wawrzyniak
Currently macb sets clock divisor for pclk up to 160 MHz. Function gem_mdc_clk_div was updated to enable divisor for higher values of pclk. Signed-off-by: Bartosz Wawrzyniak <bwawrzyn@cisco.com> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-19ALSA: hda/realtek: Add quirks for some Clevo laptopsTim Crawford
Add the audio quirk for some of Clevo's latest RPL laptops: - NP50RNJS (ALC256) - NP70SNE (ALC256) - PD50SNE (ALC1220) - PE60RNE (ALC1220) Co-authored-by: Jeremy Soller <jeremy@system76.com> Signed-off-by: Tim Crawford <tcrawford@system76.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20230317141825.11807-1-tcrawford@system76.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-03-18fscrypt: check for NULL keyring in fscrypt_put_master_key_activeref()Eric Biggers
It is a bug for fscrypt_put_master_key_activeref() to see a NULL keyring. But it used to be possible due to the bug, now fixed, where fscrypt_destroy_keyring() was called before security_sb_delete(). To be consistent with how fscrypt_destroy_keyring() uses WARN_ON for the same issue, WARN and leak the fscrypt_master_key if the keyring is NULL instead of dereferencing the NULL pointer. This is a robustness improvement, not a fix. Link: https://lore.kernel.org/r/20230313221231.272498-4-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2023-03-18fscrypt: improve fscrypt_destroy_keyring() documentationEric Biggers
Document that fscrypt_destroy_keyring() must be called after all potentially-encrypted inodes have been evicted. Link: https://lore.kernel.org/r/20230313221231.272498-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2023-03-18Merge tag 'fbdev-for-6.3-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev Pull fbdev fixes from Helge Deller: "The majority of lines changed is due to a code style cleanup in the pnmtologo helper program. Arnd removed the omap1 osk driver and the SIS fb driver is now orphaned. Other than that it's the usual bunch of small fixes and cleanups, e.g. prevent possible divide-by-zero in various fb drivers if the pixclock is zero and various conversions to devm_platform*() and of_property*() functions: - Drop omap1 osk driver - Various potential divide by zero pixclock fixes - Add pixelclock and fb_check_var() to stifb - Code style cleanups and indenting fixes" * tag 'fbdev-for-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev: fbdev: Use of_property_present() for testing DT property presence fbdev: au1200fb: Fix potential divide by zero fbdev: lxfb: Fix potential divide by zero fbdev: intelfb: Fix potential divide by zero fbdev: nvidia: Fix potential divide by zero fbdev: stifb: Provide valid pixelclock and add fb_check_var() checks fbdev: omapfb: remove omap1 osk driver fbdev: xilinxfb: Use devm_platform_get_and_ioremap_resource() fbdev: wm8505fb: Use devm_platform_ioremap_resource() fbdev: pxa3xx-gcu: Use devm_platform_get_and_ioremap_resource() fbdev: Use of_property_read_bool() for boolean properties fbdev: clps711x-fb: Use devm_platform_get_and_ioremap_resource() fbdev: tgafb: Fix potential divide by zero MAINTAINERS: orphan SIS FRAMEBUFFER DRIVER fbdev: omapfb: cleanup inconsistent indentation drivers: video: logo: add SPDX comment, remove GPL notice in pnmtologo.c drivers: video: logo: fix code style issues in pnmtologo.c
2023-03-18Merge tag 'kbuild-fixes-v6.3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Exclude kallsyms_seqs_of_names from kallsyms to fix build error - Fix 'make kernelrelease' for external module builds - Get the Debian source package compilable again - Fix the wrong uname when Debian packages are built with the KDEB_PKGVERSION option - Fix superfluous CROSS_COMPILE when building Debian packages - Fix RPM package build error when KCONFIG_CONFIG is set - Use 'git archive' for creating source tarballs - Remove the scripts/list-gitignored tool * tag 'kbuild-fixes-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kbuild: use git-archive for source package creation kbuild: rpm-pkg: move source components to rpmbuild/SOURCES kbuild: deb-pkg: use dh_listpackages to know enabled packages kbuild: deb-pkg: split image and debug objects staging out into functions kbuild: deb-pkg: set CROSS_COMPILE only when undefined kbuild: deb-pkg: do not take KERNELRELEASE from the source version kbuild: deb-pkg: make debian source package working again Makefile: Make kernelrelease target work with M= kconfig: Update config changed flag before calling callback kallsyms: add kallsyms_seqs_of_names to list of special symbols
2023-03-18Merge tag 'hwmon-for-v6.3-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: - ltc2992, adm1266: Set missing can_sleep flag - tmp512/tmp513: Drop of_match_ptr for ID table to fix build with !CONFIG_OF - ucd90320: Fix back-to-back access problem - ina3221: Fix bad error return from probe function - xgene: Fix use-after-free bug in remove function - adt7475: Fix hysteresis register bit masks, and fix association of 'smoothing' attributes * tag 'hwmon-for-v6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (ltc2992) Set `can_sleep` flag for GPIO chip hwmon: (adm1266) Set `can_sleep` flag for GPIO chip hwmon: tmp512: drop of_match_ptr for ID table hwmon: (ucd90320) Add minimum delay between bus accesses hwmon: (ina3221) return prober error code hwmon: (xgene) Fix use after free bug in xgene_hwmon_remove due to race condition hwmon: (adt7475) Fix masking of hysteresis registers hwmon: (adt7475) Display smoothing attributes in correct order
2023-03-18Merge tag 'ata-6.3-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata Pull ata fixes from Damien Le Moal: - Two fixes from Ondrej for the pata_parport driver to address an issue with error handling during drive connection and to fix memory leaks in case of errors during initialization and when disconnecting a device. * tag 'ata-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: ata: pata_parport: fix memory leaks ata: pata_parport: fix parport release without claim
2023-03-18media: m5mols: fix off-by-one loop termination errorLinus Torvalds
The __find_restype() function loops over the m5mols_default_ffmt[] array, and the termination condition ends up being wrong: instead of stopping when the iterator becomes the size of the array it traverses, it stops after it has already overshot the array. Now, in practice this doesn't likely matter, because the code will always find the entry it looks for, and will thus return early and never hit that last extra iteration. But it turns out that clang will unroll the loop fully, because it has only two iterations (well, three due to the off-by-one bug), and then clang will end up just giving up in the middle of the loop unrolling when it notices that the code walks past the end of the array. And that made 'objtool' very unhappy indeed, because the generated code just falls off the edge of the universe, and ends up falling through to the next function, causing this warning: drivers/media/i2c/m5mols/m5mols.o: warning: objtool: m5mols_set_fmt() falls through to next function m5mols_get_frame_desc() Fix the loop ending condition. Reported-by: Jens Axboe <axboe@kernel.dk> Analyzed-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Analyzed-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://lore.kernel.org/linux-block/CAHk-=wgTSdKYbmB1JYM5vmHMcD9J9UZr0mn7BOYM_LudrP+Xvw@mail.gmail.com/ Fixes: bc125106f8af ("[media] Add support for M-5MOLS 8 Mega Pixel camera ISP") Cc: HeungJun, Kim <riverful.kim@samsung.com> Cc: Sylwester Nawrocki <s.nawrocki@samsung.com> Cc: Kyungmin Park <kyungmin.park@samsung.com> Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-03-18bpf, docs: Libbpf overview documentationSreevani Sreejith
This patch documents overview of libbpf, including its features for developing BPF programs. Signed-off-by: Sreevani Sreejith <ssreevani@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20230315195405.2051559-1-ssreevani@meta.com
2023-03-18iio: adc: ti-ads7950: Set `can_sleep` flag for GPIO chipLars-Peter Clausen
The ads7950 uses a mutex as well as SPI transfers in its GPIO callbacks. This means these callbacks can sleep and the `can_sleep` flag should be set. Having the flag set will make sure that warnings are generated when calling any of the callbacks from a potentially non-sleeping context. Fixes: c97dce792dc8 ("iio: adc: ti-ads7950: add GPIO support") Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Acked-by: David Lechner <david@lechnology.com> Link: https://lore.kernel.org/r/20230312210933.2275376-1-lars@metafoo.de Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2023-03-18iio: adc: palmas_gpadc: fix NULL dereference on rmmodPatrik Dahlström
Calling dev_to_iio_dev() on a platform device pointer is undefined and will make adc NULL. Signed-off-by: Patrik Dahlström <risca@dalakolonin.se> Link: https://lore.kernel.org/r/20230313205029.1881745-1-risca@dalakolonin.se Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>