summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-10-28efi: efivars: Fix variable writes with unsupported query_variable_store()Ard Biesheuvel
Commit 8a254d90a775 ("efi: efivars: Fix variable writes without query_variable_store()") addressed an issue that was introduced during the EFI variable store refactor, where alternative implementations of the efivars layer that lacked query_variable_store() would no longer work. Unfortunately, there is another case to consider here, which was missed: if the efivars layer is backed by the EFI runtime services as usual, but the EFI implementation predates the introduction of QueryVariableInfo(), we will return EFI_UNSUPPORTED, and this is no longer being dealt with correctly. So let's fix this, and while at it, clean up the code a bit, by merging the check_var_size() routines as well as their callers. Cc: <stable@vger.kernel.org> # v6.0 Fixes: bbc6d2c6ef22 ("efi: vars: Switch to new wrapper layer") Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Aditya Garg <gargaditya08@live.com>
2022-10-28RDMA/qedr: clean up work queue on failure in qedr_alloc_resources()Dan Carpenter
Add a check for if create_singlethread_workqueue() fails and also destroy the work queue on failure paths. Fixes: e411e0587e0d ("RDMA/qedr: Add iWARP connection management functions") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/Y1gBkDucQhhWj5YM@kili Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-10-28RDMA/core: Fix null-ptr-deref in ib_core_cleanup()Chen Zhongjin
KASAN reported a null-ptr-deref error: KASAN: null-ptr-deref in range [0x0000000000000118-0x000000000000011f] CPU: 1 PID: 379 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) RIP: 0010:destroy_workqueue+0x2f/0x740 RSP: 0018:ffff888016137df8 EFLAGS: 00000202 ... Call Trace: ib_core_cleanup+0xa/0xa1 [ib_core] __do_sys_delete_module.constprop.0+0x34f/0x5b0 do_syscall_64+0x3a/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7fa1a0d221b7 ... It is because the fail of roce_gid_mgmt_init() is ignored: ib_core_init() roce_gid_mgmt_init() gid_cache_wq = alloc_ordered_workqueue # fail ... ib_core_cleanup() roce_gid_mgmt_cleanup() destroy_workqueue(gid_cache_wq) # destroy an unallocated wq Fix this by catching the fail of roce_gid_mgmt_init() in ib_core_init(). Fixes: 03db3a2d81e6 ("IB/core: Add RoCE GID table management") Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com> Link: https://lore.kernel.org/r/20221025024146.109137-1-chenzhongjin@huawei.com Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-10-28ACPI: x86: Add another system to quirk list for forcing StorageD3EnableMario Limonciello
commit 018d6711c26e4 ("ACPI: x86: Add a quirk for Dell Inspiron 14 2-in-1 for StorageD3Enable") introduced a quirk to allow a system with ambiguous use of _ADR 0 to force StorageD3Enable. Julius Brockmann reports that Inspiron 16 5625 suffers that same symptoms. Add this other system to the list as well. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216440 Reported-and-tested-by: Julius Brockmann <mail@juliusbrockmann.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-10-28MAINTAINERS: Change myself to a maintainerMatti Vaittinen
After some off-list discussion with Marek Vasut and Geert Uytterhoeven and finally a kx022a driver related discussion with Joe Perches https://lore.kernel.org/lkml/92c3f72e60bc99bf4a21da259b4d78c1bdca447d.camel@perches.com/ it seems that my status as a reviewer has been wrong. I do look after the ROHM/Kionix drivers I've authored and currently I am also paid to do so as is reflected by the 'S: Supported'. According to Joe, the reviewer entry in MAINTAINERS do not indicate such level of support and having a reviewer supporting an IC is a contradiction. Switch undersigned from a reviewer to a maintainer for IC drivers I am taking care of. Signed-off-by: Matti Vaittinen <mazziesaccount@gmail.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2022-10-28Merge branches 'acpi-resource', 'acpi-pcc' and 'devprop'Rafael J. Wysocki
Merge an IRQ override quirk, an ACPI PCC code fix and a device properties documentation update for 6.1-rc3: - Make the ACPI device resources code skip IRQ override on Asus Vivobook S5602ZA (Tamim Khan). - Fix a possible integer overflow during multiplication in the ACPI PCC code (Manank Patel). - Fix the documentation of the *_match_string() family of functions to properly cover the return value (Andy Shevchenko). * acpi-resource: ACPI: resource: Skip IRQ override on Asus Vivobook S5602ZA * acpi-pcc: ACPI: PCC: Fix unintentional integer overflow * devprop: device property: Fix documentation for *_match_string() APIs
2022-10-28Merge branches 'pm-sleep', 'pm-domains' and 'pm-tools'Rafael J. Wysocki
Merge a hiberantion-related fix, a generic power domains code fix and a pm-graph update for 6.1-rc1: - Allow hybrid sleep to use suspend-to-idle as a system suspend method if it is the current suspend method of choice (Mario Limonciello). - Fix handling of unavailable/disabled idle states in the generic power domains code (Sudeep Holla). - Update the pm-graph suite of utilities to version 5.10 which is fixes-mostly and does not add any new features (Todd Brandt). * pm-sleep: PM: hibernate: Allow hybrid sleep to work with s2idle * pm-domains: PM: domains: Fix handling of unavailable/disabled idle states * pm-tools: pm-graph v5.10
2022-10-28blk-mq: Properly init requests from blk_mq_alloc_request_hctx()John Garry
Function blk_mq_alloc_request_hctx() is missing zeroing/init of rq->bio, biotail, __sector, and __data_len members, which blk_mq_alloc_request() has, so duplicate what we do in blk_mq_alloc_request(). Fixes: 1f5bd336b9150 ("blk-mq: add blk_mq_alloc_request_hctx") Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/1666780513-121650-1-git-send-email-john.garry@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-10-28wifi: ath11k: fix monitor vdev creation with firmware recoveryNagarajan Maran
During firmware recovery, the monitor interface is not getting created in the driver and firmware since the respective flags are not updated properly. So after firmware recovery is successful, when monitor interface is brought down manually, firmware assertion is observed, since we are trying to bring down the interface which is not yet created in the firmware. Fix this by updating the monitor flags properly per phy#, during firmware recovery. Tested-on: IPQ8074 hw2.0 AHB WLAN.HK.2.7.0.1-01744-QCAHKSWPL_SILICONZ-1 Signed-off-by: Nagarajan Maran <quic_nmaran@quicinc.com> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://lore.kernel.org/r/20221014155054.11471-1-quic_nmaran@quicinc.com
2022-10-28ALSA: hda/realtek: Add quirk for ASUS Zenbook using CS35L41Stefan Binding
This Asus Zenbook laptop use Realtek HDA codec combined with 2xCS35L41 Amplifiers using I2C with Internal Boost. Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20221028102742.2588687-1-sbinding@opensource.cirrus.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-10-28phy: qcom-qmp-combo: fix NULL-deref on runtime resumeJohan Hovold
Commit fc64623637da ("phy: qcom-qmp-combo,usb: add support for separate PCS_USB region") started treating the PCS_USB registers as potentially separate from the PCS registers but used the wrong base when no PCS_USB offset has been provided. Fix the PCS_USB base used at runtime resume to prevent dereferencing a NULL pointer on platforms that do not provide a PCS_USB offset (e.g. SC7180). Fixes: fc64623637da ("phy: qcom-qmp-combo,usb: add support for separate PCS_USB region") Cc: stable@vger.kernel.org # 5.20 Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Andrew Halaney <ahalaney@redhat.com> Link: https://lore.kernel.org/r/20221026162116.26462-1-johan+linaro@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-10-28fuse: add file_modified() to fallocateMiklos Szeredi
Add missing file_modified() call to fuse_file_fallocate(). Without this fallocate on fuse failed to clear privileges. Fixes: 05ba1f082300 ("fuse: add FALLOCATE operation") Cc: <stable@vger.kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-10-28MAINTAINERS: Update HiSilicon SFC Driver maintainerJay Fang
Add Jay Fang as the maintainer of the HiSilicon SFC Driver, replacing John Garry. Signed-off-by: Jay Fang <f.fangjian@huawei.com> Acked-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/20221028023739.4113998-1-f.fangjian@huawei.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-10-28soundwire: qcom: check for outanding writes before doing a readSrinivas Kandagatla
Reading will increase the fifo count, so check for outstanding cmd wrt. write fifo depth to avoid overflow as read will also increase write fifo cnt. Fixes: a661308c34de ("soundwire: qcom: wait for fifo space to be available before read/write") Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Link: https://lore.kernel.org/r/20221026110210.6575-3-srinivas.kandagatla@linaro.org Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-10-28soundwire: qcom: reinit broadcast completionSrinivas Kandagatla
For some reason we never reinit the broadcast completion, there is a danger that broadcast commands could be treated as completed by driver from previous complete status. Fix this by reinitializing the completion before sending a broadcast command. Fixes: ddea6cf7b619 ("soundwire: qcom: update register read/write routine") Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Link: https://lore.kernel.org/r/20221026110210.6575-2-srinivas.kandagatla@linaro.org Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-10-28soundwire: intel: Initialize clock stop timeoutSjoerd Simons
The bus->clk_stop_timeout member is only initialized to a non-zero value during the codec driver probe. This can lead to corner cases where this value remains pegged at zero when the bus suspends, which results in an endless loop in sdw_bus_wait_for_clk_prep_deprep(). Corner cases include configurations with no codecs described in the firmware, or delays in probing codec drivers. Initializing the default timeout to the smallest non-zero value avoid this problem and allows for the existing logic to be preserved: the bus->clk_stop_timeout is set as the maximum required by all codecs connected on the bus. Fixes: 1f2dcf3a154ac ("soundwire: intel: set dev_num_ida_min") Signed-off-by: Sjoerd Simons <sjoerd@collabora.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Chao Song <chao.song@intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://lore.kernel.org/r/20221020015624.1703950-1-yung-chuan.liao@linux.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-10-28KVM: x86/xen: Fix eventfd error handling in kvm_xen_eventfd_assign()Eiichi Tsukata
Should not call eventfd_ctx_put() in case of error. Fixes: 2fd6df2f2b47 ("KVM: x86/xen: intercept EVTCHNOP_send from guests") Reported-by: syzbot+6f0c896c5a9449a10ded@syzkaller.appspotmail.com Signed-off-by: Eiichi Tsukata <eiichi.tsukata@nutanix.com> Message-Id: <20221028092631.117438-1-eiichi.tsukata@nutanix.com> [Introduce new goto target instead. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-28capabilities: fix potential memleak on error path from vfs_getxattr_alloc()Gaosheng Cui
In cap_inode_getsecurity(), we will use vfs_getxattr_alloc() to complete the memory allocation of tmpbuf, if we have completed the memory allocation of tmpbuf, but failed to call handler->get(...), there will be a memleak in below logic: |-- ret = (int)vfs_getxattr_alloc(mnt_userns, ...) | /* ^^^ alloc for tmpbuf */ |-- value = krealloc(*xattr_value, error + 1, flags) | /* ^^^ alloc memory */ |-- error = handler->get(handler, ...) | /* error! */ |-- *xattr_value = value | /* xattr_value is &tmpbuf (memory leak!) */ So we will try to free(tmpbuf) after vfs_getxattr_alloc() fails to fix it. Cc: stable@vger.kernel.org Fixes: 8db6c34f1dbc ("Introduce v3 namespaced file capabilities") Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Acked-by: Serge Hallyn <serge@hallyn.com> [PM: subject line and backtrace tweaks] Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-10-28net: emaclite: update reset_lock member documentationRadhey Shyam Pandey
Instead of generic description, mention what reset_lock actually protects i.e. lock to serialize xmit and tx_timeout execution. Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28nfc: s3fwrn5: use devm_clk_get_optional_enabled() helperDmitry Torokhov
Because we enable the clock immediately after acquiring it in probe, we can combine the 2 operations and use devm_clk_get_optional_enabled() helper. Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28Merge branch 'txgbe'David S. Miller
Jiawen Wu says: ==================== net: WangXun txgbe ethernet driver This patch series adds support for WangXun 10 gigabit NIC, to initialize hardware, set mac address, and register netdev. Change log: v6: address comments: Jakub Kicinski: check with scripts/kernel-doc v5: address comments: Jakub Kicinski: clean build with W=1 C=1 v4: address comments: Andrew Lunn: https://lore.kernel.org/all/YzXROBtztWopeeaA@lunn.ch/ v3: address comments: Andrew Lunn: remove hw function ops, reorder functions, use BIT(n) for register bit offset, move the same code of txgbe and ngbe to libwx v2: address comments: Andrew Lunn: https://lore.kernel.org/netdev/YvRhld5rD%2FxgITEg@lunn.ch/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28net: txgbe: Set MAC address and register netdevJiawen Wu
Add MAC address related operations, and register netdev. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28net: txgbe: Reset hardwareJiawen Wu
Reset and initialize the hardware by configuring the MAC layer. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28net: txgbe: Store PCI infoJiawen Wu
Get PCI config space info, set LAN id and check flash status. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28KVM: x86: smm: number of GPRs in the SMRAM image depends on the image formatMaxim Levitsky
On 64 bit host, if the guest doesn't have X86_FEATURE_LM, KVM will access 16 gprs to 32-bit smram image, causing out-ouf-bound ram access. On 32 bit host, the rsm_load_state_64/enter_smm_save_state_64 is compiled out, thus access overflow can't happen. Fixes: b443183a25ab61 ("KVM: x86: Reduce the number of emulator GPRs to '8' for 32-bit KVM") Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Message-Id: <20221025124741.228045-15-mlevitsk@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-28KVM: x86: emulator: update the emulation mode after CR0 writeMaxim Levitsky
Update the emulation mode when handling writes to CR0, because toggling CR0.PE switches between Real and Protected Mode, and toggling CR0.PG when EFER.LME=1 switches between Long and Protected Mode. This is likely a benign bug because there is no writeback of state, other than the RIP increment, and when toggling CR0.PE, the CPU has to execute code from a very low memory address. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20221025124741.228045-14-mlevitsk@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-28KVM: x86: emulator: update the emulation mode after rsmMaxim Levitsky
Update the emulation mode after RSM so that RIP will be correctly written back, because the RSM instruction can switch the CPU mode from 32 bit (or less) to 64 bit. This fixes a guest crash in case the #SMI is received while the guest runs a code from an address > 32 bit. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20221025124741.228045-13-mlevitsk@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-28KVM: x86: emulator: introduce emulator_recalc_and_set_modeMaxim Levitsky
Some instructions update the cpu execution mode, which needs to update the emulation mode. Extract this code, and make assign_eip_far use it. assign_eip_far now reads CS, instead of getting it via a parameter, which is ok, because callers always assign CS to the same value before calling this function. No functional change is intended. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20221025124741.228045-12-mlevitsk@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-28KVM: x86: emulator: em_sysexit should update ctxt->modeMaxim Levitsky
SYSEXIT is one of the instructions that can change the processor mode, thus ctxt->mode should be updated after it. Note that this is likely a benign bug, because the only problematic mode change is from 32 bit to 64 bit which can lead to truncation of RIP, and it is not possible to do with sysexit, since sysexit running in 32 bit mode will be limited to 32 bit version. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20221025124741.228045-11-mlevitsk@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-28KVM: selftests: Mark "guest_saw_irq" as volatile in xen_shinfo_testSean Christopherson
Tag "guest_saw_irq" as "volatile" to ensure that the compiler will never optimize away lookups. Relying on the compiler thinking that the flag is global and thus might change also works, but it's subtle, less robust, and looks like a bug at first glance, e.g. risks being "fixed" and breaking the test. Make the flag "static" as well since convincing the compiler it's global is no longer necessary. Alternatively, the flag could be accessed with {READ,WRITE}_ONCE(), but literally every access would need the wrappers, and eking out performance isn't exactly top priority for selftests. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20221013211234.1318131-17-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-28KVM: selftests: Add tests in xen_shinfo_test to detect lock racesMichal Luczaj
Tests for races between shinfo_cache (de)activation and hypercall+ioctl() processing. KVM has had bugs where activating the shared info cache multiple times and/or with concurrent users results in lock corruption, NULL pointer dereferences, and other fun. For the timer injection testcase (#22), re-arm the timer until the IRQ is successfully injected. If the timer expires while the shared info is deactivated (invalid), KVM will drop the event. Signed-off-by: Michal Luczaj <mhal@rbox.co> Co-developed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20221013211234.1318131-16-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-28Merge branch 'tcp-plb'David S. Miller
Mubashir Adnan Qureshi says: ==================== net: Add PLB functionality to TCP This patch series adds PLB (Protective Load Balancing) to TCP and hooks it up to DCTCP. PLB is disabled by default and can be enabled using relevant sysctls and support from underlying CC. PLB (Protective Load Balancing) is a host based mechanism for load balancing across switch links. It leverages congestion signals(e.g. ECN) from transport layer to randomly change the path of the connection experiencing congestion. PLB changes the path of the connection by changing the outgoing IPv6 flow label for IPv6 connections (implemented in Linux by calling sk_rethink_txhash()). Because of this implementation mechanism, PLB can currently only work for IPv6 traffic. For more information, see the SIGCOMM 2022 paper: https://doi.org/10.1145/3544216.3544226 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28tcp: add rcv_wnd and plb_rehash to TCP_INFOMubashir Adnan Qureshi
rcv_wnd can be useful to diagnose TCP performance where receiver window becomes the bottleneck. rehash reports the PLB and timeout triggered rehash attempts by the TCP connection. Signed-off-by: Mubashir Adnan Qureshi <mubashirq@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28tcp: add u32 counter in tcp_sock and an SNMP counter for PLBMubashir Adnan Qureshi
A u32 counter is added to tcp_sock for counting the number of PLB triggered rehashes for a TCP connection. An SNMP counter is also added to count overall PLB triggered rehash events for a host. These counters are hooked up to PLB implementation for DCTCP. TCP_NLA_REHASH is added to SCM_TIMESTAMPING_OPT_STATS that reports the rehash attempts triggered due to PLB or timeouts. This gives a historical view of sustained congestion or timeouts experienced by the TCP connection. Signed-off-by: Mubashir Adnan Qureshi <mubashirq@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28tcp: add support for PLB in DCTCPMubashir Adnan Qureshi
PLB support is added to TCP DCTCP code. As DCTCP uses ECN as the congestion signal, PLB also uses ECN to make decisions whether to change the path or not upon sustained congestion. Signed-off-by: Mubashir Adnan Qureshi <mubashirq@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28tcp: add PLB functionality for TCPMubashir Adnan Qureshi
Congestion control algorithms track PLB state and cause the connection to trigger a path change when either of the 2 conditions is satisfied: - No packets are in flight and (# consecutive congested rounds >= sysctl_tcp_plb_idle_rehash_rounds) - (# consecutive congested rounds >= sysctl_tcp_plb_rehash_rounds) A round (RTT) is marked as congested when congestion signal (ECN ce_ratio) over an RTT is greater than sysctl_tcp_plb_cong_thresh. In the event of RTO, PLB (via tcp_write_timeout()) triggers a path change and disables congestion-triggered path changes for random time between (sysctl_tcp_plb_suspend_rto_sec, 2*sysctl_tcp_plb_suspend_rto_sec) to avoid hopping onto the "connectivity blackhole". RTO-triggered path changes can still happen during this cool-off period. Signed-off-by: Mubashir Adnan Qureshi <mubashirq@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28tcp: add sysctls for TCP PLB parametersMubashir Adnan Qureshi
PLB (Protective Load Balancing) is a host based mechanism for load balancing across switch links. It leverages congestion signals(e.g. ECN) from transport layer to randomly change the path of the connection experiencing congestion. PLB changes the path of the connection by changing the outgoing IPv6 flow label for IPv6 connections (implemented in Linux by calling sk_rethink_txhash()). Because of this implementation mechanism, PLB can currently only work for IPv6 traffic. For more information, see the SIGCOMM 2022 paper: https://doi.org/10.1145/3544216.3544226 This commit adds new sysctl knobs and sets their default values for TCP PLB. Signed-off-by: Mubashir Adnan Qureshi <mubashirq@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28Merge branch 'mxl-gpy-MDI-X'David S. Miller
Raju Lakkaraju says: ==================== net: phy: mxl-gpy: Add MDI-X This patch series add the MDI-X feature to GPY211 PHYs and Also Change return type to gpy_update_interface() function ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28net: phy: mxl-gpy: Add PHY Auto/MDI/MDI-X set driver for GPY211 chipsRaju Lakkaraju
Add support for MDI-X status and configuration for GPY211 chips Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28net: phy: mxl-gpy: Change gpy_update_interface() function return typeRaju Lakkaraju
gpy_update_interface() is called from gpy_read_status() which does return error codes. gpy_read_status() would benefit from returning -EINVAL, etc. Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28net: dsa: Fix possible memory leaks in dsa_loop_init()Chen Zhongjin
kmemleak reported memory leaks in dsa_loop_init(): kmemleak: 12 new suspected memory leaks unreferenced object 0xffff8880138ce000 (size 2048): comm "modprobe", pid 390, jiffies 4295040478 (age 238.976s) backtrace: [<000000006a94f1d5>] kmalloc_trace+0x26/0x60 [<00000000a9c44622>] phy_device_create+0x5d/0x970 [<00000000d0ee2afc>] get_phy_device+0xf3/0x2b0 [<00000000dca0c71f>] __fixed_phy_register.part.0+0x92/0x4e0 [<000000008a834798>] fixed_phy_register+0x84/0xb0 [<0000000055223fcb>] dsa_loop_init+0xa9/0x116 [dsa_loop] ... There are two reasons for memleak in dsa_loop_init(). First, fixed_phy_register() create and register phy_device: fixed_phy_register() get_phy_device() phy_device_create() # freed by phy_device_free() phy_device_register() # freed by phy_device_remove() But fixed_phy_unregister() only calls phy_device_remove(). So the memory allocated in phy_device_create() is leaked. Second, when mdio_driver_register() fail in dsa_loop_init(), it just returns and there is no cleanup for phydevs. Fix the problems by catching the error of mdio_driver_register() in dsa_loop_init(), then calling both fixed_phy_unregister() and phy_device_free() to release phydevs. Also add a function for phydevs cleanup to avoid duplacate. Fixes: 98cd1552ea27 ("net: dsa: Mock-up driver") Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28kbuild: fix SIGPIPE error message for AR=gcc-ar and AR=llvm-arMasahiro Yamada
Jiri Slaby reported that building the kernel with AR=gcc-ar shows: /usr/bin/ar terminated with signal 13 [Broken pipe] Nathan Chancellor reported the latest AR=llvm-ar shows: error: write on a pipe with no reader The latter occurs since LLVM commit 51b557adc131 ("Add an error message to the default SIGPIPE handler"). The resulting vmlinux is correct, but it is better to silence it. 'head -n1' exits after reading the first line, so the pipe is closed. Use 'sed -n 1p' to eat the stream till the end. Fixes: 321648455061 ("kbuild: use obj-y instead extra-y for objects placed at the head") Link: https://github.com/ClangBuiltLinux/linux/issues/1651 Reported-by: Jiri Slaby <jirislaby@kernel.org> Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Nathan Chancellor <nathan@kernel.org>
2022-10-27cifs: fix use-after-free caused by invalid pointer `hostname`Zeng Heng
`hostname` needs to be set as null-pointer after free in `cifs_put_tcp_session` function, or when `cifsd` thread attempts to resolve hostname and reconnect the host, the thread would deref the invalid pointer. Here is one of practical backtrace examples as reference: Task 477 --------------------------- do_mount path_mount do_new_mount vfs_get_tree smb3_get_tree smb3_get_tree_common cifs_smb3_do_mount cifs_mount mount_put_conns cifs_put_tcp_session --> kfree(server->hostname) cifsd --------------------------- kthread cifs_demultiplex_thread cifs_reconnect reconn_set_ipaddr_from_hostname --> if (!server->hostname) --> if (server->hostname[0] == '\0') // !! UAF fault here CIFS: VFS: cifs_mount failed w/return code = -112 mount error(112): Host is down BUG: KASAN: use-after-free in reconn_set_ipaddr_from_hostname+0x2ba/0x310 Read of size 1 at addr ffff888108f35380 by task cifsd/480 CPU: 2 PID: 480 Comm: cifsd Not tainted 6.1.0-rc2-00106-gf705792f89dd-dirty #25 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x68/0x85 print_report+0x16c/0x4a3 kasan_report+0x95/0x190 reconn_set_ipaddr_from_hostname+0x2ba/0x310 __cifs_reconnect.part.0+0x241/0x800 cifs_reconnect+0x65f/0xb60 cifs_demultiplex_thread+0x1570/0x2570 kthread+0x2c5/0x380 ret_from_fork+0x22/0x30 </TASK> Allocated by task 477: kasan_save_stack+0x1e/0x40 kasan_set_track+0x21/0x30 __kasan_kmalloc+0x7e/0x90 __kmalloc_node_track_caller+0x52/0x1b0 kstrdup+0x3b/0x70 cifs_get_tcp_session+0xbc/0x19b0 mount_get_conns+0xa9/0x10c0 cifs_mount+0xdf/0x1970 cifs_smb3_do_mount+0x295/0x1660 smb3_get_tree+0x352/0x5e0 vfs_get_tree+0x8e/0x2e0 path_mount+0xf8c/0x1990 do_mount+0xee/0x110 __x64_sys_mount+0x14b/0x1f0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd Freed by task 477: kasan_save_stack+0x1e/0x40 kasan_set_track+0x21/0x30 kasan_save_free_info+0x2a/0x50 __kasan_slab_free+0x10a/0x190 __kmem_cache_free+0xca/0x3f0 cifs_put_tcp_session+0x30c/0x450 cifs_mount+0xf95/0x1970 cifs_smb3_do_mount+0x295/0x1660 smb3_get_tree+0x352/0x5e0 vfs_get_tree+0x8e/0x2e0 path_mount+0xf8c/0x1990 do_mount+0xee/0x110 __x64_sys_mount+0x14b/0x1f0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd The buggy address belongs to the object at ffff888108f35380 which belongs to the cache kmalloc-16 of size 16 The buggy address is located 0 bytes inside of 16-byte region [ffff888108f35380, ffff888108f35390) The buggy address belongs to the physical page: page:00000000333f8e58 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888108f350e0 pfn:0x108f35 flags: 0x200000000000200(slab|node=0|zone=2) raw: 0200000000000200 0000000000000000 dead000000000122 ffff8881000423c0 raw: ffff888108f350e0 000000008080007a 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888108f35280: fa fb fc fc fa fb fc fc fa fb fc fc fa fb fc fc ffff888108f35300: fa fb fc fc fa fb fc fc fa fb fc fc fa fb fc fc >ffff888108f35380: fa fb fc fc fa fb fc fc fa fb fc fc fa fb fc fc ^ ffff888108f35400: fa fb fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff888108f35480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc Fixes: 7be3248f3139 ("cifs: To match file servers, make sure the server hostname matches") Signed-off-by: Zeng Heng <zengheng4@huawei.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-nextJakub Kicinski
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next 1) Move struct nft_payload_set definition to .c file where it is only used. 2) Shrink transport and inner header offset fields in the nft_pktinfo structure to 16-bits, from Florian Westphal. 3) Get rid of nft_objref Kbuild toggle, make it built-in into nf_tables. This expression is used to instantiate conntrack helpers in nftables. After removing the conntrack helper auto-assignment toggle it this feature became more important so move it to the nf_tables core module. Also from Florian. 4) Extend the existing function to calculate payload inner header offset to deal with the GRE and IPIP transport protocols. 6) Add inner expression support for nf_tables. This new expression provides a packet parser for tunneled packets which uses a userspace description of the expected inner headers. The inner expression invokes the payload expression (via direct call) to match on the inner header protocol fields using the inner link, network and transport header offsets. An example of the bytecode generated from userspace to match on IP source encapsulated in a VxLAN packet: # nft --debug=netlink add rule netdev x y udp dport 4789 vxlan ip saddr 1.2.3.4 netdev x y [ meta load l4proto => reg 1 ] [ cmp eq reg 1 0x00000011 ] [ payload load 2b @ transport header + 2 => reg 1 ] [ cmp eq reg 1 0x0000b512 ] [ inner type vxlan hdrsize 8 flags f [ meta load protocol => reg 1 ] ] [ cmp eq reg 1 0x00000008 ] [ inner type vxlan hdrsize 8 flags f [ payload load 4b @ network header + 12 => reg 1 ] ] [ cmp eq reg 1 0x04030201 ] 7) Store inner link, network and transport header offsets in percpu area to parse inner packet header once only. Matching on a different tunnel type invalidates existing offsets in the percpu area and it invokes the inner tunnel parser again. 8) Add support for inner meta matching. This support for NFTA_META_PROTOCOL, which specifies the inner ethertype, and NFT_META_L4PROTO, which specifies the inner transport protocol. 9) Extend nft_inner to parse GENEVE optional fields to calculate the link layer offset. 10) Update inner expression so tunnel offset points to GRE header to normalize tunnel header handling. This also allows to perform different interpretations of the GRE header from userspace. * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: nft_inner: set tunnel offset to GRE header offset netfilter: nft_inner: add geneve support netfilter: nft_meta: add inner match support netfilter: nft_inner: add percpu inner context netfilter: nft_inner: support for inner tunnel header matching netfilter: nft_payload: access ipip payload for inner offset netfilter: nft_payload: access GRE payload via inner offset netfilter: nft_objref: make it builtin netfilter: nf_tables: reduce nft_pktinfo by 8 bytes netfilter: nft_payload: move struct nft_payload_set definition where it belongs ==================== Link: https://lore.kernel.org/r/20221026132227.3287-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-27net: dpaa2-eth: Simplify bool conversionYang Li
./drivers/net/ethernet/freescale/dpaa2/dpaa2-xsk.c:453:42-47: WARNING: conversion to bool not needed here Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2577 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Link: https://lore.kernel.org/r/20221026051824.38730-1-yang.lee@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-27Merge branch 'ionic-vf-attr-replay-and-other-updates'Jakub Kicinski
Shannon Nelson says: ==================== ionic: VF attr replay and other updates For better VF management when a FW update restart or a FW crash recover is detected, the PF now will replay any user specified VF attributes to be sure the FW hasn't lost them in the restart. Newer FW offers more packet processing offloads, so we now support them in the driver. A small refactor of the Rx buffer fill cleans a bit of code and will help future work on buffer caching. ==================== Link: https://lore.kernel.org/r/20221026143744.11598-1-snelson@pensando.io Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-27ionic: refactor use of ionic_rx_fill()Neel Patel
The same pre-work code is used before each call to ionic_rx_fill(), so bring it in and make it a part of the routine. Signed-off-by: Neel Patel <neel@pensando.io> Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-27ionic: enable tunnel offloadsNeel Patel
Support stateless offloads for GRE, VXLAN, GENEVE, IPXIP4 and IPXIP6 when the FW supports them. Signed-off-by: Neel Patel <neel@pensando.io> Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-27ionic: new ionic device identity level and VF start controlShannon Nelson
A new ionic dev_cmd is added to the interface in ionic_if.h, with a new capabilities field in the ionic device identity to signal its availability in the FW. The identity level code is incremented to '2' to show support for this new capabilities bitfield. If the driver has indicated with the new identity level that it has the VF_CTRL command, newer FW will wait for the start command before starting the VFs after a FW update or crash recovery. This patch updates the driver to make use of the new VF start control in fw_up path to be sure that the PF has set the user attributes on the VF before the FW allows the VFs to restart. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-27ionic: only save the user set VF attributesShannon Nelson
Report the current FW values for the VF attributes, but don't save the FW values locally, only save the vf attributes that are given to us from the user. This allows us to replay user data, and doesn't end up confusing things like "who set the mac address". Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: Jakub Kicinski <kuba@kernel.org>