summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-03-27net: sunhme: Consolidate mac address initializationSean Anderson
The mac address initialization is braodly the same between PCI and SBUS, and one was clearly copied from the other. Consolidate them. We still have to have some ifdefs because pci_(un)map_rom is only implemented for PCI, and idprom is only implemented for SPARC. Signed-off-by: Sean Anderson <seanga2@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27net: sunhme: Switch SBUS to devresSean Anderson
The PCI half of this driver was converted in commit 914d9b2711dd ("sunhme: switch to devres"). Do the same for the SBUS half. Signed-off-by: Sean Anderson <seanga2@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27net: sunhme: Alphabetize includesSean Anderson
Alphabetize includes to make it clearer where to add new ones. Signed-off-by: Sean Anderson <seanga2@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27net: sunhme: Unify IRQ requestingSean Anderson
Instead of registering one interrupt handler for all four SBUS Quattro HMEs, let each HME register its own handler. To make this work, we don't handle the IRQ if none of the status bits are set. This reduces the complexity of the driver, and makes it easier to ensure things happen before/after enabling IRQs. I'm not really sure why we request IRQs in two different places (and leave them running after removing the driver!). A lot of things in this driver seem to just be crusty, and not necessarily intentional. I'm assuming that's the case here as well. This really needs to be tested by someone with an SBUS Quattro card. Signed-off-by: Sean Anderson <seanga2@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27net: sunhme: Remove residual polling codeSean Anderson
The sunhme driver never used the hardware MII polling feature. Even the if-def'd out happy_meal_poll_start was removed by 2002 [1]. Remove the various places in the driver which needlessly guard against MII interrupts which will never be enabled. [1] https://lwn.net/2002/0411/a/2.5.8-pre3.php3 Signed-off-by: Sean Anderson <seanga2@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27net: sunhme: Just restart autonegotiation if we can't bring the link upSean Anderson
If we've tried regular autonegotiation and forcing the link mode, just restart autonegotiation instead of reinitializing the whole NIC. Signed-off-by: Sean Anderson <seanga2@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27net: sunhme: Fix uninitialized return codeSean Anderson
Fix an uninitialized return code if we never found a qfe slot. It would be a bug if we ever got into this situation, but it's good to return something tracable. Fixes: acb3f35f920b ("sunhme: forward the error code from pci_enable_device()") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Sean Anderson <seanga2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27Merge branch 'octeon_ep-deferred-probe-and-mailbox'David S. Miller
Veerasenareddy Burru says: ==================== octeon_ep: deferred probe and mailbox Implement Deferred probe, mailbox enhancements and heartbeat monitor. v4 -> v5: - addressed review comments https://lore.kernel.org/all/20230323104703.GD36557@unreal/ replaced atomic_inc() + atomic_read() with atomic_inc_return(). v3 -> v4: - addressed review comments on v3 https://lore.kernel.org/all/20230214051422.13705-1-vburru@marvell.com/ - 0004-xxx.patch v3 is split into 0004-xxx.patch and 0005-xxx.patch in v4. - API changes to accept function ID are moved to 0005-xxx.patch. - fixed rct violations. - reverted newly added changes that do not yet have use cases. v2 -> v3: - removed SRIOV VF support changes from v2, as new drivers which use ndo_get_vf_xxx() and ndo_set_vf_xxx() are not accepted. https://lore.kernel.org/all/20221207200204.6819575a@kernel.org/ Will implement VF representors and submit again. - 0007-xxx.patch and 0008-xxx.patch from v2 are removed and 0009-xxx.patch in v2 is now 0007-xxx.patch in v3. - accordingly, changed title for cover letter. v1 -> v2: - remove separate workqueue task to wait for firmware ready. instead defer probe when firmware is not ready. Reported-by: Leon Romanovsky <leon@kernel.org> - This change has resulted in update of 0001-xxx.patch and all other patches in the patchset. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27octeon_ep: add heartbeat monitorVeerasenareddy Burru
Monitor periodic heartbeat messages from device firmware. Presence of heartbeat indicates the device is active and running. If the heartbeat is missed for configured interval indicates firmware has crashed and device is unusable; in this case, PF driver stops and uninitialize the device. Signed-off-by: Veerasenareddy Burru <vburru@marvell.com> Signed-off-by: Abhijit Ayarekar <aayarekar@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27octeon_ep: function id in link info and stats mailbox commandsVeerasenareddy Burru
Update control mailbox API to include function id in get stats and link info. Signed-off-by: Veerasenareddy Burru <vburru@marvell.com> Signed-off-by: Abhijit Ayarekar <aayarekar@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27octeon_ep: support asynchronous notificationsVeerasenareddy Burru
Add asynchronous notification support to the control mailbox. Signed-off-by: Veerasenareddy Burru <vburru@marvell.com> Signed-off-by: Abhijit Ayarekar <aayarekar@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27octeon_ep: include function id in mailbox commandsVeerasenareddy Burru
Extend control command structure to include vfid and update APIs to accept VF ID. Signed-off-by: Abhijit Ayarekar <aayarekar@marvell.com> Signed-off-by: Veerasenareddy Burru <vburru@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27octeon_ep: add separate mailbox command and response queuesVeerasenareddy Burru
Enhance control mailbox protocol to support following - separate command and response queues * command queue to send control commands to firmware. * response queue to receive responses and notifications from firmware. - variable size messages using scatter/gather Signed-off-by: Abhijit Ayarekar <aayarekar@marvell.com> Signed-off-by: Veerasenareddy Burru <vburru@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27octeon_ep: control mailbox for multiple PFsVeerasenareddy Burru
Add control mailbox support for multiple PFs. Update control mbox base address calculation based on PF function link. Signed-off-by: Veerasenareddy Burru <vburru@marvell.com> Signed-off-by: Abhijit Ayarekar <aayarekar@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27octeon_ep: poll for control messagesVeerasenareddy Burru
Poll for control messages until interrupts are enabled. All the interrupts are enabled in ndo_open(). Add ability to listen for notifications from firmware before ndo_open(). Once interrupts are enabled, this polling is disabled and all the messages are processed by bottom half of interrupt handler. Signed-off-by: Veerasenareddy Burru <vburru@marvell.com> Signed-off-by: Abhijit Ayarekar <aayarekar@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27octeon_ep: defer probe if firmware not readyVeerasenareddy Burru
Defer probe if firmware is not ready for device usage. Signed-off-by: Veerasenareddy Burru <vburru@marvell.com> Signed-off-by: Abhijit Ayarekar <aayarekar@marvell.com> Signed-off-by: Satananda Burla <sburla@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27net: dsa: b53: mmap: add phy opsÁlvaro Fernández Rojas
Implement phy_read16() and phy_write16() ops for B53 MMAP to avoid accessing B53_PORT_MII_PAGE registers which hangs the device. This access should be done through the MDIO Mux bus controller. Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27Merge branch 'bcm53134-support'David S. Miller
Álvaro Fernández Rojas says: ==================== net: dsa: b53: mdio: add support for BCM53134 This is based on the initial work from Paul Geurts that was sent to the incorrect linux development lists and recipients. I've modified it by removing BCM53134_DEVICE_ID from is531x5() and therefore adding is53134() where needed. I also added a separate RGMII handling block for is53134() since according to Paul, BCM53134 doesn't support RGMII_CTRL_TIMING_SEL as opposed to is531x5(). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27net: dsa: b53: mdio: add support for BCM53134Paul Geurts
Add support for the BCM53134 Ethernet switch in the existing b53 dsa driver. BCM53134 is very similar to the BCM58XX series. Signed-off-by: Paul Geurts <paul.geurts@prodrive-technologies.com> Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27dt-bindings: net: dsa: b53: add BCM53134 supportÁlvaro Fernández Rojas
BCM53134 are B53 switches connected by MDIO. Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27net: phy: micrel: correct KSZ9131RNX EEE capabilities and advertisementOleksij Rempel
The KSZ9131RNX incorrectly shows EEE capabilities in its registers. Although the "EEE control and capability 1" (Register 3.20) is set to 0, indicating no EEE support, the "EEE advertisement 1" (Register 7.60) is set to 0x6, advertising EEE support for 1000BaseT/Full and 100BaseT/Full. This inconsistency causes PHYlib to assume there is no EEE support, preventing control over EEE advertisement, which is enabled by default. This patch resolves the issue by utilizing the ksz9477_get_features() function to correctly set the EEE capabilities for the KSZ9131RNX. This adjustment allows proper control over EEE advertisement and ensures accurate representation of the device's capabilities. Fixes: 8b68710a3121 ("net: phy: start using genphy_c45_ethtool_get/set_eee()") Reported-by: Marek Vasut <marex@denx.de> Tested-by: Marek Vasut <marex@denx.de> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27vsock/loopback: use only sk_buff_head.lock to protect the packet queueStefano Garzarella
pkt_list_lock was used before commit 71dc9ec9ac7d ("virtio/vsock: replace virtio_vsock_pkt with sk_buff") to protect the packet queue. After that commit we switched to sk_buff and we are using sk_buff_head.lock in almost every place to protect the packet queue except in vsock_loopback_work() when we call skb_queue_splice_init(). As reported by syzbot, this caused unlocked concurrent access to the packet queue between vsock_loopback_work() and vsock_loopback_cancel_pkt() since it is not holding pkt_list_lock. With the introduction of sk_buff_head, pkt_list_lock is redundant and can cause confusion, so let's remove it and use sk_buff_head.lock everywhere to protect the packet queue access. Fixes: 71dc9ec9ac7d ("virtio/vsock: replace virtio_vsock_pkt with sk_buff") Cc: bobby.eshleman@bytedance.com Reported-and-tested-by: syzbot+befff0a9536049e7902e@syzkaller.appspotmail.com Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Bobby Eshleman <bobby.eshleman@bytedance.com> Reviewed-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27Merge branch 'constify-sfp-phy-nodes'David S. Miller
Russell King says: ==================== Constify a few sfp/phy fwnodes This series constifies a bunch of fwnode_handle pointers that are only used to refer to but not modify the contents of the fwnode structures. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27net: phy: constify fwnode_get_phy_node() fwnode argumentRussell King (Oracle)
fwnode_get_phy_node() does not motify the fwnode structure, so make the argument const, Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27net: sfp: constify sfp-bus internal fwnode usesRussell King (Oracle)
Constify sfp-bus internal fwnode uses, since we do not modify the fwnode structures. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27net: sfp: make sfp_bus_find_fwnode() take a const fwnodeRussell King (Oracle)
sfp_bus_find_fwnode() does not write to the fwnode, so let's make it const. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27net/net_failover: fix txq exceeding warningFaicker Mo
The failover txq is inited as 16 queues. when a packet is transmitted from the failover device firstly, the failover device will select the queue which is returned from the primary device if the primary device is UP and running. If the primary device txq is bigger than the default 16, it can lead to the following warning: eth0 selects TX queue 18, but real number of TX queues is 16 The warning backtrace is: [ 32.146376] CPU: 18 PID: 9134 Comm: chronyd Tainted: G E 6.2.8-1.el7.centos.x86_64 #1 [ 32.147175] Hardware name: Red Hat KVM, BIOS 1.10.2-3.el7_4.1 04/01/2014 [ 32.147730] Call Trace: [ 32.147971] <TASK> [ 32.148183] dump_stack_lvl+0x48/0x70 [ 32.148514] dump_stack+0x10/0x20 [ 32.148820] netdev_core_pick_tx+0xb1/0xe0 [ 32.149180] __dev_queue_xmit+0x529/0xcf0 [ 32.149533] ? __check_object_size.part.0+0x21c/0x2c0 [ 32.149967] ip_finish_output2+0x278/0x560 [ 32.150327] __ip_finish_output+0x1fe/0x2f0 [ 32.150690] ip_finish_output+0x2a/0xd0 [ 32.151032] ip_output+0x7a/0x110 [ 32.151337] ? __pfx_ip_finish_output+0x10/0x10 [ 32.151733] ip_local_out+0x5e/0x70 [ 32.152054] ip_send_skb+0x19/0x50 [ 32.152366] udp_send_skb.isra.0+0x163/0x3a0 [ 32.152736] udp_sendmsg+0xba8/0xec0 [ 32.153060] ? __folio_memcg_unlock+0x25/0x60 [ 32.153445] ? __pfx_ip_generic_getfrag+0x10/0x10 [ 32.153854] ? sock_has_perm+0x85/0xa0 [ 32.154190] inet_sendmsg+0x6d/0x80 [ 32.154508] ? inet_sendmsg+0x6d/0x80 [ 32.154838] sock_sendmsg+0x62/0x70 [ 32.155152] ____sys_sendmsg+0x134/0x290 [ 32.155499] ___sys_sendmsg+0x81/0xc0 [ 32.155828] ? _get_random_bytes.part.0+0x79/0x1a0 [ 32.156240] ? ip4_datagram_release_cb+0x5f/0x1e0 [ 32.156649] ? get_random_u16+0x69/0xf0 [ 32.156989] ? __fget_light+0xcf/0x110 [ 32.157326] __sys_sendmmsg+0xc4/0x210 [ 32.157657] ? __sys_connect+0xb7/0xe0 [ 32.157995] ? __audit_syscall_entry+0xce/0x140 [ 32.158388] ? syscall_trace_enter.isra.0+0x12c/0x1a0 [ 32.158820] __x64_sys_sendmmsg+0x24/0x30 [ 32.159171] do_syscall_64+0x38/0x90 [ 32.159493] entry_SYSCALL_64_after_hwframe+0x72/0xdc Fix that by reducing txq number as the non-existent primary-dev does. Fixes: cfc80d9a1163 ("net: Introduce net_failover driver") Signed-off-by: Faicker Mo <faicker.mo@ucloud.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-27regulator: Handle deferred clkChristophe JAILLET
devm_clk_get() can return -EPROBE_DEFER. So it is better to return the error code from devm_clk_get(), instead of a hard coded -ENOENT. This gives more opportunities to successfully probe the driver. Fixes: 8959e5324485 ("regulator: fixed: add possibility to enable by clock") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/18459fae3d017a66313699c7c8456b28158b2dd0.1679819354.git.christophe.jaillet@wanadoo.fr Signed-off-by: Mark Brown <broonie@kernel.org>
2023-03-279P FS: Fix wild-memory-access write in v9fs_get_aclIvan Orlov
KASAN reported the following issue: [ 36.825817][ T5923] BUG: KASAN: wild-memory-access in v9fs_get_acl+0x1a4/0x390 [ 36.827479][ T5923] Write of size 4 at addr 9fffeb37f97f1c00 by task syz-executor798/5923 [ 36.829303][ T5923] [ 36.829846][ T5923] CPU: 0 PID: 5923 Comm: syz-executor798 Not tainted 6.2.0-syzkaller-18302-g596b6b709632 #0 [ 36.832110][ T5923] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/21/2023 [ 36.834464][ T5923] Call trace: [ 36.835196][ T5923] dump_backtrace+0x1c8/0x1f4 [ 36.836229][ T5923] show_stack+0x2c/0x3c [ 36.837100][ T5923] dump_stack_lvl+0xd0/0x124 [ 36.838103][ T5923] print_report+0xe4/0x4c0 [ 36.839068][ T5923] kasan_report+0xd4/0x130 [ 36.840052][ T5923] kasan_check_range+0x264/0x2a4 [ 36.841199][ T5923] __kasan_check_write+0x2c/0x3c [ 36.842216][ T5923] v9fs_get_acl+0x1a4/0x390 [ 36.843232][ T5923] v9fs_mount+0x77c/0xa5c [ 36.844163][ T5923] legacy_get_tree+0xd4/0x16c [ 36.845173][ T5923] vfs_get_tree+0x90/0x274 [ 36.846137][ T5923] do_new_mount+0x25c/0x8c8 [ 36.847066][ T5923] path_mount+0x590/0xe58 [ 36.848147][ T5923] __arm64_sys_mount+0x45c/0x594 [ 36.849273][ T5923] invoke_syscall+0x98/0x2c0 [ 36.850421][ T5923] el0_svc_common+0x138/0x258 [ 36.851397][ T5923] do_el0_svc+0x64/0x198 [ 36.852398][ T5923] el0_svc+0x58/0x168 [ 36.853224][ T5923] el0t_64_sync_handler+0x84/0xf0 [ 36.854293][ T5923] el0t_64_sync+0x190/0x194 Calling '__v9fs_get_acl' method in 'v9fs_get_acl' creates the following chain of function calls: __v9fs_get_acl v9fs_fid_get_acl v9fs_fid_xattr_get p9_client_xattrwalk Function p9_client_xattrwalk accepts a pointer to u64-typed variable attr_size and puts some u64 value into it. However, after the executing the p9_client_xattrwalk, in some circumstances we assign the value of u64-typed variable 'attr_size' to the variable 'retval', which we will return. However, the type of 'retval' is ssize_t, and if the value of attr_size is larger than SSIZE_MAX, we will face the signed type overflow. If the overflow occurs, the result of v9fs_fid_xattr_get may be negative, but not classified as an error. When we try to allocate an acl with 'broken' size we receive an error, but don't process it. When we try to free this acl, we face the 'wild-memory-access' error (because it wasn't allocated). This patch will add new condition to the 'v9fs_fid_xattr_get' function, so it will return an EOVERFLOW error if the 'attr_size' is larger than SSIZE_MAX. In this version of the patch I simplified the condition. In previous (v2) version of the patch I removed explicit type conversion and added separate condition to check the possible overflow and return an error (in v1 version I've just modified the existing condition). Tested via syzkaller. Suggested-by: Christian Schoenebeck <linux_oss@crudebyte.com> Reported-by: syzbot+cb1d16facb3cc90de5fb@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=fbbef66d9e4d096242f3617de5d14d12705b4659 Signed-off-by: Ivan Orlov <ivan.orlov0322@gmail.com> Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
2023-03-26Linux 6.3-rc4v6.3-rc4Linus Torvalds
2023-03-26Merge tag 'usb-6.3-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB / Thunderbolt driver fixes from Greg KH: "Here are a small set of USB and Thunderbolt driver fixes for reported problems and a documentation update, for 6.3-rc4. Included in here are: - documentation update for uvc gadget driver - small thunderbolt driver fixes - cdns3 driver fixes - dwc3 driver fixes - dwc2 driver fixes - chipidea driver fixes - typec driver fixes - onboard_usb_hub device id updates - quirk updates All of these have been in linux-next with no reported problems" * tag 'usb-6.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (30 commits) usb: dwc2: fix a race, don't power off/on phy for dual-role mode usb: dwc2: fix a devres leak in hw_enable upon suspend resume usb: chipidea: core: fix possible concurrent when switch role usb: chipdea: core: fix return -EINVAL if request role is the same with current role thunderbolt: Rename shadowed variables bit to interrupt_bit and auto_clear_bit thunderbolt: Disable interrupt auto clear for rings thunderbolt: Use const qualifier for `ring_interrupt_index` usb: gadget: Use correct endianness of the wLength field for WebUSB uas: Add US_FL_NO_REPORT_OPCODES for JMicron JMS583Gen 2 usb: cdnsp: changes PCI Device ID to fix conflict with CNDS3 driver usb: cdns3: Fix issue with using incorrect PCI device function usb: cdnsp: Fixes issue with redundant Status Stage MAINTAINERS: make me a reviewer of USB/IP thunderbolt: Use scale field when allocating USB3 bandwidth thunderbolt: Limit USB3 bandwidth of certain Intel USB4 host routers thunderbolt: Call tb_check_quirks() after initializing adapters thunderbolt: Add missing UNSET_INBOUND_SBTX for retimer access thunderbolt: Fix memory leak in margining usb: dwc2: drd: fix inconsistent mode if role-switch-default-mode="host" docs: usb: Add documentation for the UVC Gadget ...
2023-03-26Merge tag 'sched_urgent_for_v6.3_rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fix from Borislav Petkov: - Fix a corner case where vruntime of a task is not being sanitized * tag 'sched_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/fair: Sanitize vruntime of entity being migrated
2023-03-26Merge tag 'perf_urgent_for_v6.3_rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fix from Borislav Petkov: - Properly clear perf event status tracking in the AMD perf event overflow handler * tag 'perf_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/amd/core: Always clear status for idx
2023-03-26Merge tag 'core_urgent_for_v6.3_rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core fixes from Borislav Petkov: - Do the delayed RCU wakeup for kthreads in the proper order so that former doesn't get ignored - A noinstr warning fix * tag 'core_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: entry/rcu: Check TIF_RESCHED _after_ delayed RCU wake-up entry: Fix noinstr warning in __enter_from_user_mode()
2023-03-26Merge tag 'x86_urgent_for_v6.3_rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Add a AMX ptrace self test - Prevent a false-positive warning when retrieving the (invalid) address of dynamic FPU features in their init state which are not saved in init_fpstate at all - Randomize per-CPU entry areas only when KASLR is enabled * tag 'x86_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: selftests/x86/amx: Add a ptrace test x86/fpu/xstate: Prevent false-positive warning in __copy_xstate_uabi_buf() x86/mm: Do not shuffle CPU entry areas without KASLR
2023-03-26Merge tag 'smb3-client-fixes-6.3-rc3' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull cifs client fixes from Steve French: "Twelve cifs/smb3 client fixes (most also for stable) - forced umount fix - fix for two perf regressions - reconnect fixes - small debugging improvements - multichannel fixes" * tag 'smb3-client-fixes-6.3-rc3' of git://git.samba.org/sfrench/cifs-2.6: smb3: fix unusable share after force unmount failure cifs: fix dentry lookups in directory handle cache smb3: lower default deferred close timeout to address perf regression cifs: fix missing unload_nls() in smb2_reconnect() cifs: avoid race conditions with parallel reconnects cifs: append path to open_enter trace event cifs: print session id while listing open files cifs: dump pending mids for all channels in DebugData cifs: empty interface list when server doesn't support query interfaces cifs: do not poll server interfaces too regularly cifs: lock chan_lock outside match_session cifs: check only tcon status on tcon related functions
2023-03-25xsk: allow remap of fill and/or completion ringsNuno Gonçalves
The remap of fill and completion rings was frowned upon as they control the usage of UMEM which does not support concurrent use. At the same time this would disallow the remap of these rings into another process. A possible use case is that the user wants to transfer the socket/ UMEM ownership to another process (via SYS_pidfd_getfd) and so would need to also remap these rings. This will have no impact on current usages and just relaxes the remap limitation. Signed-off-by: Nuno Gonçalves <nunog@fr24.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20230324100222.13434-1-nunog@fr24.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25bpf, docs: Add extended call instructionsDave Thaler
Add extended call instructions. Uses the term "program-local" for call by offset. And there are instructions for calling helper functions by "address" (the old way of using integer values), and for calling helper functions by BTF ID (for kfuncs). V1 -> V2: addressed comments from David Vernet V2 -> V3: make descriptions in table consistent with updated names V3 -> V4: addressed comments from Alexei V4 -> V5: fixed alignment Signed-off-by: Dave Thaler <dthaler@microsoft.com> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20230326033117.1075-1-dthaler1968@googlemail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25Merge branch 'bpf: Use bpf_mem_cache_alloc/free in bpf_local_storage'Alexei Starovoitov
Martin KaFai Lau says: ==================== From: Martin KaFai Lau <martin.lau@kernel.org> This set is a continuation of the effort in using bpf_mem_cache_alloc/free in bpf_local_storage [1] Major change is only using bpf_mem_alloc for task and cgrp storage while sk and inode stay with kzalloc/kfree. The details is in patch 2. [1]: https://lore.kernel.org/bpf/20230308065936.1550103-1-martin.lau@linux.dev/ v3: - Only use bpf_mem_alloc for task and cgrp storage. - sk and inode storage stay with kzalloc/kfree. - Check NULL and add comments in bpf_mem_cache_raw_free() in patch 1. - Added test and benchmark for task storage. v2: - Added bpf_mem_cache_alloc_flags() and bpf_mem_cache_raw_free() to hide the internal data structure of the bpf allocator. - Fixed a typo bug in bpf_selem_free() - Simplified the test_local_storage test by directly using err returned from libbpf ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25selftests/bpf: Add bench for task storage creationMartin KaFai Lau
This patch adds a task storage benchmark to the existing local-storage-create benchmark. For task storage, ./bench --storage-type task --batch-size 32: bpf_ma: Summary: creates 30.456 ± 0.507k/s ( 30.456k/prod), 6.08 kmallocs/create no bpf_ma: Summary: creates 31.962 ± 0.486k/s ( 31.962k/prod), 6.13 kmallocs/create ./bench --storage-type task --batch-size 64: bpf_ma: Summary: creates 30.197 ± 1.476k/s ( 30.197k/prod), 6.08 kmallocs/create no bpf_ma: Summary: creates 31.103 ± 0.297k/s ( 31.103k/prod), 6.13 kmallocs/create Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20230322215246.1675516-6-martin.lau@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25selftests/bpf: Test task storage when local_storage->smap is NULLMartin KaFai Lau
The current sk storage test ensures the memory free works when the local_storage->smap is NULL. This patch adds a task storage test to ensure the memory free code path works when local_storage->smap is NULL. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20230322215246.1675516-5-martin.lau@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25bpf: Use bpf_mem_cache_alloc/free for bpf_local_storageMartin KaFai Lau
This patch uses bpf_mem_cache_alloc/free for allocating and freeing bpf_local_storage for task and cgroup storage. The changes are similar to the previous patch. A few things that worth to mention for bpf_local_storage: The local_storage is freed when the last selem is deleted. Before deleting a selem from local_storage, it needs to retrieve the local_storage->smap because the bpf_selem_unlink_storage_nolock() may have set it to NULL. Note that local_storage->smap may have already been NULL when the selem created this local_storage has been removed. In this case, call_rcu will be used to free the local_storage. Also, the bpf_ma (true or false) value is needed before calling bpf_local_storage_free(). The bpf_ma can either be obtained from the local_storage->smap (if available) or any of its selem's smap. A new helper check_storage_bpf_ma() is added to obtain bpf_ma for a deleting bpf_local_storage. When bpf_local_storage_alloc getting a reused memory, all fields are either in the correct values or will be initialized. 'cache[]' must already be all NULLs. 'list' must be empty. Others will be initialized. Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20230322215246.1675516-4-martin.lau@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25bpf: Use bpf_mem_cache_alloc/free in bpf_local_storage_elemMartin KaFai Lau
This patch uses bpf_mem_alloc for the task and cgroup local storage that the bpf prog can easily get a hold of the storage owner's PTR_TO_BTF_ID. eg. bpf_get_current_task_btf() can be used in some of the kmalloc code path which will cause deadlock/recursion. bpf_mem_cache_alloc is deadlock free and will solve a legit use case in [1]. For sk storage, its batch creation benchmark shows a few percent regression when the sk create/destroy batch size is larger than 32. The sk creation/destruction happens much more often and depends on external traffic. Considering it is hypothetical to be able to cause deadlock with sk storage, it can cross the bridge to use bpf_mem_alloc till a legit (ie. useful) use case comes up. For inode storage, bpf_local_storage_destroy() is called before waiting for a rcu gp and its memory cannot be reused immediately. inode stays with kmalloc/kfree after the rcu [or tasks_trace] gp. A 'bool bpf_ma' argument is added to bpf_local_storage_map_alloc(). Only task and cgroup storage have 'bpf_ma == true' which means to use bpf_mem_cache_alloc/free(). This patch only changes selem to use bpf_mem_alloc for task and cgroup. The next patch will change the local_storage to use bpf_mem_alloc also for task and cgroup. Here is some more details on the changes: * memory allocation: After bpf_mem_cache_alloc(), the SDATA(selem)->data is zero-ed because bpf_mem_cache_alloc() could return a reused selem. It is to keep the existing bpf_map_kzalloc() behavior. Only SDATA(selem)->data is zero-ed. SDATA(selem)->data is the visible part to the bpf prog. No need to use zero_map_value() to do the zeroing because bpf_selem_free(..., reuse_now = true) ensures no bpf prog is using the selem before returning the selem through bpf_mem_cache_free(). For the internal fields of selem, they will be initialized when linking to the new smap and the new local_storage. When 'bpf_ma == false', nothing changes in this patch. It will stay with the bpf_map_kzalloc(). * memory free: The bpf_selem_free() and bpf_selem_free_rcu() are modified to handle the bpf_ma == true case. For the common selem free path where its owner is also being destroyed, the mem is freed in bpf_local_storage_destroy(), the owner (task and cgroup) has gone through a rcu gp. The memory can be reused immediately, so bpf_local_storage_destroy() will call bpf_selem_free(..., reuse_now = true) which will do bpf_mem_cache_free() for immediate reuse consideration. An exception is the delete elem code path. The delete elem code path is called from the helper bpf_*_storage_delete() and the syscall bpf_map_delete_elem(). This path is an unusual case for local storage because the common use case is to have the local storage staying with its owner life time so that the bpf prog and the user space does not have to monitor the owner's destruction. For the delete elem path, the selem cannot be reused immediately because there could be bpf prog using it. It will call bpf_selem_free(..., reuse_now = false) and it will wait for a rcu tasks trace gp before freeing the elem. The rcu callback is changed to do bpf_mem_cache_raw_free() instead of kfree(). When 'bpf_ma == false', it should be the same as before. __bpf_selem_free() is added to do the kfree_rcu and call_tasks_trace_rcu(). A few words on the 'reuse_now == true'. When 'reuse_now == true', it is still racing with bpf_local_storage_map_free which is under rcu protection, so it still needs to wait for a rcu gp instead of kfree(). Otherwise, the selem may be reused by slab for a totally different struct while the bpf_local_storage_map_free() is still using it (as a rcu reader). For the inode case, there may be other rcu readers also. In short, when bpf_ma == false and reuse_now == true => vanilla rcu. [1]: https://lore.kernel.org/bpf/20221118190109.1512674-1-namhyung@kernel.org/ Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20230322215246.1675516-3-martin.lau@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25bpf: Add a few bpf mem allocator functionsMartin KaFai Lau
This patch adds a few bpf mem allocator functions which will be used in the bpf_local_storage in a later patch. bpf_mem_cache_alloc_flags(..., gfp_t flags) is added. When the flags == GFP_KERNEL, it will fallback to __alloc(..., GFP_KERNEL). bpf_local_storage knows its running context is sleepable (GFP_KERNEL) and provides a better guarantee on memory allocation. bpf_local_storage has some uncommon cases that its selem cannot be reused immediately. It handles its own rcu_head and goes through a rcu_trace gp and then free it. bpf_mem_cache_raw_free() is added for direct free purpose without leaking the LLIST_NODE_SZ internal knowledge. During free time, the 'struct bpf_mem_alloc *ma' is no longer available. However, the caller should know if it is percpu memory or not and it can call different raw_free functions. bpf_local_storage does not support percpu value, so only the non-percpu 'bpf_mem_cache_raw_free()' is added in this patch. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20230322215246.1675516-2-martin.lau@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25Merge branch 'First set of verifier/*.c migrated to inline assembly'Alexei Starovoitov
Eduard Zingerman says: ==================== This is a follow up for RFC [1]. It migrates a first batch of 38 verifier/*.c tests to inline assembly and use of ./test_progs for actual execution. The migration is done by a python script (see [2]). Each migrated verifier/xxx.c file is mapped to progs/verifier_xxx.c plus an entry in the prog_tests/verifier.c. One patch per each file. A few patches at the beginning of the patch-set extend test_loader with necessary functionality, mainly: - support for tests execution in unprivileged mode; - support for test runs for test programs. Migrated tests could be selected for execution using the following filter: ./test_progs -a verifier_* An example of the migrated test: SEC("xdp") __description("XDP pkt read, pkt_data' > pkt_end, corner case, good access") __success __retval(0) __flag(BPF_F_ANY_ALIGNMENT) __naked void end_corner_case_good_access_1(void) { asm volatile (" \ r2 = *(u32*)(r1 + %[xdp_md_data]); \ r3 = *(u32*)(r1 + %[xdp_md_data_end]); \ r1 = r2; \ r1 += 8; \ if r1 > r3 goto l0_%=; \ r0 = *(u64*)(r1 - 8); \ l0_%=: r0 = 0; \ exit; \ " : : __imm_const(xdp_md_data, offsetof(struct xdp_md, data)), __imm_const(xdp_md_data_end, offsetof(struct xdp_md, data_end)) : __clobber_all); } Changes compared to RFC: - test_loader.c is extended to support test program runs; - capabilities handling now matches behavior of test_verifier; - BPF_ST_MEM instructions are automatically replaced by BPF_STX_MEM instructions to overcome current clang limitations; - tests styling updates according to RFC feedback; - 38 migrated files are included instead of 1. I used the following means for testing: - migration tool itself has a set of self-tests; - migrated tests are passing; - manually compared each old/new file side-by-side. While doing side-by-side comparison I've noted a few defects in the original tests: - and.c: - One of the jump targets is off by one; - BPF_ST_MEM wrong OFF/IMM ordering; - array_access.c: - BPF_ST_MEM wrong OFF/IMM ordering; - value_or_null.c: - BPF_ST_MEM wrong OFF/IMM ordering. These defects would be addressed separately. [1] RFC https://lore.kernel.org/bpf/20230123145148.2791939-1-eddyz87@gmail.com/ [2] Migration tool https://github.com/eddyz87/verifier-tests-migrator ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25selftests/bpf: verifier/xdp.c converted to inline assemblyEduard Zingerman
Test verifier/xdp.c automatically converted to use inline assembly. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20230325025524.144043-43-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25selftests/bpf: verifier/xadd.c converted to inline assemblyEduard Zingerman
Test verifier/xadd.c automatically converted to use inline assembly. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20230325025524.144043-42-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25selftests/bpf: verifier/var_off.c converted to inline assemblyEduard Zingerman
Test verifier/var_off.c automatically converted to use inline assembly. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20230325025524.144043-41-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25selftests/bpf: verifier/value_or_null.c converted to inline assemblyEduard Zingerman
Test verifier/value_or_null.c automatically converted to use inline assembly. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20230325025524.144043-40-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-25selftests/bpf: verifier/value.c converted to inline assemblyEduard Zingerman
Test verifier/value.c automatically converted to use inline assembly. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20230325025524.144043-39-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>