Age | Commit message (Collapse) | Author |
|
The mac address initialization is braodly the same between PCI and SBUS,
and one was clearly copied from the other. Consolidate them. We still have
to have some ifdefs because pci_(un)map_rom is only implemented for PCI,
and idprom is only implemented for SPARC.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The PCI half of this driver was converted in commit 914d9b2711dd ("sunhme:
switch to devres"). Do the same for the SBUS half.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Alphabetize includes to make it clearer where to add new ones.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Instead of registering one interrupt handler for all four SBUS Quattro
HMEs, let each HME register its own handler. To make this work, we don't
handle the IRQ if none of the status bits are set. This reduces the
complexity of the driver, and makes it easier to ensure things happen
before/after enabling IRQs.
I'm not really sure why we request IRQs in two different places (and leave
them running after removing the driver!). A lot of things in this driver
seem to just be crusty, and not necessarily intentional. I'm assuming
that's the case here as well.
This really needs to be tested by someone with an SBUS Quattro card.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The sunhme driver never used the hardware MII polling feature. Even the
if-def'd out happy_meal_poll_start was removed by 2002 [1]. Remove the
various places in the driver which needlessly guard against MII interrupts
which will never be enabled.
[1] https://lwn.net/2002/0411/a/2.5.8-pre3.php3
Signed-off-by: Sean Anderson <seanga2@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If we've tried regular autonegotiation and forcing the link mode, just
restart autonegotiation instead of reinitializing the whole NIC.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix an uninitialized return code if we never found a qfe slot. It would be
a bug if we ever got into this situation, but it's good to return something
tracable.
Fixes: acb3f35f920b ("sunhme: forward the error code from pci_enable_device()")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Sean Anderson <seanga2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Veerasenareddy Burru says:
====================
octeon_ep: deferred probe and mailbox
Implement Deferred probe, mailbox enhancements and heartbeat monitor.
v4 -> v5:
- addressed review comments
https://lore.kernel.org/all/20230323104703.GD36557@unreal/
replaced atomic_inc() + atomic_read() with atomic_inc_return().
v3 -> v4:
- addressed review comments on v3
https://lore.kernel.org/all/20230214051422.13705-1-vburru@marvell.com/
- 0004-xxx.patch v3 is split into 0004-xxx.patch and 0005-xxx.patch
in v4.
- API changes to accept function ID are moved to 0005-xxx.patch.
- fixed rct violations.
- reverted newly added changes that do not yet have use cases.
v2 -> v3:
- removed SRIOV VF support changes from v2, as new drivers which use
ndo_get_vf_xxx() and ndo_set_vf_xxx() are not accepted.
https://lore.kernel.org/all/20221207200204.6819575a@kernel.org/
Will implement VF representors and submit again.
- 0007-xxx.patch and 0008-xxx.patch from v2 are removed and
0009-xxx.patch in v2 is now 0007-xxx.patch in v3.
- accordingly, changed title for cover letter.
v1 -> v2:
- remove separate workqueue task to wait for firmware ready.
instead defer probe when firmware is not ready.
Reported-by: Leon Romanovsky <leon@kernel.org>
- This change has resulted in update of 0001-xxx.patch and
all other patches in the patchset.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Monitor periodic heartbeat messages from device firmware.
Presence of heartbeat indicates the device is active and running.
If the heartbeat is missed for configured interval indicates
firmware has crashed and device is unusable; in this case, PF driver
stops and uninitialize the device.
Signed-off-by: Veerasenareddy Burru <vburru@marvell.com>
Signed-off-by: Abhijit Ayarekar <aayarekar@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Update control mailbox API to include function id in get stats and
link info.
Signed-off-by: Veerasenareddy Burru <vburru@marvell.com>
Signed-off-by: Abhijit Ayarekar <aayarekar@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add asynchronous notification support to the control mailbox.
Signed-off-by: Veerasenareddy Burru <vburru@marvell.com>
Signed-off-by: Abhijit Ayarekar <aayarekar@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Extend control command structure to include vfid and
update APIs to accept VF ID.
Signed-off-by: Abhijit Ayarekar <aayarekar@marvell.com>
Signed-off-by: Veerasenareddy Burru <vburru@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Enhance control mailbox protocol to support following
- separate command and response queues
* command queue to send control commands to firmware.
* response queue to receive responses and notifications from
firmware.
- variable size messages using scatter/gather
Signed-off-by: Abhijit Ayarekar <aayarekar@marvell.com>
Signed-off-by: Veerasenareddy Burru <vburru@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add control mailbox support for multiple PFs.
Update control mbox base address calculation based on PF function link.
Signed-off-by: Veerasenareddy Burru <vburru@marvell.com>
Signed-off-by: Abhijit Ayarekar <aayarekar@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Poll for control messages until interrupts are enabled.
All the interrupts are enabled in ndo_open().
Add ability to listen for notifications from firmware before ndo_open().
Once interrupts are enabled, this polling is disabled and all the
messages are processed by bottom half of interrupt handler.
Signed-off-by: Veerasenareddy Burru <vburru@marvell.com>
Signed-off-by: Abhijit Ayarekar <aayarekar@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Defer probe if firmware is not ready for device usage.
Signed-off-by: Veerasenareddy Burru <vburru@marvell.com>
Signed-off-by: Abhijit Ayarekar <aayarekar@marvell.com>
Signed-off-by: Satananda Burla <sburla@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Implement phy_read16() and phy_write16() ops for B53 MMAP to avoid accessing
B53_PORT_MII_PAGE registers which hangs the device.
This access should be done through the MDIO Mux bus controller.
Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Álvaro Fernández Rojas says:
====================
net: dsa: b53: mdio: add support for BCM53134
This is based on the initial work from Paul Geurts that was sent to the
incorrect linux development lists and recipients.
I've modified it by removing BCM53134_DEVICE_ID from is531x5() and therefore
adding is53134() where needed.
I also added a separate RGMII handling block for is53134() since according to
Paul, BCM53134 doesn't support RGMII_CTRL_TIMING_SEL as opposed to is531x5().
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add support for the BCM53134 Ethernet switch in the existing b53 dsa driver.
BCM53134 is very similar to the BCM58XX series.
Signed-off-by: Paul Geurts <paul.geurts@prodrive-technologies.com>
Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
BCM53134 are B53 switches connected by MDIO.
Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The KSZ9131RNX incorrectly shows EEE capabilities in its registers.
Although the "EEE control and capability 1" (Register 3.20) is set to 0,
indicating no EEE support, the "EEE advertisement 1" (Register 7.60) is
set to 0x6, advertising EEE support for 1000BaseT/Full and
100BaseT/Full.
This inconsistency causes PHYlib to assume there is no EEE support,
preventing control over EEE advertisement, which is enabled by default.
This patch resolves the issue by utilizing the ksz9477_get_features()
function to correctly set the EEE capabilities for the KSZ9131RNX. This
adjustment allows proper control over EEE advertisement and ensures
accurate representation of the device's capabilities.
Fixes: 8b68710a3121 ("net: phy: start using genphy_c45_ethtool_get/set_eee()")
Reported-by: Marek Vasut <marex@denx.de>
Tested-by: Marek Vasut <marex@denx.de>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
pkt_list_lock was used before commit 71dc9ec9ac7d ("virtio/vsock:
replace virtio_vsock_pkt with sk_buff") to protect the packet queue.
After that commit we switched to sk_buff and we are using
sk_buff_head.lock in almost every place to protect the packet queue
except in vsock_loopback_work() when we call skb_queue_splice_init().
As reported by syzbot, this caused unlocked concurrent access to the
packet queue between vsock_loopback_work() and
vsock_loopback_cancel_pkt() since it is not holding pkt_list_lock.
With the introduction of sk_buff_head, pkt_list_lock is redundant and
can cause confusion, so let's remove it and use sk_buff_head.lock
everywhere to protect the packet queue access.
Fixes: 71dc9ec9ac7d ("virtio/vsock: replace virtio_vsock_pkt with sk_buff")
Cc: bobby.eshleman@bytedance.com
Reported-and-tested-by: syzbot+befff0a9536049e7902e@syzkaller.appspotmail.com
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Bobby Eshleman <bobby.eshleman@bytedance.com>
Reviewed-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Russell King says:
====================
Constify a few sfp/phy fwnodes
This series constifies a bunch of fwnode_handle pointers that are only
used to refer to but not modify the contents of the fwnode structures.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
fwnode_get_phy_node() does not motify the fwnode structure, so make
the argument const,
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Constify sfp-bus internal fwnode uses, since we do not modify the
fwnode structures.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
sfp_bus_find_fwnode() does not write to the fwnode, so let's make it
const.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The failover txq is inited as 16 queues.
when a packet is transmitted from the failover device firstly,
the failover device will select the queue which is returned from
the primary device if the primary device is UP and running.
If the primary device txq is bigger than the default 16,
it can lead to the following warning:
eth0 selects TX queue 18, but real number of TX queues is 16
The warning backtrace is:
[ 32.146376] CPU: 18 PID: 9134 Comm: chronyd Tainted: G E 6.2.8-1.el7.centos.x86_64 #1
[ 32.147175] Hardware name: Red Hat KVM, BIOS 1.10.2-3.el7_4.1 04/01/2014
[ 32.147730] Call Trace:
[ 32.147971] <TASK>
[ 32.148183] dump_stack_lvl+0x48/0x70
[ 32.148514] dump_stack+0x10/0x20
[ 32.148820] netdev_core_pick_tx+0xb1/0xe0
[ 32.149180] __dev_queue_xmit+0x529/0xcf0
[ 32.149533] ? __check_object_size.part.0+0x21c/0x2c0
[ 32.149967] ip_finish_output2+0x278/0x560
[ 32.150327] __ip_finish_output+0x1fe/0x2f0
[ 32.150690] ip_finish_output+0x2a/0xd0
[ 32.151032] ip_output+0x7a/0x110
[ 32.151337] ? __pfx_ip_finish_output+0x10/0x10
[ 32.151733] ip_local_out+0x5e/0x70
[ 32.152054] ip_send_skb+0x19/0x50
[ 32.152366] udp_send_skb.isra.0+0x163/0x3a0
[ 32.152736] udp_sendmsg+0xba8/0xec0
[ 32.153060] ? __folio_memcg_unlock+0x25/0x60
[ 32.153445] ? __pfx_ip_generic_getfrag+0x10/0x10
[ 32.153854] ? sock_has_perm+0x85/0xa0
[ 32.154190] inet_sendmsg+0x6d/0x80
[ 32.154508] ? inet_sendmsg+0x6d/0x80
[ 32.154838] sock_sendmsg+0x62/0x70
[ 32.155152] ____sys_sendmsg+0x134/0x290
[ 32.155499] ___sys_sendmsg+0x81/0xc0
[ 32.155828] ? _get_random_bytes.part.0+0x79/0x1a0
[ 32.156240] ? ip4_datagram_release_cb+0x5f/0x1e0
[ 32.156649] ? get_random_u16+0x69/0xf0
[ 32.156989] ? __fget_light+0xcf/0x110
[ 32.157326] __sys_sendmmsg+0xc4/0x210
[ 32.157657] ? __sys_connect+0xb7/0xe0
[ 32.157995] ? __audit_syscall_entry+0xce/0x140
[ 32.158388] ? syscall_trace_enter.isra.0+0x12c/0x1a0
[ 32.158820] __x64_sys_sendmmsg+0x24/0x30
[ 32.159171] do_syscall_64+0x38/0x90
[ 32.159493] entry_SYSCALL_64_after_hwframe+0x72/0xdc
Fix that by reducing txq number as the non-existent primary-dev does.
Fixes: cfc80d9a1163 ("net: Introduce net_failover driver")
Signed-off-by: Faicker Mo <faicker.mo@ucloud.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
devm_clk_get() can return -EPROBE_DEFER. So it is better to return the
error code from devm_clk_get(), instead of a hard coded -ENOENT.
This gives more opportunities to successfully probe the driver.
Fixes: 8959e5324485 ("regulator: fixed: add possibility to enable by clock")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/18459fae3d017a66313699c7c8456b28158b2dd0.1679819354.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
KASAN reported the following issue:
[ 36.825817][ T5923] BUG: KASAN: wild-memory-access in v9fs_get_acl+0x1a4/0x390
[ 36.827479][ T5923] Write of size 4 at addr 9fffeb37f97f1c00 by task syz-executor798/5923
[ 36.829303][ T5923]
[ 36.829846][ T5923] CPU: 0 PID: 5923 Comm: syz-executor798 Not tainted 6.2.0-syzkaller-18302-g596b6b709632 #0
[ 36.832110][ T5923] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/21/2023
[ 36.834464][ T5923] Call trace:
[ 36.835196][ T5923] dump_backtrace+0x1c8/0x1f4
[ 36.836229][ T5923] show_stack+0x2c/0x3c
[ 36.837100][ T5923] dump_stack_lvl+0xd0/0x124
[ 36.838103][ T5923] print_report+0xe4/0x4c0
[ 36.839068][ T5923] kasan_report+0xd4/0x130
[ 36.840052][ T5923] kasan_check_range+0x264/0x2a4
[ 36.841199][ T5923] __kasan_check_write+0x2c/0x3c
[ 36.842216][ T5923] v9fs_get_acl+0x1a4/0x390
[ 36.843232][ T5923] v9fs_mount+0x77c/0xa5c
[ 36.844163][ T5923] legacy_get_tree+0xd4/0x16c
[ 36.845173][ T5923] vfs_get_tree+0x90/0x274
[ 36.846137][ T5923] do_new_mount+0x25c/0x8c8
[ 36.847066][ T5923] path_mount+0x590/0xe58
[ 36.848147][ T5923] __arm64_sys_mount+0x45c/0x594
[ 36.849273][ T5923] invoke_syscall+0x98/0x2c0
[ 36.850421][ T5923] el0_svc_common+0x138/0x258
[ 36.851397][ T5923] do_el0_svc+0x64/0x198
[ 36.852398][ T5923] el0_svc+0x58/0x168
[ 36.853224][ T5923] el0t_64_sync_handler+0x84/0xf0
[ 36.854293][ T5923] el0t_64_sync+0x190/0x194
Calling '__v9fs_get_acl' method in 'v9fs_get_acl' creates the
following chain of function calls:
__v9fs_get_acl
v9fs_fid_get_acl
v9fs_fid_xattr_get
p9_client_xattrwalk
Function p9_client_xattrwalk accepts a pointer to u64-typed
variable attr_size and puts some u64 value into it. However,
after the executing the p9_client_xattrwalk, in some circumstances
we assign the value of u64-typed variable 'attr_size' to the
variable 'retval', which we will return. However, the type of
'retval' is ssize_t, and if the value of attr_size is larger
than SSIZE_MAX, we will face the signed type overflow. If the
overflow occurs, the result of v9fs_fid_xattr_get may be
negative, but not classified as an error. When we try to allocate
an acl with 'broken' size we receive an error, but don't process
it. When we try to free this acl, we face the 'wild-memory-access'
error (because it wasn't allocated).
This patch will add new condition to the 'v9fs_fid_xattr_get'
function, so it will return an EOVERFLOW error if the 'attr_size'
is larger than SSIZE_MAX.
In this version of the patch I simplified the condition.
In previous (v2) version of the patch I removed explicit type conversion
and added separate condition to check the possible overflow and return
an error (in v1 version I've just modified the existing condition).
Tested via syzkaller.
Suggested-by: Christian Schoenebeck <linux_oss@crudebyte.com>
Reported-by: syzbot+cb1d16facb3cc90de5fb@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?id=fbbef66d9e4d096242f3617de5d14d12705b4659
Signed-off-by: Ivan Orlov <ivan.orlov0322@gmail.com>
Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com>
Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB / Thunderbolt driver fixes from Greg KH:
"Here are a small set of USB and Thunderbolt driver fixes for reported
problems and a documentation update, for 6.3-rc4.
Included in here are:
- documentation update for uvc gadget driver
- small thunderbolt driver fixes
- cdns3 driver fixes
- dwc3 driver fixes
- dwc2 driver fixes
- chipidea driver fixes
- typec driver fixes
- onboard_usb_hub device id updates
- quirk updates
All of these have been in linux-next with no reported problems"
* tag 'usb-6.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (30 commits)
usb: dwc2: fix a race, don't power off/on phy for dual-role mode
usb: dwc2: fix a devres leak in hw_enable upon suspend resume
usb: chipidea: core: fix possible concurrent when switch role
usb: chipdea: core: fix return -EINVAL if request role is the same with current role
thunderbolt: Rename shadowed variables bit to interrupt_bit and auto_clear_bit
thunderbolt: Disable interrupt auto clear for rings
thunderbolt: Use const qualifier for `ring_interrupt_index`
usb: gadget: Use correct endianness of the wLength field for WebUSB
uas: Add US_FL_NO_REPORT_OPCODES for JMicron JMS583Gen 2
usb: cdnsp: changes PCI Device ID to fix conflict with CNDS3 driver
usb: cdns3: Fix issue with using incorrect PCI device function
usb: cdnsp: Fixes issue with redundant Status Stage
MAINTAINERS: make me a reviewer of USB/IP
thunderbolt: Use scale field when allocating USB3 bandwidth
thunderbolt: Limit USB3 bandwidth of certain Intel USB4 host routers
thunderbolt: Call tb_check_quirks() after initializing adapters
thunderbolt: Add missing UNSET_INBOUND_SBTX for retimer access
thunderbolt: Fix memory leak in margining
usb: dwc2: drd: fix inconsistent mode if role-switch-default-mode="host"
docs: usb: Add documentation for the UVC Gadget
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Borislav Petkov:
- Fix a corner case where vruntime of a task is not being sanitized
* tag 'sched_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Sanitize vruntime of entity being migrated
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fix from Borislav Petkov:
- Properly clear perf event status tracking in the AMD perf event
overflow handler
* tag 'perf_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/amd/core: Always clear status for idx
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core fixes from Borislav Petkov:
- Do the delayed RCU wakeup for kthreads in the proper order so that
former doesn't get ignored
- A noinstr warning fix
* tag 'core_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
entry/rcu: Check TIF_RESCHED _after_ delayed RCU wake-up
entry: Fix noinstr warning in __enter_from_user_mode()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- Add a AMX ptrace self test
- Prevent a false-positive warning when retrieving the (invalid)
address of dynamic FPU features in their init state which are not
saved in init_fpstate at all
- Randomize per-CPU entry areas only when KASLR is enabled
* tag 'x86_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
selftests/x86/amx: Add a ptrace test
x86/fpu/xstate: Prevent false-positive warning in __copy_xstate_uabi_buf()
x86/mm: Do not shuffle CPU entry areas without KASLR
|
|
Pull cifs client fixes from Steve French:
"Twelve cifs/smb3 client fixes (most also for stable)
- forced umount fix
- fix for two perf regressions
- reconnect fixes
- small debugging improvements
- multichannel fixes"
* tag 'smb3-client-fixes-6.3-rc3' of git://git.samba.org/sfrench/cifs-2.6:
smb3: fix unusable share after force unmount failure
cifs: fix dentry lookups in directory handle cache
smb3: lower default deferred close timeout to address perf regression
cifs: fix missing unload_nls() in smb2_reconnect()
cifs: avoid race conditions with parallel reconnects
cifs: append path to open_enter trace event
cifs: print session id while listing open files
cifs: dump pending mids for all channels in DebugData
cifs: empty interface list when server doesn't support query interfaces
cifs: do not poll server interfaces too regularly
cifs: lock chan_lock outside match_session
cifs: check only tcon status on tcon related functions
|
|
The remap of fill and completion rings was frowned upon as they
control the usage of UMEM which does not support concurrent use.
At the same time this would disallow the remap of these rings
into another process.
A possible use case is that the user wants to transfer the socket/
UMEM ownership to another process (via SYS_pidfd_getfd) and so
would need to also remap these rings.
This will have no impact on current usages and just relaxes the
remap limitation.
Signed-off-by: Nuno Gonçalves <nunog@fr24.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/r/20230324100222.13434-1-nunog@fr24.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add extended call instructions. Uses the term "program-local" for
call by offset. And there are instructions for calling helper functions
by "address" (the old way of using integer values), and for calling
helper functions by BTF ID (for kfuncs).
V1 -> V2: addressed comments from David Vernet
V2 -> V3: make descriptions in table consistent with updated names
V3 -> V4: addressed comments from Alexei
V4 -> V5: fixed alignment
Signed-off-by: Dave Thaler <dthaler@microsoft.com>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20230326033117.1075-1-dthaler1968@googlemail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Martin KaFai Lau says:
====================
From: Martin KaFai Lau <martin.lau@kernel.org>
This set is a continuation of the effort in using
bpf_mem_cache_alloc/free in bpf_local_storage [1]
Major change is only using bpf_mem_alloc for task and cgrp storage
while sk and inode stay with kzalloc/kfree. The details is
in patch 2.
[1]: https://lore.kernel.org/bpf/20230308065936.1550103-1-martin.lau@linux.dev/
v3:
- Only use bpf_mem_alloc for task and cgrp storage.
- sk and inode storage stay with kzalloc/kfree.
- Check NULL and add comments in bpf_mem_cache_raw_free() in patch 1.
- Added test and benchmark for task storage.
v2:
- Added bpf_mem_cache_alloc_flags() and bpf_mem_cache_raw_free()
to hide the internal data structure of the bpf allocator.
- Fixed a typo bug in bpf_selem_free()
- Simplified the test_local_storage test by directly using
err returned from libbpf
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch adds a task storage benchmark to the existing
local-storage-create benchmark.
For task storage,
./bench --storage-type task --batch-size 32:
bpf_ma: Summary: creates 30.456 ± 0.507k/s ( 30.456k/prod), 6.08 kmallocs/create
no bpf_ma: Summary: creates 31.962 ± 0.486k/s ( 31.962k/prod), 6.13 kmallocs/create
./bench --storage-type task --batch-size 64:
bpf_ma: Summary: creates 30.197 ± 1.476k/s ( 30.197k/prod), 6.08 kmallocs/create
no bpf_ma: Summary: creates 31.103 ± 0.297k/s ( 31.103k/prod), 6.13 kmallocs/create
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20230322215246.1675516-6-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The current sk storage test ensures the memory free works when
the local_storage->smap is NULL.
This patch adds a task storage test to ensure the memory free
code path works when local_storage->smap is NULL.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20230322215246.1675516-5-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch uses bpf_mem_cache_alloc/free for allocating and freeing
bpf_local_storage for task and cgroup storage.
The changes are similar to the previous patch. A few things that
worth to mention for bpf_local_storage:
The local_storage is freed when the last selem is deleted.
Before deleting a selem from local_storage, it needs to retrieve the
local_storage->smap because the bpf_selem_unlink_storage_nolock()
may have set it to NULL. Note that local_storage->smap may have
already been NULL when the selem created this local_storage has
been removed. In this case, call_rcu will be used to free the
local_storage.
Also, the bpf_ma (true or false) value is needed before calling
bpf_local_storage_free(). The bpf_ma can either be obtained from
the local_storage->smap (if available) or any of its selem's smap.
A new helper check_storage_bpf_ma() is added to obtain
bpf_ma for a deleting bpf_local_storage.
When bpf_local_storage_alloc getting a reused memory, all
fields are either in the correct values or will be initialized.
'cache[]' must already be all NULLs. 'list' must be empty.
Others will be initialized.
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20230322215246.1675516-4-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch uses bpf_mem_alloc for the task and cgroup local storage that
the bpf prog can easily get a hold of the storage owner's PTR_TO_BTF_ID.
eg. bpf_get_current_task_btf() can be used in some of the kmalloc code
path which will cause deadlock/recursion. bpf_mem_cache_alloc is
deadlock free and will solve a legit use case in [1].
For sk storage, its batch creation benchmark shows a few percent
regression when the sk create/destroy batch size is larger than 32.
The sk creation/destruction happens much more often and
depends on external traffic. Considering it is hypothetical
to be able to cause deadlock with sk storage, it can cross
the bridge to use bpf_mem_alloc till a legit (ie. useful)
use case comes up.
For inode storage, bpf_local_storage_destroy() is called before
waiting for a rcu gp and its memory cannot be reused immediately.
inode stays with kmalloc/kfree after the rcu [or tasks_trace] gp.
A 'bool bpf_ma' argument is added to bpf_local_storage_map_alloc().
Only task and cgroup storage have 'bpf_ma == true' which
means to use bpf_mem_cache_alloc/free(). This patch only changes
selem to use bpf_mem_alloc for task and cgroup. The next patch
will change the local_storage to use bpf_mem_alloc also for
task and cgroup.
Here is some more details on the changes:
* memory allocation:
After bpf_mem_cache_alloc(), the SDATA(selem)->data is zero-ed because
bpf_mem_cache_alloc() could return a reused selem. It is to keep
the existing bpf_map_kzalloc() behavior. Only SDATA(selem)->data
is zero-ed. SDATA(selem)->data is the visible part to the bpf prog.
No need to use zero_map_value() to do the zeroing because
bpf_selem_free(..., reuse_now = true) ensures no bpf prog is using
the selem before returning the selem through bpf_mem_cache_free().
For the internal fields of selem, they will be initialized when
linking to the new smap and the new local_storage.
When 'bpf_ma == false', nothing changes in this patch. It will
stay with the bpf_map_kzalloc().
* memory free:
The bpf_selem_free() and bpf_selem_free_rcu() are modified to handle
the bpf_ma == true case.
For the common selem free path where its owner is also being destroyed,
the mem is freed in bpf_local_storage_destroy(), the owner (task
and cgroup) has gone through a rcu gp. The memory can be reused
immediately, so bpf_local_storage_destroy() will call
bpf_selem_free(..., reuse_now = true) which will do
bpf_mem_cache_free() for immediate reuse consideration.
An exception is the delete elem code path. The delete elem code path
is called from the helper bpf_*_storage_delete() and the syscall
bpf_map_delete_elem(). This path is an unusual case for local
storage because the common use case is to have the local storage
staying with its owner life time so that the bpf prog and the user
space does not have to monitor the owner's destruction. For the delete
elem path, the selem cannot be reused immediately because there could
be bpf prog using it. It will call bpf_selem_free(..., reuse_now = false)
and it will wait for a rcu tasks trace gp before freeing the elem. The
rcu callback is changed to do bpf_mem_cache_raw_free() instead of kfree().
When 'bpf_ma == false', it should be the same as before.
__bpf_selem_free() is added to do the kfree_rcu and call_tasks_trace_rcu().
A few words on the 'reuse_now == true'. When 'reuse_now == true',
it is still racing with bpf_local_storage_map_free which is under rcu
protection, so it still needs to wait for a rcu gp instead of kfree().
Otherwise, the selem may be reused by slab for a totally different struct
while the bpf_local_storage_map_free() is still using it (as a
rcu reader). For the inode case, there may be other rcu readers also.
In short, when bpf_ma == false and reuse_now == true => vanilla rcu.
[1]: https://lore.kernel.org/bpf/20221118190109.1512674-1-namhyung@kernel.org/
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20230322215246.1675516-3-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch adds a few bpf mem allocator functions which will
be used in the bpf_local_storage in a later patch.
bpf_mem_cache_alloc_flags(..., gfp_t flags) is added. When the
flags == GFP_KERNEL, it will fallback to __alloc(..., GFP_KERNEL).
bpf_local_storage knows its running context is sleepable (GFP_KERNEL)
and provides a better guarantee on memory allocation.
bpf_local_storage has some uncommon cases that its selem
cannot be reused immediately. It handles its own
rcu_head and goes through a rcu_trace gp and then free it.
bpf_mem_cache_raw_free() is added for direct free purpose
without leaking the LLIST_NODE_SZ internal knowledge.
During free time, the 'struct bpf_mem_alloc *ma' is no longer
available. However, the caller should know if it is
percpu memory or not and it can call different raw_free functions.
bpf_local_storage does not support percpu value, so only
the non-percpu 'bpf_mem_cache_raw_free()' is added in
this patch.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20230322215246.1675516-2-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Eduard Zingerman says:
====================
This is a follow up for RFC [1]. It migrates a first batch of 38
verifier/*.c tests to inline assembly and use of ./test_progs for
actual execution. The migration is done by a python script (see [2]).
Each migrated verifier/xxx.c file is mapped to progs/verifier_xxx.c
plus an entry in the prog_tests/verifier.c. One patch per each file.
A few patches at the beginning of the patch-set extend test_loader
with necessary functionality, mainly:
- support for tests execution in unprivileged mode;
- support for test runs for test programs.
Migrated tests could be selected for execution using the following filter:
./test_progs -a verifier_*
An example of the migrated test:
SEC("xdp")
__description("XDP pkt read, pkt_data' > pkt_end, corner case, good access")
__success __retval(0) __flag(BPF_F_ANY_ALIGNMENT)
__naked void end_corner_case_good_access_1(void)
{
asm volatile (" \
r2 = *(u32*)(r1 + %[xdp_md_data]); \
r3 = *(u32*)(r1 + %[xdp_md_data_end]); \
r1 = r2; \
r1 += 8; \
if r1 > r3 goto l0_%=; \
r0 = *(u64*)(r1 - 8); \
l0_%=: r0 = 0; \
exit; \
" :
: __imm_const(xdp_md_data, offsetof(struct xdp_md, data)),
__imm_const(xdp_md_data_end, offsetof(struct xdp_md, data_end))
: __clobber_all);
}
Changes compared to RFC:
- test_loader.c is extended to support test program runs;
- capabilities handling now matches behavior of test_verifier;
- BPF_ST_MEM instructions are automatically replaced by BPF_STX_MEM
instructions to overcome current clang limitations;
- tests styling updates according to RFC feedback;
- 38 migrated files are included instead of 1.
I used the following means for testing:
- migration tool itself has a set of self-tests;
- migrated tests are passing;
- manually compared each old/new file side-by-side.
While doing side-by-side comparison I've noted a few defects in the
original tests:
- and.c:
- One of the jump targets is off by one;
- BPF_ST_MEM wrong OFF/IMM ordering;
- array_access.c:
- BPF_ST_MEM wrong OFF/IMM ordering;
- value_or_null.c:
- BPF_ST_MEM wrong OFF/IMM ordering.
These defects would be addressed separately.
[1] RFC
https://lore.kernel.org/bpf/20230123145148.2791939-1-eddyz87@gmail.com/
[2] Migration tool
https://github.com/eddyz87/verifier-tests-migrator
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Test verifier/xdp.c automatically converted to use inline assembly.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230325025524.144043-43-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Test verifier/xadd.c automatically converted to use inline assembly.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230325025524.144043-42-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Test verifier/var_off.c automatically converted to use inline assembly.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230325025524.144043-41-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Test verifier/value_or_null.c automatically converted to use inline assembly.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230325025524.144043-40-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Test verifier/value.c automatically converted to use inline assembly.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230325025524.144043-39-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|