Age | Commit message (Collapse) | Author |
|
Recent changes to count number of matching symbols when creating
a kprobe event failed to take into account kernel modules. As such, it
breaks kprobes on kernel module symbols, by assuming there is no match.
Fix this my calling module_kallsyms_on_each_symbol() in addition to
kallsyms_on_each_match_symbol() to perform a proper counting.
Link: https://lore.kernel.org/all/20231027233126.2073148-1-andrii@kernel.org/
Cc: Francis Laniel <flaniel@linux.microsoft.com>
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Fixes: b022f0c7e404 ("tracing/kprobes: Return EADDRNOTAVAIL when func matches several symbols")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
|
|
Use of dget() after we'd dropped ->d_lock is too late - dentry might
be gone by that point.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
failed
->ki_pos value is unreliable in such cases. For an obvious example,
consider O_DSYNC write - we feed the data to page cache and start IO,
then we make sure it's completed. Update of ->ki_pos is dealt with
by the first part; failure in the second ends up with negative value
returned _and_ ->ki_pos left advanced as if sync had been successful.
In the same situation write(2) does not advance the file position
at all.
Reviewed-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Pull io_uring fixes from Jens Axboe:
"Fix for an issue reported where reading fdinfo could find a NULL
thread as we didn't properly synchronize, and then a disable for the
IOCB_DIO_CALLER_COMP optimization as a recent reported highlighted how
that could lead to deadlocks if the task issued async O_DIRECT writes
and then proceeded to do sync fallocate() calls"
* tag 'io_uring-6.6-2023-10-27' of git://git.kernel.dk/linux:
io_uring/rw: disable IOCB_DIO_CALLER_COMP
io_uring/fdinfo: lock SQ thread while retrieving thread cpu/pid
|
|
Fault handler used to make non-trivial calls, so it needed
to set a stack frame up. Used to be
save ... - grab a stack frame, old %o... become %i...
....
ret - go back to address originally in %o7, currently %i7
restore - switch to previous stack frame, in delay slot
Non-trivial calls had been gone since ab5e8b331244 and that code should
have become
retl - go back to address in %o7
clr %o0 - have return value set to 0
What it had become instead was
ret - go back to address in %i7 - return address of *caller*
clr %o0 - have return value set to 0
which is not good, to put it mildly - we forcibly return 0 from
csum_and_copy_{from,to}_iter() (which is what the call of that
thing had been inlined into) and do that without dropping the
stack frame of said csum_and_copy_..._iter(). Confuses the
hell out of the caller of csum_and_copy_..._iter(), obviously...
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Fixes: ab5e8b331244 "sparc32: propagate the calling conventions change down to __csum_partial_copy_sparc_generic()"
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Pull block fix from Jens Axboe:
"Just a single fix for a potential divide-by-zero, introduced in this
cycle"
* tag 'block-6.6-2023-10-27' of git://git.kernel.dk/linux:
blk-throttle: check for overflow in calculate_bytes_allowed
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata
Pull ATA fix from Damien Le Moal:
"A single patch to fix a regression introduced by the recent
suspend/resume fixes.
The regression is that ATA disks are not stopped on system shutdown,
which is not recommended and increases the disks SMART counters for
unclean power off events.
This patch fixes this by refining the recent rework of the scsi device
manage_xxx flags"
* tag 'ata-6.6-final' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
scsi: sd: Introduce manage_shutdown device flag
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fix from Hans de Goede:
"A single patch to extend the AMD PMC driver DMI quirk list
for laptops which need special handling to avoid NVME s2idle
suspend/resume errors"
* tag 'platform-drivers-x86-v6.6-6' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86: Add s2idle quirk for more Lenovo laptops
|
|
Add DW_2500BASEX case in xpcs_get_state( ) to update speed, duplex and pause
Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microchip.com>
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
Link: https://lore.kernel.org/r/20231027044306.291250-1-Raju.Lakkaraju@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
With latest sync from net-next tree, bpf-next has a bpf selftest failure:
[root@arch-fb-vm1 bpf]# ./test_progs -t setget_sockopt
...
[ 76.194349] ============================================
[ 76.194682] WARNING: possible recursive locking detected
[ 76.195039] 6.6.0-rc7-g37884503df08-dirty #67 Tainted: G W OE
[ 76.195518] --------------------------------------------
[ 76.195852] new_name/154 is trying to acquire lock:
[ 76.196159] ffff8c3e06ad8d30 (sk_lock-AF_INET){+.+.}-{0:0}, at: ip_sock_set_tos+0x19/0x30
[ 76.196669]
[ 76.196669] but task is already holding lock:
[ 76.197028] ffff8c3e06ad8d30 (sk_lock-AF_INET){+.+.}-{0:0}, at: inet_listen+0x21/0x70
[ 76.197517]
[ 76.197517] other info that might help us debug this:
[ 76.197919] Possible unsafe locking scenario:
[ 76.197919]
[ 76.198287] CPU0
[ 76.198444] ----
[ 76.198600] lock(sk_lock-AF_INET);
[ 76.198831] lock(sk_lock-AF_INET);
[ 76.199062]
[ 76.199062] *** DEADLOCK ***
[ 76.199062]
[ 76.199420] May be due to missing lock nesting notation
[ 76.199420]
[ 76.199879] 2 locks held by new_name/154:
[ 76.200131] #0: ffff8c3e06ad8d30 (sk_lock-AF_INET){+.+.}-{0:0}, at: inet_listen+0x21/0x70
[ 76.200644] #1: ffffffff90f96a40 (rcu_read_lock){....}-{1:2}, at: __cgroup_bpf_run_filter_sock_ops+0x55/0x290
[ 76.201268]
[ 76.201268] stack backtrace:
[ 76.201538] CPU: 4 PID: 154 Comm: new_name Tainted: G W OE 6.6.0-rc7-g37884503df08-dirty #67
[ 76.202134] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[ 76.202699] Call Trace:
[ 76.202858] <TASK>
[ 76.203002] dump_stack_lvl+0x4b/0x80
[ 76.203239] __lock_acquire+0x740/0x1ec0
[ 76.203503] lock_acquire+0xc1/0x2a0
[ 76.203766] ? ip_sock_set_tos+0x19/0x30
[ 76.204050] ? sk_stream_write_space+0x12a/0x230
[ 76.204389] ? lock_release+0xbe/0x260
[ 76.204661] lock_sock_nested+0x32/0x80
[ 76.204942] ? ip_sock_set_tos+0x19/0x30
[ 76.205208] ip_sock_set_tos+0x19/0x30
[ 76.205452] do_ip_setsockopt+0x4b3/0x1580
[ 76.205719] __bpf_setsockopt+0x62/0xa0
[ 76.205963] bpf_sock_ops_setsockopt+0x11/0x20
[ 76.206247] bpf_prog_630217292049c96e_bpf_test_sockopt_int+0xbc/0x123
[ 76.206660] bpf_prog_493685a3bae00bbd_bpf_test_ip_sockopt+0x49/0x4b
[ 76.207055] bpf_prog_b0bcd27f269aeea0_skops_sockopt+0x44c/0xec7
[ 76.207437] __cgroup_bpf_run_filter_sock_ops+0xda/0x290
[ 76.207829] __inet_listen_sk+0x108/0x1b0
[ 76.208122] inet_listen+0x48/0x70
[ 76.208373] __sys_listen+0x74/0xb0
[ 76.208630] __x64_sys_listen+0x16/0x20
[ 76.208911] do_syscall_64+0x3f/0x90
[ 76.209174] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
...
Both ip_sock_set_tos() and inet_listen() calls lock_sock(sk) which
caused a dead lock.
To fix the issue, use sockopt_lock_sock() in ip_sock_set_tos()
instead. sockopt_lock_sock() will avoid lock_sock() if it is in bpf
context.
Fixes: 878d951c6712 ("inet: lock the socket in ip_sock_set_tos()")
Suggested-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20231027182424.1444845-1-yonghong.song@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch uses a helper function for assignment of xdp_features.
This change simplifies backports.
Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://lore.kernel.org/r/1698430011-21562-1-git-send-email-haiyangz@microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch is basically a followup to commit 4e4b1798cc90 ("vxlan: Add
missing entries to vxlan_get_size()"). All of the attributes in
vxlan_get_size() appear in the same order that they are filled in
vxlan_fill_info() except for IFLA_VXLAN_PORT_RANGE. For consistency, move
that entry to match its order and add a comment, like for all other
entries.
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Link: https://lore.kernel.org/r/20231027184410.236671-1-bpoirier@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Jacob Keller says:
====================
Intel Wired LAN Driver Updates for 2023-10-23 (iavf)
This series includes iAVF driver cleanups from Michal Schmidt.
Michal removes and updates stale comments, fixes some locking anti-patterns,
improves handling of resets when the PF is slow, avoids unnecessary
duplication of netdev state, refactors away some duplicate code, and finally
removes the never-actually-used client interface.
Changes since v1:
* Dropped patch ("iavf: in iavf_down, disable queues when removing the
driver") which was applied directly to net.
* Fixed a merge conflict due to 7db311104388 ("iavf: initialize waitqueues
before starting watchdog_task").
V1 was originally posted at:
https://lore.kernel.org/netdev/20231027104109.4f536f51@kernel.org/T/#mfadbdb39313eeccc616fdee80a4fdd6bda7e2822
====================
Link: https://lore.kernel.org/r/20231027175941.1340255-1-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The iavf client interface was added in 2017 by commit ed0e894de7c1
("i40evf: add client interface"), but there have never been any in-tree
callers.
It's not useful for future development either. The Intel out-of-tree
iavf and irdma drivers instead use an auxiliary bus, which is a better
solution.
Remove the iavf client interface code. Also gone are the client_task
work and the client_lock mutex.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231027175941.1340255-9-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a new function iavf_free_interrupt_scheme that does the inverse of
iavf_init_interrupt_scheme. Symmetry is nice. And there will be three
callers already.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231027175941.1340255-8-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use unregister_netdev, which takes rtnl_lock for us. We don't have to
check the reg_state under rtnl_lock. There's nothing to race with. We
have just cancelled the finish_config work.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231027175941.1340255-7-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The information whether a netdev has been registered is already present
in the netdev itself. There's no need for a driver flag with the same
meaning.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231027175941.1340255-6-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Every time I create VFs on ice, I receive at least one "Device is still
in reset (-16), retrying" message per VF. It recovers fine, but typical
usecases should not trigger scary-looking messages.
The waiting for reset is too short. It makes no sense to check every 10
microseconds. Typical reset waiting times are at least tens of
milliseconds and can be several seconds. I suspect the polling interval
was meant to be 10 milliseconds all along.
IAVF_RESET_WAIT_COMPLETE_COUNT is defined as 2000, so the total waiting
time could be over 20 seconds. I have seen resets take 5 seconds (with
128 VFs on ice).
The added benefit of not triggering the "Device is still in reset" path
is that we avoid going through the __IAVF_INIT_FAILED state, which would
take a full second before retrying.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231027175941.1340255-5-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The reason for queueing watchdog_task is to have it process the
aq_required flags that are being set here. If comms failed, there's
nothing to do, so return early.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231027175941.1340255-4-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This pattern appears in two places in the iavf source code:
while (!mutex_trylock(...))
usleep_range(...);
That's just mutex_lock with extra steps.
The pattern is a leftover from when iavf used bit flags instead of
mutexes for locking. Commit 5ac49f3c2702 ("iavf: use mutexes for locking
of critical sections") replaced test_and_set_bit with !mutex_trylock,
preserving the pattern.
Simplify it to mutex_lock.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231027175941.1340255-3-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Bit lock __IAVF_IN_CRITICAL_TASK does not exist anymore since commit
5ac49f3c2702 ("iavf: use mutexes for locking of critical sections").
Adjust the comments accordingly.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231027175941.1340255-2-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
allow specifying cmd-cnt-name and cmd-max-name in netlink specs, in
accordance with Documentation/userspace-api/netlink/c-code-gen.rst.
Use cmd-cnt-name and attr-cnt-name in the mptcp yaml spec and in the
corresponding uAPI headers, to preserve the #defines we had in the past
and avoid adding new ones.
v2:
- squash modification in mptcp.yaml and MPTCP uAPI headers
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Link: https://lore.kernel.org/r/12d4ed0116d8883cf4b533b856f3125a34e56749.1698415310.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In case the kernel sends message back containing attribute not defined
in family spec, following exception is raised to the user:
$ sudo ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/devlink.yaml --do trap-get --json '{"bus-name": "netdevsim", "dev-name": "netdevsim1", "trap-name": "source_mac_is_multicast"}'
Traceback (most recent call last):
File "/home/jiri/work/linux/tools/net/ynl/lib/ynl.py", line 521, in _decode
attr_spec = attr_space.attrs_by_val[attr.type]
~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^
KeyError: 132
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/jiri/work/linux/./tools/net/ynl/cli.py", line 61, in <module>
main()
File "/home/jiri/work/linux/./tools/net/ynl/cli.py", line 49, in main
reply = ynl.do(args.do, attrs, args.flags)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jiri/work/linux/tools/net/ynl/lib/ynl.py", line 731, in do
return self._op(method, vals, flags)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jiri/work/linux/tools/net/ynl/lib/ynl.py", line 719, in _op
rsp_msg = self._decode(decoded.raw_attrs, op.attr_set.name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jiri/work/linux/tools/net/ynl/lib/ynl.py", line 525, in _decode
raise Exception(f"Space '{space}' has no attribute with value '{attr.type}'")
Exception: Space 'devlink' has no attribute with value '132'
Introduce a command line option "process-unknown" and pass it down to
YnlFamily class constructor to allow user to process unknown
attributes and types and print them as binaries.
$ sudo ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/devlink.yaml --do trap-get --json '{"bus-name": "netdevsim", "dev-name": "netdevsim1", "trap-name": "source_mac_is_multicast"}' --process-unknown
{'UnknownAttr(129)': {'UnknownAttr(0)': b'\x00\x00\x00\x00\x00\x00\x00\x00',
'UnknownAttr(1)': b'\x00\x00\x00\x00\x00\x00\x00\x00',
'UnknownAttr(2)': b'\x0e\x00\x00\x00\x00\x00\x00\x00'},
'UnknownAttr(132)': b'\x00',
'UnknownAttr(133)': b'',
'UnknownAttr(134)': {'UnknownAttr(0)': b''},
'bus-name': 'netdevsim',
'dev-name': 'netdevsim1',
'trap-action': 'drop',
'trap-group-name': 'l2_drops',
'trap-name': 'source_mac_is_multicast'}
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20231027092525.956172-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Both ipvlan_process_v4_outbound() and ipvlan_process_v6_outbound()
increment dev->stats.tx_errors in case of errors.
Unfortunately there are two issues :
1) ipvlan_get_stats64() does not propagate dev->stats.tx_errors to user.
2) Increments are not atomic. KCSAN would complain eventually.
Use DEV_STATS_INC() to not miss an update, and change ipvlan_get_stats64()
to copy the value back to user.
Fixes: 2ad7bf363841 ("ipvlan: Initial check-in of the IPVLAN driver.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Link: https://lore.kernel.org/r/20231026131446.3933175-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Like other buses, devices on the netdevsim bus have a release callback
that is invoked when the reference count of the device drops to zero.
However, unlike other buses such as PCI, the release callback is not
necessarily built into the kernel, as netdevsim can be built as a
module.
The above is problematic as nothing prevents the module from being
unloaded before the release callback has been invoked, which can happen
asynchronously. One such example can be found in commit a380687200e0
("devlink: take device reference for devlink object") where devlink
calls put_device() from an RCU callback.
The issue is not theoretical and the reproducer in [1] can reliably
crash the kernel. The conclusion of this discussion was that the issue
should be solved in netdevsim, which is what this patch is trying to do.
Add a reference count that is increased when a device is added to the
bus and decreased when a device is released. Signal a completion when
the reference count drops to zero and wait for the completion when
unloading the module so that the module will not be unloaded before all
the devices were released. The reference count is initialized to one so
that completion is only signaled when unloading the module.
With this patch, the reproducer in [1] no longer crashes the kernel.
[1] https://lore.kernel.org/netdev/20230619125015.1541143-2-idosch@nvidia.com/
Fixes: a380687200e0 ("devlink: take device reference for devlink object")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20231026083343.890689-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The napi_build_skb() can reuse the skb in skb cache per CPU or
can allocate skbs in bulk, which helps improve the performance.
Signed-off-by: Fei Qin <fei.qin@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Link: https://lore.kernel.org/r/20231026080058.22810-1-louis.peens@corigine.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There is a spelling mistake in a dev_dbg message. Fix it.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/all/20231026065408.1087824-1-colin.i.king@gmail.com/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Oleksij Rempel says:
====================
net: dsa: microchip: provide Wake on LAN support (part 2)
This patch series introduces extensive Wake on LAN (WoL) support for the
Microchip KSZ9477 family of switches, coupled with some code refactoring
and error handling enhancements. The principal aim is to enable and
manage Wake on Magic Packet and other PHY event triggers for waking up
the system, whilst ensuring that the switch isn't reset during a
shutdown if WoL is active.
The Wake on LAN functionality is optional and is particularly beneficial
if the PME pins are connected to the SoC as a wake source or to a PMIC
that can enable or wake the SoC.
====================
Link: https://lore.kernel.org/r/20231026051051.2316937-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Ensures a stable PME (Power Management Event) pin state by disabling PME
on system start and enabling it on shutdown only if WoL (Wake-on-LAN) is
configured. This is needed to avoid issues with some PMICs (Power
Management ICs).
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20231026051051.2316937-6-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Centralize the switch shutdown routine in a dedicated function,
ksz_switch_shutdown(), to enhance code maintainability and reduce
redundancy. This change abstracts the common shutdown operations
previously duplicated in ksz9477_i2c_shutdown() and ksz_spi_shutdown().
This refactoring is a preparatory step for an upcoming patch to avoid
reset on shutdown if Wake-on-LAN (WoL) is enabled.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20231026051051.2316937-5-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Enhance the ksz_switch_macaddr_get() function to handle errors that may
occur during the call to ksz_write8(). Specifically, this update checks
the return value of ksz_write8(), which may fail if regmap ranges
validation is not passed and returns the error code.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20231026051051.2316937-4-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Update the comment to follow kernel-doc format.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20231026051051.2316937-3-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Introduce Wake on Magic Packet (WoL) functionality to the ksz9477
driver.
Major changes include:
1. Extending the `ksz9477_handle_wake_reason` function to identify Magic
Packet wake events alongside existing wake reasons.
2. Updating the `ksz9477_get_wol` and `ksz9477_set_wol` functions to
handle WAKE_MAGIC alongside the existing WAKE_PHY option, and to
program the switch's MAC address register accordingly when Magic
Packet wake-up is enabled. This change will prevent WAKE_MAGIC
activation if the related port has a different MAC address compared
to a MAC address already used by HSR or an already active WAKE_MAGIC
on another port.
3. Adding a restriction in `ksz_port_set_mac_address` to prevent MAC
address changes on ports with active Wake on Magic Packet, as the
switch's MAC address register is utilized for this feature.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20231026051051.2316937-2-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Tetsuo reported the following lockdep splat when the TSC synchronization
fails during CPU hotplug:
tsc: Marking TSC unstable due to check_tsc_sync_source failed
WARNING: inconsistent lock state
inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
ffffffff8cfa1c78 (watchdog_lock){?.-.}-{2:2}, at: clocksource_watchdog+0x23/0x5a0
{IN-HARDIRQ-W} state was registered at:
_raw_spin_lock_irqsave+0x3f/0x60
clocksource_mark_unstable+0x1b/0x90
mark_tsc_unstable+0x41/0x50
check_tsc_sync_source+0x14f/0x180
sysvec_call_function_single+0x69/0x90
Possible unsafe locking scenario:
lock(watchdog_lock);
<Interrupt>
lock(watchdog_lock);
stack backtrace:
_raw_spin_lock+0x30/0x40
clocksource_watchdog+0x23/0x5a0
run_timer_softirq+0x2a/0x50
sysvec_apic_timer_interrupt+0x6e/0x90
The reason is the recent conversion of the TSC synchronization function
during CPU hotplug on the control CPU to a SMP function call. In case
that the synchronization with the upcoming CPU fails, the TSC has to be
marked unstable via clocksource_mark_unstable().
clocksource_mark_unstable() acquires 'watchdog_lock', but that lock is
taken with interrupts enabled in the watchdog timer callback to minimize
interrupt disabled time. That's obviously a possible deadlock scenario,
Before that change the synchronization function was invoked in thread
context so this could not happen.
As it is not crucical whether the unstable marking happens slightly
delayed, defer the call to a worker thread which avoids the lock context
problem.
Fixes: 9d349d47f0e3 ("x86/smpboot: Make TSC synchronization function call based")
Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/87zg064ceg.ffs@tglx
|
|
David and a few others reported that on certain newer systems some legacy
interrupts fail to work correctly.
Debugging revealed that the BIOS of these systems leaves the legacy PIC in
uninitialized state which makes the PIC detection fail and the kernel
switches to a dummy implementation.
Unfortunately this fallback causes quite some code to fail as it depends on
checks for the number of legacy PIC interrupts or the availability of the
real PIC.
In theory there is no reason to use the PIC on any modern system when
IO/APIC is available, but the dependencies on the related checks cannot be
resolved trivially and on short notice. This needs lots of analysis and
rework.
The PIC detection has been added to avoid quirky checks and force selection
of the dummy implementation all over the place, especially in VM guest
scenarios. So it's not an option to revert the relevant commit as that
would break a lot of other scenarios.
One solution would be to try to initialize the PIC on detection fail and
retry the detection, but that puts the burden on everything which does not
have a PIC.
Fortunately the ACPI/MADT table header has a flag field, which advertises
in bit 0 that the system is PCAT compatible, which means it has a legacy
8259 PIC.
Evaluate that bit and if set avoid the detection routine and keep the real
PIC installed, which then gets initialized (for nothing) and makes the rest
of the code with all the dependencies work again.
Fixes: e179f6914152 ("x86, irq, pic: Probe for legacy PIC and set legacy_pic appropriately")
Reported-by: David Lazar <dlazar@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: David Lazar <dlazar@gmail.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Cc: stable@vger.kernel.org
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218003
Link: https://lore.kernel.org/r/875y2u5s8g.ffs@tglx
|
|
https://git.linaro.org/people/daniel.lezcano/linux into timers/core
Pull timer driver updates from Daniel Lezcano:
- Fix DT bindings typos, readability and wrong information about
underflow and overflow interrupts for the RZ/G2L MTU3a driver (Biju
Das)
- Fix a memory leak in the error path when probing the i.MX GPT timer
(Jacky Bai)
- Don't use clk_get_rate() in atomic context as the function might
sleep. Store the clock and use notifiers to receive a clocke rate
change notification (Ivaylo Dimitrov)
- Remove superfluous error message when platform_get_irq() fails
because the underlying function already prints one (Yang Li)
- Add wakeup capability flag for the risc-V ACPI timer (Sunil V L)
- Fix initialization of the TCB timers which are in cascade as the
second timer is reset after the first wraps up leading to
inconsistent scheduler behavior (Ronald Wahl)
- Add DT bindings and driver for Cirrus Logic EP93xx (Nikita Shubin)
|
|
For "reasons" Intel has code-named this CPU with a "_H" suffix.
[ dhansen: As usual, apply this and send it upstream quickly to
make it easier for anyone who is doing work that
consumes this. ]
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/all/20231025202513.12358-1-tony.luck%40intel.com
|
|
Since commit 97154bcf4d1b ("af_unix: Kconfig: make CONFIG_UNIX bool"),
af_unix.c is no longer built as module.
Let's remove unnecessary #if condition, exitcall, and module macros.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20231026212305.45545-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Mat Martineau says:
====================
mptcp: Fixes and cleanup for v6.7
This series includes three initial patches that we had queued in our
mptcp-net branch, but given the likely timing of net/net-next syncs this
week, the need to avoid introducing branch conflicts, and another batch
of net-next patches pending in the mptcp tree, the most practical route
is to send everything for net-next.
Patches 1 & 2 fix some intermittent selftest failures by adjusting timing.
Patch 3 removes an unneccessary userspace path manager restriction on
the removal of subflows with subflow ID 0.
The remainder of the patches are all cleanup or selftest changes:
Patches 4-8 clean up kernel code by removing unused parameters, making
more consistent use of existing helper functions, and reducing extra
casting of socket pointers.
Patch 9 removes an unused variable in a selftest script.
Patch 10 adds a little more detail to some mptcp_join test output.
====================
Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-0-db8f25f798eb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Just like displaying "invert" after "Info: ", "simult" should be
displayed too when rm_subflow_nr doesn't match the expect value in
chk_rm_nr():
syn [ ok ]
synack [ ok ]
ack [ ok ]
add [ ok ]
echo [ ok ]
rm [ ok ]
rmsf [ ok ] 3 in [2:4]
Info: invert simult
syn [ ok ]
synack [ ok ]
ack [ ok ]
add [ ok ]
echo [ ok ]
rm [ ok ]
rmsf [ ok ]
Info: invert
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-10-db8f25f798eb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Global var mptcp_connect defined at the front of mptcp_sockopt.sh is
duplicate with local var mptcp_connect defined in do_transfer(), drop
this useless global one.
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-9-db8f25f798eb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
'(struct sock *)msk' is used several times in mptcp_nl_cmd_announce(),
mptcp_nl_cmd_remove() or mptcp_userspace_pm_set_flags() in pm_userspace.c,
it's worth adding a local variable sk to point it.
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-8-db8f25f798eb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If we move the sk assignment statement ahead in mptcp_nl_cmd_sf_create()
or mptcp_nl_cmd_sf_destroy(), right after the msk null-check statements,
sk can be used after the create_err or destroy_err labels instead of
open-coding it again.
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-7-db8f25f798eb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use mptcp_get_ext() helper defined in protocol.h instead of open-coding
it in mptcp_sendmsg_frag().
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-6-db8f25f798eb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use __mptcp_check_fallback() helper defined in net/mptcp/protocol.h,
instead of open-coding it in both __mptcp_do_fallback() and
mptcp_diag_fill_info().
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-5-db8f25f798eb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The code using 'ssk' parameter of mptcp_pm_subflow_check_next() has been
dropped in commit "95d686517884 (mptcp: fix subflow accounting on close)".
So drop this useless parameter ssk.
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-4-db8f25f798eb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch adds the ability to send RM_ADDR for local ID 0. Check
whether id 0 address is removed, if not, put id 0 into a removing
list, pass it to mptcp_pm_remove_addr() to remove id 0 address.
There is no reason not to allow the userspace to remove the initial
address (ID 0). This special case was not taken into account not
letting the userspace to delete all addresses as announced.
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/379
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-3-db8f25f798eb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The second input parameter of 'wait_rm_addr/sf $1 1' is misused. If it's
1, wait_rm_addr/sf will never break, and will loop ten times, then
'wait_rm_addr/sf' equals to 'sleep 1'. This delay time is too long,
which can sometimes make the tests fail.
A better way to use wait_rm_addr/sf is to use rm_addr/sf_count to obtain
the current value, and then pass into wait_rm_addr/sf.
Fixes: 4369c198e599 ("selftests: mptcp: test userspace pm out of transfer")
Cc: stable@vger.kernel.org
Suggested-by: Matthieu Baerts <matttbe@kernel.org>
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-2-db8f25f798eb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Some userspace pm tests failed are reported by CI:
112 userspace pm add & remove address
syn [ ok ]
synack [ ok ]
ack [ ok ]
add [ ok ]
echo [ ok ]
mptcp_info subflows=1:1 [ ok ]
subflows_total 2:2 [ ok ]
mptcp_info add_addr_signal=1:1 [ ok ]
rm [ ok ]
rmsf [ ok ]
Info: invert
mptcp_info subflows=0:0 [ ok ]
subflows_total 1:1 [fail]
got subflows 0:0 expected 1:1
Server ns stats
TcpPassiveOpens 2 0.0
TcpInSegs 118 0.0
This patch fixes them by changing 'speed' to 5 to run the tests much more
slowly.
Fixes: 4369c198e599 ("selftests: mptcp: test userspace pm out of transfer")
Cc: stable@vger.kernel.org
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-1-db8f25f798eb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu fix from Joerg Roedel:
- Fix boot regression for Sapphire Rapids with Intel VT-d driver
* tag 'iommu-fix-v6.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu: Avoid unnecessary cache invalidations
|