summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-12-16sock: support SO_PRIORITY cmsgAnna Emese Nyiri
The Linux socket API currently allows setting SO_PRIORITY at the socket level, applying a uniform priority to all packets sent through that socket. The exception to this is IP_TOS, when the priority value is calculated during the handling of ancillary data, as implemented in commit f02db315b8d8 ("ipv4: IP_TOS and IP_TTL can be specified as ancillary data"). However, this is a computed value, and there is currently no mechanism to set a custom priority via control messages prior to this patch. According to this patch, if SO_PRIORITY is specified as ancillary data, the packet is sent with the priority value set through sockc->priority, overriding the socket-level values set via the traditional setsockopt() method. This is analogous to the existing support for SO_MARK, as implemented in commit c6af0c227a22 ("ip: support SO_MARK cmsg"). If both cmsg SO_PRIORITY and IP_TOS are passed, then the one that takes precedence is the last one in the cmsg list. This patch has the side effect that raw_send_hdrinc now interprets cmsg IP_TOS. Reviewed-by: Willem de Bruijn <willemb@google.com> Suggested-by: Ferenc Fejes <fejes@inf.elte.hu> Signed-off-by: Anna Emese Nyiri <annaemesenyiri@gmail.com> Link: https://patch.msgid.link/20241213084457.45120-3-annaemesenyiri@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-16sock: Introduce sk_set_prio_allowed helper functionAnna Emese Nyiri
Simplify priority setting permissions with the 'sk_set_prio_allowed' function, centralizing the validation logic. This change is made in anticipation of a second caller in a following patch. No functional changes. Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Suggested-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Anna Emese Nyiri <annaemesenyiri@gmail.com> Link: https://patch.msgid.link/20241213084457.45120-2-annaemesenyiri@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-16netlink: specs: add phys-binding attr to rt_link specDonald Hunter
Add the missing phys-binding attr to the mctp-attrs in the rt_link spec. This fixes commit 580db513b4a9 ("net: mctp: Expose transport binding identifier via IFLA attribute"). Note that enum mctp_phys_binding is not currently uapi, but perhaps it should be? Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20241213112551.33557-1-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-16chelsio/chtls: prevent potential integer overflow on 32bitDan Carpenter
The "gl->tot_len" variable is controlled by the user. It comes from process_responses(). On 32bit systems, the "gl->tot_len + sizeof(struct cpl_pass_accept_req) + sizeof(struct rss_header)" addition could have an integer wrapping bug. Use size_add() to prevent this. Fixes: a08943947873 ("crypto: chtls - Register chtls with net tls") Cc: stable@vger.kernel.org Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/c6bfb23c-2db2-4e1b-b8ab-ba3925c82ef5@stanley.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-16rxrpc: Fix ability to add more data to a call once MSG_MORE deassertedDavid Howells
When userspace is adding data to an RPC call for transmission, it must pass MSG_MORE to sendmsg() if it intends to add more data in future calls to sendmsg(). Calling sendmsg() without MSG_MORE being asserted closes the transmission phase of the call (assuming sendmsg() adds all the data presented) and further attempts to add more data should be rejected. However, this is no longer the case. The change of call state that was previously the guard got bumped over to the I/O thread, which leaves a window for a repeat sendmsg() to insert more data. This previously went unnoticed, but the more recent patch that changed the structures behind the Tx queue added a warning: WARNING: CPU: 3 PID: 6639 at net/rxrpc/sendmsg.c:296 rxrpc_send_data+0x3f2/0x860 and rejected the additional data, returning error EPROTO. Fix this by adding a guard flag to the call, setting the flag when we queue the final packet and then rejecting further attempts to add data with EPROTO. Fixes: 2d689424b618 ("rxrpc: Move call state changes from sendmsg to I/O thread") Reported-by: syzbot+ff11be94dfcd7a5af8da@syzkaller.appspotmail.com Closes: https://lore.kernel.org/r/6757fb68.050a0220.2477f.005f.GAE@google.com/ Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: syzbot+ff11be94dfcd7a5af8da@syzkaller.appspotmail.com cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: https://patch.msgid.link/2870480.1734037462@warthog.procyon.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-16rxrpc: Disable IRQ, not BH, to take the lock for ->attend_linkDavid Howells
Use spin_lock_irq(), not spin_lock_bh() to take the lock when accessing the ->attend_link() to stop a delay in the I/O thread due to an interrupt being taken in the app thread whilst that holds the lock and vice versa. Fixes: a2ea9a907260 ("rxrpc: Use irq-disabling spinlocks between app and I/O thread") Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: https://patch.msgid.link/2870146.1734037095@warthog.procyon.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-16Merge branch 'netdev-fix-repeated-netlink-messages-in-queue-dumps'Jakub Kicinski
Jakub Kicinski says: ==================== netdev: fix repeated netlink messages in queue dumps Fix dump continuation for queues and queue stats in the netdev family. Because we used post-increment when saving id of dumped queue next skb would re-dump the already dumped queue. ==================== Link: https://patch.msgid.link/20241213152244.3080955-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-16selftests: net-drv: stats: sanity check netlink dumpsJakub Kicinski
Sanity check netlink dumps, to make sure dumps don't have repeated entries or gaps in IDs. Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20241213152244.3080955-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-16selftests: net-drv: queues: sanity check netlink dumpsJakub Kicinski
This test already catches a netlink bug fixed by this series, but only when running on HW with many queues. Make sure the netdevsim instance created has a lot of queues, and constrain the size of the recv_buffer used by netlink. While at it test both rx and tx queues. Reviewed-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20241213152244.3080955-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-16selftests: net: support setting recv_size in YNLJakub Kicinski
recv_size parameter allows constraining the buffer size for dumps. It's useful in testing kernel handling of dump continuation, IOW testing dumps which span multiple skbs. Let the tests set this parameter when initializing the YNL family. Keep the normal default, we don't want tests to unintentionally behave very differently than normal code. Reviewed-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20241213152244.3080955-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-16netdev: fix repeated netlink messages in queue statsJakub Kicinski
The context is supposed to record the next queue to dump, not last dumped. If the dump doesn't fit we will restart from the already-dumped queue, duplicating the message. Before this fix and with the selftest improvements later in this series we see: # ./run_kselftest.sh -t drivers/net:stats.py timeout set to 45 selftests: drivers/net: stats.py KTAP version 1 1..5 ok 1 stats.check_pause ok 2 stats.check_fec ok 3 stats.pkt_byte_sum # Check| At /root/ksft-net-drv/drivers/net/./stats.py, line 125, in qstat_by_ifindex: # Check| ksft_eq(len(queues[qtype]), len(set(queues[qtype])), # Check failed 45 != 44 repeated queue keys # Check| At /root/ksft-net-drv/drivers/net/./stats.py, line 127, in qstat_by_ifindex: # Check| ksft_eq(len(queues[qtype]), max(queues[qtype]) + 1, # Check failed 45 != 44 missing queue keys # Check| At /root/ksft-net-drv/drivers/net/./stats.py, line 125, in qstat_by_ifindex: # Check| ksft_eq(len(queues[qtype]), len(set(queues[qtype])), # Check failed 45 != 44 repeated queue keys # Check| At /root/ksft-net-drv/drivers/net/./stats.py, line 127, in qstat_by_ifindex: # Check| ksft_eq(len(queues[qtype]), max(queues[qtype]) + 1, # Check failed 45 != 44 missing queue keys # Check| At /root/ksft-net-drv/drivers/net/./stats.py, line 125, in qstat_by_ifindex: # Check| ksft_eq(len(queues[qtype]), len(set(queues[qtype])), # Check failed 103 != 100 repeated queue keys # Check| At /root/ksft-net-drv/drivers/net/./stats.py, line 127, in qstat_by_ifindex: # Check| ksft_eq(len(queues[qtype]), max(queues[qtype]) + 1, # Check failed 103 != 100 missing queue keys # Check| At /root/ksft-net-drv/drivers/net/./stats.py, line 125, in qstat_by_ifindex: # Check| ksft_eq(len(queues[qtype]), len(set(queues[qtype])), # Check failed 102 != 100 repeated queue keys # Check| At /root/ksft-net-drv/drivers/net/./stats.py, line 127, in qstat_by_ifindex: # Check| ksft_eq(len(queues[qtype]), max(queues[qtype]) + 1, # Check failed 102 != 100 missing queue keys not ok 4 stats.qstat_by_ifindex ok 5 stats.check_down # Totals: pass:4 fail:1 xfail:0 xpass:0 skip:0 error:0 With the fix: # ./ksft-net-drv/run_kselftest.sh -t drivers/net:stats.py timeout set to 45 selftests: drivers/net: stats.py KTAP version 1 1..5 ok 1 stats.check_pause ok 2 stats.check_fec ok 3 stats.pkt_byte_sum ok 4 stats.qstat_by_ifindex ok 5 stats.check_down # Totals: pass:5 fail:0 xfail:0 xpass:0 skip:0 error:0 Fixes: ab63a2387cb9 ("netdev: add per-queue statistics") Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20241213152244.3080955-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-16netdev: fix repeated netlink messages in queue dumpJakub Kicinski
The context is supposed to record the next queue to dump, not last dumped. If the dump doesn't fit we will restart from the already-dumped queue, duplicating the message. Before this fix and with the selftest improvements later in this series we see: # ./run_kselftest.sh -t drivers/net:queues.py timeout set to 45 selftests: drivers/net: queues.py KTAP version 1 1..2 # Check| At /root/ksft-net-drv/drivers/net/./queues.py, line 32, in get_queues: # Check| ksft_eq(queues, expected) # Check failed 102 != 100 # Check| At /root/ksft-net-drv/drivers/net/./queues.py, line 32, in get_queues: # Check| ksft_eq(queues, expected) # Check failed 101 != 100 not ok 1 queues.get_queues ok 2 queues.addremove_queues # Totals: pass:1 fail:1 xfail:0 xpass:0 skip:0 error:0 not ok 1 selftests: drivers/net: queues.py # exit=1 With the fix: # ./ksft-net-drv/run_kselftest.sh -t drivers/net:queues.py timeout set to 45 selftests: drivers/net: queues.py KTAP version 1 1..2 ok 1 queues.get_queues ok 2 queues.addremove_queues # Totals: pass:2 fail:0 xfail:0 xpass:0 skip:0 error:0 Fixes: 6b6171db7fc8 ("netdev-genl: Add netlink framework functions for queue") Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20241213152244.3080955-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-16Merge branch 'mlx5-next' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Tariq Toukan says: ==================== mlx5-next 2024-12-16 The following pull-request contains mlx5 IFC updates. * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5: Add device cap abs_native_port_num net/mlx5: qos: Add ifc support for cross-esw scheduling net/mlx5: Add support for new scheduling elements net/mlx5: Add ConnectX-8 device to ifc net/mlx5: ifc: Reorganize mlx5_ifc_flow_table_context_bits ==================== Link: https://patch.msgid.link/20241216124028.973763-1-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-16nios2: Use str_yes_no() helper in show_cpuinfo()Thorsten Blum
Remove hard-coded strings by using the str_yes_no() helper function. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2024-12-16fortify: Hide run-time copy size from value range trackingKees Cook
GCC performs value range tracking for variables as a way to provide better diagnostics. One place this is regularly seen is with warnings associated with bounds-checking, e.g. -Wstringop-overflow, -Wstringop-overread, -Warray-bounds, etc. In order to keep the signal-to-noise ratio high, warnings aren't emitted when a value range spans the entire value range representable by a given variable. For example: unsigned int len; char dst[8]; ... memcpy(dst, src, len); If len's value is unknown, it has the full "unsigned int" range of [0, UINT_MAX], and GCC's compile-time bounds checks against memcpy() will be ignored. However, when a code path has been able to narrow the range: if (len > 16) return; memcpy(dst, src, len); Then the range will be updated for the execution path. Above, len is now [0, 16] when reading memcpy(), so depending on other optimizations, we might see a -Wstringop-overflow warning like: error: '__builtin_memcpy' writing between 9 and 16 bytes into region of size 8 [-Werror=stringop-overflow] When building with CONFIG_FORTIFY_SOURCE, the fortified run-time bounds checking can appear to narrow value ranges of lengths for memcpy(), depending on how the compiler constructs the execution paths during optimization passes, due to the checks against the field sizes. For example: if (p_size_field != SIZE_MAX && p_size != p_size_field && p_size_field < size) As intentionally designed, these checks only affect the kernel warnings emitted at run-time and do not block the potentially overflowing memcpy(), so GCC thinks it needs to produce a warning about the resulting value range that might be reaching the memcpy(). We have seen this manifest a few times now, with the most recent being with cpumasks: In function ‘bitmap_copy’, inlined from ‘cpumask_copy’ at ./include/linux/cpumask.h:839:2, inlined from ‘__padata_set_cpumasks’ at kernel/padata.c:730:2: ./include/linux/fortify-string.h:114:33: error: ‘__builtin_memcpy’ reading between 257 and 536870904 bytes from a region of size 256 [-Werror=stringop-overread] 114 | #define __underlying_memcpy __builtin_memcpy | ^ ./include/linux/fortify-string.h:633:9: note: in expansion of macro ‘__underlying_memcpy’ 633 | __underlying_##op(p, q, __fortify_size); \ | ^~~~~~~~~~~~~ ./include/linux/fortify-string.h:678:26: note: in expansion of macro ‘__fortify_memcpy_chk’ 678 | #define memcpy(p, q, s) __fortify_memcpy_chk(p, q, s, \ | ^~~~~~~~~~~~~~~~~~~~ ./include/linux/bitmap.h:259:17: note: in expansion of macro ‘memcpy’ 259 | memcpy(dst, src, len); | ^~~~~~ kernel/padata.c: In function ‘__padata_set_cpumasks’: kernel/padata.c:713:48: note: source object ‘pcpumask’ of size [0, 256] 713 | cpumask_var_t pcpumask, | ~~~~~~~~~~~~~~^~~~~~~~ This warning is _not_ emitted when CONFIG_FORTIFY_SOURCE is disabled, and with the recent -fdiagnostics-details we can confirm the origin of the warning is due to FORTIFY's bounds checking: ../include/linux/bitmap.h:259:17: note: in expansion of macro 'memcpy' 259 | memcpy(dst, src, len); | ^~~~~~ '__padata_set_cpumasks': events 1-2 ../include/linux/fortify-string.h:613:36: 612 | if (p_size_field != SIZE_MAX && | ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 613 | p_size != p_size_field && p_size_field < size) | ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~ | | | (1) when the condition is evaluated to false | (2) when the condition is evaluated to true '__padata_set_cpumasks': event 3 114 | #define __underlying_memcpy __builtin_memcpy | ^ | | | (3) out of array bounds here Note that the cpumask warning started appearing since bitmap functions were recently marked __always_inline in commit ed8cd2b3bd9f ("bitmap: Switch from inline to __always_inline"), which allowed GCC to gain visibility into the variables as they passed through the FORTIFY implementation. In order to silence these false positives but keep otherwise deterministic compile-time warnings intact, hide the length variable from GCC with OPTIMIZE_HIDE_VAR() before calling the builtin memcpy. Additionally add a comment about why all the macro args have copies with const storage. Reported-by: "Thomas Weißschuh" <linux@weissschuh.net> Closes: https://lore.kernel.org/all/db7190c8-d17f-4a0d-bc2f-5903c79f36c2@t-8ch.de/ Reported-by: Nilay Shroff <nilay@linux.ibm.com> Closes: https://lore.kernel.org/all/20241112124127.1666300-1-nilay@linux.ibm.com/ Tested-by: Nilay Shroff <nilay@linux.ibm.com> Acked-by: Yury Norov <yury.norov@gmail.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Kees Cook <kees@kernel.org>
2024-12-16hwmon: (tmp513) Fix interpretation of values of Temperature Result and Limit ↵Murad Masimov
Registers The values returned by the driver after processing the contents of the Temperature Result and the Temperature Limit Registers do not correspond to the TMP512/TMP513 specifications. A raw register value is converted to a signed integer value by a sign extension in accordance with the algorithm provided in the specification, but due to the off-by-one error in the sign bit index, the result is incorrect. According to the TMP512 and TMP513 datasheets, the Temperature Result (08h to 0Bh) and Limit (11h to 14h) Registers are 13-bit two's complement integer values, shifted left by 3 bits. The value is scaled by 0.0625 degrees Celsius per bit. E.g., if regval = 1 1110 0111 0000 000, the output should be -25 degrees, but the driver will return +487 degrees. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 59dfa75e5d82 ("hwmon: Add driver for Texas Instruments TMP512/513 sensor chips.") Signed-off-by: Murad Masimov <m.masimov@maxima.ru> Link: https://lore.kernel.org/r/20241216173648.526-4-m.masimov@maxima.ru [groeck: fixed description line length] Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-12-16hwmon: (tmp513) Fix Current Register value interpretationMurad Masimov
The value returned by the driver after processing the contents of the Current Register does not correspond to the TMP512/TMP513 specifications. A raw register value is converted to a signed integer value by a sign extension in accordance with the algorithm provided in the specification, but due to the off-by-one error in the sign bit index, the result is incorrect. Moreover, negative values will be reported as large positive due to missing sign extension from u32 to long. According to the TMP512 and TMP513 datasheets, the Current Register (07h) is a 16-bit two's complement integer value. E.g., if regval = 1000 0011 0000 0000, then the value must be (-32000 * lsb), but the driver will return (33536 * lsb). Fix off-by-one bug, and also cast data->curr_lsb_ua (which is of type u32) to long to prevent incorrect cast for negative values. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 59dfa75e5d82 ("hwmon: Add driver for Texas Instruments TMP512/513 sensor chips.") Signed-off-by: Murad Masimov <m.masimov@maxima.ru> Link: https://lore.kernel.org/r/20241216173648.526-3-m.masimov@maxima.ru [groeck: Fixed description line length] Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-12-16hwmon: (tmp513) Fix interpretation of values of Shunt Voltage and Limit ↵Murad Masimov
Registers The values returned by the driver after processing the contents of the Shunt Voltage Register and the Shunt Limit Registers do not correspond to the TMP512/TMP513 specifications. A raw register value is converted to a signed integer value by a sign extension in accordance with the algorithm provided in the specification, but due to the off-by-one error in the sign bit index, the result is incorrect. Moreover, the PGA shift calculated with the tmp51x_get_pga_shift function is relevant only to the Shunt Voltage Register, but is also applied to the Shunt Limit Registers. According to the TMP512 and TMP513 datasheets, the Shunt Voltage Register (04h) is 13 to 16 bit two's complement integer value, depending on the PGA setting. The Shunt Positive (0Ch) and Negative (0Dh) Limit Registers are 16-bit two's complement integer values. Below are some examples: * Shunt Voltage Register If PGA = 8, and regval = 1000 0011 0000 0000, then the decimal value must be -32000, but the value calculated by the driver will be 33536. * Shunt Limit Register If regval = 1000 0011 0000 0000, then the decimal value must be -32000, but the value calculated by the driver will be 768, if PGA = 1. Fix sign bit index, and also correct misleading comment describing the tmp51x_get_pga_shift function. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 59dfa75e5d82 ("hwmon: Add driver for Texas Instruments TMP512/513 sensor chips.") Signed-off-by: Murad Masimov <m.masimov@maxima.ru> Link: https://lore.kernel.org/r/20241216173648.526-2-m.masimov@maxima.ru [groeck: Fixed description and multi-line alignments] Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-12-16ceph: allocate sparse_ext map only for sparse readsIlya Dryomov
If mounted with sparseread option, ceph_direct_read_write() ends up making an unnecessarily allocation for O_DIRECT writes. Fixes: 03bc06c7b0bd ("ceph: add new mount option to enable sparse reads") Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Markuze <amarkuze@redhat.com>
2024-12-16ceph: fix memory leak in ceph_direct_read_write()Ilya Dryomov
The bvecs array which is allocated in iter_get_bvecs_alloc() is leaked and pages remain pinned if ceph_alloc_sparse_ext_map() fails. There is no need to delay the allocation of sparse_ext map until after the bvecs array is set up, so fix this by moving sparse_ext allocation a bit earlier. Also, make a similar adjustment in __ceph_sync_read() for consistency (a leak of the same kind in __ceph_sync_read() has been addressed differently). Cc: stable@vger.kernel.org Fixes: 03bc06c7b0bd ("ceph: add new mount option to enable sparse reads") Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Markuze <amarkuze@redhat.com>
2024-12-16ceph: improve error handling and short/overflow-read logic in __ceph_sync_read()Alex Markuze
This patch refines the read logic in __ceph_sync_read() to ensure more predictable and efficient behavior in various edge cases. - Return early if the requested read length is zero or if the file size (`i_size`) is zero. - Initialize the index variable (`idx`) where needed and reorder some code to ensure it is always set before use. - Improve error handling by checking for negative return values earlier. - Remove redundant encrypted file checks after failures. Only attempt filesystem-level decryption if the read succeeded. - Simplify leftover calculations to correctly handle cases where the read extends beyond the end of the file or stops short. This can be hit by continuously reading a file while, on another client, we keep truncating and writing new data into it. - This resolves multiple issues caused by integer and consequent buffer overflow (`pages` array being accessed beyond `num_pages`): - https://tracker.ceph.com/issues/67524 - https://tracker.ceph.com/issues/68980 - https://tracker.ceph.com/issues/68981 Cc: stable@vger.kernel.org Fixes: 1065da21e5df ("ceph: stop copying to iter at EOF on sync reads") Reported-by: Luis Henriques (SUSE) <luis.henriques@linux.dev> Signed-off-by: Alex Markuze <amarkuze@redhat.com> Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2024-12-16ceph: validate snapdirname option length when mountingIlya Dryomov
It becomes a path component, so it shouldn't exceed NAME_MAX characters. This was hardened in commit c152737be22b ("ceph: Use strscpy() instead of strcpy() in __get_snap_name()"), but no actual check was put in place. Cc: stable@vger.kernel.org Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Markuze <amarkuze@redhat.com>
2024-12-16ceph: give up on paths longer than PATH_MAXMax Kellermann
If the full path to be built by ceph_mdsc_build_path() happens to be longer than PATH_MAX, then this function will enter an endless (retry) loop, effectively blocking the whole task. Most of the machine becomes unusable, making this a very simple and effective DoS vulnerability. I cannot imagine why this retry was ever implemented, but it seems rather useless and harmful to me. Let's remove it and fail with ENAMETOOLONG instead. Cc: stable@vger.kernel.org Reported-by: Dario Weißer <dario@cure53.de> Signed-off-by: Max Kellermann <max.kellermann@ionos.com> Reviewed-by: Alex Markuze <amarkuze@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2024-12-16ceph: fix memory leaks in __ceph_sync_read()Max Kellermann
In two `break` statements, the call to ceph_release_page_vector() was missing, leaking the allocation from ceph_alloc_page_vector(). Instead of adding the missing ceph_release_page_vector() calls, the Ceph maintainers preferred to transfer page ownership to the `ceph_osd_request` by passing `own_pages=true` to osd_req_op_extent_osd_data_pages(). This requires postponing the ceph_osdc_put_request() call until after the block that accesses the `pages`. Cc: stable@vger.kernel.org Fixes: 03bc06c7b0bd ("ceph: add new mount option to enable sparse reads") Fixes: f0fe1e54cfcf ("ceph: plumb in decryption during reads") Signed-off-by: Max Kellermann <max.kellermann@ionos.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2024-12-16ftrace: Do not find "true_parent" if HAVE_DYNAMIC_FTRACE_WITH_ARGS is not setSteven Rostedt
When function tracing and function graph tracing are both enabled (in different instances) the "parent" of some of the function tracing events is "return_to_handler" which is the trampoline used by function graph tracing. To fix this, ftrace_get_true_parent_ip() was introduced that returns the "true" parent ip instead of the trampoline. To do this, the ftrace_regs_get_stack_pointer() is used, which uses kernel_stack_pointer(). The problem is that microblaze does not implement kerenl_stack_pointer() so when function graph tracing is enabled, the build fails. But microblaze also does not enabled HAVE_DYNAMIC_FTRACE_WITH_ARGS. That option has to be enabled by the architecture to reliably get the values from the fregs parameter passed in. When that config is not set, the architecture can also pass in NULL, which is not tested for in that function and could cause the kernel to crash. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Michal Simek <monstr@monstr.eu> Cc: Jeff Xie <jeff.xie@linux.dev> Link: https://lore.kernel.org/20241216164633.6df18e87@gandalf.local.home Fixes: 60b1f578b578 ("ftrace: Get the true parent ip for function tracer") Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-12-16Merge branch 'selftests-bpf-migrate-test_xdp_meta-sh-to-test_progs'Martin KaFai Lau
Bastien Curutchet says: ==================== This patch series continues the work to migrate the script tests into prog_tests. test_xdp_meta.sh uses the BPF programs defined in progs/test_xdp_meta.c to do a simple XDP/TC functional test that checks the metadata allocation performed by the bpf_xdp_adjust_meta() helper. This is already partly covered by two tests under prog_tests/: - xdp_context_test_run.c uses bpf_prog_test_run_opts() to verify the validity of the xdp_md context after a call to bpf_xdp_adjust_meta() - xdp_metadata.c ensures that these meta-data can be exchanged through an AF_XDP socket. However test_xdp_meta.sh also verifies that the meta-data initialized in the struct xdp_md is forwarded to the struct __sk_buff used by BPF programs at 'TC level'. To cover this, I add a test case in xdp_context_test_run.c that uses the same BPF programs from progs/test_xdp_meta.c. Changes in v2: - Add missing close_netns() - Use a unique 'close' label - Link to v1: https://lore.kernel.org/r/20241206-xdp_meta-v1-0-5c150618f6e9@bootlin.com ==================== Link: https://patch.msgid.link/20241213-xdp_meta-v2-0-634582725b90@bootlin.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-12-16selftests/bpf: Migrate test_xdp_meta.sh into xdp_context_test_run.cBastien Curutchet
test_xdp_meta.sh can't be used by the BPF CI. Migrate test_xdp_meta.sh in a new test case in xdp_context_test_run.c. It uses the same BPF programs located in progs/test_xdp_meta.c and the same network topology. Remove test_xdp_meta.sh and its Makefile entry. Signed-off-by: Bastien Curutchet <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20241213-xdp_meta-v2-2-634582725b90@bootlin.com
2024-12-16selftests/bpf: test_xdp_meta: Rename BPF sectionsBastien Curutchet
SEC("t") and SEC("x") can't be loaded by the __load() helper. Rename these sections SEC("tc") and SEC("xdp") so they can be interpreted by the __load() helper in upcoming patch. Update the test_xdp_meta.sh to fit these new names. Signed-off-by: Bastien Curutchet <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20241213-xdp_meta-v2-1-634582725b90@bootlin.com
2024-12-16fgraph: Still initialize idle shadow stacks when startingSteven Rostedt
A bug was discovered where the idle shadow stacks were not initialized for offline CPUs when starting function graph tracer, and when they came online they were not traced due to the missing shadow stack. To fix this, the idle task shadow stack initialization was moved to using the CPU hotplug callbacks. But it removed the initialization when the function graph was enabled. The problem here is that the hotplug callbacks are called when the CPUs come online, but the idle shadow stack initialization only happens if function graph is currently active. This caused the online CPUs to not get their shadow stack initialized. The idle shadow stack initialization still needs to be done when the function graph is registered, as they will not be allocated if function graph is not registered. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20241211135335.094ba282@batman.local.home Fixes: 2c02f7375e65 ("fgraph: Use CPU hotplug mechanism to initialize idle shadow stacks") Reported-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Linus Walleij <linus.walleij@linaro.org> Closes: https://lore.kernel.org/all/CACRpkdaTBrHwRbbrphVy-=SeDz6MSsXhTKypOtLrTQ+DgGAOcQ@mail.gmail.com/ Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-12-16thermal/thresholds: Fix uapi header macros leading to a compilation errorDaniel Lezcano
The macros giving the direction of the crossing thresholds use the BIT macro which is not exported to the userspace. Consequently when an userspace program includes the header, it fails to compile. Replace the macros by their litteral to allow the compilation of userspace program using this header. Fixes: 445936f9e258 ("thermal: core: Add user thresholds support") Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/20241212201311.4143196-1-daniel.lezcano@linaro.org [ rjw: Add Fixes: ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-12-16Merge tag 'soc-fixes-6.13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull SoC fixes from Arnd Bergmann: "Three small fixes for the soc tree: - devicetee fix for the Arm Juno reference machine, to allow more interesting PCI configurations - build fix for SCMI firmware on the NXP i.MX platform - fix for a race condition in Arm FF-A firmware" * tag 'soc-fixes-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: arm64: dts: fvp: Update PCIe bus-range property firmware: arm_ffa: Fix the race around setting ffa_dev->properties firmware: arm_scmi: Fix i.MX build dependency
2024-12-16Merge tag 'platform-drivers-x86-v6.13-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Ilpo Järvinen: - alienware-wmi: - Add support for Alienware m16 R1 AMD - Do not setup legacy LED control with X and G Series - intel/ifs: Clearwater Forest support - intel/vsec: Panther Lake support - p2sb: Do not hide the device if BIOS left it unhidden - touchscreen_dmi: Add SARY Tab 3 tablet information * tag 'platform-drivers-x86-v6.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86/intel/vsec: Add support for Panther Lake platform/x86/intel/ifs: Add Clearwater Forest to CPU support list platform/x86: touchscreen_dmi: Add info for SARY Tab 3 tablet p2sb: Do not scan and remove the P2SB device when it is unhidden p2sb: Move P2SB hide and unhide code to p2sb_scan_and_cache() p2sb: Introduce the global flag p2sb_hidden_by_bios p2sb: Factor out p2sb_read_from_cache() alienware-wmi: Adds support to Alienware m16 R1 AMD alienware-wmi: Fix X Series and G Series quirks
2024-12-16ASoC: Intel: sof_sdw: Update DMI matches for LenovoMark Brown
Merge series from Bard Liao <yung-chuan.liao@linux.intel.com>: The DMI match information for these models has changed so the match entries need updates.
2024-12-16Merge tag 'usb-serial-6.13-rc3' of ↵Greg Kroah-Hartman
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus Johan writes: USB-serial device ids for 6.13-rc3 Here are some new modem device ids. All have been in linux-next with no reported issues. * tag 'usb-serial-6.13-rc3' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial: USB: serial: option: add Telit FE910C04 rmnet compositions USB: serial: option: add MediaTek T7XX compositions USB: serial: option: add Netprisma LCUK54 modules for WWAN Ready USB: serial: option: add MeiG Smart SLM770A USB: serial: option: add TCL IK512 MBIM & ECM
2024-12-16ASoC: dt-bindings: realtek,rt5645: Fix CPVDD voltage commentChen-Yu Tsai
Both the ALC5645 and ALC5650 datasheets specify a recommended voltage of 1.8V for CPVDD, not 3.5V. Fix the comment. Cc: Matthias Brugger <matthias.bgg@gmail.com> Fixes: 26aa19174f0d ("ASoC: dt-bindings: rt5645: add suppliers") Fixes: 83d43ab0a1cb ("ASoC: dt-bindings: realtek,rt5645: Convert to dtschema") Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20241211035403.4157760-1-wenst@chromium.org Signed-off-by: Mark Brown <broonie@kernel.org>
2024-12-16ASoC: Intel: sof_sdw: Fix DMI match for Lenovo 21QA and 21QBRichard Fitzgerald
Update the DMI match for a Lenovo laptop to the new DMI identifier. This laptop ships with a different DMI identifier to what was expected, and now has two identifiers. Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Fixes: ea657f6b24e1 ("ASoC: Intel: sof_sdw: Add quirk for cs42l43 system using host DMICs") Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://patch.msgid.link/20241216140821.153670-3-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-12-16ASoC: Intel: sof_sdw: Fix DMI match for Lenovo 21Q6 and 21Q7Richard Fitzgerald
Update the DMI match for a Lenovo laptop to the new DMI identifier. This laptop ships with a different DMI identifier to what was expected, and now has two identifiers. Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Fixes: 83c062ae81e8 ("ASoC: Intel: sof_sdw: Add quirks for some new Lenovo laptops") Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://patch.msgid.link/20241216140821.153670-2-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-12-16RDMA/bnxt_re: Fix reporting hw_ver in query_deviceKalesh AP
Driver currently populates subsystem_device id in the "hw_ver" field of ib_attr structure in query_device. Updated to populate PCI revision ID. Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Reviewed-by: Preethi G <preethi.gurusiddalingeswaraswamy@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/20241211083931.968831-6-kalesh-anakkur.purayil@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-12-16RDMA/bnxt_re: Fix to export port num to ib_query_qpHongguang Gao
Current driver implementation doesn't populate the port_num field in query_qp. Adding the code to convert internal firmware port id to ibv defined port number and export it. Reviewed-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Hongguang Gao <hongguang.gao@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/20241211083931.968831-5-kalesh-anakkur.purayil@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-12-16RDMA/bnxt_re: Fix setting mandatory attributes for modify_qpDamodharam Ammepalli
Firmware expects "min_rnr_timer" as a mandatory attribute in MODIFY_QP command during the RTR-RTS transition. This needs to be enforced by the driver which is missing while setting bnxt_set_mandatory_attributes that sends these flags as part of modify_qp optimization. Fixes: 82c32d219272 ("RDMA/bnxt_re: Add support for optimized modify QP") Reviewed-by: Rukhsana Ansari <rukhsana.ansari@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/20241211083931.968831-4-kalesh-anakkur.purayil@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-12-16RDMA/bnxt_re: Add check for path mtu in modify_qpSaravanan Vajravel
When RDMA app configures path MTU, add a check in modify_qp verb to make sure that it doesn't go beyond interface MTU. If this check fails, driver will fail the modify_qp verb. Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/20241211083931.968831-3-kalesh-anakkur.purayil@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-12-16RDMA/bnxt_re: Fix the check for 9060 conditionKalesh AP
The check for 9060 condition should only be made for legacy chips. Fixes: 9152e0b722b2 ("RDMA/bnxt_re: HW workarounds for handling specific conditions") Reviewed-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/20241211083931.968831-2-kalesh-anakkur.purayil@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-12-16erofs: use buffered I/O for file-backed mounts by defaultGao Xiang
For many use cases (e.g. container images are just fetched from remote), performance will be impacted if underlay page cache is up-to-date but direct i/o flushes dirty pages first. Instead, let's use buffered I/O by default to keep in sync with loop devices and add a (re)mount option to explicitly give a try to use direct I/O if supported by the underlying files. The container startup time is improved as below: [workload] docker.io/library/workpress:latest unpack 1st run non-1st runs EROFS snapshotter buffered I/O file 4.586404265s 0.308s 0.198s EROFS snapshotter direct I/O file 4.581742849s 2.238s 0.222s EROFS snapshotter loop 4.596023152s 0.346s 0.201s Overlayfs snapshotter 5.382851037s 0.206s 0.214s Fixes: fb176750266a ("erofs: add file-backed mount support") Cc: Derek McGowan <derek@mcg.dev> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20241212134336.2059899-1-hsiangkao@linux.alibaba.com
2024-12-16erofs: reference `struct erofs_device_info` for erofs_map_devGao Xiang
Record `m_sb` and `m_dif` to replace `m_fscache`, `m_daxdev`, `m_fp` and `m_dax_part_off` in order to simplify the codebase. Note that `m_bdev` is still left since it can be assigned from `sb->s_bdev` directly. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20241212235401.2857246-1-hsiangkao@linux.alibaba.com
2024-12-16erofs: use `struct erofs_device_info` for the primary deviceGao Xiang
Instead of just listing each one directly in `struct erofs_sb_info` except that we still use `sb->s_bdev` for the primary block device. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20241216125310.930933-2-hsiangkao@linux.alibaba.com
2024-12-16Merge branch 'net-timestamp-selectable'David S. Miller
Kory Maincent says: ==================== net: Make timestamping selectable Up until now, there was no way to let the user select the hardware PTP provider at which time stamping occurs. The stack assumed that PHY time stamping is always preferred, but some MAC/PHY combinations were buggy. This series updates the default MAC/PHY default timestamping and aims to allow the user to select the desired hwtstamp provider administratively. Here is few netlink spec usage examples: ./ynl/cli.py --spec netlink/specs/ethtool.yaml --no-schema --dump tsinfo-get --json '{"header":{"dev-name":"eth0"}}' [{'header': {'dev-index': 3, 'dev-name': 'eth0'}, 'hwtst-provider': {'index': 0, 'qualifier': 0}, 'phc-index': 0, 'rx-filters': {'bits': {'bit': [{'index': 0, 'name': 'none'}, {'index': 2, 'name': 'some'}]}, 'nomask': True, 'size': 16}, 'timestamping': {'bits': {'bit': [{'index': 0, 'name': 'hardware-transmit'}, {'index': 2, 'name': 'hardware-receive'}, {'index': 6, 'name': 'hardware-raw-clock'}]}, 'nomask': True, 'size': 17}, 'tx-types': {'bits': {'bit': [{'index': 0, 'name': 'off'}, {'index': 1, 'name': 'on'}]}, 'nomask': True, 'size': 4}}, {'header': {'dev-index': 3, 'dev-name': 'eth0'}, 'hwtst-provider': {'index': 2, 'qualifier': 0}, 'phc-index': 2, 'rx-filters': {'bits': {'bit': [{'index': 0, 'name': 'none'}, {'index': 1, 'name': 'all'}]}, 'nomask': True, 'size': 16}, 'timestamping': {'bits': {'bit': [{'index': 0, 'name': 'hardware-transmit'}, {'index': 1, 'name': 'software-transmit'}, {'index': 2, 'name': 'hardware-receive'}, {'index': 3, 'name': 'software-receive'}, {'index': 4, 'name': 'software-system-clock'}, {'index': 6, 'name': 'hardware-raw-clock'}]}, 'nomask': True, 'size': 17}, 'tx-types': {'bits': {'bit': [{'index': 0, 'name': 'off'}, {'index': 1, 'name': 'on'}, {'index': 2, 'name': 'onestep-sync'}]}, 'nomask': True, 'size': 4}}] ./ynl/cli.py --spec netlink/specs/ethtool.yaml --no-schema --do tsinfo-get --json '{"header":{"dev-name":"eth0"}, "hwtst-provider":{"index":0, "qualifier":0 } }' {'header': {'dev-index': 3, 'dev-name': 'eth0'}, 'hwtst-provider': {'index': 0, 'qualifier': 0}, 'phc-index': 0, 'rx-filters': {'bits': {'bit': [{'index': 0, 'name': 'none'}, {'index': 2, 'name': 'some'}]}, 'nomask': True, 'size': 16}, 'timestamping': {'bits': {'bit': [{'index': 0, 'name': 'hardware-transmit'}, {'index': 2, 'name': 'hardware-receive'}, {'index': 6, 'name': 'hardware-raw-clock'}]}, 'nomask': True, 'size': 17}, 'tx-types': {'bits': {'bit': [{'index': 0, 'name': 'off'}, {'index': 1, 'name': 'on'}]}, 'nomask': True, 'size': 4}} ./ynl/cli.py --spec netlink/specs/ethtool.yaml --no-schema --do tsinfo-set --json '{"header":{"dev-name":"eth0"}, "hwtst-provider":{"index":2, "qualifier":0}}' None ./ynl/cli.py --spec netlink/specs/ethtool.yaml --no-schema --do tsconfig-get --json '{"header":{"dev-name":"eth0"}}' {'header': {'dev-index': 3, 'dev-name': 'eth0'}, 'hwtstamp-flags': 1, 'hwtstamp-provider': {'index': 1, 'qualifier': 0}, 'rx-filters': {'bits': {'bit': [{'index': 12, 'name': 'ptpv2-event'}]}, 'nomask': True, 'size': 16}, 'tx-types': {'bits': {'bit': [{'index': 1, 'name': 'on'}]}, 'nomask': True, 'size': 4}} ./ynl/cli.py --spec netlink/specs/ethtool.yaml --no-schema --do tsconfig-set --json '{"header":{"dev-name":"eth0"}, "hwtstamp-provider":{"index":1, "qualifier":0 }, "rx-filters":{"bits": {"bit": {"name":"ptpv2-l4-event"}}, "nomask": 1}, "tx-types":{"bits": {"bit": {"name":"on"}}, "nomask": 1}}' {'header': {'dev-index': 3, 'dev-name': 'eth0'}, 'hwtstamp-flags': 1, 'hwtstamp-provider': {'index': 1, 'qualifier': 0}, 'rx-filters': {'bits': {'bit': [{'index': 12, 'name': 'ptpv2-event'}]}, 'nomask': True, 'size': 16}, 'tx-types': {'bits': {'bit': [{'index': 1, 'name': 'on'}]}, 'nomask': True, 'size': 4}} Changes in v21: - NIT fixes. - Link to v20: https://lore.kernel.org/r/20241204-feature_ptp_netnext-v20-0-9bd99dc8a867@bootlin.com Changes in v20: - Change hwtstamp provider design to avoid saving "user" (phy or net) in the ptp clock structure. - Link to v19: https://lore.kernel.org/r/20241030-feature_ptp_netnext-v19-0-94f8aadc9d5c@bootlin.com Changes in v19: - Rebase on net-next - Link to v18: https://lore.kernel.org/r/20241023-feature_ptp_netnext-v18-0-ed948f3b6887@bootlin.com Changes in v18: - Few changes in the tsconfig-set ethtool command. - Add tsconfig-set-reply ethtool netlink socket. - Add missing netlink tsconfig documentation - Link to v17: https://lore.kernel.org/r/20240709-feature_ptp_netnext-v17-0-b5317f50df2a@bootlin.com Changes in v17: - Fix a documentation nit. - Add a missing kernel_ethtool_tsinfo update from a new MAC driver. - Link to v16: https://lore.kernel.org/r/20240705-feature_ptp_netnext-v16-0-5d7153914052@bootlin.com Changes in v16: - Add a new patch to separate tsinfo into a new tsconfig command to get and set the hwtstamp config. - Used call_rcu() instead of synchronize_rcu() to free the hwtstamp_provider - Moved net core changes of patch 12 directly to patch 8. - Link to v15: https://lore.kernel.org/r/20240612-feature_ptp_netnext-v15-0-b2a086257b63@bootlin.com Changes in v15: - Fix uninitialized ethtool_ts_info structure. - Link to v14: https://lore.kernel.org/r/20240604-feature_ptp_netnext-v14-0-77b6f6efea40@bootlin.com Changes in v14: - Add back an EXPORT_SYMBOL() missing. - Link to v13: https://lore.kernel.org/r/20240529-feature_ptp_netnext-v13-0-6eda4d40fa4f@bootlin.com Changes in v13: - Add PTP builtin code to fix build errors when building PTP as a module. - Fix error spotted by smatch and sparse. - Link to v12: https://lore.kernel.org/r/20240430-feature_ptp_netnext-v12-0-2c5f24b6a914@bootlin.com Changes in v12: - Add missing return description in the kdoc. - Fix few nit. - Link to v11: https://lore.kernel.org/r/20240422-feature_ptp_netnext-v11-0-f14441f2a1d8@bootlin.com Changes in v11: - Add netlink examples. - Remove a change of my out of tree marvell_ptp patch in the patch series. - Remove useless extern. - Link to v10: https://lore.kernel.org/r/20240409-feature_ptp_netnext-v10-0-0fa2ea5c89a9@bootlin.com Changes in v10: - Move declarations to net/core/dev.h instead of netdevice.h - Add netlink documentation. - Add ETHTOOL_A_TSINFO_GHWTSTAMP netlink attributes instead of a bit in ETHTOOL_A_TSINFO_TIMESTAMPING bitset. - Send "Move from simple ida to xarray" patch standalone. - Add tsinfo ntf command. - Add rcu_lock protection mechanism to avoid memory leak. - Fixed doc and kdoc issue. - Link to v9: https://lore.kernel.org/r/20240226-feature_ptp_netnext-v9-0-455611549f21@bootlin.com Changes in v9: - Remove the RFC prefix. - Correct few NIT fixes. - Link to v8: https://lore.kernel.org/r/20240216-feature_ptp_netnext-v8-0-510f42f444fb@bootlin.com Changes in v8: - Drop the 6 first patch as they are now merged. - Change the full implementation to not be based on the hwtstamp layer (MAC/PHY) but on the hwtstamp provider which mean a ptp clock and a phc qualifier. - Made some patch to prepare the new implementation. - Expand netlink tsinfo instead of a new ts command for new hwtstamp configuration uAPI and for dumping tsinfo of specific hwtstamp provider. - Link to v7: https://lore.kernel.org/r/20231114-feature_ptp_netnext-v7-0-472e77951e40@bootlin.com Changes in v7: - Fix a temporary build error. - Link to v6: https://lore.kernel.org/r/20231019-feature_ptp_netnext-v6-0-71affc27b0e5@bootlin.com Changes in v6: - Few fixes from the reviews. - Replace the allowlist to default_timestamp flag to know which phy is using old API behavior. - Rename the timestamping layer enum values. - Move to a simple enum instead of the mix between enum and bitfield. - Update ts_info and ts-set in software timestamping case. Changes in v5: - Update to ndo_hwstamp_get/set. This bring several new patches. - Add few patches to make the glue. - Convert macb to ndo_hwstamp_get/set. - Add netlink specs description of new ethtool commands. - Removed netdev notifier. - Split the patches that expose the timestamping to userspace to separate the core and ethtool development. - Add description of software timestamping. - Convert PHYs hwtstamp callback to use kernel_hwtstamp_config. Changes in v4: - Move on to ethtool netlink instead of ioctl. - Add a netdev notifier to allow packet trapping by the MAC in case of PHY time stamping. - Add a PHY whitelist to not break the old PHY default time-stamping preference API. Changes in v3: - Expose the PTP choice to ethtool instead of sysfs. You can test it with the ethtool source on branch feature_ptp of: https://github.com/kmaincent/ethtool - Added a devicetree binding to select the preferred timestamp. Changes in v2: - Move selected_timestamping_layer variable of the concerned patch. - Use sysfs_streq instead of strmcmp. - Use the PHY timestamp only if available. ==================== Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-16net: ethtool: Add support for tsconfig command to get/set hwtstamp configKory Maincent
Introduce support for ETHTOOL_MSG_TSCONFIG_GET/SET ethtool netlink socket to read and configure hwtstamp configuration of a PHC provider. Note that simultaneous hwtstamp isn't supported; configuring a new one disables the previous setting. Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-16net: ethtool: tsinfo: Enhance tsinfo to support several hwtstamp by net topologyKory Maincent
Either the MAC or the PHY can provide hwtstamp, so we should be able to read the tsinfo for any hwtstamp provider. Enhance 'get' command to retrieve tsinfo of hwtstamp providers within a network topology. Add support for a specific dump command to retrieve all hwtstamp providers within the network topology, with added functionality for filtered dump to target a single interface. Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-16net: Add the possibility to support a selected hwtstamp in netdeviceKory Maincent
Introduce the description of a hwtstamp provider, mainly defined with a the hwtstamp source and the phydev pointer. Add a hwtstamp provider description within the netdev structure to allow saving the hwtstamp we want to use. This prepares for future support of an ethtool netlink command to select the desired hwtstamp provider. By default, the old API that does not support hwtstamp selectability is used, meaning the hwtstamp provider pointer is unset. Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-16net: Make net_hwtstamp_validate accessibleKory Maincent
Make the net_hwtstamp_validate function accessible in prevision to use it from ethtool to validate the hwtstamp configuration before setting it. Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>