Age | Commit message (Collapse) | Author |
|
Add support for a bitmap for phy interface modes, which includes:
- a macro to declare the interface bitmap
- an inline helper to zero the interface bitmap
- an inline helper to detect an empty interface bitmap
- inline helpers to do a bitwise AND and OR operations on two interface
bitmaps
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This change allows an extended address struct - struct sockaddr_mctp_ext
- to be passed to sendmsg/recvmsg. This allows userspace to specify
output ifindex and physical address information (for sendmsg) or receive
the input ifindex/physaddr for incoming messages (for recvmsg). This is
typically used by userspace for MCTP address discovery and assignment
operations.
The extended addressing facility is conditional on a new sockopt:
MCTP_OPT_ADDR_EXT; userspace must explicitly enable addressing before
the kernel will consume/populate the extended address data.
Includes a fix for an uninitialised var:
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
sk_stream_alloc_skb() is only used by TCP.
Rename it to make this clear, and move its declaration
to include/net/tcp.h
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
neigh_output() reads n->nud_state and hh->hh_len locklessly.
This is fine, but we need to add annotations and document this.
We evaluate skip_cache first to avoid reading these fields
if the cache has to by bypassed.
syzbot report:
BUG: KCSAN: data-race in __neigh_event_send / ip_finish_output2
write to 0xffff88810798a885 of 1 bytes by interrupt on cpu 1:
__neigh_event_send+0x40d/0xac0 net/core/neighbour.c:1128
neigh_event_send include/net/neighbour.h:444 [inline]
neigh_resolve_output+0x104/0x410 net/core/neighbour.c:1476
neigh_output include/net/neighbour.h:510 [inline]
ip_finish_output2+0x80a/0xaa0 net/ipv4/ip_output.c:221
ip_finish_output+0x3b5/0x510 net/ipv4/ip_output.c:309
NF_HOOK_COND include/linux/netfilter.h:296 [inline]
ip_output+0xf3/0x1a0 net/ipv4/ip_output.c:423
dst_output include/net/dst.h:450 [inline]
ip_local_out+0x164/0x220 net/ipv4/ip_output.c:126
__ip_queue_xmit+0x9d3/0xa20 net/ipv4/ip_output.c:525
ip_queue_xmit+0x34/0x40 net/ipv4/ip_output.c:539
__tcp_transmit_skb+0x142a/0x1a00 net/ipv4/tcp_output.c:1405
tcp_transmit_skb net/ipv4/tcp_output.c:1423 [inline]
tcp_xmit_probe_skb net/ipv4/tcp_output.c:4011 [inline]
tcp_write_wakeup+0x4a9/0x810 net/ipv4/tcp_output.c:4064
tcp_send_probe0+0x2c/0x2b0 net/ipv4/tcp_output.c:4079
tcp_probe_timer net/ipv4/tcp_timer.c:398 [inline]
tcp_write_timer_handler+0x394/0x520 net/ipv4/tcp_timer.c:626
tcp_write_timer+0xb9/0x180 net/ipv4/tcp_timer.c:642
call_timer_fn+0x2e/0x1d0 kernel/time/timer.c:1421
expire_timers+0x135/0x240 kernel/time/timer.c:1466
__run_timers+0x368/0x430 kernel/time/timer.c:1734
run_timer_softirq+0x19/0x30 kernel/time/timer.c:1747
__do_softirq+0x12c/0x26e kernel/softirq.c:558
invoke_softirq kernel/softirq.c:432 [inline]
__irq_exit_rcu kernel/softirq.c:636 [inline]
irq_exit_rcu+0x4e/0xa0 kernel/softirq.c:648
sysvec_apic_timer_interrupt+0x69/0x80 arch/x86/kernel/apic/apic.c:1097
asm_sysvec_apic_timer_interrupt+0x12/0x20
native_safe_halt arch/x86/include/asm/irqflags.h:51 [inline]
arch_safe_halt arch/x86/include/asm/irqflags.h:89 [inline]
acpi_safe_halt drivers/acpi/processor_idle.c:109 [inline]
acpi_idle_do_entry drivers/acpi/processor_idle.c:553 [inline]
acpi_idle_enter+0x258/0x2e0 drivers/acpi/processor_idle.c:688
cpuidle_enter_state+0x2b4/0x760 drivers/cpuidle/cpuidle.c:237
cpuidle_enter+0x3c/0x60 drivers/cpuidle/cpuidle.c:351
call_cpuidle kernel/sched/idle.c:158 [inline]
cpuidle_idle_call kernel/sched/idle.c:239 [inline]
do_idle+0x1a3/0x250 kernel/sched/idle.c:306
cpu_startup_entry+0x15/0x20 kernel/sched/idle.c:403
secondary_startup_64_no_verify+0xb1/0xbb
read to 0xffff88810798a885 of 1 bytes by interrupt on cpu 0:
neigh_output include/net/neighbour.h:507 [inline]
ip_finish_output2+0x79a/0xaa0 net/ipv4/ip_output.c:221
ip_finish_output+0x3b5/0x510 net/ipv4/ip_output.c:309
NF_HOOK_COND include/linux/netfilter.h:296 [inline]
ip_output+0xf3/0x1a0 net/ipv4/ip_output.c:423
dst_output include/net/dst.h:450 [inline]
ip_local_out+0x164/0x220 net/ipv4/ip_output.c:126
__ip_queue_xmit+0x9d3/0xa20 net/ipv4/ip_output.c:525
ip_queue_xmit+0x34/0x40 net/ipv4/ip_output.c:539
__tcp_transmit_skb+0x142a/0x1a00 net/ipv4/tcp_output.c:1405
tcp_transmit_skb net/ipv4/tcp_output.c:1423 [inline]
tcp_xmit_probe_skb net/ipv4/tcp_output.c:4011 [inline]
tcp_write_wakeup+0x4a9/0x810 net/ipv4/tcp_output.c:4064
tcp_send_probe0+0x2c/0x2b0 net/ipv4/tcp_output.c:4079
tcp_probe_timer net/ipv4/tcp_timer.c:398 [inline]
tcp_write_timer_handler+0x394/0x520 net/ipv4/tcp_timer.c:626
tcp_write_timer+0xb9/0x180 net/ipv4/tcp_timer.c:642
call_timer_fn+0x2e/0x1d0 kernel/time/timer.c:1421
expire_timers+0x135/0x240 kernel/time/timer.c:1466
__run_timers+0x368/0x430 kernel/time/timer.c:1734
run_timer_softirq+0x19/0x30 kernel/time/timer.c:1747
__do_softirq+0x12c/0x26e kernel/softirq.c:558
invoke_softirq kernel/softirq.c:432 [inline]
__irq_exit_rcu kernel/softirq.c:636 [inline]
irq_exit_rcu+0x4e/0xa0 kernel/softirq.c:648
sysvec_apic_timer_interrupt+0x69/0x80 arch/x86/kernel/apic/apic.c:1097
asm_sysvec_apic_timer_interrupt+0x12/0x20
native_safe_halt arch/x86/include/asm/irqflags.h:51 [inline]
arch_safe_halt arch/x86/include/asm/irqflags.h:89 [inline]
acpi_safe_halt drivers/acpi/processor_idle.c:109 [inline]
acpi_idle_do_entry drivers/acpi/processor_idle.c:553 [inline]
acpi_idle_enter+0x258/0x2e0 drivers/acpi/processor_idle.c:688
cpuidle_enter_state+0x2b4/0x760 drivers/cpuidle/cpuidle.c:237
cpuidle_enter+0x3c/0x60 drivers/cpuidle/cpuidle.c:351
call_cpuidle kernel/sched/idle.c:158 [inline]
cpuidle_idle_call kernel/sched/idle.c:239 [inline]
do_idle+0x1a3/0x250 kernel/sched/idle.c:306
cpu_startup_entry+0x15/0x20 kernel/sched/idle.c:403
rest_init+0xee/0x100 init/main.c:734
arch_call_rest_init+0xa/0xb
start_kernel+0x5e4/0x669 init/main.c:1142
secondary_startup_64_no_verify+0xb1/0xbb
value changed: 0x20 -> 0x01
Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.15.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2021-10-25
Misc updates for mlx5 driver:
1) Misc updates and cleanups:
- Don't write directly to netdev->dev_addr, From Jakub Kicinski
- Remove unnecessary checks for slow path flag in tc module
- Fix unused function warning of mlx5i_flow_type_mask
- Bridge, support replacing existing FDB entry
2) Sub Functions, Reduction in memory usage:
- Reduce flow counters bulk query buffer size
- Implement max_macs devlink parameter
- Add devlink vendor params to control Event Queue sizes
- Added SF life cycle trace points by Parav/
3) From Aya, Firmware health buffer reporting improvements
- Print health buffer by log level and more missing information
- Periodic update of host time to firmware
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
During a testing of an user-space application which transmits UDP
multicast datagrams and utilizes multicast routing to send the UDP
datagrams out of defined network interfaces, I've found a multicast
router does not fill-in UDP checksum into locally produced, looped-back
and forwarded UDP datagrams, if an original output NIC the datagrams
are sent to has UDP TX checksum offload enabled.
The datagrams are sent malformed out of the NIC the datagrams have been
forwarded to.
It is because:
1. If TX checksum offload is enabled on the output NIC, UDP checksum
is not calculated by kernel and is not filled into skb data.
2. dev_loopback_xmit(), which is called solely by
ip_mc_finish_output(), sets skb->ip_summed = CHECKSUM_UNNECESSARY
unconditionally.
3. Since 35fc92a9 ("[NET]: Allow forwarding of ip_summed except
CHECKSUM_COMPLETE"), the ip_summed value is preserved during
forwarding.
4. If ip_summed != CHECKSUM_PARTIAL, checksum is not calculated during
a packet egress.
The minimum fix in dev_loopback_xmit():
1. Preserves skb->ip_summed CHECKSUM_PARTIAL. This is the
case when the original output NIC has TX checksum offload enabled.
The effects are:
a) If the forwarding destination interface supports TX checksum
offloading, the NIC driver is responsible to fill-in the
checksum.
b) If the forwarding destination interface does NOT support TX
checksum offloading, checksums are filled-in by kernel before
skb is submitted to the NIC driver.
c) For local delivery, checksum validation is skipped as in the
case of CHECKSUM_UNNECESSARY, thanks to skb_csum_unnecessary().
2. Translates ip_summed CHECKSUM_NONE to CHECKSUM_UNNECESSARY. It
means, for CHECKSUM_NONE, the behavior is unmodified and is there
to skip a looped-back packet local delivery checksum validation.
Signed-off-by: Cyril Strejc <cyril.strejc@skoda.cz>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
irq_cpu_{on,off}line() are now only used by the Octeon platform.
Make their use conditional on this plaform being enabled, and
otherwise hidden away.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Serge Semin <fancer.lancer@gmail.com>
Link: https://lore.kernel.org/r/20211021170414.3341522-4-maz@kernel.org
|
|
Now that entry code handles IRQ entry (including setting the IRQ regs)
before calling irqchip code, irqchip code can safely call
generic_handle_domain_irq(), and there's no functional reason for it to
call handle_domain_irq().
Let's cement this split of responsibility and remove handle_domain_irq()
entirely, updating irqchip drivers to call generic_handle_domain_irq().
For consistency, handle_domain_nmi() is similarly removed and replaced
with a generic_handle_domain_nmi() function which also does not perform
any entry logic.
Previously handle_domain_{irq,nmi}() had a WARN_ON() which would fire
when they were called in an inappropriate context. So that we can
identify similar issues going forward, similar WARN_ON_ONCE() logic is
added to the generic_handle_*() functions, and comments are updated for
clarity and consistency.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
|
|
New x86 FPU features will be very large, requiring ~10k of stack in
signal handlers. These new features require a new approach called
"dynamic features".
The kernel currently tries to ensure that altstacks are reasonably
sized. Right now, on x86, sys_sigaltstack() requires a size of >=2k.
However, that 2k is a constant. Simply raising that 2k requirement
to >10k for the new features would break existing apps which have a
compiled-in size of 2k.
Instead of universally enforcing a larger stack, prohibit a process from
using dynamic features without properly-sized altstacks. This must be
enforced in two places:
* A dynamic feature can not be enabled without an large-enough altstack
for each process thread.
* Once a dynamic feature is enabled, any request to install a too-small
altstack will be rejected
The dynamic feature enabling code must examine each thread in a
process to ensure that the altstacks are large enough. Add a new lock
(sigaltstack_lock()) to ensure that threads can not race and change
their altstack after being examined.
Add the infrastructure in form of a config option and provide empty
stubs for architectures which do not need dynamic altstack size checks.
This implementation will be fleshed out for x86 in a future patch called
x86/arch_prctl: Add controls for dynamic XSTATE components
[dhansen: commit message. ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211021225527.10184-2-chang.seok.bae@intel.com
|
|
The Atmel TPM 1.2 chips crash with error
`tpm_try_transmit: send(): error -62` since kernel 4.14.
It is observed from the kernel log after running `tpm_sealdata -z`.
The error thrown from the command is as follows
```
$ tpm_sealdata -z
Tspi_Key_LoadKey failed: 0x00001087 - layer=tddl,
code=0087 (135), I/O error
```
The issue was reproduced with the following Atmel TPM chip:
```
$ tpm_version
T0 TPM 1.2 Version Info:
Chip Version: 1.2.66.1
Spec Level: 2
Errata Revision: 3
TPM Vendor ID: ATML
TPM Version: 01010000
Manufacturer Info: 41544d4c
```
The root cause of the issue is due to the TPM calls to msleep()
were replaced with usleep_range() [1], which reduces
the actual timeout. Via experiments, it is observed that
the original msleep(5) actually sleeps for 15ms.
Because of a known timeout issue in Atmel TPM 1.2 chip,
the shorter timeout than 15ms can cause the error described above.
A few further changes in kernel 4.16 [2] and 4.18 [3, 4] further
reduced the timeout to less than 1ms. With experiments,
the problematic timeout in the latest kernel is the one
for `wait_for_tpm_stat`.
To fix it, the patch reverts the timeout of `wait_for_tpm_stat`
to 15ms for all Atmel TPM 1.2 chips, but leave it untouched
for Ateml TPM 2.0 chip, and chips from other vendors.
As explained above, the chosen 15ms timeout is
the actual timeout before this issue introduced,
thus the old value is used here.
Particularly, TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 14700us,
TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 15000us according to
the existing TPM_TIMEOUT_RANGE_US (300us).
The fixed has been tested in the system with the affected Atmel chip
with no issues observed after boot up.
References:
[1] 9f3fc7bcddcb tpm: replace msleep() with usleep_range() in TPM
1.2/2.0 generic drivers
[2] cf151a9a44d5 tpm: reduce tpm polling delay in tpm_tis_core
[3] 59f5a6b07f64 tpm: reduce poll sleep time in tpm_transmit()
[4] 424eaf910c32 tpm: reduce polling time to usecs for even finer
granularity
Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
Link: https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
Signed-off-by: Hao Wu <hao.wu@rubrik.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
RFC 5082 IP_MINTTL option is rarely used on hosts.
Add a static key to remove from TCP fast path useless code,
and potential cache line miss to fetch inet_sk(sk)->min_ttl
Note that once ip4_min_ttl static key has been enabled,
it stays enabled until next boot.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
RFC 5082 IPV6_MINHOPCOUNT is rarely used on hosts.
Add a static key to remove from TCP fast path useless code,
and potential cache line miss to fetch tcp_inet6_sk(sk)->min_hopcount
Note that once ip6_min_hopcount static key has been enabled,
it stays enabled until next boot.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
sk->sk_rx_queue_mapping can be modified locklessly,
add a couple of READ_ONCE()/WRITE_ONCE() to document this fact.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
sk_rx_queue_mapping is located in a cache line that should be kept read mostly.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
sk_napi_id is located in a cache line that can be kept read mostly.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Increase cache locality by moving rx_dst_coookie next to sk->sk_rx_dst
This removes one or two cache line misses in IPv6 early demux (TCP/UDP)
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Increase cache locality by moving rx_dst_ifindex next to sk->sk_rx_dst
This is part of an effort to reduce cache line misses in TCP fast path.
This removes one cache line miss in early demux.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The __compiletime_strlen() macro expansion will shadow p_size and p_len
local variables. No callers currently use any of the shadowed names
for their "p" variable, so there are no code generation problems.
Add "__" prefixes to variable definitions __compiletime_strlen() to
avoid new W=2 warnings:
./include/linux/fortify-string.h: In function 'strnlen':
./include/linux/fortify-string.h:17:9: warning: declaration of 'p_size' shadows a previous local [-Wshadow]
17 | size_t p_size = __builtin_object_size(p, 1); \
| ^~~~~~
./include/linux/fortify-string.h:77:17: note: in expansion of macro '__compiletime_strlen'
77 | size_t p_len = __compiletime_strlen(p);
| ^~~~~~~~~~~~~~~~~~~~
./include/linux/fortify-string.h:76:9: note: shadowed declaration is here
76 | size_t p_size = __builtin_object_size(p, 1);
| ^~~~~~
Signed-off-by: Qian Cai <quic_qiancai@quicinc.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20211025210528.261643-1-quic_qiancai@quicinc.com
|
|
Currently, max_macs is taking 70Kbytes of memory per function. This
size is not needed in all use cases, and is critical with large scale.
Hence, allow user to configure the number of max_macs.
For example, to reduce the number of max_macs to 1, execute::
$ devlink dev param set pci/0000:00:0b.0 name max_macs value 1 \
cmode driverinit
$ devlink dev reload pci/0000:00:0b.0
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Event EQ is an EQ which received the notification of almost all the
events generated by the NIC.
Currently, each event EQ is taking 512KB of memory. This size is not
needed in most use cases, and is critical with large scale. Hence,
allow user to configure the size of the event EQ.
For example to reduce event EQ size to 64, execute::
$ devlink resource set pci/0000:00:0b.0 path /event_eq_size/ size 64
$ devlink dev reload pci/0000:00:0b.0
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Currently, each I/O EQ is taking 128KB of memory. This size
is not needed in all use cases, and is critical with large scale.
Hence, allow user to configure the size of I/O EQs.
For example, to reduce I/O EQ size to 64, execute:
$ devlink resource set pci/0000:00:0b.0 path /io_eq_size/ size 64
$ devlink dev reload pci/0000:00:0b.0
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Firmware logs its asserts also to non-volatile memory. In order to
reduce drift between the NIC and the host, the driver sets the host
epoch-time to the firmware every hour.
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Enhance health buffer to include:
- assert_var5: expose the 6'th assert variable.
- time: error's time-stamp in seconds (epoch time).
- rfr: Recovery Flow Requiered. When set, indicates that the error
cannot be recovered without flow involving reset.
- severity: error's severity value, ranging from emergency to debug.
Expose them in the health buffer dump (dmesg and devlink fw reporter).
Health buffer in dmesg:
mlx5_core 0000:08:00.0: print_health_info:425:(pid 912): Health issue observed, firmware internal error, severity(3) ERROR:
mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[0] 0x08040700
mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[1] 0x00000000
mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[2] 0x00000000
mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[3] 0x00000000
mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[4] 0x00000000
mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[5] 0x00000000
mlx5_core 0000:08:00.0: print_health_info:432:(pid 912): assert_exit_ptr 0x00aaf800
mlx5_core 0000:08:00.0: print_health_info:434:(pid 912): assert_callra 0x00aaf70c
mlx5_core 0000:08:00.0: print_health_info:436:(pid 912): fw_ver 16.32.492
mlx5_core 0000:08:00.0: print_health_info:437:(pid 912): time 1634819758
mlx5_core 0000:08:00.0: print_health_info:438:(pid 912): hw_id 0x0000020d
mlx5_core 0000:08:00.0: print_health_info:439:(pid 912): rfr 0
mlx5_core 0000:08:00.0: print_health_info:440:(pid 912): severity 3 (ERROR)
mlx5_core 0000:08:00.0: print_health_info:441:(pid 912): irisc_index 9
mlx5_core 0000:08:00.0: print_health_info:442:(pid 912): synd 0x1: firmware internal error
mlx5_core 0000:08:00.0: print_health_info:444:(pid 912): ext_synd 0x802b
mlx5_core 0000:08:00.0: print_health_info:445:(pid 912): raw fw_ver 0x102001ec
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
The second argument was only used by the USB gadget code, yet everyone
pays the overhead of passing a zero to be passed into aio, where it
ends up being part of the aio res2 value.
Now that everybody is passing in zero, kill off the extra argument.
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
tls already supports the SM4 GCM/CCM algorithms. It is also necessary
to add support for these two algorithms in tls_crypto_context to avoid
potential issues caused by forced type conversion.
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The management registrations locking was broken, the list was
locked for each wdev, but cfg80211_mgmt_registrations_update()
iterated it without holding all the correct spinlocks, causing
list corruption.
Rather than trying to fix it with fine-grained locking, just
move the lock to the wiphy/rdev (still need the list on each
wdev), we already need to hold the wdev lock to change it, so
there's no contention on the lock in any case. This trivially
fixes the bug since we hold one wdev's lock already, and now
will hold the lock that protects all lists.
Cc: stable@vger.kernel.org
Reported-by: Jouni Malinen <j@w1.fi>
Fixes: 6cd536fe62ef ("cfg80211: change internal management frame registration API")
Link: https://lore.kernel.org/r/20211025133111.5cf733eab0f4.I7b0abb0494ab712f74e2efcd24bb31ac33f7eee9@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Add generic fast retrain auto-negotiation function for C45 PHYs.
Signed-off-by: Luo Jie <luoj@codeaurora.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add the constants for 2.5G fast retrain capability
in 10G AN control register, fast retrain status and
control register and THP bypass register into mdio.h.
Signed-off-by: Luo Jie <luoj@codeaurora.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Now that the rtnl_mutex is going away for dsa_port_{host_,}fdb_{add,del},
no one is serializing access to the address lists that DSA keeps for the
purpose of reference counting on shared ports (CPU and cascade ports).
It can happen for one dsa_switch_do_fdb_del to do list_del on a dp->fdbs
element while another dsa_switch_do_fdb_{add,del} is traversing dp->fdbs.
We need to avoid that.
Currently dp->mdbs is not at risk, because dsa_switch_do_mdb_{add,del}
still runs under the rtnl_mutex. But it would be nice if it would not
depend on that being the case. So let's introduce a mutex per port (the
address lists are per port too) and share it between dp->mdbs and
dp->fdbs.
The place where we put the locking is interesting. It could be tempting
to put a DSA-level lock which still serializes calls to
.port_fdb_{add,del}, but it would still not avoid concurrency with other
driver code paths that are currently under rtnl_mutex (.port_fdb_dump,
.port_fast_age). So it would add a very false sense of security (and
adding a global switch-wide lock in DSA to resynchronize with the
rtnl_lock is also counterproductive and hard).
So the locking is intentionally done only where the dp->fdbs and dp->mdbs
lists are traversed. That means, from a driver perspective, that
.port_fdb_add will be called with the dp->addr_lists_lock mutex held on
the CPU port, but not held on user ports. This is done so that driver
writers are not encouraged to rely on any guarantee offered by
dp->addr_lists_lock.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
DSA would like to remove the rtnl_lock from its
SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE handlers, and the felix driver uses
the same MAC table functions as ocelot.
This means that the MAC table functions will no longer be implicitly
serialized with respect to each other by the rtnl_mutex, we need to add
a dedicated lock in ocelot for the non-atomic operations of selecting a
MAC table row, reading/writing what we want and polling for completion.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This reverts commit 965e6b262f48257dbdb51b565ecfd84877a0ab5f, reversing
changes made to 4d98bb0d7ec2d0b417df6207b0bafe1868bad9f8.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next
Marc Kleine-Budde says:
====================
pull-request: can-next 2021-10-24
this is a pull request of 15 patches for net-next/master.
The first patch is by Thomas Gleixner and makes use of
hrtimer_forward_now() in the CAN broad cast manager (bcm).
The next patch is by me and changes the type of the variables used in
the CAN bit timing calculation can_fixup_bittiming() to unsigned int.
Vincent Mailhol provides 6 patches targeting the CAN device
infrastructure. The CAN-FD specific Transmitter Delay Compensation
(TDC) is updated and configuration via the CAN netlink interface is
added.
Qing Wang's patch updates the at91 and janz-ican3 drivers to use
sysfs_emit() instead of snprintf() in the sysfs show functions.
Geert Uytterhoeven's patch drops the unneeded ARM dependency from the
rar Kconfig.
Cai Huoqing's patch converts the mscan driver to make use of the
dev_err_probe() helper function.
A patch by me against the gsusb driver changes the printf format
strings to use %u to print unsigned values.
Stephane Grosjean's patch updates the peak_usb CAN-FD driver to use
the 64 bit timestamps provided by the hardware.
The last 2 patches target the xilinx_can driver. Michal Simek provides
a patch that removes repeated word from the kernel-doc and Dongliang
Mu's patch removes a redundant netif_napi_del() from the xcan_remove()
function.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Several architectures select GENERIC_IRQ_MULTI_HANDLER and branch to
handle_arch_irq() without performing any entry accounting.
Add a generic wrapper to handle the common irqentry work when invoking
handle_arch_irq(). Where an architecture needs to perform some entry
accounting itself, it will need to invoke handle_arch_irq() itself.
In subsequent patches it will become the responsibilty of the entry code
to set the irq regs when entering an IRQ (rather than deferring this to
an irqchip handler), so generic_handle_arch_irq() is made to set the irq
regs now. This can be redundant in some cases, but is never harmful as
saving/restoring the old regs nests safely.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Guo Ren <guoren@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
|
|
There are no callers for early_init_dt_scan_chosen_arch(), so remove it.
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Frank Rowand <frank.rowand@sony.com>
Link: https://lkml.kernel.org/r/20211022164642.2815706-1-robh@kernel.org
|
|
Drivers using the new IRQCHIP_PLATFORM_DRIVER_BEGIN helper
fail to link when compile-testing without CONFIG_OF,
as that means CONFIG_IRQCHIP is disabled as well:
ld.lld: error: undefined symbol: platform_irqchip_probe
>>> referenced by irq-meson-gpio.c
>>> irqchip/irq-meson-gpio.o:(meson_gpio_intc_driver) in archive drivers/built-in.a
>>> referenced by irq-mchp-eic.c
>>> irqchip/irq-mchp-eic.o:(mchp_eic_driver) in archive drivers/built-in.a
As the drivers are not actually used in this case, just
making the reference to this symbol conditional helps
avoid the link failure.
Fixes: f8410e626569 ("irqchip: Add IRQCHIP_PLATFORM_DRIVER_BEGIN/END and IRQCHIP_MATCH helper macros")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211022154927.920491-1-arnd@kernel.org
|
|
struct can_tdc::tdco represents the absolute offset from TDCV. Some
controllers use instead an offset relative to the Sample Point (SP)
such that:
| SSP = TDCV + absolute TDCO
| = TDCV + SP + relative TDCO
Consequently:
| relative TDCO = absolute TDCO - SP
The function can_tdc_get_relative_tdco() allow to retrieve this
relative TDCO value.
Link: https://lore.kernel.org/all/20210918095637.20108-7-mailhol.vincent@wanadoo.fr
CC: Stefan Mätje <Stefan.Maetje@esd.eu>
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
Some CAN device can measure the TDCV (Transmission Delay Compensation
Value) automatically for each transmitted CAN frames.
A callback function do_get_auto_tdcv() is added to retrieve that
value. This function is used only if CAN_CTRLMODE_TDC_AUTO is enabled
(if CAN_CTRLMODE_TDC_MANUAL is selected, the TDCV value is provided by
the user).
If the device does not support reporting of TDCV, do_get_auto_tdcv()
should be set to NULL and TDCV will not be reported by the netlink
interface.
On success, do_get_auto_tdcv() shall return 0. If the value can not be
measured by the device, for example because network is down or because
no frames were transmitted yet, can_priv::do_get_auto_tdcv() shall
return a negative error code (e.g. -EINVAL) to signify that the value
is not yet available. In such cases, TDCV is not reported by the
netlink interface.
Link: https://lore.kernel.org/all/20210918095637.20108-6-mailhol.vincent@wanadoo.fr
CC: Stefan Mätje <stefan.maetje@esd.eu>
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
Add the netlink interface for TDC parameters of struct can_tdc_const
and can_tdc.
Contrary to the can_bittiming(_const) structures for which there is
just a single IFLA_CAN(_DATA)_BITTMING(_CONST) entry per structure,
here, we create a nested entry IFLA_CAN_TDC. Within this nested entry,
additional IFLA_CAN_TDC_TDC* entries are added for each of the TDC
parameters of the newly introduced struct can_tdc_const and struct
can_tdc.
For struct can_tdc_const, these are:
IFLA_CAN_TDC_TDCV_MIN
IFLA_CAN_TDC_TDCV_MAX
IFLA_CAN_TDC_TDCO_MIN
IFLA_CAN_TDC_TDCO_MAX
IFLA_CAN_TDC_TDCF_MIN
IFLA_CAN_TDC_TDCF_MAX
For struct can_tdc, these are:
IFLA_CAN_TDC_TDCV
IFLA_CAN_TDC_TDCO
IFLA_CAN_TDC_TDCF
This is done so that changes can be applied in the future to the
structures without breaking the netlink interface.
The TDC netlink logic works as follow:
* CAN_CTRLMODE_FD is not provided:
- if any TDC parameters are provided: error.
- TDC parameters not provided: TDC parameters unchanged.
* CAN_CTRLMODE_FD is provided and is false:
- TDC is deactivated: both the structure and the
CAN_CTRLMODE_TDC_{AUTO,MANUAL} flags are flushed.
* CAN_CTRLMODE_FD provided and is true:
- CAN_CTRLMODE_TDC_{AUTO,MANUAL} and tdc{v,o,f} not provided: call
can_calc_tdco() to automatically decide whether TDC should be
activated and, if so, set CAN_CTRLMODE_TDC_AUTO and uses the
calculated tdco value.
- CAN_CTRLMODE_TDC_AUTO and tdco provided: set
CAN_CTRLMODE_TDC_AUTO and use the provided tdco value. Here,
tdcv is illegal and tdcf is optional.
- CAN_CTRLMODE_TDC_MANUAL and both of tdcv and tdco provided: set
CAN_CTRLMODE_TDC_MANUAL and use the provided tdcv and tdco
value. Here, tdcf is optional.
- CAN_CTRLMODE_TDC_{AUTO,MANUAL} are mutually exclusive. Whenever
one flag is turned on, the other will automatically be turned
off. Providing both returns an error.
- Combination other than the one listed above are illegal and will
return an error.
N.B. above rules mean that whenever CAN_CTRLMODE_FD is provided, the
previous TDC values will be overwritten. The only option to reuse
previous TDC value is to not provide CAN_CTRLMODE_FD.
All the new parameters are defined as u32. This arbitrary choice is
done to mimic the other bittiming values with are also all of type
u32. An u16 would have been sufficient to hold the TDC values.
This patch completes below series (c.f. [1]):
- commit 289ea9e4ae59 ("can: add new CAN FD bittiming parameters:
Transmitter Delay Compensation (TDC)")
- commit c25cc7993243 ("can: bittiming: add calculation for CAN FD
Transmitter Delay Compensation (TDC)")
[1] https://lore.kernel.org/linux-can/20210224002008.4158-1-mailhol.vincent@wanadoo.fr/T/#t
Link: https://lore.kernel.org/all/20210918095637.20108-5-mailhol.vincent@wanadoo.fr
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
The function can_calc_tdco() directly retrieves can_priv from the
net_device and directly modifies it.
This is annoying for the upcoming patch. In
drivers/net/can/dev/netlink.c:can_changelink(), the data bittiming are
written to a temporary structure and memcpyed to can_priv only after
everything succeeded. In the next patch, where we will introduce the
netlink interface for TDC parameters, we will add a new TDC block
which can potentially fail. For this reason, the data bittiming
temporary structure has to be copied after that to-be-introduced TDC
block. However, TDC also needs to access data bittiming information.
We change the prototype so that the data bittiming structure is passed
to can_calc_tdco() as an argument instead of retrieving it from
priv. This way can_calc_tdco() can access the data bittiming before it
gets memcpyed to priv.
Link: https://lore.kernel.org/all/20210918095637.20108-4-mailhol.vincent@wanadoo.fr
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
In the current implementation, all Transmission Delay Compensation
(TDC) parameters are expressed in time quantum. However, ISO 11898-1
actually specifies that these should be expressed in *minimum* time
quantum.
Furthermore, the minimum time quantum is specified to be "one node
clock period long" (c.f. paragraph 11.3.1.1 "Bit time"). For sake of
simplicity, we prefer to use the "clock period" term instead of
"minimum time quantum" because we believe that it is more broadly
understood.
This patch fixes that discrepancy by updating the documentation and
the formula for TDCO calculation.
N.B. In can_calc_tdco(), the sample point (in time quantum) was
calculated using a division, thus introducing a risk of rounding and
truncation errors. On top of changing the unit to clock period, we
also modified the formula to use only additions.
Link: https://lore.kernel.org/all/20210918095637.20108-3-mailhol.vincent@wanadoo.fr
Suggested-by: Stefan Mätje <Stefan.Maetje@esd.eu>
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
ISO 11898-1 specifies in section 11.3.3 "Transmitter delay
compensation" that "the configuration range for [the] SSP position
shall be at least 0 to 63 minimum time quanta."
Because SSP = TDCV + TDCO, it means that we should allow both TDCV and
TDCO to hold zero value in order to honor SSP's minimum possible
value.
However, current implementation assigned special meaning to TDCV and
TDCO's zero values:
* TDCV = 0 -> TDCV is automatically measured by the transceiver.
* TDCO = 0 -> TDC is off.
In order to allow for those values to really be zero and to maintain
current features, we introduce two new flags:
* CAN_CTRLMODE_TDC_AUTO indicates that the controller support
automatic measurement of TDCV.
* CAN_CTRLMODE_TDC_MANUAL indicates that the controller support
manual configuration of TDCV. N.B.: current implementation failed
to provide an option for the driver to indicate that only manual
mode was supported.
TDC is disabled if both CAN_CTRLMODE_TDC_AUTO and
CAN_CTRLMODE_TDC_MANUAL flags are off, c.f. the helper function
can_tdc_is_enabled() which is also introduced in this patch.
Also, this patch adds three fields: tdcv_min, tdco_min and tdcf_min to
struct can_tdc_const. While we are not convinced that those three
fields could be anything else than zero, we can imagine that some
controllers might specify a lower bound on these. Thus, those minimums
are really added "just in case".
Comments of struct can_tdc and can_tdc_const are updated accordingly.
Finally, the changes are applied to the etas_es58x driver.
Link: https://lore.kernel.org/all/20210918095637.20108-2-mailhol.vincent@wanadoo.fr
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
Now that the rtnl_mutex is going away for dsa_port_{host_,}fdb_{add,del},
no one is serializing access to the address lists that DSA keeps for the
purpose of reference counting on shared ports (CPU and cascade ports).
It can happen for one dsa_switch_do_fdb_del to do list_del on a dp->fdbs
element while another dsa_switch_do_fdb_{add,del} is traversing dp->fdbs.
We need to avoid that.
Currently dp->mdbs is not at risk, because dsa_switch_do_mdb_{add,del}
still runs under the rtnl_mutex. But it would be nice if it would not
depend on that being the case. So let's introduce a mutex per port (the
address lists are per port too) and share it between dp->mdbs and
dp->fdbs.
The place where we put the locking is interesting. It could be tempting
to put a DSA-level lock which still serializes calls to
.port_fdb_{add,del}, but it would still not avoid concurrency with other
driver code paths that are currently under rtnl_mutex (.port_fdb_dump,
.port_fast_age). So it would add a very false sense of security (and
adding a global switch-wide lock in DSA to resynchronize with the
rtnl_lock is also counterproductive and hard).
So the locking is intentionally done only where the dp->fdbs and dp->mdbs
lists are traversed. That means, from a driver perspective, that
.port_fdb_add will be called with the dp->addr_lists_lock mutex held on
the CPU port, but not held on user ports. This is done so that driver
writers are not encouraged to rely on any guarantee offered by
dp->addr_lists_lock.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
DSA would like to remove the rtnl_lock from its
SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE handlers, and the felix driver uses
the same MAC table functions as ocelot.
This means that the MAC table functions will no longer be implicitly
serialized with respect to each other by the rtnl_mutex, we need to add
a dedicated lock in ocelot for the non-atomic operations of selecting a
MAC table row, reading/writing what we want and polling for completion.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
7712 is a 16nm process SoC with a 10/100 integrated Ethernet PHY,
utilize the recently defined 16nm EPHY macro to configure that PHY.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This adds some helpers for accessing non-phy MDIO devices. They are
analogous to phy_(read|write|modify), except that they take an mdio_device
and not a phy_device.
Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If CONFIG_BLOCK isn't set, then it's an empty struct anyway. Just make
it generally available, so we don't break the compile:
kernel/sched/core.c: In function ‘sched_submit_work’:
kernel/sched/core.c:6346:35: error: ‘struct task_struct’ has no member named ‘plug’
6346 | blk_flush_plug(tsk->plug, true);
| ^~
kernel/sched/core.c: In function ‘io_schedule_prepare’:
kernel/sched/core.c:8357:20: error: ‘struct task_struct’ has no member named ‘plug’
8357 | if (current->plug)
| ^~
kernel/sched/core.c:8358:39: error: ‘struct task_struct’ has no member named ‘plug’
8358 | blk_flush_plug(current->plug, true);
| ^~
Reported-by: Nathan Chancellor <nathan@kernel.org>
Fixes: 008f75a20e70 ("block: cleanup the flush plug helpers")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Restrict bpf_jit_limit to the maximum supported by the arch's JIT.
Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211014142554.53120-4-lmb@cloudflare.com
|
|
The change of devlink_register() to be last devlink command together
with delayed notification logic made the publish API to be obsolete.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"These fix two regressions, one related to ACPI power resources
management and one that broke ACPI tools compilation.
Specifics:
- Stop turning off unused ACPI power resources in an unknown state to
address a regression introduced during the 5.14 cycle (Rafael
Wysocki).
- Fix an ACPI tools build issue introduced recently when the minimal
stdarg.h was added (Miguel Bernal Marin)"
* tag 'acpi-5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: PM: Do not turn off power resources in unknown state
ACPI: tools: fix compilation error
|
|
Merge a fix for a recent ACPI tools bild regresson.
* acpi-tools:
ACPI: tools: fix compilation error
|