Age | Commit message (Collapse) | Author |
|
Use kfree() instead of kvfree() on ft->g in arfs_create_groups() because
the memory is allocated with kcalloc().
Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Missing space at the end of a comment line, add it.
Signed-off-by: Hu Haowen <xianfengting221@163.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Flow steering core layer is independent of the eswitch layer.
Hence avoid fs_core dependency on eswitch.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
The eth_addr member is passed to ether_addr functions that require
2-byte alignment, therefore the member must be properly aligned
to avoid unaligned accesses.
The problem is in place since the initial merge of multicast to unicast:
commit 6db6f0eae6052b70885562e1733896647ec1d807 bridge: multicast to unicast
Fixes: 6db6f0eae605 ("bridge: multicast to unicast")
Cc: Roopa Prabhu <roopa@cumulusnetworks.com>
Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Thomas Martitz <t.martitz@avm.de>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull rdma fixes from Jason Gunthorpe:
"Several regression fixes from work that landed in the merge window,
particularly in the mlx5 driver:
- Various static checker and warning fixes
- General bug fixes in rvt, qedr, hns, mlx5 and hfi1
- Several regression fixes related to the ECE and QP changes in last
cycle
- Fixes for a few long standing crashers in CMA, uverbs ioctl, and
xrc"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (25 commits)
IB/hfi1: Add atomic triggered sleep/wakeup
IB/hfi1: Correct -EBUSY handling in tx code
IB/hfi1: Fix module use count flaw due to leftover module put calls
IB/hfi1: Restore kfree in dummy_netdev cleanup
IB/mad: Fix use after free when destroying MAD agent
RDMA/mlx5: Protect from kernel crash if XRC_TGT doesn't have udata
RDMA/counter: Query a counter before release
RDMA/mad: Fix possible memory leak in ib_mad_post_receive_mads()
RDMA/mlx5: Fix integrity enabled QP creation
RDMA/mlx5: Remove ECE limitation from the RAW_PACKET QPs
RDMA/mlx5: Fix remote gid value in query QP
RDMA/mlx5: Don't access ib_qp fields in internal destroy QP path
RDMA/core: Check that type_attrs is not NULL prior access
RDMA/hns: Fix an cmd queue issue when resetting
RDMA/hns: Fix a calltrace when registering MR from userspace
RDMA/mlx5: Add missed RST2INIT and INIT2INIT steps during ECE handshake
RDMA/cma: Protect bind_list and listen_list while finding matching cm id
RDMA/qedr: Fix KASAN: use-after-free in ucma_event_handler+0x532
RDMA/efa: Set maximum pkeys device attribute
RDMA/rvt: Fix potential memory leak caused by rvt_alloc_rq
...
|
|
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.
After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.
In the case of ptp_pch, after removing PCI helper functions, .suspend()
and .resume() became empty-body functions. Hence, define them NULL and
use dev_pm_ops.
Compile-tested only.
Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
there is a problem with the CWR flag set in an incoming ACK segment
and it leads to the situation when the ECE flag is latched forever
the following packetdrill script shows what happens:
// Stack receives incoming segments with CE set
+0.1 <[ect0] . 11001:12001(1000) ack 1001 win 65535
+0.0 <[ce] . 12001:13001(1000) ack 1001 win 65535
+0.0 <[ect0] P. 13001:14001(1000) ack 1001 win 65535
// Stack repsonds with ECN ECHO
+0.0 >[noecn] . 1001:1001(0) ack 12001
+0.0 >[noecn] E. 1001:1001(0) ack 13001
+0.0 >[noecn] E. 1001:1001(0) ack 14001
// Write a packet
+0.1 write(3, ..., 1000) = 1000
+0.0 >[ect0] PE. 1001:2001(1000) ack 14001
// Pure ACK received
+0.01 <[noecn] W. 14001:14001(0) ack 2001 win 65535
// Since CWR was sent, this packet should NOT have ECE set
+0.1 write(3, ..., 1000) = 1000
+0.0 >[ect0] P. 2001:3001(1000) ack 14001
// but Linux will still keep ECE latched here, with packetdrill
// flagging a missing ECE flag, expecting
// >[ect0] PE. 2001:3001(1000) ack 14001
// in the script
In the situation above we will continue to send ECN ECHO packets
and trigger the peer to reduce the congestion window. To avoid that
we can check CWR on pure ACKs received.
v3:
- Add a sequence check to avoid sending an ACK to an ACK
v2:
- Adjusted the comment
- move CWR check before checking for unacknowledged packets
Signed-off-by: Denis Kirjanov <denis.kirjanov@suse.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The skcipher API dynamically instantiates the transformation object
on request that implements the requested algorithm optimally on the
given platform. This notion of optimality only matters for cases like
bulk network or disk encryption, where performance can be a bottleneck,
or in cases where the algorithm itself is not known at compile time.
In the mscc case, we are dealing with AES encryption of a single
block, and so neither concern applies, and we are better off using
the AES library interface, which is lightweight and safe for this
kind of use.
Note that the scatterlist API does not permit references to buffers
that are located on the stack, so the existing code is incorrect in
any case, but avoiding the skcipher and scatterlist APIs entirely is
the most straight-forward approach to fixing this.
Cc: Antoine Tenart <antoine.tenart@bootlin.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Fixes: 28c5107aa904e ("net: phy: mscc: macsec support")
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Heiko Carstens:
- Fix kernel crash on system call single stepping.
- Make sure early program check handler is executed with DAT on to
avoid an endless program check loop.
- Add __GFP_NOWARN flag to debug feature to avoid user triggerable
allocation failure messages.
* tag 's390-5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/debug: avoid kernel warning on too large number of pages
s390/kasan: fix early pgm check handler execution
s390: fix system call single stepping
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of small fixes gathered in the last two weeks.
The major changes here are fixes for the recent DPCM regressions found
on i.MX and Qualcomm platforms and fixes for resource leaks in ASoC
DAI registrations.
Other than those are mostly device-specific fixes including the usual
USB- and HD-audio quirks, and a fix for syzkaller case and ID updates
for new Intel platforms"
* tag 'sound-5.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (32 commits)
ALSA: usb-audio: Fix OOB access of mixer element list
ALSA: usb-audio: add quirk for Samsung USBC Headset (AKG)
ALSA: usb-audio: Add registration quirk for Kingston HyperX Cloud Flight S
ASoC: rockchip: Fix a reference count leak.
ASoC: amd: closing specific instance.
ALSA: hda: Intel: add missing PCI IDs for ICL-H, TGL-H and EKL
ASoC: hdac_hda: fix memleak with regmap not freed on remove
ASoC: SOF: Intel: add PCI IDs for ICL-H and TGL-H
ASoC: SOF: Intel: add PCI ID for CometLake-S
ASoC: Intel: SOF: merge COMETLAKE_LP and COMETLAKE_H
ALSA: hda/realtek: Add mute LED and micmute LED support for HP systems
ALSA: usb-audio: Fix potential use-after-free of streams
ALSA: hda/realtek - Add quirk for MSI GE63 laptop
ASoC: fsl_ssi: Fix bclk calculation for mono channel
ASoC: SOF: Intel: hda: Clear RIRB status before reading WP
ASoC: rt1015: Update rt1015 default register value according to spec modification.
ASoC: qcom: common: set correct directions for dailinks
ASoc: q6afe: add support to get port direction
ASoC: soc-pcm: fix checks for multi-cpu FE dailinks
ASoC: rt5682: Let dai clks be registered whether mclk exists or not
...
|
|
Define attach_type_name in common.c instead of main.h so it is only
defined once. This leads to a slight decrease in the binary size of
bpftool.
Before:
text data bss dec hex filename
399024 11168 1573160 1983352 1e4378 bpftool
After:
text data bss dec hex filename
398256 10880 1573160 1982296 1e3f58 bpftool
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20200624143154.13145-1-tklauser@distanz.ch
|
|
Define prog_type_name in prog.c instead of main.h so it is only defined
once. This leads to a slight decrease in the binary size of bpftool.
Before:
text data bss dec hex filename
401032 11936 1573160 1986128 1e4e50 bpftool
After:
text data bss dec hex filename
399024 11168 1573160 1983352 1e4378 bpftool
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20200624143124.12914-1-tklauser@distanz.ch
|
|
Flow metering entries in IEEE 802.1Qci is an optional function for a
flow filtering module. Flow metering is two rates two buckets and three
color marker to policing the frames. This patch only enable one rate one
bucket and in color blind mode. Flow metering instance are as
specified in the algorithm in MEF 10.3 and in Bandwidth Profile
Parameters. They are:
a) Flow meter instance identifier. An integer value identifying the flow
meter instance. The patch use the police 'index' as thin value.
b) Committed Information Rate (CIR), in bits per second. This patch use
the 'rate_bytes_ps' represent this value.
c) Committed Burst Size (CBS), in octets. This patch use the 'burst'
represent this value.
d) Excess Information Rate (EIR), in bits per second.
e) Excess Burst Size per Bandwidth Profile Flow (EBS), in octets.
And plus some other parameters. This patch set EIR/EBS default disable
and color blind mode.
v1->v2 changes:
- Use div_u64() as division replace the '/' report:
All errors (new ones prefixed by >>):
ld: drivers/net/ethernet/freescale/enetc/enetc_qos.o: in function `enetc_flowmeter_hw_set':
>> enetc_qos.c:(.text+0x66): undefined reference to `__udivdi3'
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Po Liu <Po.Liu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Hardware device may include more than one police entry. Specifying the
action's index make it possible for several tc filters to share the same
police action when installing the filters.
Propagate this index to device drivers through the flow offload
intermediate representation, so that drivers could share a single
hardware policer between multiple filters.
v1->v2 changes:
- Update the commit message suggest by Ido Schimmel <idosch@idosch.org>
Signed-off-by: Po Liu <Po.Liu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Base on the tc flower offload police action add max frame size by the
parameter 'mtu'. Tc flower device driver working by the IEEE 802.1Qci
stream filter can implement the max frame size filtering. Add it to the
current hardware tc flower stearm filter driver.
Signed-off-by: Po Liu <Po.Liu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Current police offloading support the 'burst'' and 'rate_bytes_ps'. Some
hardware own the capability to limit the frame size. If the frame size
larger than the setting, the frame would be dropped. For the police
action itself already accept the 'mtu' parameter in tc command. But not
extend to tc flower offloading. So extend 'mtu' to tc flower offloading.
Signed-off-by: Po Liu <Po.Liu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Doug Berger says:
====================
net: bcmgenet: use hardware padding of runt frames
Now that scatter-gather and tx-checksumming are enabled by default
it revealed a packet corruption issue that can occur for very short
fragmented packets.
When padding these frames to the minimum length it is possible for
the non-linear (fragment) data to be added to the end of the linear
header in an SKB. Since the number of fragments is read before the
padding and used afterward without reloading, the fragment that
should have been consumed can be tacked on in place of part of the
padding.
The third commit in this set corrects this by removing the software
padding and allowing the hardware to add the pad bytes if necessary.
The first two commits resolve warnings observed by the kbuild test
robot and are included here for simplicity of application.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When commit 474ea9cafc45 ("net: bcmgenet: correctly pad short
packets") added the call to skb_padto() it should have been
located before the nr_frags parameter was read since that value
could be changed when padding packets with lengths between 55
and 59 bytes (inclusive).
The use of a stale nr_frags value can cause corruption of the
pad data when tx-scatter-gather is enabled. This corruption of
the pad can cause invalid checksum computation when hardware
offload of tx-checksum is also enabled.
Since the original reason for the padding was corrected by
commit 7dd399130efb ("net: bcmgenet: fix skb_len in
bcmgenet_xmit_single()") we can remove the software padding all
together and make use of hardware padding of short frames as
long as the hardware also always appends the FCS value to the
frame.
Fixes: 474ea9cafc45 ("net: bcmgenet: correctly pad short packets")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The 16-bit value that holds a short in network byte order should
be declared as a restricted big endian type to allow type checks
to succeed during assignment.
Fixes: 3e370952287c ("net: bcmgenet: add support for ethtool rxnfc flows")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This function was originally removed by Baoyou Xie in
commit e2072600a241 ("net: bcmgenet: remove unused function in
bcmgenet.c") to prevent a build warning.
Some of the functions removed by Baoyou Xie are now used for
WAKE_FILTER support so his commit was reverted, but this function
is still unused and the kbuild test robot dutifully reported the
warning.
This commit once again removes the remaining unused hfb functions.
Fixes: 14da1510fedc ("Revert "net: bcmgenet: remove unused function in bcmgenet.c"")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Yonghong Song says:
====================
bpf iterator implments traversal of kernel data structures and these
data structures are passed to a bpf program for processing.
This gives great flexibility for users to examine kernel data
structure without using e.g. /proc/net which has limited and
fixed format.
Commit 138d0be35b14 ("net: bpf: Add netlink and ipv6_route bpf_iter targets")
implemented bpf iterators for netlink and ipv6_route.
This patch set intends to implement bpf iterators for tcp and udp.
Currently, /proc/net/tcp is used to print tcp4 stats and /proc/net/tcp6
is used to print tcp6 stats. /proc/net/udp[6] have similar usage model.
In contrast, only one tcp iterator is implemented and it is bpf program
resposibility to filter based on socket family. The same is for udp.
This will avoid another unnecessary traversal pass if users want
to check both tcp4 and tcp6.
Several helpers are also implemented in this patch
bpf_skc_to_{tcp, tcp6, tcp_timewait, tcp_request, udp6}_sock
The argument for these helpers is not a fixed btf_id. For example,
bpf_skc_to_tcp(struct sock_common *), or
bpf_skc_to_tcp(struct sock *), or
bpf_skc_to_tcp(struct inet_sock *), ...
are all valid. At runtime, the helper will check whether pointer cast
is legal or not. Please see Patch #5 for details.
Since btf_id's for both arguments and return value are known at
build time, the btf_id's are pre-computed once vmlinux btf becomes
valid. Jiri's "adding d_path helper" patch set
https://lore.kernel.org/bpf/20200616100512.2168860-1-jolsa@kernel.org/T/
provides a way to pre-compute btf id during vmlinux build time.
This can be applied here as well. A followup patch can convert
to build time btf id computation after Jiri's patch landed.
Changelogs:
v4 -> v5:
- fix bpf_skc_to_udp6_sock helper as besides sk_protocol, sk_family,
sk_type == SOCK_DGRAM is also needed to differentiate from
SOCK_RAW (Eric)
v3 -> v4:
- fix bpf_skc_to_{tcp_timewait, tcp_request}_sock helper implementation
as just checking sk->sk_state is not enough (Martin)
- fix a few kernel test robot reported failures
- move bpf_tracing_net.h from libbpf to selftests (Andrii)
- remove __weak attribute from selftests CONFIG_HZ variables (Andrii)
v2 -> v3:
- change sock_cast*/SOCK_CAST* names to btf_sock* names for generality (Martin)
- change gpl_license to false (Martin)
- fix helper to cast to tcp timewait/request socket. (Martin)
v1 -> v2:
- guard init_sock_cast_types() defination properly with CONFIG_NET (Martin)
- reuse the btf_ids, computed for new helper argument, for return
values (Martin)
- using BTF_TYPE_EMIT to express intent of btf type generation (Andrii)
- abstract out common net macros into bpf_tracing_net.h (Andrii)
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Added tcp{4,6} and udp{4,6} bpf programs into test_progs
selftest so that they at least can load successfully.
$ ./test_progs -n 3
...
#3/7 tcp4:OK
#3/8 tcp6:OK
#3/9 udp4:OK
#3/10 udp6:OK
...
#3 bpf_iter:OK
Summary: 1/16 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200623230823.3989372-1-yhs@fb.com
|
|
On my VM, I got identical results between /proc/net/udp[6] and
the udp{4,6} bpf iterator.
For udp6:
$ cat /sys/fs/bpf/p1
sl local_address remote_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode ref pointer drops
1405: 000080FE00000000FF7CC4D0D9EFE4FE:0222 00000000000000000000000000000000:0000 07 00000000:00000000 00:00000000 00000000 193 0 19183 2 0000000029eab111 0
$ cat /proc/net/udp6
sl local_address remote_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode ref pointer drops
1405: 000080FE00000000FF7CC4D0D9EFE4FE:0222 00000000000000000000000000000000:0000 07 00000000:00000000 00:00000000 00000000 193 0 19183 2 0000000029eab111 0
For udp4:
$ cat /sys/fs/bpf/p4
sl local_address rem_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode ref pointer drops
2007: 00000000:1F90 00000000:0000 07 00000000:00000000 00:00000000 00000000 0 0 72540 2 000000004ede477a 0
$ cat /proc/net/udp
sl local_address rem_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode ref pointer drops
2007: 00000000:1F90 00000000:0000 07 00000000:00000000 00:00000000 00000000 0 0 72540 2 000000004ede477a 0
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200623230822.3989299-1-yhs@fb.com
|
|
In my VM, I got identical result compared to /proc/net/{tcp,tcp6}.
For tcp6:
$ cat /proc/net/tcp6
sl local_address remote_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode
0: 00000000000000000000000000000000:0016 00000000000000000000000000000000:0000 0A 00000000:00000000 00:00000001 00000000 0 0 17955 1 000000003eb3102e 100 0 0 10 0
$ cat /sys/fs/bpf/p1
sl local_address remote_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode
0: 00000000000000000000000000000000:0016 00000000000000000000000000000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 17955 1 000000003eb3102e 100 0 0 10 0
For tcp:
$ cat /proc/net/tcp
sl local_address rem_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode
0: 00000000:0016 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 2666 1 000000007152e43f 100 0 0 10 0
$ cat /sys/fs/bpf/p2
sl local_address remote_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode
1: 00000000:0016 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 2666 1 000000007152e43f 100 0 0 10 0
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200623230820.3989165-1-yhs@fb.com
|
|
These newly added macros will be used in subsequent bpf iterator
tcp{4,6} and udp{4,6} programs.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200623230819.3989050-1-yhs@fb.com
|
|
Refactor bpf_iter_ipv6_route.c and bpf_iter_netlink.c
so net macros, originally from various include/linux header
files, are moved to a new header file
bpf_tracing_net.h. The goal is to improve reuse so
networking tracing programs do not need to
copy these macros every time they use them.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200623230817.3988962-1-yhs@fb.com
|
|
Commit b9f4c01f3e0b ("selftest/bpf: Make bpf_iter selftest
compilable against old vmlinux.h") and Commit dda18a5c0b75
("selftests/bpf: Convert bpf_iter_test_kern{3, 4}.c to define
own bpf_iter_meta") redefined newly introduced types
in bpf programs so the bpf program can still compile
properly with old kernels although loading may fail.
Since this patch set introduced new types and the same
workaround is needed, so let us move the workaround
to a separate header file so they do not clutter
bpf programs.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200623230816.3988656-1-yhs@fb.com
|
|
The helper is used in tracing programs to cast a socket
pointer to a udp6_sock pointer.
The return value could be NULL if the casting is illegal.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/bpf/20200623230815.3988481-1-yhs@fb.com
|
|
The bpf iterator for udp is implemented. Both udp4 and udp6
sockets will be traversed. It is up to bpf program to
filter for udp4 or udp6 only, or both families of sockets.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200623230813.3988404-1-yhs@fb.com
|
|
Similar to tcp_iter_state, a new field bpf_seq_afinfo is
added to udp_iter_state to provide bpf udp iterator
afinfo.
This does not change /proc/net/{udp, udp6} behavior. But
it enables bpf iterator to avoid get afinfo from PDE_DATA
and iterate through all udp and udp6 sockets in one pass.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200623230812.3988347-1-yhs@fb.com
|
|
Three more helpers are added to cast a sock_common pointer to
an tcp_sock, tcp_timewait_sock or a tcp_request_sock for
tracing programs.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200623230811.3988277-1-yhs@fb.com
|
|
The helper is used in tracing programs to cast a socket
pointer to a tcp6_sock pointer.
The return value could be NULL if the casting is illegal.
A new helper return type RET_PTR_TO_BTF_ID_OR_NULL is added
so the verifier is able to deduce proper return types for the helper.
Different from the previous BTF_ID based helpers,
the bpf_skc_to_tcp6_sock() argument can be several possible
btf_ids. More specifically, all possible socket data structures
with sock_common appearing in the first in the memory layout.
This patch only added socket types related to tcp and udp.
All possible argument btf_id and return value btf_id
for helper bpf_skc_to_tcp6_sock() are pre-calculcated and
cached. In the future, it is even possible to precompute
these btf_id's at kernel build time.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200623230809.3988195-1-yhs@fb.com
|
|
/proc/net/tcp{4,6} uses jiffies for various computations.
Let us add bpf_jiffies64() helper to tracing program
so bpf_iter and other programs can use it.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200623230808.3988073-1-yhs@fb.com
|
|
'X' tells kernel to print hex with upper case letters.
/proc/net/tcp{4,6} seq_file show() used this, and
supports it in bpf_seq_printf() helper too.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200623230807.3988014-1-yhs@fb.com
|
|
The bpf iterator for tcp is implemented. Both tcp4 and tcp6
sockets will be traversed. It is up to bpf program to
filter for tcp4 or tcp6 only, or both families of sockets.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200623230805.3987959-1-yhs@fb.com
|
|
A new field bpf_seq_afinfo is added to tcp_iter_state
to provide bpf tcp iterator afinfo. There are two
reasons on why we did this.
First, the current way to get afinfo from PDE_DATA
does not work for bpf iterator as its seq_file
inode does not conform to /proc/net/{tcp,tcp6}
inode structures. More specifically, anonymous
bpf iterator will use an anonymous inode which
is shared in the system and we cannot change inode
private data structure at all.
Second, bpf iterator for tcp/tcp6 wants to
traverse all tcp and tcp6 sockets in one pass
and bpf program can control whether they want
to skip one sk_family or not. Having a different
afinfo with family AF_UNSPEC make it easier
to understand in the code.
This patch does not change /proc/net/{tcp,tcp6} behavior
as the bpf_seq_afinfo will be NULL for these two proc files.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200623230804.3987829-1-yhs@fb.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull erofs fix from Gao Xiang:
"Fix a regression which uses potential uninitialized high 32-bit value
unexpectedly recently observed with specific compiler options"
* tag 'erofs-for-5.8-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
erofs: fix partially uninitialized misuse in z_erofs_onlinepage_fixup
|
|
check that 'nft ... ct helper set <foo>' works:
1. configure ftp helper via nft and assign it to
connections on port 2121
2. check with 'conntrack -L' that the next connection
has the ftp helper attached to it.
Also add a test for auto-assign (old behaviour).
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Using new helpers ip6t_unregister_table_pre_exit() and
ip6t_unregister_table_exit().
Fixes: b9e69e127397 ("netfilter: xtables: don't hook tables by default")
Signed-off-by: David Wilder <dwilder@us.ibm.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
helpers.
The pre_exit will un-register the underlying hook and .exit will do
the table freeing. The netns core does an unconditional synchronize_rcu
after the pre_exit hooks insuring no packets are in flight that have
picked up the pointer before completing the un-register.
Fixes: b9e69e127397 ("netfilter: xtables: don't hook tables by default")
Signed-off-by: David Wilder <dwilder@us.ibm.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Using new helpers ipt_unregister_table_pre_exit() and
ipt_unregister_table_exit().
Fixes: b9e69e127397 ("netfilter: xtables: don't hook tables by default")
Signed-off-by: David Wilder <dwilder@us.ibm.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
helpers.
The pre_exit will un-register the underlying hook and .exit will do the
table freeing. The netns core does an unconditional synchronize_rcu after
the pre_exit hooks insuring no packets are in flight that have picked up
the pointer before completing the un-register.
Fixes: b9e69e127397 ("netfilter: xtables: don't hook tables by default")
Signed-off-by: David Wilder <dwilder@us.ibm.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
The user tool modinfo is used to get information on kernel modules, including a
description where it is available.
This patch adds a brief MODULE_DESCRIPTION to netfilter kernel modules
(descriptions taken from Kconfig file or code comments)
Signed-off-by: Rob Gill <rrobgill@protonmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
When using ip_set with counters and comment, traffic causes the kernel
to panic on 32-bit ARM:
Alignment trap: not handling instruction e1b82f9f at [<bf01b0dc>]
Unhandled fault: alignment exception (0x221) at 0xea08133c
PC is at ip_set_match_extensions+0xe0/0x224 [ip_set]
The problem occurs when we try to update the 64-bit counters - the
faulting address above is not 64-bit aligned. The problem occurs
due to the way elements are allocated, for example:
set->dsize = ip_set_elem_len(set, tb, 0, 0);
map = ip_set_alloc(sizeof(*map) + elements * set->dsize);
If the element has a requirement for a member to be 64-bit aligned,
and set->dsize is not a multiple of 8, but is a multiple of four,
then every odd numbered elements will be misaligned - and hitting
an atomic64_add() on that element will cause the kernel to panic.
ip_set_elem_len() must return a size that is rounded to the maximum
alignment of any extension field stored in the element. This change
ensures that is the case.
Fixes: 95ad1f4a9358 ("netfilter: ipset: Fix extension alignment")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
The error DBG_STATUS_NO_MATCHING_FRAMING_MODE was added to the enum
enum dbg_status however there is a missing corresponding entry for
this in the array s_status_str. This causes an out-of-bounds read when
indexing into the last entry of s_status_str. Fix this by adding in
the missing entry.
Addresses-Coverity: ("Out-of-bounds read").
Fixes: 2d22bc8354b1 ("qed: FW 8.42.2.0 debug features")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Jisheng Zhang says:
====================
net: phy: call phy_disable_interrupts() in phy_init_hw()
We face an issue with rtl8211f, a pin is shared between INTB and PMEB,
and the PHY Register Accessible Interrupt is enabled by default, so
the INTB/PMEB pin is always active in polling mode case.
As Heiner pointed out "I was thinking about calling
phy_disable_interrupts() in phy_init_hw(), to have a defined init
state as we don't know in which state the PHY is if the PHY driver is
loaded. We shouldn't assume that it's the chip power-on defaults, BIOS
or boot loader could have changed this. Or in case of dual-boot
systems the other OS could leave the PHY in whatever state."
patch1 makes phy_disable_interrupts() non-static so that it could be used
in phy_init_hw() to have a defined init state.
patch2 calls phy_disable_interrupts() in phy_init_hw() to have a
defined init state.
Since v3:
- call phy_disable_interrupts() have interrupts disabled first then
config_init, thank Florian
Since v2:
- Don't export phy_disable_interrupts() but just make it non-static
Since v1:
- EXPORT the correct symbol
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Call phy_disable_interrupts() in phy_init_hw() to "have a defined init
state as we don't know in which state the PHY is if the PHY driver is
loaded. We shouldn't assume that it's the chip power-on defaults, BIOS
or boot loader could have changed this. Or in case of dual-boot
systems the other OS could leave the PHY in whatever state." as pointed
out by Heiner.
Suggested-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We face an issue with rtl8211f, a pin is shared between INTB and PMEB,
and the PHY Register Accessible Interrupt is enabled by default, so
the INTB/PMEB pin is always active in polling mode case.
As Heiner pointed out "I was thinking about calling
phy_disable_interrupts() in phy_init_hw(), to have a defined init
state as we don't know in which state the PHY is if the PHY driver is
loaded. We shouldn't assume that it's the chip power-on defaults, BIOS
or boot loader could have changed this. Or in case of dual-boot
systems the other OS could leave the PHY in whatever state."
Make phy_disable_interrupts() non-static so that it could be used in
phy_init_hw() to have a defined init state.
Suggested-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When writing the serdes configuration register was moved to
mvneta_config_interface() the whole code block was removed from
mvneta_port_power_up() in the assumption that its only purpose was to
write the serdes configuration register. As mentioned by Russell King
its purpose was also to check for valid interface modes early so that
later in the driver we do not have to care for unexpected interface
modes.
Add back the test to let the driver bail out early on unhandled
interface modes.
Fixes: b4748553f53f ("net: ethernet: mvneta: Fix Serdes configuration for SoCs without comphy")
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In mvneta_config_interface() the RGMII modes are catched by the default
case which is an error return. The RGMII modes are valid modes for the
driver, so instead of returning an error add a break statement to return
successfully.
This avoids this warning for non comphy SoCs which use RGMII, like
SolidRun Clearfog:
WARNING: CPU: 0 PID: 268 at drivers/net/ethernet/marvell/mvneta.c:3512 mvneta_start_dev+0x220/0x23c
Fixes: b4748553f53f ("net: ethernet: mvneta: Fix Serdes configuration for SoCs without comphy")
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|