Age | Commit message (Collapse) | Author |
|
Pull xfs fixes from Darrick Wong:
"Here are a few fixes for some corruption bugs and uninitialized
variable problems. The few patches here have gone through a few days
worth of fstest runs with no new problems observed.
Changes since last update:
- Fix a bunch of static checker complaints about uninitialized
variables and insufficient range checks.
- Avoid a crash when incore extent map data are corrupt.
- Disallow FITRIM when we haven't recovered the log and know the
metadata are stale.
- Fix a data corruption when doing unaligned overlapping dio writes"
* tag 'xfs-5.1-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: serialize unaligned dio writes against all other dio writes
xfs: prohibit fstrim in norecovery mode
xfs: always init bma in xfs_bmapi_write
xfs: fix btree scrub checking with regards to root-in-inode
xfs: dabtree scrub needs to range-check level
xfs: don't trip over uninitialized buffer on extent read of corrupted inode
|
|
Commit 70b62c25665f636c ("LoadPin: Initialize as ordered LSM") removed
CONFIG_DEFAULT_SECURITY_{SELINUX,SMACK,TOMOYO,APPARMOR,DAC} from
security/Kconfig and changed CONFIG_LSM to provide a fixed ordering as a
default value. That commit expected that existing users (upgrading from
Linux 5.0 and earlier) will edit CONFIG_LSM value in accordance with
their CONFIG_DEFAULT_SECURITY_* choice in their old kernel configs. But
since users might forget to edit CONFIG_LSM value, this patch revives
the choice (only for providing the default value for CONFIG_LSM) in order
to make sure that CONFIG_LSM reflects CONFIG_DEFAULT_SECURITY_* from their
old kernel configs.
Note that since TOMOYO can be fully stacked against the other legacy
major LSMs, when it is selected, it explicitly disables the other LSMs
to avoid them also initializing since TOMOYO does not expect this
currently.
Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Fixes: 70b62c25665f636c ("LoadPin: Initialize as ordered LSM")
Co-developed-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
|
|
Replace the br_port_exists() macro with its twin from netdevice.h
CC: Roopa Prabhu <roopa@cumulusnetworks.com>
CC: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Replace the team_port_exists() macro with its twin from netdevice.h
CC: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch advertises Forward Error Correction in ethtool
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Change t4fw_version.h to update latest firmware version
number to 1.23.3.0.
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This reverts commit 4d31c4fa3f9ef7b7e2e79fd57d21290f64c938f5.
Accidently applied this to the wrong tree.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Change t4fw_version.h to update latest firmware version
number to 1.23.3.0.
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
NULL or ZERO_SIZE_PTR will be returned for zero sized memory
request, and derefencing them will lead to a segfault
so it is unnecessory to call vzalloc for zero sized memory
request and not call functions which maybe derefence the
NULL allocated memory
this also fixes a possible memory leak if phy_ethtool_get_stats
returns error, memory should be freed before exit
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Reviewed-by: Wang Li <wangli39@baidu.com>
Reviewed-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Only decrypt_internal() performs zero copy on rx, all paths
which don't hit decrypt_internal() must set zc to false,
otherwise tls_sw_recvmsg() may return 0 causing the application
to believe that that connection got closed.
Currently this happens with device offload when new record
is first read from.
Fixes: d069b780e367 ("tls: Fix tls_device receive")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reported-by: David Beckett <david.beckett@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
After queue stopped, the wakeup mechanism may wake it up again
when ring buffer usage is lower than a threshold. This may cause
send path panic on NULL pointer when we stopped all tx queues in
netvsc_detach and start removing the netvsc device.
This patch fix it by adding a tx_disable flag to prevent unwanted
queue wakeup.
Fixes: 7b2ee50c0cd5 ("hv_netvsc: common detach logic")
Reported-by: Mohammed Gamal <mgamal@redhat.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Bond expects ethernet hwaddr for its slave, but it can be longer than 6
bytes - infiniband interface for example.
# cat /sys/devices/<skipped>/net/ib0/address
80:00:02:08:fe:80:00:00:00:00:00:00:7c:fe:90:03:00:be:5d:e1
# cat /sys/devices/<skipped>/net/ib0/bonding_slave/perm_hwaddr
80:00:02:08:fe:80
So print full hwaddr in sysfs "bonding_slave/perm_hwaddr" as well.
Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There is currently no support for the multicast/broadcast aspects
of VXLAN in ovs. In the datapath flow the tun_dst must specific.
But in the IP_TUNNEL_INFO_BRIDGE mode the tun_dst can not be specific.
And the packet can forward through the fdb table of vxlan devcice. In
this mode the broadcast/multicast packet can be sent through the
following ways in ovs.
ovs-vsctl add-port br0 vxlan -- set in vxlan type=vxlan \
options:key=1000 options:remote_ip=flow
ovs-ofctl add-flow br0 in_port=LOCAL,dl_dst=ff:ff:ff:ff:ff:ff, \
action=output:vxlan
bridge fdb append ff:ff:ff:ff:ff:ff dev vxlan_sys_4789 dst 172.168.0.1 \
src_vni 1000 vni 1000 self
bridge fdb append ff:ff:ff:ff:ff:ff dev vxlan_sys_4789 dst 172.168.0.2 \
src_vni 1000 vni 1000 self
Signed-off-by: wenxu <wenxu@ucloud.cn>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/urgent fixes from Arnaldo:
Core libraries:
Jiri Olsa:
- Fix max perf_event_attr.precise_ip detection.
Kan Liang:
- Fix parser error for uncore event alias
Wei Lin:
- Fixup ordering of kernel maps after obtaining the main kernel map address.
Intel PT:
Adrian Hunter:
- Fix TSC slip where A TSC packet can slip past MTC packets so that the
timestamp appears to go backwards.
- Fixes for exported-sql-viewer GUI conversion to python3.
ARM coresight:
Solomon Tan:
- Fix the build by adding a missing case value for enumeration value introduced
in newer library, that now is the required one.
tool headers:
Arnaldo Carvalho de Melo:
- Syncronize kernel headers with the kernel, getting new io_uring and
pidfd_send_signal syscalls so that 'perf trace' can handle them.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
TCP stack relies on the fact that a freshly allocated skb
has skb->cb[] and skb_shinfo(skb)->tx_flags cleared.
When recycling tx skb, we must ensure these fields are cleared.
Fixes: 472c2e07eef0 ("tcp: add one skb cache for tx")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The Innova IPsec driver is part of all Innova drivers, and its
maintainenece is covered by an existing entry in this file.
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The driver allocates an encap context based on the tunnel properties,
and reuse that context for all flows using the same tunnel properties.
Commit df2ef3bff193 ("net/mlx5e: Add GRE protocol offloading")
introduced another tunnel protocol other than the single VXLAN
previously supported. A flow that uses a tunnel with the same tunnel
properties but with a different tunnel type (GRE vs VXLAN for example)
would mistakenly reuse the previous alocated context, causing the
traffic to be sent with the wrong encapsulation. Fix that by
considering the tunnel type for encap contexts.
Fixes: df2ef3bff193 ("net/mlx5e: Add GRE protocol offloading")
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Set xon = xoff - netdev's max_mtu.
netdev's max_mtu will give enough time for the pause frame to
arrive at the sender.
Fixes: 0696d60853d5 ("net/mlx5e: Receive buffer configuration")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Set minimum speed in xoff threshold formula to 40Gbps
Fixes: 0696d60853d5 ("net/mlx5e: Receive buffer configuration")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Make sure the struct mlx5_flow_destination is zero before
filling in the field.
Fixes: 8da202b24913 ("net/mlx5: E-Switch, Add support for VEPA in legacy mode.")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Traditionally, the PF (Physical Function) which resides on vport 0 was
the E-switch manager. Since the ECPF (Embedded CPU Physical Function),
which resides on vport 0xfffe, was introduced as the E-Switch manager,
the assumption that the E-switch manager is on vport 0 is incorrect.
Since the eswitch code already uses the actual vport value, all we
need is to always set other_vport=1.
Signed-off-by: Omri Kahalon <omrik@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
The esw offloads structures share a union with the legacy mode structs.
Reset the offloads struct to zero in init to protect from null
assumptions made by the legacy mode code.
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
The capacity of FDB offloading and NIC offloading table are
different, and when allocating the pedit actions, we should
use the correct namespace type.
Fixes: c500c86b0c75d ("net/mlx5e: support for two independent packet edit actions")
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
The esw fdb table has a union of legacy and offloads members.
So if we were in a certain esw mode we could set some memebers and not
set null which is fine as on destroy path and don't care.
But then moving from legacy to switchdev a second time, the cleanup flow
of legacy mode checks if a struct member was in use if it's not null so
we need to make sure to reset the code to null when we init legacy mode.
Fixes: 8da202b24913 ("net/mlx5: E-Switch, Add support for VEPA in legacy mode.")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Allow configuration of legacy link-modes even when extended link-modes
are supported. This requires reading of legacy advertisement even when
extended link-modes are supported. Since legacy and extended
advertisement are mutually excluded, wait for empty reply from extended
advertisement before reading legacy advertisement.
Fixes: 6a897372417e ("net/mlx5: ethtool, Add ethtool support for 50Gbps per lane link modes")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Ethtool option set_link_ksettings allows setting of legacy link-modes
or extended link-modes. Refine the decision of which type of link-modes
is set.
Fixes: 6a897372417e ("net/mlx5: ethtool, Add ethtool support for 50Gbps per lane link modes")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Refresh tirs is looping over a global list of tirs while netdevs are
adding and removing tirs from that list. That is why a lock is
required.
Fixes: 724b2aa15126 ("net/mlx5e: TIRs management refactoring")
Signed-off-by: Yuval Avnery <yuvalav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
idr_find() can return a NULL value to 'flow' which is used without a
check. The patch adds a check to avoid potential NULL pointer dereference.
In case of mlx5_fpga_sbu_conn_sendmsg() failure, free buf allocated
using kzalloc.
Fixes: ab412e1dd7db ("net/mlx5: Accel, add TLS rx offload routines")
Signed-off-by: Aditya Pakki <pakki001@umn.edu>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
For some protocols we are not allowing IP header rewrite offload, since
the HW is not capable to properly adjust the l4 checksum. However, TTL
& HOPLIMIT modification can be done for all IP protocols, because they
are not part of the pseudo header taken into account for checksum.
Fixes: 738678817573 ("drivers: net: use flow action infrastructure")
Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Previously, a false positive would be caught if the TIRs list is
empty, since the err value was initialized to -ENOMEM, and was only
updated if a TIR is refreshed. This is resolved by initializing the
err value to zero.
Fixes: b676f653896a ("net/mlx5e: Refactor refresh TIRs")
Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Delete initialization of high order entries in mr cache to decrease initial
memory footprint. When required, the administrator can populate the
entries with memory keys via the /sys interface.
This approach is very helpful to significantly reduce the per HW function
memory footprint in virtualization environments such as SRIOV.
Fixes: 9603b61de1ee ("mlx5: Move pci device handling from mlx5_ib to mlx5_core")
Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reported-by: Shalom Toledo <shalomt@mellanox.com>
Acked-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
We want to avoid leaking pointer info from xdp_frame (that is placed in
top of frame) like commit 6dfb970d3dbd ("xdp: avoid leaking info stored in
frame data on page reuse"), and followup commit 97e19cce05e5 ("bpf:
reserve xdp_frame size in xdp headroom") that reserve this headroom.
These changes also affected how cpumap constructed SKBs, as xdpf->headroom
size changed, the skb data starting point were in-effect shifted with 32
bytes (sizeof xdp_frame). This was still okay, as the cpumap frame_size
calculation also included xdpf->headroom which were reduced by same amount.
A bug was introduced in commit 77ea5f4cbe20 ("bpf/cpumap: make sure
frame_size for build_skb is aligned if headroom isn't"), where the
xdpf->headroom became part of the SKB_DATA_ALIGN rounding up. This
round-up to find the frame_size is in principle still correct as it does
not exceed the 2048 bytes frame_size (which is max for ixgbe and i40e),
but the 32 bytes offset of pkt_data_start puts this over the 2048 bytes
limit. This cause skb_shared_info to spill into next frame. It is a little
hard to trigger, as the SKB need to use above 15 skb_shinfo->frags[] as
far as I calculate. This does happen in practise for TCP streams when
skb_try_coalesce() kicks in.
KASAN can be used to detect these wrong memory accesses, I've seen:
BUG: KASAN: use-after-free in skb_try_coalesce+0x3cb/0x760
BUG: KASAN: wild-memory-access in skb_release_data+0xe2/0x250
Driver veth also construct a SKB from xdp_frame in this way, but is not
affected, as it doesn't reserve/deduct the room (used by xdp_frame) from
the SKB headroom. Instead is clears the pointers via xdp_scrub_frame(),
and allows SKB to use this area.
The fix in this patch is to do like veth and instead allow SKB to (re)use
the area occupied by xdp_frame, by clearing via xdp_scrub_frame(). (This
does kill the idea of the SKB being able to access (mem) info from this
area, but I guess it was a bad idea anyhow, and it was already killed by
the veth changes.)
Fixes: 77ea5f4cbe20 ("bpf/cpumap: make sure frame_size for build_skb is aligned if headroom isn't")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add 36 pedit action tests to check pedit options described in tc-pedit(8)
man page. Test cases can be specified by categories: actions, pedit,
raw_op, layered_op. RAW_OP cases check offset option for u8, u16 and u32
offset size. LAYERED_OP cases check fields option for eth, ip, ip6,
tcp and udp headers.
Include following tests:
377e - Add pedit action with RAW_OP offset u32
a0ca - Add pedit action with RAW_OP offset u32 (INVALID)
dd8a - Add pedit action with RAW_OP offset u16 u16
53db - Add pedit action with RAW_OP offset u16 (INVALID)
5c7e - Add pedit action with RAW_OP offset u8 add value
2893 - Add pedit action with RAW_OP offset u8 quad
3a07 - Add pedit action with RAW_OP offset u8-u16-u8
ab0f - Add pedit action with RAW_OP offset u16-u8-u8
9d12 - Add pedit action with RAW_OP offset u32 set u16 clear u8 invert
ebfa - Add pedit action with RAW_OP offset overflow u32 (INVALID)
f512 - Add pedit action with RAW_OP offset u16 at offmask shift set
c2cb - Add pedit action with RAW_OP offset u32 retain value
86d4 - Add pedit action with LAYERED_OP eth set src & dst
c715 - Add pedit action with LAYERED_OP eth set src (INVALID)
ba22 - Add pedit action with LAYERED_OP eth type set/clear sequence
5810 - Add pedit action with LAYERED_OP ip set src & dst
1092 - Add pedit action with LAYERED_OP ip set ihl & dsfield
02d8 - Add pedit action with LAYERED_OP ip set ttl & protocol
3e2d - Add pedit action with LAYERED_OP ip set ttl (INVALID)
31ae - Add pedit action with LAYERED_OP ip ttl clear/set
486f - Add pedit action with LAYERED_OP ip set duplicate fields
e790 - Add pedit action with LAYERED_OP ip set ce, df, mf, firstfrag,
nofrag fields
6829 - Add pedit action with LAYERED_OP beyond ip set dport & sport
afd8 - Add pedit action with LAYERED_OP beyond ip set icmp_type &
icmp_code
3143 - Add pedit action with LAYERED_OP beyond ip set dport (INVALID)
fc1f - Add pedit action with LAYERED_OP ip6 set src & dst
6d34 - Add pedit action with LAYERED_OP ip6 dst retain value (INVALID)
6f5e - Add pedit action with LAYERED_OP ip6 flow_lbl
6795 - Add pedit action with LAYERED_OP ip6 set payload_len, nexthdr,
hoplimit
1442 - Add pedit action with LAYERED_OP tcp set dport & sport
b7ac - Add pedit action with LAYERED_OP tcp sport set (INVALID)
cfcc - Add pedit action with LAYERED_OP tcp flags set
3bc4 - Add pedit action with LAYERED_OP tcp set dport, sport & flags
fields
f1c8 - Add pedit action with LAYERED_OP udp set dport & sport
d784 - Add pedit action with mixed RAW/LAYERED_OP #1
70ca - Add pedit action with mixed RAW/LAYERED_OP #2
Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull drm fixes from Dave Airlie:
"Weekly fixes roundup, nothing two serious, some usb device regressions
are fixed, and i915 GVT has a bigger fix but otherwise not really much
happening here.
core:
- fb bpp check regression fix
- release/unplug fix
- use after free fixes
i915:
- fix mmap range checks
- fix gvt ppgtt mm LRU list access races
- fix selftest error pointer check
- fix a macro definition (pre-emptive for potential further backports)
- fix one AML SKU ULX status
amdgpu:
- one variable refresh rate fix
udl:
- fix EDID reading
tegra:
- build/warning fixes
meson:
- cleanup path fixes
- TMDS clock filter fix
rockchip:
- NV12 buffers and scalar fix"
* tag 'drm-fixes-2019-03-29' of git://anongit.freedesktop.org/drm/drm: (22 commits)
drm/i915/icl: Fix VEBOX mismatch BUG_ON()
drm/i915/selftests: Fix an IS_ERR() vs NULL check
drm/i915: Mark AML 0x87CA as ULX
drm/meson: fix TMDS clock filtering for DMT monitors
drm/meson: Uninstall IRQ handler
drm/meson: Fix invalid pointer in meson_drv_unbind()
drm/udl: Refactor edid retrieving in UDL driver (v2)
drm: Fix drm_release() and device unplug
drm/fb: avoid setting 0 depth.
drm/tegra: vic: Fix implicit function declaration warning
drm/tegra: hub: Fix dereference before check
drm/i915/icl: Fix the TRANS_DDI_FUNC_CTL2 bitfield macro
drm/amd/display: Only allow VRR when vrefresh is within supported range
drm/rockchip: vop: reset scale mode when win is disabled
drm/vkms: fix use-after-free when drm_gem_handle_create() fails
drm/vgem: fix use-after-free when drm_gem_handle_create() fails
drm/i915/gvt: Add mutual lock for ppgtt mm LRU list
drm/i915/gvt: Only assign ppgtt root at dispatch time
drm/i915/gvt: Don't submit request for error workload dispatch
drm/i915/gvt: stop scheduling workload when vgpu is inactive
...
|
|
The number of stubs is growing and has nothing to do with addrconf.
Move the definition of the stubs to a separate header file and update
users. In the move, drop the vxlan specific comment before ipv6_stub.
Code move only; no functional change intended.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
David Ahern says:
====================
net: Move fib_nh and fib6_nh to a common struct
First set of three with the end goal of enabling IPv6 gateways with IPv4
routes.
This set refactors ipv4 and ipv6 code to create init and release
helpers for each protocol and moving common elements to a fib_nh_common
struct.
v3
- split the reject setting into 2 with helper to the checks. This
avoids changing cfg->fc_flags in fib6_nh_init
v2
- addressed Ido's comments: cleanup on failure path in nh_init helpers,
ordering in fib6_nh_release, and removal of RTF_GATEWAY from fib6_info
uses in mlxsw
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
With fib_nh_common in place, move common initialization and release
code into helpers used by both ipv4 and ipv6. For the moment, the init
is just the lwt encap and the release is both the netdev reference and
the the lwt state reference. More will be added later.
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add fib_nh_common struct with common nexthop attributes. Convert
fib_nh and fib6_nh to use it. Use macros to move existing
fib_nh_* references to the new nh_common.nhc_*.
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Rename fib6_nh entries that will be moved to a fib_nh_common struct.
Specifically, the device, gateway, flags, and lwtstate are common
with all nexthop definitions. In some places new temporary variables
are declared or local variables renamed to maintain line lengths.
Rename only; no functional change intended.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Rename fib_nh entries that will be moved to a fib_nh_common struct.
Specifically, the device, oif, gateway, flags, scope, lwtstate,
nh_weight and nh_upper_bound are common with all nexthop definitions.
In the process shorten fib_nh_lwtstate to fib_nh_lws to avoid really
long lines.
Rename only; no functional change intended.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
rt6_add_nexthop and rt6_nexthop_info only need the fib6_info for the
gateway flag and the nexthop weight, and the presence of a gateway is now
per-nexthop. Update the signatures to take a fib6_nh and nexthop weight
and better align with the ipv4 versions.
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
fib6_ignore_linkdown takes a fib6_info but only looks at the net_device
and its IPv6 config. Change it to take a net_device over a fib6_info as
its input argument.
In addition, move it to a header file to make the check inline and usable
later with IPv4 code without going through the ipv6 stub, and rename to
ip6_ignore_linkdown since it is only checking the setting based on the
ipv6 struct on a device.
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The gateway setting is not per fib6_info entry but per-fib6_nh. Add a new
fib_nh_has_gw flag to fib6_nh and convert references to RTF_GATEWAY to
the new flag. For IPv6 address the flag is cheaper than checking that
nh_gw is non-0 like IPv4 does.
While this increases fib6_nh by 8-bytes, the effective allocation size of
a fib6_info is unchanged. The 8 bytes is recovered later with a
fib_nh_common change.
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Move the fib6_nh cleanup code to a new helper, fib6_nh_release.
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Similar to IPv4, consolidate the fib6_nh initialization into a helper.
As a new standalone function, add a cleanup path to put lwtstate on
error.
To avoid modifying fib6_config flags, move the reject check to a helper
that is invoked once by fib6_nh_init to reset the device and then
again in ip6_route_info_create to set the fib6_flags.
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Move the fib_nh cleanup code from free_fib_info_rcu into a new helper,
fib_nh_release. Move classid accounting into fib_nh_release which is
called per fib_nh to make accounting symmetrical with fib_nh_init.
Export the helper to allow for use with nexthop objects in the
future.
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Consolidate the fib_nh initialization which is duplicated between
fib_create_info for single path and fib_get_nhs for multipath.
Export the helper to allow for use with nexthop objects in the
future.
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
in_dev lookup followed by IN_DEV_IGNORE_ROUTES_WITH_LINKDOWN check
is called in several places, some with the rcu lock and others with the
rtnl held.
Move the check to a helper similar to what IPv6 has. Since the helper
can be invoked from either context use rcu_dereference_rtnl to
dereference ip_ptr.
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Define fib_get_nhs to return EINVAL when CONFIG_IP_ROUTE_MULTIPATH is
not enabled and remove the ifdef check for CONFIG_IP_ROUTE_MULTIPATH
in fib_create_info.
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Syzkaller reports:
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN PTI
CPU: 1 PID: 5373 Comm: syz-executor.0 Not tainted 5.0.0-rc8+ #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
RIP: 0010:put_links+0x101/0x440 fs/proc/proc_sysctl.c:1599
Code: 00 0f 85 3a 03 00 00 48 8b 43 38 48 89 44 24 20 48 83 c0 38 48 89 c2 48 89 44 24 28 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80> 3c 02 00 0f 85 fe 02 00 00 48 8b 74 24 20 48 c7 c7 60 2a 9d 91
RSP: 0018:ffff8881d828f238 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff8881e01b1140 RCX: ffffffff8ee98267
RDX: 0000000000000007 RSI: ffffc90001479000 RDI: ffff8881e01b1178
RBP: dffffc0000000000 R08: ffffed103ee27259 R09: ffffed103ee27259
R10: 0000000000000001 R11: ffffed103ee27258 R12: fffffffffffffff4
R13: 0000000000000006 R14: ffff8881f59838c0 R15: dffffc0000000000
FS: 00007f072254f700(0000) GS:ffff8881f7100000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fff8b286668 CR3: 00000001f0542002 CR4: 00000000007606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
drop_sysctl_table+0x152/0x9f0 fs/proc/proc_sysctl.c:1629
get_subdir fs/proc/proc_sysctl.c:1022 [inline]
__register_sysctl_table+0xd65/0x1090 fs/proc/proc_sysctl.c:1335
br_netfilter_init+0xbc/0x1000 [br_netfilter]
do_one_initcall+0xfa/0x5ca init/main.c:887
do_init_module+0x204/0x5f6 kernel/module.c:3460
load_module+0x66b2/0x8570 kernel/module.c:3808
__do_sys_finit_module+0x238/0x2a0 kernel/module.c:3902
do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x462e99
Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f072254ec58 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
RAX: ffffffffffffffda RBX: 000000000073bf00 RCX: 0000000000462e99
RDX: 0000000000000000 RSI: 0000000020000280 RDI: 0000000000000003
RBP: 00007f072254ec70 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f072254f6bc
R13: 00000000004bcefa R14: 00000000006f6fb0 R15: 0000000000000004
Modules linked in: br_netfilter(+) dvb_usb_dibusb_mc_common dib3000mc dibx000_common dvb_usb_dibusb_common dvb_usb_dw2102 dvb_usb classmate_laptop palmas_regulator cn videobuf2_v4l2 v4l2_common snd_soc_bd28623 mptbase snd_usb_usx2y snd_usbmidi_lib snd_rawmidi wmi libnvdimm lockd sunrpc grace rc_kworld_pc150u rc_core rtc_da9063 sha1_ssse3 i2c_cros_ec_tunnel adxl34x_spi adxl34x nfnetlink lib80211 i5500_temp dvb_as102 dvb_core videobuf2_common videodev media videobuf2_vmalloc videobuf2_memops udc_core lnbp22 leds_lp3952 hid_roccat_ryos s1d13xxxfb mtd vport_geneve openvswitch nf_conncount nf_nat_ipv6 nsh geneve udp_tunnel ip6_udp_tunnel snd_soc_mt6351 sis_agp phylink snd_soc_adau1761_spi snd_soc_adau1761 snd_soc_adau17x1 snd_soc_core snd_pcm_dmaengine ac97_bus snd_compress snd_soc_adau_utils snd_soc_sigmadsp_regmap snd_soc_sigmadsp raid_class hid_roccat_konepure hid_roccat_common hid_roccat c2port_duramar2150 core mdio_bcm_unimac iptable_security iptable_raw iptable_mangle
iptable_nat nf_nat_ipv4 nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter bpfilter ip6_vti ip_vti ip_gre ipip sit tunnel4 ip_tunnel hsr veth netdevsim devlink vxcan batman_adv cfg80211 rfkill chnl_net caif nlmon dummy team bonding vcan bridge stp llc ip6_gre gre ip6_tunnel tunnel6 tun crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel joydev mousedev ide_pci_generic piix aesni_intel aes_x86_64 ide_core crypto_simd atkbd cryptd glue_helper serio_raw ata_generic pata_acpi i2c_piix4 floppy sch_fq_codel ip_tables x_tables ipv6 [last unloaded: lm73]
Dumping ftrace buffer:
(ftrace buffer empty)
---[ end trace 770020de38961fd0 ]---
A new dir entry can be created in get_subdir and its 'header->parent' is
set to NULL. Only after insert_header success, it will be set to 'dir',
otherwise 'header->parent' is set to NULL and drop_sysctl_table is called.
However in err handling path of get_subdir, drop_sysctl_table also be
called on 'new->header' regardless its value of parent pointer. Then
put_links is called, which triggers NULL-ptr deref when access member of
header->parent.
In fact we have multiple error paths which call drop_sysctl_table() there,
upon failure on insert_links() we also call drop_sysctl_table().And even
in the successful case on __register_sysctl_table() we still always call
drop_sysctl_table().This patch fix it.
Link: http://lkml.kernel.org/r/20190314085527.13244-1-yuehaibing@huawei.com
Fixes: 0e47c99d7fe25 ("sysctl: Replace root_list with links between sysctl_table_sets")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: <stable@vger.kernel.org> [3.4+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|