Age | Commit message (Collapse) | Author |
|
Since capi_ioctl() copies 64 bytes after calling
capi20_get_manufacturer() we need to ensure to not leak
information to user.
BUG: KMSAN: kernel-infoleak in _copy_to_user+0x16b/0x1f0 lib/usercopy.c:32
CPU: 0 PID: 11245 Comm: syz-executor633 Not tainted 4.20.0-rc7+ #2
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x173/0x1d0 lib/dump_stack.c:113
kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613
kmsan_internal_check_memory+0x9d4/0xb00 mm/kmsan/kmsan.c:704
kmsan_copy_to_user+0xab/0xc0 mm/kmsan/kmsan_hooks.c:601
_copy_to_user+0x16b/0x1f0 lib/usercopy.c:32
capi_ioctl include/linux/uaccess.h:177 [inline]
capi_unlocked_ioctl+0x1a0b/0x1bf0 drivers/isdn/capi/capi.c:939
do_vfs_ioctl+0xebd/0x2bf0 fs/ioctl.c:46
ksys_ioctl fs/ioctl.c:713 [inline]
__do_sys_ioctl fs/ioctl.c:720 [inline]
__se_sys_ioctl+0x1da/0x270 fs/ioctl.c:718
__x64_sys_ioctl+0x4a/0x70 fs/ioctl.c:718
do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
entry_SYSCALL_64_after_hwframe+0x63/0xe7
RIP: 0033:0x440019
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffdd4659fb8 EFLAGS: 00000213 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000440019
RDX: 0000000020000080 RSI: 00000000c0044306 RDI: 0000000000000003
RBP: 00000000006ca018 R08: 0000000000000000 R09: 00000000004002c8
R10: 0000000000000000 R11: 0000000000000213 R12: 00000000004018a0
R13: 0000000000401930 R14: 0000000000000000 R15: 0000000000000000
Local variable description: ----data.i@capi_unlocked_ioctl
Variable was created at:
capi_ioctl drivers/isdn/capi/capi.c:747 [inline]
capi_unlocked_ioctl+0x82/0x1bf0 drivers/isdn/capi/capi.c:939
do_vfs_ioctl+0xebd/0x2bf0 fs/ioctl.c:46
Bytes 12-63 of 64 are uninitialized
Memory access of size 64 starts at ffff88807ac5fce8
Data copied to user address 0000000020000080
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Karsten Keil <isdn@linux-pingi.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In ip6_neigh_lookup(), we must not return errors coming from
neigh_create(): if creation of a neighbour entry fails, the lookup should
return NULL, in the same way as it's done in __neigh_lookup().
Otherwise, callers legitimately checking for a non-NULL return value of
the lookup function might dereference an invalid pointer.
For instance, on neighbour table overflow, ndisc_router_discovery()
crashes ndisc_update() by passing ERR_PTR(-ENOBUFS) as 'neigh' argument.
Reported-by: Jianlin Shi <jishi@redhat.com>
Fixes: f8a1b43b709d ("net/ipv6: Create a neigh_lookup for FIB entries")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Using del_timer() + add_timer() is generally unsafe on SMP,
as noticed by syzbot. Use mod_timer() instead.
kernel BUG at kernel/time/timer.c:1136!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 1026 Comm: kworker/u4:4 Not tainted 4.20.0+ #2
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events_unbound flush_to_ldisc
RIP: 0010:add_timer kernel/time/timer.c:1136 [inline]
RIP: 0010:add_timer+0xa81/0x1470 kernel/time/timer.c:1134
Code: 4d 89 7d 40 48 c7 85 70 fe ff ff 00 00 00 00 c7 85 7c fe ff ff ff ff ff ff 48 89 85 90 fe ff ff e9 e6 f7 ff ff e8 cf 42 12 00 <0f> 0b e8 c8 42 12 00 0f 0b e8 c1 42 12 00 4c 89 bd 60 fe ff ff e9
RSP: 0018:ffff8880a7fdf5a8 EFLAGS: 00010293
RAX: ffff8880a7846340 RBX: dffffc0000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff816f3ee1 RDI: ffff88808a514ff8
RBP: ffff8880a7fdf760 R08: 0000000000000007 R09: ffff8880a7846c58
R10: ffff8880a7846340 R11: 0000000000000000 R12: ffff88808a514ff8
R13: ffff88808a514ff8 R14: ffff88808a514dc0 R15: 0000000000000030
FS: 0000000000000000(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000061c500 CR3: 00000000994d9000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
decode_prio_command drivers/net/hamradio/6pack.c:903 [inline]
sixpack_decode drivers/net/hamradio/6pack.c:971 [inline]
sixpack_receive_buf drivers/net/hamradio/6pack.c:457 [inline]
sixpack_receive_buf+0xf9c/0x1470 drivers/net/hamradio/6pack.c:434
tty_ldisc_receive_buf+0x164/0x1c0 drivers/tty/tty_buffer.c:465
tty_port_default_receive_buf+0x114/0x190 drivers/tty/tty_port.c:38
receive_buf drivers/tty/tty_buffer.c:481 [inline]
flush_to_ldisc+0x3b2/0x590 drivers/tty/tty_buffer.c:533
process_one_work+0xd0c/0x1ce0 kernel/workqueue.c:2153
worker_thread+0x143/0x14a0 kernel/workqueue.c:2296
kthread+0x357/0x430 kernel/kthread.c:246
ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Andreas Koensgen <ajk@comnets.uni-bremen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If there is no shutdown callback, our board will report pcie UNF errors
after restarting. This patch add shutdown callback for hinic.
Signed-off-by: Xue Chaojing <xuechaojing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
For HNAE3_DOWN_CLIENT calling hns3_nic_net_stop(), HNAE3_UP_CLIENT
should call hns3_nic_net_open(), since if the number of queue or
the map of TC has is changed before HHAE3_UP_CLIENT is called,
it will cause problem.
Also the HNS3_NIC_STATE_RESETTING flag needs to be cleared before
hns3_nic_net_open() called, and set it back while hns3_nic_net_open()
failed.
Fixes: bb6b94a896d4 ("net: hns3: Add reset interface implementation in client")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
KMSAN detected read beyond end of buffer in vti and sit devices when
passing truncated packets with PF_PACKET. The issue affects additional
ip tunnel devices.
Extend commit 76c0ddd8c3a6 ("ip6_tunnel: be careful when accessing the
inner header") and commit ccfec9e5cb2d ("ip_tunnel: be careful when
accessing the inner header").
Move the check to a separate helper and call at the start of each
ndo_start_xmit function in net/ipv4 and net/ipv6.
Minor changes:
- convert dev_kfree_skb to kfree_skb on error path,
as dev_kfree_skb calls consume_skb which is not for error paths.
- use pskb_network_may_pull even though that is pedantic here,
as the same as pskb_may_pull for devices without llheaders.
- do not cache ipv6 hdrs if used only once
(unsafe across pskb_may_pull, was more relevant to earlier patch)
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The BPF flow dissector expects either skb->sk or skb->dev set on
all skbs. Delay flow dissection until after skb->dev is set.
This requires calling from within an rcu read-side critical section.
That is fine, see also the call from tun_xdp_one.
Fixes: d0e13a1488ad ("flow_dissector: lookup netns by skb->sk if skb->dev is NULL")
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
__ptr_ring_swap_queue() tries to move pointers from the old
ring to the new one, but it forgets to check if ->producer
is beyond the new size at the end of the operation. This leads
to an out-of-bound access in __ptr_ring_produce() as reported
by syzbot.
Reported-by: syzbot+8993c0fa96d57c399735@syzkaller.appspotmail.com
Fixes: 5d49de532002 ("ptr_ring: resize support")
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In kfree, the NULL check is done.
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
add document and examples for below counters:
TcpExtTCPOFOQueue
TcpExtTCPOFODrop
TcpExtTCPOFOMerge
TcpExtPAWSActive
TcpExtPAWSEstab
TcpExtTCPACKSkippedSynRecv
TcpExtTCPACKSkippedPAWS
TcpExtTCPACKSkippedSeq
TcpExtTCPACKSkippedFinWait2
TcpExtTCPACKSkippedTimeWait
TcpExtTCPACKSkippedChallenge
Signed-off-by: yupeng <yupeng0921@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Al Viro mentioned (Message-ID
<20170626041334.GZ10672@ZenIV.linux.org.uk>)
that there is probably a race condition
lurking in accesses of sk_stamp on 32-bit machines.
sock->sk_stamp is of type ktime_t which is always an s64.
On a 32 bit architecture, we might run into situations of
unsafe access as the access to the field becomes non atomic.
Use seqlocks for synchronization.
This allows us to avoid using spinlocks for readers as
readers do not need mutual exclusion.
Another approach to solve this is to require sk_lock for all
modifications of the timestamps. The current approach allows
for timestamps to have their own lock: sk_stamp_lock.
This allows for the patch to not compete with already
existing critical sections, and side effects are limited
to the paths in the patch.
The addition of the new field maintains the data locality
optimizations from
commit 9115e8cd2a0c ("net: reorganize struct sock for better data
locality")
Note that all the instances of the sk_stamp accesses
are either through the ioctl or the syscall recvmsg.
Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 33a48ab105a7 ("ibmveth: Fix DMA unmap error") fixed an issue in the
normal code path of ibmveth_xmit_start() that was originally introduced by
Commit 6e8ab30ec677 ("ibmveth: Add scatter-gather support"). This original
fix missed the error path where dma_unmap_page is wrongly called on the
header portion in descs[0] which was mapped with dma_map_single. As a
result a failure to DMA map any of the frags results in a dmesg warning
when CONFIG_DMA_API_DEBUG is enabled.
------------[ cut here ]------------
DMA-API: ibmveth 30000002: device driver frees DMA memory with wrong function
[device address=0x000000000a430000] [size=172 bytes] [mapped as page] [unmapped as single]
WARNING: CPU: 1 PID: 8426 at kernel/dma/debug.c:1085 check_unmap+0x4fc/0xe10
...
<snip>
...
DMA-API: Mapped at:
ibmveth_start_xmit+0x30c/0xb60
dev_hard_start_xmit+0x100/0x450
sch_direct_xmit+0x224/0x490
__qdisc_run+0x20c/0x980
__dev_queue_xmit+0x1bc/0xf20
This fixes the API misuse by unampping descs[0] with dma_unmap_single.
Fixes: 6e8ab30ec677 ("ibmveth: Add scatter-gather support")
Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In rtl8169_runtime_resume() we configure WoL but don't set the device
to wakeup-enabled. This prevents PME generation once the cable is
re-plugged. Fix this by moving the call to device_set_wakeup_enable()
to __rtl8169_set_wol().
Fixes: 433f9d0ddcc6 ("r8169: improve saved_wolopts handling")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
nr_find_socket(), nr_find_peer() and nr_find_listener() lock the
sock after finding it in the global list. However, the call path
requires BH disabled for the sock lock consistently.
Actually the locking is unnecessary at this point, we can just hold
the sock refcnt to make sure it is not gone after we unlock the global
list, and lock it later only when needed.
Reported-and-tested-by: syzbot+f621cda8b7e598908efa@syzkaller.appspotmail.com
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When x25_asy_open() fails, it already cleans up by itself,
so its caller doesn't need to free the memory again.
It seems we still have to call x25_asy_free() to clear the SLF_INUSE
bit, so just set these pointers to NULL after kfree().
Reported-and-tested-by: syzbot+5e5e969e525129229052@syzkaller.appspotmail.com
Fixes: 3b780bed3138 ("x25_asy: Free x25_asy on x25_asy_open() failure.")
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There are multiple issues here:
1. After freeing dev->ax25_ptr, we need to set it to NULL otherwise
we may use a dangling pointer.
2. There is a race between ax25_setsockopt() and device notifier as
reported by syzbot. Close it by holding RTNL lock.
3. We need to test if dev->ax25_ptr is NULL before using it.
Reported-and-tested-by: syzbot+ae6bb869cbed29b29040@syzkaller.appspotmail.com
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Su Yanjun <suyj.fnst@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
gcc warn this:
net/ipv4/fib_rules.c:203 fib_empty_table() warn:
always true condition '(id <= 4294967295) => (0-u32max <= u32max)'
'id' is u32, which always not greater than RT_TABLE_MAX
(0xFFFFFFFF), So add a check to break while wrap around.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
'ipv6_find_idev()' returns NULL on error, not an error pointer.
Update the test accordingly and return -ENOBUFS, as already done in
'addrconf_add_dev()', if NULL is returned.
Fixes: ("ipv6: allow userspace to add IFA_F_OPTIMISTIC addresses")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We must have an address to lookup otherwise we'll derefence a null
pointer in the ndo_fdb_get callbacks.
CC: Roopa Prabhu <roopa@cumulusnetworks.com>
CC: David Ahern <dsa@cumulusnetworks.com>
Reported-by: syzbot+017b1f61c82a1c3e7efd@syzkaller.appspotmail.com
Fixes: 5b2f94b27622 ("net: rtnetlink: support for fdb get")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net, specifically
fixes for the nf_conncount infrastructure which is causing troubles
since 5c789e131cbb9 ("netfilter: nf_conncount: Add list lock and gc
worker, and RCU for init tree search"). Patches aim to simplify this
infrastructure while fixing up the problems:
1) Use fixed size CONNCOUNT_SLOTS in nf_conncount, from Shawn Bohrer.
2) Incorrect signedness in age calculation from find_or_evict(),
from Florian Westphal.
3) Proper locking for the garbage collector workqueue callback,
first make a patch to count how many nodes can be collected
without holding locks, then grab lock and release them. Also
from Florian.
4) Restart node lookup from the insertion path, after releasing nodes
via packet path garbage collection. Shawn Bohrer described a scenario
that may result in inserting a connection in an already dead list
node. Patch from Florian.
5) Merge lookup and add function to avoid a hold release and re-grab.
From Florian.
6) Be safe and iterate over the node lists under the spinlock.
7) Speculative list nodes removal via garbage collection, check if
list node got a connection while it was scheduled for deletion
via gc.
8) Accidental argument swap in find_next_bit() that leads to more
frequent scheduling of the workqueue. From Florian Westphal.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
These functions are called from atomic context:
[ 9.150239] BUG: sleeping function called from invalid context at /home/scott/git/linux/mm/slab.h:421
[ 9.158159] in_atomic(): 1, irqs_disabled(): 0, pid: 4432, name: ip
[ 9.163128] CPU: 8 PID: 4432 Comm: ip Not tainted 4.20.0-rc2-00169-g63d86876f324 #29
[ 9.163130] Call Trace:
[ 9.170701] [c0000002e899a980] [c0000000009c1068] .dump_stack+0xa8/0xec (unreliable)
[ 9.177140] [c0000002e899aa10] [c00000000007a7b4] .___might_sleep+0x138/0x164
[ 9.184440] [c0000002e899aa80] [c0000000001d5bac] .kmem_cache_alloc_trace+0x238/0x30c
[ 9.191216] [c0000002e899ab40] [c00000000065ea1c] .memac_add_hash_mac_address+0x104/0x198
[ 9.199464] [c0000002e899abd0] [c00000000065a788] .set_multi+0x1c8/0x218
[ 9.206242] [c0000002e899ac80] [c0000000006615ec] .dpaa_set_rx_mode+0xdc/0x17c
[ 9.213544] [c0000002e899ad00] [c00000000083d2b0] .__dev_set_rx_mode+0x80/0xd4
[ 9.219535] [c0000002e899ad90] [c00000000083d334] .dev_set_rx_mode+0x30/0x54
[ 9.225271] [c0000002e899ae10] [c00000000083d4a0] .__dev_open+0x148/0x1c8
[ 9.230751] [c0000002e899aeb0] [c00000000083d934] .__dev_change_flags+0x19c/0x1e0
[ 9.230755] [c0000002e899af60] [c00000000083d9a4] .dev_change_flags+0x2c/0x80
[ 9.242752] [c0000002e899aff0] [c0000000008554ec] .do_setlink+0x350/0xf08
[ 9.248228] [c0000002e899b170] [c000000000857ad0] .rtnl_newlink+0x588/0x7e0
[ 9.253965] [c0000002e899b740] [c000000000852424] .rtnetlink_rcv_msg+0x3e0/0x498
[ 9.261440] [c0000002e899b820] [c000000000884790] .netlink_rcv_skb+0x134/0x14c
[ 9.267607] [c0000002e899b8e0] [c000000000851840] .rtnetlink_rcv+0x18/0x2c
[ 9.274558] [c0000002e899b950] [c000000000883c8c] .netlink_unicast+0x214/0x318
[ 9.281163] [c0000002e899ba00] [c000000000884220] .netlink_sendmsg+0x348/0x444
[ 9.287076] [c0000002e899bae0] [c00000000080d13c] .sock_sendmsg+0x2c/0x54
[ 9.287080] [c0000002e899bb50] [c0000000008106c0] .___sys_sendmsg+0x2d0/0x2d8
[ 9.298375] [c0000002e899bd30] [c000000000811a80] .__sys_sendmsg+0x5c/0xb0
[ 9.303939] [c0000002e899be20] [c0000000000006b0] system_call+0x60/0x6c
Signed-off-by: Scott Wood <oss@buserror.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
HFCPCI_l1hw()
In drivers/isdn/hisax/hfc_pci.c, the functions hfcpci_interrupt() and
HFCPCI_l1hw() may be concurrently executed.
HFCPCI_l1hw()
line 1173: if (!cs->tx_skb)
hfcpci_interrupt()
line 942: spin_lock_irqsave();
line 1066: dev_kfree_skb_irq(cs->tx_skb);
Thus, a possible concurrency use-after-free bug may occur
in HFCPCI_l1hw().
To fix these bugs, the calls to spin_lock_irqsave() and
spin_unlock_irqrestore() are added in HFCPCI_l1hw(), to protect the
access to cs->tx_skb.
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The return type for get_regs_len in struct ethtool_ops is int,
the hns3 driver may return error when failing to get the regs
len by sending cmd to firmware.
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Size and 'next bit' were swapped, this bug could cause worker to
reschedule itself even if system was idle.
Fixes: 5c789e131cbb9 ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search")
Reviewed-by: Shawn Bohrer <sbohrer@cloudflare.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Instead of removing a empty list node that might be reintroduced soon
thereafter, tentatively place the empty list node on the list passed to
tree_nodes_free(), then re-check if the list is empty again before erasing
it from the tree.
[ Florian: rebase on top of pending nf_conncount fixes ]
Fixes: 5c789e131cbb9 ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search")
Reviewed-by: Shawn Bohrer <sbohrer@cloudflare.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Two CPUs may race to remove a connection from the list, the existing
conn->dead will result in a use-after-free. Use the per-list spinlock to
protect list iterations.
As all accesses to the list now happen while holding the per-list lock,
we no longer need to delay free operations with rcu.
Joint work with Florian.
Fixes: 5c789e131cbb9 ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search")
Reviewed-by: Shawn Bohrer <sbohrer@cloudflare.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
'lookup' is always followed by 'add'.
Merge both and make the list-walk part of nf_conncount_add().
This also avoids one unneeded unlock/re-lock pair.
Extra care needs to be taken in count_tree, as we only hold rcu
read lock, i.e. we can only insert to an existing tree node after
acquiring its lock and making sure it has a nonzero count.
As a zero count should be rare, just fall back to insert_tree()
(which acquires tree lock).
This issue and its solution were pointed out by Shawn Bohrer
during patch review.
Reviewed-by: Shawn Bohrer <sbohrer@cloudflare.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Shawn Bohrer reported a following crash:
|RIP: 0010:rb_erase+0xae/0x360
[..]
Call Trace:
nf_conncount_destroy+0x59/0xc0 [nf_conncount]
cleanup_match+0x45/0x70 [ip_tables]
...
Shawn tracked this down to bogus 'parent' pointer:
Problem is that when we insert a new node, then there is a chance that
the 'parent' that we found was also passed to tree_nodes_free() (because
that node was empty) for erase+free.
Instead of trying to be clever and detect when this happens, restart
the search if we have evicted one or more nodes. To prevent frequent
restarts, do not perform gc on the second round.
Also, unconditionally schedule the gc worker.
The condition
gc_count > ARRAY_SIZE(gc_nodes))
cannot be true unless tree grows very large, as the height of the tree
will be low even with hundreds of nodes present.
Fixes: 5c789e131cbb9 ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search")
Reported-by: Shawn Bohrer <sbohrer@cloudflare.com>
Reviewed-by: Shawn Bohrer <sbohrer@cloudflare.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
The lockless workqueue garbage collector can race with packet path
garbage collector to delete list nodes, as it calls tree_nodes_free()
with the addresses of nodes that might have been free'd already from
another cpu.
To fix this, split gc into two phases.
One phase to perform gc on the connections: From a locking perspective,
this is the same as count_tree(): we hold rcu lock, but we do not
change the tree, we only change the nodes' contents.
The second phase acquires the tree lock and reaps empty nodes.
This avoids a race condition of the garbage collection vs. packet path:
If a node has been free'd already, the second phase won't find it anymore.
This second phase is, from locking perspective, same as insert_tree().
The former only modifies nodes (list content, count), latter modifies
the tree itself (rb_erase or rb_insert).
Fixes: 5c789e131cbb9 ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search")
Reviewed-by: Shawn Bohrer <sbohrer@cloudflare.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
age is signed integer, so result can be negative when the timestamps
have a large delta. In this case we want to discard the entry.
Instead of using age >= 2 || age < 0, just make it unsigned.
Fixes: b36e4523d4d56 ("netfilter: nf_conncount: fix garbage collection confirm race")
Reviewed-by: Shawn Bohrer <sbohrer@cloudflare.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Most of the time these were the same value anyway, but when
CONFIG_LOCKDEP was enabled we would use a smaller number of locks to
reduce overhead. Unfortunately having two values is confusing and not
worth the complexity.
This fixes a bug where tree_gc_worker() would only GC up to
CONNCOUNT_LOCK_SLOTS trees which meant when CONFIG_LOCKDEP was enabled
not all trees would be GCed by tree_gc_worker().
Fixes: 5c789e131cbb9 ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Shawn Bohrer <sbohrer@cloudflare.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
If nla_nest_start() may fail. The fix checks its return value and goes
to nla_put_failure if it fails.
Signed-off-by: Kangjie Lu <kjlu@umn.edu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Correct two minor kerneldoc errors:
1) missing reference to @mode in struct phy_ops
2) obsolete reference to @init_data in struct_phy_attrs,
removed in dbc98635e0d42f0e62ea92813df1e0e4c90f8375
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
1) note that gianfar_phy.c was removed years ago
2) fix obvious copy and paste error in regular doc
3) change regular doc into kerneldoc for phy_modes()
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch fixes potential double frees if register_hdlc_device() fails.
Signed-off-by: Wen Yang <wen.yang99@zte.com.cn>
Reviewed-by: Peng Hao <peng.hao2@zte.com.cn>
CC: Zhao Qiang <qiang.zhao@nxp.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: netdev@vger.kernel.org
CC: linuxppc-dev@lists.ozlabs.org
CC: linux-kernel@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When acpi_match_device fails, its return value is NULL. Directly using
the return value without a check may result in a NULL-pointer
dereference. The fix checks if acpi_match_device fails, and if so,
returns -EINVAL.
Signed-off-by: Kangjie Lu <kjlu@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
genlmsg_put could fail. The fix inserts a check of its return value, and
if it fails, returns -EMSGSIZE.
Signed-off-by: Kangjie Lu <kjlu@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
atl1e_write_phy_reg() could fail. The fix issues an error message when
it fails.
Signed-off-by: Kangjie Lu <kjlu@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Both bcm_sf2_sw_indir_rw and mdiobus_write_nested could fail, so let's
return their error codes upstream.
Signed-off-by: Kangjie Lu <kjlu@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
clk_prepare() could fail, so let's check its status, and if it fails,
return its error code upstream.
Signed-off-by: Kangjie Lu <kjlu@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
clk_prepare() could fail, so let's check its status, and if it fails,
return its error code upstream.
Signed-off-by: Kangjie Lu <kjlu@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
niu_pci_eeprom_read() may fail, so we should check its return value
before using the read data.
Signed-off-by: Kangjie Lu <kjlu@umn.edu>
Acked-by: Shannon Nelson <shannon.lee.nelson@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
cudbg_collect_hw_sched() could fail when the function cudg_get_buffer()
returns an error. The fix adds a check to the latter function returning
error on failure
Signed-off-by: Aditya Pakki <pakki001@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
While flushing the cache via ipv6_sysctl_rtcache_flush(), the call
to proc_dointvec() may fail. The fix adds a check that returns the
error, on failure.
Signed-off-by: Aditya Pakki <pakki001@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
bearer_disable() already calls kfree_rcu() to free struct tipc_bearer,
we don't need to call kfree() again.
Fixes: cb30a63384bc ("tipc: refactor function tipc_enable_bearer()")
Reported-by: syzbot+b981acf1fb240c0c128b@syzkaller.appspotmail.com
Cc: Ying Xue <ying.xue@windriver.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
"API:
- Add 1472-byte test to tcrypt for IPsec
- Reintroduced crypto stats interface with numerous changes
- Support incremental algorithm dumps
Algorithms:
- Add xchacha12/20
- Add nhpoly1305
- Add adiantum
- Add streebog hash
- Mark cts(cbc(aes)) as FIPS allowed
Drivers:
- Improve performance of arm64/chacha20
- Improve performance of x86/chacha20
- Add NEON-accelerated nhpoly1305
- Add SSE2 accelerated nhpoly1305
- Add AVX2 accelerated nhpoly1305
- Add support for 192/256-bit keys in gcmaes AVX
- Add SG support in gcmaes AVX
- ESN for inline IPsec tx in chcr
- Add support for CryptoCell 703 in ccree
- Add support for CryptoCell 713 in ccree
- Add SM4 support in ccree
- Add SM3 support in ccree
- Add support for chacha20 in caam/qi2
- Add support for chacha20 + poly1305 in caam/jr
- Add support for chacha20 + poly1305 in caam/qi2
- Add AEAD cipher support in cavium/nitrox"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (130 commits)
crypto: skcipher - remove remnants of internal IV generators
crypto: cavium/nitrox - Fix build with !CONFIG_DEBUG_FS
crypto: salsa20-generic - don't unnecessarily use atomic walk
crypto: skcipher - add might_sleep() to skcipher_walk_virt()
crypto: x86/chacha - avoid sleeping under kernel_fpu_begin()
crypto: cavium/nitrox - Added AEAD cipher support
crypto: mxc-scc - fix build warnings on ARM64
crypto: api - document missing stats member
crypto: user - remove unused dump functions
crypto: chelsio - Fix wrong error counter increments
crypto: chelsio - Reset counters on cxgb4 Detach
crypto: chelsio - Handle PCI shutdown event
crypto: chelsio - cleanup:send addr as value in function argument
crypto: chelsio - Use same value for both channel in single WR
crypto: chelsio - Swap location of AAD and IV sent in WR
crypto: chelsio - remove set but not used variable 'kctx_len'
crypto: ux500 - Use proper enum in hash_set_dma_transfer
crypto: ux500 - Use proper enum in cryp_set_dma_transfer
crypto: aesni - Add scatter/gather avx stubs, and use them in C
crypto: aesni - Introduce partial block macro
..
|
|
Pull networking updates from David Miller:
1) New ipset extensions for matching on destination MAC addresses, from
Stefano Brivio.
2) Add ipv4 ttl and tos, plus ipv6 flow label and hop limit offloads to
nfp driver. From Stefano Brivio.
3) Implement GRO for plain UDP sockets, from Paolo Abeni.
4) Lots of work from Michał Mirosław to eliminate the VLAN_TAG_PRESENT
bit so that we could support the entire vlan_tci value.
5) Rework the IPSEC policy lookups to better optimize more usecases,
from Florian Westphal.
6) Infrastructure changes eliminating direct manipulation of SKB lists
wherever possible, and to always use the appropriate SKB list
helpers. This work is still ongoing...
7) Lots of PHY driver and state machine improvements and
simplifications, from Heiner Kallweit.
8) Various TSO deferral refinements, from Eric Dumazet.
9) Add ntuple filter support to aquantia driver, from Dmitry Bogdanov.
10) Batch dropping of XDP packets in tuntap, from Jason Wang.
11) Lots of cleanups and improvements to the r8169 driver from Heiner
Kallweit, including support for ->xmit_more. This driver has been
getting some much needed love since he started working on it.
12) Lots of new forwarding selftests from Petr Machata.
13) Enable VXLAN learning in mlxsw driver, from Ido Schimmel.
14) Packed ring support for virtio, from Tiwei Bie.
15) Add new Aquantia AQtion USB driver, from Dmitry Bezrukov.
16) Add XDP support to dpaa2-eth driver, from Ioana Ciocoi Radulescu.
17) Implement coalescing on TCP backlog queue, from Eric Dumazet.
18) Implement carrier change in tun driver, from Nicolas Dichtel.
19) Support msg_zerocopy in UDP, from Willem de Bruijn.
20) Significantly improve garbage collection of neighbor objects when
the table has many PERMANENT entries, from David Ahern.
21) Remove egdev usage from nfp and mlx5, and remove the facility
completely from the tree as it no longer has any users. From Oz
Shlomo and others.
22) Add a NETDEV_PRE_CHANGEADDR so that drivers can veto the change and
therefore abort the operation before the commit phase (which is the
NETDEV_CHANGEADDR event). From Petr Machata.
23) Add indirect call wrappers to avoid retpoline overhead, and use them
in the GRO code paths. From Paolo Abeni.
24) Add support for netlink FDB get operations, from Roopa Prabhu.
25) Support bloom filter in mlxsw driver, from Nir Dotan.
26) Add SKB extension infrastructure. This consolidates the handling of
the auxiliary SKB data used by IPSEC and bridge netfilter, and is
designed to support the needs to MPTCP which could be integrated in
the future.
27) Lots of XDP TX optimizations in mlx5 from Tariq Toukan.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1845 commits)
net: dccp: fix kernel crash on module load
drivers/net: appletalk/cops: remove redundant if statement and mask
bnx2x: Fix NULL pointer dereference in bnx2x_del_all_vlans() on some hw
net/net_namespace: Check the return value of register_pernet_subsys()
net/netlink_compat: Fix a missing check of nla_parse_nested
ieee802154: lowpan_header_create check must check daddr
net/mlx4_core: drop useless LIST_HEAD
mlxsw: spectrum: drop useless LIST_HEAD
net/mlx5e: drop useless LIST_HEAD
iptunnel: Set tun_flags in the iptunnel_metadata_reply from src
net/mlx5e: fix semicolon.cocci warnings
staging: octeon: fix build failure with XFRM enabled
net: Revert recent Spectre-v1 patches.
can: af_can: Fix Spectre v1 vulnerability
packet: validate address length if non-zero
nfc: af_nfc: Fix Spectre v1 vulnerability
phonet: af_phonet: Fix Spectre v1 vulnerability
net: core: Fix Spectre v1 vulnerability
net: minor cleanup in skb_ext_add()
net: drop the unused helper skb_ext_get()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux
Pull modules updates from Jessica Yu:
- Some modules-related kallsyms cleanups and a kallsyms fix for ARM.
- Include keys from the secondary keyring in module signature
verification.
* tag 'modules-for-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
ARM: module: Fix function kallsyms on Thumb-2
module: Overwrite st_size instead of st_info
module: make it clearer when we're handling kallsyms symbols vs exported symbols
modsign: use all trusted keys to verify module signature
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull general security subsystem updates from James Morris:
"The main changes here are Paul Gortmaker's removal of unneccesary
module.h infrastructure"
* 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
security: integrity: partial revert of make ima_main explicitly non-modular
security: fs: make inode explicitly non-modular
security: audit and remove any unnecessary uses of module.h
security: integrity: make evm_main explicitly non-modular
keys: remove needless modular infrastructure from ecryptfs_format
security: integrity: make ima_main explicitly non-modular
tomoyo: fix small typo
|