summaryrefslogtreecommitdiff
path: root/drivers/net/ppp/ppp_generic.c
AgeCommit message (Collapse)Author
2017-10-22drivers, net, ppp: convert ppp_file.refcnt from atomic_t to refcount_tElena Reshetova
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable ppp_file.refcnt is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Windsor <dwindsor@gmail.com> Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-06ppp: fix race in ppp device destructionGuillaume Nault
ppp_release() tries to ensure that netdevices are unregistered before decrementing the unit refcount and running ppp_destroy_interface(). This is all fine as long as the the device is unregistered by ppp_release(): the unregister_netdevice() call, followed by rtnl_unlock(), guarantee that the unregistration process completes before rtnl_unlock() returns. However, the device may be unregistered by other means (like ppp_nl_dellink()). If this happens right before ppp_release() calling rtnl_lock(), then ppp_release() has to wait for the concurrent unregistration code to release the lock. But rtnl_unlock() releases the lock before completing the device unregistration process. This allows ppp_release() to proceed and eventually call ppp_destroy_interface() before the unregistration process completes. Calling free_netdev() on this partially unregistered device will BUG(): ------------[ cut here ]------------ kernel BUG at net/core/dev.c:8141! invalid opcode: 0000 [#1] SMP CPU: 1 PID: 1557 Comm: pppd Not tainted 4.14.0-rc2+ #4 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc26 04/01/2014 Call Trace: ppp_destroy_interface+0xd8/0xe0 [ppp_generic] ppp_disconnect_channel+0xda/0x110 [ppp_generic] ppp_unregister_channel+0x5e/0x110 [ppp_generic] pppox_unbind_sock+0x23/0x30 [pppox] pppoe_connect+0x130/0x440 [pppoe] SYSC_connect+0x98/0x110 ? do_fcntl+0x2c0/0x5d0 SyS_connect+0xe/0x10 entry_SYSCALL_64_fastpath+0x1a/0xa5 RIP: free_netdev+0x107/0x110 RSP: ffffc28a40573d88 ---[ end trace ed294ff0cc40eeff ]--- We could set the ->needs_free_netdev flag on PPP devices and move the ppp_destroy_interface() logic in the ->priv_destructor() callback. But that'd be quite intrusive as we'd first need to unlink from the other channels and units that depend on the device (the ones that used the PPPIOCCONNECT and PPPIOCATTACH ioctls). Instead, we can just let the netdevice hold a reference on its ppp_file. This reference is dropped in ->priv_destructor(), at the very end of the unregistration process, so that neither ppp_release() nor ppp_disconnect_channel() can call ppp_destroy_interface() in the interim. Reported-by: Beniamino Galvani <bgalvani@redhat.com> Fixes: 8cb775bc0a34 ("ppp: fix device unregistration upon netns deletion") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-01ppp: fix __percpu annotationGuillaume Nault
Move sparse annotation right after pointer type. Fixes sparse warning: drivers/net/ppp/ppp_generic.c:1422:13: warning: incorrect type in initializer (different address spaces) drivers/net/ppp/ppp_generic.c:1422:13: expected void const [noderef] <asn:3>*__vpp_verify drivers/net/ppp/ppp_generic.c:1422:13: got int *<noident> ... Fixes: e5dadc65f9e0 ("ppp: Fix false xmit recursion detect with two ppp devices") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-08ppp: fix xmit recursion detection on ppp channelsGuillaume Nault
Commit e5dadc65f9e0 ("ppp: Fix false xmit recursion detect with two ppp devices") dropped the xmit_recursion counter incrementation in ppp_channel_push() and relied on ppp_xmit_process() for this task. But __ppp_channel_push() can also send packets directly (using the .start_xmit() channel callback), in which case the xmit_recursion counter isn't incremented anymore. If such packets get routed back to the parent ppp unit, ppp_xmit_process() won't notice the recursion and will call ppp_channel_push() on the same channel, effectively creating the deadlock situation that the xmit_recursion mechanism was supposed to prevent. This patch re-introduces the xmit_recursion counter incrementation in ppp_channel_push(). Since the xmit_recursion variable is now part of the parent ppp unit, incrementation is skipped if the channel doesn't have any. This is fine because only packets routed through the parent unit may enter the channel recursively. Finally, we have to ensure that pch->ppp is not going to be modified while executing ppp_channel_push(). Instead of taking this lock only while calling ppp_xmit_process(), we now have to hold it for the full ppp_channel_push() execution. This respects the ppp locks ordering which requires locking ->upl before ->downl. Fixes: e5dadc65f9e0 ("ppp: Fix false xmit recursion detect with two ppp devices") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18ppp: Fix false xmit recursion detect with two ppp devicesGao Feng
The global percpu variable ppp_xmit_recursion is used to detect the ppp xmit recursion to avoid the deadlock, which is caused by one CPU tries to lock the xmit lock twice. But it would report false recursion when one CPU wants to send the skb from two different PPP devices, like one L2TP on the PPPoE. It is a normal case actually. Now use one percpu member of struct ppp instead of the gloable variable to detect the xmit recursion of one ppp device. Fixes: 55454a565836 ("ppp: avoid dealock on recursive xmit") Signed-off-by: Gao Feng <gfree.wind@vip.163.com> Signed-off-by: Liu Jianying <jianying.liu@ikuai8.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-26net: add netlink_ext_ack argument to rtnl_link_ops.validateMatthias Schiffer
Add support for extended error reporting. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-26net: add netlink_ext_ack argument to rtnl_link_ops.newlinkMatthias Schiffer
Add support for extended error reporting. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-16networking: make skb_push & __skb_push return void pointersJohannes Berg
It seems like a historic accident that these return unsigned char *, and in many places that means casts are required, more often than not. Make these functions return void * and remove all the casts across the tree, adding a (u8 *) cast only where the unsigned char pointer was used directly, all done with the following spatch: @@ expression SKB, LEN; typedef u8; identifier fn = { skb_push, __skb_push, skb_push_rcsum }; @@ - *(fn(SKB, LEN)) + *(u8 *)fn(SKB, LEN) @@ expression E, SKB, LEN; identifier fn = { skb_push, __skb_push, skb_push_rcsum }; type T; @@ - E = ((T *)(fn(SKB, LEN))) + E = fn(SKB, LEN) @@ expression SKB, LEN; identifier fn = { skb_push, __skb_push, skb_push_rcsum }; @@ - fn(SKB, LEN)[0] + *(u8 *)fn(SKB, LEN) Note that the last part there converts from push(...)[0] to the more idiomatic *(u8 *)push(...). Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01ppp: remove unnecessary bh disable in xmit pathGao Feng
Since the commit 55454a565836 ("ppp: avoid dealock on recursive xmit"), the PPP xmit path is protected by wrapper functions which disable the bh already. So it is unnecessary to disable the bh again in the real xmit path. Signed-off-by: Gao Feng <gfree.wind@vip.163.com> Acked-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-02sched/headers: Prepare to move signal wakeup & sigpending methods from ↵Ingo Molnar
<linux/sched.h> into <linux/sched/signal.h> Fix up affected files that include this signal functionality via sched.h. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-08net: make ndo_get_stats64 a void functionstephen hemminger
The network device operation for reading statistics is only called in one place, and it ignores the return value. Having a structure return value is potentially confusing because some future driver could incorrectly assume that the return value was used. Fix all drivers with ndo_get_stats64 to have a void function. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-18netns: make struct pernet_operations::id unsigned intAlexey Dobriyan
Make struct pernet_operations::id unsigned. There are 2 reasons to do so: 1) This field is really an index into an zero based array and thus is unsigned entity. Using negative value is out-of-bound access by definition. 2) On x86_64 unsigned 32-bit data which are mixed with pointers via array indexing or offsets added or subtracted to pointers are preffered to signed 32-bit data. "int" being used as an array index needs to be sign-extended to 64-bit before being used. void f(long *p, int i) { g(p[i]); } roughly translates to movsx rsi, esi mov rdi, [rsi+...] call g MOVSX is 3 byte instruction which isn't necessary if the variable is unsigned because x86_64 is zero extending by default. Now, there is net_generic() function which, you guessed it right, uses "int" as an array index: static inline void *net_generic(const struct net *net, int id) { ... ptr = ng->ptr[id - 1]; ... } And this function is used a lot, so those sign extensions add up. Patch snipes ~1730 bytes on allyesconfig kernel (without all junk messing with code generation): add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730) Unfortunately some functions actually grow bigger. This is a semmingly random artefact of code generation with register allocator being used differently. gcc decides that some variable needs to live in new r8+ registers and every access now requires REX prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be used which is longer than [r8] However, overall balance is in negative direction: add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730) function old new delta nfsd4_lock 3886 3959 +73 tipc_link_build_proto_msg 1096 1140 +44 mac80211_hwsim_new_radio 2776 2808 +32 tipc_mon_rcv 1032 1058 +26 svcauth_gss_legacy_init 1413 1429 +16 tipc_bcbase_select_primary 379 392 +13 nfsd4_exchange_id 1247 1260 +13 nfsd4_setclientid_confirm 782 793 +11 ... put_client_renew_locked 494 480 -14 ip_set_sockfn_get 730 716 -14 geneve_sock_add 829 813 -16 nfsd4_sequence_done 721 703 -18 nlmclnt_lookup_host 708 686 -22 nfsd4_lockt 1085 1063 -22 nfs_get_client 1077 1050 -27 tcf_bpf_init 1106 1076 -30 nfsd4_encode_fattr 5997 5930 -67 Total: Before=154856051, After=154854321, chg -0.00% Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31ppp: declare PPP devices as LLTXGuillaume Nault
ppp_xmit_process() already locks the xmit path. If HARD_TX_LOCK() tries to hold the _xmit_lock we can get lock inversion. [ 973.726130] ====================================================== [ 973.727311] [ INFO: possible circular locking dependency detected ] [ 973.728546] 4.8.0-rc2 #1 Tainted: G O [ 973.728986] ------------------------------------------------------- [ 973.728986] accel-pppd/1806 is trying to acquire lock: [ 973.728986] (&qdisc_xmit_lock_key){+.-...}, at: [<ffffffff8146f6fe>] sch_direct_xmit+0x8d/0x221 [ 973.728986] [ 973.728986] but task is already holding lock: [ 973.728986] (l2tp_sock){+.-...}, at: [<ffffffffa0202c4a>] l2tp_xmit_skb+0x1e8/0x5d7 [l2tp_core] [ 973.728986] [ 973.728986] which lock already depends on the new lock. [ 973.728986] [ 973.728986] [ 973.728986] the existing dependency chain (in reverse order) is: [ 973.728986] -> #3 (l2tp_sock){+.-...}: [ 973.728986] [<ffffffff810b3130>] lock_acquire+0x150/0x217 [ 973.728986] [<ffffffff815752f4>] _raw_spin_lock+0x2d/0x3c [ 973.728986] [<ffffffffa0202c4a>] l2tp_xmit_skb+0x1e8/0x5d7 [l2tp_core] [ 973.728986] [<ffffffffa01b2466>] pppol2tp_xmit+0x1f2/0x25e [l2tp_ppp] [ 973.728986] [<ffffffffa0184f59>] ppp_channel_push+0xb5/0x14a [ppp_generic] [ 973.728986] [<ffffffffa01853ed>] ppp_write+0x104/0x11c [ppp_generic] [ 973.728986] [<ffffffff811b2ec6>] __vfs_write+0x56/0x120 [ 973.728986] [<ffffffff811b3f4c>] vfs_write+0xbd/0x11b [ 973.728986] [<ffffffff811b4cb2>] SyS_write+0x5e/0x96 [ 973.728986] [<ffffffff81575ba5>] entry_SYSCALL_64_fastpath+0x18/0xa8 [ 973.728986] -> #2 (&(&pch->downl)->rlock){+.-...}: [ 973.728986] [<ffffffff810b3130>] lock_acquire+0x150/0x217 [ 973.728986] [<ffffffff81575334>] _raw_spin_lock_bh+0x31/0x40 [ 973.728986] [<ffffffffa01808e2>] ppp_push+0xa7/0x82d [ppp_generic] [ 973.728986] [<ffffffffa0184675>] __ppp_xmit_process+0x48/0x877 [ppp_generic] [ 973.728986] [<ffffffffa018505b>] ppp_xmit_process+0x4b/0xaf [ppp_generic] [ 973.728986] [<ffffffffa01853f7>] ppp_write+0x10e/0x11c [ppp_generic] [ 973.728986] [<ffffffff811b2ec6>] __vfs_write+0x56/0x120 [ 973.728986] [<ffffffff811b3f4c>] vfs_write+0xbd/0x11b [ 973.728986] [<ffffffff811b4cb2>] SyS_write+0x5e/0x96 [ 973.728986] [<ffffffff81575ba5>] entry_SYSCALL_64_fastpath+0x18/0xa8 [ 973.728986] -> #1 (&(&ppp->wlock)->rlock){+.-...}: [ 973.728986] [<ffffffff810b3130>] lock_acquire+0x150/0x217 [ 973.728986] [<ffffffff81575334>] _raw_spin_lock_bh+0x31/0x40 [ 973.728986] [<ffffffffa0184654>] __ppp_xmit_process+0x27/0x877 [ppp_generic] [ 973.728986] [<ffffffffa018505b>] ppp_xmit_process+0x4b/0xaf [ppp_generic] [ 973.728986] [<ffffffffa01852da>] ppp_start_xmit+0x21b/0x22a [ppp_generic] [ 973.728986] [<ffffffff8143f767>] dev_hard_start_xmit+0x1a9/0x43d [ 973.728986] [<ffffffff8146f747>] sch_direct_xmit+0xd6/0x221 [ 973.728986] [<ffffffff814401e4>] __dev_queue_xmit+0x62a/0x912 [ 973.728986] [<ffffffff814404d7>] dev_queue_xmit+0xb/0xd [ 973.728986] [<ffffffff81449978>] neigh_direct_output+0xc/0xe [ 973.728986] [<ffffffff8150e62b>] ip6_finish_output2+0x5a9/0x623 [ 973.728986] [<ffffffff81512128>] ip6_output+0x15e/0x16a [ 973.728986] [<ffffffff8153ef86>] dst_output+0x76/0x7f [ 973.728986] [<ffffffff8153f737>] mld_sendpack+0x335/0x404 [ 973.728986] [<ffffffff81541c61>] mld_send_initial_cr.part.21+0x99/0xa2 [ 973.728986] [<ffffffff8154441d>] ipv6_mc_dad_complete+0x42/0x71 [ 973.728986] [<ffffffff8151c4bd>] addrconf_dad_completed+0x1cf/0x2ea [ 973.728986] [<ffffffff8151e4fa>] addrconf_dad_work+0x453/0x520 [ 973.728986] [<ffffffff8107a393>] process_one_work+0x365/0x6f0 [ 973.728986] [<ffffffff8107aecd>] worker_thread+0x2de/0x421 [ 973.728986] [<ffffffff810816fb>] kthread+0x121/0x130 [ 973.728986] [<ffffffff81575dbf>] ret_from_fork+0x1f/0x40 [ 973.728986] -> #0 (&qdisc_xmit_lock_key){+.-...}: [ 973.728986] [<ffffffff810b28d6>] __lock_acquire+0x1118/0x1483 [ 973.728986] [<ffffffff810b3130>] lock_acquire+0x150/0x217 [ 973.728986] [<ffffffff815752f4>] _raw_spin_lock+0x2d/0x3c [ 973.728986] [<ffffffff8146f6fe>] sch_direct_xmit+0x8d/0x221 [ 973.728986] [<ffffffff814401e4>] __dev_queue_xmit+0x62a/0x912 [ 973.728986] [<ffffffff814404d7>] dev_queue_xmit+0xb/0xd [ 973.728986] [<ffffffff81449978>] neigh_direct_output+0xc/0xe [ 973.728986] [<ffffffff81487811>] ip_finish_output2+0x5db/0x609 [ 973.728986] [<ffffffff81489590>] ip_finish_output+0x152/0x15e [ 973.728986] [<ffffffff8148a0d4>] ip_output+0x8c/0x96 [ 973.728986] [<ffffffff81489652>] ip_local_out+0x41/0x4a [ 973.728986] [<ffffffff81489e7d>] ip_queue_xmit+0x5a5/0x609 [ 973.728986] [<ffffffffa0202fe4>] l2tp_xmit_skb+0x582/0x5d7 [l2tp_core] [ 973.728986] [<ffffffffa01b2466>] pppol2tp_xmit+0x1f2/0x25e [l2tp_ppp] [ 973.728986] [<ffffffffa0184f59>] ppp_channel_push+0xb5/0x14a [ppp_generic] [ 973.728986] [<ffffffffa01853ed>] ppp_write+0x104/0x11c [ppp_generic] [ 973.728986] [<ffffffff811b2ec6>] __vfs_write+0x56/0x120 [ 973.728986] [<ffffffff811b3f4c>] vfs_write+0xbd/0x11b [ 973.728986] [<ffffffff811b4cb2>] SyS_write+0x5e/0x96 [ 973.728986] [<ffffffff81575ba5>] entry_SYSCALL_64_fastpath+0x18/0xa8 [ 973.728986] [ 973.728986] other info that might help us debug this: [ 973.728986] [ 973.728986] Chain exists of: &qdisc_xmit_lock_key --> &(&pch->downl)->rlock --> l2tp_sock [ 973.728986] Possible unsafe locking scenario: [ 973.728986] [ 973.728986] CPU0 CPU1 [ 973.728986] ---- ---- [ 973.728986] lock(l2tp_sock); [ 973.728986] lock(&(&pch->downl)->rlock); [ 973.728986] lock(l2tp_sock); [ 973.728986] lock(&qdisc_xmit_lock_key); [ 973.728986] [ 973.728986] *** DEADLOCK *** [ 973.728986] [ 973.728986] 6 locks held by accel-pppd/1806: [ 973.728986] #0: (&(&pch->downl)->rlock){+.-...}, at: [<ffffffffa0184efa>] ppp_channel_push+0x56/0x14a [ppp_generic] [ 973.728986] #1: (l2tp_sock){+.-...}, at: [<ffffffffa0202c4a>] l2tp_xmit_skb+0x1e8/0x5d7 [l2tp_core] [ 973.728986] #2: (rcu_read_lock){......}, at: [<ffffffff81486981>] rcu_lock_acquire+0x0/0x20 [ 973.728986] #3: (rcu_read_lock_bh){......}, at: [<ffffffff81486981>] rcu_lock_acquire+0x0/0x20 [ 973.728986] #4: (rcu_read_lock_bh){......}, at: [<ffffffff814340e3>] rcu_lock_acquire+0x0/0x20 [ 973.728986] #5: (dev->qdisc_running_key ?: &qdisc_running_key#2){+.....}, at: [<ffffffff8144011e>] __dev_queue_xmit+0x564/0x912 [ 973.728986] [ 973.728986] stack backtrace: [ 973.728986] CPU: 2 PID: 1806 Comm: accel-pppd Tainted: G O 4.8.0-rc2 #1 [ 973.728986] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014 [ 973.728986] ffff7fffffffffff ffff88003436f850 ffffffff812a20f4 ffffffff82156e30 [ 973.728986] ffffffff82156920 ffff88003436f890 ffffffff8115c759 ffff88003344ae00 [ 973.728986] ffff88003344b5c0 0000000000000002 0000000000000006 ffff88003344b5e8 [ 973.728986] Call Trace: [ 973.728986] [<ffffffff812a20f4>] dump_stack+0x67/0x90 [ 973.728986] [<ffffffff8115c759>] print_circular_bug+0x22e/0x23c [ 973.728986] [<ffffffff810b28d6>] __lock_acquire+0x1118/0x1483 [ 973.728986] [<ffffffff810b3130>] lock_acquire+0x150/0x217 [ 973.728986] [<ffffffff810b3130>] ? lock_acquire+0x150/0x217 [ 973.728986] [<ffffffff8146f6fe>] ? sch_direct_xmit+0x8d/0x221 [ 973.728986] [<ffffffff815752f4>] _raw_spin_lock+0x2d/0x3c [ 973.728986] [<ffffffff8146f6fe>] ? sch_direct_xmit+0x8d/0x221 [ 973.728986] [<ffffffff8146f6fe>] sch_direct_xmit+0x8d/0x221 [ 973.728986] [<ffffffff814401e4>] __dev_queue_xmit+0x62a/0x912 [ 973.728986] [<ffffffff814404d7>] dev_queue_xmit+0xb/0xd [ 973.728986] [<ffffffff81449978>] neigh_direct_output+0xc/0xe [ 973.728986] [<ffffffff81487811>] ip_finish_output2+0x5db/0x609 [ 973.728986] [<ffffffff81486853>] ? dst_mtu+0x29/0x2e [ 973.728986] [<ffffffff81489590>] ip_finish_output+0x152/0x15e [ 973.728986] [<ffffffff8148a0bc>] ? ip_output+0x74/0x96 [ 973.728986] [<ffffffff8148a0d4>] ip_output+0x8c/0x96 [ 973.728986] [<ffffffff81489652>] ip_local_out+0x41/0x4a [ 973.728986] [<ffffffff81489e7d>] ip_queue_xmit+0x5a5/0x609 [ 973.728986] [<ffffffff814c559e>] ? udp_set_csum+0x207/0x21e [ 973.728986] [<ffffffffa0202fe4>] l2tp_xmit_skb+0x582/0x5d7 [l2tp_core] [ 973.728986] [<ffffffffa01b2466>] pppol2tp_xmit+0x1f2/0x25e [l2tp_ppp] [ 973.728986] [<ffffffffa0184f59>] ppp_channel_push+0xb5/0x14a [ppp_generic] [ 973.728986] [<ffffffffa01853ed>] ppp_write+0x104/0x11c [ppp_generic] [ 973.728986] [<ffffffff811b2ec6>] __vfs_write+0x56/0x120 [ 973.728986] [<ffffffff8124c11d>] ? fsnotify_perm+0x27/0x95 [ 973.728986] [<ffffffff8124d41d>] ? security_file_permission+0x4d/0x54 [ 973.728986] [<ffffffff811b3f4c>] vfs_write+0xbd/0x11b [ 973.728986] [<ffffffff811b4cb2>] SyS_write+0x5e/0x96 [ 973.728986] [<ffffffff81575ba5>] entry_SYSCALL_64_fastpath+0x18/0xa8 [ 973.728986] [<ffffffff810ae0fa>] ? trace_hardirqs_off_caller+0x121/0x12f Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31ppp: avoid dealock on recursive xmitGuillaume Nault
In case of misconfiguration, a virtual PPP channel might send packets back to their parent PPP interface. This typically happens in misconfigured L2TP setups, where PPP's peer IP address is set with the IP of the L2TP peer. When that happens the system hangs due to PPP trying to recursively lock its xmit path. [ 243.332155] BUG: spinlock recursion on CPU#1, accel-pppd/926 [ 243.333272] lock: 0xffff880033d90f18, .magic: dead4ead, .owner: accel-pppd/926, .owner_cpu: 1 [ 243.334859] CPU: 1 PID: 926 Comm: accel-pppd Not tainted 4.8.0-rc2 #1 [ 243.336010] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014 [ 243.336018] ffff7fffffffffff ffff8800319a77a0 ffffffff8128de85 ffff880033d90f18 [ 243.336018] ffff880033ad8000 ffff8800319a77d8 ffffffff810ad7c0 ffffffff0000039e [ 243.336018] ffff880033d90f18 ffff880033d90f60 ffff880033d90f18 ffff880033d90f28 [ 243.336018] Call Trace: [ 243.336018] [<ffffffff8128de85>] dump_stack+0x4f/0x65 [ 243.336018] [<ffffffff810ad7c0>] spin_dump+0xe1/0xeb [ 243.336018] [<ffffffff810ad7f0>] spin_bug+0x26/0x28 [ 243.336018] [<ffffffff810ad8b9>] do_raw_spin_lock+0x5c/0x160 [ 243.336018] [<ffffffff815522aa>] _raw_spin_lock_bh+0x35/0x3c [ 243.336018] [<ffffffffa01a88e2>] ? ppp_push+0xa7/0x82d [ppp_generic] [ 243.336018] [<ffffffffa01a88e2>] ppp_push+0xa7/0x82d [ppp_generic] [ 243.336018] [<ffffffff810adada>] ? do_raw_spin_unlock+0xc2/0xcc [ 243.336018] [<ffffffff81084962>] ? preempt_count_sub+0x13/0xc7 [ 243.336018] [<ffffffff81552438>] ? _raw_spin_unlock_irqrestore+0x34/0x49 [ 243.336018] [<ffffffffa01ac657>] ppp_xmit_process+0x48/0x877 [ppp_generic] [ 243.336018] [<ffffffff81084962>] ? preempt_count_sub+0x13/0xc7 [ 243.336018] [<ffffffff81408cd3>] ? skb_queue_tail+0x71/0x7c [ 243.336018] [<ffffffffa01ad1c5>] ppp_start_xmit+0x21b/0x22a [ppp_generic] [ 243.336018] [<ffffffff81426af1>] dev_hard_start_xmit+0x15e/0x32c [ 243.336018] [<ffffffff81454ed7>] sch_direct_xmit+0xd6/0x221 [ 243.336018] [<ffffffff814273a8>] __dev_queue_xmit+0x52a/0x820 [ 243.336018] [<ffffffff814276a9>] dev_queue_xmit+0xb/0xd [ 243.336018] [<ffffffff81430a3c>] neigh_direct_output+0xc/0xe [ 243.336018] [<ffffffff8146b5d7>] ip_finish_output2+0x4d2/0x548 [ 243.336018] [<ffffffff8146a8e6>] ? dst_mtu+0x29/0x2e [ 243.336018] [<ffffffff8146d49c>] ip_finish_output+0x152/0x15e [ 243.336018] [<ffffffff8146df84>] ? ip_output+0x74/0x96 [ 243.336018] [<ffffffff8146df9c>] ip_output+0x8c/0x96 [ 243.336018] [<ffffffff8146d55e>] ip_local_out+0x41/0x4a [ 243.336018] [<ffffffff8146dd15>] ip_queue_xmit+0x531/0x5c5 [ 243.336018] [<ffffffff814a82cd>] ? udp_set_csum+0x207/0x21e [ 243.336018] [<ffffffffa01f2f04>] l2tp_xmit_skb+0x582/0x5d7 [l2tp_core] [ 243.336018] [<ffffffffa01ea458>] pppol2tp_xmit+0x1eb/0x257 [l2tp_ppp] [ 243.336018] [<ffffffffa01acf17>] ppp_channel_push+0x91/0x102 [ppp_generic] [ 243.336018] [<ffffffffa01ad2d8>] ppp_write+0x104/0x11c [ppp_generic] [ 243.336018] [<ffffffff811a3c1e>] __vfs_write+0x56/0x120 [ 243.336018] [<ffffffff81239801>] ? fsnotify_perm+0x27/0x95 [ 243.336018] [<ffffffff8123ab01>] ? security_file_permission+0x4d/0x54 [ 243.336018] [<ffffffff811a4ca4>] vfs_write+0xbd/0x11b [ 243.336018] [<ffffffff811a5a0a>] SyS_write+0x5e/0x96 [ 243.336018] [<ffffffff81552a1b>] entry_SYSCALL_64_fastpath+0x13/0x94 The main entry points for sending packets over a PPP unit are the .write() and .ndo_start_xmit() callbacks (simplified view): .write(unit fd) or .ndo_start_xmit() \ CALL ppp_xmit_process() \ LOCK unit's xmit path (ppp->wlock) | CALL ppp_push() \ LOCK channel's xmit path (chan->downl) | CALL lower layer's .start_xmit() callback \ ... might recursively call .ndo_start_xmit() ... / RETURN from .start_xmit() | UNLOCK channel's xmit path / RETURN from ppp_push() | UNLOCK unit's xmit path / RETURN from ppp_xmit_process() Packets can also be directly sent on channels (e.g. LCP packets): .write(channel fd) or ppp_output_wakeup() \ CALL ppp_channel_push() \ LOCK channel's xmit path (chan->downl) | CALL lower layer's .start_xmit() callback \ ... might call .ndo_start_xmit() ... / RETURN from .start_xmit() | UNLOCK channel's xmit path / RETURN from ppp_channel_push() Key points about the lower layer's .start_xmit() callback: * It can be called directly by a channel fd .write() or by ppp_output_wakeup() or indirectly by a unit fd .write() or by .ndo_start_xmit(). * In any case, it's always called with chan->downl held. * It might route the packet back to its parent unit using .ndo_start_xmit() as entry point. This patch detects and breaks recursion in ppp_xmit_process(). This function is a good candidate for the task because it's called early enough after .ndo_start_xmit(), it's always part of the recursion loop and it's on the path of whatever entry point is used to send a packet on a PPP unit. Recursion detection is done using the per-cpu ppp_xmit_recursion variable. Since ppp_channel_push() too locks the channel's xmit path and calls the lower layer's .start_xmit() callback, we need to also increment ppp_xmit_recursion there. However there's no need to check for recursion, as it's out of the recursion loop. Reported-by: Feng Gao <gfree.wind@gmail.com> Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-09ppp: build ifname using unit identifier for rtnl based devicesGuillaume Nault
Userspace programs generally need to know the name of the ppp devices they create. Both ioctl and rtnl interfaces use the ppp<suffix> sheme to name them. But although the suffix used by the ioctl interface can be known by userspace (it's the PPP unit identifier returned by the PPPIOCGUNIT ioctl), the one used by the rtnl is only known by the kernel. This patch brings more consistency between ioctl and rtnl based ppp devices by generating device names using the PPP unit identifer as suffix in both cases. This way, userspace can always infer the name of the devices they create. Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Just several instances of overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-08ppp: defer netns reference release for ppp channelWANG Cong
Matt reported that we have a NULL pointer dereference in ppp_pernet() from ppp_connect_channel(), i.e. pch->chan_net is NULL. This is due to that a parallel ppp_unregister_channel() could happen while we are in ppp_connect_channel(), during which pch->chan_net set to NULL. Since we need a reference to net per channel, it makes sense to sync the refcnt with the life time of the channel, therefore we should release this reference when we destroy it. Fixes: 1f461dcdd296 ("ppp: take reference on channels netns") Reported-by: Matt Bennett <Matt.Bennett@alliedtelesis.co.nz> Cc: Paul Mackerras <paulus@samba.org> Cc: linux-ppp@vger.kernel.org Cc: Guillaume Nault <g.nault@alphalink.fr> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-09net: add netdev_lockdep_set_classes() helperEric Dumazet
It is time to add netdev_lockdep_set_classes() helper so that lockdep annotations per device type are easier to manage. This removes a lot of copies and missing annotations. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-07net_sched: transform qdisc running bit into a seqcountEric Dumazet
Instead of using a single bit (__QDISC___STATE_RUNNING) in sch->__state, use a seqcount. This adds lockdep support, but more importantly it will allow us to sample qdisc/class statistics without having to grab qdisc root lock. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-29ppp: add rtnetlink device creation supportGuillaume Nault
Define PPP device handler for use with rtnetlink. The only PPP specific attribute is IFLA_PPP_DEV_FD. It is mandatory and contains the file descriptor of the associated /dev/ppp instance (the file descriptor which would have been used for ioctl(PPPIOCNEWUNIT) in the ioctl-based API). The PPP device is removed when this file descriptor is released (same behaviour as with ioctl based PPP devices). PPP devices created with the rtnetlink API behave like the ones created with ioctl(PPPIOCNEWUNIT). In particular existing ioctls work the same way, no matter how the PPP device was created. The rtnl callbacks are also assigned to ioctl based PPP devices. This way, rtnl messages have the same effect on any PPP devices. The immediate effect is that all PPP devices, even ioctl-based ones, can now be removed with "ip link del". A minor difference still exists between ioctl and rtnl based PPP interfaces: in the device name, the number following the "ppp" prefix corresponds to the PPP unit number for ioctl based devices, while it is just an unrelated incrementing index for rtnl ones. Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-29ppp: define reusable device creation functionsGuillaume Nault
Move PPP device initialisation and registration out of ppp_create_interface(). This prepares code for device registration with rtnetlink. While there, simplify the prototype of ppp_create_interface(): * Since ppp_dev_configure() takes care of setting file->private_data, there's no need to return a ppp structure to ppp_unattached_ioctl() anymore. * The unit parameter is made read/write so that ppp_create_interface() can tell which unit number has been assigned. Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-23ppp: take reference on channels netnsGuillaume Nault
Let channels hold a reference on their network namespace. Some channel types, like ppp_async and ppp_synctty, can have their userspace controller running in a different namespace. Therefore they can't rely on them to preclude their netns from being removed from under them. ================================================================== BUG: KASAN: use-after-free in ppp_unregister_channel+0x372/0x3a0 at addr ffff880064e217e0 Read of size 8 by task syz-executor/11581 ============================================================================= BUG net_namespace (Not tainted): kasan: bad access detected ----------------------------------------------------------------------------- Disabling lock debugging due to kernel taint INFO: Allocated in copy_net_ns+0x6b/0x1a0 age=92569 cpu=3 pid=6906 [< none >] ___slab_alloc+0x4c7/0x500 kernel/mm/slub.c:2440 [< none >] __slab_alloc+0x4c/0x90 kernel/mm/slub.c:2469 [< inline >] slab_alloc_node kernel/mm/slub.c:2532 [< inline >] slab_alloc kernel/mm/slub.c:2574 [< none >] kmem_cache_alloc+0x23a/0x2b0 kernel/mm/slub.c:2579 [< inline >] kmem_cache_zalloc kernel/include/linux/slab.h:597 [< inline >] net_alloc kernel/net/core/net_namespace.c:325 [< none >] copy_net_ns+0x6b/0x1a0 kernel/net/core/net_namespace.c:360 [< none >] create_new_namespaces+0x2f6/0x610 kernel/kernel/nsproxy.c:95 [< none >] copy_namespaces+0x297/0x320 kernel/kernel/nsproxy.c:150 [< none >] copy_process.part.35+0x1bf4/0x5760 kernel/kernel/fork.c:1451 [< inline >] copy_process kernel/kernel/fork.c:1274 [< none >] _do_fork+0x1bc/0xcb0 kernel/kernel/fork.c:1723 [< inline >] SYSC_clone kernel/kernel/fork.c:1832 [< none >] SyS_clone+0x37/0x50 kernel/kernel/fork.c:1826 [< none >] entry_SYSCALL_64_fastpath+0x16/0x7a kernel/arch/x86/entry/entry_64.S:185 INFO: Freed in net_drop_ns+0x67/0x80 age=575 cpu=2 pid=2631 [< none >] __slab_free+0x1fc/0x320 kernel/mm/slub.c:2650 [< inline >] slab_free kernel/mm/slub.c:2805 [< none >] kmem_cache_free+0x2a0/0x330 kernel/mm/slub.c:2814 [< inline >] net_free kernel/net/core/net_namespace.c:341 [< none >] net_drop_ns+0x67/0x80 kernel/net/core/net_namespace.c:348 [< none >] cleanup_net+0x4e5/0x600 kernel/net/core/net_namespace.c:448 [< none >] process_one_work+0x794/0x1440 kernel/kernel/workqueue.c:2036 [< none >] worker_thread+0xdb/0xfc0 kernel/kernel/workqueue.c:2170 [< none >] kthread+0x23f/0x2d0 kernel/drivers/block/aoe/aoecmd.c:1303 [< none >] ret_from_fork+0x3f/0x70 kernel/arch/x86/entry/entry_64.S:468 INFO: Slab 0xffffea0001938800 objects=3 used=0 fp=0xffff880064e20000 flags=0x5fffc0000004080 INFO: Object 0xffff880064e20000 @offset=0 fp=0xffff880064e24200 CPU: 1 PID: 11581 Comm: syz-executor Tainted: G B 4.4.0+ Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014 00000000ffffffff ffff8800662c7790 ffffffff8292049d ffff88003e36a300 ffff880064e20000 ffff880064e20000 ffff8800662c77c0 ffffffff816f2054 ffff88003e36a300 ffffea0001938800 ffff880064e20000 0000000000000000 Call Trace: [< inline >] __dump_stack kernel/lib/dump_stack.c:15 [<ffffffff8292049d>] dump_stack+0x6f/0xa2 kernel/lib/dump_stack.c:50 [<ffffffff816f2054>] print_trailer+0xf4/0x150 kernel/mm/slub.c:654 [<ffffffff816f875f>] object_err+0x2f/0x40 kernel/mm/slub.c:661 [< inline >] print_address_description kernel/mm/kasan/report.c:138 [<ffffffff816fb0c5>] kasan_report_error+0x215/0x530 kernel/mm/kasan/report.c:236 [< inline >] kasan_report kernel/mm/kasan/report.c:259 [<ffffffff816fb4de>] __asan_report_load8_noabort+0x3e/0x40 kernel/mm/kasan/report.c:280 [< inline >] ? ppp_pernet kernel/include/linux/compiler.h:218 [<ffffffff83ad71b2>] ? ppp_unregister_channel+0x372/0x3a0 kernel/drivers/net/ppp/ppp_generic.c:2392 [< inline >] ppp_pernet kernel/include/linux/compiler.h:218 [<ffffffff83ad71b2>] ppp_unregister_channel+0x372/0x3a0 kernel/drivers/net/ppp/ppp_generic.c:2392 [< inline >] ? ppp_pernet kernel/drivers/net/ppp/ppp_generic.c:293 [<ffffffff83ad6f26>] ? ppp_unregister_channel+0xe6/0x3a0 kernel/drivers/net/ppp/ppp_generic.c:2392 [<ffffffff83ae18f3>] ppp_asynctty_close+0xa3/0x130 kernel/drivers/net/ppp/ppp_async.c:241 [<ffffffff83ae1850>] ? async_lcp_peek+0x5b0/0x5b0 kernel/drivers/net/ppp/ppp_async.c:1000 [<ffffffff82c33239>] tty_ldisc_close.isra.1+0x99/0xe0 kernel/drivers/tty/tty_ldisc.c:478 [<ffffffff82c332c0>] tty_ldisc_kill+0x40/0x170 kernel/drivers/tty/tty_ldisc.c:744 [<ffffffff82c34943>] tty_ldisc_release+0x1b3/0x260 kernel/drivers/tty/tty_ldisc.c:772 [<ffffffff82c1ef21>] tty_release+0xac1/0x13e0 kernel/drivers/tty/tty_io.c:1901 [<ffffffff82c1e460>] ? release_tty+0x320/0x320 kernel/drivers/tty/tty_io.c:1688 [<ffffffff8174de36>] __fput+0x236/0x780 kernel/fs/file_table.c:208 [<ffffffff8174e405>] ____fput+0x15/0x20 kernel/fs/file_table.c:244 [<ffffffff813595ab>] task_work_run+0x16b/0x200 kernel/kernel/task_work.c:115 [< inline >] exit_task_work kernel/include/linux/task_work.h:21 [<ffffffff81307105>] do_exit+0x8b5/0x2c60 kernel/kernel/exit.c:750 [<ffffffff813fdd20>] ? debug_check_no_locks_freed+0x290/0x290 kernel/kernel/locking/lockdep.c:4123 [<ffffffff81306850>] ? mm_update_next_owner+0x6f0/0x6f0 kernel/kernel/exit.c:357 [<ffffffff813215e6>] ? __dequeue_signal+0x136/0x470 kernel/kernel/signal.c:550 [<ffffffff8132067b>] ? recalc_sigpending_tsk+0x13b/0x180 kernel/kernel/signal.c:145 [<ffffffff81309628>] do_group_exit+0x108/0x330 kernel/kernel/exit.c:880 [<ffffffff8132b9d4>] get_signal+0x5e4/0x14f0 kernel/kernel/signal.c:2307 [< inline >] ? kretprobe_table_lock kernel/kernel/kprobes.c:1113 [<ffffffff8151d355>] ? kprobe_flush_task+0xb5/0x450 kernel/kernel/kprobes.c:1158 [<ffffffff8115f7d3>] do_signal+0x83/0x1c90 kernel/arch/x86/kernel/signal.c:712 [<ffffffff8151d2a0>] ? recycle_rp_inst+0x310/0x310 kernel/include/linux/list.h:655 [<ffffffff8115f750>] ? setup_sigcontext+0x780/0x780 kernel/arch/x86/kernel/signal.c:165 [<ffffffff81380864>] ? finish_task_switch+0x424/0x5f0 kernel/kernel/sched/core.c:2692 [< inline >] ? finish_lock_switch kernel/kernel/sched/sched.h:1099 [<ffffffff81380560>] ? finish_task_switch+0x120/0x5f0 kernel/kernel/sched/core.c:2678 [< inline >] ? context_switch kernel/kernel/sched/core.c:2807 [<ffffffff85d794e9>] ? __schedule+0x919/0x1bd0 kernel/kernel/sched/core.c:3283 [<ffffffff81003901>] exit_to_usermode_loop+0xf1/0x1a0 kernel/arch/x86/entry/common.c:247 [< inline >] prepare_exit_to_usermode kernel/arch/x86/entry/common.c:282 [<ffffffff810062ef>] syscall_return_slowpath+0x19f/0x210 kernel/arch/x86/entry/common.c:344 [<ffffffff85d88022>] int_ret_from_sys_call+0x25/0x9f kernel/arch/x86/entry/entry_64.S:281 Memory state around the buggy address: ffff880064e21680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff880064e21700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff880064e21780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff880064e21800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff880064e21880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== Fixes: 273ec51dd7ce ("net: ppp_generic - introduce net-namespace functionality v2") Reported-by: Baozeng Ding <sploving1@gmail.com> Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-16ppp: ensure file->private_data can't be overriddenGuillaume Nault
Locking ppp_mutex must be done before dereferencing file->private_data, otherwise it could be modified before ppp_unattached_ioctl() takes the lock. This could lead ppp_unattached_ioctl() to override ->private_data, thus leaking reference to the ppp_file previously pointed to. v2: lock all ppp_ioctl() instead of just checking private_data in ppp_unattached_ioctl(), to avoid ambiguous behaviour. Fixes: f3ff8a4d80e8 ("ppp: push BKL down into the driver") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Several cases of overlapping changes, as well as one instance (vxlan) of a bug fix in 'net' overlapping with code movement in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-07ppp: release rtnl mutex when interface creation failsGuillaume Nault
Add missing rtnl_unlock() in the error path of ppp_create_interface(). Fixes: 58a89ecaca53 ("ppp: fix lockdep splat in ppp_dev_uninit()") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01ppp: lock ppp->flags in ppp_read() and ppp_poll()Guillaume Nault
ppp_read() and ppp_poll() can be called concurrently with ppp_ioctl(). In this case, ppp_ioctl() might call ppp_ccp_closed(), which may update ppp->flags while ppp_read() or ppp_poll() is reading it. The update done by ppp_ccp_closed() isn't atomic due to the bit mask operation ('ppp->flags &= ~(SC_CCP_OPEN | SC_CCP_UP)'), so concurrent readers might get transient values. Reading incorrect ppp->flags may disturb the 'ppp->flags & SC_LOOP_TRAFFIC' test in ppp_read() and ppp_poll(), which in turn can lead to improper decision on whether the PPP unit file is ready for reading or not. Since ppp_ccp_closed() is protected by the Rx and Tx locks (with ppp_lock()), taking the Rx lock is enough for ppp_read() and ppp_poll() to guarantee that ppp_ccp_closed() won't update ppp->flags concurrently. The same reasoning applies to ppp->n_channels. The 'n_channels' field can also be written to concurrently by ppp_ioctl() (through ppp_connect_channel() or ppp_disconnect_channel()). These writes aren't atomic (simple increment/decrement), but are protected by both the Rx and Tx locks (like in the ppp->flags case). So holding the Rx lock before reading ppp->n_channels also prevents concurrent writes. Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24ppp: clarify parsing of user supplied data in ppp_set_compress()Guillaume Nault
* Split big conditional statement. * Check (data.length <= CCP_MAX_OPTION_LENGTH) only once. * Don't read ccp_option[1] if not initialised. Reading uninitialised ccp_option[1] was harmless, because this could only happen when data.length was 0 or 1. So even then, we couldn't pass the (ccp_option[1] < 2 || ccp_option[1] > data.length) test anyway. Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-14ppp: declare ppp devices as enumerated interfacesGuillaume Nault
Let user space be aware of the naming scheme used by ppp interfaces (visible in /sys/class/net/<iface>/name_assign_type). Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-14ppp: define "ppp" device typeGuillaume Nault
Let PPP devices be identified as such in /sys/class/net/<iface>/uevent. Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-02ppp, slip: Validate VJ compression slot parameters completelyBen Hutchings
Currently slhc_init() treats out-of-range values of rslots and tslots as equivalent to 0, except that if tslots is too large it will dereference a null pointer (CVE-2015-7799). Add a range-check at the top of the function and make it return an ERR_PTR() on error instead of NULL. Change the callers accordingly. Compile-tested only. Reported-by: 郭永刚 <guoyonggang@360.cn> References: http://article.gmane.org/gmane.comp.security.oss.general/17908 Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-25ppp: fix lockdep splat in ppp_dev_uninit()Guillaume Nault
ppp_dev_uninit() locks all_ppp_mutex while under rtnl mutex protection. ppp_create_interface() must then lock these mutexes in that same order to avoid possible deadlock. [ 120.880011] ====================================================== [ 120.880011] [ INFO: possible circular locking dependency detected ] [ 120.880011] 4.2.0 #1 Not tainted [ 120.880011] ------------------------------------------------------- [ 120.880011] ppp-apitest/15827 is trying to acquire lock: [ 120.880011] (&pn->all_ppp_mutex){+.+.+.}, at: [<ffffffffa0145f56>] ppp_dev_uninit+0x64/0xb0 [ppp_generic] [ 120.880011] [ 120.880011] but task is already holding lock: [ 120.880011] (rtnl_mutex){+.+.+.}, at: [<ffffffff812e4255>] rtnl_lock+0x12/0x14 [ 120.880011] [ 120.880011] which lock already depends on the new lock. [ 120.880011] [ 120.880011] [ 120.880011] the existing dependency chain (in reverse order) is: [ 120.880011] [ 120.880011] -> #1 (rtnl_mutex){+.+.+.}: [ 120.880011] [<ffffffff81073a6f>] lock_acquire+0xcf/0x10e [ 120.880011] [<ffffffff813ab18a>] mutex_lock_nested+0x56/0x341 [ 120.880011] [<ffffffff812e4255>] rtnl_lock+0x12/0x14 [ 120.880011] [<ffffffff812d9d94>] register_netdev+0x11/0x27 [ 120.880011] [<ffffffffa0147b17>] ppp_ioctl+0x289/0xc98 [ppp_generic] [ 120.880011] [<ffffffff8113b367>] do_vfs_ioctl+0x4ea/0x532 [ 120.880011] [<ffffffff8113b3fd>] SyS_ioctl+0x4e/0x7d [ 120.880011] [<ffffffff813ad7d7>] entry_SYSCALL_64_fastpath+0x12/0x6f [ 120.880011] [ 120.880011] -> #0 (&pn->all_ppp_mutex){+.+.+.}: [ 120.880011] [<ffffffff8107334e>] __lock_acquire+0xb07/0xe76 [ 120.880011] [<ffffffff81073a6f>] lock_acquire+0xcf/0x10e [ 120.880011] [<ffffffff813ab18a>] mutex_lock_nested+0x56/0x341 [ 120.880011] [<ffffffffa0145f56>] ppp_dev_uninit+0x64/0xb0 [ppp_generic] [ 120.880011] [<ffffffff812d5263>] rollback_registered_many+0x19e/0x252 [ 120.880011] [<ffffffff812d5381>] rollback_registered+0x29/0x38 [ 120.880011] [<ffffffff812d53fa>] unregister_netdevice_queue+0x6a/0x77 [ 120.880011] [<ffffffffa0146a94>] ppp_release+0x42/0x79 [ppp_generic] [ 120.880011] [<ffffffff8112d9f6>] __fput+0xec/0x192 [ 120.880011] [<ffffffff8112dacc>] ____fput+0x9/0xb [ 120.880011] [<ffffffff8105447a>] task_work_run+0x66/0x80 [ 120.880011] [<ffffffff81001801>] prepare_exit_to_usermode+0x8c/0xa7 [ 120.880011] [<ffffffff81001900>] syscall_return_slowpath+0xe4/0x104 [ 120.880011] [<ffffffff813ad931>] int_ret_from_sys_call+0x25/0x9f [ 120.880011] [ 120.880011] other info that might help us debug this: [ 120.880011] [ 120.880011] Possible unsafe locking scenario: [ 120.880011] [ 120.880011] CPU0 CPU1 [ 120.880011] ---- ---- [ 120.880011] lock(rtnl_mutex); [ 120.880011] lock(&pn->all_ppp_mutex); [ 120.880011] lock(rtnl_mutex); [ 120.880011] lock(&pn->all_ppp_mutex); [ 120.880011] [ 120.880011] *** DEADLOCK *** Fixes: 8cb775bc0a34 ("ppp: fix device unregistration upon netns deletion") Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-25ppp: implement x-netns supportGuillaume Nault
Let packets move from one netns to the other at PPP encapsulation and decapsulation time. PPP units and channels remain in the netns in which they were originally created. Only the net_device may move to a different namespace. Cross netns handling is thus transparent to lower PPP layers (PPPoE, L2TP, etc.). PPP devices are automatically unregistered when their netns gets removed. So read() and poll() on the unit file descriptor will respectively receive EOF and POLLHUP. Channels aren't affected. Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17ppp: fix device unregistration upon netns deletionGuillaume Nault
PPP devices may get automatically unregistered when their network namespace is getting removed. This happens if the ppp control plane daemon (e.g. pppd) exits while it is the last user of this namespace. This leads to several races: * ppp_exit_net() may destroy the per namespace idr (pn->units_idr) before all file descriptors were released. Successive ppp_release() calls may then cleanup PPP devices with ppp_shutdown_interface() and try to use the already destroyed idr. * Automatic device unregistration may also happen before the ppp_release() call for that device gets executed. Once called on the file owning the device, ppp_release() will then clean it up and try to unregister it a second time. To fix these issues, operations defined in ppp_shutdown_interface() are moved to the PPP device's ndo_uninit() callback. This allows PPP devices to be properly cleaned up by unregister_netdev() and friends. So checking for ppp->owner is now an accurate test to decide if a PPP device should be unregistered. Setting ppp->owner is done in ppp_create_interface(), before device registration, in order to avoid unprotected modification of this field. Finally, ppp_exit_net() now starts by unregistering all remaining PPP devices to ensure that none will get unregistered after the call to idr_destroy(). Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-20ppp: call skb_checksum_complete_unset in ppp_receive_frameTom Herbert
Call checksum_complete_unset in PPP receive to discard checksum-complete value. PPP does not pull checksum for headers and also modifies packet as in VJ compression. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-09ppp_read(): switch to skb_copy_datagram_iter()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-11-11net: ppp: Don't call bpf_prog_create() in ppp_lockTakashi Iwai
In ppp_ioctl(), bpf_prog_create() is called inside ppp_lock, which eventually calls vmalloc() and hits BUG_ON() in vmalloc.c. This patch works around the problem by moving the allocation outside the lock. The bug was revealed by the recent change in net/core/filter.c, as it allocates via vmalloc() instead of kmalloc() now. Reported-and-tested-by: Stefan Seyfried <stefan.seyfried@googlemail.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-13Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs updates from Al Viro: "The big thing in this pile is Eric's unmount-on-rmdir series; we finally have everything we need for that. The final piece of prereqs is delayed mntput() - now filesystem shutdown always happens on shallow stack. Other than that, we have several new primitives for iov_iter (Matt Wilcox, culled from his XIP-related series) pushing the conversion to ->read_iter()/ ->write_iter() a bit more, a bunch of fs/dcache.c cleanups and fixes (including the external name refcounting, which gives consistent behaviour of d_move() wrt procfs symlinks for long and short names alike) and assorted cleanups and fixes all over the place. This is just the first pile; there's a lot of stuff from various people that ought to go in this window. Starting with unionmount/overlayfs mess... ;-/" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (60 commits) fs/file_table.c: Update alloc_file() comment vfs: Deduplicate code shared by xattr system calls operating on paths reiserfs: remove pointless forward declaration of struct nameidata don't need that forward declaration of struct nameidata in dcache.h anymore take dname_external() into fs/dcache.c let path_init() failures treated the same way as subsequent link_path_walk() fix misuses of f_count() in ppp and netlink ncpfs: use list_for_each_entry() for d_subdirs walk vfs: move getname() from callers to do_mount() gfs2_atomic_open(): skip lookups on hashed dentry [infiniband] remove pointless assignments gadgetfs: saner API for gadgetfs_create_file() f_fs: saner API for ffs_sb_create_file() jfs: don't hash direct inode [s390] remove pointless assignment of ->f_op in vmlogrdr ->open() ecryptfs: ->f_op is never NULL android: ->f_op is never NULL nouveau: __iomem misannotations missing annotation in fs/file.c fs: namespace: suppress 'may be used uninitialized' warnings ...
2014-10-09fix misuses of f_count() in ppp and netlinkAl Viro
we used to check for "nobody else could start doing anything with that opened file" by checking that refcount was 2 or less - one for descriptor table and one we'd acquired in fget() on the way to wherever we are. That was race-prone (somebody else might have had a reference to descriptor table and do fget() just as we'd been checking) and it had become flat-out incorrect back when we switched to fget_light() on those codepaths - unlike fget(), it doesn't grab an extra reference unless the descriptor table is shared. The same change allowed a race-free check, though - we are safe exactly when refcount is less than 2. It was a long time ago; pre-2.6.12 for ioctl() (the codepath leading to ppp one) and 2.6.17 for sendmsg() (netlink one). OTOH, netlink hadn't grown that check until 3.9 and ppp used to live in drivers/net, not drivers/net/ppp until 3.1. The bug existed well before that, though, and the same fix used to apply in old location of file. Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-07net: better IFF_XMIT_DST_RELEASE supportEric Dumazet
Testing xmit_more support with netperf and connected UDP sockets, I found strange dst refcount false sharing. Current handling of IFF_XMIT_DST_RELEASE is not optimal. Dropping dst in validate_xmit_skb() is certainly too late in case packet was queued by cpu X but dequeued by cpu Y The logical point to take care of drop/force is in __dev_queue_xmit() before even taking qdisc lock. As Julian Anastasov pointed out, need for skb_dst() might come from some packet schedulers or classifiers. This patch adds new helper to cleanly express needs of various drivers or qdiscs/classifiers. Drivers that need skb_dst() in their ndo_start_xmit() should call following helper in their setup instead of the prior : dev->priv_flags &= ~IFF_XMIT_DST_RELEASE; -> netif_keep_dst(dev); Instead of using a single bit, we use two bits, one being eventually rebuilt in bonding/team drivers. The other one, is permanent and blocks IFF_XMIT_DST_RELEASE being rebuilt in bonding/team. Eventually, we could add something smarter later. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-02net: filter: split 'struct sk_filter' into socket and bpf partsAlexei Starovoitov
clean up names related to socket filtering and bpf in the following way: - everything that deals with sockets keeps 'sk_*' prefix - everything that is pure BPF is changed to 'bpf_*' prefix split 'struct sk_filter' into struct sk_filter { atomic_t refcnt; struct rcu_head rcu; struct bpf_prog *prog; }; and struct bpf_prog { u32 jited:1, len:31; struct sock_fprog_kern *orig_prog; unsigned int (*bpf_func)(const struct sk_buff *skb, const struct bpf_insn *filter); union { struct sock_filter insns[0]; struct bpf_insn insnsi[0]; struct work_struct work; }; }; so that 'struct bpf_prog' can be used independent of sockets and cleans up 'unattached' bpf use cases split SK_RUN_FILTER macro into: SK_RUN_FILTER to be used with 'struct sk_filter *' and BPF_PROG_RUN to be used with 'struct bpf_prog *' __sk_filter_release(struct sk_filter *) gains __bpf_prog_release(struct bpf_prog *) helper function also perform related renames for the functions that work with 'struct bpf_prog *', since they're on the same lines: sk_filter_size -> bpf_prog_size sk_filter_select_runtime -> bpf_prog_select_runtime sk_filter_free -> bpf_prog_free sk_unattached_filter_create -> bpf_prog_create sk_unattached_filter_destroy -> bpf_prog_destroy sk_store_orig_filter -> bpf_prog_store_orig_filter sk_release_orig_filter -> bpf_release_orig_filter __sk_migrate_filter -> bpf_migrate_filter __sk_prepare_filter -> bpf_prepare_filter API for attaching classic BPF to a socket stays the same: sk_attach_filter(prog, struct sock *)/sk_detach_filter(struct sock *) and SK_RUN_FILTER(struct sk_filter *, ctx) to execute a program which is used by sockets, tun, af_packet API for 'unattached' BPF programs becomes: bpf_prog_create(struct bpf_prog **)/bpf_prog_destroy(struct bpf_prog *) and BPF_PROG_RUN(struct bpf_prog *, ctx) to execute a program which is used by isdn, ppp, team, seccomp, ptp, xt_bpf, cls_bpf, test_bpf Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/infiniband/hw/cxgb4/device.c The cxgb4 conflict was simply overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-16net: ppp: fix creating PPP pass and active filtersChristoph Schulz
Commit 568f194e8bd16c353ad50f9ab95d98b20578a39d ("net: ppp: use sk_unattached_filter api") inadvertently changed the logic when setting PPP pass and active filters. This applies to both the generic PPP subsystem implemented by drivers/net/ppp/ppp_generic.c and the ISDN PPP subsystem implemented by drivers/isdn/i4l/isdn_ppp.c. The original code in ppp_ioctl() (or isdn_ppp_ioctl(), resp.) handling PPPIOCSPASS and PPPIOCSACTIVE allowed to remove a pass/active filter previously set by using a filter of length zero. However, with the new code this is not possible anymore as this case is not explicitly checked for, which leads to passing NULL as a filter to sk_unattached_filter_create(). This results in returning EINVAL to the caller. Additionally, the variables ppp->pass_filter and ppp->active_filter (or is->pass_filter and is->active_filter, resp.) are not reset to NULL, although the filters they point to may have been destroyed by sk_unattached_filter_destroy(), so in this EINVAL case dangling pointers are left behind (provided the pointers were previously non-NULL). This patch corrects both problems by checking whether the filter passed is empty or non-empty, and prevents sk_unattached_filter_create() from being called in the first case. Moreover, the pointers are always reset to NULL as soon as sk_unattached_filter_destroy() returns. Signed-off-by: Christoph Schulz <develop@kristov.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-16net: ppp: access ppp->nextseq only if CONFIG_PPP_MULTILINK is definedChristoph Schulz
Commit d762d038497c9df51c19fcbe69b094b3bf8e5568 resets the counter holding the next sequence number for multilink PPP fragments to zero whenever the SC_MULTILINK flag is set. However, this counter only exists if CONFIG_PPP_MULTILINK is defined. Consequently, the new code has to be enclosed within #ifdef CONFIG_PPP_MULTILINK ... #endif. Signed-off-by: Christoph Schulz <develop@kristov.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-15net: ppp: reset nextseq counter when enabling SC_MULTILINKChristoph Schulz
If using a demand-dialled PPP unit for a PPP multilink master, the pppd daemon needs to reset the sequence counter between two connections. This allows the daemon to reuse the PPP unit instead of destroying and recreating it. As there is no API to reset the counter, this patch resets the counter whenever the SC_MULTILINK flag is set. Signed-off-by: Christoph Schulz <develop@kristov.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-15net: set name_assign_type in alloc_netdev()Tom Gundersen
Extend alloc_netdev{,_mq{,s}}() to take name_assign_type as argument, and convert all users to pass NET_NAME_UNKNOWN. Coccinelle patch: @@ expression sizeof_priv, name, setup, txqs, rxqs, count; @@ ( -alloc_netdev_mqs(sizeof_priv, name, setup, txqs, rxqs) +alloc_netdev_mqs(sizeof_priv, name, NET_NAME_UNKNOWN, setup, txqs, rxqs) | -alloc_netdev_mq(sizeof_priv, name, setup, count) +alloc_netdev_mq(sizeof_priv, name, NET_NAME_UNKNOWN, setup, count) | -alloc_netdev(sizeof_priv, name, setup) +alloc_netdev(sizeof_priv, name, NET_NAME_UNKNOWN, setup) ) v9: move comments here from the wrong commit Signed-off-by: Tom Gundersen <teg@jklm.no> Reviewed-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-14net: ppp: don't call sk_chk_filter twiceChristoph Schulz
Commit 568f194e8bd16c353ad50f9ab95d98b20578a39d ("net: ppp: use sk_unattached_filter api") causes sk_chk_filter() to be called twice when setting a PPP pass or active filter. This applies to both the generic PPP subsystem implemented by drivers/net/ppp/ppp_generic.c and the ISDN PPP subsystem implemented by drivers/isdn/i4l/isdn_ppp.c. The first call is from within get_filter(). The second one is through the call chain ppp_ioctl() or isdn_ppp_ioctl() --> sk_unattached_filter_create() --> __sk_prepare_filter() --> sk_chk_filter() The first call from within get_filter() should be deleted as get_filter() is called just before calling sk_unattached_filter_create() later on, which eventually calls sk_chk_filter() anyway. For 3.15.x, this proposed change is a bugfix rather than a pure optimization as in that branch, sk_chk_filter() may replace filter codes by other codes which are not recognized when executing sk_chk_filter() a second time. So with 3.15.x, if sk_chk_filter() is called twice, the second invocation may yield EINVAL (this depends on the filter codes found in the filter to be set, but because the replacement is done for frequently used codes, this is almost always the case). The net effect is that setting pass and/or active PPP filters does not work anymore, since sk_unattached_filter_create() always returns EINVAL due to the second call to sk_chk_filter(), regardless whether the filter was originally sane or not. Signed-off-by: Christoph Schulz <develop@kristov.de> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-23net: filter: let unattached filters use sock_fprog_kernDaniel Borkmann
The sk_unattached_filter_create() API is used by BPF filters that are not directly attached or related to sockets, and are used in team, ptp, xt_bpf, cls_bpf, etc. As such all users do their own internal managment of obtaining filter blocks and thus already have them in kernel memory and set up before calling into sk_unattached_filter_create(). As a result, due to __user annotation in sock_fprog, sparse triggers false positives (incorrect type in assignment [different address space]) when filters are set up before passing them to sk_unattached_filter_create(). Therefore, let sk_unattached_filter_create() API use sock_fprog_kern to overcome this issue. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-31net: ppp: use sk_unattached_filter apiDaniel Borkmann
For the ppp driver, there are currently two open-coded BPF filters in use, that is, pass_filter and active_filter. Migrate both to make proper use of sk_unattached_filter_{create,destroy} API so that the actual BPF code is decoupled from direct access, and filters can be jited as a side-effect by the internal filter compiler. Joint work with Alexei Starovoitov. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Paul Mackerras <paulus@samba.org> Cc: linux-ppp@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-27ppp: convert to idr_alloc()Tejun Heo
Convert to the much saner new idr interface. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>