summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2018-04-04rxrpc: Fix undefined packet handlingDavid Howells
By analogy with other Rx implementations, RxRPC packet types 9, 10 and 11 should just be discarded rather than being aborted like other undefined packet types. Reported-by: Jeffrey Altman <jaltman@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking updates from David Miller: 1) Support offloading wireless authentication to userspace via NL80211_CMD_EXTERNAL_AUTH, from Srinivas Dasari. 2) A lot of work on network namespace setup/teardown from Kirill Tkhai. Setup and cleanup of namespaces now all run asynchronously and thus performance is significantly increased. 3) Add rx/tx timestamping support to mv88e6xxx driver, from Brandon Streiff. 4) Support zerocopy on RDS sockets, from Sowmini Varadhan. 5) Use denser instruction encoding in x86 eBPF JIT, from Daniel Borkmann. 6) Support hw offload of vlan filtering in mvpp2 dreiver, from Maxime Chevallier. 7) Support grafting of child qdiscs in mlxsw driver, from Nogah Frankel. 8) Add packet forwarding tests to selftests, from Ido Schimmel. 9) Deal with sub-optimal GSO packets better in BBR congestion control, from Eric Dumazet. 10) Support 5-tuple hashing in ipv6 multipath routing, from David Ahern. 11) Add path MTU tests to selftests, from Stefano Brivio. 12) Various bits of IPSEC offloading support for mlx5, from Aviad Yehezkel, Yossi Kuperman, and Saeed Mahameed. 13) Support RSS spreading on ntuple filters in SFC driver, from Edward Cree. 14) Lots of sockmap work from John Fastabend. Applications can use eBPF to filter sendmsg and sendpage operations. 15) In-kernel receive TLS support, from Dave Watson. 16) Add XDP support to ixgbevf, this is significant because it should allow optimized XDP usage in various cloud environments. From Tony Nguyen. 17) Add new Intel E800 series "ice" ethernet driver, from Anirudh Venkataramanan et al. 18) IP fragmentation match offload support in nfp driver, from Pieter Jansen van Vuuren. 19) Support XDP redirect in i40e driver, from Björn Töpel. 20) Add BPF_RAW_TRACEPOINT program type for accessing the arguments of tracepoints in their raw form, from Alexei Starovoitov. 21) Lots of striding RQ improvements to mlx5 driver with many performance improvements, from Tariq Toukan. 22) Use rhashtable for inet frag reassembly, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1678 commits) net: mvneta: improve suspend/resume net: mvneta: split rxq/txq init and txq deinit into SW and HW parts ipv6: frags: fix /proc/sys/net/ipv6/ip6frag_low_thresh net: bgmac: Fix endian access in bgmac_dma_tx_ring_free() net: bgmac: Correctly annotate register space route: check sysctl_fib_multipath_use_neigh earlier than hash fix typo in command value in drivers/net/phy/mdio-bitbang. sky2: Increase D3 delay to sky2 stops working after suspend net/mlx5e: Set EQE based as default TX interrupt moderation mode ibmvnic: Disable irqs before exiting reset from closed state net: sched: do not emit messages while holding spinlock vlan: also check phy_driver ts_info for vlan's real device Bluetooth: Mark expected switch fall-throughs Bluetooth: Set HCI_QUIRK_SIMULTANEOUS_DISCOVERY for BTUSB_QCA_ROME Bluetooth: btrsi: remove unused including <linux/version.h> Bluetooth: hci_bcm: Remove DMI quirk for the MINIX Z83-4 sh_eth: kill useless check in __sh_eth_get_regs() sh_eth: add sh_eth_cpu_data::no_xdfar flag ipv6: factorize sk_wmem_alloc updates done by __ip6_append_data() ipv4: factorize sk_wmem_alloc updates done by __ip_append_data() ...
2018-04-03sunrpc: remove incorrect HMAC request initializationEric Biggers
make_checksum_hmac_md5() is allocating an HMAC transform and doing crypto API calls in the following order: crypto_ahash_init() crypto_ahash_setkey() crypto_ahash_digest() This is wrong because it makes no sense to init() the request before a key has been set, given that the initial state depends on the key. And digest() is short for init() + update() + final(), so in this case there's no need to explicitly call init() at all. Before commit 9fa68f620041 ("crypto: hash - prevent using keyed hashes without setting key") the extra init() had no real effect, at least for the software HMAC implementation. (There are also hardware drivers that implement HMAC-MD5, and it's not immediately obvious how gracefully they handle init() before setkey().) But now the crypto API detects this incorrect initialization and returns -ENOKEY. This is breaking NFS mounts in some cases. Fix it by removing the incorrect call to crypto_ahash_init(). Reported-by: Michael Young <m.a.young@durham.ac.uk> Fixes: 9fa68f620041 ("crypto: hash - prevent using keyed hashes without setting key") Fixes: fffdaef2eb4a ("gss_krb5: Add support for rc4-hmac encryption") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03NFSD: Clean up legacy NFS SYMLINK argument XDR decodersChuck Lever
Move common code in NFSD's legacy SYMLINK decoders into a helper. The immediate benefits include: - one fewer data copies on transports that support DDP - consistent error checking across all versions - reduction of code duplication - support for both legal forms of SYMLINK requests on RDMA transports for all versions of NFS (in particular, NFSv2, for completeness) In the long term, this helper is an appropriate spot to perform a per-transport call-out to fill the pathname argument using, say, RDMA Reads. Filling the pathname in the proc function also means that eventually the incoming filehandle can be interpreted so that filesystem- specific memory can be allocated as a sink for the pathname argument, rather than using anonymous pages. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03NFSD: Clean up legacy NFS WRITE argument XDR decodersChuck Lever
Move common code in NFSD's legacy NFS WRITE decoders into a helper. The immediate benefit is reduction of code duplication and some nice micro-optimizations (see below). In the long term, this helper can perform a per-transport call-out to fill the rq_vec (say, using RDMA Reads). The legacy WRITE decoders and procs are changed to work like NFSv4, which constructs the rq_vec just before it is about to call vfs_writev. Why? Calling a transport call-out from the proc instead of the XDR decoder means that the incoming FH can be resolved to a particular filesystem and file. This would allow pages from the backing file to be presented to the transport to be filled, rather than presenting anonymous pages and copying or flipping them into the file's page cache later. I also prefer using the pages in rq_arg.pages, instead of pulling the data pages directly out of the rqstp::rq_pages array. This is currently the way the NFSv3 write decoder works, but the other two do not seem to take this approach. Fixing this removes the only reference to rq_pages found in NFSD, eliminating an NFSD assumption about how transports use the pages in rq_pages. Lastly, avoid setting up the first element of rq_vec as a zero- length buffer. This happens with an RDMA transport when a normal Read chunk is present because the data payload is in rq_arg's page list (none of it is in the head buffer). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03svc: Report xprt dequeue latencyChuck Lever
Record the time between when a rqstp is enqueued on a transport and when it is dequeued. This includes how long the rqstp waits on the queue and how long it takes the kernel scheduler to wake a nfsd thread to service it. The svc_xprt_dequeue trace point is altered to include the number of microseconds between xprt_enqueue and xprt_dequeue. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03sunrpc: Report per-RPC execution statsChuck Lever
Introduce a mechanism to report the server-side execution latency of each RPC. The goal is to enable user space to filter the trace record for latency outliers, build histograms, etc. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03sunrpc: Re-purpose trace_svc_processChuck Lever
Currently, trace_svc_process has two call sites: 1. Just after a call to svc_send. svc_send already invokes trace_svc_send with the same arguments just before returning 2. Just before a call to svc_drop. svc_drop already invokes trace_svc_drop with the same arguments just after it is called Therefore trace_svc_process does not provide any additional information not already provided by these other trace points. However, it would be useful to record the incoming RPC procedure. So reuse trace_svc_process for this purpose. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03sunrpc: Save remote presentation address in svc_xprt for trace eventsChuck Lever
TP_printk defines a format string that is passed to user space for converting raw trace event records to something human-readable. My user space's printf (Oracle Linux 7), however, does not have a %pI format specifier. The result is that what is supposed to be an IP address in the output of "trace-cmd report" is just a string that says the field couldn't be displayed. To fix this, adopt the same approach as the client: maintain a pre- formated presentation address for occasions when %pI is not available. The location of the trace_svc_send trace point is adjusted so that rqst->rq_xprt is not NULL when the trace event is recorded. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03sunrpc: Simplify trace_svc_recvChuck Lever
There doesn't seem to be a lot of value in calling trace_svc_recv in the failing case. 1. There are two very common cases: one is the transport is not ready, and the other is shutdown. Neither is terribly interesting. 2. The trace record for the failing case contains nothing but the status code. Therefore the trace point call site in the error exit is removed. Since the trace point is now recording a length instead of a status, rename the status field and remove the case that records a zero XID. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03sunrpc: Simplify do_enqueue tracingChuck Lever
There are three cases where svc_xprt_do_enqueue() returns without waking an nfsd thread: 1. There is no work to do 2. The transport is already busy 3. There are no available nfsd threads Only 3. is truly interesting. Move the trace point so it records that there was work to do and either an nfsd thread was awoken, or a free one could not found. As an additional clean up, remove a redundant comment and a couple of dprintk call sites. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03sunrpc: Move trace_svc_xprt_dequeue()Chuck Lever
Reduce the amount of noise generated by trace_svc_xprt_dequeue by moving it to the end of svc_get_next_xprt. This generates exactly one trace event when a ready xprt is found, rather than spurious events when there is no work to do. The empty events contain no information that can't be obtained simply by tracing function calls to svc_xprt_dequeue. A small additional benefit is simplification of the svc_xprt_event trace class, which no longer has to handle the case when the @xprt parameter is NULL. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03svc: Simplify ->xpo_secure_portChuck Lever
Clean up: Instead of returning a value that is used to set or clear a bit, just make ->xpo_secure_port mangle that bit, and return void. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03sunrpc: Remove unneeded pointer dereferenceChuck Lever
Clean up: Noticed during code inspection that there is already a local automatic variable "xprt" so dereferencing rqst->rq_xprt again is unnecessary. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03Bluetooth: Fix connection if directed advertising and privacy is usedSzymon Janc
Local random address needs to be updated before creating connection if RPA from LE Direct Advertising Report was resolved in host. Otherwise remote device might ignore connection request due to address mismatch. This was affecting following qualification test cases: GAP/CONN/SCEP/BV-03-C, GAP/CONN/GCEP/BV-05-C, GAP/CONN/DCEP/BV-05-C Before patch: < HCI Command: LE Set Random Address (0x08|0x0005) plen 6 #11350 [hci0] 84680.231216 Address: 56:BC:E8:24:11:68 (Resolvable) Identity type: Random (0x01) Identity: F2:F1:06:3D:9C:42 (Static) > HCI Event: Command Complete (0x0e) plen 4 #11351 [hci0] 84680.246022 LE Set Random Address (0x08|0x0005) ncmd 1 Status: Success (0x00) < HCI Command: LE Set Scan Parameters (0x08|0x000b) plen 7 #11352 [hci0] 84680.246417 Type: Passive (0x00) Interval: 60.000 msec (0x0060) Window: 30.000 msec (0x0030) Own address type: Random (0x01) Filter policy: Accept all advertisement, inc. directed unresolved RPA (0x02) > HCI Event: Command Complete (0x0e) plen 4 #11353 [hci0] 84680.248854 LE Set Scan Parameters (0x08|0x000b) ncmd 1 Status: Success (0x00) < HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2 #11354 [hci0] 84680.249466 Scanning: Enabled (0x01) Filter duplicates: Enabled (0x01) > HCI Event: Command Complete (0x0e) plen 4 #11355 [hci0] 84680.253222 LE Set Scan Enable (0x08|0x000c) ncmd 1 Status: Success (0x00) > HCI Event: LE Meta Event (0x3e) plen 18 #11356 [hci0] 84680.458387 LE Direct Advertising Report (0x0b) Num reports: 1 Event type: Connectable directed - ADV_DIRECT_IND (0x01) Address type: Random (0x01) Address: 53:38:DA:46:8C:45 (Resolvable) Identity type: Public (0x00) Identity: 11:22:33:44:55:66 (OUI 11-22-33) Direct address type: Random (0x01) Direct address: 7C:D6:76:8C:DF:82 (Resolvable) Identity type: Random (0x01) Identity: F2:F1:06:3D:9C:42 (Static) RSSI: -74 dBm (0xb6) < HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2 #11357 [hci0] 84680.458737 Scanning: Disabled (0x00) Filter duplicates: Disabled (0x00) > HCI Event: Command Complete (0x0e) plen 4 #11358 [hci0] 84680.469982 LE Set Scan Enable (0x08|0x000c) ncmd 1 Status: Success (0x00) < HCI Command: LE Create Connection (0x08|0x000d) plen 25 #11359 [hci0] 84680.470444 Scan interval: 60.000 msec (0x0060) Scan window: 60.000 msec (0x0060) Filter policy: White list is not used (0x00) Peer address type: Random (0x01) Peer address: 53:38:DA:46:8C:45 (Resolvable) Identity type: Public (0x00) Identity: 11:22:33:44:55:66 (OUI 11-22-33) Own address type: Random (0x01) Min connection interval: 30.00 msec (0x0018) Max connection interval: 50.00 msec (0x0028) Connection latency: 0 (0x0000) Supervision timeout: 420 msec (0x002a) Min connection length: 0.000 msec (0x0000) Max connection length: 0.000 msec (0x0000) > HCI Event: Command Status (0x0f) plen 4 #11360 [hci0] 84680.474971 LE Create Connection (0x08|0x000d) ncmd 1 Status: Success (0x00) < HCI Command: LE Create Connection Cancel (0x08|0x000e) plen 0 #11361 [hci0] 84682.545385 > HCI Event: Command Complete (0x0e) plen 4 #11362 [hci0] 84682.551014 LE Create Connection Cancel (0x08|0x000e) ncmd 1 Status: Success (0x00) > HCI Event: LE Meta Event (0x3e) plen 19 #11363 [hci0] 84682.551074 LE Connection Complete (0x01) Status: Unknown Connection Identifier (0x02) Handle: 0 Role: Master (0x00) Peer address type: Public (0x00) Peer address: 00:00:00:00:00:00 (OUI 00-00-00) Connection interval: 0.00 msec (0x0000) Connection latency: 0 (0x0000) Supervision timeout: 0 msec (0x0000) Master clock accuracy: 0x00 After patch: < HCI Command: LE Set Scan Parameters (0x08|0x000b) plen 7 #210 [hci0] 667.152459 Type: Passive (0x00) Interval: 60.000 msec (0x0060) Window: 30.000 msec (0x0030) Own address type: Random (0x01) Filter policy: Accept all advertisement, inc. directed unresolved RPA (0x02) > HCI Event: Command Complete (0x0e) plen 4 #211 [hci0] 667.153613 LE Set Scan Parameters (0x08|0x000b) ncmd 1 Status: Success (0x00) < HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2 #212 [hci0] 667.153704 Scanning: Enabled (0x01) Filter duplicates: Enabled (0x01) > HCI Event: Command Complete (0x0e) plen 4 #213 [hci0] 667.154584 LE Set Scan Enable (0x08|0x000c) ncmd 1 Status: Success (0x00) > HCI Event: LE Meta Event (0x3e) plen 18 #214 [hci0] 667.182619 LE Direct Advertising Report (0x0b) Num reports: 1 Event type: Connectable directed - ADV_DIRECT_IND (0x01) Address type: Random (0x01) Address: 50:52:D9:A6:48:A0 (Resolvable) Identity type: Public (0x00) Identity: 11:22:33:44:55:66 (OUI 11-22-33) Direct address type: Random (0x01) Direct address: 7C:C1:57:A5:B7:A8 (Resolvable) Identity type: Random (0x01) Identity: F4:28:73:5D:38:B0 (Static) RSSI: -70 dBm (0xba) < HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2 #215 [hci0] 667.182704 Scanning: Disabled (0x00) Filter duplicates: Disabled (0x00) > HCI Event: Command Complete (0x0e) plen 4 #216 [hci0] 667.183599 LE Set Scan Enable (0x08|0x000c) ncmd 1 Status: Success (0x00) < HCI Command: LE Set Random Address (0x08|0x0005) plen 6 #217 [hci0] 667.183645 Address: 7C:C1:57:A5:B7:A8 (Resolvable) Identity type: Random (0x01) Identity: F4:28:73:5D:38:B0 (Static) > HCI Event: Command Complete (0x0e) plen 4 #218 [hci0] 667.184590 LE Set Random Address (0x08|0x0005) ncmd 1 Status: Success (0x00) < HCI Command: LE Create Connection (0x08|0x000d) plen 25 #219 [hci0] 667.184613 Scan interval: 60.000 msec (0x0060) Scan window: 60.000 msec (0x0060) Filter policy: White list is not used (0x00) Peer address type: Random (0x01) Peer address: 50:52:D9:A6:48:A0 (Resolvable) Identity type: Public (0x00) Identity: 11:22:33:44:55:66 (OUI 11-22-33) Own address type: Random (0x01) Min connection interval: 30.00 msec (0x0018) Max connection interval: 50.00 msec (0x0028) Connection latency: 0 (0x0000) Supervision timeout: 420 msec (0x002a) Min connection length: 0.000 msec (0x0000) Max connection length: 0.000 msec (0x0000) > HCI Event: Command Status (0x0f) plen 4 #220 [hci0] 667.186558 LE Create Connection (0x08|0x000d) ncmd 1 Status: Success (0x00) > HCI Event: LE Meta Event (0x3e) plen 19 #221 [hci0] 667.485824 LE Connection Complete (0x01) Status: Success (0x00) Handle: 0 Role: Master (0x00) Peer address type: Random (0x01) Peer address: 50:52:D9:A6:48:A0 (Resolvable) Identity type: Public (0x00) Identity: 11:22:33:44:55:66 (OUI 11-22-33) Connection interval: 50.00 msec (0x0028) Connection latency: 0 (0x0000) Supervision timeout: 420 msec (0x002a) Master clock accuracy: 0x07 @ MGMT Event: Device Connected (0x000b) plen 13 {0x0002} [hci0] 667.485996 LE Address: 11:22:33:44:55:66 (OUI 11-22-33) Flags: 0x00000000 Data length: 0 Signed-off-by: Szymon Janc <szymon.janc@codecoup.pl> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org
2018-04-02Merge branch 'syscalls-next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux Pull removal of in-kernel calls to syscalls from Dominik Brodowski: "System calls are interaction points between userspace and the kernel. Therefore, system call functions such as sys_xyzzy() or compat_sys_xyzzy() should only be called from userspace via the syscall table, but not from elsewhere in the kernel. At least on 64-bit x86, it will likely be a hard requirement from v4.17 onwards to not call system call functions in the kernel: It is better to use use a different calling convention for system calls there, where struct pt_regs is decoded on-the-fly in a syscall wrapper which then hands processing over to the actual syscall function. This means that only those parameters which are actually needed for a specific syscall are passed on during syscall entry, instead of filling in six CPU registers with random user space content all the time (which may cause serious trouble down the call chain). Those x86-specific patches will be pushed through the x86 tree in the near future. Moreover, rules on how data may be accessed may differ between kernel data and user data. This is another reason why calling sys_xyzzy() is generally a bad idea, and -- at most -- acceptable in arch-specific code. This patchset removes all in-kernel calls to syscall functions in the kernel with the exception of arch/. On top of this, it cleans up the three places where many syscalls are referenced or prototyped, namely kernel/sys_ni.c, include/linux/syscalls.h and include/linux/compat.h" * 'syscalls-next' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux: (109 commits) bpf: whitelist all syscalls for error injection kernel/sys_ni: remove {sys_,sys_compat} from cond_syscall definitions kernel/sys_ni: sort cond_syscall() entries syscalls/x86: auto-create compat_sys_*() prototypes syscalls: sort syscall prototypes in include/linux/compat.h net: remove compat_sys_*() prototypes from net/compat.h syscalls: sort syscall prototypes in include/linux/syscalls.h kexec: move sys_kexec_load() prototype to syscalls.h x86/sigreturn: use SYSCALL_DEFINE0 x86: fix sys_sigreturn() return type to be long, not unsigned long x86/ioport: add ksys_ioperm() helper; remove in-kernel calls to sys_ioperm() mm: add ksys_readahead() helper; remove in-kernel calls to sys_readahead() mm: add ksys_mmap_pgoff() helper; remove in-kernel calls to sys_mmap_pgoff() mm: add ksys_fadvise64_64() helper; remove in-kernel call to sys_fadvise64_64() fs: add ksys_fallocate() wrapper; remove in-kernel calls to sys_fallocate() fs: add ksys_p{read,write}64() helpers; remove in-kernel calls to syscalls fs: add ksys_truncate() wrapper; remove in-kernel calls to sys_truncate() fs: add ksys_sync_file_range helper(); remove in-kernel calls to syscall kernel: add ksys_setsid() helper; remove in-kernel call to sys_setsid() kernel: add ksys_unshare() helper; remove in-kernel calls to sys_unshare() ...
2018-04-02net: socket: add __compat_sys_...msg() helpers; remove in-kernel calls to ↵Dominik Brodowski
compat syscalls Using the net-internal helpers __compat_sys_...msg() allows us to avoid the internal calls to the compat_sys_...msg() syscalls. compat_sys_recvmmsg() is handled in a different patch. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __compat_sys_recvmmsg() helper; remove in-kernel call to ↵Dominik Brodowski
compat syscall Using the net-internal helper __compat_sys_recvmmsg() allows us to avoid the internal calls to the compat_sys_recvmmsg() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __compat_sys_getsockopt() helper; remove in-kernel call to ↵Dominik Brodowski
compat syscall Using the net-internal helper __compat_sys_getsockopt() allows us to avoid the internal calls to the compat_sys_getsockopt() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __compat_sys_setsockopt() helper; remove in-kernel call to ↵Dominik Brodowski
compat syscall Using the net-internal helper __compat_sys_setsockopt() allows us to avoid the internal calls to the compat_sys_setsockopt() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __compat_sys_recvfrom() helper; remove in-kernel call to ↵Dominik Brodowski
compat syscall Using the net-internal helper __compat_sys_recvfrom() allows us to avoid the internal calls to the compat_sys_recvfrom() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: replace call to sys_recv() with __sys_recvfrom()Dominik Brodowski
sys_recv() merely expands the parameters to __sys_recvfrom() by NULL and NULL. Open-code this in the two places which used sys_recv() as a wrapper to __sys_recvfrom(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: replace calls to sys_send() with __sys_sendto()Dominik Brodowski
sys_send() merely expands the parameters to __sys_sendto() by NULL and 0. Open-code this in the two places which used sys_send() as a wrapper to __sys_sendto(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: move check for forbid_cmsg_compat to __sys_...msg()Dominik Brodowski
The non-compat codepaths for sys_...msg() verify that MSG_CMSG_COMPAT is not set. By moving this check to the __sys_...msg() functions (and making it dependent on a static flag passed to this function), we can call the __sys...msg() functions instead of the syscall functions in all cases. __sys_recvmmsg() does not need this trickery, as the check is handled within the do_sys_recvmmsg() function internal to net/socket.c. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add do_sys_recvmmsg() helper; remove in-kernel call to syscallDominik Brodowski
Using the net-internal helper do_sys_recvmmsg() allows us to avoid the internal calls to the sys_getsockopt() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __sys_getsockopt() helper; remove in-kernel call to syscallDominik Brodowski
Using the net-internal helper __sys_getsockopt() allows us to avoid the internal calls to the sys_getsockopt() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __sys_setsockopt() helper; remove in-kernel call to syscallDominik Brodowski
Using the net-internal helper __sys_setsockopt() allows us to avoid the internal calls to the sys_setsockopt() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __sys_shutdown() helper; remove in-kernel call to syscallDominik Brodowski
Using the net-internal helper __sys_shutdown() allows us to avoid the internal calls to the sys_shutdown() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __sys_socketpair() helper; remove in-kernel call to syscallDominik Brodowski
Using the net-internal helper __sys_socketpair() allows us to avoid the internal calls to the sys_socketpair() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __sys_getpeername() helper; remove in-kernel call to syscallDominik Brodowski
Using the net-internal helper __sys_getpeername() allows us to avoid the internal calls to the sys_getpeername() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __sys_getsockname() helper; remove in-kernel call to syscallDominik Brodowski
Using the net-internal helper __sys_getsockname() allows us to avoid the internal calls to the sys_getsockname() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __sys_listen() helper; remove in-kernel call to syscallDominik Brodowski
Using the net-internal helper __sys_listen() allows us to avoid the internal calls to the sys_listen() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __sys_connect() helper; remove in-kernel call to syscallDominik Brodowski
Using the net-internal helper __sys_connect() allows us to avoid the internal calls to the sys_connect() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __sys_bind() helper; remove in-kernel call to syscallDominik Brodowski
Using the net-internal helper __sys_bind() allows us to avoid the internal calls to the sys_bind() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __sys_socket() helper; remove in-kernel call to syscallDominik Brodowski
Using the net-internal helper __sys_socket() allows us to avoid the internal calls to the sys_socket() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __sys_accept4() helper; remove in-kernel call to syscallDominik Brodowski
Using the net-internal helper __sys_accept4() allows us to avoid the internal calls to the sys_accept4() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __sys_sendto() helper; remove in-kernel call to syscallDominik Brodowski
Using the net-internal helper __sys_sendto() allows us to avoid the internal calls to the sys_sendto() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02net: socket: add __sys_recvfrom() helper; remove in-kernel call to syscallDominik Brodowski
Using the net-internal helper __sys_recvfrom() allows us to avoid the internal calls to the sys_recvfrom() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02ipv6: frags: fix /proc/sys/net/ipv6/ip6frag_low_threshEric Dumazet
I forgot to change ip6frag_low_thresh proc_handler from proc_dointvec_minmax to proc_doulongvec_minmax Fixes: 3e67f106f619 ("inet: frags: break the 2GB limit for frags storage") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-02ceph: quota: add initial infrastructure to support cephfs quotasLuis Henriques
This patch adds the infrastructure required to support cephfs quotas as it is currently implemented in the ceph fuse client. Cephfs quotas can be set on any directory, and can restrict the number of bytes or the number of files stored beneath that point in the directory hierarchy. Quotas are set using the extended attributes 'ceph.quota.max_files' and 'ceph.quota.max_bytes', and can be removed by setting these attributes to '0'. Link: http://tracker.ceph.com/issues/22372 Signed-off-by: Luis Henriques <lhenriques@suse.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02libceph, ceph: add __init attribution to init funcitonsChengguang Xu
Add __init attribution to the functions which are called only once during initiating/registering operations and deleting unnecessary symbol exports. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02libceph: adding missing message types to ceph_msg_type_name()Chengguang Xu
Some of message types are missing in ceph_msg_type_name(), so just adding them for better understanding of output information. Signed-off-by: Chengguang Xu <cgxu519@icloud.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02libceph: fix misjudgement of maximum monitor numberChengguang Xu
num_mon should allow up to CEPH_MAX_MON in ceph_monmap_decode(). Signed-off-by: Chengguang Xu <cgxu519@icloud.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02libceph, ceph: change permission for readonly debugfs entriesChengguang Xu
Remove write permission for debugfs entries which only have readonly function. Signed-off-by: Chengguang Xu <cgxu519@icloud.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02ceph: add newline to end of debug message formatChengguang Xu
Some of dout format do not include newline in the end, fix for the files which are in fs/ceph and net/ceph directories, and changing printk to dout for printing debug info in super.c Signed-off-by: Chengguang Xu <cgxu519@icloud.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02libceph, ceph: move ceph_calc_file_object_mapping() to striper.cIlya Dryomov
ceph_calc_file_object_mapping() has nothing to do with osdmaps. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02libceph: striping framework implementationIlya Dryomov
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02libceph: handle zero-length data itemsIlya Dryomov
rbd needs this for null copyups -- if copyup data is all zeroes, we want to save some I/O and network bandwidth. See rbd_obj_issue_copyup() in the next commit. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org>
2018-04-02libceph: introduce BVECS data typeIlya Dryomov
In preparation for rbd "fancy" striping, introduce ceph_bvec_iter for working with bio_vec array data buffers. The wrappers are trivial, but make it look similar to ceph_bio_iter. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02libceph, rbd: new bio handling code (aka don't clone bios)Ilya Dryomov
The reason we clone bios is to be able to give each object request (and consequently each ceph_osd_data/ceph_msg_data item) its own pointer to a (list of) bio(s). The messenger then initializes its cursor with cloned bio's ->bi_iter, so it knows where to start reading from/writing to. That's all the cloned bios are used for: to determine each object request's starting position in the provided data buffer. Introduce ceph_bio_iter to do exactly that -- store position within bio list (i.e. pointer to bio) + position within that bio (i.e. bvec_iter). Signed-off-by: Ilya Dryomov <idryomov@gmail.com>