summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2019-02-13netfilter: reject: skip csum verification for protocols that don't support itAlin Nastac
Some protocols have other means to verify the payload integrity (AH, ESP, SCTP) while others are incompatible with nf_ip(6)_checksum implementation because checksum is either optional or might be partial (UDPLITE, DCCP, GRE). Because nf_ip(6)_checksum was used to validate the packets, ip(6)tables REJECT rules were not capable to generate ICMP(v6) errors for the protocols mentioned above. This commit also fixes the incorrect pseudo-header protocol used for IPv4 packets that carry other transport protocols than TCP or UDP (pseudo-header used protocol 0 iso the proper value). Signed-off-by: Alin Nastac <alin.nastac@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-02-12Rename include/{uapi => }/asm-generic/shmparam.h reallyMasahiro Yamada
Commit 36c0f7f0f899 ("arch: unexport asm/shmparam.h for all architectures") is different from the patch I submitted. My patch is this: https://lore.kernel.org/lkml/1546904307-11124-1-git-send-email-yamada.masahiro@socionext.com/T/#u The file renaming part: rename include/{uapi => }/asm-generic/shmparam.h (100%) was lost when it was picked up. I think it was an accident because Andrew did not say anything. Link: http://lkml.kernel.org/r/1549158277-24558-1-git-send-email-yamada.masahiro@socionext.com Fixes: 36c0f7f0f899 ("arch: unexport asm/shmparam.h for all architectures") Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-12net: sched: add flags to Qdisc class ops structVlad Buslov
Extend Qdisc_class_ops with flags. Create enum to hold possible class ops flag values. Add first class ops flags value QDISC_CLASS_OPS_DOIT_UNLOCKED to indicate that class ops functions can be called without taking rtnl lock. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12net: sched: extend proto ops to support unlocked classifiersVlad Buslov
Add 'rtnl_held' flag to tcf proto change, delete, destroy, dump, walk functions to track rtnl lock status. Extend users of these function in cls API to propagate rtnl lock status to them. This allows classifiers to obtain rtnl lock when necessary and to pass rtnl lock status to extensions and driver offload callbacks. Add flags field to tcf proto ops. Add flag value to indicate that classifier doesn't require rtnl lock. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12net: sched: extend proto ops with 'put' callbackVlad Buslov
Add optional tp->ops->put() API to be implemented for filter reference counting. This new function is called by cls API to release filter reference for filters returned by tp->ops->change() or tp->ops->get() functions. Implement tfilter_put() helper to call tp->ops->put() only for classifiers that implement it. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12net: sched: track rtnl lock status when validating extensionsVlad Buslov
Actions API is already updated to not rely on rtnl lock for synchronization. However, it need to be provided with rtnl status when called from classifiers API in order to be able to correctly release the lock when loading kernel module. Extend extension validation function with 'rtnl_held' flag which is passed to actions API. Add new 'rtnl_held' parameter to tcf_exts_validate() in cls API. No classifier is currently updated to support unlocked execution, so pass hardcoded 'true' flag parameter value. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12net: sched: prevent insertion of new classifiers during chain flushVlad Buslov
Extend tcf_chain with 'flushing' flag. Use the flag to prevent insertion of new classifier instances when chain flushing is in progress in order to prevent resource leak when tcf_proto is created by unlocked users concurrently. Return EAGAIN error from tcf_chain_tp_insert_unique() to restart tc_new_tfilter() and lookup the chain/proto again. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12net: sched: refactor tp insert/delete for concurrent executionVlad Buslov
Implement unique insertion function to atomically attach tcf_proto to chain after verifying that no other tcf proto with specified priority exists. Implement delete function that verifies that tp is actually empty before deleting it. Use these functions to refactor cls API to account for concurrent tp and rule update instead of relying on rtnl lock. Add new 'deleting' flag to tcf proto. Use it to restart search when iterating over tp's on chain to prevent accessing potentially inval tp->next pointer. Extend tcf proto with spinlock that is intended to be used to protect its data from concurrent modification instead of relying on rtnl mutex. Use it to protect 'deleting' flag. Add lockdep macros to validate that lock is held when accessing protected fields. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12net: sched: traverse classifiers in chain with tcf_get_next_proto()Vlad Buslov
All users of chain->filters_chain rely on rtnl lock and assume that no new classifier instances are added when traversing the list. Use tcf_get_next_proto() to traverse filters list without relying on rtnl mutex. This function iterates over classifiers by taking reference to current iterator classifier only and doesn't assume external synchronization of filters list. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12net: sched: introduce reference counting for tcf_protoVlad Buslov
In order to remove dependency on rtnl lock and allow concurrent tcf_proto modification, extend tcf_proto with reference counter. Implement helper get/put functions for tcf proto and use them to modify cls API to always take reference to tcf_proto while using it. Only release reference to parent chain after releasing last reference to tp. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12net: sched: protect filter_chain list with filter_chain_lock mutexVlad Buslov
Extend tcf_chain with new filter_chain_lock mutex. Always lock the chain when accessing filter_chain list, instead of relying on rtnl lock. Dereference filter_chain with tcf_chain_dereference() lockdep macro to verify that all users of chain_list have the lock taken. Rearrange tp insert/remove code in tc_new_tfilter/tc_del_tfilter to execute all necessary code while holding chain lock in order to prevent invalidation of chain_info structure by potential concurrent change. This also serializes calls to tcf_chain0_head_change(), which allows head change callbacks to rely on filter_chain_lock for synchronization instead of rtnl mutex. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12net: sched: traverse chains in block with tcf_get_next_chain()Vlad Buslov
All users of block->chain_list rely on rtnl lock and assume that no new chains are added when traversing the list. Use tcf_get_next_chain() to traverse chain list without relying on rtnl mutex. This function iterates over chains by taking reference to current iterator chain only and doesn't assume external synchronization of chain list. Don't take reference to all chains in block when flushing and use tcf_get_next_chain() to safely iterate over chain list instead. Remove tcf_block_put_all_chains() that is no longer used. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12net: sched: protect block state with mutexVlad Buslov
Currently, tcf_block doesn't use any synchronization mechanisms to protect critical sections that manage lifetime of its chains. block->chain_list and multiple variables in tcf_chain that control its lifetime assume external synchronization provided by global rtnl lock. Converting chain reference counting to atomic reference counters is not possible because cls API uses multiple counters and flags to control chain lifetime, so all of them must be synchronized in chain get/put code. Use single per-block lock to protect block data and manage lifetime of all chains on the block. Always take block->lock when accessing chain_list. Chain get and put modify chain lifetime-management data and parent block's chain_list, so take the lock in these functions. Verify block->lock state with assertions in functions that expect to be called with the lock taken and are called from multiple places. Take block->lock when accessing filter_chain_list. In order to allow parallel update of rules on single block, move all calls to classifiers outside of critical sections protected by new block->lock. Rearrange chain get and put functions code to only access protected chain data while holding block lock: - Rearrange code to only access chain reference counter and chain action reference counter while holding block lock. - Extract code that requires block->lock from tcf_chain_destroy() into standalone tcf_chain_destroy() function that is called by __tcf_chain_put() in same critical section that changes chain reference counters. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12inet_diag: fix reporting cgroup classid and fallback to priorityKonstantin Khlebnikov
Field idiag_ext in struct inet_diag_req_v2 used as bitmap of requested extensions has only 8 bits. Thus extensions starting from DCTCPINFO cannot be requested directly. Some of them included into response unconditionally or hook into some of lower 8 bits. Extension INET_DIAG_CLASS_ID has not way to request from the beginning. This patch bundle it with INET_DIAG_TCLASS (ipv6 tos), fixes space reservation, and documents behavior for other extensions. Also this patch adds fallback to reporting socket priority. This filed is more widely used for traffic classification because ipv4 sockets automatically maps TOS to priority and default qdisc pfifo_fast knows about that. But priority could be changed via setsockopt SO_PRIORITY so INET_DIAG_TOS isn't enough for predicting class. Also cgroup2 obsoletes net_cls classid (it always zero), but we cannot reuse this field for reporting cgroup2 id because it is 64-bit (ino+gen). So, after this patch INET_DIAG_CLASS_ID will report socket priority for most common setup when net_cls isn't set and/or cgroup2 in use. Fixes: 0888e372c37f ("net: inet: diag: expose sockets cgroup classid") Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12ptp_qoriq: fix register memory mapYangbo Lu
The 1588 timer on eTSEC Ethernet controller uses different register memory map with DPAA Ethernet controller. Now the new ENETC Ethernet controller uses same reigster memory map with DPAA. To support ENETC, let's use register memory map of DPAA/ENETC in default. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12ptp_qoriq: add little enadian supportYangbo Lu
There is QorIQ 1588 timer IP block on the new ENETC Ethernet controller. However it uses little endian mode which is different with before. This patch is to add little endian support for the driver by using "little-endian" dts node property. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12ptp_qoriq: convert to use ptp_qoriq_init/freeYangbo Lu
Moved QorIQ PTP clock initialization/free into new functions ptp_qoriq_init()/ptp_qoriq_free(). These functions could also be reused by ENETC PTP drvier which is a PCI driver for same 1588 timer IP block. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12ptp_qoriq: make ptp operations globalYangbo Lu
This patch is to make functions of ptp operations global, so that ENETC PTP driver which is a PCI driver for same 1588 timer IP block could reuse them. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12ptp_qoriq: make structure/function names more consistentYangbo Lu
Strings containing "ptp_qoriq" or "qoriq_ptp" which were used for structure/function names were complained by users. Let's just use the unique "ptp_qoriq" to make these names more consistent. This patch is just to unify the names using "ptp_qoriq". It hasn't changed any functions. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12net/skbuff: fix up kernel-doc placementBrian Norris
There are several skb_* functions where the locked and unlocked functions are confusingly documented. For several of them, the kernel-doc for the unlocked version is placed above the locked version, which to the casual reader makes it seems like the locked version "takes no locks and you must therefore hold required locks before calling it." One can see, for example, that this link claims to document skb_queue_head(), while instead describing __skb_queue_head(). https://www.kernel.org/doc/html/latest/networking/kapi.html#c.skb_queue_head The correct documentation for skb_queue_head() is also included further down the page. This diff tested via: $ scripts/kernel-doc -rst include/linux/skbuff.h net/core/skbuff.c No new warnings were seen, and the output makes a little more sense. Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12net: phylink: add phylink_init_eee() helperRussell King
Provide phylink_init_eee() to allow MAC drivers to initialise PHY EEE from within the ethtool set_eee() method. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12Revert "devlink: Add a generic wake_on_lan port parameter"Vasundhara Volam
This reverts commit b639583f9e36d044ac1b13090ae812266992cbac. As per discussion with Jakub Kicinski and Michal Kubecek, this will be better addressed by soon-too-come ethtool netlink API with additional indication that given configuration request is supposed to be persisted. Also, remove the parameter support from bnxt_en driver. Cc: Jiri Pirko <jiri@mellanox.com> Cc: Michael Chan <michael.chan@broadcom.com> Cc: Michal Kubecek <mkubecek@suse.cz> Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12bpf: offload: add priv field for driversJakub Kicinski
Currently bpf_offload_dev does not have any priv pointer, forcing the drivers to work backwards from the netdev in program metadata. This is not great given programs are conceptually associated with the offload device, and it means one or two unnecessary deferences. Add a priv pointer to bpf_offload_dev. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-11flow_offload: Fix flow action infrastructureEli Britstein
Implementation of macro "flow_action_for_each" introduced in commit e3ab786b42535 ("flow_offload: add flow action infrastructure") and used in commit 738678817573c ("drivers: net: use flow action infrastructure") iterated the first item twice and did not reach the last one. Fix it. Fixes: e3ab786b42535 ("flow_offload: add flow action infrastructure") Fixes: 738678817573c ("drivers: net: use flow action infrastructure") Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-11devlink: add a generic board.manufacture version nameJakub Kicinski
At Jiri's suggestion add a generic "board.manufacture" version identifier. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-10bpf: Add struct bpf_tcp_sock and BPF_FUNC_tcp_sockMartin KaFai Lau
This patch adds a helper function BPF_FUNC_tcp_sock and it is currently available for cg_skb and sched_(cls|act): struct bpf_tcp_sock *bpf_tcp_sock(struct bpf_sock *sk); int cg_skb_foo(struct __sk_buff *skb) { struct bpf_tcp_sock *tp; struct bpf_sock *sk; __u32 snd_cwnd; sk = skb->sk; if (!sk) return 1; tp = bpf_tcp_sock(sk); if (!tp) return 1; snd_cwnd = tp->snd_cwnd; /* ... */ return 1; } A 'struct bpf_tcp_sock' is also added to the uapi bpf.h to provide read-only access. bpf_tcp_sock has all the existing tcp_sock's fields that has already been exposed by the bpf_sock_ops. i.e. no new tcp_sock's fields are exposed in bpf.h. This helper returns a pointer to the tcp_sock. If it is not a tcp_sock or it cannot be traced back to a tcp_sock by sk_to_full_sk(), it returns NULL. Hence, the caller needs to check for NULL before accessing it. The current use case is to expose members from tcp_sock to allow a cg_skb_bpf_prog to provide per cgroup traffic policing/shaping. Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-10bpf: Add state, dst_ip4, dst_ip6 and dst_port to bpf_sockMartin KaFai Lau
This patch adds "state", "dst_ip4", "dst_ip6" and "dst_port" to the bpf_sock. The userspace has already been using "state", e.g. inet_diag (ss -t) and getsockopt(TCP_INFO). This patch also allows narrow load on the following existing fields: "family", "type", "protocol" and "src_port". Unlike IP address, the load offset is resticted to the first byte for them but it can be relaxed later if there is a use case. This patch also folds __sock_filter_check_size() into bpf_sock_is_valid_access() since it is not called by any where else. All bpf_sock checking is in one place. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-10bpf: Add a bpf_sock pointer to __sk_buff and a bpf_sk_fullsock helperMartin KaFai Lau
In kernel, it is common to check "skb->sk && sk_fullsock(skb->sk)" before accessing the fields in sock. For example, in __netdev_pick_tx: static u16 __netdev_pick_tx(struct net_device *dev, struct sk_buff *skb, struct net_device *sb_dev) { /* ... */ struct sock *sk = skb->sk; if (queue_index != new_index && sk && sk_fullsock(sk) && rcu_access_pointer(sk->sk_dst_cache)) sk_tx_queue_set(sk, new_index); /* ... */ return queue_index; } This patch adds a "struct bpf_sock *sk" pointer to the "struct __sk_buff" where a few of the convert_ctx_access() in filter.c has already been accessing the skb->sk sock_common's fields, e.g. sock_ops_convert_ctx_access(). "__sk_buff->sk" is a PTR_TO_SOCK_COMMON_OR_NULL in the verifier. Some of the fileds in "bpf_sock" will not be directly accessible through the "__sk_buff->sk" pointer. It is limited by the new "bpf_sock_common_is_valid_access()". e.g. The existing "type", "protocol", "mark" and "priority" in bpf_sock are not allowed. The newly added "struct bpf_sock *bpf_sk_fullsock(struct bpf_sock *sk)" can be used to get a sk with all accessible fields in "bpf_sock". This helper is added to both cg_skb and sched_(cls|act). int cg_skb_foo(struct __sk_buff *skb) { struct bpf_sock *sk; sk = skb->sk; if (!sk) return 1; sk = bpf_sk_fullsock(sk); if (!sk) return 1; if (sk->family != AF_INET6 || sk->protocol != IPPROTO_TCP) return 1; /* some_traffic_shaping(); */ return 1; } (1) The sk is read only (2) There is no new "struct bpf_sock_common" introduced. (3) Future kernel sock's members could be added to bpf_sock only instead of repeatedly adding at multiple places like currently in bpf_sock_ops_md, bpf_sock_addr_md, sk_reuseport_md...etc. (4) After "sk = skb->sk", the reg holding sk is in type PTR_TO_SOCK_COMMON_OR_NULL. (5) After bpf_sk_fullsock(), the return type will be in type PTR_TO_SOCKET_OR_NULL which is the same as the return type of bpf_sk_lookup_xxx(). However, bpf_sk_fullsock() does not take refcnt. The acquire_reference_state() is only depending on the return type now. To avoid it, a new is_acquire_function() is checked before calling acquire_reference_state(). (6) The WARN_ON in "release_reference_state()" is no longer an internal verifier bug. When reg->id is not found in state->refs[], it means the bpf_prog does something wrong like "bpf_sk_release(bpf_sk_fullsock(skb->sk))" where reference has never been acquired by calling "bpf_sk_fullsock(skb->sk)". A -EINVAL and a verbose are done instead of WARN_ON. A test is added to the test_verifier in a later patch. Since the WARN_ON in "release_reference_state()" is no longer needed, "__release_reference_state()" is folded into "release_reference_state()" also. Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-10net: phy: add register modifying helpers returning 1 on changeHeiner Kallweit
When modifying registers there are scenarios where we need to know whether the register content actually changed. This patch adds new helpers to not break users of the current ones, phy_modify() etc. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-10Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Ingo Molnar: "irqchip driver fixes: most of them are race fixes for ARM GIC (General Interrupt Controller) variants, but also a fix for the ARM MMP (Marvell PXA168 et al) irqchip affecting OLPC keyboards" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/gic-v3-its: Fix ITT_entry_size accessor irqchip/mmp: Only touch the PJ4 IRQ & FIQ bits on enable/disable irqchip/gic-v3-its: Gracefully fail on LPI exhaustion irqchip/gic-v3-its: Plug allocation race for devices sharing a DevID irqchip/gic-v4: Fix occasional VLPI drop
2019-02-10net: Change TCA_ACT_* to TCA_ID_* to match that of TCA_ID_POLICEEli Cohen
Modify the kernel users of the TCA_ACT_* macros to use TCA_ID_*. For example, use TCA_ID_GACT instead of TCA_ACT_GACT. This will align with TCA_ID_POLICE and also differentiates these identifier, used in struct tc_action_ops type field, from other macros starting with TCA_ACT_. To make things clearer, we name the enum defining the TCA_ID_* identifiers and also change the "type" field of struct tc_action to id. Signed-off-by: Eli Cohen <eli@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-10net: Move all TC actions identifiers to one placeEli Cohen
Move all the TC identifiers to one place, to the same enum that defines the identifier of police action. This makes it easier choose numbers for new actions since they are now defined in one place. We preserve the original values for binary compatibility. New IDs should be added inside the enum. Signed-off-by: Eli Cohen <eli@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-09Merge tag 'for-linus-20190209' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: - NVMe pull request from Christoph, fixing namespace locking when dealing with the effects log, and a rapid add/remove issue (Keith) - blktrace tweak, ensuring requests with -1 sectors are shown (Jan) - link power management quirk for a Smasung SSD (Hans) - m68k nfblock dynamic major number fix (Chengguang) - series fixing blk-iolatency inflight counter issue (Liu) - ensure that we clear ->private when setting up the aio kiocb (Mike) - __find_get_block_slow() rate limit print (Tetsuo) * tag 'for-linus-20190209' of git://git.kernel.dk/linux-block: blk-mq: remove duplicated definition of blk_mq_freeze_queue Blk-iolatency: warn on negative inflight IO counter blk-iolatency: fix IO hang due to negative inflight counter blktrace: Show requests without sector fs: ratelimit __find_get_block_slow() failure message. m68k: set proper major_num when specifying module param major_num libata: Add NOLPM quirk for SAMSUNG MZ7TE512HMHP-000L1 SSD nvme-pci: fix rapid add remove sequence nvme: lock NS list changes while handling command effects aio: initialize kiocb private in case any filesystems expect it.
2019-02-09net: phy: Add support for asking the PHY its abilitiesAndrew Lunn
Add support for runtime determination of what the PHY supports, by adding a new function to the phy driver. The get_features call should set the phydev->supported member with the features the PHY supports. It is only called if phydrv->features is NULL. This requires minor changes to pause. The PHY driver should not set pause abilities, except for when it has odd cause capabilities, e.g. pause cannot be disabled. With this change, phydev->supported already contains the drivers abilities, including pause. So rather than considering phydrv->features, look at the phydev->supported, and enable pause if neither of the pause bits are already set. Signed-off-by: Andrew Lunn <andrew@lunn.ch> [hkallweit1@gmail.com: fixed small checkpatch complaint in one comment] Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-09batman-adv: Add throughput_override hardif genl configurationSven Eckelmann
The B.A.T.M.A.N. V implementation tries to estimate the link throughput of an interface to an originator using different automatic methods. It is still possible to overwrite it the link throughput for all reachable originators via this interface. The BATADV_CMD_SET_HARDIF/BATADV_CMD_GET_HARDIF commands allow to set/get the configuration of this feature using the u32 BATADV_ATTR_THROUGHPUT_OVERRIDE attribute. The used unit is in 100 Kbit/s. If the value is set to 0 then batman-adv will try to estimate the throughput by itself. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09batman-adv: Add elp_interval hardif genl configurationSven Eckelmann
The ELP packets are transmitted every elp_interval milliseconds on an slave/hard-interface. This value can be changed using the configuration interface. The BATADV_CMD_SET_HARDIF/BATADV_CMD_GET_HARDIF commands allow to set/get the configuration of this feature using the u32 BATADV_ATTR_ELP_INTERVAL attribute. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09batman-adv: Add orig_interval mesh genl configurationSven Eckelmann
The OGM packets are transmitted every orig_interval milliseconds. This value can be changed using the configuration interface. The BATADV_CMD_SET_MESH/BATADV_CMD_GET_MESH commands allow to set/get the configuration of this feature using the u32 BATADV_ATTR_ORIG_INTERVAL attribute. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09batman-adv: Add network_coding mesh genl configurationSven Eckelmann
The mesh interface can use (in an homogeneous mesh) network coding, a mechanism that aims to increase the overall network throughput by fusing multiple packets in one transmission. The BATADV_CMD_SET_MESH/BATADV_CMD_GET_MESH commands allow to set/get the configuration of this feature using the BATADV_ATTR_NETWORK_CODING_ENABLED attribute. Setting the u8 to zero will disable this feature and setting it to something else is enabling this feature. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09batman-adv: Add multicast forceflood mesh genl configurationSven Eckelmann
The mesh interface can optimize the flooding of multicast packets based on the content of the global translation tables. To disable this behavior and use the broadcast-like flooding of the packets, forceflood has to be enabled. The BATADV_CMD_SET_MESH/BATADV_CMD_GET_MESH commands allow to set/get the configuration of this feature using the BATADV_ATTR_MULTICAST_FORCEFLOOD_ENABLED attribute. Setting the u8 to zero will disable this feature (allowing multicast optimizations) and setting it to something else is enabling this feature (forcing simple flooding). Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09batman-adv: Add log_level mesh genl configurationSven Eckelmann
In contrast to other modules, batman-adv allows to set the debug message verbosity per mesh/soft-interface and not per module (via modparam). The BATADV_CMD_SET_MESH/BATADV_CMD_GET_MESH commands allow to set/get the configuration of this feature using the u32 (bitmask) BATADV_ATTR_LOG_LEVEL attribute. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09batman-adv: Add hop_penalty mesh genl configurationSven Eckelmann
The TQ (B.A.T.M.A.N. IV) and throughput values (B.A.T.M.A.N. V) are reduced when they are forwarded. One of the reductions is the penalty for traversing an additional hop. This hop_penalty (0-255) defines the percentage of reduction (0-100%). The BATADV_CMD_SET_MESH/BATADV_CMD_GET_MESH commands allow to set/get the configuration of this feature using the u8 BATADV_ATTR_HOP_PENALTY attribute. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09batman-adv: Add gateway mesh genl configurationSven Eckelmann
The mesh/soft-interface can optimize the handling of DHCP packets. Instead of flooding them through the whole mesh, it can be forwarded as unicast to a specific gateway server. The originator which injects the packets in the mesh has to select (based on sel_class thresholds) a responsible gateway server. This is done by switching this originator to the gw_mode client. The servers announce their forwarding bandwidth (download/upload) when the gw_mode server was selected. The BATADV_CMD_SET_MESH/BATADV_CMD_GET_MESH commands allow to set/get the configuration of this feature using the attributes: * u8 BATADV_ATTR_GW_MODE (0 == off, 1 == client, 2 == server) * u32 BATADV_ATTR_GW_BANDWIDTH_DOWN (in 100 kbit/s steps) * u32 BATADV_ATTR_GW_BANDWIDTH_UP (in 100 kbit/s steps) * u32 BATADV_ATTR_GW_SEL_CLASS Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09batman-adv: Add fragmentation mesh genl configurationSven Eckelmann
The mesh interface can fragment unicast packets when the packet size exceeds the outgoing slave/hard-interface MTU. The BATADV_CMD_SET_MESH/BATADV_CMD_GET_MESH commands allow to set/get the configuration of this feature using the BATADV_ATTR_FRAGMENTATION_ENABLED attribute. Setting the u8 to zero will disable this feature and setting it to something else is enabling this feature. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09batman-adv: Add distributed_arp_table mesh genl configurationSven Eckelmann
The mesh interface can use a distributed hash table to answer ARP requests without flooding the request through the whole mesh. The BATADV_CMD_SET_MESH/BATADV_CMD_GET_MESH commands allow to set/get the configuration of this feature using the BATADV_ATTR_DISTRIBUTED_ARP_TABLE_ENABLED attribute. Setting the u8 to zero will disable this feature and setting it to something else is enabling this feature. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09batman-adv: Add bridge_loop_avoidance mesh genl configurationSven Eckelmann
The mesh interface can try to detect loops in the same mesh caused by (indirectly) bridged mesh/soft-interfaces of different nodes. Some of the loops can also be resolved without breaking the mesh. The BATADV_CMD_SET_MESH/BATADV_CMD_GET_MESH commands allow to set/get the configuration of this feature using the BATADV_ATTR_BRIDGE_LOOP_AVOIDANCE_ENABLED attribute. Setting the u8 to zero will disable this feature and setting it to something else is enabling this feature. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09batman-adv: Add bonding mesh genl configurationSven Eckelmann
The mesh interface can use multiple slave/hard-interface ports at the same time to transport the traffic to other nodes. The BATADV_CMD_SET_MESH/BATADV_CMD_GET_MESH commands allow to set/get the configuration of this feature using the BATADV_ATTR_BONDING_ENABLED attribute. Setting the u8 to zero will disable this feature and setting it to something else is enabling this feature. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09batman-adv: Add ap_isolation mesh/vlan genl configurationSven Eckelmann
The mesh interface can drop messages between clients to implement a mesh-wide AP isolation. The BATADV_CMD_SET_MESH/BATADV_CMD_GET_MESH and BATADV_CMD_SET_VLAN/BATADV_CMD_GET_VLAN commands allow to set/get the configuration of this feature using the BATADV_ATTR_AP_ISOLATION_ENABLED attribute. Setting the u8 to zero will disable this feature and setting it to something else is enabling this feature. This feature also requires that skbuff which should be handled as isolated are marked. The BATADV_CMD_SET_MESH/BATADV_CMD_GET_MESH commands allow to set/get the mark/mask using the u32 attributes BATADV_ATTR_ISOLATION_MARK and BATADV_ATTR_ISOLATION_MASK. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09batman-adv: Add aggregated_ogms mesh genl configurationSven Eckelmann
The mesh interface can delay OGM messages to aggregate different ogms together in a single OGM packet. The BATADV_CMD_SET_MESH/BATADV_CMD_GET_MESH commands allow to set/get the configuration of this feature using the BATADV_ATTR_AGGREGATED_OGMS_ENABLED attribute. Setting the u8 to zero will disable this feature and setting it to something else is enabling this feature. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09batman-adv: Prepare framework for vlan genl configSven Eckelmann
The batman-adv configuration interface was implemented solely using sysfs. This approach was condemned by non-batadv developers as "huge mistake". Instead a netlink/genl based implementation was suggested. Beside the mesh/soft-interface specific configuration, the VLANs on top of the mesh/soft-interface have configuration settings. The genl interface reflects this by allowing to get/set it using the vlan specific commands BATADV_CMD_GET_VLAN/BATADV_CMD_SET_VLAN. The set command BATADV_CMD_SET_MESH will also notify interested userspace listeners of the "config" mcast group using the BATADV_CMD_SET_VLAN command message type that settings might have been changed and what the current values are. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-02-09batman-adv: Prepare framework for hardif genl configSven Eckelmann
The batman-adv configuration interface was implemented solely using sysfs. This approach was condemned by non-batadv developers as "huge mistake". Instead a netlink/genl based implementation was suggested. Beside the mesh/soft-interface specific configuration, the slave/hard-interface have B.A.T.M.A.N. V specific configuration settings. The genl interface reflects this by allowing to get/set it using the hard-interface specific commands. The BATADV_CMD_GET_HARDIFS (or short version BATADV_CMD_GET_HARDIF) is reused as get command because it already allow sto dump the content of other information from the slave/hard-interface which are not yet configuration specific. The set command BATADV_CMD_SET_HARDIF will also notify interested userspace listeners of the "config" mcast group using the BATADV_CMD_SET_HARDIF command message type that settings might have been changed and what the current values are. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>