summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-06-27nfp: cast sizeof() to int when comparing with error codeChengguang Xu
sizeof() will return unsigned value so in the error check negative error code will be always larger than sizeof(). Fixes: a0d8e02c35ff ("nfp: add support for reading nffw info") Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-27powerpc/powermac: Fix rtc read/write functionsArnd Bergmann
As Mathieu pointed out, my conversion to time64_t was incorrect and resulted in negative times to be read from the RTC. The problem is that during the conversion from a byte array to a time64_t, the 'unsigned char' variable holding the top byte gets turned into a negative signed 32-bit integer before being assigned to the 64-bit variable for any times after 1972. This changes the logic to cast to an unsigned 32-bit number first for the Macintosh time and then convert that to the Unix time, which then gives us a time in the documented 1904..2040 year range. I decided not to use the longer 1970..2106 range that other drivers use, for consistency with the literal interpretation of the register, but that could be easily changed if we decide we want to support any Mac after 2040. Just to be on the safe side, I'm also adding a WARN_ON that will trigger if either the year 2040 has come and is observed by this driver, or we run into an RTC that got set back to a pre-1970 date for some reason (the two are indistinguishable). For the RTC write functions, Andreas found another problem: both pmu_request() and cuda_request() are varargs functions, so changing the type of the arguments passed into them from 32 bit to 64 bit breaks the API for the set_rtc_time functions. This changes it back to 32 bits. The same code exists in arch/m68k/ and is patched in an identical way now in a separate patch. Fixes: 5bfd643583b2 ("powerpc: use time64_t in read_persistent_clock") Reported-by: Mathieu Malaterre <malat@debian.org> Reported-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Tested-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-27Merge branch 'nfp-MPLS-and-shared-blocks-TC-offload-fixes'David S. Miller
Jakub Kicinski says: ==================== nfp: MPLS and shared blocks TC offload fixes This series brings two fixes to TC filter/action offload code. Pieter fixes matching MPLS packets when the match is purely on the MPLS ethertype and none of the MPLS fields are used. John provides a fix for offload of shared blocks. Unfortunately, with shared blocks there is currently no guarantee that filters which were added by the core will be removed before block unbind. Our simple fix is to not support offload of rules on shared blocks at all, a revert of this fix will be send for -next once the reoffload infrastructure lands. The shared blocks became important as we are trying to use them for bonding offload (managed from user space) and lack of remove calls leads to resource leaks. v2: - fix build error reported by kbuild bot due to missing tcf_block_shared() helper. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-27nfp: reject binding to shared blocksJohn Hurley
TC shared blocks allow multiple qdiscs to be grouped together and filters shared between them. Currently the chains of filters attached to a block are only flushed when the block is removed. If a qdisc is removed from a block but the block still exists, flow del messages are not passed to the callback registered for that qdisc. For the NFP, this presents the possibility of rules still existing in hw when they should be removed. Prevent binding to shared blocks until the kernel can send per qdisc del messages when block unbinds occur. tcf_block_shared() was not used outside of the core until now, so also add an empty implementation for builds with CONFIG_NET_CLS=n. Fixes: 4861738775d7 ("net: sched: introduce shared filter blocks infrastructure") Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-27nfp: flower: fix mpls ether type detectionPieter Jansen van Vuuren
Previously it was not possible to distinguish between mpls ether types and other ether types. This leads to incorrect classification of offloaded filters that match on mpls ether type. For example the following two filters overlap: # tc filter add dev eth0 parent ffff: \ protocol 0x8847 flower \ action mirred egress redirect dev eth1 # tc filter add dev eth0 parent ffff: \ protocol 0x0800 flower \ action mirred egress redirect dev eth2 The driver now correctly includes the mac_mpls layer where HW stores mpls fields, when it detects an mpls ether type. It also sets the MPLS_Q bit to indicate that the filter should match mpls packets. Fixes: bb055c198d9b ("nfp: add mpls match offloading support") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-27Merge branch 'Multipath-tests-for-tunnel-devices'David S. Miller
Petr Machata says: ==================== Multipath tests for tunnel devices This patchset adds a test for ECMP and weighted ECMP between two GRE tunnels. In patches #1 and #2, the function multipath_eval() is first moved from router_multipath.sh to lib.sh for ease of reuse, and then fixed up. In patch #3, the function tc_rule_stats_get() is parameterized to be useful for egress rules as well. In patch #4, a new function __simple_if_init() is extracted from simple_if_init(). This covers the logic that needs to be done for the usual interface: VRF migration, upping and installation of IP addresses. Patch #5 then adds the test itself. Additionally in patch #6, a requirement to add diagrams to selftests is documented. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-27selftests: forwarding: README: Require diagramsPetr Machata
ASCII art diagrams are well suited for presenting the topology that a test uses while being easy to embed directly in the test file iteslf. They make the information very easy to grasp even for simple topologies, and for more complex ones they are almost essential, as figuring out the interconnects from the script itself proves to be difficult. Therefore state the requirement for topology ASCII art in README. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-27selftests: forwarding: Test multipath tunnelingPetr Machata
Add a GRE-tunneling test such that there are two tunnels involved, with a multipath route listing both as next hops. Similarly to router_multipath.sh, test that the distribution of traffic to the tunnels honors the configured weights. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-27selftests: forwarding: lib: Extract interface-init functionsPetr Machata
The function simple_if_init() does two things: it creates a VRF, then moves an interface into this VRF and configures addresses. The latter comes in handy when adding more interfaces into a VRF later on. The situation is similar for simple_if_fini(). Therefore split the interface remastering and address de/initialization logic to a new pair of helpers __simple_if_init() / __simple_if_fini(), and defer to these helpers from simple_if_init() and simple_if_fini(). Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-27selftests: forwarding: tc_rule_stats_get: Parameterize directionPetr Machata
The GRE multipath tests need stats on an egress counter. Change tc_rule_stats_get() to take direction as an optional argument, with default of ingress. Take the opportunity to change line continuation character from | to \. Move the | to the next line, which indent. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-27selftests: forwarding: multipath_eval(): Improve stylePetr Machata
- Change the indentation of the function body from 7 spaces to one tab. - Move initialization of weights_ratio up so that it can be referenced from the error message about packet difference being zero. - Move |'s consistently to continuation line, which reindent. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-27selftests: forwarding: Move multipath_eval() to lib.shPetr Machata
This function will be useful for the GRE multipath test that is coming later. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-27net/tls: Remove VLA usage on nonceKees Cook
It looks like the prior VLA removal, commit b16520f7493d ("net/tls: Remove VLA usage"), and a new VLA addition, commit c46234ebb4d1e ("tls: RX path for ktls"), passed in the night. This removes the newly added VLA, which happens to have its bounds based on the same max value. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-27fib_rules: match rules based on suppress_* properties tooJason A. Donenfeld
Two rules with different values of suppress_prefix or suppress_ifgroup are not the same. This fixes an -EEXIST when running: $ ip -4 rule add table main suppress_prefixlength 0 Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Fixes: f9d4b0c1e969 ("fib_rules: move common handling of newrule delrule msgs into fib_nl2rule") Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-27rds: clean up loopback rds_connections on netns deletionSowmini Varadhan
The RDS core module creates rds_connections based on callbacks from rds_loop_transport when sending/receiving packets to local addresses. These connections will need to be cleaned up when they are created from a netns that is not init_net, and that netns is deleted. Add the changes aligned with the changes from commit ebeeb1ad9b8a ("rds: tcp: use rds_destroy_pending() to synchronize netns/module teardown and rds connection/workq management") for rds_loop_transport Reported-and-tested-by: syzbot+4c20b3866171ce8441d2@syzkaller.appspotmail.com Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-26Input: psmouse - fix button reporting for basic protocolsDmitry Torokhov
The commit ba667650c568 ("Input: psmouse - clean up code") was pretty brain-dead and broke extra buttons reporting for variety of PS/2 mice: Genius, Thinkmouse and Intellimouse Explorer. We need to actually inspect the data coming from the device when reporting events. Fixes: ba667650c568 ("Input: psmouse - clean up code") Reported-by: Jiri Slaby <jslaby@suse.cz> Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2018-06-26net/mlx5: Fix command interface race in polling modeAlex Vesker
The command interface can work in two modes: Events and Polling. In the general case, each time we invoke a command, a work is queued to handle it. When working in events, the interrupt handler completes the command execution. On the other hand, when working in polling mode, the work itself completes it. Due to a bug in the work handler, a command could have been completed by the interrupt handler, while the work handler hasn't finished yet, causing the it to complete once again if the command interface mode was changed from Events to polling after the interrupt handler was called. mlx5_unload_one() mlx5_stop_eqs() // Destroy the EQ before cmd EQ ...cmd_work_handler() write_doorbell() --> EVENT_TYPE_CMD mlx5_cmd_comp_handler() // First free free_ent(cmd, ent->idx) complete(&ent->done) <-- mlx5_stop_eqs //cmd was complete // move to polling before destroying the last cmd EQ mlx5_cmd_use_polling() cmd->mode = POLL; --> cmd_work_handler (continues) if (cmd->mode == POLL) mlx5_cmd_comp_handler() // Double free The solution is to store the cmd->mode before writing the doorbell. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-06-26net/mlx5: Fix incorrect raw command length parsingAlex Vesker
The NULL character was not set correctly for the string containing the command length, this caused failures reading the output of the command due to a random length. The fix is to initialize the output length string. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-06-26net/mlx5: Fix wrong size allocation for QoS ETC TC regitsterShay Agroskin
The driver allocates wrong size (due to wrong struct name) when issuing a query/set request to NIC's register. Fixes: d8880795dabf ("net/mlx5e: Implement DCBNL IEEE max rate") Signed-off-by: Shay Agroskin <shayag@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-06-26net/mlx5: Fix required capability for manipulating MPFSEli Cohen
Manipulating of the MPFS requires eswitch manager capabilities. Fixes: eeb66cdb6826 ('net/mlx5: Separate between E-Switch and MPFS') Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-06-26net/mlx5: E-Switch, Disallow vlan/spoofcheck setup if not being esw managerEli Cohen
In smartnic env, if the host (PF) driver is not an e-switch manager, we are not allowed to apply eswitch ports setups such as vlan (VST), spoof-checks, min/max rate or state. Make sure we are eswitch manager when coming to issue these callbacks and err otherwise. Also fix the definition of ESW_ALLOWED to rely on eswitch_manager capability and on the vport_group_manger. Operations on the VF nic vport context, such as setting a mac or reading the vport counters are allowed to the PF in this scheme. The modify nic vport guid code was modified to omit checking the nic_vport_node_guid_modify eswitch capability. The reason for doing so is that modifying node guid requires vport group manager capability, and there's no need to check further capabilities. 1. set_vf_vlan - disallowed 2. set_vf_spoofchk - disallowed 3. set_vf_mac - allowed 4. get_vf_config - allowed 5. set_vf_trust - disallowed 6. set_vf_rate - disallowed 7. get_vf_stat - allowed 8. set_vf_link_state - disallowed Fixes: f942380c1239 ('net/mlx5: E-Switch, Vport ingress/egress ACLs rules for spoofchk') Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Tested-by: Or Gerlitz <ogerlitz@mellanox.com>
2018-06-26IB/mlx5: Avoid dealing with vport representors if not being e-switch managerOr Gerlitz
In smartnic env, the host (PF) driver might not be an e-switch manager, hence the switchdev mode representors are running on the embedded cpu (EC) and not at the host. As such, we should avoid dealing with vport representors if not being esw manager. Fixes: b5ca15ad7e61 ('IB/mlx5: Add proper representors support') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-06-26net/mlx5e: Avoid dealing with vport representors if not being e-switch managerOr Gerlitz
In smartnic env, the host (PF) driver might not be an e-switch manager, hence the switchdev mode representors are running on the embedded cpu (EC) and not at the host. As such, we should avoid dealing with vport representors if not being esw manager. While here, make sure to disallow eswitch switchdev related setups through devlink if we are not esw managers. Fixes: cb67b832921c ('net/mlx5e: Introduce SRIOV VF representors') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-06-26net/mlx5: E-Switch, Avoid setup attempt if not being e-switch managerOr Gerlitz
In smartnic env, the host (PF) driver might not be an e-switch manager, hence the FW will err on driver attempts to deal with setting/unsetting the eswitch and as a result the overall setup of sriov will fail. Fix that by avoiding the operation if e-switch management is not allowed for this driver instance. While here, move to use the correct name for the esw manager capability name. Fixes: 81848731ff40 ('net/mlx5: E-Switch, Add SR-IOV (FDB) support') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reported-by: Guy Kushnir <guyk@mellanox.com> Reviewed-by: Eli Cohen <eli@melloanox.com> Tested-by: Eli Cohen <eli@melloanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-06-26net/mlx5e: Don't attempt to dereference the ppriv struct if not being ↵Or Gerlitz
eswitch manager The check for cpu hit statistics was not returning immediate false for any non vport rep netdev and hence we crashed (say on mlx5 probed VFs) if user-space tool was calling into any possible netdev in the system. Fix that by doing a proper check before dereferencing. Fixes: 1d447a39142e ('net/mlx5e: Extendable vport representor netdev private data') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reported-by: Eli Cohen <eli@melloanox.com> Reviewed-by: Eli Cohen <eli@melloanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-06-26PCI: controller: Move PCI_DOMAINS selection to arch KconfigLorenzo Pieralisi
Commit 51bc085d6454 ("PCI: Improve host drivers compile test coverage") added configuration options to allow PCI host controller drivers to be compile tested on all architectures. Some host controller drivers (eg PCIE_ALTERA) config entries select the PCI_DOMAINS config option to enable PCI domains management in the kernel. Now that host controller drivers can be compiled on all architectures, this triggers build regressions on arches that do not implement the PCI_DOMAINS required API (ie pci_domain_nr()): drivers/ata/pata_ali.c: In function 'ali_init_chipset': drivers/ata/pata_ali.c:469:38: error: implicit declaration of function 'pci_domain_nr'; did you mean 'pci_iomap_wc'? Furthemore, some software configurations (ie Jailhouse) require a PCI_DOMAINS enabled kernel to configure multiple host controllers without having an explicit dependency on the ARM platform on which they run. Make PCI_DOMAINS a visible configuration option on ARM so that software configurations that need it can manually select it and move the PCI_DOMAINS selection from PCI controllers configuration file to ARM sub-arch config entries that currently require it, fixing the issue. Fixes: 51bc085d6454 ("PCI: Improve host drivers compile test coverage") Link: https://lkml.kernel.org/r/20180612170229.GA10141@roeck-us.net Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <helgaas@kernel.org> Acked-by: Jan Kiszka <jan.kiszka@siemens.com> Acked-by: Ley Foon Tan <ley.foon.tan@intel.com> Acked-by: Rob Herring <robh@kernel.org> Cc: Scott Branden <scott.branden@broadcom.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Guenter Roeck <linux@roeck-us.net>
2018-06-26PCI: Initialize endpoint library before controllersAlan Douglas
The endpoint library must be initialized before its users, which are in drivers/pci/controllers. The endpoint initialization currently depends on link order. This corrects a kernel crash when loading the Cadence EP driver, since it calls devm_pci_epc_create() and this is only valid once the endpoint library has been initialized. Fixes: 6e0832fa432e ("PCI: Collect all native drivers under drivers/pci/controller/") Signed-off-by: Alan Douglas <adouglas@cadence.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-06-26block: Fix transfer when chunk sectors exceeds maxKeith Busch
A device may have boundary restrictions where the number of sectors between boundaries exceeds its max transfer size. In this case, we need to cap the max size to the smaller of the two limits. Reported-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com> Tested-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com> Cc: <stable@vger.kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-26Merge tag 'qcom-fixes-for-4.18-rc2' of ↵Olof Johansson
git://git.kernel.org/pub/scm/linux/kernel/git/agross/linux into fixes Qualcomm Fixes for v4.18-rc2 * Fix compiler warnings for cmd-db driver * tag 'qcom-fixes-for-4.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/agross/linux: qcom: cmd-db: enforce CONFIG_OF_RESERVED_MEM dependency Signed-off-by: Olof Johansson <olof@lixom.net>
2018-06-26ARM: dts: Fix SPI node for Arria10Thor Thayer
Remove the unused bus-num node and change num-chipselect to num-cs to match SPI bindings. Cc: stable@vger.kernel.org Fixes: f2d6f8f817814 ("ARM: dts: socfpga: Add SPI Master1 for Arria10 SR chip") Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org> Signed-off-by: Olof Johansson <olof@lixom.net>
2018-06-26arm64: dts: stratix10: Fix SPI nodes for Stratix10Thor Thayer
Remove the unused bus-num node and change num-chipselect to num-cs to match SPI bindings. Cc: stable@vger.kernel.org Fixes: 78cd6a9d8e154 ("arm64: dts: Add base stratix 10 dtsi") Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org> Signed-off-by: Olof Johansson <olof@lixom.net>
2018-06-26dh key: fix rounding up KDF output lengthEric Biggers
Commit 383203eff718 ("dh key: get rid of stack allocated array") changed kdf_ctr() to assume that the length of key material to derive is a multiple of the digest size. The length was supposed to be rounded up accordingly. However, the round_up() macro was used which only gives the correct result on power-of-2 arguments, whereas not all hash algorithms have power-of-2 digest sizes. In some cases this resulted in a write past the end of the 'outbuf' buffer. Fix it by switching to roundup(), which works for non-power-of-2 inputs. Reported-by: syzbot+486f97f892efeb2075a3@syzkaller.appspotmail.com Reported-by: syzbot+29d17b7898b41ee120a5@syzkaller.appspotmail.com Reported-by: syzbot+8a608baf8751184ec727@syzkaller.appspotmail.com Reported-by: syzbot+d04e58bd384f1fe0b112@syzkaller.appspotmail.com Fixes: 383203eff718 ("dh key: get rid of stack allocated array") Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Tycho Andersen <tycho@tycho.ws> Signed-off-by: James Morris <james.morris@microsoft.com>
2018-06-26certs/blacklist: fix const confusionNick Desaulniers
Fixes commit 2be04df5668d ("certs/blacklist_nohashes.c: fix const confusion in certs blacklist") Signed-off-by: Nick Desaulniers <nick.desaulniers@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.morris@microsoft.com>
2018-06-26ceph: fix dentry leak in splice_dentry()Yan, Zheng
In any case, d_splice_alias() does not drop reference of original dentry. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-06-26netfilter: nf_conncount: fix garbage collection confirm raceFlorian Westphal
Yi-Hung Wei and Justin Pettit found a race in the garbage collection scheme used by nf_conncount. When doing list walk, we lookup the tuple in the conntrack table. If the lookup fails we remove this tuple from our list because the conntrack entry is gone. This is the common cause, but turns out its not the only one. The list entry could have been created just before by another cpu, i.e. the conntrack entry might not yet have been inserted into the global hash. The avoid this, we introduce a timestamp and the owning cpu. If the entry appears to be stale, evict only if: 1. The current cpu is the one that added the entry, or, 2. The timestamp is older than two jiffies The second constraint allows GC to be taken over by other cpu too (e.g. because a cpu was offlined or napi got moved to another cpu). We can't pretend the 'doubtful' entry wasn't in our list. Instead, when we don't find an entry indicate via IS_ERR that entry was removed ('did not exist' or withheld ('might-be-unconfirmed'). This most likely also fixes a xt_connlimit imbalance earlier reported by Dmitry Andrianov. Cc: Dmitry Andrianov <dmitry.andrianov@alertme.com> Reported-by: Justin Pettit <jpettit@vmware.com> Reported-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-06-26Merge branch 'fixes-v4.18-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem fixes from James Morris: - Smack: fix a regression caused by 1bbc55131e5 - X.509: fix a (usually un-seen) bug in RSA signature parsing * 'fixes-v4.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: X.509: unpack RSA signatureValue field from BIT STRING Smack: Mark inode instant in smack_task_to_inode
2018-06-26Merge tag 'for-4.18-rc1-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "Two regression fixes and an incorrect error value propagation fix from 'rename exchange'" * tag 'for-4.18-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: Btrfs: fix return value on rename exchange failure btrfs: fix invalid-free in btrfs_extent_same Btrfs: fix physical offset reported by fiemap for inline extents
2018-06-26netfilter: nf_log: don't hold nf_log_mutex during user accessJann Horn
The old code would indefinitely block other users of nf_log_mutex if a userspace access in proc_dostring() blocked e.g. due to a userfaultfd region. Fix it by moving proc_dostring() out of the locked region. This is a followup to commit 266d07cb1c9a ("netfilter: nf_log: fix sleeping function called from invalid context"), which changed this code from using rcu_read_lock() to taking nf_log_mutex. Fixes: 266d07cb1c9a ("netfilter: nf_log: fix sleeping function calle[...]") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-06-26netfilter: nf_log: fix uninit read in nf_log_proc_dostringJann Horn
When proc_dostring() is called with a non-zero offset in strict mode, it doesn't just write to the ->data buffer, it also reads. Make sure it doesn't read uninitialized data. Fixes: c6ac37d8d884 ("netfilter: nf_log: fix error on write NONE to [...]") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-06-26ARM: davinci: board-da850-evm: fix WP pin polarity for MMC/SDAdam Ford
When booting from MMC/SD in rw mode on DA850 EVM, the system crashes because the write protect pin should be active high and not active low. This patch fixes the polarity of the WP pin. Fixes: bdf0e8364fd3 ("ARM: davinci: da850-evm: use gpio descriptor for mmc pins") Signed-off-by: Adam Ford <aford173@gmail.com> Signed-off-by: Sekhar Nori <nsekhar@ti.com>
2018-06-26selftests: forwarding: mirror_gre_vlan_bridge_1q: Unset rp_filterPetr Machata
The IP addresses of tunnel endpoint at H3 are set at the VLAN device $h3.555. Therefore when test_gretap_untagged_egress() sets vlan 555 to egress untagged at $swp3, $h3's rp_filter rejects these packets. The test then spuriously fails. Therefore turn off net.ipv4.conf.{all, $h3}.rp_filter. Fixes: 9c7c8a82442c ("selftests: forwarding: mirror_gre_vlan_bridge_1q: Add more tests") Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-26mdio-mux-gpio: Remove VLA usageKees Cook
In the quest to remove all stack VLA usage from the kernel[1], this allocates the values buffer during the callback instead of putting it on the stack. [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-26Merge branch 'net-sched-support-replay-of-filter-offload-when-binding-to-block'David S. Miller
Jakub Kicinski says: ==================== net: sched: support replay of filter offload when binding to block This series from John adds the ability to replay filter offload requests when new offload callback is being registered on a TC block. This is most likely to take place for shared blocks today, when a block which already has rules is bound to another interface. Prior to this patch set if any of the rules were offloaded the block bind would fail. A new tcf_proto_op is added to generate a filter-specific offload request. The new 'offload' op is supporting extack from day 0, hence we need to propagate extack to .ndo_setup_tc TC_BLOCK_BIND/TC_BLOCK_UNBIND and through tcf_block_cb_register() to tcf_block_playback_offloads(). The immediate use of this patch set is to simplify life of drivers which require duplicating rules when sharing blocks. Switch drivers (mlxsw) can bind ports to rule lists dynamically, NIC drivers generally don't have that ability and need the rules to be duplicated for each ingress they match on. In code terms this means that switch drivers don't register multiple callbacks for each port. NIC drivers do, and get a separate request and hance rule per-port, as if the block was not shared. The registration fails today, however, if some rules were already present. As John notes in description of patch 7, drivers which register multiple callbacks to shared blocks will likely need to flush the rules on block unbind. This set makes the core not only replay the the offload add requests but also offload remove requests when callback is unregistered. v2: - name parameters in patch 2; - use unsigned int instead of u32 for in_hw_coun; - improve extack message in patch 7. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-26net: sched: call reoffload op on block callback regJohn Hurley
Call the reoffload tcf_proto_op on all tcf_proto nodes in all chains of a block when a callback tries to register to a block that already has offloaded rules. If all existing rules cannot be offloaded then the registration is rejected. This replaces the previous policy of rejecting such callback registration outright. On unregistration of a callback, the rules are flushed for that given cb. The implementation of block sharing in the NFP driver, for example, duplicates shared rules to all devs bound to a block. This meant that rules could still exist in hw even after a device is unbound from a block (assuming the block still remains active). Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-26net: sched: cls_bpf: implement offload tcf_proto_opJohn Hurley
Add the offload tcf_proto_op in cls_bpf to generate an offload message for each bpf prog in the given tcf_proto. Call the specified callback with this new offload message. The function only returns an error if the callback rejects adding a 'hardware only' prog. A prog contains a flag to indicate if it is in hardware or not. To ensure the offload function properly maintains this flag, keep a reference counter for the number of instances of the prog that are in hardware. Only update the flag when this counter changes from or to 0. Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-26net: sched: cls_u32: implement offload tcf_proto_opJohn Hurley
Add the offload tcf_proto_op in cls_u32 to generate an offload message for each filter and the hashtable in the given tcf_proto. Call the specified callback with this new offload message. The function only returns an error if the callback rejects adding a 'hardware only' rule. A filter contains a flag to indicate if it is in hardware or not. To ensure the offload function properly maintains this flag, keep a reference counter for the number of instances of the filter that are in hardware. Only update the flag when this counter changes from or to 0. Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-26net: sched: cls_matchall: implement offload tcf_proto_opJohn Hurley
Add the reoffload tcf_proto_op in matchall to generate an offload message for each filter in the given tcf_proto. Call the specified callback with this new offload message. The function only returns an error if the callback rejects adding a 'hardware only' rule. Ensure matchall flags correctly report if the rule is in hw by keeping a reference counter for the number of instances of the rule offloaded. Only update the flag when this counter changes from or to 0. Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-26net: sched: cls_flower: implement offload tcf_proto_opJohn Hurley
Add the reoffload tcf_proto_op in flower to generate an offload message for each filter in the given tcf_proto. Call the specified callback with this new offload message. The function only returns an error if the callback rejects adding a 'hardware only' rule. A filter contains a flag to indicate if it is in hardware or not. To ensure the reoffload function properly maintains this flag, keep a reference counter for the number of instances of the filter that are in hardware. Only update the flag when this counter changes from or to 0. Add a generic helper function to implement this behaviour. Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-26net: sched: add tcf_proto_op to offload a ruleJohn Hurley
Create a new tcf_proto_op called 'reoffload' that generates a new offload message for each node in a tcf_proto. Pointers to the tcf_proto and whether the offload request is to add or delete the node are included. Also included is a callback function to send the offload message to and the option of priv data to go with the cb. Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-26net: sched: pass extack pointer to block binds and cb registrationJohn Hurley
Pass the extact struct from a tc qdisc add to the block bind function and, in turn, to the setup_tc ndo of binding device via the tc_block_offload struct. Pass this back to any block callback registrations to allow netlink logging of fails in the bind process. Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>