summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-08-06btrfs: Fix comment in lookup_inline_extent_backrefNikolay Borisov
The comment wrongfully states that the owner parameter is the level of the parent block. In fact owner is the level of the current block and by adding 1 to it we can eventually get to the parent/root. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-08-06btrfs: Document __btrfs_inc_extent_refNikolay Borisov
Here is a doc-only patch which tires to deobfuscate the terra-incognita that arguments for delayed refs are. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-08-06btrfs: scrub: Remove unused copy_nocow_pages and its callchainQu Wenruo
Since commit ac0b4145d662a3b9e340 ("btrfs: scrub: Don't use inode pages for device replace") the function is not used and we can remove all functions down the call chain. There was an optimization that reused inode pages to speed up device replace, but broke when there was nodatasum and compressed page. The potential performance gain is small so we don't loose much by removing it and using scrub_pages same as the other pages. Signed-off-by: Qu Wenruo <wqu@suse.com> [ update changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-08-06btrfs: replace get_seconds with new 64bit time APIAllen Pais
The get_seconds() function is deprecated as it truncates the timestamp to 32 bits. Change it to or ktime_get_real_seconds(). Signed-off-by: Allen Pais <allen.lkml@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> [ update changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-08-06pinctrl: ocelot: add support for interrupt controllerQuentin Schulz
This GPIO controller can serve as an interrupt controller as well on the GPIOs it handles. An interrupt is generated whenever a GPIO line changes and the interrupt for this GPIO line is enabled. This means that both the changes from low to high and high to low generate an interrupt. For some use cases, it makes sense to ignore the high to low change and not generate an interrupt. Such a use case is a line that is hold in a level high/low manner until the event holding the line gets acked. This can be achieved by making sure the interrupt on the GPIO controller side gets acked and masked only after the line gets hold in its default state, this is what's done with the fasteoi functions. Only IRQ_TYPE_EDGE_BOTH and IRQ_TYPE_LEVEL_HIGH are supported for now. Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2018-08-06Merge tag 'irqchip-4.19' of ↵Thomas Gleixner
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core Pull irqchip updates from Marc Zyngier: - GICv3 ITS LPI allocation revamp - GICv3 support for hypervisor-enforced LPI range - GICv3 ITS conversion to raw spinlock
2018-08-06PM / reboot: Eliminate race between reboot and suspendPingfan Liu
At present, "systemctl suspend" and "shutdown" can run in parrallel. A system can suspend after devices_shutdown(), and resume. Then the shutdown task goes on to power off. This causes many devices are not really shut off. Hence replacing reboot_mutex with system_transition_mutex (renamed from pm_mutex) to achieve the exclusion. The renaming of pm_mutex as system_transition_mutex can be better to reflect the purpose of the mutex. Signed-off-by: Pingfan Liu <kernelfans@gmail.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-08-06Merge ath-next from git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.gitKalle Valo
ath.git patches for 4.19. Major changes: ath10k * add debugfs file warm_hw_reset wil6210 * add debugfs files tx_latency, link_stats and link_stats_global * add 3-MSI support * allow scan on AP interface * support max aggregation window size 64
2018-08-06ieee802154: fakelb: add deprecated msg while probeAlexander Aring
Since mac802154_hwsim the fakelb driver will get deprecated. This patch will notifier all users of fakelb to switch to the new mac802154_hwsim driver. Signed-off-by: Alexander Aring <aring@mojatatu.com> Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2018-08-06ieee802154: hwsim: add replacement for fakelbAlexander Aring
This patch adds a new virtual driver mac802154_hwsim which is based on the fakelb driver. The fakelb driver will get deprecated and hopefully removed someday. The main reason for doing this step is to rename the driver to mac802154_hwsim to have a similar naming scheme as mac80211_hwsim, which is more popular in the 802.11 wireless word and the idea is the same behind this driver. The new features of this driver are to have knowledge about connected edges, which can be changed during runtime. This offers a testing environment for routing protocols e.g. RPL. The default behaviour is still as fakelb: two radios connected to each other. New added radios during runtime will not be connected to other wpan_hwsim instances. The netlink api is not namespace aware on purpose, only the registered wpan_phy's can be moved to namespaces. The physical layer according to wiresless "air" communication can be handled across namespaces. Furthermore the edges can be weighted with the LQI value according IEEE 802.15.4 which offers additional handling to mark bad or good connection indicators to other connected virtual phys. Signed-off-by: Alexander Aring <aring@mojatatu.com> Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2018-08-06net: ieee802154: 6lowpan: remove redundant pointers 'fq' and 'net'Colin Ian King
Pointers fq and net are being assigned but are never used hence they are redundant and can be removed. Cleans up clang warnings: warning: variable 'fq' set but not used [-Wunused-but-set-variable] warning: variable 'net' set but not used [-Wunused-but-set-variable] Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2018-08-06net: mac802154: tx: expand tailroom if necessaryAlexander Aring
This patch is necessary if case of AF_PACKET or other socket interface which I am aware of it and didn't allocated the necessary room. Reported-by: David Palma <david.palma@ntnu.no> Reported-by: Rabi Narayan Sahoo <rabinarayans0828@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Alexander Aring <aring@mojatatu.com> Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2018-08-06net: 6lowpan: fix reserved space for single framesAlexander Aring
This patch fixes patch add handling to take care tail and headroom for single 6lowpan frames. We need to be sure we have a skb with the right head and tailroom for single frames. This patch do it by using skb_copy_expand() if head and tailroom is not enough allocated by upper layer. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195059 Reported-by: David Palma <david.palma@ntnu.no> Reported-by: Rabi Narayan Sahoo <rabinarayans0828@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Alexander Aring <aring@mojatatu.com> Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2018-08-06PM / hibernate: Mark expected switch fall-throughGustavo A. R. Silva
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. This addresses Coverity-ID: 114713 ("Missing break in switch"). Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-08-06aio: allow direct aio poll comletions for keyed wakeupsChristoph Hellwig
If we get a keyed wakeup for a aio poll waitqueue and wake can acquire the ctx_lock without spinning we can just complete the iocb straight from the wakeup callback to avoid a context switch. Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Avi Kivity <avi@scylladb.com>
2018-08-06aio: implement IOCB_CMD_POLLChristoph Hellwig
Simple one-shot poll through the io_submit() interface. To poll for a file descriptor the application should submit an iocb of type IOCB_CMD_POLL. It will poll the fd for the events specified in the the first 32 bits of the aio_buf field of the iocb. Unlike poll or epoll without EPOLLONESHOT this interface always works in one shot mode, that is once the iocb is completed, it will have to be resubmitted. Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Avi Kivity <avi@scylladb.com>
2018-08-06aio: add a iocb refcountChristoph Hellwig
This is needed to prevent races caused by the way the ->poll API works. To avoid introducing overhead for other users of the iocbs we initialize it to zero and only do refcount operations if it is non-zero in the completion path. Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Avi Kivity <avi@scylladb.com>
2018-08-06timerfd: add support for keyed wakeupsChristoph Hellwig
This prepares timerfd for use with aio poll. Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Avi Kivity <avi@scylladb.com>
2018-08-06cpufreq: intel_pstate: Ignore turbo active ratio in HWPSrinivas Pandruvada
When HWP is active turbo active ratio is not used, so we should allow policy max frequency above turbo activation ratio to be set. When HWP is not active, then any policy max frequency above turbo activation ratio can result upto max one-core turbo frequency. This fix helps better thermal control in turbo region when other methods like "Running Average Power Limit" is not available to use. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-08-06irqchip/gic-v3-its: Make its_lock a raw_spin_lock_tSebastian Andrzej Siewior
The its_lock lock is held while a new device is added to the list and during setup while the CPU is booted. Even on -RT the CPU-bootup is performed with disabled interrupts. Make its_lock a raw_spin_lock_t. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-08-06ACPI / EC: Use ec_no_wakeup on ThinkPad X1 Yoga 3rdAaron Ma
Like on X1C6, on X1Y3 EC interrupts constantly wake up system from s2idle, the power consumption is extremely high. So make ec_no_wakeup be true as default to keep system in s2idle mode and reduce power consumption. Power button works when ec_no_wakeup=true. Signed-off-by: Aaron Ma <aaron.ma@canonical.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-08-06Merge back cpufreq changes for 4.19.Rafael J. Wysocki
2018-08-06Merge back ACPICA material for 4.19.Rafael J. Wysocki
2018-08-06ALSA: dice: fix wrong copy to rx parameters for Alesis iO26Takashi Sakamoto
A commit 28b208f600a3 ('ALSA: dice: add parameters of stream formats for models produced by Alesis') adds wrong copy to rx parameters instead of tx parameters for Alesis iO26. This commit fixes the bug for v4.18-rc8. Fixes: 28b208f600a3 ('ALSA: dice: add parameters of stream formats for models produced by Alesis') Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Cc: <stable@vger.kernel.org> # v4.18 Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-08-06ALSA: echoaudio: Mark expected switch fall-throughsGustavo A. R. Silva
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Addresses-Coverity-ID: 115156 ("Missing break in switch") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-08-06ALSA: emu10k1: Mark expected switch fall-throughsGustavo A. R. Silva
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Notice that in this particular case, I replaced the code comment with a proper "fall through" annotation, which is what GCC is expecting to find. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-08-06ALSA: mixart: Mark expected switch fall-throughGustavo A. R. Silva
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-08-06Merge remote-tracking branch 'net-next/master'Stefan Schmidt
2018-08-06bpf: btf: Change tools/lib/bpf/btf to LGPLMartin KaFai Lau
This patch changes the tools/lib/bpf/btf.[ch] to LGPL which is inline with libbpf also. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-05lightnvm: remove minor version check for 2.0Matias Bjørling
A minor version number increase should not break backwards compatibility. Fixes: 3cb98f84d368b ("lightnvm: add minor version to generic geometry") Reviewed-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-05Merge branch 'nvme-4.19' of git://git.infradead.org/nvme into for-4.19/block2Jens Axboe
Pull NVMe changes from Christoph: "This contains the support for TP4004, Asymmetric Namespace Access, which makes NVMe multipathing usable in practice." * 'nvme-4.19' of git://git.infradead.org/nvme: nvmet: use Retain Async Event bit to clear AEN nvmet: support configuring ANA groups nvmet: add minimal ANA support nvmet: track and limit the number of namespaces per subsystem nvmet: keep a port pointer in nvmet_ctrl nvme: add ANA support nvme: remove nvme_req_needs_failover nvme: simplify the API for getting log pages nvme.h: add ANA definitions nvme.h: add support for the log specific field Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-05Merge tag 'v4.18-rc6' into for-4.19/block2Jens Axboe
Pull in 4.18-rc6 to get the NVMe core AEN change to avoid a merge conflict down the line. Signed-of-by: Jens Axboe <axboe@kernel.dk>
2018-08-05tc-testing: remove duplicate spaces in skbedit match patternsVlad Buslov
Match patterns for some skbedit tests contain duplicate whitespace that is not present in actual tc output. This causes tests to fail because they can't match required action, even when it was successfully created. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-05tc-testing: remove duplicate spaces in connmark match patternsVlad Buslov
Match patterns for some connmark tests contain duplicate whitespace that is not present in actual tc output. This causes tests to fail because they can't match required action, even when it was successfully created. Fixes: 1dad0f9ffff7 ("tc-testing: add connmark action tests") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-05tc-testing: flush gact actions on test teardownVlad Buslov
Test 6fb4 creates one mirred and one pipe action, but only flushes mirred on teardown. Leaking pipe action causes failures in other tests. Add additional teardown command to also flush gact actions. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-05tc-testing: fix ip address in u32 testVlad Buslov
Fix expected ip address to actually match configured ip address. Fix test to expect single matched filter. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-05Merge tag 'wireless-drivers-next-for-davem-2018-08-05' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next Kalle Valo says: ==================== wireless-drivers-next patches for 4.19 This time a bigger pull request as we have two new Mediatek drivers MT76x2u (CONFIG_MT76x2U) and MT76x0U (CONFIG_MT76x0U). Also iwlwifi got support for the new IEEE 802.11ax standard, the successor for 802.11ac. And naturally smaller new features and bugfixes all over. Major changes: wcn36xx * fix WEP in client mode wil6210 * add support for Talyn-MB (Talyn ver 2.0) device * add support for enhanced DMA firmware feature iwlwifi * implement 802.11ax D2.0 * support for the new 22560 device family * new PCI IDs for 22000 and 22560 qtnfmac * implement cfg80211 power management callback * enable multiple SSIDs scan support * qtnfmac: implement basic WoWLAN support mt7601u * fall back to software encryption for hw unsupported ciphers * enable 802.11 Management Frame Protection (MFP) mt76 * support setting RTS threshold * add USB support * add support for MT76x2u devices * add support for MT76x0U devices mwifiex * allow user space to set all other IEs except WMM IE rsi * add firmware support for AP+BT dual mode ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-05ip6_tunnel: use the right value for ipv4 min mtu check in ip6_tnl_xmitXin Long
According to RFC791, 68 bytes is the minimum size of IPv4 datagram every device must be able to forward without further fragmentation while 576 bytes is the minimum size of IPv4 datagram every device has to be able to receive, so in ip6_tnl_xmit(), 68(IPV4_MIN_MTU) should be the right value for the ipv4 min mtu check in ip6_tnl_xmit. While at it, change to use max() instead of if statement. Fixes: c9fefa08190f ("ip6_tunnel: get the min mtu properly in ip6_tnl_xmit") Reported-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-05Merge branch 'for-upstream' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2018-08-05 Here's the main bluetooth-next pull request for the 4.19 kernel. - Added support for Bluetooth Advertising Extensions - Added vendor driver support to hci_h5 HCI driver - Added serdev support to hci_h5 driver - Added support for Qualcomm wcn3990 controller - Added support for RTL8723BS and RTL8723DS controllers - btusb: Added new ID for Realtek 8723DE - Several other smaller fixes & cleanups Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-05Merge branch 'mlxsw-Enable-MC-aware-mode-for-mlxsw-ports'David S. Miller
Ido Schimmel says: ==================== mlxsw: Enable MC-aware mode for mlxsw ports Petr says: Due to an issue in Spectrum chips, when unicast traffic shares the same queue as BUM traffic, and there is a congestion, the BUM traffic is admitted to the queue anyway, thus pushing out all UC traffic. In order to give unicast traffic precedence over BUM traffic, configure multicast-aware mode on all ports. Under multicast-aware regime, when assigning traffic class to a packet, the switch doesn't merely take the value prescribed by the QTCT register. For BUM traffic, it instead assigns that value plus 8. That limits the number of available TCs, but since mlxsw currently only uses the lower eight anyway, it is no real loss. The two TCs (UC and MC one) are then mapped to the same subgroup and strictly prioritized so that UC traffic is preferred in case of congestion. In patch #1, introduce a new register, QTCTM, which enables the multicast-aware mode. In patch #2, fix a typo in related code. In patch #3, set up TCs and QTCTM to enable multicast-aware mode. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-05mlxsw: spectrum: Configure MC-aware mode on mlxsw portsPetr Machata
In order to give unicast traffic precedence over BUM traffic, configure multicast-aware mode on all ports. Under multicast-aware regime, when assigning traffic class to a packet, the switch doesn't merely take the value prescribed by the QTCT register. For BUM traffic, it instead assigns that value plus 8. ETS elements for TCs 8..15 thus need to be configured as well. Extend mlxsw_sp_port_ets_init() so that it maps each of them to the same subgroup as their corresponding TC from the range 0..7, such that TCs X and X+8 map to the same subgroup. The existing code configures TCs with strict priority. So far this was immaterial, because each TC had its own subgroup. Now that two TCs share a subgroup it becomes important. TCs are prioritized in order of 7, 6, ..., 0, 15, 14, ..., 8: the higher TCs used for BUM traffic end up being deprioritized. Since that's what's needed, keep that configuration as it is, and configure the new TCs likewise. Finally in mlxsw_sp_port_create(), invoke configuration of QTCTM to enable MC-aware mode on each port. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-05mlxsw: spectrum: Fix a typoPetr Machata
Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-05mlxsw: reg: Add QoS Switch Traffic Class Table is Multicast-Aware RegisterPetr Machata
This register configures if the Switch Priority to Traffic Class mapping is based on Multicast packet indication. If so, then multicast packets will get a Traffic Class that is plus (cap_max_tclass_data/2) the value configured by QTCT. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-05virtio-net: mark expected switch fall-throughsGustavo A. R. Silva
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Addresses-Coverity-ID: 1402059 ("Missing break in switch") Addresses-Coverity-ID: 1402060 ("Missing break in switch") Addresses-Coverity-ID: 1402061 ("Missing break in switch") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-05net: sched: cls_flower: Fix an error code in fl_tmplt_create()Dan Carpenter
We forgot to set the error code on this path, so we return NULL instead of an error pointer. In the current code kzalloc() won't fail for small allocations so this doesn't really affect runtime. Fixes: b95ec7eb3b4d ("net: sched: cls_flower: implement chain templates") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-05net: check extack._msg before printLi RongQing
dev_set_mtu_ext is able to fail with a valid mtu value, at that condition, extack._msg is not set and random since it is in stack, then kernel will crash when print it. Fixes: 7a4c53bee3324a ("net: report invalid mtu value via netlink extack") Signed-off-by: Zhang Yu <zhangyu31@baidu.com> Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-05ipv6: fix double refcount of fib6_metricsCong Wang
All the callers of ip6_rt_copy_init()/rt6_set_from() hold refcnt of the "from" fib6_info, so there is no need to hold fib6_metrics refcnt again, because fib6_metrics refcnt is only released when fib6_info is gone, that is, they have the same life time, so the whole fib6_metrics refcnt can be removed actually. This fixes a kmemleak warning reported by Sabrina. Fixes: 93531c674315 ("net/ipv6: separate handling of FIB entries from dst based routes") Reported-by: Sabrina Dubroca <sd@queasysnail.net> Cc: Sabrina Dubroca <sd@queasysnail.net> Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-05ipv6: defrag: drop non-last frags smaller than min mtuFlorian Westphal
don't bother with pathological cases, they only waste cycles. IPv6 requires a minimum MTU of 1280 so we should never see fragments smaller than this (except last frag). v3: don't use awkward "-offset + len" v2: drop IPv4 part, which added same check w. IPV4_MIN_MTU (68). There were concerns that there could be even smaller frags generated by intermediate nodes, e.g. on radio networks. Cc: Peter Oskolkov <posk@google.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-05Merge branch 'ip-Use-rb-trees-for-IP-frag-queue'David S. Miller
Peter Oskolkov says: ==================== ip: Use rb trees for IP frag queue. This patchset * changes IPv4 defrag behavior to match that of IPv6: overlapping fragments now cause the whole IP datagram to be discarded (suggested by David Miller): there are no legitimate use cases for overlapping fragments; * changes IPv4 defrag queue from a list to a rb tree (suggested by Eric Dumazet): this change removes a potential attach vector. Upcoming patches will contain similar changes for IPv6 frag queue, as well as a comprehensive IP defrag self-test (temporarily delayed). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-05ip: use rb trees for IP frag queue.Peter Oskolkov
Similar to TCP OOO RX queue, it makes sense to use rb trees to store IP fragments, so that OOO fragments are inserted faster. Tested: - a follow-up patch contains a rather comprehensive ip defrag self-test (functional) - ran neper `udp_stream -c -H <host> -F 100 -l 300 -T 20`: netstat --statistics Ip: 282078937 total packets received 0 forwarded 0 incoming packets discarded 946760 incoming packets delivered 18743456 requests sent out 101 fragments dropped after timeout 282077129 reassemblies required 944952 packets reassembled ok 262734239 packet reassembles failed (The numbers/stats above are somewhat better re: reassemblies vs a kernel without this patchset. More comprehensive performance testing TBD). Reported-by: Jann Horn <jannh@google.com> Reported-by: Juha-Matti Tilli <juha-matti.tilli@iki.fi> Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Peter Oskolkov <posk@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>