summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-05-16scsi: lpfc: Fix used-RPI accounting problem.James Smart
With 255 vports created a link trasition can casue a crash. When going through discovery after a link bounce the driver is using rpis before the cmd FCOE_POST_HDR_TEMPLATES completes. By doing that the next rpi bumps the rpi range out of the boundary. The fix it to increment the next_rpi only when the FCOE_POST_HDR_TEMPLATE succeeds. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16scsi: libfc: fix incorrect variable assignmentGustavo A. R. Silva
Previous assignment was causing the use of the uninitialized variable _explan_ inside fc_seq_ls_rjt() function, which in this particular case is being called by fc_seq_els_rsp_send(). [mkp: fixed typo] Addresses-Coverity-ID: 1398125 Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16scsi: sd: Ignore sync cache failures when not supportedDerek Basehore
Some external hard drives don't support the sync command even though the hard drive has write cache enabled. In this case, upon suspend request, sync cache failures are ignored if the error code in the sense header is ILLEGAL_REQUEST. There's not much we can do for these drives, so we shouldn't fail to suspend for this error case. The drive may stay powered if that's the setup for the port it's plugged into. Signed-off-by: Derek Basehore <dbasehore@chromium.org> Signed-off-by: Thierry Escande <thierry.escande@collabora.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-17drm/nouveau/fifo/gk104-: Silence a locking warningDan Carpenter
Presumably we can never actually hit this return, but static checkers complain that we should unlock before we return. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-05-17drm/nouveau/secboot: plug memory leak in ls_ucode_img_load_gr() error pathChristophe JAILLET
The last goto looks spurious because it releases less resources than the previous one. Also free 'img->sig' if 'ls_ucode_img_build()' fails. Fixes: 9d896f3e41a6 ("drm/nouveau/secboot: abstract LS firmware loading functions") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-05-17drm/nouveau: Fix drm poll_helper handlingPeter Ujfalusi
Commit cae9ff036eea effectively disabled the drm poll_helper by checking the wrong flag to see if the driver should enable the poll or not: mode_config.poll_enabled is only set to true by poll_init and it is not indicating if the poll is enabled or not. nouveau_display_create() will initialize the poll and going to disable it right away. After poll_init() the mode_config.poll_enabled will be true, but the poll itself is disabled. To avoid the race caused by calling the poll_enable() from different paths, this patch will enable the poll from one place, in the nouveau_display_hpd_work(). In case the pm_runtime is disabled we will enable the poll in nouveau_drm_load() once. Fixes: cae9ff036eea ("drm/nouveau: Don't enabling polling twice on runtime resume") Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Reviewed-by: Lyude <lyude@redhat.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-05-16i2c: mv64xxx: don't override deferred probing when getting irqThomas Petazzoni
There is no reason to use platform_get_irq() for non-DT probing and irq_of_parse_and_map() for DT probing. Indeed, platform_get_irq() works fine for both. In addition, using platform_get_irq() properly returns -EPROBE_DEFER when the interrupt controller is not yet available, so instead of inventing our own error code (-ENXIO), return the one provided by platform_get_irq(). Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2017-05-16uio: fix incorrect memory leak cleanupSuman Anna
Commit 75f0aef6220d ("uio: fix memory leak") has fixed up some memory leaks during the failure paths of the addition of uio attributes, but still is not correct entirely. A kobject_uevent() failure still needs a kobject_put() and the kobject container structure allocation failure before the kobject_init() doesn't need a kobject_put(). Fix this properly. Fixes: 75f0aef6220d ("uio: fix memory leak") Signed-off-by: Suman Anna <s-anna@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-16misc: pci_endpoint_test: select CRC32Tobias Regnery
There is the following link error with CONFIG_PCI_ENDPOINT_TEST=y and CONFIG_CRC32=m: drivers/built-in.o: In function 'pci_endpoint_test_ioctl': pci_endpoint_test.c:(.text+0xf1251): undefined reference to 'crc32_le' pci_endpoint_test.c:(.text+0xf1322): undefined reference to 'crc32_le' pci_endpoint_test.c:(.text+0xf13b2): undefined reference to 'crc32_le' pci_endpoint_test.c:(.text+0xf141e): undefined reference to 'crc32_le' Fix this by selecting CRC32 in the PCI_ENDPOINT_TEST kconfig entry. Fixes: 2c156ac71c6b ("misc: Add host side PCI driver for PCI test function device") Signed-off-by: Tobias Regnery <tobias.regnery@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-16char: lp: fix possible integer overflow in lp_setup()Willy Tarreau
The lp_setup() code doesn't apply any bounds checking when passing "lp=none", and only in this case, resulting in an overflow of the parport_nr[] array. All versions in Git history are affected. Reported-By: Roee Hay <roee.hay@hcl.com> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: stable@vger.kernel.org Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-16Merge tag 'pstore-v4.12-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull pstore fix from Kees Cook: "Fix bad EFI vars iterator usage" * tag 'pstore-v4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: efi-pstore: Fix read iter after pstore API refactor
2017-05-16liquidio: use pcie_flr instead of duplicating itChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16net: phy: Remove residual magic from PHY driversAndrew Lunn
commit fa8cddaf903c ("net phylib: Remove unnecessary condition check in phy") removed the only place where the PHY flag PHY_HAS_MAGICANEG was checked. But it left the flag being set in the drivers. Remove the flag. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16bnx2x: Remove open coded carrier checkLeon Romanovsky
There is inline function to test if carrier present, so it makes open-coded solution redundant. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16mlx5e: add CONFIG_INET dependencyArnd Bergmann
We now reference the arp_tbl, which requires IPv4 support to be enabled in the kernel, otherwise we get a link error: drivers/net/built-in.o: In function `mlx5e_tc_update_neigh_used_value': (.text+0x16afec): undefined reference to `arp_tbl' drivers/net/built-in.o: In function `mlx5e_rep_neigh_init': en_rep.c:(.text+0x16c16d): undefined reference to `arp_tbl' drivers/net/built-in.o: In function `mlx5e_rep_netevent_event': en_rep.c:(.text+0x16cbb5): undefined reference to `arp_tbl' This adds a Kconfig dependency for it. Fixes: 232c001398ae ("net/mlx5e: Add support to neighbour update flow") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16tcp: internal implementation for pacingEric Dumazet
BBR congestion control depends on pacing, and pacing is currently handled by sch_fq packet scheduler for performance reasons, and also because implemening pacing with FQ was convenient to truly avoid bursts. However there are many cases where this packet scheduler constraint is not practical. - Many linux hosts are not focusing on handling thousands of TCP flows in the most efficient way. - Some routers use fq_codel or other AQM, but still would like to use BBR for the few TCP flows they initiate/terminate. This patch implements an automatic fallback to internal pacing. Pacing is requested either by BBR or use of SO_MAX_PACING_RATE option. If sch_fq happens to be in the egress path, pacing is delegated to the qdisc, otherwise pacing is done by TCP itself. One advantage of pacing from TCP stack is to get more precise rtt estimations, and less work done from TX completion, since TCP Small queue limits are not generally hit. Setups with single TX queue but many cpus might even benefit from this. Note that unlike sch_fq, we do not take into account header sizes. Taking care of these headers would add additional complexity for no practical differences in behavior. Some performance numbers using 800 TCP_STREAM flows rate limited to ~48 Mbit per second on 40Gbit NIC. If MQ+pfifo_fast is used on the NIC : $ sar -n DEV 1 5 | grep eth 14:48:44 eth0 725743.00 2932134.00 46776.76 4335184.68 0.00 0.00 1.00 14:48:45 eth0 725349.00 2932112.00 46751.86 4335158.90 0.00 0.00 0.00 14:48:46 eth0 725101.00 2931153.00 46735.07 4333748.63 0.00 0.00 0.00 14:48:47 eth0 725099.00 2931161.00 46735.11 4333760.44 0.00 0.00 1.00 14:48:48 eth0 725160.00 2931731.00 46738.88 4334606.07 0.00 0.00 0.00 Average: eth0 725290.40 2931658.20 46747.54 4334491.74 0.00 0.00 0.40 $ vmstat 1 5 procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 4 0 0 259825920 45644 2708324 0 0 21 2 247 98 0 0 100 0 0 4 0 0 259823744 45644 2708356 0 0 0 0 2400825 159843 0 19 81 0 0 0 0 0 259824208 45644 2708072 0 0 0 0 2407351 159929 0 19 81 0 0 1 0 0 259824592 45644 2708128 0 0 0 0 2405183 160386 0 19 80 0 0 1 0 0 259824272 45644 2707868 0 0 0 32 2396361 158037 0 19 81 0 0 Now use MQ+FQ : lpaa23:~# echo fq >/proc/sys/net/core/default_qdisc lpaa23:~# tc qdisc replace dev eth0 root mq $ sar -n DEV 1 5 | grep eth 14:49:57 eth0 678614.00 2727930.00 43739.13 4033279.14 0.00 0.00 0.00 14:49:58 eth0 677620.00 2723971.00 43674.69 4027429.62 0.00 0.00 1.00 14:49:59 eth0 676396.00 2719050.00 43596.83 4020125.02 0.00 0.00 0.00 14:50:00 eth0 675197.00 2714173.00 43518.62 4012938.90 0.00 0.00 1.00 14:50:01 eth0 676388.00 2719063.00 43595.47 4020171.64 0.00 0.00 0.00 Average: eth0 676843.00 2720837.40 43624.95 4022788.86 0.00 0.00 0.40 $ vmstat 1 5 procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 2 0 0 259832240 46008 2710912 0 0 21 2 223 192 0 1 99 0 0 1 0 0 259832896 46008 2710744 0 0 0 0 1702206 198078 0 17 82 0 0 0 0 0 259830272 46008 2710596 0 0 0 0 1696340 197756 1 17 83 0 0 4 0 0 259829168 46024 2710584 0 0 16 0 1688472 197158 1 17 82 0 0 3 0 0 259830224 46024 2710408 0 0 0 0 1692450 197212 0 18 82 0 0 As expected, number of interrupts per second is very different. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Van Jacobson <vanj@google.com> Cc: Jerry Chu <hkchu@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16Merge branch 'udp-scalability-improvements'David S. Miller
Paolo Abeni says: ==================== udp: scalability improvements This patch series implement an idea suggested by Eric Dumazet to reduce the contention of the udp sk_receive_queue lock when the socket is under flood. An ancillary queue is added to the udp socket, and the socket always tries first to read packets from such queue. If it's empty, we splice the content from sk_receive_queue into the ancillary queue. The first patch introduces some helpers to keep the udp code small, and the following two implement the ancillary queue strategy. The code is split to hopefully help the reviewing process. The measured overall gain under udp flood is up to the 30% depending on the numa layout and the number of ingress queue used by the relevant nic. The performance numbers have been gathered using pktgen as sender, with 64 bytes packets, random src port on a host b2b connected via a 10Gbs link with the dut. The receiver used the udp_sink program by Jesper [1] and an h/w l4 rx hash on the ingress nic, so that the number of ingress nic rx queues hit by the udp traffic could be controlled via ethtool -L. The udp_sink program was bound to the first idle cpu, to get more stable numbers. On a single numa node receiver: nic rx queues vanilla patched kernel 1 1820 kpps 1900 kpps 2 1950 kpps 2500 kpps 16 1670 kpps 2120 kpps When using a single nic rx queue, busy polling was also enabled, elsewhere, in the above scenario, the bh processing becomes the bottle-neck and this produces large artifacts in the measured performances (e.g. improving the udp sink run time, decreases the overall tput, since more action from the scheduler comes into play). [1] https://github.com/netoptimizer/network-testing/blob/master/src/udp_sink.c v1 -> v2: Patches 1/3 and 2/3 are unchanged, in patch 3/3 the rx_queue_lock_held param of udp_rmem_release() is now a bool. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16udp: keep the sk_receive_queue held when splicingPaolo Abeni
On packet reception, when we are forced to splice the sk_receive_queue, we can keep the related lock held, so that we can avoid re-acquiring it, if fwd memory scheduling is required. v1 -> v2: the rx_queue_lock_held param in udp_rmem_release() is now a bool Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16udp: use a separate rx queue for packet receptionPaolo Abeni
under udp flood the sk_receive_queue spinlock is heavily contended. This patch try to reduce the contention on such lock adding a second receive queue to the udp sockets; recvmsg() looks first in such queue and, only if empty, tries to fetch the data from sk_receive_queue. The latter is spliced into the newly added queue every time the receive path has to acquire the sk_receive_queue lock. The accounting of forward allocated memory is still protected with the sk_receive_queue lock, so udp_rmem_release() needs to acquire both locks when the forward deficit is flushed. On specific scenarios we can end up acquiring and releasing the sk_receive_queue lock multiple times; that will be covered by the next patch Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16net/sock: factor out dequeue/peek with offset codePaolo Abeni
And update __sk_queue_drop_skb() to work on the specified queue. This will help the udp protocol to use an additional private rx queue in a later patch. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16xfs: only return detailed fsmap info if the caller has CAP_SYS_ADMINDarrick J. Wong
There were a number of handwaving complaints that one could "possibly" use inode numbers and extent maps to fingerprint a filesystem hosting multiple containers and somehow use the information to guess at the contents of other containers and attack them. Despite the total lack of any demonstration that this is actually possible, it's easier to restrict access now and broaden it later, so use the rmapbt fsmap backends only if the caller has CAP_SYS_ADMIN. Unprivileged users will just have to make do with only getting the free space and static metadata placement information. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
2017-05-16KVM: x86: lower default for halt_poll_nsPaolo Bonzini
In some fio benchmarks, halt_poll_ns=400000 caused CPU utilization to increase heavily even in cases where the performance improvement was small. In particular, bandwidth divided by CPU usage was as much as 60% lower. To some extent this is the expected effect of the patch, and the additional CPU utilization is only visible when running the benchmarks. However, halving the threshold also halves the extra CPU utilization (from +30-130% to +20-70%) and has no negative effect on performance. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-05-16dm bufio: make the parameter "retain_bytes" unsigned longMikulas Patocka
Change the type of the parameter "retain_bytes" from unsigned to unsigned long, so that on 64-bit machines the user can set more than 4GiB of data to be retained. Also, change the type of the variable "count" in the function "__evict_old_buffers" to unsigned long. The assignment "count = c->n_buffers[LIST_CLEAN] + c->n_buffers[LIST_DIRTY];" could result in unsigned long to unsigned overflow and that could result in buffers not being freed when they should. While at it, avoid division in get_retain_buffers(). Division is slow, we can change it to shift because we have precalculated the log2 of block size. Cc: stable@vger.kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-05-16net: Improve handling of failures on link and route dumpsDavid Ahern
In general, rtnetlink dumps do not anticipate failure to dump a single object (e.g., link or route) on a single pass. As both route and link objects have grown via more attributes, that is no longer a given. netlink dumps can handle a failure if the dump function returns an error; specifically, netlink_dump adds the return code to the response if it is <= 0 so userspace is notified of the failure. The missing piece is the rtnetlink dump functions returning the error. Fix route and link dump functions to return the errors if no object is added to an skb (detected by skb->len != 0). IPv6 route dumps (rt6_dump_route) already return the error; this patch updates IPv4 and link dumps. Other dump functions may need to be ajusted as well. Reported-by: Jan Moskyto Matejka <mq@ucw.cz> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16net/smc: Add warning about remote memory exposureChristoph Hellwig
The driver explicitly bypasses APIs to register all memory once a connection is made, and thus allows remote access to memory. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Leon Romanovsky <leon@kernel.org> Acked-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16smc: switch to usage of IB_PD_UNSAFE_GLOBAL_RKEYUrsula Braun
Currently, SMC enables remote access to physical memory when a user has successfully configured and established an SMC-connection until ten minutes after the last SMC connection is closed. Because this is considered a security risk, drivers are supposed to use IB_PD_UNSAFE_GLOBAL_RKEY in such a case. This patch changes the current SMC code to use IB_PD_UNSAFE_GLOBAL_RKEY. This improves user awareness, but does not remove the security risk itself. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16efi-pstore: Fix read iter after pstore API refactorKees Cook
During the internal pstore API refactoring, the EFI vars read entry was accidentally made to update a stack variable instead of the pstore private data pointer. This corrects the problem (and removes the now needless argument). Fixes: 125cc42baf8a ("pstore: Replace arguments for read() API") Signed-off-by: Kees Cook <keescook@chromium.org>
2017-05-16Merge branch 'nfp-LSO-checksum-and-XDP-datapath-updates'David S. Miller
Jakub Kicinski says: ==================== nfp: LSO, checksum and XDP datapath updates This series introduces a number of refinements to standard features like LSO and checksum offload. Three major features are support for CHECKSUM_COMPLETE, refinement of TSO handling and another small speed up for XDP TX. This series also switches from depending on some app FW<>driver ABI versions to heavier use of capabilities. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16nfp: eliminate an if statement in calculation of completed framesJakub Kicinski
Given that our rings are always a power of 2, we can simplify the calculation of number of completed TX descriptors by using masking instead of if statement based on whether the index have wrapped or not. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16nfp: add a helper for wrapping descriptor indexJakub Kicinski
We have a number of places where we calculate the descriptor index based on a value which may have overflown. Create a macro for masking with the ring size. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16nfp: complete the XDP TX ring only when it's fullJakub Kicinski
Since XDP TX ring holds "spare" RX buffers anyway, we don't have to rush the completion. We can wait until ring fills up completely before trying to reclaim buffers. If RX poll has ended an no buffer has been queued for XDP TX we have no guarantee we will see another interrupt, so run the reclaim there as well, to make sure TX statistics won't become stale. This should help us reclaim more buffers per single queue controller register read. Note that the XDP completion is very trivial, it only adds up the sizes of transmitted frames for statistics so the latency spike should be acceptable. In case user sets the ring sizes to something crazy, limit the completion to 2k entries. The check if the ring is empty at the beginning of xdp_complete() is no longer needed - the callers will perform it. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16nfp: add CHECKSUM_COMPLETE supportJakub Kicinski
Introduce NFP_NET_CFG_CTRL_CSUM_COMPLETE capability and implement parsing of CHECKSUM_COMPLETE metadata. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Edwin Peer <edwin.peer@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16nfp: version independent support for chained RSS metadataEdwin Peer
ABI version 4 introduced metadata chaining. Using the ABI version to signal metadata chaining precludes firmware that advertises new capabilities which rely on prepended metadata from working on older kernels. Capability bits are thus better suited to signalling the chained metadata format. A new version of the RSS capability is introduced to distinguish between the differing metadata formats for ABI versions other than 4. Signed-off-by: Edwin Peer <edwin.peer@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16nfp: don't assume RSS and IRQ moderation are always enabledJakub Kicinski
Even if capability for RSS and IRQ moderation are present we may have not initialized them for control vNIC. Depend on selected features mask (ctrl) rather than capabilities (cap) to determine which features should be enabled. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16nfp: support LSO2 capabilityEdwin Peer
Firmware advertising the LSO2 capability exploits driver provided L3 and L4 offsets in order to avoid parsing packet headers in the TX path. The vlan field in struct nfp_net_tx_desc is repurposed, making TXVLAN a mutually exclusive configuration to LSO2. Signed-off-by: Edwin Peer <edwin.peer@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16nfp: rename l4_offset in struct nfp_net_tx_desc to lso_hdrlenEdwin Peer
The l4_offset field referred to by NFD is confusingly named. It is not the offset of the L4 transport header, but rather the L4 payload. The LSO2 capability supported by alternative device firmware requires the actual L4 offset, thus the rename seems prudent. Signed-off-by: Edwin Peer <edwin.peer@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16nfp: don't enable TSO on the device when disabledJakub Kicinski
We advertise TSO to the stack but leave it disabled by default. Make sure it's not only disabled in the netdev features but also on the device itself. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16Merge branch 'i2c-mux/for-current' of https://github.com/peda-r/i2c-mux into ↵Wolfram Sang
i2c/for-current Pull bugfixes from the i2c mux subsubsystem: This fixes an old bug in resource cleanup on failure in i2c-mux-reg and a new log spamming bug from this merge window in the i2c-mux core.
2017-05-16ipmr: vrf: Find VIFs using the actual deviceThomas Winter
The skb->dev that is passed into ip_mr_input is the loX device for VRFs. When we lookup a vif for this dev, none is found as we do not create vifs for loopbacks. Instead lookup a vif for the actual device that the packet was received on, eg the vlan. Signed-off-by: Thomas Winter <Thomas.Winter@alliedtelesis.co.nz> cc: David Ahern <dsa@cumulusnetworks.com> cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> cc: roopa <roopa@cumulusnetworks.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16tcp: eliminate negative reordering in tcp_clean_rtx_queueSoheil Hassas Yeganeh
tcp_ack() can call tcp_fragment() which may dededuct the value tp->fackets_out when MSS changes. When prior_fackets is larger than tp->fackets_out, tcp_clean_rtx_queue() can invoke tcp_update_reordering() with negative values. This results in absurd tp->reodering values higher than sysctl_tcp_max_reordering. Note that tcp_update_reordering indeeds sets tp->reordering to min(sysctl_tcp_max_reordering, metric), but because the comparison is signed, a negative metric always wins. Fixes: c7caf8d3ed7a ("[TCP]: Fix reord detection due to snd_una covered holes") Reported-by: Rebecca Isaacs <risaacs@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: - convert the debug feature to refcount_t - reduce the copy size for strncpy_from_user - 8 bug fixes * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/virtio: change virtio_feature_desc:features type to __le32 s390: convert debug_info.ref_count from atomic_t to refcount_t s390: move _text symbol to address higher than zero s390/qdio: increase string buffer size s390/ccwgroup: increase string buffer size s390/topology: let topology_mnest_limit() return unsigned char s390/uaccess: use sane length for __strncpy_from_user() s390/uprobes: fix compile for !KPROBES s390/ftrace: fix compile for !MODULES s390/cputime: fix incorrect system time
2017-05-16xfs: bad assertion for delalloc an extent that start at i_sizeZorro Lang
By run fsstress long enough time enough in RHEL-7, I find an assertion failure (harder to reproduce on linux-4.11, but problem is still there): XFS: Assertion failed: (iflags & BMV_IF_DELALLOC) != 0, file: fs/xfs/xfs_bmap_util.c The assertion is in xfs_getbmap() funciton: if (map[i].br_startblock == DELAYSTARTBLOCK && --> map[i].br_startoff <= XFS_B_TO_FSB(mp, XFS_ISIZE(ip))) ASSERT((iflags & BMV_IF_DELALLOC) != 0); When map[i].br_startoff == XFS_B_TO_FSB(mp, XFS_ISIZE(ip)), the startoff is just at EOF. But we only need to make sure delalloc extents that are within EOF, not include EOF. Signed-off-by: Zorro Lang <zlang@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-05-16xfs: fix warnings about unused stack variablesDarrick J. Wong
Reduce stack usage and get rid of compiler warnings by eliminating unused variables. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
2017-05-16xfs: BMAPX shouldn't barf on inline-format directoriesDarrick J. Wong
When we're fulfilling a BMAPX request, jump out early if the data fork is in local format. This prevents us from hitting a debugging check in bmapi_read and barfing errors back to userspace. The on-disk extent count check later isn't sufficient for IF_DELALLOC mode because da extents are in memory and not on disk. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2017-05-16xfs: fix indlen accounting error on partial delalloc conversionBrian Foster
The delalloc -> real block conversion path uses an incorrect calculation in the case where the middle part of a delalloc extent is being converted. This is documented as a rare situation because XFS generally attempts to maximize contiguity by converting as much of a delalloc extent as possible. If this situation does occur, the indlen reservation for the two new delalloc extents left behind by the conversion of the middle range is calculated and compared with the original reservation. If more blocks are required, the delta is allocated from the global block pool. This delta value can be characterized as the difference between the new total requirement (temp + temp2) and the currently available reservation minus those blocks that have already been allocated (startblockval(PREV.br_startblock) - allocated). The problem is that the current code does not account for previously allocated blocks correctly. It subtracts the current allocation count from the (new - old) delta rather than the old indlen reservation. This means that more indlen blocks than have been allocated end up stashed in the remaining extents and free space accounting is broken as a result. Fix up the calculation to subtract the allocated block count from the original extent indlen and thus correctly allocate the reservation delta based on the difference between the new total requirement and the unused blocks from the original reservation. Also remove a bogus assert that contradicts the fact that the new indlen reservation can be larger than the original indlen reservation. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-05-16Merge tag 'edac_fix_for_4.12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp Pull EDAC fix from Borislav Petkov: "A single amd64_edac fix correcting chip select sizes reporting on F17h" * tag 'edac_fix_for_4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: EDAC, amd64: Fix reporting of Chip Select sizes on Fam17h
2017-05-16net: socket: mark socket protocol handler structs as constlinzhang
Signed-off-by: linzhang <xiaolou4617@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16tools: hv: Add clean up for included files in Ubuntu net configHaiyang Zhang
The clean up function is updated to cover duplicate config info in files included by "source" key word in Ubuntu network config. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16bnxt: add dma mapping attributesShannon Nelson
On the SPARC platform we need to use the DMA_ATTR_WEAK_ORDERING attribute in our Rx path dma mapping in order to get the expected performance out of the receive path. Adding it to the Tx path has little effect, so that's not a part of this patch. Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Reviewed-by: Tushar Dave <tushar.n.dave@oracle.com> Reviewed-by: Tom Saeger <tom.saeger@oracle.com> Acked-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16Merge branch 'xgene-Add-ethtool-stats-and-bug-fixes'David S. Miller
Iyappan Subramanian says: ==================== drivers: net: xgene: Add ethtool stats and bug fixes This patch set, - adds ethtool extended statistics support - addresses errata workarounds - fixes bugs related to statistics v2: Address review comments from v1 - Adds lock to protect mdio-xgene indirect MAC access - Refactors xgene-enet indirect MAC read/write functions - Uses mdio-xgene MAC access routines, if xgene-enet port use the same HW. v1: - Initial version Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Signed-off-by: Quan Nguyen <qnguyen@apm.com> ==================== Signed-off-by: David S. Miller <davem@davemloft.net>