Age | Commit message (Collapse) | Author |
|
This patch adds the multiqueue (VIRTIO_NET_F_MQ) support to virtio_net
driver. VIRTIO_NET_F_MQ capable device could allow the driver to do packet
transmission and reception through multiple queue pairs and does the packet
steering to get better performance. By default, one one queue pair is used, user
could change the number of queue pairs by ethtool in the next patch.
When multiple queue pairs is used and the number of queue pairs is equal to the
number of vcpus. Driver does the following optimizations to implement per-cpu
virt queue pairs:
- select the txq based on the smp processor id.
- smp affinity hint to the cpu that owns the queue pairs.
This could be used with the flow steering support of the device to guarantee the
packets of a single flow is handled by the same cpu.
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
To support multiqueue transmitq/receiveq, the first step is to separate queue
related structure from virtnet_info. This patch introduce send_queue and
receive_queue structure and use the pointer to them as the parameter in
functions handling sending/receiving.
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds capability in vxlan to identify received
checksummed inner packets and signal them to the upper layers of
the stack. The driver needs to set the skb->encapsulation bit
and also set the skb->ip_summed to CHECKSUM_UNNECESSARY.
Signed-off-by: Joseph Gasparakis <joseph.gasparakis@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Allow VXLAN to make use of Tx checksum offloading and Tx scatter-gather.
The advantage to these two changes is that it also allows the VXLAN to
make use of GSO.
Signed-off-by: Joseph Gasparakis <joseph.gasparakis@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This change allows the VXLAN to enable Tx checksum offloading even on
devices that do not support encapsulated checksum offloads. The
advantage to this is that it allows for the lower device to change due
to routing table changes without impacting features on the VXLAN itself.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds support in the kernel for offloading in the NIC Tx and Rx
checksumming for encapsulated packets (such as VXLAN and IP GRE).
For Tx encapsulation offload, the driver will need to set the right bits
in netdev->hw_enc_features. The protocol driver will have to set the
skb->encapsulation bit and populate the inner headers, so the NIC driver will
use those inner headers to calculate the csum in hardware.
For Rx encapsulation offload, the driver will need to set again the
skb->encapsulation flag and the skb->ip_csum to CHECKSUM_UNNECESSARY.
In that case the protocol driver should push the decapsulated packet up
to the stack, again with CHECKSUM_UNNECESSARY. In ether case, the protocol
driver should set the skb->encapsulation flag back to zero. Finally the
protocol driver should have NETIF_F_RXCSUM flag set in its features.
Signed-off-by: Joseph Gasparakis <joseph.gasparakis@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A SID could potentially be embedded inside of payload.value if there are
no subauthorities, and the arch has 8 byte pointers. Allow for that
possibility there.
While we're at it, rephrase the "embedding" check in terms of
key->payload to allow for the possibility that the union might change
size in the future.
Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
|
|
It was hardcoded to 192 bytes, which was not enough when the max number
of subauthorities went to 15. Redefine this constant in terms of sizeof
the structs involved, and rename it for better clarity.
While we're at it, remove a couple more unused constants from cifsacl.h.
Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
|
|
Now that we aren't so rigid about the length of the key being passed
in, we need to be a bit more rigorous about checking the length of
the actual data against the claimed length (a'la num_subauths field).
Check for the case where userspace sends us a seemingly valid key
with a num_subauths field that goes beyond the end of the array. If
that happens, return -EIO and invalidate the key.
Also change the other places where we check for malformed keys in this
code to invalidate the key as well.
Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
|
|
The cifs.idmap keytype always allocates memory to hold the payload from
userspace. In the common case where we're translating a SID to a UID or
GID, we're allocating memory to hold something that's less than or equal
to the size of a pointer.
When the payload is the same size as a pointer or smaller, just store
it in the payload.value union member instead. That saves us an extra
allocation on the sid_to_id upcall.
Note that we have to take extra care to check the datalen when we
go to dereference the .data pointer in the union, but the callers
now check that as a matter of course anyway.
Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
|
|
The cifs.idmap handling code currently causes the kernel to cache the
data from userspace twice. It first looks in a rbtree to see if there is
a matching entry for the given id. If there isn't then it calls
request_key which then checks its cache and then calls out to userland
if it doesn't have one. If the userland program establishes a mapping
and downcalls with that info, it then gets cached in the keyring and in
this rbtree.
Aside from the double memory usage and the performance penalty in doing
all of these extra copies, there are some nasty bugs in here too. The
code declares four rbtrees and spinlocks to protect them, but only seems
to use two of them. The upshot is that the same tree is used to hold
(eg) uid:sid and sid:uid mappings. The comparitors aren't equipped to
deal with that.
I think we'd be best off to remove a layer of caching in this code. If
this was originally done for performance reasons, then that really seems
like a premature optimization.
This patch does that -- it removes the rbtrees and the locks that
protect them and simply has the code do a request_key call on each call
into sid_to_id and id_to_sid. This greatly simplifies this code and
should roughly halve the memory utilization from using the idmapping
code.
Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
|
|
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
|
|
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
|
|
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
|
|
GigaMAC registers have been reported left unitialized in several
situations:
- after cold boot from power-off state
- after S3 resume
Tweaking rtl_hw_phy_config takes care of both.
This patch removes an excess entry (",") at the end of the exgmac_reg
array as well.
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: Wang YanQing <udknight@gmail.com>
Cc: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux
Paul Gortmaker says:
====================
Changes since v1:
-get rid of essentially unused variable spotted by
Neil Horman (patch #2)
-drop patch #3; defer it for 3.9 content, so Neil,
Jon and Ying can discuss its specifics at their
leisure while net-next is closed. (It had no
direct dependencies to the rest of the series, and
was just an optimization)
-fix indentation of accept() code directly in place
vs. forking it out to a separate function (was patch
#10, now patch #9).
Rebuilt and re-ran tests just to ensure nothing odd happened.
Original v1 text follows, updated pull information follows that.
---------
Here is another batch of TIPC changes. The most interesting
thing is probably the non-blocking socket connect - I'm told
there were several users looking forward to seeing this.
Also there were some resource limitation changes that had
the right intent back in 2005, but were now apparently causing
needless limitations to people's real use cases; those have
been relaxed/removed.
There is a lockdep splat fix, but no need for a stable backport,
since it is virtually impossible to trigger in mainline; you
have to essentially modify code to force the probabilities
in your favour to see it.
The rest can largely be categorized as general cleanup of things
seen in the process of getting the above changes done.
Tested between 64 and 32 bit nodes with the test suite. I've
also compile tested all the individual commits on the chain.
I'd originally figured on this queue not being ready for 3.8, but
the extended stabilization window of 3.7 has changed that. On
the other hand, this can still be 3.9 material, if that simply
works better for folks - no problem for me to defer it to 2013.
If anyone spots any problems then I'll definitely defer it,
rather than rush a last minute respin.
===================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
commit c702418f8a2f ("mm: vmscan: do not keep kswapd looping forever due
to individual uncompactable zones") removed zone watermark checks from
the compaction code in kswapd but left in the zone congestion clearing,
which now happens unconditionally on higher order reclaim.
This messes up the reclaim throttling logic for zones with
dirty/writeback pages, where zones should only lose their congestion
status when their watermarks have been restored.
Remove the clearing from the zone compaction section entirely. The
preliminary zone check and the reclaim loop in kswapd will clear it if
the zone is considered balanced.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The direct-IO write path already had the i_size checks in mm/filemap.c,
but it turns out the read path did not, and removing the block size
checks in fs/block_dev.c (commit bbec0270bdd8: "blkdev_max_block: make
private to fs/buffer.c") removed the magic "shrink IO to past the end of
the device" code there.
Fix it by truncating the IO to the size of the block device, like the
write path already does.
NOTE! I suspect the write path would be *much* better off doing it this
way in fs/block_dev.c, rather than hidden deep in mm/filemap.c. The
mm/filemap.c code is extremely hard to follow, and has various
conditionals on the target being a block device (ie the flag passed in
to 'generic_write_checks()', along with a conditional update of the
inode timestamp etc).
It is also quite possible that we should treat this whole block device
size as a "s_maxbytes" issue, and try to make the logic even more
generic. However, in the meantime this is the fairly minimal targeted
fix.
Noted by Milan Broz thanks to a regression test for the cryptsetup
reencrypt tool.
Reported-and-tested-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/core
Pull ftrace updates from Steve Rostedt.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc into perf/core
Pull uprobes fixes, cleanups and preparation for the ARM port from Oleg Nesterov.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into sched/core
Pull more cputime cleanups from Frederic Weisbecker:
* Get rid of underscores polluting the vtime namespace
* Consolidate context switch and tick handling
* Improve debuggability by detecting irq unsafe callers
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/urgent
Pull ftrace fixes from Steve Rostedt.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into sched/core
Pull cputime cleanups from Frederic Weisbecker:
* Improve naming and code location
* Consolidate adjustment code
* Comment the adjustement code
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Conflicts:
tools/perf/Makefile
tools/perf/builtin-test.c
tools/perf/perf.h
tools/perf/tests/parse-events.c
tools/perf/util/evsel.h
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
- UAPI fixes, from David Howels
- Separate perf tests into multiple objects, one per test, from Jiri Olsa.
- Fixes to /proc/pid/maps parsing, preparatory to supporting data maps,
from Namhyung Kim
- Fix compile error for non-NEWT builds, from Namhyung Kim
- Implement ui_progress for GTK, from Namhyung Kim
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Pull networking fixes from David Miller:
"Two stragglers:
1) The new code that adds new flushing semantics to GRO can cause SKB
pointer list corruption, manage the lists differently to avoid the
OOPS. Fix from Eric Dumazet.
2) When TCP fast open does a retransmit of data in a SYN-ACK or
similar, we update retransmit state that we shouldn't triggering a
WARN_ON later. Fix from Yuchung Cheng."
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
net: gro: fix possible panic in skb_gro_receive()
tcp: bug fix Fast Open client retransmission
|
|
As the current LEDs code breaks other platform, remove it.
It shall be replaced by a generic "MMIO LEDs" driver.
Reported-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Tested-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
From Maxime Ripard:
Fixes in sunXi related drivers for 3.8
* tag 'sunxi-fixes-for-3.8' of git://github.com/mripard/linux:
irqchip: irq-sunxi: Add terminating entry for sunxi_irq_dt_ids
clocksource: sunxi_timer: Add terminating entry for sunxi_timer_dt_ids
|
|
In TIPC's accept() routine, there is a large block of code relating
to initialization of a new socket, all within an if condition checking
if the allocation succeeded.
Here, we simply flip the check of the if, so that the main execution
path stays at the same indentation level, which improves readability.
If the allocation fails, we jump to an already existing exit label.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
|
|
TIPC accept() call grabs the socket lock on a newly allocated
socket while holding the socket lock on an old socket. But lockdep
worries that this might be a recursive lock attempt:
[ INFO: possible recursive locking detected ]
---------------------------------------------
kworker/u:0/6 is trying to acquire lock:
(sk_lock-AF_TIPC){+.+.+.}, at: [<c8c1226c>] accept+0x15c/0x310 [tipc]
but task is already holding lock:
(sk_lock-AF_TIPC){+.+.+.}, at: [<c8c12138>] accept+0x28/0x310 [tipc]
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(sk_lock-AF_TIPC);
lock(sk_lock-AF_TIPC);
*** DEADLOCK ***
May be due to missing lock nesting notation
[...]
Tell lockdep that this locking is safe by using lock_sock_nested().
This is similar to what was done in commit 5131a184a3458d9 for
SCTP code ("SCTP: lock_sock_nested in sctp_sock_migrate").
Also note that this is isn't something that is seen normally,
as it was uncovered with some experimental work-in-progress
code not yet ready for mainline. So no need for stable
backports or similar of this commit.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
|
|
As connection setup is now completed asynchronously in BH context,
in the function filter_connect(), the corresponding code in recv_msg()
becomes redundant.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
|
|
TIPC has so far only supported blocking connect(), meaning that a call
to connect() doesn't return until either the connection is fully
established, or an error occurs. This has proved insufficient for many
users, so we now introduce non-blocking connect(), analogous to how
this is done in TCP and other protocols.
With this feature, if a connection cannot be established instantly,
connect() will return the error code "-EINPROGRESS".
If the user later calls connect() again, he will either have the
return code "-EALREADY" or "-EISCONN", depending on whether the
connection has been established or not.
The user must have explicitly set the socket to be non-blocking
(SOCK_NONBLOCK or O_NONBLOCK, depending on method used), so unless
for some reason they had set this already (the socket would anyway
remain blocking in current TIPC) this change should be completely
backwards compatible.
It is also now possible to call select() or poll() to wait for the
completion of a connection.
An effect of the above is that the actual completion of a connection
may now be performed asynchronously, independent of the calls from
user space. Therefore, we now execute this code in BH context, in
the function filter_rcv(), which is executed upon reception of
messages in the socket.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
[PG: minor refactoring for improved connect/disconnect function names]
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
|
|
Handling of connection-related message reception is currently scattered
around at different places in the code. This makes it harder to verify
that things are handled correctly in all possible scenarios.
So we consolidate the existing processing of connection-oriented
message reception in a single routine. In the process, we convert the
chain of if/else into a switch/case for improved readability.
A cast on the socket_state in the switch is needed to avoid compile
warnings on 32 bit, like "net/tipc/socket.c:1252:2: warning: case value
‘4294967295’ not in enumerated type". This happens because existing
tipc code pseudo extends the default linux socket state values with:
#define SS_LISTENING -1 /* socket is listening */
#define SS_READY -2 /* socket is connectionless */
It may make sense to add these as _positive_ values to the existing
socket state enum list someday, vs. these already existing defines.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
[PG: add cast to fix warning; remove returns from middle of switch]
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
|
|
Currently we have tipc_disconnect and tipc_disconnect_port. It is
not clear from the names alone, what they do or how they differ.
It turns out that tipc_disconnect just deals with the port locking
and then calls tipc_disconnect_port which does all the work.
If we rename as follows: tipc_disconnect_port --> __tipc_disconnect
then we will be following typical linux convention, where:
__tipc_disconnect: "raw" function that does all the work.
tipc_disconnect: wrapper that deals with locking and then calls
the real core __tipc_disconnect function
With this, the difference is immediately evident, and locking
violations are more apt to be spotted by chance while working on,
or even just while reading the code.
On the connect side of things, we currently only have the single
"tipc_connect2port" function. It does both the locking at enter/exit,
and the core of the work. Pending changes will make it desireable to
have the connect be a two part locking wrapper + worker function,
just like the disconnect is already.
Here, we make the connect look just like the updated disconnect case,
for the above reason, and for consistency. In the process, we also
get rid of the "2port" suffix that was on the original name, since
it adds no descriptive value.
On close examination, one might notice that the above connect
changes implicitly move the call to tipc_link_get_max_pkt() to be
within the scope of tipc_port_lock() protected region; when it was
not previously. We don't see any issues with this, and it is in
keeping with __tipc_connect doing the work and tipc_connect just
handling the locking.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
|
|
The sk_recv_queue upper limit for connectionless sockets has empirically
turned out to be too low. When we double the current limit we get much
fewer rejected messages and no noticable negative side-effects.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
|
|
* acpi-general:
pnpacpi: fix incorrect TEST_ALPHA() test
ACPI / video: ignore BIOS initial backlight value for HP Folio 13-2000
ACPI : do not use Lid and Sleep button for S5 wakeup
|
|
* acpi-enumeration:
ACPI: add Haswell LPSS devices to acpi_platform_device_ids list
ACPI: add documentation about ACPI 5 enumeration
|
|
* acpi-dev-pm:
ACPI / PM: Fix header of acpi_dev_pm_detach() in acpi.h
|
|
* pm-devfreq: (23 commits)
PM / devfreq: remove compiler error with module governors (2)
PM / devfreq: Fix return value in devfreq_remove_governor()
PM / devfreq: Fix incorrect argument in error message
PM / devfreq: missing rcu_read_lock() added for find_device_opp()
PM / devfreq: remove compiler error when a governor is module
PM / devfreq: exynos4_bus.c: Fixed an alignment of the func call args.
PM / devfreq: Add sysfs node to expose available governors
PM / devfreq: allow sysfs governor node to switch governor
PM / devfreq: governors: add GPL module license and allow module build
PM / devfreq: map devfreq drivers to governor using name
PM / devfreq: register governors with devfreq framework
PM / devfreq: provide hooks for governors to be registered
PM / devfreq: export update_devfreq
PM / devfreq: Add sysfs node for representing frequency transition information.
PM / devfreq: Add sysfs node to expose available frequencies
PM / devfreq: documentation cleanups for devfreq header
PM / devfreq: Use devm_* functions in exynos4_bus.c
PM / devfreq: make devfreq_class static
PM / devfreq: fix sscanf handling for writable sysfs entries
PM / devfreq: kernel-doc typo corrections
...
|
|
All devices behind Haswell LPSS (Low Power Subsystem) should be represented
as platform devices so add them to the acpi_platform_device_ids list.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Add a document that describes how to take advantage of ACPI enumeration for
buses like platform, I2C and SPI. In addition to that we document how to
translate ACPI GpioIo and GpioInt resources to be useful in Linux device
drivers.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
TEST_ALPHA() is broken and always returns 0.
[akpm@linux-foundation.org: return false for '@' as well, per Bjorn]
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
* pci/daniel-numachip:
x86/PCI: Add NumaChip remote PCI support
|
|
Add NumaChip-specific PCI access mechanism via MMCONFIG cycles, but
preventing access to AMD Northbridges which shouldn't respond.
Signed-off-by: Daniel J Blueman <daniel@numascale-asia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
To handle error trigger table correctly, memory region must be
removed from request region. We had a series of patches to do this
culminating in:
commit b4e008dc5
ACPI, APEI, EINJ, Refine the fix of resource conflict
but when ACPI5 support was added, we missed updating this area. So
when using EINJ table on an ACPI5 enabled machine, we get following error:
APEI: Can not request [mem 0x526b80000-0x526b80007] for APEI EINJ
Trigger registers
Fix this by checking for the acpi5 case and using the same code
that was added earlier.
Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
|
|
commit 2e71a6f8084e (net: gro: selective flush of packets) added
a bug for skbs using frag_list. This part of the GRO stack is rarely
used, as it needs skb not using a page fragment for their skb->head.
Most drivers do use a page fragment, but some of them use GFP_KERNEL
allocations for the initial fill of their RX ring buffer.
napi_gro_flush() overwrite skb->prev that was used for these skb to
point to the last skb in frag_list.
Fix this using a separate field in struct napi_gro_cb to point to the
last fragment.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If SYN-ACK partially acks SYN-data, the client retransmits the
remaining data by tcp_retransmit_skb(). This increments lost recovery
state variables like tp->retrans_out in Open state. If loss recovery
happens before the retransmission is acked, it triggers the WARN_ON
check in tcp_fastretrans_alert(). For example: the client sends
SYN-data, gets SYN-ACK acking only ISN, retransmits data, sends
another 4 data packets and get 3 dupacks.
Since the retransmission is not caused by network drop it should not
update the recovery state variables. Further the server may return a
smaller MSS than the cached MSS used for SYN-data, so the retranmission
needs a loop. Otherwise some data will not be retransmitted until timeout
or other loss recovery events.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since commit 2c60db037034 ('net: provide a default dev->ethtool_ops')
all devices have a non-null ethtool_ops. Test only
dev->ethtool_ops->get_link in both places where we care.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
V5: fix two bugs pointed out by Thomas
remove seq check for now, mark it as TODO
V4: remove some useless #include
some coding style fix
V3: drop debugging printk's
update selinux perm table as well
V2: drop patch 1/2, export ifindex directly
Redesign netlink attributes
Improve netlink seq check
Handle IPv6 addr as well
This patch exports bridge multicast database via netlink
message type RTM_GETMDB. Similar to fdb, but currently bridge-specific.
We may need to support modify multicast database too (RTM_{ADD,DEL}MDB).
(Thanks to Thomas for patient reviews)
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Cong Wang <amwang@redhat.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
'secluded' is used to describe places, not suitable here.
Suggested-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Shan Wei <davidshan@tencent.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|