Age | Commit message (Collapse) | Author |
|
Hariprasad Shenai says:
====================
cxgb4/cxgb4vf: T5 BAR2 and ethtool related fixes
This series adds new interface to calculate BAR2 SGE queue register address for
cxgb4 and cxgb4vf driver and some more sge related fixes for T5. Also adds a
patch which updates the FW version displayed by ethtool after firmware flash.
The patches series is created against 'net-next' tree.
And includes patches on cxgb4 and cxgb4vf driver.
We have included all the maintainers of respective drivers. Kindly review the
change and let us know in case of any review comments.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
After successfully loading new firmware, reload the new firmware's version
number information so "ethtool -i", etc. will report the right value
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use BAR2 Going To Sleep (GTS) for T5 and later. Use new BAR2 User Doorbells for
T5 for both cxgb4 and cxgb4vf driver.
Based on original work by Casey Leedom <leedom@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add new Common Code facilities for calculating T5 BAR2 Offsets for SGE Queue
Registers. This new code can handle situations where
Queues Per Page * SGE BAR2 Queue Register Area Size > Page Size
Based on original work by Casey Leedom <leedom@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add sge_vf_eq_qpp and sge_vf_iq_qpp to (struct sge_params), initialize
sge_queues_per_page and sge_vf_qpp in t4vf_get_sge_params(), add new
t4vf_prep_adapter() which initializes basic adapter parameters.
Grab both SGE_EGRESS_QUEUES_PER_PAGE_VF and SGE_INGRESS_QUEUES_PER_PAGE_VF
for VF Drivers since we need both to calculate the User Doorbell area
offsets for Egress and Ingress Queues.
Based on original work by Casey Leedom <leedom@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Part of the old remote management feature is a piece of code
that checked permissions on the local system to see if a certain
operation was permitted, and if so pass the command to a remote
node. This serves no purpose after the removal of remote management
with commit 5902385a2440 ("tipc: obsolete the remote management
feature") so we remove it.
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This kills the sparse warnings.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Change 1G-SFP module detection by verifying not only that it's not
compliant with 10G-Ethernet, but also that it's 1G-ethernet compliant.
Signed-off-by: Yaniv Rosner <Yaniv.Rosner@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch is to fix the max coalesce timer setting that can be provided
by ethtool.
The default value (STMMAC_COAL_TX_TIMER) was used in the set_coalesce helper
instead of the max one (STMMAC_MAX_COAL_TX_TICK, so defined but not used).
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
To accomodate for enough headroom for tunnels, use MAX_HEADER instead
of LL_MAX_HEADER. Robert reported that he has hit after roughly 40hrs
of trinity an skb_under_panic() via SCTP output path (see reference).
I couldn't reproduce it from here, but not using MAX_HEADER as elsewhere
in other protocols might be one possible cause for this.
In any case, it looks like accounting on chunks themself seems to look
good as the skb already passed the SCTP output path and did not hit
any skb_over_panic(). Given tunneling was enabled in his .config, the
headroom would have been expanded by MAX_HEADER in this case.
Reported-by: Robert Święcki <robert@swiecki.net>
Reference: https://lkml.org/lkml/2014/12/1/507
Fixes: 594ccc14dfe4d ("[SCTP] Replace incorrect use of dev_alloc_skb with alloc_skb in sctp_packet_transmit().")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Don't let T4 firmware flash on a T5 adapter and vice-versa
using ethtool
Based on original work by Casey Leedom <leedom@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
xchg is atomic, so there is no necessary to use spin_lock/spin_unlock
to protect it. At last, remove the redundant
opt = xchg(&inet6_sk(sk)->opt, opt); statement.
Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Tom Lendacky says:
====================
amd-xgbe: AMD XGBE driver fixes 2014-12-02
The following series of patches includes two bug fixes. Unfortunately,
the first patch will create a conflict when eventually merged into
net-next but should be very easy to resolve.
- Do not clear the interrupt bit in the xgbe_ring_data structure
- Associate a Tx SKB with the proper xgbe_ring_data structure
This patch series is based on net.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The SKB for a Tx packet is associated with an xgbe_ring_data structure
in the xgbe_map_tx_skb function. However, it is being saved in the
structure after the last structure used when the SKB is mapped. Use
the last used structure to save the SKB value.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The interrupt value within the xgbe_ring_data structure is used as an
indicator of which Rx descriptor should have the INTE bit set to
generate an interrupt when that Rx descriptor is used. This bit was
mistakenly cleared in the xgbe_unmap_rdata function, effectively
nullifying the ethtool rx-frames support.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When requesting an irq, the name passed in must be (part of) allocated
memory. The irq name was a local variable and resulted in random
characters when listing /proc/interrupts. Add a character field to the
xgbe_channel structure to hold the irq name and use that.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There is nice kernel helper to escape a given strings by provided rules. Let's
use it instead of custom approach.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
[bfields@redhat.com: fix length calculation]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
...move the WARN_ON_ONCE inside the following if block since they use
the same condition.
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Bugs similar to the one in acbbe6fbb240 (kcmp: fix standard comparison
bug) are in rich supply.
In this variant, the problem is that struct xdr_netobj::len has type
unsigned int, so the expression o1->len - o2->len _also_ has type
unsigned int; it has completely well-defined semantics, and the result
is some non-negative integer, which is always representable in a long
long. But this means that if the conditional triggers, we are
guaranteed to return a positive value from compare_blob.
In this case it could be fixed by
- res = o1->len - o2->len;
+ res = (long long)o1->len - (long long)o2->len;
but I'd rather eliminate the usually broken 'return a - b;' idiom.
Reviewed-by: Jeff Layton <jlayton@primarydata.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
These were useful when I was tracking down a race condition between
svc_xprt_do_enqueue and svc_get_next_xprt.
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Testing has shown that the pool->sp_lock can be a bottleneck on a busy
server. Every time data is received on a socket, the server must take
that lock in order to dequeue a thread from the sp_threads list.
Address this problem by eliminating the sp_threads list (which contains
threads that are currently idle) and replacing it with a RQ_BUSY flag in
svc_rqst. This allows us to walk the sp_all_threads list under the
rcu_read_lock and find a suitable thread for the xprt by doing a
test_and_set_bit.
Note that we do still have a potential atomicity problem however with
this approach. We don't want svc_xprt_do_enqueue to set the
rqst->rq_xprt pointer unless a test_and_set_bit of RQ_BUSY returned
zero (which indicates that the thread was idle). But, by the time we
check that, the bit could be flipped by a waking thread.
To address this, we acquire a new per-rqst spinlock (rq_lock) and take
that before doing the test_and_set_bit. If that returns false, then we
can set rq_xprt and drop the spinlock. Then, when the thread wakes up,
it must set the bit under the same spinlock and can trust that if it was
already set then the rq_xprt is also properly set.
With this scheme, the case where we have an idle thread no longer needs
to take the highly contended pool->sp_lock at all, and that removes the
bottleneck.
That still leaves one issue: What of the case where we walk the whole
sp_all_threads list and don't find an idle thread? Because the search is
lockess, it's possible for the queueing to race with a thread that is
going to sleep. To address that, we queue the xprt and then search again.
If we find an idle thread at that point, we can't attach the xprt to it
directly since that might race with a different thread waking up and
finding it. All we can do is wake the idle thread back up and let it
attempt to find the now-queued xprt.
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Tested-by: Chris Worley <chris.worley@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
In a later patch, we'll be removing some spinlocking around the socket
and thread queueing code in order to fix some contention problems. At
that point, the stats counters will no longer be protected by the
sp_lock.
Change the counters to atomic_long_t fields, except for the
"sockets_queued" counter which will still be manipulated under a
spinlock.
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Tested-by: Chris Worley <chris.worley@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
...also make the manipulation of sp_all_threads list use RCU-friendly
functions.
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Tested-by: Chris Worley <chris.worley@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Currently all svc_create callers pass in NULL for the shutdown parm,
which then gets fixed up to be svc_rpcb_cleanup if the service uses
rpcbind.
Simplify this by instead having the the only caller that requires it
(lockd) pass in svc_rpcb_cleanup and get rid of the special casing.
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
The way that svc_wake_up works is a bit inefficient. It walks all of the
available pools for a service and either wakes up a task in each one or
sets the SP_TASK_PENDING flag in each one.
When svc_wake_up is called, there is no need to wake up more than one
thread to do this work. In practice, only lockd currently uses this
function and it's single threaded anyway. Thus, this just boils down to
doing a wake up of a thread in pool 0 or setting a single flag.
Eliminate the for loop in this function and change it to just operate on
pool 0. Also update the comments that sit above it and get rid of some
code that has been commented out for years now.
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
In a later patch, we'll want to be able to handle this flag without
holding the sp_lock. Change this field to an unsigned long flags
field, and declare a new flag in it that can be managed with atomic
bitops.
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
There are a couple of holes in the svc_rqst field on x86_64. Move the
rq_cachetype to a different location to eliminate both of them.
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
In a later patch, we're going to need some atomic bit flags. Since that
field will need to be an unsigned long, we mitigate that space
consumption by migrating some other bitflags to the new field. Start
with the rq_secure flag.
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Mainly what I need is 860a0d9e511f "sunrpc: add some tracepoints in
svc_rqst handling functions", which subsequent server rpc patches from
jlayton depend on. I'm merging this later tag on the assumption that's
more likely to be a tested and stable point.
|
|
Suppose that a system has two CPU sockets, three cores per socket,
that it does not support hyperthreading and that four hardware
queues are provided by a block driver. With the current algorithm
this will lead to the following assignment of CPU cores to hardware
queues:
HWQ 0: 0 1
HWQ 1: 2 3
HWQ 2: 4 5
HWQ 3: (none)
This patch changes the queue assignment into:
HWQ 0: 0 1
HWQ 1: 2
HWQ 2: 3 4
HWQ 3: 5
In other words, this patch has the following three effects:
- All four hardware queues are used instead of only three.
- CPU cores are spread more evenly over hardware queues. For the
above example the range of the number of CPU cores associated
with a single HWQ is reduced from [0..2] to [1..2].
- If the number of HWQ's is a multiple of the number of CPU sockets
it is now guaranteed that all CPU cores associated with a single
HWQ reside on the same CPU socket.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Alexander Gordeev <agordeev@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Remove a superfluous finish_wait() call. Convert the two bt_wait_ptr()
calls into a single call.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Robert Elliott <elliott@hp.com>
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Alexander Gordeev <agordeev@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
What we need is the following two guarantees:
* Any thread that observes the effect of the test_and_set_bit() by
__bt_get_word() also observes the preceding addition of 'current'
to the appropriate wait list. This is guaranteed by the semantics
of the spin_unlock() operation performed by prepare_and_wait().
Hence the conversion of test_and_set_bit_lock() into
test_and_set_bit().
* The wait lists are examined by bt_clear() after the tag bit has
been cleared. clear_bit_unlock() guarantees that any thread that
observes that the bit has been cleared also observes the store
operations preceding clear_bit_unlock(). However,
clear_bit_unlock() does not prevent that the wait lists are examined
before that the tag bit is cleared. Hence the addition of a memory
barrier between clear_bit() and the wait list examination.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Robert Elliott <elliott@hp.com>
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Alexander Gordeev <agordeev@redhat.com>
Cc: <stable@vger.kernel.org> # v3.13+
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
If __bt_get_word() is called with last_tag != 0, if the first
find_next_zero_bit() fails, if after wrap-around the
test_and_set_bit() call fails and find_next_zero_bit() succeeds,
if the next test_and_set_bit() call fails and subsequently
find_next_zero_bit() does not find a zero bit, then another
wrap-around will occur. Avoid this by introducing an additional
local variable.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Robert Elliott <elliott@hp.com>
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Alexander Gordeev <agordeev@redhat.com>
Cc: <stable@vger.kernel.org> # v3.13+
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
blk-mq users are allowed to free the memory request_queue.tag_set
points at after blk_cleanup_queue() has finished but before
blk_release_queue() has started. This can happen e.g. in the SCSI
core. The SCSI core namely embeds the tag_set structure in a SCSI
host structure. The SCSI host structure is freed by
scsi_host_dev_release(). This function is called after
blk_cleanup_queue() finished but can be called before
blk_release_queue().
This means that it is not safe to access request_queue.tag_set from
inside blk_release_queue(). Hence remove the blk_sync_queue() call
from blk_release_queue(). This call is not necessary - outstanding
requests must have finished before blk_release_queue() is
called. Additionally, move the blk_mq_free_queue() call from
blk_release_queue() to blk_cleanup_queue() to avoid that struct
request_queue.tag_set gets accessed after it has been freed.
This patch avoids that the following kernel oops can be triggered
when deleting a SCSI host for which scsi-mq was enabled:
Call Trace:
[<ffffffff8109a7c4>] lock_acquire+0xc4/0x270
[<ffffffff814ce111>] mutex_lock_nested+0x61/0x380
[<ffffffff812575f0>] blk_mq_free_queue+0x30/0x180
[<ffffffff8124d654>] blk_release_queue+0x84/0xd0
[<ffffffff8126c29b>] kobject_cleanup+0x7b/0x1a0
[<ffffffff8126c140>] kobject_put+0x30/0x70
[<ffffffff81245895>] blk_put_queue+0x15/0x20
[<ffffffff8125c409>] disk_release+0x99/0xd0
[<ffffffff8133d056>] device_release+0x36/0xb0
[<ffffffff8126c29b>] kobject_cleanup+0x7b/0x1a0
[<ffffffff8126c140>] kobject_put+0x30/0x70
[<ffffffff8125a78a>] put_disk+0x1a/0x20
[<ffffffff811d4cb5>] __blkdev_put+0x135/0x1b0
[<ffffffff811d56a0>] blkdev_put+0x50/0x160
[<ffffffff81199eb4>] kill_block_super+0x44/0x70
[<ffffffff8119a2a4>] deactivate_locked_super+0x44/0x60
[<ffffffff8119a87e>] deactivate_super+0x4e/0x70
[<ffffffff811b9833>] cleanup_mnt+0x43/0x90
[<ffffffff811b98d2>] __cleanup_mnt+0x12/0x20
[<ffffffff8107252c>] task_work_run+0xac/0xe0
[<ffffffff81002c01>] do_notify_resume+0x61/0xa0
[<ffffffff814d2c58>] int_signal+0x12/0x17
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Robert Elliott <elliott@hp.com>
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Alexander Gordeev <agordeev@redhat.com>
Cc: <stable@vger.kernel.org> # v3.13+
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
This will make it easy for transports to validate features and return
failure.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Legacy balloon device doesn't pretend to support revision 1 or 64 bit
features.
But just in case someone implements a broken one that does, let's not
even try to drive legacy only devices using revision 1, and let's not
give them a chance to say they support VIRTIO_F_VERSION_1 by not reading
or writing high feature bits.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
|
|
transports need to be able to detect legacy-only
devices (ATM balloon only) to use legacy path
to drive them.
Add a core API to do just that.
The implementation just blacklists balloon:
not too pretty, but let's not over-engineer.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
|
|
CHECK drivers/char/virtio_console.c
drivers/char/virtio_console.c:687:36: warning: incorrect type in
argument 1 (different address spaces)
drivers/char/virtio_console.c:687:36: expected void [noderef]
<asn:1>*to
drivers/char/virtio_console.c:687:36: got char *out_buf
drivers/char/virtio_console.c:790:35: warning: incorrect type in
argument 2 (different address spaces)
drivers/char/virtio_console.c:790:35: expected char *out_buf
drivers/char/virtio_console.c:790:35: got char [noderef]
<asn:1>*ubuf
fill_readbuf is reused with both kernel and userspace pointers,
depending on value of to_user flag.
Tag address parameter as __user, and cast to/from regular pointer type
when we know it's safe.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Core activates this bit automatically now,
drop it from drivers that set it explicitly.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Activate VIRTIO_F_VERSION_1 automatically unless legacy_only
is set.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
We have no plans to support virtio 1.0 in balloon driver. Add an
explicit flag to mark it legacy only.
This will be used by follow-up patches.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Pretty straight-forward, just use accessors for all fields.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
This merely fixes sparse warnings, without actually
adding support for the new APIs.
Still working out the best way to enable the new
functionality.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|