Age | Commit message (Collapse) | Author |
|
Wei Liu says:
====================
This is rebased version of Andrew's V8 patch series. The original cover letter:
--------------------
xen-net{back, front}: Multiple transmit and receive queues
This patch series implements multiple transmit and receive queues (i.e.
multiple shared rings) for the xen virtual network interfaces.
The series is split up as follows:
- Patch 1 brings the 'grant_copy_op' array back into struct xenvif, in
preparation for multi-queue support. See the patch itself for more details.
- Patches 2 and 4 factor out the queue-specific data for netback and
netfront respectively, and modify the rest of the code to use these
as appropriate.
- Patches 3 and 5 introduce new XenStore keys to negotiate and use
multiple shared rings and event channels, and code to connect these
as appropriate.
- Patch 6 documents the XenStore keys required for the new feature
in include/xen/interface/io/netif.h
All other transmit and receive processing remains unchanged, i.e. there
is a kthread per queue and a NAPI context per queue.
The performance of these patches has been analysed in detail, with
results available at:
http://wiki.xenproject.org/wiki/Xen-netback_and_xen-netfront_multi-queue_performance_testing
To summarise:
* Using multiple queues allows a VM to transmit at line rate on a 10
Gbit/s NIC, compared with a maximum aggregate throughput of 6 Gbit/s
with a single queue.
* For intra-host VM--VM traffic, eight queues provide 171% of the
throughput of a single queue; almost 12 Gbit/s instead of 6 Gbit/s.
* There is a corresponding increase in total CPU usage, i.e. this is a
scaling out over available resources, not an efficiency improvement.
* Results depend on the availability of sufficient CPUs, as well as the
distribution of interrupts and the distribution of TCP streams across
the queues.
Queue selection is currently achieved via an L4 hash on the packet (i.e.
TCP src/dst port, IP src/dst address) and is not negotiated between the
frontend and backend, since only one option exists. Future patches to
support other frontends (particularly Windows) will need to add some
capability to negotiate not only the hash algorithm selection, but also
allow the frontend to specify some parameters to this.
Note that queue selection is a decision by the transmitting system about
which queue to use for a particular packet. In general, the algorithm
may differ between the frontend and the backend with no adverse effects.
Queue-specific XenStore entries for ring references and event channels
are stored hierarchically, i.e. under .../queue-N/... where N varies
from 0 to one less than the requested number of queues (inclusive). If
only one queue is requested, it falls back to the flat structure where
the ring references and event channels are written at the same level as
other vif information.
V8:
- Squash the queue error handling code into patch 3.
- Update the documentation (patch 6) according to comments on the
equivalent patch to Xen.
V7:
- Rebase on latest net-next, which includes the netback grant mapping
patch series from Zoltan Kiss
- Reduce QUEUE_NAME_SIZE by 1 to avoid double-counting the trailing '\0'
- Simplify the queue hashing by using (hash % num_queues) instead of
multiply & shift.
- Add ratelimited warning for invalid queue selection.
- Fix error handling to correctly tear down already setup queues.
- Use dev->real_num_tx_queues instead of separately maintaining a
count of the number of queues.
V6:
- Use 'max_queues' as the module param. name for both netback and netfront.
V5:
- Fix bug in xenvif_free() that could lead to an attempt to transmit an
skb after the queue structures had been freed.
- Improve the XenStore protocol documentation in netif.h.
- Fix IRQ_NAME_SIZE double-accounting for null terminator.
- Move rx_gso_checksum_fixup stat into struct xenvif_stats (per-queue).
- Don't initialise a local variable that is set in both branches (xspath).
V4:
- Add MODULE_PARM_DESC() for the multi-queue parameters for netback
and netfront modules.
- Move del_timer_sync() in netfront to after unregister_netdev, which
restores the order in which these functions were called before applying
these patches.
V3:
- Further indentation and style fixups.
V2:
- Rebase onto net-next.
- Change queue->number to queue->id.
- Add atomic operations around the small number of stats variables that
are not queue-specific or per-cpu.
- Fixup formatting and style issues.
- XenStore protocol changes documented in netif.h.
- Default max. number of queues to num_online_cpus().
- Check requested number of queues does not exceed maximum.
--------------------
I rebased this on top of net-next. No functional change is introduced. The
patch that needed some extra care was "xen-netback: Factor queue-specific data
into queue struct" because it clashed with a fix introduced in net. A simple
test of creating guest, iperf, then shutting down guest worked as expected.
The last patch fixes a minor problem that queue name is not initialised in
xen-netfront, resulting in names like "-tx" "-rx" in /proc/interrupt.
Changes since v9 (no functional change introduced):
* include commit summary in the commit message of first patch
* fold David Vrabel's Reviewed-by into last patch
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Document the multi-queue feature in terms of XenStore keys to be written
by the backend and by the frontend.
Signed-off-by: Andrew J. Bennieston <andrew.bennieston@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Build on the refactoring of the previous patch to implement multiple
queues between xen-netfront and xen-netback.
Check XenStore for multi-queue support, and set up the rings and event
channels accordingly.
Write ring references and event channels to XenStore in a queue
hierarchy if appropriate, or flat when using only one queue.
Update the xennet_select_queue() function to choose the queue on which
to transmit a packet based on the skb hash result.
Signed-off-by: Andrew J. Bennieston <andrew.bennieston@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In preparation for multi-queue support in xen-netfront, move the
queue-specific data from struct netfront_info to struct netfront_queue,
and update the rest of the code to use this.
Also adds loops over queues where appropriate, even though only one is
configured at this point, and uses alloc_etherdev_mq() and the
corresponding multi-queue netif wake/start/stop functions in preparation
for multiple active queues.
Finally, implements a trivial queue selection function suitable for
ndo_select_queue, which simply returns 0, selecting the first (and
only) queue.
Signed-off-by: Andrew J. Bennieston <andrew.bennieston@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Builds on the refactoring of the previous patch to implement multiple
queues between xen-netfront and xen-netback.
Writes the maximum supported number of queues into XenStore, and reads
the values written by the frontend to determine how many queues to use.
Ring references and event channels are read from XenStore on a per-queue
basis and rings are connected accordingly.
Also adds code to handle the cleanup of any already initialised queues
if the initialisation of a subsequent queue fails.
Signed-off-by: Andrew J. Bennieston <andrew.bennieston@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In preparation for multi-queue support in xen-netback, move the
queue-specific data from struct xenvif into struct xenvif_queue, and
update the rest of the code to use this.
Also adds loops over queues where appropriate, even though only one is
configured at this point, and uses alloc_netdev_mq() and the
corresponding multi-queue netif wake/start/stop functions in preparation
for multiple active queues.
Finally, implements a trivial queue selection function suitable for
ndo_select_queue, which simply returns 0 for a single queue and uses
skb_get_hash() to compute the queue index otherwise.
Signed-off-by: Andrew J. Bennieston <andrew.bennieston@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This array was allocated separately in commit ac3d5ac2 ("xen-netback:
fix guest-receive-side array sizes") due to it being very large, and a
struct xenvif is allocated as the netdev_priv part of a struct
net_device, i.e. via kmalloc() but falling back to vmalloc() if the
initial alloc. fails.
In preparation for the multi-queue patches, where this array becomes
part of struct xenvif_queue and is always allocated through vzalloc(),
move this back into the struct xenvif.
Signed-off-by: Andrew J. Bennieston <andrew.bennieston@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates
This series contains updates to e1000, igb and ixgbe.
Emil provides his version 2 fix for the detection of SFP+ capable interfaces.
In cases where the driver is loaded while there are no SFP+ modules in cage,
the interface was not being detected as SFP capable. Resolve the issue by
identifying interfaces with no PHY type set as SFP capable which allows the
driver to detect the SFP module when the interface is brought up. In this
version 2 of the patch, the 82599 specific check was removed since we only
have 82598 devices that are SFP capable.
Jacob removes the including of the export header in the ixgbe PTP core, since
it is not needed. Renames igb_ptp_enable() to igb_ptp_feature_enable() to
better reflect the actual functions purpose.
Todd fixes the ethtool loopback test for i354 backplane devices since we
do not know what PHY is to be used for the devices, use MAC loopback for
ethtool tests. Todd also sets the packet buffer size register defaults for
i210 devices.
Yongjian Xu removes the check for skb->len being negative or zero since there
is never a case where it would be zero or negative for e1000.
Manuel Schölling updates e1000 to use the time_after() helper function.
v2: Fix indentation on wrapped line in patch 3 of the series based on
feedback from David Miller
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull block follow-up bits from Jens Axboe:
"A few minor (but important) fixes for blk-mq for the -rc1 window.
- Hot removal potential oops fix for single queue devices. From me.
- Two merged patches in late May meant that we accidentally lost a
fix for freeing an active queue. Fix that up. From me.
- A change of the blk_mq_tag_to_rq() API, passing in blk_mq_tags, to
make life considerably easier for scsi-mq. From me.
- A schedule-while-atomic fix from Ming Lei, which would hit if the
tag space was exhausted.
- Missing __percpu annotation in one place in blk-mq. Found by the
magic Wu compile bot due to code being moved around by the previous
patch, but it's actually an older issue. From Ming Lei.
- Clearing of tag of a flush request at end_io time. From Ming Lei"
* 'for-linus' of git://git.kernel.dk/linux-block:
block: mq flush: clear flush_rq's tag in flush_end_io()
blk-mq: let blk_mq_tag_to_rq() take blk_mq_tags as the main parameter
blk-mq: fix regression from commit 624dbe475416
blk-mq: handle NULL req return from blk_map_request in single queue mode
blk-mq: fix sparse warning on missed __percpu annotation
blk-mq: fix schedule from atomic context
blk-mq: move blk_mq_get_ctx/blk_mq_put_ctx to mq private header
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media into next
Pull media updates from Mauro Carvalho Chehab:
"This contains:
- a new frontend/tuner driver set for si2168 and sa2157
- Videobuf 2 core now supports DVB too
- A new gspca sub-driver (dtcs033)
- saa7134 is now converted to use videobuf2
- add support for 4K timings
- several other driver fixes and improvements
PS. This pull request is shorter than usual, partly because I have
some other patches on topic branches that I'll be sending you later
this week"
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (286 commits)
[media] au0828-dvb: restore its permission to 644
[media] xc5000: delay tuner sleep to 5 seconds
[media] xc5000: Don't use whitespace before tabs
[media] xc5000: fix CamelCase
[media] xc5000: Don't wrap msleep()
[media] xc5000: get rid of positive error codes
[media] au0828: reset streaming when a new frequency is set
[media] au0828: Improve debug messages for urb_completion
[media] au0828: Cancel stream-restart operation if frontend is disconnected
[media] dib0700: fix RC support on Hauppauge Nova-TD
[media] USB: as102_usb_drv.c: Remove useless return variables
[media] v4l: Fix documentation of V4L2_PIX_FMT_H264_MVC and VP8 pixel formats
[media] m5mols: Replace missing header
[media] staging: lirc: Fix sparse warnings
[media] fix mceusb endpoint type identification/handling
[media] az6027: Added the PID for a new revision of the Elgato EyeTV Sat DVB-S Tuner
[media] DocBook media: fix typo
[media] adv7604: Add missing include to linux/types.h
[media] v4l: Validate fields in the core code for subdev EDID ioctls
[media] v4l: Add support for DV timings ioctls on subdev nodes
...
|
|
|
|
- added interrupt support for GIO devices
- improved detection of GIO cards on Indigo2
- added more known GIO cards
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7055/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
In octeon_3xxx.dts file, there is a definiton for twsi/twsi2 interrupts.
But there is no code for initialization of this interrupts. This patch adds
code for initialization of twsi interrupts.
Signed-off-by: Eunbong Song <eunb.song@samsung.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/6816/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
There's an alternative 5-line mini interrupt controller in the R4k MB ASIC
used on the KN04 and KN05 CPU daughtercards. The controller is cascaded
from the CPU interrupt input that would be used for the Halt button on the
corresponding R3k systems. This change documents the findings so far.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/6706/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/6707/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
This patch adds support for the microMIPS implementation of the MSA
instructions.
Signed-off-by: Steven J. Hill <Steven.Hill@imgtec.com>
Reviewed-by: Paul Burton <Paul.Burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/6763/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Signed-off-by: Daniel Walter <dwalter@google.com>
Cc: linux-kernel@vger.kernel.org
Cc: richard@nod.at
Cc: akpm@linux-foundation.org
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Signed-off-by: Daniel Walter <dwalter@google.com>
Cc: linux-kernel@vger.kernel.org
Cc: richard@nod.at
Cc: akpm@linux-foundation.org
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
This keeps the if condition slightly simpler - it's going to become ore
complication.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
We don't want the stateid to be found in the hash table before the delegation
is granted.
Currently this is protected by the client_mutex, but we want to break that
up and this is a necessary step toward that goal.
Signed-off-by: Benny Halevy <bhalevy@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
...as the name is a bit more descriptive and we've started using it for
other purposes.
Signed-off-by: Benny Halevy <bhalevy@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
The memset of resp in svc_process_common should ensure that these are
already zeroed by the time they get here.
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
In the NFS4_OPEN_CLAIM_PREVIOUS case, we should only mark it confirmed
if the nfs4_check_open_reclaim check succeeds.
In the NFS4_OPEN_CLAIM_DELEG_PREV_FH and NFS4_OPEN_CLAIM_DELEGATE_PREV
cases, I see no point in declaring the openowner confirmed when the
operation is going to fail anyway, and doing so might allow the client
to game things such that it wouldn't need to confirm a subsequent open
with the same owner.
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
This fixes a bug in the handling of the fi_delegations list.
nfs4_setlease does not hold the recall_lock when adding to it. The
client_mutex is held, which prevents against concurrent list changes,
but nfsd_break_deleg_cb does not hold while walking it. New delegations
could theoretically creep onto the list while we're walking it there.
Signed-off-by: Benny Halevy <bhalevy@primarydata.com>
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
If a trace event uses a dynamic array for something other than a string
then there's currently no way the TP_printk() can figure out what size
it is. A __get_dynamic_array_len() is required to know the length.
This also simplifies the __get_bitmask() macro which required it as well,
but instead just hardcoded it.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
A previous patch mistakenly changed the file permission to 755.
Restore it to 644.
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
|
|
Somehow this unused variable warning sneaked past my warnings check
(probably due to it depending on a new config).
kernel/trace/trace_benchmark.c: In function 'trace_do_benchmark':
kernel/trace/trace_benchmark.c:38:6: warning: unused variable 'seedsq' [-Wunused-variable]
u64 seedsq;
^
Link: http://lkml.kernel.org/r/20140604160921.4f4e69c4@canb.auug.org.au
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
Otherwise sparse gives a bunch of warnings like
drivers/net/ethernet/mellanox/mlx4/srq.c:110:66: sparse: incorrect type in argument 4 (different base types)
drivers/net/ethernet/mellanox/mlx4/srq.c:110:66: expected int [signed] gfp
drivers/net/ethernet/mellanox/mlx4/srq.c:110:66: got restricted gfp_t
Signed-off-by: Roland Dreier <roland@purestorage.com>
|
|
When encountering an error during the add_port function, adding a port
to sysfs, the port kobject is freed without being deleted from sysfs.
Instead of freeing it directly, the patch uses kobject_put to release
the kobject and delete it.
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
|
|
The ib_core module will call kobject_get on the parent object of each
kobject it creates. This is redundant since kobject_add does that
anyway.
As a side effect, this patch should fix leaking the ports kobject and
the device kobject during unregister flow, since the previous code
didn't seem to take into account the kobject_get calls on behalf of
the child kobjects.
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux into next
Pull DeviceTree updates from Rob Herring:
- Another round of clean-up of FDT related code in architecture code.
This removes knowledge of internal FDT details from most
architectures except powerpc.
- Conversion of kernel's custom FDT parsing code to use libfdt.
- DT based initialization for generic serial earlycon. The
introduction of generic serial earlycon support went in through the
tty tree.
- Improve the platform device naming for DT probed devices to ensure
unique naming and use parent names instead of a global index.
- Fix a race condition in of_update_property.
- Unify the various linker section OF match tables and fix several
function prototype errors.
- Update platform_get_irq_byname to work in deferred probe cases.
- 2 binding doc updates
* tag 'devicetree-for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (58 commits)
of: handle NULL node in next_child iterators
of/irq: provide more wrappers for !CONFIG_OF
devicetree: bindings: Document micrel vendor prefix
dt: bindings: dwc2: fix required value for the phy-names property
of_pci_irq: kill useless variable in of_irq_parse_pci()
of/irq: do irq resolution in platform_get_irq_byname()
of: Add a testcase for of_find_node_by_path()
of: Make of_find_node_by_path() handle /aliases
of: Create unlocked version of for_each_child_of_node()
lib: add glibc style strchrnul() variant
of: Handle memory@0 node on PPC32 only
pci/of: Remove dead code
of: fix race between search and remove in of_update_property()
of: Use NULL for pointers
of: Stop naming platform_device using dcr address
of: Ensure unique names without sacrificing determinism
tty/serial: pl011: add DT based earlycon support
of/fdt: add FDT serial scanning for earlycon
of/fdt: add FDT address translation support
serial: earlycon: add DT support
...
|
|
Fix a few functions that are declared with __attribute_const__ in the
ib_verbs.h header file but defined without it in verbs.c. This gets rid
of the following sparse warnings:
drivers/infiniband/core/verbs.c:51:5: error: symbol 'ib_rate_to_mult' redeclared with different type (originally declared at include/rdma/ib_verbs.h:469) - different modifiers
drivers/infiniband/core/verbs.c:68:14: error: symbol 'mult_to_ib_rate' redeclared with different type (originally declared at include/rdma/ib_verbs.h:607) - different modifiers
drivers/infiniband/core/verbs.c:85:5: error: symbol 'ib_rate_to_mbps' redeclared with different type (originally declared at include/rdma/ib_verbs.h:476) - different modifiers
drivers/infiniband/core/verbs.c:111:1: error: symbol 'rdma_node_get_transport' redeclared with different type (originally declared at include/rdma/ib_verbs.h:84) - different modifiers
Signed-off-by: Roland Dreier <roland@purestorage.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
Pull percpu fix from Tejun Heo:
"It is very late but this is an important percpu-refcount fix from
Sebastian Ott.
The problem is that percpu_ref_*() used __this_cpu_*() instead of
this_cpu_*(). The difference between the two is that the latter is
atomic on the local cpu while the former is not. this_cpu_inc() is
guaranteed to increment the percpu counter on the cpu that the
operation is executed on without any synchronization; however,
__this_cpu_inc() doesn't and if the local cpu invokes the function
from different contexts (e.g. process and irq) of the same CPU, it's
not guaranteed to actually increment as it may be implemented as rmw.
This bug existed from the get-go but it hasn't been noticed earlier
probably because on x86 __this_cpu_inc() is equivalent to
this_cpu_inc() as both get translated into single instruction;
however, s390 uses the generic rmw implementation and gets affected by
the bug. Kudos to Sebastian and Heiko for diagnosing it.
The change is very low risk and fixes a critical issue on the affected
architectures, so I think it's a good candidate for inclusion although
it's very late in the devel cycle. On the other hand, this has been
broken since v3.11, so backporting it through -stable post -rc1 won't
be the end of the world.
I'll ping Christoph whether __this_cpu_*() ops can be better annotated
so that it can trigger lockdep warning when used from multiple
contexts"
* 'for-3.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu-refcount: fix usage of this_cpu_ops
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu.git into for-3.16
Pull percpu/for-3.15-fixes into percpu/for-3.16 to receive
0c36b390a546 ("percpu-refcount: fix usage of this_cpu_ops").
The merge doesn't produce any conflict but the automatic merge is
still incorrect because 4fb6e25049cb ("percpu-refcount: implement
percpu_ref_tryget()") added another use of __this_cpu_inc() which
should also be converted to this_cpu_ince().
This commit pulls in percpu/for-3.15-fixes and converts the newly
added __this_cpu_inc() to this_cpu_inc().
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
blk_mq_tag_to_rq() needs to be able to tell if it should return
the original request, or the flush request if we are doing a flush
sequence. Clear the flush tag when IO completes for a flush, since
that is what we are comparing against.
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
We currently pass in the hardware queue, and get the tags from there.
But from scsi-mq, with a shared tag space, it's a lot more convenient
to pass in the blk_mq_tags instead as the hardware queue isn't always
directly available. So instead of having to re-map to a given
hardware queue from rq->mq_ctx, just pass in the tags structure.
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
The percpu-refcount infrastructure uses the underscore variants of
this_cpu_ops in order to modify percpu reference counters.
(e.g. __this_cpu_inc()).
However the underscore variants do not atomically update the percpu
variable, instead they may be implemented using read-modify-write
semantics (more than one instruction). Therefore it is only safe to
use the underscore variant if the context is always the same (process,
softirq, or hardirq). Otherwise it is possible to lose updates.
This problem is something that Sebastian has seen within the aio
subsystem which uses percpu refcounters both in process and softirq
context leading to reference counts that never dropped to zeroes; even
though the number of "get" and "put" calls matched.
Fix this by using the non-underscore this_cpu_ops variant which
provides correct per cpu atomic semantics and fixes the corrupted
reference counts.
Cc: Kent Overstreet <kmo@daterainc.com>
Cc: <stable@vger.kernel.org> # v3.11+
Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
References: http://lkml.kernel.org/g/alpine.LFD.2.11.1406041540520.21183@denkbrett
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound into next
Pull sound updates from Takashi Iwai:
"At this time, majority of changes come from ASoC world while we got a
few new drivers in other places for FireWire and USB. There have been
lots of ASoC core cleanups / refactoring, but very little visible to
external users.
ASoC:
- Support for specifying aux CODECs in DT
- Removal of the deprecated mux and enum macros
- More moves towards full componentisation
- Removal of some unused I/O code
- Lots of cleanups, fixes and enhancements to the davinci, Freescale,
Haswell and Realtek drivers
- Several drivers exposed directly in Kconfig for use with
simple-card
- GPIO descriptor support for jacks
- More updates and fixes to the Freescale SSI, Intel and rsnd drivers
- New drivers for Cirrus CS42L56, Realtek RT5639, RT5642 and RT5651
and ST STA350, Analog Devices ADAU1361, ADAU1381, ADAU1761 and
ADAU1781, and Realtek RT5677
HD-audio:
- Clean up Dell headset quirks
- Noise fixes for Dell and Sony laptops
- Thinkpad T440 dock fix
- Realtek codec updates (ALC293,ALC233,ALC3235)
- Tegra HD-audio HDMI support
FireWire-audio:
- FireWire audio stack enhancement (AMDTP, MIDI), support for
incoming isochronous stream and duplex streams with timestamp
synchronization
- BeBoB-based devices support
- Fireworks-based device support
USB-audio:
- Behringer BCD2000 USB device support
Misc:
- Clean up of a few old drivers, atmel, fm801, etc"
* tag 'sound-3.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (480 commits)
ASoC: Fix wrong argument for card remove callbacks
ASoC: free jack GPIOs before the sound card is freed
ALSA: firewire-lib: Remove a comment about restriction of asynchronous operation
ASoC: cache: Fix error code when not using ASoC level cache
ALSA: hda/realtek - Fix COEF widget NID for ALC260 replacer fixup
ALSA: hda/realtek - Correction of fixup codes for PB V7900 laptop
ALSA: firewire-lib: Use IEC 61883-6 compliant labels for Raw Audio data
ASoC: add RT5677 CODEC driver
ASoC: intel: The Baytrail/MAX98090 driver depends on I2C
ASoC: rt5640: Add the function "get_clk_info" to RL6231 shared support
ASoC: rt5640: Add the function of the PLL clock calculation to RL6231 shared support
ASoC: rt5640: Add RL6231 class device shared support for RT5640, RT5645 and RT5651
ASoC: cache: Fix possible ZERO_SIZE_PTR pointer dereferencing error.
ASoC: Add helper functions to cast from DAPM context to CODEC/platform
ALSA: bebob: sizeof() vs ARRAY_SIZE() typo
ASoC: wm9713: correct mono out PGA sources
ALSA: synth: emux: soundfont.c: Cleaning up memory leak
ASoC: fsl: Remove dependencies of boards for SND_SOC_EUKREA_TLV320
ASoC: fsl-ssi: Use regmap
ASoC: fsl-ssi: reorder and document fsl_ssi_private
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux into next
Pull omap fbdev changes from Tomi Valkeinen:
- DT support for the panel drivers that were still missing it
- TI AM43xx support
- TI OMAP5 support
* tag 'fbdev-omap-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux: (46 commits)
OMAPDSS: move 'compatible' converter to omapdss driver
OMAPDSS: HDMI: fix devm_ioremap_resource error checks
OMAPDSS: HDMI: remove unused defines
OMAPDSS: HDMI: cleanup WP ioremaps
OMAPDSS: panel NEC-NL8048HL11 DT support
Doc/DT: Add DT binding documentation for TPO td043mtea1 panel
OMAPDSS: Panel TPO-TD043MTEA1 DT support
Doc/DT: Add DT binding documentation for SHARP LS037V7DW01
OMAPDSS: panel sharp-ls037v7dw01 DT support
OMAPDSS: panel-sharp-ls037v7dw01: update to use gpiod
Doc/DT: Add binding doc for lgphilips,lb035q02.txt
OMAPDSS: panel-lgphilips-lb035q02: Add DT support
OMAPDSS: panel-lgphilips-lb035q02: use gpiod for enable gpio
OMAPDSS: hdmi5_core: Fix compilation with OMAP5_DSS_HDMI_AUDIO
OMAPDSS: panel-dpi: enable-gpio
OMAPDSS: Fix writes to DISPC_POL_FREQ
Doc/DT: Add OMAP5 DSS DT bindings
OMAPDSS: HDMI: cleanup ioremaps
OMAPDSS: HDMI: Add OMAP5 HDMI support
OMAPDSS: HDMI: PLL changes for OMAP5
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux into next
Pull main fbdev changes from Tomi Valkeinen:
"Mainly fixes and small improvements. The biggest change seems to be
backlight control support for mx3fb"
* tag 'fbdev-main-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux: (31 commits)
drivers/video/fbdev/fb-puv3.c: Add header files for function unifb_mmap
video: fbdev: s3fb.c: Fix for possible null pointer dereference
video: fbdev: grvga.c: Fix for possible null pointer dereference
matroxfb: perform a dummy read of M_STATUS
video: of: display_timing: fix default native-mode setting
video: delete unneeded call to platform_get_drvdata
video: mx3fb: Add backlight control support
video: omap: delete support for early fbmem allocation
video: of: display_timing: remove two unsafe error messages
fbdev: fbmem: remove positive test on unsigned values
fbcon: Fix memory leak in con2fb_release_oldinfo()
video: Kconfig: Add a dependency to the Goldfish framebuffer driver
video: exynos: Add a dependency to the menu
video: mx3fb: Use devm_kzalloc
video/nuc900: allow modular build
video: atmel needs FB_BACKLIGHT
video: export fb_prepare_logo
video/mbx: fix building debugfs support
video/omap: fix modular build
video: clarify I2C dependencies
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm into next
Pull ACPI and power management updates from Rafael Wysocki:
"ACPICA is the leader this time (63 commits), followed by cpufreq (28
commits), devfreq (15 commits), system suspend/hibernation (12
commits), ACPI video and ACPI device enumeration (10 commits each).
We have no major new features this time, but there are a few
significant changes of how things work. The most visible one will
probably be that we are now going to create platform devices rather
than PNP devices by default for ACPI device objects with _HID. That
was long overdue and will be really necessary to be able to use the
same drivers for the same hardware blocks on ACPI and DT-based systems
going forward. We're not expecting fallout from this one (as usual),
but it's something to watch nevertheless.
The second change having a chance to be visible is that ACPI video
will now default to using native backlight rather than the ACPI
backlight interface which should generally help systems with broken
Win8 BIOSes. We're hoping that all problems with the native backlight
handling that we had previously have been addressed and we are in a
good enough shape to flip the default, but this change should be easy
enough to revert if need be.
In addition to that, the system suspend core has a new mechanism to
allow runtime-suspended devices to stay suspended throughout system
suspend/resume transitions if some extra conditions are met
(generally, they are related to coordination within device hierarchy).
However, enabling this feature requires cooperation from the bus type
layer and for now it has only been implemented for the ACPI PM domain
(used by ACPI-enumerated platform devices mostly today).
Also, the acpidump utility that was previously shipped as a separate
tool will now be provided by the upstream ACPICA along with the rest
of ACPICA code, which will allow it to be more up to date and better
supported, and we have one new cpuidle driver (ARM clps711x).
The rest is improvements related to certain specific use cases,
cleanups and fixes all over the place.
Specifics:
- ACPICA update to upstream version 20140424. That includes a number
of fixes and improvements related to things like GPE handling,
table loading, headers, memory mapping and unmapping, DSDT/SSDT
overriding, and the Unload() operator. The acpidump utility from
upstream ACPICA is included too. From Bob Moore, Lv Zheng, David
Box, David Binderman, and Colin Ian King.
- Fixes and cleanups related to ACPI video and backlight interfaces
from Hans de Goede. That includes blacklist entries for some new
machines and using native backlight by default.
- ACPI device enumeration changes to create platform devices rather
than PNP devices for ACPI device objects with _HID by default. PNP
devices will still be created for the ACPI device object with
device IDs corresponding to real PNP devices, so that change should
not break things left and right, and we're expecting to see more
and more ACPI-enumerated platform devices in the future. From
Zhang Rui and Rafael J Wysocki.
- Updates for the ACPI LPSS (Low-Power Subsystem) driver allowing it
to handle system suspend/resume on Asus T100 correctly. From
Heikki Krogerus and Rafael J Wysocki.
- PM core update introducing a mechanism to allow runtime-suspended
devices to stay suspended over system suspend/resume transitions if
certain additional conditions related to coordination within device
hierarchy are met. Related PM documentation update and ACPI PM
domain support for the new feature. From Rafael J Wysocki.
- Fixes and improvements related to the "freeze" sleep state. They
affect several places including cpuidle, PM core, ACPI core, and
the ACPI battery driver. From Rafael J Wysocki and Zhang Rui.
- Miscellaneous fixes and updates of the ACPI core from Aaron Lu,
Bjørn Mork, Hanjun Guo, Lan Tianyu, and Rafael J Wysocki.
- Fixes and cleanups for the ACPI processor and ACPI PAD (Processor
Aggregator Device) drivers from Baoquan He, Manuel Schölling, Tony
Camuso, and Toshi Kani.
- System suspend/resume optimization in the ACPI battery driver from
Lan Tianyu.
- OPP (Operating Performance Points) subsystem updates from Chander
Kashyap, Mark Brown, and Nishanth Menon.
- cpufreq core fixes, updates and cleanups from Srivatsa S Bhat,
Stratos Karafotis, and Viresh Kumar.
- Updates, fixes and cleanups for the Tegra, powernow-k8, imx6q,
s5pv210, nforce2, and powernv cpufreq drivers from Brian Norris,
Jingoo Han, Paul Bolle, Philipp Zabel, Stratos Karafotis, and
Viresh Kumar.
- intel_pstate driver fixes and cleanups from Dirk Brandewie, Doug
Smythies, and Stratos Karafotis.
- Enabling the big.LITTLE cpufreq driver on arm64 from Mark Brown.
- Fix for the cpuidle menu governor from Chander Kashyap.
- New ARM clps711x cpuidle driver from Alexander Shiyan.
- Hibernate core fixes and cleanups from Chen Gang, Dan Carpenter,
Fabian Frederick, Pali Rohár, and Sebastian Capella.
- Intel RAPL (Running Average Power Limit) driver updates from Jacob
Pan.
- PNP subsystem updates from Bjorn Helgaas and Fabian Frederick.
- devfreq core updates from Chanwoo Choi and Paul Bolle.
- devfreq updates for exynos4 and exynos5 from Chanwoo Choi and
Bartlomiej Zolnierkiewicz.
- turbostat tool fix from Jean Delvare.
- cpupower tool updates from Prarit Bhargava, Ramkumar Ramachandra
and Thomas Renninger.
- New ACPI ec_access.c tool for poking at the EC in a safe way from
Thomas Renninger"
* tag 'pm+acpi-3.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (187 commits)
ACPICA: Namespace: Remove _PRP method support.
intel_pstate: Improve initial busy calculation
intel_pstate: add sample time scaling
intel_pstate: Correct rounding in busy calculation
intel_pstate: Remove C0 tracking
PM / hibernate: fixed typo in comment
ACPI: Fix x86 regression related to early mapping size limitation
ACPICA: Tables: Add mechanism to control early table checksum verification.
ACPI / scan: use platform bus type by default for _HID enumeration
ACPI / scan: always register ACPI LPSS scan handler
ACPI / scan: always register memory hotplug scan handler
ACPI / scan: always register container scan handler
ACPI / scan: Change the meaning of missing .attach() in scan handlers
ACPI / scan: introduce platform_id device PNP type flag
ACPI / scan: drop unsupported serial IDs from PNP ACPI scan handler ID list
ACPI / scan: drop IDs that do not comply with the ACPI PNP ID rule
ACPI / PNP: use device ID list for PNPACPI device enumeration
ACPI / scan: .match() callback for ACPI scan handlers
ACPI / battery: wakeup the system only when necessary
power_supply: allow power supply devices registered w/o wakeup source
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid into next
Pull HID patches from Jiri Kosina:
- RMI driver for Synaptics touchpads, by Benjamin Tissoires, Andrew
Duggan and Jiri Kosina
- cleanup of hid-sony driver and improved support for Sixaxis and
Dualshock 4, by Frank Praznik
- other usual small fixes and support for new device IDs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (29 commits)
HID: thingm: thingm_fwinfo[] doesn't need to be global
HID: core: add two new usages for digitizer
HID: hid-sensor-hub: new device id and quirk for STM Sensor hub
HID: usbhid: enable NO_INIT_REPORTS quirk for Semico USB Keykoard
HID: hid-sensor-hub: Set report quirk for Microsoft Surface
HID: debug: add labels for HID Sensor Usages
HID: uhid: Use kmemdup instead of kmalloc + memcpy
HID: rmi: do not handle touchscreens through hid-rmi
HID: quirk for Saitek RAT7 and MMO7 mices' mode button
HID: core: fix validation of report id 0
HID: rmi: fix masks for x and w_x data
HID: rmi: fix wrong struct field name
HID: rmi: do not fetch more than 16 bytes in a query
HID: rmi: check for the existence of some optional queries before reading query 12
HID: i2c-hid: hid report descriptor retrieval changes
HID: add missing hid usages
HID: hid-sony - allow 3rd party INTEC controller to turn off all leds
HID: sony: Add blink support to the Sixaxis and DualShock 4 LEDs
HID: sony: Initialize the controller LEDs with a device ID value
HID: sony: Use the controller Bluetooth MAC address as the unique value in the battery name string
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial into next
Pull trivial tree changes from Jiri Kosina:
"Usual pile of patches from trivial tree that make the world go round"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (23 commits)
staging: go7007: remove reference to CONFIG_KMOD
aic7xxx: Remove obsolete preprocessor define
of: dma: doc fixes
doc: fix incorrect formula to calculate CommitLimit value
doc: Note need of bc in the kernel build from 3.10 onwards
mm: Fix printk typo in dmapool.c
modpost: Fix comment typo "Modules.symvers"
Kconfig.debug: Grammar s/addition/additional/
wimax: Spelling s/than/that/, wording s/destinatary/recipient/
aic7xxx: Spelling s/termnation/termination/
arm64: mm: Remove superfluous "the" in comment
of: Spelling s/anonymouns/anonymous/
dma: imx-sdma: Spelling s/determnine/determine/
ath10k: Improve grammar in comments
ath6kl: Spelling s/determnine/determine/
of: Improve grammar for of_alias_get_id() documentation
drm/exynos: Spelling s/contro/control/
radio-bcm2048.c: fix wrong overflow check
doc: printk-formats: do not mention casts for u64/s64
doc: spelling error changes
...
|
|
Pull KVM updates from Paolo Bonzini:
"At over 200 commits, covering almost all supported architectures, this
was a pretty active cycle for KVM. Changes include:
- a lot of s390 changes: optimizations, support for migration, GDB
support and more
- ARM changes are pretty small: support for the PSCI 0.2 hypercall
interface on both the guest and the host (the latter acked by
Catalin)
- initial POWER8 and little-endian host support
- support for running u-boot on embedded POWER targets
- pretty large changes to MIPS too, completing the userspace
interface and improving the handling of virtualized timer hardware
- for x86, a larger set of changes is scheduled for 3.17. Still, we
have a few emulator bugfixes and support for running nested
fully-virtualized Xen guests (para-virtualized Xen guests have
always worked). And some optimizations too.
The only missing architecture here is ia64. It's not a coincidence
that support for KVM on ia64 is scheduled for removal in 3.17"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (203 commits)
KVM: add missing cleanup_srcu_struct
KVM: PPC: Book3S PR: Rework SLB switching code
KVM: PPC: Book3S PR: Use SLB entry 0
KVM: PPC: Book3S HV: Fix machine check delivery to guest
KVM: PPC: Book3S HV: Work around POWER8 performance monitor bugs
KVM: PPC: Book3S HV: Make sure we don't miss dirty pages
KVM: PPC: Book3S HV: Fix dirty map for hugepages
KVM: PPC: Book3S HV: Put huge-page HPTEs in rmap chain for base address
KVM: PPC: Book3S HV: Fix check for running inside guest in global_invalidates()
KVM: PPC: Book3S: Move KVM_REG_PPC_WORT to an unused register number
KVM: PPC: Book3S: Add ONE_REG register names that were missed
KVM: PPC: Add CAP to indicate hcall fixes
KVM: PPC: MPIC: Reset IRQ source private members
KVM: PPC: Graciously fail broken LE hypercalls
PPC: ePAPR: Fix hypercall on LE guest
KVM: PPC: BOOK3S: Remove open coded make_dsisr in alignment handler
KVM: PPC: BOOK3S: Always use the saved DAR value
PPC: KVM: Make NX bit available with magic page
KVM: PPC: Disable NX for old magic page using guests
KVM: PPC: BOOK3S: HV: Add mixed page-size support for guest
...
|
|
Pull jfs changes from Dave Kleikamp.
* tag 'jfs-3.16' of git://github.com/kleikamp/linux-shaggy:
fs/jfs/super.c: convert simple_str to kstr
fs/jfs/jfs_dmap.c: replace min/casting by min_t
fs/jfs/super.c: remove 0 assignment to static + code clean-up
fs/jfs/jfs_logmgr.c: remove NULL assignment on static
JFS: Check for NULL before calling posix_acl_equiv_mode()
fs/jfs/jfs_inode.c: atomically set inode->i_flags
|
|
Sometimes PORT_EXIT messages are lost when a process is exiting.
This happens if you subscribe to the announce port with client A,
then subscribe to the announce port with client B, then kill client A.
Client B will not see the PORT_EXIT message because client A's port is
closing and is earlier in the announce port subscription list. The
for each loop will try to send the announcement to client A and fail,
then will stop trying to broadcast to other ports. Killing B works fine
since the announcement will already have gone to A. The CLIENT_EXIT
message does not get lost.
How to reproduce problem:
*** termA
$ aseqdump -p 0:1
0:1 Port subscribed 0:1 -> 128:0
*** termB
$ aseqdump -p 0:1
*** termA
0:1 Client start client 129
0:1 Port start 129:0
0:1 Port subscribed 0:1 -> 129:0
*** termB
0:1 Port subscribed 0:1 -> 129:0
*** termA
^C
*** termB
0:1 Client exit client 128
<--- expected Port exit as well (before client exit)
Signed-off-by: Adam Goode <agoode@google.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw into next
Pull gfs2 updates from Steven Whitehouse:
"This must be about the smallest merge window patch set ever for GFS2.
It is probably also the first one without a single patch from me.
That is down to a combination of factors, and I have some things in
the works that are not quite ready yet, that I hope to put in next
time around.
Returning to what is here this time... we have 3 patches which fix
various warnings. Two are bug fixes (for quotas and also a rare
recovery race condition). The final patch, from Ben Marzinski, is an
important change in the freeze code which has been in progress for
some time. This removes the need to take and drop the transaction
lock for every single transaction, when the only time it was used, was
at file system freeze time. Ben's patch integrates the freeze
operation into the journal flush code as an alternative with lower
overheads and also lands up resolving some difficult to fix races at
the same time"
* tag 'gfs2-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw:
GFS2: Prevent recovery before the local journal is set
GFS2: fs/gfs2/file.c: kernel-doc warning fixes
GFS2: fs/gfs2/bmap.c: kernel-doc warning fixes
GFS2: remove transaction glock
GFS2: lops.c: replace 0 by NULL for pointers
GFS2: quotas not being refreshed in gfs2_adjust_quota
|
|
Pull file locking changes from Jeff Layton:
"Pretty quiet on the file-locking related front this cycle. Just some
small cleanups and the addition of some tracepoints in the lease
handling code"
* tag 'locks-v3.16' of git://git.samba.org/jlayton/linux:
locks: add some tracepoints in the lease handling code
fs/locks.c: replace seq_printf by seq_puts
locks: ensure that fl_owner is always initialized properly in flock and lease codepaths
|
|
When the code was collapsed to avoid duplication, the recent patch
for ensuring that a queue is idled before free was dropped, which was
added by commit 19c5d84f14d2.
Add back the blk_mq_tag_idle(), to ensure we don't leak a reference
to an active queue when it is freed.
Signed-off-by: Jens Axboe <axboe@fb.com>
|