Age | Commit message (Collapse) | Author |
|
Use the more generic functions get_signal() signal_setup_done()
for signal delivery.
This inverts also the return codes of force_sigsegv_info()
and setup_frame() to follow the kernel convention.
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Use the more generic functions get_signal() signal_setup_done()
for signal delivery.
Acked-by: Richard Kuo <rkuo@codeaurora.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Use the more generic functions get_signal() signal_setup_done()
for signal delivery.
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Use the more generic functions get_signal() signal_setup_done()
for signal delivery.
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Use the more generic functions get_signal() signal_setup_done()
for signal delivery.
Tested-by: Mark Salter <msalter@redhat.com>
Acked-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Use the more generic functions get_signal() signal_setup_done()
for signal delivery.
Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Steven Miao <realmz6@gmail.com>
|
|
Use the more generic functions get_signal() signal_setup_done()
for signal delivery.
Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no>
|
|
Use the more generic functions get_signal() signal_setup_done()
for signal delivery.
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Use the more generic functions get_signal() signal_setup_done()
for signal delivery.
Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
|
|
commit 4badad352a6bb202ec68afa7a574c0bb961e5ebc (locking/mutex: Disable
optimistic spinning on some architectures) fenced spinning for
architectures without proper cmpxchg.
There is no need to disable mutex spinning on s390, though:
The instructions CS,CSG and friends provide the proper guarantees.
(We dont implement cmpxchg with locks).
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Everyone agrees we should do this,
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
'for-3.17/hyperv', 'for-3.17/i2c', 'for-3.17/lenovo', 'for-3.17/rmi' and 'for-3.17/sony' into for-linus
|
|
drm-next
bunch of cleanups
* 'drm-next' of git://people.freedesktop.org/~dvdhrm/linux:
drm: mark drm_context support as legacy
drm: make sysfs device always available for minors
drm: make minor->index available early
drm: merge drm_drv.c into drm_ioctl.c
drm: move module initialization to drm_stub.c
drm: don't de-authenticate clients on master-close
drm: drop redundant drm_file->is_master
drm: extract legacy ctxbitmap flushing
|
|
Eliminate boilerplate code by using module_platform_driver().
Signed-off-by: Christoph Jaeger <christophjaeger@linux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Conflicts:
arch/sparc/mm/init_64.c
Conflict was simple non-overlapping additions.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Conflicts:
drivers/net/Makefile
net/ipv6/sysctl_net_ipv6.c
Two ipv6_table_template[] additions overlap, so the index
of the ipv6_table[x] assignments needed to be adjusted.
In the drivers/net/Makefile case, we've gotten rid of the
garbage whereby we had to list every single USB networking
driver in the top-level Makefile, there is just one
"USB_NETWORKING" that guards everything.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer and time updates from Thomas Gleixner:
"A rather large update of timers, timekeeping & co
- Core timekeeping code is year-2038 safe now for 32bit machines.
Now we just need to fix all in kernel users and the gazillion of
user space interfaces which rely on timespec/timeval :)
- Better cache layout for the timekeeping internal data structures.
- Proper nanosecond based interfaces for in kernel users.
- Tree wide cleanup of code which wants nanoseconds but does hoops
and loops to convert back and forth from timespecs. Some of it
definitely belongs into the ugly code museum.
- Consolidation of the timekeeping interface zoo.
- A fast NMI safe accessor to clock monotonic for tracing. This is a
long standing request to support correlated user/kernel space
traces. With proper NTP frequency correction it's also suitable
for correlation of traces accross separate machines.
- Checkpoint/restart support for timerfd.
- A few NOHZ[_FULL] improvements in the [hr]timer code.
- Code move from kernel to kernel/time of all time* related code.
- New clocksource/event drivers from the ARM universe. I'm really
impressed that despite an architected timer in the newer chips SoC
manufacturers insist on inventing new and differently broken SoC
specific timers.
[ Ed. "Impressed"? I don't think that word means what you think it means ]
- Another round of code move from arch to drivers. Looks like most
of the legacy mess in ARM regarding timers is sorted out except for
a few obnoxious strongholds.
- The usual updates and fixlets all over the place"
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (114 commits)
timekeeping: Fixup typo in update_vsyscall_old definition
clocksource: document some basic timekeeping concepts
timekeeping: Use cached ntp_tick_length when accumulating error
timekeeping: Rework frequency adjustments to work better w/ nohz
timekeeping: Minor fixup for timespec64->timespec assignment
ftrace: Provide trace clocks monotonic
timekeeping: Provide fast and NMI safe access to CLOCK_MONOTONIC
seqcount: Add raw_write_seqcount_latch()
seqcount: Provide raw_read_seqcount()
timekeeping: Use tk_read_base as argument for timekeeping_get_ns()
timekeeping: Create struct tk_read_base and use it in struct timekeeper
timekeeping: Restructure the timekeeper some more
clocksource: Get rid of cycle_last
clocksource: Move cycle_last validation to core code
clocksource: Make delta calculation a function
wireless: ath9k: Get rid of timespec conversions
drm: vmwgfx: Use nsec based interfaces
drm: i915: Use nsec based interfaces
timekeeping: Provide ktime_get_raw()
hangcheck-timer: Use ktime_get_ns()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
"Nothing spectacular from the irq department this time:
- overhaul of the crossbar chip driver
- overhaul of the spear shirq chip driver
- support for the atmel-aic chip
- code move from arch to drivers
- the usual tiny fixlets
- two reverts worth to mention which undo the too simple attempt of
supporting wakeup interrupts on shared interrupt lines"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
Revert "irq: Warn when shared interrupts do not match on NO_SUSPEND"
Revert "PM / sleep / irq: Do not suspend wakeup interrupts"
irq: Warn when shared interrupts do not match on NO_SUSPEND
irqchip: atmel-aic: Define irq fixups for atmel SoCs
irqchip: atmel-aic: Implement RTC irq fixup
irqchip: atmel-aic: Add irq fixup infrastructure
irqchip: atmel-aic: Add atmel AIC/AIC5 drivers
irqchip: atmel-aic: Move binding doc to interrupt-controller directory
genirq: generic chip: Export irq_map_generic_chip function
PM / sleep / irq: Do not suspend wakeup interrupts
irqchip: or1k-pic: Migrate from arch/openrisc/
irqchip: crossbar: Allow for quirky hardware with direct hardwiring of GIC
documentation: dt: omap: crossbar: Add description for interrupt consumer
irqchip: crossbar: Introduce centralized check for crossbar write
irqchip: crossbar: Introduce ti, max-crossbar-sources to identify valid crossbar mapping
irqchip: crossbar: Add kerneldoc for crossbar_domain_unmap callback
irqchip: crossbar: Set cb pointer to null in case of error
irqchip: crossbar: Change the goto naming
irqchip: crossbar: Return proper error value
irqchip: crossbar: Fix kerneldoc warning
...
|
|
Commit ea431643d6c3 ("x86/mce: Fix CMCI preemption bugs") breaks RT by
the completely unrelated conversion of the cmci_discover_lock to a
regular (non raw) spinlock. This lock was annotated in commit
59d958d2c7de ("locking, x86: mce: Annotate cmci_discover_lock as raw")
with a proper explanation why.
The argument for converting the lock back to a regular spinlock was:
- it does percpu ops without disabling preemption. Preemption is not
disabled due to the mistaken use of a raw spinlock.
Which is complete nonsense. The raw_spinlock is disabling preemption in
the same way as a regular spinlock. In mainline spinlock maps to
raw_spinlock, in RT spinlock becomes a "sleeping" lock.
raw_spinlock has on RT exactly the same semantics as in mainline. And
because this lock is taken in non preemptible context it must be raw on
RT.
Undo the locking brainfart.
Reported-by: Clark Williams <williams@redhat.com>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
We need to take the connection mutex around the link status
check for non-MST case, but also around the MST link training
on short HPDs.
I suspect we actually should have a dpcd lock in the future as
well, that just lock the local copies of dpcd and flags stored
from that.
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
Enabling a Virtual Interface can result in an interrupt during the processing
of the VI Enable command and, in some paths, result in an attempt to issue
another command in the interrupt context, eventually crashing the system. Thus,
we disable interrupts during the course of the VI Enable command and ensure
enable doesn't sleep.
Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
USB network drivers are already handled in drivers/net/usb/Kconfig.
Let's save the maintenance burden of dependencies in drivers/net/Makefile.
The newly introduced USB_NET_DRIVERS umbrella config option defaults
to 'y' so as to minimize the changes of behavior.
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
tg3_tso_bug() was originally designed to handle only HW TX ring 0, Commit
d3f6f3a1d818410c17445bce4f4caab52eb102f1 ("tg3: Prevent page allocation failure
during TSO workaround") changed the driver logic to use tg3_tso_bug() for all
HW TX rings that are enabled. This patch fixes the regression by modifying
tg3_tso_bug() to handle multiple HW TX rings.
Signed-off-by: Prashant Sreedharan <prashant@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Tom Lendacky says:
====================
amd-xgbe: AMD XGBE driver update 2014-08-05
The following series of patches includes fixes/updates to the driver.
- Use dma_set_mask_and_coherent to set the DMA mask
- Move the phy connect/disconnect logic to allow for module unloading
Changes in V2:
- Check the return value of the dma_set_mask_and_coherent call
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A change added to the mdiobus/phy api added a module_get/module_put
during phy connect/disconnect processing. Currently, the driver
performs a phy connect during module probe and a phy disconnect during
module remove. With the addition of the module_get during phy connect
the amd-xgbe module use count is incremented and can no longer be
unloaded.
Move the phy connect/disconnect from the driver probe/remove functions
to the net_device_ops ndo_open/ndo_stop functions. This allows the
module use count to be decremented when the device(s) are brought down
and allows the module to be unloaded.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use the dma_set_mask_and_coherent function to set the DMA mask rather
than setting the DMA mask fields directly. This was originally done
to work around a bug in the arm64 DMA support when RAM started above
the 4GB boundary which has since been fixed.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Upon reception of a new frame, the emac driver checks for a number
of error conditions, and flag the packet as "bad" if any of these
are present. It then allocates a skb unconditionally, but only uses
it if the packet is "good". On the error path, the skb is just forgotten,
and the system leaks memory.
The piece of junk I have on my desk seems to encounter such error
frequently enough so that the box goes OOM after a couple of days,
which makes me grumpy.
Fix this by moving the allocation on the "good_packet" path (and
convert it to netdev_alloc_skb while we're at it).
Tested on a random Allwinner A20 board.
Cc: Stefan Roese <sr@denx.de>
Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: <stable@vger.kernel.org> # 3.11+
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Dave reported following splat, caused by improper use of
IP_INC_STATS_BH() in process context.
BUG: using __this_cpu_add() in preemptible [00000000] code: trinity-c117/14551
caller is __this_cpu_preempt_check+0x13/0x20
CPU: 3 PID: 14551 Comm: trinity-c117 Not tainted 3.16.0+ #33
ffffffff9ec898f0 0000000047ea7e23 ffff88022d32f7f0 ffffffff9e7ee207
0000000000000003 ffff88022d32f818 ffffffff9e397eaa ffff88023ee70b40
ffff88022d32f970 ffff8801c026d580 ffff88022d32f828 ffffffff9e397ee3
Call Trace:
[<ffffffff9e7ee207>] dump_stack+0x4e/0x7a
[<ffffffff9e397eaa>] check_preemption_disabled+0xfa/0x100
[<ffffffff9e397ee3>] __this_cpu_preempt_check+0x13/0x20
[<ffffffffc0839872>] sctp_packet_transmit+0x692/0x710 [sctp]
[<ffffffffc082a7f2>] sctp_outq_flush+0x2a2/0xc30 [sctp]
[<ffffffff9e0d985c>] ? mark_held_locks+0x7c/0xb0
[<ffffffff9e7f8c6d>] ? _raw_spin_unlock_irqrestore+0x5d/0x80
[<ffffffffc082b99a>] sctp_outq_uncork+0x1a/0x20 [sctp]
[<ffffffffc081e112>] sctp_cmd_interpreter.isra.23+0x1142/0x13f0 [sctp]
[<ffffffffc081c86b>] sctp_do_sm+0xdb/0x330 [sctp]
[<ffffffff9e0b8f1b>] ? preempt_count_sub+0xab/0x100
[<ffffffffc083b350>] ? sctp_cname+0x70/0x70 [sctp]
[<ffffffffc08389ca>] sctp_primitive_ASSOCIATE+0x3a/0x50 [sctp]
[<ffffffffc083358f>] sctp_sendmsg+0x88f/0xe30 [sctp]
[<ffffffff9e0d673a>] ? lock_release_holdtime.part.28+0x9a/0x160
[<ffffffff9e0d62ce>] ? put_lock_stats.isra.27+0xe/0x30
[<ffffffff9e73b624>] inet_sendmsg+0x104/0x220
[<ffffffff9e73b525>] ? inet_sendmsg+0x5/0x220
[<ffffffff9e68ac4e>] sock_sendmsg+0x9e/0xe0
[<ffffffff9e1c0c09>] ? might_fault+0xb9/0xc0
[<ffffffff9e1c0bae>] ? might_fault+0x5e/0xc0
[<ffffffff9e68b234>] SYSC_sendto+0x124/0x1c0
[<ffffffff9e0136b0>] ? syscall_trace_enter+0x250/0x330
[<ffffffff9e68c3ce>] SyS_sendto+0xe/0x10
[<ffffffff9e7f9be4>] tracesys+0xdd/0xe2
This is a followup of commits f1d8cba61c3c4b ("inet: fix possible
seqlock deadlocks") and 7f88c6b23afbd315 ("ipv6: fix possible seqlock
deadlock in ip6_finish_output2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Reported-by: Dave Jones <davej@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit a71e3c37960ce5f9 ("net: phy: Set the driver when registering an MDIO bus
device") caused the following regression on the fec driver:
root@imx6qsabresd:~# echo mem > /sys/power/state
PM: Syncing filesystems ... done.
Freezing user space processes ... (elapsed 0.003 seconds) done.
Freezing remaining freezable tasks ... (elapsed 0.002 seconds) done.
Unable to handle kernel NULL pointer dereference at virtual address 0000002c
pgd = bcd14000
[0000002c] *pgd=4d9e0831, *pte=00000000, *ppte=00000000
Internal error: Oops: 17 [#1] SMP ARM
Modules linked in:
CPU: 0 PID: 617 Comm: sh Not tainted 3.16.0 #17
task: bc0c4e00 ti: bceb6000 task.ti: bceb6000
PC is at fec_suspend+0x10/0x70
LR is at dpm_run_callback.isra.7+0x34/0x6c
pc : [<803f8a98>] lr : [<80361f44>] psr: 600f0013
sp : bceb7d70 ip : bceb7d88 fp : bceb7d84
r10: 8091523c r9 : 00000000 r8 : bd88f478
r7 : 803f8a88 r6 : 81165988 r5 : 00000000 r4 : 00000000
r3 : 00000000 r2 : 00000000 r1 : bd88f478 r0 : bd88f478
Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user
Control: 10c5387d Table: 4cd1404a DAC: 00000015
Process sh (pid: 617, stack limit = 0xbceb6240)
Stack: (0xbceb7d70 to 0xbceb8000)
....
The problem with the original commit is explained by Russell King:
"It has the effect (as can be seen from the oops) of attaching the MDIO bus
device (itself is a bus-less device) to the platform driver, which means
that if the platform driver supports power management, it will be called
to power manage the MDIO bus device.
Moreover, drivers do not expect to be called for power management
operations for devices which they haven't probed, and certainly not for
devices which aren't part of the same bus that the driver is registered
against."
This reverts commit a71e3c37960ce5f9c6a519bc1215e3ba9fa83e75.
Cc: <stable@vger.kernel.org> #3.16
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
routine
Need to turn off SGE RX/TX Callback Timers & interrupt in cxgb4vf PCI Shutdown
routine in order to prevent crashes during reboot/poweroff when traffic is
running.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Antonio Quartulli says:
====================
pull request: batman-adv 2014-08-05
this is a pull request intended for net-next/linux-3.17 (yeah..it's really
late).
Patches 1, 2 and 4 are really minor changes:
- kmalloc_array is substituted to kmalloc when possible (as suggested by
checkpatch);
- net_ratelimited() is now used properly and the "suppressed" message is not
printed anymore if not needed;
- the internal version number has been increased to reflect our current version.
Patch 3 instead is introducing a change in the metric computation function
by changing the penalty applied at each mesh hop from 15/255 (~6%) to
30/255 (~11%). This change is introduced by Simon Wunderlich after having
observed a performance improvement in several networks when using the new value.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The variable "err" is not necessary.
Return register_netdevice() directly.
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Now bridge ports can be non-promiscuous, vlan_vid_add() is no longer an
unnecessary operation.
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media updates from Mauro Carvalho Chehab:
- removal of sn9c102. This device driver was replaced a long time ago
by gspca
- solo6x10 and go7007 webcam drivers moved from staging into
mainstream. They were waiting for an API to allow setting the image
detection matrix
- SDR drivers moved from staging into mainstream: sdr-msi3101 (renamed
as msi2500) and rtl2832
- added SDR driver for airspy
- added demux driver: si2165
- rework at several RC subsystem, making the code for RC-5 SZ variant
to be added at the standard RC5 decoder
- added decoder for the XMP IR protocol
- tuner driver moved from staging into mainstream: msi3101 (renamed as
msi001)
- added documentation for some additional SDR pixfmt
- some device tree bindings documented
- added support for exynos3250 at s5p-jpeg
- remove the obsolete, unmaintained and broken mx1_camera driver
- added support for remote controllers at au0828 driver
- added a RC driver: sunxi-cir
- several driver fixes, enhancements and cleanups.
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (455 commits)
[media] cx23885: fix UNSET/TUNER_ABSENT confusion
[media] coda: fix build error by making reset control optional
[media] radio-miropcm20: fix sparse NULL pointer warning
[media] MAINTAINERS: Update go7007 pattern
[media] MAINTAINERS: Update solo6x10 patterns
[media] media: atmel-isi: add primary DT support
[media] media: atmel-isi: convert the pdata from pointer to structure
[media] media: atmel-isi: add v4l2 async probe support
[media] rcar_vin: add devicetree support
[media] media: pxa_camera device-tree support
[media] media: mt9m111: add device-tree suppport
[media] soc_camera: add support for dt binding soc_camera drivers
[media] media: soc_camera: pxa_camera documentation device-tree support
[media] media: mt9m111: add device-tree documentation
[media] s5p-mfc: remove unnecessary calling to function video_devdata()
[media] s5p-jpeg: add chroma subsampling adjustment for Exynos3250
[media] s5p-jpeg: Prevent erroneous downscaling for Exynos3250 SoC
[media] s5p-jpeg: Assure proper crop rectangle initialization
[media] s5p-jpeg: fix g_selection op
[media] s5p-jpeg: Adjust jpeg_bound_align_image to Exynos3250 needs
...
|
|
Willem de Bruijn says:
====================
net-timestamp: new tx tstamps and tcp
Extend socket tx timestamping:
- allow multiple types of software timestamps aside from send (1)
- add software timestamp on enter packet scheduling (4)
- add software timestamp for TCP (5)
- add software timestamp for TCP on ACK (6)
The sk_flags option space is nearly exhausted. Also move the
many timestamp options to a new sk->sk_tstamps (2).
To disambiguate data when tstamps may arrive out of order,
optionally return a sequential ID assigned at send (3).
Extend Linux tx timestamping to monitoring of latency
incurred within the kernel stack and to protocols embedded in TCP.
Complex kernel setups may have multiple layers of queueing, including
multiple instances of packet scheduling, and many classes per layer.
Many applications embed discrete payloads into TCP bytestreams for
reliability, flow control, etcetera. Detecting application tail
latency in such scenarios relies on identifying the exact queue
responsible if on the host, or the network latency if otherwise.
Changelog:
v4->v5
- define SCM_TSTAMP_SND == 0, for legacy behavior
- add TCP tstamps without changing the generated byte stream
- modify GSO and ACK to find offset: slightly more complex
than previous invariant that it is the last byte
- consistent naming of packet scheduling
- rename SCM_TSTAMP_ENQ to SCM_TSTAMP_SCHED
- add unique key in ee_data
- add id field in ee_info to disambiguate tstamps
- optional, only on new flag SOF_TIMESTAMPING_OPT_ID
- for bytestream, in bytes
v3->v4
- (v3 review comment) removed skb->mark packet identification (*A)
- (v3 review comment) fixed indentation
- tcp: fixed poll() to return POLLERR on non-zero queue
- rebased to work without syststamp
- comments: removed all traces of MSG_TSTAMP_.. (*B)
v2->v3
- extend the SO_TIMESTAMPING API, instead of defining a new one.
- add protocol independent support to correlate tstamps with data,
based on returning skb->mark.
- removed no-payload optimization and documentation (for now):
I have a follow-on patch that reintroduces MSG_TSTAMP along with a
new socket option SOF_TIMESTAMPING_OPT_ONFLAG. This is equivalent
to sequence setsockopt(<enable>); send(..); setsockopt(<disable>),
but avoids the need to define a MSG_TSTAMP_<TYPE> for each type.
I will leave these three patches as follow-on, as this patchset is
large enough as is.
v1->v2
- expand timestamping (existing and new) to SOCK_RAW and ping sockets
- rename sock_errqueue_timestamping to scm_timestamping
- change timestamp data format: do not add fields to scm_timestamping.
Doing so could break legacy applications. Instead, communicate
through an existing, but unused, field in the error message.
- rename SOF_.._OPT_TX_NO_PAYLOAD to shorter SOF_.._OPT_TSONLY
- move msg_tstamp test app out of patchset and to github
git://github.com/wdebruij/kerneltools.git
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add SOF_TIMESTAMPING_TX_ACK, a request for a tstamp when the last byte
in the send() call is acknowledged. It implements the feature for TCP.
The timestamp is generated when the TCP socket cumulative ACK is moved
beyond the tracked seqno for the first time. The feature ignores SACK
and FACK, because those acknowledge the specific byte, but not
necessarily the entire contents of the buffer up to that byte.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
TCP timestamping extends SO_TIMESTAMPING to bytestreams.
Bytestreams do not have a 1:1 relationship between send() buffers and
network packets. The feature interprets a send call on a bytestream as
a request for a timestamp for the last byte in that send() buffer.
The choice corresponds to a request for a timestamp when all bytes in
the buffer have been sent. That assumption depends on in-order kernel
transmission. This is the common case. That said, it is possible to
construct a traffic shaping tree that would result in reordering.
The guarantee is strong, then, but not ironclad.
This implementation supports send and sendpages (splice). GSO replaces
one large packet with multiple smaller packets. This patch also copies
the option into the correct smaller packet.
This patch does not yet support timestamping on data in an initial TCP
Fast Open SYN, because that takes a very different data path.
If ID generation in ee_data is enabled, bytestream timestamps return a
byte offset, instead of the packet counter for datagrams.
The implementation supports a single timestamp per packet. It silenty
replaces requests for previous timestamps. To avoid missing tstamps,
flush the tcp queue by disabling Nagle, cork and autocork. Missing
tstamps can be detected by offset when the ee_data ID is enabled.
Implementation details:
- On GSO, the timestamping code can be included in the main loop. I
moved it into its own loop to reduce the impact on the common case
to a single branch.
- To avoid leaking the absolute seqno to userspace, the offset
returned in ee_data must always be relative. It is an offset between
an skb and sk field. The first is always set (also for GSO & ACK).
The second must also never be uninitialized. Only allow the ID
option on sockets in the ESTABLISHED state, for which the seqno
is available. Never reset it to zero (instead, move it to the
current seqno when reenabling the option).
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Kernel transmit latency is often incurred in the packet scheduler.
Introduce a new timestamp on transmission just before entering the
scheduler. When data travels through multiple devices (bonding,
tunneling, ...) each device will export an individual timestamp.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Datagrams timestamped on transmission can coexist in the kernel stack
and be reordered in packet scheduling. When reading looped datagrams
from the socket error queue it is not always possible to unique
correlate looped data with original send() call (for application
level retransmits). Even if possible, it may be expensive and complex,
requiring packet inspection.
Introduce a data-independent ID mechanism to associate timestamps with
send calls. Pass an ID alongside the timestamp in field ee_data of
sock_extended_err.
The ID is a simple 32 bit unsigned int that is associated with the
socket and incremented on each send() call for which software tx
timestamp generation is enabled.
The feature is enabled only if SOF_TIMESTAMPING_OPT_ID is set, to
avoid changing ee_data for existing applications that expect it 0.
The counter is reset each time the flag is reenabled. Reenabling
does not change the ID of already submitted data. It is possible
to receive out of order IDs if the timestamp stream is not quiesced
first.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
sk_flags is reaching its limit. New timestamping options will not fit.
Move all of them into a new field sk->sk_tsflags.
Added benefit is that this removes boilerplate code to convert between
SOF_TIMESTAMPING_.. and SOCK_TIMESTAMPING_.. in getsockopt/setsockopt.
SOCK_TIMESTAMPING_RX_SOFTWARE is also used to toggle the receive
timestamp logic (netstamp_needed). That can be simplified and this
last key removed, but will leave that for a separate patch.
Signed-off-by: Willem de Bruijn <willemb@google.com>
----
The u16 in sock can be moved into a 16-bit hole below sk_gso_max_segs,
though that scatters tstamp fields throughout the struct.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Applications that request kernel tx timestamps with SO_TIMESTAMPING
read timestamps as recvmsg() ancillary data. The response is defined
implicitly as timespec[3].
1) define struct scm_timestamping explicitly and
2) add support for new tstamp types. On tx, scm_timestamping always
accompanies a sock_extended_err. Define previously unused field
ee_info to signal the type of ts[0]. Introduce SCM_TSTAMP_SND to
define the existing behavior.
The reception path is not modified. On rx, no struct similar to
sock_extended_err is passed along with SCM_TIMESTAMPING.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
These belong to the t4 msg header, will ensure there is no accidental code
duplication in the future
Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This commit reduces spurious retransmits due to apparent SACK reneging
by only reacting to SACK reneging that persists for a short delay.
When a sequence space hole at snd_una is filled, some TCP receivers
send a series of ACKs as they apparently scan their out-of-order queue
and cumulatively ACK all the packets that have now been consecutiveyly
received. This is essentially misbehavior B in "Misbehaviors in TCP
SACK generation" ACM SIGCOMM Computer Communication Review, April
2011, so we suspect that this is from several common OSes (Windows
2000, Windows Server 2003, Windows XP). However, this issue has also
been seen in other cases, e.g. the netdev thread "TCP being hoodwinked
into spurious retransmissions by lack of timestamps?" from March 2014,
where the receiver was thought to be a BSD box.
Since snd_una would temporarily be adjacent to a previously SACKed
range in these scenarios, this receiver behavior triggered the Linux
SACK reneging code path in the sender. This led the sender to clear
the SACK scoreboard, enter CA_Loss, and spuriously retransmit
(potentially) every packet from the entire write queue at line rate
just a few milliseconds before the ACK for each packet arrives at the
sender.
To avoid such situations, now when a sender sees apparent reneging it
does not yet retransmit, but rather adjusts the RTO timer to give the
receiver a little time (max(RTT/2, 10ms)) to send us some more ACKs
that will restore sanity to the SACK scoreboard. If the reneging
persists until this RTO then, as before, we clear the SACK scoreboard
and enter CA_Loss.
A 10ms delay tolerates a receiver sending such a stream of ACKs at
56Kbit/sec. And to allow for receivers with slower or more congested
paths, we wait for at least RTT/2.
We validated the resulting max(RTT/2, 10ms) delay formula with a mix
of North American and South American Google web server traffic, and
found that for ACKs displaying transient reneging:
(1) 90% of inter-ACK delays were less than 10ms
(2) 99% of inter-ACK delays were less than RTT/2
In tests on Google web servers this commit reduced reneging events by
75%-90% (as measured by the TcpExtTCPSACKReneging counter), without
any measurable impact on latency for user HTTP and SPDY requests.
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Rajesh Borundia says:
====================
qlcnic: Bug fixes
The patch series contains following bug fixes.
* Aggregating tx stats in adapter variable was resulting
in increase of stats when user runs ifconfig command
and no traffic is running. Instead aggregate tx stats
in local variable and then assign it to adapter struct
variable.
* Set_driver_version was called after registering netdev
which was resulting in a race between FLR in open
handler and set_driver_version command as open handler
can be called simulatneously on another cpu even if probe
is not complete. So call this command before registering
netdev.
* dcbnl_ops should be initialized before registering netdev
as they are referenced in open handler.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
o Initialization of dcbnl_ops after register netdev may result in
dcbnl_ops not getting set before it is being accessed from open.
So, moving it before register_netdev.
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
o Earlier, set_drv_version was getting called after register_netdev.
This was resulting in a race between set_drv_version and FLR called
from open(). Moving set_drv_version before register_netdev avoids
the race.
o Log response code in error message on CDRP failure.
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
o Aggregating tx stats in adapter variable was resulting in
an increase in stats even after no traffic was run and
user runs ifconfig/ethtool command.
o qlcnic_update_stats used to accumulate stats in adapter
struct at each function call, instead accumulate tx stats
in local variable and then assign it to adapter structure.
Reported-by: Holger Kiehl <holger.kiehl@dwd.de>
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Antonio Quartulli says:
====================
pull request net: batman-adv 2014-08-04
this is a pull request intended for net.
It just contains a patch by Sven Eckelmann that fixes the
reception of out-of-order fragments. As explained in the
commit message, the issue was due to a wrong assumption
about hlist_for_each_entry() in batadv_frag_insert_packet().
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator updates from Mark Brown:
"A couple of nice new features this month, the ability to map
regulators in order to allow voltage control by external coprocessors
is something people have been asking for for a long time.
- improved support for switch only "regulators", allowing current
state to be read from the parent regulator but no setting.
- support for obtaining the register access method used to set
voltages, for use in systems which can offload control of this to a
coprocessor (typically for DVFS).
- support for Active-Semi AC8846, Dialog DA9211 and Texas Instruments
TPS65917"
* tag 'regulator-v3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (58 commits)
regulator: act8865: fix build when OF is not enabled
regulator: act8865: add act8846 to DT binding documentation
regulator: act8865: add support for act8846
regulator: act8865: prepare support for other act88xx devices
regulator: act8865: set correct number of regulators in pdata
regulator: act8865: Remove error variable in act8865_pmic_probe
regulator: act8865: fix parsing of platform data
regulator: tps65090: Set voltage for fixed regulators
regulator: core: Allow to get voltage count and list from parent
regulator: core: Get voltage from parent if not available
regulator: Add missing statics and inlines for stub functions
regulator: lp872x: Don't set constraints within the regulator driver
regmap: Fix return code for stub regmap_get_device()
regulator: s2mps11: Update module description and Kconfig to add S2MPU02 support
regulator: Add helpers for low-level register access
regmap: Allow regmap_get_device() to be used by modules
regmap: Add regmap_get_device
regulator: da9211: Remove unnecessary devm_regulator_unregister() calls
regulator: Add DT bindings for tps65218 PMIC regulators.
regulator: da9211: new regulator driver
...
|