Age | Commit message (Collapse) | Author |
|
minimum_to_wake is unique to N_TTY processing, and belongs in
per-ldisc data.
Add the ldisc method, ldisc_ops::fasync(), to notify line disciplines
when signal-driven I/O is enabled or disabled. When enabled for N_TTY
(by fcntl(F_SETFL, O_ASYNC)), blocking reader/polls will be woken
for any readable input. When disabled, blocking reader/polls are not
woken until the read buffer is full.
Canonical mode (L_ICANON(tty), n_tty_data::icanon) is not affected by
the minimum_to_wake setting.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We want the fixes in this branch as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We want the changes in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We want the fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We want the fixes in here.
|
|
This data allow writing for example MTD driver.
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
|
|
|
|
|
|
It is counter-intuitive to have "0" mean disable in a boolean
manner for electronic properties of pins such as pull-up and
pull-down. Therefore, define that a pull-up/pull-down argument
of 0 to such a generic option means that the pin is
short-circuited to VDD or GROUND. Pull disablement shall be
done using PIN_CONFIG_BIAS_DISABLE.
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by Heiko Stuebner <heiko@sntech.de>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
The comment introduced with the recently added pinctrl_gpio_range.pins
element was wrong. This corrects it.
Thanks to Patrice Chotard for pointing this out.
Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
The stubs for the !PINCTRL case were placed in the wrong
part of the file, causing breakage in linux-next when compiling
SH without pinctrl. Fix it up by moving the stubs to the right
place.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
Traditionally, GPIO ranges are based on consecutive ranges of both GPIO
and pin numbers. This patch allows for GPIO ranges with arbitrary lists
of pin numbers.
Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
This deletes the dependency on any platform data for
the COH901 pin controller. There is only one user in the
kernel, and if we at some point want to support more
variants, they shall provide their variant info through
the device tree.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
Adds support for "High Speed Serial Communications Interface with FIFO",
essentially a SCIF with 128-byte FIFOs and more accurate baud rate
generator.
Signed-off-by: Ulrich Hecht <ulrich.hecht@gmail.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
|
|
percpu-refcount was incorrectly using preempt_disable/enable() for RCU
critical sections against call_rcu(). 6a24474da8 ("percpu-refcount:
consistently use plain (non-sched) RCU") fixed it by converting the
preepmtion operations with rcu_read_[un]lock() citing that there isn't
any advantage in using sched-RCU over using the usual one; however,
rcu_read_[un]lock() for the preemptible RCU implementation -
CONFIG_TREE_PREEMPT_RCU, chosen when CONFIG_PREEMPT - are slightly
more expensive than preempt_disable/enable().
In a contrived microbench which repeats the followings,
- percpu_ref_get()
- copy 32 bytes of data into percpu buffer
- percpu_put_get()
- copy 32 bytes of data into percpu buffer
rcu_read_[un]lock() used in percpu_ref_get/put() makes it go slower by
about 15% when compared to using sched-RCU.
As the RCU critical sections are extremely short, using sched-RCU
shouldn't have any latency implications. Convert to RCU-sched.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Kent Overstreet <koverstreet@google.com>
Acked-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Rusty Russell <rusty@rustcorp.com.au>
|
|
If a device have sleep and idle states in addition to the
default state, look up these in the core and stash them in
the pinctrl state container.
Add accessor functions for pinctrl consumers to put the pins
into "default", "sleep" and "idle" states passing nothing but
the struct device * affected.
Solution suggested by Kevin Hilman, Mark Brown and Dmitry
Torokhov in response to a patch series from Hebbar
Gururaja.
Cc: Hebbar Gururaja <gururaja.hebbar@ti.com>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Acked-by: Wolfram Sang <wsa@the-dreams.de>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
There exist controllers that don't support to set the pull to up or down
separately but instead automatically set the pull direction based on
embedded knowledge inside the controller, for example depending on the
selected mux function of the pin.
Therefore this patch adds another config option to use this default
pull-state for a pin where it is not possible to know or decide if the
pin will be pulled up or down.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
Add a new PIN_CONFIG_BIAS_BUS_HOLD pin configuration for a bus holder
pin mode (also known as bus keeper, or repeater). This is a weak latch
which drives the last value on a tristate bus. Another device on the bus
can drive the bus high or low before going tristate to change the value
driven by the pin.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
In Rockchip Cortex-A9 based chips, they don't use paradigm of
reading-changing-writing the register contents. Instead they
use a hiword mask to indicate the changed bits.
When b1 should be set as gate, it also needs to indicate the change
by setting hiword mask (b1 << 16).
The patch adds gate flag for this usage.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>
Signed-off-by: Mike Turquette <mturquette@linaro.org>
|
|
In both Hisilicon & Rockchip Cortex-A9 based chips, they don't use the
paradigm of reading-changing-writing the register contents.
Instead they use a hiword mask to indicate the changed bits.
When b01 should be set as setting divider, it also needs to indicate
the change by setting hiword mask (b11 << 16).
The patch adds divider flag for this usage.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>
Signed-off-by: Mike Turquette <mturquette@linaro.org>
|
|
In both Hisilicon & Rockchip Cortex-A9 based chips, they don't use the
paradigm of reading-changing-writing the register contents.
Instead they use a hiword mask to indicate the changed bits.
When b01 should be set as switching mux, it also needs to indicate
the change by setting hiword mask (b11 << 16).
The patch adds mux flag for this usage.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Haojian Zhuang <haojian.zhuang@linaro.org>
Signed-off-by: Mike Turquette <mturquette@linaro.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-omap-pm into pm-omap
Pull PM / AVS: SmartReflex: misc. cleanups for 3.11 from Kevin Hilman.
|
|
Thanks to commit f91eb62f71b3 ("init: scream bloody murder if interrupts
are enabled too early"), "bloody murder" is now being screamed.
With a MIPS OCTEON config, we use on_each_cpu() in our
irq_chip.irq_bus_sync_unlock() function. This gets called in early as a
result of the time_init() call. Because the !SMP version of
on_each_cpu() unconditionally enables irqs, we get:
WARNING: at init/main.c:560 start_kernel+0x250/0x410()
Interrupts were enabled early
CPU: 0 PID: 0 Comm: swapper Not tainted 3.10.0-rc5-Cavium-Octeon+ #801
Call Trace:
show_stack+0x68/0x80
warn_slowpath_common+0x78/0xb0
warn_slowpath_fmt+0x38/0x48
start_kernel+0x250/0x410
Suggested fix: Do what we already do in the SMP version of
on_each_cpu(), and use local_irq_save/local_irq_restore. Because we
need a flags variable, make it a static inline to avoid name space
issues.
[ Change from v1: Convert on_each_cpu to a static inline function, add
#include <linux/irqflags.h> to avoid build breakage on some files.
on_each_cpu_mask() and on_each_cpu_cond() suffer the same problem as
on_each_cpu(), but they are not causing !SMP bugs for me, so I will
defer changing them to a less urgent patch. ]
Signed-off-by: David Daney <david.daney@cavium.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into next/drivers
From Simon Horman:
Renesas ARM based SoC GPIO R-Car updates for v3.11
DT support to GPIO R-Car driver by Laurent Pinchart.
* tag 'renesas-gpio-rcar-for-v3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas: (131 commits)
gpio-rcar: Add DT support
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into next/soc
From Simon Horman:
Renesas USB updates for v3.11
These updates are by Sergei Shtylyov to clean-up USB support
present for R8A7779/Marzen and then extend USB support coverage to
R8A7778/BOCK-W.
* tag 'renesas-phy-rcar-usb-for-v3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
ARM: shmobile: BOCK-W: add USB support
ARM: shmobile: r8a7778: add USB support
phy-rcar-usb: add R8A7778 support
phy-rcar-usb: handle platform data
ARM: shmobile: Marzen: pass platform data to USB PHY device
phy-rcar-usb: add platform data
phy-rcar-usb: correct base address
ARM: shmobile: r8a7779: remove USB PHY 2nd memory resource
phy-rcar-usb: remove EHCI internal buffer setup
ARM: shmobile: r8a7779: setup EHCI internal buffer
ehci-platform: add pre_setup() method to platform data
ARM: shmobile: Marzen: move USB EHCI, OHCI, and PHY devices to R8A7779 code
Conflicts:
arch/arm/mach-shmobile/board-marzen.c
arch/arm/mach-shmobile/setup-r8a7778.c
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-stericsson into next/drivers
From Linus Walleij:
Second set of DMA40 changes: refactorings and device tree
support for the DMA40. Now with MUSB and some platform
data removal.
* tag 'ux500-dma40-for-arm-soc-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-stericsson:
dmaengine: ste_dma40: Fetch disabled channels from DT
dmaengine: ste_dma40: Fetch the number of physical channels from DT
ARM: ux500: Stop passing DMA platform data though AUXDATA
dmaengine: ste_dma40: Allow memcpy channels to be configured from DT
dmaengine: ste_dma40_ll: Replace meaningless register set with comment
dmaengine: ste_dma40: Convert data_width from register bit format to value
dmaengine: ste_dma40_ll: Use the BIT macro to replace ugly '(1 << x)'s
ARM: ux500: Remove recently unused stedma40_xfer_dir enums
dmaengine: ste_dma40: Replace ST-E's home-brew DMA direction defs with generic ones
ARM: ux500: Replace ST-E's home-brew DMA direction definition with the generic one
dmaengine: ste_dma40: Use the BIT macro to replace ugly '(1 << x)'s
ARM: ux500: Remove empty function u8500_of_init_devices()
ARM: ux500: Remove ux500-musb platform registation when booting with DT
usb: musb: ux500: add device tree probing support
usb: musb: ux500: attempt to find channels by name before using pdata
usb: musb: ux500: harden checks for platform data
usb: musb: ux500: take the dma_mask from coherent_dma_mask
usb: musb: ux500: move the MUSB HDRC configuration into the driver
usb: musb: ux500: move channel number knowledge into the driver
|
|
* pci/jiang-bus-lock-v3:
PCI: Return early on allocation failures to unindent mainline code
PCI: Simplify IOV implementation and fix reference count races
PCI: Drop redundant setting of bus->is_added in virtfn_add_bus()
unicore32/PCI: Remove redundant call of pci_bus_add_devices()
m68k/PCI: Remove redundant call of pci_bus_add_devices()
PCI: Rename pci_release_bus_bridge_dev() to pci_release_host_bridge_dev()
PCI: Fix refcount issue in pci_create_root_bus() error recovery path
ia64/PCI: Clean up pci_scan_root_bus() usage
PCI: Convert alloc_pci_dev(void) to pci_alloc_dev(bus)
PCI: Introduce pci_alloc_dev(struct pci_bus*) to replace alloc_pci_dev()
PCI: Introduce pci_bus_{get|put}() to manage PCI bus reference count
Conflicts:
drivers/pci/probe.c
|
|
* pci/misc:
PCI / ACPI / PM: Use correct power state strings in messages
PCI: Fix comment typo for pcie_pme_remove()
PCI: Add pcibios_release_device()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into next/soc
From Tony Lindgren:
Omap SoC changes. Mostly improves am33xx support, and adds
minimal support for am43x SoCs.
* tag 'omap-for-v3.11/soc-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: OMAP2+: AM43x: SRAM base and size
ARM: OMAP2+: AM43x: GP or HS ?
ARM: OMAP2+: AM43x: early init
ARM: OMAP2+: AM43x: static mapping
ARM: OMAP2+: AM437x: SoC revision detection
ARM: OMAP2+: AM43x: soc_is support
ARM: OMAP2+: AM43x: kbuild
ARM: OMAP2+: AM43x: Kconfig
ARM: OMAP2+: separate out OMAP4 restart
ARM: AM33XX: clk: Add clock node for EHRPWM TBCLK
ARM: OMAP3: clock data: get rid of unused USB host clock aliases and dummies
ARM: OMAP2+: AM33xx: Add missing reset status info to GFX hwmod
+ Linux 3.10-rc5
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into next/cleanup
From Tony Lindgren:
Move omap4 over to device tree based booting. This allows us to get rid
a big pile of platform init code for things that are already handled by
device tree related code. As am33xx is already device tree based, we
can also remove the same data for am33xx.
* tag 'omap-for-v3.11/cleanup-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: OMAP4: hwmod data: Remove irq entries from mcspi, mmc hwmods
ARM: OMAP4: hwmod data: add DSS data back
ARM: OMAP4: hwmod data: Clean up the data file
ARM: AM33XX: hwmod data: irq, dma and addr info clean up
ARM: OMAP2+: Remove omap4 ocp2scp pdata
ARM: OMAP2+: Remove omap4 pdata for USB
ARM: OMAP2+: Remove omap4 pdata from hsmmc.c
ARM: OMAP2+: Remove legacy mux data for omap4
ARM: OMAP2+: Remove board-omap4panda.c
ARM: OMAP2+: Remove board-4430sdp.c
ARM: OMAP2+: Legacy support for wl12xx when booted with devicetree
Resolved merge conflict due to a fix for 3.10 (the fix is removed since
the code is no longer used -- data comes from device tree).
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into next/cleanup
Pulling in a set of fixes from Tony Lindgren to resolve conflicts with later
cleanup branch:
A set of small fixes for omaps for the -rc cycle:
- am7303 iva2 reset PM regression fix
- am33xx uart2 dma channel fix
- am33xx gpmc properties fix
- omap44xx rtc wake-up mux fix for nirq pins
- omap36xx clock divider restore fix
There's also one tiny non-critical .dts fix for omap5
timer pwm properties.
* tag 'omap-for-v3.10/fixes-v3.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: omap3: clock: fix wrong container_of in clock36xx.c
ARM: dts: OMAP5: Fix missing PWM capability to timer nodes
ARM: dts: omap4-panda|sdp: Fix mux for twl6030 IRQ pin and msecure line
ARM: dts: AM33xx: Fix properties on gpmc node
arm: omap2: fix AM33xx hwmod infos for UART2
+ Linux 3.10-rc4
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into next/fixes-non-critical
From Tony Lindgren:
Non-critical fixes for omaps for v3.11 merge window.
* tag 'omap-for-v3.11/fixes-non-critical-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: OMAP2+: omap-usb-host: Fix memory leaks
ARM: OMAP2+: Fix serial init for device tree based booting
arm/omap: use const char properly
ARM: OMAP2+: devices: Do not print error when dss_hdmi hwmod lookup fails
ARM: OMAP2+: devices: Do not print error when DMIC hwmod lookup fails
ARM: OMAP2+: devices: Do not print error when McPDM hwmod lookup fails
ARM: OMAP: add vdds_sdi supply for omapdss_sdi.0
ARM: OMAP: add vdds_dsi supply for omapdss_dpi.0
ARM: OMAP: fix dsi regulator names
+ Linux 3.10-rc5
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
All Transparent Huge Pages are allocated by the buddy allocator.
A compile time check is in place that fails when the order of a
transparent huge page is too large to be allocated by the buddy
allocator. Unfortunately that compile time check passes when:
HPAGE_PMD_ORDER == MAX_ORDER
( which is incorrect as the buddy allocator can only allocate
memory of order strictly less than MAX_ORDER. )
This patch updates the compile time check to fail in the above
case.
Signed-off-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Under x86, multiple puds can be made to reference the same bank of
huge pmds provided that they represent a full PUD_SIZE of shared
huge memory that is aligned to a PUD_SIZE boundary.
The code to share pmds does not require any architecture specific
knowledge other than the fact that pmds can be indexed, thus can
be beneficial to some other architectures.
This patch copies the huge pmd sharing (and unsharing) logic from
x86/ to mm/ and introduces a new config option to activate it:
CONFIG_ARCH_WANTS_HUGE_PMD_SHARE
Signed-off-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
|
|
f12dc02014 ("cgroup: mark "tasks" cgroup file as insane") and
cc5943a781 ("cgroup: mark "notify_on_release" and "release_agent"
cgroup files insane") forgot to update the changed behavior
documentation in cgroup.h. Update it.
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
A css (cgroup_subsys_state) is how each cgroup is represented to a
controller. As such, it can be used in hot paths across the various
subsystems different controllers are associated with.
One of the common operations is reference counting, which up until now
has been implemented using a global atomic counter and can have
significant adverse impact on scalability. For example, css refcnt
can be gotten and put multiple times by blkcg for each IO request.
For highops configurations which try to do as much per-cpu as
possible, the global frequent refcnting can be very expensive.
In general, given the various and hugely diverse paths css's end up
being used from, we need to make it cheap and highly scalable. In its
usage, css refcnting isn't very different from module refcnting.
This patch converts css refcnting to use the recently added
percpu_ref. css_get/tryget/put() directly maps to the matching
percpu_ref operations and the deactivation logic is no longer
necessary as percpu_ref already has refcnt killing.
The only complication is that as the refcnt is per-cpu,
percpu_ref_kill() in itself doesn't ensure that further tryget
operations will fail, which we need to guarantee before invoking
->css_offline()'s. This is resolved collecting kill confirmation
using percpu_ref_kill_and_confirm() and initiating the offline phase
of destruction after all css refcnt's are confirmed to be seen as
killed on all CPUs. The previous patches already splitted destruction
into two phases, so percpu_ref_kill_and_confirm() can be hooked up
easily.
This patch removes css_refcnt() which is used for rcu dereference
sanity check in css_id(). While we can add a percpu refcnt API to ask
the same question, css_id() itself is scheduled to be removed fairly
soon, so let's not bother with it. Just drop the sanity check and use
rcu_dereference_raw() instead.
v2: - init_cgroup_css() was calling percpu_ref_init() without checking
the return value. This causes two problems - the obvious lack
of error handling and percpu_ref_init() being called from
cgroup_init_subsys() before the allocators are up, which
triggers warnings but doesn't cause actual problems as the
refcnt isn't used for roots anyway. Fix both by moving
percpu_ref_init() to cgroup_create().
- The base references were put too early by
percpu_ref_kill_and_confirm() and cgroup_offline_fn() put the
refs one extra time. This wasn't noticeable because css's go
through another RCU grace period before being freed. Update
cgroup_destroy_locked() to grab an extra reference before
killing the refcnts. This problem was noticed by Kent.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Kent Overstreet <koverstreet@google.com>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: "Alasdair G. Kergon" <agk@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Glauber Costa <glommer@gmail.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu into for-3.11
This is to receive percpu_refcount which will replace atomic_t
reference count in cgroup_subsys_state.
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Split cgroup_destroy_locked() into two steps and put the latter half
into cgroup_offline_fn() which is executed from a work item. The
latter half is responsible for offlining the css's, removing the
cgroup from internal lists, and propagating release notification to
the parent. The separation is to allow using percpu refcnt for css.
Note that this allows for other cgroup operations to happen between
the first and second halves of destruction, including creating a new
cgroup with the same name. As the target cgroup is marked DEAD in the
first half and cgroup internals don't care about the names of cgroups,
this should be fine. A comment explaining this will be added by the
next patch which implements the actual percpu refcnting.
As RCU freeing is guaranteed to happen after the second step of
destruction, we can use the same work item for both. This patch
renames cgroup->free_work to ->destroy_work and uses it for both
purposes. INIT_WORK() is now performed right before queueing the work
item.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
|
|
percpu_ref_kill_and_confirm()
Implement percpu_tryget() which stops giving out references once the
percpu_ref is visible as killed. Because the refcnt is per-cpu,
different CPUs will start to see a refcnt as killed at different
points in time and tryget() may continue to succeed on subset of cpus
for a while after percpu_ref_kill() returns.
For use cases where it's necessary to know when all CPUs start to see
the refcnt as dead, percpu_ref_kill_and_confirm() is added. The new
function takes an extra argument @confirm_kill which is invoked when
the refcnt is guaranteed to be viewed as killed on all CPUs.
While this isn't the prettiest interface, it doesn't force synchronous
wait and is much safer than requiring the caller to do its own
call_rcu().
v2: Patch description rephrased to emphasize that tryget() may
continue to succeed on some CPUs after kill() returns as suggested
by Kent.
v3: Function comment in percpu_ref_kill_and_confirm() updated warning
people to not depend on the implied RCU grace period from the
confirm callback as it's an implementation detail.
Signed-off-by: Tejun Heo <tj@kernel.org>
Slightly-Grumpily-Acked-by: Kent Overstreet <koverstreet@google.com>
|
|
Add support to change the link state of VF (vPort)
Signed-off-by: Rony Efraim <ronye@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add netlink directives and ndo entry to allow for controling
VF link, which can be in one of three states:
Auto - VF link state reflects the PF link state (default)
Up - VF link state is up, traffic from VF to VF works even if
the actual PF link is down
Down - VF link state is down, no traffic from/to this VF, can be of
use while configuring the VF
Signed-off-by: Rony Efraim <ronye@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Caught by sparse:
- __rcu: missing annotation to sd->flow_limit
- __user: direct access in cpumask_scnprintf
Also
- add endline character when printing bitmap if room in buffer
- avoid bucket overflow by reducing FLOW_LIMIT_HISTORY
The last item warrants some explanation. The hashtable buckets are
subject to overflow if FLOW_LIMIT_HISTORY is larger than or equal
to bucket size, since all packets may end up in a single bucket. The
current (rather arbitrary) history value of 256 happens to match the
buffer size (u8).
As a result, with a single flow, the first 128 packets are accepted
(correct), the second 128 packets dropped (correct) and then the
history[] array has filled, so that each subsequent new packet
causes an increment in the bucket for new_flow plus a decrement
for old_flow: a steady state.
This is fine if packets are dropped, as the steady state goes away
as soon as a mix of traffic reappears. But, because the 256th packet
overflowed the bucket to 0: no packets are dropped.
Instead of explicitly adding an overflow check, this patch changes
FLOW_LIMIT_HISTORY to never be able to overflow a single bucket.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
(first item)
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull RCU fixes from Paul McKenney:
"I must confess that this past merge window was not RCU's best showing.
This series contains three more fixes for RCU regressions:
1. A fix to __DECLARE_TRACE_RCU() that causes it to act as an
interrupt from idle rather than as a task switch from idle.
This change is needed due to the recent use of _rcuidle()
tracepoints that can be invoked from interrupt handlers as well
as from idle. Without this fix, invoking _rcuidle() tracepoints
from interrupt handlers results in splats and (more seriously)
confusion on RCU's part as to whether a given CPU is idle or not.
This confusion can in turn result in too-short grace periods and
therefore random memory corruption.
2. A fix to a subtle deadlock that could result due to RCU doing
a wakeup while holding one of its rcu_node structure's locks.
Although the probability of occurrence is low, it really
does happen. The fix, courtesy of Steven Rostedt, uses
irq_work_queue() to avoid the deadlock.
3. A fix to a silent deadlock (invisible to lockdep) due to the
interaction of timeouts posted by RCU debug code enabled by
CONFIG_PROVE_RCU_DELAY=y, grace-period initialization, and CPU
hotplug operations. This will not occur in production kernels,
but really does occur in randconfig testing. Diagnosis courtesy
of Steven Rostedt"
* 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
rcu: Fix deadlock with CPU hotplug, RCU GP init, and timer migration
rcu: Don't call wakeup() with rcu_node structure ->lock held
trace: Allow idle-safe tracepoints to be called from irq
|
|
Normally, percpu_ref_init() initializes and percpu_ref_kill()
initiates destruction which completes asynchronously. The
asynchronous destruction can be problematic in init failure path where
the caller wants to destroy half-constructed object - distinguishing
half-constructed objects from the usual release method can be painful
for complex objects.
This patch implements percpu_ref_cancel_init() which synchronously
destroys the percpu_ref without invoking release. To avoid
unintentional misuses, the function requires the ref to have finished
percpu_ref_init() but never used and triggers WARN otherwise.
v2: Explain the weird name and usage restriction in the function
comment.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Kent Overstreet <koverstreet@google.com>
|
|
ACCESS_ONCE() in percpu_ref_kill_rcu()
Two small changes.
* Unlike most init functions, percpu_ref_init() allocates memory and
may fail. Let's mark it with __must_check in case the caller
forgets.
* percpu_ref_kill_rcu() is unnecessarily using ACCESS_ONCE() to
dereference @ref->pcpu_count, which can be misleading. The pointer
is guaranteed to be valid and visible and can't change underneath
the function. Drop ACCESS_ONCE().
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
cgroup->count tracks the number of css_sets associated with the cgroup
and used only to verify that no css_set is associated when the cgroup
is being destroyed. It's superflous as the destruction path can
simply check whether cgroup->cset_links is empty instead.
Drop cgroup->count and check ->cset_links directly from
cgroup_destroy_locked().
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
|
|
We will add another flag indicating that the cgroup is in the process
of being killed. REMOVING / REMOVED is more difficult to distinguish
and cgroup_is_removing()/cgroup_is_removed() are a bit awkward. Also,
later percpu_ref usage will involve "kill"ing the refcnt.
s/CGRP_REMOVED/CGRP_DEAD/
s/cgroup_is_removed()/cgroup_is_dead()
This patch is purely cosmetic.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
|
|
* __css_get() isn't used by anyone. Fold it into css_get().
* Add proper function comments to all css reference functions.
This patch is purely cosmetic.
v2: Typo fix as per Li.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
|
|
cgroups and css_sets are mapped M:N and this M:N mapping is
represented by struct cg_cgroup_link which forms linked lists on both
sides. The naming around this mapping is already confusing and struct
cg_cgroup_link exacerbates the situation quite a bit.
>From cgroup side, it starts off ->css_sets and runs through
->cgrp_link_list. From css_set side, it starts off ->cg_links and
runs through ->cg_link_list. This is rather reversed as
cgrp_link_list is used to iterate css_sets and cg_link_list cgroups.
Also, this is the only place which is still using the confusing "cg"
for css_sets. This patch cleans it up a bit.
* s/cgroup->css_sets/cgroup->cset_links/
s/css_set->cg_links/css_set->cgrp_links/
s/cgroup_iter->cg_link/cgroup_iter->cset_link/
* s/cg_cgroup_link/cgrp_cset_link/
* s/cgrp_cset_link->cg/cgrp_cset_link->cset/
s/cgrp_cset_link->cgrp_link_list/cgrp_cset_link->cset_link/
s/cgrp_cset_link->cg_link_list/cgrp_cset_link->cgrp_link/
* s/init_css_set_link/init_cgrp_cset_link/
s/free_cg_links/free_cgrp_cset_links/
s/allocate_cg_links/allocate_cgrp_cset_links/
* s/cgl[12]/link[12]/ in compare_css_sets()
* s/saved_link/tmp_link/ s/tmp/tmp_links/ and a couple similar
adustments.
* Comment and whiteline adjustments.
After the changes, we have
list_for_each_entry(link, &cont->cset_links, cset_link) {
struct css_set *cset = link->cset;
instead of
list_for_each_entry(link, &cont->css_sets, cgrp_link_list) {
struct css_set *cset = link->cg;
This patch is purely cosmetic.
v2: Fix broken sentences in the patch description.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
|