Age | Commit message (Collapse) | Author |
|
For lockless accesses to dentries we don't have pinned we rely
(among other things) upon having an RCU delay between dropping
the last reference and actually freeing the memory.
On the other hand, for things like pipes and sockets we neither
do that kind of lockless access, nor want to deal with the
overhead of an RCU delay every time a socket gets closed.
So delay was made optional - setting DCACHE_RCUACCESS in ->d_flags
made sure it would happen. We tried to avoid setting it unless
we knew we need it. Unfortunately, that had led to recurring
class of bugs, in which we missed the need to set it.
We only really need it for dentries that are created by
d_alloc_pseudo(), so let's not bother with trying to be smart -
just make having an RCU delay the default. The ones that do
*not* get it set the replacement flag (DCACHE_NORCU) and we'd
better use that sparingly. d_alloc_pseudo() is the only
such user right now.
FWIW, the race that finally prompted that switch had been
between __lock_parent() of immediate subdirectory of what's
currently the root of a disconnected tree (e.g. from
open-by-handle in progress) racing with d_splice_alias()
elsewhere picking another alias for the same inode, either
on outright corrupted fs image, or (in case of open-by-handle
on NFS) that subdirectory having been just moved on server.
It's not easy to hit, so the sky is not falling, but that's
not the first race on similar missed cases and the logics
for settinf DCACHE_RCUACCESS has gotten ridiculously
convoluted.
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
To be able to predict the sleep duration for a CPU entering idle, it
is essential to know the expiration time of the next timer. Both the
teo and the menu cpuidle governors already use this information for
CPU idle state selection.
Moving forward, a similar prediction needs to be made for a group of
idle CPUs rather than for a single one and the following changes
implement a new genpd governor for that purpose.
In order to support that feature, add a new function called
tick_nohz_get_next_hrtimer() that will return the next hrtimer
expiration time of a given CPU to be invoked after deciding
whether or not to stop the scheduler tick on that CPU.
Make the cpuidle core call tick_nohz_get_next_hrtimer() right
before invoking the ->enter() callback provided by the cpuidle
driver for the given state and store its return value in the
per-CPU struct cpuidle_device, so as to make it available to code
outside of cpuidle.
Note that at the point when cpuidle calls tick_nohz_get_next_hrtimer(),
the governor's ->select() callback has already returned and indicated
whether or not the tick should be stopped, so in fact the value
returned by tick_nohz_get_next_hrtimer() always is the next hrtimer
expiration time for the given CPU, possibly including the tick (if
it hasn't been stopped).
Co-developed-by: Lina Iyer <lina.iyer@linaro.org>
Co-developed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
[ rjw: Subject & changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
To enable a CPU device to be attached to a PM domain managed by genpd,
make a few changes to it for convenience.
To be able to quickly find out what CPUs are attached to a genpd,
which typically becomes useful from a genpd governor as subsequent
changes are about to show, add a cpumask to struct generic_pm_domain
to be updated when a CPU device gets attached to the genpd containing
that cpumask. Also, propagate the cpumask changes upwards in the
domain hierarchy to the master PM domains. This way, the cpumask for
a genpd hierarchically reflects all CPUs attached to the topology
below it.
Finally, make this an opt-in feature, to avoid having to manage CPUs
and the cpumask for a genpd that don't need it. To that end, add
a new genpd configuration bit, GENPD_FLAG_CPU_DOMAIN.
Co-developed-by: Lina Iyer <lina.iyer@linaro.org>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Add a data pointer to the genpd_power_state struct, to allow a genpd
backend driver to store per-state specific data. To introduce the
pointer, change the way genpd deals with freeing of the corresponding
allocated data.
More precisely, clarify the responsibility of whom that shall free the
data, by adding a ->free_states() callback to the generic_pm_domain
structure. The one allocating the data will be expected to set the
callback, to allow genpd to invoke it from genpd_remove().
Co-developed-by: Lina Iyer <lina.iyer@linaro.org>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
[ rjw: Subject & changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Sphinx emits various warnings all caused by a missing colon before code
block:
WARNING: Block quote ends without a blank line; unexpected unindent.
ERROR: Unexpected indentation.
WARNING: Block quote ends without a blank line; unexpected unindent.
Add the colon, clearing sphinx warnings.
Signed-off-by: Tobin C. Harding <tobin@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
In certain situations, such as when returning from low power modes, the
EMIF must re-run hardware leveling to properly restore DDR3 access.
This is accomplished by introducing a new ti-emif-sram-pm call,
ti_emif_run_hw_leveling, to check if DDR3 is in use and if so, trigger
the full write and read leveling processes.
Suggested-by: Brad Griffis <bgriffis@ti.com>
Signed-off-by: Dave Gerlach <d-gerlach@ti.com>
Acked-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
|
|
'fixes.2019.03.26b', 'srcu.2019.03.26b', 'stall.2019.03.26b' and 'torture.2019.03.26b' into HEAD
consolidate.2019.04.09a: Lingering RCU flavor consolidation cleanups.
doc.2019.03.26b: Documentation updates.
fixes.2019.03.26b: Miscellaneous fixes.
srcu.2019.03.26b: SRCU updates.
stall.2019.03.26b: RCU CPU stall warning updates.
torture.2019.03.26b: Torture-test updates.
|
|
|
|
We have a function to copy strings safely and we have a function to copy
strings and zero the tail of the destination (if source string is
shorter than destination buffer) but we do not have a function to do
both at once. This means developers must write this themselves if they
desire this functionality. This is a chore, and also leaves us open to
off by one errors unnecessarily.
Add a function that calls strscpy() then memset()s the tail to zero if
the source string is shorter than the destination buffer.
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Tobin C. Harding <tobin@kernel.org>
Signed-off-by: Shuah Khan <shuah@kernel.org>
|
|
genphy_read_status() so far checks phydev->supported, not the actual
PHY capabilities. This can make a difference if the supported speeds
have been limited by of_set_phy_supported() or phy_set_max_speed().
It seems that this issue only affects the link partner advertisements
as displayed by ethtool. Also this patch wouldn't apply to older
kernels because linkmode bitmaps have been introduced recently.
Therefore net-next.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mamameed says:
====================
mlx5-updates-2019-04-02
This series provides misc updates to mlx5 driver
1) Aya Levin (1): Handle event of power detection in the PCIE slot
2) Eli Britstein (6):
Some TC VLAN related updates and fixes to the previous VLAN modify action
support patchset.
Offload TC e-switch rules with egress/ingress VLAN devices
3) Max Gurtovoy (1): Fix double mutex initialization in esiwtch.c
4) Tariq Toukan (3): Misc small updates
A write memory barrier is sufficient in EQ ci update
Obsolete param field holding a constant value
Unify logic of MTU boundaries
5) Tonghao Zhang (4): Misc updates to en_tc.c
Make the log friendly when decapsulation offload not supported
Remove 'parse_attr' argument in parse_tc_fdb_actions()
Deletes unnecessary setting of esw_attr->parse_attr
Return -EOPNOTSUPP when attempting to offload an unsupported action
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
They have no external callers anymore.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Add version option support to the nftables "osf" expression.
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
vring_create_virtqueue() allows the caller to specify via the
may_reduce_num parameter whether the vring code is allowed to
allocate a smaller ring than specified.
However, the split ring allocation code tries to allocate a
smaller ring on allocation failure regardless of what the
caller specified. This may cause trouble for e.g. virtio-pci
in legacy mode, which does not support ring resizing. (The
packed ring code does not resize in any case.)
Let's fix this by bailing out immediately in the split ring code
if the requested size cannot be allocated and may_reduce_num has
not been specified.
While at it, fix a typo in the usage instructions.
Fixes: 2a2d1382fe9d ("virtio: Add improved queue allocation API")
Cc: stable@vger.kernel.org # v4.6+
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
|
|
NF_NAT_NEEDED is true whenever nat support for either ipv4 or ipv6 is
enabled. Now that the af-specific nat configuration switches have been
removed, IS_ENABLED(CONFIG_NF_NAT) has the same effect.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
very little code, so it really doesn't make sense to have extra
modules or even a kconfig knob for this.
Merge them and make functionality available unconditionally.
The merge makes inet family route support trivial, so add it
as well here.
Before:
text data bss dec hex filename
835 832 0 1667 683 nft_chain_route_ipv4.ko
870 832 0 1702 6a6 nft_chain_route_ipv6.ko
111568 2556 529 114653 1bfdd nf_tables.ko
After:
text data bss dec hex filename
113133 2556 529 116218 1c5fa nf_tables.ko
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
optimize nf_inet_addr_cmp by 64bit xor computation
similar to ipv6_addr_equal()
Signed-off-by: Yuan Linsi <yuanlinsi01@baidu.com>
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
there is a similar helper in net/netfilter/nf_tables_api.c,
this maybe become a common request someday, so move it to
time.c
Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Acked-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Add support for rtc+ddr in self refresh mode. Add addtional
pm hooks for save/restore and rtc suspend/resume.
Signed-off-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
|
|
Prepare rtc driver for rtc-only with DDR in self-refresh mode.
omap_rtc_power_off now should cater to two features:
1) RTC plus DDR in self-refresh is power a saving mode where in the
entire system including the different voltage rails from PMIC are
shutdown except the ones feeding on to RTC and DDR. DDR is kept in
self-refresh hence the contents are preserved. RTC ALARM2 is connected
to PMIC_EN line once we the ALARM2 is triggered we enter the mode with
DDR in self-refresh and RTC Ticking. After a predetermined time an RTC
ALARM1 triggers waking up the system[1]. The control goes to bootloader.
The bootloader then checks RTC scratchpad registers to confirm it was an
rtc_only wakeup and follows a different path, configure bare minimal
clocks for ddr and then jumps to the resume address in another RTC
scratchpad registers and transfers the control to Kernel. Kernel then
restores the saved context. omap_rtc_power_off_program does the ALARM2
programming part.
[1] http://www.ti.com/lit/ug/spruhl7h/spruhl7h.pdf Page 2884
2) Power-off: This is usual poweroff mode. omap_rtc_power_off calls the
above omap_rtc_power_off_program function and in addition to that
programs the OMAP_RTC_PMIC_REG for any external wake ups for PMIC like
the pushbutton and shuts off the PMIC.
Hence the split in omap_rtc_power_off.
Signed-off-by: Keerthy <j-keerthy@ti.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
[tony@atomide.com: folded in a fix for compile warning]
Signed-off-by: Tony Lindgren <tony@atomide.com>
|
|
After commit a297569fe00a ("net/udp: do not touch skb->peeked unless
really needed") the 'peeked' argument of __skb_try_recv_datagram()
and friends is always equal to !!'flags & MSG_PEEK'.
Since such argument is really a boolean info, and the callers have
already 'flags & MSG_PEEK' handy, we can remove it and clean-up the
code a bit.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Most dma_map_ops implementations already had some issues with a NULL
device, or did simply crash if one was fed to them. Now that we have
cleaned up all the obvious offenders we can stop to pretend we
support this mode.
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Commit 6dc4f100c175 ("block: allow bio_for_each_segment_all() to
iterate over multi-page bvec") changes bio_for_each_segment_all()
to use for-inside-for.
This way breaks all bio_for_each_segment_all() call with error out
branch via 'break', since now 'break' can only break from the inner
loop.
Fixes this issue by implementing bio_for_each_segment_all() via
single 'for' loop, and now the logic is very similar with normal
bvec iterator.
Cc: Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: linux-btrfs@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
Cc: Omar Sandoval <osandov@fb.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reported-and-Tested-by: Qu Wenruo <quwenruo.btrfs@gmx.com>
Fixes: 6dc4f100c175 ("block: allow bio_for_each_segment_all() to iterate over multi-page bvec")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Sometimes we want to define a tracepoint as a do-nothing function.
So I introduce TRACE_EVENT_NOP, DECLARE_EVENT_CLASS_NOP and
DEFINE_EVENT_NOP for this kind of usage.
Link: http://lkml.kernel.org/r/1553602391-11926-2-git-send-email-laoar.shao@gmail.com
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Linux 5.1-rc3
|
|
mmiowb() is now implied by spin_unlock() on architectures that require
it, so there is no reason to call it from driver code. This patch was
generated using coccinelle:
@mmiowb@
@@
- mmiowb();
and invoked as:
$ for d in drivers include/linux/qed sound; do \
spatch --include-headers --sp-file mmiowb.cocci --dir $d --in-place; done
NOTE: mmiowb() has only ever guaranteed ordering in conjunction with
spin_unlock(). However, pairing each mmiowb() removal in this patch with
the corresponding call to spin_unlock() is not at all trivial, so there
is a small chance that this change may regress any drivers incorrectly
relying on mmiowb() to order MMIO writes between CPUs using lock-free
synchronisation. If you've ended up bisecting to this commit, you can
reintroduce the mmiowb() calls using wmb() instead, which should restore
the old behaviour on all architectures other than some esoteric ia64
systems.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
Removing explicit calls to mmiowb() from driver code means that we must
now call into the generic mmiowb_spin_{lock,unlock}() functions from the
core spinlock code. In order to elide barriers following critical
sections without any I/O writes, we also hook into the asm-generic I/O
routines.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
While the cpuinfo.max_freq value doesn't really matter for
intel_pstate in the active mode, in the passive mode it is used by
governors as the maximum physical frequency of the CPU and the
results of governor computations generally depend on it. Also it
is made available to user space via sysfs and it should match the
current HW configuration.
For this reason, make intel_pstate update cpuinfo.max_freq for all
CPUs if it detects a global change of turbo frequency settings from
"disable" to "enable" or the other way associated with a _PPC change
notification from the platform firmware.
Note that policy_is_inactive(), cpufreq_cpu_acquire(),
cpufreq_cpu_release(), and cpufreq_set_policy() need to be made
available to it for this purpose.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=200759
Reported-by: Gabriele Mazzotta <gabriele.mzt@gmail.com>
Tested-by: Gabriele Mazzotta <gabriele.mzt@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|
The direction inversion semantics are now handled by simply
using the registers for in/out available, no need to keep
track of inversion semantics exmplicitly anymore.
Reviewed-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Reviewed-by: Jan Kotas <jank@cadence.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
The symbol that is being used in the #if/#endif block is not the one
which is mentioned at the bottom.
Fixes: 93af53b8633c ("nand: omap2: Remove horrible ifdefs to fix module probe")
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
|
|
nand_device embeds a nand_ecc_req object which contains the minimum
strength and step-size required by the NAND device.
Drop the chip->ecc_{strength,step}_ds fields and use
chip->base.eccreq.{strength,step_size} instead.
Signed-off-by: Boris Brezillon <bbrezillon@kernel.org>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Reviewed-by: Frieder Schrempf <frieder.schrempf@kontron.de>
|
|
The same information is provided by nanddev_ntargets().
Signed-off-by: Boris Brezillon <bbrezillon@kernel.org>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Reviewed-by: Frieder Schrempf <frieder.schrempf@kontron.de>
|
|
The target size can now be returned by nanddev_get_targetsize(). Get
rid of the chip->chipsize field and use this helper instead.
Signed-off-by: Boris Brezillon <bbrezillon@kernel.org>
Reviewed-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
|
|
Now that we inherit from nand_device, we can use
nand_device->memorg.bits_per_cell instead of having our own field at
the nand_chip level.
Signed-off-by: Boris Brezillon <bbrezillon@kernel.org>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Reviewed-by: Frieder Schrempf <frieder.schrempf@kontron.de>
|
|
nanddev_mtd_max_bad_blocks() is implemented by the generic NAND layer
and is already doing what we need. Reuse this function instead of
having our own implementation.
While at it, get rid of the ->max_bb_per_die and ->blocks_per_die
fields which are now unused.
Signed-off-by: Boris Brezillon <bbrezillon@kernel.org>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Reviewed-by: Frieder Schrempf <frieder.schrempf@kontron.de>
|
|
Looking at the field names it's hard to tell what ->data_buf, ->pagebuf
and ->pagebuf_bitflips are for. Clarify that by moving those fields
in a sub-struct named pagecache.
Signed-off-by: Boris Brezillon <bbrezillon@kernel.org>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Reviewed-by: Frieder Schrempf <frieder.schrempf@kontron.de>
|
|
We plan to move cache related fields to a pagecache struct in nand_chip
but some drivers access ->pagebuf directly to invalidate the cache
before they start using ->data_buf.
Let's provide an helper that returns a pointer to ->data_buf after
invalidating the cache.
Signed-off-by: Boris Brezillon <bbrezillon@kernel.org>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Reviewed-by: Frieder Schrempf <frieder.schrempf@kontron.de>
|
|
The generic NAND layer provides abstraction of NAND devices no matter
the bus that is used to communicate with the chip. Basing the raw NAND
core on this generic layer should avoid duplication of common
operations, like iterating over all pages/blocks for MTD IO/erase
operations.
In order to re-use this layer, we must first inherit from nand_device
and then initialize the nand_device struct appropriately. This patch
is taking care of the former.
Signed-off-by: Boris Brezillon <bbrezillon@kernel.org>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Reviewed-by: Frieder Schrempf <frieder.schrempf@kontron.de>
|
|
Use the nand_to_mtd() helper to access chip->mtd as done everywhere
else.
Signed-off-by: Boris Brezillon <bbrezillon@kernel.org>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Reviewed-by: Frieder Schrempf <frieder.schrempf@kontron.de>
|
|
Will be used by the raw NAND framework.
Signed-off-by: Boris Brezillon <bbrezillon@kernel.org>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Reviewed-by: Frieder Schrempf <frieder.schrempf@kontron.de>
|
|
Some drivers in the raw NAND framework seems to need this helper, so
let's just add it instead of open-coding the logic.
Signed-off-by: Boris Brezillon <bbrezillon@kernel.org>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Reviewed-by: Frieder Schrempf <frieder.schrempf@kontron.de>
|
|
NAND datasheets usually give the maximum number of bad blocks per LUN
and this number can be used to help upper layers decide how much blocks
they should reserve for bad block handling.
Add a max_bad_eraseblocks_per_lun to the nand_memory_organization
struct and update the NAND_MEMORG() macro (and its users) accordingly.
We also provide a default mtd->_max_bad_blocks() implementation.
Signed-off-by: Boris Brezillon <bbrezillon@kernel.org>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Reviewed-by: Frieder Schrempf <frieder.schrempf@kontron.de>
|
|
This patch creates set_cs_timing SPI master optional method for
SPI masters to implement configuring CS timing if applicable.
This patch also creates spi_cs_timing accessory for SPI clients to
use for requesting SPI master controllers to configure device requested
CS setup time, hold time and inactive delay.
Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Move all of the code doing struct spi_bitbang initialization, so that
it can be paired with devm_spi_register_master() in order to avoid
having to call spi_bitbang_stop() explicitly.
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Chris Healy <cphealy@gmail.com>
Cc: linux-spi@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The current definition and implementation of the SEV_GET_ID command
does not provide the length of the unique ID returned by the firmware.
As per the firmware specification, the firmware may return an ID
length that is not restricted to 64 bytes as assumed by the SEV_GET_ID
command.
Introduce the SEV_GET_ID2 command to overcome with the SEV_GET_ID
limitations. Deprecate the SEV_GET_ID in the favor of SEV_GET_ID2.
At the same time update SEV API web link.
Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Gary Hook <gary.hook@amd.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Nathaniel McCallum <npmccallum@redhat.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Native bit_spin_locks are not tracked by lockdep.
The bit_spin_locks used for rhashtable buckets are local
to the rhashtable implementation, so there is little opportunity
for the sort of misuse that lockdep might detect.
However locks are held while a hash function or compare
function is called, and if one of these took a lock,
a misbehaviour is possible.
As it is quite easy to add lockdep support this unlikely
possibility seems to be enough justification.
So create a lockdep class for bucket bit_spin_lock and attach
through a lockdep_map in each bucket_table.
Without the 'nested' annotation in rhashtable_rehash_one(), lockdep
correctly reports a possible problem as this lock is taken
while another bucket lock (in another table) is held. This
confirms that the added support works.
With the correct nested annotation in place, lockdep reports
no problems.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch changes rhashtables to use a bit_spin_lock on BIT(1) of the
bucket pointer to lock the hash chain for that bucket.
The benefits of a bit spin_lock are:
- no need to allocate a separate array of locks.
- no need to have a configuration option to guide the
choice of the size of this array
- locking cost is often a single test-and-set in a cache line
that will have to be loaded anyway. When inserting at, or removing
from, the head of the chain, the unlock is free - writing the new
address in the bucket head implicitly clears the lock bit.
For __rhashtable_insert_fast() we ensure this always happens
when adding a new key.
- even when lockings costs 2 updates (lock and unlock), they are
in a cacheline that needs to be read anyway.
The cost of using a bit spin_lock is a little bit of code complexity,
which I think is quite manageable.
Bit spin_locks are sometimes inappropriate because they are not fair -
if multiple CPUs repeatedly contend of the same lock, one CPU can
easily be starved. This is not a credible situation with rhashtable.
Multiple CPUs may want to repeatedly add or remove objects, but they
will typically do so at different buckets, so they will attempt to
acquire different locks.
As we have more bit-locks than we previously had spinlocks (by at
least a factor of two) we can expect slightly less contention to
go with the slightly better cache behavior and reduced memory
consumption.
To enhance type checking, a new struct is introduced to represent the
pointer plus lock-bit
that is stored in the bucket-table. This is "struct rhash_lock_head"
and is empty. A pointer to this needs to be cast to either an
unsigned lock, or a "struct rhash_head *" to be useful.
Variables of this type are most often called "bkt".
Previously "pprev" would sometimes point to a bucket, and sometimes a
->next pointer in an rhash_head. As these are now different types,
pprev is NULL when it would have pointed to the bucket. In that case,
'blk' is used, together with correct locking protocol.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Rather than returning a pointer to a static nulls, rht_bucket_var()
now returns NULL if the bucket doesn't exist.
This will make the next patch, which stores a bitlock in the
bucket pointer, somewhat cleaner.
This change involves introducing __rht_bucket_nested() which is
like rht_bucket_nested(), but doesn't provide the static nulls,
and changing rht_bucket_nested() to call this and possible
provide a static nulls - as is still needed for the non-var case.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The current handling of MSR_IA32_ENERGY_PERF_BIAS in the kernel is
problematic, because it may cause changes made by user space to that
MSR (with the help of the x86_energy_perf_policy tool, for example)
to be lost every time a CPU goes offline and then back online as well
as during system-wide power management transitions into sleep states
and back into the working state.
The first problem is that if the current EPB value for a CPU going
online is 0 ('performance'), the kernel will change it to 6 ('normal')
regardless of whether or not this is the first bring-up of that CPU.
That also happens during system-wide resume from sleep states
(including, but not limited to, hibernation). However, the EPB may
have been adjusted by user space this way and the kernel should not
blindly override that setting.
The second problem is that if the platform firmware resets the EPB
values for any CPUs during system-wide resume from a sleep state,
the kernel will not restore their previous EPB values that may
have been set by user space before the preceding system-wide
suspend transition. Again, that behavior may at least be confusing
from the user space perspective.
In order to address these issues, rework the handling of
MSR_IA32_ENERGY_PERF_BIAS so that the EPB value is saved on CPU
offline and restored on CPU online as well as (for the boot CPU)
during the syscore stages of system-wide suspend and resume
transitions, respectively.
However, retain the policy by which the EPB is set to 6 ('normal')
on the first bring-up of each CPU if its initial value is 0, based
on the observation that 0 may mean 'not initialized' just as well as
'performance' in that case.
While at it, move the MSR_IA32_ENERGY_PERF_BIAS handling code into
a separate file and document it in Documentation/admin-guide.
Fixes: abe48b108247 (x86, intel, power: Initialize MSR_IA32_ENERGY_PERF_BIAS)
Fixes: b51ef52df71c (x86/cpu: Restore MSR_IA32_ENERGY_PERF_BIAS after resume)
Reported-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Acked-by: Borislav Petkov <bp@suse.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Remove the LM3532 backlight driver references from the ti-lmu
code as dedicated driver support is available.
Signed-off-by: Dan Murphy <dmurphy@ti.com>
Acked-for-MFD-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
|