Age | Commit message (Collapse) | Author |
|
handle_percpu_devid_fasteoi_ipi() has no more users, and
handle_percpu_devid_irq() can do all that it was supposed to do. Get rid of
it.
This reverts commit c5e5ec033c4ab25c53f1fd217849e75deb0bf7bf.
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201109094121.29975-6-valentin.schneider@arm.com
|
|
Since 5.10-rc1 i.MX is a devicetree-only platform, so simplify the code
by removing the unused non-DT support.
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Link: https://lore.kernel.org/r/20201210212748.5849-1-festevam@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Currently there is no way to the sensors to directly call an ops in
interrupt mode without calling thermal_zone_device_update assuming all
the trip points are defined.
A sensor may want to do something special if a trip point is hot or
critical.
This patch adds the critical and hot ops to the thermal zone device,
so a sensor can directly invoke them or let the thermal framework to
call the sensor specific ones.
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://lore.kernel.org/r/20201210121514.25760-2-daniel.lezcano@linaro.org
|
|
Remove old power model and use new Energy Model to calculate the power
budget. It drops static + dynamic power calculations and power table
in order to use Energy Model performance domain data. This model
should be easy to use and could find more users. It is also less
complicated to setup the needed structures.
Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201210143014.24685-5-lukasz.luba@arm.com
|
|
The Energy Model (EM) framework supports devices such as Devfreq. Create
new registration function which automatically register EM for the thermal
devfreq_cooling devices. This patch prepares the code for coming changes
which are going to replace old power model with the new EM.
Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201210143014.24685-4-lukasz.luba@arm.com
|
|
A new field (20MHz PSD - 1 byte) was added to the RNR TBTT info field.
Adjust the expected TBTT info length accordingly.
Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20201129172929.b503adccce6a.Ie684e1d3039c111bf2d521bf762aaec3f7a24d2e@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
This extends the support for drivers that rebuild IEs in the
FW (same as with HT/VHT/HE).
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20201129172929.4012647275f3.I1a93ae71c57ef0b6f58f99d47fce919d19d65ff0@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
The WLAN device may exist yet not be usable. This can happen
when the WLAN device is controllable by both the host and
some platform internal component.
We need some arbritration that is vendor specific, but when
the device is not available for the host, we need to reflect
this state towards the user space.
Add a reason field to the rfkill object (and event) so that
userspace can know why the device is in rfkill: because some
other platform component currently owns the device, or
because the actual hw rfkill signal is asserted.
Capable userspace can now determine the reason for the rfkill
and possibly do some negotiation on a side band channel using
a proprietary protocol to gain ownership on the device in case
the device is owned by some other component. When the host
gains ownership on the device, the kernel can remove the
RFKILL_HARD_BLOCK_NOT_OWNER reason and the hw rfkill state
will be off. Then, the userspace can bring the device up and
start normal operation.
The rfkill_event structure is enlarged to include the additional
byte, it is now 9 bytes long. Old user space will ask to read
only 8 bytes so that the kernel can know not to feed them with
more data. When the user space writes 8 bytes, new kernels will
just read what is present in the file descriptor. This new byte
is read only from the userspace standpoint anyway.
If a new user space uses an old kernel, it'll ask to read 9 bytes
but will get only 8, and it'll know that it didn't get the new
state. When it'll write 9 bytes, the kernel will again ignore
this new byte which is read only from the userspace standpoint.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Link: https://lore.kernel.org/r/20201104134641.28816-1-emmanuel.grumbach@intel.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
fsnotify_parent() used to send two separate events to backends when a
parent inode is watching children and the child inode is also watching.
In an attempt to avoid duplicate events in fanotify, we unified the two
backend callbacks to a single callback and handled the reporting of the
two separate events for the relevant backends (inotify and dnotify).
However the handling is buggy and can result in inotify and dnotify
listeners receiving events of the type they never asked for or spurious
events.
The problem is the unified event callback with two inode marks (parent and
child) is called when any of the parent and child inodes are watched and
interested in the event, but the parent inode's mark that is interested
in the event on the child is not necessarily the one we are currently
reporting to (it could belong to a different group).
So before reporting the parent or child event flavor to backend we need
to check that the mark is really interested in that event flavor.
The semantics of INODE and CHILD marks were hard to follow and made the
logic more complicated than it should have been. Replace it with INODE
and PARENT marks semantics to hopefully make the logic more clear.
Thanks to Hugh Dickins for spotting a bug in the earlier version of this
patch.
Fixes: 497b0c5a7c06 ("fsnotify: send event to parent and child with single callback")
CC: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201202120713.702387-4-amir73il@gmail.com
Reported-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
TCPM state machine needs 20-25ms to enter the ErrorRecovery state after
tPSSourceOn timer timeouts. Change the timer from max 480ms to 450ms to
ensure that the timer complies with the Spec. In order to keep the
flexibility for other usecases using tPSSourceOn, add another timer only
for PR_SWAP.
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Cc: Badhri Jagan Sridharan <badhri@google.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Kyle Tso <kyletso@google.com>
Signed-off-by: Will McVicker <willmcvicker@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20201210160521.3417426-5-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The current RTC set_offset_nsec value is not really intuitive to
understand.
tsched twrite(t2.tv_sec - 1) t2 (seconds increment)
The offset is calculated from twrite based on the assumption that t2 -
twrite == 1s. That means for the MC146818 RTC the offset needs to be
negative so that the write happens 500ms before t2.
It's easier to understand when the whole calculation is based on t2. That
avoids negative offsets and the meaning is obvious:
t2 - twrite: The time defined by the chip when seconds increment
after the write.
twrite - tsched: The time for the transport to the point where the chip
is updated.
==> set_offset_nsec = t2 - tsched
ttransport = twrite - tsched
tRTCinc = t2 - twrite
==> set_offset_nsec = ttransport + tRTCinc
tRTCinc is a chip property and can be obtained from the data sheet.
ttransport depends on how the RTC is connected. It is close to 0 for
directly accessible RTCs. For RTCs behind a slow bus, e.g. i2c, it's the
time required to send the update over the bus. This can be estimated or
even calibrated, but that's a different problem.
Adjust the implementation and update comments accordingly.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20201206220542.263204937@linutronix.de
|
|
rtc_set_ntp_time() is not really RTC functionality as the code is just a
user of RTC. Move it into the NTP code which allows further cleanups.
Requested-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20201206220542.166871172@linutronix.de
|
|
Miroslav reported that the periodic RTC synchronization in the NTP code
fails more often than not to hit the specified update window.
The reason is that the code uses delayed_work to schedule the update which
needs to be in thread context as the underlying RTC might be connected via
a slow bus, e.g. I2C. In the update function it verifies whether the
current time is correct vs. the requirements of the underlying RTC.
But delayed_work is using the timer wheel for scheduling which is
inaccurate by design. Depending on the distance to the expiry the wheel
gets less granular to allow batching and to avoid the cascading of the
original timer wheel. See 500462a9de65 ("timers: Switch to a non-cascading
wheel") and the code for further details.
The code already deals with this by splitting the 660 seconds period into a
long 659 seconds timer and then retrying with a smaller delta.
But looking at the actual granularities of the timer wheel (which depend on
the HZ configuration) the 659 seconds timer ends up in an outer wheel level
and is affected by a worst case granularity of:
HZ Granularity
1000 32s
250 16s
100 40s
So the initial timer can be already off by max 12.5% which is not a big
issue as the period of the sync is defined as ~11 minutes.
The fine grained second attempt schedules to the desired update point with
a timer expiring less than a second from now. Depending on the actual delta
and the HZ setting even the second attempt can end up in outer wheel levels
which have a large enough granularity to make the correctness check fail.
As this is a fundamental property of the timer wheel there is no way to
make this more accurate short of iterating in one jiffies steps towards the
update point.
Switch it to an hrtimer instead which schedules the actual update work. The
hrtimer will expire precisely (max 1 jiffie delay when high resolution
timers are not available). The actual scheduling delay of the work is the
same as before.
The update is triggered from do_adjtimex() which is a bit racy but not much
more racy than it was before:
if (ntp_synced())
queue_delayed_work(system_power_efficient_wq, &sync_work, 0);
which is racy when the work is currently executed and has not managed to
reschedule itself.
This becomes now:
if (ntp_synced() && !hrtimer_is_queued(&sync_hrtimer))
queue_work(system_power_efficient_wq, &sync_work, 0);
which is racy when the hrtimer has expired and the work is currently
executed and has not yet managed to rearm the hrtimer.
Not a big problem as it just schedules work for nothing.
The new implementation has a safe guard in place to catch the case where
the hrtimer is queued on entry to the work function and avoids an extra
update attempt of the RTC that way.
Reported-by: Miroslav Lichvar <mlichvar@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Miroslav Lichvar <mlichvar@redhat.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20201206220542.062910520@linutronix.de
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull namespaced fscaps fix from James Morris:
"Fix namespaced fscaps when !CONFIG_SECURITY (Serge Hallyn)"
* tag 'fixes-v5.10a' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
[SECURITY] fix namespaced fscaps when !CONFIG_SECURITY
|
|
Pull NFS client fixes from Anna Schumaker:
"Here are a handful more bugfixes for 5.10.
Unfortunately, we found some problems with the new READ_PLUS operation
that aren't easy to fix. We've decided to disable this codepath
through a Kconfig option for now, but a series of patches going into
5.11 will clean up the code and fix the issues at the same time. This
seemed like the best way to go about it.
Summary:
- Fix array overflow when flexfiles mirroring is enabled
- Fix rpcrdma_inline_fixup() crash with new LISTXATTRS
- Fix 5 second delay when doing inter-server copy
- Disable READ_PLUS by default"
* tag 'nfs-for-5.10-3' of git://git.linux-nfs.org/projects/anna/linux-nfs:
NFS: Disable READ_PLUS by default
NFSv4.2: Fix 5 seconds delay when doing inter server copy
NFS: Fix rpcrdma_inline_fixup() crash with new LISTXATTRS operation
pNFS/flexfiles: Fix array overflow when flexfiles mirroring is enabled
|
|
Pull networking fixes from David Miller:
1) IPsec compat fixes, from Dmitry Safonov.
2) Fix memory leak in xfrm_user_policy(). Fix from Yu Kuai.
3) Fix polling in xsk sockets by using sk_poll_wait() instead of
datagram_poll() which keys off of sk_wmem_alloc and such which xsk
sockets do not update. From Xuan Zhuo.
4) Missing init of rekey_data in cfgh80211, from Sara Sharon.
5) Fix destroy of timer before init, from Davide Caratti.
6) Missing CRYPTO_CRC32 selects in ethernet driver Kconfigs, from Arnd
Bergmann.
7) Missing error return in rtm_to_fib_config() switch case, from Zhang
Changzhong.
8) Fix some src/dest address handling in vrf and add a testcase. From
Stephen Suryaputra.
9) Fix multicast handling in Seville switches driven by mscc-ocelot
driver. From Vladimir Oltean.
10) Fix proto value passed to skb delivery demux in udp, from Xin Long.
11) HW pkt counters not reported correctly in enetc driver, from Claudiu
Manoil.
12) Fix deadlock in bridge, from Joseph Huang.
13) Missing of_node_pur() in dpaa2 driver, fromn Christophe JAILLET.
14) Fix pid fetching in bpftool when there are a lot of results, from
Andrii Nakryiko.
15) Fix long timeouts in nft_dynset, from Pablo Neira Ayuso.
16) Various stymmac fixes, from Fugang Duan.
17) Fix null deref in tipc, from Cengiz Can.
18) When mss is biog, coose more resonable rcvq_space in tcp, fromn Eric
Dumazet.
19) Revert a geneve change that likely isnt necessary, from Jakub
Kicinski.
20) Avoid premature rx buffer reuse in various Intel driversm from Björn
Töpel.
21) retain EcT bits during TIS reflection in tcp, from Wei Wang.
22) Fix Tso deferral wrt. cwnd limiting in tcp, from Neal Cardwell.
23) MPLS_OPT_LSE_LABEL attribute is 342 ot 8 bits, from Guillaume Nault
24) Fix propagation of 32-bit signed bounds in bpf verifier and add test
cases, from Alexei Starovoitov.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (81 commits)
selftests: fix poll error in udpgro.sh
selftests/bpf: Fix "dubious pointer arithmetic" test
selftests/bpf: Fix array access with signed variable test
selftests/bpf: Add test for signed 32-bit bound check bug
bpf: Fix propagation of 32-bit signed bounds from 64-bit bounds.
MAINTAINERS: Add entry for Marvell Prestera Ethernet Switch driver
net: sched: Fix dump of MPLS_OPT_LSE_LABEL attribute in cls_flower
net/mlx4_en: Handle TX error CQE
net/mlx4_en: Avoid scheduling restart task if it is already running
tcp: fix cwnd-limited bug for TSO deferral where we send nothing
net: flow_offload: Fix memory leak for indirect flow block
tcp: Retain ECT bits for tos reflection
ethtool: fix stack overflow in ethnl_parse_bitset()
e1000e: fix S0ix flow to allow S0i3.2 subset entry
ice: avoid premature Rx buffer reuse
ixgbe: avoid premature Rx buffer reuse
i40e: avoid premature Rx buffer reuse
igb: avoid transmit queue timeout in xdp path
igb: use xdp_do_flush
igb: skb add metasize for xdp
...
|
|
If generic_drop_inode() returns true, it means iput_final() can evict
this inode regardless of whether it is dirty or not. If we check
I_DONTCACHE in generic_drop_inode(), any inode with this bit set will be
evicted unconditionally. This is not the desired behavior because
I_DONTCACHE only means the inode shouldn't be cached on the LRU list.
As for whether we need to evict this inode, this is what
generic_drop_inode() should do. This patch corrects the usage of
I_DONTCACHE.
This patch was proposed in [1].
[1]: https://lore.kernel.org/linux-fsdevel/20200831003407.GE12096@dread.disaster.area/
Fixes: dae2f8ed7992 ("fs: Lift XFS_IDONTCACHE to the VFS layer")
Signed-off-by: Hao Li <lihao2018.fnst@cn.fujitsu.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Add the API for getting the domain from a vfio group. This could be used
by the physical device drivers which rely on the vfio/mdev framework for
mediated device user level access. The typical use case like below:
unsigned int pasid;
struct vfio_group *vfio_group;
struct iommu_domain *iommu_domain;
struct device *dev = mdev_dev(mdev);
struct device *iommu_device = mdev_get_iommu_device(dev);
if (!iommu_device ||
!iommu_dev_feature_enabled(iommu_device, IOMMU_DEV_FEAT_AUX))
return -EINVAL;
vfio_group = vfio_group_get_external_user_from_dev(dev);
if (IS_ERR_OR_NULL(vfio_group))
return -EFAULT;
iommu_domain = vfio_group_iommu_domain(vfio_group);
if (IS_ERR_OR_NULL(iommu_domain)) {
vfio_group_put_external_user(vfio_group);
return -EFAULT;
}
pasid = iommu_aux_get_pasid(iommu_domain, iommu_device);
if (pasid < 0) {
vfio_group_put_external_user(vfio_group);
return -EFAULT;
}
/* Program device context with pasid value. */
...
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
|
|
i.MX is a devicetree-only platform now and the existing platform data
support in this driver was only useful for old non-devicetree platforms.
Get rid of the platform data support since it is no longer used.
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Reviewed-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20201110121908.19400-1-festevam@gmail.com
|
|
Some identifiers have different names between their prototypes
and the kernel-doc markup.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/9ed47a57d12c40e73a9b01612ee119d39baa6236.1603469755.git.mchehab+huawei@kernel.org
|
|
Add the logic in the NAND core to find the right ECC engine depending
on the NAND chip requirements and the user desires. Right now, the
choice may be made between (more will come):
* software Hamming
* software BCH
* on-die (SPI-NAND devices only)
Once the ECC engine has been found, the ECC engine must be
configured.
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20201001102014.20100-2-miquel.raynal@bootlin.com
|
|
Before making use of the ECC engines, we must retrieve them. Add the
necessary boilerplate.
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20200930154109.3922-5-miquel.raynal@bootlin.com
|
|
Make use of the existing functions taken from the SPI-NAND core to
instantiate an on-die ECC engine specific to the SPI-NAND core. The
next step will be to tweak the core to use this object instead of
calling the helpers directly.
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20200930154109.3922-4-miquel.raynal@bootlin.com
|
|
Before making use of the ECC engines, we must retrieve them. Add the
boilerplate for the ones already available: software engines (Hamming
and BCH).
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20200929230124.31491-21-miquel.raynal@bootlin.com
|
|
Let's continue introducing the generic ECC engine abstraction in the
NAND subsystem by instantiating a second ECC engine: software
Hamming.
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20200929230124.31491-20-miquel.raynal@bootlin.com
|
|
There is no reason to always embed the software Hamming ECC engine
implementation. By default it is (with raw NAND), but we can let the
user decide.
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20200929230124.31491-19-miquel.raynal@bootlin.com
|
|
Most of the includes are simply useless, drop them.
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20200929230124.31491-18-miquel.raynal@bootlin.com
|
|
This code is meant to be reused by the SPI-NAND core. Now that the
driver has been cleaned and reorganized, use a generic ECC engine
object to store the driver's data instead of accessing members of the
nand_chip structure. This means adding proper init/cleanup helpers.
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20200929230124.31491-17-miquel.raynal@bootlin.com
|
|
Prefix by ecc_sw_hamming_ the functions which should be internal only
but are exported for "raw" operations.
Prefix by nand_ecc_sw_hamming_ the other functions which will be used
in the context of the declaration of an Hamming proper ECC engine
object.
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20200929230124.31491-16-miquel.raynal@bootlin.com
|
|
The include file pretends being the header for "ECC algorithm", while
it is just the header for the Hamming implementation. Make this clear
by rewording the sentence.
Do the same with the module description.
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20200929230124.31491-13-miquel.raynal@bootlin.com
|
|
Hamming ECC code might be later re-used by the SPI NAND layer.
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20200929230124.31491-12-miquel.raynal@bootlin.com
|
|
nand_ecc_ctrl embeds a private pointer which only has a meaning in the
sunxi driver. This structure will soon be deprecated, but as this
field is actually not needed, let's just drop it.
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Acked-by: Maxime Ripard <mripard@kernel.org>
Link: https://lore.kernel.org/linux-mtd/20200929230124.31491-11-miquel.raynal@bootlin.com
|
|
Let's continue introducing the generic ECC engine abstraction in the
NAND subsystem by instantiating a first ECC engine: the software
BCH one.
While at it, make a very tidy ecc_sw_bch_init() function and move all
the sanity checks and user input management in
nand_ecc_sw_bch_init_ctx(). This second helper will be called from the
raw RAND core.
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20200929230124.31491-10-miquel.raynal@bootlin.com
|
|
Add ECAM-related constants to provide a set of standard constants
defining memory address shift values to the byte-level address that can
be used to access the PCI Express Configuration Space, and then move
native PCI Express controller drivers to use the newly introduced
definitions retiring driver-specific ones.
Refactor pci_ecam_map_bus() function to use newly added constants so
that limits to the bus, device function and offset (now limited to 4K as
per the specification) are in place to prevent the defective or
malicious caller from supplying incorrect configuration offset and thus
targeting the wrong device when accessing extended configuration space.
This refactor also allows for the ".bus_shift" initialisers to be
dropped when the user is not using a custom value as a default value
will be used as per the PCI Express Specification.
Thanks to Qian Cai <qcai@redhat.com>, Michael Walle <michael@walle.cc>,
and Vladimir Oltean <olteanv@gmail.com> for reporting a pci_ecam_create()
issue with .bus_shift and to Vladimir for proposing the fix.
[bhelgaas: incorporate Vladimir's fix, update commit log]
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20201129230743.3006978-2-kw@linux.com
Tested-by: Michael Walle <michael@walle.cc>
Signed-off-by: Krzysztof Wilczyński <kw@linux.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jon Derrick <jonathan.derrick@intel.com>
Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
This change adds a new kind of core dump mechanism which instead of dumping
entire program segments of the firmware, dumps sections of the remoteproc
memory which are sufficient to allow debugging the firmware. This function
thus uses section headers instead of program headers during creation of the
core dump elf.
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Co-developed-by: Rishabh Bhatnagar <rishabhb@codeaurora.org>
Signed-off-by: Rishabh Bhatnagar <rishabhb@codeaurora.org>
Signed-off-by: Siddharth Gupta <sidgup@codeaurora.org>
Link: https://lore.kernel.org/r/1605819935-10726-3-git-send-email-sidgup@codeaurora.org
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
|
|
Each remoteproc might have different requirements for coredumps and might
want to choose the type of dumps it wants to collect. This change allows
remoteproc drivers to specify their own custom dump function to be executed
in place of rproc_coredump. If the coredump op is not specified by the
remoteproc driver it will be set to rproc_coredump by default.
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Siddharth Gupta <sidgup@codeaurora.org>
Link: https://lore.kernel.org/r/1605819935-10726-2-git-send-email-sidgup@codeaurora.org
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
|
|
Recently syzbot reported[0] that there is a deadlock amongst the users
of exec_update_mutex. The problematic lock ordering found by lockdep
was:
perf_event_open (exec_update_mutex -> ovl_i_mutex)
chown (ovl_i_mutex -> sb_writes)
sendfile (sb_writes -> p->lock)
by reading from a proc file and writing to overlayfs
proc_pid_syscall (p->lock -> exec_update_mutex)
While looking at possible solutions it occured to me that all of the
users and possible users involved only wanted to state of the given
process to remain the same. They are all readers. The only writer is
exec.
There is no reason for readers to block on each other. So fix
this deadlock by transforming exec_update_mutex into a rw_semaphore
named exec_update_lock that only exec takes for writing.
Cc: Jann Horn <jannh@google.com>
Cc: Vasiliy Kulikov <segoon@openwall.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Bernd Edlinger <bernd.edlinger@hotmail.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Christopher Yeoh <cyeoh@au1.ibm.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Fixes: eea9673250db ("exec: Add exec_update_mutex to replace cred_guard_mutex")
[0] https://lkml.kernel.org/r/00000000000063640c05ade8e3de@google.com
Reported-by: syzbot+db9cdf3dd1f64252c6ef@syzkaller.appspotmail.com
Link: https://lkml.kernel.org/r/87ft4mbqen.fsf@x220.int.ebiederm.org
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
|
When discussing[1] exec and posix file locks it was realized that none
of the callers of get_files_struct fundamentally needed to call
get_files_struct, and that by switching them to helper functions
instead it will both simplify their code and remove unnecessary
increments of files_struct.count. Those unnecessary increments can
result in exec unnecessarily unsharing files_struct which breaking
posix locks, and it can result in fget_light having to fallback to
fget reducing system performance.
Now that get_files_struct has no more users and can not cause the
problems for posix file locking and fget_light remove get_files_struct
so that it does not gain any new users.
[1] https://lkml.kernel.org/r/20180915160423.GA31461@redhat.com
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
v1: https://lkml.kernel.org/r/20200817220425.9389-13-ebiederm@xmission.com
Link: https://lkml.kernel.org/r/20201120231441.29911-24-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
|
The function close_fd_get_file is explicitly a variant of
__close_fd[1]. Now that __close_fd has been renamed close_fd, rename
close_fd_get_file to be consistent with close_fd.
When __alloc_fd, __close_fd and __fd_install were introduced the
double underscore indicated that the function took a struct
files_struct parameter. The function __close_fd_get_file never has so
the naming has always been inconsistent. This just cleans things up
so there are not any lingering mentions or references __close_fd left
in the code.
[1] 80cd795630d6 ("binder: fix use-after-free due to ksys_close() during fdget()")
Link: https://lkml.kernel.org/r/20201120231441.29911-23-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
|
Now that ksys_close is exactly identical to close_fd replace
the one caller of ksys_close with close_fd.
[1] https://lkml.kernel.org/r/20200818112020.GA17080@infradead.org
Suggested-by: Christoph Hellwig <hch@infradead.org>
Link: https://lkml.kernel.org/r/20201120231441.29911-22-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
|
The function __close_fd was added to support binder[1]. Now that
binder has been fixed to no longer need __close_fd[2] all calls
to __close_fd pass current->files.
Therefore transform the files parameter into a local variable
initialized to current->files, and rename __close_fd to close_fd to
reflect this change, and keep it in sync with the similar changes to
__alloc_fd, and __fd_install.
This removes the need for callers to care about the extra care that
needs to be take if anything except current->files is passed, by
limiting the callers to only operation on current->files.
[1] 483ce1d4b8c3 ("take descriptor-related part of close() to file.c")
[2] 44d8047f1d87 ("binder: use standard functions to allocate fds")
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
v1: https://lkml.kernel.org/r/20200817220425.9389-17-ebiederm@xmission.com
Link: https://lkml.kernel.org/r/20201120231441.29911-21-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
|
The function __alloc_fd was added to support binder[1]. With binder
fixed[2] there are no more users.
As alloc_fd just calls __alloc_fd with "files=current->files",
merge them together by transforming the files parameter into a
local variable initialized to current->files.
[1] dcfadfa4ec5a ("new helper: __alloc_fd()")
[2] 44d8047f1d87 ("binder: use standard functions to allocate fds")
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
v1: https://lkml.kernel.org/r/20200817220425.9389-16-ebiederm@xmission.com
Link: https://lkml.kernel.org/r/20201120231441.29911-20-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
|
The function __fd_install was added to support binder[1]. With binder
fixed[2] there are no more users.
As fd_install just calls __fd_install with "files=current->files",
merge them together by transforming the files parameter into a
local variable initialized to current->files.
[1] f869e8a7f753 ("expose a low-level variant of fd_install() for binder")
[2] 44d8047f1d87 ("binder: use standard functions to allocate fds")
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
v1:https://lkml.kernel.org/r/20200817220425.9389-14-ebiederm@xmission.com
Link: https://lkml.kernel.org/r/20201120231441.29911-18-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
|
As a companion to fget_task and task_lookup_fd_rcu implement
task_lookup_next_fd_rcu that will return the struct file for the first
file descriptor number that is equal or greater than the fd argument
value, or NULL if there is no such struct file.
This allows file descriptors of foreign processes to be iterated
through safely, without needed to increment the count on files_struct.
Some concern[1] has been expressed that this function takes the task_lock
for each iteration and thus for each file descriptor. This place
where this function will be called in a commonly used code path is for
listing /proc/<pid>/fd. I did some small benchmarks and did not see
any measurable performance differences. For ordinary users ls is
likely to stat each of the directory entries and tid_fd_mode called
from tid_fd_revalidae has always taken the task lock for each file
descriptor. So this does not look like it will be a big change in
practice.
At some point is will probably be worth changing put_files_struct to
free files_struct after an rcu grace period so that task_lock won't be
needed at all.
[1] https://lkml.kernel.org/r/20200817220425.9389-10-ebiederm@xmission.com
v1: https://lkml.kernel.org/r/20200817220425.9389-9-ebiederm@xmission.com
Link: https://lkml.kernel.org/r/20201120231441.29911-14-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
|
As a companion to lookup_fd_rcu implement task_lookup_fd_rcu for
querying an arbitrary process about a specific file.
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
v1: https://lkml.kernel.org/r/20200818103713.aw46m7vprsy4vlve@wittgenstein
Link: https://lkml.kernel.org/r/20201120231441.29911-11-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
|
Also remove the confusing comment about checking if a fd exists. I
could not find one instance in the entire kernel that still matches
the description or the reason for the name fcheck.
The need for better names became apparent in the last round of
discussion of this set of changes[1].
[1] https://lkml.kernel.org/r/CAHk-=wj8BQbgJFLa+J0e=iT-1qpmCRTbPAJ8gd6MJQ=kbRPqyQ@mail.gmail.com
Link: https://lkml.kernel.org/r/20201120231441.29911-10-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
|
This change renames fcheck_files to files_lookup_fd_rcu. All of the
remaining callers take the rcu_read_lock before calling this function
so the _rcu suffix is appropriate. This change also tightens up the
debug check to verify that all callers hold the rcu_read_lock.
All callers that used to call files_check with the files->file_lock
held have now been changed to call files_lookup_fd_locked.
This change of name has helped remind me of which locks and which
guarantees are in place helping me to catch bugs later in the
patchset.
The need for better names became apparent in the last round of
discussion of this set of changes[1].
[1] https://lkml.kernel.org/r/CAHk-=wj8BQbgJFLa+J0e=iT-1qpmCRTbPAJ8gd6MJQ=kbRPqyQ@mail.gmail.com
Link: https://lkml.kernel.org/r/20201120231441.29911-9-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
|
To make it easy to tell where files->file_lock protection is being
used when looking up a file create files_lookup_fd_locked. Only allow
this function to be called with the file_lock held.
Update the callers of fcheck and fcheck_files that are called with the
files->file_lock held to call files_lookup_fd_locked instead.
Hopefully this makes it easier to quickly understand what is going on.
The need for better names became apparent in the last round of
discussion of this set of changes[1].
[1] https://lkml.kernel.org/r/CAHk-=wj8BQbgJFLa+J0e=iT-1qpmCRTbPAJ8gd6MJQ=kbRPqyQ@mail.gmail.com
Link: https://lkml.kernel.org/r/20201120231441.29911-8-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
|
The function fcheck despite it's comment is poorly named
as it has no callers that only check it's return value.
All of fcheck's callers use the returned file descriptor.
The same is true for fcheck_files and __fcheck_files.
A new less confusing name is needed. In addition the names
of these functions are confusing as they do not report
the kind of locks that are needed to be held when these
functions are called making error prone to use them.
To remedy this I am making the base functio name lookup_fd
and will and prefixes and sufficies to indicate the rest
of the context.
Name the function (previously called __fcheck_files) that proceeds
from a struct files_struct, looks up the struct file of a file
descriptor, and requires it's callers to verify all of the appropriate
locks are held files_lookup_fd_raw.
The need for better names became apparent in the last round of
discussion of this set of changes[1].
[1] https://lkml.kernel.org/r/CAHk-=wj8BQbgJFLa+J0e=iT-1qpmCRTbPAJ8gd6MJQ=kbRPqyQ@mail.gmail.com
Link: https://lkml.kernel.org/r/20201120231441.29911-7-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
|
Now that exec no longer needs to restore the previous value of current->files
on error there are no more callers of reset_files_struct so remove it.
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
v1: https://lkml.kernel.org/r/20200817220425.9389-3-ebiederm@xmission.com
Link: https://lkml.kernel.org/r/20201120231441.29911-3-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|