Age | Commit message (Collapse) | Author |
|
Add an entry for Synopsys DesignWare HDMI Receiver Controller
Driver.
Reviewed-by: Christopher Obbard <chris.obbard@collabora.com>
Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
|
|
Currently, we are unnecessarily holding a regulator_ww_class_mutex lock
when creating debugfs entries for a newly created regulator. This was
brought up as a concern in the discussion in commit cba6cfdc7c3f
("regulator: core: Avoid lockdep reports when resolving supplies").
This causes the following lockdep splat after executing
`ls /sys/kernel/debug` on my platform:
======================================================
WARNING: possible circular locking dependency detected
5.15.167-axis9-devel #1 Tainted: G O
------------------------------------------------------
ls/2146 is trying to acquire lock:
ffffff803a562918 (&mm->mmap_lock){++++}-{3:3}, at: __might_fault+0x40/0x88
but task is already holding lock:
ffffff80014497f8 (&sb->s_type->i_mutex_key#3){++++}-{3:3}, at: iterate_dir+0x50/0x1f4
which lock already depends on the new lock.
[...]
Chain exists of:
&mm->mmap_lock --> regulator_ww_class_mutex --> &sb->s_type->i_mutex_key#3
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&sb->s_type->i_mutex_key#3);
lock(regulator_ww_class_mutex);
lock(&sb->s_type->i_mutex_key#3);
lock(&mm->mmap_lock);
*** DEADLOCK ***
This lock dependency still exists on the latest kernel and using a newer
non-tainted kernel would still cause this problem.
Fix by moving sysfs symlinking and creation of debugfs entries to after
the release of the regulator lock.
Fixes: cba6cfdc7c3f ("regulator: core: Avoid lockdep reports when resolving supplies")
Fixes: eaa7995c529b ("regulator: core: avoid regulator_resolve_supply() race condition")
Signed-off-by: Ludvig Pärsson <ludvig.parsson@axis.com>
Link: https://patch.msgid.link/20250305-regulator_lockdep_fix-v1-1-ab938b12e790@axis.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
child_cfs_rq_on_list attempts to convert a 'prev' pointer to a cfs_rq.
This 'prev' pointer can originate from struct rq's leaf_cfs_rq_list,
making the conversion invalid and potentially leading to memory
corruption. Depending on the relative positions of leaf_cfs_rq_list and
the task group (tg) pointer within the struct, this can cause a memory
fault or access garbage data.
The issue arises in list_add_leaf_cfs_rq, where both
cfs_rq->leaf_cfs_rq_list and rq->leaf_cfs_rq_list are added to the same
leaf list. Also, rq->tmp_alone_branch can be set to rq->leaf_cfs_rq_list.
This adds a check `if (prev == &rq->leaf_cfs_rq_list)` after the main
conditional in child_cfs_rq_on_list. This ensures that the container_of
operation will convert a correct cfs_rq struct.
This check is sufficient because only cfs_rqs on the same CPU are added
to the list, so verifying the 'prev' pointer against the current rq's list
head is enough.
Fixes a potential memory corruption issue that due to current struct
layout might not be manifesting as a crash but could lead to unpredictable
behavior when the layout changes.
Fixes: fdaba61ef8a2 ("sched/fair: Ensure that the CFS parent is added after unthrottling")
Signed-off-by: Zecheng Li <zecheng@google.com>
Reviewed-and-tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20250304214031.2882646-1-zecheng@google.com
|
|
After switchdev is enabled and disabled later, LLDP packets sending stops,
despite working perfectly fine before and during switchdev state.
To reproduce (creating/destroying VF is what triggers the reconfiguration):
devlink dev eswitch set pci/<address> mode switchdev
echo '2' > /sys/class/net/<ifname>/device/sriov_numvfs
echo '0' > /sys/class/net/<ifname>/device/sriov_numvfs
This happens because LLDP relies on the destination override functionality.
It needs to 1) set a flag in the descriptor, 2) set the VSI permission to
make it valid. The permissions are set when the PF VSI is first configured,
but switchdev then enables it for the uplink VSI (which is always the PF)
once more when configured and disables when deconfigured, which leads to
software-generated LLDP packets being blocked.
Do not modify the destination override permissions when configuring
switchdev, as the enabled state is the default configuration that is never
modified.
Fixes: 1a1c40df2e80 ("ice: set and release switchdev environment")
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Merge the pmdomain fixes for v6.14-rc[n] into the next branch, to allow them
to get tested together with the new changes that are targeted for v6.15.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
ISP and MIPI_ISP, these two have a parent-child relationship,
ISP depends on MIPI_ISP.
Fixes: ca75e4b214c6 ("pmdomain: amlogic: Add support for T7 power domains controller")
Signed-off-by: Xianwei Zhao <xianwei.zhao@amlogic.com>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250303-fix-t7-pwrc-v1-1-b563612bcd86@amlogic.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
When CONFIG_REGULATOR is disabled, this pmdomain driver fails to build:
drivers/pmdomain/rockchip/pm-domains.c:653:30: error: implicit declaration of function 'devm_of_regulator_get'; did you mean 'devm_regulator_get'? [-Wimplicit-function-declaration]
653 | pd->supply = devm_of_regulator_get(pmu->dev, pd->node, "domain");
| ^~~~~~~~~~~~~~~~~~~~~
| devm_regulator_get
drivers/pmdomain/rockchip/pm-domains.c:653:28: error: assignment to 'struct regulator *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
653 | pd->supply = devm_of_regulator_get(pmu->dev, pd->node, "domain");
| ^
Add a Kconfig dependency.
Fixes: db6df2e3fc16 ("pmdomain: rockchip: add regulator support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Reviewed-by: Dragan Simic <dsimic@manjaro.org>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20250304142803.689201-1-arnd@kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
The utf16_le_to_7bit function claims to, naively, convert a UTF-16
string to a 7-bit ASCII string. By naively, we mean that it:
* drops the first byte of every character in the original UTF-16 string
* checks if all characters are printable, and otherwise replaces them
by exclamation mark "!".
This means that theoretically, all characters outside the 7-bit ASCII
range should be replaced by another character. Examples:
* lower-case alpha (ɒ) 0x0252 becomes 0x52 (R)
* ligature OE (œ) 0x0153 becomes 0x53 (S)
* hangul letter pieup (ㅂ) 0x3142 becomes 0x42 (B)
* upper-case gamma (Ɣ) 0x0194 becomes 0x94 (not printable) so gets
replaced by "!"
The result of this conversion for the GPT partition name is passed to
user-space as PARTNAME via udev, which is confusing and feels questionable.
However, there is a flaw in the conversion function itself. By dropping
one byte of each character and using isprint() to check if the remaining
byte corresponds to a printable character, we do not actually guarantee
that the resulting character is 7-bit ASCII.
This happens because we pass 8-bit characters to isprint(), which
in the kernel returns 1 for many values > 0x7f - as defined in ctype.c.
This results in many values which should be replaced by "!" to be kept
as-is, despite not being valid 7-bit ASCII. Examples:
* e with acute accent (é) 0x00E9 becomes 0xE9 - kept as-is because
isprint(0xE9) returns 1.
* euro sign (€) 0x20AC becomes 0xAC - kept as-is because isprint(0xAC)
returns 1.
This way has broken pyudev utility[1], fixes it by using a mask of 7 bits
instead of 8 bits before calling isprint.
Link: https://github.com/pyudev/pyudev/issues/490#issuecomment-2685794648 [1]
Link: https://lore.kernel.org/linux-block/4cac90c2-e414-4ebb-ae62-2a4589d9dc6e@canonical.com/
Cc: Mulhern <amulhern@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: stable@vger.kernel.org
Signed-off-by: Olivier Gayot <olivier.gayot@canonical.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250305022154.3903128-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add a helper function io_cache_free() that returns an allocation to a
io_alloc_cache, falling back on kfree() if the io_alloc_cache is full.
This is the inverse of io_cache_alloc(), which takes an allocation from
an io_alloc_cache and falls back on kmalloc() if the cache is empty.
Convert 4 callers to use the helper.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Suggested-by: Li Zetao <lizetao1@huawei.com>
Link: https://lore.kernel.org/r/20250304194814.2346705-1-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The parameters set by the set_params call are only applied to the block
device in the start_dev call. So if a device has already been started, a
subsequently issued set_params on that device will not have the desired
effect, and should return an error. There is an existing check for this
- set_params fails on devices in the LIVE state. But this check is not
sufficient to cover the recovery case. In this case, the device will be
in the QUIESCED or FAIL_IO states, so set_params will succeed. But this
success is misleading, because the parameters will not be applied, since
the device has already been started (by a previous ublk server). The bit
UB_STATE_USED is set on completion of the start_dev; use it to detect
and fail set_params commands which arrive too late to be applied (after
start_dev).
Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Fixes: 0aa73170eba5 ("ublk_drv: add SET_PARAMS/GET_PARAMS control command")
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250304-set_params-v1-1-17b5e0887606@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
This add a match for cs42l43 on sdw link 2, 3 cs35l56 on sdw link 1 and
3. All speakers are aggregated.
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com>
Link: https://patch.msgid.link/20250305135443.201884-10-yung-chuan.liao@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
This add a match for cs42l43 on sdw link 2, 3 cs35l56 on sdw link 1 and
3. All speakers are aggregated.
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com>
Link: https://patch.msgid.link/20250305135443.201884-9-yung-chuan.liao@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Add a SoundWire multi function mockup codec support.
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Link: https://patch.msgid.link/20250305135443.201884-8-yung-chuan.liao@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Add a SoundWire multi function mockup codec support.
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Link: https://patch.msgid.link/20250305135443.201884-7-yung-chuan.liao@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
SoundWire multi function codecs are common used in recent platforms.
Add a multi function mockup codec to test without real codec.
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Link: https://patch.msgid.link/20250305135443.201884-6-yung-chuan.liao@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
part id 0xaaaa codec is a mockup codec. Make it be a multi function
codec. So that we can test multi function with a mockup codec.
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Link: https://patch.msgid.link/20250305135443.201884-5-yung-chuan.liao@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Add rt713_vb on SDW link 3 and 2 rt1320 on SDW link 1 and 2 configuration
support.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Link: https://patch.msgid.link/20250305135443.201884-4-yung-chuan.liao@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Add rt712_vb on SDW link 3 and 1 rt1320 on SDW link 2 configuration
support.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Link: https://patch.msgid.link/20250305135443.201884-3-yung-chuan.liao@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
That is a speaker only configuration and 2 rt1316 are on link 0 and 2.
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Link: https://patch.msgid.link/20250305135443.201884-2-yung-chuan.liao@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
"generic_new_peripheral_assigned: invalid dev_num 1, wake supported 1"
is reported by our internal CI test.
Rt1320's wake feature is not used in Linux and that's why it is not in
the wake_capable_list[] list in intel_auxdevice.c.
However, BIOS may set it as wake-capable. Overwrite wake_capable to 0
in the codec driver to align with wake_capable_list[].
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Acked-by: Shuming Fan <shumingf@realtek.com>
Link: https://patch.msgid.link/20250305134113.201326-1-yung-chuan.liao@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
format() and subformat() should be handled in the same time, no need to
have each functions. Let's merge these
soc_pcm_hw_update_format()
soc_pcm_hw_update_subformat()
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Cezary Rojewski <cezary.rojewski@intel.com>
Cc: Jaroslav Kysela <perex@perex.cz>
Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com>
Link: https://patch.msgid.link/87v7soz664.wl-kuninori.morimoto.gx@renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The valid_mask member of the struct gpio_chip is unconditionally written
by the GPIO core at driver registration. Current documentation does not
mention this but just says the valid_mask is used if it's not NULL. This
lured me to try populating it directly in the GPIO driver probe instead
of using the init_valid_mask() callback. It took some retries with
different bitmaps and eventually a bit of code-reading to understand why
the valid_mask was not obeyed. I could've avoided this trial and error if
the valid_mask was hidden in the struct gpio_device instead of being a
visible member of the struct gpio_chip.
Help the next developer who decides to directly populate the valid_mask
in struct gpio_chip by hiding the valid_mask in struct gpio_device and
keep it internal to the GPIO core.
Suggested-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Matti Vaittinen <mazziesaccount@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/4547ca90d910d60cab3d56d864d59ddde47a5e93.1741180097.git.mazziesaccount@gmail.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
The valid_mask member of the struct gpio_chip is unconditionally written
by the GPIO core at driver registration. It should not be directly
populated by the drivers. Hiding the valid_mask in struct gpio_device
makes it clear it is not meant to be directly populated by drivers. This
means drivers should not access it directly from the struct gpio_chip.
The gpio-rcar checks the valid mask in set/get_multiple() operations.
This is no longer needed [1]. Drop these checks.
Additionally, the valid_mask is needed for enabling the GPIO inputs at
probe time. Use the new valid_mask -getter function instead of accessing
it directly from the struct gpio_chip.
Signed-off-by: Matti Vaittinen <mazziesaccount@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/e46441ba8a2840e6b48ec8d2ecd5919995a5675f.1741180097.git.mazziesaccount@gmail.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
The valid_mask member of the struct gpio_chip is unconditionally written
by the GPIO core at driver registration. It shouldn't be directly
populated by drivers. This can be prevented by moving it from the struct
gpio_chip to struct gpio_device, which is internal to the GPIO core.
As a preparatory step, provide a getter function which can be used by
those drivers which need the valid_mask information.
Signed-off-by: Matti Vaittinen <mazziesaccount@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/026f9d78502eca883bfe3faeb684e23d5d6c5e84.1741180097.git.mazziesaccount@gmail.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
When GPIOs were requested the validity of GPIOs were checked only when
the GPIO-chip had the request -callback populated. This made using
masked GPIOs possible.
The GPIO chip driver authors may find it difficult to understand the
relation of enforsing the GPIO validity and the 'request' -callback
because the current documentation for the 'request' callback does not
mention this. It only states:
* @request: optional hook for chip-specific activation, such as
* enabling module power and clock; may sleep
The validity of the GPIO line should be checked whether the driver
provides the 'request' callback or not.
Unconditionally check the GPIO validity when GPIO is being requested.
Signed-off-by: Matti Vaittinen <mazziesaccount@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/cd5e067b80e1bb590027bc3bfa817e7f794f21c3.1741180097.git.mazziesaccount@gmail.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
When I read through the TSO codes, I found out that we probably
miss initializing the tx_flags of last seg when TSO is turned
off, which means at the following points no more timestamp
(for this last one) will be generated. There are three flags
to be handled in this patch:
1. SKBTX_HW_TSTAMP
2. SKBTX_BPF
3. SKBTX_SCHED_TSTAMP
Note that SKBTX_BPF[1] was added in 6.14.0-rc2 by commit
6b98ec7e882af ("bpf: Add BPF_SOCK_OPS_TSTAMP_SCHED_CB callback")
and only belongs to net-next branch material for now. The common
issue of the above three flags can be fixed by this single patch.
This patch initializes the tx_flags to SKBTX_ANY_TSTAMP like what
the UDP GSO does to make the newly segmented last skb inherit the
tx_flags so that requested timestamp will be generated in each
certain layer, or else that last one has zero value of tx_flags
which leads to no timestamp at all.
Fixes: 4ed2d765dfacc ("net-timestamp: TCP timestamping")
Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a check for invalid data size to avoid corrupted filesystem
from being further corrupted.
Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
When generic_write_checks() returns zero, it means that
iov_iter_count() is zero, and there is no work to do.
Simply return success like all other filesystems do, rather than
proceeding down the write path, which today yields an -EFAULT in
generic_perform_write() via the
(fault_in_iov_iter_readable(i, bytes) == bytes) check when bytes
== 0.
Fixes: 11a347fb6cef ("exfat: change to get file size from DataLength")
Reported-by: Noah <kernel-org-10@maxgrass.eu>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
bitmap clear loop will take long time in __exfat_free_cluster()
if data size of file/dir enty is invalid.
If cluster bit in bitmap is already clear, stop clearing bitmap go to
out of loop.
Fixes: 31023864e67a ("exfat: add fat entry operations")
Reported-by: Kun Hu <huk23@m.fudan.edu.cn>, Jiaji Qin <jjtan24@m.fudan.edu.cn>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
This commit fixes the condition for allocating cluster to parent
directory to avoid allocating new cluster to parent directory when
there are just enough empty directory entries at the end of the
parent directory.
Fixes: af02c72d0b62 ("exfat: convert exfat_find_empty_entry() to use dentry cache")
Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
Wojtek Wasko says:
====================
Permission checks for dynamic POSIX clocks
Dynamic clocks - such as PTP clocks - extend beyond the standard POSIX
clock API by using ioctl calls. While file permissions are enforced for
standard POSIX operations, they are not implemented for ioctl calls,
since the POSIX layer cannot differentiate between calls which modify
the clock's state (like enabling PPS output generation) and those that
don't (such as retrieving the clock's PPS capabilities).
On the other hand, drivers implementing the dynamic clocks lack the
necessary information context to enforce permission checks themselves.
Additionally, POSIX clock layer requires the WRITE permission even for
readonly adjtime() operations before invoking the callback.
Add a struct file pointer to the POSIX clock context and use it to
implement the appropriate permission checks on PTP chardevs. Permit
readonly adjtime() for dynamic clocks. Add a readonly option to testptp.
Changes in v4:
- Allow readonly adjtime() for dynamic clocks, as suggested by Thomas
Changes in v3:
- Reword the log message for commit against posix-clock and fix
documentation of struct posix_clock_context, as suggested by Thomas
Changes in v2:
- Store file pointer in POSIX clock context rather than fmode in the PTP
clock's private data, as suggested by Richard.
- Move testptp.c changes into separate patch.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
PTP Hardware Clocks no longer require WRITE permission to perform
readonly operations, such as listing device capabilities or listening to
EXTTS events once they have been enabled by a process with WRITE
permissions.
Add '-r' option to testptp to open the PHC in readonly mode instead of
the default read-write mode. Skip enabling EXTTS if readonly mode is
requested.
Acked-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Signed-off-by: Wojtek Wasko <wwasko@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Many devices implement highly accurate clocks, which the kernel manages
as PTP Hardware Clocks (PHCs). Userspace applications rely on these
clocks to timestamp events, trace workload execution, correlate
timescales across devices, and keep various clocks in sync.
The kernel’s current implementation of PTP clocks does not enforce file
permissions checks for most device operations except for POSIX clock
operations, where file mode is verified in the POSIX layer before
forwarding the call to the PTP subsystem. Consequently, it is common
practice to not give unprivileged userspace applications any access to
PTP clocks whatsoever by giving the PTP chardevs 600 permissions. An
example of users running into this limitation is documented in [1].
Additionally, POSIX layer requires WRITE permission even for readonly
adjtime() calls which are used in PTP layer to return current frequency
offset applied to the PHC.
Add permission checks for functions that modify the state of a PTP
device. Continue enforcing permission checks for POSIX clock operations
(settime, adjtime) in the POSIX layer. Only require WRITE access for
dynamic clocks adjtime() if any flags are set in the modes field.
[1] https://lists.nwtime.org/sympa/arc/linuxptp-users/2024-01/msg00036.html
Changes in v4:
- Require FMODE_WRITE in ajtime() only for calls modifying the clock in
any way.
Acked-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Signed-off-by: Wojtek Wasko <wwasko@nvidia.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
File descriptor based pc_clock_*() operations of dynamic posix clocks
have access to the file pointer and implement permission checks in the
generic code before invoking the relevant dynamic clock callback.
Character device operations (open, read, poll, ioctl) do not implement a
generic permission control and the dynamic clock callbacks have no
access to the file pointer to implement them.
Extend struct posix_clock_context with a struct file pointer and
initialize it in posix_clock_open(), so that all dynamic clock callbacks
can access it.
Acked-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Wojtek Wasko <wwasko@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Christian Brauner <brauner@kernel.org> says:
Various tools need access to information about a process/task even after
it has already been reaped. For example, systemd's journal logs and uses
such information as the cgroup id and exit status to deal with processes
that have been sent via SCM_PIDFD or SCM_PEERPIDFD. By the time the
pidfd is received the process might have already been reaped.
This series aims to provide information by extending the PIDFD_GET_INFO
ioctl to retrieve the exit code and cgroup id. There might be other
stuff that we would want in the future.
Pidfd polling allows waiting on either task exit or for a task to have
been reaped. The contract for PIDFD_INFO_EXIT is simply that EPOLLHUP
must be observed before exit information can be retrieved, i.e., exit
information is only provided once the task has been reaped.
Note, that if a thread-group leader exits before other threads in the
thread-group then exit information will only be available once the
thread-group is empty. This aligns with wait() as well, where reaping of
a thread-group leader that exited before the thread-group was empty is
delayed until the thread-group is empty.
With PIDFD_INFO_EXIT autoreaping might actually become usable because it
means a parent can ignore SIGCHLD or set SA_NOCLDWAIT and simply use
pidfd polling and PIDFD_INFO_EXIT to get get status information for its
children. The kernel will autocleanup right away instead of delaying.
This includes expansive selftests including for thread-group behior and
multi-threaded exec by a non-thread-group leader thread.
* patches from https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-0-c8c3d8361705@kernel.org:
selftests/pidfd: add seventh PIDFD_INFO_EXIT selftest
selftests/pidfd: add sixth PIDFD_INFO_EXIT selftest
selftests/pidfd: add fifth PIDFD_INFO_EXIT selftest
selftests/pidfd: add fourth PIDFD_INFO_EXIT selftest
selftests/pidfd: add third PIDFD_INFO_EXIT selftest
selftests/pidfd: add second PIDFD_INFO_EXIT selftest
selftests/pidfd: add first PIDFD_INFO_EXIT selftest
selftests/pidfd: expand common pidfd header
pidfs/selftests: ensure correct headers for ioctl handling
selftests/pidfd: fix header inclusion
pidfs: allow to retrieve exit information
pidfs: record exit code and cgroupid at exit
pidfs: use private inode slab cache
pidfs: move setting flags into pidfs_alloc_file()
pidfd: rely on automatic cleanup in __pidfd_prepare()
pidfs: switch to copy_struct_to_user()
Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-0-c8c3d8361705@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Add a selftest for PIDFD_INFO_EXIT behavior.
Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-16-c8c3d8361705@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Add a selftest for PIDFD_INFO_EXIT behavior.
Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-15-c8c3d8361705@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Add a selftest for PIDFD_INFO_EXIT behavior.
Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-14-c8c3d8361705@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Add a selftest for PIDFD_INFO_EXIT behavior.
Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-13-c8c3d8361705@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Add a selftest for PIDFD_INFO_EXIT behavior.
Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-12-c8c3d8361705@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Add a selftest for PIDFD_INFO_EXIT behavior.
Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-11-c8c3d8361705@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Add a selftest for PIDFD_INFO_EXIT behavior.
Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-10-c8c3d8361705@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Move more infrastructure to the pidfd header.
Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-9-c8c3d8361705@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Ensure that necessary ioctl infrastructure is available.
Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-8-c8c3d8361705@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Ensure that necessary defines are present.
Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-7-c8c3d8361705@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Some tools like systemd's jounral need to retrieve the exit and cgroup
information after a process has already been reaped. This can e.g.,
happen when retrieving a pidfd via SCM_PIDFD or SCM_PEERPIDFD.
Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-6-c8c3d8361705@kernel.org
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Record the exit code and cgroupid in release_task() and stash in struct
pidfs_exit_info so it can be retrieved even after the task has been
reaped.
Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-5-c8c3d8361705@kernel.org
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Introduce a private inode slab cache for pidfs. In follow-up patches
pidfs will gain the ability to provide exit information to userspace
after the task has been reaped. This means storing exit information even
after the task has already been released and struct pid's task linkage
is gone. Store that information alongside the inode.
Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-4-c8c3d8361705@kernel.org
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Instead od adding it into __pidfd_prepare() place it where the actual
file allocation happens and update the outdated comment.
Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-3-c8c3d8361705@kernel.org
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Rely on scope-based cleanup for the allocated file descriptor.
Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-2-c8c3d8361705@kernel.org
Acked-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|