Age | Commit message (Collapse) | Author |
|
different nodes Date: Thu, 22 Feb 2024 22:04:17 +0800
When a group of tasks that access different nodes are scheduled on the
same node, they may encounter bandwidth bottlenecks and access latency.
Thus, numa_aware flag is introduced here, allowing tasks to be distributed
across different nodes to fully utilize the advantage of multi-node
systems.
Link: https://lkml.kernel.org/r/20240222140422.393911-5-gang.li@linux.dev
Signed-off-by: Gang Li <ligang.bdlg@bytedance.com>
Tested-by: David Rientjes <rientjes@google.com>
Reviewed-by: Muchun Song <muchun.song@linux.dev>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jane Chu <jane.chu@oracle.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
mac802154_llsec_key_del() can free resources of a key directly without
following the RCU rules for waiting before the end of a grace period. This
may lead to use-after-free in case llsec_lookup_key() is traversing the
list of keys in parallel with a key deletion:
refcount_t: addition on 0; use-after-free.
WARNING: CPU: 4 PID: 16000 at lib/refcount.c:25 refcount_warn_saturate+0x162/0x2a0
Modules linked in:
CPU: 4 PID: 16000 Comm: wpan-ping Not tainted 6.7.0 #19
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
RIP: 0010:refcount_warn_saturate+0x162/0x2a0
Call Trace:
<TASK>
llsec_lookup_key.isra.0+0x890/0x9e0
mac802154_llsec_encrypt+0x30c/0x9c0
ieee802154_subif_start_xmit+0x24/0x1e0
dev_hard_start_xmit+0x13e/0x690
sch_direct_xmit+0x2ae/0xbc0
__dev_queue_xmit+0x11dd/0x3c20
dgram_sendmsg+0x90b/0xd60
__sys_sendto+0x466/0x4c0
__x64_sys_sendto+0xe0/0x1c0
do_syscall_64+0x45/0xf0
entry_SYSCALL_64_after_hwframe+0x6e/0x76
Also, ieee802154_llsec_key_entry structures are not freed by
mac802154_llsec_key_del():
unreferenced object 0xffff8880613b6980 (size 64):
comm "iwpan", pid 2176, jiffies 4294761134 (age 60.475s)
hex dump (first 32 bytes):
78 0d 8f 18 80 88 ff ff 22 01 00 00 00 00 ad de x.......".......
00 00 00 00 00 00 00 00 03 00 cd ab 00 00 00 00 ................
backtrace:
[<ffffffff81dcfa62>] __kmem_cache_alloc_node+0x1e2/0x2d0
[<ffffffff81c43865>] kmalloc_trace+0x25/0xc0
[<ffffffff88968b09>] mac802154_llsec_key_add+0xac9/0xcf0
[<ffffffff8896e41a>] ieee802154_add_llsec_key+0x5a/0x80
[<ffffffff8892adc6>] nl802154_add_llsec_key+0x426/0x5b0
[<ffffffff86ff293e>] genl_family_rcv_msg_doit+0x1fe/0x2f0
[<ffffffff86ff46d1>] genl_rcv_msg+0x531/0x7d0
[<ffffffff86fee7a9>] netlink_rcv_skb+0x169/0x440
[<ffffffff86ff1d88>] genl_rcv+0x28/0x40
[<ffffffff86fec15c>] netlink_unicast+0x53c/0x820
[<ffffffff86fecd8b>] netlink_sendmsg+0x93b/0xe60
[<ffffffff86b91b35>] ____sys_sendmsg+0xac5/0xca0
[<ffffffff86b9c3dd>] ___sys_sendmsg+0x11d/0x1c0
[<ffffffff86b9c65a>] __sys_sendmsg+0xfa/0x1d0
[<ffffffff88eadbf5>] do_syscall_64+0x45/0xf0
[<ffffffff890000ea>] entry_SYSCALL_64_after_hwframe+0x6e/0x76
Handle the proper resource release in the RCU callback function
mac802154_llsec_key_del_rcu().
Note that if llsec_lookup_key() finds a key, it gets a refcount via
llsec_key_get() and locally copies key id from key_entry (which is a
list element). So it's safe to call llsec_key_put() and free the list
entry after the RCU grace period elapses.
Found by Linux Verification Center (linuxtesting.org).
Fixes: 5d637d5aabd8 ("mac802154: add llsec structures and mutators")
Cc: stable@vger.kernel.org
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Acked-by: Alexander Aring <aahringo@redhat.com>
Message-ID: <20240228163840.6667-1-pchelkin@ispras.ru>
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
|
|
PAGE_SIZE
The trace_seq buffer is used to print out entire events. It's typically
set to PAGE_SIZE * 2 as there's some events that can be quite large.
As a side effect, writes to trace_marker is limited by both the size of the
trace_seq buffer as well as the ring buffer's sub-buffer size (which is a
power of PAGE_SIZE). By limiting the trace_seq size, it also limits the
size of the largest string written to trace_marker.
trace_seq does not need to be dependent on PAGE_SIZE like the ring buffer
sub-buffers need to be. Hard code it to 8K which is PAGE_SIZE * 2 on most
architectures. This will also limit the size of trace_marker on those
architectures with greater than 4K PAGE_SIZE.
Link: https://lore.kernel.org/all/20240302111244.3a1674be@gandalf.local.home/
Link: https://lore.kernel.org/linux-trace-kernel/20240304191342.56fb1087@gandalf.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Sachin Sant <sachinp@linux.ibm.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-6.9/block
Pull MD atomic queue limits changes from Song.
* tag 'md-6.9-20240306' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
block: remove disk_stack_limits
md: remove mddev->queue
md: don't initialize queue limits
md/raid10: use the atomic queue limit update APIs
md/raid5: use the atomic queue limit update APIs
md/raid1: use the atomic queue limit update APIs
md/raid0: use the atomic queue limit update APIs
md: add queue limit helpers
md: add a mddev_is_dm helper
md: add a mddev_add_trace_msg helper
md: add a mddev_trace_remap helper
|
|
This is a patch that enables RSS functionality for GTP packets using ethtool.
A user can include TEID and make RSS work for GTP-U over IPv4 by doing the
following:`ethtool -N ens3 rx-flow-hash gtpu4 sde`
In addition to gtpu(4|6), we now support gtpc(4|6),gtpc(4|6)t,gtpu(4|6)e,
gtpu(4|6)u, and gtpu(4|6)d.
gtpc(4|6): Used for GTP-C in IPv4 and IPv6, where the GTP header format does
not include a TEID.
gtpc(4|6)t: Used for GTP-C in IPv4 and IPv6, with a GTP header format that
includes a TEID.
gtpu(4|6): Used for GTP-U in both IPv4 and IPv6 scenarios.
gtpu(4|6)e: Used for GTP-U with extended headers in both IPv4 and IPv6.
gtpu(4|6)u: Used when the PSC (PDU session container) in the GTP-U extended
header includes Uplink, applicable to both IPv4 and IPv6.
gtpu(4|6)d: Used when the PSC in the GTP-U extended header includes Downlink,
for both IPv4 and IPv6.
GTP generates a flow that includes an ID called TEID to identify the tunnel.
This tunnel is created for each UE (User Equipment).By performing RSS based on
this flow, it is possible to apply RSS for each communication unit from the UE.
Without this, RSS would only be effective within the range of IP addresses. For
instance, the PGW can only perform RSS within the IP range of the SGW.
Problematic from a load distribution perspective, especially if there's a bias
in the terminals connected to a particular base station.This case can be
solved by using this patch.
Signed-off-by: Takeru Hayasaka <hayatake396@gmail.com>
Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
disk_stack_limits is unused now, remove it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed--by: Song Liu <song@kernel.org>
Tested-by: Song Liu <song@kernel.org>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20240303140150.5435-12-hch@lst.de
|
|
The current device_release callback for individual iommu drivers does the
following:
1) Silent IOMMU DMA translation: It detaches any existing domain from the
device and puts it into a blocking state (some drivers might use the
identity state).
2) Resource release: It releases resources allocated during the
device_probe callback and restores the device to its pre-probe state.
Step 1 is challenging for individual iommu drivers because each must check
if a domain is already attached to the device. Additionally, if a deferred
attach never occurred, the device_release should avoid modifying hardware
configuration regardless of the reason for its call.
To simplify this process, introduce a static release_domain within the
iommu_ops structure. It can be either a blocking or identity domain
depending on the iommu hardware. The iommu core will decide whether to
attach this domain before the device_release callback, eliminating the
need for repetitive code in various drivers.
Consequently, the device_release callback can focus solely on the opposite
operations of device_probe, including releasing all resources allocated
during that callback.
Co-developed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20240305013305.204605-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Make pci_dev_is_disconnected() public so that it can be called from
Intel VT-d driver to quickly fix/workaround the surprise removal
unplug hang issue for those ATS capable devices on PCIe switch downstream
hotplug capable ports.
Beside pci_device_is_present() function, this one has no config space
space access, so is light enough to optimize the normal pure surprise
removal and safe removal flow.
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Tested-by: Haorong Ye <yehaorong@bytedance.com>
Signed-off-by: Ethan Zhao <haifeng.zhao@linux.intel.com>
Link: https://lore.kernel.org/r/20240301080727.3529832-2-haifeng.zhao@linux.intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
- Get rid of copy_mc flag in iov_iter which really only makes sense for
the core dumping code so move it out of the generic iov iter code and
make it coredump's problem. See the detailed commit description.
- Revert fs/aio: Make io_cancel() generate completions again
The initial fix here was predicated on the assumption that calling
ki_cancel() didn't complete aio requests. However, that turned out to
be wrong since the two drivers that actually make use of this set a
cancellation function that performs the cancellation correctly. So
revert this change.
- Ensure that the test for IOCB_AIO_RW always happens before the read
from ki_ctx.
* tag 'vfs-6.8-release.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
iov_iter: get rid of 'copy_mc' flag
fs/aio: Check IOCB_AIO_RW before the struct aio_kiocb conversion
Revert "fs/aio: Make io_cancel() generate completions again"
|
|
Since commit 43a7206b0963 ("driver core: class: make class_register() take
a const *"), the driver core allows for struct class to be in read-only
memory, so move the block_class structure to be declared at build time
placing it into read-only memory, instead of having to be dynamically
allocated at boot time.
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20240305-class_cleanup-block-v1-1-130bb27b9c72@marliere.net
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Since commit aed65af1cc2f ("drivers: make device_type const"), the driver
core can properly handle constant struct device_type. Move the
greybus_hd_type, greybus_module_type, greybus_interface_type,
greybus_control_type, greybus_bundle_type and greybus_svc_type variables to
be constant structures as well, placing it into read-only memory which can
not be modified at runtime.
Signed-off-by: "Ricardo B. Marliere" <ricardo@marliere.net>
Reviewed-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20240219-device_cleanup-greybus-v1-1-babb3f65e8cc@marliere.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Now that the driver core can properly handle constant struct bus_type,
move the greybus_bus_type variable to be a constant structure as well,
placing it into read-only memory which can not be modified at runtime.
Cc: Alex Elder <elder@kernel.org>
Cc: greybus-dev@lists.linaro.org
Reviewed-by: Johan Hovold <johan@kernel.org>
Link: https://lore.kernel.org/r/2024010517-handgun-scoreless-05e7@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/djakov/icc into char-misc-next
Georgi writes:
interconnect changes for 6.9
This pull request contains the interconnect changes for the 6.9-rc1 merge
window. The highlights are below:
Core changes:
- Constify the of_phandle_args in xlate functions.
Driver changes:
- New interconnect driver for the MSM8909 platform.
- New interconnect driver for the SM7150 platform.
- Clean-up and removal of unused resources in drivers.
- Constify some pointers to structs.
Signed-off-by: Georgi Djakov <djakov@kernel.org>
* tag 'icc-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/djakov/icc:
interconnect: qcom: Add SM7150 driver support
dt-bindings: interconnect: Add Qualcomm SM7150 DT bindings
interconnect: constify of_phandle_args in xlate
dt-bindings: interconnect: qcom,rpmh: Fix bouncing @codeaurora address
interconnect: qcom: x1e80100: constify pointer to qcom_icc_bcm
interconnect: qcom: sa8775p: constify pointer to qcom_icc_bcm
interconnect: qcom: sm6115: constify pointer to qcom_icc_node
interconnect: qcom: sm8250: constify pointer to qcom_icc_node
interconnect: qcom: sa8775p: constify pointer to qcom_icc_node
interconnect: qcom: msm8909: constify pointer to qcom_icc_node
interconnect: qcom: x1e80100: Remove bogus per-RSC BCMs and nodes
dt-bindings: interconnect: Remove bogus interconnect nodes
interconnect: qcom: sm8550: Remove bogus per-RSC BCMs and nodes
interconnect: qcom: Add MSM8909 interconnect provider driver
dt-bindings: interconnect: Add Qualcomm MSM8909 DT bindings
|
|
Add the event value to the snd_soc_dapm_start and snd_soc_dapm_done trace
events to make them more informative.
Trace before:
aplay-229 [000] 250.140309: snd_soc_dapm_start: card=vscn-2046
aplay-229 [000] 250.167531: snd_soc_dapm_done: card=vscn-2046
aplay-229 [000] 251.169588: snd_soc_dapm_start: card=vscn-2046
aplay-229 [000] 251.195245: snd_soc_dapm_done: card=vscn-2046
Trace after:
aplay-214 [000] 693.290612: snd_soc_dapm_start: card=vscn-2046 event=1
aplay-214 [000] 693.315508: snd_soc_dapm_done: card=vscn-2046 event=1
aplay-214 [000] 694.537349: snd_soc_dapm_start: card=vscn-2046 event=2
aplay-214 [000] 694.563241: snd_soc_dapm_done: card=vscn-2046 event=2
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Link: https://msgid.link/r/20240306-improve-asoc-trace-events-v1-2-edb252bbeb10@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The snd_soc_bias_level_start and snd_soc_bias_level_done trace events
currently look like:
aplay-229 [000] 1250.140778: snd_soc_bias_level_start: card=vscn-2046 val=1
aplay-229 [000] 1250.140784: snd_soc_bias_level_done: card=vscn-2046 val=1
aplay-229 [000] 1250.140786: snd_soc_bias_level_start: card=vscn-2046 val=2
aplay-229 [000] 1250.140788: snd_soc_bias_level_done: card=vscn-2046 val=2
kworker/u8:1-21 [000] 1250.140871: snd_soc_bias_level_start: card=vscn-2046 val=1
kworker/u8:0-11 [000] 1250.140951: snd_soc_bias_level_start: card=vscn-2046 val=1
kworker/u8:0-11 [000] 1250.140956: snd_soc_bias_level_done: card=vscn-2046 val=1
kworker/u8:0-11 [000] 1250.140959: snd_soc_bias_level_start: card=vscn-2046 val=2
kworker/u8:0-11 [000] 1250.140961: snd_soc_bias_level_done: card=vscn-2046 val=2
kworker/u8:1-21 [000] 1250.167219: snd_soc_bias_level_done: card=vscn-2046 val=1
kworker/u8:1-21 [000] 1250.167222: snd_soc_bias_level_start: card=vscn-2046 val=2
kworker/u8:1-21 [000] 1250.167232: snd_soc_bias_level_done: card=vscn-2046 val=2
kworker/u8:0-11 [000] 1250.167440: snd_soc_bias_level_start: card=vscn-2046 val=3
kworker/u8:0-11 [000] 1250.167444: snd_soc_bias_level_done: card=vscn-2046 val=3
kworker/u8:1-21 [000] 1250.167497: snd_soc_bias_level_start: card=vscn-2046 val=3
kworker/u8:1-21 [000] 1250.167506: snd_soc_bias_level_done: card=vscn-2046 val=3
There are clearly multiple calls, one per component, but they cannot be
discriminated from each other.
Change the ftrace events to also print the component name, to make it clear
which part of the code is involved. This requires changing the passed value
from a struct snd_soc_card, where the DAPM context is not kwown, to a
struct snd_soc_dapm_context where it is obviously known but the a card
pointer is also available.
With this change, the resulting trace becomes:
aplay-247 [000] 1436.357332: snd_soc_bias_level_start: card=vscn-2046 component=(none) val=1
aplay-247 [000] 1436.357338: snd_soc_bias_level_done: card=vscn-2046 component=(none) val=1
aplay-247 [000] 1436.357340: snd_soc_bias_level_start: card=vscn-2046 component=(none) val=2
aplay-247 [000] 1436.357343: snd_soc_bias_level_done: card=vscn-2046 component=(none) val=2
kworker/u8:4-215 [000] 1436.357437: snd_soc_bias_level_start: card=vscn-2046 component=ff560000.codec val=1
kworker/u8:5-231 [000] 1436.357518: snd_soc_bias_level_start: card=vscn-2046 component=ff320000.i2s val=1
kworker/u8:5-231 [000] 1436.357523: snd_soc_bias_level_done: card=vscn-2046 component=ff320000.i2s val=1
kworker/u8:5-231 [000] 1436.357526: snd_soc_bias_level_start: card=vscn-2046 component=ff320000.i2s val=2
kworker/u8:5-231 [000] 1436.357528: snd_soc_bias_level_done: card=vscn-2046 component=ff320000.i2s val=2
kworker/u8:4-215 [000] 1436.383217: snd_soc_bias_level_done: card=vscn-2046 component=ff560000.codec val=1
kworker/u8:4-215 [000] 1436.383221: snd_soc_bias_level_start: card=vscn-2046 component=ff560000.codec val=2
kworker/u8:4-215 [000] 1436.383231: snd_soc_bias_level_done: card=vscn-2046 component=ff560000.codec val=2
kworker/u8:5-231 [000] 1436.383468: snd_soc_bias_level_start: card=vscn-2046 component=ff320000.i2s val=3
kworker/u8:5-231 [000] 1436.383472: snd_soc_bias_level_done: card=vscn-2046 component=ff320000.i2s val=3
kworker/u8:4-215 [000] 1436.383503: snd_soc_bias_level_start: card=vscn-2046 component=ff560000.codec val=3
kworker/u8:4-215 [000] 1436.383513: snd_soc_bias_level_done: card=vscn-2046 component=ff560000.codec val=3
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Link: https://msgid.link/r/20240306-improve-asoc-trace-events-v1-1-edb252bbeb10@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Currently getsockopt does not support IP_ROUTER_ALERT and
IPV6_ROUTER_ALERT, and we are unable to get the values of these two
socket options through getsockopt.
This patch adds getsockopt support for IP_ROUTER_ALERT and
IPV6_ROUTER_ALERT.
Signed-off-by: Juntong Deng <juntong.deng@outlook.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Several file system notification system headers have "writable"
misspelled as "writtable" in the comments. This patch fixes it in the
fanotify header.
Signed-off-by: Vicki Pfau <vi@endrift.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Message-Id: <20240306020831.1404033-3-vi@endrift.com>
|
|
Several file system notification system headers have "writable"
misspelled as "writtable" in the comments. This patch fixes it in the
fsnotify header.
Signed-off-by: Vicki Pfau <vi@endrift.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Message-Id: <20240306020831.1404033-2-vi@endrift.com>
|
|
Several file system notification system headers have "writable"
misspelled as "writtable" in the comments. This patch fixes it in the
inotify header.
Signed-off-by: Vicki Pfau <vi@endrift.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Message-Id: <20240306020831.1404033-1-vi@endrift.com>
|
|
This flag is only set by one single user: the magical core dumping code
that looks up user pages one by one, and then writes them out using
their kernel addresses (by using a BVEC_ITER).
That actually ends up being a huge problem, because while we do use
copy_mc_to_kernel() for this case and it is able to handle the possible
machine checks involved, nothing else is really ready to handle the
failures caused by the machine check.
In particular, as reported by Tong Tiangen, we don't actually support
fault_in_iov_iter_readable() on a machine check area.
As a result, the usual logic for writing things to a file under a
filesystem lock, which involves doing a copy with page faults disabled
and then if that fails trying to fault pages in without holding the
locks with fault_in_iov_iter_readable() does not work at all.
We could decide to always just make the MC copy "succeed" (and filling
the destination with zeroes), and that would then create a core dump
file that just ignores any machine checks.
But honestly, this single special case has been problematic before, and
means that all the normal iov_iter code ends up slightly more complex
and slower.
See for example commit c9eec08bac96 ("iov_iter: Don't deal with
iter->copy_mc in memcpy_from_iter_mc()") where David Howells
re-organized the code just to avoid having to check the 'copy_mc' flags
inside the inner iov_iter loops.
So considering that we have exactly one user, and that one user is a
non-critical special case that doesn't actually ever trigger in real
life (Tong found this with manual error injection), the sane solution is
to just decide that the onus on handling the machine check lines on that
user instead.
Ergo, do the copy_mc_to_kernel() in the core dump logic itself, copying
the user data to a stable kernel page before writing it out.
Fixes: f1982740f5e7 ("iov_iter: Convert iterate*() to inline funcs")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
Link: https://lore.kernel.org/r/20240305133336.3804360-1-tongtiangen@huawei.com
Link: https://lore.kernel.org/all/4e80924d-9c85-f13a-722a-6a5d2b1c225a@huawei.com/
Tested-by: David Howells <dhowells@redhat.com>
Reviewed-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reported-by: Tong Tiangen <tongtiangen@huawei.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Some FUSE daemons want to know if the received request is a resend
request. The high bit of the fuse request ID is utilized for indicating
this, enabling the receiver to perform appropriate handling.
The init flag "FUSE_HAS_RESEND" is added to indicate this feature.
Signed-off-by: Zhao Chen <winters.zc@antgroup.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
When a FUSE daemon panics and failover, we aim to minimize the impact on
applications by reusing the existing FUSE connection. During this process,
another daemon is employed to preserve the FUSE connection's file
descriptor. The new started FUSE Daemon will takeover the fd and continue
to provide service.
However, it is possible for some inflight requests to be lost and never
returned. As a result, applications awaiting replies would become stuck
forever. To address this, we can resend these pending requests to the
new started FUSE daemon.
This patch introduces a new notification type "FUSE_NOTIFY_RESEND", which
can trigger resending of the pending requests, ensuring they are properly
processed again.
Signed-off-by: Zhao Chen <winters.zc@antgroup.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
open_by_handle_at(2) can fail with -ESTALE with a valid handle returned
by a previous name_to_handle_at(2) for evicted fuse inodes, which is
especially common when entry_valid_timeout is 0, e.g. when the fuse
daemon is in "cache=none" mode.
The time sequence is like:
name_to_handle_at(2) # succeed
evict fuse inode
open_by_handle_at(2) # fail
The root cause is that, with 0 entry_valid_timeout, the dput() called in
name_to_handle_at(2) will trigger iput -> evict(), which will send
FUSE_FORGET to the daemon. The following open_by_handle_at(2) will send
a new FUSE_LOOKUP request upon inode cache miss since the previous inode
eviction. Then the fuse daemon may fail the FUSE_LOOKUP request with
-ENOENT as the cached metadata of the requested inode has already been
cleaned up during the previous FUSE_FORGET. The returned -ENOENT is
treated as -ESTALE when open_by_handle_at(2) returns.
This confuses the application somehow, as open_by_handle_at(2) fails
when the previous name_to_handle_at(2) succeeds. The returned errno is
also confusing as the requested file is not deleted and already there.
It is reasonable to fail name_to_handle_at(2) early in this case, after
which the application can fallback to open(2) to access files.
Since this issue typically appears when entry_valid_timeout is 0 which
is configured by the fuse daemon, the fuse daemon is the right person to
explicitly disable the export when required.
Also considering FUSE_EXPORT_SUPPORT actually indicates the support for
lookups of "." and "..", and there are existing fuse daemons supporting
export without FUSE_EXPORT_SUPPORT set, for compatibility, we add a new
INIT flag for such purpose.
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
Arm SCMI spec. v3.2, s4.5.3.12 PERFORMANCE_DESCRIBE_FASTCHANNEL
defines a per-domain rate_limit for performance requests:
"""
Rate Limit in microseconds, indicating the minimum time
required between successive requests. A value of 0
indicates that this field is not applicable or supported
on the platform.
""""
The field is first defined in SCMI v2.0.
Add support to fetch this value and advertise it through
a fast_switch_rate_limit() callback.
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Cristian Marussi <cristian.marussi@arm.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|
Arm SCMI spec. v3.2, s4.5.3.4 PERFORMANCE_DOMAIN_ATTRIBUTES
defines a per-domain rate_limit for performance requests:
"""
Rate Limit in microseconds, indicating the minimum time
required between successive requests. A value of 0
indicates that this field is not supported by the
platform. This field does not apply to FastChannels.
""""
The field is first defined in SCMI v1.0.
Add support to fetch this value and advertise it through
a rate_limit_get() callback.
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Cristian Marussi <cristian.marussi@arm.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|
In order for EEE to operate, both the MAC and the PHY need to support
it, similar to how pause works. With some exception - a number of PHYs
have SmartEEE or AutoGrEEEn support in order to provide some EEE-like
power savings with non-EEE capable MACs.
Copy the pause concept and add the call phy_support_eee() which the MAC
makes after connecting the PHY to indicate it supports EEE. phylib will
then advertise EEE when auto-neg is performed.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://lore.kernel.org/r/20240302195306.3207716-6-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Have phylib keep track of the EEE configuration. This simplifies the
MAC drivers, in that they don't need to store it.
Future patches to phylib will also make use of this information to
further simplify the MAC drivers.
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://lore.kernel.org/r/20240302195306.3207716-4-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
MAC drivers which support EEE need to know the results of the EEE
auto-neg in order to program the hardware to perform EEE or not. The
oddly named phy_init_eee() can be used to determine this, it returns 0
if EEE should be used, or a negative error code,
e.g. -EOPPROTONOTSUPPORT if the PHY does not support EEE or negotiate
resulted in it not being used.
However, many MAC drivers get this wrong. Add phydev->enable_tx_lpi
which indicates the result of the autoneg for EEE, including if EEE is
administratively disabled with ethtool. The MAC driver can then access
this in the same way as link speed and duplex in the adjust link
callback. If enable_tx_lpi is true, the MAC should send low power
indications and does not need to consider anything else with respect
to EEE.
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://lore.kernel.org/r/20240302195306.3207716-3-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add helpers that phylib and phylink can use to manage EEE configuration
and determine whether the MAC should be permitted to use LPI based on
that configuration.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://lore.kernel.org/r/20240302195306.3207716-2-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Older versions of GCC really want to know the full definition
of the type involved in rcu_assign_pointer().
struct dpll_pin is defined in a local header, net/core can't
reach it. Move all the netdev <> dpll code into dpll, where
the type is known. Otherwise we'd need multiple function calls
to jump between the compilation units.
This is the same problem the commit under fixes was trying to address,
but with rcu_assign_pointer() not rcu_dereference().
Some of the exports are not needed, networking core can't
be a module, we only need exports for the helpers used by
drivers.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Link: https://lore.kernel.org/all/35a869c8-52e8-177-1d4d-e57578b99b6@linux-m68k.org/
Fixes: 640f41ed33b5 ("dpll: fix build failure due to rcu_dereference_check() on unknown type")
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240305013532.694866-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ARL-H uses the same media and display IP as MTL, and a version 12.74
graphics IP (referred to as Xe_LPG+). From a driver point of view, we
should be able to just treat the whole platform as MTL and rely on
GRAPHICS_VERx100 checks to handle any spots where ARL's Xe_LPG+ needs
different handling from MTL's Xe_LPG (i.e., workarounds).
v2: Resolve conflict and Reorder PCI ids in sorted order
v3: Append signed-off-by commiter to this commit
Bspec: 55420
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240229070806.3402641-4-dnyaneshwar.bhadane@intel.com
|
|
This property is documented to have a special format which exposes all
available behaviours and the currently active one at the same time. For
this special format some helpers are provided.
When the charge_behaviour property was added in
1b0b6cc8030d ("power: supply: add charge_behaviour attributes"), it did
not update the default logic in in power_supply_sysfs.c to use the
format helpers. Thus by default only the currently active behaviour
is printed. This fixes the default logic to follow the documented
format.
There is currently only one in-tree drivers exposing charge behaviours -
thinkpad_acpi, which is not affected by the change, as it directly uses
the helpers and does not use the power_supply_sysfs.c logic.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20240303-power_supply-charge_behaviour_prop-v2-3-8ebb0a7c2409@weissschuh.net
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
|
|
Track the call timeouts as ktimes rather than jiffies as the latter's
granularity is too high and only set the timer at the end of the event
handling function.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: linux-afs@lists.infradead.org
cc: netdev@vger.kernel.org
|
|
There are three points that transmit PING ACKs and all of them use the same
trace string. Change two of them to use different strings.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: linux-afs@lists.infradead.org
cc: netdev@vger.kernel.org
|
|
Introduce power_supply_for_each_device(), which is a wrapper
for class_for_each_device() using the power_supply_class and
going through all devices.
This allows making the power_supply_class itself a local
variable, so that drivers cannot mess with it and simplifies
the code slightly.
Reviewed-by: Ricardo B. Marliere <ricardo@marliere.net>
Link: https://lore.kernel.org/r/20240301-psy-class-cleanup-v1-1-aebe8c4b6b08@collabora.com
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv fixes from Wei Liu:
- Multiple fixes, cleanups and documentations for Hyper-V core code and
drivers
* tag 'hyperv-fixes-signed-20240303' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
Drivers: hv: vmbus: make hv_bus const
x86/hyperv: Allow 15-bit APIC IDs for VTL platforms
x86/hyperv: Make encrypted/decrypted changes safe for load_unaligned_zeropad()
x86/mm: Regularize set_memory_p() parameters and make non-static
x86/hyperv: Use slow_virt_to_phys() in page transition hypervisor callback
Documentation: hyperv: Add overview of PCI pass-thru device support
Drivers: hv: vmbus: Update indentation in create_gpadl_header()
Drivers: hv: vmbus: Remove duplication and cleanup code in create_gpadl_header()
fbdev/hyperv_fb: Fix logic error for Gen2 VMs in hvfb_getmem()
Drivers: hv: vmbus: Calculate ring buffer size for more efficient use of memory
hv_utils: Allow implicit ICTIMESYNCFLAG_SYNC
|
|
Since commit 43a7206b0963 ("driver core: class: make class_register() take
a const *"), the driver core allows for struct class to be in read-only
memory, so move the nfc_class structure to be declared at build time
placing it into read-only memory, instead of having to be dynamically
allocated at boot time.
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240302-class_cleanup-net-next-v1-6-8fa378595b93@marliere.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Linux 6.8-rc7
|
|
[BUG]
Currently btrfs can create subvolume with an invalid qgroup inherit
without triggering any error:
# mkfs.btrfs -O quota -f $dev
# mount $dev $mnt
# btrfs subvolume create -i 2/0 $mnt/subv1
# btrfs qgroup show -prce --sync $mnt
Qgroupid Referenced Exclusive Path
-------- ---------- --------- ----
0/5 16.00KiB 16.00KiB <toplevel>
0/256 16.00KiB 16.00KiB subv1
[CAUSE]
We only do a very basic size check for btrfs_qgroup_inherit structure,
but never really verify if the values are correct.
Thus in btrfs_qgroup_inherit() function, we have to skip non-existing
qgroups, and never return any error.
[FIX]
Fix the behavior and introduce extra checks:
- Introduce early check for btrfs_qgroup_inherit structure
Not only the size, but also all the qgroup ids would be verified.
And the timing is very early, so we can return error early.
This early check is very important for snapshot creation, as snapshot
is delayed to transaction commit.
- Drop support for btrfs_qgroup_inherit::num_ref_copies and
num_excl_copies
Those two members are used to specify to copy refr/excl numbers from
other qgroups.
This would definitely mark qgroup inconsistent, and btrfs-progs has
dropped the support for them for a long time.
It's time to drop the support for kernel.
- Verify the supported btrfs_qgroup_inherit::flags
Just in case we want to add extra flags for btrfs_qgroup_inherit.
Now above subvolume creation would fail with -ENOENT other than silently
ignore the non-existing qgroup.
CC: stable@vger.kernel.org # 6.7+
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Those are already de-facto UAPI, so let's just move it into the uapi
header.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240305133853.2214268-2-kherbst@redhat.com
|
|
FORTIFY_SOURCE has been ignoring 0-sized destinations while the kernel
code base has been converted to flexible arrays. In order to enforce
the 0-sized destinations (e.g. with __counted_by), the remaining 0-sized
destinations need to be handled. Instead of converting an empty struct
into using a flexible array, just directly use a pointer without any
additional indirection. Remove struct gb_bootrom_get_firmware_response
and struct gb_fw_download_fetch_firmware_response.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20240304211940.it.083-kees@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Several serial drivers want to read the same or similar set of
the port properties. Make a common helper for them.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20240304123035.758700-4-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
In some APIs we would like to assign the special value to iotype
and compare against it in another places. Introduce UPIO_UNKNOWN
for this purpose.
Note, we can't use 0, because it's a valid value for IO port access.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20240304123035.758700-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Currently it's not crystal clear what UPIO_* and UPQ_* definitions
belong to. Reindent the code, so it will be easy to read and understand.
No functional changes intended.
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20240304123035.758700-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
If the circular buffer is empty, it just means we fit all characters to
send into the HW fifo, but not that the hardware finished transmitting
them.
So if we immediately call stop_tx() after that, this may abort any
pending characters in the HW fifo, and cause dropped characters on the
console.
Fix this by only stopping tx when the tx HW fifo is actually empty.
Fixes: 8275b48b2780 ("tty: serial: introduce transmit helpers")
Cc: stable@vger.kernel.org
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Link: https://lore.kernel.org/r/20240303150807.68117-1-jonas.gorski@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This adds the support to set the connector orientation value
accordingly. This is part of the optional CONFIG_STANDARD_OUTPUT
register 0x18, specified within the USB port controller spsicification
rev. 2.0 [1].
[1] https://www.usb.org/sites/default/files/documents/usb-port_controller_specification_rev2.0_v1.0_0.pdf
Signed-off-by: Marco Felsch <m.felsch@pengutronix.de>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://lore.kernel.org/r/20240222210903.208901-4-m.felsch@pengutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
When a USB hub is described in DT, such as any device that matches the
onboard-hub driver, the connect_type is set to "unknown" or
USB_PORT_CONNECT_TYPE_UNKNOWN. This makes any device plugged into that
USB port report their 'removable' device attribute as "unknown".
ChromeOS userspace would like to know if the USB device is actually
removable or not so that security policies can be applied. Improve the
connect_type attribute for ports, and in turn the removable attribute
for USB devices, by looking for child devices with a reg property or an
OF graph when the device is described in DT.
If the graph exists, endpoints that are connected to a remote node must
be something like a usb-{a,b,c}-connector compatible node, or an
intermediate node like a redriver, and not a hardwired USB device on the
board. Set the connect_type to USB_PORT_CONNECT_TYPE_HOT_PLUG in this
case because the device is going to be plugged in. Set the connect_type
to USB_PORT_CONNECT_TYPE_HARD_WIRED if there's a child node for the port
like 'device@2' for port2. Set the connect_type to USB_PORT_NOT_USED if
there isn't an endpoint or child node corresponding to the port number.
To make sure things don't change, only set the port to not used if
there are child nodes. This way an onboard hub connect_type doesn't
change until ports are added or child nodes are added to describe
hardwired devices. It's assumed that all ports or no ports will be
described for a device.
Cc: Matthias Kaehlcke <mka@chromium.org>
Cc: linux-usb@vger.kernel.org
Cc: devicetree@vger.kernel.org
Cc: Pin-yen Lin <treapking@chromium.org>
Cc: maciek swiech <drmasquatch@google.com>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/20240223005823.3074029-3-swboyd@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
If we have a neat macro, at least new code should
use it.
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Link: https://lore.kernel.org/r/20240229131851.16148-2-oneukum@suse.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Bridge driver today has no support to forward the userspace timestamp
packets and ends up resetting the timestamp. ETF qdisc checks the
packet coming from userspace and encounters to be 0 thereby dropping
time sensitive packets. These changes will allow userspace timestamps
packets to be forwarded from the bridge to NIC drivers.
Setting the same bit (mono_delivery_time) to avoid dropping of
userspace tstamp packets in the forwarding path.
Existing functionality of mono_delivery_time remains unaltered here,
instead just extended with userspace tstamp support for bridge
forwarding path.
Signed-off-by: Abhishek Chauhan <quic_abchauha@quicinc.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20240301201348.2815102-1-quic_abchauha@quicinc.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
FUSE server calls the FUSE_DEV_IOC_BACKING_OPEN ioctl with a backing file
descriptor. If the call succeeds, a backing file identifier is returned.
A later change will be using this backing file id in a reply to OPEN
request with the flag FOPEN_PASSTHROUGH to setup passthrough of file
operations on the open FUSE file to the backing file.
The FUSE server should call FUSE_DEV_IOC_BACKING_CLOSE ioctl to close the
backing file by its id.
This can be done at any time, but if an open reply with FOPEN_PASSTHROUGH
flag is still in progress, the open may fail if the backing file is
closed before the fuse file was opened.
Setting up backing files requires a server with CAP_SYS_ADMIN privileges.
For the backing file to be successfully setup, the backing file must
implement both read_iter and write_iter file operations.
The limitation on the level of filesystem stacking allowed for the
backing file is enforced before setting up the backing file.
Signed-off-by: Alessio Balsini <balsini@android.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|