Age | Commit message (Collapse) | Author |
|
linux-next hit the following build error:
net/ipv4/tcp_ao.c: In function 'tcp_ao_key_alloc':
net/ipv4/tcp_ao.c:1536:13: error: implicit declaration of function 'crypto_ahash_alignmask'; did you mean 'crypto_ahash_alg_name'? [-Werror=implicit-function-declaration]
1536 | if (crypto_ahash_alignmask(tfm) > TCP_AO_KEY_ALIGN) {
| ^~~~~~~~~~~~~~~~~~~~~~
| crypto_ahash_alg_name
Caused by commit from the crypto tree
0f8660c82b79 ("crypto: ahash - remove crypto_ahash_alignmask")
interacting with commit
4954f17ddefc ("net/tcp: Introduce TCP_AO setsockopt()s")
from networking. crypto_ahash_alignmask() has been phased out
by the former commit, drop the call in networking.
Eric confirms that the check is safe to remove and was questionable
here in the first place.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Reviewed-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The commit 5054e778fcd9c ("dm crypt: allocate compound pages if
possible") changed dm-crypt to use compound pages to improve
performance. Unfortunately, there was an oversight: the allocation of
compound pages was not accounted at all. Normal pages are accounted in
a percpu counter cc->n_allocated_pages and dm-crypt is limited to
allocate at most 2% of memory. Because compound pages were not
accounted at all, dm-crypt could allocate memory over the 2% limit.
Fix this by adding the accounting of compound pages, so that memory
consumption of dm-crypt is properly limited.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Fixes: 5054e778fcd9c ("dm crypt: allocate compound pages if possible")
Cc: stable@vger.kernel.org # v6.5+
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
|
Add the committed decoder sysfs attribute for v6.7.
|
|
Pickup some misc. CXL updates for v6.7.
|
|
Merge some prep-work for CXL QOS class support. This cycle saw large
collisions with mm on this topic, so the bulk of this topic needs to
wait.
|
|
Restricted CXL Host (RCH) Error Handling undoes the topology munging of
CXL 1.1 to enabled some AER recovery, and lands some base infrastructure
for handling Root-Complex-Event-Collectors (RCECs) with CXL. Include
this long running series finally for v6.7.
|
|
We were passing 0 as the xid for the call to query
server interfaces. This is not great for debugging.
This change adds a real xid.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Bharath SM <bharathsm@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
In the output of /proc/fs/cifs/DebugData, we do not
print the server->capabilities field today.
With this change, we will do that.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Simplify symlink_hash() by using crypto_shash_digest() instead of an
init+update+final sequence. This should also improve performance.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Skip SMB sessions that are being teared down
(e.g. @ses->ses_status == SES_EXITING) in cifs_debug_data_proc_show()
to avoid use-after-free in @ses.
This fixes the following GPF when reading from /proc/fs/cifs/DebugData
while mounting and umounting
[ 816.251274] general protection fault, probably for non-canonical
address 0x6b6b6b6b6b6b6d81: 0000 [#1] PREEMPT SMP NOPTI
...
[ 816.260138] Call Trace:
[ 816.260329] <TASK>
[ 816.260499] ? die_addr+0x36/0x90
[ 816.260762] ? exc_general_protection+0x1b3/0x410
[ 816.261126] ? asm_exc_general_protection+0x26/0x30
[ 816.261502] ? cifs_debug_tcon+0xbd/0x240 [cifs]
[ 816.261878] ? cifs_debug_tcon+0xab/0x240 [cifs]
[ 816.262249] cifs_debug_data_proc_show+0x516/0xdb0 [cifs]
[ 816.262689] ? seq_read_iter+0x379/0x470
[ 816.262995] seq_read_iter+0x118/0x470
[ 816.263291] proc_reg_read_iter+0x53/0x90
[ 816.263596] ? srso_alias_return_thunk+0x5/0x7f
[ 816.263945] vfs_read+0x201/0x350
[ 816.264211] ksys_read+0x75/0x100
[ 816.264472] do_syscall_64+0x3f/0x90
[ 816.264750] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[ 816.265135] RIP: 0033:0x7fd5e669d381
Cc: stable@vger.kernel.org
Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
All release_mid() callers seem to hold a reference of @mid so there is
no need to call kref_put(&mid->refcount, __release_mid) under
@server->mid_lock spinlock. If they don't, then an use-after-free bug
would have occurred anyways.
By getting rid of such spinlock also fixes a potential deadlock as
shown below
CPU 0 CPU 1
------------------------------------------------------------------
cifs_demultiplex_thread() cifs_debug_data_proc_show()
release_mid()
spin_lock(&server->mid_lock);
spin_lock(&cifs_tcp_ses_lock)
spin_lock(&server->mid_lock)
__release_mid()
smb2_find_smb_tcon()
spin_lock(&cifs_tcp_ses_lock) *deadlock*
Cc: stable@vger.kernel.org
Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
When firmware status is invalid, assign -EINVAL to ret as ret is '0' at
that point and returning success is incorrect when firmware status is
invalid.
Fixes: a51d8ba03a4f ("ALSA: hda: cs35l41: Check CSPL state after loading firmware")
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Link: https://lore.kernel.org/r/20231030070836.3234385-1-harshit.m.mogalapalli@oracle.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Fixes some xfstests including generic/564 and generic/157
The "sfu" mount option can be useful for creating special files (character
and block devices in particular) but could not create FIFOs. It did
recognize existing empty files with the "system" attribute flag as FIFOs
but this is too general, so to support creating FIFOs more safely use a new
tag (but the same length as those for char and block devices ie "IntxLNK"
and "IntxBLK") "LnxFIFO" to indicate that the file should be treated as a
FIFO (when mounted with the "sfu"). For some additional context note that
"sfu" followed the way that "Services for Unix" on Windows handled these
special files (at least for character and block devices and symlinks),
which is different than newer Windows which can handle special files
as reparse points (which isn't an option to many servers).
Cc: stable@vger.kernel.org
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
The acpi_thermal_unregister_thermal_zone() is paired with
acpi_thermal_register_thermal_zone() so it should mirror it. It should
clean up all the resources that the register function allocated and
leave the stuff that was allocated elsewhere.
Unfortunately, it doesn't call thermal_zone_device_disable(). Also it
calls kfree(tz->trip_table) when it shouldn't. That was allocated in
acpi_thermal_add(). Putting the kfree() here leads to a double free
in the acpi_thermal_add() clean up function.
Likewise, the acpi_thermal_remove() should mirror acpi_thermal_add() so
it should have an explicit kfree(tz->trip_table) as well.
Fixes: ec23c1c462de ("ACPI: thermal: Use trip point table to register thermal zones")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
New helper for new rebalance code
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
data_progress_list is gone - it was redundant with moving_context_list
The upcoming rebalance rewrite is going to have it using two different
move_stats objects with the same moving_context, depending on whether
it's scanning or using the rebalance_work btree - this patch plumbs
stats around a bit differently so that will work.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
btree_trans and moving_context are used together, and having the
moving_context owns the transaction object reduces some plumbing.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Prep work for the new rebalance code - we need a few helpers exported.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Since compression options now include compression level, proper
validation is a bit more involved.
This adds bch2_compression_opt_valid(), and plumbs it around
appropriately.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This fixes an infinite loop when there's a set bit at position >= 32.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We don't yet repair (split) them, just check.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The write path may (rarely) see an encoded (checksummed) extent that
exceeds encoded_extent_max - this can happen when we're moving an
existing extent that was not checksummed, but was given a checksum by
bch2_write_rechecksum().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We're going to be using bch2_target_to_text() ->
bch2_disk_path_to_text() from bch2_bkey_ptrs_to_text() and
bch2_bkey_ptrs_invalid(), which can be called in any context.
This patch adds the actual label to bch_disk_group_cpu so that it can be
used by bch2_disk_path_to_text, and splits out bch2_disk_path_to_text()
into two variants - like the previous patch, one for when we have a
running filesystem and another for when we only have a superblock.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Previously we just had bch2_opt_target_to_text() which could be passed
either a filesystem object or just a superblock - depending on if we
have a running filesystem or not.
Split these into two functions for clarity.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Upcoming rebalance_work btree will require extent triggers to be
BTREE_TRIGGER_WANTS_OLD_AND_NEW - so to reduce potential confusion,
let's just make all triggers BTREE_TRIGGER_WANTS_OLD_AND_NEW.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The data move path now correctly picks IO options when inodes in
different snapshots have different options applied.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We can't mark device superblocks or allocate journal on a device that
isn't online.
That means we may need to do this on every mount, because we may have
formatted a new filesystem and then done the first mount
(bch2_fs_initialize()) in degraded mode.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This code duplicated initialization already done in
bch2_fs_btree_iter_init().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The ca->oldest_gen array needs to be the same size as the bucket_gens
array; ca->mi.nbuckets is updated with only state_lock held, not
gc_lock, so bch2_gc_gens() could race with device resize and allocate
too small of an oldest_gens array.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Shrinkers are now exported to debugfs, so the names can't have slashes
in them.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
More forwards compatibility fixups: having BKEY_TYPE_btree at the end of
the enum conflicts with unnkown btree IDs, this shifts BKEY_TYPE_btree
to slot 0 and fixes things up accordingly.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Since we can run with unknown btree IDs, we can't directly index btree
IDs into fixed size arrays.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Be a bit more careful about when bch2_delete_dead_snapshots needs to
run: it only needs to run synchronously if we're running fsck, and it
only needs to run at all if we have snapshot nodes to delete or if fsck
has noticed that it needs to run.
Also:
Rename BCH_FS_HAVE_DELETED_SNAPSHOTS -> BCH_FS_NEED_DELETE_DEAD_SNAPSHOTS
Kill bch2_delete_dead_snapshots_hook(), move functionality to
bch2_mark_snapshot()
Factor out bch2_check_snapshot_needs_deletion(), to explicitly check
if we need to be running snapshot deletion.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We must not hold btree locks while taking snapshot_create_lock - this
fixes a lockdep splat.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core & protocols:
- Support usec resolution of TCP timestamps, enabled selectively by a
route attribute.
- Defer regular TCP ACK while processing socket backlog, try to send
a cumulative ACK at the end. Increase single TCP flow performance
on a 200Gbit NIC by 20% (100Gbit -> 120Gbit).
- The Fair Queuing (FQ) packet scheduler:
- add built-in 3 band prio / WRR scheduling
- support bypass if the qdisc is mostly idle (5% speed up for TCP RR)
- improve inactive flow reporting
- optimize the layout of structures for better cache locality
- Support TCP Authentication Option (RFC 5925, TCP-AO), a more modern
replacement for the old MD5 option.
- Add more retransmission timeout (RTO) related statistics to
TCP_INFO.
- Support sending fragmented skbs over vsock sockets.
- Make sure we send SIGPIPE for vsock sockets if socket was
shutdown().
- Add sysctl for ignoring lower limit on lifetime in Router
Advertisement PIO, based on an in-progress IETF draft.
- Add sysctl to control activation of TCP ping-pong mode.
- Add sysctl to make connection timeout in MPTCP configurable.
- Support rcvlowat and notsent_lowat on MPTCP sockets, to help apps
limit the number of wakeups.
- Support netlink GET for MDB (multicast forwarding), allowing user
space to request a single MDB entry instead of dumping the entire
table.
- Support selective FDB flushing in the VXLAN tunnel driver.
- Allow limiting learned FDB entries in bridges, prevent OOM attacks.
- Allow controlling via configfs netconsole targets which were
created via the kernel cmdline at boot, rather than via configfs at
runtime.
- Support multiple PTP timestamp event queue readers with different
filters.
- MCTP over I3C.
BPF:
- Add new veth-like netdevice where BPF program defines the logic of
the xmit routine. It can operate in L3 and L2 mode.
- Support exceptions - allow asserting conditions which should never
be true but are hard for the verifier to infer. With some extra
flexibility around handling of the exit / failure:
https://lwn.net/Articles/938435/
- Add support for local per-cpu kptr, allow allocating and storing
per-cpu objects in maps. Access to those objects operates on the
value for the current CPU.
This allows to deprecate local one-off implementations of per-CPU
storage like BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE maps.
- Extend cgroup BPF sockaddr hooks for UNIX sockets. The use case is
for systemd to re-implement the LogNamespace feature which allows
running multiple instances of systemd-journald to process the logs
of different services.
- Enable open-coded task_vma iteration, after maple tree conversion
made it hard to directly walk VMAs in tracing programs.
- Add open-coded task, css_task and css iterator support. One of the
use cases is customizable OOM victim selection via BPF.
- Allow source address selection with bpf_*_fib_lookup().
- Add ability to pin BPF timer to the current CPU.
- Prevent creation of infinite loops by combining tail calls and
fentry/fexit programs.
- Add missed stats for kprobes to retrieve the number of missed
kprobe executions and subsequent executions of BPF programs.
- Inherit system settings for CPU security mitigations.
- Add BPF v4 CPU instruction support for arm32 and s390x.
Changes to common code:
- overflow: add DEFINE_FLEX() for on-stack definition of structs with
flexible array members.
- Process doc update with more guidance for reviewers.
Driver API:
- Simplify locking in WiFi (cfg80211 and mac80211 layers), use wiphy
mutex in most places and remove a lot of smaller locks.
- Create a common DPLL configuration API. Allow configuring and
querying state of PLL circuits used for clock syntonization, in
network time distribution.
- Unify fragmented and full page allocation APIs in page pool code.
Let drivers be ignorant of PAGE_SIZE.
- Rework PHY state machine to avoid races with calls to phy_stop().
- Notify DSA drivers of MAC address changes on user ports, improve
correctness of offloads which depend on matching port MAC
addresses.
- Allow antenna control on injected WiFi frames.
- Reduce the number of variants of napi_schedule().
- Simplify error handling when composing devlink health messages.
Misc:
- A lot of KCSAN data race "fixes", from Eric.
- A lot of __counted_by() annotations, from Kees.
- A lot of strncpy -> strscpy and printf format fixes.
- Replace master/slave terminology with conduit/user in DSA drivers.
- Handful of KUnit tests for netdev and WiFi core.
Removed:
- AppleTalk COPS.
- AppleTalk ipddp.
- TI AR7 CPMAC Ethernet driver.
Drivers:
- Ethernet high-speed NICs:
- Intel (100G, ice, idpf):
- add a driver for the Intel E2000 IPUs
- make CRC/FCS stripping configurable
- cross-timestamping for E823 devices
- basic support for E830 devices
- use aux-bus for managing client drivers
- i40e: report firmware versions via devlink
- nVidia/Mellanox:
- support 4-port NICs
- increase max number of channels to 256
- optimize / parallelize SF creation flow
- Broadcom (bnxt):
- enhance NIC temperature reporting
- support PAM4 speeds and lane configuration
- Marvell OcteonTX2:
- PTP pulse-per-second output support
- enable hardware timestamping for VFs
- Solarflare/AMD:
- conntrack NAT offload and offload for tunnels
- Wangxun (ngbe/txgbe):
- expose HW statistics
- Pensando/AMD:
- support PCI level reset
- narrow down the condition under which skbs are linearized
- Netronome/Corigine (nfp):
- support CHACHA20-POLY1305 crypto in IPsec offload
- Ethernet NICs embedded, slower, virtual:
- Synopsys (stmmac):
- add Loongson-1 SoC support
- enable use of HW queues with no offload capabilities
- enable PPS input support on all 5 channels
- increase TX coalesce timer to 5ms
- RealTek USB (r8152): improve efficiency of Rx by using GRO frags
- xen: support SW packet timestamping
- add drivers for implementations based on TI's PRUSS (AM64x EVM)
- nVidia/Mellanox Ethernet datacenter switches:
- avoid poor HW resource use on Spectrum-4 by better block
selection for IPv6 multicast forwarding and ordering of blocks
in ACL region
- Ethernet embedded switches:
- Microchip:
- support configuring the drive strength for EMI compliance
- ksz9477: partial ACL support
- ksz9477: HSR offload
- ksz9477: Wake on LAN
- Realtek:
- rtl8366rb: respect device tree config of the CPU port
- Ethernet PHYs:
- support Broadcom BCM5221 PHYs
- TI dp83867: support hardware LED blinking
- CAN:
- add support for Linux-PHY based CAN transceivers
- at91_can: clean up and use rx-offload helpers
- WiFi:
- MediaTek (mt76):
- new sub-driver for mt7925 USB/PCIe devices
- HW wireless <> Ethernet bridging in MT7988 chips
- mt7603/mt7628 stability improvements
- Qualcomm (ath12k):
- WCN7850:
- enable 320 MHz channels in 6 GHz band
- hardware rfkill support
- enable IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS to
make scan faster
- read board data variant name from SMBIOS
- QCN9274: mesh support
- RealTek (rtw89):
- TDMA-based multi-channel concurrency (MCC)
- Silicon Labs (wfx):
- Remain-On-Channel (ROC) support
- Bluetooth:
- ISO: many improvements for broadcast support
- mark BCM4378/BCM4387 as BROKEN_LE_CODED
- add support for QCA2066
- btmtksdio: enable Bluetooth wakeup from suspend"
* tag 'net-next-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1816 commits)
net: pcs: xpcs: Add 2500BASE-X case in get state for XPCS drivers
net: bpf: Use sockopt_lock_sock() in ip_sock_set_tos()
net: mana: Use xdp_set_features_flag instead of direct assignment
vxlan: Cleanup IFLA_VXLAN_PORT_RANGE entry in vxlan_get_size()
iavf: delete the iavf client interface
iavf: add a common function for undoing the interrupt scheme
iavf: use unregister_netdev
iavf: rely on netdev's own registered state
iavf: fix the waiting time for initial reset
iavf: in iavf_down, don't queue watchdog_task if comms failed
iavf: simplify mutex_trylock+sleep loops
iavf: fix comments about old bit locks
doc/netlink: Update schema to support cmd-cnt-name and cmd-max-name
tools: ynl: introduce option to process unknown attributes or types
ipvlan: properly track tx_errors
netdevsim: Block until all devices are released
nfp: using napi_build_skb() to replace build_skb()
net: dsa: microchip: ksz9477: Fix spelling mistake "Enery" -> "Energy"
net: dsa: microchip: Ensure Stable PME Pin State for Wake-on-LAN
net: dsa: microchip: Refactor switch shutdown routine for WoL preparation
...
|
|
Simplify sb_mac() by using crypto_shash_digest() instead of an
init+update+final sequence. This should also improve performance.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
|
Simplify crypt_iv_tcw_whitening() by using crypto_shash_digest() instead
of an init+update+final sequence. This should also improve performance.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
|
dm-error is used in several test cases in the xfstests test suite to
check the handling of IO errors in file systems. However, with several
file systems getting native support for zoned block devices (e.g.
btrfs and f2fs), dm-error's lack of zoned block device support creates
problems as the file system attempts executing zone commands (e.g. a
zone append operation) against a dm-error non-zoned block device,
which causes various issues in the block layer (e.g. WARN_ON
triggers).
This commit adds supports for zoned block devices to dm-error, allowing
a DM device table containing an error target to be exposed as a zoned
block device (if all targets have a compatible zoned model support and
mapping). This is done as follows:
1) Allow passing 2 arguments to an error target, similar to dm-linear:
a backing device and a start sector. These arguments are optional and
dm-error retains its characteristics if the arguments are not
specified.
2) Implement the iterate_devices method so that dm-core can normally
check the zone support and restrictions (e.g. zone alignment of the
targets). When the backing device arguments are not specified, the
iterate_devices method never calls the fn() argument.
When no backing device is specified, as before, we assume that the DM
device is not zoned. When the backing device arguments are specified,
the zoned model of the DM device will depend on the backing device
type:
- If the backing device is zoned and its model and mapping is
compatible with other targets of the device, the resulting device
will be zoned, with the dm-error mapped portion always returning
errors (similar to the default non-zoned case).
- If the backing device is not zoned, then the DM device will not be
either.
This zone support for dm-error requires the definition of a functional
report_zones operation so that dm_revalidate_zones() can operate
correctly and resources for emulating zone append operations
initialized. This is necessary for cases where dm-error is used to
partially map a device and have an overall correct handling of zone
append. This means that dm-error does not fail report zones operations.
Two changes that are not obvious are included to avoid issues:
1) dm_table_supports_zoned_model() is changed to directly check if
the backing device of a wildcard target (= dm-error target) is
zoned. Otherwise, we wouldn't be able to catch the invalid setup of
dm-error without a backing device (non zoned case) being combined
with zoned targets.
2) dm_table_supports_dax() is modified to return false if the wildcard
target is found. Otherwise, when dm-error is set without a backing
device, we end up with a NULL pointer dereference in
set_dax_synchronous (dax_dev is NULL). This is consistent with the
current behavior because dm_table_supports_dax() always returned
false for targets that do not define the iterate_devices method.
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Tested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
|
DM delay's current design of using timers and wq to realize the delays
is insufficient for delays below ~50ms.
This commit enhances the design to use a kthread to flush the expired
delays, trading some CPU time (in some cases) for better delay
accuracy and delays closer to what the user requested for smaller
delays. The new design is chosen as long as all the delays are below
50ms.
Since bios can't be completed in interrupt context using a kthread
is probably the most reasonable way to approach this.
Testing with
echo "0 2097152 zero" | dmsetup create dm-zeros
for i in $(seq 0 20);
do
echo "0 2097152 delay /dev/mapper/dm-zeros 0 $i" | dmsetup create dm-delay-${i}ms;
done
Some performance numbers for comparison, on beaglebone black (single
core) CONFIG_HZ_1000=y:
fio --name=1msread --rw=randread --bs=4k --runtime=60 --time_based \
--filename=/dev/mapper/dm-delay-1ms
Theoretical maximum: 1000 IOPS
Previous: 250 IOPS
Kthread: 500 IOPS
fio --name=10msread --rw=randread --bs=4k --runtime=60 --time_based \
--filename=/dev/mapper/dm-delay-10ms
Theoretical maximum: 100 IOPS
Previous: 45 IOPS
Kthread: 50 IOPS
fio --name=1mswrite --rw=randwrite --direct=1 --bs=4k --runtime=60 \
--time_based --filename=/dev/mapper/dm-delay-1ms
Theoretical maximum: 1000 IOPS
Previous: 498 IOPS
Kthread: 1000 IOPS
fio --name=10mswrite --rw=randwrite --direct=1 --bs=4k --runtime=60 \
--time_based --filename=/dev/mapper/dm-delay-10ms
Theoretical maximum: 100 IOPS
Previous: 90 IOPS
Kthread: 100 IOPS
(This one is just to prove the new design isn't impacting throughput,
not really about delays):
fio --name=10mswriteasync --rw=randwrite --direct=1 --bs=4k \
--runtime=60 --time_based --filename=/dev/mapper/dm-delay-10ms \
--numjobs=32 --iodepth=64 --ioengine=libaio --group_reporting
Previous: 13.3k IOPS
Kthread: 13.3k IOPS
Signed-off-by: Christian Loehle <christian.loehle@arm.com>
[Harshit: kthread_create error handling fix in delay_ctr]
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
|
I2S mode register field will be set to 1 when tdm mode is enabled.
Update the I2S mode field based on tdm_mode flag check.
This will fix below smatch checker warning.
sound/soc/amd/acp/acp-i2s.c:59 acp_set_i2s_clk()
warn: odd binop '0x0 & 0x2'
Fixes: 40f74d5f09d7 ("ASoC: amd: acp: refactor acp i2s clock
generation code")
Reported-By: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Syed Saba Kareem <Syed.SabaKareem@amd.com>
Link: https://lore.kernel.org/r/20231031135949.1064581-3-Syed.SabaKareem@amd.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
KVM SVM changes for 6.7:
- Report KVM_EXIT_SHUTDOWN instead of EINVAL if KVM intercepts SHUTDOWN while
running an SEV-ES guest.
- Clean up handling "failures" when KVM detects it can't emulate the "skip"
action for an instruction that has already been partially emulated. Drop a
hack in the SVM code that was fudging around the emulator code not giving
SVM enough information to do the right thing.
|
|
KVM PMU change for 6.7:
- Handle NMI/SMI requests after PMU/PMI requests so that a PMI=>NMI doesn't
require redoing the entire run loop due to the NMI not being detected until
the final kvm_vcpu_exit_request() check before entering the guest.
|
|
KVM x86 Xen changes for 6.7:
- Omit "struct kvm_vcpu_xen" entirely when CONFIG_KVM_XEN=n.
- Use the fast path directly from the timer callback when delivering Xen timer
events. Avoid the problematic races with using the fast path by ensuring
the hrtimer isn't running when (re)starting the timer or saving the timer
information (for userspace).
- Follow the lead of upstream Xen and ignore the VCPU_SSHOTTMR_future flag.
|
|
KVM x86 MMU changes for 6.7:
- Clean up code that deals with honoring guest MTRRs when the VM has
non-coherent DMA and host MTRRs are ignored, i.e. EPT is enabled.
- Zap EPT entries when non-coherent DMA assignment stops/start to prevent
using stale entries with the wrong memtype.
- Don't ignore guest PAT for CR0.CD=1 && KVM_X86_QUIRK_CD_NW_CLEARED=y, as
there's zero reason to ignore guest PAT if the effective MTRR memtype is WB.
This will also allow for future optimizations of handling guest MTRR updates
for VMs with non-coherent DMA and the quirk enabled.
- Harden the fast page fault path to guard against encountering an invalid
root when walking SPTEs.
|
|
In the unlikely event that workqueue allocation fails and returns NULL in
mlx5_mkey_cache_init(), delete the call to
mlx5r_umr_resource_cleanup() (which frees the QP) in
mlx5_ib_stage_post_ib_reg_umr_init(). This will avoid attempted double
free of the same QP when __mlx5_ib_add() does its cleanup.
Resolves a splat:
Syzkaller reported a UAF in ib_destroy_qp_user
workqueue: Failed to create a rescuer kthread for wq "mkey_cache": -EINTR
infiniband mlx5_0: mlx5_mkey_cache_init:981:(pid 1642):
failed to create work queue
infiniband mlx5_0: mlx5_ib_stage_post_ib_reg_umr_init:4075:(pid 1642):
mr cache init failed -12
==================================================================
BUG: KASAN: slab-use-after-free in ib_destroy_qp_user (drivers/infiniband/core/verbs.c:2073)
Read of size 8 at addr ffff88810da310a8 by task repro_upstream/1642
Call Trace:
<TASK>
kasan_report (mm/kasan/report.c:590)
ib_destroy_qp_user (drivers/infiniband/core/verbs.c:2073)
mlx5r_umr_resource_cleanup (drivers/infiniband/hw/mlx5/umr.c:198)
__mlx5_ib_add (drivers/infiniband/hw/mlx5/main.c:4178)
mlx5r_probe (drivers/infiniband/hw/mlx5/main.c:4402)
...
</TASK>
Allocated by task 1642:
__kmalloc (./include/linux/kasan.h:198 mm/slab_common.c:1026
mm/slab_common.c:1039)
create_qp (./include/linux/slab.h:603 ./include/linux/slab.h:720
./include/rdma/ib_verbs.h:2795 drivers/infiniband/core/verbs.c:1209)
ib_create_qp_kernel (drivers/infiniband/core/verbs.c:1347)
mlx5r_umr_resource_init (drivers/infiniband/hw/mlx5/umr.c:164)
mlx5_ib_stage_post_ib_reg_umr_init (drivers/infiniband/hw/mlx5/main.c:4070)
__mlx5_ib_add (drivers/infiniband/hw/mlx5/main.c:4168)
mlx5r_probe (drivers/infiniband/hw/mlx5/main.c:4402)
...
Freed by task 1642:
__kmem_cache_free (mm/slub.c:1826 mm/slub.c:3809 mm/slub.c:3822)
ib_destroy_qp_user (drivers/infiniband/core/verbs.c:2112)
mlx5r_umr_resource_cleanup (drivers/infiniband/hw/mlx5/umr.c:198)
mlx5_ib_stage_post_ib_reg_umr_init (drivers/infiniband/hw/mlx5/main.c:4076
drivers/infiniband/hw/mlx5/main.c:4065)
__mlx5_ib_add (drivers/infiniband/hw/mlx5/main.c:4168)
mlx5r_probe (drivers/infiniband/hw/mlx5/main.c:4402)
...
Fixes: 04876c12c19e ("RDMA/mlx5: Move init and cleanup of UMR to umr.c")
Link: https://lore.kernel.org/r/1698170518-4006-1-git-send-email-george.kennedy@oracle.com
Suggested-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: George Kennedy <george.kennedy@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
KVM x86 misc changes for 6.7:
- Add CONFIG_KVM_MAX_NR_VCPUS to allow supporting up to 4096 vCPUs without
forcing more common use cases to eat the extra memory overhead.
- Add IBPB and SBPB virtualization support.
- Fix a bug where restoring a vCPU snapshot that was taken within 1 second of
creating the original vCPU would cause KVM to try to synchronize the vCPU's
TSC and thus clobber the correct TSC being set by userspace.
- Compute guest wall clock using a single TSC read to avoid generating an
inaccurate time, e.g. if the vCPU is preempted between multiple TSC reads.
- "Virtualize" HWCR.TscFreqSel to make Linux guests happy, which complain
about a "Firmware Bug" if the bit isn't set for select F/M/S combos.
- Don't apply side effects to Hyper-V's synthetic timer on writes from
userspace to fix an issue where the auto-enable behavior can trigger
spurious interrupts, i.e. do auto-enabling only for guest writes.
- Remove an unnecessary kick of all vCPUs when synchronizing the dirty log
without PML enabled.
- Advertise "support" for non-serializing FS/GS base MSR writes as appropriate.
- Use octal notation for file permissions through KVM x86.
- Fix a handful of typo fixes and warts.
|