Age | Commit message (Collapse) | Author |
|
The check for 'sym1 == sym2' is redundant here because it has already
been done a few lines above:
if (sym1 != sym2)
return NULL;
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Currently, comparisons to 'm' or 'n' result in incorrect output.
[Test Code]
config MODULES
def_bool y
modules
config A
def_tristate m
config B
def_bool A > n
CONFIG_B is unset, while CONFIG_B=y is expected.
The reason for the issue is because Kconfig compares the tristate values
as strings.
Currently, the .type fields in the constant symbol definitions,
symbol_{yes,mod,no} are unspecified, i.e., S_UNKNOWN.
When expr_calc_value() evaluates 'A > n', it checks the types of 'A' and
'n' to determine how to compare them.
The left-hand side, 'A', is a tristate symbol with a value of 'm', which
corresponds to a numeric value of 1. (Internally, 'y', 'm', and 'n' are
represented as 2, 1, and 0, respectively.)
The right-hand side, 'n', has an unknown type, so it is treated as the
string "n" during the comparison.
expr_calc_value() compares two values numerically only when both can
have numeric values. Otherwise, they are compared as strings.
symbol numeric value ASCII code
-------------------------------------
y 2 0x79
m 1 0x6d
n 0 0x6e
'm' is greater than 'n' if compared numerically (since 1 is greater
than 0), but smaller than 'n' if compared as strings (since the ASCII
code 0x6d is smaller than 0x6e).
Specifying .type=S_TRISTATE for symbol_{yes,mod,no} fixes the above
test code.
Doing so, however, would cause a regression to the following test code.
[Test Code 2]
config MODULES
def_bool n
modules
config A
def_tristate n
config B
def_bool A = m
You would get CONFIG_B=y, while CONFIG_B should not be set.
The reason is because sym_get_string_value() turns 'm' into 'n' when the
module feature is disabled. Consequently, expr_calc_value() evaluates
'A = n' instead of 'A = m'. This oddity has been hidden because the type
of 'm' was previously S_UNKNOWN instead of S_TRISTATE.
sym_get_string_value() should not tweak the string because the tristate
value has already been correctly calculated. There is no reason to
return the string "n" where its tristate value is mod.
Fixes: 31847b67bec0 ("kconfig: allow use of relations other than (in)equality")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
This has not been used since commit e911503085ae ("Kconfig: Remove
bad inference rules expr_eliminate_dups2()").
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Commit ec144244a43f ("drm/gem-shmem: Acquire reservation lock in GEM
pin/unpin callbacks") moved locking DRM object's dma reservation to
drm_gem_shmem_object_pin, and made drm_gem_shmem_pin_locked public, so
we need to make sure the not-imported check warning is also added to
the latter.
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Fixes: a78027847226 ("drm/gem: Acquire reservation lock in drm_gem_{pin/unpin}()")
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240523113236.432585-4-adrian.larumbe@collabora.com
|
|
Commit a78027847226 ("drm/gem: Acquire reservation lock in
drm_gem_{pin/unpin}()") moved locking the DRM object's dma reservation to
drm_gem_pin(), but Lima's pin callback kept calling drm_gem_shmem_pin,
which also tries to lock the same dma_resv, leading to a double lock
situation.
As was already done for Panfrost in the previous commit, fix it by
replacing drm_gem_shmem_pin() with its locked variant.
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Fixes: a78027847226 ("drm/gem: Acquire reservation lock in drm_gem_{pin/unpin}()")
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Tested-by: Val Packett <val@packett.cool>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240523113236.432585-3-adrian.larumbe@collabora.com
|
|
When Panfrost must pin an object that is being prepared a dma-buf
attachment for on behalf of another driver, the core drm gem object pinning
code already takes a lock on the object's dma reservation.
However, Panfrost GEM object's pinning callback would eventually try taking
the lock on the same dma reservation when delegating pinning of the object
onto the shmem subsystem, which led to a deadlock.
This can be shown by enabling CONFIG_DEBUG_WW_MUTEX_SLOWPATH, which throws
the following recursive locking situation:
weston/3440 is trying to acquire lock:
ffff000000e235a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: drm_gem_shmem_pin+0x34/0xb8 [drm_shmem_helper]
but task is already holding lock:
ffff000000e235a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: drm_gem_pin+0x2c/0x80 [drm]
Fix it by replacing drm_gem_shmem_pin with its locked version, as the lock
had already been taken by drm_gem_pin().
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Fixes: a78027847226 ("drm/gem: Acquire reservation lock in drm_gem_{pin/unpin}()")
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240523113236.432585-2-adrian.larumbe@collabora.com
|
|
It is possible for syzbot to side-step the restriction imposed by the
blamed commit in the Fixes: tag, because the taprio UAPI permits a
cycle-time different from (and potentially shorter than) the sum of
entry intervals.
We need one more restriction, which is that the cycle time itself must
be larger than N * ETH_ZLEN bit times, where N is the number of schedule
entries. This restriction needs to apply regardless of whether the cycle
time came from the user or was the implicit, auto-calculated value, so
we move the existing "cycle == 0" check outside the "if "(!new->cycle_time)"
branch. This way covers both conditions and scenarios.
Add a selftest which illustrates the issue triggered by syzbot.
Fixes: b5b73b26b3ca ("taprio: Fix allowing too small intervals")
Reported-by: syzbot+a7d2b1d5d1af83035567@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/0000000000007d66bc06196e7c66@google.com/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20240527153955.553333-2-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In commit b5b73b26b3ca ("taprio: Fix allowing too small intervals"), a
comparison of user input against length_to_duration(q, ETH_ZLEN) was
introduced, to avoid RCU stalls due to frequent hrtimers.
The implementation of length_to_duration() depends on q->picos_per_byte
being set for the link speed. The blamed commit in the Fixes: tag has
moved this too late, so the checks introduced above are ineffective.
The q->picos_per_byte is zero at parse_taprio_schedule() ->
parse_sched_list() -> parse_sched_entry() -> fill_sched_entry() time.
Move the taprio_set_picos_per_byte() call as one of the first things in
taprio_change(), before the bulk of the netlink attribute parsing is
done. That's because it is needed there.
Add a selftest to make sure the issue doesn't get reintroduced.
Fixes: 09dbdf28f9f9 ("net/sched: taprio: fix calculation of maximum gate durations")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240527153955.553333-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Pull in remaining commits from 6.10/scsi-queue.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
We were accidentally returning -EROFS during recovery on filesystem
inconsistency - since this is what the journal returns on emergency
shutdown.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Move the mode verification to __create_region() before allocating the
memregion to avoid the memregion leaks.
Fixes: 6e099264185d ("cxl/region: Add volatile region creation support")
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/20240507053421.456439-1-lizhijian@fujitsu.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
tools/testing/cxl/test/mem.c uses vmalloc() and vfree() but does not
include linux/vmalloc.h. Kernel v6.10 made changes that causes the
currently included headers not depend on vmalloc.h and therefore
mem.c can no longer compile. Add linux/vmalloc.h to fix compile
issue.
CC [M] tools/testing/cxl/test/mem.o
tools/testing/cxl/test/mem.c: In function ‘label_area_release’:
tools/testing/cxl/test/mem.c:1428:9: error: implicit declaration of function ‘vfree’; did you mean ‘kvfree’? [-Werror=implicit-function-declaration]
1428 | vfree(lsa);
| ^~~~~
| kvfree
tools/testing/cxl/test/mem.c: In function ‘cxl_mock_mem_probe’:
tools/testing/cxl/test/mem.c:1466:22: error: implicit declaration of function ‘vmalloc’; did you mean ‘kmalloc’? [-Werror=implicit-function-declaration]
1466 | mdata->lsa = vmalloc(LSA_SIZE);
| ^~~~~~~
| kmalloc
Fixes: 7d3eb23c4ccf ("tools/testing/cxl: Introduce a mock memory device + driver")
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://lore.kernel.org/r/20240528225551.1025977-1-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
This removes the restriction of needing iif selector in the
forward/input hooks for fib lookups when requested result is
oif/oifname.
Removing this restriction allows "loose" lookups from the forward hooks.
Fixes: be8be04e5ddb ("netfilter: nft_fib: reverse path filter for policy-based routing on iif")
Signed-off-by: Eric Garver <eric@garver.life>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
syzbot reports:
general protection fault, probably for non-canonical address 0xdffffc0000000003: 0000 [#1] PREEMPT SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f]
[..]
RIP: 0010:nf_tproxy_laddr4+0xb7/0x340 net/ipv4/netfilter/nf_tproxy_ipv4.c:62
Call Trace:
nft_tproxy_eval_v4 net/netfilter/nft_tproxy.c:56 [inline]
nft_tproxy_eval+0xa9a/0x1a00 net/netfilter/nft_tproxy.c:168
__in_dev_get_rcu() can return NULL, so check for this.
Reported-and-tested-by: syzbot+b94a6818504ea90d7661@syzkaller.appspotmail.com
Fixes: cc6eb4338569 ("tproxy: use the interface primary IP address as a default value for --on-ip")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Userspace assumes vlan header is present at a given offset, but vlan
offload allows to store this in metadata fields of the skbuff. Hence
mangling vlan results in a garbled packet. Handle this transparently by
adding a parser to the kernel.
If vlan metadata is present and payload offset is over 12 bytes (source
and destination mac address fields), then subtract vlan header present
in vlan metadata, otherwise mangle vlan metadata based on offset and
length, extracting data from the source register.
This is similar to:
8cfd23e67401 ("netfilter: nft_payload: work around vlan header stripping")
to deal with vlan payload mangling.
Fixes: 7ec3f7b47b8d ("netfilter: nft_payload: add packet mangling support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Can't actually be used uninitialized, but gcc was being silly.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
v6.10-rc1 is released, forward from v6.9
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux into pm-tools
Merge cpupower utility fix for 6.10-rc2 from Shuah Khan:
"This cpupower fixes update for Linux 6.10-rc2 consists of one single
fix to cpupower's P-State frequency calculation and reporting with
AMD Family 1Ah+ processors, when using the acpi-cpufreq driver."
* tag 'linux-cpupower-6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux:
tools/power/cpupower: Fix Pstate frequency reporting on AMD Family 1Ah CPUs
|
|
The nominal frequency in cpudata is maintained in MHz whereas all other
frequencies are in KHz. This means we have to convert nominal frequency
value to KHz before we do any interaction with other frequency values.
In amd_pstate_set_boost(), this conversion from MHz to KHz is missed,
fix that.
Tested on a AMD Zen4 EPYC server
Before:
$ cat /sys/devices/system/cpu/cpufreq/policy*/scaling_max_freq | uniq
2151
$ cat /sys/devices/system/cpu/cpufreq/policy*/cpuinfo_min_freq | uniq
400000
$ cat /sys/devices/system/cpu/cpufreq/policy*/scaling_cur_freq | uniq
2151
409422
After:
$ cat /sys/devices/system/cpu/cpufreq/policy*/scaling_max_freq | uniq
2151000
$ cat /sys/devices/system/cpu/cpufreq/policy*/cpuinfo_min_freq | uniq
400000
$ cat /sys/devices/system/cpu/cpufreq/policy*/scaling_cur_freq | uniq
2151000
1799527
Fixes: ec437d71db77 ("cpufreq: amd-pstate: Introduce a new AMD P-State driver to support future processors")
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Tested-by: Peter Jung <ptr1337@cachyos.org>
Cc: 5.17+ <stable@vger.kernel.org> # 5.17+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
When extra warnings are enabled, gcc points out a global variable
definition in a header:
In file included from drivers/cpufreq/amd-pstate-ut.c:29:
include/linux/amd-pstate.h:123:27: error: 'amd_pstate_mode_string' defined but not used [-Werror=unused-const-variable=]
123 | static const char * const amd_pstate_mode_string[] = {
| ^~~~~~~~~~~~~~~~~~~~~~
This header is only included from two files in the same directory,
and one of them uses only a single definition from it, so clean it
up by moving most of the contents into the driver that uses them,
and making shared bits a local header file.
Fixes: 36c5014e5460 ("cpufreq: amd-pstate: optimize driver working mode selection in amd_pstate_param()")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The powermanagement core does various actions when a powersupply changes.
It calls into notifiers, LED triggers, other power supplies and emits an uevent.
To make sure that all these actions happen properly call power_supply_changed().
Reported-by: Rajas Paranjpe <paranjperajas@gmail.com>
Closes: https://github.com/MrChromebox/firmware/issues/420#issuecomment-2132251318
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The pnp_bus_type is defined only when CONFIG_PNP=y, while being
not guarded by ifdeffery in the header. Moreover, it's not used
outside of the PNP code. Move it to the internal header to make
sure no-one will try to (ab)use it.
Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Since we have a dev_is_pnp() macro that utilises the address of the
pnp_bus_type variable, the users, which can be compiled as modules,
will fail to build. Convert the macro to be a function and export it
to the modules to prevent build breakage.
Reported-by: Woody Suwalski <terraluna977@gmail.com>
Closes: https://lore.kernel.org/r/cc8a93b2-2504-9754-e26c-5d5c3bd1265c@gmail.com
Fixes: 2a49b45cd0e7 ("PNP: Add dev_is_pnp() macro")
Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
To pick the changes in:
4af663c2f64a8d25 ("KVM: SEV: Allow per-guest configuration of GHCB protocol version")
4f5defae708992dd ("KVM: SEV: introduce KVM_SEV_INIT2 operation")
26c44aa9e076ed83 ("KVM: SEV: define VM types for SEV and SEV-ES")
ac5c48027bacb1b5 ("KVM: SEV: publish supported VMSA features")
651d61bc8b7d8bb6 ("KVM: PPC: Fix documentation for ppc mmu caps")
That don't change functionality in tools/perf, as no new ioctl is added
for the 'perf trace' scripts to harvest.
This addresses these perf build warnings:
Warning: Kernel ABI header differences:
diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h
diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michael Roth <michael.roth@amd.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/lkml/ZlYxAdHjyAkvGtMW@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To pick up the changes from these csets:
53bc516ade85a764 ("x86/msr: Move ARCH_CAP_XAPIC_DISABLE bit definition to its rightful place")
That patch just move definitions around, so this just silences this perf
build warning:
Warning: Kernel ABI header differences:
diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Link: https://lore.kernel.org/lkml/ZlYe8jOzd1_DyA7X@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd
Pull tpm fixes from Jarkko Sakkinen:
"This fixes two unaddressed review comments for the HMAC encryption
patch set. They are cosmetic but we are better off, if such
unnecessary glitches do not exist in the release.
The important part is enabling the HMAC encryption by default only on
x86-64 because that is the only sufficiently tested arch.
Finally, there is a bug fix for SPI transfer buffer allocation, which
did not take into account the SPI header size"
* tag 'tpmdd-next-6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
tpm: Enable TCG_TPM2_HMAC by default only for X86_64
tpm: Rename TPM2_OA_TMPL to TPM2_OA_NULL_KEY and make it local
tpm: Open code tpm_buf_parameters()
tpm_tis_spi: Account for SPI header when allocating TPM SPI xfer buffer
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probes fixes from Masami Hiramatsu:
- uprobes: prevent mutex_lock() under rcu_read_lock().
Recent changes moved uprobe_cpu_buffer preparation which involves
mutex_lock(), under __uprobe_trace_func() which is called inside
rcu_read_lock().
Fix it by moving uprobe_cpu_buffer preparation outside of
__uprobe_trace_func()
- kprobe-events: handle the error case of btf_find_struct_member()
* tag 'probes-fixes-v6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing/probes: fix error check in parse_btf_field()
uprobes: prevent mutex_lock() under rcu_read_lock()
|
|
Fix the 'make W=1' warning:
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/of/of_test.o
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://lore.kernel.org/r/20240524-md-of-of_test-v1-1-6ebd078d620f@quicinc.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
In nvmet_sq_destroy we capture sq->ctrl early and if it is non-NULL we
know that a ctrl was allocated (in the admin connect request handler)
and we need to release pending AERs, clear ctrl->sqs and sq->ctrl
(for nvme-loop primarily), and drop the final reference on the ctrl.
However, a small window is possible where nvmet_sq_destroy starts (as
a result of the client giving up and disconnecting) concurrently with
the nvme admin connect cmd (which may be in an early stage). But *before*
kill_and_confirm of sq->ref (i.e. the admin connect managed to get an sq
live reference). In this case, sq->ctrl was allocated however after it was
captured in a local variable in nvmet_sq_destroy.
This prevented the final reference drop on the ctrl.
Solve this by re-capturing the sq->ctrl after all inflight request has
completed, where for sure sq->ctrl reference is final, and move forward
based on that.
This issue was observed in an environment with many hosts connecting
multiple ctrls simoutanuosly, creating a delay in allocating a ctrl
leading up to this race window.
Reported-by: Alex Turin <alex@vastdata.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
The nvme pci driver synchronizes with all the namespace queues during a
reset to ensure that there's no pending timeout work.
Meanwhile the timeout work potentially iterates those same namespaces to
freeze their queues.
Each of those namespace iterations use the same read lock. If a write
lock should somehow get between the synchronize and freeze steps, then
forward progress is deadlocked.
We had been relying on the nvme controller state machine to ensure the
reset work wouldn't conflict with timeout work. That guarantee may be a
bit fragile to rely on, so iterate the namespace lists without taking
potentially circular locks, as reported by lockdep.
Link: https://lore.kernel.org/all/20220930001943.zdbvolc3gkekfmcv@shindev/
Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Tested-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Compatibility fix - we no longer have a separate table for which order
gc walks btrees in, and special case the stripes btree directly.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Fix the 'make W=1' warning:
WARNING: modpost: missing MODULE_DESCRIPTION() in fs/bcachefs/mean_and_variance_test.o
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
bch2_check_version_downgrade() was setting c->sb.version, which
bch2_sb_set_downgrade() expects to be at the previous version; and it
shouldn't even have been set directly because c->sb.version is updated
by write_super().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
delete_dead_snapshots now runs before the main fsck.c passes which check
for keys for invalid snapshots; thus, it needs those checks as well.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Consolidate per-key work into delete_dead_snapshots_process_key(), so we
now walk all keys once, not twice.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We now track whether a transaction is locked, and verify that we don't
have nodes locked when the transaction isn't locked; reorder relocks to
not pop the new assert.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This function is used for finding the hash seed (which is the same in
all versions of an inode in different snapshots): ff an inode has been
deleted in a child snapshot we need to iterate until we find a live
version.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
It can be useful to know the exact byte offset within a btree node where
an error occured.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Update cpupower's P-State frequency calculation and reporting with AMD
Family 1Ah+ processors, when using the acpi-cpufreq driver. This is due
to a change in the PStateDef MSR layout in AMD Family 1Ah+.
Tested on 4th and 5th Gen AMD EPYC system
Signed-off-by: Ananth Narayan <Ananth.Narayan@amd.com>
Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
If a write path in COW mode fails, either before submitting a bio for the
new extents or an actual IO error happens, we can end up allowing a fast
fsync to log file extent items that point to unwritten extents.
This is because dropping the extent maps happens when completing ordered
extents, at btrfs_finish_one_ordered(), and the completion of an ordered
extent is executed in a work queue.
This can result in a fast fsync to start logging file extent items based
on existing extent maps before the ordered extents complete, therefore
resulting in a log that has file extent items that point to unwritten
extents, resulting in a corrupt file if a crash happens after and the log
tree is replayed the next time the fs is mounted.
This can happen for both direct IO writes and buffered writes.
For example consider a direct IO write, in COW mode, that fails at
btrfs_dio_submit_io() because btrfs_extract_ordered_extent() returned an
error:
1) We call btrfs_finish_ordered_extent() with the 'uptodate' parameter
set to false, meaning an error happened;
2) That results in marking the ordered extent with the BTRFS_ORDERED_IOERR
flag;
3) btrfs_finish_ordered_extent() queues the completion of the ordered
extent - so that btrfs_finish_one_ordered() will be executed later in
a work queue. That function will drop extent maps in the range when
it's executed, since the extent maps point to unwritten locations
(signaled by the BTRFS_ORDERED_IOERR flag);
4) After calling btrfs_finish_ordered_extent() we keep going down the
write path and unlock the inode;
5) After that a fast fsync starts and locks the inode;
6) Before the work queue executes btrfs_finish_one_ordered(), the fsync
task sees the extent maps that point to the unwritten locations and
logs file extent items based on them - it does not know they are
unwritten, and the fast fsync path does not wait for ordered extents
to complete, which is an intentional behaviour in order to reduce
latency.
For the buffered write case, here's one example:
1) A fast fsync begins, and it starts by flushing delalloc and waiting for
the writeback to complete by calling filemap_fdatawait_range();
2) Flushing the dellaloc created a new extent map X;
3) During the writeback some IO error happened, and at the end io callback
(end_bbio_data_write()) we call btrfs_finish_ordered_extent(), which
sets the BTRFS_ORDERED_IOERR flag in the ordered extent and queues its
completion;
4) After queuing the ordered extent completion, the end io callback clears
the writeback flag from all pages (or folios), and from that moment the
fast fsync can proceed;
5) The fast fsync proceeds sees extent map X and logs a file extent item
based on extent map X, resulting in a log that points to an unwritten
data extent - because the ordered extent completion hasn't run yet, it
happens only after the logging.
To fix this make btrfs_finish_ordered_extent() set the inode flag
BTRFS_INODE_NEEDS_FULL_SYNC in case an error happened for a COW write,
so that a fast fsync will wait for ordered extent completion.
Note that this issues of using extent maps that point to unwritten
locations can not happen for reads, because in read paths we start by
locking the extent range and wait for any ordered extents in the range
to complete before looking for extent maps.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
new 'mseal' syscall
But also to wire up shadow stacks on 32-bit x86, picking up those
changes from these csets:
ff388fe5c481d39c ("mseal: wire up mseal syscall")
2883f01ec37dd866 ("x86/shstk: Enable shadow stacks for x32")
This makes 'perf trace' support it, now its possible, for instance to
do:
# perf trace -e mseal --max-stack=16
Here is an example with the 'sendmmsg' syscall:
root@x1:~# perf trace -e sendmmsg --max-stack 16 --max-events=1
0.000 ( 0.062 ms): dbus-broker/1012 sendmmsg(fd: 150, mmsg: 0x7ffef57cca50, vlen: 1, flags: DONTWAIT|NOSIGNAL) = 1
syscall_exit_to_user_mode_prepare ([kernel.kallsyms])
syscall_exit_to_user_mode_prepare ([kernel.kallsyms])
syscall_exit_to_user_mode ([kernel.kallsyms])
do_syscall_64 ([kernel.kallsyms])
entry_SYSCALL_64 ([kernel.kallsyms])
[0x117ce7] (/usr/lib64/libc.so.6 (deleted))
root@x1:~#
To do a system wide tracing of the new 'mseal' syscall with a backtrace
of at most 16 entries.
This addresses these perf tools build warnings:
Warning: Kernel ABI header differences:
diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl
diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H J Lu <hjl.tools@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jeff Xu <jeffxu@chromium.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/ZlXlo4TNcba4wnVZ@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The start counter for FT1 filter is wrongly set to 0 in the driver.
FT1 is used for source address violation (SAV) check and source address
starts at Byte 6 not Byte 0. Fix this by changing start counter to
ETH_ALEN in icssg_ft1_set_mac_addr().
Fixes: e9b4ece7d74b ("net: ti: icssg-prueth: Add Firmware config and classification APIs.")
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Link: https://lore.kernel.org/r/20240527063015.263748-1-danishanwar@ti.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|