Age | Commit message (Collapse) | Author |
|
Use firmware properties (eg: OF) to get the device specific
configuration. This change just replaces the platform data since there
was no platform using it and so, it makes no sense having both.
Special note to the PULL-UP disable setting that is now supported as
part of the gpio subsystem (using 'set_config()' callback).
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20220829131553.690063-5-nuno.sa@analog.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
This change replaces the support for GPIs as key event generators.
Instead of reporting the events directly, we add a gpio based irqchip
so that these events can be consumed by keys defined in the gpio-keys
driver (as it's goal is indeed for keys on GPIOs capable of generating
interrupts). With this, the gpio-adp5588 driver can also be dropped.
The basic idea is that all the pins that are not being used as part of
the keymap matrix can be possibly requested as GPIOs by gpio-keys
(it's also fine to use these pins as plain interrupts though that's not
really the point).
Since the gpiochip now also has irqchip capabilities, we should only
remove it after we free the device interrupt (otherwise we could, in
theory, be handling GPIs interrupts while the gpiochip is concurrently
removed). Thus the call 'adp5588_gpio_add()' is moved and since the
setup phase also needs to come before making the gpios visible, we also
need to move 'adp5588_setup()'.
While at it, always select GPIOLIB so that we don't need to use #ifdef
guards.
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20220829131553.690063-2-nuno.sa@analog.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
For concerns about security, the register to enable/disable IOMMU of
SMI LARB should only be configured in secure world. Thus, we add some
SMC command for multimedia master to enable/disable MM IOMMU in ATF by
setting the register of SMI LARB. This function is prepared for MT8188.
Signed-off-by: Chengci.Xu <chengci.xu@mediatek.com>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220817124608.10062-4-chengci.xu@mediatek.com
|
|
Count the pages used by KVM mmu on x86 in memory stats under secondary
pagetable stats (e.g. "SecPageTables" in /proc/meminfo) to give better
visibility into the memory consumption of KVM mmu in a similar way to
how normal user page tables are accounted.
Add the inner helper in common KVM, ARM will also use it to count stats
in a future commit.
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Acked-by: Marc Zyngier <maz@kernel.org> # generic KVM changes
Link: https://lore.kernel.org/r/20220823004639.2387269-3-yosryahmed@google.com
Link: https://lore.kernel.org/r/20220823004639.2387269-4-yosryahmed@google.com
[sean: squash x86 usage to workaround modpost issues]
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Automatic kernel fuzzing revealed a recursive locking violation in
usb-storage:
============================================
WARNING: possible recursive locking detected
5.18.0 #3 Not tainted
--------------------------------------------
kworker/1:3/1205 is trying to acquire lock:
ffff888018638db8 (&us_interface_key[i]){+.+.}-{3:3}, at:
usb_stor_pre_reset+0x35/0x40 drivers/usb/storage/usb.c:230
but task is already holding lock:
ffff888018638db8 (&us_interface_key[i]){+.+.}-{3:3}, at:
usb_stor_pre_reset+0x35/0x40 drivers/usb/storage/usb.c:230
...
stack backtrace:
CPU: 1 PID: 1205 Comm: kworker/1:3 Not tainted 5.18.0 #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
Workqueue: usb_hub_wq hub_event
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
print_deadlock_bug kernel/locking/lockdep.c:2988 [inline]
check_deadlock kernel/locking/lockdep.c:3031 [inline]
validate_chain kernel/locking/lockdep.c:3816 [inline]
__lock_acquire.cold+0x152/0x3ca kernel/locking/lockdep.c:5053
lock_acquire kernel/locking/lockdep.c:5665 [inline]
lock_acquire+0x1ab/0x520 kernel/locking/lockdep.c:5630
__mutex_lock_common kernel/locking/mutex.c:603 [inline]
__mutex_lock+0x14f/0x1610 kernel/locking/mutex.c:747
usb_stor_pre_reset+0x35/0x40 drivers/usb/storage/usb.c:230
usb_reset_device+0x37d/0x9a0 drivers/usb/core/hub.c:6109
r871xu_dev_remove+0x21a/0x270 drivers/staging/rtl8712/usb_intf.c:622
usb_unbind_interface+0x1bd/0x890 drivers/usb/core/driver.c:458
device_remove drivers/base/dd.c:545 [inline]
device_remove+0x11f/0x170 drivers/base/dd.c:537
__device_release_driver drivers/base/dd.c:1222 [inline]
device_release_driver_internal+0x1a7/0x2f0 drivers/base/dd.c:1248
usb_driver_release_interface+0x102/0x180 drivers/usb/core/driver.c:627
usb_forced_unbind_intf+0x4d/0xa0 drivers/usb/core/driver.c:1118
usb_reset_device+0x39b/0x9a0 drivers/usb/core/hub.c:6114
This turned out not to be an error in usb-storage but rather a nested
device reset attempt. That is, as the rtl8712 driver was being
unbound from a composite device in preparation for an unrelated USB
reset (that driver does not have pre_reset or post_reset callbacks),
its ->remove routine called usb_reset_device() -- thus nesting one
reset call within another.
Performing a reset as part of disconnect processing is a questionable
practice at best. However, the bug report points out that the USB
core does not have any protection against nested resets. Adding a
reset_in_progress flag and testing it will prevent such errors in the
future.
Link: https://lore.kernel.org/all/CAB7eexKUpvX-JNiLzhXBDWgfg2T9e9_0Tw4HQ6keN==voRbP0g@mail.gmail.com/
Cc: stable@vger.kernel.org
Reported-and-tested-by: Rondreis <linhaoguo86@gmail.com>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Link: https://lore.kernel.org/r/YwkflDxvg0KWqyZK@rowland.harvard.edu
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
There should be no reason to adjust old ktermios which is going to get
discarded anyway.
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20220816115739.10928-9-ilpo.jarvinen@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
There should be no reason to adjust old ktermios which is going to get
discarded anyway.
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20220816115739.10928-8-ilpo.jarvinen@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
There should be no reason to adjust old ktermios which is going to get
discarded anyway.
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20220816115739.10928-7-ilpo.jarvinen@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
There should be no reason to adjust old ktermios which is going to get
discarded anyway.
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20220816115739.10928-6-ilpo.jarvinen@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
There should be no reason to adjust old ktermios which is going to get
discarded anyway.
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20220816115739.10928-5-ilpo.jarvinen@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
With the architectures currently in-tree, either:
1) CBAUDEX is zero
2) The earlier BOTHER if check covers cbaud < 1 case
3) All CBAUD bits are covered by the baud_table
Thus, the check for cbaud being out-of-range for CBAUDEX case cannot
ever be true.
The ktermios parameters can now be made const.
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20220816115739.10928-2-ilpo.jarvinen@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Being able to check attribute presence and set extack
if not on one line is handy, add helpers.
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
There is currently no way to report via extack in a structured way
that an attribute is missing. This leads to families resorting to
string messages.
Add a pair of attributes - @offset and @type for machine-readable
way of reporting missing attributes. The @offset points to the
nest which should have contained the attribute, @type is the
expected nla_type. The offset will be skipped if the attribute
is missing at the message level rather than inside a nest.
User space should be able to figure out which attribute enum
(AKA attribute space AKA attribute set) the nest pointed to by
@offset is using.
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
After commit f2d3b9a46e0e ("ARM: 9220/1: amba: Remove deferred device
addition"), it became possible for amba_read_periphid() to be invoked
concurrently from two threads for a particular AMBA device.
Consider the case where a thread (T0) is registering an AMBA driver, and
searching for all of the devices it can match with on the AMBA bus.
Suppose that another thread (T1) is executing the deferred probe work,
and is searching through all of the AMBA drivers on the bus for a driver
that matches a particular AMBA device. Assume that both threads begin
operating on the same AMBA device and the device's peripheral ID is
still unknown.
In this scenario, the amba_match() function will be invoked for the
same AMBA device by both threads, which means amba_read_periphid()
can also be invoked by both threads, and both threads will be able
to manipulate the AMBA device's pclk pointer without any synchronization.
It's possible that one thread will initialize the pclk pointer, then the
other thread will re-initialize it, overwriting the previous value, and
both will race to free the same pclk, resulting in a use-after-free for
whichever thread frees the pclk last.
Add a lock per AMBA device to synchronize the handling with detecting the
peripheral ID to avoid the use-after-free scenario.
The following KFENCE bug report helped detect this problem:
==================================================================
BUG: KFENCE: use-after-free read in clk_disable+0x14/0x34
Use-after-free read at 0x(ptrval) (in kfence-#19):
clk_disable+0x14/0x34
amba_read_periphid+0xdc/0x134
amba_match+0x3c/0x84
__driver_attach+0x20/0x158
bus_for_each_dev+0x74/0xc0
bus_add_driver+0x154/0x1e8
driver_register+0x88/0x11c
do_one_initcall+0x8c/0x2fc
kernel_init_freeable+0x190/0x220
kernel_init+0x10/0x108
ret_from_fork+0x14/0x3c
0x0
kfence-#19: 0x(ptrval)-0x(ptrval), size=36, cache=kmalloc-64
allocated by task 8 on cpu 0 at 11.629931s:
clk_hw_create_clk+0x38/0x134
amba_get_enable_pclk+0x10/0x68
amba_read_periphid+0x28/0x134
amba_match+0x3c/0x84
__device_attach_driver+0x2c/0xc4
bus_for_each_drv+0x80/0xd0
__device_attach+0xb0/0x1f0
bus_probe_device+0x88/0x90
deferred_probe_work_func+0x8c/0xc0
process_one_work+0x23c/0x690
worker_thread+0x34/0x488
kthread+0xd4/0xfc
ret_from_fork+0x14/0x3c
0x0
freed by task 8 on cpu 0 at 11.630095s:
amba_read_periphid+0xec/0x134
amba_match+0x3c/0x84
__device_attach_driver+0x2c/0xc4
bus_for_each_drv+0x80/0xd0
__device_attach+0xb0/0x1f0
bus_probe_device+0x88/0x90
deferred_probe_work_func+0x8c/0xc0
process_one_work+0x23c/0x690
worker_thread+0x34/0x488
kthread+0xd4/0xfc
ret_from_fork+0x14/0x3c
0x0
Cc: Saravana Kannan <saravanak@google.com>
Cc: patches@armlinux.org.uk
Fixes: f2d3b9a46e0e ("ARM: 9220/1: amba: Remove deferred device addition")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
Implement simple accessors to probe percpu-rwsem's locked state:
percpu_is_write_locked(), percpu_is_read_locked().
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20220829124719.675715-11-elver@google.com
|
|
Due to being a __weak function, hw_breakpoint_weight() will cause the
compiler to always emit a call to it. This generates unnecessarily bad
code (register spills etc.) for no good reason; in fact it appears in
profiles of `perf bench -r 100 breakpoint thread -b 4 -p 128 -t 512`:
...
0.70% [kernel] [k] hw_breakpoint_weight
...
While a small percentage, no architecture defines its own
hw_breakpoint_weight() nor are there users outside hw_breakpoint.c,
which makes the fact it is currently __weak a poor choice.
Change hw_breakpoint_weight()'s definition to follow a similar protocol
to hw_breakpoint_slots(), such that if <asm/hw_breakpoint.h> defines
hw_breakpoint_weight(), we'll use it instead.
The result is that it is inlined and no longer shows up in profiles.
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20220829124719.675715-8-elver@google.com
|
|
On a machine with 256 CPUs, running the recently added perf breakpoint
benchmark results in:
| $> perf bench -r 30 breakpoint thread -b 4 -p 64 -t 64
| # Running 'breakpoint/thread' benchmark:
| # Created/joined 30 threads with 4 breakpoints and 64 parallelism
| Total time: 236.418 [sec]
|
| 123134.794271 usecs/op
| 7880626.833333 usecs/op/cpu
The benchmark tests inherited breakpoint perf events across many
threads.
Looking at a perf profile, we can see that the majority of the time is
spent in various hw_breakpoint.c functions, which execute within the
'nr_bp_mutex' critical sections which then results in contention on that
mutex as well:
37.27% [kernel] [k] osq_lock
34.92% [kernel] [k] mutex_spin_on_owner
12.15% [kernel] [k] toggle_bp_slot
11.90% [kernel] [k] __reserve_bp_slot
The culprit here is task_bp_pinned(), which has a runtime complexity of
O(#tasks) due to storing all task breakpoints in the same list and
iterating through that list looking for a matching task. Clearly, this
does not scale to thousands of tasks.
Instead, make use of the "rhashtable" variant "rhltable" which stores
multiple items with the same key in a list. This results in average
runtime complexity of O(1) for task_bp_pinned().
With the optimization, the benchmark shows:
| $> perf bench -r 30 breakpoint thread -b 4 -p 64 -t 64
| # Running 'breakpoint/thread' benchmark:
| # Created/joined 30 threads with 4 breakpoints and 64 parallelism
| Total time: 0.208 [sec]
|
| 108.422396 usecs/op
| 6939.033333 usecs/op/cpu
On this particular setup that's a speedup of ~1135x.
While one option would be to make task_struct a breakpoint list node,
this would only further bloat task_struct for infrequently used data.
Furthermore, after all optimizations in this series, there's no evidence
it would result in better performance: later optimizations make the time
spent looking up entries in the hash table negligible (we'll reach the
theoretical ideal performance i.e. no constraints).
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20220829124719.675715-5-elver@google.com
|
|
Provide hw_breakpoint_is_used() to check if breakpoints are in use on
the system.
Use it in the KUnit test to verify the global state before and after a
test case.
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20220829124719.675715-3-elver@google.com
|
|
The SoundWire specification allows the device number to be allocated
at will. When a system includes multiple SoundWire links, the device
number scope is limited to the link to which the device is attached.
However, for integration/debug it can be convenient to have a unique
device number across the system. This patch adds a 'dev_num_ida_min'
field at the bus level, which when set will be used to allocate an
IDA.
The allocation happens when a hardware device reports as ATTACHED. If
any error happens during the enumeration, the allocated IDA is not
freed - the device number will be reused if/when the device re-joins
the bus. The IDA is only freed when the Linux device is unregistered.
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Reviewed-by: Rander Wang <rander.wang@intel.com>
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Link: https://lore.kernel.org/r/20220823045004.2670658-3-yung-chuan.liao@linux.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
drivers-for-6.1
v6.0-rc1 +
20220825043859.30066-2-manivannan.sadhasivam@linaro.org +
20220825043859.30066-3-manivannan.sadhasivam@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
|
|
The LLCC EDAC register offsets varies between each SoCs. Until now, the
EDAC driver used the hardcoded register offsets. But this caused crash
on SM8450 SoC where the register offsets has been changed.
So to avoid this crash and also to make it easy to accommodate changes for
new SoCs, let's pass the LLCC version specific register offsets to the
EDAC driver.
Currently, two set of offsets are used. One is starting from LLCC version
v1.0.0 used by all SoCs other than SM8450. For SM8450, LLCC version
starting from v2.1.0 is used.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Sai Prakash Ranjan <quic_saipraka@quicinc.com>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20220825043859.30066-3-manivannan.sadhasivam@linaro.org
|
|
Currently all usage of struct qmi_elem_info, which is used to define
the QMI message encoding/decoding rules, does not use const. This
prevents clients from registering const arrays. Since these arrays are
always pre-defined, they should be const, so add the const qualifier
to all places in the QMI interface where struct qmi_elem_info is used.
Once this patch is in place, clients can independently update their
pre-defined arrays to be const, as demonstrated in the QMI sample
code.
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20220822153435.7856-1-quic_jjohnson@quicinc.com
|
|
There are two definitions of the is_signed_type() macro: one in
<linux/overflow.h> and a second definition in <linux/trace_events.h>.
As suggested by Linus, move the definition of the is_signed_type() macro
into the <linux/compiler.h> header file. Change the definition of the
is_signed_type() macro to make sure that it does not trigger any sparse
warnings with future versions of sparse for bitwise types.
Link: https://lore.kernel.org/all/CAHk-=whjH6p+qzwUdx5SOVVHjS3WvzJQr6mDUwhEyTf6pJWzaQ@mail.gmail.com/
Link: https://lore.kernel.org/all/CAHk-=wjQGnVfb4jehFR0XyZikdQvCZouE96xR_nnf5kqaM5qqQ@mail.gmail.com/
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
IP multicast must sometimes be discriminated from non-IP multicast,
e.g. when determining the forwarding behavior of a given group in the
presence of multicast router ports on an offloaded bridge. Therefore,
provide helpers to identify these groups.
Signed-off-by: Casper Andersson <casper.casan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We had historically not checked that genlmsghdr.reserved
is 0 on input which prevents us from using those precious
bytes in the future.
One use case would be to extend the cmd field, which is
currently just 8 bits wide and 256 is not a lot of commands
for some core families.
To make sure that new families do the right thing by default
put the onus of opting out of validation on existing families.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Paul Moore <paul@paul-moore.com> (NetLabel)
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
While working on a GRUB patch to support PCI-serial, a number of
cleanups were suggested that apply to the code I took inspiration from.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> # pci_ids.h
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lkml.kernel.org/r/YwdeyCEtW+wa+QhH@worktop.programming.kicks-ass.net
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull more hotfixes from Andrew Morton:
"Seventeen hotfixes. Mostly memory management things.
Ten patches are cc:stable, addressing pre-6.0 issues"
* tag 'mm-hotfixes-stable-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
.mailmap: update Luca Ceresoli's e-mail address
mm/mprotect: only reference swap pfn page if type match
squashfs: don't call kmalloc in decompressors
mm/damon/dbgfs: avoid duplicate context directory creation
mailmap: update email address for Colin King
asm-generic: sections: refactor memory_intersects
bootmem: remove the vmemmap pages from kmemleak in put_page_bootmem
ocfs2: fix freeing uninitialized resource on ocfs2_dlm_shutdown
Revert "memcg: cleanup racy sum avoidance code"
mm/zsmalloc: do not attempt to free IS_ERR handle
binder_alloc: add missing mmap_lock calls when using the VMA
mm: re-allow pinning of zero pfns (again)
vmcoreinfo: add kallsyms_num_syms symbol
mailmap: update Guilherme G. Piccoli's email addresses
writeback: avoid use-after-free after removing device
shmem: update folio if shmem_replace_page() updates the page
mm/hugetlb: avoid corrupting page->mapping in hugetlb_mcopy_atomic_pte
|
|
This reverts commit 96e51ccf1af33e82f429a0d6baebba29c6448d0f.
Recently we started running the kernel with rstat infrastructure on
production traffic and begin to see negative memcg stats values.
Particularly the 'sock' stat is the one which we observed having negative
value.
$ grep "sock " /mnt/memory/job/memory.stat
sock 253952
total_sock 18446744073708724224
Re-run after couple of seconds
$ grep "sock " /mnt/memory/job/memory.stat
sock 253952
total_sock 53248
For now we are only seeing this issue on large machines (256 CPUs) and
only with 'sock' stat. I think the networking stack increase the stat on
one cpu and decrease it on another cpu much more often. So, this negative
sock is due to rstat flusher flushing the stats on the CPU that has seen
the decrement of sock but missed the CPU that has increments. A typical
race condition.
For easy stable backport, revert is the most simple solution. For long
term solution, I am thinking of two directions. First is just reduce the
race window by optimizing the rstat flusher. Second is if the reader sees
a negative stat value, force flush and restart the stat collection.
Basically retry but limited.
Link: https://lkml.kernel.org/r/20220817172139.3141101-1-shakeelb@google.com
Fixes: 96e51ccf1af33e8 ("memcg: cleanup racy sum avoidance code")
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Cc: "Michal Koutný" <mkoutny@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Yosry Ahmed <yosryahmed@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: <stable@vger.kernel.org> [5.15]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The below referenced commit makes the same error as 1c563432588d ("mm: fix
is_pinnable_page against a cma page"), re-interpreting the logic to
exclude pinning of the zero page, which breaks device assignment with
vfio.
To avoid further subtle mistakes, split the logic into discrete tests.
[akpm@linux-foundation.org: simplify comment, per John]
Link: https://lkml.kernel.org/r/166015037385.760108.16881097713975517242.stgit@omen
Link: https://lore.kernel.org/all/165490039431.944052.12458624139225785964.stgit@omen
Fixes: f25cbb7a95a2 ("mm: add zone device coherent type memory support")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Suggested-by: Matthew Wilcox <willy@infradead.org>
Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
Tested-by: Slawomir Laba <slawomirx.laba@intel.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Cc: Alex Sierra <alex.sierra@amd.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Add str_read_write() helper to return 'read' or 'write' string literal.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Dmitry Rokosov <ddrokosov@sberdevices.ru>
Link: https://lore.kernel.org/r/20220822175011.2886-2-ddrokosov@sberdevices.ru
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
|
|
Currently, Hz units do not have milli, micro and nano Hz coefficients.
Some drivers (IIO especially) use their analogues to calculate
appropriate Hz values. This patch includes them to units.h definitions,
so they can be used from different kernel places.
Signed-off-by: Dmitry Rokosov <ddrokosov@sberdevices.ru>
Link: https://lore.kernel.org/r/20220812165243.22177-3-ddrokosov@sberdevices.ru
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
|
|
Add a new "spec" bitfield to branch entries for providing speculation
information. This will be populated using hints provided by branch sampling
features on supported hardware. The following cases are covered:
* No branch speculation information is available
* Branch is speculative but taken on the wrong path
* Branch is non-speculative but taken on the correct path
* Branch is speculative and taken on the correct path
Suggested-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/834088c302faf21c7b665031dd111f424e509a64.1660211399.git.sandipan.das@amd.com
|
|
Pulling to receive 43626dade36f ("group: Add missing cpus_read_lock() to
cgroup_attach_task_all()") for a follow-up patch.
|
|
Cgroup id is user provided datum hence extend its return domain to
include possible error reason (similar to cgroup_get_from_fd()).
This change also fixes commit d4ccaf58a847 ("bpf: Introduce cgroup
iter") that would use NULL instead of proper error handling in
d4ccaf58a847 ("bpf: Introduce cgroup iter").
Additionally, neither of: fc_appid_store, bpf_iter_attach_cgroup,
mem_cgroup_get_from_ino (callers of cgroup_get_from_fd) is built without
CONFIG_CGROUPS (depends via CONFIG_BLK_CGROUP, direct, transitive
CONFIG_MEMCG respectively) transitive, so drop the singular definition
not needed with !CONFIG_CGROUPS.
Fixes: d4ccaf58a847 ("bpf: Introduce cgroup iter")
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
There are several places in the kernel where wait_on_bit is not followed
by a memory barrier (for example, in drivers/md/dm-bufio.c:new_read).
On architectures with weak memory ordering, it may happen that memory
accesses that follow wait_on_bit are reordered before wait_on_bit and
they may return invalid data.
Fix this class of bugs by introducing a new function "test_bit_acquire"
that works like test_bit, but has acquire memory ordering semantics.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Acked-by: Will Deacon <will@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
io-uring cmd support was added through ee692a21e9bf ("fs,io_uring:
add infrastructure for uring-cmd"), this extended the struct
file_operations to allow a new command which each subsystem can use
to enable command passthrough. Add an LSM specific for the command
passthrough which enables LSMs to inspect the command details.
This was discussed long ago without no clear pointer for something
conclusive, so this enables LSMs to at least reject this new file
operation.
[0] https://lkml.kernel.org/r/8adf55db-7bab-f59d-d612-ed906b948d19@schaufler-ca.com
Cc: stable@vger.kernel.org
Fixes: ee692a21e9bf ("fs,io_uring: add infrastructure for uring-cmd")
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Johannes berg says:
====================
Various updates:
* rtw88: operation, locking, warning, and code style fixes
* rtw89: small updates
* cfg80211/mac80211: more EHT/MLO (802.11be, WiFi 7) work
* brcmfmac: a couple of fixes
* misc cleanups etc.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Adds support for setting various power states of TUF keyboards.
These states are combinations of:
- boot, set if a boot animation is shown on keyboard
- awake, set if the keyboard LEDs are visible while laptop is on
- sleep, set if an animation is displayed while the laptop is suspended
- keyboard (unknown effect)
Adds two sysfs attributes to asus::kbd_backlight:
- kbd_rgb_state
- kbd_rgb_state_index
Signed-off-by: Luke D. Jones <luke@ljones.dev>
Link: https://lore.kernel.org/r/20220825232251.345893-3-luke@ljones.dev
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
Adds support for changing the laptop keyboard LED mode and colour.
The modes are visible effects such as static, rainbow, pulsing,
colour cycles.
These sysfs attributes are added to asus::kbd_backlight:
- kbd_rgb_mode
- kbd_rgb_mode_index
Signed-off-by: Luke D. Jones <luke@ljones.dev>
Link: https://lore.kernel.org/r/20220825232251.345893-2-luke@ljones.dev
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
Add support for TUF laptops which have the ability to control
the GPU fan. This will show as a second fan in hwmon, and has
the ability to run as boost (fullspeed), or auto.
Signed-off-by: Luke D. Jones <luke@ljones.dev>
Link: https://lore.kernel.org/r/20220826004210.356534-3-luke@ljones.dev
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
drivers/net/ethernet/mellanox/mlx5/core/en_fs.c
21234e3a84c7 ("net/mlx5e: Fix use after free in mlx5e_fs_init()")
c7eafc5ed068 ("net/mlx5e: Convert ethtool_steering member of flow_steering struct to pointer")
https://lore.kernel.org/all/20220825104410.67d4709c@canb.auug.org.au/
https://lore.kernel.org/all/20220823055533.334471-1-saeed@kernel.org/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Rename ATA_DFLAG_NCQ_PRIO_ENABLE to ATA_DFLAG_NCQ_PRIO_ENABLED to match
the fact that this flags indicates if NCQ priority use is enabled by the
user.
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
|
|
Use try_cmpxchg instead of cmpxchg (*ptr, old, new) == old in
napi_if_scheduled_mark_missed. x86 CMPXCHG instruction returns
success in ZF flag, so this change saves a compare after cmpxchg
(and related move instruction in front of cmpxchg).
Also, try_cmpxchg implicitly assigns old *ptr value to "old" when cmpxchg
fails, enabling further code simplifications.
Cc: Eric Dumazet <edumazet@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Link: https://lore.kernel.org/r/20220822143243.2798-1-ubizjak@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from ipsec and netfilter (with one broken Fixes tag).
Current release - new code bugs:
- dsa: don't dereference NULL extack in dsa_slave_changeupper()
- dpaa: fix <1G ethernet on LS1046ARDB
- neigh: don't call kfree_skb() under spin_lock_irqsave()
Previous releases - regressions:
- r8152: fix the RX FIFO settings when suspending
- dsa: microchip: keep compatibility with device tree blobs with no
phy-mode
- Revert "net: macsec: update SCI upon MAC address change."
- Revert "xfrm: update SA curlft.use_time", comply with RFC 2367
Previous releases - always broken:
- netfilter: conntrack: work around exceeded TCP receive window
- ipsec: fix a null pointer dereference of dst->dev on a metadata dst
in xfrm_lookup_with_ifid
- moxa: get rid of asymmetry in DMA mapping/unmapping
- dsa: microchip: make learning configurable and keep it off while
standalone
- ice: xsk: prohibit usage of non-balanced queue id
- rxrpc: fix locking in rxrpc's sendmsg
Misc:
- another chunk of sysctl data race silencing"
* tag 'net-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (87 commits)
net: lantiq_xrx200: restore buffer if memory allocation failed
net: lantiq_xrx200: fix lock under memory pressure
net: lantiq_xrx200: confirm skb is allocated before using
net: stmmac: work around sporadic tx issue on link-up
ionic: VF initial random MAC address if no assigned mac
ionic: fix up issues with handling EAGAIN on FW cmds
ionic: clear broken state on generation change
rxrpc: Fix locking in rxrpc's sendmsg
net: ethernet: mtk_eth_soc: fix hw hash reporting for MTK_NETSYS_V2
MAINTAINERS: rectify file entry in BONDING DRIVER
i40e: Fix incorrect address type for IPv6 flow rules
ixgbe: stop resetting SYSTIME in ixgbe_ptp_start_cyclecounter
net: Fix a data-race around sysctl_somaxconn.
net: Fix a data-race around netdev_unregister_timeout_secs.
net: Fix a data-race around gro_normal_batch.
net: Fix data-races around sysctl_devconf_inherit_init_net.
net: Fix data-races around sysctl_fb_tunnels_only_for_init_net.
net: Fix a data-race around netdev_budget_usecs.
net: Fix data-races around sysctl_max_skb_frags.
net: Fix a data-race around netdev_budget.
...
|
|
Since the stub version of of_dma_configure_id() was added in commit
a081bd4af4ce ("of/device: Add input id to of_dma_configure()"), it has
not matched the signature of the full function, leading to build failure
reports when code using this function is built on !OF configurations.
Fixes: a081bd4af4ce ("of/device: Add input id to of_dma_configure()")
Cc: stable@vger.kernel.org
Signed-off-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Frank Rowand <frank.rowand@sony.com>
Acked-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Link: https://lore.kernel.org/r/20220824153256.1437483-1-thierry.reding@gmail.com
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
Cgroup_iter is a type of bpf_iter. It walks over cgroups in four modes:
- walking a cgroup's descendants in pre-order.
- walking a cgroup's descendants in post-order.
- walking a cgroup's ancestors.
- process only the given cgroup.
When attaching cgroup_iter, one can set a cgroup to the iter_link
created from attaching. This cgroup is passed as a file descriptor
or cgroup id and serves as the starting point of the walk. If no
cgroup is specified, the starting point will be the root cgroup v2.
For walking descendants, one can specify the order: either pre-order or
post-order. For walking ancestors, the walk starts at the specified
cgroup and ends at the root.
One can also terminate the walk early by returning 1 from the iter
program.
Note that because walking cgroup hierarchy holds cgroup_mutex, the iter
program is called with cgroup_mutex held.
Currently only one session is supported, which means, depending on the
volume of data bpf program intends to send to user space, the number
of cgroups that can be walked is limited. For example, given the current
buffer size is 8 * PAGE_SIZE, if the program sends 64B data for each
cgroup, assuming PAGE_SIZE is 4kb, the total number of cgroups that can
be walked is 512. This is a limitation of cgroup_iter. If the output
data is larger than the kernel buffer size, after all data in the
kernel buffer is consumed by user space, the subsequent read() syscall
will signal EOPNOTSUPP. In order to work around, the user may have to
update their program to reduce the volume of data sent to output. For
example, skip some uninteresting cgroups. In future, we may extend
bpf_iter flags to allow customizing buffer size.
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Hao Luo <haoluo@google.com>
Link: https://lore.kernel.org/r/20220824233117.1312810-2-haoluo@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add quirk for ASUS ROG X13 Flow 2-in-1 to enable tablet mode with
lid flip (all screen rotations).
Signed-off-by: Luke D. Jones <luke@ljones.dev>
Link: https://lore.kernel.org/r/20220813092753.6635-2-luke@ljones.dev
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
Support the hardware GPU MUX switch available on some models. This
switch can toggle the MUX between:
- 0, Dedicated mode
- 1, Optimus mode
Optimus mode is the regular iGPU + dGPU available, while dedicated
mode switches the system to have only the dGPU available.
Signed-off-by: Luke D. Jones <luke@ljones.dev>
Link: https://lore.kernel.org/r/20220813092624.6228-1-luke@ljones.dev
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
The style of the comments is not uniform, make it so and fix
a few grammar issues. While at it, update Copyright years.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220801113734.36131-4-andriy.shevchenko@linux.intel.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
On Intel hardware the SLP_TYPx bitfield occupies bits 10-12 as per ACPI
specification (see Table 4.13 "PM1 Control Registers Fixed Hardware
Feature Control Bits" for the details).
Fix the mask and other related definitions accordingly.
Fixes: 93e5eadd1f6e ("x86/platform: New Intel Atom SOC power management controller driver")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220801113734.36131-1-andriy.shevchenko@linux.intel.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|