Age | Commit message (Collapse) | Author |
|
vhca state events indicates change in the state of the vhca that may
occur due to a SF allocation, deallocation or enabling/disabling the
SF HCA.
Introduce vhca state event handler which will be used by SF devlink
port manager and SF hardware id allocator in subsequent patches
to act on the event.
This enables single entity to subscribe, query and rearm the event
for a function.
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Vu Pham <vuhuong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Currently, a PM domain's idle state is determined based on whether the
QoS requirements are met. However, even entering an idle state may waste
power if the minimum residency requirements aren't fulfilled.
CPU PM domains use the next timer wakeup for the CPUs in the domain to
determine the sleep duration of the domain. This is compared with the
idle state residencies to determine the optimal idle state. For other PM
domains, determining the sleep length is not that straight forward. But
if the device's next_event is available, we can use that to determine
the sleep duration of the PM domain.
Let's update the domain governor logic to check for idle state residency
based on the next wakeup of devices as well as QoS constraints. But
since, not all domains may contain devices capable of specifying the
next wakeup, let's enable this additional check only if specified by the
domain's flags when initializing the domain.
Signed-off-by: Lina Iyer <ilina@codeaurora.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Some devices may have a predictable interrupt pattern while executing
usecases. An example would be the VSYNC interrupt associated with
display devices. A 60 Hz display could cause a interrupt every 16 ms. If
the device were in a PM domain, the domain would need to be powered up
for device to resume and handle the interrupt.
Entering a domain idle state saves power, only if the residency of the
idle state is met. Without knowing the idle duration of the domain, the
governor would just choose the deepest idle state that matches the QoS
requirements. The domain might be powered off just as the device is
expecting to wake up. If devices could inform PM frameworks of their
next event, the parent PM domain's idle duration can be determined.
So let's add the dev_pm_genpd_set_next_wakeup() API for the device to
inform PM domains of the impending wakeup. This information will be the
domain governor to determine the best idle state given the wakeup.
Signed-off-by: Lina Iyer <ilina@codeaurora.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
There is a need to distinguish geniune per-cpu kthreads from kthreads
that happen to have a single CPU affinity.
Geniune per-cpu kthreads are kthreads that are CPU affine for
correctness, these will obviously have PF_KTHREAD set, but must also
have PF_NO_SETAFFINITY set, lest userspace modify their affinity and
ruins things.
However, these two things are not sufficient, PF_NO_SETAFFINITY is
also set on other tasks that have their affinities controlled through
other means, like for instance workqueues.
Therefore another bit is needed; it turns out kthread_create_per_cpu()
already has such a bit: KTHREAD_IS_PER_CPU, which is used to make
kthread_park()/kthread_unpark() work correctly.
Expose this flag and remove the implicit setting of it from
kthread_create_on_cpu(); the io_uring usage of it seems dubious at
best.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Tested-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lkml.kernel.org/r/20210121103506.557620262@infradead.org
|
|
We generally expect local_irq_save() and local_irq_restore() to be
paired and sanely nested, and so local_irq_restore() expects to be
called with irqs disabled. Thus, within local_irq_restore() we only
trace irq flag changes when unmasking irqs.
This means that a sequence such as:
| local_irq_disable();
| local_irq_save(flags);
| local_irq_enable();
| local_irq_restore(flags);
... is liable to break things, as the local_irq_restore() would mask
irqs without tracing this change. Similar problems may exist for
architectures whose arch_irq_restore() function depends on being called
with irqs disabled.
We don't consider such sequences to be a good idea, so let's define
those as forbidden, and add tooling to detect such broken cases.
This patch adds debug code to WARN() when raw_local_irq_restore() is
called with irqs enabled. As raw_local_irq_restore() is expected to pair
with raw_local_irq_save(), it should never be called with irqs enabled.
To avoid the possibility of circular header dependencies between
irqflags.h and bug.h, the warning is handled in a separate C file.
The new code is all conditional on a new CONFIG_DEBUG_IRQFLAGS symbol
which is independent of CONFIG_TRACE_IRQFLAGS. As noted above such cases
will confuse lockdep, so CONFIG_DEBUG_LOCKDEP now selects
CONFIG_DEBUG_IRQFLAGS.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210111153707.10071-1-mark.rutland@arm.com
|
|
While running my branch profiler that checks for incorrect "likely" and
"unlikely"s around the kernel, there's a large number of them that are
incorrect due to being "static_branches".
As static_branches are rather special, as they are likely or unlikely for
other reasons than normal annotations are used for, there's no reason to
have them be profiled.
Expose the "unlikely_notrace" and "likely_notrace" so that the
static_branch can use them, and have them be ignored by the branch
profilers.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20201211163754.585174b9@gandalf.local.home
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-next
Jonathan writes:
First set of IIO new device support, cleanups etc for 5.12
Includes one immutable branch, to support some qcom-vadc patches
going through IIO and thermal.
Late rebase to drop a patch that should go through the hid tree.
New device support:
* adi,ad5766
- New driver supporting AD5766 and AD5767 16 channel DACs.
* adi,ad7476
- Support for LTC2314-14 14 bit ADC (trivial to add)
* hid-sensors-hinge
- New driver including HID custom sensor support.
* invensense,mpu6050
- Add support for the MPU-6880 (chip info all that is needed)
* memsic,ms5637
- Add support for ms5803 device after a bunch of rework.
* xilinx-xadc
- Add support for Ultrascale System Monitor.
* yamaha,yas530
- New driver for this magnetometer supporting YAS530, YAS532 adn YAS 533.
Dt-binding conversions to yaml
* invensense,mpu3050
* invensense,mpu6050
Cleanups and minor features
* core
- Copy iio_info.attrs->is_visible along with the attrs themselves.
- Handle enumerate properties with gaps (i.e. reserved values in
the middle of otherwise used values).
- Add an of_iio_channel_get_by_name() function.
* adi,adf4350
- Drop an unnecessary NULL check.
* amstaos,tsl2583
- Use DIV_ROUND_CLOSEST in place of open coding.
* avago,apds9960
- Add MSHW0184 ACPI id seen in the Microsoft Surface Book 3 and Surface
Pro 7.
* bosch,bmc150_magn
- Basic regulator support.
* bosch,bme680
- Use DIV_ROUND_CLOSEST in place of opencoding.
* bosch,bmg160
- Basic regulator support.
* hid-sensors
- Add timestamp channels to all sensors types.
* kionix,kxcjk1013
- Basic regulator support.
* memsic
- Fix ordering in trivial-device.yaml
* microchip,mcp4725
- More flexible restrictions in DT binding.
* plantower,pms7003
- Fix comma that should be semicolon.
* qcom-vadc
- Refactors to support addition of ADC-TM5 driver
- Addition of a fixp_linear_interpolate function to support this common
operation.
* sprd,sc27xx_adc
- Use DIV_ROUND_CLOSEST in place of opencoding.
* st,ab8500-adc
- Enable non-hw-conversion as AB505 doesn't support it.
* st,stm32-adc
- Drop unneeded NULL check.
* st,stm32-dfsdm
- Drop unneeded NULL check.
* st,vl6180
- Use DIV_ROUND_CLOSEST in place of opencoding.
* xilinx-xadc
- Local var for &pdev->dev to avoid excessive repetition.
- devm_ throughout and drop remove()
* tag 'iio-for-5.12a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio: (59 commits)
iio: adc: stm32-dfsdm: Remove redundant null check before clk_disable_unprepare
iio:pressure:ms5637: add ms5803 support
iio:common:ms_sensors:ms_sensors_i2c: add support for alternative PROM layout
iio:common:ms_sensors:ms_sensors_i2c: rework CRC calculation helper
iio:pressure:ms5637: limit available sample frequencies
iio:pressure:ms5637: introduce hardware differentiation
dt-bindings: trivial-devices: reorder memsic devices
iio: dac: ad5766: add driver support for AD5766
Documentation/ABI/testing: Add documentation for AD5766 new ABI
dt-bindings: iio: dac: AD5766 yaml documentation
iio: hid-sensor-rotation: Add timestamp channel
iio: hid-sensor-incl-3d: Add timestamp channel
iio: hid-sensor-magn-3d: Add timestamp channel
iio: hid-sensor-als: Add timestamp channel
iio: hid-sensor-gyro-3d: Add timestamp channel
iio: hid-sensor-accel-3d: Add timestamp channel for gravity sensor
iio: magnetometer: bmc150: Add rudimentary regulator support
dt-bindings: iio: magnetometer: bmc150: Document regulator supplies
iio: Handle enumerated properties with gaps
iio:Documentation: Add documentation for hinge sensor channels
...
|
|
Immutable branch to allow for additional patches to thermal that may
be applied in this cycle.
|
|
Currently custom sensors properties are not decoded and it is up to
user space to interpret.
Some manufacturers already standardized the meaning of some custom sensors.
They can be presented as a proper IIO sensor. We can identify these sensors
based on manufacturer and serial number property in the report.
This change is identifying hinge sensor when the manufacturer is "INTEL".
This creates a platform device so that a sensor driver can be loaded to
process these sensors.
Signed-off-by: Ye Xiang <xiang.ye@intel.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Link: https://lore.kernel.org/r/20201215054444.9324-2-xiang.ye@intel.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
|
|
Unlike many other structure types defined in the crypto API, the
'shash_desc' structure is permitted to live on the stack, which
implies its contents may not be accessed by DMA masters. (This is
due to the fact that the stack may be located in the vmalloc area,
which requires a different virtual-to-physical translation than the
one implemented by the DMA subsystem)
Our definition of CRYPTO_MINALIGN_ATTR is based on ARCH_KMALLOC_MINALIGN,
which may take DMA constraints into account on architectures that support
non-cache coherent DMA such as ARM and arm64. In this case, the value is
chosen to reflect the largest cacheline size in the system, in order to
ensure that explicit cache maintenance as required by non-coherent DMA
masters does not affect adjacent, unrelated slab allocations. On arm64,
this value is currently set at 128 bytes.
This means that applying CRYPTO_MINALIGN_ATTR to struct shash_desc is both
unnecessary (as it is never used for DMA), and undesirable, given that it
wastes stack space (on arm64, performing the alignment costs 112 bytes in
the worst case, and the hole between the 'tfm' and '__ctx' members takes
up another 120 bytes, resulting in an increased stack footprint of up to
232 bytes.) So instead, let's switch to the minimum SLAB alignment, which
does not take DMA constraints into account.
Note that this is a no-op for x86.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
A function has a different name between their prototype
and its kernel-doc markup.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/2dc136ff6290d7c8919599d21bee244f31647c8c.1610610937.git.mchehab+huawei@kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
Some identifiers have different names between their prototypes
and the kernel-doc markup.
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/f3c65f61367993a607f9daf9dc1a3bdab1f0a040.1610610937.git.mchehab+huawei@kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
A function has a different name between their prototype
and its kernel-doc markup.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/889cfb141a98ae06d5bc79b744786ec2e8f92d93.1610610937.git.mchehab+huawei@kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
There are some common comments marked, instead, with kernel-doc
notation, which won't work.
While here, rename an identifier, in order to match the
function prototype below kernel-doc markup.
Acked-by: Richard Gong <richard.gong@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/02a1eb47767e01e875d8840805b8b2d4f3c6bdee.1610610937.git.mchehab+huawei@kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
The kernel-doc markup inside share.c is actually for
__parport_register_driver. The actual goal seems to be
to document parport_register_driver().
So, fix the existing markup and add a new one.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/dc0778af8c466cc667409ead05876a5cfd3cbece.1610610937.git.mchehab+huawei@kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
The device link device's name was of the form:
<supplier-dev-name>--<consumer-dev-name>
This can cause name collision as reported here [1] as device names are
not globally unique. Since device names have to be unique within the
bus/class, add the bus/class name as a prefix to the device names used to
construct the device link device name.
So the devuce link device's name will be of the form:
<supplier-bus-name>:<supplier-dev-name>--<consumer-bus-name>:<consumer-dev-name>
[1] - https://lore.kernel.org/lkml/20201229033440.32142-1-michael@walle.cc/
Fixes: 287905e68dd2 ("driver core: Expose device link details in sysfs")
Cc: stable@vger.kernel.org
Reported-by: Michael Walle <michael@walle.cc>
Tested-by: Michael Walle <michael@walle.cc>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20210110175408.1465657-1-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
KEY_FLAG_KEEP is not meant to be passed to keyring_alloc() or key_alloc(),
as these only take KEY_ALLOC_* flags. KEY_FLAG_KEEP has the same value as
KEY_ALLOC_BYPASS_RESTRICTION, but fortunately only key_create_or_update()
uses it. LSMs using the key_alloc hook don't check that flag.
KEY_FLAG_KEEP is then ignored but fortunately (again) the root user cannot
write to the blacklist keyring, so it is not possible to remove a key/hash
from it.
Fix this by adding a KEY_ALLOC_SET_KEEP flag that tells key_alloc() to set
KEY_FLAG_KEEP on the new key. blacklist_init() can then, correctly, pass
this to keyring_alloc().
We can also use this in ima_mok_init() rather than setting the flag
manually.
Note that this doesn't fix an observable bug with the current
implementation but it is required to allow addition of new hashes to the
blacklist in the future without making it possible for them to be removed.
Fixes: 734114f8782f ("KEYS: Add a system blacklist keyring")
Reported-by: Mickaël Salaün <mic@linux.microsoft.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Mickaël Salaün <mic@linux.microsoft.com>
cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: David Woodhouse <dwmw2@infradead.org>
|
|
Add missing linux/types.h for size_t.
[DH: Changed from stddef.h]
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Ben Boeckel <mathstuf@gmail.com>
|
|
The macro use will already have a semicolon.
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Reviewed-by: Ben Boeckel <mathstuf@gmail.com>
|
|
The fields of this struct are only ever read after being initialised, so
mark it 'const' before somebody tries to modify it again. GCC will then
complain (with an error) about modification of these fields after they
have been initialised, although LLVM currently allows them without even
a warning:
https://bugs.llvm.org/show_bug.cgi?id=48755
Hopefully, future versions of LLVM will emit a warning.
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Rather than modifying the 'address' field of the 'struct vm_fault'
passed to do_set_pte(), leave that to identify the real faulting address
and pass in the virtual address to be mapped by the new pte as a
separate argument.
This makes FAULT_FLAG_PREFAULT redundant, as a prefault entry can be
identified simply by comparing the new address parameter with the
faulting address, so remove the redundant flag at the same time.
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
'struct vm_fault' contains both information about the fault being
serviced alongside mutable fields contributing to the state of the
fault-handling logic. Unfortunately, the distinction between the two is
not clear-cut, and a number of callers end up manipulating the structure
temporarily before restoring it when returning.
Try to clean this up by moving the immutable fault information into an
anonymous struct, which will later be marked as 'const'. Ideally, the
'flags' field would be part of the new structure too, but it seems as
though the ->page_mkwrite() path is not ready for this yet.
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/CAHk-=whYs9XsO88iqJzN6NC=D-dp2m0oYXuOoZ=eWnvv=5OA+w@mail.gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Add missing stubs to regulator/consumer.h in order to fix COMPILE_TEST
of the kernel. In particular this should fix compile-testing of OPP core
because of a missing stub for regulator_sync_voltage().
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Link: https://lore.kernel.org/r/20210120205844.12658-1-digetx@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
PD3.0 Spec 6.8.1 describes how to handle Protocol Error. There are
general rules defined in Table 6-61 which regulate incoming Message
handling. If the incoming Message is unexpected, unsupported, or
unrecognized, Protocol Error occurs. Follow the rules to handle these
situations. Also consider PD2.0 connection (PD2.0 Spec Table 6-36) for
backward compatibilities.
To know the types of AMS in all the recipient's states, identify those
AMS who are initiated by the port partner but not yet recorded in the
current code.
Besides, introduce a new state CHUNK_NOT_SUPP to delay the NOT_SUPPORTED
message after receiving a chunked message.
Tested-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Kyle Tso <kyletso@google.com>
Link: https://lore.kernel.org/r/20210114145053.1952756-3-kyletso@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This patch provides the implementation of Collision Avoidance introduced
in PD3.0. The start of each Atomic Message Sequence (AMS) initiated by
the port will be denied if the current AMS is not interruptible. The
Source port will set the CC to SinkTxNG if it is going to initiate an
AMS, and SinkTxOk otherwise. Meanwhile, any AMS initiated by a Sink port
will be denied in TCPM if the port partner (Source) sets SinkTxNG except
for HARD_RESET and SOFT_RESET.
Tested-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Kyle Tso <kyletso@google.com>
Link: https://lore.kernel.org/r/20210114145053.1952756-2-kyletso@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into tty-next
Fixes both the "splice/sendfile to a tty" and "splice/sendfile from a
tty" regression from 5.10.
* 'tty-splice' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux:
tty: teach the n_tty ICANON case about the new "cookie continuations" too
tty: teach n_tty line discipline about the new "cookie continuations"
tty: clean up legacy leftovers from n_tty line discipline
tty: implement read_iter
tty: convert tty_ldisc_ops 'read()' function to take a kernel pointer
tty: implement write_iter
|
|
|
|
If controller driver has specified the irq_flags, mhi uses this specified
irq_flags. Otherwise, mhi uses default irq_flags.
The purpose of this change is to support one MSI vector for QCA6390.
MHI will use one same MSI vector too in this scenario.
In case of one MSI vector, IRQ_NO_BALANCING is needed when irq handler
is requested. The reason is if irq migration happens, the msi_data may
change too. However, the msi_data is already programmed to QCA6390
hardware during initialization phase. This msi_data inconsistence will
result in crash in kernel.
Another issue is in case of one MSI vector, IRQF_NO_SUSPEND will trigger
WARNINGS because QCA6390 wants to disable the IRQ during the suspend.
To avoid above two issues, QCA6390 driver specifies the irq_flags in case
of one MSI vector when mhi_register_controller is called.
Signed-off-by: Carl Huang <cjhuang@codeaurora.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
|
|
Some parameters may have to be determined at runtime.
It is the case for the event ring MSI vector.
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
|
|
This patch add the TCA_FLOWER_KEY_CT_FLAGS_INVALID flag to
match the ct_state with invalid for conntrack.
Signed-off-by: wenxu <wenxu@ucloud.cn>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Link: https://lore.kernel.org/r/1611045110-682-1-git-send-email-wenxu@ucloud.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The tty line discipline .read() function was passed the final user
pointer destination as an argument, which doesn't match the 'write()'
function, and makes it very inconvenient to do a splice method for
ttys.
This is a conversion to use a kernel buffer instead.
NOTE! It does this by passing the tty line discipline ->read() function
an additional "cookie" to fill in, and an offset into the cookie data.
The line discipline can fill in the cookie data with its own private
information, and then the reader will repeat the read until either the
cookie is cleared or it runs out of data.
The only real user of this is N_HDLC, which can use this to handle big
packets, even if the kernel buffer is smaller than the whole packet.
Cc: Christoph Hellwig <hch@lst.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When we attach any cgroup hook, the rest (even if unused/unattached) start
to contribute small overhead. In particular, the one we want to avoid is
__cgroup_bpf_run_filter_skb which does two redirections to get to
the cgroup and pushes/pulls skb.
Let's split cgroup_bpf_enabled to be per-attach to make sure
only used attach types trigger.
I've dropped some existing high-level cgroup_bpf_enabled in some
places because BPF_PROG_CGROUP_XXX_RUN macros usually have another
cgroup_bpf_enabled check.
I also had to copy-paste BPF_CGROUP_RUN_SA_PROG_LOCK for
GETPEERNAME/GETSOCKNAME because type for cgroup_bpf_enabled[type]
has to be constant and known at compile time.
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20210115163501.805133-4-sdf@google.com
|
|
When we attach a bpf program to cgroup/getsockopt any other getsockopt()
syscall starts incurring kzalloc/kfree cost.
Let add a small buffer on the stack and use it for small (majority)
{s,g}etsockopt values. The buffer is small enough to fit into
the cache line and cover the majority of simple options (most
of them are 4 byte ints).
It seems natural to do the same for setsockopt, but it's a bit more
involved when the BPF program modifies the data (where we have to
kmalloc). The assumption is that for the majority of setsockopt
calls (which are doing pure BPF options or apply policy) this
will bring some benefit as well.
Without this patch (we remove about 1% __kmalloc):
3.38% 0.07% tcp_mmap [kernel.kallsyms] [k] __cgroup_bpf_run_filter_getsockopt
|
--3.30%--__cgroup_bpf_run_filter_getsockopt
|
--0.81%--__kmalloc
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210115163501.805133-3-sdf@google.com
|
|
Add custom implementation of getsockopt hook for TCP_ZEROCOPY_RECEIVE.
We skip generic hooks for TCP_ZEROCOPY_RECEIVE and have a custom
call in do_tcp_getsockopt using the on-stack data. This removes
3% overhead for locking/unlocking the socket.
Without this patch:
3.38% 0.07% tcp_mmap [kernel.kallsyms] [k] __cgroup_bpf_run_filter_getsockopt
|
--3.30%--__cgroup_bpf_run_filter_getsockopt
|
--0.81%--__kmalloc
With the patch applied:
0.52% 0.12% tcp_mmap [kernel.kallsyms] [k] __cgroup_bpf_run_filter_getsockopt_kern
Note, exporting uapi/tcp.h requires removing netinet/tcp.h
from test_progs.h because those headers have confliciting
definitions.
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210115163501.805133-2-sdf@google.com
|
|
Conflicts:
drivers/net/can/dev.c
commit 03f16c5075b2 ("can: dev: can_restart: fix use after free bug")
commit 3e77f70e7345 ("can: dev: move driver related infrastructure into separate subdir")
Code move.
drivers/net/dsa/b53/b53_common.c
commit 8e4052c32d6b ("net: dsa: b53: fix an off by one in checking "vlan->vid"")
commit b7a9e0da2d1c ("net: switchdev: remove vid_begin -> vid_end range from VLAN objects")
Field rename.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Some brcmstb drivers may be shared with other SoC families. E.g. the
same USB PHY block is shared by brcmstb and BCM4908.
To avoid building brcmstb common code on non-brcmstb platforms we need
stubs for:
1. brcmstb_get_family_id()
2. brcmstb_get_product_id()
(to avoid "undefined reference to" errors).
With this change PHY_BRCM_USB will not have to unconditionally select
SOC_BRCMSTB anymore.
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Networking fixes for 5.11-rc5, including fixes from bpf, wireless, and
can trees.
Current release - regressions:
- nfc: nci: fix the wrong NCI_CORE_INIT parameters
Current release - new code bugs:
- bpf: allow empty module BTFs
Previous releases - regressions:
- bpf: fix signed_{sub,add32}_overflows type handling
- tcp: do not mess with cloned skbs in tcp_add_backlog()
- bpf: prevent double bpf_prog_put call from bpf_tracing_prog_attach
- bpf: don't leak memory in bpf getsockopt when optlen == 0
- tcp: fix potential use-after-free due to double kfree()
- mac80211: fix encryption issues with WEP
- devlink: use right genl user_ptr when handling port param get/set
- ipv6: set multicast flag on the multicast route
- tcp: fix TCP_USER_TIMEOUT with zero window
Previous releases - always broken:
- bpf: local storage helpers should check nullness of owner ptr passed
- mac80211: fix incorrect strlen of .write in debugfs
- cls_flower: call nla_ok() before nla_next()
- skbuff: back tiny skbs with kmalloc() in __netdev_alloc_skb() too"
* tag 'net-5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (52 commits)
net: systemport: free dev before on error path
net: usb: cdc_ncm: don't spew notifications
net: mscc: ocelot: Fix multicast to the CPU port
tcp: Fix potential use-after-free due to double kfree()
bpf: Fix signed_{sub,add32}_overflows type handling
can: peak_usb: fix use after free bugs
can: vxcan: vxcan_xmit: fix use after free bug
can: dev: can_restart: fix use after free bug
tcp: fix TCP socket rehash stats mis-accounting
net: dsa: b53: fix an off by one in checking "vlan->vid"
tcp: do not mess with cloned skbs in tcp_add_backlog()
selftests: net: fib_tests: remove duplicate log test
net: nfc: nci: fix the wrong NCI_CORE_INIT parameters
sh_eth: Fix power down vs. is_opened flag ordering
net: Disable NETIF_F_HW_TLS_RX when RXCSUM is disabled
netfilter: rpfilter: mask ecn bits before fib lookup
udp: mask TOS bits in udp_v4_early_demux()
xsk: Clear pool even for inactive queues
bpf: Fix helper bpf_map_peek_elem_proto pointing to wrong callback
sh_eth: Make PHY access aware of Runtime PM to fix reboot crash
...
|
|
A dummy zero bit is sent preceding the data during a read transfer by the
Microchip 93LC46B eeprom (section 2.7 of[1]). This results in right shift
of data during a read. In order to ignore this bit a quirk can be added to
send an extra zero bit after the read address.
Add a quirk to ignore the zero bit sent before data by adding a zero bit
after the read address.
[1] - https://www.mouser.com/datasheet/2/268/20001749K-277859.pdf
Signed-off-by: Aswath Govindraju <a-govindraju@ti.com>
Link: https://lore.kernel.org/r/20210105105817.17644-3-a-govindraju@ti.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
RTL8156 sends notifications about every 32ms.
Only display/log notifications when something changes.
This issue has been reported by others:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1832472
https://lkml.org/lkml/2020/8/27/1083
...
[785962.779840] usb 1-1: new high-speed USB device number 5 using xhci_hcd
[785962.929944] usb 1-1: New USB device found, idVendor=0bda, idProduct=8156, bcdDevice=30.00
[785962.929949] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=6
[785962.929952] usb 1-1: Product: USB 10/100/1G/2.5G LAN
[785962.929954] usb 1-1: Manufacturer: Realtek
[785962.929956] usb 1-1: SerialNumber: 000000001
[785962.991755] usbcore: registered new interface driver cdc_ether
[785963.017068] cdc_ncm 1-1:2.0: MAC-Address: 00:24:27:88:08:15
[785963.017072] cdc_ncm 1-1:2.0: setting rx_max = 16384
[785963.017169] cdc_ncm 1-1:2.0: setting tx_max = 16384
[785963.017682] cdc_ncm 1-1:2.0 usb0: register 'cdc_ncm' at usb-0000:00:14.0-1, CDC NCM, 00:24:27:88:08:15
[785963.019211] usbcore: registered new interface driver cdc_ncm
[785963.023856] usbcore: registered new interface driver cdc_wdm
[785963.025461] usbcore: registered new interface driver cdc_mbim
[785963.038824] cdc_ncm 1-1:2.0 enx002427880815: renamed from usb0
[785963.089586] cdc_ncm 1-1:2.0 enx002427880815: network connection: disconnected
[785963.121673] cdc_ncm 1-1:2.0 enx002427880815: network connection: disconnected
[785963.153682] cdc_ncm 1-1:2.0 enx002427880815: network connection: disconnected
...
This is about 2KB per second and will overwrite all contents of a 1MB
dmesg buffer in under 10 minutes rendering them useless for debugging
many kernel problems.
This is also an extra 180 MB/day in /var/logs (or 1GB per week) rendering
the majority of those logs useless too.
When the link is up (expected state), spew amount is >2x higher:
...
[786139.600992] cdc_ncm 2-1:2.0 enx002427880815: network connection: connected
[786139.632997] cdc_ncm 2-1:2.0 enx002427880815: 2500 mbit/s downlink 2500 mbit/s uplink
[786139.665097] cdc_ncm 2-1:2.0 enx002427880815: network connection: connected
[786139.697100] cdc_ncm 2-1:2.0 enx002427880815: 2500 mbit/s downlink 2500 mbit/s uplink
[786139.729094] cdc_ncm 2-1:2.0 enx002427880815: network connection: connected
[786139.761108] cdc_ncm 2-1:2.0 enx002427880815: 2500 mbit/s downlink 2500 mbit/s uplink
...
Chrome OS cannot support RTL8156 until this is fixed.
Signed-off-by: Grant Grundler <grundler@chromium.org>
Reviewed-by: Hayes Wang <hayeswang@realtek.com>
Link: https://lore.kernel.org/r/20210120011208.3768105-1-grundler@chromium.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
reset.
Add command to EC_CMD_REBOOT_EC to reset EC but don't boot AP.
Signed-off-by: Pi-Hsun Shih <pihsun@chromium.org>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Link: https://lore.kernel.org/r/20201221041231.14516-1-pihsun@chromium.org
|
|
Commit 5c0a85fad949 ("mm: make faultaround produce old ptes") changed
the "faultaround" behaviour to initialise prefaulted PTEs as 'old',
since this avoids vmscan wrongly assuming that they are hot, despite
having never been explicitly accessed by userspace. The change has been
shown to benefit numerous arm64 micro-architectures (with hardware
access flag) running Android, where both application launch latency and
direct reclaim time are significantly reduced (by 10%+ and ~80%
respectively).
Unfortunately, commit 315d09bf30c2 ("Revert "mm: make faultaround
produce old ptes"") reverted the change due to it being identified as
the cause of a ~6% regression in unixbench on x86. Experiments on a
variety of recent arm64 micro-architectures indicate that unixbench is
not affected by the original commit, which appears to yield a 0-1%
performance improvement.
Since one size does not fit all for the initial state of prefaulted
PTEs, introduce arch_wants_old_prefaulted_pte(), which allows an
architecture to opt-in to 'old' prefaulted PTEs at runtime based on
whatever criteria it may have.
Cc: Jan Kara <jack@suse.cz>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Vinayak Menon <vinmenon@codeaurora.org>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
alloc_set_pte() has two users with different requirements: in the
faultaround code, it called from an atomic context and PTE page table
has to be preallocated. finish_fault() can sleep and allocate page table
as needed.
PTL locking rules are also strange, hard to follow and overkill for
finish_fault().
Let's untangle the mess. alloc_set_pte() has gone now. All locking is
explicit.
The price is some code duplication to handle huge pages in faultaround
path, but it should be fine, having overall improvement in readability.
Link: https://lore.kernel.org/r/20201229132819.najtavneutnf7ajp@box
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
[will: s/from from/from/ in comment; spotted by willy]
Signed-off-by: Will Deacon <will@kernel.org>
|
|
This converts the lms283gf05 backlight driver to use GPIO
descriptors and switches the single PXA Palm Z2 device
over to defining these.
Since the platform data was only used to convey GPIO
information we can delete the platform data header.
Notice that we define the proper active low semantics in
the board file GPIO descriptor table (active low) and
assert the reset line by bringing it to "1" (asserted).
Cc: Marek Vasut <marex@denx.de>
Cc: Haojian Zhuang <haojian.zhuang@gmail.com>
Cc: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Daniel Mack <daniel@zonque.org>
Acked-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
|
|
The ARM architected TRNG firmware interface, described in ARM spec
DEN0098, define an ARM SMCCC based interface to a true random number
generator, provided by firmware.
Add the definitions of the SMCCC functions as defined by the spec.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20210106103453.152275-2-andre.przywara@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
This comes from an end-user request, where they're running multiple VMs on
hosts with bonded interfaces connected to some interest switch topologies,
where 802.3ad isn't an option. They're currently running a proprietary
solution that effectively achieves load-balancing of VMs and bandwidth
utilization improvements with a similar form of transmission algorithm.
Basically, each VM has it's own vlan, so it always sends its traffic out
the same interface, unless that interface fails. Traffic gets split
between the interfaces, maintaining a consistent path, with failover still
available if an interface goes down.
Unlike bond_eth_hash(), this hash function is using the full source MAC
address instead of just the last byte, as there are so few components to
the hash, and in the no-vlan case, we would be returning just the last
byte of the source MAC as the hash value. It's entirely possible to have
two NICs in a bond with the same last byte of their MAC, but not the same
MAC, so this adjustment should guarantee distinct hashes in all cases.
This has been rudimetarily tested to provide similar results to the
proprietary solution it is aiming to replace. A patch for iproute2 is also
posted, to properly support the new mode there as well.
Cc: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Cc: Thomas Davis <tadavis@lbl.gov>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Link: https://lore.kernel.org/r/20210119010927.1191922-1-jarod@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This reverts commit fbdd0049d98d44914fc57d4b91f867f4996c787b.
Due to commit in fixes tag, netdevice events were received only in one net
namespace of mlx5_core_dev. Due to this when netdevice events arrive in
net namespace other than net namespace of mlx5_core_dev, they are missed.
This results in empty GID table due to RDMA device being detached from its
net device.
Hence, revert back to receive netdevice events in all net namespaces to
restore back RDMA functionality in non init_net net namespace. The
deadlock will have to be addressed in another patch.
Fixes: fbdd0049d98d ("RDMA/mlx5: Fix devlink deadlock on net namespace deletion")
Link: https://lore.kernel.org/r/20210117092633.10690-1-leon@kernel.org
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
This patch is to define a inline function skb_csum_is_sctp(), and
also replace all places where it checks if it's a SCTP CSUM skb.
This function would be used later in many networking drivers in
the following patches.
Suggested-by: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The change of the cooling device state should be used by the governor
or at least by the core code, not by the drivers themselves.
Remove the API usage and move the function declaration to the internal
headers.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20210118173824.9970-1-daniel.lezcano@linaro.org
|
|
The code does no longer use the ms unit based fields to set the
delays as they are replaced by the jiffies.
Remove them and replace their user to use the jiffies version instead.
Cc: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Reviewed-by: Peter Kästle <peter@piie.net>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20201216220337.839878-3-daniel.lezcano@linaro.org
|
|
The delays are stored in ms units and when the polling function is
called this delay is converted into jiffies at each call.
Instead of doing the conversion again and again, compute the jiffies
at init time and use the value directly when setting the polling.
Cc: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org>
Link: https://lore.kernel.org/r/20201216220337.839878-1-daniel.lezcano@linaro.org
|