Age | Commit message (Collapse) | Author |
|
`pcc_mailbox_probe` doesn't initialize all memory that has been allocated
before the first time that one of it's members `txdone_irq` may be
accessed.
This leads to a an invalid load any time that this member is accessed:
[ 2.429769] UBSAN: invalid-load in drivers/mailbox/pcc.c:684:22
[ 2.430324] UBSAN: invalid-load in drivers/mailbox/mailbox.c:486:12
[ 4.276782] UBSAN: invalid-load in drivers/acpi/cppc_acpi.c:314:45
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215587
Fixes: ce028702ddbc ("mailbox: pcc: Move bulk of PCCT parsing into pcc_mbox_probe")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
|
|
The stm32 ipcc mailbox driver supports only two interrupts (rx and tx), so
remove the unsupported "wakeup" one.
Signed-off-by: Fabien Dessenne <fabien.dessenne@foss.st.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
|
|
Correct kerneldoc warnings like:
drivers/mailbox/arm_mhu_db.c:47:
warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
drivers/mailbox/qcom-ipcc.c:58:
warning: Function parameter or member 'num_chans' not described in 'qcom_ipcc'
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
|
|
Using pm_runtime_resume_and_get() to replace pm_runtime_get_sync and
pm_runtime_put_noidle. This change is just to simplify the code, no
actual functional changes.
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: ran jianping <ran.jianping@zte.com.cn>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
|
|
Using pm_runtime_resume_and_get() to replace pm_runtime_get_sync and
pm_runtime_put_noidle. This change is just to simplify the code, no
actual functional changes.
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: ran jianping <ran.jianping@zte.com.cn>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
|
|
Add support of mt8186 adsp mailbox.
Signed-off-by: Tinghan Shen <tinghan.shen@mediatek.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
|
|
Add compatible name for MediaTek MT8186 SoC ADSP mailbox.
Signed-off-by: Tinghan Shen <tinghan.shen@mediatek.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
|
|
Add support for 128-bit shared mailboxes found on Tegra234 chips.
Signed-off-by: Kartik <kkartik@nvidia.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
|
|
Tegra234 supports sending/receiving 32-bit and 128-bit data over
a shared mailbox. Based on the data size to be used, clients need
to specify the type of shared mailbox in the device tree.
Add a macro for 128-bit shared mailbox. Mailbox clients can use this
macro as a flag in device tree to enable 128-bit data support for a
shared mailbox.
Signed-off-by: Kartik <kkartik@nvidia.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
|
|
This patch introduces tegra_hsp_sm_ops to abstract send & receive
API's for shared mailboxes.
Signed-off-by: Kartik <kkartik@nvidia.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
|
|
Add the GCE header file to define GCE subsys ids, hardware event ids
and constants for MT8186.
Signed-off-by: Yongqiang Niu <yongqiang.niu@mediatek.com>
Acked-by: Rob Herring <robh@kernel.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
|
|
The list iterator is always non-NULL so it doesn't need to be checked.
Thus just remove the unnecessary NULL check.
Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
|
|
Smatch reports this issue
imx-mailbox.c:887:10: warning: Initializer entry defined twice
imx-mailbox.c:889:10: also defined here
.rxdb = imx_mu_generic_rxdb,
Is listed twice, so remove one.
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
|
|
Consumer examples in the bindings of resource providers are trivial,
useless and duplicating code. Additionally the incomplete qcom,smp2p
example triggers DT schema warnings.
Cleanup the example by removing the consumer part and fixing the
indentation to DT schema convention.
Reported-by: Rob Herring <robh@kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
|
|
Commit 6eaf08770ee8 ("ACPICA: executer/exsystem: Warn about sleeps
greater than 10 ms") made acpi_ex_system_do_sleep() log a warning for
sleep times greater than 10 ms, but such sleep times are used in
power management AML because of the PCI specification requirements.
This results with logging warnings that cannot really be acted on in
any useful way which is annoying and these warnings show up in the logs
on many production systems, so revert commit 6eaf08770ee8.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Rather than pass in a bool for whether or not this work item needs to go
into the priority list or not, provide separate helpers for it. For most
use cases, this also then gets rid of the branch for non-priority task
work.
While at it, rename the prior_task_list to prio_task_list. Prior is
a confusing name for it, as it would seem to indicate that this is the
previous task_work list. prio makes it clear that this is a priority
task_work list.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Spelling mistake (triple letters) in comment. Detected with the help of
Coccinelle.
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220521111145.81697-39-Julia.Lawall@inria.fr
|
|
Spelling mistake (triple letters) in comment.
Detected with the help of Coccinelle.
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Link: https://lore.kernel.org/r/20220521111145.81697-28-Julia.Lawall@inria.fr
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Spelling mistake (triple letters) in comment.
Detected with the help of Coccinelle.
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Link: https://lore.kernel.org/r/20220521111145.81697-29-Julia.Lawall@inria.fr
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Due to i2c->adap.dev.fwnode not being set, ACPI_COMPANION() wasn't properly
found for TWSI controllers.
Signed-off-by: Szymon Balcerak <sbalcerak@marvell.com>
Signed-off-by: Piyush Malgujar <pmalgujar@marvell.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
|
|
Before sending a MSI the hardware writes information pertinent to the
interrupt cause to a memory location pointed by SMTICL register. This
memory holds three double words where the least significant bit tells
whether the interrupt cause of master/target/error is valid. The driver
does not use this but we need to set it up because otherwise it will
perform DMA write to the default address (0) and this will cause an
IOMMU fault such as below:
DMAR: DRHD: handling fault status reg 2
DMAR: [DMA Write] Request device [00:12.0] PASID ffffffff fault addr 0
[fault reason 05] PTE Write access is not set
To prevent this from happening, provide a proper DMA buffer for this
that then gets mapped by the IOMMU accordingly.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
|
|
Fix the missing clk_disable_unprepare() before return
from mtk_i2c_probe() in the error handling case.
Fixes: d04913ec5f89 ("i2c: mt7621: Add MediaTek MT7621/7628/7688 I2C driver")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Stefan Roese <sr@denx.de>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
|
|
The quirk entry for Focusrite Saffire 6 had no proper ep_idx for the
capture endpoint, and this confused the driver, resulting in the
broken sound. This patch adds the missing ep_idx in the entry.
While we are at it, a couple of other entries (for Digidesign MBox and
MOTU MicroBook II) seem to have the same problem, and those are
covered as well.
Fixes: bf6313a0ff76 ("ALSA: usb-audio: Refactor endpoint management")
Reported-by: André Kapelrud <a.kapelrud@gmail.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220521065325.426-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Maris reported that TEAC UD-501 (0644:8043) doesn't work with the
typical "clock source 41 is not valid, cannot use" errors on the
recent kernels. The currently known workaround so far is to restore
(partially) what we've done unconditionally at the clock setup;
namely, re-setup the USB interface immediately after the clock is
changed. This patch re-introduces the behavior conditionally for TEAC
devices.
Further notes:
- The USB interface shall be set later in
snd_usb_endpoint_configure(), but this seems to be too late.
- Even calling usb_set_interface() right after
sne_usb_init_sample_rate() doesn't help; so this must be related
with the clock validation, too.
- The device may still spew the "clock source 41 is not valid" error
at the first clock setup. This seems happening at the very first
try of clock setup, but it disappears at later attempts.
The error is likely harmless because the driver retries the clock
setup (such an error is more or less expected on some devices).
Fixes: bf6313a0ff76 ("ALSA: usb-audio: Refactor endpoint management")
Reported-and-tested-by: Maris Abele <maris7abele@gmail.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220521064627.29292-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
GCC 12 sees that it's technically possible for num_streams to be larger
than ARRAY_SIZE(pcm->streams). Bounds-check the iterator.
../sound/pci/lola/lola_pcm.c: In function 'lola_pcm_update':
../sound/pci/lola/lola_pcm.c:567:64: warning: array subscript [0, 31] is outside array bounds of 'struct lola_stream[16]' [-Warray-bounds]
567 | struct lola_stream *str = &pcm->streams[i];
| ~~~~~~~~~~~~^~~
In file included from ../sound/pci/lola/lola_pcm.c:15:
../sound/pci/lola/lola.h:307:28: note: while referencing 'streams'
307 | struct lola_stream streams[MAX_STREAM_COUNT];
| ^~~~~~~
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220520165537.2139826-1-keescook@chromium.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fix for v5.17
This is rather late and at this point I'm expecting it to get merged in
the merge window rather than as a fix but if we get a -rc8 it's a small,
driver specific fix which should be fine to send.
|
|
Pull kvm fixes from Paolo Bonzini:
"ARM:
- Correctly expose GICv3 support even if no irqchip is created so
that userspace doesn't observe it changing pointlessly (fixing a
regression with QEMU)
- Don't issue a hypercall to set the id-mapped vectors when protected
mode is enabled (fix for pKVM in combination with CPUs affected by
Spectre-v3a)
x86 (five oneliners, of which the most interesting two are):
- a NULL pointer dereference on INVPCID executed with paging
disabled, but only if KVM is using shadow paging
- an incorrect bsearch comparison function which could truncate the
result and apply PMU event filtering incorrectly. This one comes
with a selftests update too"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86/mmu: fix NULL pointer dereference on guest INVPCID
KVM: x86: hyper-v: fix type of valid_bank_mask
KVM: Free new dirty bitmap if creating a new memslot fails
KVM: eventfd: Fix false positive RCU usage warning
selftests: kvm/x86: Verify the pmu event filter matches the correct event
selftests: kvm/x86: Add the helper function create_pmu_event_filter
kvm: x86/pmu: Fix the compare function used by the pmu event filter
KVM: arm64: Don't hypercall before EL2 init
KVM: arm64: vgic-v3: Consistently populate ID_AA64PFR0_EL1.GIC
KVM: x86/mmu: Update number of zapped pages even if page list is stable
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"Three clk driver fixes to close out the release
- Fix a divider calculation breaking boot on Broadcom bcm2835
- Fix HDMI output on Tanix TX6 mini board by reverting a patch
- Fix clk_set_rate_range() calls on at91 by considering the range
while calculating the divisor"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: at91: generated: consider range when calculating best rate
Revert "clk: sunxi-ng: sun6i-rtc: Add support for H6"
clk: bcm2835: fix bcm2835_clock_choose_div
|
|
Pull drm fixes from Dave Airlie:
"Few final fixes for 5.18, one amdgpu, core dp mst leak fix, dma-buf
two fixes, and i915 has a few fixes, one for a regression on older
GM45 chipsets,
dma-buf:
- ioctl userspace use fix
- fix dma-buf sysfs name generation
core:
- dp/mst leak fix
amdgpu:
- suspend/resume regression fix
i915:
- fix for #5806: GPU hangs and display artifacts on Intel GM45
- reject DMC with out-of-spec MMIO
- correctly mark guilty contexts on GuC reset"
* tag 'drm-fixes-2022-05-21' of git://anongit.freedesktop.org/drm/drm:
drm/i915: Use i915_gem_object_ggtt_pin_ww for reloc_iomap
drm/amd: Don't reset dGPUs if the system is going to s2idle
drm/dp/mst: fix a possible memory leak in fetch_monitor_name()
dma-buf: fix use of DMA_BUF_SET_NAME_{A,B} in userspace
i915/guc/reset: Make __guc_reset_context aware of guilty engines
drm/i915/dmc: Add MMIO range restrictions
dma-buf: ensure unique directory name for dmabuf stats
|
|
We should have 'n', then 'size', not the opposite.
This is harmless because the 2 values are just multiplied, but having
the correct order silence a (unpublished yet) smatch warning.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/49d726d11964ca0e3757bdb5659e3b3eaa1572b5.1653081643.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
Some muxes need to set a the safe position when clock is off.
Signed-off-by: Gabriel Fernandez <gabriel.fernandez@foss.st.com>
Link: https://lore.kernel.org/r/20220516070600.7692-12-gabriel.fernandez@foss.st.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
Some RCC muxes can manages two output clocks with same register.
Signed-off-by: Gabriel Fernandez <gabriel.fernandez@foss.st.com>
Link: https://lore.kernel.org/r/20220516070600.7692-11-gabriel.fernandez@foss.st.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
Complete all kernel clocks of stm32mp13.
Signed-off-by: Gabriel Fernandez <gabriel.fernandez@foss.st.com>
Link: https://lore.kernel.org/r/20220516070600.7692-10-gabriel.fernandez@foss.st.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
All peripheral clocks are mainly based on stm32_gate clock.
Signed-off-by: Gabriel Fernandez <gabriel.fernandez@foss.st.com>
Link: https://lore.kernel.org/r/20220516070600.7692-9-gabriel.fernandez@foss.st.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
Don't register a clock if this clock is secured.
Signed-off-by: Gabriel Fernandez <gabriel.fernandez@foss.st.com>
Link: https://lore.kernel.org/r/20220516070600.7692-8-gabriel.fernandez@foss.st.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
Just to introduce management of stm32 composite clock.
Signed-off-by: Gabriel Fernandez <gabriel.fernandez@foss.st.com>
Link: https://lore.kernel.org/r/20220516070600.7692-7-gabriel.fernandez@foss.st.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
Just to introduce management of a stm32 divider clock
Signed-off-by: Gabriel Fernandez <gabriel.fernandez@foss.st.com>
Link: https://lore.kernel.org/r/20220516070600.7692-6-gabriel.fernandez@foss.st.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
Just to introduce management of a stm32 gate clock.
Signed-off-by: Gabriel Fernandez <gabriel.fernandez@foss.st.com>
Link: https://lore.kernel.org/r/20220516070600.7692-5-gabriel.fernandez@foss.st.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
Just to introduce management of a stm32 mux clock.
Signed-off-by: Gabriel Fernandez <gabriel.fernandez@foss.st.com>
Link: https://lore.kernel.org/r/20220516070600.7692-4-gabriel.fernandez@foss.st.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
This driver manages Reset and Clock of STM32MP13 soc.
It uses a clk-stm32-core module to manage stm32 gate, mux and divider
for STM32MP13 and for new future soc.
All gates, muxes, dividers are identify by an index and information
are stored in array (register address, shift, with, flags...)
This is useful when we have two clocks with the same gate or
when one mux manages two output clocks.
Signed-off-by: Gabriel Fernandez <gabriel.fernandez@foss.st.com>
Link: https://lore.kernel.org/r/20220516070600.7692-3-gabriel.fernandez@foss.st.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
New compatible to manage clock and reset of STM32MP13 SoC.
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Gabriel Fernandez <gabriel.fernandez@foss.st.com>
Link: https://lore.kernel.org/r/20220516070600.7692-2-gabriel.fernandez@foss.st.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
To move the list iterator variable into the list_for_each_entry_*()
macro in the future it should be avoided to use the list iterator
variable after the loop body.
To *never* use the list iterator variable after the loop it was
concluded to use a separate iterator variable instead of a
found boolean [1].
This removes the need to use a found variable and simply checking if
the variable was set, can determine if the break/goto was hit.
Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=EHreAsk5SqXPwr9Y7k9sA6cWXJ6w@mail.gmail.com/
Signed-off-by: Jakob Koschel <jakobkoschel@gmail.com>
Link: https://lore.kernel.org/r/20220324071019.59483-1-jakobkoschel@gmail.com
Tested-by: Tony Lindgren <tony@atomide.com>
Reviewed-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
As noted in the "Deprecated Interfaces, Language Features, Attributes,
and Conventions" documentation [1], size calculations (especially
multiplication) should not be performed in memory allocator (or similar)
function arguments due to the risk of them overflowing. This could lead
to values wrapping around and a smaller allocation being made than the
caller was expecting. Using those allocations could lead to linear
overflows of heap memory and other misbehaviors.
So, use the purpose specific kcalloc() function instead of the argument
size * count in the kzalloc() function.
[1] https://www.kernel.org/doc/html/v5.14/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments
Signed-off-by: Len Baker <len.baker@gmx.com>
Link: https://lore.kernel.org/r/20210904131714.2312-1-len.baker@gmx.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
_NR_CLKS which can be used to register clocks via nr_clk_ids. The clock
IDs are started from 1. So, _NR_CLKS should be defined to "the last
clock id + 1"
Fixes: 680e1c8370a2 ("dt-bindings: clock: add clock binding definitions for Exynos Auto v9")
Signed-off-by: Chanho Park <chanho61.park@samsung.com>
Link: https://lore.kernel.org/r/20220520030625.145324-1-chanho61.park@samsung.com
Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
Alan Maguire says:
====================
Unprivileged BPF disabled (kernel.unprivileged_bpf_disabled >= 1)
is the default in most cases now; when set, the BPF system call is
blocked for users without CAP_BPF/CAP_SYS_ADMIN. In some cases
however, it makes sense to split activities between capability-requiring
ones - such as program load/attach - and those that might not require
capabilities such as reading perf/ringbuf events, reading or
updating BPF map configuration etc. One example of this sort of
approach is a service that loads a BPF program, and a user-space
program that interacts with it.
Here - rather than blocking all BPF syscall commands - unprivileged
BPF disabled blocks the key object-creating commands (prog load,
map load). Discussion has alluded to this idea in the past [1],
and Alexei mentioned it was also discussed at LSF/MM/BPF this year.
Changes since v3 [2]:
- added acks to patch 1
- CI was failing on Ubuntu; I suspect the issue was an old capability.h
file which specified CAP_LAST_CAP as < CAP_BPF, leading to the logic
disabling all caps not disabling CAP_BPF. Use CAP_BPF as basis for
"all caps" bitmap instead as we explicitly define it in cap_helpers.h
if not already found in capabilities.h
- made global variables arguments to subtests instead (Andrii, patch 2)
Changes since v2 [3]:
- added acks from Yonghong
- clang compilation issue in selftest with bpf_prog_query()
(Alexei, patch 2)
- disable all capabilities for test (Yonghong, patch 2)
- add assertions that size of perf/ringbuf data matches expectations
(Yonghong, patch 2)
- add map array size definition, remove unneeded whitespace (Yonghong, patch 2)
Changes since RFC [4]:
- widened scope of commands unprivileged BPF disabled allows
(Alexei, patch 1)
- removed restrictions on map types for lookup, update, delete
(Alexei, patch 1)
- removed kernel CONFIG parameter controlling unprivileged bpf disabled
change (Alexei, patch 1)
- widened test scope to cover most BPF syscall commands, with positive
and negative subtests
[1] https://lore.kernel.org/bpf/CAADnVQLTBhCTAx1a_nev7CgMZxv1Bb7ecz1AFRin8tHmjPREJA@mail.gmail.com/
[2] https://lore.kernel.org/bpf/1652880861-27373-1-git-send-email-alan.maguire@oracle.com/T/
[3] https://lore.kernel.org/bpf/1652788780-25520-1-git-send-email-alan.maguire@oracle.com/T/#t
[4] https://lore.kernel.org/bpf/20220511163604.5kuczj6jx3ec5qv6@MBP-98dd607d3435.dhcp.thefacebook.com/T/#mae65f35a193279e718f37686da636094d69b96ee
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
tests load/attach bpf prog with maps, perfbuf and ringbuf, pinning
them. Then effective caps are dropped and we verify we can
- pick up the pin
- create ringbuf/perfbuf
- get ringbuf/perfbuf events, carry out map update, lookup and delete
- create a link
Negative testing also ensures
- BPF prog load fails
- BPF map create fails
- get fd by id fails
- get next id fails
- query fails
- BTF load fails
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/1652970334-30510-3-git-send-email-alan.maguire@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
With unprivileged BPF disabled, all cmds associated with the BPF syscall
are blocked to users without CAP_BPF/CAP_SYS_ADMIN. However there are
use cases where we may wish to allow interactions with BPF programs
without being able to load and attach them. So for example, a process
with required capabilities loads/attaches a BPF program, and a process
with less capabilities interacts with it; retrieving perf/ring buffer
events, modifying map-specified config etc. With all BPF syscall
commands blocked as a result of unprivileged BPF being disabled,
this mode of interaction becomes impossible for processes without
CAP_BPF.
As Alexei notes
"The bpf ACL model is the same as traditional file's ACL.
The creds and ACLs are checked at open(). Then during file's write/read
additional checks might be performed. BPF has such functionality already.
Different map_creates have capability checks while map_lookup has:
map_get_sys_perms(map, f) & FMODE_CAN_READ.
In other words it's enough to gate FD-receiving parts of bpf
with unprivileged_bpf_disabled sysctl.
The rest is handled by availability of FD and access to files in bpffs."
So key fd creation syscall commands BPF_PROG_LOAD and BPF_MAP_CREATE
are blocked with unprivileged BPF disabled and no CAP_BPF.
And as Alexei notes, map creation with unprivileged BPF disabled off
blocks creation of maps aside from array, hash and ringbuf maps.
Programs responsible for loading and attaching the BPF program
can still control access to its pinned representation by restricting
permissions on the pin path, as with normal files.
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Acked-by: KP Singh <kpsingh@kernel.org>
Link: https://lore.kernel.org/r/1652970334-30510-2-git-send-email-alan.maguire@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Tracing and syscall BPF program types are very convenient to add BPF
capabilities to subsystem otherwise not BPF capable.
When we add kfuncs capabilities to those program types, we can add
BPF features to subsystems without having to touch BPF core.
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Link: https://lore.kernel.org/r/20220518205924.399291-2-benjamin.tissoires@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Joanne Koong says:
====================
Add a bhash2 table hashed by port + address
This patchset proposes adding a bhash2 table that hashes by port and address.
The motivation behind bhash2 is to expedite bind requests in situations where
the port has many sockets in its bhash table entry, which makes checking bind
conflicts costly especially given that we acquire the table entry spinlock
while doing so, which can cause softirq cpu lockups and can prevent new tcp
connections.
We ran into this problem at Meta where the traffic team binds a large number
of IPs to port 443 and the bind() call took a significant amount of time
which led to cpu softirq lockups, which caused packet drops and other failures
on the machine
The patches are as follows:
1/2 - Adds a second bhash table (bhash2) hashed by port and address
2/2 - Adds a test for timing how long an additional bind request takes when
the bhash entry is populated
When experimentally testing this on a local server for ~24k sockets bound to
the port, the results seen were:
ipv4:
before - 0.002317 seconds
with bhash2 - 0.000018 seconds
ipv6:
before - 0.002431 seconds
with bhash2 - 0.000021 seconds
====================
Link: https://lore.kernel.org/r/20220520001834.2247810-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
bhash entry
This test populates the bhash table for a given port with
MAX_THREADS * MAX_CONNECTIONS sockets, and then times how long
a bind request on the port takes.
When populating the bhash table, we create the sockets and then bind
the sockets to the same address and port (SO_REUSEADDR and SO_REUSEPORT
are set). When timing how long a bind on the port takes, we bind on a
different address without SO_REUSEPORT set. We do not set SO_REUSEPORT
because we are interested in the case where the bind request does not
go through the tb->fastreuseport path, which is fragile (eg
tb->fastreuseport path does not work if binding with a different uid).
To run the test locally, I did:
* ulimit -n 65535000
* ip addr add 2001:0db8:0:f101::1 dev eth0
* ./bind_bhash_test 443
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|