Age | Commit message (Collapse) | Author |
|
Currently blktests nvme/002 trips up debugobjects if CONFIG_NVME_AUTH is
enabled, but authentication is not on a queue. This is because
nvmet_auth_sq_free cancels sq->auth_expired_work unconditionaly, while
auth_expired_work is only ever initialized if authentication is enabled
for a given controller.
Fix this by calling most of what is nvmet_init_auth unconditionally
when initializing the SQ, and just do the setting of the result
field in the connect command handler.
Fixes: db1312dd9548 ("nvmet: implement basic In-Band Authentication")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Hannes Reinecke <hare@suse.de>
|
|
There is only a single call-site of nvmet_tcp_finish_cmd(), this
becomes redundant. Remove nvmet_tcp_finish_cmd() and use the original
function body instead.
Suggested-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
ttag is used as an index to get cmd in nvmet_tcp_handle_h2c_data_pdu(),
add a bounds check to avoid out-of-bounds access.
Signed-off-by: Varun Prakash <varun@chelsio.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
As per NVMe/TCP transport specification ICReq PDU is the first PDU received
by the controller and controller should receive only one ICReq PDU.
If controller receives more than one ICReq PDU then this can be considered
as fatal error.
nvmet-tcp driver does not check for ICReq PDU opcode if queue state is
NVMET_TCP_Q_LIVE. In LIVE state ICReq PDU is treated as CapsuleCmd PDU,
this can result in abnormal behavior.
Add a check for ICReq PDU in nvmet_tcp_done_recv_pdu() to fix this issue.
Signed-off-by: Varun Prakash <varun@chelsio.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
nvmet-tcp frees CMD buffers in nvmet_tcp_uninit_data_in_cmds(),
and waits the inflight IO requests in nvmet_sq_destroy(). During wait
the inflight IO requests, the callback nvmet_tcp_queue_response()
is called from backend after IO complete, this leads a typical
Use-After-Free issue like this:
BUG: kernel NULL pointer dereference, address: 0000000000000008
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 107f80067 P4D 107f80067 PUD 10789e067 PMD 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 1 PID: 123 Comm: kworker/1:1H Kdump: loaded Tainted: G E 6.0.0-rc2.bm.1-amd64 #15
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
Workqueue: nvmet_tcp_wq nvmet_tcp_io_work [nvmet_tcp]
RIP: 0010:shash_ahash_digest+0x2b/0x110
Code: 1f 44 00 00 41 57 41 56 41 55 41 54 55 48 89 fd 53 48 89 f3 48 83 ec 08 44 8b 67 30 45 85 e4 74 1c 48 8b 57 38 b8 00 10 00 00 <44> 8b 7a 08 44 29 f8 39 42 0c 0f 46 42 0c 41 39 c4 76 43 48 8b 03
RSP: 0018:ffffc9000051bdd8 EFLAGS: 00010206
RAX: 0000000000001000 RBX: ffff888100ab5470 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff888100ab5470 RDI: ffff888100ab5420
RBP: ffff888100ab5420 R08: ffff8881024d08c8 R09: ffff888103e1b4b8
R10: 8080808080808080 R11: 0000000000000000 R12: 0000000000001000
R13: 0000000000000000 R14: ffff88813412bd4c R15: ffff8881024d0800
FS: 0000000000000000(0000) GS:ffff88883fa40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000008 CR3: 0000000104b48000 CR4: 0000000000350ee0
Call Trace:
<TASK>
nvmet_tcp_io_work+0xa52/0xb52 [nvmet_tcp]
? __switch_to+0x106/0x420
process_one_work+0x1ae/0x380
? process_one_work+0x380/0x380
worker_thread+0x30/0x360
? process_one_work+0x380/0x380
kthread+0xe6/0x110
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x1f/0x30
Separate nvmet_tcp_uninit_data_in_cmds() into two steps:
uninit data in cmds <- new step 1
nvmet_sq_destroy();
cancel_work_sync(&queue->io_work);
free CMD buffers <- new step 2
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
We've been reporting 2 maps regardless of whether the module parameter
asked for anything beyond the default queues. A consequence of this
means that blk-mq will reinitialize the all the hardware contexts and io
schedulers on every controller reset when the mapping is exactly the
same as before. This unnecessary overhead is adding several milliseconds
on a reset for environments that don't need it. Report the actual number
of mappings in use.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
If swiotlb is force enabled dma_max_mapping_size ends up calling
swiotlb_max_mapping_size which takes into account the min align mask for
the device. Set the min align mask for nvme driver before calling
dma_max_mapping_size while calculating max hw sectors.
Signed-off-by: Rishabh Bhatnagar <risbhat@amazon.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
When a discovery controller is disconnected, no AENs will arrive to
notify the host about discovery log change events.
In order to solve this, send a uevent notification when a
persistent discovery controller reconnects. We add a new ctrl
flag NVME_CTRL_STARTED_ONCE that will be set on the first
start, and consecutive calls will find it set, and send the
event to userspace if the controller is a discovery controller.
Upon the event reception, userspace will re-read the discovery
log page and will act upon changes as it sees fit.
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
We expect to grow a few of these flags for various purposes
so make them a proper enumeration.
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
E3C/E4C SSDs do support the Write Zeroes command in theory, but have very
bad performance when using it. As the firmware has been frozen for these
products we can not expect firmware improvements for it, so disable
Write Zeroes.
Signed-off-by: Tina Hsu <tina_hsu@phison.corp-partner.google.com>
[hch: update the commit message]
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
The IOC_PR_CLEAR and IOC_PR_RELEASE ioctls are
non-functional on NVMe devices because the nvme_pr_clear()
and nvme_pr_release() functions set the IEKEY field incorrectly.
The IEKEY field should be set only when the key is zero (i.e,
not specified). The current code does it backwards.
Furthermore, the NVMe spec describes the persistent
reservation "clear" function as an option on the reservation
release command. The current implementation of nvme_pr_clear()
erroneously uses the reservation register command.
Fix these errors. Note that NVMe version 1.3 and later specify
that setting the IEKEY field will return an error of Invalid
Field in Command. The fix will set IEKEY when the key is zero,
which is appropriate as these ioctls consider a zero key to
be "unspecified", and the intention of the spec change is
to require a valid key.
Tested on a version 1.4 PCI NVMe device in an Azure VM.
Fixes: 1673f1f08c88 ("nvme: move block_device_operations and ns/ctrl freeing to common code")
Fixes: 1d277a637a71 ("NVMe: Add persistent reservation ops")
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
It's part of the line protocol, same as in commit 82805818898d
("Documentation: NBD_REPLY_MAGIC isn't a magic number")
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Link: https://lore.kernel.org/r/20220927003727.slf4ofb7dgum6apt@tarta.nabijaczleweli.xyz
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The subsystem reset writes to a register, so we have to ensure the
device state is capable of handling that otherwise the driver may access
unmapped registers. Use the state machine to ensure the subsystem reset
doesn't try to write registers on a device already undergoing this type
of reset.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=214771
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
The passthrough commands already have this restriction, but the other
operations do not. Require the same capabilities for all users as all of
these operations, which include resets and rescans, can be disruptive.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
The firmware revision can change on after a reset so copy the most
recent info each time instead of just the first time, otherwise the
sysfs firmware_rev entry may contain stale data.
Reported-by: Jeff Lien <jeff.lien@wdc.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Chao Leng <lengchao@huawei.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
If a reset occurs after the scan work attempts to issue a command, the
reset may quisce the admin queue, which blocks the scan work's command
from dispatching. The scan work will not be able to complete while the
queue is quiesced.
Meanwhile, the reset work will cancel all outstanding admin tags and
wait until all requests have transitioned to idle, which includes the
passthrough request. But the passthrough request won't be set to idle
until after the scan_work flushes, so we're deadlocked.
Fix this by handling the end effects after the request has been freed.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216354
Reported-by: Jonathan Derrick <Jonathan.Derrick@solidigm.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chao Leng <lengchao@huawei.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Add support for Dell 5811e (EM7455) with USB-id 0x413c:0x81c2.
Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
|
|
Setting pointer and afterwards checking for wraparound leads
to the possibility of returning the inconsistent pointer position.
This patch increments buffer pointer atomically to avoid this issue.
Fixes: e7f73a1613567a ("ASoC: Add dmaengine PCM helper functions")
Signed-off-by: Andreas Pape <apape@de.adit-jv.com>
Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com>
Link: https://lore.kernel.org/r/1664211493-11789-1-git-send-email-erosca@de.adit-jv.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
The recent change in ALSA core allows drivers to get the current PCM
state directly from runtime object. Replace the calls accordingly.
Reviewed-by: Jaroslav Kysela <perex@perex.cz>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/r/20220926135558.26580-12-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
The recent change in ALSA core allows drivers to get the current PCM
state directly from runtime object. Replace the calls accordingly.
Reviewed-by: Jaroslav Kysela <perex@perex.cz>
Acked-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/r/20220926135558.26580-11-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
The recent change in ALSA core allows drivers to get the current PCM
state directly from runtime object. Replace the calls accordingly.
Reviewed-by: Jaroslav Kysela <perex@perex.cz>
Acked-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/r/20220926135558.26580-10-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
The recent change in ALSA core allows drivers to get the current PCM
state directly from runtime object. Replace the calls accordingly.
Reviewed-by: Jaroslav Kysela <perex@perex.cz>
Link: https://lore.kernel.org/r/20220926135558.26580-9-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
The recent change in ALSA core allows drivers to get the current PCM
state directly from runtime object. Replace the calls accordingly.
Reviewed-by: Jaroslav Kysela <perex@perex.cz>
Link: https://lore.kernel.org/r/20220926135558.26580-8-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
The recent change in ALSA core allows drivers to get the current PCM
state directly from runtime object. Replace the calls accordingly.
Reviewed-by: Jaroslav Kysela <perex@perex.cz>
Link: https://lore.kernel.org/r/20220926135558.26580-7-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
The recent change in ALSA core allows drivers to get the current PCM
state directly from runtime object. Replace the calls accordingly.
Reviewed-by: Jaroslav Kysela <perex@perex.cz>
Link: https://lore.kernel.org/r/20220926135558.26580-6-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
The recent change in ALSA core allows drivers to get the current PCM
state directly from runtime object. Replace the calls accordingly.
Reviewed-by: Jaroslav Kysela <perex@perex.cz>
Link: https://lore.kernel.org/r/20220926135558.26580-5-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
The recent change in ALSA core allows drivers to get the current PCM
state directly from runtime object. Replace the calls accordingly.
Reviewed-by: Jaroslav Kysela <perex@perex.cz>
Link: https://lore.kernel.org/r/20220926135558.26580-4-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
The mmap status record should be read-only. Modifying it from
user-space may screw up things unexpectedly, so let's clear the write
bits at exposing it.
Note that alsa-lib and other known user-space apps access the mmapped
status only as read-only, hence this change shouldn't break the
existing applications.
Reviewed-by: Jaroslav Kysela <perex@perex.cz>
Link: https://lore.kernel.org/r/20220926135558.26580-3-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
In the PCM core and driver code, there are lots place referring to the
current PCM state via runtime->status->state. This patch introduced a
local PCM state in runtime itself and replaces those references with
runtime->state. It has improvements in two aspects:
- The reduction of a indirect access leads to more code optimization
- It avoids a possible (unexpected) modification of the state via mmap
of the status record
The status->state is updated together with runtime->state, so that
user-space can still read the current state via mmap like before,
too.
This patch touches only the ALSA core code. The changes in each
driver will follow in later patches.
Reviewed-by: Jaroslav Kysela <perex@perex.cz>
Link: https://lore.kernel.org/r/20220926135558.26580-2-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
There is already a SPDX-License-Identifier tag, so the corresponding
license text can be removed.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/203c1db92c470925f31e361f6e7d180812501f2e.1664112023.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
There is already a SPDX-License-Identifier tag, so the corresponding license
text can be removed.
While at it, be more consistent and:
- add a missing .c (ff-protocol-latter)
- remove an empty line (motu-protocol-v1)
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/2bfe76c7eeb0f5205a1427e280bf8d9da0354a62.1664110649.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
The name of this function looks not very accurate compared to it's
implementation and it's only a wrapper to erofs_read_metabuf(). So,
let's fold it directly instead.
Signed-off-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220927032518.25266-1-zbestahu@gmail.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
ath.git patches for v6.1. Major changes:
ath11k
* cold boot calibration support on WCN6750
* Target Wake Time (TWT) debugfs support for STA interface
* support to connect to a non-transmit MBSSID AP profile
* enable remain-on-channel support on WCN6750
* implement SRAM dump debugfs interface
* enable threaded NAPI on all hardware
* WoW support for WCN6750
* support to provide transmit power from firmware via nl80211
* support to get power save duration for each client
* spectral scan support for 160 MHz
wcn36xx
* add SNR from a received frame as a source of system entropy
|
|
As noted by Arnd Bergmann, "we used to have three drivers for the same
hardware (pcmcia, pata and ide), and only the pcmcia driver remained
in the tree after drivers/ide/ was removed and pata_at91 did not get
converted to DT". "There is no dts file in tree that actually declares
either of them, so chances are that nobody is actually using the CF
slot on at91 any more."[1]
On this rationale, remove the AT91RM9200 Compact Flash driver, which
also assists in reaching "the goal of stopping exporting OF-specific
APIs of gpiolib".[2]
[1] https://lore.kernel.org/lkml/68c63077-848b-45f5-8aca-ed995391f2b6@www.fastmail.com/
[2] https://lore.kernel.org/lkml/Yy6d7TjqzUwGQnQa@penguin/
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
|
|
When STA roams from one AP to another, after roam is complete, host
driver tries to get TIM information from firmware. This is no longer
supported in the firmware & hence, this call will always fail.
This failure results in the below message being displayed on the
console all the time when roam is done.
ieee80211 phy0: brcmf_update_bss_info: wl dtim_assoc failed (-52)
Changes ensure that the host driver will no longer try to get TIM
information from firmware.
Signed-off-by: Ramesh Rangavittal <ramesh.rangavittal@infineon.com>
Signed-off-by: Chi-hsien Lin <chi-hsien.lin@infineon.com>
Signed-off-by: Ian Lin <ian.lin@infineon.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220922104140.11889-5-ian.lin@infineon.com
|
|
Increase dcmd maximum buffer size to match firmware
configuration for new chips.
Signed-off-by: Lo(Double)Hsiang Lo <double.lo@cypress.com>
Signed-off-by: Chi-Hsien Lin <chi-hsien.lin@cypress.com>
Signed-off-by: Ian Lin <ian.lin@infineon.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220922104140.11889-4-ian.lin@infineon.com
|
|
Adds support of 89459 chip pcie device and save restore support.
Signed-off-by: Alexander Prutskov <alep@cypress.com>
Signed-off-by: Joseph chuang <jiac@cypress.com>
Signed-off-by: Chi-Hsien Lin <chi-hsien.lin@cypress.com>
Signed-off-by: Ian Lin <ian.lin@infineon.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220922104140.11889-3-ian.lin@infineon.com
|
|
4373 has support of 16 WOWL patterns thus increasing the default value
Signed-off-by: Ryohei Kondo <ryohei.kondo@cypress.com>
Signed-off-by: Chi-hsien Lin <chi-hsien.lin@cypress.com>
Signed-off-by: Ian Lin <ian.lin@infineon.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220922104140.11889-2-ian.lin@infineon.com
|
|
The bug is here: "} else if (item) {".
The list iterator value will *always* be set and non-NULL by
list_for_each_entry(), so it is incorrect to assume that the iterator
value will be NULL if the list is empty or no element is found in list.
Use a new value 'iter' as the list iterator, while use the old value
'item' as a dedicated pointer to point to the found element, which
1. can fix this bug, due to now 'item' is NULL only if it's not found.
2. do not need to change all the uses of 'item' after the loop.
3. can also limit the scope of the list iterator 'iter' *only inside*
the traversal loop by simply declaring 'iter' inside the loop in the
future, as usage of the iterator outside of the list_for_each_entry
is considered harmful. https://lkml.org/lkml/2022/2/17/1032
Fixes: a910e4a94f692 ("cw1200: add driver for the ST-E CW1100 & CW1200 WLAN chipsets")
Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220413091723.17596-1-xiam0nd.tong@gmail.com
|
|
When running rootless with special capabilities like:
FOWNER / DAC_OVERRIDE / DAC_READ_SEARCH
The "access" API will not make the proper check if there is really
access to a file or not.
>From the access man page:
"
The check is done using the calling process's real UID and GID, rather
than the effective IDs as is done when actually attempting an operation
(e.g., open(2)) on the file. Similarly, for the root user, the check
uses the set of permitted capabilities rather than the set of effective
capabilities; ***and for non-root users, the check uses an empty set of
capabilities.***
"
What that means is that for non-root user the access API will not do the
proper validation if the process really has permission to a file or not.
To resolve this this patch replaces all the access API calls with
faccessat with AT_EACCESS flag.
Signed-off-by: Jon Doron <jond@wiz.io>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220925070431.1313680-1-arilou@gmail.com
|
|
Song Liu says:
====================
Changes v1 => v2:
1. Update arch_prepare_bpf_dispatcher to use a RO image and a RW buffer.
(Alexei) Note: I haven't found an existing test to cover this part, so
this part was tested manually (comparing the generated dispatcher is
the same).
Jeff Layton reported CPA W^X warning linux-next [1]. It turns out to be
W^X issue with bpf trampoline and bpf dispatcher. Fix these by:
1. Use bpf_prog_pack for bpf_dispatcher;
2. Set memory permission properly with bpf trampoline.
[1] https://lore.kernel.org/lkml/c84cc27c1a5031a003039748c3c099732a718aec.camel@kernel.org/
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Mark the trampoline as RO+X after arch_prepare_bpf_trampoline, so that
the trampoine follows W^X rule strictly. This will turn off warnings like
CPA refuse W^X violation: 8000000000000163 -> 0000000000000163 range: ...
Also remove bpf_jit_alloc_exec_page(), since it is not used any more.
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20220926184739.3512547-3-song@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Allocate bpf_dispatcher with bpf_prog_pack_alloc so that bpf_dispatcher
can share pages with bpf programs.
arch_prepare_bpf_dispatcher() is updated to provide a RW buffer as working
area for arch code to write to.
This also fixes CPA W^X warnning like:
CPA refuse W^X violation: 8000000000000163 -> 0000000000000163 range: ...
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20220926184739.3512547-2-song@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Jiri Olsa says:
====================
Martynas reported bpf_get_func_ip returning +4 address when
CONFIG_X86_KERNEL_IBT option is enabled and I found there are
some failing bpf tests when this option is enabled.
The CONFIG_X86_KERNEL_IBT option adds endbr instruction at the
function entry, so the idea is to 'fix' entry ip for kprobe_multi
and trampoline probes, because they are placed on the function
entry.
v5 changes:
- updated uapi/linux/bpf.h headers with comment for
bpf_get_func_ip returning 0 [Andrii]
- added acks
v4 changes:
- used get_kernel_nofault to read previous instruction [Peter]
- used movabs instruction in trampoline comment [Peter]
- renamed fentry_ip argument in kprobe_multi_link_handler [Peter]
v3 changes:
- using 'unused' bpf function to get IBT config option
into selftest skeleton
- rebased to current bpf-next/master
- added ack/review from Masami
v2 changes:
- change kprobes get_func_ip to return zero for kprobes
attached within the function body [Andrii]
- detect IBT config and properly test kprobe with offset
[Andrii]
v1 changes:
- read previous instruction in kprobe_multi link handler
and adjust entry_ip for CONFIG_X86_KERNEL_IBT option
- split first patch into 2 separate changes
- update changelogs
====================
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
With CONFIG_X86_KERNEL_IBT enabled the test for kprobe with offset
won't work because of the extra endbr instruction.
As suggested by Andrii adding CONFIG_X86_KERNEL_IBT detection
and using appropriate offset value based on that.
Also removing test7 program, because it does the same as test6.
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220926153340.1621984-7-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Changing return value of kprobe's version of bpf_get_func_ip
to return zero if the attach address is not on the function's
entry point.
For kprobes attached in the middle of the function we can't easily
get to the function address especially now with the CONFIG_X86_KERNEL_IBT
support.
If user cares about current IP for kprobes attached within the
function body, they can get it with PT_REGS_IP(ctx).
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220926153340.1621984-6-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Martynas reported bpf_get_func_ip returning +4 address when
CONFIG_X86_KERNEL_IBT option is enabled.
When CONFIG_X86_KERNEL_IBT is enabled we'll have endbr instruction
at the function entry, which screws return value of bpf_get_func_ip()
helper that should return the function address.
There's short term workaround for kprobe_multi bpf program made by
Alexei [1], but we need this fixup also for bpf_get_attach_cookie,
that returns cookie based on the entry_ip value.
Moving the fixup in the fprobe handler, so both bpf_get_func_ip
and bpf_get_attach_cookie get expected function address when
CONFIG_X86_KERNEL_IBT option is enabled.
Also renaming kprobe_multi_link_handler entry_ip argument to fentry_ip
so it's clearer this is an ftrace __fentry__ ip.
[1] commit 7f0059b58f02 ("selftests/bpf: Fix kprobe_multi test.")
Cc: Peter Zijlstra <peterz@infradead.org>
Reported-by: Martynas Pumputis <m@lambda.lt>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220926153340.1621984-5-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Using function address given at the generation time as the trampoline
ip argument. This way we get directly the function address that we
need, so we don't need to:
- read the ip from the stack
- subtract X86_PATCH_SIZE
- subtract ENDBR_INSN_SIZE if CONFIG_X86_KERNEL_IBT is enabled
which is not even implemented yet ;-)
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220926153340.1621984-4-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Keeping the resolved 'addr' in kallsyms_callback, instead of taking
ftrace_location value, because we depend on symbol address in the
cookie related code.
With CONFIG_X86_KERNEL_IBT option the ftrace_location value differs
from symbol address, which screwes the symbol address cookies matching.
There are 2 users of this function:
- bpf_kprobe_multi_link_attach
for which this fix is for
- get_ftrace_locations
which is used by register_fprobe_syms
this function needs to get symbols resolved to addresses,
but does not need 'ftrace location addresses' at this point
there's another ftrace location translation in the path done
by ftrace_set_filter_ips call:
register_fprobe_syms
addrs = get_ftrace_locations
register_fprobe_ips(addrs)
...
ftrace_set_filter_ips
...
__ftrace_match_addr
ip = ftrace_location(ip);
...
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220926153340.1621984-3-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Adding KPROBE_FLAG_ON_FUNC_ENTRY kprobe flag to indicate that
attach address is on function entry. This is used in following
changes in get_func_ip helper to return correct function address.
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220926153340.1621984-2-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|