Age | Commit message (Collapse) | Author |
|
The whitelist has been refactored away with a0fb3fc8af01 ("mmc:
renesas_sdhi: remove whitelist for internal DMAC") so the comment
doesn't make any sense anymore.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20220320123016.57991-5-wsa+renesas@sang-engineering.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
When I read 'no_fallback', I forgot what fallback even though I was the
author of this change. Name it better to make the code easier to
understand.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20220320123016.57991-4-wsa+renesas@sang-engineering.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
It is not explicitly expressed in the docs, but the needed data strobe
pin is indeed missing for D3. The BSP disables HS400 as well. This means
a little refactoring to reuse an already existing setup.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20220320123016.57991-3-wsa+renesas@sang-engineering.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
We moved quirk handling out of the SDHI core to the individual drivers.
So, no need to include headers needed for soc_device_match et al.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20220320123016.57991-2-wsa+renesas@sang-engineering.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Way back in commit 4f25580fb84d ("mmc: core: changes frequency to
hs_max_dtr when selecting hs400es"), Rockchip engineers noticed that
some eMMC don't respond to SEND_STATUS commands very reliably if they're
still running at a low initial frequency. As mentioned in that commit,
JESD84-B51 P49 suggests a sequence in which the host:
1. sets HS_TIMING
2. bumps the clock ("<= 52 MHz")
3. sends further commands
It doesn't exactly require that we don't use a lower-than-52MHz
frequency, but in practice, these eMMC don't like it.
The aforementioned commit tried to get that right for HS400ES, although
it's unclear whether this ever truly worked as committed into mainline,
as other changes/refactoring adjusted the sequence in conflicting ways:
08573eaf1a70 ("mmc: mmc: do not use CMD13 to get status after speed mode
switch")
53e60650f74e ("mmc: core: Allow CMD13 polling when switching to HS mode
for mmc")
In any case, today we do step 3 before step 2. Let's fix that, and also
apply the same logic to HS200/400, where this eMMC has problems too.
Resolves errors like this seen when booting some RK3399 Gru/Scarlet
systems:
[ 2.058881] mmc1: CQHCI version 5.10
[ 2.097545] mmc1: SDHCI controller on fe330000.mmc [fe330000.mmc] using ADMA
[ 2.209804] mmc1: mmc_select_hs400es failed, error -84
[ 2.215597] mmc1: error -84 whilst initialising MMC card
[ 2.417514] mmc1: mmc_select_hs400es failed, error -110
[ 2.423373] mmc1: error -110 whilst initialising MMC card
[ 2.605052] mmc1: mmc_select_hs400es failed, error -110
[ 2.617944] mmc1: error -110 whilst initialising MMC card
[ 2.835884] mmc1: mmc_select_hs400es failed, error -110
[ 2.841751] mmc1: error -110 whilst initialising MMC card
Ealier versions of this patch bumped to 200MHz/HS200 speeds too early,
which caused issues on, e.g., qcom-msm8974-fairphone-fp2. (Thanks for
the report Luca!) After a second look, it appears that aligns with
JESD84 / page 45 / table 28, so we need to keep to lower (HS / 52 MHz)
rates first.
Fixes: 08573eaf1a70 ("mmc: mmc: do not use CMD13 to get status after speed mode switch")
Fixes: 53e60650f74e ("mmc: core: Allow CMD13 polling when switching to HS mode for mmc")
Fixes: 4f25580fb84d ("mmc: core: changes frequency to hs_max_dtr when selecting hs400es")
Cc: Shawn Lin <shawn.lin@rock-chips.com>
Link: https://lore.kernel.org/linux-mmc/11962455.O9o76ZdvQC@g550jk/
Reported-by: Luca Weiss <luca@z3ntu.xyz>
Signed-off-by: Brian Norris <briannorris@chromium.org>
Tested-by: Luca Weiss <luca@z3ntu.xyz>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220422100824.v4.1.I484f4ee35609f78b932bd50feed639c29e64997e@changeid
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
This patch adds the necessary PCI IDs for Intel Meteor Lake-P
devices.
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Cc: stable <stable@kernel.org>
Link: https://lore.kernel.org/r/20220425103518.44028-1-heikki.krogerus@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We received a report[1] of kernel crashes when Cilium is used in XDP
mode with virtio_net after updating to newer kernels. After
investigating the reason it turned out that when using mergeable bufs
with an XDP program which adjusts xdp.data or xdp.data_meta page_to_buf()
calculates the build_skb address wrong because the offset can become less
than the headroom so it gets the address of the previous page (-X bytes
depending on how lower offset is):
page_to_skb: page addr ffff9eb2923e2000 buf ffff9eb2923e1ffc offset 252 headroom 256
This is a pr_err() I added in the beginning of page_to_skb which clearly
shows offset that is less than headroom by adding 4 bytes of metadata
via an xdp prog. The calculations done are:
receive_mergeable():
headroom = VIRTIO_XDP_HEADROOM; // VIRTIO_XDP_HEADROOM == 256 bytes
offset = xdp.data - page_address(xdp_page) -
vi->hdr_len - metasize;
page_to_skb():
p = page_address(page) + offset;
...
buf = p - headroom;
Now buf goes -4 bytes from the page's starting address as can be seen
above which is set as skb->head and skb->data by build_skb later. Depending
on what's done with the skb (when it's freed most often) we get all kinds
of corruptions and BUG_ON() triggers in mm[2]. We have to recalculate
the new headroom after the xdp program has run, similar to how offset
and len are recalculated. Headroom is directly related to
data_hard_start, data and data_meta, so we use them to get the new size.
The result is correct (similar pr_err() in page_to_skb, one case of
xdp_page and one case of virtnet buf):
a) Case with 4 bytes of metadata
[ 115.949641] page_to_skb: page addr ffff8b4dcfad2000 offset 252 headroom 252
[ 121.084105] page_to_skb: page addr ffff8b4dcf018000 offset 20732 headroom 252
b) Case of pushing data +32 bytes
[ 153.181401] page_to_skb: page addr ffff8b4dd0c4d000 offset 288 headroom 288
[ 158.480421] page_to_skb: page addr ffff8b4dd00b0000 offset 24864 headroom 288
c) Case of pushing data -33 bytes
[ 835.906830] page_to_skb: page addr ffff8b4dd3270000 offset 223 headroom 223
[ 840.839910] page_to_skb: page addr ffff8b4dcdd68000 offset 12511 headroom 223
Offset and headroom are equal because offset points to the start of
reserved bytes for the virtio_net header which are at buf start +
headroom, while data points at buf start + vnet hdr size + headroom so
when data or data_meta are adjusted by the xdp prog both the headroom size
and the offset change equally. We can use data_hard_start to compute the
new headroom after the xdp prog (linearized / page start case, the
virtnet buf case is similar just with bigger base offset):
xdp.data_hard_start = page_address + vnet_hdr
xdp.data = page_address + vnet_hdr + headroom
new headroom after xdp prog = xdp.data - xdp.data_hard_start - metasize
An example reproducer xdp prog[3] is below.
[1] https://github.com/cilium/cilium/issues/19453
[2] Two of the many traces:
[ 40.437400] BUG: Bad page state in process swapper/0 pfn:14940
[ 40.916726] BUG: Bad page state in process systemd-resolve pfn:053b7
[ 41.300891] kernel BUG at include/linux/mm.h:720!
[ 41.301801] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 41.302784] CPU: 1 PID: 1181 Comm: kubelet Kdump: loaded Tainted: G B W 5.18.0-rc1+ #37
[ 41.304458] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1.fc35 04/01/2014
[ 41.306018] RIP: 0010:page_frag_free+0x79/0xe0
[ 41.306836] Code: 00 00 75 ea 48 8b 07 a9 00 00 01 00 74 e0 48 8b 47 48 48 8d 50 ff a8 01 48 0f 45 fa eb d0 48 c7 c6 18 b8 30 a6 e8 d7 f8 fc ff <0f> 0b 48 8d 78 ff eb bc 48 8b 07 a9 00 00 01 00 74 3a 66 90 0f b6
[ 41.310235] RSP: 0018:ffffac05c2a6bc78 EFLAGS: 00010292
[ 41.311201] RAX: 000000000000003e RBX: 0000000000000000 RCX: 0000000000000000
[ 41.312502] RDX: 0000000000000001 RSI: ffffffffa6423004 RDI: 00000000ffffffff
[ 41.313794] RBP: ffff993c98823600 R08: 0000000000000000 R09: 00000000ffffdfff
[ 41.315089] R10: ffffac05c2a6ba68 R11: ffffffffa698ca28 R12: ffff993c98823600
[ 41.316398] R13: ffff993c86311ebc R14: 0000000000000000 R15: 000000000000005c
[ 41.317700] FS: 00007fe13fc56740(0000) GS:ffff993cdd900000(0000) knlGS:0000000000000000
[ 41.319150] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 41.320152] CR2: 000000c00008a000 CR3: 0000000014908000 CR4: 0000000000350ee0
[ 41.321387] Call Trace:
[ 41.321819] <TASK>
[ 41.322193] skb_release_data+0x13f/0x1c0
[ 41.322902] __kfree_skb+0x20/0x30
[ 41.343870] tcp_recvmsg_locked+0x671/0x880
[ 41.363764] tcp_recvmsg+0x5e/0x1c0
[ 41.384102] inet_recvmsg+0x42/0x100
[ 41.406783] ? sock_recvmsg+0x1d/0x70
[ 41.428201] sock_read_iter+0x84/0xd0
[ 41.445592] ? 0xffffffffa3000000
[ 41.462442] new_sync_read+0x148/0x160
[ 41.479314] ? 0xffffffffa3000000
[ 41.496937] vfs_read+0x138/0x190
[ 41.517198] ksys_read+0x87/0xc0
[ 41.535336] do_syscall_64+0x3b/0x90
[ 41.551637] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 41.568050] RIP: 0033:0x48765b
[ 41.583955] Code: e8 4a 35 fe ff eb 88 cc cc cc cc cc cc cc cc e8 fb 7a fe ff 48 8b 7c 24 10 48 8b 74 24 18 48 8b 54 24 20 48 8b 44 24 08 0f 05 <48> 3d 01 f0 ff ff 76 20 48 c7 44 24 28 ff ff ff ff 48 c7 44 24 30
[ 41.632818] RSP: 002b:000000c000a2f5b8 EFLAGS: 00000212 ORIG_RAX: 0000000000000000
[ 41.664588] RAX: ffffffffffffffda RBX: 000000c000062000 RCX: 000000000048765b
[ 41.681205] RDX: 0000000000005e54 RSI: 000000c000e66000 RDI: 0000000000000016
[ 41.697164] RBP: 000000c000a2f608 R08: 0000000000000001 R09: 00000000000001b4
[ 41.713034] R10: 00000000000000b6 R11: 0000000000000212 R12: 00000000000000e9
[ 41.728755] R13: 0000000000000001 R14: 000000c000a92000 R15: ffffffffffffffff
[ 41.744254] </TASK>
[ 41.758585] Modules linked in: br_netfilter bridge veth netconsole virtio_net
and
[ 33.524802] BUG: Bad page state in process systemd-network pfn:11e60
[ 33.528617] page ffffe05dc0147b00 ffffe05dc04e7a00 ffff8ae9851ec000 (1) len 82 offset 252 metasize 4 hroom 0 hdr_len 12 data ffff8ae9851ec10c data_meta ffff8ae9851ec108 data_end ffff8ae9851ec14e
[ 33.529764] page:000000003792b5ba refcount:0 mapcount:-512 mapping:0000000000000000 index:0x0 pfn:0x11e60
[ 33.532463] flags: 0xfffffc0000000(node=0|zone=1|lastcpupid=0x1fffff)
[ 33.532468] raw: 000fffffc0000000 0000000000000000 dead000000000122 0000000000000000
[ 33.532470] raw: 0000000000000000 0000000000000000 00000000fffffdff 0000000000000000
[ 33.532471] page dumped because: nonzero mapcount
[ 33.532472] Modules linked in: br_netfilter bridge veth netconsole virtio_net
[ 33.532479] CPU: 0 PID: 791 Comm: systemd-network Kdump: loaded Not tainted 5.18.0-rc1+ #37
[ 33.532482] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1.fc35 04/01/2014
[ 33.532484] Call Trace:
[ 33.532496] <TASK>
[ 33.532500] dump_stack_lvl+0x45/0x5a
[ 33.532506] bad_page.cold+0x63/0x94
[ 33.532510] free_pcp_prepare+0x290/0x420
[ 33.532515] free_unref_page+0x1b/0x100
[ 33.532518] skb_release_data+0x13f/0x1c0
[ 33.532524] kfree_skb_reason+0x3e/0xc0
[ 33.532527] ip6_mc_input+0x23c/0x2b0
[ 33.532531] ip6_sublist_rcv_finish+0x83/0x90
[ 33.532534] ip6_sublist_rcv+0x22b/0x2b0
[3] XDP program to reproduce(xdp_pass.c):
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
SEC("xdp_pass")
int xdp_pkt_pass(struct xdp_md *ctx)
{
bpf_xdp_adjust_head(ctx, -(int)32);
return XDP_PASS;
}
char _license[] SEC("license") = "GPL";
compile: clang -O2 -g -Wall -target bpf -c xdp_pass.c -o xdp_pass.o
load on virtio_net: ip link set enp1s0 xdpdrv obj xdp_pass.o sec xdp_pass
CC: stable@vger.kernel.org
CC: Jason Wang <jasowang@redhat.com>
CC: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
CC: Daniel Borkmann <daniel@iogearbox.net>
CC: "Michael S. Tsirkin" <mst@redhat.com>
CC: virtualization@lists.linux-foundation.org
Fixes: 8fb7da9e9907 ("virtio_net: get build_skb() buf by data ptr")
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20220425103703.3067292-1-razor@blackwall.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The other port_hidden functions rely on the port_read/port_write
functions to access the hidden control port. These functions apply the
offset for port_base_addr where applicable. Update port_hidden_wait to
use the port_wait_bit so that port_base_addr offsets are accounted for
when waiting for the busy bit to change.
Without the offset the port_hidden_wait function would timeout on
devices that have a non-zero port_base_addr (e.g. MV88E6141), however
devices that have a zero port_base_addr would operate correctly (e.g.
MV88E6390).
Fixes: 609070133aff ("net: dsa: mv88e6xxx: update code operating on hidden registers")
Signed-off-by: Nathan Rossi <nathan@nathanrossi.com>
Reviewed-by: Marek Behún <kabel@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20220425070454.348584-1-nathan@nathanrossi.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Return back the error value that we get from phy_read_mmd().
Fixes: c84786fa8f91 ("net: phy: marvell10g: read copper results from CSSR1")
Signed-off-by: Baruch Siach <baruch.siach@siklu.com>
Reviewed-by: Marek Behún <kabel@kernel.org>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/f47cb031aeae873bb008ba35001607304a171a20.1650868058.git.baruch@tkos.co.il
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The __warn() prototype is declared in CONFIG_BUG scope but the function
definition in panic.c is unconditional. The IBT enablement started using
it unconditionally but a CONFIG_X86_KERNEL_IBT=y, CONFIG_BUG=n .config
will trigger a
arch/x86/kernel/traps.c: In function ‘__exc_control_protection’:
arch/x86/kernel/traps.c:249:17: error: implicit declaration of function \
‘__warn’; did you mean ‘pr_warn’? [-Werror=implicit-function-declaration]
Pull up the declarations so that they're unconditionally visible too.
[ bp: Rewrite commit message. ]
Fixes: 991625f3dd2c ("x86/ibt: Add IBT feature, MSR and #CP handling")
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Shida Zhang <zhangshida@kylinos.cn>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220426032007.510245-1-starzhangzsd@gmail.com
|
|
The hardware checksum offloading requires use of a transmit
status block inserted before the outgoing frame data, this was
updated in '9a9ba2a4aaaa ("net: bcmgenet: always enable status blocks")'
However, skb_tx_timestamp() assumes that it is passed a raw frame
and PTP parsing chokes on this status block.
Fix this by calling __skb_pull(), which hides the TSB before calling
skb_tx_timestamp(), so an outgoing PTP packet is parsed correctly.
As the data in the skb has already been set up for DMA, and the
dma_unmap_* calls use a separately stored address, there is no
no effective change in the data transmission.
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220424165307.591145-1-jonathan.lemon@gmail.com
Fixes: d03825fba459 ("net: bcmgenet: add skb_tx_timestamp call")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The codeaurora.org domain is no longer valid, remove my id
from the maintainers for OMAP PM frameworks.
I haven't contributed to them in years, neither do I plan to
in the future so not updating this with my new quicinc id.
Signed-off-by: Rajendra Nayak <quic_rjendra@quicinc.com>
Message-Id: <1649824983-29400-1-git-send-email-quic_rjendra@quicinc.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
|
|
On some systems, the oops only has relative time from boot.
Signed-off-by: Jean-Marc Eurin <jmeurin@google.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20220425160927.3823016-1-jmeurin@google.com
|
|
Create a dump header to enable the addition of fields without having
to modify the rest of the code.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Jean-Marc Eurin <jmeurin@google.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20220421234244.2172003-3-jmeurin@google.com
|
|
The read buffer size depends on the MTDOOPS_HEADER_SIZE.
Tested: Changed the header size, it doesn't panic, header is still
read/written correctly.
Signed-off-by: Jean-Marc Eurin <jmeurin@google.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20220421234244.2172003-2-jmeurin@google.com
|
|
The function mctp_unregister() reclaims the device's relevant resource
when a netcard detaches. However, a running routine may be unaware of
this and cause the use-after-free of the mdev->addrs object.
The race condition can be demonstrated below
cleanup thread another thread
|
unregister_netdev() | mctp_sendmsg()
... | ...
mctp_unregister() | rt = mctp_route_lookup()
... | mctl_local_output()
kfree(mdev->addrs) | ...
| saddr = rt->dev->addrs[0];
|
An attacker can adopt the (recent provided) mtcpserial driver with pty
to fake the device detaching and use the userfaultfd to increase the
race success chance (in mctp_sendmsg). The KASan report for such a POC
is shown below:
[ 86.051955] ==================================================================
[ 86.051955] BUG: KASAN: use-after-free in mctp_local_output+0x4e9/0xb7d
[ 86.051955] Read of size 1 at addr ffff888005f298c0 by task poc/295
[ 86.051955]
[ 86.051955] Call Trace:
[ 86.051955] <TASK>
[ 86.051955] dump_stack_lvl+0x33/0x42
[ 86.051955] print_report.cold.13+0xb2/0x6b3
[ 86.051955] ? preempt_schedule_irq+0x57/0x80
[ 86.051955] ? mctp_local_output+0x4e9/0xb7d
[ 86.051955] kasan_report+0xa5/0x120
[ 86.051955] ? mctp_local_output+0x4e9/0xb7d
[ 86.051955] mctp_local_output+0x4e9/0xb7d
[ 86.051955] ? mctp_dev_set_key+0x79/0x79
[ 86.051955] ? copyin+0x38/0x50
[ 86.051955] ? _copy_from_iter+0x1b6/0xf20
[ 86.051955] ? sysvec_apic_timer_interrupt+0x97/0xb0
[ 86.051955] ? asm_sysvec_apic_timer_interrupt+0x12/0x20
[ 86.051955] ? mctp_local_output+0x1/0xb7d
[ 86.051955] mctp_sendmsg+0x64d/0xdb0
[ 86.051955] ? mctp_sk_close+0x20/0x20
[ 86.051955] ? __fget_light+0x2fd/0x4f0
[ 86.051955] ? mctp_sk_close+0x20/0x20
[ 86.051955] sock_sendmsg+0xdd/0x110
[ 86.051955] __sys_sendto+0x1cc/0x2a0
[ 86.051955] ? __ia32_sys_getpeername+0xa0/0xa0
[ 86.051955] ? new_sync_write+0x335/0x550
[ 86.051955] ? alloc_file+0x22f/0x500
[ 86.051955] ? __ip_do_redirect+0x820/0x1820
[ 86.051955] ? vfs_write+0x44d/0x7b0
[ 86.051955] ? vfs_write+0x44d/0x7b0
[ 86.051955] ? fput_many+0x15/0x120
[ 86.051955] ? ksys_write+0x155/0x1b0
[ 86.051955] ? __ia32_sys_read+0xa0/0xa0
[ 86.051955] __x64_sys_sendto+0xd8/0x1b0
[ 86.051955] ? exit_to_user_mode_prepare+0x2f/0x120
[ 86.051955] ? syscall_exit_to_user_mode+0x12/0x20
[ 86.051955] do_syscall_64+0x3a/0x80
[ 86.051955] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 86.051955] RIP: 0033:0x7f82118a56b3
[ 86.051955] RSP: 002b:00007ffdb154b110 EFLAGS: 00000293 ORIG_RAX: 000000000000002c
[ 86.051955] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f82118a56b3
[ 86.051955] RDX: 0000000000000010 RSI: 00007f8211cd4000 RDI: 0000000000000007
[ 86.051955] RBP: 00007ffdb154c1d0 R08: 00007ffdb154b164 R09: 000000000000000c
[ 86.051955] R10: 0000000000000000 R11: 0000000000000293 R12: 000055d779800db0
[ 86.051955] R13: 00007ffdb154c2b0 R14: 0000000000000000 R15: 0000000000000000
[ 86.051955] </TASK>
[ 86.051955]
[ 86.051955] Allocated by task 295:
[ 86.051955] kasan_save_stack+0x1c/0x40
[ 86.051955] __kasan_kmalloc+0x84/0xa0
[ 86.051955] mctp_rtm_newaddr+0x242/0x610
[ 86.051955] rtnetlink_rcv_msg+0x2fd/0x8b0
[ 86.051955] netlink_rcv_skb+0x11c/0x340
[ 86.051955] netlink_unicast+0x439/0x630
[ 86.051955] netlink_sendmsg+0x752/0xc00
[ 86.051955] sock_sendmsg+0xdd/0x110
[ 86.051955] __sys_sendto+0x1cc/0x2a0
[ 86.051955] __x64_sys_sendto+0xd8/0x1b0
[ 86.051955] do_syscall_64+0x3a/0x80
[ 86.051955] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 86.051955]
[ 86.051955] Freed by task 301:
[ 86.051955] kasan_save_stack+0x1c/0x40
[ 86.051955] kasan_set_track+0x21/0x30
[ 86.051955] kasan_set_free_info+0x20/0x30
[ 86.051955] __kasan_slab_free+0x104/0x170
[ 86.051955] kfree+0x8c/0x290
[ 86.051955] mctp_dev_notify+0x161/0x2c0
[ 86.051955] raw_notifier_call_chain+0x8b/0xc0
[ 86.051955] unregister_netdevice_many+0x299/0x1180
[ 86.051955] unregister_netdevice_queue+0x210/0x2f0
[ 86.051955] unregister_netdev+0x13/0x20
[ 86.051955] mctp_serial_close+0x6d/0xa0
[ 86.051955] tty_ldisc_kill+0x31/0xa0
[ 86.051955] tty_ldisc_hangup+0x24f/0x560
[ 86.051955] __tty_hangup.part.28+0x2ce/0x6b0
[ 86.051955] tty_release+0x327/0xc70
[ 86.051955] __fput+0x1df/0x8b0
[ 86.051955] task_work_run+0xca/0x150
[ 86.051955] exit_to_user_mode_prepare+0x114/0x120
[ 86.051955] syscall_exit_to_user_mode+0x12/0x20
[ 86.051955] do_syscall_64+0x46/0x80
[ 86.051955] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 86.051955]
[ 86.051955] The buggy address belongs to the object at ffff888005f298c0
[ 86.051955] which belongs to the cache kmalloc-8 of size 8
[ 86.051955] The buggy address is located 0 bytes inside of
[ 86.051955] 8-byte region [ffff888005f298c0, ffff888005f298c8)
[ 86.051955]
[ 86.051955] The buggy address belongs to the physical page:
[ 86.051955] flags: 0x100000000000200(slab|node=0|zone=1)
[ 86.051955] raw: 0100000000000200 dead000000000100 dead000000000122 ffff888005c42280
[ 86.051955] raw: 0000000000000000 0000000080660066 00000001ffffffff 0000000000000000
[ 86.051955] page dumped because: kasan: bad access detected
[ 86.051955]
[ 86.051955] Memory state around the buggy address:
[ 86.051955] ffff888005f29780: 00 fc fc fc fc 00 fc fc fc fc 00 fc fc fc fc 00
[ 86.051955] ffff888005f29800: fc fc fc fc 00 fc fc fc fc 00 fc fc fc fc 00 fc
[ 86.051955] >ffff888005f29880: fc fc fc fb fc fc fc fc fa fc fc fc fc fa fc fc
[ 86.051955] ^
[ 86.051955] ffff888005f29900: fc fc 00 fc fc fc fc 00 fc fc fc fc 00 fc fc fc
[ 86.051955] ffff888005f29980: fc 00 fc fc fc fc 00 fc fc fc fc 00 fc fc fc fc
[ 86.051955] ==================================================================
To this end, just like the commit e04480920d1e ("Bluetooth: defer
cleanup of resources in hci_unregister_dev()") this patch defers the
destructive kfree(mdev->addrs) in mctp_unregister to the mctp_dev_put,
where the refcount of mdev is zero and the entire device is reclaimed.
This prevents the use-after-free because the sendmsg thread holds the
reference of mdev in the mctp_route object.
Fixes: 583be982d934 (mctp: Add device handling and netlink interface)
Signed-off-by: Lin Ma <linma@zju.edu.cn>
Acked-by: Jeremy Kerr <jk@codeconstruct.com.au>
Link: https://lore.kernel.org/r/20220422114340.32346-1-linma@zju.edu.cn
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
plane_state->uapi.crtc is not what we want to be looking at.
If bigjoiner is used hw.crtc is what tells us what crtc the plane
is supposedly using.
Not an actual problem on current hardware as the only FBC capable
pipe (A) can't be a bigjoiner slave and thus uapi.crtc==hw.crtc
always here. But when we get more FBC instances this will become
actually important.
Fixes: 2e6c99f88679 ("drm/i915/fbc: Nuke lots of crap from intel_fbc_state_cache")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220413152852.7336-1-ville.syrjala@linux.intel.com
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
(cherry picked from commit 3e1faae3398789abe8d4797255bfe28d95d81308)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
Fix typo in the _SEL_FETCH_PLANE_BASE_1_B register base address.
Fixes: a5523e2ff074a5 ("drm/i915: Add PSR2 selective fetch registers")
References: https://gitlab.freedesktop.org/drm/intel/-/issues/5400
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: <stable@vger.kernel.org> # v5.9+
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220421162221.2261895-1-imre.deak@intel.com
(cherry picked from commit af2cbc6ef967f61711a3c40fca5366ea0bc7fecc)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
It's noted that dcvs interrupts are not self-clearing, thus an interrupt
handler runs constantly, which leads to a severe regression in runtime.
To fix the problem an explicit write to clear interrupt register is
required, note that on OSM platforms the register may not be present.
Fixes: 275157b367f4 ("cpufreq: qcom-cpufreq-hw: Add dcvs interrupt support")
Signed-off-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|
'size' may be used uninitialized in gsm_dlci_modem_output() if called with
an adaption that is neither 1 nor 2. The function is currently only called
by gsm_modem_upd_via_data() and only for adaption 2.
Properly handle every invalid case by returning -EINVAL to silence the
compiler warning and avoid future regressions.
Fixes: c19ffe00fed6 ("tty: n_gsm: fix invalid use of MSC in advanced option")
Cc: stable@vger.kernel.org
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Daniel Starke <daniel.starke@siemens.com>
Link: https://lore.kernel.org/r/20220425104726.7986-1-daniel.starke@siemens.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Document the max_wro_seq_files, nr_wro_seq_files, max_active_seq_files
and nr_active_seq_files sysfs attributes.
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
|
|
Total 16 bytes can be saved in two ways:
1) The field 'bio' will only be used in bio based mode, and the field
'rq' will only be used in mq mode. Since they won't be used in the
same time, declare a union for them.
2) The field 'bool fake_timeout' can be placed in the hole after the
field 'error'.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Link: https://lore.kernel.org/r/20220426022133.3999006-1-yukuai3@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
There is no need to declare attributes such as the ctime, mtime and
block size invalid when we're just returning a delegation, so it is
inappropriate to call nfs_post_op_update_inode_force_wcc().
Instead, just call nfs_refresh_inode() after faking up the change
attribute. We know that the GETATTR op occurs before the DELEGRETURN, so
we are safe when doing this.
Fixes: 0bc2c9b4dca9 ("NFSv4: Don't discard the attributes returned by asynchronous DELEGRETURN")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into clk-fixes
Pull Allwinner clk fixes from Jernej Skrabec:
- Add missing sentinel
- check return value for platform_get_resource()
- mark rtc-32k as critical
* tag 'sunxi-clk-fixes-for-5.18-2' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
clk: sunxi: sun9i-mmc: check return value after calling platform_get_resource()
clk: sunxi-ng: sun6i-rtc: Mark rtc-32k as critical
clk: sunxi-ng: fix not NULL terminated coccicheck error
|
|
If IO_URING_SCM_ALL isn't set, as it would not be on 32-bit builds,
then we trigger a warning:
fs/io_uring.c: In function '__io_sqe_files_unregister':
fs/io_uring.c:8992:13: warning: unused variable 'i' [-Wunused-variable]
8992 | int i;
| ^
Move the ifdef up to include the 'i' variable declaration.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 5e45690a1cb8 ("io_uring: store SCM state in io_fixed_file->file_ptr")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
There are several instances where magic numbers are used in md.c instead
of the defined constants in md_p.h. This patch set improves code
readability by replacing all occurrences of 0xffff, 0xfffe, and 0xfffd when
relating to md roles with their equivalent defined constant.
Signed-off-by: David Sloan <david.sloan@eideticom.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Song Liu <song@kernel.org>
|
|
The RAID0 layout is irrelevant if all members have the same size so the
array has only one zone. It is *also* irrelevant if the array has two
zones and the second zone has only one device, for example if the array
has two members of different sizes.
So in that case it makes sense to allow assembly even when the layout is
undefined, like what is done when the array has only one zone.
Reviewed-by: NeilBrown <neilb@suse.de>
Signed-off-by: Pascal Hambourg <pascal@plouf.fr.eu.org>
Signed-off-by: Song Liu <song@kernel.org>
|
|
A handful of functions note the device_lock must be held with a comment
but this is not comprehensive. Many other functions hold the lock when
taken so add an __must_hold() to each call to annotate when the lock is
held.
This makes it a bit easier to analyse device_lock.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Song Liu <song@kernel.org>
|
|
To suppress the last remaining sparse warnings about accessing
rdev, add rcu_dereference_protected calls to a couple places
in raid5-ppl. All of these places are called under raid5_run and
therefore are occurring before the array has started and is thus
safe.
There's no sensible check to do for the second argument of
rcu_dereference_protected() so a comment is added instead.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Song Liu <song@kernel.org>
|
|
The mddev_lock should be held during raid5_remove_disk() which is when
the rdev/replacement pointers are modified. So any access to these
pointers marked __rcu should be safe whenever the mddev_lock is held.
There are numerous such access that currently produce sparse warnings.
Add a helper function, rdev_mdlock_deref() that wraps
rcu_dereference_protected() in all these instances.
This annotation fixes a number of sparse warnings.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Song Liu <song@kernel.org>
|
|
There are a number of accesses to __rcu variables that should be safe
because nr_pending in the disk is known to be elevated.
Create a wrapper around rcu_dereference_protected() to annotate these
accesses and verify that nr_pending is non-zero.
This fixes a number of sparse warnings.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Song Liu <song@kernel.org>
|
|
rdev and replacement are protected in some circumstances with
rcu_dereference and synchronize_rcu (in raid5_remove_disk()). However,
they were not annotated with __rcu so a sparse warning is emitted for
every rcu_dereference() call.
Add the __rcu annotation and fix up the initialization with
RCU_INIT_POINTER, all pointer modifications with rcu_assign_pointer(),
a few cases where the pointer value is tested with rcu_access_pointer()
and one case where READ_ONCE() is used instead of rcu_dereference(),
a case in print_raid5_conf() that should have rcu_dereference() and
rcu_read_[un]lock() calls.
Additional sparse issues will be fixed up in further commits.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Song Liu <song@kernel.org>
|
|
Sparse reports many warnings of the form:
drivers/md/raid5.c:1476:16: warning: dereference of noderef expression
This is because all struct raid5_percpu definitions get marked as
__percpu when really only the pointer in r5conf should have that
annotation.
Fix this by moving the defnition of raid5_precpu out of the definition
of struct r5conf.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Song Liu <song@kernel.org>
|
|
Be more careful about the error returns. Most errors in this function
are actually ENOMEM, but it forcibly returns EIO if conf has been
allocated.
Instead return ret and ensure it is set appropriately before each goto
abort.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Song Liu <song@kernel.org>
|
|
This commit includes two topics:
1> replace deprecated strlcpy
change strlcpy to strscpy for strlcpy is marked as deprecated in
Documentation/process/deprecated.rst
2> remove duplicated strlcpy line
in md_bitmap_read_sb@md-bitmap.c there are two duplicated strlcpy(), the
history:
- commit cf921cc19cf7 ("Add node recovery callbacks") introduced the first
usage of strlcpy().
- commit b97e92574c0b ("Use separate bitmaps for each nodes in the cluster")
introduced the second strlcpy(). this time, the two strlcpy() are same,
we can remove anyone safely.
- commit d3b178adb3a3 ("md: Skip cluster setup for dm-raid") added dm-raid
special handling. And the "nodes" value is the key of this patch. but
from this patch, strlcpy() which was introduced by b97e92574c0bf
become necessary.
- commit 3c462c880b52 ("md: Increment version for clustered bitmaps") used
clustered major version to only handle in clustered env. this patch
could look a polishment for clustered code logic.
So cf921cc19cf7 became useless after d3b178adb3a3a, we could remove it
safely.
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
Signed-off-by: Song Liu <song@kernel.org>
|
|
If bitmap area contains invalid data, kernel will crash then mdadm
triggers "Segmentation fault".
This is cluster-md speical bug. In non-clustered env, mdadm will
handle broken metadata case. In clustered array, only kernel space
handles bitmap slot info. But even this bug only happened in clustered
env, current sanity check is wrong, the code should be changed.
How to trigger: (faulty injection)
dd if=/dev/zero bs=1M count=1 oflag=direct of=/dev/sda
dd if=/dev/zero bs=1M count=1 oflag=direct of=/dev/sdb
mdadm -C /dev/md0 -b clustered -e 1.2 -n 2 -l mirror /dev/sda /dev/sdb
mdadm -Ss
echo aaa > magic.txt
== below modifying slot 2 bitmap data ==
dd if=magic.txt of=/dev/sda seek=16384 bs=1 count=3 <== destroy magic
dd if=/dev/zero of=/dev/sda seek=16436 bs=1 count=4 <== ZERO chunksize
mdadm -A /dev/md0 /dev/sda /dev/sdb
== kernel crashes. mdadm outputs "Segmentation fault" ==
Reason of kernel crash:
In md_bitmap_read_sb (called by md_bitmap_create), bad bitmap magic didn't
block chunksize assignment, and zero value made DIV_ROUND_UP_SECTOR_T()
trigger "divide error".
Crash log:
kernel: md: md0 stopped.
kernel: md/raid1:md0: not clean -- starting background reconstruction
kernel: md/raid1:md0: active with 2 out of 2 mirrors
kernel: dlm: ... ...
kernel: md-cluster: Joined cluster 44810aba-38bb-e6b8-daca-bc97a0b254aa slot 1
kernel: md0: invalid bitmap file superblock: bad magic
kernel: md_bitmap_copy_from_slot can't get bitmap from slot 2
kernel: md-cluster: Could not gather bitmaps from slot 2
kernel: divide error: 0000 [#1] SMP NOPTI
kernel: CPU: 0 PID: 1603 Comm: mdadm Not tainted 5.14.6-1-default
kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
kernel: RIP: 0010:md_bitmap_create+0x1d1/0x850 [md_mod]
kernel: RSP: 0018:ffffc22ac0843ba0 EFLAGS: 00010246
kernel: ... ...
kernel: Call Trace:
kernel: ? dlm_lock_sync+0xd0/0xd0 [md_cluster 77fe..7a0]
kernel: md_bitmap_copy_from_slot+0x2c/0x290 [md_mod 24ea..d3a]
kernel: load_bitmaps+0xec/0x210 [md_cluster 77fe..7a0]
kernel: md_bitmap_load+0x81/0x1e0 [md_mod 24ea..d3a]
kernel: do_md_run+0x30/0x100 [md_mod 24ea..d3a]
kernel: md_ioctl+0x1290/0x15a0 [md_mod 24ea....d3a]
kernel: ? mddev_unlock+0xaa/0x130 [md_mod 24ea..d3a]
kernel: ? blkdev_ioctl+0xb1/0x2b0
kernel: block_ioctl+0x3b/0x40
kernel: __x64_sys_ioctl+0x7f/0xb0
kernel: do_syscall_64+0x59/0x80
kernel: ? exit_to_user_mode_prepare+0x1ab/0x230
kernel: ? syscall_exit_to_user_mode+0x18/0x40
kernel: ? do_syscall_64+0x69/0x80
kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae
kernel: RIP: 0033:0x7f4a15fa722b
kernel: ... ...
kernel: ---[ end trace 8afa7612f559c868 ]---
kernel: RIP: 0010:md_bitmap_create+0x1d1/0x850 [md_mod]
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
Signed-off-by: Song Liu <song@kernel.org>
|
|
The bug is here:
if (!rdev || rdev->desc_nr != nr) {
The list iterator value 'rdev' will *always* be set and non-NULL
by rdev_for_each_rcu(), so it is incorrect to assume that the
iterator value will be NULL if the list is empty or no element
found (In fact, it will be a bogus pointer to an invalid struct
object containing the HEAD). Otherwise it will bypass the check
and lead to invalid memory access passing the check.
To fix the bug, use a new variable 'iter' as the list iterator,
while using the original variable 'pdev' as a dedicated pointer to
point to the found element.
Cc: stable@vger.kernel.org
Fixes: 70bcecdb1534 ("md-cluster: Improve md_reload_sb to be less error prone")
Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com>
Signed-off-by: Song Liu <song@kernel.org>
|
|
The bug is here:
if (!rdev)
The list iterator value 'rdev' will *always* be set and non-NULL
by rdev_for_each(), so it is incorrect to assume that the iterator
value will be NULL if the list is empty or no element found.
Otherwise it will bypass the NULL check and lead to invalid memory
access passing the check.
To fix the bug, use a new variable 'iter' as the list iterator,
while using the original variable 'rdev' as a dedicated pointer to
point to the found element.
Cc: stable@vger.kernel.org
Fixes: 2aa82191ac36 ("md-cluster: Perform a lazy update")
Acked-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com>
Acked-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Song Liu <song@kernel.org>
|
|
Raid456 module had allowed to achieve failed state. It was fixed by
fb73b357fb9 ("raid5: block failing device if raid will be failed").
This fix introduces a bug, now if raid5 fails during IO, it may result
with a hung task without completion. Faulty flag on the device is
necessary to process all requests and is checked many times, mainly in
analyze_stripe().
Allow to set faulty on drive again and set MD_BROKEN if raid is failed.
As a result, this level is allowed to achieve failed state again, but
communication with userspace (via -EBUSY status) will be preserved.
This restores possibility to fail array via #mdadm --set-faulty command
and will be fixed by additional verification on mdadm side.
Reproduction steps:
mdadm -CR imsm -e imsm -n 3 /dev/nvme[0-2]n1
mdadm -CR r5 -e imsm -l5 -n3 /dev/nvme[0-2]n1 --assume-clean
mkfs.xfs /dev/md126 -f
mount /dev/md126 /mnt/root/
fio --filename=/mnt/root/file --size=5GB --direct=1 --rw=randrw
--bs=64k --ioengine=libaio --iodepth=64 --runtime=240 --numjobs=4
--time_based --group_reporting --name=throughput-test-job
--eta-newline=1 &
echo 1 > /sys/block/nvme2n1/device/device/remove
echo 1 > /sys/block/nvme1n1/device/device/remove
[ 1475.787779] Call Trace:
[ 1475.793111] __schedule+0x2a6/0x700
[ 1475.799460] schedule+0x38/0xa0
[ 1475.805454] raid5_get_active_stripe+0x469/0x5f0 [raid456]
[ 1475.813856] ? finish_wait+0x80/0x80
[ 1475.820332] raid5_make_request+0x180/0xb40 [raid456]
[ 1475.828281] ? finish_wait+0x80/0x80
[ 1475.834727] ? finish_wait+0x80/0x80
[ 1475.841127] ? finish_wait+0x80/0x80
[ 1475.847480] md_handle_request+0x119/0x190
[ 1475.854390] md_make_request+0x8a/0x190
[ 1475.861041] generic_make_request+0xcf/0x310
[ 1475.868145] submit_bio+0x3c/0x160
[ 1475.874355] iomap_dio_submit_bio.isra.20+0x51/0x60
[ 1475.882070] iomap_dio_bio_actor+0x175/0x390
[ 1475.889149] iomap_apply+0xff/0x310
[ 1475.895447] ? iomap_dio_bio_actor+0x390/0x390
[ 1475.902736] ? iomap_dio_bio_actor+0x390/0x390
[ 1475.909974] iomap_dio_rw+0x2f2/0x490
[ 1475.916415] ? iomap_dio_bio_actor+0x390/0x390
[ 1475.923680] ? atime_needs_update+0x77/0xe0
[ 1475.930674] ? xfs_file_dio_aio_read+0x6b/0xe0 [xfs]
[ 1475.938455] xfs_file_dio_aio_read+0x6b/0xe0 [xfs]
[ 1475.946084] xfs_file_read_iter+0xba/0xd0 [xfs]
[ 1475.953403] aio_read+0xd5/0x180
[ 1475.959395] ? _cond_resched+0x15/0x30
[ 1475.965907] io_submit_one+0x20b/0x3c0
[ 1475.972398] __x64_sys_io_submit+0xa2/0x180
[ 1475.979335] ? do_io_getevents+0x7c/0xc0
[ 1475.986009] do_syscall_64+0x5b/0x1a0
[ 1475.992419] entry_SYSCALL_64_after_hwframe+0x65/0xca
[ 1476.000255] RIP: 0033:0x7f11fc27978d
[ 1476.006631] Code: Bad RIP value.
[ 1476.073251] INFO: task fio:3877 blocked for more than 120 seconds.
Cc: stable@vger.kernel.org
Fixes: fb73b357fb9 ("raid5: block failing device if raid will be failed")
Reviewd-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Song Liu <song@kernel.org>
|
|
There is no direct mechanism to determine raid failure outside
personality. It is done by checking rdev->flags after executing
md_error(). If "faulty" flag is not set then -EBUSY is returned to
userspace. -EBUSY means that array will be failed after drive removal.
Mdadm has special routine to handle the array failure and it is executed
if -EBUSY is returned by md.
There are at least two known reasons to not consider this mechanism
as correct:
1. drive can be removed even if array will be failed[1].
2. -EBUSY seems to be wrong status. Array is not busy, but removal
process cannot proceed safe.
-EBUSY expectation cannot be removed without breaking compatibility
with userspace. In this patch first issue is resolved by adding support
for MD_BROKEN flag for RAID1 and RAID10. Support for RAID456 is added in
next commit.
The idea is to set the MD_BROKEN if we are sure that raid is in failed
state now. This is done in each error_handler(). In md_error() MD_BROKEN
flag is checked. If is set, then -EBUSY is returned to userspace.
As in previous commit, it causes that #mdadm --set-faulty is able to
fail array. Previously proposed workaround is valid if optional
functionality[1] is disabled.
[1] commit 9a567843f7ce("md: allow last device to be forcibly removed from
RAID1/RAID10.")
Reviewd-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Song Liu <song@kernel.org>
|
|
Since version 5.13, the standard syscon bindings have been added
to all clps711x DT nodes, so we can now use the more general
syscon_regmap_lookup_by_phandle function to get the syscon pointer.
Signed-off-by: Alexander Shiyan <eagle.alexander923@gmail.com>
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
Wen Gu says:
====================
net/smc: Two fixes for smc fallback
This patch set includes two fixes for smc fallback:
Patch 1/2 introduces some simple helpers to wrap the replacement
and restore of clcsock's callback functions. Make sure that only
the original callbacks will be saved and not overwritten.
Patch 2/2 fixes a syzbot reporting slab-out-of-bound issue where
smc_fback_error_report() accesses the already freed smc sock (see
https://lore.kernel.org/r/00000000000013ca8105d7ae3ada@google.com/).
The patch fixes it by resetting sk_user_data and restoring clcsock
callback functions timely in fallback situation.
But it should be noted that although patch 2/2 can fix the issue
of 'slab-out-of-bounds/use-after-free in smc_fback_error_report',
it can't pass the syzbot reproducer test. Because after applying
these two patches in upstream, syzbot reproducer triggered another
known issue like this:
==================================================================
BUG: KASAN: use-after-free in tcp_retransmit_timer+0x2ef3/0x3360 net/ipv4/tcp_timer.c:511
Read of size 8 at addr ffff888020328380 by task udevd/4158
CPU: 1 PID: 4158 Comm: udevd Not tainted 5.18.0-rc3-syzkaller-00074-gb05a5683eba6-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
print_address_description.constprop.0.cold+0xeb/0x467 mm/kasan/report.c:313
print_report mm/kasan/report.c:429 [inline]
kasan_report.cold+0xf4/0x1c6 mm/kasan/report.c:491
tcp_retransmit_timer+0x2ef3/0x3360 net/ipv4/tcp_timer.c:511
tcp_write_timer_handler+0x5e6/0xbc0 net/ipv4/tcp_timer.c:622
tcp_write_timer+0xa2/0x2b0 net/ipv4/tcp_timer.c:642
call_timer_fn+0x1a5/0x6b0 kernel/time/timer.c:1421
expire_timers kernel/time/timer.c:1466 [inline]
__run_timers.part.0+0x679/0xa80 kernel/time/timer.c:1737
__run_timers kernel/time/timer.c:1715 [inline]
run_timer_softirq+0xb3/0x1d0 kernel/time/timer.c:1750
__do_softirq+0x29b/0x9c2 kernel/softirq.c:558
invoke_softirq kernel/softirq.c:432 [inline]
__irq_exit_rcu+0x123/0x180 kernel/softirq.c:637
irq_exit_rcu+0x5/0x20 kernel/softirq.c:649
sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1097
</IRQ>
...
(detail report can be found in https://syzkaller.appspot.com/text?tag=CrashReport&x=15406b44f00000)
IMHO, the above issue is the same as this known one: https://syzkaller.appspot.com/bug?extid=694120e1002c117747ed,
and it doesn't seem to be related with SMC. The discussion about this known issue is ongoing and can be found in
https://lore.kernel.org/bpf/000000000000f75af905d3ba0716@google.com/T/.
And I added the temporary solution mentioned in the above discussion on
top of my two patches, the syzbot reproducer of 'slab-out-of-bounds/
use-after-free in smc_fback_error_report' no longer triggers any issue.
====================
Link: https://lore.kernel.org/r/1650614179-11529-1-git-send-email-guwen@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
syzbot reported a slab-out-of-bounds/use-after-free issue,
which was caused by accessing an already freed smc sock in
fallback-specific callback functions of clcsock.
This patch fixes the issue by restoring fallback-specific
callback functions to original ones and resetting clcsock
sk_user_data to NULL before freeing smc sock.
Meanwhile, this patch introduces sk_callback_lock to make
the access and assignment to sk_user_data mutually exclusive.
Reported-by: syzbot+b425899ed22c6943e00b@syzkaller.appspotmail.com
Fixes: 341adeec9ada ("net/smc: Forward wakeup to smc socket waitqueue after fallback")
Link: https://lore.kernel.org/r/00000000000013ca8105d7ae3ada@google.com/
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Acked-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Both listen and fallback process will save the current clcsock
callback functions and establish new ones. But if both of them
happen, the saved callback functions will be overwritten.
So this patch introduces some helpers to ensure that only save
the original callback functions of clcsock.
Fixes: 341adeec9ada ("net/smc: Forward wakeup to smc socket waitqueue after fallback")
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Acked-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs fixes from Jaegeuk Kim:
"This includes major bug fixes introduced in 5.18-rc1 and 5.17+:
- Remove obsolete whint_mode (5.18-rc1)
- Fix IO split issue caused by op_flags change in f2fs (5.18-rc1)
- Fix a wrong condition check to detect IO failure loop (5.18-rc1)
- Fix wrong data truncation during roll-forward (5.17+)"
* tag 'f2fs-fix-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs:
f2fs: should not truncate blocks during roll-forward recovery
f2fs: fix wrong condition check when failing metapage read
f2fs: keep io_flags to avoid IO split due to different op_flags in two fio holders
f2fs: remove obsolete whint_mode
|
|
It will cause null-ptr-deref if platform_get_resource() returns NULL,
we need check the return value.
Fixes: 7a6fca879f59 ("clk: sunxi: Add driver for A80 MMC config clocks/resets")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Samuel Holland <samuel@sholland.org>
Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Link: https://lore.kernel.org/r/20220421134308.2885094-1-yangyingliang@huawei.com
|
|
This code is really spurious.
It always returns an ERR_PTR, even when err is known to be 0 and calls
put_device() after a successful device_register() call.
It is likely that the return statement in the normal path is missing.
Add 'return rdev;' to fix it.
Fixes: d787dcdb9c8f ("bus: sunxi-rsb: Add driver for Allwinner Reduced Serial Bus")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Samuel Holland <samuel@sholland.org>
Tested-by: Samuel Holland <samuel@sholland.org>
Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Link: https://lore.kernel.org/r/ef2b9576350bba4c8e05e669e9535e9e2a415763.1650551719.git.christophe.jaillet@wanadoo.fr
|
|
Merge series from 周琰杰 (Zhou Yanjie) <zhouyanjie@wanyeetech.com>:
1.Add support for using GPIOs as chip select lines on Ingenic SoCs.
2.Add support for probing the spi-ingenic driver on the JZ4775 SoC,
the X1000 SoC, and the X2000 SoC.
3.Modify annotation texts to be more in line with the current state.
|
|
It turns out that for the CONFIG_MMU=n builds, vmalloc_huge() was never
defined, since it's defined in mm/vmalloc.c, which doesn't get built for
the no-MMU configurations.
Just implement the trivial wrapper for the no-MMU case too. In fact,
just make it an alias to the existing __vmalloc() function that has the
same signature.
Link: https://lore.kernel.org/all/CAMuHMdVdx2V1uhv_152Sw3_z2xE0spiaWp1d6Ko8-rYmAxUBAg@mail.gmail.com/
Link: https://lore.kernel.org/all/CA+G9fYscb1y4a17Sf5G_Aibt+WuSf-ks_Qjw9tYFy=A4sjCEug@mail.gmail.com/
Link: https://lore.kernel.org/all/20220425150356.GA4138752@roeck-us.net/
Reported-and-tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Reported-and-tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Explicitly disable PEC when the client does not support it.
The problematic scenario is the following. A device with enabled PEC
support is up and running and a kernel driver is loaded.
Then the driver is unloaded (or device unbound), the HW device
is reconfigured externally (e.g. by i2cset) to advertise itself as not
supporting PEC. Without a new code, at the second load of the driver
(or bind) the "flags" variable is not updated to avoid PEC usage. As a
consequence the further communication with the device is done with
the PEC enabled, which is wrong and may fail.
The implementation first disable the I2C_CLIENT_PEC flag, then the old
code enable it if needed.
Fixes: 4e5418f787ec ("hwmon: (pmbus_core) Check adapter PEC support")
Signed-off-by: Adam Wujek <dev_public@wujek.eu>
Link: https://lore.kernel.org/r/20220420145059.431061-1-dev_public@wujek.eu
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|