Age | Commit message (Collapse) | Author |
|
GCM detection logic has to change for two reasons:
-some CAAM instantiations with Era < 10, even though they have AES LP,
they now support GCM mode
-Era 10 upwards, there is a dedicated bit in AESA_VERSION[AESA_MISC]
field for GCM support
For Era 9 and earlier, all AES accelerator versions support GCM,
except for AES LP (CHAVID_LS[AESVID]=3) with revision CRNR[AESRN] < 8.
For Era 10 and later, bit 9 of the AESA_VERSION register should be used
to detect GCM support in AES accelerator.
Note: caam/qi and caam/qi2 are drivers for QI (Queue Interface), which
is used in DPAA-based SoCs; for now, we rely on CAAM having an AES HP
and this AES accelerator having support for GCM.
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Reviewed-by: Iuliana Prodan <iuliana.prodan@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
There is a spelling mistake in an error message in the qi_error_list
array. Fix this.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
The flags field in 'struct shash_desc' never actually does anything.
The only ostensibly supported flag is CRYPTO_TFM_REQ_MAY_SLEEP.
However, no shash algorithm ever sleeps, making this flag a no-op.
With this being the case, inevitably some users who can't sleep wrongly
pass MAY_SLEEP. These would all need to be fixed if any shash algorithm
actually started sleeping. For example, the shash_ahash_*() functions,
which wrap a shash algorithm with the ahash API, pass through MAY_SLEEP
from the ahash API to the shash API. However, the shash functions are
called under kmap_atomic(), so actually they're assumed to never sleep.
Even if it turns out that some users do need preemption points while
hashing large buffers, we could easily provide a helper function
crypto_shash_update_large() which divides the data into smaller chunks
and calls crypto_shash_update() and cond_resched() for each chunk. It's
not necessary to have a flag in 'struct shash_desc', nor is it necessary
to make individual shash algorithms aware of this at all.
Therefore, remove shash_desc::flags, and document that the
crypto_shash_*() functions can be called from any context.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
The nx driver uses the MAY_SLEEP flag in shash_desc::flags as an
indicator to not retry sending the operation to the hardware as many
times before returning -EBUSY. This is bogus because (1) that's not
what the MAY_SLEEP flag is for, and (2) the shash API doesn't allow
failing if the hardware is busy anyway.
For now, just make it always retry the larger number of times. This
doesn't actually fix this driver, but it at least makes it not use the
shash_desc::flags field anymore. Then this field can be removed, as no
other drivers use it.
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
The crypto_yield() in shash_ahash_digest() occurs after the entire
digest operation already happened, so there's no real point. Remove it.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
A genpd provider that uses the ->attach_dev() callback to look up
resources for a device fails to do so when the device has multiple
PM domains attached. That is because when genpd invokes the
->attach_dev() callback, it passes the allocated virtual device as
the in-parameter.
To address this problem, simply assign the dev->of_node for the
virtual device, based upon the original device's OF node.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Niklas Cassel <niklas.cassel@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
It's not correct to call kfree(dev) when device_register(dev) has failed.
Fix this by calling put_device(dev) instead.
Fixes: 3c095f32a92b ("PM / Domains: Add support for multi PM domains per device to genpd")
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Niklas Cassel <niklas.cassel@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The err_remove_chip block is too coarse, and may perform cleanup that
must not be done. E.g. if of_gpiochip_add() fails, of_gpiochip_remove()
is still called, causing:
OF: ERROR: Bad of_node_put() on /soc/gpio@e6050000
CPU: 1 PID: 20 Comm: kworker/1:1 Not tainted 5.1.0-rc2-koelsch+ #407
Hardware name: Generic R-Car Gen2 (Flattened Device Tree)
Workqueue: events deferred_probe_work_func
[<c020ec74>] (unwind_backtrace) from [<c020ae58>] (show_stack+0x10/0x14)
[<c020ae58>] (show_stack) from [<c07c1224>] (dump_stack+0x7c/0x9c)
[<c07c1224>] (dump_stack) from [<c07c5a80>] (kobject_put+0x94/0xbc)
[<c07c5a80>] (kobject_put) from [<c0470420>] (gpiochip_add_data_with_key+0x8d8/0xa3c)
[<c0470420>] (gpiochip_add_data_with_key) from [<c0473738>] (gpio_rcar_probe+0x1d4/0x314)
[<c0473738>] (gpio_rcar_probe) from [<c052fca8>] (platform_drv_probe+0x48/0x94)
and later, if a GPIO consumer tries to use a GPIO from a failed
controller:
WARNING: CPU: 0 PID: 1 at lib/refcount.c:156 kobject_get+0x38/0x4c
refcount_t: increment on 0; use-after-free.
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.1.0-rc2-koelsch+ #407
Hardware name: Generic R-Car Gen2 (Flattened Device Tree)
[<c020ec74>] (unwind_backtrace) from [<c020ae58>] (show_stack+0x10/0x14)
[<c020ae58>] (show_stack) from [<c07c1224>] (dump_stack+0x7c/0x9c)
[<c07c1224>] (dump_stack) from [<c0221580>] (__warn+0xd0/0xec)
[<c0221580>] (__warn) from [<c02215e0>] (warn_slowpath_fmt+0x44/0x6c)
[<c02215e0>] (warn_slowpath_fmt) from [<c07c58fc>] (kobject_get+0x38/0x4c)
[<c07c58fc>] (kobject_get) from [<c068b3ec>] (of_node_get+0x14/0x1c)
[<c068b3ec>] (of_node_get) from [<c0686f24>] (of_find_node_by_phandle+0xc0/0xf0)
[<c0686f24>] (of_find_node_by_phandle) from [<c0686fbc>] (of_phandle_iterator_next+0x68/0x154)
[<c0686fbc>] (of_phandle_iterator_next) from [<c0687fe4>] (__of_parse_phandle_with_args+0x40/0xd0)
[<c0687fe4>] (__of_parse_phandle_with_args) from [<c0688204>] (of_parse_phandle_with_args_map+0x100/0x3ac)
[<c0688204>] (of_parse_phandle_with_args_map) from [<c0471240>] (of_get_named_gpiod_flags+0x38/0x380)
[<c0471240>] (of_get_named_gpiod_flags) from [<c046f864>] (gpiod_get_from_of_node+0x24/0xd8)
[<c046f864>] (gpiod_get_from_of_node) from [<c0470aa4>] (devm_fwnode_get_index_gpiod_from_child+0xa0/0x144)
[<c0470aa4>] (devm_fwnode_get_index_gpiod_from_child) from [<c05f425c>] (gpio_keys_probe+0x418/0x7bc)
[<c05f425c>] (gpio_keys_probe) from [<c052fca8>] (platform_drv_probe+0x48/0x94)
Fix this by splitting the cleanup block, and adding a missing call to
gpiochip_irqchip_remove().
Fixes: 28355f81969962cf ("gpio: defer probe if pinctrl cannot be found")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
Remove the check for IOMMU presence since it was considered a
layer violation.
This means we have no reliable way to destinguish between coherent
hardware IOMMU DMA address translations and incoherent SWIOTLB DMA
address translations, which we can't handle. So always presume the
former. This means that if anybody forces SWIOTLB without also setting
the vmw_force_coherent=1 vmwgfx option, driver operation will fail,
like it will on most other graphics drivers.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
Mel Gorman says:
"32-bit NUMA systems should be non-existent in practice. The last NUMA
system I'm aware of that was both NUMA and 32-bit only died somewhere
between 2004 and 2007. If someone is running a 64-bit capable system in
32-bit mode with NUMA, they really are just punishing themselves for fun."
Mark DISCONTIGMEM broken for now as suggested by Christoph Hellwig,
and (hopefully) remove it in a couple of releases.
Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/1556112252-9339-3-git-send-email-rppt@linux.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Sparsemem has been a default memory model for x86-64 for over a decade,
since:
b263295dbffd ("x86: 64-bit, make sparsemem vmemmap the only memory model").
Make it the default for 32-bit NUMA systems (if there any left) as well.
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/1556112252-9339-2-git-send-email-rppt@linux.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
PC8/PC9/PC10 counters
Kaby Lake (and Coffee Lake) has PC8/PC9/PC10 residency counters.
This patch updates the list of Kaby/Coffee Lake PMU event counters
from the snb_cstates[] list of events to the hswult_cstates[]
list of events, which keeps all previously supported events and
also adds the PKG_C8, PKG_C9 and PKG_C10 residency counters.
This allows user space tools to profile them through the perf interface.
Signed-off-by: Harry Pan <harry.pan@intel.com>
Cc: <stable@vger.kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: gs0622@gmail.com
Link: http://lkml.kernel.org/r/20190424145033.1924-1-harry.pan@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Pull networking fixes from David Miller:
"Just the usual assortment of small'ish fixes:
1) Conntrack timeout is sometimes not initialized properly, from
Alexander Potapenko.
2) Add a reasonable range limit to tcp_min_rtt_wlen to avoid
undefined behavior. From ZhangXiaoxu.
3) des1 field of descriptor in stmmac driver is initialized with the
wrong variable. From Yue Haibing.
4) Increase mlxsw pci sw reset timeout a little bit more, from Ido
Schimmel.
5) Match IOT2000 stmmac devices more accurately, from Su Bao Cheng.
6) Fallback refcount fix in TLS code, from Jakub Kicinski.
7) Fix max MTU check when using XDP in mlx5, from Maxim Mikityanskiy.
8) Fix recursive locking in team driver, from Hangbin Liu.
9) Fix tls_set_device_offload_Rx() deadlock, from Jakub Kicinski.
10) Don't use napi_alloc_frag() outside of softiq context of socionext
driver, from Ilias Apalodimas.
11) MAC address increment overflow in ncsi, from Tao Ren.
12) Fix a regression in 8K/1M pool switching of RDS, from Zhu Yanjun.
13) ipv4_link_failure has to validate the headers that are actually
there because RAW sockets can pass in arbitrary garbage, from Eric
Dumazet"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (43 commits)
ipv4: add sanity checks in ipv4_link_failure()
net/rose: fix unbound loop in rose_loopback_timer()
rxrpc: fix race condition in rxrpc_input_packet()
net: rds: exchange of 8K and 1M pool
net: vrf: Fix operation not supported when set vrf mac
net/ncsi: handle overflow when incrementing mac address
net: socionext: replace napi_alloc_frag with the netdev variant on init
net: atheros: fix spelling mistake "underun" -> "underrun"
spi: ST ST95HF NFC: declare missing of table
spi: Micrel eth switch: declare missing of table
net: stmmac: move stmmac_check_ether_addr() to driver probe
netfilter: fix nf_l4proto_log_invalid to log invalid packets
netfilter: never get/set skb->tstamp
netfilter: ebtables: CONFIG_COMPAT: drop a bogus WARN_ON
Documentation: decnet: remove reference to CONFIG_DECNET_ROUTE_FWMARK
dt-bindings: add an explanation for internal phy-mode
net/tls: don't leak IV and record seq when offload fails
net/tls: avoid potential deadlock in tls_set_device_offload_rx()
selftests/net: correct the return value for run_afpackettests
team: fix possible recursive locking when add slaves
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds
Pull LED update from Jacek Anaszewski:
"A single change to MAINTAINERS:
We announce a new LED reviewer - Dan Murphy"
* tag 'leds-for-5.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
MAINTAINERS: LEDs: Add designated reviewer for LED subsystem
|
|
Before calling __ip_options_compile(), we need to ensure the network
header is a an IPv4 one, and that it is already pulled in skb->head.
RAW sockets going through a tunnel can end up calling ipv4_link_failure()
with total garbage in the skb, or arbitrary lengthes.
syzbot report :
BUG: KASAN: stack-out-of-bounds in memcpy include/linux/string.h:355 [inline]
BUG: KASAN: stack-out-of-bounds in __ip_options_echo+0x294/0x1120 net/ipv4/ip_options.c:123
Write of size 69 at addr ffff888096abf068 by task syz-executor.4/9204
CPU: 0 PID: 9204 Comm: syz-executor.4 Not tainted 5.1.0-rc5+ #77
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x172/0x1f0 lib/dump_stack.c:113
print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187
kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
check_memory_region_inline mm/kasan/generic.c:185 [inline]
check_memory_region+0x123/0x190 mm/kasan/generic.c:191
memcpy+0x38/0x50 mm/kasan/common.c:133
memcpy include/linux/string.h:355 [inline]
__ip_options_echo+0x294/0x1120 net/ipv4/ip_options.c:123
__icmp_send+0x725/0x1400 net/ipv4/icmp.c:695
ipv4_link_failure+0x29f/0x550 net/ipv4/route.c:1204
dst_link_failure include/net/dst.h:427 [inline]
vti6_xmit net/ipv6/ip6_vti.c:514 [inline]
vti6_tnl_xmit+0x10d4/0x1c0c net/ipv6/ip6_vti.c:553
__netdev_start_xmit include/linux/netdevice.h:4414 [inline]
netdev_start_xmit include/linux/netdevice.h:4423 [inline]
xmit_one net/core/dev.c:3292 [inline]
dev_hard_start_xmit+0x1b2/0x980 net/core/dev.c:3308
__dev_queue_xmit+0x271d/0x3060 net/core/dev.c:3878
dev_queue_xmit+0x18/0x20 net/core/dev.c:3911
neigh_direct_output+0x16/0x20 net/core/neighbour.c:1527
neigh_output include/net/neighbour.h:508 [inline]
ip_finish_output2+0x949/0x1740 net/ipv4/ip_output.c:229
ip_finish_output+0x73c/0xd50 net/ipv4/ip_output.c:317
NF_HOOK_COND include/linux/netfilter.h:278 [inline]
ip_output+0x21f/0x670 net/ipv4/ip_output.c:405
dst_output include/net/dst.h:444 [inline]
NF_HOOK include/linux/netfilter.h:289 [inline]
raw_send_hdrinc net/ipv4/raw.c:432 [inline]
raw_sendmsg+0x1d2b/0x2f20 net/ipv4/raw.c:663
inet_sendmsg+0x147/0x5d0 net/ipv4/af_inet.c:798
sock_sendmsg_nosec net/socket.c:651 [inline]
sock_sendmsg+0xdd/0x130 net/socket.c:661
sock_write_iter+0x27c/0x3e0 net/socket.c:988
call_write_iter include/linux/fs.h:1866 [inline]
new_sync_write+0x4c7/0x760 fs/read_write.c:474
__vfs_write+0xe4/0x110 fs/read_write.c:487
vfs_write+0x20c/0x580 fs/read_write.c:549
ksys_write+0x14f/0x2d0 fs/read_write.c:599
__do_sys_write fs/read_write.c:611 [inline]
__se_sys_write fs/read_write.c:608 [inline]
__x64_sys_write+0x73/0xb0 fs/read_write.c:608
do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x458c29
Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f293b44bc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458c29
RDX: 0000000000000014 RSI: 00000000200002c0 RDI: 0000000000000003
RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f293b44c6d4
R13: 00000000004c8623 R14: 00000000004ded68 R15: 00000000ffffffff
The buggy address belongs to the page:
page:ffffea00025aafc0 count:0 mapcount:0 mapping:0000000000000000 index:0x0
flags: 0x1fffc0000000000()
raw: 01fffc0000000000 0000000000000000 ffffffff025a0101 0000000000000000
raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff888096abef80: 00 00 00 f2 f2 f2 f2 f2 00 00 00 00 00 00 00 f2
ffff888096abf000: f2 f2 f2 f2 00 00 00 00 00 00 00 00 00 00 00 00
>ffff888096abf080: 00 00 f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00
^
ffff888096abf100: 00 00 00 00 f1 f1 f1 f1 00 00 f3 f3 00 00 00 00
ffff888096abf180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Fixes: ed0de45a1008 ("ipv4: recompile ip options in ipv4_link_failure")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Stephen Suryaputra <ssuryaextr@gmail.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds a limit on the number of skbs that fuzzers can queue
into loopback_queue. 1000 packets for rose loopback seems more than enough.
Then, since we now have multiple cpus in most linux hosts,
we also need to limit the number of skbs rose_loopback_timer()
can dequeue at each round.
rose_loopback_queue() can be drop-monitor friendly, calling
consume_skb() or kfree_skb() appropriately.
Finally, use mod_timer() instead of del_timer() + add_timer()
syzbot report was :
rcu: INFO: rcu_preempt self-detected stall on CPU
rcu: 0-...!: (10499 ticks this GP) idle=536/1/0x4000000000000002 softirq=103291/103291 fqs=34
rcu: (t=10500 jiffies g=140321 q=323)
rcu: rcu_preempt kthread starved for 10426 jiffies! g140321 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1
rcu: RCU grace-period kthread stack dump:
rcu_preempt I29168 10 2 0x80000000
Call Trace:
context_switch kernel/sched/core.c:2877 [inline]
__schedule+0x813/0x1cc0 kernel/sched/core.c:3518
schedule+0x92/0x180 kernel/sched/core.c:3562
schedule_timeout+0x4db/0xfd0 kernel/time/timer.c:1803
rcu_gp_fqs_loop kernel/rcu/tree.c:1971 [inline]
rcu_gp_kthread+0x962/0x17b0 kernel/rcu/tree.c:2128
kthread+0x357/0x430 kernel/kthread.c:253
ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352
NMI backtrace for cpu 0
CPU: 0 PID: 7632 Comm: kworker/0:4 Not tainted 5.1.0-rc5+ #172
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events iterate_cleanup_work
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x172/0x1f0 lib/dump_stack.c:113
nmi_cpu_backtrace.cold+0x63/0xa4 lib/nmi_backtrace.c:101
nmi_trigger_cpumask_backtrace+0x1be/0x236 lib/nmi_backtrace.c:62
arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
rcu_dump_cpu_stacks+0x183/0x1cf kernel/rcu/tree.c:1223
print_cpu_stall kernel/rcu/tree.c:1360 [inline]
check_cpu_stall kernel/rcu/tree.c:1434 [inline]
rcu_pending kernel/rcu/tree.c:3103 [inline]
rcu_sched_clock_irq.cold+0x500/0xa4a kernel/rcu/tree.c:2544
update_process_times+0x32/0x80 kernel/time/timer.c:1635
tick_sched_handle+0xa2/0x190 kernel/time/tick-sched.c:161
tick_sched_timer+0x47/0x130 kernel/time/tick-sched.c:1271
__run_hrtimer kernel/time/hrtimer.c:1389 [inline]
__hrtimer_run_queues+0x33e/0xde0 kernel/time/hrtimer.c:1451
hrtimer_interrupt+0x314/0x770 kernel/time/hrtimer.c:1509
local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1035 [inline]
smp_apic_timer_interrupt+0x120/0x570 arch/x86/kernel/apic/apic.c:1060
apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:807
RIP: 0010:__sanitizer_cov_trace_pc+0x0/0x50 kernel/kcov.c:95
Code: 89 25 b4 6e ec 08 41 bc f4 ff ff ff e8 cd 5d ea ff 48 c7 05 9e 6e ec 08 00 00 00 00 e9 a4 e9 ff ff 90 90 90 90 90 90 90 90 90 <55> 48 89 e5 48 8b 75 08 65 48 8b 04 25 00 ee 01 00 65 8b 15 c8 60
RSP: 0018:ffff8880ae807ce0 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13
RAX: ffff88806fd40640 RBX: dffffc0000000000 RCX: ffffffff863fbc56
RDX: 0000000000000100 RSI: ffffffff863fbc1d RDI: ffff88808cf94228
RBP: ffff8880ae807d10 R08: ffff88806fd40640 R09: ffffed1015d00f8b
R10: ffffed1015d00f8a R11: 0000000000000003 R12: ffff88808cf941c0
R13: 00000000fffff034 R14: ffff8882166cd840 R15: 0000000000000000
rose_loopback_timer+0x30d/0x3f0 net/rose/rose_loopback.c:91
call_timer_fn+0x190/0x720 kernel/time/timer.c:1325
expire_timers kernel/time/timer.c:1362 [inline]
__run_timers kernel/time/timer.c:1681 [inline]
__run_timers kernel/time/timer.c:1649 [inline]
run_timer_softirq+0x652/0x1700 kernel/time/timer.c:1694
__do_softirq+0x266/0x95a kernel/softirq.c:293
do_softirq_own_stack+0x2a/0x40 arch/x86/entry/entry_64.S:1027
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
After commit 5271953cad31 ("rxrpc: Use the UDP encap_rcv hook"),
rxrpc_input_packet() is directly called from lockless UDP receive
path, under rcu_read_lock() protection.
It must therefore use RCU rules :
- udp_sk->sk_user_data can be cleared at any point in this function.
rcu_dereference_sk_user_data() is what we need here.
- Also, since sk_user_data might have been set in rxrpc_open_socket()
we must observe a proper RCU grace period before kfree(local) in
rxrpc_lookup_local()
v4: @local can be NULL in xrpc_lookup_local() as reported by kbuild test robot <lkp@intel.com>
and Julia Lawall <julia.lawall@lip6.fr>, thanks !
v3,v2 : addressed David Howells feedback, thanks !
syzbot reported :
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 19236 Comm: syz-executor703 Not tainted 5.1.0-rc6 #79
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__lock_acquire+0xbef/0x3fb0 kernel/locking/lockdep.c:3573
Code: 00 0f 85 a5 1f 00 00 48 81 c4 10 01 00 00 5b 41 5c 41 5d 41 5e 41 5f 5d c3 48 b8 00 00 00 00 00 fc ff df 4c 89 ea 48 c1 ea 03 <80> 3c 02 00 0f 85 4a 21 00 00 49 81 7d 00 20 54 9c 89 0f 84 cf f4
RSP: 0018:ffff88809d7aef58 EFLAGS: 00010002
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000026 RSI: 0000000000000000 RDI: 0000000000000001
RBP: ffff88809d7af090 R08: 0000000000000001 R09: 0000000000000001
R10: ffffed1015d05bc7 R11: ffff888089428600 R12: 0000000000000000
R13: 0000000000000130 R14: 0000000000000001 R15: 0000000000000001
FS: 00007f059044d700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000004b6040 CR3: 00000000955ca000 CR4: 00000000001406f0
Call Trace:
lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:4211
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x95/0xcd kernel/locking/spinlock.c:152
skb_queue_tail+0x26/0x150 net/core/skbuff.c:2972
rxrpc_reject_packet net/rxrpc/input.c:1126 [inline]
rxrpc_input_packet+0x4a0/0x5536 net/rxrpc/input.c:1414
udp_queue_rcv_one_skb+0xaf2/0x1780 net/ipv4/udp.c:2011
udp_queue_rcv_skb+0x128/0x730 net/ipv4/udp.c:2085
udp_unicast_rcv_skb.isra.0+0xb9/0x360 net/ipv4/udp.c:2245
__udp4_lib_rcv+0x701/0x2ca0 net/ipv4/udp.c:2301
udp_rcv+0x22/0x30 net/ipv4/udp.c:2482
ip_protocol_deliver_rcu+0x60/0x8f0 net/ipv4/ip_input.c:208
ip_local_deliver_finish+0x23b/0x390 net/ipv4/ip_input.c:234
NF_HOOK include/linux/netfilter.h:289 [inline]
NF_HOOK include/linux/netfilter.h:283 [inline]
ip_local_deliver+0x1e9/0x520 net/ipv4/ip_input.c:255
dst_input include/net/dst.h:450 [inline]
ip_rcv_finish+0x1e1/0x300 net/ipv4/ip_input.c:413
NF_HOOK include/linux/netfilter.h:289 [inline]
NF_HOOK include/linux/netfilter.h:283 [inline]
ip_rcv+0xe8/0x3f0 net/ipv4/ip_input.c:523
__netif_receive_skb_one_core+0x115/0x1a0 net/core/dev.c:4987
__netif_receive_skb+0x2c/0x1c0 net/core/dev.c:5099
netif_receive_skb_internal+0x117/0x660 net/core/dev.c:5202
napi_frags_finish net/core/dev.c:5769 [inline]
napi_gro_frags+0xade/0xd10 net/core/dev.c:5843
tun_get_user+0x2f24/0x3fb0 drivers/net/tun.c:1981
tun_chr_write_iter+0xbd/0x156 drivers/net/tun.c:2027
call_write_iter include/linux/fs.h:1866 [inline]
do_iter_readv_writev+0x5e1/0x8e0 fs/read_write.c:681
do_iter_write fs/read_write.c:957 [inline]
do_iter_write+0x184/0x610 fs/read_write.c:938
vfs_writev+0x1b3/0x2f0 fs/read_write.c:1002
do_writev+0x15e/0x370 fs/read_write.c:1037
__do_sys_writev fs/read_write.c:1110 [inline]
__se_sys_writev fs/read_write.c:1107 [inline]
__x64_sys_writev+0x75/0xb0 fs/read_write.c:1107
do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Fixes: 5271953cad31 ("rxrpc: Use the UDP encap_rcv hook")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
CONFIG_INIT_STACK_ALL turns on stack initialization based on
-ftrivial-auto-var-init in Clang builds, which has greater coverage
than CONFIG_GCC_PLUGINS_STRUCTLEAK_BYREF_ALL.
-ftrivial-auto-var-init Clang option provides trivial initializers for
uninitialized local variables, variable fields and padding.
It has three possible values:
pattern - uninitialized locals are filled with a fixed pattern
(mostly 0xAA on 64-bit platforms, see https://reviews.llvm.org/D54604
for more details, but 0x000000AA for 32-bit pointers) likely to cause
crashes when uninitialized value is used;
zero (it's still debated whether this flag makes it to the official
Clang release) - uninitialized locals are filled with zeroes;
uninitialized (default) - uninitialized locals are left intact.
This patch uses only the "pattern" mode when CONFIG_INIT_STACK_ALL is
enabled.
Developers have the possibility to opt-out of this feature on a
per-variable basis by using __attribute__((uninitialized)), but such
use should be well justified in comments.
Co-developed-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Tested-by: Alexander Potapenko <glider@google.com>
Acked-by: Masahiro Yamada <yamada.masahiro@socionext.com>
|
|
This moves the stackleak plugin options to Kconfig.hardening's memory
initialization menu.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Alexander Popov <alex.popov@linux.com>
Acked-by: Masahiro Yamada <yamada.masahiro@socionext.com>
|
|
Right now kernel hardening options are scattered around various Kconfig
files. This can be a central place to collect these kinds of options
going forward. This is initially populated with the memory initialization
options from the gcc-plugins.
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Masahiro Yamada <yamada.masahiro@socionext.com>
|
|
Add a designated reviewer for the LED subsystem as there
are already two maintainers assigned.
Signed-off-by: Dan Murphy <dmurphy@ti.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
|
|
Before the commit 490ea5967b0d ("RDS: IB: move FMR code to its own file"),
when the dirty_count is greater than 9/10 of max_items of 8K pool,
1M pool is used, Vice versa. After the commit 490ea5967b0d ("RDS: IB: move
FMR code to its own file"), the above is removed. When we make the
following tests.
Server:
rds-stress -r 1.1.1.16 -D 1M
Client:
rds-stress -r 1.1.1.14 -s 1.1.1.16 -D 1M
The following will appear.
"
connecting to 1.1.1.16:4000
negotiated options, tasks will start in 2 seconds
Starting up..header from 1.1.1.166:4001 to id 4001 bogus
..
tsks tx/s rx/s tx+rx K/s mbi K/s mbo K/s tx us/c rtt us
cpu %
1 0 0 0.00 0.00 0.00 0.00 0.00 -1.00
1 0 0 0.00 0.00 0.00 0.00 0.00 -1.00
1 0 0 0.00 0.00 0.00 0.00 0.00 -1.00
1 0 0 0.00 0.00 0.00 0.00 0.00 -1.00
1 0 0 0.00 0.00 0.00 0.00 0.00 -1.00
...
"
So this exchange between 8K and 1M pool is added back.
Fixes: commit 490ea5967b0d ("RDS: IB: move FMR code to its own file")
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Vrf device is not able to change mac address now because lack of
ndo_set_mac_address. Complete this in case some apps need to do
this.
Reported-by: Hui Wang <wanghui104@huawei.com>
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
regmap_update_bits could fail and deserves a check.
The patch adds the checks and if it fails, returns its error
code upstream.
Signed-off-by: Kangjie Lu <kjlu@umn.edu>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
|
|
CIFS can leak pages reference gotten through GUP (get_user_pages*()
through iov_iter_get_pages()). This happen if cifs_send_async_read()
or cifs_write_from_iter() calls fail from within __cifs_readv() and
__cifs_writev() respectively. This patch move page unreference to
cifs_aio_ctx_release() which will happens on all code paths this is
all simpler to follow for correctness.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Steve French <sfrench@samba.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Stable <stable@vger.kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
|
|
A path-based rename returning EBUSY will incorrectly try opening
the file with a cifs (NT Create AndX) operation on an smb2+ mount,
which causes the server to force a session close.
If the mount is smb2+, skip the fallback.
Signed-off-by: Frank Sorenson <sorenson@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
|
|
Commit 088aaf17aa79300cab14dbee2569c58cfafd7d6e introduced a leak where
if SMB2_read() returned an error we would return without freeing the
request buffer.
Cc: Stable <stable@vger.kernel.org>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
CACHE_SYNC branch of run_cache_set
In the CACHE_SYNC branch of run_cache_set(), LIST_HEAD(journal) is used
to collect journal_replay(s) and filled by bch_journal_read().
If all goes well, bch_journal_replay() will release the list of
jounal_replay(s) at the end of the branch.
If something goes wrong, code flow will jump to the label "err:" and leave
the list unreleased.
This patch will release the list of journal_replay(s) in the case of
error detected.
v1 -> v2:
* Move the release code to the location after label 'err:' to
simply the change.
Signed-off-by: Shenghui Wang <shhuiw@foxmail.com>
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
of btree_gc_coalesce
Elements of keylist should be accessed before the list is freed.
Move bch_keylist_free() calling after the while loop to avoid wrong
content accessed.
Signed-off-by: Shenghui Wang <shhuiw@foxmail.com>
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
journal replay failed with messages:
Sep 10 19:10:43 ceph kernel: bcache: error on
bb379a64-e44e-4812-b91d-a5599871a3b1: bcache: journal entries
2057493-2057567 missing! (replaying 2057493-2076601), disabling
caching
The reason is in journal_reclaim(), when discard is enabled, we send
discard command and reclaim those journal buckets whose seq is old
than the last_seq_now, but before we write a journal with last_seq_now,
the machine is restarted, so the journal with the last_seq_now is not
written to the journal bucket, and the last_seq_wrote in the newest
journal is old than last_seq_now which we expect to be, so when we doing
replay, journals from last_seq_wrote to last_seq_now are missing.
It's hard to write a journal immediately after journal_reclaim(),
and it harmless if those missed journal are caused by discarding
since those journals are already wrote to btree node. So, if miss
seqs are started from the beginning journal, we treat it as normal,
and only print a message to show the miss journal, and point out
it maybe caused by discarding.
Patch v2 add a judgement condition to ignore the missed journal
only when discard enabled as Coly suggested.
(Coly Li: rebase the patch with other changes in bch_journal_replay())
Signed-off-by: Tang Junhui <tang.junhui.linux@gmail.com>
Tested-by: Dennis Schridde <devurandom@gmx.net>
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
This patch tries to release mutex bch_register_lock early, to give
chance to stop cache set and bcache device early.
This patch also expends time out of stopping all bcache device from
2 seconds to 10 seconds, because stopping writeback rate update worker
may delay for 5 seconds, 2 seconds is not enough.
After this patch applied, stopping bcache devices during system reboot
or shutdown is very hard to be observed any more.
Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add code comments to explain which call back function might be called
for the closure_queue(). This is an effort to make code to be more
understandable for readers.
Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add comments to explain why in register_bcache() blkdev_put() won't
be called in two location. Add comments to explain why blkdev_put()
must be called in register_cache() when cache_alloc() failed.
Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
This patch adds return value to register_bdev(). Then if failure happens
inside register_bdev(), its caller register_bcache() may detect and
handle the failure more properly.
Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
When failure happens inside bch_journal_replay(), calling
cache_set_err_on() and handling the failure in async way is not a good
idea. Because after bch_journal_replay() returns, registering code will
continue to execute following steps, and unregistering code triggered
by cache_set_err_on() is running in same time. First it is unnecessary
to handle failure and unregister cache set in an async way, second there
might be potential race condition to run register and unregister code
for same cache set.
So in this patch, if failure happens in bch_journal_replay(), we don't
call cache_set_err_on(), and just print out the same error message to
kernel message buffer, then return -EIO immediately caller. Then caller
can detect such failure and handle it in synchrnozied way.
Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Bcache has several routines to release resources in implicit way, they
are called when the associated kobj released. This patch adds code
comments to notice when and which release callback will be called,
- When dc->disk.kobj released:
void bch_cached_dev_release(struct kobject *kobj)
- When d->kobj released:
void bch_flash_dev_release(struct kobject *kobj)
- When c->kobj released:
void bch_cache_set_release(struct kobject *kobj)
- When ca->kobj released
void bch_cache_release(struct kobject *kobj)
Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Currently run_cache_set() has no return value, if there is failure in
bch_journal_replay(), the caller of run_cache_set() has no idea about
such failure and just continue to execute following code after
run_cache_set(). The internal failure is triggered inside
bch_journal_replay() and being handled in async way. This behavior is
inefficient, while failure handling inside bch_journal_replay(), cache
register code is still running to start the cache set. Registering and
unregistering code running as same time may introduce some rare race
condition, and make the code to be more hard to be understood.
This patch adds return value to run_cache_set(), and returns -EIO if
bch_journal_rreplay() fails. Then caller of run_cache_set() may detect
such failure and stop registering code flow immedidately inside
register_cache_set().
If journal replay fails, run_cache_set() can report error immediately
to register_cache_set(). This patch makes the failure handling for
bch_journal_replay() be in synchronized way, easier to understand and
debug, and avoid poetential race condition for register-and-unregister
in same time.
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
In journal_reclaim() ja->cur_idx of each cache will be update to
reclaim available journal buckets. Variable 'int n' is used to count how
many cache is successfully reclaimed, then n is set to c->journal.key
by SET_KEY_PTRS(). Later in journal_write_unlocked(), a for_each_cache()
loop will write the jset data onto each cache.
The problem is, if all jouranl buckets on each cache is full, the
following code in journal_reclaim(),
529 for_each_cache(ca, c, iter) {
530 struct journal_device *ja = &ca->journal;
531 unsigned int next = (ja->cur_idx + 1) % ca->sb.njournal_buckets;
532
533 /* No space available on this device */
534 if (next == ja->discard_idx)
535 continue;
536
537 ja->cur_idx = next;
538 k->ptr[n++] = MAKE_PTR(0,
539 bucket_to_sector(c, ca->sb.d[ja->cur_idx]),
540 ca->sb.nr_this_dev);
541 }
542
543 bkey_init(k);
544 SET_KEY_PTRS(k, n);
If there is no available bucket to reclaim, the if() condition at line
534 will always true, and n remains 0. Then at line 544, SET_KEY_PTRS()
will set KEY_PTRS field of c->journal.key to 0.
Setting KEY_PTRS field of c->journal.key to 0 is wrong. Because in
journal_write_unlocked() the journal data is written in following loop,
649 for (i = 0; i < KEY_PTRS(k); i++) {
650-671 submit journal data to cache device
672 }
If KEY_PTRS field is set to 0 in jouranl_reclaim(), the journal data
won't be written to cache device here. If system crahed or rebooted
before bkeys of the lost journal entries written into btree nodes, data
corruption will be reported during bcache reload after rebooting the
system.
Indeed there is only one cache in a cache set, there is no need to set
KEY_PTRS field in journal_reclaim() at all. But in order to keep the
for_each_cache() logic consistent for now, this patch fixes the above
problem by not setting 0 KEY_PTRS of journal key, if there is no bucket
available to reclaim.
Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
'int ret' is defined as a local variable inside macro read_bucket().
Since this macro is called multiple times, and following patches will
use a 'int ret' variable in bch_journal_read(), this patch moves
definition of 'int ret' from macro read_bucket() to range of function
bch_journal_read().
Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
There is a race between cache device register and cache set unregister.
For an already registered cache device, register_bcache will call
bch_is_open to iterate through all cachesets and check every cache
there. The race occurs if cache_set_free executes at the same time and
clears the caches right before ca is dereferenced in bch_is_open_cache.
To close the race, let's make sure the clean up work is protected by
the bch_register_lock as well.
This issue can be reproduced as follows,
while true; do echo /dev/XXX> /sys/fs/bcache/register ; done&
while true; do echo 1> /sys/block/XXX/bcache/set/unregister ; done &
and results in the following oops,
[ +0.000053] BUG: unable to handle kernel NULL pointer dereference at 0000000000000998
[ +0.000457] #PF error: [normal kernel read fault]
[ +0.000464] PGD 800000003ca9d067 P4D 800000003ca9d067 PUD 3ca9c067 PMD 0
[ +0.000388] Oops: 0000 [#1] SMP PTI
[ +0.000269] CPU: 1 PID: 3266 Comm: bash Not tainted 5.0.0+ #6
[ +0.000346] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.fc28 04/01/2014
[ +0.000472] RIP: 0010:register_bcache+0x1829/0x1990 [bcache]
[ +0.000344] Code: b0 48 83 e8 50 48 81 fa e0 e1 10 c0 0f 84 a9 00 00 00 48 89 c6 48 89 ca 0f b7 ba 54 04 00 00 4c 8b 82 60 0c 00 00 85 ff 74 2f <49> 3b a8 98 09 00 00 74 4e 44 8d 47 ff 31 ff 49 c1 e0 03 eb 0d
[ +0.000839] RSP: 0018:ffff92ee804cbd88 EFLAGS: 00010202
[ +0.000328] RAX: ffffffffc010e190 RBX: ffff918b5c6b5000 RCX: ffff918b7d8e0000
[ +0.000399] RDX: ffff918b7d8e0000 RSI: ffffffffc010e190 RDI: 0000000000000001
[ +0.000398] RBP: ffff918b7d318340 R08: 0000000000000000 R09: ffffffffb9bd2d7a
[ +0.000385] R10: ffff918b7eb253c0 R11: ffffb95980f51200 R12: ffffffffc010e1a0
[ +0.000411] R13: fffffffffffffff2 R14: 000000000000000b R15: ffff918b7e232620
[ +0.000384] FS: 00007f955bec2740(0000) GS:ffff918b7eb00000(0000) knlGS:0000000000000000
[ +0.000420] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ +0.000801] CR2: 0000000000000998 CR3: 000000003cad6000 CR4: 00000000001406e0
[ +0.000837] Call Trace:
[ +0.000682] ? _cond_resched+0x10/0x20
[ +0.000691] ? __kmalloc+0x131/0x1b0
[ +0.000710] kernfs_fop_write+0xfa/0x170
[ +0.000733] __vfs_write+0x2e/0x190
[ +0.000688] ? inode_security+0x10/0x30
[ +0.000698] ? selinux_file_permission+0xd2/0x120
[ +0.000752] ? security_file_permission+0x2b/0x100
[ +0.000753] vfs_write+0xa8/0x1a0
[ +0.000676] ksys_write+0x4d/0xb0
[ +0.000699] do_syscall_64+0x3a/0xf0
[ +0.000692] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Signed-off-by: Liang Chen <liangchen.linux@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
There are a few nits in this function. They could in theory all
be separate patches, but that's probably taking small commits
too far.
1) I added a brief comment saying what it does.
2) I like to declare pointer parameters "const" where possible
for documentation reasons.
3) It uses bitmap_weight(&rand, BITS_PER_LONG) to compute the Hamming
weight of a 32-bit random number (giving a random integer with
mean 16 and variance 8). Passing by reference in a 64-bit variable
is silly; just use hweight32().
4) Its helper function fract_exp_two is unnecessarily tangled.
Gcc can optimize the multiply by (1 << x) to a shift, but it can
be written in a much more straightforward way at the cost of one
more bit of internal precision. Some analysis reveals that this
bit is always available.
This shrinks the object code for fract_exp_two(x, 6) from 23 bytes:
0000000000000000 <foo1>:
0: 89 f9 mov %edi,%ecx
2: c1 e9 06 shr $0x6,%ecx
5: b8 01 00 00 00 mov $0x1,%eax
a: d3 e0 shl %cl,%eax
c: 83 e7 3f and $0x3f,%edi
f: d3 e7 shl %cl,%edi
11: c1 ef 06 shr $0x6,%edi
14: 01 f8 add %edi,%eax
16: c3 retq
To 19:
0000000000000017 <foo2>:
17: 89 f8 mov %edi,%eax
19: 83 e0 3f and $0x3f,%eax
1c: 83 c0 40 add $0x40,%eax
1f: 89 f9 mov %edi,%ecx
21: c1 e9 06 shr $0x6,%ecx
24: d3 e0 shl %cl,%eax
26: c1 e8 06 shr $0x6,%eax
29: c3 retq
(Verified with 0 <= frac_bits <= 8, 0 <= x < 16<<frac_bits;
both versions produce the same output.)
5) And finally, the call to bch_get_congested() in check_should_bypass()
is separated from the use of the value by multiple tests which
could moot the need to compute it. Move the computation down to
where it's needed. This also saves a local register to hold the
computed value.
Signed-off-by: George Spelvin <lkml@sdf.org>
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
This patch uses kmemdup_nul to create a NUL-terminated string from
dc->sb.label. This is better than open coding it.
With this, we can move env[2] initialization into env[] array to make
code more elegant.
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
clang has identified a code path in which it thinks a
variable may be unused:
drivers/md/bcache/alloc.c:333:4: error: variable 'bucket' is used uninitialized whenever 'if' condition is false
[-Werror,-Wsometimes-uninitialized]
fifo_pop(&ca->free_inc, bucket);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/md/bcache/util.h:219:27: note: expanded from macro 'fifo_pop'
#define fifo_pop(fifo, i) fifo_pop_front(fifo, (i))
^~~~~~~~~~~~~~~~~~~~~~~~~
drivers/md/bcache/util.h:189:6: note: expanded from macro 'fifo_pop_front'
if (_r) { \
^~
drivers/md/bcache/alloc.c:343:46: note: uninitialized use occurs here
allocator_wait(ca, bch_allocator_push(ca, bucket));
^~~~~~
drivers/md/bcache/alloc.c:287:7: note: expanded from macro 'allocator_wait'
if (cond) \
^~~~
drivers/md/bcache/alloc.c:333:4: note: remove the 'if' if its condition is always true
fifo_pop(&ca->free_inc, bucket);
^
drivers/md/bcache/util.h:219:27: note: expanded from macro 'fifo_pop'
#define fifo_pop(fifo, i) fifo_pop_front(fifo, (i))
^
drivers/md/bcache/util.h:189:2: note: expanded from macro 'fifo_pop_front'
if (_r) { \
^
drivers/md/bcache/alloc.c:331:15: note: initialize the variable 'bucket' to silence this warning
long bucket;
^
This cannot happen in practice because we only enter the loop
if there is at least one element in the list.
Slightly rearranging the code makes this clearer to both the
reader and the compiler, which avoids the warning.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
To get the amount of unused buckets in sysfs_priority_stats, the code
count the buckets which GC_SECTORS_USED is zero. It's correct and should
not be overwritten by the count of buckets which prio is zero.
Signed-off-by: Guoju Fang <fangguoju@gmail.com>
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The bio from upper layer is considered completed when bio_complete()
returns. In most scenarios bio_complete() is called in search_free(),
but when read miss happens, the bio_compete() is called when backing
device reading completed, while the struct search is still in use until
cache inserting finished.
If someone stops the bcache device just then, the device may be closed
and released, but after cache inserting finished the struct search will
access a freed struct cached_dev.
This patch add the reference of bcache device before bio_complete() when
read miss happens, and put it after the search is not used.
Signed-off-by: Guoju Fang <fangguoju@gmail.com>
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The RMI4 function structure has been released in rmi_register_function
if error occurs. However, it will be released again in the function
rmi_create_function, which may result in a double-free bug.
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
When this code was consolidated the intention was that the VMA would
become backed by anonymous zero pages after the zap_vma_pte - however this
very subtly relied on setting the vm_ops = NULL and clearing the VM_SHARED
bits to transform the VMA into an anonymous VMA. Since the vm_ops was
removed this broke.
Now userspace gets a SIGBUS if it touches the vma after disassociation.
Instead of converting the VMA to anonymous provide a fault handler that
puts a zero'd page into the VMA when user-space touches it after
disassociation.
Cc: stable@vger.kernel.org
Suggested-by: Andrea Arcangeli <aarcange@redhat.com>
Fixes: 5f9794dc94f5 ("RDMA/ucontext: Add a core API for mmaping driver IO memory")
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
The SNVS power key is not only used on i.MX6SX and i.MX7D, it is also
used by i.MX6UL and NXP's latest ARMv8 based i.MX8M series SOC. So
update the config dependency to use ARCH_MXC, and add the COMPILE_TEST
too.
Signed-off-by: Jacky Bai <ping.bai@nxp.com>
Reviewed-by: Dong Aisheng <aisheng.dong@nxp.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
Since mlx5 supports device disassociate it must use this API for all
BAR page mmaps, otherwise the pages can remain mapped after the device
is unplugged causing a system crash.
Cc: stable@vger.kernel.org
Fixes: 5f9794dc94f5 ("RDMA/ucontext: Add a core API for mmaping driver IO memory")
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
|
|
The intent of this VMA was to be read-only from user space, but the
VM_MAYWRITE masking was missed, so mprotect could make it writable.
Cc: stable@vger.kernel.org
Fixes: 5c99eaecb1fc ("IB/mlx5: Mmap the HCA's clock info to user-space")
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
|