Age | Commit message (Collapse) | Author |
|
SCSI restarts its queue in scsi_end_request() automatically, so we don't
need to handle this case in blk-mq.
Especailly any request won't be dequeued in this case, we needn't to
worry about IO hang caused by restart vs. dispatch.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Now restart is used in the following cases, and TAG_SHARED is for
SCSI only.
1) .get_budget() returns BLK_STS_RESOURCE
- if resource in target/host level isn't satisfied, this SCSI device
will be added in shost->starved_list, and the whole queue will be rerun
(via SCSI's built-in RESTART) in scsi_end_request() after any request
initiated from this host/targe is completed. Forget to mention, host level
resource can't be an issue for blk-mq at all.
- the same is true if resource in the queue level isn't satisfied.
- if there isn't outstanding request on this queue, then SCSI's RESTART
can't work(blk-mq's can't work too), and the queue will be run after
SCSI_QUEUE_DELAY, and finally all starved sdevs will be handled by SCSI's
RESTART when this request is finished
2) scsi_dispatch_cmd() returns BLK_STS_RESOURCE
- if there isn't onprogressing request on this queue, the queue
will be run after SCSI_QUEUE_DELAY
- otherwise, SCSI's RESTART covers the rerun.
3) blk_mq_get_driver_tag() failed
- BLK_MQ_S_TAG_WAITING covers the cross-queue RESTART for driver
allocation.
In one word, SCSI's built-in RESTART is enough to cover the queue
rerun, and we don't need to pay special attention to TAG_SHARED wrt. restart.
In my test on scsi_debug(8 luns), this patch improves IOPS by 20% ~ 30% when
running I/O on these 8 luns concurrently.
Aslo Roman Pen reported the current RESTART is very expensive especialy
when there are lots of LUNs attached in one host, such as in his
test, RESTART causes half of IOPS be cut.
Fixes: https://marc.info/?l=linux-kernel&m=150832216727524&w=2
Fixes: 6d8c6c0f97ad ("blk-mq: Restart a single queue if tag sets are shared")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
We need to tell blk-mq to reserve resources before queuing one request,
so implement these two callbacks. Then blk-mq can avoid to dequeue
request too early, and IO merging can be improved a lot.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
In the following patch, we will implement scsi_get_budget()
which need to call scsi_prep_state_check() when rq isn't
dequeued yet.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
SCSI devices use host-wide tagset, and the shared driver tag space is
often quite big. However, there is also a queue depth for each lun(
.cmd_per_lun), which is often small, for example, on both lpfc and
qla2xxx, .cmd_per_lun is just 3.
So lots of requests may stay in sw queue, and we always flush all
belonging to same hw queue and dispatch them all to driver.
Unfortunately it is easy to cause queue busy because of the small
.cmd_per_lun. Once these requests are flushed out, they have to stay in
hctx->dispatch, and no bio merge can happen on these requests, and
sequential IO performance is harmed.
This patch introduces blk_mq_dequeue_from_ctx for dequeuing a request
from a sw queue, so that we can dispatch them in scheduler's way. We can
then avoid dequeueing too many requests from sw queue, since we don't
flush ->dispatch completely.
This patch improves dispatching from sw queue by using the .get_budget
and .put_budget callbacks.
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
For SCSI devices, there is often a per-request-queue depth, which needs
to be respected before queuing one request.
Currently blk-mq always dequeues the request first, then calls
.queue_rq() to dispatch the request to lld. One obvious issue with this
approach is that I/O merging may not be successful, because when the
per-request-queue depth can't be respected, .queue_rq() has to return
BLK_STS_RESOURCE, and then this request has to stay in hctx->dispatch
list. This means it never gets a chance to be merged with other IO.
This patch introduces .get_budget and .put_budget callback in blk_mq_ops,
then we can try to get reserved budget first before dequeuing request.
If the budget for queueing I/O can't be satisfied, we don't need to
dequeue request at all. Hence the request can be left in the IO
scheduler queue, for more merging opportunities.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
There may be request in sw queue, and not fetched to domain queue
yet, so check it in kyber_has_work().
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
For blk-mq, we need to be able to iterate software queues starting
from any queue in a round robin fashion, so introduce this helper.
Reviewed-by: Omar Sandoval <osandov@fb.com>
Cc: Omar Sandoval <osandov@fb.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
So that it becomes easy to support to dispatch from sw queue in the
following patch.
No functional change.
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Suggested-by: Christoph Hellwig <hch@lst.de> # for simplifying dispatch logic
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
When the hw queue is busy, we shouldn't take requests from the scheduler
queue any more, otherwise it is difficult to do IO merge.
This patch fixes the awful IO performance on some SCSI devices(lpfc,
qla2xxx, ...) when mq-deadline/kyber is used by not taking requests if
hw queue is busy.
Reviewed-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Just like the CTO timeout calculation introduced recently, the DTO
timeout calculation was incorrect. It used "bus_hz" but, as far as I
can tell, it's supposed to use the card clock. Let's account for the
div value, which is documented as 2x the value stored in the register,
or 1 if the register is 0.
NOTE: This was likely not terribly important until commit 16a34574c6ca
("mmc: dw_mmc: remove the quirks flags") landed because "DIV" is
documented on Rockchip SoCs (the ones that used to define the quirk)
to always be 0 or 1. ...and, in fact, it's documented to only be 1
with EMMC in 8-bit DDR52 mode. Thus before the quirk was applied to
everyone it was mostly OK to ignore the DIV value.
I haven't personally observed any problems that are fixed by this
patch but I also haven't tested this anywhere with a DIV other an 0.
AKA: this problem was found simply by code inspection and I have no
failing test cases that are fixed by it. Presumably this could fix
real bugs for someone out there, though.
Fixes: 16a34574c6ca ("mmc: dw_mmc: remove the quirks flags")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
In dquot_writeback_dquots(), we write back dquot from dirty dquots
list. There is a potential infinite loop if ->write_dquot() failure
and forget remove dquot from the list. This patch clear dirty bit
anyway to avoid it.
Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
'asoc/fix/rt5514', 'asoc/fix/rt5616', 'asoc/fix/rt5659' and 'asoc/fix/rt5663' into tmp
|
|
This patch removes the un-necessary ifdef CONFIG_ACPI and directly
uses the acpi_match_table from the driver pdev.
Signed-off-by: Hoan Tran <hotran@apm.com>
[groeck: Dropped unnecessary initialization]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
Based on SNMP values provided by Roman, Yuchung made the observation
that some crashes in tcp_sacktag_walk() might be caused by MTU probing.
Looking at tcp_mtu_probe(), I found that when a new skb was placed
in front of the write queue, we were not updating tcp highest sack.
If one skb is freed because all its content was copied to the new skb
(for MTU probing), then tp->highest_sack could point to a now freed skb.
Bad things would then happen, including infinite loops.
This patch renames tcp_highest_sack_combine() and uses it
from tcp_mtu_probe() to fix the bug.
Note that I also removed one test against tp->sacked_out,
since we want to replace tp->highest_sack regardless of whatever
condition, since keeping a stale pointer to freed skb is a recipe
for disaster.
Fixes: a47e5a988a57 ("[TCP]: Convert highest_sack to sk_buff to allow direct access")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Reported-by: Roman Gushchin <guro@fb.com>
Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In the (unlikely) event fixup_permanent_addr() returns a failure,
addrconf_permanent_addr() calls ipv6_del_addr() without the
mandatory call to in6_ifa_hold(), leading to a refcount error,
spotted by syzkaller :
WARNING: CPU: 1 PID: 3142 at lib/refcount.c:227 refcount_dec+0x4c/0x50
lib/refcount.c:227
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 3142 Comm: ip Not tainted 4.14.0-rc4-next-20171009+ #33
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:16 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:52
panic+0x1e4/0x41c kernel/panic.c:181
__warn+0x1c4/0x1e0 kernel/panic.c:544
report_bug+0x211/0x2d0 lib/bug.c:183
fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:178
do_trap_no_signal arch/x86/kernel/traps.c:212 [inline]
do_trap+0x260/0x390 arch/x86/kernel/traps.c:261
do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:298
do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:311
invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:905
RIP: 0010:refcount_dec+0x4c/0x50 lib/refcount.c:227
RSP: 0018:ffff8801ca49e680 EFLAGS: 00010286
RAX: 000000000000002c RBX: ffff8801d07cfcdc RCX: 0000000000000000
RDX: 000000000000002c RSI: 1ffff10039493c90 RDI: ffffed0039493cc4
RBP: ffff8801ca49e688 R08: ffff8801ca49dd70 R09: 0000000000000000
R10: ffff8801ca49df58 R11: 0000000000000000 R12: 1ffff10039493cd9
R13: ffff8801ca49e6e8 R14: ffff8801ca49e7e8 R15: ffff8801d07cfcdc
__in6_ifa_put include/net/addrconf.h:369 [inline]
ipv6_del_addr+0x42b/0xb60 net/ipv6/addrconf.c:1208
addrconf_permanent_addr net/ipv6/addrconf.c:3327 [inline]
addrconf_notify+0x1c66/0x2190 net/ipv6/addrconf.c:3393
notifier_call_chain+0x136/0x2c0 kernel/notifier.c:93
__raw_notifier_call_chain kernel/notifier.c:394 [inline]
raw_notifier_call_chain+0x2d/0x40 kernel/notifier.c:401
call_netdevice_notifiers_info+0x32/0x60 net/core/dev.c:1697
call_netdevice_notifiers net/core/dev.c:1715 [inline]
__dev_notify_flags+0x15d/0x430 net/core/dev.c:6843
dev_change_flags+0xf5/0x140 net/core/dev.c:6879
do_setlink+0xa1b/0x38e0 net/core/rtnetlink.c:2113
rtnl_newlink+0xf0d/0x1a40 net/core/rtnetlink.c:2661
rtnetlink_rcv_msg+0x733/0x1090 net/core/rtnetlink.c:4301
netlink_rcv_skb+0x216/0x440 net/netlink/af_netlink.c:2408
rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4313
netlink_unicast_kernel net/netlink/af_netlink.c:1273 [inline]
netlink_unicast+0x4e8/0x6f0 net/netlink/af_netlink.c:1299
netlink_sendmsg+0xa4a/0xe70 net/netlink/af_netlink.c:1862
sock_sendmsg_nosec net/socket.c:633 [inline]
sock_sendmsg+0xca/0x110 net/socket.c:643
___sys_sendmsg+0x75b/0x8a0 net/socket.c:2049
__sys_sendmsg+0xe5/0x210 net/socket.c:2083
SYSC_sendmsg net/socket.c:2094 [inline]
SyS_sendmsg+0x2d/0x50 net/socket.c:2090
entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x7fa9174d3320
RSP: 002b:00007ffe302ae9e8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007ffe302b2ae0 RCX: 00007fa9174d3320
RDX: 0000000000000000 RSI: 00007ffe302aea20 RDI: 0000000000000016
RBP: 0000000000000082 R08: 0000000000000000 R09: 000000000000000f
R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffe302b32a0
R13: 0000000000000000 R14: 00007ffe302b2ab8 R15: 00007ffe302b32b8
Fixes: f1705ec197e7 ("net: ipv6: Make address flushing on ifdown optional")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: David Ahern <dsahern@gmail.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Syzkaller found several variants of the lockup below by setting negative
values with the TUNSETSNDBUF ioctl. This patch adds a sanity check
to both the tun and tap versions of this ioctl.
watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [repro:2389]
Modules linked in:
irq event stamp: 329692056
hardirqs last enabled at (329692055): [<ffffffff824b8381>] _raw_spin_unlock_irqrestore+0x31/0x75
hardirqs last disabled at (329692056): [<ffffffff824b9e58>] apic_timer_interrupt+0x98/0xb0
softirqs last enabled at (35659740): [<ffffffff824bc958>] __do_softirq+0x328/0x48c
softirqs last disabled at (35659731): [<ffffffff811c796c>] irq_exit+0xbc/0xd0
CPU: 0 PID: 2389 Comm: repro Not tainted 4.14.0-rc7 #23
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: ffff880009452140 task.stack: ffff880006a20000
RIP: 0010:_raw_spin_lock_irqsave+0x11/0x80
RSP: 0018:ffff880006a27c50 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff10
RAX: ffff880009ac68d0 RBX: ffff880006a27ce0 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffff880006a27ce0 RDI: ffff880009ac6900
RBP: ffff880006a27c60 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 000000000063ff00 R12: ffff880009ac6900
R13: ffff880006a27cf8 R14: 0000000000000001 R15: ffff880006a27cf8
FS: 00007f4be4838700(0000) GS:ffff88000cc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020101000 CR3: 0000000009616000 CR4: 00000000000006f0
Call Trace:
prepare_to_wait+0x26/0xc0
sock_alloc_send_pskb+0x14e/0x270
? remove_wait_queue+0x60/0x60
tun_get_user+0x2cc/0x19d0
? __tun_get+0x60/0x1b0
tun_chr_write_iter+0x57/0x86
__vfs_write+0x156/0x1e0
vfs_write+0xf7/0x230
SyS_write+0x57/0xd0
entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x7f4be4356df9
RSP: 002b:00007ffc18101c08 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f4be4356df9
RDX: 0000000000000046 RSI: 0000000020101000 RDI: 0000000000000005
RBP: 00007ffc18101c40 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000001 R11: 0000000000000293 R12: 0000559c75f64780
R13: 00007ffc18101d30 R14: 0000000000000000 R15: 0000000000000000
Fixes: 33dccbb050bb ("tun: Limit amount of queued packets per device")
Fixes: 20d29d7a916a ("net: macvtap driver")
Signed-off-by: Craig Gallek <kraig@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It fixes a problem for the last chunk where 'chunk_size' is smaller than
MLXSW_I2C_BLK_MAX and data is copied to the wrong offset, overriding
previous data.
Fixes: 6882b0aee180 ("mlxsw: Introduce support for I2C bus")
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
niph is not updated after pskb_expand_head changes the skb head. It
still points to the freed data, which is then used to update tot_len and
checksum. This could cause use-after-free poison crash.
Update niph, if ip_route_me_harder does not fail.
This only affects the interaction with REJECT targets and br_netfilter.
Signed-off-by: Tejaswi Tanikella <tejaswit@codeaurora.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:
====================
pull request (net): ipsec 2017-11-01
1) Fix a memleak when a packet matches a policy
without a matching state.
2) Reset the socket cached dst_entry when inserting
a socket policy, otherwise the policy might be
ignored. From Jonathan Basseri.
3) Fix GSO for a IPsec, GRE tunnel combination.
We reset the encapsulation field at the skb
too erly, as a result GRE does not segment
GSO packets. Fix this by resetting the the
encapsulation field right before the
transformation where the inner headers get
invalid.
Please pull or let me know if there are problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix kernel-doc build error. A symbol that ends with an underscore
character ('_') has special meaning in reST (reStructuredText), so add
a '*' to prevent this error and to indicate that there are several of
these values to choose from.
../sound/soc/soc-core.c:2799: ERROR: Unknown target name: "snd_soc_daifmt".
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
On some platforms, when reading or writing some special registers through
regmap, we should acquire one hardware spinlock to synchronize between
the multiple subsystems. Thus this patch adds the hardware spinlock
support for regmap.
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Current rsnd driver has rsnd_mod_id() which returns mod ID,
and it returns -1 if mod was NULL.
In the same time, this driver has rsnd_mod_name() which returns mod
name, and it returns "unknown" if mod or mod->ops was NULL.
Basically these "mod" never be NULL, but the reason why rsnd driver
has such behavior is that DMA path finder is assuming memory as
"mod == NULL".
Thus, current DMA path debug code prints like below.
Here "unknown[-1]" means it was memory.
...
rcar_sound ec500000.sound: unknown[-1] from
rcar_sound ec500000.sound: src[0] to
rcar_sound ec500000.sound: ctu[2]
rcar_sound ec500000.sound: mix[0]
rcar_sound ec500000.sound: dvc[0]
rcar_sound ec500000.sound: ssi[0]
rcar_sound ec500000.sound: audmac[0] unknown[-1] -> src[0]
...
1st issue is that it is confusable for user.
2nd issue is rsnd driver has something like below code.
mask |= 1 << rsnd_mod_id(mod);
Because of this kind of code, some statically code checker will
reports "Shifting by a negative value is undefined behaviour".
But this "mod" never be NULL, thus negative shift never happen.
To avoid these issues, this patch adds new dummy "mem" to
indicate memory, and use it to indicate debug information,
and, remove unneeded "NULL mod" behavior from rsnd_mod_id() and
rsnd_mod_name().
The debug information will be like below by this patch
...
rcar_sound ec500000.sound: mem[0] from
rcar_sound ec500000.sound: src[0] to
rcar_sound ec500000.sound: ctu[2]
rcar_sound ec500000.sound: mix[0]
rcar_sound ec500000.sound: dvc[0]
rcar_sound ec500000.sound: ssi[0]
rcar_sound ec500000.sound: audmac[0] mem[0] -> src[0]
...
Reported-by: Hiroyuki Yokoyama <hiroyuki.yokoyama.vx@renesas.com>
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
SSI parent mod might be NULL. ssi_parent_mod() needs to care
about it. Otherwise, it uses negative shift.
This patch fixes it.
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
CPUMASK_OFFSTACK=y kernels(v1)
When the irqaffinity= kernel parameter is passed in a CPUMASK_OFFSTACK=y
kernel, it fails to boot, because zalloc_cpumask_var() cannot be used before
initializing the slab allocator to allocate a cpumask.
So, use alloc_bootmem_cpumask_var() instead.
Also do some cleanups while at it: in init_irq_default_affinity() remove
an #ifdef via using cpumask_available().
Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20171026045800.27087-1-rakib.mullick@gmail.com
Link: http://lkml.kernel.org/r/20171101041451.12581-1-rakib.mullick@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The definition of sysctl_sched_migration_cost, sysctl_sched_nr_migrate
and sysctl_sched_time_avg includes the attribute const_debug. This
attribute is not part of the extern declaration of these variables in
include/linux/sched/sysctl.h, while it is in kernel/sched/sched.h,
and as a result Clang generates warnings like this:
kernel/sched/sched.h:1618:33: warning: section attribute is specified on redeclared variable [-Wsection]
extern const_debug unsigned int sysctl_sched_time_avg;
^
./include/linux/sched/sysctl.h:42:21: note: previous declaration is here
extern unsigned int sysctl_sched_time_avg;
The header only declares the variables when CONFIG_SCHED_DEBUG is defined,
therefore it is not necessary to duplicate the definition of const_debug.
Instead we can use the attribute __read_mostly, which is the expansion of
const_debug when CONFIG_SCHED_DEBUG=y is set.
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Reviewed-by: Nick Desaulniers <nick.desaulniers@gmail.com>
Cc: Douglas Anderson <dianders@chromium.org>
Cc: Guenter Roeck <groeck@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shile Zhang <shile.zhang@nokia.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20171030180816.170850-1-mka@chromium.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Dmitry (through syzbot) reported being able to trigger the WARN in
get_pi_state() and a use-after-free on:
raw_spin_lock_irq(&pi_state->pi_mutex.wait_lock);
Both are due to this race:
exit_pi_state_list() put_pi_state()
lock(&curr->pi_lock)
while() {
pi_state = list_first_entry(head);
hb = hash_futex(&pi_state->key);
unlock(&curr->pi_lock);
dec_and_test(&pi_state->refcount);
lock(&hb->lock)
lock(&pi_state->pi_mutex.wait_lock) // uaf if pi_state free'd
lock(&curr->pi_lock);
....
unlock(&curr->pi_lock);
get_pi_state(); // WARN; refcount==0
The problem is we take the reference count too late, and don't allow it
being 0. Fix it by using inc_not_zero() and simply retrying the loop
when we fail to get a refcount. In that case put_pi_state() should
remove the entry from the list.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Gratian Crisan <gratian.crisan@ni.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: dvhart@infradead.org
Cc: syzbot <bot+2af19c9e1ffe4d4ee1d16c56ae7580feaee75765@syzkaller.appspotmail.com>
Cc: syzkaller-bugs@googlegroups.com
Cc: <stable@vger.kernel.org>
Fixes: c74aef2d06a9 ("futex: Fix pi_state->owner serialization")
Link: http://lkml.kernel.org/r/20171031101853.xpfh72y643kdfhjs@hirez.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
belong to kernel text
This makes the changes introduced in commit 83e840c770f2c5
("powerpc64/elfv1: Only dereference function descriptor for non-text
symbols") to be specific to the kprobe subsystem.
We previously changed ppc_function_entry() to always check the provided
address to confirm if it needed to be dereferenced. This is actually
only an issue for kprobe blacklisted asm labels (through use of
_ASM_NOKPROBE_SYMBOL) and can cause other issues with ftrace. Also, the
additional checks are not really necessary for our other uses.
As such, move this check to the kprobes subsystem.
Fixes: 83e840c770f2 ("powerpc64/elfv1: Only dereference function descriptor for non-text symbols")
Cc: stable@vger.kernel.org # v4.13+
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
symbols"
This reverts commit 83e840c770f2c5 ("powerpc64/elfv1: Only dereference
function descriptor for non-text symbols").
Chandan reported that on newer kernels, trying to enable function_graph
tracer on ppc64 (BE) locks up the system with the following trace:
Unable to handle kernel paging request for data at address 0x600000002fa30010
Faulting instruction address: 0xc0000000001f1300
Thread overran stack, or stack corrupted
Oops: Kernel access of bad area, sig: 11 [#1]
BE SMP NR_CPUS=2048 DEBUG_PAGEALLOC NUMA pSeries
Modules linked in:
CPU: 1 PID: 6586 Comm: bash Not tainted 4.14.0-rc3-00162-g6e51f1f-dirty #20
task: c000000625c07200 task.stack: c000000625c07310
NIP: c0000000001f1300 LR: c000000000121cac CTR: c000000000061af8
REGS: c000000625c088c0 TRAP: 0380 Not tainted (4.14.0-rc3-00162-g6e51f1f-dirty)
MSR: 8000000000001032 <SF,ME,IR,DR,RI> CR: 28002848 XER: 00000000
CFAR: c0000000001f1320 SOFTE: 0
...
NIP [c0000000001f1300] .__is_insn_slot_addr+0x30/0x90
LR [c000000000121cac] .kernel_text_address+0x18c/0x1c0
Call Trace:
[c000000625c08b40] [c0000000001bd040] .is_module_text_address+0x20/0x40 (unreliable)
[c000000625c08bc0] [c000000000121cac] .kernel_text_address+0x18c/0x1c0
[c000000625c08c50] [c000000000061960] .prepare_ftrace_return+0x50/0x130
[c000000625c08cf0] [c000000000061b10] .ftrace_graph_caller+0x14/0x34
[c000000625c08d60] [c000000000121b40] .kernel_text_address+0x20/0x1c0
[c000000625c08df0] [c000000000061960] .prepare_ftrace_return+0x50/0x130
...
[c000000625c0ab30] [c000000000061960] .prepare_ftrace_return+0x50/0x130
[c000000625c0abd0] [c000000000061b10] .ftrace_graph_caller+0x14/0x34
[c000000625c0ac40] [c000000000121b40] .kernel_text_address+0x20/0x1c0
[c000000625c0acd0] [c000000000061960] .prepare_ftrace_return+0x50/0x130
[c000000625c0ad70] [c000000000061b10] .ftrace_graph_caller+0x14/0x34
[c000000625c0ade0] [c000000000121b40] .kernel_text_address+0x20/0x1c0
This is because ftrace is using ppc_function_entry() for obtaining the
address of return_to_handler() in prepare_ftrace_return(). The call to
kernel_text_address() itself gets traced and we end up in a recursive
loop.
Fixes: 83e840c770f2 ("powerpc64/elfv1: Only dereference function descriptor for non-text symbols")
Cc: stable@vger.kernel.org # v4.13+
Reported-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
The ASIC has the ability to generate events whenever a sensor indicates
the temperature goes above or below its high or low thresholds,
respectively.
In new firmware versions the firmware enforces a minimum of 5
degrees Celsius difference between both thresholds. Make the driver
conform to this requirement.
Note that this is required even when the events are disabled, as in
certain systems interrupts are generated via GPIO based on these
thresholds.
Fixes: 85926f877040 ("mlxsw: reg: Add definition of temperature management registers")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Provide a mailing list for maintenance of the module instead.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
For the time being I will be available in my private mail. Update both the
MAINTAINERS file and the individual modules MODULE_AUTHOR directive with
the new address.
Signed-off-by: Yotam Gigi <yotam.gi@gmail.com>
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The function of_parse_phandle() returns a NULL pointer if it cannot
resolve a phandle property to a device_node pointer. In function
hns_nic_dev_probe(), its return value is passed to PTR_ERR to extract
the error code. However, in this case, the extracted error code will
always be zero, which is unexpected.
Signed-off-by: Pan Bian <bianpan2016@163.com>
Reviewed-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The function netdev_priv() returns the private data of the device. The
memory to store the private data is allocated in alloc_netdev() and is
released in netdev_free(). Calling kfree() on the return value of
netdev_priv() after netdev_free() results in a double free bug.
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Now that SK_REDIRECT is no longer a valid return code. Remove it
from the UAPI completely. Then do a namespace remapping internal
to sockmap so SK_REDIRECT is no longer externally visible.
Patchs primary change is to do a namechange from SK_REDIRECT to
__SK_REDIRECT
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The fix 5987feb38aa5 ("net: phy: marvell: logical vs bitwise OR typo")
uncovered another bug in the Marvell PHY driver, which broke the
Marvell OpenRD platform. It relies on the bootloader configuring the
RGMII delays and does not specify a phy-mode in its device tree. The
PHY driver should only configure RGMII delays if the phy mode
indicates it is using RGMII. Without anything in device tree, the
mv643xx Ethernet driver defaults to GMII.
Fixes: 5987feb38aa5 ("net: phy: marvell: logical vs bitwise OR typo")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:
====================
wireless-drivers fixes for 4.14
The most important here is the security vulnerabitility fix for
ath10k.
ath10k
* fix security vulnerability with missing PN check on certain hardware
* revert ath10k napi fix as it caused regressions on QCA6174
wcn36xx
* remove unnecessary rcu_read_unlock() from error path
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 6f542ebeaee0 ("MIPS: Fix race on setting and getting
cpu_online_mask") effectively reverted commit 8f46cca1e6c06 ("MIPS: SMP:
Fix possibility of deadlock when bringing CPUs online") and thus has
reinstated the possibility of deadlock.
The commit was based on testing of kernel v4.4, where the CPU hotplug
core code issued a BUG() if the starting CPU is not marked online when
the boot CPU returns from __cpu_up. The commit fixes this race (in
v4.4), but re-introduces the deadlock situation.
As noted in the commit message, upstream differs in this area. Commit
8df3e07e7f21f ("cpu/hotplug: Let upcoming cpu bring itself fully up")
adds a completion event in the CPU hotplug core code, making this race
impossible. However, people were unhappy with relying on the core code
to do the right thing.
To address the issues both commits were trying to fix, add a second
completion event in the MIPS smp hotplug path. It removes the
possibility of a race, since the MIPS smp hotplug code now synchronises
both the boot and secondary CPUs before they return to the hotplug core
code. It also addresses the deadlock by ensuring that the secondary CPU
is not marked online before it's counters are synchronised.
This fix should also be backported to fix the race condition introduced
by the backport of commit 8f46cca1e6c06 ("MIPS: SMP: Fix possibility of
deadlock when bringing CPUs online"), through really that race only
existed before commit 8df3e07e7f21f ("cpu/hotplug: Let upcoming cpu
bring itself fully up").
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Fixes: 6f542ebeaee0 ("MIPS: Fix race on setting and getting cpu_online_mask")
CC: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com>
Cc: <stable@vger.kernel.org> # v4.1+: 8f46cca1e6c0: "MIPS: SMP: Fix possibility of deadlock when bringing CPUs online"
Cc: <stable@vger.kernel.org> # v4.1+: a00eeede507c: "MIPS: SMP: Use a completion event to signal CPU up"
Cc: <stable@vger.kernel.org> # v4.1+: 6f542ebeaee0: "MIPS: Fix race on setting and getting cpu_online_mask"
Cc: <stable@vger.kernel.org> # v4.1+
Patchwork: https://patchwork.linux-mips.org/patch/17376/
Signed-off-by: James Hogan <jhogan@kernel.org>
|
|
Fix a typo in build_one_insn().
Fixes: b6bd53f9c4e8 ("MIPS: Add missing file for eBPF JIT.")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Cc: <stable@vger.kernel.org> # 4.13+
Patchwork: https://patchwork.linux-mips.org/patch/17491/
Signed-off-by: James Hogan <jhogan@kernel.org>
|
|
It seems that this is a typo error and the proper bit masking is
"RT | RS" instead of "RS | RS".
This issue was detected with the help of Coccinelle.
Fixes: d6b3314b49e1 ("MIPS: uasm: Add lh uam instruction")
Reported-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Reviewed-by: James Hogan <jhogan@kernel.org>
Cc: <stable@vger.kernel.org> # 3.16+
Patchwork: https://patchwork.linux-mips.org/patch/17551/
Signed-off-by: James Hogan <jhogan@kernel.org>
|
|
single nouveau regression fix.
* 'linux-4.14' of git://github.com/skeggsb/linux:
drm/nouveau/kms/nv50: use the correct state for base channel notifier setup
|
|
The default CM target field in the GCR_BASE register is encoded with 0
meaning memory & 1 being reserved. However the definitions we use for
those bits effectively get these two values backwards - likely because
they were copied from the definitions for the CM regions where the
target is encoded differently. This results in use setting up GCR_BASE
with the reserved target value by default, rather than targeting memory
as intended. Although we currently seem to get away with this it's not a
great idea to rely upon.
Fix this by changing our macros to match the documentated target values.
The incorrect encoding became used as of commit 9f98f3dd0c51 ("MIPS: Add
generic CM probe & access code") in the Linux v3.15 cycle, and was
likely carried forwards from older but unused code introduced by
commit 39b8d5254246 ("[MIPS] Add support for MIPS CMP platform.") in the
v2.6.26 cycle.
Fixes: 9f98f3dd0c51 ("MIPS: Add generic CM probe & access code")
Signed-off-by: Paul Burton <paul.burton@mips.com>
Reported-by: Matt Redfearn <matt.redfearn@mips.com>
Reviewed-by: James Hogan <jhogan@kernel.org>
Cc: Matt Redfearn <matt.redfearn@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # v3.15+
Patchwork: https://patchwork.linux-mips.org/patch/17562/
Signed-off-by: James Hogan <jhogan@kernel.org>
|
|
Commit e83f7e02af50c ("MIPS: CPS: Have asm/mips-cps.h include CM & CPC
headers") adds a #error to arch/mips/include/asm/mips-cpc.h if it is
included directly. While this commit replaced almost all direct includes
of mips-cm.h and mips-cpc.h, 2 remain.
With some defconfigs, mips-cps.h is indirectly included before
mips-cpc.h, but in others this results in compilation errors:
In file included from arch/mips/generic/init.c:23:0:
./arch/mips/include/asm/mips-cpc.h:12:3: error: #error Please include
asm/mips-cps.h rather than asm/mips-cpc.h
# error Please include asm/mips-cps.h rather than asm/mips-cpc.h
In file included from arch/mips/kernel/smp.c:23:0:
./arch/mips/include/asm/mips-cpc.h:12:3: error: #error Please include
asm/mips-cps.h rather than asm/mips-cpc.h
# error Please include asm/mips-cps.h rather than asm/mips-cpc.h
In both cases, fix this by including mips-cps.h instead.
Fixes: e83f7e02af50c ("MIPS: CPS: Have asm/mips-cps.h include CM & CPC headers")
Signed-off-by: Matt Redfearn <matt.redfearn@mips.com>
Patchwork: https://patchwork.linux-mips.org/patch/17492/
Signed-off-by: James Hogan <jhogan@kernel.org>
|
|
Commit 9fef68686317b ("MIPS: Make SAVE_SOME more standard") made several
changes to the order in which registers are saved in the SAVE_SOME
macro, used by exception handlers to save the processor state. In
particular, it removed the
move k1, sp
in the delay slot of the branch testing if the processor is already in
kernel mode. This is replaced later in the macro by a
move k0, sp
When CONFIG_EVA is disabled, this instruction actually appears in the
delay slot of the branch. However, when CONFIG_EVA is enabled, instead
the RPS workaround of
MFC0 k0, CP0_ENTRYHI
appears in the delay slot. This results in k0 not containing the stack
pointer, but some unrelated value, which is then saved to the kernel
stack. On exit from the exception, this bogus value is restored to the
stack pointer, resulting in an OOPS.
Fix this by moving the save of SP in k0 explicitly in the delay slot of
the branch, outside of the CONFIG_EVA section, restoring the expected
instruction ordering when CONFIG_EVA is active.
Fixes: 9fef68686317b ("MIPS: Make SAVE_SOME more standard")
Signed-off-by: Matt Redfearn <matt.redfearn@mips.com>
Reported-by: Vladimir Kondratiev <vladimir.kondratiev@intel.com>
Reviewed-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: James Hogan <jhogan@kernel.org>
Patchwork: https://patchwork.linux-mips.org/patch/17471/
Signed-off-by: James Hogan <jhogan@kernel.org>
|
|
A spin lock is used in the irq-mvebu-gicp driver, but it is never
initialized. This patch adds the missing spin_lock_init() call in the
driver's probe function.
Fixes: a68a63cb4dfc ("irqchip/irq-mvebu-gicp: Add new driver for Marvell GICP")
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: gregory.clement@free-electrons.com
Acked-by: marc.zyngier@arm.com
Cc: thomas.petazzoni@free-electrons.com
Cc: andrew@lunn.ch
Cc: jason@lakedaemon.net
Cc: nadavh@marvell.com
Cc: miquel.raynal@free-electrons.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: sebastian.hesselbarth@gmail.com
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20171025072326.21030-1-antoine.tenart@free-electrons.com
|
|
Fixes: 857263 ("drm/nouveau: Handle drm_atomic_helper_swap_state failure")
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Tested-by: Lyude Paul <lyude@redhat.com>
Reviewed by: Lyude Paul <lyude@redhat.com>
|
|
https://git.linaro.org/people/daniel.lezcano/linux into timers/core
Pull clockevent updates from Daniel Lezcano:
- Improve the generic clockevents dependency by factoring out the option
in the Kconfig menu option (Arnd Bergmann)
- Add missing "\n" in pr_err messages for fttmr010, owl and rockchip
(Arvind Yadav)
- Add missing timer_of_exit function to rollback timer_of_init (Benjamin
Gaignard)
- Fix path and add bindings to timers (Daniel Lezcano)
- Cleanup and remove support for renesas,cmt-32* (Geert Uytterhoeven)
- Add support for separate R-Car Gen2 (Magnus Damm)
- Fix DEFINE_PER_CPU length definition to prevent warning at expansion
time for the arm_arch_timer (Mark Rutland)
- Remove pointless irq_save,restore in an already irq-disabled callback
and add a shortcut optimization for the local cpu on mips-gic-timer
(Matt Redfearn)
|
|
Since commit 04a85e087ad6 ("MIPS: generic: Move NI 169445 FIT image
source to its own file"), a generic 32r2el_defconfig kernel fails to
build with the following build error:
ITB arch/mips/boot/vmlinux.gz.itb
Error: arch/mips/boot/vmlinux.gz.its:111.1-2 syntax error
FATAL ERROR: Unable to parse input tree
mkimage Can't read arch/mips/boot/vmlinux.gz.itb.tmp: Invalid argument
Fix arch/mips/generic/board-ni169445.its.S to include the necessary "/"
node path before the first open brace.
The original issue in arch/mips/generic/vmlinux.its.S was fixed directly
in the original commit 7aacf86b75bc ("MIPS: NI 169445 board support")
after https://patchwork.linux-mips.org/patch/16941/ was submitted, but
the separate its.S file wasn't correctly fixed when resolving the
conflict in commit 04a85e087ad6 ("MIPS: generic: Move NI 169445 FIT
image source to its own file").
Fixes: 04a85e087ad6 ("MIPS: generic: Move NI 169445 FIT image source to its own file")
Signed-off-by: James Hogan <jhogan@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Nathan Sullivan <nathan.sullivan@ni.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/17561/
Signed-off-by: James Hogan <jhogan@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management commit reverts from Rafael Wysocki:
"Since Geert reports additional problems with my PM QoS fix from the
last week that have not been addressed by the most recent fixup on top
of it, they both should better be reverted now and let's fix the
original issue properly in 4.15.
This reverts two recent PM QoS commits one of which introduced
multiple problems and the other one fixed some, but not all of them
(Rafael Wysocki)"
* tag 'pm-reverts-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Revert "PM / QoS: Fix device resume latency PM QoS"
Revert "PM / QoS: Fix default runtime_pm device resume latency"
|