Age | Commit message (Collapse) | Author |
|
The AON_PM_L2 is normally used to trigger and identify the source of a
wake-up event. Since the RX_SYS clock is no longer turned off, we also
have an interrupt being sent to the SYSTEMPORT INTRL_2_0 controller, and
that interrupt remains active up until the magic packet detector is
disabled which happens much later during the driver resumption.
The race happens if we have a CPU that is entering the SYSTEMPORT
INTRL2_0 handler during resume, and another CPU has managed to clear the
wake-up interrupt during bcm_sysport_resume_from_wol(). In that case, we
have the first CPU stuck in the interrupt handler with an interrupt
cause that has been cleared under its feet, and so we keep returning
IRQ_NONE and we never make any progress.
This was not a problem before because we would always turn off the
RX_SYS clock during WoL, so the SYSTEMPORT INTRL2_0 would also be turned
off as well, thus not latching the interrupt.
The fix is to make sure we do not enable either the MPD or
BRCM_TAG_MATCH interrupts since those are redundant with what the
AON_PM_L2 interrupt controller already processes and they would cause
such a race to occur.
Fixes: bb9051a2b230 ("net: systemport: Add support for WAKE_FILTER")
Fixes: 83e82f4c706b ("net: systemport: add Wake-on-LAN support")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fixes problem (discovered by Aurelien) introduced by recent commit:
commit b24df3e30cbf48255db866720fb71f14bf9d2f39
("cifs: update receive_encrypted_standard to handle compounded responses")
which broke the ability to respond to some lease breaks
(lease breaks being ignored is a problem since can block
server response for duration of the lease break timeout).
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
|
|
For compounded PDUs we whould only wake the waiting thread for the
very last PDU of the compound.
We do this so that we are guaranteed that the demultiplex_thread will
not process or access any of those MIDs any more once the send/recv
thread starts processing.
Else there is a race where at the end of the send/recv processing we
will try to delete all the mids of the compound. If the multiplex
thread still has other mids to process at this point for this compound
this can lead to an oops.
Needed to fix recent commit:
commit 730928c8f4be88e9d6a027a16b1e8fa9c59fc077
("cifs: update smb2_queryfs() to use compounding")
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:
====================
wireless-drivers fixes for 4.19
First, and also hopefully the last, set of fixes for 4.19. All small
but still important fixes
mt76x0
* fix a bug when a virtual interface is removed multiple times
b43
* fix DMA error related regression with proprietary firmware
iwlwifi
* fix an oops which was a regression in v4.19-rc1
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
cifs_delete_mid() is called once we are finished handling a mid and we
expect no more work done on this mid.
Needed to fix recent commit:
commit 730928c8f4be88e9d6a027a16b1e8fa9c59fc077
("cifs: update smb2_queryfs() to use compounding")
Add a warning if someone tries to dequeue a mid that has already been
flagged to be deleted.
Also change list_del() to list_del_init() so that if we have similar bugs
resurface in the future we will not oops.
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
|
|
When mounting a Windows share that is the root of a drive (eg. C$)
the server does not return . and .. directory entries. This results in
the smb2 code path erroneously skipping the 2 first entries.
Pseudo-code of the readdir() code path:
cifs_readdir(struct file, struct dir_context)
initiate_cifs_search <-- if no reponse cached yet
server->ops->query_dir_first
dir_emit_dots
dir_emit <-- adds "." and ".." if we're at pos=0
find_cifs_entry
initiate_cifs_search <-- if pos < start of current response
(restart search)
server->ops->query_dir_next <-- if pos > end of current response
(fetch next search res)
for(...) <-- loops over cur response entries
starting at pos
cifs_filldir <-- skip . and .., emit entry
cifs_fill_dirent
dir_emit
pos++
A) dir_emit_dots() always adds . & ..
and sets the current dir pos to 2 (0 and 1 are done).
Therefore we always want the index_to_find to be 2 regardless of if
the response has . and ..
B) smb1 code initializes index_of_last_entry with a +2 offset
in cifssmb.c CIFSFindFirst():
psrch_inf->index_of_last_entry = 2 /* skip . and .. */ +
psrch_inf->entries_in_buffer;
Later in find_cifs_entry() we want to find the next dir entry at pos=2
as a result of (A)
first_entry_in_buffer = cfile->srch_inf.index_of_last_entry -
cfile->srch_inf.entries_in_buffer;
This var is the dir pos that the first entry in the buffer will
have therefore it must be 2 in the first call.
If we don't offset index_of_last_entry by 2 (like in (B)),
first_entry_in_buffer=0 but we were instructed to get pos=2 so this
code in find_cifs_entry() skips the 2 first which is ok for non-root
shares, as it skips . and .. from the response but is not ok for root
shares where the 2 first are actual files
pos_in_buf = index_to_find - first_entry_in_buffer;
// pos_in_buf=2
// we skip 2 first response entries :(
for (i = 0; (i < (pos_in_buf)) && (cur_ent != NULL); i++) {
/* go entry by entry figuring out which is first */
cur_ent = nxt_dir_entry(cur_ent, end_of_smb,
cfile->srch_inf.info_level);
}
C) cifs_filldir() skips . and .. so we can safely ignore them for now.
Sample program:
int main(int argc, char **argv)
{
const char *path = argc >= 2 ? argv[1] : ".";
DIR *dh;
struct dirent *de;
printf("listing path <%s>\n", path);
dh = opendir(path);
if (!dh) {
printf("opendir error %d\n", errno);
return 1;
}
while (1) {
de = readdir(dh);
if (!de) {
if (errno) {
printf("readdir error %d\n", errno);
return 1;
}
printf("end of listing\n");
break;
}
printf("off=%lu <%s>\n", de->d_off, de->d_name);
}
return 0;
}
Before the fix with SMB1 on root shares:
<.> off=1
<..> off=2
<$Recycle.Bin> off=3
<bootmgr> off=4
and on non-root shares:
<.> off=1
<..> off=4 <-- after adding .., the offsets jumps to +2 because
<2536> off=5 we skipped . and .. from response buffer (C)
<411> off=6 but still incremented pos
<file> off=7
<fsx> off=8
Therefore the fix for smb2 is to mimic smb1 behaviour and offset the
index_of_last_entry by 2.
Test results comparing smb1 and smb2 before/after the fix on root
share, non-root shares and on large directories (ie. multi-response
dir listing):
PRE FIX
=======
pre-1-root VS pre-2-root:
ERR pre-2-root is missing [bootmgr, $Recycle.Bin]
pre-1-nonroot VS pre-2-nonroot:
OK~ same files, same order, different offsets
pre-1-nonroot-large VS pre-2-nonroot-large:
OK~ same files, same order, different offsets
POST FIX
========
post-1-root VS post-2-root:
OK same files, same order, same offsets
post-1-nonroot VS post-2-nonroot:
OK same files, same order, same offsets
post-1-nonroot-large VS post-2-nonroot-large:
OK same files, same order, same offsets
REGRESSION?
===========
pre-1-root VS post-1-root:
OK same files, same order, same offsets
pre-1-nonroot VS post-1-nonroot:
OK same files, same order, same offsets
BugLink: https://bugzilla.samba.org/show_bug.cgi?id=13107
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Paulo Alcantara <palcantara@suse.deR>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
CC: Stable <stable@vger.kernel.org>
|
|
We have an impressive number of syzkaller bugs that are linked
to the fact that syzbot was able to create a networking device
with millions of TX (or RX) queues.
Let's limit the number of RX/TX queues to 4096, this really should
cover all known cases.
A separate patch will add various cond_resched() in the loops
handling sysfs entries at device creation and dismantle.
Tested:
lpaa6:~# ip link add gre-4097 numtxqueues 4097 numrxqueues 4097 type ip6gretap
RTNETLINK answers: Invalid argument
lpaa6:~# time ip link add gre-4096 numtxqueues 4096 numrxqueues 4096 type ip6gretap
real 0m0.180s
user 0m0.000s
sys 0m0.107s
Fixes: 76ff5cc91935 ("rtnl: allow to specify number of rx and tx queues on device creation")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/misc/vmw_vmci/vmci_host.c: In function 'vmci_host_do_alloc_queuepair':
drivers/misc/vmw_vmci/vmci_host.c:450:6: warning:
variable 'cid' set but not used [-Wunused-but-set-variable]
u32 cid;
^
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
In preparation to remove the node name pointer from struct device_node,
convert printf users to use the %pOFn format specifier.
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
delay.h and dma-mapping.h have duplicated include. hence just remove
redundant file.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
RX queue config for bonding master could be different from its slave
device(s). With the commit 6a9e461f6fe4 ("bonding: pass link-local
packets to bonding master also."), the packet is reinjected into stack
with skb->dev as bonding master. This potentially triggers the
message:
"bondX received packet on queue Y, but number of RX queues is Z"
whenever the queue that packet is received on is higher than the
numrxqueues on bonding master (Y > Z).
Fixes: 6a9e461f6fe4 ("bonding: pass link-local packets to bonding master also.")
Reported-by: John Sperbeck <jsperbeck@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fixes the following sparse warning:
drivers/android/binder.c:3312:1: warning:
symbol 'binder_free_buf' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Timer handlers do not imply rcu_read_lock(), so my recent fix
triggered a LOCKDEP warning when SYNACK is retransmit.
Lets add rcu_read_lock()/rcu_read_unlock() pairs around ireq->ireq_opt
usages instead of guessing what is done by callers, since it is
not worth the pain.
Get rid of ireq_opt_deref() helper since it hides the logic
without real benefit, since it is now a standard rcu_dereference().
Fixes: 1ad98e9d1bdf ("tcp/dccp: fix lockdep issue when SYN is backlogged")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Intel has done pretty major changes to the driver and we continue to do
so in the future as well. Add Intel as copyright holder of the files we
have done changes.
While there drop "Cactus Ridge" from the headers because this driver
works also with other Thunderbolt controllers.
No functional changes intended.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Yehezkel Bernat <yehezkelshb@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This gets rid of the licence boilerplate duplicated in each file. While
there fix doubled space in domain.c author line.
No functional changes intended.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Yehezkel Bernat <yehezkelshb@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The previous patch made the driver less verbose meanining that all the
switch structures and ports are now logged as debug level. However, we
have been missing similar output that USB for intance prints when a new
USB device is connected and disconnected. This information is useful for
end users as well as developers because it immediately shows the actual
device that was connected.
This patch adds printing of the actual connected devices to the driver.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Yehezkel Bernat <yehezkelshb@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Currently the driver logs quite a lot to the system message buffer even
when doing normal operations. This information is not useful for
ordinary users and might even annoy some.
For this reason convert most of the logs at info level to happen at
debug level instead. The nice output formatting is untouched.
Logging can be easily re-enabled by passing "thunderbolt.dyndbg" in the
kernel command line (or using the corresponding control file runtime).
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Yehezkel Bernat <yehezkelshb@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
dma_pool_destroy() already takes NULL pointer into account so there is
no need to check that again in tb_ctl_free().
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
[mw: reword commit log a bit]
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
"out_buf_sz" needs to be signed for the error handling to work.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Looks like during merging the bulk POLL* -> EPOLL* replacement
missed the patch
'commit af336cabe083 ("mei: limit the number of queued writes")'
Fix sparse warning:
drivers/misc/mei/main.c:602:13: warning: restricted __poll_t degrades to integer
drivers/misc/mei/main.c:605:30: warning: invalid assignment: |=
drivers/misc/mei/main.c:605:30: left side has type restricted __poll_t
drivers/misc/mei/main.c:605:30: right side has type int
Fixes: af336cabe083 ("mei: limit the number of queued writes")
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Fixes the following sparse warning:
drivers/uio/uio.c:277:6: warning:
symbol 'uio_class_registered' was not declared. Should it be static?
Fixes: ae61cf5b9913 ("uio: ensure class is registered before devices")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
A cpumask structure on the stack can cause a warning with
CONFIG_NR_CPUS=8192 (e.g. Ubuntu 16.04 and 18.04 use this):
drivers/hv//channel_mgmt.c: In function ‘init_vp_index’:
drivers/hv//channel_mgmt.c:702:1: warning: the frame size of 1032 bytes
is larger than 1024 bytes [-Wframe-larger-than=]
Nowadays it looks most distros enable CONFIG_CPUMASK_OFFSTACK=y, and
hence we can work around the warning by using cpumask_var_t.
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We don't need to call process_ib_ipinfo() if message->kvp_hdr.operation is
KVP_OP_GET_IP_INFO in kvp_send_key(), because here we just need to pass on
the op code from the host to the userspace; when the userspace returns
the info requested by the host, we pass the info on to the host in
kvp_respond_to_host() -> process_ob_ipinfo(). BTW, the current buggy code
actually doesn't cause any harm, because only message->kvp_hdr.operation
is used by the userspace, in the case of KVP_OP_GET_IP_INFO.
The patch also adds a missing "break;" in kvp_send_key(). BTW, the current
buggy code actually doesn't cause any harm, because in the case of
KVP_OP_SET, the unexpected fall-through corrupts
message->body.kvp_set.data.key_size, but that is not really used: see
the definition of struct hv_kvp_exchg_msg_value.
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
No functional change.
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
No functional change.
Added descriptions for some parameters.
Fixed some typos.
Removed some out-of-date comments.
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The bus master was not removed after unloading the module
or unbinding the driver. That lead to oopses like this
[ 127.842987] Unable to handle kernel paging request at virtual address bf01d04c
[ 127.850646] pgd = 70e3cd9a
[ 127.853698] [bf01d04c] *pgd=8f908811, *pte=00000000, *ppte=00000000
[ 127.860412] Internal error: Oops: 80000007 [#1] PREEMPT SMP ARM
[ 127.866668] Modules linked in: bq27xxx_battery overlay [last unloaded: omap_hdq]
[ 127.874542] CPU: 0 PID: 1022 Comm: w1_bus_master1 Not tainted 4.19.0-rc4-00001-g2d51da718324 #12
[ 127.883819] Hardware name: Generic OMAP36xx (Flattened Device Tree)
[ 127.890441] PC is at 0xbf01d04c
[ 127.893798] LR is at w1_search_process_cb+0x4c/0xfc
[ 127.898956] pc : [<bf01d04c>] lr : [<c05f9580>] psr: a0070013
[ 127.905609] sp : cf885f48 ip : bf01d04c fp : ddf1e11c
[ 127.911132] r10: cf8fe040 r9 : c05f8d00 r8 : cf8fe040
[ 127.916656] r7 : 000000f0 r6 : cf8fe02c r5 : cf8fe000 r4 : cf8fe01c
[ 127.923553] r3 : c05f8d00 r2 : 000000f0 r1 : cf8fe000 r0 : dde1ef10
[ 127.930450] Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
[ 127.938018] Control: 10c5387d Table: 8f8f0019 DAC: 00000051
[ 127.944091] Process w1_bus_master1 (pid: 1022, stack limit = 0x9135699f)
[ 127.951171] Stack: (0xcf885f48 to 0xcf886000)
[ 127.955810] 5f40: cf8fe000 00000000 cf884000 cf8fe090 000003e8 c05f8d00
[ 127.964477] 5f60: dde5fc34 c05f9700 ddf1e100 ddf1e540 cf884000 cf8fe000 c05f9694 00000000
[ 127.973114] 5f80: dde5fc34 c01499a4 00000000 ddf1e540 c0149874 00000000 00000000 00000000
[ 127.981781] 5fa0: 00000000 00000000 00000000 c01010e8 00000000 00000000 00000000 00000000
[ 127.990447] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 127.999114] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
[ 128.007781] [<c05f9580>] (w1_search_process_cb) from [<c05f9700>] (w1_process+0x6c/0x118)
[ 128.016479] [<c05f9700>] (w1_process) from [<c01499a4>] (kthread+0x130/0x148)
[ 128.024047] [<c01499a4>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
[ 128.031677] Exception stack(0xcf885fb0 to 0xcf885ff8)
[ 128.037017] 5fa0: 00000000 00000000 00000000 00000000
[ 128.045684] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 128.054351] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000
[ 128.061340] Code: bad PC value
[ 128.064697] ---[ end trace af066e33c0e14119 ]---
Cc: <stable@vger.kernel.org>
Signed-off-by: Andreas Kemnade <andreas@kemnade.info>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
When adding a VMCI resource, the check for an existing entry
would ignore that the new entry could be a wildcard. This could
result in multiple resource entries that would match a given
handle. One disastrous outcome of this is that the
refcounting used to ensure that delayed callbacks for VMCI
datagrams have run before the datagram is destroyed can be
wrong, since the refcount could be increased on the duplicate
entry. This in turn leads to a use after free bug. This issue
was discovered by Hangbin Liu using KASAN and syzkaller.
Fixes: bc63dedb7d46 ("VMCI: resource object implementation")
Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Adit Ranadive <aditr@vmware.com>
Reviewed-by: Vishnu Dasa <vdasa@vmware.com>
Signed-off-by: Jorgen Hansen <jhansen@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This reverts commit 2d4dd0da45401c7ae7332b4d1eb7bbb1348edde9.
This broke earlycon on all Renesas ARM platforms using a SCIF port for the
serial console (R-Car, RZ/A1, RZ/G1, RZ/G2 SoCs), due to an incorrect value
of port->regshift.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Chris Brandt <chris.brandt@renesas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This reverts commit 7acece71a517cad83a0842a94d94c13f271b680c.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Chris Brandt <chris.brandt@renesas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This reverts commit d76c74387e1c978b6c5524a146ab0f3f72206f98.
While commit d76c74387e1c ("serial: 8250_dw: Fix runtime PM handling")
fixes runtime PM handling when using kgdb, it introduces a traceback for
everyone else.
BUG: sleeping function called from invalid context at
/mnt/host/source/src/third_party/kernel/next/drivers/base/power/runtime.c:1034
in_atomic(): 1, irqs_disabled(): 1, pid: 1, name: swapper/0
7 locks held by swapper/0/1:
#0: 000000005ec5bc72 (&dev->mutex){....}, at: __driver_attach+0xb5/0x12b
#1: 000000005d5fa9e5 (&dev->mutex){....}, at: __device_attach+0x3e/0x15b
#2: 0000000047e93286 (serial_mutex){+.+.}, at: serial8250_register_8250_port+0x51/0x8bb
#3: 000000003b328f07 (port_mutex){+.+.}, at: uart_add_one_port+0xab/0x8b0
#4: 00000000fa313d4d (&port->mutex){+.+.}, at: uart_add_one_port+0xcc/0x8b0
#5: 00000000090983ca (console_lock){+.+.}, at: vprintk_emit+0xdb/0x217
#6: 00000000c743e583 (console_owner){-...}, at: console_unlock+0x211/0x60f
irq event stamp: 735222
__down_trylock_console_sem+0x4a/0x84
console_unlock+0x338/0x60f
__do_softirq+0x4a4/0x50d
irq_exit+0x64/0xe2
CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc5 #6
Hardware name: Google Caroline/Caroline, BIOS Google_Caroline.7820.286.0 03/15/2017
Call Trace:
dump_stack+0x7d/0xbd
___might_sleep+0x238/0x259
__pm_runtime_resume+0x4e/0xa4
? serial8250_rpm_get+0x2e/0x44
serial8250_console_write+0x44/0x301
? lock_acquire+0x1b8/0x1fa
console_unlock+0x577/0x60f
vprintk_emit+0x1f0/0x217
printk+0x52/0x6e
register_console+0x43b/0x524
uart_add_one_port+0x672/0x8b0
? set_io_from_upio+0x150/0x162
serial8250_register_8250_port+0x825/0x8bb
dw8250_probe+0x80c/0x8b0
? dw8250_serial_inq+0x8e/0x8e
? dw8250_check_lcr+0x108/0x108
? dw8250_runtime_resume+0x5b/0x5b
? dw8250_serial_outq+0xa1/0xa1
? dw8250_remove+0x115/0x115
platform_drv_probe+0x76/0xc5
really_probe+0x1f1/0x3ee
? driver_allows_async_probing+0x5d/0x5d
driver_probe_device+0xd6/0x112
? driver_allows_async_probing+0x5d/0x5d
bus_for_each_drv+0xbe/0xe5
__device_attach+0xdd/0x15b
bus_probe_device+0x5a/0x10b
device_add+0x501/0x894
? _raw_write_unlock+0x27/0x3a
platform_device_add+0x224/0x2b7
mfd_add_device+0x718/0x75b
? __kmalloc+0x144/0x16a
? mfd_add_devices+0x38/0xdb
mfd_add_devices+0x9b/0xdb
intel_lpss_probe+0x7d4/0x8ee
intel_lpss_pci_probe+0xac/0xd4
pci_device_probe+0x101/0x18e
...
Revert the offending patch until a more comprehensive solution
is available.
Cc: Tony Lindgren <tony@atomide.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Phil Edworthy <phil.edworthy@renesas.com>
Fixes: d76c74387e1c ("serial: 8250_dw: Fix runtime PM handling")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Use memblock_end_of_DRAM which provides correct last low memory
PFN. Without that, DMA32 region becomes empty resulting in zero
pages being allocated for DMA32.
This patch is based on earlier patch from palmer which never
merged into 4.19. I just edited the commit text to make more
sense.
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
|
|
The recent rework of the TSC calibration code introduced a regression on UV
systems as it added a call to tsc_early_init() which initializes the TSC
ADJUST values before acpi_boot_table_init(). In the case of UV systems,
that is a necessary step that calls uv_system_init(). This informs
tsc_sanitize_first_cpu() that the kernel runs on a platform with async TSC
resets as documented in commit 341102c3ef29 ("x86/tsc: Add option that TSC
on Socket 0 being non-zero is valid")
Fix it by skipping the early tsc initialization on UV systems and let TSC
init tests take place later in tsc_init().
Fixes: cf7a63ef4e02 ("x86/tsc: Calibrate tsc only once")
Suggested-by: Hedi Berriche <hedi.berriche@hpe.com>
Signed-off-by: Mike Travis <mike.travis@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Russ Anderson <rja@hpe.com>
Reviewed-by: Dimitri Sivanich <sivanich@hpe.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Russ Anderson <russ.anderson@hpe.com>
Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Len Brown <len.brown@intel.com>
Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
Cc: Xiaoming Gao <gxm.linux.kernel@gmail.com>
Cc: Rajvi Jingar <rajvi.jingar@intel.com>
Link: https://lkml.kernel.org/r/20181002180144.923579706@stormcage.americas.sgi.com
|
|
Introduce is_early_uv_system() which uses efi.uv_systab to decide early
in the boot process whether the kernel runs on a UV system.
This is needed to skip other early setup/init code that might break
the UV platform if done too early such as before necessary ACPI tables
parsing takes place.
Suggested-by: Hedi Berriche <hedi.berriche@hpe.com>
Signed-off-by: Mike Travis <mike.travis@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Russ Anderson <rja@hpe.com>
Reviewed-by: Dimitri Sivanich <sivanich@hpe.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Russ Anderson <russ.anderson@hpe.com>
Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Len Brown <len.brown@intel.com>
Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
Cc: Xiaoming Gao <gxm.linux.kernel@gmail.com>
Cc: Rajvi Jingar <rajvi.jingar@intel.com>
Link: https://lkml.kernel.org/r/20181002180144.801700401@stormcage.americas.sgi.com
|
|
When FW floods the driver with control messages try to exit the cmsg
processing loop every now and then to avoid soft lockups. Cmsg
processing is generally very lightweight so 512 seems like a reasonable
budget, which should not be exceeded under normal conditions.
Fixes: 77ece8d5f196 ("nfp: add control vNIC datapath")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Tested-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix a commit 4bcc595ccd80 ("printk: reinstate KERN_CONT for printing
continuation lines") regression with the `declance' driver, which caused
the adapter identification message to be split between two lines, e.g.:
declance.c: v0.011 by Linux MIPS DECstation task force
tc6: PMAD-AA
, addr = 08:00:2b:1b:2a:6a, irq = 14
tc6: registered as eth0.
Address that properly, by printing identification with a single call,
making the messages now look like:
declance.c: v0.011 by Linux MIPS DECstation task force
tc6: PMAD-AA, addr = 08:00:2b:1b:2a:6a, irq = 14
tc6: registered as eth0.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Fixes: 4bcc595ccd80 ("printk: reinstate KERN_CONT for printing continuation lines")
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
During certain heavy network loads TX could time out
with TX ring dump.
TX is sometimes never restarted after reaching
"tx_stop_threshold" because function "fec_enet_tx_queue"
only tests the first queue.
In addition the TX timeout callback function failed to
recover because it also operated only on the first queue.
Signed-off-by: Rickard x Andersson <rickaran@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If IOMMU is enabled and Thunderbolt driver is built into the kernel
image, it will be probed before IOMMUs are attached to the PCI bus.
Because of this DMA mappings the driver does will not go through IOMMU
and start failing right after IOMMUs are enabled.
For this reason move the Thunderbolt driver initialization happen at
rootfs level.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
If there is a long chain of devices connected when the driver is loaded
ICM sends device connected event for each and those are put to tb->wq
for later processing. Now if the driver gets unloaded in the middle, so
that the work queue is not yet empty it gets flushed by tb_domain_stop().
However, by that time the root switch is already removed so the driver
crashes when it tries to dereference it in ICM event handling callbacks.
Fix this by checking whether the root switch is already removed. If it
is we know that the domain is stopped and we should merely skip handling
the event.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire into char-misc-next
Vinod writes:
soundwire updates for 4.20-rc1
- support for multi-link streaming
- updates in intel driver for multi-link streaming
- Update Vinod's email
- Fix rst formatting
* tag 'soundwire-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
Documentation: soundwire: fix stream.rst markup warnings
soundwire: intel: Remove duplicate assignment
MAINTAINERS: Update Vinod's email
soundwire: intel: Fix uninitialized adev deref
soundwire: intel: Add pre/post bank switch ops
soundwire: keep track of Masters in a stream
soundwire: Add support for multi link bank switch
soundwire: Handle multiple master instances in a stream
soundwire: Add support to lock across bus instances
soundwire: Initialize completion for defer messages
Documentation: soundwire: Add documentation for multi link
|
|
Commit 51c3c62b58b3 ("powerpc: Avoid code patching freed init
sections") accesses 'init_mem_is_free' flag too early, before the
kernel is relocated. This provokes early boot failure (before the
console is active).
As it is not necessary to do this verification that early, this
patch moves the test into patch_instruction() instead of
__patch_instruction().
This modification also has the advantage of avoiding unnecessary
remappings.
Fixes: 51c3c62b58b3 ("powerpc: Avoid code patching freed init sections")
Cc: stable@vger.kernel.org # 4.13+
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Explicitly forbid creating cgroup local storage maps with zero value
size, as it makes no sense and might even cause a panic.
Reported-by: syzbot+18628320d3b14a5c459c@syzkaller.appspotmail.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Bartlomiej writes:
"fbdev fixes for v4.19-rc7:
- fix OMAPFB_MEMORY_READ ioctl to not leak kernel memory in omapfb driver
(Tomi Valkeinen)
- add missing prepare/unprepare clock operations in pxa168fb driver
(Lubomir Rintel)
- add nobgrt option in efifb driver to disable ACPI BGRT logo restore
(Hans de Goede)
- fix spelling mistake in fall-through annotation in stifb driver
(Gustavo A. R. Silva)
- fix URL for uvesafb repository in the documentation (Adam Jackson)"
* tag 'fbdev-v4.19-rc7' of https://github.com/bzolnier/linux:
video/fbdev/stifb: Fix spelling mistake in fall-through annotation
uvesafb: Fix URLs in the documentation
efifb: BGRT: Add nobgrt option
fbdev/omapfb: fix omapfb_memory_read infoleak
pxa168fb: prepare the clock
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Ulf writes:
"MMC core:
- Fixup conversion of debounce time to/from ms/us
MMC host:
- sdhi: Fixup whitelisting for Gen3 types"
* tag 'mmc-v4.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: slot-gpio: Fix debounce time to use miliseconds again
mmc: core: Fix debounce time to use microseconds
mmc: sdhi: sys_dmac: check for all Gen3 types when whitelisting
|
|
Sergey Suloev reported a crash happening in drm_client_dev_hotplug()
when fbdev had failed to register.
[ 9.124598] vc4_hdmi 3f902000.hdmi: ASoC: Failed to create component debugfs directory
[ 9.147667] vc4_hdmi 3f902000.hdmi: vc4-hdmi-hifi <-> 3f902000.hdmi mapping ok
[ 9.155184] vc4_hdmi 3f902000.hdmi: ASoC: no DMI vendor name!
[ 9.166544] vc4-drm soc:gpu: bound 3f902000.hdmi (ops vc4_hdmi_ops [vc4])
[ 9.173840] vc4-drm soc:gpu: bound 3f806000.vec (ops vc4_vec_ops [vc4])
[ 9.181029] vc4-drm soc:gpu: bound 3f004000.txp (ops vc4_txp_ops [vc4])
[ 9.188519] vc4-drm soc:gpu: bound 3f400000.hvs (ops vc4_hvs_ops [vc4])
[ 9.195690] vc4-drm soc:gpu: bound 3f206000.pixelvalve (ops vc4_crtc_ops [vc4])
[ 9.203523] vc4-drm soc:gpu: bound 3f207000.pixelvalve (ops vc4_crtc_ops [vc4])
[ 9.215032] vc4-drm soc:gpu: bound 3f807000.pixelvalve (ops vc4_crtc_ops [vc4])
[ 9.274785] vc4-drm soc:gpu: bound 3fc00000.v3d (ops vc4_v3d_ops [vc4])
[ 9.290246] [drm] Initialized vc4 0.0.0 20140616 for soc:gpu on minor 0
[ 9.297464] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[ 9.304600] [drm] Driver supports precise vblank timestamp query.
[ 9.382856] vc4-drm soc:gpu: [drm:drm_fb_helper_fbdev_setup [drm_kms_helper]] *ERROR* Failed to set fbdev configuration
[ 10.404937] Unable to handle kernel paging request at virtual address 00330a656369768a
[ 10.441620] [00330a656369768a] address between user and kernel address ranges
[ 10.449087] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[ 10.454762] Modules linked in: brcmfmac vc4 drm_kms_helper cfg80211 drm rfkill smsc95xx brcmutil usbnet drm_panel_orientation_quirks raspberrypi_hwmon bcm2835_dma crc32_ce pwm_bcm2835 bcm2835_rng virt_dma rng_core i2c_bcm2835 ip_tables x_tables ipv6
[ 10.477296] CPU: 2 PID: 45 Comm: kworker/2:1 Not tainted 4.19.0-rc5 #3
[ 10.483934] Hardware name: Raspberry Pi 3 Model B Rev 1.2 (DT)
[ 10.489966] Workqueue: events output_poll_execute [drm_kms_helper]
[ 10.596515] Process kworker/2:1 (pid: 45, stack limit = 0x000000007e8924dc)
[ 10.603590] Call trace:
[ 10.606259] drm_client_dev_hotplug+0x5c/0xb0 [drm]
[ 10.611303] drm_kms_helper_hotplug_event+0x30/0x40 [drm_kms_helper]
[ 10.617849] output_poll_execute+0xc4/0x1e0 [drm_kms_helper]
[ 10.623616] process_one_work+0x1c8/0x318
[ 10.627695] worker_thread+0x48/0x428
[ 10.631420] kthread+0xf8/0x128
[ 10.634615] ret_from_fork+0x10/0x18
[ 10.638255] Code: 54000220 f9401261 aa1303e0 b4000141 (f9400c21)
[ 10.644456] ---[ end trace c75b4a4b0e141908 ]---
The reason for this is that drm_fbdev_cma_init() removes the drm_client
when fbdev registration fails, but it doesn't remove the client from the
drm_device client list. So the client list now has a pointer that points
into the unknown and we have a 'use after free' situation.
Split drm_client_new() into drm_client_init() and drm_client_add() to fix
removal in the error path.
Fixes: 894a677f4b3e ("drm/cma-helper: Use the generic fbdev emulation")
Reported-by: Sergey Suloev <ssuloev@orpaltech.com>
Cc: Stefan Wahren <stefan.wahren@i2se.com>
Cc: Eric Anholt <eric@anholt.net>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20181001194536.57756-1-noralf@tronnes.org
|
|
Automatic NUMA Balancing uses a multi-stage pass to decide whether a page
should migrate to a local node. This filter avoids excessive ping-ponging
if a page is shared or used by threads that migrate cross-node frequently.
Threads inherit both page tables and the preferred node ID from the
parent. This means that threads can trigger hinting faults earlier than
a new task which delays scanning for a number of seconds. As it can be
load balanced very early in its lifetime there can be an unnecessary delay
before it starts migrating thread-local data. This patch migrates private
pages faster early in the lifetime of a thread using the sequence counter
as an identifier of new tasks.
With this patch applied, STREAM performance is the same as 4.17 even though
processes are not spread cross-node prematurely. Other workloads showed
a mix of minor gains and losses. This is somewhat expected most workloads
are not very sensitive to the starting conditions of a process.
4.19.0-rc5 4.19.0-rc5 4.17.0
numab-v1r1 fastmigrate-v1r1 vanilla
MB/sec copy 43298.52 ( 0.00%) 47335.46 ( 9.32%) 47219.24 ( 9.06%)
MB/sec scale 30115.06 ( 0.00%) 32568.12 ( 8.15%) 32527.56 ( 8.01%)
MB/sec add 32825.12 ( 0.00%) 36078.94 ( 9.91%) 35928.02 ( 9.45%)
MB/sec triad 32549.52 ( 0.00%) 35935.94 ( 10.40%) 35969.88 ( 10.51%)
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: Rik van Riel <riel@surriel.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Jirka Hladky <jhladky@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux-MM <linux-mm@kvack.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20181001100525.29789-3-mgorman@techsingularity.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Rate limiting of page migrations due to automatic NUMA balancing was
introduced to mitigate the worst-case scenario of migrating at high
frequency due to false sharing or slowly ping-ponging between nodes.
Since then, a lot of effort was spent on correctly identifying these
pages and avoiding unnecessary migrations and the safety net may no longer
be required.
Jirka Hladky reported a regression in 4.17 due to a scheduler patch that
avoids spreading STREAM tasks wide prematurely. However, once the task
was properly placed, it delayed migrating the memory due to rate limiting.
Increasing the limit fixed the problem for him.
Currently, the limit is hard-coded and does not account for the real
capabilities of the hardware. Even if an estimate was attempted, it would
not properly account for the number of memory controllers and it could
not account for the amount of bandwidth used for normal accesses. Rather
than fudging, this patch simply eliminates the rate limiting.
However, Jirka reports that a STREAM configuration using multiple
processes achieved similar performance to 4.16. In local tests, this patch
improved performance of STREAM relative to the baseline but it is somewhat
machine-dependent. Most workloads show little or not performance difference
implying that there is not a heavily reliance on the throttling mechanism
and it is safe to remove.
STREAM on 2-socket machine
4.19.0-rc5 4.19.0-rc5
numab-v1r1 noratelimit-v1r1
MB/sec copy 43298.52 ( 0.00%) 44673.38 ( 3.18%)
MB/sec scale 30115.06 ( 0.00%) 31293.06 ( 3.91%)
MB/sec add 32825.12 ( 0.00%) 34883.62 ( 6.27%)
MB/sec triad 32549.52 ( 0.00%) 34906.60 ( 7.24%
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: Rik van Riel <riel@surriel.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Jirka Hladky <jhladky@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux-MM <linux-mm@kvack.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20181001100525.29789-2-mgorman@techsingularity.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Since 890658b7ab48 ("locking/mutex: Kill arch specific code"), there
are no mutex header files under arch/, so we can remove the redundant
entry from MAINTAINERS.
Reported-by: Joe Perches <joe@perches.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: Jason Low <jason.low2@hpe.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20181001142856.GC9716@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
fd_install() moves the reference given to it into the file descriptor table
of the current process. If the current process is multithreaded, then
immediately after fd_install(), another thread can close() the file
descriptor and cause the file's resources to be cleaned up.
Since the reference to "lessee" is held by the file, we must not access
"lessee" after the fd_install() call.
As far as I can tell, to reach this codepath, the caller must have an open
file descriptor to a DRI device in master mode. I'm not sure what the
requirements for that are.
Signed-off-by: Jann Horn <jannh@google.com>
Fixes: 62884cd386b8 ("drm: Add four ioctls for managing drm mode object leases [v7]")
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20181001153117.216923-1-jannh@google.com
|
|
If NUMA improvement from the task migration is going to be very
minimal, then avoid task migration.
Specjbb2005 results (8 warehouses)
Higher bops are better
2 Socket - 2 Node Haswell - X86
JVMS Prev Current %Change
4 198512 205910 3.72673
1 313559 318491 1.57291
2 Socket - 4 Node Power8 - PowerNV
JVMS Prev Current %Change
8 74761.9 74935.9 0.232739
1 214874 226796 5.54837
2 Socket - 2 Node Power9 - PowerNV
JVMS Prev Current %Change
4 180536 189780 5.12031
1 210281 205695 -2.18089
4 Socket - 4 Node Power7 - PowerVM
JVMS Prev Current %Change
8 56511.4 60370 6.828
1 104899 108100 3.05151
1/7 cases is regressing, if we look at events migrate_pages seem
to vary the most especially in the regressing case. Also some
amount of variance is expected between different runs of
Specjbb2005.
Some events stats before and after applying the patch.
perf stats 8th warehouse Multi JVM 2 Socket - 2 Node Haswell - X86
Event Before After
cs 13,818,546 13,801,554
migrations 1,149,960 1,151,541
faults 385,583 433,246
cache-misses 55,259,546,768 55,168,691,835
sched:sched_move_numa 2,257 2,551
sched:sched_stick_numa 9 24
sched:sched_swap_numa 512 904
migrate:mm_migrate_pages 2,225 1,571
vmstat 8th warehouse Multi JVM 2 Socket - 2 Node Haswell - X86
Event Before After
numa_hint_faults 72692 113682
numa_hint_faults_local 62270 102163
numa_hit 238762 240181
numa_huge_pte_updates 48 36
numa_interleave 75 64
numa_local 238676 240103
numa_other 86 78
numa_pages_migrated 2225 1564
numa_pte_updates 98557 134080
perf stats 8th warehouse Single JVM 2 Socket - 2 Node Haswell - X86
Event Before After
cs 3,173,490 3,079,150
migrations 36,966 31,455
faults 108,776 99,081
cache-misses 12,200,075,320 11,588,126,740
sched:sched_move_numa 1,264 1
sched:sched_stick_numa 0 0
sched:sched_swap_numa 0 0
migrate:mm_migrate_pages 899 36
vmstat 8th warehouse Single JVM 2 Socket - 2 Node Haswell - X86
Event Before After
numa_hint_faults 21109 430
numa_hint_faults_local 17120 77
numa_hit 72934 71277
numa_huge_pte_updates 42 0
numa_interleave 33 22
numa_local 72866 71218
numa_other 68 59
numa_pages_migrated 915 23
numa_pte_updates 42326 0
perf stats 8th warehouse Multi JVM 2 Socket - 2 Node Power9 - PowerNV
Event Before After
cs 8,312,022 8,707,565
migrations 231,705 171,342
faults 310,242 310,820
cache-misses 402,324,573 136,115,400
sched:sched_move_numa 193 215
sched:sched_stick_numa 0 6
sched:sched_swap_numa 3 24
migrate:mm_migrate_pages 93 162
vmstat 8th warehouse Multi JVM 2 Socket - 2 Node Power9 - PowerNV
Event Before After
numa_hint_faults 11838 8985
numa_hint_faults_local 11216 8154
numa_hit 90689 93819
numa_huge_pte_updates 0 0
numa_interleave 1579 882
numa_local 89634 93496
numa_other 1055 323
numa_pages_migrated 92 169
numa_pte_updates 12109 9217
perf stats 8th warehouse Single JVM 2 Socket - 2 Node Power9 - PowerNV
Event Before After
cs 2,170,481 2,152,072
migrations 10,126 10,704
faults 160,962 164,376
cache-misses 10,834,845 3,818,437
sched:sched_move_numa 10 16
sched:sched_stick_numa 0 0
sched:sched_swap_numa 0 7
migrate:mm_migrate_pages 2 199
vmstat 8th warehouse Single JVM 2 Socket - 2 Node Power9 - PowerNV
Event Before After
numa_hint_faults 403 2248
numa_hint_faults_local 358 1666
numa_hit 25898 25704
numa_huge_pte_updates 0 0
numa_interleave 207 200
numa_local 25860 25679
numa_other 38 25
numa_pages_migrated 2 197
numa_pte_updates 400 2234
perf stats 8th warehouse Multi JVM 4 Socket - 4 Node Power7 - PowerVM
Event Before After
cs 110,339,633 93,330,595
migrations 4,139,812 4,122,061
faults 863,622 865,979
cache-misses 231,838,045,660 225,395,083,479
sched:sched_move_numa 2,196 2,372
sched:sched_stick_numa 33 24
sched:sched_swap_numa 544 769
migrate:mm_migrate_pages 2,469 1,677
vmstat 8th warehouse Multi JVM 4 Socket - 4 Node Power7 - PowerVM
Event Before After
numa_hint_faults 85748 91638
numa_hint_faults_local 66831 78096
numa_hit 242213 242225
numa_huge_pte_updates 0 0
numa_interleave 0 2
numa_local 242211 242219
numa_other 2 6
numa_pages_migrated 2376 1515
numa_pte_updates 86233 92274
perf stats 8th warehouse Single JVM 4 Socket - 4 Node Power7 - PowerVM
Event Before After
cs 59,331,057 51,487,271
migrations 552,019 537,170
faults 266,586 256,921
cache-misses 73,796,312,990 70,073,831,187
sched:sched_move_numa 981 576
sched:sched_stick_numa 54 24
sched:sched_swap_numa 286 327
migrate:mm_migrate_pages 713 726
vmstat 8th warehouse Single JVM 4 Socket - 4 Node Power7 - PowerVM
Event Before After
numa_hint_faults 14807 12000
numa_hint_faults_local 5738 5024
numa_hit 36230 36470
numa_huge_pte_updates 0 0
numa_interleave 0 0
numa_local 36228 36465
numa_other 2 5
numa_pages_migrated 703 726
numa_pte_updates 14742 11930
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Jirka Hladky <jhladky@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1537552141-27815-7-git-send-email-srikar@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Since this spinlock will only serialize the migrate rate limiting,
convert the spin_lock() to a spin_trylock(). If another thread is updating, this
task can move on.
Specjbb2005 results (8 warehouses)
Higher bops are better
2 Socket - 2 Node Haswell - X86
JVMS Prev Current %Change
4 205332 198512 -3.32145
1 319785 313559 -1.94693
2 Socket - 4 Node Power8 - PowerNV
JVMS Prev Current %Change
8 74912 74761.9 -0.200368
1 206585 214874 4.01239
2 Socket - 2 Node Power9 - PowerNV
JVMS Prev Current %Change
4 189162 180536 -4.56011
1 213760 210281 -1.62753
4 Socket - 4 Node Power7 - PowerVM
JVMS Prev Current %Change
8 58736.8 56511.4 -3.78877
1 105419 104899 -0.49327
Avoiding stretching of window intervals may be the reason for the
regression. Also code now uses READ_ONCE/WRITE_ONCE. That may
also be hurting performance to some extent.
Some events stats before and after applying the patch.
perf stats 8th warehouse Multi JVM 2 Socket - 2 Node Haswell - X86
Event Before After
cs 14,285,708 13,818,546
migrations 1,180,621 1,149,960
faults 339,114 385,583
cache-misses 55,205,631,894 55,259,546,768
sched:sched_move_numa 843 2,257
sched:sched_stick_numa 6 9
sched:sched_swap_numa 219 512
migrate:mm_migrate_pages 365 2,225
vmstat 8th warehouse Multi JVM 2 Socket - 2 Node Haswell - X86
Event Before After
numa_hint_faults 26907 72692
numa_hint_faults_local 24279 62270
numa_hit 239771 238762
numa_huge_pte_updates 0 48
numa_interleave 68 75
numa_local 239688 238676
numa_other 83 86
numa_pages_migrated 363 2225
numa_pte_updates 27415 98557
perf stats 8th warehouse Single JVM 2 Socket - 2 Node Haswell - X86
Event Before After
cs 3,202,779 3,173,490
migrations 37,186 36,966
faults 106,076 108,776
cache-misses 12,024,873,744 12,200,075,320
sched:sched_move_numa 931 1,264
sched:sched_stick_numa 0 0
sched:sched_swap_numa 1 0
migrate:mm_migrate_pages 637 899
vmstat 8th warehouse Single JVM 2 Socket - 2 Node Haswell - X86
Event Before After
numa_hint_faults 17409 21109
numa_hint_faults_local 14367 17120
numa_hit 73953 72934
numa_huge_pte_updates 20 42
numa_interleave 25 33
numa_local 73892 72866
numa_other 61 68
numa_pages_migrated 668 915
numa_pte_updates 27276 42326
perf stats 8th warehouse Multi JVM 2 Socket - 2 Node Power9 - PowerNV
Event Before After
cs 8,474,013 8,312,022
migrations 254,934 231,705
faults 320,506 310,242
cache-misses 110,580,458 402,324,573
sched:sched_move_numa 725 193
sched:sched_stick_numa 0 0
sched:sched_swap_numa 7 3
migrate:mm_migrate_pages 145 93
vmstat 8th warehouse Multi JVM 2 Socket - 2 Node Power9 - PowerNV
Event Before After
numa_hint_faults 22797 11838
numa_hint_faults_local 21539 11216
numa_hit 89308 90689
numa_huge_pte_updates 0 0
numa_interleave 865 1579
numa_local 88955 89634
numa_other 353 1055
numa_pages_migrated 149 92
numa_pte_updates 22930 12109
perf stats 8th warehouse Single JVM 2 Socket - 2 Node Power9 - PowerNV
Event Before After
cs 2,195,628 2,170,481
migrations 11,179 10,126
faults 149,656 160,962
cache-misses 8,117,515 10,834,845
sched:sched_move_numa 49 10
sched:sched_stick_numa 0 0
sched:sched_swap_numa 0 0
migrate:mm_migrate_pages 5 2
vmstat 8th warehouse Single JVM 2 Socket - 2 Node Power9 - PowerNV
Event Before After
numa_hint_faults 3577 403
numa_hint_faults_local 3476 358
numa_hit 26142 25898
numa_huge_pte_updates 0 0
numa_interleave 358 207
numa_local 26042 25860
numa_other 100 38
numa_pages_migrated 5 2
numa_pte_updates 3587 400
perf stats 8th warehouse Multi JVM 4 Socket - 4 Node Power7 - PowerVM
Event Before After
cs 100,602,296 110,339,633
migrations 4,135,630 4,139,812
faults 789,256 863,622
cache-misses 226,160,621,058 231,838,045,660
sched:sched_move_numa 1,366 2,196
sched:sched_stick_numa 16 33
sched:sched_swap_numa 374 544
migrate:mm_migrate_pages 1,350 2,469
vmstat 8th warehouse Multi JVM 4 Socket - 4 Node Power7 - PowerVM
Event Before After
numa_hint_faults 47857 85748
numa_hint_faults_local 39768 66831
numa_hit 240165 242213
numa_huge_pte_updates 0 0
numa_interleave 0 0
numa_local 240165 242211
numa_other 0 2
numa_pages_migrated 1224 2376
numa_pte_updates 48354 86233
perf stats 8th warehouse Single JVM 4 Socket - 4 Node Power7 - PowerVM
Event Before After
cs 58,515,496 59,331,057
migrations 564,845 552,019
faults 245,807 266,586
cache-misses 73,603,757,976 73,796,312,990
sched:sched_move_numa 996 981
sched:sched_stick_numa 10 54
sched:sched_swap_numa 193 286
migrate:mm_migrate_pages 646 713
vmstat 8th warehouse Single JVM 4 Socket - 4 Node Power7 - PowerVM
Event Before After
numa_hint_faults 13422 14807
numa_hint_faults_local 5619 5738
numa_hit 36118 36230
numa_huge_pte_updates 0 0
numa_interleave 0 0
numa_local 36116 36228
numa_other 2 2
numa_pages_migrated 616 703
numa_pte_updates 13374 14742
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Jirka Hladky <jhladky@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Rik van Riel <riel@surriel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1537552141-27815-6-git-send-email-srikar@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|