Age | Commit message (Collapse) | Author |
|
'perf top' stopped working on hw architectures that do not provide a
get_cpuid() implementation and thus fallback to the weak get_cpuid()
default function.
This is done because at annotation time we may need it in the arch
specific annotation init routine, but that is only being used by arches
that do provide a get_cpuid() implementation:
$ find tools/ -name "*.[ch]" | xargs grep 'evlist->env'
tools/perf/builtin-top.c: top.evlist->env = &perf_env;
tools/perf/util/evsel.c: return evsel->evlist->env;
tools/perf/util/s390-cpumsf.c: sf->machine_type = s390_cpumsf_get_type(session->evlist->env->cpuid);
tools/perf/util/header.c: session->evlist->env = &header->env;
tools/perf/util/sample-raw.c: const char *arch_pf = perf_env__arch(evlist->env);
$
$ find tools/perf/arch -name "*.[ch]" | xargs grep -w get_cpuid
tools/perf/arch/x86/util/auxtrace.c: ret = get_cpuid(buffer, sizeof(buffer));
tools/perf/arch/x86/util/header.c:get_cpuid(char *buffer, size_t sz)
tools/perf/arch/powerpc/util/header.c:get_cpuid(char *buffer, size_t sz)
tools/perf/arch/s390/util/header.c: * Implementation of get_cpuid().
tools/perf/arch/s390/util/header.c:int get_cpuid(char *buffer, size_t sz)
tools/perf/arch/s390/util/header.c: if (buf && get_cpuid(buf, 128))
$
For 'report' or 'script', i.e. tools working on perf.data files, that is
setup while reading the header, its just top that needs to explicitely
read it at tool start.
Fixes: 608127f73779 ("perf top: Initialize perf_env->cpuid, needed by the per arch annotation init routine")
Reported-by: John Garry <john.garry@huawei.com>
Analysed-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: John Garry <john.garry@huawei.com> # arm64
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/n/tip-lxwjr0cd2eggzx04a780ffrv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Some of the functions calling get_cpuid() propagate back the error it
returns, and all are using errno (positive) values, make the weak
default get_cpuid() function return ENOSYS to be consistent and to allow
checking if this is an arch not providing this function or if a provided
one is having trouble getting the cpuid, to decide if the warning should
be provided to the user or just a debug message should be emitted.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: John Garry <john.garry@huawei.com> # arm64
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/n/tip-lxwjr0cd2eggzx04a780ffrv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
After original patch was merged there were additional comments/requests
provided by Rob Herring [1]. Mostly they are related to json-schema usage,
and this patch fixes them. Also SPDX-License-Identifier has been changed to
(GPL-2.0-only OR BSD-2-Clause) as requested.
[1] https://lkml.org/lkml/2019/11/21/875
Fixes: ef63fe72f698 ("dt-bindings: net: ti: add new cpsw switch driver bindings")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
[robh: Remove 2 more maxItems that aren't necessary]
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
If the optional wdg interrupt is defined, then this property
may be defined.
Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@st.com>
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
spin_unlock_irqrestore() might be called with stale flags after
reading port status, possibly restoring interrupts to a incorrect
state.
If a usb2 port just finished resuming while the port status is read
the spin lock will be temporary released and re-acquired in a separate
function. The flags parameter is passed as value instead of a pointer,
not updating flags properly before the final spin_unlock_irqrestore()
is called.
Cc: <stable@vger.kernel.org> # v3.12+
Fixes: 8b3d45705e54 ("usb: Fix xHCI host issues on remote wakeup.")
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20191211142007.8847-7-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
xhci driver claims it needs XHCI_TRUST_TX_LENGTH quirk for both
Broadcom/Cavium and a Renesas xHC controllers.
The quirk was inteded for handling false "success" complete event for
transfers that had data left untransferred.
These transfers should complete with "short packet" events instead.
In these two new cases the false "success" completion is reported
after a "short packet" if the TD consists of several TRBs.
xHCI specs 4.10.1.1.2 say remaining TRBs should report "short packet"
as well after the first short packet in a TD, but this issue seems so
common it doesn't make sense to add the quirk for all vendors.
Turn these events into short packets automatically instead.
This gets rid of the "The WARN Successful completion on short TX for
slot 1 ep 1: needs XHCI_TRUST_TX_LENGTH quirk" warning in many cases.
Cc: <stable@vger.kernel.org>
Reported-by: Eli Billauer <eli.billauer@gmail.com>
Reported-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Eli Billauer <eli.billauer@gmail.com>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20191211142007.8847-6-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
I've recently observed failed xHCI suspend attempt on AMD Raven Ridge
system:
kernel: xhci_hcd 0000:04:00.4: WARN: xHC CMD_RUN timeout
kernel: PM: suspend_common(): xhci_pci_suspend+0x0/0xd0 returns -110
kernel: PM: pci_pm_suspend(): hcd_pci_suspend+0x0/0x30 returns -110
kernel: PM: dpm_run_callback(): pci_pm_suspend+0x0/0x150 returns -110
kernel: PM: Device 0000:04:00.4 failed to suspend async: error -110
Similar to commit ac343366846a ("xhci: Increase STS_SAVE timeout in
xhci_suspend()") we also need to increase the HALT timeout to make it be
able to suspend again.
Cc: <stable@vger.kernel.org> # 5.2+
Fixes: f7fac17ca925 ("xhci: Convert xhci_handshake() to use readl_poll_timeout_atomic()")
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20191211142007.8847-5-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Xhci driver cannot call pci_set_power_state() on non-pci xhci host
controllers. For example, NVIDIA Tegra XHCI host controller which acts
as platform device with XHCI_SPURIOUS_WAKEUP quirk set in some platform
hits this issue during shutdown.
Cc: <stable@vger.kernel.org>
Fixes: 638298dc66ea ("xhci: Fix spurious wakeups after S5 on Haswell")
Signed-off-by: Henry Lin <henryl@nvidia.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20191211142007.8847-4-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
A race in xhci USB3 remote wake handling may force device back to suspend
after it initiated resume siganaling, causing a missed resume event or warm
reset of device.
When a USB3 link completes resume signaling and goes to enabled (UO)
state a interrupt is issued and the interrupt handler will clear the
bus_state->port_remote_wakeup resume flag, allowing bus suspend.
If the USB3 roothub thread just finished reading port status before
the interrupt, finding ports still in suspended (U3) state, but hasn't
yet started suspending the hub, then the xhci interrupt handler will clear
the flag that prevented roothub suspend and allow bus to suspend, forcing
all port links back to suspended (U3) state.
Example case:
usb_runtime_suspend() # because all ports still show suspended U3
usb_suspend_both()
hub_suspend(); # successful as hub->wakeup_bits not set yet
==> INTERRUPT
xhci_irq()
handle_port_status()
clear bus_state->port_remote_wakeup
usb_wakeup_notification()
sets hub->wakeup_bits;
kick_hub_wq()
<== END INTERRUPT
hcd_bus_suspend()
xhci_bus_suspend() # success as port_remote_wakeup bits cleared
Fix this by increasing roothub usage count during port resume to prevent
roothub autosuspend, and by making sure bus_state->port_remote_wakeup
flag is only cleared after resume completion is visible, i.e.
after xhci roothub returned U0 or other non-U3 link state link on a
get port status request.
Issue rootcaused by Chiasheng Lee
Cc: <stable@vger.kernel.org>
Cc: Lee, Hou-hsun <hou-hsun.lee@intel.com>
Reported-by: Lee, Chiasheng <chiasheng.lee@intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20191211142007.8847-3-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
When xHCI is part of Alpine or Titan Ridge Thunderbolt controller and
the xHCI device is hot-removed as a result of unplugging a dock for
example, the driver leaks memory it allocates for xhci->usb3_rhub.psi
and xhci->usb2_rhub.psi in xhci_add_in_port() as reported by kmemleak:
unreferenced object 0xffff922c24ef42f0 (size 16):
comm "kworker/u16:2", pid 178, jiffies 4294711640 (age 956.620s)
hex dump (first 16 bytes):
21 00 0c 00 12 00 dc 05 23 00 e0 01 00 00 00 00 !.......#.......
backtrace:
[<000000007ac80914>] xhci_mem_init+0xcf8/0xeb7
[<0000000001b6d775>] xhci_init+0x7c/0x160
[<00000000db443fe3>] xhci_gen_setup+0x214/0x340
[<00000000fdffd320>] xhci_pci_setup+0x48/0x110
[<00000000541e1e03>] usb_add_hcd.cold+0x265/0x747
[<00000000ca47a56b>] usb_hcd_pci_probe+0x219/0x3b4
[<0000000021043861>] xhci_pci_probe+0x24/0x1c0
[<00000000b9231f25>] local_pci_probe+0x3d/0x70
[<000000006385c9d7>] pci_device_probe+0xd0/0x150
[<0000000070241068>] really_probe+0xf5/0x3c0
[<0000000061f35c0a>] driver_probe_device+0x58/0x100
[<000000009da11198>] bus_for_each_drv+0x79/0xc0
[<000000009ce45f69>] __device_attach+0xda/0x160
[<00000000df201aaf>] pci_bus_add_device+0x46/0x70
[<0000000088a1bc48>] pci_bus_add_devices+0x27/0x60
[<00000000ad9ee708>] pci_bus_add_devices+0x52/0x60
unreferenced object 0xffff922c24ef3318 (size 8):
comm "kworker/u16:2", pid 178, jiffies 4294711640 (age 956.620s)
hex dump (first 8 bytes):
34 01 05 00 35 41 0a 00 4...5A..
backtrace:
[<000000007ac80914>] xhci_mem_init+0xcf8/0xeb7
[<0000000001b6d775>] xhci_init+0x7c/0x160
[<00000000db443fe3>] xhci_gen_setup+0x214/0x340
[<00000000fdffd320>] xhci_pci_setup+0x48/0x110
[<00000000541e1e03>] usb_add_hcd.cold+0x265/0x747
[<00000000ca47a56b>] usb_hcd_pci_probe+0x219/0x3b4
[<0000000021043861>] xhci_pci_probe+0x24/0x1c0
[<00000000b9231f25>] local_pci_probe+0x3d/0x70
[<000000006385c9d7>] pci_device_probe+0xd0/0x150
[<0000000070241068>] really_probe+0xf5/0x3c0
[<0000000061f35c0a>] driver_probe_device+0x58/0x100
[<000000009da11198>] bus_for_each_drv+0x79/0xc0
[<000000009ce45f69>] __device_attach+0xda/0x160
[<00000000df201aaf>] pci_bus_add_device+0x46/0x70
[<0000000088a1bc48>] pci_bus_add_devices+0x27/0x60
[<00000000ad9ee708>] pci_bus_add_devices+0x52/0x60
Fix this by calling kfree() for the both psi objects in
xhci_mem_cleanup().
Cc: <stable@vger.kernel.org> # 4.4+
Fixes: 47189098f8be ("xhci: parse xhci protocol speed ID list for usb 3.1 usage")
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20191211142007.8847-2-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-linus
Felipe writes:
USB: fixes for v5.5-rc2
Only four patches here this time around. Three of them are on dwc3
fixing some small bugs related to our 'started' flag.
None are major fixes but they're important nevertheless.
* tag 'fixes-for-v5.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb:
usb: gadget: fix wrong endpoint desc
usb: dwc3: ep0: Clear started flag on completion
usb: dwc3: gadget: Clear started flag for non-IOC
usb: dwc3: gadget: Fix logical condition
|
|
Since retirement may be running in a worker on another CPU, it may be
skipped in the local intel_gt_wait_for_idle(). To ensure the state is
consistent for our sanity checks upon load, serialise with the remote
retirer by waiting on the timeline->mutex.
Outside of this use case, e.g. on suspend or module unload, we expect the
slack to be picked up by intel_gt_pm_wait_for_idle() and so prefer to
put the special case serialisation with retirement in its single user,
for now at least.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191121071044.97798-2-chris@chris-wilson.co.uk
(cherry picked from commit 2d0fb251360ab7eccbffd99f6933a2a4de678d52)
Fixes: 093b92287363 ("drm/i915: Split i915_active.mutex into an irq-safe spinlock for the rbtree")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/754
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
We managed to get confused about the shift direction at least once.
Let's switch to division/multiplcation instead. Add a number of pages
macro for this purpose. We still keep the order macro around too since
this is what alloc/free pages want.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Wei Wang <wei.w.wang@intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
|
|
free_page_order is a confusing name. It's not a page order
actually, it's the order of the block of memory we are hinting.
Rename to hint_block_order. Also, rename SIZE to BYTES
to make it clear it's the block size in bytes.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Wei Wang <wei.w.wang@intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
|
|
In case we have to migrate a ballon page to a newpage of another zone, the
managed page count of both zones is wrong. Paired with memory offlining
(which will adjust the managed page count), we can trigger kernel crashes
and all kinds of different symptoms.
One way to reproduce:
1. Start a QEMU guest with 4GB, no NUMA
2. Hotplug a 1GB DIMM and online the memory to ZONE_NORMAL
3. Inflate the balloon to 1GB
4. Unplug the DIMM (be quick, otherwise unmovable data ends up on it)
5. Observe /proc/zoneinfo
Node 0, zone Normal
pages free 16810
min 24848885473806
low 18471592959183339
high 36918337032892872
spanned 262144
present 262144
managed 18446744073709533486
6. Do anything that requires some memory (e.g., inflate the balloon some
more). The OOM goes crazy and the system crashes
[ 238.324946] Out of memory: Killed process 537 (login) total-vm:27584kB, anon-rss:860kB, file-rss:0kB, shmem-rss:00
[ 238.338585] systemd invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
[ 238.339420] CPU: 0 PID: 1 Comm: systemd Tainted: G D W 5.4.0-next-20191204+ #75
[ 238.340139] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu4
[ 238.341121] Call Trace:
[ 238.341337] dump_stack+0x8f/0xd0
[ 238.341630] dump_header+0x61/0x5ea
[ 238.341942] oom_kill_process.cold+0xb/0x10
[ 238.342299] out_of_memory+0x24d/0x5a0
[ 238.342625] __alloc_pages_slowpath+0xd12/0x1020
[ 238.343024] __alloc_pages_nodemask+0x391/0x410
[ 238.343407] pagecache_get_page+0xc3/0x3a0
[ 238.343757] filemap_fault+0x804/0xc30
[ 238.344083] ? ext4_filemap_fault+0x28/0x42
[ 238.344444] ext4_filemap_fault+0x30/0x42
[ 238.344789] __do_fault+0x37/0x1a0
[ 238.345087] __handle_mm_fault+0x104d/0x1ab0
[ 238.345450] handle_mm_fault+0x169/0x360
[ 238.345790] do_user_addr_fault+0x20d/0x490
[ 238.346154] do_page_fault+0x31/0x210
[ 238.346468] async_page_fault+0x43/0x50
[ 238.346797] RIP: 0033:0x7f47eba4197e
[ 238.347110] Code: Bad RIP value.
[ 238.347387] RSP: 002b:00007ffd7c0c1890 EFLAGS: 00010293
[ 238.347834] RAX: 0000000000000002 RBX: 000055d196a20a20 RCX: 00007f47eba4197e
[ 238.348437] RDX: 0000000000000033 RSI: 00007ffd7c0c18c0 RDI: 0000000000000004
[ 238.349047] RBP: 00007ffd7c0c1c20 R08: 0000000000000000 R09: 0000000000000033
[ 238.349660] R10: 00000000ffffffff R11: 0000000000000293 R12: 0000000000000001
[ 238.350261] R13: ffffffffffffffff R14: 0000000000000000 R15: 00007ffd7c0c18c0
[ 238.350878] Mem-Info:
[ 238.351085] active_anon:3121 inactive_anon:51 isolated_anon:0
[ 238.351085] active_file:12 inactive_file:7 isolated_file:0
[ 238.351085] unevictable:0 dirty:0 writeback:0 unstable:0
[ 238.351085] slab_reclaimable:5565 slab_unreclaimable:10170
[ 238.351085] mapped:3 shmem:111 pagetables:155 bounce:0
[ 238.351085] free:720717 free_pcp:2 free_cma:0
[ 238.353757] Node 0 active_anon:12484kB inactive_anon:204kB active_file:48kB inactive_file:28kB unevictable:0kB iss
[ 238.355979] Node 0 DMA free:11556kB min:36kB low:48kB high:60kB reserved_highatomic:0KB active_anon:152kB inactivB
[ 238.358345] lowmem_reserve[]: 0 2955 2884 2884 2884
[ 238.358761] Node 0 DMA32 free:2677864kB min:7004kB low:10028kB high:13052kB reserved_highatomic:0KB active_anon:0B
[ 238.361202] lowmem_reserve[]: 0 0 72057594037927865 72057594037927865 72057594037927865
[ 238.361888] Node 0 Normal free:193448kB min:99395541895224kB low:73886371836733356kB high:147673348131571488kB reB
[ 238.364765] lowmem_reserve[]: 0 0 0 0 0
[ 238.365101] Node 0 DMA: 7*4kB (U) 5*8kB (UE) 6*16kB (UME) 2*32kB (UM) 1*64kB (U) 2*128kB (UE) 3*256kB (UME) 2*512B
[ 238.366379] Node 0 DMA32: 0*4kB 1*8kB (U) 2*16kB (UM) 2*32kB (UM) 2*64kB (UM) 1*128kB (U) 1*256kB (U) 1*512kB (U)B
[ 238.367654] Node 0 Normal: 1985*4kB (UME) 1321*8kB (UME) 844*16kB (UME) 524*32kB (UME) 300*64kB (UME) 138*128kB (B
[ 238.369184] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 238.369915] 130 total pagecache pages
[ 238.370241] 0 pages in swap cache
[ 238.370533] Swap cache stats: add 0, delete 0, find 0/0
[ 238.370981] Free swap = 0kB
[ 238.371239] Total swap = 0kB
[ 238.371488] 1048445 pages RAM
[ 238.371756] 0 pages HighMem/MovableOnly
[ 238.372090] 306992 pages reserved
[ 238.372376] 0 pages cma reserved
[ 238.372661] 0 pages hwpoisoned
In another instance (older kernel), I was able to observe this
(negative page count :/):
[ 180.896971] Offlined Pages 32768
[ 182.667462] Offlined Pages 32768
[ 184.408117] Offlined Pages 32768
[ 186.026321] Offlined Pages 32768
[ 187.684861] Offlined Pages 32768
[ 189.227013] Offlined Pages 32768
[ 190.830303] Offlined Pages 32768
[ 190.833071] Built 1 zonelists, mobility grouping on. Total pages: -36920272750453009
In another instance (older kernel), I was no longer able to start any
process:
[root@vm ~]# [ 214.348068] Offlined Pages 32768
[ 215.973009] Offlined Pages 32768
cat /proc/meminfo
-bash: fork: Cannot allocate memory
[root@vm ~]# cat /proc/meminfo
-bash: fork: Cannot allocate memory
Fix it by properly adjusting the managed page count when migrating if
the zone changed. The managed page count of the zones now looks after
unplug of the DIMM (and after deflating the balloon) just like before
inflating the balloon (and plugging+onlining the DIMM).
We'll temporarily modify the totalram page count. If this ever becomes a
problem, we can fine tune by providing helpers that don't touch
the totalram pages (e.g., adjust_zone_managed_page_count()).
Please note that fixing up the managed page count is only necessary when
we adjusted the managed page count when inflating - only if we
don't have VIRTIO_BALLOON_F_DEFLATE_ON_OOM. With that feature, the
managed page count is not touched when inflating/deflating.
Reported-by: Yumei Huang <yuhuang@redhat.com>
Fixes: 3dcc0571cd64 ("mm: correctly update zone->managed_pages")
Cc: <stable@vger.kernel.org> # v3.11+
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Jiang Liu <liuj97@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
To pick up the changes from:
22945688acd4 ("KVM: PPC: Book3S HV: Support reset of secure guest")
No tools changes are caused by this, as the only defines so far used
from these files are for syscall arg pretty printing are:
$ grep KVM tools/perf/trace/beauty/*.sh
tools/perf/trace/beauty/kvm_ioctl.sh:regex='^#[[:space:]]*define[[:space:]]+KVM_(\w+)[[:space:]]+_IO[RW]*\([[:space:]]*KVMIO[[:space:]]*,[[:space:]]*(0x[[:xdigit:]]+).*'
$
This addresses these tools/perf build warnings:
Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@ozlabs.org>
Link: https://lkml.kernel.org/n/tip-bdbe4x02johhul05a03o27zj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To pick up BPF fixes to allow a clean 'make -C tools/perf build-test':
7c3977d1e804 libbpf: Fix sym->st_value print on 32-bit arches
1fd450f99272 libbpf: Fix up generation of bpf_helper_defs.h
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When the kptr_restrict sysctl is set, the kernel can fail to return
jited_ksyms or jited_prog_insns, but still have positive values in
nr_jited_ksyms and jited_prog_len. This causes bpftool to crash when
trying to dump the program because it only checks the len fields not
the actual pointers to the instructions and ksyms.
Fix this by adding the missing checks.
Fixes: 71bb428fe2c1 ("tools: bpf: add bpftool")
Fixes: f84192ee00b7 ("tools: bpftool: resolve calls without using imm field")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191210181412.151226-1-toke@redhat.com
|
|
Building with -Werror showed another failure:
kernel/bpf/btf.c: In function 'btf_get_prog_ctx_type.isra.31':
kernel/bpf/btf.c:3508:63: error: array subscript 0 is above array bounds of 'u8[0]' {aka 'unsigned char[0]'} [-Werror=array-bounds]
ctx_type = btf_type_member(conv_struct) + bpf_ctx_convert_map[prog_type] * 2;
I don't actually understand why the array is empty, but a similar
fix has addressed a related problem, so I suppose we can do the
same thing here.
Fixes: ce27709b8162 ("bpf: Fix build in minimal configurations")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191210203553.2941035-1-arnd@arndb.de
|
|
All BPF JIT compilers except RISC-V's and MIPS' enforce a 33-tail calls
limit at runtime. In addition, a test was recently added, in tailcalls2,
to check this limit.
This patch updates the tail call limit in MIPS' JIT compiler to allow
33 tail calls.
Fixes: b6bd53f9c4e8 ("MIPS: Add missing file for eBPF JIT.")
Reported-by: Mahshid Khezri <khezri.mahshid@gmail.com>
Signed-off-by: Paul Chaignon <paul.chaignon@orange.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/b8eb2caac1c25453c539248e56ca22f74b5316af.1575916815.git.paul.chaignon@gmail.com
|
|
All BPF JIT compilers except RISC-V's and MIPS' enforce a 33-tail calls
limit at runtime. In addition, a test was recently added, in tailcalls2,
to check this limit.
This patch updates the tail call limit in RISC-V's JIT compiler to allow
33 tail calls. I tested it using the above selftest on an emulated
RISCV64.
Fixes: 2353ecc6f91f ("bpf, riscv: add BPF JIT for RV64G")
Reported-by: Mahshid Khezri <khezri.mahshid@gmail.com>
Signed-off-by: Paul Chaignon <paul.chaignon@orange.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Björn Töpel <bjorn.topel@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/966fe384383bf23a0ee1efe8d7291c78a3fb832b.1575916815.git.paul.chaignon@gmail.com
|
|
The temperature sensor may jump backwards because there is a wrong
calibration value. Both values have to be monotonically increasing.
Fix it.
This was tested on a custom board.
Fixes: 571cebfe8e2b ("arm64: dts: ls1028a: Add Thermal Monitor Unit node")
Signed-off-by: Michael Walle <michael@walle.cc>
Acked-by: Tang Yuantian <andy.tang@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
Need to align with deactivate, should only use vgpu's lock for
active state setting instead of gvt lock.
Fixes: f25a49ab8ab9 ("drm/i915/gvt: Use vgpu_lock to protect per vgpu access")
Cc: Colin Xu <colin.xu@intel.com>
Reviewed-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20191202070109.73924-2-zhenyuw@linux.intel.com
|
|
Fix commit 7b81cb6bddd2 ("usb: add a HCD_DMA flag instead of
guestimating DMA capabilities") where local memory USB drivers
erroneously allocate DMA memory instead of pool memory, causing
OHCI Unrecoverable Error, disabled
HC died; cleaning up
The order between hcd_uses_dma() and hcd->localmem_pool is now
arranged as in hcd_buffer_alloc() and hcd_buffer_free(), with the
test for hcd->localmem_pool placed first.
As an alternative, one might consider adjusting hcd_uses_dma() with
static inline bool hcd_uses_dma(struct usb_hcd *hcd)
{
- return IS_ENABLED(CONFIG_HAS_DMA) && (hcd->driver->flags & HCD_DMA);
+ return IS_ENABLED(CONFIG_HAS_DMA) &&
+ (hcd->driver->flags & HCD_DMA) &&
+ (hcd->localmem_pool == NULL);
}
One can also consider unsetting HCD_DMA for local memory pool drivers.
Fixes: 7b81cb6bddd2 ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Fredrik Noring <noring@nocrew.org>
Link: https://lore.kernel.org/r/20191210172905.GA52526@sx9
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
As a preparation for an API conversion, factor out something frequently
used in the media subsystem. As an improvement, it bails out on both,
NULL and ERRPTR to handle the old and new API.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
The usage of readl_poll_timeout is wrong, the 3rd parameter(cond)
should be "val & LOCK_STATUS" not "val & LOCK_TIMEOUT_US",
It is not check whether the pll locked, LOCK_STATUS reflects the mask,
not LOCK_TIMEOUT_US.
Fixes: 8646d4dcc7fb ("clk: imx: Add PLLs driver for imx8mm soc")
Cc: <stable@vger.kernel.org>
Reviewed-by: Abel Vesa <abel.vesa@nxp.com>
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
After the commit 8267ff89b713 ("ARM: imx: Add serial number support for i.MX6/7 SoCs")
the kernel doesn't start on i.MX6ULL/ULZ SoC.
Tested on next-20191205.
For i.MX6ULL/ULZ the variable "ocotp_compat" is set to "fsl,imx6ul-ocotp", but with commit
ffbc34bf0e9c ("nvmem: imx-ocotp: Implement i.MX6ULL/ULZ support") and commit
f243bc821ee3 ("ARM: dts: imx6ull: Fix i.MX6ULL/ULZ ocotp compatible") the value
"fsl,imx6ull-ocotp" is already defined and set in device tree...
By setting "ocotp_compat" to "fsl,imx6ull-ocotp" the kernel does boot.
Fixes: 8267ff89b713 ("ARM: imx: Add serial number support for i.MX6/7 SoCs")
Signed-off-by: Christoph Niedermaier <cniedermaier@dh-electronics.com>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
The value being used for guest_nice should be CPUTIME_GUEST_NICE
and not CPUTIME_USER.
Fixes: 26dae145a76c ("procfs: Use all-in-one vtime aware kcpustat accessor")
Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20191205020344.14940-1-frederic@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
After applying the fixup ALC274_FIXUP_DELL_AIO_LINEOUT_VERB, the
Line-out jack works well. And instead of adding a new set of pin
definition in the pin_fixup_tbl, we put a more generic matching entry
in the fallback_pin_fixup_tbl.
Cc: <stable@vger.kernel.org>
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Link: https://lore.kernel.org/r/20191211051321.5883-1-hui.wang@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
In netpoll the napi handler could be called with budget equal to zero.
Current ENA napi handler doesn't take that into consideration.
The napi handler handles Rx packets in a do-while loop.
Currently, the budget check happens only after decrementing the
budget, therefore the napi handler, in rare cases, could run over
MAX_INT packets.
In addition to that, this moves all budget related variables to int
calculation and stop mixing u32 to avoid ambiguity
Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Tuong Lien says:
====================
tipc: fix some issues
This series consists of some bug-fixes for TIPC.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In the function 'tipc_disc_rcv()', the 'msg_peer_net_hash()' is called
to read the header data field but after the message skb has been freed,
that might result in a garbage value...
This commit fixes it by defining a new local variable to store the data
first, just like the other header fields' handling.
Fixes: f73b12812a3d ("tipc: improve throughput between nodes in netns")
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When a user message is sent, TIPC will check if the socket has faced a
congestion at link layer. If that happens, it will make a sleep to wait
for the congestion to disappear. This leaves a gap for other users to
take over the socket (e.g. multi threads) since the socket is released
as well. Also, in case of connectionless (e.g. SOCK_RDM), user is free
to send messages to various destinations (e.g. via 'sendto()'), then
the socket's preformatted header has to be updated correspondingly
prior to the actual payload message building.
Unfortunately, the latter action is done before the first action which
causes a condition issue that the destination of a certain message can
be modified incorrectly in the middle, leading to wrong destination
when that message is built. Consequently, when the message is sent to
the link layer, it gets stuck there forever because the peer node will
simply reject it. After a number of retransmission attempts, the link
is eventually taken down and the retransmission failure is reported.
This commit fixes the problem by rearranging the order of actions to
prevent the race condition from occurring, so the message building is
'atomic' and its header will not be modified by anyone.
Fixes: 365ad353c256 ("tipc: reduce risk of user starvation during link congestion")
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In commit c55c8edafa91 ("tipc: smooth change between replicast and
broadcast"), we allow instant switching between replicast and broadcast
by sending a dummy 'SYN' packet on the last used link to synchronize
packets on the links. The 'SYN' message is an object of link congestion
also, so if that happens, a 'SOCK_WAKEUP' will be scheduled to be sent
back to the socket...
However, in that commit, we simply use the same socket 'cong_link_cnt'
counter for both the 'SYN' & normal payload message sending. Therefore,
if both the replicast & broadcast links are congested, the counter will
be not updated correctly but overwritten by the latter congestion.
Later on, when the 'SOCK_WAKEUP' messages are processed, the counter is
reduced one by one and eventually overflowed. Consequently, further
activities on the socket will only wait for the false congestion signal
to disappear but never been met.
Because sending the 'SYN' message is vital for the mechanism, it should
be done anyway. This commit fixes the issue by marking the message with
an error code e.g. 'TIPC_ERR_NO_PORT', so its sending should not face a
link congestion, there is no need to touch the socket 'cong_link_cnt'
either. In addition, in the event of any error (e.g. -ENOBUFS), we will
purge the entire payload message queue and make a return immediately.
Fixes: c55c8edafa91 ("tipc: smooth change between replicast and broadcast")
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The current rbtree for service ranges in the name table is built based
on the 'lower' & 'upper' range values resulting in a flaw in the rbtree
searching. Some issues have been observed in case of range overlapping:
Case #1: unable to withdraw a name entry:
After some name services are bound, all of them are withdrawn by user
but one remains in the name table forever. This corrupts the table and
that service becomes dummy i.e. no real port.
E.g.
/
{22, 22}
/
/
---> {10, 50}
/ \
/ \
{10, 30} {20, 60}
The node {10, 30} cannot be removed since the rbtree searching stops at
the node's ancestor i.e. {10, 50}, so starting from it will never reach
the finding node.
Case #2: failed to send data in some cases:
E.g. Two service ranges: {20, 60}, {10, 50} are bound. The rbtree for
this service will be one of the two cases below depending on the order
of the bindings:
{20, 60} {10, 50} <--
/ \ / \
/ \ / \
{10, 50} NIL <-- NIL {20, 60}
(a) (b)
Now, try to send some data to service {30}, there will be two results:
(a): Failed, no route to host.
(b): Ok.
The reason is that the rbtree searching will stop at the pointing node
as shown above.
Case #3: Same as case #2b above but if the data sending's scope is
local and the {10, 50} is published by a peer node, then it will result
in 'no route to host' even though the other {20, 60} is for example on
the local node which should be able to get the data.
The issues are actually due to the way we built the rbtree. This commit
fixes it by introducing an additional field to each node - named 'max',
which is the largest 'upper' of that node subtree. The 'max' value for
each subtrees will be propagated correctly whenever a node is inserted/
removed or the tree is rebalanced by the augmented rbtree callbacks.
By this way, we can change the rbtree searching appoarch to solve the
issues above. Another benefit from this is that we can now improve the
searching for a next range matching e.g. in case of multicast, so get
rid of the unneeded looping over all nodes in the tree.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Michael Chan says:
====================
bnxt_en: Error recovery fixes.
This patch series contains fixes mostly for the error recovery feature
and related areas. Please queue the series for -stable also. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The VF driver also needs to create the health reporters since
VFs are also involved in firmware reset and recovery. Modify
bnxt_dl_register() and bnxt_dl_unregister() so that they can
be called by the VFs to register/unregister devlink. Only the PF
will register the devlink parameters. With devlink registered,
we can now create the health reporters on the VFs.
Fixes: 6763c779c2d8 ("bnxt_en: Add new FW devlink_health_reporter")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix the logic to properly check the fw capabilities and create the
devlink health reporters only when needed. The current code creates
the reporters unconditionally as long as bp->fw_health is valid, and
that's not correct.
Call bnxt_dl_fw_reporters_create() directly from the init and reset
code path instead of from bnxt_dl_register(). This allows the
reporters to be adjusted when capabilities change. The same
applies to bnxt_dl_fw_reporters_destroy().
Fixes: 6763c779c2d8 ("bnxt_en: Add new FW devlink_health_reporter")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
After fixing the allocation of bp->fw_health in the previous patch,
the driver will not go through the fw reset and recovery code paths
if bp->fw_health allocation fails. So we can now remove the
unnecessary NULL checks.
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
bp->fw_health needs to be allocated for either the firmware initiated
reset feature or the driver initiated error recovery feature. The
current code is not allocating bp->fw_health for all the necessary cases.
This patch corrects the logic to allocate bp->fw_health correctly when
needed. If allocation fails, we clear the feature flags.
We also add the the missing kfree(bp->fw_health) when the driver is
unloaded. If we get an async reset message from the firmware, we also
need to make sure that we have a valid bp->fw_health before proceeding.
Fixes: 07f83d72d238 ("bnxt_en: Discover firmware error recovery capabilities.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If any change happened in the configuration of VF in VM while
collecting live dump, there could be a race and firmware can return
more data than allocated dump length. Fix it by keeping track of
the accumulated core dump length copied so far and abort the copy
with error code if the next chunk of core dump will exceed the
original dump length.
Fixes: 6c5657d085ae ("bnxt_en: Add support for ethtool get dump.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This will trigger new context memory to be rediscovered and allocated
during the re-probe process after a firmware reset. Without this, the
newly reset firmware does not have valid context memory and the driver
will eventually fail to allocate some resources.
Fixes: ec5d31e3c15d ("bnxt_en: Handle firmware reset status during IF_UP.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The logic needs to check both bp->total_irqs and the reserved IRQs in
hw_resc->resv_irqs if applicable and see if both are enough to cover
the L2 and RDMA requested vectors. The current code is only checking
bp->total_irqs and can fail in some code paths, such as the TX timeout
code path with the RDMA driver requesting vectors after recovery. In
this code path, we have not reserved enough MSIX resources for the
RDMA driver yet.
Fixes: 75720e6323a1 ("bnxt_en: Keep track of reserved IRQs.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In chasing a performance issue between using IORING_OP_RECVMSG and
IORING_OP_READV on sockets, tracing showed that we always punt the
socket reads to async offload. This is due to io_file_supports_async()
not checking for S_ISSOCK on the inode. Since sockets supports the
O_NONBLOCK (or MSG_DONTWAIT) flag just fine, add sockets to the list
of file types that we can do a non-blocking issue to.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The socket read/write helpers only look at the file O_NONBLOCK. not
the iocb IOCB_NOWAIT flag. This breaks users like preadv2/pwritev2
and io_uring that rely on not having the file itself marked nonblocking,
but rather the iocb itself.
Cc: netdev@vger.kernel.org
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
We hash regular files to avoid having multiple threads hammer on the
inode mutex, but it should not be needed on other types of files
(like sockets).
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
One major use case of linked commands is the ability to run the next
link inline, if at all possible. This is done correctly for async
offload, but somewhere along the line we lost the ability to do so when
we were able to complete a request without having to punt it. Ensure
that we do so correctly.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
This essentially reverts commit e944475e6984. For high poll ops
workloads, like TAO, the dynamic allocation of the wait_queue
entry for IORING_OP_POLL_ADD adds considerable extra overhead.
Go back to embedding the wait_queue_entry, but keep the usage of
wait->private for the pointer stashing.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Don't just assign it from the main call path, that can miss the case
when we're called from issue deferral.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
We use the mutex to guard against registered file updates, for instance.
Ensure we're safe in accessing that state against concurrent updates.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|