Age | Commit message (Collapse) | Author |
|
At the end of rxrpc_release_call(), rxrpc_cleanup_ring() is called to clear
the Rx/Tx skbuff ring, but this doesn't lock the ring whilst it's accessing
it. Unfortunately, rxrpc_resend() might be trying to retransmit a packet
concurrently with this - and whilst it does lock the ring, this isn't
protection against rxrpc_cleanup_call().
Fix this by removing the call to rxrpc_cleanup_ring() from
rxrpc_release_call(). rxrpc_cleanup_ring() will be called again anyway
from rxrpc_cleanup_call(). The earlier call is just an optimisation to
recycle skbuffs more quickly.
Alternative solutions include rxrpc_release_call() could try to cancel the
work item or wait for it to complete or rxrpc_cleanup_ring() could lock
when accessing the ring (which would require a bh lock).
This can produce a report like the following:
BUG: KASAN: use-after-free in rxrpc_send_data_packet+0x19b4/0x1e70 net/rxrpc/output.c:372
Read of size 4 at addr ffff888011606e04 by task kworker/0:0/5
...
Workqueue: krxrpcd rxrpc_process_call
Call Trace:
...
kasan_report.cold+0x79/0xd5 mm/kasan/report.c:413
rxrpc_send_data_packet+0x19b4/0x1e70 net/rxrpc/output.c:372
rxrpc_resend net/rxrpc/call_event.c:266 [inline]
rxrpc_process_call+0x1634/0x1f60 net/rxrpc/call_event.c:412
process_one_work+0x98d/0x15f0 kernel/workqueue.c:2275
...
Allocated by task 2318:
...
sock_alloc_send_pskb+0x793/0x920 net/core/sock.c:2348
rxrpc_send_data+0xb51/0x2bf0 net/rxrpc/sendmsg.c:358
rxrpc_do_sendmsg+0xc03/0x1350 net/rxrpc/sendmsg.c:744
rxrpc_sendmsg+0x420/0x630 net/rxrpc/af_rxrpc.c:560
...
Freed by task 2318:
...
kfree_skb+0x140/0x3f0 net/core/skbuff.c:704
rxrpc_free_skb+0x11d/0x150 net/rxrpc/skbuff.c:78
rxrpc_cleanup_ring net/rxrpc/call_object.c:485 [inline]
rxrpc_release_call+0x5dd/0x860 net/rxrpc/call_object.c:552
rxrpc_release_calls_on_socket+0x21c/0x300 net/rxrpc/call_object.c:579
rxrpc_release_sock net/rxrpc/af_rxrpc.c:885 [inline]
rxrpc_release+0x263/0x5a0 net/rxrpc/af_rxrpc.c:916
__sock_release+0xcd/0x280 net/socket.c:597
...
The buggy address belongs to the object at ffff888011606dc0
which belongs to the cache skbuff_head_cache of size 232
Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
Reported-by: syzbot+174de899852504e4a74a@syzkaller.appspotmail.com
Reported-by: syzbot+3d1c772efafd3c38d007@syzkaller.appspotmail.com
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Hillf Danton <hdanton@sina.com>
Link: https://lore.kernel.org/r/161234207610.653119.5287360098400436976.stgit@warthog.procyon.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Link: https://lore.kernel.org/r/20210202182511.8109-1-rajur@chelsio.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
drm/i915 fixes for v5.11-rc7:
- Skip vswing programming for TBT
- Power up combo PHY lanes for HDMI
- Fix double YUV range correction on HDR planes
- Fix the MST PBN divider calculation
- Fix LTTPR vswing/pre-emp setting in non-transparent mode
- Move the breadcrumb to the signaler if completed upon cancel
- Close race between enable_breadcrumbs and cancel_breadcrumbs
- Drop lru bumping on display unpinning
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87bld0f36b.fsf@intel.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fix from Bjorn Helgaas:
"Revert ASPM suspend/resume fix that regressed NVMe devices (Bjorn
Helgaas)"
* tag 'pci-v5.11-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
Revert "PCI/ASPM: Save/restore L1SS Capability for suspend/resume"
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.11-2021-02-03:
amdgpu:
- Fix retry in gem create
- Vangogh fixes
- Fix for display from shared buffers
- Various display fixes
amdkfd:
- Fix regression in buffer free
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210204041300.4425-1-alexander.deucher@amd.com
|
|
Since SQPOLL task can be shared and so task_work entries can be a mix of
them, we need to drop mm and files before trying to issue next request.
Cc: stable@vger.kernel.org # 5.10+
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Since commit 23025393dbeb3b8b3 ("xen/netback: use lateeoi irq binding")
xenvif_rx_ring_slots_available() is no longer called only from the rx
queue kernel thread, so it needs to access the rx queue with the
associated queue held.
Reported-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Fixes: 23025393dbeb3b8b3 ("xen/netback: use lateeoi irq binding")
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wl@xen.org>
Link: https://lore.kernel.org/r/20210202070938.7863-1-jgross@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Jan Kiszka reported that the x2apic_wrmsr_fence() function uses a plain
MFENCE while the Intel SDM (10.12.3 MSR Access in x2APIC Mode) calls for
MFENCE; LFENCE.
Short summary: we have special MSRs that have weaker ordering than all
the rest. Add fencing consistent with current SDM recommendations.
This is not known to cause any issues in practice, only in theory.
Longer story below:
The reason the kernel uses a different semantic is that the SDM changed
(roughly in late 2017). The SDM changed because folks at Intel were
auditing all of the recommended fences in the SDM and realized that the
x2apic fences were insufficient.
Why was the pain MFENCE judged insufficient?
WRMSR itself is normally a serializing instruction. No fences are needed
because the instruction itself serializes everything.
But, there are explicit exceptions for this serializing behavior written
into the WRMSR instruction documentation for two classes of MSRs:
IA32_TSC_DEADLINE and the X2APIC MSRs.
Back to x2apic: WRMSR is *not* serializing in this specific case.
But why is MFENCE insufficient? MFENCE makes writes visible, but
only affects load/store instructions. WRMSR is unfortunately not a
load/store instruction and is unaffected by MFENCE. This means that a
non-serializing WRMSR could be reordered by the CPU to execute before
the writes made visible by the MFENCE have even occurred in the first
place.
This means that an x2apic IPI could theoretically be triggered before
there is any (visible) data to process.
Does this affect anything in practice? I honestly don't know. It seems
quite possible that by the time an interrupt gets to consume the (not
yet) MFENCE'd data, it has become visible, mostly by accident.
To be safe, add the SDM-recommended fences for all x2apic WRMSRs.
This also leaves open the question of the _other_ weakly-ordered WRMSR:
MSR_IA32_TSC_DEADLINE. While it has the same ordering architecture as
the x2APIC MSRs, it seems substantially less likely to be a problem in
practice. While writes to the in-memory Local Vector Table (LVT) might
theoretically be reordered with respect to a weakly-ordered WRMSR like
TSC_DEADLINE, the SDM has this to say:
In x2APIC mode, the WRMSR instruction is used to write to the LVT
entry. The processor ensures the ordering of this write and any
subsequent WRMSR to the deadline; no fencing is required.
But, that might still leave xAPIC exposed. The safest thing to do for
now is to add the extra, recommended LFENCE.
[ bp: Massage commit message, fix typos, drop accidentally added
newline to tools/arch/x86/include/asm/barrier.h. ]
Reported-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20200305174708.F77040DD@viggo.jf.intel.com
|
|
This reverts commit bde9cfa3afe4324ec251e4af80ebf9b7afaf7afe.
Changing the first memory page type from E820_TYPE_RESERVED to
E820_TYPE_RAM makes it a part of "System RAM" resource rather than a
reserved resource and this in turn causes devmem_is_allowed() to treat
is as area that can be accessed but it is filled with zeroes instead of
the actual data as previously.
The change in /dev/mem output causes lilo to fail as was reported at
slakware users forum, and probably other legacy applications will
experience similar problems.
Link: https://www.linuxquestions.org/questions/slackware-14/slackware-current-lilo-vesa-warnings-after-recent-updates-4175689617/#post6214439
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"Address recent regression causing battery devices to be never bound to
a driver on some systems (Hans de Goede)"
* tag 'acpi-5.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: scan: Fix battery devices sometimes never binding
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull overlayfs fixes from Miklos Szeredi:
- Fix capability conversion and minor overlayfs bugs that are related
to the unprivileged overlay mounts introduced in this cycle.
- Fix two recent (v5.10) and one old (v4.10) bug.
- Clean up security xattr copy-up (related to a SELinux regression).
* tag 'ovl-fixes-5.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: implement volatile-specific fsync error behaviour
ovl: skip getxattr of security labels
ovl: fix dentry leak in ovl_get_redirect
ovl: avoid deadlock on directory ioctl
cap: fix conversions on getxattr
ovl: perform vfs_getxattr() with mounter creds
ovl: add warning on user_ns mismatch
|
|
Set cr3_lm_rsvd_bits, which is effectively an invalid GPA mask, at vCPU
reset. The reserved bits check needs to be done even if userspace never
configures the guest's CPUID model.
Cc: stable@vger.kernel.org
Fixes: 0107973a80ad ("KVM: x86: Introduce cr3_lm_rsvd_bits in kvm_vcpu_arch")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210204000117.3303214-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Pull NVMe fixes from Christoph.
* 'nvme-5.11' of git://git.infradead.org/nvme:
nvmet-tcp: fix out-of-bounds access when receiving multiple h2cdata PDUs
update the email address for Keith Bush
nvme-pci: ignore the subsysem NQN on Phison E16
nvme-pci: avoid the deepest sleep state on Kingston A2000 SSDs
|
|
Abaci Robot reported following panic:
BUG: kernel NULL pointer dereference, address: 0000000000000000
PGD 800000010ef3f067 P4D 800000010ef3f067 PUD 10d9df067 PMD 0
Oops: 0002 [#1] SMP PTI
CPU: 0 PID: 1869 Comm: io_wqe_worker-0 Not tainted 5.11.0-rc3+ #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:put_files_struct+0x1b/0x120
Code: 24 18 c7 00 f4 ff ff ff e9 4d fd ff ff 66 90 0f 1f 44 00 00 41 57 41 56 49 89 fe 41 55 41 54 55 53 48 83 ec 08 e8 b5 6b db ff 41 ff 0e 74 13 48 83 c4 08 5b 5d 41 5c 41 5d 41 5e 41 5f e9 9c
RSP: 0000:ffffc90002147d48 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff88810d9a5300 RCX: 0000000000000000
RDX: ffff88810d87c280 RSI: ffffffff8144ba6b RDI: 0000000000000000
RBP: 0000000000000080 R08: 0000000000000001 R09: ffffffff81431500
R10: ffff8881001be000 R11: 0000000000000000 R12: ffff88810ac2f800
R13: ffff88810af38a00 R14: 0000000000000000 R15: ffff8881057130c0
FS: 0000000000000000(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000010dbaa002 CR4: 00000000003706f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
__io_clean_op+0x10c/0x2a0
io_dismantle_req+0x3c7/0x600
__io_free_req+0x34/0x280
io_put_req+0x63/0xb0
io_worker_handle_work+0x60e/0x830
? io_wqe_worker+0x135/0x520
io_wqe_worker+0x158/0x520
? __kthread_parkme+0x96/0xc0
? io_worker_handle_work+0x830/0x830
kthread+0x134/0x180
? kthread_create_worker_on_cpu+0x90/0x90
ret_from_fork+0x1f/0x30
Modules linked in:
CR2: 0000000000000000
---[ end trace c358ca86af95b1e7 ]---
I guess case below can trigger above panic: there're two threads which
operates different io_uring ctxs and share same sqthread identity, and
later one thread exits, io_uring_cancel_task_requests() will clear
task->io_uring->identity->files to be NULL in sqpoll mode, then another
ctx that uses same identity will panic.
Indeed we don't need to clear task->io_uring->identity->files here,
io_grab_identity() should handle identity->files changes well, if
task->io_uring->identity->files is not equal to current->files,
io_cow_identity() should handle this changes well.
Cc: stable@vger.kernel.org # 5.5+
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
There is a bug in the TDP MMU function to zap SPTEs which could be
replaced with a larger mapping which prevents the function from doing
anything. Fix this by correctly zapping the last level SPTEs.
Cc: stable@vger.kernel.org
Fixes: 14881998566d ("kvm: x86/mmu: Support disabling dirty logging for the tdp MMU")
Signed-off-by: Ben Gardon <bgardon@google.com>
Message-Id: <20210202185734.1680553-11-bgardon@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
TLS selftests were broken also because of use of structure that
was not exported to UAPI. Fix by defining the union in tests.
Fixes: 4f336e88a870 (selftests/tls: add CHACHA20-POLY1305 to tls selftests)
Reported-by: Rong Chen <rong.a.chen@intel.com>
Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
Link: https://lore.kernel.org/r/1612384634-5377-1-git-send-email-vfedorenko@novek.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
* drm/bridge/lontium-lt9611uxc: EDID fixes; Don't handle hotplug
events in IRQ handler
* drm/ttm: Use _GFP_NOWARN for huge pages
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YBlHU4sc/5GHpXpg@linux-uq9g
|
|
syzbot found WARNING in qrtr_tun_write_iter [1] when write_iter length
exceeds KMALLOC_MAX_SIZE causing order >= MAX_ORDER condition.
Additionally, there is no check for 0 length write.
[1]
WARNING: mm/page_alloc.c:5011
[..]
Call Trace:
alloc_pages_current+0x18c/0x2a0 mm/mempolicy.c:2267
alloc_pages include/linux/gfp.h:547 [inline]
kmalloc_order+0x2e/0xb0 mm/slab_common.c:837
kmalloc_order_trace+0x14/0x120 mm/slab_common.c:853
kmalloc include/linux/slab.h:557 [inline]
kzalloc include/linux/slab.h:682 [inline]
qrtr_tun_write_iter+0x8a/0x180 net/qrtr/tun.c:83
call_write_iter include/linux/fs.h:1901 [inline]
Reported-by: syzbot+c2a7e5c5211605a90865@syzkaller.appspotmail.com
Signed-off-by: Sabyrzhan Tasbolatov <snovitoll@gmail.com>
Link: https://lore.kernel.org/r/20210202092059.1361381-1-snovitoll@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When updating the tcp or udp header checksum on port nat the function
inet_proto_csum_replace2 with the last parameter pseudohdr as true.
This leads to an error in the case that GRO is used and packets are
split up in GSO. The tcp or udp checksum of all packets is incorrect.
The error is probably masked due to the fact the most network driver
implement tcp/udp checksum offloading. It also only happens when GRO is
applied and not on single packets.
The error is most visible when using a pppoe connection which is not
triggering the tcp/udp checksum offload.
Fixes: ac2a66665e23 ("netfilter: add generic flow table infrastructure")
Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Although hooks are released via call_rcu(), chain and rule objects are
immediately released while packets are still walking over these bits.
This patch adds the .pre_exit callback which is invoked before
synchronize_rcu() in the netns framework to stay safe.
Remove a comment which is not valid anymore since the core does not use
synchronize_net() anymore since 8c873e219970 ("netfilter: core: free
hooks with call_rcu").
Suggested-by: Florian Westphal <fw@strlen.de>
Fixes: df05ef874b28 ("netfilter: nf_tables: release objects on netns destruction")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
use date %Y instead of %G to read current year
Problem appeared when running lkp-tests on 01/01/2021
Fixes: 48d072c4e8cd ("selftests: netfilter: add time counter check")
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
When both --reap and --update flag are specified, there's a code
path at which the entry to be updated is reaped beforehand,
which then leads to kernel crash. Reap only entries which won't be
updated.
Fixes kernel bugzilla #207773.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=207773
Reported-by: Reindl Harald <h.reindl@thelounge.net>
Fixes: 0079c5aee348 ("netfilter: xt_recent: add an entry reaper")
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Adding support for Cinterion MV31 with PID 0x00B7.
T: Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 11 Spd=5000 MxCh= 0
D: Ver= 3.20 Cls=ef(misc ) Sub=02 Prot=01 MxPS= 9 #Cfgs= 1
P: Vendor=1e2d ProdID=00b7 Rev=04.14
S: Manufacturer=Cinterion
S: Product=Cinterion USB Mobile Broadband
S: SerialNumber=b3246eed
C: #Ifs= 4 Cfg#= 1 Atr=a0 MxPwr=896mA
I: If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
I: If#=0x1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I: If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I: If#=0x3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
Signed-off-by: Christoph Schemmel <christoph.schemmel@gmail.com>
Link: https://lore.kernel.org/r/20210202084523.4371-1-christoph.schemmel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since I wrote the below patch if you run a debug kernel you can a
dma debug warning like:
nouveau 0000:1f:00.0: DMA-API: device driver tries to sync DMA memory it has not allocated [device address=0x000000016e012000] [size=4096 bytes]
The old nouveau code wasn't consolidate the pages like the ttm code,
but the dma-debug expects the sync code to give it the same base/range
pairs as the allocator.
Fix the nouveau sync code to consolidate pages before calling the
sync code.
Fixes: bd549d35b4be0 ("nouveau: use ttm populate mapping functions. (v2)")
Reported-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/417588/
|
|
On 32-bit architecture, roundup_pow_of_two() can return 0 when the argument
has upper most bit set due to resulting 1UL << 32. Add a check for this case.
Fixes: d5a3b1f69186 ("bpf: introduce BPF_MAP_TYPE_STACK_TRACE")
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210127063653.3576-1-minhquangbui99@gmail.com
|
|
Reporting a port as connected if nothing is attached to them leads to
any i2c transactions on this port trying to use an uninitialized i2c
adapter, fix this.
Let's account for this case even if branch devices have no good reason
to report a port as plugged with their peer device type set to 'none'.
Fixes: db1a07956968 ("drm/dp_mst: Handle SST-only branch device case")
References: https://gitlab.freedesktop.org/drm/intel/-/issues/2987
References: https://gitlab.freedesktop.org/drm/intel/-/issues/1963
Cc: Wayne Lin <Wayne.Lin@amd.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: <stable@vger.kernel.org> # v5.5+
Cc: intel-gfx@lists.freedesktop.org
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Reported-by: Thiago Macieira <gitlab@gitlab.freedesktop.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20210201120145.350258-1-imre.deak@intel.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml
Pull UML fixes from Richard Weinberger:
- Make sure to set a default console, otherwise ttynull is selected
- Revert initial ARCH_HAS_SET_MEMORY support, this needs more work
- Fix a regression due to ubd refactoring
- Various small fixes
* tag 'for-linus-5.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
um: time: fix initialization in time-travel mode
um: fix os_idle_sleep() to not hang
Revert "um: support some of ARCH_HAS_SET_MEMORY"
Revert "um: allocate a guard page to helper threads"
um: virtio: free vu_dev only with the contained struct device
um: kmsg_dumper: always dump when not tty console
um: stdio_console: Make preferred console
um: return error from ioremap()
um: ubd: fix command line handling of ubd
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
"Fix the arm64 linear map range detection for tagged addresses and
replace the bitwise operations with subtract (virt_addr_valid(),
__is_lm_address(), __lm_to_phys())"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Use simpler arithmetics for the linear map macros
arm64: Do not pass tagged addresses to __is_lm_address()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
- Initialize tracing-graph-pause at task creation, not start of
function tracing, to avoid corrupting the pause counter.
- Set "pause-on-trace" for latency tracers as that option breaks their
output (regression).
- Fix the wrong error return for setting kretprobes on future modules
(before they are loaded).
- Fix re-registering the same kretprobe.
- Add missing value check for added RCU variable reload.
* tag 'trace-v5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracepoint: Fix race between tracing and removing tracepoint
kretprobe: Avoid re-registration of the same kretprobe earlier
tracing/kprobe: Fix to support kretprobe events on unloaded modules
tracing: Use pause-on-trace with the latency tracers
fgraph: Initialize tracing_graph_pause at task creation
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"The code fixes in this round are all for the Texas Instruments OMAP
platform, addressing several regressions related to the ti-sysc
interconnect changes that was merged in linux-5.11 and one recently
introduced RCU usage warning.
Tero Kristo updates his maintainer file entries as he is changing to a
new employer.
The other changes are for devicetree files across eight different
platforms:
TI OMAP:
- multiple gpio related one-line fixes
Allwinner/sunxi:
- ARM: dts: sun7i: a20: bananapro: Fix ethernet phy-mode
- soc: sunxi: mbus: Remove DE2 display engine compatibles
NXP lpc32xx:
- ARM: dts: lpc32xx: Revert set default clock rate of HCLK PLL
STMicroelectronics stm32
- multiple minor fixes for DHCOM/DHCOR boards
NXP Layerscape:
- Fix DCFG address range on LS1046A SoC
Amlogic meson:
- fix reboot issue on odroid C4
- revert an ethernet change that caused a regression
- meson-g12: Set FL-adj property value
Rockchip:
- multiple minor fixes on 64-bit rockchip machines
Qualcomm:
- Regression fixes for Lenovo Yoga touchpad and for interconnect
configuration
- Boot fixes for 'LPASS' clock configuration on two machines"
* tag 'arm-soc-fixes-v5.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (31 commits)
ARM: dts: lpc32xx: Revert set default clock rate of HCLK PLL
ARM: dts: sun7i: a20: bananapro: Fix ethernet phy-mode
arm64: dts: ls1046a: fix dcfg address range
soc: sunxi: mbus: Remove DE2 display engine compatibles
arm64: dts: meson: switch TFLASH_VDD_EN pin to open drain on Odroid-C4
Revert "arm64: dts: amlogic: add missing ethernet reset ID"
arm64: dts: rockchip: Disable display for NanoPi R2S
ARM: dts: omap4-droid4: Fix lost keypad slide interrupts for droid4
arm64: dts: rockchip: remove interrupt-names property from rk3399 vdec node
drivers: bus: simple-pm-bus: Fix compatibility with simple-bus for auxdata
ARM: OMAP2+: Fix booting for am335x after moving to simple-pm-bus
ARM: OMAP2+: Fix suspcious RCU usage splats for omap_enter_idle_coupled
ARM: dts: stm32: Fix GPIO hog flags on DHCOM DRC02
ARM: dts: stm32: Fix GPIO hog flags on DHCOM PicoITX
ARM: dts: stm32: Fix GPIO hog names on DHCOM
ARM: dts: stm32: Disable optional TSC2004 on DRC02 board
ARM: dts: stm32: Disable WP on DHCOM uSD slot
ARM: dts: stm32: Connect card-detect signal on DHCOM
ARM: dts: stm32: Fix polarity of the DH DRC02 uSD card detect
arm64: dts: qcom: sdm845: Reserve LPASS clocks in gcc
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
"Some more fixes from the GPIO subsystem for this release. This time
it's only core fixes:
- fix a memory leak in error path in gpiolib
- clear debounce period in output mode in the character device code
- remove shadowed variable"
* tag 'gpio-fixes-for-v5.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: gpiolib: remove shadowed variable
gpiolib: free device name on error path to fix kmemleak
gpiolib: cdev: clear debounce period if line set to output
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Hans de Goede:
"Two last minute small but important fixes.
The hp-wmi change fixes an issue which is actively being hit by users:
https://bugzilla.redhat.com/show_bug.cgi?id=1918255
https://gitlab.gnome.org/GNOME/gnome-shell/-/issues/3564
And the dell-wmi-sysman patch fixes a bug in the new dell-wmi-sysman
driver which causes some systems to hang at boot when the driver
loads"
* tag 'platform-drivers-x86-v5.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86: dell-wmi-sysman: fix a NULL pointer dereference
platform/x86: hp-wmi: Disable tablet-mode reporting by default
|
|
Avoid a naming conflict with for_each_event with similar code in
parse-events.c, rename to for_each_event_tps.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210203052659.2975736-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To pick up fixes
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Running "perf mem report" in TUI mode fails with ENOMEM message in
powerpc:
failed to process sample
Running with debug and verbose options points that issue is while
allocating memory for sample histograms.
The error path is:
symbol__inc_addr_samples() ->
__symbol__inc_addr_samples() ->
annotated_source__histogram()
symbol__inc_addr_samples() calls annotated_source__alloc_histograms ()
to allocate memory for sample histograms using calloc(). Here calloc()
fails since the size of symbol is huge. The size of a symbol is
calculated as difference between its start and end address.
Example histogram allocation that fails is:
sym->name is _end
sym->start is 0xc0000000027a0000
sym->end is 0xc008000003890000
symbol__size(sym) is 0x80000010f0000
In the above case, the difference between sym->start
(0xc0000000027a0000) and sym->end (0xc008000003890000) is huge.
This is same problem as in s390 and arm64 which are fixed in commits:
b9c0a64901d5 ("perf annotate: Fix s390 gap between kernel end and module start")
78886f3ed37e ("perf symbols: Fix arm64 gap between kernel start and module end")
When this symbol was read first, its start and end address was set to
address which matches with data from /proc/kallsyms.
After symbol__new():
symbol__new: _end 0xc0000000027a0000-0xc0000000027a0000
From /proc/kallsyms:
...
c000000002799370 b backtrace_flag
c000000002799378 B radix_tree_node_cachep
c000000002799380 B __bss_stop
c0000000027a0000 B _end
c008000003890000 t icmp_checkentry [ip_tables]
c008000003890038 t ipt_alloc_initial_table [ip_tables]
c008000003890468 T ipt_do_table [ip_tables]
c008000003890de8 T ipt_unregister_table_pre_exit [ip_tables]
...
Perf calls function symbols__fixup_end() which sets the end of symbol to
0xc008000003890000, which is the next address and this is the start
address of first module (icmp_checkentry in above) which will make the
huge symbol size of 0x80000010f0000.
After symbols__fixup_end:
symbols__fixup_end: sym->name: _end
sym->start: 0xc0000000027a0000
sym->end: 0xc008000003890000
On powerpc, kernel text segment is located at 0xc000000000000000 whereas
the modules are located at very high memory addresses,
0xc00800000xxxxxxx. Since the gap between end of kernel text segment and
beginning of first module's address is high, histogram allocation using
calloc fails.
Fix this by detecting the kernel's last symbol and limiting the range of
last kernel symbol to pagesize.
Signed-off-by: Athira Rajeev<atrajeev@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-By: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1609208054-1566-1-git-send-email-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This patch fixes "perf inject --jit" to properly operate on
namespaced/containerized processes:
* jitdump files are generated by the process, thus they should be
looked up in its mount NS.
* DSOs of injected MMAP events will later be looked up in the process
mount NS, so write them into its NS.
* PIDs & TIDs from jitdump events need to be translated to the PID as
seen by "perf record" before written into MMAP events.
For a process in a different PID NS, the TID & PID given in the jitdump
event are actually ignored; I use the TID & PID of the thread which
mmap()ed the jitdump file. This is simplified and won't do for forks of
the initial process, if they continue using the same jitdump file.
Future patches might improve it.
This was tested by recording a NodeJS process running with
"--perf-prof", inside a Docker container, and by recording another
NodeJS process running in the same namespaces as perf itself, to make
sure it's not broken for non-containerized processes.
Signed-off-by: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20201105015604.1726943-1-yonatan.goldschmidt@granulate.io
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Provides an accurate mean to determine if the owner thread is in a
different PID namespace.
Signed-off-by: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20201105015418.1725218-1-yonatan.goldschmidt@granulate.io
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
events
Like in __event__synthesize_thread(), I think it's better to use
scandir() instead of the readdir() loop. In case some malicious task
continues to create new threads, the readdir() loop will run over and
over to collect tids. The scandir() also has the problem but the window
is much smaller since it doesn't do much work during the iteration.
Also add filter_task() function as we only care the tasks.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20210202090118.2008551-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To synthesize information to resolve sample IPs, it needs to scan task
and mmap info from the /proc filesystem. For each process, it opens
(and reads) status and maps file respectively. But as kernel threads
don't have memory maps so we can skip the maps file.
To find kernel threads, check "VmPeak:" line in /proc/<PID>/status file.
It's about the peak virtual memory usage so only user-level tasks have
that. Note that it's possible to miss the line due to partial reads.
So we should double-check if it's a really kernel thread when there's no
VmPeak line.
Thus check "Threads:" line (which follows the VmPeak line whether or not
it exists) to be sure it's read enough data - just in case of deeply
nested pid namespaces or large number of supplementary groups are
involved.
This is for user process:
$ head -40 /proc/1/status
Name: systemd
Umask: 0000
State: S (sleeping)
Tgid: 1
Ngid: 0
Pid: 1
PPid: 0
TracerPid: 0
Uid: 0 0 0 0
Gid: 0 0 0 0
FDSize: 256
Groups:
NStgid: 1
NSpid: 1
NSpgid: 1
NSsid: 1
VmPeak: 234192 kB <-- here
VmSize: 169964 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 29528 kB
VmRSS: 6104 kB
RssAnon: 2756 kB
RssFile: 3348 kB
RssShmem: 0 kB
VmData: 19776 kB
VmStk: 1036 kB
VmExe: 784 kB
VmLib: 9532 kB
VmPTE: 116 kB
VmSwap: 2400 kB
HugetlbPages: 0 kB
CoreDumping: 0
THP_enabled: 1
Threads: 1 <-- and here
SigQ: 1/62808
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 7be3c0fe28014a03
SigIgn: 0000000000001000
And this is for kernel thread:
$ head -20 /proc/2/status
Name: kthreadd
Umask: 0000
State: S (sleeping)
Tgid: 2
Ngid: 0
Pid: 2
PPid: 0
TracerPid: 0
Uid: 0 0 0 0
Gid: 0 0 0 0
FDSize: 64
Groups:
NStgid: 2
NSpid: 2
NSpgid: 0
NSsid: 0
Threads: 1 <-- here
SigQ: 1/62808
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20210202090118.2008551-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To save memory usage, it needs to reduce the number of entries in the
proc filesystem. It's using /proc/<PID>/task directory to traverse
threads in the process and then kernel creates /proc/<PID>/task/<TID>
entries.
After that it checks the thread info using the /proc/<TID>/status file
rather than /proc/<PID>/task/<TID>/status. As far as I can see, they
are the same and contain all the info we need.
Using the latter eliminates the unnecessary /proc/<TID> entry. This can
be useful especially a large number of threads are used in the system.
In my experiment around 1KB of memory on average was saved for each
thread (which is not a thread group leader).
To do this, pass both pid and tid to perf_event_prepare_comm() if it
knows them. In case it doesn't know, passing 0 as pid will do the old
way.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20210202090118.2008551-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Reduce duplication in the JSONs by referencing standard events from
armv8-common-and-microarch.json
In general the "PublicDescription" fields are not modified when somewhat
significantly worded differently than the standard.
Apart from that, description and names for events slightly different to
standard are changed (to standard) for consistency.
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Nakamura, Shunsuke/中村 俊介 <nakamura.shun@fujitsu.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxarm@openeuler.org
Link: https://lore.kernel.org/r/1611835236-34696-5-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Reduce duplication in the JSONs by referencing standard events from
armv8-common-and-microarch.json
In general the "PublicDescription" fields are not modified when somewhat
significantly worded differently than the standard.
Apart from that, description and names for events slightly different to
standard are changed (to standard) for consistency.
Note that names for events 0x34 and 0x35 are non-standard and remain
unchanged. Those events came from the following originally:
https://github.com/AmpereComputing/ampere-centos-kernel/blob/4c2479c67bbcf35b35224db12a092b33682b181c/Documentation/arm64/eMAG-ARM-CoreImpDefined.pdf
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Nakamura, Shunsuke/中村 俊介 <nakamura.shun@fujitsu.com>
Cc: mathieu.poirier@linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxarm@openeuler.org
Link: https://lore.kernel.org/r/1611835236-34696-4-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add a common and microarch JSON, which can be referenced from CPU JSONs.
For now, brief and public description are as event brief event
description from the ARMv8 ARM [0], D7-11.
The list of events is not complete, as not all events will be referenced
yet.
Reference document is at the following:
[0] https://documentation-service.arm.com/static/5fa3bd1eb209f547eebd4141?token=
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Nakamura, Shunsuke/中村 俊介 <nakamura.shun@fujitsu.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxarm@openeuler.org
Link: https://lore.kernel.org/r/1611835236-34696-3-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The "briefdescription" for event 0x35 has a typo - fix it.
Fixes: d35c595bf005 ("perf vendor events arm64: Revise core JSON events for eMAG")
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Nakamura, Shunsuke/中村 俊介 <nakamura.shun@fujitsu.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxarm@openeuler.org
Link: https://lore.kernel.org/r/1611835236-34696-2-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Other perf tool builtins already supported a DSO filter.
For example:
$ perf report --dsos a,b,c
which only considers symbols in these dsos.
Now the DSO filter is supported in 'perf script':
root@kbl-ppc:~# ./perf script --dsos "[kernel.kallsyms]"
perf 18123 [000] 6142863.075104: 1 cycles: ffffffff9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [000] 6142863.075107: 1 cycles: ffffffff9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [000] 6142863.075108: 10 cycles: ffffffff9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [000] 6142863.075109: 273 cycles: ffffffff9ca7730a native_write_msr+0xa ([kernel.kallsyms])
perf 18123 [000] 6142863.075110: 7684 cycles: ffffffff9ca3c9c0 native_sched_clock+0x50 ([kernel.kallsyms])
perf 18123 [000] 6142863.075112: 213017 cycles: ffffffff9d765a92 syscall_exit_to_user_mode+0x32 ([kernel.kallsyms])
perf 18123 [001] 6142863.075156: 1 cycles: ffffffff9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [001] 6142863.075158: 1 cycles: ffffffff9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [001] 6142863.075159: 17 cycles: ffffffff9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
Committer testing:
$ perf script
ls 2364888 29303.010949: 1 cycles:u: ffffffffa4bbc6a9 [unknown] ([unknown])
ls 2364888 29303.010957: 1 cycles:u: ffffffffa429ef48 [unknown] ([unknown])
ls 2364888 29303.010961: 1 cycles:u: ffffffffa4260133 [unknown] ([unknown])
ls 2364888 29303.010964: 5 cycles:u: ffffffffa429efad [unknown] ([unknown])
ls 2364888 29303.010967: 41 cycles:u: ffffffffa42a4586 [unknown] ([unknown])
ls 2364888 29303.010972: 435 cycles:u: ffffffffa429efe0 [unknown] ([unknown])
ls 2364888 29303.010978: 5142 cycles:u: 7f9b95bc2abf __GI___tunables_init+0x11f (/usr/lib64/ld-2.32.so)
ls 2364888 29303.011006: 38551 cycles:u: ffffffffa4290f61 [unknown] ([unknown])
ls 2364888 29303.011486: 238234 cycles:u: 7f9b95bb7741 _dl_relocate_object+0xa71 (/usr/lib64/ld-2.32.so)
ls 2364888 29303.011937: 415870 cycles:u: 7f9b95a1c80e __strcoll_l+0xe (/usr/lib64/libc-2.32.so)
$
Before:
$ perf script --dsos /usr/lib64/libc-2.32.so |& head -5
Error: unknown option `dsos'
Usage: perf script [<options>]
or: perf script [<options>] record <script> [<record-options>] <command>
or: perf script [<options>] report <script> [script-args]
$
After:
$ perf script --dsos /usr/lib64/libc-2.32.so
ls 2364888 29303.011937: 415870 cycles:u: 7f9b95a1c80e __strcoll_l+0xe (/usr/lib64/libc-2.32.so)
$
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210124232750.19170-2-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When we lookup an address and don't find a map we should filter that
sample if the user specified a list of --dso entries to filter on, fix
it.
Before:
$ perf script
sleep 274800 2843.556162: 1 cycles:u: ffffffffbb26bff4 [unknown] ([unknown])
sleep 274800 2843.556168: 1 cycles:u: ffffffffbb2b047d [unknown] ([unknown])
sleep 274800 2843.556171: 1 cycles:u: ffffffffbb2706b2 [unknown] ([unknown])
sleep 274800 2843.556174: 6 cycles:u: ffffffffbb2b0267 [unknown] ([unknown])
sleep 274800 2843.556176: 59 cycles:u: ffffffffbb2b03b1 [unknown] ([unknown])
sleep 274800 2843.556180: 691 cycles:u: ffffffffbb26bff4 [unknown] ([unknown])
sleep 274800 2843.556189: 9160 cycles:u: 7fa9550eeaa3 __GI___tunables_init+0xf3 (/usr/lib64/ld-2.32.so)
sleep 274800 2843.556312: 86937 cycles:u: 7fa9550e157b _dl_lookup_symbol_x+0x4b (/usr/lib64/ld-2.32.so)
$
So we have some samples we somehow didn't find in a map for, if we now
do:
$ perf report --stdio --dso /usr/lib64/ld-2.32.so
# dso: /usr/lib64/ld-2.32.so
#
# Total Lost Samples: 0
#
# Samples: 8 of event 'cycles:u'
# Event count (approx.): 96856
#
# Overhead Command Symbol
# ........ ....... ........................
#
89.76% sleep [.] _dl_lookup_symbol_x
9.46% sleep [.] __GI___tunables_init
0.71% sleep [k] 0xffffffffbb26bff4
0.06% sleep [k] 0xffffffffbb2b03b1
0.01% sleep [k] 0xffffffffbb2b0267
0.00% sleep [k] 0xffffffffbb2706b2
0.00% sleep [k] 0xffffffffbb2b047d
$
After this patch we get the right output with just entries for the DSOs
specified in --dso:
$ perf report --stdio --dso /usr/lib64/ld-2.32.so
# dso: /usr/lib64/ld-2.32.so
#
# Total Lost Samples: 0
#
# Samples: 8 of event 'cycles:u'
# Event count (approx.): 96856
#
# Overhead Command Symbol
# ........ ....... ........................
#
89.76% sleep [.] _dl_lookup_symbol_x
9.46% sleep [.] __GI___tunables_init
$
#
Fixes: 96415e4d3f5fdf9c ("perf symbols: Avoid unnecessary symbol loading when dso list is specified")
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210128131209.GD775562@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The Topdown Microarchitecture Analysis (TMA) Method is a structured
analysis methodology to identify critical performance bottlenecks in
out-of-order processors. From the Ice Lake and later platforms, the
Topdown information can be retrieved from the dedicated "metrics"
register, which isn't impacted by other events. Also, the Topdown
metrics support both per thread/process and per core measuring. Adding
Topdown metrics events as default events can enrich the default
measuring information, and would not cost any extra multiplexing.
Introduce arch_evlist__add_default_attrs() to allow architecture
specific default events. Add the Topdown metrics events in the X86
specific arch_evlist__add_default_attrs(). Other architectures can add
their own default events later separately.
With the patch:
$ perf stat sleep 1
Performance counter stats for 'sleep 1':
0.82 msec task-clock:u # 0.001 CPUs utilized
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
61 page-faults:u # 0.074 M/sec
319,941 cycles:u # 0.388 GHz
242,802 instructions:u # 0.76 insn per cycle
54,380 branches:u # 66.028 M/sec
4,043 branch-misses:u # 7.43% of all branches
1,585,555 slots:u # 1925.189 M/sec
238,941 topdown-retiring:u # 15.0% retiring
410,378 topdown-bad-spec:u # 25.8% bad speculation
634,222 topdown-fe-bound:u # 39.9% frontend bound
304,675 topdown-be-bound:u # 19.2% backend bound
1.001791625 seconds time elapsed
0.000000000 seconds user
0.001572000 seconds sys
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20210121133752.118327-1-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Event duration_time in a metric expression requires special handling.
Improve test coverage by including a metric whose expression includes
duration_time. The actual metric is a copied from the L1D_Cache_Fill_BW
metric on my broadwell machine.
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxarm@openeuler.org
Link: http://lore.kernel.org/lkml/1611578842-5749-1-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When the host sends multiple h2cdata PDUs, we keep track on
the receive progress and calculate the scatterlist index and
offsets.
The issue is that sg_offset should only be kept for the first
iov entry we map in the iovec as this is the difference between
our cursor and the sg entry offset itself.
In addition, the sg index was calculated wrong because we should
not round up when dividing the command byte offset with PAG_SIZE.
Fixes: 872d26a391da ("nvmet-tcp: add NVMe over TCP target driver")
Reported-by: Narayan Ayalasomayajula <Narayan.Ayalasomayajula@wdc.com>
Tested-by: Narayan Ayalasomayajula <Narayan.Ayalasomayajula@wdc.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
The commit 0d00449c7a28 ("x86: Replace ist_enter() with nmi_enter()")
converted do_int3 handler to be "NMI-like".
That made old if (in_nmi()) check abort execution of bpf programs
attached to kprobe when kprobe is firing via int3
(For example when kprobe is placed in the middle of the function).
Remove the check to restore user visible behavior.
Fixes: 0d00449c7a28 ("x86: Replace ist_enter() with nmi_enter()")
Reported-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/bpf/20210203070636.70926-1-alexei.starovoitov@gmail.com
|