Age | Commit message (Collapse) | Author |
|
The 'perf kvm' command set up things so that we can record, report, top,
etc, but not 'script', so make 'perf script' be able to process samples
by allowing to pass guest kallsyms, vmlinux, modules, etc, and if at
least one of those is provided, set perf_guest to true so that guest
samples get properly resolved.
Testing it:
# perf kvm --guest --guestkallsyms /wb/rhel6.kallsyms --guestmodules /wb/rhel6.modules record -e cycles:Gk
^C[ perf record: Woken up 7 times to write data ]
[ perf record: Captured and wrote 3.602 MB perf.data.guest (10492 samples) ]
#
# perf evlist -i perf.data.guest
cycles:Gk
# perf evlist -v -i perf.data.guest
cycles:Gk: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, exclude_user: 1, exclude_hv: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_host: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
#
# perf kvm --guestkallsyms /wb/rhel6.kallsyms --guestmodules /wb/rhel6.modules report --stdio -s sym | head -30
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 10K of event 'cycles:Gk'
# Event count (approx.): 2434201408
#
# Overhead Symbol
# ........ ..............................................
#
11.93% [g] avtab_search_node
3.95% [g] sidtab_context_to_sid
2.41% [g] n_tty_write
2.20% [g] _spin_unlock_irqrestore
1.37% [g] _aesni_dec4
1.33% [g] kmem_cache_alloc
1.07% [g] native_write_cr0
0.99% [g] kfree
0.95% [g] _spin_lock
0.91% [g] __memset
0.87% [g] schedule
0.83% [g] _spin_lock_irqsave
0.76% [g] __kmalloc
0.67% [g] avc_has_perm_noaudit
0.66% [g] kmem_cache_free
0.65% [g] glue_xts_crypt_128bit
0.59% [g] __d_lookup
0.59% [g] __audit_syscall_exit
0.56% [g] __memcpy
#
Then, when trying to use perf script to generate a python script and
then process the events after adding a python hook for non-tracepoint
events:
# perf script -i perf.data.guest -g python
generated Python script: perf-script.py
# vim perf-script.py
# tail -2 perf-script.py
def process_event(param_dict):
print(param_dict["symbol"])
#
# perf script -i perf.data.guest -s perf-script.py | head
in trace_begin
vmx_vmexit
vmx_vmexit
vmx_vmexit
vmx_vmexit
vmx_vmexit
vmx_vmexit
vmx_vmexit
vmx_vmexit
vmx_vmexit
231
#
We'd see just the vmx_vmexit, i.e. the samples from the guest don't show
up.
After this patch:
# perf script --guestkallsyms /wb/rhel6.kallsyms --guestmodules /wb/rhel6.modules -i perf.data.guest -s perf-script.py 2> /dev/null | head -30
in trace_begin
apic_timer_interrupt
apic_timer_interrupt
apic_timer_interrupt
apic_timer_interrupt
apic_timer_interrupt
save_args
do_timer
drain_array
inode_permission
avc_has_perm_noaudit
run_timer_softirq
apic_timer_interrupt
apic_timer_interrupt
apic_timer_interrupt
apic_timer_interrupt
apic_timer_interrupt
kvm_guest_apic_eoi_write
run_posix_cpu_timers
_spin_lock
handle_pte_fault
rcu_irq_enter
delay_tsc
delay_tsc
native_read_tsc
apic_timer_interrupt
sys_open
internal_add_timer
list_del
rcu_exit_nohz
#
Jiri Olsa noticed we need to set 'perf_guest' to true if we want to
process guest samples and I made it be set if one of the guest files
settings get set via the command line options added in this patch, that
match those present in the 'perf kvm' command.
We probably want to have 'perf record', 'perf report' etc to notice that
there are guest samples and do the right thing, which is to look for
files with some suffix that make it be associated with the guest used to
collect the samples, i.e. if a vmlinux file is passed, we can get the
build-id from it, if not some other identifier or simply looking for
"kallsyms.guest", for instance, in the current directory.
Reported-by: Mariano Pache <npache@redhat.com>
Tested-by: Mariano Pache <npache@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Cc: Ali Raza <alirazabhutta.10@gmail.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Mario <jmario@redhat.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Orran Krieger <okrieger@redhat.com>
Cc: Ramkumar Ramachandra <artagnon@gmail.com>
Cc: Yunlong Song <yunlong.song@huawei.com>
Link: https://lkml.kernel.org/n/tip-d54gj64rerlxcqsrod05biwn@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Move the blk_mq_bio_to_request() call in front of the if-statement.
Cc: Hannes Reinecke <hare@suse.com>
Cc: Omar Sandoval <osandov@fb.com>
Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
No code that occurs between blk_mq_get_ctx() and blk_mq_put_ctx() depends
on preemption being disabled for its correctness. Since removing the CPU
preemption calls does not measurably affect performance, simplify the
blk-mq code by removing the blk_mq_put_ctx() function and also by not
disabling preemption in blk_mq_get_ctx().
Cc: Hannes Reinecke <hare@suse.com>
Cc: Omar Sandoval <osandov@fb.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The psock_tpacket test will need to access /proc/kallsyms, this would
require the kernel config CONFIG_KALLSYMS to be enabled first.
Apart from adding CONFIG_KALLSYMS to the net/config file here, check the
file existence to determine if we can run this test will be helpful to
avoid a false-positive test result when testing it directly with the
following commad against a kernel that have CONFIG_KALLSYMS disabled:
make -C tools/testing/selftests TARGETS=net run_tests
Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Before mlxsw_sp1_ptp_packet_finish() sends the packet back, it validates
whether the corresponding port is still valid. However the condition is
incorrect: when mlxsw_sp_port == NULL, the code dereferences the port to
compare it to skb->dev.
The condition needs to check whether the port is present and skb->dev still
refers to that port (or else is NULL). If that does not hold, bail out.
Add a pair of parentheses to fix the condition.
Fixes: d92e4e6e33c8 ("mlxsw: spectrum: PTP: Support timestamping on Spectrum-1")
Reported-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If the rxrpc_eproto tracepoint is enabled, an oops will be cause by the
trace line that rxrpc_extract_header() tries to emit when a protocol error
occurs (typically because the packet is short) because the call argument is
NULL.
Fix this by using ?: to assume 0 as the debug_id if call is NULL.
This can then be induced by:
echo -e '\0\0\0\0\0\0\0\0' | ncat -4u --send-only <addr> 20001
where addr has the following program running on it:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/socket.h>
#include <arpa/inet.h>
#include <linux/rxrpc.h>
int main(void)
{
struct sockaddr_rxrpc srx;
int fd;
memset(&srx, 0, sizeof(srx));
srx.srx_family = AF_RXRPC;
srx.srx_service = 0;
srx.transport_type = AF_INET;
srx.transport_len = sizeof(srx.transport.sin);
srx.transport.sin.sin_family = AF_INET;
srx.transport.sin.sin_port = htons(0x4e21);
fd = socket(AF_RXRPC, SOCK_DGRAM, AF_INET6);
bind(fd, (struct sockaddr *)&srx, sizeof(srx));
sleep(20);
return 0;
}
It results in the following oops.
BUG: kernel NULL pointer dereference, address: 0000000000000340
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
...
RIP: 0010:trace_event_raw_event_rxrpc_rx_eproto+0x47/0xac
...
Call Trace:
<IRQ>
rxrpc_extract_header+0x86/0x171
? rcu_read_lock_sched_held+0x5d/0x63
? rxrpc_new_skb+0xd4/0x109
rxrpc_input_packet+0xef/0x14fc
? rxrpc_input_data+0x986/0x986
udp_queue_rcv_one_skb+0xbf/0x3d0
udp_unicast_rcv_skb.isra.8+0x64/0x71
ip_protocol_deliver_rcu+0xe4/0x1b4
ip_local_deliver+0xf0/0x154
__netif_receive_skb_one_core+0x50/0x6c
netif_receive_skb_internal+0x26b/0x2e9
napi_gro_receive+0xf8/0x1da
rtl8169_poll+0x303/0x4c4
net_rx_action+0x10e/0x333
__do_softirq+0x1a5/0x38f
irq_exit+0x54/0xc4
do_IRQ+0xda/0xf8
common_interrupt+0xf/0xf
</IRQ>
...
? cpuidle_enter_state+0x23c/0x34d
cpuidle_enter+0x2a/0x36
do_idle+0x163/0x1ea
cpu_startup_entry+0x1d/0x1f
start_secondary+0x157/0x172
secondary_startup_64+0xa4/0xb0
Fixes: a25e21f0bcd2 ("rxrpc, afs: Use debug_ids rather than pointers in traces")
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It was reported that the GPD MicroPC is broken in a way that no valid
MAC address can be read from the network chip. The vendor driver deals
with this by assigning a random MAC address as fallback. So let's do
the same.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This reverts commit 759d095741721888b6ee51afa74e0a66ce65e974.
The patch was based on a misunderstanding. As Al Viro pointed out [0]
it's simply wrong on big endian. So let's revert it.
[0] https://marc.info/?t=156200975600004&r=1&w=2
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This is for fixing bug KMSAN: uninit-value in ax88772_bind
Tested by
https://groups.google.com/d/msg/syzkaller-bugs/aFQurGotng4/eB_HlNhhCwAJ
Reported-by: syzbot+8a3fc6674bbc3978ed4e@syzkaller.appspotmail.com
syzbot found the following crash on:
HEAD commit: f75e4cfe kmsan: use kmsan_handle_urb() in urb.c
git tree: kmsan
console output: https://syzkaller.appspot.com/x/log.txt?x=136d720ea00000
kernel config:
https://syzkaller.appspot.com/x/.config?x=602468164ccdc30a
dashboard link:
https://syzkaller.appspot.com/bug?extid=8a3fc6674bbc3978ed4e
compiler: clang version 9.0.0 (/home/glider/llvm/clang
06d00afa61eef8f7f501ebdb4e8612ea43ec2d78)
syz repro:
https://syzkaller.appspot.com/x/repro.syz?x=12788316a00000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=120359aaa00000
==================================================================
BUG: KMSAN: uninit-value in is_valid_ether_addr
include/linux/etherdevice.h:200 [inline]
BUG: KMSAN: uninit-value in asix_set_netdev_dev_addr
drivers/net/usb/asix_devices.c:73 [inline]
BUG: KMSAN: uninit-value in ax88772_bind+0x93d/0x11e0
drivers/net/usb/asix_devices.c:724
CPU: 0 PID: 3348 Comm: kworker/0:2 Not tainted 5.1.0+ #1
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Workqueue: usb_hub_wq hub_event
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x191/0x1f0 lib/dump_stack.c:113
kmsan_report+0x130/0x2a0 mm/kmsan/kmsan.c:622
__msan_warning+0x75/0xe0 mm/kmsan/kmsan_instr.c:310
is_valid_ether_addr include/linux/etherdevice.h:200 [inline]
asix_set_netdev_dev_addr drivers/net/usb/asix_devices.c:73 [inline]
ax88772_bind+0x93d/0x11e0 drivers/net/usb/asix_devices.c:724
usbnet_probe+0x10f5/0x3940 drivers/net/usb/usbnet.c:1728
usb_probe_interface+0xd66/0x1320 drivers/usb/core/driver.c:361
really_probe+0xdae/0x1d80 drivers/base/dd.c:513
driver_probe_device+0x1b3/0x4f0 drivers/base/dd.c:671
__device_attach_driver+0x5b8/0x790 drivers/base/dd.c:778
bus_for_each_drv+0x28e/0x3b0 drivers/base/bus.c:454
__device_attach+0x454/0x730 drivers/base/dd.c:844
device_initial_probe+0x4a/0x60 drivers/base/dd.c:891
bus_probe_device+0x137/0x390 drivers/base/bus.c:514
device_add+0x288d/0x30e0 drivers/base/core.c:2106
usb_set_configuration+0x30dc/0x3750 drivers/usb/core/message.c:2027
generic_probe+0xe7/0x280 drivers/usb/core/generic.c:210
usb_probe_device+0x14c/0x200 drivers/usb/core/driver.c:266
really_probe+0xdae/0x1d80 drivers/base/dd.c:513
driver_probe_device+0x1b3/0x4f0 drivers/base/dd.c:671
__device_attach_driver+0x5b8/0x790 drivers/base/dd.c:778
bus_for_each_drv+0x28e/0x3b0 drivers/base/bus.c:454
__device_attach+0x454/0x730 drivers/base/dd.c:844
device_initial_probe+0x4a/0x60 drivers/base/dd.c:891
bus_probe_device+0x137/0x390 drivers/base/bus.c:514
device_add+0x288d/0x30e0 drivers/base/core.c:2106
usb_new_device+0x23e5/0x2ff0 drivers/usb/core/hub.c:2534
hub_port_connect drivers/usb/core/hub.c:5089 [inline]
hub_port_connect_change drivers/usb/core/hub.c:5204 [inline]
port_event drivers/usb/core/hub.c:5350 [inline]
hub_event+0x48d1/0x7290 drivers/usb/core/hub.c:5432
process_one_work+0x1572/0x1f00 kernel/workqueue.c:2269
process_scheduled_works kernel/workqueue.c:2331 [inline]
worker_thread+0x189c/0x2460 kernel/workqueue.c:2417
kthread+0x4b5/0x4f0 kernel/kthread.c:254
ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:355
Signed-off-by: Phong Tran <tranmanphong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 760f1dc2958022 ("net: stmmac: add sanity check to
device_property_read_u32_array call") introduced error checking of the
device_property_read_u32_array() call in stmmac_mdio_reset().
This results in the following error when the "snps,reset-delays-us"
property is not defined in devicetree:
invalid property snps,reset-delays-us
This sanity check made sense until commit 84ce4d0f9f55b4 ("net: stmmac:
initialize the reset delay array") ensured that there are fallback
values for the reset delay if the "snps,reset-delays-us" property is
absent. That was at the cost of making that property mandatory though.
Drop the sanity check for device_property_read_u32_array() and thus make
the "snps,reset-delays-us" property optional again (avoiding the error
message while loading the stmmac driver with a .dtb where the property
is absent).
Fixes: 760f1dc2958022 ("net: stmmac: add sanity check to device_property_read_u32_array call")
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A bonding master can be up while best_slave is NULL.
[12105.636318] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[12105.638204] mlx4_en: eth1: Linkstate event 1 -> 1
[12105.648984] IP: bond_select_active_slave+0x125/0x250
[12105.653977] PGD 0 P4D 0
[12105.656572] Oops: 0000 [#1] SMP PTI
[12105.660487] gsmi: Log Shutdown Reason 0x03
[12105.664620] Modules linked in: kvm_intel loop act_mirred uhaul vfat fat stg_standard_ftl stg_megablocks stg_idt stg_hdi stg elephant_dev_num stg_idt_eeprom w1_therm wire i2c_mux_pca954x i2c_mux mlx4_i2c i2c_usb cdc_acm ehci_pci ehci_hcd i2c_iimc mlx4_en mlx4_ib ib_uverbs ib_core mlx4_core [last unloaded: kvm_intel]
[12105.685686] mlx4_core 0000:03:00.0: dispatching link up event for port 2
[12105.685700] mlx4_en: eth2: Linkstate event 2 -> 1
[12105.685700] mlx4_en: eth2: Link Up (linkstate)
[12105.724452] Workqueue: bond0 bond_mii_monitor
[12105.728854] RIP: 0010:bond_select_active_slave+0x125/0x250
[12105.734355] RSP: 0018:ffffaf146a81fd88 EFLAGS: 00010246
[12105.739637] RAX: 0000000000000003 RBX: ffff8c62b03c6900 RCX: 0000000000000000
[12105.746838] RDX: 0000000000000000 RSI: ffffaf146a81fd08 RDI: ffff8c62b03c6000
[12105.754054] RBP: ffffaf146a81fdb8 R08: 0000000000000001 R09: ffff8c517d387600
[12105.761299] R10: 00000000001075d9 R11: ffffffffaceba92f R12: 0000000000000000
[12105.768553] R13: ffff8c8240ae4800 R14: 0000000000000000 R15: 0000000000000000
[12105.775748] FS: 0000000000000000(0000) GS:ffff8c62bfa40000(0000) knlGS:0000000000000000
[12105.783892] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[12105.789716] CR2: 0000000000000000 CR3: 0000000d0520e001 CR4: 00000000001626f0
[12105.796976] Call Trace:
[12105.799446] [<ffffffffac31d387>] bond_mii_monitor+0x497/0x6f0
[12105.805317] [<ffffffffabd42643>] process_one_work+0x143/0x370
[12105.811225] [<ffffffffabd42c7a>] worker_thread+0x4a/0x360
[12105.816761] [<ffffffffabd48bc5>] kthread+0x105/0x140
[12105.821865] [<ffffffffabd42c30>] ? rescuer_thread+0x380/0x380
[12105.827757] [<ffffffffabd48ac0>] ? kthread_associate_blkcg+0xc0/0xc0
[12105.834266] [<ffffffffac600241>] ret_from_fork+0x51/0x60
Fixes: e2a7420df2e0 ("bonding/main: convert to using slave printk macros")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: John Sperbeck <jsperbeck@google.com>
Cc: Jarod Wilson <jarod@redhat.com>
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Remove a leftover function header and a static inline stub with no
users from the ACPI header file.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
|
|
In general, it is not correct to call pm_generic_suspend(),
pm_generic_suspend_late() and pm_generic_suspend_noirq() during the
hibernation's "poweroff" transition, because device drivers may
provide special callbacks to be invoked then and the wrappers in
question cause system suspend callbacks to be run. Unfortunately,
that happens in the ACPI PM domain and ACPI LPSS.
To address this potential issue, introduce "poweroff" callbacks
for the ACPI PM and LPSS that will use pm_generic_poweroff(),
pm_generic_poweroff_late() and pm_generic_poweroff_noirq() as
appropriate.
Fixes: 05087360fd7a (ACPI / PM: Take SMART_SUSPEND driver flag into account)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
|
|
First, after a previous change causing all runtime-suspended devices
in the ACPI PM domain (and ACPI LPSS devices) to be resumed before
creating a snapshot image of memory during hibernation, it is not
necessary to worry about the case in which them might be left in
runtime-suspend any more, so get rid of the code related to that from
ACPI PM domain and ACPI LPSS hibernation callbacks.
Second, it is not correct to use pm_generic_resume_early() and
acpi_subsys_resume_noirq() in hibernation "restore" callbacks (which
currently happens in the ACPI PM domain and ACPI LPSS), so introduce
proper _restore_late and _restore_noirq callbacks for the ACPI PM
domain and ACPI LPSS.
Fixes: 05087360fd7a (ACPI / PM: Take SMART_SUSPEND driver flag into account)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
|
|
After a previous change causing all runtime-suspended PCI devices
to be resumed before creating a snapshot image of memory during
hibernation, it is not necessary to worry about the case in which
them might be left in runtime-suspend any more, so get rid of the
code related to that from bus-level PCI hibernation callbacks.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
|
|
Both the PCI bus type and the ACPI PM domain avoid resuming
runtime-suspended devices with DPM_FLAG_SMART_SUSPEND set during
hibernation (before creating the snapshot image of system memory),
but that turns out to be a mistake. It leads to functional issues
and adds complexity that's hard to justify.
For this reason, resume all runtime-suspended PCI devices and all
devices in the ACPI PM domains before creating a snapshot image of
system memory during hibernation.
Fixes: 05087360fd7a (ACPI / PM: Take SMART_SUSPEND driver flag into account)
Fixes: c4b65157aeef (PCI / PM: Take SMART_SUSPEND driver flag into account)
Link: https://lore.kernel.org/linux-acpi/917d4399-2e22-67b1-9d54-808561f9083f@uwyo.edu/T/#maf065fe6e4974f2a9d79f332ab99dfaba635f64c
Reported-by: Robert R. Howell <RHowell@uwyo.edu>
Tested-by: Robert R. Howell <RHowell@uwyo.edu>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/nsekhar/linux-davinci into arm/fixes
This set of patches fixes regressions introduced in v5.2 kernel when DA8xx
OHCI driver was converted over to use GPIO regulators.
* tag 'davinci-fixes-for-v5.2-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/nsekhar/linux-davinci:
ARM: davinci: da830-evm: fix GPIO lookup for OHCI
ARM: davinci: omapl138-hawk: add missing regulator constraints for OHCI
ARM: davinci: da830-evm: add missing regulator constraints for OHCI
+ Linux 5.2-rc7
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
Both tipc_udp_enable and tipc_udp_disable are called under rtnl_lock,
ub->ubsock could never be NULL in tipc_udp_disable and cleanup_bearer,
so remove the check.
Also remove the one in tipc_udp_enable by adding "free" label.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix the following coverity warning reported by Dan Carpenter:
fs/ext4/namei.c:1311 ext4_fname_setup_ci_filename()
warn: 'cf_name->len' unsigned <= 0
Fixes: 3ae72562ad91 ("ext4: optimize case-insensitive lookups")
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
|
|
There are several firmware versions between version 2AR10001 and
2BA30001, presumably these also have broken FPDMA_AA activation, so
lets play it safe and apply the quirk to all firmware versions.
Suggested-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The kobj_type default_attrs field is being replaced by the
default_groups field. Replace the default_attrs field in ext4_sb_ktype
and ext4_feat_ktype with default_groups. Use the ATTRIBUTE_GROUPS macro
to create ext4_groups and ext4_feat_groups.
Signed-off-by: Kimberly Brown <kimbrownkd@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
Andreas Steinmetz says:
====================
macsec: fix some bugs in the receive path
This series fixes some bugs in the receive path of macsec. The first
is a use after free when processing macsec frames with a SecTAG that
has the TCI E bit set but the C bit clear. In the 2nd bug, the driver
leaves an invalid checksumming state after decrypting the packet.
This is a combined effort of Sabrina Dubroca <sd@queasysnail.net> and me.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix checksumming after decryption.
Signed-off-by: Andreas Steinmetz <ast@domdv.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix use-after-free of skb when rx_handler returns RX_HANDLER_PASS.
Signed-off-by: Andreas Steinmetz <ast@domdv.de>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In commit ee28906fd7a1 ("ipv4: Dump route exceptions if requested") I
added a counter of per-node dumped routes (including actual routes and
exceptions), analogous to the existing counter for dumped nodes. Dumping
exceptions means we need to also keep track of how many routes are dumped
for each node: this would be just one route per node, without exceptions.
When netlink strict checking is not enabled, we dump both routes and
exceptions at the same time: the RTM_F_CLONED flag is not used as a
filter. In this case, the per-node counter 'i_fa' is incremented by one
to track the single dumped route, then also incremented by one for each
exception dumped, and then stored as netlink callback argument as skip
counter, 's_fa', to be used when a partial dump operation restarts.
The per-node counter needs to be increased by one also when we skip a
route (exception) due to a previous non-zero skip counter, because it
needs to match the existing skip counter, if we are dumping both routes
and exceptions. I missed this, and only incremented the counter, for
regular routes, if the previous skip counter was zero. This means that,
in case of a mixed dump, partial dump operations after the first one
will start with a mismatching skip counter value, one less than expected.
This means in turn that the first exception for a given node is skipped
every time a partial dump operation restarts, if netlink strict checking
is not enabled (iproute < 5.0).
It turns out I didn't repeat the test in its final version, commit
de755a85130e ("selftests: pmtu: Introduce list_flush_ipv4_exception test
case"), which also counts the number of route exceptions returned, with
iproute2 versions < 5.0 -- I was instead using the equivalent of the IPv6
test as it was before commit b964641e9925 ("selftests: pmtu: Make
list_flush_ipv6_exception test more demanding").
Always increment the per-node counter by one if we previously dumped
a regular route, so that it matches the current skip counter.
Fixes: ee28906fd7a1 ("ipv4: Dump route exceptions if requested")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
No reason to error out on a MT7621 device with DDR2 memory when non
TRGMII mode is selected.
Only MT7621 DDR2 clock setup is not supported for TRGMII mode.
But non TRGMII mode doesn't need any special clock setup.
Signed-off-by: René van Dorst <opensource@vdorst.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If the CHAP_A value is not supported, the chap_server_open() function
should free the auth_protocol pointer and set it to NULL, or we will leave
a dangling pointer around.
[ 66.010905] Unsupported CHAP_A value
[ 66.011660] Security negotiation failed.
[ 66.012443] iSCSI Login negotiation failed.
[ 68.413924] general protection fault: 0000 [#1] SMP PTI
[ 68.414962] CPU: 0 PID: 1562 Comm: targetcli Kdump: loaded Not tainted 4.18.0-80.el8.x86_64 #1
[ 68.416589] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[ 68.417677] RIP: 0010:__kmalloc_track_caller+0xc2/0x210
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Reviewed-by: Chris Leech <cleech@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
WRITE SAME corrupts data on the block device behind iblock if the command
is emulated. The emulation code issues (M - 1) * N times more bios than
requested, where M is the number of 512 blocks per real block size and N is
the NUMBER OF LOGICAL BLOCKS specified in WRITE SAME command. So, for a
device with 4k blocks, 7 * N more LBAs gets written after the requested
range.
The issue happens because the number of 512 byte sectors to be written is
decreased one by one while the real bios are typically from 1 to 8 512 byte
sectors per bio.
Fixes: c66ac9db8d4a ("[SCSI] target: Add LIO target core v4.0.0-rc6")
Cc: <stable@vger.kernel.org>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
I ran into an intriguing bug caused by
commit ""spi: gpio: Don't request CS GPIO in DT use-case"
affecting all SPI GPIO devices with an active high
chip select line.
The commit switches the CS gpio handling over to the GPIO
core, which will parse and handle "cs-gpios" from the OF
node without even calling down to the driver to get the
job done.
However the GPIO core handles the standard bindings in
Documentation/devicetree/bindings/spi/spi-controller.yaml
that specifies that active high CS needs to be specified
using "spi-cs-high" in the DT node.
The code in drivers/spi/spi-gpio.c never respected this
and never tried to inspect subnodes to see if they contained
"spi-cs-high" like the gpiolib OF quirks does. Instead the
only way to get an active high CS was to tag it in the
device tree using the flags cell such as
cs-gpios = <&gpio 4 GPIO_ACTIVE_HIGH>;
This alters the quirks to not inspect the subnodes of SPI
masters on "spi-gpio" for the standard attribute "spi-cs-high",
making old device trees work as expected.
This semantic is a bit ambigous, but just allowing the
flags on the GPIO descriptor to modify polarity is what
the kernel at large mostly uses so let's encourage that.
Fixes: 249e2632dcd0 ("spi: gpio: Don't request CS GPIO in DT use-case")
Cc: Andrey Smirnov <andrew.smirnov@gmail.com>
Cc: linux-gpio@vger.kernel.org
Cc: linux-spi@vger.kernel.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
ftrace_arch_code_modify_post_process() and ftrace_arch_code_modify_prepare()
ftrace_arch_code_modify_prepare() is acquiring text_mutex, while the
corresponding release is happening in ftrace_arch_code_modify_post_process().
This has already been documented in the code, but let's also make the fact
that this is intentional clear to the semantic analysis tools such as sparse.
Link: http://lkml.kernel.org/r/nycvar.YFH.7.76.1906292321170.27227@cbobk.fhfr.pm
Fixes: 39611265edc1a ("ftrace/x86: Add a comment to why we take text_mutex in ftrace_arch_code_modify_prepare()")
Fixes: d5b844a2cf507 ("ftrace/x86: Remove possible deadlock between register_kprobe() and ftrace_run_update_code()")
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
If sendmsg() or sendmmsg() is called on a connected socket that hasn't had
bind() called on it, then an oops will occur when the kernel tries to
connect the call because no local endpoint has been allocated.
Fix this by implicitly binding the socket if it is in the
RXRPC_CLIENT_UNBOUND state, just like it does for the RXRPC_UNBOUND state.
Further, the state should be transitioned to RXRPC_CLIENT_BOUND after this
to prevent further attempts to bind it.
This can be tested with:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/socket.h>
#include <arpa/inet.h>
#include <linux/rxrpc.h>
static const unsigned char inet6_addr[16] = {
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -1, -1, 0xac, 0x14, 0x14, 0xaa
};
int main(void)
{
struct sockaddr_rxrpc srx;
struct cmsghdr *cm;
struct msghdr msg;
unsigned char control[16];
int fd;
memset(&srx, 0, sizeof(srx));
srx.srx_family = 0x21;
srx.srx_service = 0;
srx.transport_type = AF_INET;
srx.transport_len = 0x1c;
srx.transport.sin6.sin6_family = AF_INET6;
srx.transport.sin6.sin6_port = htons(0x4e22);
srx.transport.sin6.sin6_flowinfo = htons(0x4e22);
srx.transport.sin6.sin6_scope_id = htons(0xaa3b);
memcpy(&srx.transport.sin6.sin6_addr, inet6_addr, 16);
cm = (struct cmsghdr *)control;
cm->cmsg_len = CMSG_LEN(sizeof(unsigned long));
cm->cmsg_level = SOL_RXRPC;
cm->cmsg_type = RXRPC_USER_CALL_ID;
*(unsigned long *)CMSG_DATA(cm) = 0;
msg.msg_name = NULL;
msg.msg_namelen = 0;
msg.msg_iov = NULL;
msg.msg_iovlen = 0;
msg.msg_control = control;
msg.msg_controllen = cm->cmsg_len;
msg.msg_flags = 0;
fd = socket(AF_RXRPC, SOCK_DGRAM, AF_INET);
connect(fd, (struct sockaddr *)&srx, sizeof(srx));
sendmsg(fd, &msg, 0);
return 0;
}
Leading to the following oops:
BUG: kernel NULL pointer dereference, address: 0000000000000018
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
...
RIP: 0010:rxrpc_connect_call+0x42/0xa01
...
Call Trace:
? mark_held_locks+0x47/0x59
? __local_bh_enable_ip+0xb6/0xba
rxrpc_new_client_call+0x3b1/0x762
? rxrpc_do_sendmsg+0x3c0/0x92e
rxrpc_do_sendmsg+0x3c0/0x92e
rxrpc_sendmsg+0x16b/0x1b5
sock_sendmsg+0x2d/0x39
___sys_sendmsg+0x1a4/0x22a
? release_sock+0x19/0x9e
? reacquire_held_locks+0x136/0x160
? release_sock+0x19/0x9e
? find_held_lock+0x2b/0x6e
? __lock_acquire+0x268/0xf73
? rxrpc_connect+0xdd/0xe4
? __local_bh_enable_ip+0xb6/0xba
__sys_sendmsg+0x5e/0x94
do_syscall_64+0x7d/0x1bf
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Fixes: 2341e0775747 ("rxrpc: Simplify connect() implementation and simplify sendmsg() op")
Reported-by: syzbot+7966f2a0b2c7da8939b4@syzkaller.appspotmail.com
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
With gcc 4.1:
net/rxrpc/output.c: In function ‘rxrpc_send_data_packet’:
net/rxrpc/output.c:338: warning: ‘ret’ may be used uninitialized in this function
Indeed, if the first jump to the send_fragmentable label is made, and
the address family is not handled in the switch() statement, ret will be
used uninitialized.
Fix this by BUG()'ing as is done in other places in rxrpc where internal
support for future address families will need adding. It should not be
possible to reach this normally as the address families are checked
up-front.
Fixes: 5a924b8951f835b5 ("rxrpc: Don't store the rxrpc header in the Tx queue sk_buffs")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The Memory_BW metric generates groups including duration_time, which
maps to a software event.
For some reason this makes the group always not count.
Always put duration_time outside a group when generating metrics. It's
always the same time, so no need to group it.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190628220737.13259-3-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When printing the metrics raw, don't print : after the metricgroups.
This helps the command line completion to complete those too.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190628220737.13259-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
- Add a missing filter for the DRAM_Latency / DRAM_Parallel_Reads metrics
- Remove the useless PMM_* metrics from Skylake
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190628220737.13259-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
- Fix a typo in the man page
- Fix a tip that doesn't make any sense.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190628220900.13741-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add support for Hisi hip08 L3C PMU aliasing.
The kernel driver is in drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxarm@huawei.com
Link: http://lkml.kernel.org/r/1561732552-143038-5-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add support for Hisi hip08 HHA PMU aliasing.
The kernel driver is in drivers/perf/hisilicon/hisi_uncore_hha_pmu.c
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxarm@huawei.com
Link: http://lkml.kernel.org/r/1561732552-143038-4-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add support for Hisi hip08 DDRC PMU aliasing. We can now do something like
this:
$perf list
[snip]
uncore ddrc:
uncore_hisi_ddrc.act_cmd
[DDRC active commands. Unit: hisi_sccl,ddrc]
uncore_hisi_ddrc.flux_rcmd
[DDRC read commands. Unit: hisi_sccl,ddrc]
uncore_hisi_ddrc.flux_wcmd
[DDRC write commands. Unit: hisi_sccl,ddrc]
uncore_hisi_ddrc.flux_wr
[DDRC precharge commands. Unit: hisi_sccl,ddrc]
uncore_hisi_ddrc.rnk_chg
[DDRC rank commands. Unit: hisi_sccl,ddrc]
uncore_hisi_ddrc.rw_chg
[DDRC read and write changes. Unit: hisi_sccl,ddrc]
Performance counter stats for 'system wide':
0 uncore_hisi_ddrc.flux_rcmd [hisi_sccl1_ddrc0]
0 uncore_hisi_ddrc.flux_rcmd [hisi_sccl3_ddrc1]
0 uncore_hisi_ddrc.flux_rcmd [hisi_sccl5_ddrc2]
0 uncore_hisi_ddrc.flux_rcmd [hisi_sccl7_ddrc3]
0 uncore_hisi_ddrc.flux_rcmd [hisi_sccl5_ddrc0]
0 uncore_hisi_ddrc.flux_rcmd [hisi_sccl7_ddrc1]
0 uncore_hisi_ddrc.flux_rcmd [hisi_sccl1_ddrc3]
0 uncore_hisi_ddrc.flux_rcmd [hisi_sccl1_ddrc1]
0 uncore_hisi_ddrc.flux_rcmd [hisi_sccl3_ddrc2]
0 uncore_hisi_ddrc.flux_rcmd [hisi_sccl5_ddrc3]
0 uncore_hisi_ddrc.flux_rcmd [hisi_sccl3_ddrc0]
0 uncore_hisi_ddrc.flux_rcmd [hisi_sccl5_ddrc1]
0 uncore_hisi_ddrc.flux_rcmd [hisi_sccl7_ddrc2]
0 uncore_hisi_ddrc.flux_rcmd [hisi_sccl7_ddrc0]
20,421 uncore_hisi_ddrc.flux_rcmd [hisi_sccl1_ddrc2]
0 uncore_hisi_ddrc.flux_rcmd [hisi_sccl3_ddrc3]
1.001559011 seconds time elapsed
The kernel driver is in drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxarm@huawei.com
Link: http://lkml.kernel.org/r/1561732552-143038-3-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The jevent "Unit" field is used for uncore PMU alias definition.
The form uncore_pmu_example_X is supported, where "X" is a wildcard, to
support multiple instances of the same PMU in a system.
Unfortunately this format not suitable for all uncore PMUs; take the
Hisi DDRC uncore PMU for example, where the name is in the form
hisi_scclX_ddrcY.
For for current jevent parsing, we would be required to hardcode an
uncore alias translation for each possible value of X. This is not
scalable.
Instead, add support for "Unit" field in the form "hisi_sccl,ddrc",
where we can match by hisi_scclX and ddrcY. Tokens in Unit field are
delimited by ','.
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxarm@huawei.com
Link: http://lkml.kernel.org/r/1561732552-143038-2-git-send-email-john.garry@huawei.com
[ Shut up older gcc complianing about the last arg to strtok_r() being uninitialized, set that tmp to NULL ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The variable r is being initialized with a value that is never
read and it is being updated later with a new value. The
initialization is redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Nikolay Aleksandrov says:
====================
net: bridge: fix possible stale skb pointers
In the bridge driver we have a couple of places which call pskb_may_pull
but we've cached skb pointers before that and use them after which can
lead to out-of-bounds/stale pointer use. I've had these in my "to fix"
list for some time and now we got a report (patch 01) so here they are.
Patches 02-04 are fixes based on code inspection. Also patch 01 was
tested by Martin Weinelt, Martin if you don't mind please add your
tested-by tag to it by replying with Tested-by: name <email>.
I've also briefly tested the set by trying to exercise those code paths.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Don't cache eth dest pointer before calling pskb_may_pull.
Fixes: cf0f02d04a83 ("[BRIDGE]: use llc for receiving STP packets")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We would cache ether dst pointer on input in br_handle_frame_finish but
after the neigh suppress code that could lead to a stale pointer since
both ipv4 and ipv6 suppress code do pskb_may_pull. This means we have to
always reload it after the suppress code so there's no point in having
it cached just retrieve it directly.
Fixes: 057658cb33fbf ("bridge: suppress arp pkts on BR_NEIGH_SUPPRESS ports")
Fixes: ed842faeb2bd ("bridge: suppress nd pkts on BR_NEIGH_SUPPRESS ports")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We get a pointer to the ipv6 hdr in br_ip6_multicast_query but we may
call pskb_may_pull afterwards and end up using a stale pointer.
So use the header directly, it's just 1 place where it's needed.
Fixes: 08b202b67264 ("bridge br_multicast: IPv6 MLD support.")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Tested-by: Martin Weinelt <martin@linuxlounge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We take a pointer to grec prior to calling pskb_may_pull and use it
afterwards to get nsrcs so record nsrcs before the pull when handling
igmp3 and we get a pointer to nsrcs and call pskb_may_pull when handling
mld2 which again could lead to reading 2 bytes out-of-bounds.
==================================================================
BUG: KASAN: use-after-free in br_multicast_rcv+0x480c/0x4ad0 [bridge]
Read of size 2 at addr ffff8880421302b4 by task ksoftirqd/1/16
CPU: 1 PID: 16 Comm: ksoftirqd/1 Tainted: G OE 5.2.0-rc6+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
Call Trace:
dump_stack+0x71/0xab
print_address_description+0x6a/0x280
? br_multicast_rcv+0x480c/0x4ad0 [bridge]
__kasan_report+0x152/0x1aa
? br_multicast_rcv+0x480c/0x4ad0 [bridge]
? br_multicast_rcv+0x480c/0x4ad0 [bridge]
kasan_report+0xe/0x20
br_multicast_rcv+0x480c/0x4ad0 [bridge]
? br_multicast_disable_port+0x150/0x150 [bridge]
? ktime_get_with_offset+0xb4/0x150
? __kasan_kmalloc.constprop.6+0xa6/0xf0
? __netif_receive_skb+0x1b0/0x1b0
? br_fdb_update+0x10e/0x6e0 [bridge]
? br_handle_frame_finish+0x3c6/0x11d0 [bridge]
br_handle_frame_finish+0x3c6/0x11d0 [bridge]
? br_pass_frame_up+0x3a0/0x3a0 [bridge]
? virtnet_probe+0x1c80/0x1c80 [virtio_net]
br_handle_frame+0x731/0xd90 [bridge]
? select_idle_sibling+0x25/0x7d0
? br_handle_frame_finish+0x11d0/0x11d0 [bridge]
__netif_receive_skb_core+0xced/0x2d70
? virtqueue_get_buf_ctx+0x230/0x1130 [virtio_ring]
? do_xdp_generic+0x20/0x20
? virtqueue_napi_complete+0x39/0x70 [virtio_net]
? virtnet_poll+0x94d/0xc78 [virtio_net]
? receive_buf+0x5120/0x5120 [virtio_net]
? __netif_receive_skb_one_core+0x97/0x1d0
__netif_receive_skb_one_core+0x97/0x1d0
? __netif_receive_skb_core+0x2d70/0x2d70
? _raw_write_trylock+0x100/0x100
? __queue_work+0x41e/0xbe0
process_backlog+0x19c/0x650
? _raw_read_lock_irq+0x40/0x40
net_rx_action+0x71e/0xbc0
? __switch_to_asm+0x40/0x70
? napi_complete_done+0x360/0x360
? __switch_to_asm+0x34/0x70
? __switch_to_asm+0x40/0x70
? __schedule+0x85e/0x14d0
__do_softirq+0x1db/0x5f9
? takeover_tasklets+0x5f0/0x5f0
run_ksoftirqd+0x26/0x40
smpboot_thread_fn+0x443/0x680
? sort_range+0x20/0x20
? schedule+0x94/0x210
? __kthread_parkme+0x78/0xf0
? sort_range+0x20/0x20
kthread+0x2ae/0x3a0
? kthread_create_worker_on_cpu+0xc0/0xc0
ret_from_fork+0x35/0x40
The buggy address belongs to the page:
page:ffffea0001084c00 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x0
flags: 0xffffc000000000()
raw: 00ffffc000000000 ffffea0000cfca08 ffffea0001098608 0000000000000000
raw: 0000000000000000 0000000000000003 00000000ffffff7f 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff888042130180: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ffff888042130200: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> ffff888042130280: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
^
ffff888042130300: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ffff888042130380: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
==================================================================
Disabling lock debugging due to kernel taint
Fixes: bc8c20acaea1 ("bridge: multicast: treat igmpv3 report with INCLUDE and no sources as a leave")
Reported-by: Martin Weinelt <martin@linuxlounge.net>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Tested-by: Martin Weinelt <martin@linuxlounge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch removes standard netdev stats in ethtool -S.
Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Xue Chaojing <xuechaojing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
LINE6 drivers allocate the buffers based on the value returned from
usb_maxpacket() calls. The manipulated device may return zero for
this, and this results in the kmalloc() with zero size (and it may
succeed) while the other part of the driver code writes the packet
data with the fixed size -- which eventually overwrites.
This patch adds a simple sanity check for the invalid buffer size for
avoiding that problem.
Reported-by: syzbot+219f00fb49874dcaea17@syzkaller.appspotmail.com
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
We support many speeds and it doesn't make much sense to list them all
in the Kconfig. Let's just call it Multi-Gigabit.
Suggested-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Thomas reported that:
| Background:
|
| In preparation of supporting IPI shorthands I changed the CPU offline
| code to software disable the local APIC instead of just masking it.
| That's done by clearing the APIC_SPIV_APIC_ENABLED bit in the APIC_SPIV
| register.
|
| Failure:
|
| When the CPU comes back online the startup code triggers occasionally
| the warning in apic_pending_intr_clear(). That complains that the IRRs
| are not empty.
|
| The offending vector is the local APIC timer vector who's IRR bit is set
| and stays set.
|
| It took me quite some time to reproduce the issue locally, but now I can
| see what happens.
|
| It requires apicv_enabled=0, i.e. full apic emulation. With apicv_enabled=1
| (and hardware support) it behaves correctly.
|
| Here is the series of events:
|
| Guest CPU
|
| goes down
|
| native_cpu_disable()
|
| apic_soft_disable();
|
| play_dead()
|
| ....
|
| startup()
|
| if (apic_enabled())
| apic_pending_intr_clear() <- Not taken
|
| enable APIC
|
| apic_pending_intr_clear() <- Triggers warning because IRR is stale
|
| When this happens then the deadline timer or the regular APIC timer -
| happens with both, has fired shortly before the APIC is disabled, but the
| interrupt was not serviced because the guest CPU was in an interrupt
| disabled region at that point.
|
| The state of the timer vector ISR/IRR bits:
|
| ISR IRR
| before apic_soft_disable() 0 1
| after apic_soft_disable() 0 1
|
| On startup 0 1
|
| Now one would assume that the IRR is cleared after the INIT reset, but this
| happens only on CPU0.
|
| Why?
|
| Because our CPU0 hotplug is just for testing to make sure nothing breaks
| and goes through an NMI wakeup vehicle because INIT would send it through
| the boots-trap code which is not really working if that CPU was not
| physically unplugged.
|
| Now looking at a real world APIC the situation in that case is:
|
| ISR IRR
| before apic_soft_disable() 0 1
| after apic_soft_disable() 0 1
|
| On startup 0 0
|
| Why?
|
| Once the dying CPU reenables interrupts the pending interrupt gets
| delivered as a spurious interupt and then the state is clear.
|
| While that CPU0 hotplug test case is surely an esoteric issue, the APIC
| emulation is still wrong, Even if the play_dead() code would not enable
| interrupts then the pending IRR bit would turn into an ISR .. interrupt
| when the APIC is reenabled on startup.
From SDM 10.4.7.2 Local APIC State After It Has Been Software Disabled
* Pending interrupts in the IRR and ISR registers are held and require
masking or handling by the CPU.
In Thomas's testing, hardware cpu will not respect soft disable LAPIC
when IRR has already been set or APICv posted-interrupt is in flight,
so we can skip soft disable APIC checking when clearing IRR and set ISR,
continue to respect soft disable APIC when attempting to set IRR.
Reported-by: Rong Chen <rong.a.chen@intel.com>
Reported-by: Feng Tang <feng.tang@intel.com>
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rong Chen <rong.a.chen@intel.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|