Age | Commit message (Collapse) | Author |
|
Cross-merge networking fixes after downstream PR (net-6.12-rc4).
Conflicts:
107a034d5c1e ("net/mlx5: qos: Store rate groups in a qos domain")
1da9cfd6c41c ("net/mlx5: Unregister notifier on eswitch init failure")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Uprobe needs to fetch args into a percpu buffer, and then copy to ring
buffer to avoid non-atomic context problem.
Sometimes user-space strings, arrays can be very large, but the size of
percpu buffer is only page size. And store_trace_args() won't check
whether these data exceeds a single page or not, caused out-of-bounds
memory access.
It could be reproduced by following steps:
1. build kernel with CONFIG_KASAN enabled
2. save follow program as test.c
```
\#include <stdio.h>
\#include <stdlib.h>
\#include <string.h>
// If string length large than MAX_STRING_SIZE, the fetch_store_strlen()
// will return 0, cause __get_data_size() return shorter size, and
// store_trace_args() will not trigger out-of-bounds access.
// So make string length less than 4096.
\#define STRLEN 4093
void generate_string(char *str, int n)
{
int i;
for (i = 0; i < n; ++i)
{
char c = i % 26 + 'a';
str[i] = c;
}
str[n-1] = '\0';
}
void print_string(char *str)
{
printf("%s\n", str);
}
int main()
{
char tmp[STRLEN];
generate_string(tmp, STRLEN);
print_string(tmp);
return 0;
}
```
3. compile program
`gcc -o test test.c`
4. get the offset of `print_string()`
```
objdump -t test | grep -w print_string
0000000000401199 g F .text 000000000000001b print_string
```
5. configure uprobe with offset 0x1199
```
off=0x1199
cd /sys/kernel/debug/tracing/
echo "p /root/test:${off} arg1=+0(%di):ustring arg2=\$comm arg3=+0(%di):ustring"
> uprobe_events
echo 1 > events/uprobes/enable
echo 1 > tracing_on
```
6. run `test`, and kasan will report error.
==================================================================
BUG: KASAN: use-after-free in strncpy_from_user+0x1d6/0x1f0
Write of size 8 at addr ffff88812311c004 by task test/499CPU: 0 UID: 0 PID: 499 Comm: test Not tainted 6.12.0-rc3+ #18
Hardware name: Red Hat KVM, BIOS 1.16.0-4.al8 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x55/0x70
print_address_description.constprop.0+0x27/0x310
kasan_report+0x10f/0x120
? strncpy_from_user+0x1d6/0x1f0
strncpy_from_user+0x1d6/0x1f0
? rmqueue.constprop.0+0x70d/0x2ad0
process_fetch_insn+0xb26/0x1470
? __pfx_process_fetch_insn+0x10/0x10
? _raw_spin_lock+0x85/0xe0
? __pfx__raw_spin_lock+0x10/0x10
? __pte_offset_map+0x1f/0x2d0
? unwind_next_frame+0xc5f/0x1f80
? arch_stack_walk+0x68/0xf0
? is_bpf_text_address+0x23/0x30
? kernel_text_address.part.0+0xbb/0xd0
? __kernel_text_address+0x66/0xb0
? unwind_get_return_address+0x5e/0xa0
? __pfx_stack_trace_consume_entry+0x10/0x10
? arch_stack_walk+0xa2/0xf0
? _raw_spin_lock_irqsave+0x8b/0xf0
? __pfx__raw_spin_lock_irqsave+0x10/0x10
? depot_alloc_stack+0x4c/0x1f0
? _raw_spin_unlock_irqrestore+0xe/0x30
? stack_depot_save_flags+0x35d/0x4f0
? kasan_save_stack+0x34/0x50
? kasan_save_stack+0x24/0x50
? mutex_lock+0x91/0xe0
? __pfx_mutex_lock+0x10/0x10
prepare_uprobe_buffer.part.0+0x2cd/0x500
uprobe_dispatcher+0x2c3/0x6a0
? __pfx_uprobe_dispatcher+0x10/0x10
? __kasan_slab_alloc+0x4d/0x90
handler_chain+0xdd/0x3e0
handle_swbp+0x26e/0x3d0
? __pfx_handle_swbp+0x10/0x10
? uprobe_pre_sstep_notifier+0x151/0x1b0
irqentry_exit_to_user_mode+0xe2/0x1b0
asm_exc_int3+0x39/0x40
RIP: 0033:0x401199
Code: 01 c2 0f b6 45 fb 88 02 83 45 fc 01 8b 45 fc 3b 45 e4 7c b7 8b 45 e4 48 98 48 8d 50 ff 48 8b 45 e8 48 01 d0 ce
RSP: 002b:00007ffdf00576a8 EFLAGS: 00000206
RAX: 00007ffdf00576b0 RBX: 0000000000000000 RCX: 0000000000000ff2
RDX: 0000000000000ffc RSI: 0000000000000ffd RDI: 00007ffdf00576b0
RBP: 00007ffdf00586b0 R08: 00007feb2f9c0d20 R09: 00007feb2f9c0d20
R10: 0000000000000001 R11: 0000000000000202 R12: 0000000000401040
R13: 00007ffdf0058780 R14: 0000000000000000 R15: 0000000000000000
</TASK>
This commit enforces the buffer's maxlen less than a page-size to avoid
store_trace_args() out-of-memory access.
Link: https://lore.kernel.org/all/20241015060148.1108331-1-mqaio@linux.alibaba.com/
Fixes: dcad1a204f72 ("tracing/uprobes: Fetch args before reserving a ring buffer")
Signed-off-by: Qiao Ma <mqaio@linux.alibaba.com>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
|
|
|
|
This fixes a fsck bug on a very old filesystem (pre mainline merge).
Fixes: 72350ee0ea22 ("bcachefs: Kill snapshot arg to fsck_write_inode()")
Reported-by: Marcin Mirosław <marcin@mejor.pl>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Reported-by: Marcin Mirosław <marcin@mejor.pl>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Pull bluetooth fixes from Luiz Augusto Von Dentz:
- ISO: Fix multiple init when debugfs is disabled
- Call iso_exit() on module unload
- Remove debugfs directory on module init failure
- btusb: Fix not being able to reconnect after suspend
- btusb: Fix regression with fake CSR controllers 0a12:0001
- bnep: fix wild-memory-access in proto_unregister
Note: normally the bluetooth fixes go through the networking tree, but
this missed the weekly merge, and two of the commits fix regressions
that have caused a fair amount of noise and have now hit stable too:
https://lore.kernel.org/all/4e1977ca-6166-4891-965e-34a6f319035f@leemhuis.info/
So I'm pulling it directly just to expedite things and not miss yet
another -rc release. This is not meant to become a new pattern.
* tag 'for-net-2024-10-16' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
Bluetooth: btusb: Fix regression with fake CSR controllers 0a12:0001
Bluetooth: bnep: fix wild-memory-access in proto_unregister
Bluetooth: btusb: Fix not being able to reconnect after suspend
Bluetooth: Remove debugfs directory on module init failure
Bluetooth: Call iso_exit() on module unload
Bluetooth: ISO: Fix multiple init when debugfs is disabled
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
"Mostly error path fixes, but one pretty serious interrupt problem in
the Ocelot driver as well:
- Fix two error paths and a missing semicolon in the Intel driver
- Add a missing ACPI ID for the Intel Panther Lake
- Check return value of devm_kasprintf() in the Apple and STM32
drivers
- Add a missing mutex_destroy() in the aw9523 driver
- Fix a double free in cv1800_pctrl_dt_node_to_map() in the Sophgo
driver
- Fix a double free in ma35_pinctrl_dt_node_to_map_func() in the
Nuvoton driver
- Fix a bug in the Ocelot interrupt handler making the system hang"
* tag 'pinctrl-v6.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: ocelot: fix system hang on level based interrupts
pinctrl: nuvoton: fix a double free in ma35_pinctrl_dt_node_to_map_func()
pinctrl: sophgo: fix double free in cv1800_pctrl_dt_node_to_map()
pinctrl: intel: platform: Add Panther Lake to the list of supported
pinctrl: aw9523: add missing mutex_destroy
pinctrl: stm32: check devm_kasprintf() returned value
pinctrl: apple: check devm_kasprintf() returned value
pinctrl: intel: platform: use semicolon instead of comma in ncommunities assignment
pinctrl: intel: platform: fix error path in device_for_each_child_node()
|
|
kvmalloc() doesn't support allocations > INT_MAX, but vmalloc() does -
the limit should be lifted, but we can work around this for now.
A user with a 75 TB filesystem reported the following journal replay
error:
https://github.com/koverstreet/bcachefs/issues/769
In journal replay we have to sort and dedup all the keys from the
journal, which means we need a large contiguous allocation. Given that
the user has 128GB of ram, the 2GB limit on allocation size has become
far too small.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Fix a bug where mount was failing with -ERESTARTSYS:
https://github.com/koverstreet/bcachefs/issues/741
We only want the interruptible wait when called from fsync.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We only warn about having a btree_trans that wasn't passed in if we'll
be prompting.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull misc driver fixes from Greg KH:
"Here are a number of small char/misc/iio driver fixes for 6.12-rc4:
- loads of small iio driver fixes for reported problems
- parport driver out-of-bounds fix
- Kconfig description and MAINTAINERS file updates
All of these, except for the Kconfig and MAINTAINERS file updates have
been in linux-next all week. Those other two are just documentation
changes and will have no runtime issues and were merged on Friday"
* tag 'char-misc-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (39 commits)
misc: rtsx: list supported models in Kconfig help
MAINTAINERS: Remove some entries due to various compliance requirements.
misc: microchip: pci1xxxx: add support for NVMEM_DEVID_AUTO for OTP device
misc: microchip: pci1xxxx: add support for NVMEM_DEVID_AUTO for EEPROM device
parport: Proper fix for array out-of-bounds access
iio: frequency: admv4420: fix missing select REMAP_SPI in Kconfig
iio: frequency: {admv4420,adrf6780}: format Kconfig entries
iio: adc: ad4695: Add missing Kconfig select
iio: adc: ti-ads8688: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
iio: hid-sensors: Fix an error handling path in _hid_sensor_set_report_latency()
iioc: dac: ltc2664: Fix span variable usage in ltc2664_channel_config()
iio: dac: stm32-dac-core: add missing select REGMAP_MMIO in Kconfig
iio: dac: ltc1660: add missing select REGMAP_SPI in Kconfig
iio: dac: ad5770r: add missing select REGMAP_SPI in Kconfig
iio: amplifiers: ada4250: add missing select REGMAP_SPI in Kconfig
iio: frequency: adf4377: add missing select REMAP_SPI in Kconfig
iio: resolver: ad2s1210: add missing select (TRIGGERED_)BUFFER in Kconfig
iio: resolver: ad2s1210 add missing select REGMAP in Kconfig
iio: proximity: mb1232: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
iio: pressure: bm1390: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial driver fixes from Greg KH:
"Here are some small tty and serial driver fixes for 6.12-rc4:
- qcom-geni serial driver fixes, wow what a mess of a UART chip that
thing is...
- vt infoleak fix for odd font sizes
- imx serial driver bugfix
- yet-another n_gsm ldisc bugfix, slowly chipping down the issues in
that piece of code
All of these have been in linux-next for over a week with no reported
issues"
* tag 'tty-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: qcom-geni: rename suspend functions
serial: qcom-geni: drop unused receive parameter
serial: qcom-geni: drop flip buffer WARN()
serial: qcom-geni: fix rx cancel dma status bit
serial: qcom-geni: fix receiver enable
serial: qcom-geni: fix dma rx cancellation
serial: qcom-geni: fix shutdown race
serial: qcom-geni: revert broken hibernation support
serial: qcom-geni: fix polled console initialisation
serial: imx: Update mctrl old_status on RTSD interrupt
tty: n_gsm: Fix use-after-free in gsm_cleanup_mux
vt: prevent kernel-infoleak in con_font_get()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB driver fixes from Greg KH:
"Here are some small USB driver fixes and new device ids for 6.12-rc4:
- xhci driver fixes for a number of reported issues
- new usb-serial driver ids
- dwc3 driver fixes for reported problems.
- usb gadget driver fixes for reported problems
- typec driver fixes
- MAINTAINER file updates
All of these have been in linux-next this week with no reported issues"
* tag 'usb-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: serial: option: add Telit FN920C04 MBIM compositions
USB: serial: option: add support for Quectel EG916Q-GL
xhci: dbc: honor usb transfer size boundaries.
usb: xhci: Fix handling errors mid TD followed by other errors
xhci: Mitigate failed set dequeue pointer commands
xhci: Fix incorrect stream context type macro
USB: gadget: dummy-hcd: Fix "task hung" problem
usb: gadget: f_uac2: fix return value for UAC2_ATTRIBUTE_STRING store
usb: dwc3: core: Fix system suspend on TI AM62 platforms
xhci: tegra: fix checked USB2 port number
usb: dwc3: Wait for EndXfer completion before restoring GUSB2PHYCFG
usb: typec: qcom-pmic-typec: fix sink status being overwritten with RP_DEF
usb: typec: altmode should keep reference to parent
MAINTAINERS: usb: raw-gadget: add bug tracker link
MAINTAINERS: Add an entry for the LJCA drivers
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- Explicitly disable the TSC deadline timer when going idle to address
some CPU errata in that area
- Do not apply the Zenbleed fix on anything else except AMD Zen2 on the
late microcode loading path
- Clear CPU buffers later in the NMI exit path on 32-bit to avoid
register clearing while they still contain sensitive data, for the
RDFS mitigation
- Do not clobber EFLAGS.ZF with VERW on the opportunistic SYSRET exit
path on 32-bit
- Fix parsing issues of memory bandwidth specification in sysfs for
resctrl's memory bandwidth allocation feature
- Other small cleanups and improvements
* tag 'x86_urgent_for_v6.12_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/apic: Always explicitly disarm TSC-deadline timer
x86/CPU/AMD: Only apply Zenbleed fix for Zen2 during late microcode load
x86/bugs: Use code segment selector for VERW operand
x86/entry_32: Clear CPU buffers after register restore in NMI return
x86/entry_32: Do not clobber user EFLAGS.ZF
x86/resctrl: Annotate get_mem_config() functions as __init
x86/resctrl: Avoid overflow in MB settings in bw_validate()
x86/amd_nb: Add new PCI ID for AMD family 1Ah model 20h
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Borislav Petkov:
- Fix a case for sifive-plic where an interrupt gets disabled *and*
masked and remains masked when it gets reenabled later
- Plug a small race in GIC-v4 where userspace can force an affinity
change of a virtual CPU (vPE) in its unmapping path
- Do not mix the two sets of ocelot irqchip's registers in the mask
calculation of the main interrupt sticky register
- Other smaller fixlets and cleanups
* tag 'irq_urgent_for_v6.12_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/renesas-rzg2l: Fix missing put_device
irqchip/riscv-intc: Fix SMP=n boot with ACPI
irqchip/sifive-plic: Unmask interrupt in plic_irq_enable()
irqchip/gic-v4: Don't allow a VMOVP on a dying VPE
irqchip/sifive-plic: Return error code on failure
irqchip/riscv-imsic: Fix output text of base address
irqchip/ocelot: Comment sticky register clearing code
irqchip/ocelot: Fix trigger register address
irqchip: Remove obsolete config ARM_GIC_V3_ITS_PCI
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduling fixes from Borislav Petkov:
- Add PREEMPT_RT maintainers
- Fix another aspect of delayed dequeued tasks wrt determining their
state, i.e., whether they're runnable or blocked
- Handle delayed dequeued tasks and their migration wrt PSI properly
- Fix the situation where a delayed dequeue task gets enqueued into a
new class, which should not happen
- Fix a case where memory allocation would happen while the runqueue
lock is held, which is a no-no
- Do not over-schedule when tasks with shorter slices preempt the
currently running task
- Make sure delayed to deque entities are properly handled before
unthrottling
- Other smaller cleanups and improvements
* tag 'sched_urgent_for_v6.12_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
MAINTAINERS: Add an entry for PREEMPT_RT.
sched/fair: Fix external p->on_rq users
sched/psi: Fix mistaken CPU pressure indication after corrupted task state bug
sched/core: Dequeue PSI signals for blocked tasks that are delayed
sched: Fix delayed_dequeue vs switched_from_fair()
sched/core: Disable page allocation in task_tick_mm_cid()
sched/deadline: Use hrtick_enabled_dl() before start_hrtick_dl()
sched/eevdf: Fix wakeup-preempt by checking cfs_rq->nr_running
sched: Fix sched_delayed vs cfs_bandwidth
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fix from Juergen Gross:
"A single fix for a build failure introduced this merge window"
* tag 'for-linus-6.12a-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: Remove dependency between pciback and privcmd
|
|
git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fix from Christoph Hellwig:
"Just another small tracing fix from Sean"
* tag 'dma-mapping-6.12-2024-10-20' of git://git.infradead.org/users/hch/dma-mapping:
dma-mapping: fix tracing dma_alloc/free with vmalloc'd memory
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 6.12, take #3
- Stop wasting space in the HYP idmap, as we are dangerously close
to the 4kB limit, and this has already exploded in -next
- Fix another race in vgic_init()
- Fix a UBSAN error when faking the cache topology with MTE
enabled
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 6.12, take #2
- Fix the guest view of the ID registers, making the relevant fields
writable from userspace (affecting ID_AA64DFR0_EL1 and ID_AA64PFR1_EL1)
- Correcly expose S1PIE to guests, fixing a regression introduced
in 6.12-rc1 with the S1POE support
- Fix the recycling of stage-2 shadow MMUs by tracking the context
(are we allowed to block or not) as well as the recycling state
- Address a couple of issues with the vgic when userspace misconfigures
the emulation, resulting in various splats. Headaches courtesy
of our Syzkaller friends
|
|
For the external interrupt updating procedure in imsic, there was a
spinlock to protect it already. But since it should not be preempted in
any cases, we should turn to use raw_spinlock to prevent any preemption
in case PREEMPT_RT was enabled.
Signed-off-by: Cyan Yang <cyan.yang@sifive.com>
Reviewed-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Message-ID: <20240919160126.44487-1-cyan.yang@sifive.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
When looking for a "mangled", i.e. dynamic, CPUID entry, terminate the
walk based on the number of array _entries_, not the size in bytes of
the array. Iterating based on the total size of the array can result in
false passes, e.g. if the random data beyond the array happens to match
a CPUID entry's function and index.
Fixes: fb18d053b7f8 ("selftest: kvm: x86: test KVM_GET_CPUID2 and guest visible CPUIDs against KVM_GET_SUPPORTED_CPUID")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-ID: <20241003234337.273364-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Some distros switched gcc to '-march=x86-64-v3' by default and while it's
hard to find a CPU which doesn't support it today, many KVM selftests fail
with
==== Test Assertion Failure ====
lib/x86_64/processor.c:570: Unhandled exception in guest
pid=72747 tid=72747 errno=4 - Interrupted system call
Unhandled exception '0x6' at guest RIP '0x4104f7'
The failure is easy to reproduce elsewhere with
$ make clean && CFLAGS='-march=x86-64-v3' make -j && ./x86_64/kvm_pv_test
The root cause of the problem seems to be that with '-march=x86-64-v3' GCC
uses AVX* instructions (VMOVQ in the example above) and without prior
XSETBV() in the guest this results in #UD. It is certainly possible to add
it there, e.g. the following saves the day as well:
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-ID: <20240920154422.2890096-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
In NC-SI specification, NC-SI is using RMII, not MII.
Signed-off-by: Jacky Chou <jacky_chou@aspeedtech.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Message-ID: <20241018053331.1900100-1-jacky_chou@aspeedtech.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
|
|
There are some spelling mistakes of 'accelaration', 'exprienced' and
'rewritting' in comments which should be 'acceleration', 'experienced'
and 'rewriting'.
Suggested-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/all/20241017162846.GA51712@kernel.org/
Signed-off-by: WangYuli <wangyuli@uniontech.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Message-ID: <90D42CB167CA0842+20241018021910.31359-1-wangyuli@uniontech.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
|
|
Register a6d/12 is shadowing register MDIO_AN_EEE_ADV2. So this line
disables advertisement of EEE at 2.5G. Latest vendor driver r8125
doesn't do this (any longer?), so this mode seems to be safe.
EEE saves quite some energy, therefore enable this mode per default.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Message-ID: <95dd5a0c-09ea-4847-94d9-b7aa3063e8ff@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
|
|
The first boards show up with Realtek's RTL8125D. This MAC/PHY chip
comes with an integrated 2.5Gbps PHY with ID 0x001cc841. It's not
clear yet whether there's an external version of this PHY and how
Realtek calls it, therefore use the numeric id for now.
Link: https://lore.kernel.org/netdev/2ada65e1-5dfa-456c-9334-2bc51272e9da@gmail.com/T/
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Message-ID: <7d2924de-053b-44d2-a479-870dc3878170@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
|
|
Run airoha_qdma_cleanup_tx_queue() in ndo_stop callback in order to
unmap pending skbs. Moreover, reset BQL txq state stopping the netdevice,
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Hariprasad Kelam <hkelam@marvell.com>
Message-ID: <20241017-airoha-en7581-reset-bql-v1-1-08c0c9888de5@kernel.org>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
|
|
This patch propagates error code correctly in cal_cycle()
and improve with FIELD_GET().
Signed-off-by: SkyLake.Huang <skylake.huang@mediatek.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
|
|
This patch shrinks line wrapping to 80 chars. Also, in
tx_amp_fill_result(), use FIELD_PREP() to prettify code.
Signed-off-by: SkyLake.Huang <skylake.huang@mediatek.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
|
|
This patch fixes spelling errors, re-arrange vars with
reverse Xmas tree and remove unnecessary parens in
mediatek-ge-soc.c.
Signed-off-by: SkyLake.Huang <skylake.huang@mediatek.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
|
|
Remove rtl_dash_loop_wait_high/low to simplify the code.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Message-ID: <fb8c490c-2d92-48f5-8bbf-1fc1f2ee1649@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
|
|
warn level
In case of a problem with firmware loading we inform at the driver level,
in addition the firmware load code itself issues warnings. Therefore
switch to firmware_request_nowarn() to avoid duplicated error messages.
In addition switch to warn level because the firmware is optional and
typically just fixes compatibility issues.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Message-ID: <d9c5094c-89a6-40e2-b5fe-8df7df4624ef@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
|
|
So far we use a custom flag to define when a task can be scheduled and
when not. Let's use the standard mechanism with disable_work() et al
instead.
Note that in rtl8169_close() we can remove the call to cancel_work()
because we now call disable_work_sync() in rtl8169_down() already.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
|
|
There's not really a benefit here in taking the RTNL lock. The task
handler does exception handling only, so we're in trouble anyway when
we come here, and there's no need to protect against e.g. a parallel
ethtool call.
A benefit of removing the RTNL lock here is that we now can
synchronously cancel the workqueue from a context holding the RTNL mutex.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
|
|
fbnic fails to link as built-in when PTP support is in a loadable
module:
aarch64-linux-ld: drivers/net/ethernet/meta/fbnic/fbnic_ethtool.o: in function `fbnic_get_ts_info':
fbnic_ethtool.c:(.text+0x428): undefined reference to `ptp_clock_index'
aarch64-linux-ld: drivers/net/ethernet/meta/fbnic/fbnic_time.o: in function `fbnic_time_start':
fbnic_time.c:(.text+0x820): undefined reference to `ptp_schedule_worker'
aarch64-linux-ld: drivers/net/ethernet/meta/fbnic/fbnic_time.o: in function `fbnic_ptp_setup':
fbnic_time.c:(.text+0xa68): undefined reference to `ptp_clock_register'
Add the appropriate dependency to enforce this.
Fixes: 6a2b3ede9543 ("eth: fbnic: add RX packets timestamping support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Message-ID: <20241016062303.2551686-1-arnd@kernel.org>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
|
|
samples/pktgen is missing in the MAINTAINERS file.
Suggested-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Message-ID: <20241018005301.10052-1-liuhangbin@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
|
|
Mapping all my previously used emails to my kernel.org email.
Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org>
Message-ID: <172909057364.2452383.8019986488234344607.stgit@firesoul>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
|
|
mv88e6393x_port_set_policy doesn't correctly shift the ptr value when
converting the policy format between the old and new styles, so the
target register ends up with the ptr being written over the data bits.
Shift the pointer to align with the format expected by
mv88e6393x_port_policy_write().
Fixes: 6584b26020fc ("net: dsa: mv88e6xxx: implement .port_set_policy for Amethyst")
Signed-off-by: Peter Rashleigh <peter@rashleigh.ca>
Reviewed-by: Simon Horman <horms@kernel.org>
Message-ID: <20241016040822.3917-1-peter@rashleigh.ca>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
|
|
Ignore nCR3[4:0] when loading PDPTEs from memory for nested SVM, as bits
4:0 of CR3 are ignored when PAE paging is used, and thus VMRUN doesn't
enforce 32-byte alignment of nCR3.
In the absolute worst case scenario, failure to ignore bits 4:0 can result
in an out-of-bounds read, e.g. if the target page is at the end of a
memslot, and the VMM isn't using guard pages.
Per the APM:
The CR3 register points to the base address of the page-directory-pointer
table. The page-directory-pointer table is aligned on a 32-byte boundary,
with the low 5 address bits 4:0 assumed to be 0.
And the SDM's much more explicit:
4:0 Ignored
Note, KVM gets this right when loading PDPTRs, it's only the nSVM flow
that is broken.
Fixes: e4e517b4be01 ("KVM: MMU: Do not unconditionally read PDPTE from guest memory")
Reported-by: Kirk Swidowski <swidowski@google.com>
Cc: Andy Nguyen <theflow@google.com>
Cc: 3pvd <3pvd@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20241009140838.1036226-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Reset the segment cache after segment initialization in vmx_vcpu_reset()
to harden KVM against caching stale/uninitialized data. Without the
recent fix to bypass the cache in kvm_arch_vcpu_put(), the following
scenario is possible:
- vCPU is just created, and the vCPU thread is preempted before
SS.AR_BYTES is written in vmx_vcpu_reset().
- When scheduling out the vCPU task, kvm_arch_vcpu_in_kernel() =>
vmx_get_cpl() reads and caches '0' for SS.AR_BYTES.
- vmx_vcpu_reset() => seg_setup() configures SS.AR_BYTES, but doesn't
invoke vmx_segment_cache_clear() to invalidate the cache.
As a result, KVM retains a stale value in the cache, which can be read,
e.g. via KVM_GET_SREGS. Usually this is not a problem because the VMX
segment cache is reset on each VM-Exit, but if the userspace VMM (e.g KVM
selftests) reads and writes system registers just after the vCPU was
created, _without_ modifying SS.AR_BYTES, userspace will write back the
stale '0' value and ultimately will trigger a VM-Entry failure due to
incorrect SS segment type.
Invalidating the cache after writing the VMCS doesn't address the general
issue of cache accesses from IRQ context being unsafe, but it does prevent
KVM from clobbering the VMCS, i.e. mitigates the harm done _if_ KVM has a
bug that results in an unsafe cache access.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Fixes: 2fb92db1ec08 ("KVM: VMX: Cache vmcs segment fields")
[sean: rework changelog to account for previous patch]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20241009175002.1118178-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Massage the documentation for KVM_X86_QUIRK_SLOT_ZAP_ALL to call out that
it applies to moved memslots as well as deleted memslots, to avoid KVM's
"fast zap" terminology (which has no meaning for userspace), and to reword
the documented targeted zap behavior to specifically say that KVM _may_
zap a subset of all SPTEs. As evidenced by the fix to zap non-leafs SPTEs
with gPTEs, formally documenting KVM's exact internal behavior is risky
and unnecessary.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20241009192345.1148353-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Add a lockdep assertion in kvm_unmap_gfn_range() to ensure that either
mmu_invalidate_in_progress is elevated, or that the range is being zapped
due to memslot removal (loosely detected by slots_lock being held).
Zapping SPTEs without mmu_invalidate_{in_progress,seq} protection is unsafe
as KVM's page fault path snapshots state before acquiring mmu_lock, and
thus can create SPTEs with stale information if vCPUs aren't forced to
retry faults (due to seeing an in-progress or past MMU invalidation).
Memslot removal is a special case, as the memslot is retrieved outside of
mmu_invalidate_seq, i.e. doesn't use the "standard" protections, and
instead relies on SRCU synchronization to ensure any in-flight page faults
are fully resolved before zapping SPTEs.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20241009192345.1148353-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
When performing a targeted zap on memslot removal, zap only MMU pages that
shadow guest PTEs, as zapping all SPs that "match" the gfn is inexact and
unnecessary. Furthermore, for_each_gfn_valid_sp() arguably shouldn't
exist, because it doesn't do what most people would it expect it to do.
The "round gfn for level" adjustment that is done for direct SPs (no gPTE)
means that the exact gfn comparison will not get a match, even when a SP
does "cover" a gfn, or was even created specifically for a gfn.
For memslot deletion specifically, KVM's behavior will vary significantly
based on the size and alignment of a memslot, and in weird ways. E.g. for
a 4KiB memslot, KVM will zap more SPs if the slot is 1GiB aligned than if
it's only 4KiB aligned. And as described below, zapping SPs in the
aligned case overzaps for direct MMUs, as odds are good the upper-level
SPs are serving other memslots.
To iterate over all potentially-relevant gfns, KVM would need to make a
pass over the hash table for each level, with the gfn used for lookup
rounded for said level. And then check that the SP is of the correct
level, too, e.g. to avoid over-zapping.
But even then, KVM would massively overzap, as processing every level is
all but guaranteed to zap SPs that serve other memslots, especially if the
memslot being removed is relatively small. KVM could mitigate that issue
by processing only levels that can be possible guest huge pages, i.e. are
less likely to be re-used for other memslot, but while somewhat logical,
that's quite arbitrary and would be a bit of a mess to implement.
So, zap only SPs with gPTEs, as the resulting behavior is easy to describe,
is predictable, and is explicitly minimal, i.e. KVM only zaps SPs that
absolutely must be zapped.
Cc: Yan Zhao <yan.y.zhao@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Yan Zhao <yan.y.zhao@intel.com>
Tested-by: Yan Zhao <yan.y.zhao@intel.com>
Message-ID: <20241009192345.1148353-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
AMD SEV-SNP and Intel TDX have limited access to MTRR: either it is not
advertised in CPUID or it cannot be programmed (on TDX, due to #VE on
CR0.CD clear).
This results in guests using uncached mappings where it shouldn't and
pmd/pud_set_huge() failures due to non-uniform memory type reported by
mtrr_type_lookup().
Override MTRR state, making it WB by default as the kernel does for
Hyper-V guests.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Suggested-by: Binbin Wu <binbin.wu@intel.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Message-ID: <20241015095818.357915-1-kirill.shutemov@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The last use of kvm_vcpu_gfn_to_pfn_atomic was removed by commit
1bbc60d0c7e5 ("KVM: x86/mmu: Remove MMU auditing")
Remove it.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Message-ID: <20241001141354.18009-3-linux@treblig.org>
[Adjust Documentation/virt/kvm/locking.rst. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The last use of kvm_vcpu_gfn_to_pfn was removed by commit
b1624f99aa8f ("KVM: Remove kvm_vcpu_gfn_to_page() and kvm_vcpu_gpa_to_page()")
Remove it.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Message-ID: <20241001141354.18009-2-linux@treblig.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Pull one more io_uring fix from Jens Axboe:
"Fix for a regression introduced in 6.12-rc2, where a condition check
was negated and hence -EAGAIN would bubble back up up to userspace
rather than trigger a retry condition"
* tag 'io_uring-6.12-20241019' of git://git.kernel.dk/linux:
io_uring/rw: fix wrong NOWAIT check in io_rw_init_file()
|
|
build_skb() returns NULL in case of a memory allocation failure so handle
it inside __octep_oq_process_rx() to avoid NULL pointer dereference.
__octep_oq_process_rx() is called during NAPI polling by the driver. If
skb allocation fails, keep on pulling packets out of the Rx DMA queue: we
shouldn't break the polling immediately and thus falsely indicate to the
octep_napi_poll() that the Rx pressure is going down. As there is no
associated skb in this case, don't process the packets and don't push them
up the network stack - they are skipped.
Helper function is implemented to unmmap/flush all the fragment buffers
used by the dropped packet. 'alloc_failures' counter is incremented to
mark the skb allocation error in driver statistics.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 37d79d059606 ("octeon_ep: add Tx/Rx processing and interrupt support")
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
|
|
The common code with some packet and index manipulations is extracted and
moved to newly implemented helper to make the code more readable and avoid
duplication. This is a preparation for skb allocation failure handling.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Suggested-by: Simon Horman <horms@kernel.org>
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
|