Age | Commit message (Collapse) | Author |
|
Verify that the replicated kernel image for the non-boot nodes matches
the boot kernel image, and report differences found. This ensures that
the non-boot modes are running an identical copy of the kernel.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
Add a module to allow kernel text replication to be tested; this
exposes some data in procfs which can be used to verify that:
(a) we're using different page tables in TTBR1 on CPUs in different
NUMA nodes
(b) that CPUs in different NUMA nodes are indeed accessing different
copies of the kernel
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
Add a kernel configuration option to determine whether kernel text
replication should default to being enabled or disabled at boot
without a command line specifier.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
Add the Kconfig symbol for kernel text replication. This unfortunately
requires KASAN and kernel text randomisation options to be disabled at
the moment.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
Provide an early kernel option "ktext=" which allows the kernel text
replication to be enabled. This takes a boolean argument.
The way this has been implemented means that we take all the same paths
through the kernel at runtime whether kernel text replication has been
enabled or not; this allows the performance effects of the code changes
to be evaluated separately from the act of running with replicating the
kernel text.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
Include as much of the read-only data in the replication as we can
without needing to move away from the generic RO_DATA() macro in
the linker script.
Unfortunately, the read-only data section is immedaitely followed
by the read-only after init data with no page alignment, which
means we can't have separate mappings for the read-only data
section and everything else. Changing that would mean replacing
the generic RO_DATA() macro which increases the maintenance burden.
however, this is likely not worth the effort as the majority of
read-only data will be covered.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
Setup page table entries in each non-boot NUMA node page table to
point at each node's own copy of the kernel text. This switches
each node to use its own unique copy of the kernel text.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
Add changes for CNP (Common Not Private) support of kernel text
replication. Although text replication has only been tested on
dual-socket Ampere A1 systems, provided the different NUMA nodes
are not part of the same inner shareable domain, CNP should not
be a problem.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
Arrange for secondary CPUs to boot with TTBR1 pointing at the
appropriate per-node copy of the kernel page tables for the CPUs NUMA
node.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
Allocate the level 0 page tables for the per-node kernel text
replication, but copy all level 0 table entries from the NUMA node 0
table. Therefore, for the time being, each node's level 0 page tables
will contain identical entries, and thus other nodes will continue
to use the node 0 kernel text.
Since the level 0 page tables can be updated at runtime to add entries
for vmalloc and module space, propagate these updates to the other
swapper page tables. The exception is if we see an update for the
level 0 entry which points to the kernel mapping.
We also need to setup a copy of the trampoline page tables as well, as
the assembly code relies on the two page tables being a fixed offset
apart.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
Add a series of helpers for the swapper page directories - a set which
return those for the calling CPU, and those which take the NUMA node
number.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
Add a struct definition for the level zero page table group (the
optional trampoline page tables, reserved page tables, and swapper page
tables).
Add a symbol and extern declaration for the node 0 page table group.
Add an array of pointers to per-node page tables, which will default to
using the node 0 page table group.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
aarch64_insn_write_literal_u64() was introduced in v6.3-rc1 for
updating ftrace ops pointers in the kernel text. This needs to be
fixed up for kernel text replication, so provide a version that
will update the mapping.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
Add support for text patching on our replicated texts.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
Allocate memory on the appropriate node for the per-node copies of the
kernel text, and copy the kernel text to that memory. Clean and
invalidate the caches to the point of unification so that the copied
text is correctly visible to the target node.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
The kernel text and modules must be in separate L0 page table entries.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
A simple patch that adds an empty function for kernel text replication
initialisation and hooks it into the initialisation path.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
Kernel text replication needs to maintain separate per-node page
tables for the kernel text. In order to do this without affecting
other kernel memory mappings, placing the kernel such that it does
not share a L0 page table entry with any other mapping is desirable.
Prior to this commit, the layout without KASLR was:
+----------+
| vmalloc |
+----------+
| Kernel |
+----------+ MODULES_END, VMALLOC_START, KIMAGE_VADDR =
| Modules | MODULES_VADDR + MODULES_VSIZE
+----------+ MODULES_VADDR = _PAGE_END(VA_BITS_MIN)
| VA space |
+----------+ 0
This becomes:
+----------+
| vmalloc |
+----------+ VMALLOC_START = MODULES_END + PGDIR_SIZE
| Kernel |
+----------+ MODULES_END, KIMAGE_VADDR = _PAGE_END(VA_BITS_MIN) + PGDIR_SIZE
| Modules |
+----------+ MODULES_VADDR = MODULES_END - MODULES_VSIZE
| VA space |
+----------+ 0
This assumes MODULES_VSIZE (128M) <= PGDIR_SIZE.
One side effect of this change is that KIMAGE_VADDR's definition now
includes PGDIR_SIZE (to leave room for the modules) but this is not
defined when asm/memory.h is included. This means KIMAGE_VADDR can
not be used in inline functions within this file, so we convert
kaslr_offset() and kaslr_enabled() to be macros instead.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
When we hook into the kernel text patching code, we will need to call
clean_dcache_range_nopatch() to ensure that the patching of the
replicated kernel text is properly visible to other CPUs. Make this
function available to the replication code.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
Arrange for __cpu_replace_ttbr1() to take the physical address of the
page tables rather than the virtual page table pointer. This will allow
us to provide unique swapper page tables depending on the NUMA node.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"Improve the detection when to run atomic transfer handlers for kernels
with preemption disabled. This removes some false positive splats a
number of users were seeing if their driver didn't have support for
atomic transfers.
Also, fix a typo in the docs while we are here"
* tag 'i2c-for-6.7-final' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: core: Fix atomic xfer check for non-preempt config
Documentation/i2c: fix spelling error in i2c-address-translators
|
|
Since commit aa49c90894d0 ("i2c: core: Run atomic i2c xfer when
!preemptible"), the whole reboot/power off sequence on non-preempt kernels
is using atomic i2c xfer, as !preemptible() always results to 1.
During device_shutdown(), the i2c might be used a lot and not all busses
have implemented an atomic xfer handler. This results in a lot of
avoidable noise, like:
[ 12.687169] No atomic I2C transfer handler for 'i2c-0'
[ 12.692313] WARNING: CPU: 6 PID: 275 at drivers/i2c/i2c-core.h:40 i2c_smbus_xfer+0x100/0x118
...
Fix this by allowing non-atomic xfer when the interrupts are enabled, as
it was before.
Link: https://lore.kernel.org/r/20231222230106.73f030a5@yea
Link: https://lore.kernel.org/r/20240102150350.3180741-1-mwalle@kernel.org
Link: https://lore.kernel.org/linux-i2c/13271b9b-4132-46ef-abf8-2c311967bb46@mailbox.org/
Fixes: aa49c90894d0 ("i2c: core: Run atomic i2c xfer when !preemptible")
Cc: stable@vger.kernel.org # v5.2+
Signed-off-by: Benjamin Bara <benjamin.bara@skidata.com>
Tested-by: Michael Walle <mwalle@kernel.org>
Tested-by: Tor Vic <torvic9@mailbox.org>
[wsa: removed a comment which needs more work, code is ok]
Signed-off-by: Wolfram Sang <wsa@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc mm fixes from Andrew Morton:
"12 hotfixes.
Two are cc:stable and the remainder either address post-6.7 issues or
aren't considered necessary for earlier kernel versions"
* tag 'mm-hotfixes-stable-2024-01-05-11-35' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm: shrinker: use kvzalloc_node() from expand_one_shrinker_info()
mailmap: add entries for Mathieu Othacehe
MAINTAINERS: change vmware.com addresses to broadcom.com
arch/mm/fault: fix major fault accounting when retrying under per-VMA lock
mm/mglru: skip special VMAs in lru_gen_look_around()
MAINTAINERS: hand over hwpoison maintainership to Miaohe Lin
MAINTAINERS: remove hugetlb maintainer Mike Kravetz
mm: fix unmap_mapping_range high bits shift bug
mm: memcg: fix split queue list crash when large folio migration
mm: fix arithmetic for max_prop_frac when setting max_ratio
mm: fix arithmetic for bdi min_ratio
mm: align larger anonymous mappings on THP boundaries
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd fix from Chuck Lever:
- Fix another regression in the NFSD administrative API
* tag 'nfsd-6.7-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
nfsd: drop the nfsd_put helper
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
Pull firewire fix from Takashi Sakamoto:
"A single patch to suppress unexpected system reboot in AMD Ryzen
machines with PCIe card consisting of Asmedia ASM1083/1085 and
VT6306/6307/6308.
When the 1394 OHCI driver for the card accesses a specific register
in PCI memory space, the system reboot often occurs.
The issue affects all versions of Linux kernel as long as the 1394
OHCI driver is included. The mechanism of unexpected system reboot is
not clear, so the driver is changed to avoid the access itself when
detecting the combination of hardware"
* tag 'firewire-fixes-6.7-final' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
firewire: ohci: suppress unexpected system reboot in AMD Ryzen machines and ASM108x/VT630x PCIe cards
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
"MMC core:
- Fix releasing the host by canceling the delayed work
- Fix pause retune on all RPMB partitions
MMC host:
- meson-mx-sdhc: Fix HW hang during card initialization
- sdhci-sprd: Fix eMMC init failure after HW reset"
* tag 'mmc-v6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: sdhci-sprd: Fix eMMC init failure after hw reset
mmc: core: Cancel delayed work before releasing host
mmc: rpmb: fixes pause retune on all RPMB partitions.
mmc: meson-mx-sdhc: Fix initialization frozen issue
|
|
Pull more drm fixes from Dave Airlie:
"The amdgpu ones are fairly normal, the one that is a bit large is a
fix for a newly introduced IP in 6.7 so unlikely to cause regressions.
The nouveau ones are mostly memory leaks and debugging cleanups from
the GSP (new nvidia firmware) enablement. There are some GSP changes
to the message passing code and a subsequent fix for eDP panel turn
on, that means my laptop can turn on the panel in GSP mode. These are
fairly low chance of disrupting things since GSP is new in 6.7. The
final not all in GSP fix is a deadlock seen with i915/nouveau when GSP
is used where the the fence and irq paths have locking inversions,
I've pushed some irq enablement out to a workqueue, and this has seen
some fairly decent testing.
amdgpu:
- DP MST fix
- SMU 13.0.6 fixes
- fix displays on macbooks using vega12
- fix VSC and colorimetry on DP/eDP
nouveau:
- fix deadlock between fence signalling and irq paths
- fix GSP memory leaks
- fix GSP leftover debug
- hide some GSP callback messages
- fix GSP display disable path
- fix GSP ACPI interaction
- handle errors in ctrl messages
- use errors info to fix DP link training"
* tag 'drm-fixes-2024-01-05' of git://anongit.freedesktop.org/drm/drm:
drm/nouveau/dp: Honor GSP link training retry timeouts
nouveau: push event block/allowing out of the fence context
nouveau/gsp: always free the alloc messages on r535
nouveau/gsp: don't free ctrl messages on errors
nouveau/gsp: convert gsp errors to generic errors
drm/nouveau/gsp: Fix ACPI MXDM/MXDS method invocations
nouveau/gsp: free userd allocation.
nouveau/gsp: free acpi object after use
nouveau: fix disp disabling with GSP
nouveau/gsp: drop some acpi related debug
nouveau/gsp: add three notifier callbacks that we see in normal operation (v2)
drm/amd/pm: Use gpu_metrics_v1_5 for SMUv13.0.6
drm/amd/pm: Add gpu_metrics_v1_5
drm/amd/pm: Add mem_busy_percent for GCv9.4.3 apu
drm/amd/display: Fix sending VSC (+ colorimetry) packets for DP/eDP displays without PSR
drm/amdgpu: skip gpu_info fw loading on navi12
drm/amd/display: add nv12 bounding box
drm/amd/pm: Update metric table for jpeg/vcn data
drm/amd/pm: Use separate metric table for APU
drm/amd/display: pbn_div need be updated for hotplug event
|
|
syzbot is reporting uninit-value at shrinker_alloc(), for commit
307bececcd12 ("mm: shrinker: add a secondary array for
shrinker_info::{map, nr_deferred}") which assumed that the ->unit was
allocated with __GFP_ZERO forgot to replace kvmalloc_node() in
expand_one_shrinker_info() with kvzalloc_node().
Link: https://lkml.kernel.org/r/9226cc0a-10e0-4489-80c5-58c3b5b4359c@I-love.SAKURA.ne.jp
Reported-by: syzbot <syzbot+1e0ed05798af62917464@syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=1e0ed05798af62917464
Fixes: 307bececcd12 ("mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}")
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"These are two correctness fixes for handing DT input in the
Allwinner (sunxi) SMP startup code"
* tag 'soc-fixes-6.7-3a' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
ARM: sun9i: smp: fix return code check of of_property_match_string
ARM: sun9i: smp: Fix array-index-out-of-bounds read in sunxi_mc_smp_init
|
|
Pull kvm fix from Paolo Bonzini:
- Fix boolean logic in intel_guest_get_msrs
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86/pmu: fix masking logic for MSR_CORE_PERF_GLOBAL_CTRL
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull kprobes/x86 fix from Masami Hiramatsu:
- Fix to emulate indirect call which size is not 5 byte.
Current code expects the indirect call instructions are 5 bytes, but
that is incorrect. Usually indirect call based on register is shorter
than that, thus the emulation causes a kernel crash by accessing
wrong instruction boundary. This uses the instruction size to
calculate the return address correctly.
* tag 'probes-fixes-v6.7-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
x86/kprobes: fix incorrect return address calculation in kprobe_emulate_call_indirect
|
|
Pull smb client fixes from Steve French:
"Three important multichannel smb3 client fixes found in recent
testing:
- fix oops due to incorrect refcounting of interfaces after
disabling multichannel
- fix possible unrecoverable session state after disabling
multichannel with active sessions
- fix two places that were missing use of chan_lock"
* tag '6.7-rc8-smb3-mchan-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: do not depend on release_iface for maintaining iface_list
cifs: cifs_chan_is_iface_active should be called with chan_lock held
cifs: after disabling multichannel, mark tcon for reconnect
|
|
ASM108x/VT630x PCIe cards
VIA VT6306/6307/6308 provides PCI interface compliant to 1394 OHCI. When
the hardware is combined with Asmedia ASM1083/1085 PCIe-to-PCI bus bridge,
it appears that accesses to its 'Isochronous Cycle Timer' register (offset
0xf0 on PCI memory space) often causes unexpected system reboot in any
type of AMD Ryzen machine (both 0x17 and 0x19 families). It does not
appears in the other type of machine (AMD pre-Ryzen machine, Intel
machine, at least), or in the other OHCI 1394 hardware (e.g. Texas
Instruments).
The issue explicitly appears at a commit dcadfd7f7c74 ("firewire: core:
use union for callback of transaction completion") added to v6.5 kernel.
It changed 1394 OHCI driver to access to the register every time to
dispatch local asynchronous transaction. However, the issue exists in
older version of kernel as long as it runs in AMD Ryzen machine, since
the access to the register is required to maintain bus time. It is not
hard to imagine that users experience the unexpected system reboot when
generating bus reset by plugging any devices in, or reading the register
by time-aware application programs; e.g. audio sample processing.
This commit suppresses the unexpected system reboot in the combination of
hardware. It avoids the access itself. As a result, the software stack can
not provide the hardware time anymore to unit drivers, userspace
applications, and nodes in the same IEEE 1394 bus. It brings apparent
disadvantage since time-aware application programs require it, while
time-unaware applications are available again; e.g. sbp2.
Cc: stable@vger.kernel.org
Reported-by: Jiri Slaby <jirislaby@kernel.org>
Closes: https://bugzilla.suse.com/show_bug.cgi?id=1215436
Reported-by: Mario Limonciello <mario.limonciello@amd.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217994
Reported-by: Tobias Gruetzmacher <tobias-lists@23.gs>
Closes: https://sourceforge.net/p/linux1394/mailman/message/58711901/
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2240973
Closes: https://bugs.launchpad.net/linux/+bug/2043905
Link: https://lore.kernel.org/r/20240102110150.244475-1-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
|
|
It's not safe to call nfsd_put once nfsd_last_thread has been called, as
that function will zero out the nn->nfsd_serv pointer.
Drop the nfsd_put helper altogether and open-code the svc_put in its
callers instead. That allows us to not be reliant on the value of that
pointer when handling an error.
Fixes: 2a501f55cd64 ("nfsd: call nfsd_last_thread() before final nfsd_put()")
Reported-by: Zhi Li <yieli@redhat.com>
Cc: NeilBrown <neilb@suse.de>
Signed-off-by: Jeffrey Layton <jlayton@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Turns out that one of the ways that Nvidia's driver handles the pre-LT
timeout for eDP panels is by providing a retry timeout in their link
training callbacks that we're expected to wait for. Up until now we didn't
pay any attention to this parameter.
So, start honoring the timeout if link training fails - and retry up to 3
times. The "3 times" bit comes from OpenRM's link training code.
[airlied: this fixes the panel on one of my laptops]
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231222043308.3090089-12-airlied@gmail.com
|
|
There is a deadlock between the irq and fctx locks,
the irq handling takes irq then fctx lock
the fence signalling takes fctx then irq lock
This splits the fence signalling path so the code that hits
the irq lock is done in a separate work queue.
This seems to fix crashes/hangs when using nouveau gsp with
i915 primary GPU.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231222043308.3090089-11-airlied@gmail.com
|
|
Fixes a memory leak seen with kmemleak.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231222043308.3090089-10-airlied@gmail.com
|
|
It looks like for some messages the upper layers need to get access to the
results of the message so we can interpret it.
Rework the ctrl push interface to not free things and cleanup properly
whereever it errors out.
Requested-by: Lyude
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231222043308.3090089-9-airlied@gmail.com
|
|
This should let the upper layers retry as needed on EAGAIN.
There may be other values we will care about in the future, but
this covers our present needs.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231222043308.3090089-8-airlied@gmail.com
|
|
Currently we get an error from ACPI because both of these arguments expect
a single argument, and we don't provide one. I'm not totally clear on what
that argument does, but we're able to find the missing value from
_acpiCacheMethodData() in src/kernel/platform/acpi_common.c in nvidia's
driver. So, let's add that - which doesn't get eDP displays to power on
quite yet, but gets rid of the argument warning at least.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231222043308.3090089-7-airlied@gmail.com
|
|
This was being leaked.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231222043308.3090089-6-airlied@gmail.com
|
|
This fixes a memory leak for the acpi dod object.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231222043308.3090089-5-airlied@gmail.com
|
|
This func ptr here is normally static allocation, but gsp r535
uses a dynamic pointer, so we need to handle that better.
This fixes a crash with GSP when you use config=disp=0 to avoid
disp problems.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231222043308.3090089-4-airlied@gmail.com
|
|
These were leftover debug, if we need to bring them back do so
for debugging later.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231222043308.3090089-3-airlied@gmail.com
|
|
Add NULL callbacks for some things GSP calls that we don't handle, but know about
so we avoid the logging.
v2: Timur suggested allowing null fn.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231222043308.3090089-2-airlied@gmail.com
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amdgpu:
- DP MST fix
- SMU 13.0.6 fixes
- Fix displays on macbooks using vega12
- Fix VSC and colorimetry on DP/eDP
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240104152139.4931-1-alexander.deucher@amd.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from wireless and netfilter.
We haven't accumulated much over the break. If it wasn't for the
uninterrupted stream of fixes for Intel drivers this PR would be very
slim. There was a handful of user reports, however, either they stood
out because of the lower traffic or users have had more time to test
over the break. The ones which are v6.7-relevant should be wrapped up.
Current release - regressions:
- Revert "net: ipv6/addrconf: clamp preferred_lft to the minimum
required", it caused issues on networks where routers send prefixes
with preferred_lft=0
- wifi:
- iwlwifi: pcie: don't synchronize IRQs from IRQ, prevent deadlock
- mac80211: fix re-adding debugfs entries during reconfiguration
Current release - new code bugs:
- tcp: print AO/MD5 messages only if there are any keys
Previous releases - regressions:
- virtio_net: fix missing dma unmap for resize, prevent OOM
Previous releases - always broken:
- mptcp: prevent tcp diag from closing listener subflows
- nf_tables:
- set transport header offset for egress hook, fix IPv4 mangling
- skip set commit for deleted/destroyed sets, avoid double deactivation
- nat: make sure action is set for all ct states, fix openvswitch
matching on ICMP packets in related state
- eth: mlxbf_gige: fix receive hang under heavy traffic
- eth: r8169: fix PCI error on system resume for RTL8168FP
- net: add missing getsockopt(SO_TIMESTAMPING_NEW) and cmsg handling"
* tag 'net-6.7-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (52 commits)
net/tcp: Only produce AO/MD5 logs if there are any keys
net: Implement missing SO_TIMESTAMPING_NEW cmsg support
bnxt_en: Remove mis-applied code from bnxt_cfg_ntp_filters()
net: ravb: Wait for operating mode to be applied
asix: Add check for usbnet_get_endpoints
octeontx2-af: Re-enable MAC TX in otx2_stop processing
octeontx2-af: Always configure NIX TX link credits based on max frame size
net/smc: fix invalid link access in dumping SMC-R connections
net/qla3xxx: fix potential memleak in ql_alloc_buffer_queues
virtio_net: fix missing dma unmap for resize
igc: Fix hicredit calculation
ice: fix Get link status data length
i40e: Restore VF MSI-X state during PCI reset
i40e: fix use-after-free in i40e_aqc_add_filters()
net: Save and restore msg_namelen in sock_sendmsg
netfilter: nft_immediate: drop chain reference counter on error
netfilter: nf_nat: fix action not being set for all ct states
net: bcmgenet: Fix FCS generation for fragmented skbuffs
mptcp: prevent tcp diag from closing listener subflows
MAINTAINERS: add Geliang as reviewer for MPTCP
...
|
|
Commit 688eb8191b47 ("x86/csum: Improve performance of `csum_partial`")
ended up improving the code generation for the IP csum calculations, and
in particular special-casing the 40-byte case that is a hot case for
IPv6 headers.
It then had _another_ special case for the 64-byte unrolled loop, which
did two chains of 32-byte blocks, which allows modern CPU's to improve
performance by doing the chains in parallel thanks to renaming the carry
flag.
This just unifies the special cases and combines them into just one
single helper the 40-byte csum case, and replaces the 64-byte case by a
80-byte case that just does that single helper twice. It avoids having
all these different versions of inline assembly, and actually improved
performance further in my tests.
There was never anything magical about the 64-byte unrolled case, even
though it happens to be a common size (and typically is the cacheline
size).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The special case for odd aligned buffers is unnecessary and mostly
just adds overhead. Aligned buffers is the expectations, and even for
unaligned buffer, the only case that was helped is if the buffer was
1-byte from word aligned which is ~1/7 of the cases. Overall it seems
highly unlikely to be worth to extra branch.
It was left in the previous perf improvement patch because I was
erroneously comparing the exact output of `csum_partial(...)`, but
really we only need `csum_fold(csum_partial(...))` to match so its
safe to remove.
All csum kunit tests pass.
Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Laight <david.laight@aculab.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|