Age | Commit message (Collapse) | Author |
|
Use snd_soc_dapm_widget_name_cmp() helper when comparing widget names,
to include also the component's name prefix.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20231023095428.166563-10-krzysztof.kozlowski@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Use snd_soc_dapm_widget_name_cmp() helper when comparing widget names,
to include also the component's name prefix.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20231023095428.166563-9-krzysztof.kozlowski@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Use snd_soc_dapm_widget_name_cmp() helper when comparing widget names,
to include also the component's name prefix.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20231023095428.166563-8-krzysztof.kozlowski@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Use snd_soc_dapm_widget_name_cmp() helper when comparing widget names,
to include also the component's name prefix.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20231023095428.166563-7-krzysztof.kozlowski@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Use snd_soc_dapm_widget_name_cmp() helper when comparing widget names,
to include also the component's name prefix.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Nuno Sa <nuno.sa@analog.com>
Link: https://lore.kernel.org/r/20231023095428.166563-6-krzysztof.kozlowski@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Use snd_soc_dapm_widget_name_cmp() helper when comparing widget names,
to include also the component's name prefix.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20231023095428.166563-5-krzysztof.kozlowski@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Use snd_soc_dapm_widget_name_cmp() helper when comparing widget names,
to include also the component's name prefix.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Nuno Sa <nuno.sa@analog.com>
Link: https://lore.kernel.org/r/20231023095428.166563-4-krzysztof.kozlowski@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Use snd_soc_dapm_widget_name_cmp() helper when comparing widget names,
to include also the component's name prefix.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Nuno Sa <nuno.sa@analog.com>
Link: https://lore.kernel.org/r/20231023095428.166563-3-krzysztof.kozlowski@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Use snd_soc_dapm_widget_name_cmp() helper when comparing widget names,
to include also the component's name prefix.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20231023095428.166563-2-krzysztof.kozlowski@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The VC4 mock helpers allocate the CRTC, encoders and connectors using a
call to kunit_kzalloc(), but the DRM device they are attache to survives
for longer than the test itself which leads to use-after-frees reported
by KASAN.
Switch to drmm_kzalloc to tie the lifetime of these objects to the main
DRM device.
Fixes: f759f5b53f1c ("drm/vc4: tests: Introduce a mocking infrastructure")
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Closes: https://lore.kernel.org/all/CA+G9fYvJA2HGqzR9LGgq63v0SKaUejHAE6f7+z9cwWN-ourJ_g@mail.gmail.com/
Tested-by: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20231024105640.352752-1-mripard@kernel.org
|
|
On Ampere Altra, PCIe hotplug is handled through ACPI. A side interface is
also present to request system firmware control of the hotplug Attention
Indicators. Add an ACPI PCI Hotplug companion driver to support Attention
Indicator control.
Link: https://lore.kernel.org/r/20230930002036.6491-2-scott@os.amperecomputing.com
Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com>
[bhelgaas: mask domain to low 4 bits]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: "Rafael J. Wysocki" <rafael@kernel.org>
|
|
The TongFang GMxXGxx/TUXEDO Stellaris/Pollaris Gen5 needs IRQ overriding
for the keyboard to work.
Adding an entry for this laptop to the override_table makes the internal
keyboard functional.
Signed-off-by: Werner Sembach <wse@tuxedocomputers.com>
Cc: All applicable <stable@vger.kernel.org>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for net-next. Mostly
nf_tables updates with two patches for connlabel and br_netfilter.
1) Rename function name to perform on-demand GC for rbtree elements,
and replace async GC in rbtree by sync GC. Patches from Florian Westphal.
2) Use commit_mutex for NFT_MSG_GETRULE_RESET to ensure that two
concurrent threads invoking this command do not underrun stateful
objects. Patches from Phil Sutter.
3) Use single hook to deal with IP and ARP packets in br_netfilter.
Patch from Florian Westphal.
4) Use atomic_t in netns->connlabel use counter instead of using a
spinlock, also patch from Florian.
5) Cleanups for stateful objects infrastructure in nf_tables.
Patches from Phil Sutter.
6) Flush path uses opaque set element offered by the iterator, instead of
calling pipapo_deactivate() which looks up for it again.
7) Set backend .flush interface always succeeds, make it return void
instead.
8) Add struct nft_elem_priv placeholder structure and use it by replacing
void * to pass opaque set element representation from backend to frontend
which defeats compiler type checks.
9) Shrink memory consumption of set element transactions, by reducing
struct nft_trans_elem object size and reducing stack memory usage.
10) Use struct nft_elem_priv also for set backend .insert operation too.
11) Carry reset flag in nft_set_dump_ctx structure, instead of passing it
as a function argument, from Phil Sutter.
netfilter pull request 23-10-25
* tag 'nf-next-23-10-25' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
netfilter: nf_tables: Carry reset boolean in nft_set_dump_ctx
netfilter: nf_tables: set->ops->insert returns opaque set element in case of EEXIST
netfilter: nf_tables: shrink memory consumption of set elements
netfilter: nf_tables: expose opaque set element as struct nft_elem_priv
netfilter: nf_tables: set backend .flush always succeeds
netfilter: nft_set_pipapo: no need to call pipapo_deactivate() from flush
netfilter: nf_tables: Carry reset boolean in nft_obj_dump_ctx
netfilter: nf_tables: nft_obj_filter fits into cb->ctx
netfilter: nf_tables: Carry s_idx in nft_obj_dump_ctx
netfilter: nf_tables: A better name for nft_obj_filter
netfilter: nf_tables: Unconditionally allocate nft_obj_filter
netfilter: nf_tables: Drop pointless memset in nf_tables_dump_obj
netfilter: conntrack: switch connlabels to atomic_t
br_netfilter: use single forward hook for ip and arp
netfilter: nf_tables: Add locking for NFT_MSG_GETRULE_RESET requests
netfilter: nf_tables: Introduce nf_tables_getrule_single()
netfilter: nf_tables: Open-code audit log call in nf_tables_getrule()
netfilter: nft_set_rbtree: prefer sync gc to async worker
netfilter: nft_set_rbtree: rename gc deactivate+erase function
====================
Link: https://lore.kernel.org/r/20231025212555.132775-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Merge an ACPICA change for 6.7-rc1 which adds symbol definitions related
to CDAT (Dave Jiang).
* acpica:
ACPICA: Add defines for CDAT SSLBIS
|
|
gsm_cleanup_mux() cleans up the gsm by closing all DLCIs, stopping all
timers, removing the virtual tty devices and clearing the data queues.
This procedure, however, may cause subsequent changes of the virtual modem
status lines of a DLCI. More data is being added the outgoing data queue
and the deleted kick timer is restarted to handle this. At this point many
resources have already been removed by the cleanup procedure. Thus, a
kernel panic occurs.
Fix this by proving in gsm_modem_update() that the cleanup procedure has
not been started and the mux is still alive.
Note that writing to a virtual tty is already protected by checks against
the DLCI specific connection state.
Fixes: c568f7086c6e ("tty: n_gsm: fix missing timer to handle stalled links")
Cc: stable <stable@kernel.org>
Signed-off-by: Daniel Starke <daniel.starke@siemens.com>
Link: https://lore.kernel.org/r/20231026055844.3127-1-daniel.starke@siemens.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Flaky tests can be very difficult to reproduce after the facts, which
will make it even harder to ever fix.
Let's document the metadata we agreed on to provide more context to
anyone trying to address these fixes.
Link: https://lore.kernel.org/dri-devel/CAPj87rPbJ1V1-R7WMTHkDat2A4nwSd61Df9mdGH2PR=ZzxaU=Q@mail.gmail.com/
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Helen Koike <helen.koike@collabora.com>
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20231025142441.745947-1-mripard@kernel.org
|
|
hid_connect() will call hid_pidff_init() which does
hid_device_io_start() leading to an "io already started" warning.
To fix this call hid_device_io_stop() before calling hid_connect(),
stopping IO means that connect events may be lost while hid_connect()
runs, re-enable IO and move the hidpp_connect_event() work queuing
after the hid_connect().
Note re-enabling IO is also necessary for the g920_get_config()
call later during hidpp_probe().
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20231025190151.302376-1-hdegoede@redhat.com
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
|
|
Replace the old __attribute__((packed)) with the new __packed.
Only cleanup, no functional changes.
Link: https://lore.kernel.org/r/20231025132314.5878-12-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
The header file contains lots of outdated comments and definitions.
Drop those as cleanup.
Link: https://lore.kernel.org/r/20231025132314.5878-11-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Replace the old __attribute__((packed)) with the new __packed.
Only cleanup, no functional changes.
Link: https://lore.kernel.org/r/20231025132314.5878-10-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Replace the old __attribute__((packed)) with the new __packed.
Only cleanup, no functional changes.
Link: https://lore.kernel.org/r/20231025132314.5878-9-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Replace the old __attribute__((packed)) with the new __packed.
Only cleanup, no functional changes.
Link: https://lore.kernel.org/r/20231025132314.5878-8-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Replace the old __attribute__((packed)) with the new __packed.
Only cleanup, no functional changes.
Link: https://lore.kernel.org/r/20231025132314.5878-7-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Replace the old __attribute__((packed)) with the new __packed.
Only cleanup, no functional changes.
Link: https://lore.kernel.org/r/20231025132314.5878-6-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Replace the old __attribute__((packed)) with the new __packed.
Only cleanup, no functional changes.
Link: https://lore.kernel.org/r/20231025132314.5878-5-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Replace the old __attribute__((packed)) with the new __packed.
Only cleanup, no functional changes.
Link: https://lore.kernel.org/r/20231025132314.5878-4-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Replace the old __attribute__((packed)) with the new __packed.
Only cleanup, no functional changes.
Link: https://lore.kernel.org/r/20231025132314.5878-3-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Replace the old __attribute__((packed)) with the new __packed.
Only cleanup, no functional changes.
Link: https://lore.kernel.org/r/20231025132314.5878-2-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
ARCH_R9A07G043 has its own non-standard global pool based DMA coherent
allocator, which conflicts with the remap based RISCV_ISA_ZICBOM version.
Add a proper dependency.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20231018052654.50074-4-hch@lst.de
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
RISCV_DMA_NONCOHERENT is also used for whacky non-standard
non-coherent ops that use different hooks in dma-direct.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Tested-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20231018052654.50074-3-hch@lst.de
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
RISCV_NONSTANDARD_CACHE_OPS is also used for the pmem cache maintenance
helpers, which are built into the kernel unconditionally.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20231018052654.50074-2-hch@lst.de
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
Move sequence of masking and unmasking global interrupts from buttress
interrupt handler to generic one that handles both VPUIP and BTRS
interrupts. Unmasking global interrupts will re-trigger MSI for any
pending interrupts.
Lack of this sequence will cause the driver to miss any
VPUIP interrupt that comes after reading VPU_37XX_HOST_SS_ICB_STATUS_0
and before clearing all active interrupt sources.
Fixes: 35b137630f08 ("accel/ivpu: Introduce a new DRM driver for Intel VPU")
Cc: stable@vger.kernel.org
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231024161952.759914-1-stanislaw.gruszka@linux.intel.com
|
|
The get_task_exe_file() function locks the given task with task_lock()
which when used inside audit_exe_compare() can cause deadlocks on
systems that generate audit records when the task_lock() is held. We
resolve this problem with two changes: ignoring those cases where the
task being audited is not the current task, and changing our approach
to obtaining the executable file struct to not require task_lock().
With the intent of the audit exe filter being to filter on audit events
generated by processes started by the specified executable, it makes
sense that we would only want to use the exe filter on audit records
associated with the currently executing process, e.g. @current. If
we are asked to filter records using a non-@current task_struct we can
safely ignore the exe filter without negatively impacting the admin's
expectations for the exe filter.
Knowing that we only have to worry about filtering the currently
executing task in audit_exe_compare() we can do away with the
task_lock() and call get_mm_exe_file() with @current->mm directly.
Cc: <stable@vger.kernel.org>
Fixes: 5efc244346f9 ("audit: fix exe_file access in audit_exe_compare")
Reported-by: Andreas Steinmetz <anstein99@googlemail.com>
Reviewed-by: John Johansen <john.johanse@canonical.com>
Reviewed-by: Mateusz Guzik <mjguzik@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
It's eventfs_inode not eventfs_indoe. There's no deer involved!
Link: https://lore.kernel.org/linux-trace-kernel/20231024131024.5634c743@gandalf.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Fixes: 5790b1fb3d672 ("eventfs: Remove eventfs_file and just use eventfs_inode")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
As the comment right above a WARN_ON() in create_file_dentry() states:
* Note, with the mutex held, the e_dentry cannot have content
* and the ei->is_freed be true at the same time.
But the WARN_ON() only has:
WARN_ON_ONCE(ei->is_free);
Where to match the comment (and what it should actually do) is:
dentry = *e_dentry;
WARN_ON_ONCE(dentry && ei->is_free)
Also in that case, set dentry to NULL (although it should never happen).
Link: https://lore.kernel.org/linux-trace-kernel/20231024123628.62b88755@gandalf.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Fixes: 5790b1fb3d672 ("eventfs: Remove eventfs_file and just use eventfs_inode")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
While powerpc doesn't use the seq_buf readpos, it did explicitly
initialise it for no good reason.
Link: https://lore.kernel.org/linux-trace-kernel/20231024145600.739451-1-willy@infradead.org
Cc: Christoph Hellwig <hch@lst.de>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Kees Cook <keescook@chromium.org>
Fixes: d0ed46b60396 ("tracing: Move readpos from seq_buf to trace_seq")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
'net-ipv6-addrconf-ensure-that-temporary-addresses-preferred-lifetimes-are-in-the-valid-range'
Alex Henrie says:
====================
net: ipv6/addrconf: ensure that temporary addresses' preferred lifetimes are in the valid range
No changes from v2, but there are only four patches now because the
first patch has already been applied.
https://lore.kernel.org/all/20230829054623.104293-1-alexhenrie24@gmail.com/
====================
Link: https://lore.kernel.org/r/20231024212312.299370-1-alexhenrie24@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
small or too large
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20231024212312.299370-5-alexhenrie24@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20231024212312.299370-4-alexhenrie24@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If the preferred lifetime was less than the minimum required lifetime,
ipv6_create_tempaddr would error out without creating any new address.
On my machine and network, this error happened immediately with the
preferred lifetime set to 1 second, after a few minutes with the
preferred lifetime set to 4 seconds, and not at all with the preferred
lifetime set to 5 seconds. During my investigation, I found a Stack
Exchange post from another person who seems to have had the same
problem: They stopped getting new addresses if they lowered the
preferred lifetime below 3 seconds, and they didn't really know why.
The preferred lifetime is a preference, not a hard requirement. The
kernel does not strictly forbid new connections on a deprecated address,
nor does it guarantee that the address will be disposed of the instant
its total valid lifetime expires. So rather than disable IPv6 privacy
extensions altogether if the minimum required lifetime swells above the
preferred lifetime, it is more in keeping with the user's intent to
increase the temporary address's lifetime to the minimum necessary for
the current network conditions.
With these fixes, setting the preferred lifetime to 3 or 4 seconds "just
works" because the extra fraction of a second is practically
unnoticeable. It's even possible to reduce the time before deprecation
to 1 or 2 seconds by also disabling duplicate address detection (setting
/proc/sys/net/ipv6/conf/*/dad_transmits to 0). I realize that that is a
pretty niche use case, but I know at least one person who would gladly
sacrifice performance and convenience to be sure that they are getting
the maximum possible level of privacy.
Link: https://serverfault.com/a/1031168/310447
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20231024212312.299370-3-alexhenrie24@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Without this patch, there is nothing to stop the preferred lifetime of a
temporary address from being greater than its valid lifetime. If that
was the case, the valid lifetime was effectively ignored.
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20231024212312.299370-2-alexhenrie24@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Yan Zhai says:
====================
ipv6: avoid atomic fragment on GSO output
When the ipv6 stack output a GSO packet, if its gso_size is larger than
dst MTU, then all segments would be fragmented. However, it is possible
for a GSO packet to have a trailing segment with smaller actual size
than both gso_size as well as the MTU, which leads to an "atomic
fragment". Atomic fragments are considered harmful in RFC-8021. An
Existing report from APNIC also shows that atomic fragments are more
likely to be dropped even it is equivalent to a no-op [1].
The series contains following changes:
* drop feature RTAX_FEATURE_ALLFRAG, which has been broken. This helps
simplifying other changes in this set.
* refactor __ip6_finish_output code to separate GSO and non-GSO packet
processing, mirroring IPv4 side logic.
* avoid generating atomic fragment on GSO packets.
Link: https://www.potaroo.net/presentations/2022-03-01-ipv6-frag.pdf [1]
V4: https://lore.kernel.org/netdev/cover.1698114636.git.yan@cloudflare.com/
V3: https://lore.kernel.org/netdev/cover.1697779681.git.yan@cloudflare.com/
V2: https://lore.kernel.org/netdev/ZS1%2Fqtr0dZJ35VII@debian.debian/
====================
Link: https://lore.kernel.org/r/cover.1698156966.git.yan@cloudflare.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When the ipv6 stack output a GSO packet, if its gso_size is larger than
dst MTU, then all segments would be fragmented. However, it is possible
for a GSO packet to have a trailing segment with smaller actual size
than both gso_size as well as the MTU, which leads to an "atomic
fragment". Atomic fragments are considered harmful in RFC-8021. An
Existing report from APNIC also shows that atomic fragments are more
likely to be dropped even it is equivalent to a no-op [1].
Add an extra check in the GSO slow output path. For each segment from
the original over-sized packet, if it fits with the path MTU, then avoid
generating an atomic fragment.
Link: https://www.potaroo.net/presentations/2022-03-01-ipv6-frag.pdf [1]
Fixes: b210de4f8c97 ("net: ipv6: Validate GSO SKB before finish IPv6 processing")
Reported-by: David Wragg <dwragg@cloudflare.com>
Signed-off-by: Yan Zhai <yan@cloudflare.com>
Link: https://lore.kernel.org/r/90912e3503a242dca0bc36958b11ed03a2696e5e.1698156966.git.yan@cloudflare.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Separate GSO and non-GSO packets handling to make the logic cleaner. For
GSO packets, frag_max_size check can be omitted because it is only
useful for packets defragmented by netfilter hooks. Both local output
and GRO logic won't produce GSO packets when defragment is needed. This
also mirrors what IPv4 side code is doing.
Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Yan Zhai <yan@cloudflare.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/0e1d4599f858e2becff5c4fe0b5f843236bc3fe8.1698156966.git.yan@cloudflare.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
RTAX_FEATURE_ALLFRAG was added before the first git commit:
https://www.mail-archive.com/bk-commits-head@vger.kernel.org/msg03399.html
The feature would send packets to the fragmentation path if a box
receives a PMTU value with less than 1280 byte. However, since commit
9d289715eb5c ("ipv6: stop sending PTB packets for MTU < 1280"), such
message would be simply discarded. The feature flag is neither supported
in iproute2 utility. In theory one can still manipulate it with direct
netlink message, but it is not ideal because it was based on obsoleted
guidance of RFC-2460 (replaced by RFC-8200).
The feature would always test false at the moment, so remove related
code or mark them as unused.
Signed-off-by: Yan Zhai <yan@cloudflare.com>
Reviewed-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/d78e44dcd9968a252143ffe78460446476a472a1.1698156966.git.yan@cloudflare.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In iavf_down, we're skipping the scheduling of certain operations if
the driver is being removed. However, the IAVF_FLAG_AQ_DISABLE_QUEUES
request must not be skipped in this case, because iavf_close waits
for the transition to the __IAVF_DOWN state, which happens in
iavf_virtchnl_completion after the queues are released.
Without this fix, "rmmod iavf" takes half a second per interface that's
up and prints the "Device resources not yet released" warning.
Fixes: c8de44b577eb ("iavf: do not process adminq tasks when __IAVF_IN_REMOVE_TASK is set")
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Tested-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231025183213.874283-1-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Improve alloc_migration_target_by_mpol()'s treatment of MPOL_INTERLEAVE.
Make an effort in do_mbind(), to identify the correct interleave index for
the first page to be migrated, so that it and all subsequent pages from
the same vma will be targeted to precisely their intended nodes. Pages
from following vmas will still be interleaved from the requested nodemask,
but perhaps starting from a different base.
Whether this is worth doing at all, or worth improving further, is
arguable: queue_folio_required() is right not to care about the precise
placement on interleaved nodes; but this little effort seems appropriate.
[hughd@google.com: do vma_iter search under mmap_write_unlock()]
Link: https://lkml.kernel.org/r/3311d544-fb05-a7f1-1b74-16aa0f6cd4fe@google.com
Link: https://lkml.kernel.org/r/77954a5-9c9b-1c11-7d5c-3262c01b895f@google.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Tejun heo <tj@kernel.org>
Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Yosry Ahmed <yosryahmed@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
mbind(2) holds down_write of current task's mmap_lock throughout
(exclusive because it needs to set the new mempolicy on the vmas);
migrate_pages(2) holds down_read of pid's mmap_lock throughout.
They both hold mmap_lock across the internal migrate_pages(), under which
all new page allocations (huge or small) are made. I'm nervous about it;
and migrate_pages() certainly does not need mmap_lock itself. It's done
this way for mbind(2), because its page allocator is vma_alloc_folio() or
alloc_hugetlb_folio_vma(), both of which depend on vma and address.
Now that we have alloc_pages_mpol(), depending on (refcounted) memory
policy and interleave index, mbind(2) can be modified to use that or
alloc_hugetlb_folio_nodemask(), and then not need mmap_lock across the
internal migrate_pages() at all: add alloc_migration_target_by_mpol() to
replace mbind's new_page().
(After that change, alloc_hugetlb_folio_vma() is used by nothing but a
userfaultfd function: move it out of hugetlb.h and into the #ifdef.)
migrate_pages(2) has chosen its target node before migrating, so can
continue to use the standard alloc_migration_target(); but let it take and
drop mmap_lock just around migrate_to_node()'s queue_pages_range():
neither the node-to-node calculations nor the page migrations need it.
It seems unlikely, but it is conceivable that some userspace depends on
the kernel's mmap_lock exclusion here, instead of doing its own locking:
more likely in a testsuite than in real life. It is also possible, of
course, that some pages on the list will be munmapped by another thread
before they are migrated, or a newer memory policy applied to the range by
that time: but such races could happen before, as soon as mmap_lock was
dropped, so it does not appear to be a concern.
Link: https://lkml.kernel.org/r/21e564e8-269f-6a89-7ee2-fd612831c289@google.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Tejun heo <tj@kernel.org>
Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Yosry Ahmed <yosryahmed@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Shrink shmem's stack usage by eliminating the pseudo-vma from its folio
allocation. alloc_pages_mpol(gfp, order, pol, ilx, nid) becomes the
principal actor for passing mempolicy choice down to __alloc_pages(),
rather than vma_alloc_folio(gfp, order, vma, addr, hugepage).
vma_alloc_folio() and alloc_pages() remain, but as wrappers around
alloc_pages_mpol(). alloc_pages_bulk_*() untouched, except to provide the
additional args to policy_nodemask(), which subsumes policy_node().
Cleanup throughout, cutting out some unhelpful "helpers".
It would all be much simpler without MPOL_INTERLEAVE, but that adds a
dynamic to the constant mpol: complicated by v3.6 commit 09c231cb8bfd
("tmpfs: distribute interleave better across nodes"), which added ino bias
to the interleave, hidden from mm/mempolicy.c until this commit.
Hence "ilx" throughout, the "interleave index". Originally I thought it
could be done just with nid, but that's wrong: the nodemask may come from
the shared policy layer below a shmem vma, or it may come from the task
layer above a shmem vma; and without the final nodemask then nodeid cannot
be decided. And how ilx is applied depends also on page order.
The interleave index is almost always irrelevant unless MPOL_INTERLEAVE:
with one exception in alloc_pages_mpol(), where the NO_INTERLEAVE_INDEX
passed down from vma-less alloc_pages() is also used as hint not to use
THP-style hugepage allocation - to avoid the overhead of a hugepage arg
(though I don't understand why we never just added a GFP bit for THP - if
it actually needs a different allocation strategy from other pages of the
same order). vma_alloc_folio() still carries its hugepage arg here, but
it is not used, and should be removed when agreed.
get_vma_policy() no longer allows a NULL vma: over time I believe we've
eradicated all the places which used to need it e.g. swapoff and madvise
used to pass NULL vma to read_swap_cache_async(), but now know the vma.
[hughd@google.com: handle NULL mpol being passed to __read_swap_cache_async()]
Link: https://lkml.kernel.org/r/ea419956-4751-0102-21f7-9c93cb957892@google.com
Link: https://lkml.kernel.org/r/74e34633-6060-f5e3-aee-7040d43f2e93@google.com
Link: https://lkml.kernel.org/r/1738368e-bac0-fd11-ed7f-b87142a939fe@google.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Tejun heo <tj@kernel.org>
Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Yosry Ahmed <yosryahmed@google.com>
Cc: Domenico Cerasuolo <mimmocerasuolo@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|