Age | Commit message (Collapse) | Author |
|
Add 2 per-cpu monitors as part of the sched model:
* sco: scheduling context operations
Monitor to ensure sched_set_state happens only in thread context
* tss: task switch while scheduling
Monitor to ensure sched_switch happens only in scheduling context
To: Ingo Molnar <mingo@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Kacur <jkacur@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Link: https://lore.kernel.org/20250305140406.350227-4-gmonaco@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
Monitors describing complex systems, such as the scheduler, can easily
grow to the point where they are just hard to understand because of the
many possible state transitions.
Often it is possible to break such descriptions into smaller monitors,
sharing some or all events. Enabling those smaller monitors concurrently
is, in fact, testing the system as if we had one single larger monitor.
Splitting models into multiple specification is not only easier to
understand, but gives some more clues when we see errors.
Add the possibility to create container monitors, whose only purpose is
to host other nested monitors. Enabling a container monitor enables all
nested ones, but it's still possible to enable nested monitors
independently.
Add the sched monitor as first container, for now empty.
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Link: https://lore.kernel.org/20250305140406.350227-3-gmonaco@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
Add the following tracepoints:
* sched_entry(bool preempt, ip)
Called while entering __schedule
* sched_exit(bool is_switch, ip)
Called while exiting __schedule
* sched_set_state(task, curr_state, state)
Called when a task changes its state (to and from running)
These tracepoints are useful to describe the Linux task model and are
adapted from the patches by Daniel Bristot de Oliveira
(https://bristot.me/linux-task-model/).
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Link: https://lore.kernel.org/20250305140406.350227-2-gmonaco@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
Michael Chan says:
====================
bnxt_en: Fix MAX_SKB_FRAGS > 30
The driver was written with the assumption that MAX_SKB_FRAGS could
not exceed what the NIC can support. About 2 years ago,
CONFIG_MAX_SKB_FRAGS was added. The value can exceed what the NIC
can support and it may cause TX timeout. These 2 patches will fix
the issue.
====================
Link: https://patch.msgid.link/20250321211639.3812992-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If skb_shinfo(skb)->nr_frags excceds what the chip can support,
linearize the SKB and warn once to let the user know.
net.core.max_skb_frags can be lowered, for example, to avoid the
issue.
Fixes: 3948b05950fd ("net: introduce a config option to tweak MAX_SKB_FRAGS")
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250321211639.3812992-3-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The bd_cnt field in the TX BD specifies the total number of BDs for
the TX packet. The bd_cnt field has 5 bits and the maximum number
supported is 32 with the value 0.
CONFIG_MAX_SKB_FRAGS can be modified and the total number of SKB
fragments can approach or exceed the maximum supported by the chip.
Add a macro to properly mask the bd_cnt field so that the value 32
will be properly masked and set to 0 in the bd_cnd field.
Without this patch, the out-of-range bd_cnt value will corrupt the
TX BD and may cause TX timeout.
The next patch will check for values exceeding 32.
Fixes: 3948b05950fd ("net: introduce a config option to tweak MAX_SKB_FRAGS")
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250321211639.3812992-2-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
An update was made to up the module ref count when a synthetic event is
registered for both trace and perf events. But if perf is not configured
in, the perf enums used will cause the kernel to fail to build.
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Douglas Raillard <douglas.raillard@arm.com>
Link: https://lore.kernel.org/20250323152151.528b5ced@batman.local.home
Fixes: 21581dd4e7ff ("tracing: Ensure module defining synth event cannot be unloaded while tracing")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202503232230.TeREVy8R-lkp@intel.com/
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
Currently network taps unbound to any interface are linked in the
global ptype_all list, affecting the performance in all the network
namespaces.
Add per netns ptypes chains, so that in the mentioned case only
the netns owning the packet socket(s) is affected.
While at that drop the global ptype_all list: no in kernel user
registers a tap on "any" type without specifying either the target
device or the target namespace (and IMHO doing that would not make
any sense).
Note that this adds a conditional in the fast path (to check for
per netns ptype_specific list) and increases the dataset size by
a cacheline (owing the per netns lists).
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumaze@google.com>
Link: https://patch.msgid.link/ae405f98875ee87f8150c460ad162de7e466f8a7.1742494826.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Remove debugfs_tx() which was added when the caif driver was added in
commit 9b27105b4a44 ("net-caif-driver: add CAIF serial driver (ldisc)")
but it has never been used.
Flagged by LLVM 19.1.7 W=1 builds.
Signed-off-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250320-caif-debugfs-tx-v1-1-be5654770088@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
of_gpio.h is deprecated. Since there is no of_gpio_x API, drop
unused of_gpio.h. While at here, drop gpio.h and gpio/consumer.h if
no user in driver.
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250320031542.3960381-1-peng.fan@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The net fixed phy driver does not require the creation of a platform
device. Originally, this approach was chosen for simplicity when the
driver was first implemented.
With the introduction of the lightweight faux device interface, we now
have a more appropriate alternative. Migrate the device to utilize the
faux bus, given that the platform device it previously created was not
a real one anyway. This will get rid of the fake platform device.
Cc: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://patch.msgid.link/20250319135209.2734594-1-sudeep.holla@arm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Tariq Toukan says:
====================
mlx5 cleanups 2025-03-19
This series contains small cleanups to the mlx5 core and Eth drivers.
====================
Link: https://patch.msgid.link/1742412199-159596-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Always set PAGE_POOL_STATS in mlx5 Eth driver.
Cleanup the corresponding #ifdefs.
Page pool stats are essential to monitor and analyze RX performance.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Link: https://patch.msgid.link/1742412199-159596-4-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use bitmap_free() to free memory allocated with bitmap_zalloc_node().
This fixes memtrack error:
mtl rsc inconsistency: memtrack_free: .../drivers/net/ethernet/mellanox/mlx5/core/en_main.c::466: kfree for unknown address=0xFFFF0000CA3619E8, device=0x0
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Maher Sanalla <msanalla@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Link: https://patch.msgid.link/1742412199-159596-3-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix coccinelle warnings:
WARNING: NULL check before dev_{put, hold} functions is not needed.
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Link: https://patch.msgid.link/1742412199-159596-2-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
From what I can tell the .get_fixed_state pointer in the phylink structure
hasn't been used since commit 5c05c1dbb177 ("net: phylink, dsa: eliminate
phylink_fixed_state_cb()") . Since I can't find any users for it we might
as well just drop the pointer.
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/174240634772.1745174.5690351737682751849.stgit@ahduyck-xeon-server.home.arpa
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The assignment of zero to udph->check is unnecessary as it is
immediately overwritten in the subsequent line. Remove the redundant
assignment.
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250319-netpoll_nit-v1-1-a7faac5cbd92@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull tasklist_lock optimizations from Christian Brauner:
"According to the performance testbots this brings a 23% performance
increase when creating new processes:
- Reduce tasklist_lock hold time on exit:
- Perform add_device_randomness() without tasklist_lock
- Perform free_pid() calls outside of tasklist_lock
- Drop irq disablement around pidmap_lock
- Add some tasklist_lock asserts
- Call flush_sigqueue() lockless by changing release_task()
- Don't pointlessly clear TIF_SIGPENDING in __exit_signal() ->
clear_tsk_thread_flag()"
* tag 'kernel-6.15-rc1.tasklist_lock' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
pid: drop irq disablement around pidmap_lock
pid: perform free_pid() calls outside of tasklist_lock
pid: sprinkle tasklist_lock asserts
exit: hoist get_pid() in release_task() outside of tasklist_lock
exit: perform add_device_randomness() without tasklist_lock
exit: kill the pointless __exit_signal()->clear_tsk_thread_flag(TIF_SIGPENDING)
exit: change the release_task() paths to call flush_sigqueue() lockless
|
|
There commands can be used to add an RSS context and steer some traffic
into it:
# ethtool -X eth0 context new
New RSS context is 1
# ethtool -N eth0 flow-type ip4 dst-ip 1.1.1.1 context 1
Added rule with ID 1023
However, the second command fails with EINVAL on mlx5e:
# ethtool -N eth0 flow-type ip4 dst-ip 1.1.1.1 context 1
rmgr: Cannot insert RX class rule: Invalid argument
Cannot insert classification rule
It happens when flow_get_tirn calls flow_type_to_traffic_type with
flow_type = IP_USER_FLOW or IPV6_USER_FLOW. That function only handles
IPV4_FLOW and IPV6_FLOW cases, but unlike all other cases which are
common for hash and spec, IPv4 and IPv6 defines different contants for
hash and for spec:
#define TCP_V4_FLOW 0x01 /* hash or spec (tcp_ip4_spec) */
#define UDP_V4_FLOW 0x02 /* hash or spec (udp_ip4_spec) */
...
#define IPV4_USER_FLOW 0x0d /* spec only (usr_ip4_spec) */
#define IP_USER_FLOW IPV4_USER_FLOW
#define IPV6_USER_FLOW 0x0e /* spec only (usr_ip6_spec; nfc only) */
#define IPV4_FLOW 0x10 /* hash only */
#define IPV6_FLOW 0x11 /* hash only */
Extend the switch in flow_type_to_traffic_type to support both, which
fixes the failing ethtool -N command with flow-type ip4 or ip6.
Fixes: 248d3b4c9a39 ("net/mlx5e: Support flow classification into RSS contexts")
Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
Tested-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250319124508.3979818-1-maxim@isovalent.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use the "struct" keyword in kernel-doc when describing struct
ipefs_file. Add kernel-doc for the struct members also.
Don't use kernel-doc notation for 'policy_subdir'. kernel-doc does
not support documentation comments for data definitions.
This eliminates multiple kernel-doc warnings:
security/ipe/policy_fs.c:21: warning: cannot understand function prototype: 'struct ipefs_file '
security/ipe/policy_fs.c:407: warning: cannot understand function prototype: 'const struct ipefs_file policy_subdir[] = '
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Fan Wu <wufan@kernel.org>
Cc: Paul Moore <paul@paul-moore.com>
Cc: James Morris <jmorris@namei.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
Cc: linux-security-module@vger.kernel.org
Signed-off-by: Fan Wu <wufan@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs rust updates from Christian Brauner:
"This contains minor fixes and improvements to rust file bindings:
- Optimize rust symbol generation for FileDescriptorReservation
- Optimize rust symbol generation for SeqFile"
* tag 'vfs-6.15-rc1.rust' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
rust: optimize rust symbol generation for SeqFile
rust: file: optimize rust symbol generation for FileDescriptorReservation
|
|
Some dwmac variants such as dwmac_socfpga don't use xpcs but lynx_pcs.
Don't call xpcs_config_eee_mult_fact() in this case, as this causes a
crash at init :
Unable to handle kernel NULL pointer dereference at virtual address 00000039 when write
[...]
Call trace:
xpcs_config_eee_mult_fact from stmmac_pcs_setup+0x40/0x10c
stmmac_pcs_setup from stmmac_dvr_probe+0xc0c/0x1244
stmmac_dvr_probe from socfpga_dwmac_probe+0x130/0x1bc
socfpga_dwmac_probe from platform_probe+0x5c/0xb0
Fixes: 060fb27060e8 ("net: stmmac: call xpcs_config_eee_mult_fact()")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/20250321103502.1303539-1-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs file handling updates from Christian Brauner:
"This contains performance improvements for struct file's new refcount
mechanism and various other performance work:
- The stock kernel transitioning the file to no refs held penalizes
the caller with an extra atomic to block any increments. For cases
where the file is highly likely to be going away this is easily
avoidable.
Add file_ref_put_close() to better handle the common case where
closing a file descriptor also operates on the last reference and
build fput_close_sync() and fput_close() on top of it. This brings
about 1% performance improvement by eliding one atomic in the
common case.
- Predict no error in close() since the vast majority of the time
system call returns 0.
- Reduce the work done in fdget_pos() by predicting that the file was
found and by explicitly comparing the reference count to one and
ignoring the dead zone"
* tag 'vfs-6.15-rc1.file' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
fs: reduce work in fdget_pos()
fs: use fput_close() in path_openat()
fs: use fput_close() in filp_close()
fs: use fput_close_sync() in close()
file: add fput and file_ref_put routines optimized for use when closing a fd
fs: predict no error in close()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs orangefs updates from Christian Brauner:
"This contains the work to remove orangefs_writepage() and partially
convert it to folios.
A few regular bugfixes are included as well"
* tag 'vfs-6.15-rc1.orangefs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
orangefs: Convert orangefs_writepages to contain an array of folios
orangefs: Simplify bvec setup in orangefs_writepages_work()
orangefs: Unify error & success paths in orangefs_writepages_work()
orangefs: Pass mapping to orangefs_writepages_work()
orangefs: Convert orangefs_writepage_locked() to take a folio
orangefs: Remove orangefs_writepage()
orangefs: make open_for_read and open_for_write boolean
orangefs: Move s_kmod_keyword_mask_map to orangefs-debugfs.c
orangefs: Do not truncate file size
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs afs updates from Christian Brauner:
"This contains the work for afs for this cycle:
- Fix an occasional hang that's only really encountered when
rmmod'ing the kafs module
- Remove the "-o autocell" mount option. This is obsolete with the
dynamic root and removing it makes the next patch slightly easier
- Change how the dynamic root mount is constructed. Currently, the
root directory is (de)populated when it is (un)mounted if there are
cells already configured and, further, pairs of automount points
have to be created/removed each time a cell is added/deleted
This is changed so that readdir on the root dir lists all the known
cell automount pairs plus the @cell symlinks and the inodes and
dentries are constructed by lookup on demand. This simplifies the
cell management code
- A few improvements to the afs_volume and afs_server tracepoints
- Pass trace info into the afs_lookup_cell() function to allow the
trace log to indicate the purpose of the lookup
- Remove the 'net' parameter from afs_unuse_cell() as it's
superfluous
- In rxrpc, allow a kernel app (such as kafs) to store a word of
information on rxrpc_peer records
- Use the information stored on the rxrpc_peer record to point to the
afs_server record. This allows the server address lookup to be done
away with
- Simplify the afs_server ref/activity accounting to make each one
self-contained and not garbage collected from the cell management
work item
- Simplify the afs_cell ref/activity accounting to make each one of
these also self-contained and not driven by a central management
work item
The current code was intended to make it such that a single timer
for the namespace and one work item per cell could do all the work
required to maintain these records. This, however, made for some
sequencing problems when cleaning up these records. Further, the
attempt to pass refs along with timers and work items made getting
it right rather tricky when the timer or work item already had a
ref attached and now a ref had to be got rid of"
* tag 'vfs-6.15-rc1.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
afs: Simplify cell record handling
afs: Fix afs_server ref accounting
afs: Use the per-peer app data provided by rxrpc
rxrpc: Allow the app to store private data on peer structs
afs: Drop the net parameter from afs_unuse_cell()
afs: Make afs_lookup_cell() take a trace note
afs: Improve server refcount/active count tracing
afs: Improve afs_volume tracing to display a debug ID
afs: Change dynroot to create contents on demand
afs: Remove the "autocell" mount option
|
|
Common YAML schema for devices that exports internal peripherals through
PCI BARs. The BARs are exposed as simple-buses through which the
peripherals can be accessed.
This is not intended to be used as a standalone binding, but should be
included by device specific bindings.
Signed-off-by: Andrea della Porta <andrea.porta@suse.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
[bhelgaas: fix typo]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/096ab7addb39e498e28ac2526c07157cc9327c42.1742418429.git.andrea.porta@suse.com
|
|
Remove intel_pcie_cpu_addr(), the .cpu_addr_fixup() method, because the dwc
core driver already handles address translation based on the devicetree
description.
[bhelgaas: this does require a minor dts change, but maintainer
Lei Chuan Hua <lchuanhua@maxlinear.com> confirms that the driver is only
used internally to Maxlinear and internal users will update dts:
https://lore.kernel.org/r/BY3PR19MB507667CE7531D863E1E5F8AEBDD82@BY3PR19MB5076.namprd19.prod.outlook.com]
Link: https://lore.kernel.org/r/20250305-intel-v1-1-40db3a685490@nxp.com
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Remove imx_pcie_cpu_addr_fixup, the .cpu_addr_fixup() method, because the
dwc core driver already handles address translation based on the devicetree
description.
Link: https://lore.kernel.org/r/20250315201548.858189-14-helgaas@kernel.org
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Richard Zhu <hongxing.zhu@nxp.com>
|
|
We know the parent_bus_offset, either computed from a DT reg property (the
offset is the CPU physical addr - the 'config'/'addr_space' address on the
parent bus) or from a .cpu_addr_fixup() (which may have used a host bridge
window offset).
Apply that parent_bus_offset instead of calling .cpu_addr_fixup() when
programming the ATU.
This assumes all intermediate addresses are at the same offset from the CPU
physical addresses.
[bhelgaas: commit log]
Link: https://lore.kernel.org/r/20250315201548.858189-13-helgaas@kernel.org
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Most systems' PCIe outbound map windows have non-zero physical addresses,
but the possibility of encountering zero increased after following commit
("PCI: dwc: Use parent_bus_offset").
'ep->outbound_addr[n]', representing 'parent_bus_address', might be 0 on
some hardware, which trims high address bits through bus fabric before
sending to the PCIe controller.
Replace the iteration logic with 'for_each_set_bit()' to ensure only
allocated map windows are iterated when determining the ATU index from a
given address.
Link: https://lore.kernel.org/r/20250315201548.858189-12-helgaas@kernel.org
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Endpoint
┌───────────────────────────────────────────────┐
│ pcie-ep@5f010000 │
│ ┌────────────────┐│
│ │ Endpoint ││
│ │ PCIe ││
│ │ Controller ││
│ bus@5f000000 │ ┌────────►
│ ┌──────────┐ │ │ ││dynamically
│ │ │ Outbound Transfer │ ││allocated
│┌─────┐ │ Bus ┼─────►│ ATU ───────┘ ││PCI Addr
││ │ │ Fabric │Bus │ ││
││ CPU ├───►│ │Addr │ ││
││ │CPU │ │0x8000_0000 ││
│└─────┘Addr└──────────┘ │ ││
│ 0x7000_0000 └────────────────┘│
└───────────────────────────────────────────────┘
bus@5f000000 {
compatible = "simple-bus";
ranges = <0x80000000 0x0 0x70000000 0x10000000>;
pcie-ep@5f010000 {
reg = <0x80000000 0x10000000>;
reg-names ="addr_space";
...
};
...
};
In the diagram above, CPU writes data to outbound window address
0x7000_0000, and the bus fabric maps it to 0x8000_0000. The ATU uses
bus address 0x8000_0000 as input address and maps to some PCI address
dynamically allocated by a PCI device driver on the host side.
The pcie-ep@5f010000 'reg[addr_space]' is the parent bus address, which is
the input of PCIe controller, including the ATU.
Set parent_bus_offset, the offset from the CPU address to the PCIe
controller input address using dw_pcie_init_parent_bus_offset(). The
parent_bus_offset is not used yet, so no functional change intended.
[bhelgaas: commit log]
Link: https://lore.kernel.org/r/20250315201548.858189-11-helgaas@kernel.org
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Consolidate devicetree resource handling in dw_pcie_ep_get_resources().
No functional change intended.
Link: https://lore.kernel.org/r/20250315201548.858189-10-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
|
|
Move devm_pci_epc_create() to the beginning of dw_pcie_ep_init().
devm_pci_epc_create() is generic code that doesn't depend on any DWC
resource, so moving it earlier keeps all the subsequent devicetree-related
code together.
Link: https://lore.kernel.org/r/20250315201548.858189-9-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
|
|
The 'ranges' property of a PCI controller's parent can indicate address
translation information. Most system use 1:1 map between CPU physical and
PCI controller input addresses.
But some hardware, like i.MX8QXP, doesn't use 1:1 map. See below diagram:
┌─────────┐ ┌────────────┐
┌─────┐ │ │ IA: 0x8ff8_0000 │ │
│ CPU ├───►│ ┌────►├─────────────────┐ │ PCI │
└─────┘ │ │ │ IA: 0x8ff0_0000 │ │ │
CPU Addr │ │ ┌─►├─────────────┐ │ │ Controller │
0x7ff8_0000─┼───┘ │ │ │ │ │ │
│ │ │ │ │ │ │ PCI Addr
0x7ff0_0000─┼──────┘ │ │ └──► IOSpace ─┼────────────►
│ │ │ │ │ 0
0x7000_0000─┼────────►├─────────┐ │ │ │
└─────────┘ │ └──────► CfgSpace ─┼────────────►
Bus Fabric │ │ │ 0
│ │ │
└──────────► MemSpace ─┼────────────►
IA: 0x8000_0000 │ │ 0x8000_0000
└────────────┘
bus@5f000000 {
compatible = "simple-bus";
#address-cells = <1>;
#size-cells = <1>;
ranges = <0x80000000 0x0 0x70000000 0x10000000>;
pcie@5f010000 {
compatible = "fsl,imx8q-pcie";
reg = <0x5f010000 0x10000>, <0x8ff00000 0x80000>;
reg-names = "dbi", "config";
...
};
};
Intermediate address (IA) here means the PCIe controller input address.
The pcie@5f010000 'reg[config]' address is the parent bus (PCIe controller
input) address of CfgSpace.
The ATU in MemSpace is not explicitly described via devicetree, so we
assume the offset from CPU address to intermediate MemSpace address is the
same as that for CfgSpace.
We could use bus@5f000000 'ranges' for the same purpose.
Set parent_bus_offset using dw_pcie_init_parent_bus_offset(). The
parent_bus_offset is not used yet, so no functional change intended.
Link: https://lore.kernel.org/r/20250315201548.858189-8-helgaas@kernel.org
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
dw_pcie_parent_bus_offset() looks up the parent bus address of a PCI
controller 'reg' property in devicetree. If implemented, .cpu_addr_fixup()
is a hard-coded way to get the parent bus address corresponding to a CPU
physical address.
Add debug code to compare the address from .cpu_addr_fixup() with the
address from devicetree. If they match, warn that .cpu_addr_fixup() is
redundant and should be removed; if they differ, warn that something is
wrong with the devicetree.
If .cpu_addr_fixup() is not implemented, the parent bus address should be
identical to the CPU physical address because we previously ignored the
parent bus address from devicetree. If the devicetree has a different
parent bus address, warn about it being broken.
[bhelgaas: split debug to separate patch for easier future revert, commit
log]
Link: https://lore.kernel.org/r/20250315201548.858189-7-helgaas@kernel.org
Signed-off-by: Frank Li <Frank.Li@nxp.com>
[bhelgaas: squash Ioana Ciornei <ioana.ciornei@nxp.com> fix for NULL
pointer deref when driver doesn't supply dw_pcie_ops, e.g., layerscape-pcie
https://lore.kernel.org/r/20250319134339.3114817-1-ioana.ciornei@nxp.com]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs initramfs updates from Christian Brauner:
"This adds basic kunit test coverage for initramfs unpacking and cleans
up some buffer handling issues and inefficiencies"
* tag 'vfs-6.15-rc1.initramfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
MAINTAINERS: append initramfs files to the VFS section
initramfs: avoid static buffer for error message
initramfs: fix hardlink hash leak without TRAILER
initramfs: reuse name_len for dir mtime tracking
initramfs: allocate heap buffers together
initramfs: avoid memcpy for hex header fields
vsprintf: add simple_strntoul
initramfs_test: kunit tests for initramfs unpacking
init: add initramfs_internal.h
|
|
The test_rss_context_dump() test assumes the indirection table is always
supported, which is not true for all drivers, e.g., virtio_net when
VIRTIO_NET_F_RSS is disabled.
Skip the check if 'indir' is not present.
Reviewed-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250318112426.386651-1-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
'BFP' should be 'BPF'.
Signed-off-by: Ryohei Kinugawa <ryohei.kinugawa@gmail.com>
Link: https://patch.msgid.link/20250318095154.4187952-1-ryohei.kinugawa@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ChunHao Lin says:
====================
r8169: enable more devices ASPM support
This series of patches will enable more devices ASPM support.
It also fix a RTL8126 cannot enter L1 substate issue when ASPM is
enabled.
====================
Link: https://patch.msgid.link/20250318083721.4127-1-hau@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Disable it due to it dose not meet ZRX-DC specification. If it is enabled,
device will exit L1 substate every 100ms. Disable it for saving more power
in L1 substate.
Signed-off-by: ChunHao Lin <hau@realtek.com>
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/20250318083721.4127-3-hau@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch will enable RTL8168H/RTL8168EP/RTL8168FP ASPM support on
the platforms that have tested with ASPM enabled.
Signed-off-by: ChunHao Lin <hau@realtek.com>
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/20250318083721.4127-2-hau@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs ceph updates from Christian Brauner:
"This contains the work to remove access to page->index from ceph
and fixes the test failure observed for ceph with generic/421 by
refactoring ceph_writepages_start()"
* tag 'vfs-6.15-rc1.ceph' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
fscrypt: Change fscrypt_encrypt_pagecache_blocks() to take a folio
ceph: Fix error handling in fill_readdir_cache()
fs: Remove page_mkwrite_check_truncate()
ceph: Pass a folio to ceph_allocate_page_array()
ceph: Convert ceph_move_dirty_page_in_page_array() to move_dirty_folio_in_page_array()
ceph: Remove uses of page from ceph_process_folio_batch()
ceph: Convert ceph_check_page_before_write() to use a folio
ceph: Convert writepage_nounlock() to write_folio_nounlock()
ceph: Convert ceph_readdir_cache_control to store a folio
ceph: Convert ceph_find_incompatible() to take a folio
ceph: Use a folio in ceph_page_mkwrite()
ceph: Remove ceph_writepage()
ceph: fix generic/421 test failure
ceph: introduce ceph_submit_write() method
ceph: introduce ceph_process_folio_batch() method
ceph: extend ceph_writeback_ctl for ceph_writepages_start() refactoring
|
|
The context indicates that 'than' is the correct word instead of 'then',
as a comparison is being performed.
Given that 'then' is also a valid English word, checkpatch.pl wouldn't
have picked up on this spelling error.
This typo was caught by AI during code review.
Suggested-by: Wentao Guan <guanwentao@uniontech.com>
Signed-off-by: WangYuli <wangyuli@uniontech.com>
Reviewed-by: Yanteng Si <si.yanteng@linux.dev>
Link: https://patch.msgid.link/A43BEA49ED5CC6E5+20250318074656.644391-1-wangyuli@uniontech.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This updates the old path and fixes the description of unavailable options.
Signed-off-by: Yui Washizu <yui.washidu@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Link: https://patch.msgid.link/20250318061251.775191-1-yui.washidu@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
inet_connection_sock_af_ops.addr2sockaddr() hasn't been used at all
in the git era.
$ git grep addr2sockaddr $(git rev-list HEAD | tail -n 1)
Let's remove it.
Note that there was a 4 bytes hole after sockaddr_len and now it's
6 bytes, so the binary layout is not changed.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250318060112.3729-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit de70981f295e ("gve: unlink old napi when stopping a queue using
queue API") unlinks the old napi when stopping a queue. But this breaks
QPL mode of the driver which does not use page pool. Fix this by checking
that there's a page pool associated with the ring.
Cc: stable@vger.kernel.org
Fixes: de70981f295e ("gve: unlink old napi when stopping a queue using queue API")
Reviewed-by: Joshua Washington <joshwash@google.com>
Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250317214141.286854-1-hramamurthy@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add more imix_weights test cases (for incomplete input).
Signed-off-by: Peter Seiderer <ps.report@gmx.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250317090401.1240704-2-ps.report@gmx.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add strict buffer parsing index check to avoid the following Smatch
warning:
net/core/pktgen.c:877 get_imix_entries()
warn: check that incremented offset 'i' is capped
Checking the buffer index i after every get_user/i++ step and returning
with error code immediately avoids the current indirect (but correct)
error handling.
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/netdev/36cf3ee2-38b1-47e5-a42a-363efeb0ace3@stanley.mountain/
Signed-off-by: Peter Seiderer <ps.report@gmx.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250317090401.1240704-1-ps.report@gmx.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs pagesize updates from Christian Brauner:
"This enables block sizes greater than the page size for block devices.
With this we can start supporting block devices with logical block
sizes larger than 4k.
It also allows to lift the device cache sector size support to 64k.
This allows filesystems which can use larger sector sizes up to 64k to
ensure that the filesystem will not generate writes that are smaller
than the specified sector size"
* tag 'vfs-6.15-rc1.pagesize' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
bdev: add back PAGE_SIZE block size validation for sb_set_blocksize()
bdev: use bdev_io_min() for statx block size
block/bdev: lift block size restrictions to 64k
block/bdev: enable large folio support for large logical block sizes
fs/buffer fs/mpage: remove large folio restriction
fs/mpage: use blocks_per_folio instead of blocks_per_page
fs/mpage: avoid negative shift for large blocksize
fs/buffer: remove batching from async read
fs/buffer: simplify block_read_full_folio() with bh_offset()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs mount namespace updates from Christian Brauner:
"This expands the ability of anonymous mount namespaces:
- Creating detached mounts from detached mounts
Currently, detached mounts can only be created from attached
mounts. This limitaton prevents various use-cases. For example, the
ability to mount a subdirectory without ever having to make the
whole filesystem visible first.
The current permission modelis:
(1) Check that the caller is privileged over the owning user
namespace of it's current mount namespace.
(2) Check that the caller is located in the mount namespace of the
mount it wants to create a detached copy of.
While it is not strictly necessary to do it this way it is
consistently applied in the new mount api. This model will also be
used when allowing the creation of detached mount from another
detached mount.
The (1) requirement can simply be met by performing the same check
as for the non-detached case, i.e., verify that the caller is
privileged over its current mount namespace.
To meet the (2) requirement it must be possible to infer the origin
mount namespace that the anonymous mount namespace of the detached
mount was created from.
The origin mount namespace of an anonymous mount is the mount
namespace that the mounts that were copied into the anonymous mount
namespace originate from.
In order to check the origin mount namespace of an anonymous mount
namespace the sequence number of the original mount namespace is
recorded in the anonymous mount namespace.
With this in place it is possible to perform an equivalent check
(2') to (2). The origin mount namespace of the anonymous mount
namespace must be the same as the caller's mount namespace. To
establish this the sequence number of the caller's mount namespace
and the origin sequence number of the anonymous mount namespace are
compared.
The caller is always located in a non-anonymous mount namespace
since anonymous mount namespaces cannot be setns()ed into. The
caller's mount namespace will thus always have a valid sequence
number.
The owning namespace of any mount namespace, anonymous or
non-anonymous, can never change. A mount attached to a
non-anonymous mount namespace can never change mount namespace.
If the sequence number of the non-anonymous mount namespace and the
origin sequence number of the anonymous mount namespace match, the
owning namespaces must match as well.
Hence, the capability check on the owning namespace of the caller's
mount namespace ensures that the caller has the ability to copy the
mount tree.
- Allow mount detached mounts on detached mounts
Currently, detached mounts can only be mounted onto attached
mounts. This limitation makes it impossible to assemble a new
private rootfs and move it into place. Instead, a detached tree
must be created, attached, then mounted open and then either moved
or detached again. Lift this restriction.
In order to allow mounting detached mounts onto other detached
mounts the same permission model used for creating detached mounts
from detached mounts can be used (cf. above).
Allowing to mount detached mounts onto detached mounts leaves three
cases to consider:
(1) The source mount is an attached mount and the target mount is
a detached mount. This would be equivalent to moving a mount
between different mount namespaces. A caller could move an
attached mount to a detached mount. The detached mount can now
be freely attached to any mount namespace. This changes the
current delegatioh model significantly for no good reason. So
this will fail.
(2) Anonymous mount namespaces are always attached fully, i.e., it
is not possible to only attach a subtree of an anoymous mount
namespace. This simplifies the implementation and reasoning.
Consequently, if the anonymous mount namespace of the source
detached mount and the target detached mount are the identical
the mount request will fail.
(3) The source mount's anonymous mount namespace is different from
the target mount's anonymous mount namespace.
In this case the source anonymous mount namespace of the
source mount tree must be freed after its mounts have been
moved to the target anonymous mount namespace. The source
anonymous mount namespace must be empty afterwards.
By allowing to mount detached mounts onto detached mounts a caller
may do the following:
fd_tree1 = open_tree(-EBADF, "/mnt", OPEN_TREE_CLONE)
fd_tree2 = open_tree(-EBADF, "/tmp", OPEN_TREE_CLONE)
fd_tree1 and fd_tree2 refer to two different detached mount trees
that belong to two different anonymous mount namespace.
It is important to note that fd_tree1 and fd_tree2 both refer to
the root of their respective anonymous mount namespaces.
By allowing to mount detached mounts onto detached mounts the
caller may now do:
move_mount(fd_tree1, "", fd_tree2, "",
MOVE_MOUNT_F_EMPTY_PATH | MOVE_MOUNT_T_EMPTY_PATH)
This will cause the detached mount referred to by fd_tree1 to be
mounted on top of the detached mount referred to by fd_tree2.
Thus, the detached mount fd_tree1 is moved from its separate
anonymous mount namespace into fd_tree2's anonymous mount
namespace.
It also means that while fd_tree2 continues to refer to the root of
its respective anonymous mount namespace fd_tree1 doesn't anymore.
This has the consequence that only fd_tree2 can be moved to another
anonymous or non-anonymous mount namespace. Moving fd_tree1 will
now fail as fd_tree1 doesn't refer to the root of an anoymous mount
namespace anymore.
Now fd_tree1 and fd_tree2 refer to separate detached mount trees
referring to the same anonymous mount namespace.
This is conceptually fine. The new mount api does allow for this to
happen already via:
mount -t tmpfs tmpfs /mnt
mkdir -p /mnt/A
mount -t tmpfs tmpfs /mnt/A
fd_tree3 = open_tree(-EBADF, "/mnt", OPEN_TREE_CLONE | AT_RECURSIVE)
fd_tree4 = open_tree(-EBADF, "/mnt/A", 0)
Both fd_tree3 and fd_tree4 refer to two different detached mount
trees but both detached mount trees refer to the same anonymous
mount namespace. An as with fd_tree1 and fd_tree2, only fd_tree3
may be moved another mount namespace as fd_tree3 refers to the root
of the anonymous mount namespace just while fd_tree4 doesn't.
However, there's an important difference between the
fd_tree3/fd_tree4 and the fd_tree1/fd_tree2 example.
Closing fd_tree4 and releasing the respective struct file will have
no further effect on fd_tree3's detached mount tree.
However, closing fd_tree3 will cause the mount tree and the
respective anonymous mount namespace to be destroyed causing the
detached mount tree of fd_tree4 to be invalid for further mounting.
By allowing to mount detached mounts on detached mounts as in the
fd_tree1/fd_tree2 example both struct files will affect each other.
Both fd_tree1 and fd_tree2 refer to struct files that have
FMODE_NEED_UNMOUNT set.
To handle this we use the fact that @fd_tree1 will have a parent
mount once it has been attached to @fd_tree2.
When dissolve_on_fput() is called the mount that has been passed in
will refer to the root of the anonymous mount namespace. If it
doesn't it would mean that mounts are leaked. So before allowing to
mount detached mounts onto detached mounts this would be a bug.
Now that detached mounts can be mounted onto detached mounts it
just means that the mount has been attached to another anonymous
mount namespace and thus dissolve_on_fput() must not unmount the
mount tree or free the anonymous mount namespace as the file
referring to the root of the namespace hasn't been closed yet.
If it had been closed yet it would be obvious because the mount
namespace would be NULL, i.e., the @fd_tree1 would have already
been unmounted. If @fd_tree1 hasn't been unmounted yet and has a
parent mount it is safe to skip any cleanup as closing @fd_tree2
will take care of all cleanup operations.
- Allow mount propagation for detached mount trees
In commit ee2e3f50629f ("mount: fix mounting of detached mounts
onto targets that reside on shared mounts") I fixed a bug where
propagating the source mount tree of an anonymous mount namespace
into a target mount tree of a non-anonymous mount namespace could
be used to trigger an integer overflow in the non-anonymous mount
namespace causing any new mounts to fail.
The cause of this was that the propagation algorithm was unable to
recognize mounts from the source mount tree that were already
propagated into the target mount tree and then reappeared as
propagation targets when walking the destination propagation mount
tree.
When fixing this I disabled mount propagation into anonymous mount
namespaces. Make it possible for anonymous mount namespace to
receive mount propagation events correctly. This is now also a
correctness issue now that we allow mounting detached mount trees
onto detached mount trees.
Mark the source anonymous mount namespace with MNTNS_PROPAGATING
indicating that all mounts belonging to this mount namespace are
currently in the process of being propagated and make the
propagation algorithm discard those if they appear as propagation
targets"
* tag 'vfs-6.15-rc1.mount.namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (21 commits)
selftests: test subdirectory mounting
selftests: add test for detached mount tree propagation
fs: namespace: fix uninitialized variable use
mount: handle mount propagation for detached mount trees
fs: allow creating detached mounts from fsmount() file descriptors
selftests: seventh test for mounting detached mounts onto detached mounts
selftests: sixth test for mounting detached mounts onto detached mounts
selftests: fifth test for mounting detached mounts onto detached mounts
selftests: fourth test for mounting detached mounts onto detached mounts
selftests: third test for mounting detached mounts onto detached mounts
selftests: second test for mounting detached mounts onto detached mounts
selftests: first test for mounting detached mounts onto detached mounts
fs: mount detached mounts onto detached mounts
fs: support getname_maybe_null() in move_mount()
selftests: create detached mounts from detached mounts
fs: create detached mounts from detached mounts
fs: add may_copy_tree()
fs: add fastpath for dissolve_on_fput()
fs: add assert for move_mount()
fs: add mnt_ns_empty() helper
...
|