summaryrefslogtreecommitdiff
path: root/Documentation/core-api
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/core-api')
-rw-r--r--Documentation/core-api/index.rst1
-rw-r--r--Documentation/core-api/netlink.rst101
-rw-r--r--Documentation/core-api/packing.rst2
-rw-r--r--Documentation/core-api/padata.rst2
-rw-r--r--Documentation/core-api/pin_user_pages.rst31
-rw-r--r--Documentation/core-api/workqueue.rst4
6 files changed, 121 insertions, 20 deletions
diff --git a/Documentation/core-api/index.rst b/Documentation/core-api/index.rst
index 77eb775b8b42..7a3a08d81f11 100644
--- a/Documentation/core-api/index.rst
+++ b/Documentation/core-api/index.rst
@@ -127,6 +127,7 @@ Documents that don't fit elsewhere or which have yet to be categorized.
:maxdepth: 1
librs
+ netlink
.. only:: subproject and html
diff --git a/Documentation/core-api/netlink.rst b/Documentation/core-api/netlink.rst
new file mode 100644
index 000000000000..e4a938a05cc9
--- /dev/null
+++ b/Documentation/core-api/netlink.rst
@@ -0,0 +1,101 @@
+.. SPDX-License-Identifier: BSD-3-Clause
+
+.. _kernel_netlink:
+
+===================================
+Netlink notes for kernel developers
+===================================
+
+General guidance
+================
+
+Attribute enums
+---------------
+
+Older families often define "null" attributes and commands with value
+of ``0`` and named ``unspec``. This is supported (``type: unused``)
+but should be avoided in new families. The ``unspec`` enum values are
+not used in practice, so just set the value of the first attribute to ``1``.
+
+Message enums
+-------------
+
+Use the same command IDs for requests and replies. This makes it easier
+to match them up, and we have plenty of ID space.
+
+Use separate command IDs for notifications. This makes it easier to
+sort the notifications from replies (and present them to the user
+application via a different API than replies).
+
+Answer requests
+---------------
+
+Older families do not reply to all of the commands, especially NEW / ADD
+commands. User only gets information whether the operation succeeded or
+not via the ACK. Try to find useful data to return. Once the command is
+added whether it replies with a full message or only an ACK is uAPI and
+cannot be changed. It's better to err on the side of replying.
+
+Specifically NEW and ADD commands should reply with information identifying
+the created object such as the allocated object's ID (without having to
+resort to using ``NLM_F_ECHO``).
+
+NLM_F_ECHO
+----------
+
+Make sure to pass the request info to genl_notify() to allow ``NLM_F_ECHO``
+to take effect. This is useful for programs that need precise feedback
+from the kernel (for example for logging purposes).
+
+Support dump consistency
+------------------------
+
+If iterating over objects during dump may skip over objects or repeat
+them - make sure to report dump inconsistency with ``NLM_F_DUMP_INTR``.
+This is usually implemented by maintaining a generation id for the
+structure and recording it in the ``seq`` member of struct netlink_callback.
+
+Netlink specification
+=====================
+
+Documentation of the Netlink specification parts which are only relevant
+to the kernel space.
+
+Globals
+-------
+
+kernel-policy
+~~~~~~~~~~~~~
+
+Defines if the kernel validation policy is per operation (``per-op``)
+or for the entire family (``global``). New families should use ``per-op``
+(default) to be able to narrow down the attributes accepted by a specific
+command.
+
+checks
+------
+
+Documentation for the ``checks`` sub-sections of attribute specs.
+
+unterminated-ok
+~~~~~~~~~~~~~~~
+
+Accept strings without the null-termination (for legacy families only).
+Switches from the ``NLA_NUL_STRING`` to ``NLA_STRING`` policy type.
+
+max-len
+~~~~~~~
+
+Defines max length for a binary or string attribute (corresponding
+to the ``len`` member of struct nla_policy). For string attributes terminating
+null character is not counted towards ``max-len``.
+
+The field may either be a literal integer value or a name of a defined
+constant. String types may reduce the constant by one
+(i.e. specify ``max-len: CONST - 1``) to reserve space for the terminating
+character so implementations should recognize such pattern.
+
+min-len
+~~~~~~~
+
+Similar to ``max-len`` but defines minimum length.
diff --git a/Documentation/core-api/packing.rst b/Documentation/core-api/packing.rst
index d8c341fe383e..3ed13bc9a195 100644
--- a/Documentation/core-api/packing.rst
+++ b/Documentation/core-api/packing.rst
@@ -161,6 +161,6 @@ xxx_packing() that calls it using the proper QUIRK_* one-hot bits set.
The packing() function returns an int-encoded error code, which protects the
programmer against incorrect API use. The errors are not expected to occur
-durring runtime, therefore it is reasonable for xxx_packing() to return void
+during runtime, therefore it is reasonable for xxx_packing() to return void
and simply swallow those errors. Optionally it can dump stack or print the
error description.
diff --git a/Documentation/core-api/padata.rst b/Documentation/core-api/padata.rst
index 35175710b43c..05b73c6c105f 100644
--- a/Documentation/core-api/padata.rst
+++ b/Documentation/core-api/padata.rst
@@ -42,7 +42,7 @@ padata_shells associated with it, each allowing a separate series of jobs.
Modifying cpumasks
------------------
-The CPUs used to run jobs can be changed in two ways, programatically with
+The CPUs used to run jobs can be changed in two ways, programmatically with
padata_set_cpumask() or via sysfs. The former is defined::
int padata_set_cpumask(struct padata_instance *pinst, int cpumask_type,
diff --git a/Documentation/core-api/pin_user_pages.rst b/Documentation/core-api/pin_user_pages.rst
index b18416f4500f..9fb0b1080d3b 100644
--- a/Documentation/core-api/pin_user_pages.rst
+++ b/Documentation/core-api/pin_user_pages.rst
@@ -55,18 +55,17 @@ flags the caller provides. The caller is required to pass in a non-null struct
pages* array, and the function then pins pages by incrementing each by a special
value: GUP_PIN_COUNTING_BIAS.
-For compound pages, the GUP_PIN_COUNTING_BIAS scheme is not used. Instead,
-an exact form of pin counting is achieved, by using the 2nd struct page
-in the compound page. A new struct page field, compound_pincount, has
-been added in order to support this.
-
-This approach for compound pages avoids the counting upper limit problems that
-are discussed below. Those limitations would have been aggravated severely by
-huge pages, because each tail page adds a refcount to the head page. And in
-fact, testing revealed that, without a separate compound_pincount field,
-page overflows were seen in some huge page stress tests.
-
-This also means that huge pages and compound pages do not suffer
+For large folios, the GUP_PIN_COUNTING_BIAS scheme is not used. Instead,
+the extra space available in the struct folio is used to store the
+pincount directly.
+
+This approach for large folios avoids the counting upper limit problems
+that are discussed below. Those limitations would have been aggravated
+severely by huge pages, because each tail page adds a refcount to the
+head page. And in fact, testing revealed that, without a separate pincount
+field, refcount overflows were seen in some huge page stress tests.
+
+This also means that huge pages and large folios do not suffer
from the false positives problem that is mentioned below.::
Function
@@ -221,7 +220,7 @@ Unit testing
============
This file::
- tools/testing/selftests/vm/gup_test.c
+ tools/testing/selftests/mm/gup_test.c
has the following new calls to exercise the new pin*() wrapper functions:
@@ -264,9 +263,9 @@ place.)
Other diagnostics
=================
-dump_page() has been enhanced slightly, to handle these new counting
-fields, and to better report on compound pages in general. Specifically,
-for compound pages, the exact (compound_pincount) pincount is reported.
+dump_page() has been enhanced slightly to handle these new counting
+fields, and to better report on large folios in general. Specifically,
+for large folios, the exact pincount is reported.
References
==========
diff --git a/Documentation/core-api/workqueue.rst b/Documentation/core-api/workqueue.rst
index 3b22ed137662..8ec4d6270b24 100644
--- a/Documentation/core-api/workqueue.rst
+++ b/Documentation/core-api/workqueue.rst
@@ -370,8 +370,8 @@ of possible problems:
The first one can be tracked using tracing: ::
- $ echo workqueue:workqueue_queue_work > /sys/kernel/debug/tracing/set_event
- $ cat /sys/kernel/debug/tracing/trace_pipe > out.txt
+ $ echo workqueue:workqueue_queue_work > /sys/kernel/tracing/set_event
+ $ cat /sys/kernel/tracing/trace_pipe > out.txt
(wait a few secs)
^C