Age | Commit message (Collapse) | Author |
|
Change pci_bus_distribute_available_resources() arguments from
resource_size_t to struct resource to add more information required to get
the alignment correct for bridge windows with alignment >1M.
We require (size, alignment), instead of just (size) which is what is
currently available. The change from resource_size_t to struct resource
does just that.
Note that the struct resource arguments are passed by value and not by
reference. We do not want to pass by reference and change the resource size
of the parent bridge window. We only want the size information.
No functional change intended.
Link: https://lore.kernel.org/r/PSXP216MB0438587C47CBEDF365B1EA27803C0@PSXP216MB0438.KORP216.PROD.OUTLOOK.COM
[bhelgaas: split parts to other patches to reduce the size of this one]
Signed-off-by: Nicholas Johnson <nicholas.johnson-opensource@outlook.com.au>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
|
|
In pci_bus_distribute_available_resources(), rename:
io => io_per_hp
mmio => mmio_per_hp
mmio_pref => mmio_pref_per_hp
No functional change; this is just to make a subsequent patch smaller.
[bhelgaas: extracted from https://lore.kernel.org/r/PSXP216MB0438587C47CBEDF365B1EA27803C0@PSXP216MB0438.KORP216.PROD.OUTLOOK.COM]
Signed-off-by: Nicholas Johnson <nicholas.johnson-opensource@outlook.com.au>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
|
|
git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground
Pull y2038 updates from Arnd Bergmann:
"Core, driver and file system changes
These are updates to device drivers and file systems that for some
reason or another were not included in the kernel in the previous
y2038 series.
I've gone through all users of time_t again to make sure the kernel is
in a long-term maintainable state, replacing all remaining references
to time_t with safe alternatives.
Some related parts of the series were picked up into the nfsd, xfs,
alsa and v4l2 trees. A final set of patches in linux-mm removes the
now unused time_t/timeval/timespec types and helper functions after
all five branches are merged for linux-5.6, ensuring that no new users
get merged.
As a result, linux-5.6, or my backport of the patches to 5.4 [1],
should be the first release that can serve as a base for a 32-bit
system designed to run beyond year 2038, with a few remaining caveats:
- All user space must be compiled with a 64-bit time_t, which will be
supported in the coming musl-1.2 and glibc-2.32 releases, along
with installed kernel headers from linux-5.6 or higher.
- Applications that use the system call interfaces directly need to
be ported to use the time64 syscalls added in linux-5.1 in place of
the existing system calls. This impacts most users of futex() and
seccomp() as well as programming languages that have their own
runtime environment not based on libc.
- Applications that use a private copy of kernel uapi header files or
their contents may need to update to the linux-5.6 version, in
particular for sound/asound.h, xfs/xfs_fs.h, linux/input.h,
linux/elfcore.h, linux/sockios.h, linux/timex.h and
linux/can/bcm.h.
- A few remaining interfaces cannot be changed to pass a 64-bit
time_t in a compatible way, so they must be configured to use
CLOCK_MONOTONIC times or (with a y2106 problem) unsigned 32-bit
timestamps. Most importantly this impacts all users of 'struct
input_event'.
- All y2038 problems that are present on 64-bit machines also apply
to 32-bit machines. In particular this affects file systems with
on-disk timestamps using signed 32-bit seconds: ext4 with
ext3-style small inodes, ext2, xfs (to be fixed soon) and ufs"
[1] https://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground.git/log/?h=y2038-endgame
* tag 'y2038-drivers-for-v5.6-signed' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground: (21 commits)
Revert "drm/etnaviv: reject timeouts with tv_nsec >= NSEC_PER_SEC"
y2038: sh: remove timeval/timespec usage from headers
y2038: sparc: remove use of struct timex
y2038: rename itimerval to __kernel_old_itimerval
y2038: remove obsolete jiffies conversion functions
nfs: fscache: use timespec64 in inode auxdata
nfs: fix timstamp debug prints
nfs: use time64_t internally
sunrpc: convert to time64_t for expiry
drm/etnaviv: avoid deprecated timespec
drm/etnaviv: reject timeouts with tv_nsec >= NSEC_PER_SEC
drm/msm: avoid using 'timespec'
hfs/hfsplus: use 64-bit inode timestamps
hostfs: pass 64-bit timestamps to/from user space
packet: clarify timestamp overflow
tsacct: add 64-bit btime field
acct: stop using get_seconds()
um: ubd: use 64-bit time_t where possible
xtensa: ISS: avoid struct timeval
dlm: use SO_SNDTIMEO_NEW instead of SO_SNDTIMEO_OLD
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk
Pull printk update from Petr Mladek:
"Prevent replaying log on all consoles"
* tag 'printk-for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk:
printk: fix exclusive_console replaying
|
|
Add new VMD device IDs that require the bus restriction mode.
Signed-off-by: Sushma Kalakota <sushmax.kalakota@intel.com>
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
This adds IORING_OP_EPOLL_CTL, which can perform the same work as the
epoll_ctl(2) system call.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Also make it available outside of epoll, along with the helper that
decides if we need to copy the passed in epoll_event.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
No functional changes in this patch.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add support for Intel Comet Lake PCH-V which is based on Intel Kaby
Lake. Difference between it and other Comet Lake variants is that former
uses previous iTCO version 4 and latter use version 6 like Intel Cannon
Lake PCH.
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
In I2C there is no such thing as a "stop bit". Use the proper naming: "stop
condition".
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Reported-by: Jean Delvare <jdelvare@suse.de>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
In smbus-protocol.rst we use the text "Implemented by" for the same meaning
as "This corresponds to". Change everything to "Implemented by" for
coherency.
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Reported-by: Jean Delvare <jdelvare@suse.de>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Some of the section names are not very clear. Reading those names in the
index.rst page does not help much in grasping what the content is supposed
to be.
Rename those sections to clarify their content, especially when reading
the index page.
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Acked-by: Peter Rosin <peda@axentia.se>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Use a monospace (literal) formatting for better readability of sysfs
attributes and the "dummy" client name. This looks much more readable in
ReST-generated output.
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
This section applies only to code for very old kernels. Avoid people
reading this unnecessarily.
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Use ReST syntax so that a proper hyperlink is generated.
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Use a monospace (literal) formatting for better readability of sysfs
attributes.
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Among the "static" instantiation methods the "board file" method is
described first. Move it as last, since it is being replaced by the other
methods.
Also fix subsubsection heading syntax and remove the "Method 1[abc]"
prefix as the subsubsection structure clarifies the logical hierarchy.
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Use ReST syntax so that a proper hyperlink is generated.
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Clarify from the beginning what these transactions are, and specifically
how they differ from the SMBus counterparts, i.e. the lack of a Count byte.
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Remove misplaced dot before colon.
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
The subject is plural, fix the verb.
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
This clarifies these are functions (and would/will adds a hyperlink to the
function documentation if/when documented).
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Hyperlinks from function names are not generated in headings. Move them in
the plain text so they are rendered as clickable hyperlinks.
While there also remove an unneeded colon in a heading.
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Use the proper ACK and NACK naming from the I2C specification instead of
"accept" and "reverse accept".
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
These colons are not needed: the columns already nicely separate the
symbols from their description. They are also inconsistently preceded by
whitespace.
Remove the colons completely to simplify and clean up.
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
In I2C there is no such thing as a "start bit" or a "stop bit". Use the
proper naming: "start condition" and "stop condition".
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Use the proper ReST syntax to generate a valid hyperlink.
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Use the proper ACK and NACK naming from the I2C specification instead of
"accept" and "reverse accept".
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
These colons are not needed: the columns already nicely separate the
symbols from their description. They are also inconsistently preceded by
whitespace.
Remove the colons completely to simplify and clean up.
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
In I2C there is no such thing as a "start bit" or a "stop bit". Use the
proper naming: "start condition" and "stop condition".
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
This clarifies these are functions and adds a hyperlink to the function
documentation.
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
"I2C transfer" is a legitimate english sentence, no need for a hyphen
between the two words, as as such it is used in most of the
documentation. Remove the hyphen in the few places where it is present.
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Acked-by: Peter Rosin <peda@axentia.se>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Fix "issus" -> "issues".
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Acked-by: Peter Rosin <peda@axentia.se>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Uppercase "I2C" is used almost everywhere in the docs, but the lowercase
version "i2c" is used somewhere. Use the uppercase form consistently.
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Acked-by: Peter Rosin <peda@axentia.se>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
This section, partly dating back to the pre-git era, is somewhat
unclear and partly incorrect. Rewrite it almost completely including a
reference figure, concise but precise definition of each term and the
paths where drivers are found. Particular care has been put in clarifying
the relation between adapter and algorithm, which has no correspondence
in the I2C spec terminology.
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
- state the "official" name (I²C, not I2C, according to the spec) at
the beginning but keep using the more practical I2C elsewhere
- mention some known different names
- add link to the specification document
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
The index page currently lists sections in alphabetical file order without
caring about their content. Sort sections based on their content logically,
according to the following structure:
* Intro to I2C/SMBus and their usage in Linux: summary, i2c-protocol,
smbus-protocol, instantiating-devices, busses/index, i2c-topology,
muxes/i2c-mux-gpio
* Implementing drivers: writing-clients, dev-interface,
dma-considerations, fault-codes, functionality
* Debugging: gpio-fault-injection, i2c-stub
* Slave I2C: slave-interface, slave-eeprom-backend
* Advanced: ten-bit-addresses
* Obsolete info: upgrading-clients, old-module-parameters
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
i2c/for-5.6
The main feature is the idle-state rework of the pca954x driver from Biwen Li.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux into i2c/for-5.6
at24 updates for linux v5.6
- minor maintenance: update the license tag, sort headers
- move support for the write-protect pin into nvmem core
- add a reference to the new wp-gpios property in nvmem to at25 bindings
- add support for regulator and pm_runtime control
|
|
There is a statement that is indented one level too deeply, remove
the extraneous tab.
Fixes: b4c119dbc300 ("i2c: xiic: Add timeout to the rx fifo wait loop")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Shubhrajyoti Datta <shubhrajyoti.datta@gmail.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
There is a spelling mistake in a module parameter description.
Fix it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
We're not consistent in how the file table is grabbed and assigned if we
have a command linked that requires the use of it.
Add ->file_table to the io_op_defs[] array, and use that to determine
when to grab the table instead of having the handlers set it if they
need to defer. This also means we can kill the IO_WQ_WORK_NEEDS_FILES
flag. We always initialize work->files, so io-wq can just check for
that.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The uapi value IB_UVERBS_ACCESS_OPTIONAL_RANGE shouldn't be used inside
the driver, use IB_ACCESS_OPTIONAL instead.
Fixes: 86dd738cf20c ("RDMA/efa: Allow passing of optional access flags for MR registration")
Link: https://lore.kernel.org/r/20200129071803.40117-1-galpress@amazon.com
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Remove unnecessary braces in pci_bus_distribute_available_resources(). No
functional changes.
Link: https://lore.kernel.org/r/PSXP216MB0438061CB4442460BB92A75F803C0@PSXP216MB0438.KORP216.PROD.OUTLOOK.COM
Signed-off-by: Nicholas Johnson <nicholas.johnson-opensource@outlook.com.au>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
|
|
ALSA PCM core recently introduced a new managed PCM buffer allocation
mode that does allocate / free automatically at hw_params and
hw_free. However, it overlooked the code path directly calling
hw_free PCM ops at releasing the PCM substream, and it may result in a
memory leak as spotted by syzkaller when no buffer preallocation is
used (e.g. vmalloc buffer).
This patch papers over it with a slight refactoring. The hw_free ops
call and relevant tasks are unified in a new helper function, and call
it from both places.
Fixes: 0dba808eae26 ("ALSA: pcm: Introduce managed buffer allocation mode")
Reported-by: syzbot+30edd0f34bfcdc548ac4@syzkaller.appspotmail.com
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200129195907.12197-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Fix the following sparse warning generated due to
64-bit compat type having fields defined explicitly
with __s32:
sound/soc/sof/sof-audio.c:46:31: warning: incorrect type in assignment (different base types)
sound/soc/sof/sof-audio.c:46:31: expected restricted snd_pcm_state_t [usertype] state
sound/soc/sof/sof-audio.c:46:31: got signed int [usertype] state
Fixes: 80fe7430c708 ("ALSA: add new 32-bit layout for snd_pcm_mmap_status/control")
Reported-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Suggested-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Link: https://lore.kernel.org/r/20200129184448.3005-1-ranjani.sridharan@linux.intel.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull erofs updates from Gao Xiang:
"A regression fix, several cleanups and (maybe) plus an upcoming new
mount api convert patch as a part of vfs update are considered
available for this cycle.
All commits have been in linux-next and tested with no smoke out.
Summary:
- fix an out-of-bound read access introduced in v5.3, which could
rarely cause data corruption
- various cleanup patches"
* tag 'erofs-for-5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
erofs: clean up z_erofs_submit_queue()
erofs: fold in postsubmit_is_all_bypassed()
erofs: fix out-of-bound read for shifted uncompressed block
erofs: remove void tagging/untagging of workgroup pointers
erofs: remove unused tag argument while registering a workgroup
erofs: remove unused tag argument while finding a workgroup
erofs: correct indentation of an assigned structure inside a function
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull adfs updates from Al Viro:
"adfs stuff for this cycle"
* 'work.adfs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (42 commits)
fs/adfs: bigdir: Fix an error code in adfs_fplus_read()
Documentation: update adfs filesystem documentation
fs/adfs: mostly divorse inode number from indirect disc address
fs/adfs: super: add support for E and E+ floppy image formats
fs/adfs: super: extract filesystem block probe
fs/adfs: dir: remove debug in adfs_dir_update()
fs/adfs: super: fix inode dropping
fs/adfs: bigdir: implement directory update support
fs/adfs: bigdir: calculate and validate directory checkbyte
fs/adfs: bigdir: directory validation strengthening
fs/adfs: bigdir: extract directory validation
fs/adfs: bigdir: factor out directory entry offset calculation
fs/adfs: newdir: split out directory commit from update
fs/adfs: newdir: clean up adfs_f_update()
fs/adfs: newdir: merge adfs_dir_read() into adfs_f_read()
fs/adfs: newdir: improve directory validation
fs/adfs: newdir: factor out directory format validation
fs/adfs: dir: use pointers to access directory head/tails
fs/adfs: dir: add more efficient iterate() per-format method
fs/adfs: dir: switch to iterate_shared method
...
|
|
This reverts commit 74759e1693311a8d1441de836c4080c192374238.
mptcp@lists.01.org accepts messages from non-subscribers. There was an
invisible and unexpected server-wide rule limiting the number of
recipients for subscribers and non-subscribers alike, and that has now
been turned off for this list.
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull openat2 support from Al Viro:
"This is the openat2() series from Aleksa Sarai.
I'm afraid that the rest of namei stuff will have to wait - it got
zero review the last time I'd posted #work.namei, and there had been a
leak in the posted series I'd caught only last weekend. I was going to
repost it on Monday, but the window opened and the odds of getting any
review during that... Oh, well.
Anyway, openat2 part should be ready; that _did_ get sane amount of
review and public testing, so here it comes"
From Aleksa's description of the series:
"For a very long time, extending openat(2) with new features has been
incredibly frustrating. This stems from the fact that openat(2) is
possibly the most famous counter-example to the mantra "don't silently
accept garbage from userspace" -- it doesn't check whether unknown
flags are present[1].
This means that (generally) the addition of new flags to openat(2) has
been fraught with backwards-compatibility issues (O_TMPFILE has to be
defined as __O_TMPFILE|O_DIRECTORY|[O_RDWR or O_WRONLY] to ensure old
kernels gave errors, since it's insecure to silently ignore the
flag[2]). All new security-related flags therefore have a tough road
to being added to openat(2).
Furthermore, the need for some sort of control over VFS's path
resolution (to avoid malicious paths resulting in inadvertent
breakouts) has been a very long-standing desire of many userspace
applications.
This patchset is a revival of Al Viro's old AT_NO_JUMPS[3] patchset
(which was a variant of David Drysdale's O_BENEATH patchset[4] which
was a spin-off of the Capsicum project[5]) with a few additions and
changes made based on the previous discussion within [6] as well as
others I felt were useful.
In line with the conclusions of the original discussion of
AT_NO_JUMPS, the flag has been split up into separate flags. However,
instead of being an openat(2) flag it is provided through a new
syscall openat2(2) which provides several other improvements to the
openat(2) interface (see the patch description for more details). The
following new LOOKUP_* flags are added:
LOOKUP_NO_XDEV:
Blocks all mountpoint crossings (upwards, downwards, or through
absolute links). Absolute pathnames alone in openat(2) do not
trigger this. Magic-link traversal which implies a vfsmount jump is
also blocked (though magic-link jumps on the same vfsmount are
permitted).
LOOKUP_NO_MAGICLINKS:
Blocks resolution through /proc/$pid/fd-style links. This is done
by blocking the usage of nd_jump_link() during resolution in a
filesystem. The term "magic-links" is used to match with the only
reference to these links in Documentation/, but I'm happy to change
the name.
It should be noted that this is different to the scope of
~LOOKUP_FOLLOW in that it applies to all path components. However,
you can do openat2(NO_FOLLOW|NO_MAGICLINKS) on a magic-link and it
will *not* fail (assuming that no parent component was a
magic-link), and you will have an fd for the magic-link.
In order to correctly detect magic-links, the introduction of a new
LOOKUP_MAGICLINK_JUMPED state flag was required.
LOOKUP_BENEATH:
Disallows escapes to outside the starting dirfd's
tree, using techniques such as ".." or absolute links. Absolute
paths in openat(2) are also disallowed.
Conceptually this flag is to ensure you "stay below" a certain
point in the filesystem tree -- but this requires some additional
to protect against various races that would allow escape using
"..".
Currently LOOKUP_BENEATH implies LOOKUP_NO_MAGICLINKS, because it
can trivially beam you around the filesystem (breaking the
protection). In future, there might be similar safety checks done
as in LOOKUP_IN_ROOT, but that requires more discussion.
In addition, two new flags are added that expand on the above ideas:
LOOKUP_NO_SYMLINKS:
Does what it says on the tin. No symlink resolution is allowed at
all, including magic-links. Just as with LOOKUP_NO_MAGICLINKS this
can still be used with NOFOLLOW to open an fd for the symlink as
long as no parent path had a symlink component.
LOOKUP_IN_ROOT:
This is an extension of LOOKUP_BENEATH that, rather than blocking
attempts to move past the root, forces all such movements to be
scoped to the starting point. This provides chroot(2)-like
protection but without the cost of a chroot(2) for each filesystem
operation, as well as being safe against race attacks that
chroot(2) is not.
If a race is detected (as with LOOKUP_BENEATH) then an error is
generated, and similar to LOOKUP_BENEATH it is not permitted to
cross magic-links with LOOKUP_IN_ROOT.
The primary need for this is from container runtimes, which
currently need to do symlink scoping in userspace[7] when opening
paths in a potentially malicious container.
There is a long list of CVEs that could have bene mitigated by
having RESOLVE_THIS_ROOT (such as CVE-2017-1002101,
CVE-2017-1002102, CVE-2018-15664, and CVE-2019-5736, just to name a
few).
In order to make all of the above more usable, I'm working on
libpathrs[8] which is a C-friendly library for safe path resolution.
It features a userspace-emulated backend if the kernel doesn't support
openat2(2). Hopefully we can get userspace to switch to using it, and
thus get openat2(2) support for free once it's ready.
Future work would include implementing things like
RESOLVE_NO_AUTOMOUNT and possibly a RESOLVE_NO_REMOTE (to allow
programs to be sure they don't hit DoSes though stale NFS handles)"
* 'work.openat2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
Documentation: path-lookup: include new LOOKUP flags
selftests: add openat2(2) selftests
open: introduce openat2(2) syscall
namei: LOOKUP_{IN_ROOT,BENEATH}: permit limited ".." resolution
namei: LOOKUP_IN_ROOT: chroot-like scoped resolution
namei: LOOKUP_BENEATH: O_BENEATH-like scoped resolution
namei: LOOKUP_NO_XDEV: block mountpoint crossing
namei: LOOKUP_NO_MAGICLINKS: block magic-link resolution
namei: LOOKUP_NO_SYMLINKS: block symlink resolution
namei: allow set_root() to produce errors
namei: allow nd_jump_link() to produce errors
nsfs: clean-up ns_get_path() signature to return int
namei: only return -ECHILD from follow_dotdot_rcu()
|