Age | Commit message (Collapse) | Author |
|
The physical address of page tables is passed around and
used as virtual address in various locations.
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
There is little sense in applying __pa() to a physical
address, but that what pfn_pXd() macros do.
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
This update fixes semantics of pXd_deref macros which
are expected to return a CPU-addressable pointer.
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
Provide correct mnemonic for selfhr.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull more clang LTO updates from Kees Cook:
"Clang LTO x86 enablement.
Full disclosure: while this has _not_ been in linux-next (since it
initially looked like the objtool dependencies weren't going to make
v5.12), it has been under daily build and runtime testing by Sami for
quite some time. These x86 portions have been discussed on lkml, with
Peter, Josh, and others helping nail things down.
The bulk of the changes are to get objtool working happily. The rest
of the x86 enablement is very small.
Summary:
- Generate __mcount_loc in objtool (Peter Zijlstra)
- Support running objtool against vmlinux.o (Sami Tolvanen)
- Clang LTO enablement for x86 (Sami Tolvanen)"
Link: https://lore.kernel.org/lkml/20201013003203.4168817-26-samitolvanen@google.com/
Link: https://lore.kernel.org/lkml/cover.1611263461.git.jpoimboe@redhat.com/
* tag 'clang-lto-v5.12-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
kbuild: lto: force rebuilds when switching CONFIG_LTO
x86, build: allow LTO to be selected
x86, cpu: disable LTO for cpu.c
x86, vdso: disable LTO only for vDSO
kbuild: lto: postpone objtool
objtool: Split noinstr validation from --vmlinux
x86, build: use objtool mcount
tracing: add support for objtool mcount
objtool: Don't autodetect vmlinux.o
objtool: Fix __mcount_loc generation with Clang's assembler
objtool: Add a pass for generating __mcount_loc
|
|
The PCI error recovery always resets the link for a frozen state, so the
port driver should return that a reset is required for its result. This
will get the .slot_reset() callback invoked, which is necessary to
restore the port's config space. Without this, the driver had been
relying on downstream drivers to return this status.
Link: https://lore.kernel.org/r/20210104230300.1277180-6-kbusch@kernel.org
Tested-by: Hedi Berriche <hedi.berriche@hpe.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Hedi Berriche <hedi.berriche@hpe.com>
|
|
The AER driver may be called upon to reset either a Downstream or a Root
Port. Check which type it is to properly identify it when logging that
the reset occurred.
Link: https://lore.kernel.org/r/20210104230300.1277180-5-kbusch@kernel.org
Tested-by: Hedi Berriche <hedi.berriche@hpe.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Hedi Berriche <hedi.berriche@hpe.com>
|
|
Overwriting the frozen detected status with the result of the link reset
loses the NEED_RESET result that drivers are depending on for error
handling to report the .slot_reset() callback. Retain this status so
that subsequent error handling has the correct flow.
Link: https://lore.kernel.org/r/20210104230300.1277180-4-kbusch@kernel.org
Reported-by: Hinko Kocevar <hinko.kocevar@ess.eu>
Tested-by: Hedi Berriche <hedi.berriche@hpe.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Sean V Kelley <sean.v.kelley@intel.com>
Acked-by: Hedi Berriche <hedi.berriche@hpe.com>
|
|
The pci_dev parameter given to aer_root_reset() may be a Downstream Port
rather than the Root Port. Get the Root Port from the provided device in
order to clear the root's AER status.
Link: https://lore.kernel.org/r/20210104230300.1277180-3-kbusch@kernel.org
Tested-by: Hedi Berriche <hedi.berriche@hpe.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Sean V Kelley <sean.v.kelley@intel.com>
Acked-by: Hedi Berriche <hedi.berriche@hpe.com>
|
|
Error handling operates on the first Downstream Port above the detected
error, but the error may have been reported by a downstream device.
Clear the AER status of the device that reported the error rather than
the first Downstream Port.
Link: https://lore.kernel.org/r/20210104230300.1277180-2-kbusch@kernel.org
Tested-by: Hedi Berriche <hedi.berriche@hpe.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Sean V Kelley <sean.v.kelley@intel.com>
Acked-by: Hedi Berriche <hedi.berriche@hpe.com>
|
|
Pull sparc updates from David Miller:
"A host of mall cleanups and adjustments that have accumulated while I
was away, nothing major"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: (26 commits)
sparc: make xchg() into a statement expression
sparc64: Use arch_validate_flags() to validate ADI flag
sparc32: Fix comparing pointer to 0 coccicheck warning
sparc: fix led.c driver when PROC_FS is not enabled
sparc: Fix handling of page table constructor failure
sparc64: only select COMPAT_BINFMT_ELF if BINFMT_ELF is set
tty: hvcs: Drop unnecessary if block
tty: vcc: Drop unnecessary if block
tty: vcc: Drop impossible to hit WARN_ON
sparc: sparc64_defconfig: add necessary configs for qemu
sparc64: switch defconfig from the legacy ide driver to libata
sparc32: Preserve clone syscall flags argument for restarts due to signals
sparc32: Limit memblock allocation to low memory
sparc: Replace test_ti_thread_flag() with test_tsk_thread_flag()
sbus: char: Remove meaningless jump label out_free
sparc32: signal: Fix stack trampoline for RT signals
sparc: remove SA_STATIC_ALLOC macro definition
sparc: use for_each_child_of_node() macro
sparc: Use fallthrough pseudo-keyword
sparc32: srmmu: improve type safety of __nocache_fix()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine
Pull dmaengine updates from Vinod Koul:
"We have couple of drivers removed a new driver and bunch of new device
support and few updates to drivers for this round.
New drivers/devices:
- Intel LGM SoC DMA driver
- Actions Semi S500 DMA controller
- Renesas r8a779a0 dma controller
- Ingenic JZ4760(B) dma controller
- Intel KeemBay AxiDMA controller
Removed:
- Coh901318 dma driver
- Zte zx dma driver
- Sirfsoc dma driver
Updates:
- mmp_pdma, mmp_tdma gained module support
- imx-sdma become modern and dropped platform data support
- dw-axi driver gained slave and cyclic dma support"
* tag 'dmaengine-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (58 commits)
dmaengine: dw-axi-dmac: remove redundant null check on desc
dmaengine: xilinx_dma: Alloc tx descriptors GFP_NOWAIT
dmaengine: dw-axi-dmac: Virtually split the linked-list
dmaengine: dw-axi-dmac: Set constraint to the Max segment size
dmaengine: dw-axi-dmac: Add Intel KeemBay AxiDMA BYTE and HALFWORD registers
dmaengine: dw-axi-dmac: Add Intel KeemBay AxiDMA handshake
dmaengine: dw-axi-dmac: Add Intel KeemBay AxiDMA support
dmaengine: drivers: Kconfig: add HAS_IOMEM dependency to DW_AXI_DMAC
dmaengine: dw-axi-dmac: Add Intel KeemBay DMA register fields
dt-binding: dma: dw-axi-dmac: Add support for Intel KeemBay AxiDMA
dmaengine: dw-axi-dmac: Support burst residue granularity
dmaengine: dw-axi-dmac: Support of_dma_controller_register()
dmaegine: dw-axi-dmac: Support device_prep_dma_cyclic()
dmaengine: dw-axi-dmac: Support device_prep_slave_sg
dmaengine: dw-axi-dmac: Add device_config operation
dmaengine: dw-axi-dmac: Add device_synchronize() callback
dmaengine: dw-axi-dmac: move dma_pool_create() to alloc_chan_resources()
dmaengine: dw-axi-dmac: simplify descriptor management
dt-bindings: dma: Add YAML schemas for dw-axi-dmac
dmaengine: ti: k3-psil: optimize struct psil_endpoint_config for size
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more ACPI updates from Rafael Wysocki:
"Fix race condition in generic_serial_bus (I2C) and GPIO Operation
Region handling in ACPICA and reduce some related code duplication
(Hans de Goede)"
* tag 'acpi-5.12-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPICA: Remove some code duplication from acpi_ev_address_space_dispatch
ACPICA: Fix race in generic_serial_bus (I2C) and GPIO op_region parameter handling
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki:
"These are fixes and cleanups on top of the power management material
for 5.12-rc1 merged previously.
Specifics:
- Address cpufreq regression introduced in 5.11 that causes CPU
frequency reporting to be distorted on systems with CPPC that use
acpi-cpufreq as the scaling driver (Rafael Wysocki).
- Fix regression introduced during the 5.10 development cycle related
to CPU hotplug and policy recreation in the qcom-cpufreq-hw driver
(Shawn Guo).
- Fix recent regression in the operating performance points (OPP)
framework that may cause frequency updates to be skipped by mistake
in some cases (Jonathan Marek).
- Simplify schedutil governor code and remove a misleading comment
from it (Yue Hu).
- Fix kerneldoc comment typo in the cpufreq core (Yue Hu)"
* tag 'pm-5.12-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: Fix typo in kerneldoc comment
cpufreq: schedutil: Remove update_lock comment from struct sugov_policy definition
cpufreq: schedutil: Remove needless sg_policy parameter from ignore_dl_rate_limit()
cpufreq: ACPI: Set cpuinfo.max_freq directly if max boost is known
cpufreq: qcom-hw: drop devm_xxx() calls from init/exit hooks
opp: Don't skip freq update for different frequency
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
"Mostly existing driver fixes plus a new driver for game controllers
directly connected to Nintendo 64, and an enhancement for keyboards
driven by Chrome OS EC to communicate layout of the top row to
userspace"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (47 commits)
Input: st1232 - fix NORMAL vs. IDLE state handling
Input: aiptek - convert sysfs sprintf/snprintf family to sysfs_emit
Input: alps - fix spelling of "positive"
ARM: dts: cros-ec-keyboard: Use keymap macros
dt-bindings: input: Fix the keymap for LOCK key
dt-bindings: input: Create macros for cros-ec keymap
Input: cros-ec-keyb - expose function row physical map to userspace
dt-bindings: input: cros-ec-keyb: Add a new property describing top row
Input: applespi - fix occasional crc errors under load.
Input: applespi - don't wait for responses to commands indefinitely.
Input: st1232 - add IDLE state as ready condition
Input: zinitix - fix return type of zinitix_init_touch()
Input: i8042 - add ASUS Zenbook Flip to noselftest list
Input: add missing dependencies on CONFIG_HAS_IOMEM
Input: joydev - prevent potential read overflow in ioctl
Input: elo - fix an error code in elo_connect()
Input: xpad - add support for PowerA Enhanced Wired Controller for Xbox Series X|S
Input: sur40 - fix an error code in sur40_probe()
Input: elants_i2c - detect enum overflow
Input: zinitix - remove unneeded semicolon
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID updates from Jiri Kosina:
- support for "Unified Battery" feature on Logitech devices from Filipe
Laíns
- power management improvements for intel-ish driver from Zhang Lixu
- support for Goodix devices from Douglas Anderson
- improved handling of generic HID keyboard in order to make it easier
for userspace to figure out the details of the device, from Dmitry
Torokhov
- Playstation DualSense support from Roderick Colenbrander
- other assorted small fixes and device ID additions.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: (49 commits)
HID: playstation: add DualSense player LED support.
HID: playstation: add microphone mute support for DualSense.
HID: playstation: add initial DualSense lightbar support.
HID: wacom: Ignore attempts to overwrite the touch_max value from HID
HID: playstation: fix array size comparison (off-by-one)
HID: playstation: fix unused variable in ps_battery_get_property.
HID: playstation: report DualSense hardware and firmware version.
HID: playstation: add DualSense classic rumble support.
HID: playstation: add DualSense Bluetooth support.
HID: playstation: track devices in list.
HID: playstation: add DualSense accelerometer and gyroscope support.
HID: playstation: add DualSense touchpad support.
HID: playstation: add DualSense battery support.
HID: playstation: use DualSense MAC address as unique identifier.
HID: playstation: initial DualSense USB support.
HID: ite: Enable QUIRK_TOUCHPAD_ON_OFF_REPORT on Acer Aspire Switch 10E
HID: Ignore battery for Elan touchscreen on HP Spectre X360 15-df0xxx
HID: logitech-dj: add support for the new lightspeed connection iteration
HID: intel-ish-hid: ipc: Add Tiger Lake H PCI device ID
HID: logitech-dj: add support for keyboard events in eQUAD step 4 Gaming
...
|
|
Commit 0da6bcd9fcc0 ("scripts: dtc: Build fdtoverlay tool") enabled
building fdtoverlay, but failed to add it to .gitignore.
Also add a note to keep hostprogs in sync with .gitignore.
Fixes: 0da6bcd9fcc0 ("scripts: dtc: Build fdtoverlay tool")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
Restore the previous behavior by using the correct flag for the whole device
("part0").
Fixes: 99dfc43ecbf6 ("block: use ->bi_bdev for bio based I/O accounting")
Reported-by: John Stultz <john.stultz@linaro.org>
Tested-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
When doing non-clean builds and switching between CONFIG_LTO=n and
CONFIG_LTO=y, the build system (correctly) didn't notice that assembly
and LTO-excluded C object files were rewritten in place by objtool (to
add the .orc_unwind* sections), since their build command lines were the
same between CONFIG_LTO=y and CONFIG_LTO=n. The objtool step would fail:
vmlinux.o: warning: objtool: file already has .orc_unwind section, skipping
make: *** [Makefile:1194: vmlinux] Error 255
Avoid this by making sure the build will see a difference between an LTO
and non-LTO build (by including "-fno-lto" in KBUILD_*FLAGS). This will
get ignored when CC_FLAGS_LTO is present, and will not be included at
all when CONFIG_LTO=n.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
Pull gfs2 updates from Andreas Gruenbacher:
- Log space and revoke accounting rework to fix some failed asserts.
- Local resource group glock sharing for better local performance.
- Add support for version 1802 filesystems: trusted xattr support and
'-o rgrplvb' mounts by default.
- Actually synchronize on the inode glock's FREEING bit during withdraw
("gfs2: fix glock confusion in function signal_our_withdraw").
- Fix parallel recovery of multiple journals ("gfs2: keep bios separate
for each journal").
- Various other bug fixes.
* tag 'gfs2-for-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: (49 commits)
gfs2: Don't get stuck with I/O plugged in gfs2_ail1_flush
gfs2: Per-revoke accounting in transactions
gfs2: Rework the log space allocation logic
gfs2: Minor calc_reserved cleanup
gfs2: Use resource group glock sharing
gfs2: Allow node-wide exclusive glock sharing
gfs2: Add local resource group locking
gfs2: Add per-reservation reserved block accounting
gfs2: Rename rs_{free -> requested} and rd_{reserved -> requested}
gfs2: Check for active reservation in gfs2_release
gfs2: Don't search for unreserved space twice
gfs2: Only pass reservation down to gfs2_rbm_find
gfs2: Also reflect single-block allocations in rgd->rd_extfail_pt
gfs2: Recursive gfs2_quota_hold in gfs2_iomap_end
gfs2: Add trusted xattr support
gfs2: Enable rgrplvb for sb_fs_format 1802
gfs2: Don't skip dlm unlock if glock has an lvb
gfs2: Lock imbalance on error path in gfs2_recover_one
gfs2: Move function gfs2_ail_empty_tr
gfs2: Get rid of current_tail()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull idmapped mounts from Christian Brauner:
"This introduces idmapped mounts which has been in the making for some
time. Simply put, different mounts can expose the same file or
directory with different ownership. This initial implementation comes
with ports for fat, ext4 and with Christoph's port for xfs with more
filesystems being actively worked on by independent people and
maintainers.
Idmapping mounts handle a wide range of long standing use-cases. Here
are just a few:
- Idmapped mounts make it possible to easily share files between
multiple users or multiple machines especially in complex
scenarios. For example, idmapped mounts will be used in the
implementation of portable home directories in
systemd-homed.service(8) where they allow users to move their home
directory to an external storage device and use it on multiple
computers where they are assigned different uids and gids. This
effectively makes it possible to assign random uids and gids at
login time.
- It is possible to share files from the host with unprivileged
containers without having to change ownership permanently through
chown(2).
- It is possible to idmap a container's rootfs and without having to
mangle every file. For example, Chromebooks use it to share the
user's Download folder with their unprivileged containers in their
Linux subsystem.
- It is possible to share files between containers with
non-overlapping idmappings.
- Filesystem that lack a proper concept of ownership such as fat can
use idmapped mounts to implement discretionary access (DAC)
permission checking.
- They allow users to efficiently changing ownership on a per-mount
basis without having to (recursively) chown(2) all files. In
contrast to chown (2) changing ownership of large sets of files is
instantenous with idmapped mounts. This is especially useful when
ownership of a whole root filesystem of a virtual machine or
container is changed. With idmapped mounts a single syscall
mount_setattr syscall will be sufficient to change the ownership of
all files.
- Idmapped mounts always take the current ownership into account as
idmappings specify what a given uid or gid is supposed to be mapped
to. This contrasts with the chown(2) syscall which cannot by itself
take the current ownership of the files it changes into account. It
simply changes the ownership to the specified uid and gid. This is
especially problematic when recursively chown(2)ing a large set of
files which is commong with the aforementioned portable home
directory and container and vm scenario.
- Idmapped mounts allow to change ownership locally, restricting it
to specific mounts, and temporarily as the ownership changes only
apply as long as the mount exists.
Several userspace projects have either already put up patches and
pull-requests for this feature or will do so should you decide to pull
this:
- systemd: In a wide variety of scenarios but especially right away
in their implementation of portable home directories.
https://systemd.io/HOME_DIRECTORY/
- container runtimes: containerd, runC, LXD:To share data between
host and unprivileged containers, unprivileged and privileged
containers, etc. The pull request for idmapped mounts support in
containerd, the default Kubernetes runtime is already up for quite
a while now: https://github.com/containerd/containerd/pull/4734
- The virtio-fs developers and several users have expressed interest
in using this feature with virtual machines once virtio-fs is
ported.
- ChromeOS: Sharing host-directories with unprivileged containers.
I've tightly synced with all those projects and all of those listed
here have also expressed their need/desire for this feature on the
mailing list. For more info on how people use this there's a bunch of
talks about this too. Here's just two recent ones:
https://www.cncf.io/wp-content/uploads/2020/12/Rootless-Containers-in-Gitpod.pdf
https://fosdem.org/2021/schedule/event/containers_idmap/
This comes with an extensive xfstests suite covering both ext4 and
xfs:
https://git.kernel.org/brauner/xfstests-dev/h/idmapped_mounts
It covers truncation, creation, opening, xattrs, vfscaps, setid
execution, setgid inheritance and more both with idmapped and
non-idmapped mounts. It already helped to discover an unrelated xfs
setgid inheritance bug which has since been fixed in mainline. It will
be sent for inclusion with the xfstests project should you decide to
merge this.
In order to support per-mount idmappings vfsmounts are marked with
user namespaces. The idmapping of the user namespace will be used to
map the ids of vfs objects when they are accessed through that mount.
By default all vfsmounts are marked with the initial user namespace.
The initial user namespace is used to indicate that a mount is not
idmapped. All operations behave as before and this is verified in the
testsuite.
Based on prior discussions we want to attach the whole user namespace
and not just a dedicated idmapping struct. This allows us to reuse all
the helpers that already exist for dealing with idmappings instead of
introducing a whole new range of helpers. In addition, if we decide in
the future that we are confident enough to enable unprivileged users
to setup idmapped mounts the permission checking can take into account
whether the caller is privileged in the user namespace the mount is
currently marked with.
The user namespace the mount will be marked with can be specified by
passing a file descriptor refering to the user namespace as an
argument to the new mount_setattr() syscall together with the new
MOUNT_ATTR_IDMAP flag. The system call follows the openat2() pattern
of extensibility.
The following conditions must be met in order to create an idmapped
mount:
- The caller must currently have the CAP_SYS_ADMIN capability in the
user namespace the underlying filesystem has been mounted in.
- The underlying filesystem must support idmapped mounts.
- The mount must not already be idmapped. This also implies that the
idmapping of a mount cannot be altered once it has been idmapped.
- The mount must be a detached/anonymous mount, i.e. it must have
been created by calling open_tree() with the OPEN_TREE_CLONE flag
and it must not already have been visible in the filesystem.
The last two points guarantee easier semantics for userspace and the
kernel and make the implementation significantly simpler.
By default vfsmounts are marked with the initial user namespace and no
behavioral or performance changes are observed.
The manpage with a detailed description can be found here:
https://git.kernel.org/brauner/man-pages/c/1d7b902e2875a1ff342e036a9f866a995640aea8
In order to support idmapped mounts, filesystems need to be changed
and mark themselves with the FS_ALLOW_IDMAP flag in fs_flags. The
patches to convert individual filesystem are not very large or
complicated overall as can be seen from the included fat, ext4, and
xfs ports. Patches for other filesystems are actively worked on and
will be sent out separately. The xfstestsuite can be used to verify
that port has been done correctly.
The mount_setattr() syscall is motivated independent of the idmapped
mounts patches and it's been around since July 2019. One of the most
valuable features of the new mount api is the ability to perform
mounts based on file descriptors only.
Together with the lookup restrictions available in the openat2()
RESOLVE_* flag namespace which we added in v5.6 this is the first time
we are close to hardened and race-free (e.g. symlinks) mounting and
path resolution.
While userspace has started porting to the new mount api to mount
proper filesystems and create new bind-mounts it is currently not
possible to change mount options of an already existing bind mount in
the new mount api since the mount_setattr() syscall is missing.
With the addition of the mount_setattr() syscall we remove this last
restriction and userspace can now fully port to the new mount api,
covering every use-case the old mount api could. We also add the
crucial ability to recursively change mount options for a whole mount
tree, both removing and adding mount options at the same time. This
syscall has been requested multiple times by various people and
projects.
There is a simple tool available at
https://github.com/brauner/mount-idmapped
that allows to create idmapped mounts so people can play with this
patch series. I'll add support for the regular mount binary should you
decide to pull this in the following weeks:
Here's an example to a simple idmapped mount of another user's home
directory:
u1001@f2-vm:/$ sudo ./mount --idmap both:1000:1001:1 /home/ubuntu/ /mnt
u1001@f2-vm:/$ ls -al /home/ubuntu/
total 28
drwxr-xr-x 2 ubuntu ubuntu 4096 Oct 28 22:07 .
drwxr-xr-x 4 root root 4096 Oct 28 04:00 ..
-rw------- 1 ubuntu ubuntu 3154 Oct 28 22:12 .bash_history
-rw-r--r-- 1 ubuntu ubuntu 220 Feb 25 2020 .bash_logout
-rw-r--r-- 1 ubuntu ubuntu 3771 Feb 25 2020 .bashrc
-rw-r--r-- 1 ubuntu ubuntu 807 Feb 25 2020 .profile
-rw-r--r-- 1 ubuntu ubuntu 0 Oct 16 16:11 .sudo_as_admin_successful
-rw------- 1 ubuntu ubuntu 1144 Oct 28 00:43 .viminfo
u1001@f2-vm:/$ ls -al /mnt/
total 28
drwxr-xr-x 2 u1001 u1001 4096 Oct 28 22:07 .
drwxr-xr-x 29 root root 4096 Oct 28 22:01 ..
-rw------- 1 u1001 u1001 3154 Oct 28 22:12 .bash_history
-rw-r--r-- 1 u1001 u1001 220 Feb 25 2020 .bash_logout
-rw-r--r-- 1 u1001 u1001 3771 Feb 25 2020 .bashrc
-rw-r--r-- 1 u1001 u1001 807 Feb 25 2020 .profile
-rw-r--r-- 1 u1001 u1001 0 Oct 16 16:11 .sudo_as_admin_successful
-rw------- 1 u1001 u1001 1144 Oct 28 00:43 .viminfo
u1001@f2-vm:/$ touch /mnt/my-file
u1001@f2-vm:/$ setfacl -m u:1001:rwx /mnt/my-file
u1001@f2-vm:/$ sudo setcap -n 1001 cap_net_raw+ep /mnt/my-file
u1001@f2-vm:/$ ls -al /mnt/my-file
-rw-rwxr--+ 1 u1001 u1001 0 Oct 28 22:14 /mnt/my-file
u1001@f2-vm:/$ ls -al /home/ubuntu/my-file
-rw-rwxr--+ 1 ubuntu ubuntu 0 Oct 28 22:14 /home/ubuntu/my-file
u1001@f2-vm:/$ getfacl /mnt/my-file
getfacl: Removing leading '/' from absolute path names
# file: mnt/my-file
# owner: u1001
# group: u1001
user::rw-
user:u1001:rwx
group::rw-
mask::rwx
other::r--
u1001@f2-vm:/$ getfacl /home/ubuntu/my-file
getfacl: Removing leading '/' from absolute path names
# file: home/ubuntu/my-file
# owner: ubuntu
# group: ubuntu
user::rw-
user:ubuntu:rwx
group::rw-
mask::rwx
other::r--"
* tag 'idmapped-mounts-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: (41 commits)
xfs: remove the possibly unused mp variable in xfs_file_compat_ioctl
xfs: support idmapped mounts
ext4: support idmapped mounts
fat: handle idmapped mounts
tests: add mount_setattr() selftests
fs: introduce MOUNT_ATTR_IDMAP
fs: add mount_setattr()
fs: add attr_flags_to_mnt_flags helper
fs: split out functions to hold writers
namespace: only take read lock in do_reconfigure_mnt()
mount: make {lock,unlock}_mount_hash() static
namespace: take lock_mount_hash() directly when changing flags
nfs: do not export idmapped mounts
overlayfs: do not mount on top of idmapped mounts
ecryptfs: do not mount on top of idmapped mounts
ima: handle idmapped mounts
apparmor: handle idmapped mounts
fs: make helpers idmap mount aware
exec: handle idmapped mounts
would_dump: handle idmapped mounts
...
|
|
The debug check must be done after unregister_netdevice_many() call --
the hlist_del_rcu() for this is done inside .ndo_stop.
This is the same with commit 0fda7600c2e1 ("geneve: move debug check after
netdev unregister")
Test commands:
ip netns del A
ip netns add A
ip netns add B
ip netns exec B ip link add vxlan0 type vxlan vni 100 local 10.0.0.1 \
remote 10.0.0.2 dstport 4789 srcport 4789 4789
ip netns exec B ip link set vxlan0 netns A
ip netns exec A ip link set vxlan0 up
ip netns del B
Splat looks like:
[ 73.176249][ T7] ------------[ cut here ]------------
[ 73.178662][ T7] WARNING: CPU: 4 PID: 7 at drivers/net/vxlan.c:4743 vxlan_exit_batch_net+0x52e/0x720 [vxlan]
[ 73.182597][ T7] Modules linked in: vxlan openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 mlx5_core nfp mlxfw ixgbevf tls sch_fq_codel nf_tables nfnetlink ip_tables x_tables unix
[ 73.190113][ T7] CPU: 4 PID: 7 Comm: kworker/u16:0 Not tainted 5.11.0-rc7+ #838
[ 73.193037][ T7] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[ 73.196986][ T7] Workqueue: netns cleanup_net
[ 73.198946][ T7] RIP: 0010:vxlan_exit_batch_net+0x52e/0x720 [vxlan]
[ 73.201509][ T7] Code: 00 01 00 00 0f 84 39 fd ff ff 48 89 ca 48 c1 ea 03 80 3c 1a 00 0f 85 a6 00 00 00 89 c2 48 83 c2 02 49 8b 14 d4 48 85 d2 74 ce <0f> 0b eb ca e8 b9 51 db dd 84 c0 0f 85 4a fe ff ff 48 c7 c2 80 bc
[ 73.208813][ T7] RSP: 0018:ffff888100907c10 EFLAGS: 00010286
[ 73.211027][ T7] RAX: 000000000000003c RBX: dffffc0000000000 RCX: ffff88800ec411f0
[ 73.213702][ T7] RDX: ffff88800a278000 RSI: ffff88800fc78c70 RDI: ffff88800fc78070
[ 73.216169][ T7] RBP: ffff88800b5cbdc0 R08: fffffbfff424de61 R09: fffffbfff424de61
[ 73.218463][ T7] R10: ffffffffa126f307 R11: fffffbfff424de60 R12: ffff88800ec41000
[ 73.220794][ T7] R13: ffff888100907d08 R14: ffff888100907c50 R15: ffff88800fc78c40
[ 73.223337][ T7] FS: 0000000000000000(0000) GS:ffff888114800000(0000) knlGS:0000000000000000
[ 73.225814][ T7] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 73.227616][ T7] CR2: 0000562b5cb4f4d0 CR3: 0000000105fbe001 CR4: 00000000003706e0
[ 73.229700][ T7] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 73.231820][ T7] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 73.233844][ T7] Call Trace:
[ 73.234698][ T7] ? vxlan_err_lookup+0x3c0/0x3c0 [vxlan]
[ 73.235962][ T7] ? ops_exit_list.isra.11+0x93/0x140
[ 73.237134][ T7] cleanup_net+0x45e/0x8a0
[ ... ]
Fixes: 57b61127ab7d ("vxlan: speedup vxlan tunnels dismantle")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Link: https://lore.kernel.org/r/20210221154552.11749-1-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Pass code model and stack alignment to the linker as these are not
stored in LLVM bitcode, and allow CONFIG_LTO_CLANG* to be enabled.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
|
|
Clang incorrectly inlines functions with differing stack protector
attributes, which breaks __restore_processor_state() that relies on
stack protector being disabled. This change disables LTO for cpu.c
to work aroung the bug.
Link: https://bugs.llvm.org/show_bug.cgi?id=47479
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
|
|
Disable LTO for the vDSO. Note that while we could use Clang's LTO
for the 64-bit vDSO, it won't add noticeable benefit for the small
amount of C code.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
|
|
With LTO, LLVM bitcode won't be compiled into native code until
modpost_link, or modfinal for modules. This change postpones calls
to objtool until after these steps, and moves objtool_args to
Makefile.lib, so the arguments can be reused in Makefile.modfinal.
As we didn't have objects to process earlier, we use --duplicate
when processing vmlinux.o. This change also disables unreachable
instruction warnings with LTO to avoid warnings about the int3
padding between functions.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
|
|
This change adds a --noinstr flag to objtool to allow us to specify
that we're processing vmlinux.o without also enabling noinstr
validation. This is needed to avoid false positives with LTO when we
run objtool on vmlinux.o without CONFIG_DEBUG_ENTRY.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
|
|
Select HAVE_OBJTOOL_MCOUNT if STACK_VALIDATION is selected to use
objtool to generate __mcount_loc sections for dynamic ftrace with
Clang and gcc <5 (later versions of gcc use -mrecord-mcount).
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
|
|
This change adds build support for using objtool to generate
__mcount_loc sections.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
|
|
With LTO, we run objtool on vmlinux.o, but don't want noinstr
validation. This change requires --vmlinux to be passed to objtool
explicitly.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
|
|
When objtool generates relocations for the __mcount_loc section, it
tries to reference __fentry__ calls by their section symbol offset.
However, this fails with Clang's integrated assembler as it may not
generate section symbols for every section. This patch looks up a
function symbol instead if the section symbol is missing, similarly
to commit e81e07244325 ("objtool: Support Clang non-section symbols
in ORC generation").
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
|
|
Add the --mcount option for generating __mcount_loc sections
needed for dynamic ftrace. Using this pass requires the kernel to
be compiled with -mfentry and CC_USING_NOP_MCOUNT to be defined
in Makefile.
Link: https://lore.kernel.org/lkml/20200625200235.GQ4781@hirez.programming.kicks-ass.net/
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
[Sami: rebased, dropped config changes, fixed to actually use --mcount,
and wrote a commit message.]
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
|
|
Hayes Wang says:
====================
r8152: minor adjustments
These patches are used to adjust the code.
====================
Link: https://lore.kernel.org/r/1394712342-15778-341-Taiwan-albertk@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add rtl_eee_plus_en() and rtl_green_en().
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Some messages are before calling register_netdev(), so replace
netif_err() with dev_err().
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Return error code if autosuspend_en, eee_get, or eee_set don't exist.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
U1/U2 shoued be enabled for USB 3.0 or later. The USB 2.0 doesn't
support it.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu
Pull percpu updates from Dennis Zhou:
"Percpu had a cleanup come in that makes use of the cpu bitmask helpers
instead of the current iterative approach.
This clean up then had an adverse interaction when clang's inlining
sensitivity is changed such that not all sites are inlined resulting
in modpost being upset with section mismatch due to percpu setup being
marked __init.
That was fixed by introducing __flatten to compiler_attributes.h"
* 'for-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu:
percpu: fix clang modpost section mismatch
percpu: reduce the number of cpu distance comparisons
|
|
The NanoPi M4B is a minor revision of the original M4.
The differences against the original Nanopi M4 that are common with the
other M4V2 revision include:
- microphone header removed
- power button added
- recovery button added
Additional changes specific to the M4B:
- USB 3.0 hub removed; board now has 2x USB 3.0 type-A ports and 2x
USB 2.0 ports
- ADB toggle switch added; this changes the top USB 3.0 host port to
a peripheral port
- Type-C port no longer supports data or PD
- WiFi/Bluetooth combo chip switched to AP6256, which supports BT 5.0
but only 1T1R (down from 2T2R) for WiFi
Add a compatible string for the new board revision.
Link: https://lore.kernel.org/r/20210121162321.4538-3-wens@kernel.org
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rob Herring <robh@kernel.org>
|
|
The Rockchip PCIe controller DT binding clearly states that 'ep-gpios' is
an optional property. And indeed there are boards that don't require it.
Make the driver follow the binding by using devm_gpiod_get_optional()
instead of devm_gpiod_get().
[bhelgaas: tidy whitespace]
Link: https://lore.kernel.org/r/20210121162321.4538-2-wens@kernel.org
Fixes: e77f847df54c ("PCI: rockchip: Add Rockchip PCIe controller support")
Fixes: 956cd99b35a8 ("PCI: rockchip: Separate common code from RC driver")
Fixes: 964bac9455be ("PCI: rockchip: Split out rockchip_pcie_parse_dt() to parse DT")
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Add invalid and reply flags validate in the fl_validate_ct_state.
This makes the checking complete if compared to ovs'
validate_ct_state().
Signed-off-by: wenxu <wenxu@ucloud.cn>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Link: https://lore.kernel.org/r/1614064315-364-1-git-send-email-wenxu@ucloud.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Florian Fainelli says:
====================
net: dsa: Learning fixes for b53/bcm_sf2
This patch series contains a couple of fixes for the b53/bcm_sf2 drivers
with respect to configuring learning.
The first patch is wiring-up the necessary dsa_switch_ops operations in
order to support the offloading of bridge flags.
The second patch corrects the switch driver's default learning behavior
which was unfortunately wrong from day one.
This is submitted against "net" because this is technically a bug fix
since ports should not have had learning enabled by default but given
this is dependent upon Vladimir's recent br_flags series, there is no
Fixes tag provided.
I will be providing targeted stable backports that look a bit
different.
Changes in v2:
- added first patch
- updated second patch to include BR_LEARNING check in br_flags_pre as
a support bridge flag to offload
====================
Link: https://lore.kernel.org/r/20210222223010.2907234-1-f.fainelli@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add support for being able to set the learning attribute on port, and
make sure that the standalone ports start up with learning disabled.
We can remove the code in bcm_sf2 that configured the ports learning
attribute because we want the standalone ports to have learning disabled
by default and port 7 cannot be bridged, so its learning attribute will
not change past its initial configuration.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Because bcm_sf2 implements its own dsa_switch_ops we need to export the
b53_br_flags_pre(), b53_br_flags() and b53_set_mrouter so we can wire-up
them up like they used to be with the former b53_br_egress_floods().
Fixes: a8b659e7ff75 ("net: dsa: act as passthrough for bridge port flags")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The removal of EXPORT_UNUSED_SYMBOL() in commit 367948220fce looks like
(and was sold as) a no-op, but it actually had a rather serious and
subtle side effect: the UNUSED_SYMBOLS option not only enabled the
removed (unused) functionality, it also _disabled_ the TRIM_UNUSED_KSYMS
functionality.
And it turns out that TRIM_UNUSED_KSYMS is a huge time waste, and takes
up a third of the kernel build time for me. For no actual upside, since
no distro is likely to ever be able to enable it (because they all
support external kernel modules).
Rather than re-enable EXPORT_UNUSED_SYMBOL, this just disables the
TRIM_UNUSED_KSYMS option by marking it broken. I'm tempted to just
remove the support entirely, but maybe somebody has a use-case and can
fix the behavior of it.
I could have just disabled it for COMPILE_TEST, but it really smells
like the TRIM_UNUSED_KSYMS option is badly done and not really useful,
so this takes the more direct approach - let's see if anybody ever
actually notices or complains.
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jessica Yu <jeyu@kernel.org>
Fixes: 367948220fce ("module: remove EXPORT_UNUSED_SYMBOL*")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
sky2.c driver uses netdev_warn() before the net device is initialized.
Fix it by using dev_warn() instead.
Signed-off-by: Krzysztof Halasa <khalasa@piap.pl>
Link: https://lore.kernel.org/r/m3a6s1r1ul.fsf@t19.piap.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add documentation to help users use pci-epf-ntb function driver and
existing host side NTB infrastructure for NTB functionality.
[bhelgaas: fix a few typos]
Link: https://lore.kernel.org/r/20210201195809.7342-18-kishon@ti.com
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
|
|
In ndo_stop functions, netdev_completed_queue() is called during forced
tx reclaim, after netdev_reset_queue(). This may trigger kernel panic if
there is any tx skb left.
This patch moves netdev_reset_queue() to after tx reclaim, so BQL can
complete successfully then reset.
Signed-off-by: Sieng Piaw Liew <liew.s.piaw@gmail.com>
Fixes: 4c59b0f5543d ("bcm63xx_enet: add BQL support")
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20210222013530.1356-1-liew.s.piaw@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
function
Add binding documentation for pci-ntb endpoint function that helps in
adding and configuring pci-ntb endpoint function.
Link: https://lore.kernel.org/r/20210201195809.7342-17-kishon@ti.com
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Add support for EPF PCI Non-Transparent Bridge (NTB) devices. This driver
is platform independent and may be used by any platform that has multiple
PCI endpoint instances configured using the pci-epf-ntb driver. The driver
connnects to the standard NTB subsystem interface. The EPF NTB device has a
configurable number of memory windows (max 4), a configurable number of
doorbells (max 32), and a configurable number of scratch-pad registers.
Link: https://lore.kernel.org/r/20210201195809.7342-16-kishon@ti.com
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
|