Age | Commit message (Collapse) | Author |
|
Currently both iommu_alloc_coherent() and iommu_free_coherent() align the
desired allocation size to PAGE_SIZE, and gets system pages and IOMMU
mappings (TCEs) for that value.
When IOMMU_PAGE_SIZE < PAGE_SIZE, this behavior may cause unnecessary
TCEs to be created for mapping the whole system page.
Example:
- PAGE_SIZE = 64k, IOMMU_PAGE_SIZE() = 4k
- iommu_alloc_coherent() is called for 128 bytes
- 1 system page (64k) is allocated
- 16 IOMMU pages (16 x 4k) are allocated (16 TCEs used)
It would be enough to use a single TCE for this, so 15 TCEs are
wasted in the process.
Update iommu_*_coherent() to make sure the size alignment happens only
for IOMMU_PAGE_SIZE() before calling iommu_alloc() and iommu_free().
Also, on iommu_range_alloc(), replace ALIGN(n, 1 << tbl->it_page_shift)
with IOMMU_PAGE_ALIGN(n, tbl), which is easier to read and does the
same.
Signed-off-by: Leonardo Bras <leobras.c@gmail.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210318174414.684630-1-leobras.c@gmail.com
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- GVT's BDW regression fix for cmd parser (Zhenyu)
- Fix modesetting in case of unexpected AUX timeouts (Imre)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YIGZ3pQPgPQtZtyI@intel.com
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.12-2021-04-21:
amdgpu:
- Fix gpuvm page table update issue
- Modifier fixes
- Register fix for dimgrey cavefish
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210421220456.3839-1-alexander.deucher@amd.com
|
|
Pull virtio fixes from Michael Tsirkin:
"Very late in the cycle but both risky if left unfixed and more or less
obvious.."
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vdpa/mlx5: Set err = -ENOMEM in case dma_map_sg_attrs fails
vhost-vdpa: protect concurrent access to vhost device iotlb
|
|
Set err = -ENOMEM if dma_map_sg_attrs() fails so the function reutrns
error.
Fixes: 94abbccdf291 ("vdpa/mlx5: Add shared memory registration code")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20210411083646.910546-1-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
|
|
Protect vhost device iotlb by vhost_dev->mutex. Otherwise,
it might cause corruption of the list and interval tree in
struct vhost_iotlb if userspace sends the VHOST_IOTLB_MSG_V2
message concurrently.
Fixes: 4c8cf318("vhost: introduce vDPA-based backend")
Cc: stable@vger.kernel.org
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20210412095512.178-1-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Kunihiko Hayashi says:
====================
Change phy-mode to RGMII-ID to enable delay pins for RTL8211E
UniPhier PXs2, LD20, and PXs3 boards have RTL8211E ethernet phy, and the
phy have the RX/TX delays of RGMII interface using pull-ups on the RXDLY
and TXDLY pins.
After the commit bbc4d71d6354 ("net: phy: realtek: fix rtl8211e rx/tx
delay config"), the delays are working correctly, however, "rgmii" means
no delay and the phy doesn't work. So need to set the phy-mode to
"rgmii-id" to show that RX/TX delays are enabled.
Changes since v1:
- Fix the commit message
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
RTL8211E
UniPhier LD20 and PXs3 boards have RTL8211E ethernet phy, and the phy have
the RX/TX delays of RGMII interface using pull-ups on the RXDLY and TXDLY
pins.
After the commit bbc4d71d6354 ("net: phy: realtek: fix rtl8211e rx/tx
delay config"), the delays are working correctly, however, "rgmii" means
no delay and the phy doesn't work. So need to set the phy-mode to
"rgmii-id" to show that RX/TX delays are enabled.
Fixes: c73730ee4c9a ("arm64: dts: uniphier: add AVE ethernet node")
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
RTL8211E
UniPhier PXs2 boards have RTL8211E ethernet phy, and the phy have the RX/TX
delays of RGMII interface using pull-ups on the RXDLY and TXDLY pins.
After the commit bbc4d71d6354 ("net: phy: realtek: fix rtl8211e rx/tx
delay config"), the delays are working correctly, however, "rgmii" means
no delay and the phy doesn't work. So need to set the phy-mode to
"rgmii-id" to show that RX/TX delays are enabled.
Fixes: e3cc931921d2 ("ARM: dts: uniphier: add AVE ethernet node")
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Mohammad Athari Bin Ismail says:
====================
Enable DWMAC HW descriptor prefetch
This patch series to add setting for HW descriptor prefetch for DWMAC
version 5.20 onwards. For Intel platform, enable the capability by
default.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Enable HW descriptor prefetch by default by setting plat->dma_cfg->dche =
true in intel_mgbe_common_data(). Need to be noted that this capability
only be supported in DWMAC core version 5.20 onwards. In stmmac, there is
a checking to check the core version. If the core version is below 5.20,
this capability wouldn`t be configured.
Below is the iperf result comparison between HW descriptor prefetch
disabled(DCHE=0b) and enabled(DCHE=1b). Tested on Intel Elkhartlake
platform with DWMAC Core 5.20. Observed line rate performance
improvement with HW descriptor prefetch enabled.
DCHE = 0b
[ 5] local 169.254.1.162 port 42123 connected to 169.254.244.142 port 5201
[ ID] Interval Transfer Bitrate Total Datagrams
[ 5] 0.00-1.00 sec 96.7 MBytes 811 Mbits/sec 70050
[ 5] 1.00-2.00 sec 96.5 MBytes 809 Mbits/sec 69850
[ 5] 2.00-3.00 sec 96.3 MBytes 808 Mbits/sec 69720
[ 5] 3.00-4.00 sec 95.9 MBytes 804 Mbits/sec 69450
[ 5] 4.00-5.00 sec 96.0 MBytes 806 Mbits/sec 69530
[ 5] 5.00-6.00 sec 96.8 MBytes 812 Mbits/sec 70080
[ 5] 6.00-7.00 sec 96.9 MBytes 813 Mbits/sec 70140
[ 5] 7.00-8.00 sec 96.8 MBytes 812 Mbits/sec 70080
[ 5] 8.00-9.00 sec 97.0 MBytes 814 Mbits/sec 70230
[ 5] 9.00-10.00 sec 96.9 MBytes 813 Mbits/sec 70170
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 5] 0.00-10.00 sec 966 MBytes 810 Mbits/sec 0.000 ms 0/699300 (0%) sender
[ 5] 0.00-10.00 sec 966 MBytes 810 Mbits/sec 0.011 ms 0/699265 (0%) receiver
DCHE = 1b
[ 5] local 169.254.1.162 port 49740 connected to 169.254.244.142 port 5201
[ ID] Interval Transfer Bitrate Total Datagrams
[ 5] 0.00-1.00 sec 97.9 MBytes 821 Mbits/sec 70880
[ 5] 1.00-2.00 sec 98.1 MBytes 823 Mbits/sec 71060
[ 5] 2.00-3.00 sec 98.2 MBytes 824 Mbits/sec 71140
[ 5] 3.00-4.00 sec 98.2 MBytes 824 Mbits/sec 71090
[ 5] 4.00-5.00 sec 98.1 MBytes 823 Mbits/sec 71050
[ 5] 5.00-6.00 sec 98.1 MBytes 823 Mbits/sec 71040
[ 5] 6.00-7.00 sec 98.1 MBytes 823 Mbits/sec 71050
[ 5] 7.00-8.00 sec 98.2 MBytes 824 Mbits/sec 71140
[ 5] 8.00-9.00 sec 98.2 MBytes 824 Mbits/sec 71120
[ 5] 9.00-10.00 sec 98.3 MBytes 824 Mbits/sec 71150
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 5] 0.00-10.00 sec 981 MBytes 823 Mbits/sec 0.000 ms 0/710720 (0%) sender
[ 5] 0.00-10.00 sec 981 MBytes 823 Mbits/sec 0.041 ms 0/710650 (0%) receiver
Signed-off-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
DWMAC Core 5.20 onwards supports HW descriptor prefetching.
Additionally, it also depends on platform specific RTL configuration.
This capability could be enabled by setting DMA_Mode bit-19 (DCHE).
So, to enable this cability, platform must set plat->dma_cfg->dche = true
and the DWMAC core version must be 5.20 onwards. Else, this capability
wouldn`t be configured
Signed-off-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Handling comm_channel_event in mlx4_master_comm_channel uses a double
loop to determine which slaves have requested work. The search is
always started at lowest slave. This leads to unfairness; lower VFs
tends to be prioritized over higher VFs.
The patch uses find_next_bit to determine which slaves to handle.
Fairness is implemented by always starting at the next to the last
start.
An MPI program has been used to measure improvements. It runs 500
ibv_reg_mr, synchronizes with all other instances and then runs 500
ibv_dereg_mr.
The results running 500 processes, time reported is for running 500
calls:
ibv_reg_mr:
Mod. Org.
mlx4_1 403.356ms 424.674ms
mlx4_2 403.355ms 424.674ms
mlx4_3 403.354ms 424.674ms
mlx4_4 403.355ms 424.674ms
mlx4_5 403.357ms 424.677ms
mlx4_6 403.354ms 424.676ms
mlx4_7 403.357ms 424.675ms
mlx4_8 403.355ms 424.675ms
ibv_dereg_mr:
Mod. Org.
mlx4_1 116.408ms 142.818ms
mlx4_2 116.434ms 142.793ms
mlx4_3 116.488ms 143.247ms
mlx4_4 116.679ms 143.230ms
mlx4_5 112.017ms 107.204ms
mlx4_6 112.032ms 107.516ms
mlx4_7 112.083ms 184.195ms
mlx4_8 115.089ms 190.618ms
Suggested-by: Håkon Bugge <haakon.bugge@oracle.com>
Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The problem is that bnxt_show_temp() returns long but "rc" is an int
and "len" is a u32. With ternary operations the type promotion is quite
tricky. The negative "rc" is first promoted to u32 and then to long so
it ends up being a high positive value instead of a a negative as we
intended.
Fix this by removing the ternary.
Fixes: d69753fa1ecb ("bnxt_en: return proper error codes in bnxt_show_temp")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There exist some errors "404 Not Found" when I click the link
of "MAINTAINERS" [1], "samples/bpf/" [2] and "selftests" [3]
in the documentation "HOWTO interact with BPF subsystem" [4].
As Alexei Starovoitov suggested, just remove "MAINTAINERS" and
"samples/bpf/" links and use correct link of "selftests".
[1] https://www.kernel.org/doc/html/MAINTAINERS
[2] https://www.kernel.org/doc/html/samples/bpf/
[3] https://www.kernel.org/doc/html/tools/testing/selftests/bpf/
[4] https://www.kernel.org/doc/html/latest/bpf/bpf_devel_QA.html
Fixes: 542228384888 ("bpf, doc: convert bpf_devel_QA.rst to use RST formatting")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/1619062560-30483-1-git-send-email-yangtiezhu@loongson.cn
|
|
Add optional dma-coherent property to binding doc.
Found by 'make dtbs_check' on arm64/amlogic DT files.
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Acked-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20210421204833.18523-2-khilman@baylibre.com'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
Take a pass at cleaning up a bunch of warnings
from 'make dtbs_check' that have crept in.
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20210421204833.18523-1-khilman@baylibre.com'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into arm/fixes
One fix for the MMC card detect on the Pine H64 board
* tag 'sunxi-fixes-for-5.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
arm64: dts: allwinner: Revert SD card CD GPIO for Pine64-LTS
Link: https://lore.kernel.org/r/45fc5e4d-ef48-4729-a869-79a8f288bb83.lettre@localhost
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
If a generic XDP program changes the destination MAC address from/to
multicast/broadcast, the skb->pkt_type is updated to properly handle
the packet when passed up the stack. When changing the MAC from/to
the NICs MAC, PACKET_HOST/OTHERHOST is not updated, though, making
the behavior different from that of native XDP.
Remember the PACKET_HOST/OTHERHOST state before calling the program
in generic XDP, and update pkt_type accordingly if the destination
MAC address has changed. As eth_type_trans() assumes a default
pkt_type of PACKET_HOST, restore that before calling it.
The use case for this is when a XDP program wants to push received
packets up the stack by rewriting the MAC to the NICs MAC, for
example by cluster nodes sharing MAC addresses.
Fixes: 297249569932 ("net: fix generic XDP to handle if eth header was mangled")
Signed-off-by: Martin Willi <martin@strongswan.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210419141559.8611-1-martin@strongswan.org
|
|
When the timeout occurs, we still have to run the following process
for releasing patch request. Otherwise, the PHY would keep no link.
Therefore, use break to stop the loop of loading firmware and
release the patch request rather than return the function directly.
Fixes: 4a51b0e8a014 ("r8152: support PHY firmware for RTL8156 series")
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
100GbE Intel Wired LAN Driver Updates 2021-04-22
This series contains updates to virtchnl header file, ice, and iavf
drivers.
Vignesh adds support to warn about potentially malicious VFs; those that
are overflowing the mailbox for the ice driver.
Michal adds support for an allowlist/denylist of VF commands based on
supported capabilities for the ice driver.
Brett adds support for iavf UDP segmentation offload by adding the
capability bit to virtchnl, advertising support in the ice driver, and
enabling it in the iavf driver. He also adds a helper function for
getting the VF VSI for ice.
Colin Ian King removes an unneeded pointer assignment.
Qi enables support in the ice driver to support virtchnl requests from
the iavf to configure its own RSS input set. This includes adding new
capability bits, structures, and commands to virtchnl header file.
Haiyue enables configuring RSS flow hash via ethtool to support TCP, UDP
and SCTP protocols in iavf.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull tpm fix from James Bottomley:
"This is an urgent regression fix for a tpm patch set that went in this
merge window. It looks like a rebase before the original pull request
lost a tpm_try_get_ops() so we have a lock imbalance in our code which
is causing oopses. The original patch was correct on the mailing list.
I'm sending this in agreement with Mimi (as joint maintainers of
trusted keys) because Jarkko is off communing with the Reindeer or
whatever it is Finns do when on holiday"
* tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/tpmdd:
KEYS: trusted: Fix TPM reservation for seal/unseal
|
|
Upon file deletion, zero out all fields in ext4_dir_entry2 besides rec_len.
In case sensitive data is stored in filenames, this ensures no potentially
sensitive data is left in the directory entry upon deletion. Also, wipe
these fields upon moving a directory entry during the conversion to an
htree and when splitting htree nodes.
The data wiped may still exist in the journal, but there are future
commits planned to address this.
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
Link: https://lore.kernel.org/r/20210422180834.2242353-1-leah.rumancik@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
Eric has noticed that after pagecache read rework, generic/418 is
occasionally failing for ext4 when blocksize < pagesize. In fact, the
pagecache rework just made hard to hit race in ext4 more likely. The
problem is that since ext4 conversion of direct IO writes to iomap
framework (commit 378f32bab371), we update inode size after direct IO
write only after invalidating page cache. Thus if buffered read sneaks
at unfortunate moment like:
CPU1 - write at offset 1k CPU2 - read from offset 0
iomap_dio_rw(..., IOMAP_DIO_FORCE_WAIT);
ext4_readpage();
ext4_handle_inode_extension()
the read will zero out tail of the page as it still sees smaller inode
size and thus page cache becomes inconsistent with on-disk contents with
all the consequences.
Fix the problem by moving inode size update into end_io handler which
gets called before the page cache is invalidated.
Reported-and-tested-by: Eric Whitney <enwlinux@gmail.com>
Fixes: 378f32bab371 ("ext4: introduce direct I/O write using iomap infrastructure")
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Acked-by: Dave Chinner <dchinner@redhat.com>
Link: https://lore.kernel.org/r/20210415155417.4734-1-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
There are a few warnings about empty debug macros in this driver:
drivers/net/ethernet/neterion/vxge/vxge-main.c: In function 'vxge_probe':
drivers/net/ethernet/neterion/vxge/vxge-main.c:4480:76: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body]
4480 | "Failed in enabling SRIOV mode: %d\n", ret);
Change them to proper 'do { } while (0)' expressions to make the
code a little more robust and avoid the warnings.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Ensure that the poll system call returns proper error flags when port
is removed (nullified port ops), allowing user side to properly fail,
without further read or write.
Fixes: 9a44c1cc6388 ("net: Add a WWAN subsystem")
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When the sampling truncation length is invalid (zero), pass the length
of the packet. Without the fix, no payload is reported to user space
when the truncation length is zero.
Fixes: a8700c3dd0a4 ("netdevsim: Add dummy psample implementation")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A link time bug that I had fixed before has come back now that
another sub-module was added to the enetc driver:
ERROR: modpost: "enetc_ierb_register_pf" [drivers/net/ethernet/freescale/enetc/fsl-enetc.ko] undefined!
The problem is that the enetc Makefile is not actually used for
the ierb module if that is the only built-in driver in there
and everything else is a loadable module.
Fix it by always entering the directory this time, regardless
of which symbols are configured. This should reliably fix the
problem and prevent it from coming back another time.
Fixes: 112463ddbe82 ("net: dsa: felix: fix link error")
Fixes: e7d48e5fbf30 ("net: enetc: add a mini driver for the Integrated Endpoint Register Block")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The MANA driver causes a build failure in some configurations when
it selects an unavailable symbol:
WARNING: unmet direct dependencies detected for PCI_HYPERV
Depends on [n]: PCI [=y] && X86_64 [=y] && HYPERV [=n] && PCI_MSI [=y] && PCI_MSI_IRQ_DOMAIN [=y] && SYSFS [=y]
Selected by [y]:
- MICROSOFT_MANA [=y] && NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_MICROSOFT [=y] && PCI_MSI [=y] && X86_64 [=y]
drivers/pci/controller/pci-hyperv.c: In function 'hv_irq_unmask':
drivers/pci/controller/pci-hyperv.c:1217:9: error: implicit declaration of function 'hv_set_msi_entry_from_desc' [-Werror=implicit-function-declaration]
1217 | hv_set_msi_entry_from_desc(¶ms->int_entry.msi_entry, msi_desc);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
A PCI driver should never depend on a particular host bridge
implementation in the first place, but if we have this dependency
it's better to express it as a 'depends on' rather than 'select'.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Changing downshift params without software reset has no effect,
so call genphy_soft_reset() after change downshift params.
As the datasheet says:
Changes to these bits are disruptive to the normal operation therefore,
any changes to these registers must be followed by software reset
to take effect.
Fixes: 5c6bc5199b5d ("net: phy: marvell: add downshift support for M88E1111")
Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Changing downshift params without software reset has no effect,
so call genphy_soft_reset() after change downshift params.
As the datasheet says:
Changes to these bits are disruptive to the normal operation therefore,
any changes to these registers must be followed by software reset
to take effect.
Fixes: 911af5e149bb ("net: phy: marvell: fix downshift function naming")
Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a new flag LANDLOCK_CREATE_RULESET_VERSION to
landlock_create_ruleset(2). This enables to retreive a Landlock ABI
version that is useful to efficiently follow a best-effort security
approach. Indeed, it would be a missed opportunity to abort the whole
sandbox building, because some features are unavailable, instead of
protecting users as much as possible with the subset of features
provided by the running kernel.
This new flag enables user space to identify the minimum set of Landlock
features supported by the running kernel without relying on a filesystem
interface (e.g. /proc/version, which might be inaccessible) nor testing
multiple syscall argument combinations (i.e. syscall bisection). New
Landlock features will be documented and tied to a minimum version
number (greater than 1). The current version will be incremented for
each new kernel release supporting new Landlock features. User space
libraries can leverage this information to seamlessly restrict processes
as much as possible while being compatible with newer APIs.
This is a much more lighter approach than the previous
landlock_get_features(2): the complexity is pushed to user space
libraries. This flag meets similar needs as securityfs versions:
selinux/policyvers, apparmor/features/*/version* and tomoyo/version.
Supporting this flag now will be convenient for backward compatibility.
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: James Morris <jmorris@namei.org>
Cc: Jann Horn <jannh@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Link: https://lore.kernel.org/r/20210422154123.13086-14-mic@digikod.net
Signed-off-by: James Morris <jamorris@linux.microsoft.com>
|
|
Add a first document describing userspace API: how to define and enforce
a Landlock security policy. This is explained with a simple example.
The Landlock system calls are described with their expected behavior and
current limitations.
Another document is dedicated to kernel developers, describing guiding
principles and some important kernel structures.
This documentation can be built with the Sphinx framework.
Cc: James Morris <jmorris@namei.org>
Cc: Jann Horn <jannh@google.com>
Cc: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Reviewed-by: Vincent Dagonneau <vincent.dagonneau@ssi.gouv.fr>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210422154123.13086-13-mic@digikod.net
Signed-off-by: James Morris <jamorris@linux.microsoft.com>
|
|
Add a basic sandbox tool to launch a command which can only access a
list of file hierarchies in a read-only or read-write way.
Cc: James Morris <jmorris@namei.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Reviewed-by: Jann Horn <jannh@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210422154123.13086-12-mic@digikod.net
Signed-off-by: James Morris <jamorris@linux.microsoft.com>
|
|
Test all Landlock system calls, ptrace hooks semantic and filesystem
access-control with multiple layouts.
Test coverage for security/landlock/ is 93.6% of lines. The code not
covered only deals with internal kernel errors (e.g. memory allocation)
and race conditions.
Cc: James Morris <jmorris@namei.org>
Cc: Jann Horn <jannh@google.com>
Cc: Serge E. Hallyn <serge@hallyn.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Reviewed-by: Vincent Dagonneau <vincent.dagonneau@ssi.gouv.fr>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210422154123.13086-11-mic@digikod.net
Signed-off-by: James Morris <jamorris@linux.microsoft.com>
|
|
These 3 system calls are designed to be used by unprivileged processes
to sandbox themselves:
* landlock_create_ruleset(2): Creates a ruleset and returns its file
descriptor.
* landlock_add_rule(2): Adds a rule (e.g. file hierarchy access) to a
ruleset, identified by the dedicated file descriptor.
* landlock_restrict_self(2): Enforces a ruleset on the calling thread
and its future children (similar to seccomp). This syscall has the
same usage restrictions as seccomp(2): the caller must have the
no_new_privs attribute set or have CAP_SYS_ADMIN in the current user
namespace.
All these syscalls have a "flags" argument (not currently used) to
enable extensibility.
Here are the motivations for these new syscalls:
* A sandboxed process may not have access to file systems, including
/dev, /sys or /proc, but it should still be able to add more
restrictions to itself.
* Neither prctl(2) nor seccomp(2) (which was used in a previous version)
fit well with the current definition of a Landlock security policy.
All passed structs (attributes) are checked at build time to ensure that
they don't contain holes and that they are aligned the same way for each
architecture.
See the user and kernel documentation for more details (provided by a
following commit):
* Documentation/userspace-api/landlock.rst
* Documentation/security/landlock.rst
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: James Morris <jmorris@namei.org>
Cc: Jann Horn <jannh@google.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Link: https://lore.kernel.org/r/20210422154123.13086-9-mic@digikod.net
Signed-off-by: James Morris <jamorris@linux.microsoft.com>
|
|
Wire up the following system calls for all architectures:
* landlock_create_ruleset(2)
* landlock_add_rule(2)
* landlock_restrict_self(2)
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: James Morris <jmorris@namei.org>
Cc: Jann Horn <jannh@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Link: https://lore.kernel.org/r/20210422154123.13086-10-mic@digikod.net
Signed-off-by: James Morris <jamorris@linux.microsoft.com>
|
|
The sb_delete security hook is called when shutting down a superblock,
which may be useful to release kernel objects tied to the superblock's
lifetime (e.g. inodes).
This new hook is needed by Landlock to release (ephemerally) tagged
struct inodes. This comes from the unprivileged nature of Landlock
described in the next commit.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Reviewed-by: Jann Horn <jannh@google.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210422154123.13086-7-mic@digikod.net
Signed-off-by: James Morris <jamorris@linux.microsoft.com>
|
|
Using Landlock objects and ruleset, it is possible to tag inodes
according to a process's domain. To enable an unprivileged process to
express a file hierarchy, it first needs to open a directory (or a file)
and pass this file descriptor to the kernel through
landlock_add_rule(2). When checking if a file access request is
allowed, we walk from the requested dentry to the real root, following
the different mount layers. The access to each "tagged" inodes are
collected according to their rule layer level, and ANDed to create
access to the requested file hierarchy. This makes possible to identify
a lot of files without tagging every inodes nor modifying the
filesystem, while still following the view and understanding the user
has from the filesystem.
Add a new ARCH_EPHEMERAL_INODES for UML because it currently does not
keep the same struct inodes for the same inodes whereas these inodes are
in use.
This commit adds a minimal set of supported filesystem access-control
which doesn't enable to restrict all file-related actions. This is the
result of multiple discussions to minimize the code of Landlock to ease
review. Thanks to the Landlock design, extending this access-control
without breaking user space will not be a problem. Moreover, seccomp
filters can be used to restrict the use of syscall families which may
not be currently handled by Landlock.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: James Morris <jmorris@namei.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Link: https://lore.kernel.org/r/20210422154123.13086-8-mic@digikod.net
Signed-off-by: James Morris <jamorris@linux.microsoft.com>
|
|
Move management of the superblock->sb_security blob out of the
individual security modules and into the security infrastructure.
Instead of allocating the blobs from within the modules, the modules
tell the infrastructure how much space is required, and the space is
allocated there.
Cc: John Johansen <john.johansen@canonical.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Reviewed-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210422154123.13086-6-mic@digikod.net
Signed-off-by: James Morris <jamorris@linux.microsoft.com>
|
|
Using ptrace(2) and related debug features on a target process can lead
to a privilege escalation. Indeed, ptrace(2) can be used by an attacker
to impersonate another task and to remain undetected while performing
malicious activities. Thanks to ptrace_may_access(), various part of
the kernel can check if a tracer is more privileged than a tracee.
A landlocked process has fewer privileges than a non-landlocked process
and must then be subject to additional restrictions when manipulating
processes. To be allowed to use ptrace(2) and related syscalls on a
target process, a landlocked process must have a subset of the target
process's rules (i.e. the tracee must be in a sub-domain of the tracer).
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Reviewed-by: Jann Horn <jannh@google.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210422154123.13086-5-mic@digikod.net
Signed-off-by: James Morris <jamorris@linux.microsoft.com>
|
|
Process's credentials point to a Landlock domain, which is underneath
implemented with a ruleset. In the following commits, this domain is
used to check and enforce the ptrace and filesystem security policies.
A domain is inherited from a parent to its child the same way a thread
inherits a seccomp policy.
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Reviewed-by: Jann Horn <jannh@google.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210422154123.13086-4-mic@digikod.net
Signed-off-by: James Morris <jamorris@linux.microsoft.com>
|
|
A Landlock ruleset is mainly a red-black tree with Landlock rules as
nodes. This enables quick update and lookup to match a requested
access, e.g. to a file. A ruleset is usable through a dedicated file
descriptor (cf. following commit implementing syscalls) which enables a
process to create and populate a ruleset with new rules.
A domain is a ruleset tied to a set of processes. This group of rules
defines the security policy enforced on these processes and their future
children. A domain can transition to a new domain which is the
intersection of all its constraints and those of a ruleset provided by
the current process. This modification only impact the current process.
This means that a process can only gain more constraints (i.e. lose
accesses) over time.
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Jann Horn <jannh@google.com>
Link: https://lore.kernel.org/r/20210422154123.13086-3-mic@digikod.net
Signed-off-by: James Morris <jamorris@linux.microsoft.com>
|
|
A Landlock object enables to identify a kernel object (e.g. an inode).
A Landlock rule is a set of access rights allowed on an object. Rules
are grouped in rulesets that may be tied to a set of processes (i.e.
subjects) to enforce a scoped access-control (i.e. a domain).
Because Landlock's goal is to empower any process (especially
unprivileged ones) to sandbox themselves, we cannot rely on a
system-wide object identification such as file extended attributes.
Indeed, we need innocuous, composable and modular access-controls.
The main challenge with these constraints is to identify kernel objects
while this identification is useful (i.e. when a security policy makes
use of this object). But this identification data should be freed once
no policy is using it. This ephemeral tagging should not and may not be
written in the filesystem. We then need to manage the lifetime of a
rule according to the lifetime of its objects. To avoid a global lock,
this implementation make use of RCU and counters to safely reference
objects.
A following commit uses this generic object management for inodes.
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Reviewed-by: Jann Horn <jannh@google.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210422154123.13086-2-mic@digikod.net
Signed-off-by: James Morris <jamorris@linux.microsoft.com>
|
|
Update Topdown documentation to permit calls to rdpmc, and describe
interaction with system calls.
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Link: http://lore.kernel.org/lkml/20210421091009.1711565-1-mdr@ashroe.eu
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
|
|
Provide the ability to enable SCTP RSS hashing by ethtool.
It gives users option of generating RSS hash based on the SCTP source
and destination ports numbers, IPv4 or IPv6 source and destination
addresses.
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Provides the ability to enable UDP RSS hashing by ethtool.
It gives users option of generating RSS hash based on the UDP source
and destination ports numbers, IPv4 or IPv6 source and destination
addresses.
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Provides the ability to enable TCP RSS hashing by ethtool.
It gives users option of generating RSS hash based on the TCP source
and destination ports numbers, IPv4 or IPv6 source and destination
addresses.
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Add the virtchnl message interface to VF, so that VF can request RSS
input set(s) based on PF's capability.
This framework allows ethtool RSS config support on the VF driver.
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|