summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-04-23powerpc/kernel/iommu: Align size for IOMMU_PAGE_SIZE() to save TCEsLeonardo Bras
Currently both iommu_alloc_coherent() and iommu_free_coherent() align the desired allocation size to PAGE_SIZE, and gets system pages and IOMMU mappings (TCEs) for that value. When IOMMU_PAGE_SIZE < PAGE_SIZE, this behavior may cause unnecessary TCEs to be created for mapping the whole system page. Example: - PAGE_SIZE = 64k, IOMMU_PAGE_SIZE() = 4k - iommu_alloc_coherent() is called for 128 bytes - 1 system page (64k) is allocated - 16 IOMMU pages (16 x 4k) are allocated (16 TCEs used) It would be enough to use a single TCE for this, so 15 TCEs are wasted in the process. Update iommu_*_coherent() to make sure the size alignment happens only for IOMMU_PAGE_SIZE() before calling iommu_alloc() and iommu_free(). Also, on iommu_range_alloc(), replace ALIGN(n, 1 << tbl->it_page_shift) with IOMMU_PAGE_ALIGN(n, tbl), which is easier to read and does the same. Signed-off-by: Leonardo Bras <leobras.c@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210318174414.684630-1-leobras.c@gmail.com
2021-04-23Merge tag 'drm-intel-fixes-2021-04-22' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes - GVT's BDW regression fix for cmd parser (Zhenyu) - Fix modesetting in case of unexpected AUX timeouts (Imre) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/YIGZ3pQPgPQtZtyI@intel.com
2021-04-23Merge tag 'amd-drm-fixes-5.12-2021-04-21' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-5.12-2021-04-21: amdgpu: - Fix gpuvm page table update issue - Modifier fixes - Register fix for dimgrey cavefish Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210421220456.3839-1-alexander.deucher@amd.com
2021-04-22Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio fixes from Michael Tsirkin: "Very late in the cycle but both risky if left unfixed and more or less obvious.." * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vdpa/mlx5: Set err = -ENOMEM in case dma_map_sg_attrs fails vhost-vdpa: protect concurrent access to vhost device iotlb
2021-04-22vdpa/mlx5: Set err = -ENOMEM in case dma_map_sg_attrs failsEli Cohen
Set err = -ENOMEM if dma_map_sg_attrs() fails so the function reutrns error. Fixes: 94abbccdf291 ("vdpa/mlx5: Add shared memory registration code") Signed-off-by: Eli Cohen <elic@nvidia.com> Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20210411083646.910546-1-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2021-04-22vhost-vdpa: protect concurrent access to vhost device iotlbXie Yongji
Protect vhost device iotlb by vhost_dev->mutex. Otherwise, it might cause corruption of the list and interval tree in struct vhost_iotlb if userspace sends the VHOST_IOTLB_MSG_V2 message concurrently. Fixes: 4c8cf318("vhost: introduce vDPA-based backend") Cc: stable@vger.kernel.org Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20210412095512.178-1-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-04-22Merge branch 'RTL8211E-RGMII-D'David S. Miller
Kunihiko Hayashi says: ==================== Change phy-mode to RGMII-ID to enable delay pins for RTL8211E UniPhier PXs2, LD20, and PXs3 boards have RTL8211E ethernet phy, and the phy have the RX/TX delays of RGMII interface using pull-ups on the RXDLY and TXDLY pins. After the commit bbc4d71d6354 ("net: phy: realtek: fix rtl8211e rx/tx delay config"), the delays are working correctly, however, "rgmii" means no delay and the phy doesn't work. So need to set the phy-mode to "rgmii-id" to show that RX/TX delays are enabled. Changes since v1: - Fix the commit message ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-22arm64: dts: uniphier: Change phy-mode to RGMII-ID to enable delay pins for ↵Kunihiko Hayashi
RTL8211E UniPhier LD20 and PXs3 boards have RTL8211E ethernet phy, and the phy have the RX/TX delays of RGMII interface using pull-ups on the RXDLY and TXDLY pins. After the commit bbc4d71d6354 ("net: phy: realtek: fix rtl8211e rx/tx delay config"), the delays are working correctly, however, "rgmii" means no delay and the phy doesn't work. So need to set the phy-mode to "rgmii-id" to show that RX/TX delays are enabled. Fixes: c73730ee4c9a ("arm64: dts: uniphier: add AVE ethernet node") Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-22ARM: dts: uniphier: Change phy-mode to RGMII-ID to enable delay pins for ↵Kunihiko Hayashi
RTL8211E UniPhier PXs2 boards have RTL8211E ethernet phy, and the phy have the RX/TX delays of RGMII interface using pull-ups on the RXDLY and TXDLY pins. After the commit bbc4d71d6354 ("net: phy: realtek: fix rtl8211e rx/tx delay config"), the delays are working correctly, however, "rgmii" means no delay and the phy doesn't work. So need to set the phy-mode to "rgmii-id" to show that RX/TX delays are enabled. Fixes: e3cc931921d2 ("ARM: dts: uniphier: add AVE ethernet node") Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-22Merge branch 'stmmac-swmac-desc-prefetch'David S. Miller
Mohammad Athari Bin Ismail says: ==================== Enable DWMAC HW descriptor prefetch This patch series to add setting for HW descriptor prefetch for DWMAC version 5.20 onwards. For Intel platform, enable the capability by default. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-22stmmac: intel: Enable HW descriptor prefetch by defaultMohammad Athari Bin Ismail
Enable HW descriptor prefetch by default by setting plat->dma_cfg->dche = true in intel_mgbe_common_data(). Need to be noted that this capability only be supported in DWMAC core version 5.20 onwards. In stmmac, there is a checking to check the core version. If the core version is below 5.20, this capability wouldn`t be configured. Below is the iperf result comparison between HW descriptor prefetch disabled(DCHE=0b) and enabled(DCHE=1b). Tested on Intel Elkhartlake platform with DWMAC Core 5.20. Observed line rate performance improvement with HW descriptor prefetch enabled. DCHE = 0b [ 5] local 169.254.1.162 port 42123 connected to 169.254.244.142 port 5201 [ ID] Interval Transfer Bitrate Total Datagrams [ 5] 0.00-1.00 sec 96.7 MBytes 811 Mbits/sec 70050 [ 5] 1.00-2.00 sec 96.5 MBytes 809 Mbits/sec 69850 [ 5] 2.00-3.00 sec 96.3 MBytes 808 Mbits/sec 69720 [ 5] 3.00-4.00 sec 95.9 MBytes 804 Mbits/sec 69450 [ 5] 4.00-5.00 sec 96.0 MBytes 806 Mbits/sec 69530 [ 5] 5.00-6.00 sec 96.8 MBytes 812 Mbits/sec 70080 [ 5] 6.00-7.00 sec 96.9 MBytes 813 Mbits/sec 70140 [ 5] 7.00-8.00 sec 96.8 MBytes 812 Mbits/sec 70080 [ 5] 8.00-9.00 sec 97.0 MBytes 814 Mbits/sec 70230 [ 5] 9.00-10.00 sec 96.9 MBytes 813 Mbits/sec 70170 - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams [ 5] 0.00-10.00 sec 966 MBytes 810 Mbits/sec 0.000 ms 0/699300 (0%) sender [ 5] 0.00-10.00 sec 966 MBytes 810 Mbits/sec 0.011 ms 0/699265 (0%) receiver DCHE = 1b [ 5] local 169.254.1.162 port 49740 connected to 169.254.244.142 port 5201 [ ID] Interval Transfer Bitrate Total Datagrams [ 5] 0.00-1.00 sec 97.9 MBytes 821 Mbits/sec 70880 [ 5] 1.00-2.00 sec 98.1 MBytes 823 Mbits/sec 71060 [ 5] 2.00-3.00 sec 98.2 MBytes 824 Mbits/sec 71140 [ 5] 3.00-4.00 sec 98.2 MBytes 824 Mbits/sec 71090 [ 5] 4.00-5.00 sec 98.1 MBytes 823 Mbits/sec 71050 [ 5] 5.00-6.00 sec 98.1 MBytes 823 Mbits/sec 71040 [ 5] 6.00-7.00 sec 98.1 MBytes 823 Mbits/sec 71050 [ 5] 7.00-8.00 sec 98.2 MBytes 824 Mbits/sec 71140 [ 5] 8.00-9.00 sec 98.2 MBytes 824 Mbits/sec 71120 [ 5] 9.00-10.00 sec 98.3 MBytes 824 Mbits/sec 71150 - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams [ 5] 0.00-10.00 sec 981 MBytes 823 Mbits/sec 0.000 ms 0/710720 (0%) sender [ 5] 0.00-10.00 sec 981 MBytes 823 Mbits/sec 0.041 ms 0/710650 (0%) receiver Signed-off-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-22net: stmmac: Add HW descriptor prefetch setting for DWMAC Core 5.20 onwardsMohammad Athari Bin Ismail
DWMAC Core 5.20 onwards supports HW descriptor prefetching. Additionally, it also depends on platform specific RTL configuration. This capability could be enabled by setting DMA_Mode bit-19 (DCHE). So, to enable this cability, platform must set plat->dma_cfg->dche = true and the DWMAC core version must be 5.20 onwards. Else, this capability wouldn`t be configured Signed-off-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-22net/mlx4: Treat VFs fair when handling comm_channel_eventsHans Westgaard Ry
Handling comm_channel_event in mlx4_master_comm_channel uses a double loop to determine which slaves have requested work. The search is always started at lowest slave. This leads to unfairness; lower VFs tends to be prioritized over higher VFs. The patch uses find_next_bit to determine which slaves to handle. Fairness is implemented by always starting at the next to the last start. An MPI program has been used to measure improvements. It runs 500 ibv_reg_mr, synchronizes with all other instances and then runs 500 ibv_dereg_mr. The results running 500 processes, time reported is for running 500 calls: ibv_reg_mr: Mod. Org. mlx4_1 403.356ms 424.674ms mlx4_2 403.355ms 424.674ms mlx4_3 403.354ms 424.674ms mlx4_4 403.355ms 424.674ms mlx4_5 403.357ms 424.677ms mlx4_6 403.354ms 424.676ms mlx4_7 403.357ms 424.675ms mlx4_8 403.355ms 424.675ms ibv_dereg_mr: Mod. Org. mlx4_1 116.408ms 142.818ms mlx4_2 116.434ms 142.793ms mlx4_3 116.488ms 143.247ms mlx4_4 116.679ms 143.230ms mlx4_5 112.017ms 107.204ms mlx4_6 112.032ms 107.516ms mlx4_7 112.083ms 184.195ms mlx4_8 115.089ms 190.618ms Suggested-by: Håkon Bugge <haakon.bugge@oracle.com> Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-22bnxt_en: fix ternary sign extension bug in bnxt_show_temp()Dan Carpenter
The problem is that bnxt_show_temp() returns long but "rc" is an int and "len" is a u32. With ternary operations the type promotion is quite tricky. The negative "rc" is first promoted to u32 and then to long so it ends up being a high positive value instead of a a negative as we intended. Fix this by removing the ternary. Fixes: d69753fa1ecb ("bnxt_en: return proper error codes in bnxt_show_temp") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-22bpf, doc: Fix some invalid links in bpf_devel_QA.rstTiezhu Yang
There exist some errors "404 Not Found" when I click the link of "MAINTAINERS" [1], "samples/bpf/" [2] and "selftests" [3] in the documentation "HOWTO interact with BPF subsystem" [4]. As Alexei Starovoitov suggested, just remove "MAINTAINERS" and "samples/bpf/" links and use correct link of "selftests". [1] https://www.kernel.org/doc/html/MAINTAINERS [2] https://www.kernel.org/doc/html/samples/bpf/ [3] https://www.kernel.org/doc/html/tools/testing/selftests/bpf/ [4] https://www.kernel.org/doc/html/latest/bpf/bpf_devel_QA.html Fixes: 542228384888 ("bpf, doc: convert bpf_devel_QA.rst to use RST formatting") Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/bpf/1619062560-30483-1-git-send-email-yangtiezhu@loongson.cn
2021-04-22dt-bindings: mali-bifrost: add dma-coherentKevin Hilman
Add optional dma-coherent property to binding doc. Found by 'make dtbs_check' on arm64/amlogic DT files. Signed-off-by: Kevin Hilman <khilman@baylibre.com> Acked-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20210421204833.18523-2-khilman@baylibre.com' Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-04-22arm64: dts: amlogic: misc DT schema fixupsKevin Hilman
Take a pass at cleaning up a bunch of warnings from 'make dtbs_check' that have crept in. Signed-off-by: Kevin Hilman <khilman@baylibre.com> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20210421204833.18523-1-khilman@baylibre.com' Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-04-22Merge tag 'sunxi-fixes-for-5.12-2' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into arm/fixes One fix for the MMC card detect on the Pine H64 board * tag 'sunxi-fixes-for-5.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux: arm64: dts: allwinner: Revert SD card CD GPIO for Pine64-LTS Link: https://lore.kernel.org/r/45fc5e4d-ef48-4729-a869-79a8f288bb83.lettre@localhost Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-04-22net, xdp: Update pkt_type if generic XDP changes unicast MACMartin Willi
If a generic XDP program changes the destination MAC address from/to multicast/broadcast, the skb->pkt_type is updated to properly handle the packet when passed up the stack. When changing the MAC from/to the NICs MAC, PACKET_HOST/OTHERHOST is not updated, though, making the behavior different from that of native XDP. Remember the PACKET_HOST/OTHERHOST state before calling the program in generic XDP, and update pkt_type accordingly if the destination MAC address has changed. As eth_type_trans() assumes a default pkt_type of PACKET_HOST, restore that before calling it. The use case for this is when a XDP program wants to push received packets up the stack by rewriting the MAC to the NICs MAC, for example by cluster nodes sharing MAC addresses. Fixes: 297249569932 ("net: fix generic XDP to handle if eth header was mangled") Signed-off-by: Martin Willi <martin@strongswan.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20210419141559.8611-1-martin@strongswan.org
2021-04-22r8152: replace return with break for ram code speedup mode timeoutHayes Wang
When the timeout occurs, we still have to run the following process for releasing patch request. Otherwise, the PHY would keep no link. Therefore, use break to stop the loop of loading firmware and release the patch request rather than return the function directly. Fixes: 4a51b0e8a014 ("r8152: support PHY firmware for RTL8156 series") Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-22Merge branch '100GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 100GbE Intel Wired LAN Driver Updates 2021-04-22 This series contains updates to virtchnl header file, ice, and iavf drivers. Vignesh adds support to warn about potentially malicious VFs; those that are overflowing the mailbox for the ice driver. Michal adds support for an allowlist/denylist of VF commands based on supported capabilities for the ice driver. Brett adds support for iavf UDP segmentation offload by adding the capability bit to virtchnl, advertising support in the ice driver, and enabling it in the iavf driver. He also adds a helper function for getting the VF VSI for ice. Colin Ian King removes an unneeded pointer assignment. Qi enables support in the ice driver to support virtchnl requests from the iavf to configure its own RSS input set. This includes adding new capability bits, structures, and commands to virtchnl header file. Haiyue enables configuring RSS flow hash via ethtool to support TCP, UDP and SCTP protocols in iavf. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-22Merge tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/tpmddLinus Torvalds
Pull tpm fix from James Bottomley: "This is an urgent regression fix for a tpm patch set that went in this merge window. It looks like a rebase before the original pull request lost a tpm_try_get_ops() so we have a lock imbalance in our code which is causing oopses. The original patch was correct on the mailing list. I'm sending this in agreement with Mimi (as joint maintainers of trusted keys) because Jarkko is off communing with the Reindeer or whatever it is Finns do when on holiday" * tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/tpmdd: KEYS: trusted: Fix TPM reservation for seal/unseal
2021-04-22ext4: wipe ext4_dir_entry2 upon file deletionLeah Rumancik
Upon file deletion, zero out all fields in ext4_dir_entry2 besides rec_len. In case sensitive data is stored in filenames, this ensures no potentially sensitive data is left in the directory entry upon deletion. Also, wipe these fields upon moving a directory entry during the conversion to an htree and when splitting htree nodes. The data wiped may still exist in the journal, but there are future commits planned to address this. Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com> Link: https://lore.kernel.org/r/20210422180834.2242353-1-leah.rumancik@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-04-22ext4: Fix occasional generic/418 failureJan Kara
Eric has noticed that after pagecache read rework, generic/418 is occasionally failing for ext4 when blocksize < pagesize. In fact, the pagecache rework just made hard to hit race in ext4 more likely. The problem is that since ext4 conversion of direct IO writes to iomap framework (commit 378f32bab371), we update inode size after direct IO write only after invalidating page cache. Thus if buffered read sneaks at unfortunate moment like: CPU1 - write at offset 1k CPU2 - read from offset 0 iomap_dio_rw(..., IOMAP_DIO_FORCE_WAIT); ext4_readpage(); ext4_handle_inode_extension() the read will zero out tail of the page as it still sees smaller inode size and thus page cache becomes inconsistent with on-disk contents with all the consequences. Fix the problem by moving inode size update into end_io handler which gets called before the page cache is invalidated. Reported-and-tested-by: Eric Whitney <enwlinux@gmail.com> Fixes: 378f32bab371 ("ext4: introduce direct I/O write using iomap infrastructure") CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Acked-by: Dave Chinner <dchinner@redhat.com> Link: https://lore.kernel.org/r/20210415155417.4734-1-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-04-22vxge: avoid -Wemtpy-body warningsArnd Bergmann
There are a few warnings about empty debug macros in this driver: drivers/net/ethernet/neterion/vxge/vxge-main.c: In function 'vxge_probe': drivers/net/ethernet/neterion/vxge/vxge-main.c:4480:76: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body] 4480 | "Failed in enabling SRIOV mode: %d\n", ret); Change them to proper 'do { } while (0)' expressions to make the code a little more robust and avoid the warnings. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-22net: wwan: core: Return poll error in case of port removalLoic Poulain
Ensure that the poll system call returns proper error flags when port is removed (nullified port ops), allowing user side to properly fail, without further read or write. Fixes: 9a44c1cc6388 ("net: Add a WWAN subsystem") Signed-off-by: Loic Poulain <loic.poulain@linaro.org> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-22netdevsim: Only use sampling truncation length when validIdo Schimmel
When the sampling truncation length is invalid (zero), pass the length of the packet. Without the fix, no payload is reported to user space when the truncation length is zero. Fixes: a8700c3dd0a4 ("netdevsim: Add dummy psample implementation") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-22net: enetc: fix link error againArnd Bergmann
A link time bug that I had fixed before has come back now that another sub-module was added to the enetc driver: ERROR: modpost: "enetc_ierb_register_pf" [drivers/net/ethernet/freescale/enetc/fsl-enetc.ko] undefined! The problem is that the enetc Makefile is not actually used for the ierb module if that is the only built-in driver in there and everything else is a loadable module. Fix it by always entering the directory this time, regardless of which symbols are configured. This should reliably fix the problem and prevent it from coming back another time. Fixes: 112463ddbe82 ("net: dsa: felix: fix link error") Fixes: e7d48e5fbf30 ("net: enetc: add a mini driver for the Integrated Endpoint Register Block") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-22net: mana: fix PCI_HYPERV dependencyArnd Bergmann
The MANA driver causes a build failure in some configurations when it selects an unavailable symbol: WARNING: unmet direct dependencies detected for PCI_HYPERV Depends on [n]: PCI [=y] && X86_64 [=y] && HYPERV [=n] && PCI_MSI [=y] && PCI_MSI_IRQ_DOMAIN [=y] && SYSFS [=y] Selected by [y]: - MICROSOFT_MANA [=y] && NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_MICROSOFT [=y] && PCI_MSI [=y] && X86_64 [=y] drivers/pci/controller/pci-hyperv.c: In function 'hv_irq_unmask': drivers/pci/controller/pci-hyperv.c:1217:9: error: implicit declaration of function 'hv_set_msi_entry_from_desc' [-Werror=implicit-function-declaration] 1217 | hv_set_msi_entry_from_desc(&params->int_entry.msi_entry, msi_desc); | ^~~~~~~~~~~~~~~~~~~~~~~~~~ A PCI driver should never depend on a particular host bridge implementation in the first place, but if we have this dependency it's better to express it as a 'depends on' rather than 'select'. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-22net: phy: marvell: fix m88e1111_set_downshiftMaxim Kochetkov
Changing downshift params without software reset has no effect, so call genphy_soft_reset() after change downshift params. As the datasheet says: Changes to these bits are disruptive to the normal operation therefore, any changes to these registers must be followed by software reset to take effect. Fixes: 5c6bc5199b5d ("net: phy: marvell: add downshift support for M88E1111") Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-22net: phy: marvell: fix m88e1011_set_downshiftMaxim Kochetkov
Changing downshift params without software reset has no effect, so call genphy_soft_reset() after change downshift params. As the datasheet says: Changes to these bits are disruptive to the normal operation therefore, any changes to these registers must be followed by software reset to take effect. Fixes: 911af5e149bb ("net: phy: marvell: fix downshift function naming") Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-22landlock: Enable user space to infer supported featuresMickaël Salaün
Add a new flag LANDLOCK_CREATE_RULESET_VERSION to landlock_create_ruleset(2). This enables to retreive a Landlock ABI version that is useful to efficiently follow a best-effort security approach. Indeed, it would be a missed opportunity to abort the whole sandbox building, because some features are unavailable, instead of protecting users as much as possible with the subset of features provided by the running kernel. This new flag enables user space to identify the minimum set of Landlock features supported by the running kernel without relying on a filesystem interface (e.g. /proc/version, which might be inaccessible) nor testing multiple syscall argument combinations (i.e. syscall bisection). New Landlock features will be documented and tied to a minimum version number (greater than 1). The current version will be incremented for each new kernel release supporting new Landlock features. User space libraries can leverage this information to seamlessly restrict processes as much as possible while being compatible with newer APIs. This is a much more lighter approach than the previous landlock_get_features(2): the complexity is pushed to user space libraries. This flag meets similar needs as securityfs versions: selinux/policyvers, apparmor/features/*/version* and tomoyo/version. Supporting this flag now will be convenient for backward compatibility. Cc: Arnd Bergmann <arnd@arndb.de> Cc: James Morris <jmorris@namei.org> Cc: Jann Horn <jannh@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Serge E. Hallyn <serge@hallyn.com> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> Link: https://lore.kernel.org/r/20210422154123.13086-14-mic@digikod.net Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22landlock: Add user and kernel documentationMickaël Salaün
Add a first document describing userspace API: how to define and enforce a Landlock security policy. This is explained with a simple example. The Landlock system calls are described with their expected behavior and current limitations. Another document is dedicated to kernel developers, describing guiding principles and some important kernel structures. This documentation can be built with the Sphinx framework. Cc: James Morris <jmorris@namei.org> Cc: Jann Horn <jannh@google.com> Cc: Serge E. Hallyn <serge@hallyn.com> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> Reviewed-by: Vincent Dagonneau <vincent.dagonneau@ssi.gouv.fr> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210422154123.13086-13-mic@digikod.net Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22samples/landlock: Add a sandbox manager exampleMickaël Salaün
Add a basic sandbox tool to launch a command which can only access a list of file hierarchies in a read-only or read-write way. Cc: James Morris <jmorris@namei.org> Cc: Serge E. Hallyn <serge@hallyn.com> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> Reviewed-by: Jann Horn <jannh@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210422154123.13086-12-mic@digikod.net Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22selftests/landlock: Add user space testsMickaël Salaün
Test all Landlock system calls, ptrace hooks semantic and filesystem access-control with multiple layouts. Test coverage for security/landlock/ is 93.6% of lines. The code not covered only deals with internal kernel errors (e.g. memory allocation) and race conditions. Cc: James Morris <jmorris@namei.org> Cc: Jann Horn <jannh@google.com> Cc: Serge E. Hallyn <serge@hallyn.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> Reviewed-by: Vincent Dagonneau <vincent.dagonneau@ssi.gouv.fr> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210422154123.13086-11-mic@digikod.net Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22landlock: Add syscall implementationsMickaël Salaün
These 3 system calls are designed to be used by unprivileged processes to sandbox themselves: * landlock_create_ruleset(2): Creates a ruleset and returns its file descriptor. * landlock_add_rule(2): Adds a rule (e.g. file hierarchy access) to a ruleset, identified by the dedicated file descriptor. * landlock_restrict_self(2): Enforces a ruleset on the calling thread and its future children (similar to seccomp). This syscall has the same usage restrictions as seccomp(2): the caller must have the no_new_privs attribute set or have CAP_SYS_ADMIN in the current user namespace. All these syscalls have a "flags" argument (not currently used) to enable extensibility. Here are the motivations for these new syscalls: * A sandboxed process may not have access to file systems, including /dev, /sys or /proc, but it should still be able to add more restrictions to itself. * Neither prctl(2) nor seccomp(2) (which was used in a previous version) fit well with the current definition of a Landlock security policy. All passed structs (attributes) are checked at build time to ensure that they don't contain holes and that they are aligned the same way for each architecture. See the user and kernel documentation for more details (provided by a following commit): * Documentation/userspace-api/landlock.rst * Documentation/security/landlock.rst Cc: Arnd Bergmann <arnd@arndb.de> Cc: James Morris <jmorris@namei.org> Cc: Jann Horn <jannh@google.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> Acked-by: Serge Hallyn <serge@hallyn.com> Link: https://lore.kernel.org/r/20210422154123.13086-9-mic@digikod.net Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22arch: Wire up Landlock syscallsMickaël Salaün
Wire up the following system calls for all architectures: * landlock_create_ruleset(2) * landlock_add_rule(2) * landlock_restrict_self(2) Cc: Arnd Bergmann <arnd@arndb.de> Cc: James Morris <jmorris@namei.org> Cc: Jann Horn <jannh@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Serge E. Hallyn <serge@hallyn.com> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> Link: https://lore.kernel.org/r/20210422154123.13086-10-mic@digikod.net Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22fs,security: Add sb_delete hookMickaël Salaün
The sb_delete security hook is called when shutting down a superblock, which may be useful to release kernel objects tied to the superblock's lifetime (e.g. inodes). This new hook is needed by Landlock to release (ephemerally) tagged struct inodes. This comes from the unprivileged nature of Landlock described in the next commit. Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: James Morris <jmorris@namei.org> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> Reviewed-by: Jann Horn <jannh@google.com> Acked-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210422154123.13086-7-mic@digikod.net Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22landlock: Support filesystem access-controlMickaël Salaün
Using Landlock objects and ruleset, it is possible to tag inodes according to a process's domain. To enable an unprivileged process to express a file hierarchy, it first needs to open a directory (or a file) and pass this file descriptor to the kernel through landlock_add_rule(2). When checking if a file access request is allowed, we walk from the requested dentry to the real root, following the different mount layers. The access to each "tagged" inodes are collected according to their rule layer level, and ANDed to create access to the requested file hierarchy. This makes possible to identify a lot of files without tagging every inodes nor modifying the filesystem, while still following the view and understanding the user has from the filesystem. Add a new ARCH_EPHEMERAL_INODES for UML because it currently does not keep the same struct inodes for the same inodes whereas these inodes are in use. This commit adds a minimal set of supported filesystem access-control which doesn't enable to restrict all file-related actions. This is the result of multiple discussions to minimize the code of Landlock to ease review. Thanks to the Landlock design, extending this access-control without breaking user space will not be a problem. Moreover, seccomp filters can be used to restrict the use of syscall families which may not be currently handled by Landlock. Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: James Morris <jmorris@namei.org> Cc: Jann Horn <jannh@google.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Kees Cook <keescook@chromium.org> Cc: Richard Weinberger <richard@nod.at> Cc: Serge E. Hallyn <serge@hallyn.com> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> Link: https://lore.kernel.org/r/20210422154123.13086-8-mic@digikod.net Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22LSM: Infrastructure management of the superblockCasey Schaufler
Move management of the superblock->sb_security blob out of the individual security modules and into the security infrastructure. Instead of allocating the blobs from within the modules, the modules tell the infrastructure how much space is required, and the space is allocated there. Cc: John Johansen <john.johansen@canonical.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> Reviewed-by: Stephen Smalley <stephen.smalley.work@gmail.com> Acked-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210422154123.13086-6-mic@digikod.net Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22landlock: Add ptrace restrictionsMickaël Salaün
Using ptrace(2) and related debug features on a target process can lead to a privilege escalation. Indeed, ptrace(2) can be used by an attacker to impersonate another task and to remain undetected while performing malicious activities. Thanks to ptrace_may_access(), various part of the kernel can check if a tracer is more privileged than a tracee. A landlocked process has fewer privileges than a non-landlocked process and must then be subject to additional restrictions when manipulating processes. To be allowed to use ptrace(2) and related syscalls on a target process, a landlocked process must have a subset of the target process's rules (i.e. the tracee must be in a sub-domain of the tracer). Cc: James Morris <jmorris@namei.org> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> Reviewed-by: Jann Horn <jannh@google.com> Acked-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210422154123.13086-5-mic@digikod.net Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22landlock: Set up the security framework and manage credentialsMickaël Salaün
Process's credentials point to a Landlock domain, which is underneath implemented with a ruleset. In the following commits, this domain is used to check and enforce the ptrace and filesystem security policies. A domain is inherited from a parent to its child the same way a thread inherits a seccomp policy. Cc: James Morris <jmorris@namei.org> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> Reviewed-by: Jann Horn <jannh@google.com> Acked-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210422154123.13086-4-mic@digikod.net Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22landlock: Add ruleset and domain managementMickaël Salaün
A Landlock ruleset is mainly a red-black tree with Landlock rules as nodes. This enables quick update and lookup to match a requested access, e.g. to a file. A ruleset is usable through a dedicated file descriptor (cf. following commit implementing syscalls) which enables a process to create and populate a ruleset with new rules. A domain is a ruleset tied to a set of processes. This group of rules defines the security policy enforced on these processes and their future children. A domain can transition to a new domain which is the intersection of all its constraints and those of a ruleset provided by the current process. This modification only impact the current process. This means that a process can only gain more constraints (i.e. lose accesses) over time. Cc: James Morris <jmorris@namei.org> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> Acked-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Jann Horn <jannh@google.com> Link: https://lore.kernel.org/r/20210422154123.13086-3-mic@digikod.net Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22landlock: Add object managementMickaël Salaün
A Landlock object enables to identify a kernel object (e.g. an inode). A Landlock rule is a set of access rights allowed on an object. Rules are grouped in rulesets that may be tied to a set of processes (i.e. subjects) to enforce a scoped access-control (i.e. a domain). Because Landlock's goal is to empower any process (especially unprivileged ones) to sandbox themselves, we cannot rely on a system-wide object identification such as file extended attributes. Indeed, we need innocuous, composable and modular access-controls. The main challenge with these constraints is to identify kernel objects while this identification is useful (i.e. when a security policy makes use of this object). But this identification data should be freed once no policy is using it. This ephemeral tagging should not and may not be written in the filesystem. We then need to manage the lifetime of a rule according to the lifetime of its objects. To avoid a global lock, this implementation make use of RCU and counters to safely reference objects. A following commit uses this generic object management for inodes. Cc: James Morris <jmorris@namei.org> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> Reviewed-by: Jann Horn <jannh@google.com> Acked-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210422154123.13086-2-mic@digikod.net Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22perf tools: Update topdown documentation to permit rdpmc callsRay Kinsella
Update Topdown documentation to permit calls to rdpmc, and describe interaction with system calls. Signed-off-by: Ray Kinsella <mdr@ashroe.eu> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Link: http://lore.kernel.org/lkml/20210421091009.1711565-1-mdr@ashroe.eu Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-22Merge branch 'kvm-sev-cgroup' into HEADPaolo Bonzini
2021-04-22iavf: Support for modifying SCTP RSS flow hashingHaiyue Wang
Provide the ability to enable SCTP RSS hashing by ethtool. It gives users option of generating RSS hash based on the SCTP source and destination ports numbers, IPv4 or IPv6 source and destination addresses. Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-22iavf: Support for modifying UDP RSS flow hashingHaiyue Wang
Provides the ability to enable UDP RSS hashing by ethtool. It gives users option of generating RSS hash based on the UDP source and destination ports numbers, IPv4 or IPv6 source and destination addresses. Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-22iavf: Support for modifying TCP RSS flow hashingHaiyue Wang
Provides the ability to enable TCP RSS hashing by ethtool. It gives users option of generating RSS hash based on the TCP source and destination ports numbers, IPv4 or IPv6 source and destination addresses. Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-22iavf: Add framework to enable ethtool RSS configHaiyue Wang
Add the virtchnl message interface to VF, so that VF can request RSS input set(s) based on PF's capability. This framework allows ethtool RSS config support on the VF driver. Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>