summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-11-25Merge branch 'lan966x-extend-xdp-support'David S. Miller
Horatiu Vultur says: ==================== net: lan966x: Extend xdp support Extend the current support of XDP in lan966x with the action XDP_TX and XDP_REDIRECT. The first patches just prepare the things such that it would be easier to add XDP_TX and XDP_REDIRECT actions. Like adding XDP_PACKET_HEADROOM, introduce helper functions, use the correct dma_dir for the page pool The last 2 patches introduce the XDP actions XDP_TX and XDP_REDIRECT. v4->v5: - add iterator declaration inside for loops - move the scope of port inside the function lan966x_fdma_rx_alloc_page_pool - create union for skb and xdpf inside struct lan966x_tx_dcb_buf v3->v4: - use napi_consume_skb instead of dev_kfree_skb_any - arrange members in struct lan966x_tx_dcb_buf not to have holes - fix when xdp program is added the check for determining if page pool needs to be recreated was wrong - change type for len in lan966x_tx_dcb_buf to u32 v2->v3: - make sure to update rxq memory model - update the page pool direction if there is any xdp program - in case of action XDP_TX give back to reuse the page - in case of action XDP_REDIRECT, remap the frame and make sure to unmap it when is transmitted. v1->v2: - use skb_reserve of using skb_put and skb_pull - make sure that data_len doesn't include XDP_PACKET_HEADROOM ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-25net: lan966x: Add support for XDP_REDIRECTHoratiu Vultur
Extend lan966x XDP support with the action XDP_REDIRECT. This is similar with the XDP_TX, so a lot of functionality can be reused. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-25net: lan966x: Add support for XDP_TXHoratiu Vultur
Extend lan966x XDP support with the action XDP_TX. In this case when the received buffer needs to execute XDP_TX, the buffer will be moved to the TX buffers. So a new RX buffer will be allocated. When the TX finish with the frame, it would give back the buffer to the page pool. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-25net: lan966x: Update dma_dir of page_pool_paramsHoratiu Vultur
To add support for XDP_TX it is required to be able to write to the DMA area therefore it is required that the pages will be mapped using DMA_BIDIRECTIONAL flag. Therefore check if there are any xdp programs on the interfaces and in that case set DMA_BIDRECTIONAL otherwise use DMA_FROM_DEVICE. Therefore when a new XDP program is added it is required to redo the page_pool. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-25net: lan966x: Update rxq memory modelHoratiu Vultur
By default the rxq memory model is MEM_TYPE_PAGE_SHARED but to be able to reuse pages on the TX side, when the XDP action XDP_TX it is required to update the memory model to PAGE_POOL. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-25net: lan966x: Add len field to lan966x_tx_dcb_bufHoratiu Vultur
Currently when a frame was transmitted, it is required to unamp the frame that was transmitted. The length of the frame was taken from the transmitted skb. In the future we might not have an skb, therefore store the length skb directly in the lan966x_tx_dcb_buf and use this one to unamp the frame. While at this, also arrange the members in lan966x_tx_dcb_buf not to have any holes. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-25net: lan966x: Introduce helper functionsHoratiu Vultur
Introduce lan966x_fdma_tx_setup_dcb and lan966x_fdma_tx_start functions and use of them inside lan966x_fdma_xmit. There is no functional change in here. They are introduced to be used when XDP_TX/REDIRECT actions are introduced. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-25net: lan966x: Add XDP_PACKET_HEADROOMHoratiu Vultur
Update the page_pool params to allocate XDP_PACKET_HEADROOM space as headroom for all received frames. This is needed for when the XDP_TX and XDP_REDIRECT are implemented. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-25ptp: idt82p33: remove PEROUT_ENABLE_OUTPUT_MASKMin Li
PEROUT_ENABLE_OUTPUT_MASK was there to allow us to enable/disable all the perout pins. But it is not standard procedure, we will have to discard it. Signed-off-by: Min Li <min.li.xe@renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-25ptp: idt82p33: Add PTP_CLK_REQ_EXTTS supportMin Li
82P33 family of chips can trigger TOD read/write by external signal from one of the IN12/13/14 pins, which are set user space programs by calling PTP_PIN_SETFUNC through ptp_ioctl Signed-off-by: Min Li <min.li.xe@renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-25Merge branch 'sparx5-tc-protocol-all'David S. Miller
Steen Hegelund says: ==================== net: TC protocol all support in Sparx5 IS2 VCAP This provides support for the TC flower filters 'protocol all' clause in the Sparx5 IS2 VCAP. It builds on top of the initial IS2 VCAP support found in these series: https://lore.kernel.org/all/20221020130904.1215072-1-steen.hegelund@microchip.com/ https://lore.kernel.org/all/20221109114116.3612477-1-steen.hegelund@microchip.com/ https://lore.kernel.org/all/20221111130519.1459549-1-steen.hegelund@microchip.com/ https://lore.kernel.org/all/20221117213114.699375-1-steen.hegelund@microchip.com/ Functionality: ============== As the configuration for the Sparx5 IS2 VCAP consists of one (or more) keyset(s) for each lookup/port per traffic classification, it is not always possible to cover all protocols with just one ordinary VCAP rule. To improve this situation the driver will try to find out what keysets a rule will need to cover a TC flower "protocol all" filter and then compare this set of keysets to what the hardware is currently configured for. In case multiple keysets are needed then the driver can create a rule per rule size (e.g. X6 and X12) and use a mask on the keyset type field to allow the VCAP to match more than one keyset with just one rule. This is possible because the keysets that have the same size typically has many keys in common, so the VCAP rule keys can make a common match. The result is that one TC filter command may create multiple IS2 VCAP rules of different sizes that have a type field with a masked type id. Delivery: ========= This is current plan for delivering the full VCAP feature set of Sparx5: - Sparx5 IS0 VCAP support - TC policer and drop action support (depends on the Sparx5 QoS support upstreamed separately) - Sparx5 ES0 VCAP support - TC flower template support - TC matchall filter support for mirroring and policing ports - TC flower filter mirror action support - Sparx5 ES2 VCAP support Version History: ================ v2 Fixed a NULL return value compiler warning. Moved the new vcap_find_actionfield function a bit up in the file. v1 Initial version ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-25net: microchip: sparx5: Add VCAP filter keys KUNIT testSteen Hegelund
This tests the filtering of keys, either dropping unsupported keys or dropping keys specified in a list. Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-25net: microchip: sparx5: Support for displaying a list of keysetsSteen Hegelund
This will display a list of keyset in case the type_id field in the VCAP rule has been wildcarded. Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-25net: microchip: sparx5: Support for TC protocol allSteen Hegelund
This allows support of TC protocol all for the Sparx5 IS2 VCAP. This is done by creating multiple rules that covers the rule size and traffic types in the IS2. Each rule size (e.g X16 and X6) may have multiple keysets and if there are more than one the type field in the VCAP rule will be wildcarded to support these keysets. Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-25net: microchip: sparx5: Support for copying and modifying rules in the APISteen Hegelund
This adds support for making a copy of a rule and modify keys and actions to differentiate the copy. Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-25net: loopback: use NET_NAME_PREDICTABLE for name_assign_typeRasmus Villemoes
When the name_assign_type attribute was introduced (commit 685343fc3ba6, "net: add name_assign_type netdev attribute"), the loopback device was explicitly mentioned as one which would make use of NET_NAME_PREDICTABLE: The name_assign_type attribute gives hints where the interface name of a given net-device comes from. These values are currently defined: ... NET_NAME_PREDICTABLE: The ifname has been assigned by the kernel in a predictable way that is guaranteed to avoid reuse and always be the same for a given device. Examples include statically created devices like the loopback device [...] Switch to that so that reading /sys/class/net/lo/name_assign_type produces something sensible instead of returning -EINVAL. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-25net: fec: don't reset irq coalesce settings to defaults on "ip link up"Rasmus Villemoes
Currently, when a FEC device is brought up, the irq coalesce settings are reset to their default values (1000us, 200 frames). That's unexpected, and breaks for example use of an appropriate .link file to make systemd-udev apply the desired settings (https://www.freedesktop.org/software/systemd/man/systemd.link.html), or any other method that would do a one-time setup during early boot. Refactor the code so that fec_restart() instead uses fec_enet_itr_coal_set(), which simply applies the settings that are stored in the private data, and initialize that private data with the default values. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-25octeontx2-pf: Fix pfc_alloc_status array overflowSuman Ghosh
This patch addresses pfc_alloc_status array overflow occurring for send queue index value greater than PFC priority. Queue index can be greater than supported PFC priority for multiple scenarios (e.g. QoS, during non zero SMQ allocation for a PF/VF). In those scenarios the API should return default tx scheduler '0'. This is causing mbox errors as otx2_get_smq_idx returing invalid smq value. Fixes: 99c969a83d82 ("octeontx2-pf: Add egress PFC support") Signed-off-by: Suman Ghosh <sumang@marvell.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-25net: stmmac: Set MAC's flow control register to reflect current settingsGoh, Wei Sheng
Currently, pause frame register GMAC_RX_FLOW_CTRL_RFE is not updated correctly when 'ethtool -A <IFACE> autoneg off rx off tx off' command is issued. This fix ensures the flow control change is reflected directly in the GMAC_RX_FLOW_CTRL_RFE register. Fixes: 46f69ded988d ("net: stmmac: Use resolved link config in mac_link_up()") Cc: <stable@vger.kernel.org> # 5.10.x Signed-off-by: Goh, Wei Sheng <wei.sheng.goh@intel.com> Signed-off-by: Noor Azura Ahmad Tarmizi <noor.azura.ahmad.tarmizi@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-25aquantia: Do not purge addresses when setting the number of ringsIzabela Bakollari
IPV6 addresses are purged when setting the number of rx/tx rings using ethtool -G. The function aq_set_ringparam calls dev_close, which removes the addresses. As a solution, call an internal function (aq_ndev_close). Fixes: c1af5427954b ("net: aquantia: Ethtool based ring size configuration") Signed-off-by: Izabela Bakollari <ibakolla@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-25xfrm: add extack to xfrm_set_spdinfoSabrina Dubroca
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-11-25xfrm: add extack to xfrm_alloc_userspiSabrina Dubroca
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-11-25xfrm: add extack to xfrm_do_migrateSabrina Dubroca
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-11-25xfrm: add extack to xfrm_new_ae and xfrm_replay_verify_lenSabrina Dubroca
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-11-25xfrm: add extack to xfrm_del_saSabrina Dubroca
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-11-25xfrm: add extack to xfrm_add_sa_expireSabrina Dubroca
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-11-25xfrm: a few coding style clean upsSabrina Dubroca
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-11-25qed: avoid defines prefixed with CONFIGLukas Bulwahn
Defines prefixed with "CONFIG" should be limited to proper Kconfig options, that are introduced in a Kconfig file. Here, constants for bitmap indices of some configs are defined and these defines begin with the config's name, and are suffixed with BITMAP_IDX. To avoid defines prefixed with "CONFIG", name these constants BITMAP_IDX_FOR_CONFIG_XYZ instead of CONFIG_XYZ_BITMAP_IDX. No functional change. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-25qlcnic: fix sleep-in-atomic-context bugs caused by msleepDuoming Zhou
The watchdog timer is used to monitor whether the process of transmitting data is timeout. If we use qlcnic driver, the dev_watchdog() that is the timer handler of watchdog timer will call qlcnic_tx_timeout() to process the timeout. But the qlcnic_tx_timeout() calls msleep(), as a result, the sleep-in-atomic-context bugs will happen. The processes are shown below: (atomic context) dev_watchdog qlcnic_tx_timeout qlcnic_83xx_idc_request_reset qlcnic_83xx_lock_driver msleep --------------------------- (atomic context) dev_watchdog qlcnic_tx_timeout qlcnic_83xx_idc_request_reset qlcnic_83xx_lock_driver qlcnic_83xx_recover_driver_lock msleep Fix by changing msleep() to mdelay(), the mdelay() is busy-waiting and the bugs could be mitigated. Fixes: 629263acaea3 ("qlcnic: 83xx CNA inter driver communication mechanism") Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-25lib/test_rhashtable: Remove set but unused variable 'insert_retries'Jiapeng Chong
Variable 'insert_retries' is not effectively used in the function, so delete it. lib/test_rhashtable.c:437:18: warning: variable 'insert_retries' set but not used. Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=3242 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-25net: stmmac: use sysfs_streq() instead of strncmp()Xu Panda
Replace the open-code with sysfs_streq(). Signed-off-by: Xu Panda <xu.panda@zte.com.cn> Signed-off-by: Yang Yang <yang.yang29@zte.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-25zonefs: Fix active zone accountingDamien Le Moal
If a file zone transitions to the offline or readonly state from an active state, we must clear the zone active flag and decrement the active seq file counter. Do so in zonefs_account_active() using the new zonefs inode flags ZONEFS_ZONE_OFFLINE and ZONEFS_ZONE_READONLY. These flags are set if necessary in zonefs_check_zone_condition() based on the result of report zones operation after an IO error. Fixes: 87c9ce3ffec9 ("zonefs: Add active seq file accounting") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
2022-11-25net: phy: add Motorcomm YT8531S phy id.Frank
We added patch for motorcomm.c to support YT8531S. This patch has been tested on AM335x platform which has one YT8531S interface card and passed all test cases. The tested cases indluding: YT8531S UTP function with support of 10M/100M/1000M; YT8531S Fiber function with support of 100M/1000M; and YT8531S Combo function that supports auto detection of media type. Since most functions of YT8531S are similar to YT8521 and we reuse some codes for YT8521 in the patch file. Signed-off-by: Frank <Frank.Sae@motor-comm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-25Merge tag 'linux-can-fixes-for-6.1-20221124' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== linux-can-fixes-for-6.1-20221124 this is a pull request of 8 patches for net/master. Ziyang Xuan contributes a patch for the can327, fixing a potential SKB leak when the netdev is down. Heiko Schocher's patch for the sja1000 driver fixes the width of the definition of the OCR_MODE_MASK. Zhang Changzhong contributes 4 patches. In the sja1000_isa, cc770, and m_can_pci drivers the error path in the probe() function and in case of the etas_es58x a function that is called by probe() are fixed. Jiasheng Jiang add a missing check for the return value of the devm_clk_get() in the m_can driver. Yasushi SHOJI's patch for the mcba_usb fixes setting of the external termination resistor. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-25vfs: fix copy_file_range() averts filesystem freeze protectionAmir Goldstein
Commit 868f9f2f8e00 ("vfs: fix copy_file_range() regression in cross-fs copies") removed fallback to generic_copy_file_range() for cross-fs cases inside vfs_copy_file_range(). To preserve behavior of nfsd and ksmbd server-side-copy, the fallback to generic_copy_file_range() was added in nfsd and ksmbd code, but that call is missing sb_start_write(), fsnotify hooks and more. Ideally, nfsd and ksmbd would pass a flag to vfs_copy_file_range() that will take care of the fallback, but that code would be subtle and we got vfs_copy_file_range() logic wrong too many times already. Instead, add a flag to explicitly request vfs_copy_file_range() to perform only generic_copy_file_range() and let nfsd and ksmbd use this flag only in the fallback path. This choise keeps the logic changes to minimum in the non-nfsd/ksmbd code paths to reduce the risk of further regressions. Fixes: 868f9f2f8e00 ("vfs: fix copy_file_range() regression in cross-fs copies") Tested-by: Namjae Jeon <linkinjeon@kernel.org> Tested-by: Luis Henriques <lhenriques@suse.de> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-11-25Merge tag 'amd-drm-fixes-6.1-2022-11-23' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.1-2022-11-23: amdgpu: - DCN 3.1.4 fixes - DP MST DSC deadlock fixes - HMM userptr fixes - Fix Aldebaran CU occupancy reporting - GFX11 fixes - PSP suspend/resume fix - DCE12 KASAN fix - DCN 3.2.x fixes - Rotated cursor fix - SMU 13.x fix - DELL platform suspend/resume fixes - VCN4 SR-IOV fix - Display regression fix for polled connectors Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221123143453.8977-1-alexander.deucher@amd.com
2022-11-25Merge tag 'drm-intel-fixes-2022-11-24' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes - Fix GVT KVM reference count handling (Sean Christopherson) - Never purge busy TTM objects (Matthew Auld) - Fix warn in intel_display_power_*_domain() functions (Imre Deak) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Y38u44hb1LZfZC+M@tursulin-desk
2022-11-25docs/bpf: Add BPF_MAP_TYPE_XSKMAP documentationMaryam Tahhan
Add documentation for BPF_MAP_TYPE_XSKMAP including kernel version introduced, usage and examples. Signed-off-by: Maryam Tahhan <mtahhan@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20221123090043.83945-1-mtahhan@redhat.com
2022-11-25samples/bpf: Fix wrong allocation size in xdp_router_ipv4_userRong Tao
prefix_key->data allocates three bytes using alloca(), but four bytes are actually accessed in the program. Signed-off-by: Rong Tao <rongtao@cestc.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/tencent_F9E2E81922B0C181D05B96DAE5AB0ACE6B06@qq.com
2022-11-25Merge tag 'drm-misc-fixes-2022-11-24' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes drm-misc-fixes for v6.1-rc7: - Another amdgpu gang submit fix. - Use dma_fence_unwrap_for_each when importing sync files. - Fix race in dma_heap_add(). - Fix use of uninitialized memory in logo. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/a5721505-4823-98ef-7d6f-0ea478221391@linux.intel.com
2022-11-25docs/bpf: Update btf selftests program and add linkRong Tao
Commit c64779e24e88("selftests/bpf: Merge most of test_btf into test_progs") renamed the BTF selftest from 'test_btf.c' to 'prog_tests/btf.c'. Signed-off-by: Rong Tao <rongtao@cestc.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/tencent_1FA6904156E8E599CAE4ABDBE80F22830106@qq.com
2022-11-24bpf: Don't mark arguments to fentry/fexit programs as trusted.Alexei Starovoitov
The PTR_TRUSTED flag should only be applied to pointers where the verifier can guarantee that such pointers are valid. The fentry/fexit/fmod_ret programs are not in this category. Only arguments of SEC("tp_btf") and SEC("iter") programs are trusted (which have BPF_TRACE_RAW_TP and BPF_TRACE_ITER attach_type correspondingly) This bug was masked because convert_ctx_accesses() was converting trusted loads into BPF_PROBE_MEM loads. Fix it as well. The loads from trusted pointers don't need exception handling. Fixes: 3f00c5239344 ("bpf: Allow trusted pointers to be passed to KF_TRUSTED_ARGS kfuncs") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20221124215314.55890-1-alexei.starovoitov@gmail.com
2022-11-24Merge branch 'bpf: Add bpf_rcu_read_lock() support'Alexei Starovoitov
Yonghong Song says: ==================== Currently, without rcu attribute info in BTF, the verifier treats rcu tagged pointer as a normal pointer. This might be a problem for sleepable program where rcu_read_lock()/unlock() is not available. For example, for a sleepable fentry program, if rcu protected memory access is interleaved with a sleepable helper/kfunc, it is possible the memory access after the sleepable helper/kfunc might be invalid since the object might have been freed then. Even without a sleepable helper/kfunc, without rcu_read_lock() protection, it is possible that the rcu protected object might be release in the middle of bpf program execution which may cause incorrect result. To prevent above cases, enable btf_type_tag("rcu") attributes, introduce new bpf_rcu_read_lock/unlock() kfuncs and add verifier support. In the rest of patch set, Patch 1 enabled btf_type_tag for __rcu attribute. Patche 2 added might_sleep in bpf_func_proto. Patch 3 added new bpf_rcu_read_lock/unlock() kfuncs and verifier support. Patch 4 added some tests for these two new kfuncs. Changelogs: v9 -> v10: . if no rcu tag support in vmlinux btf, using bpf_rcu_read_lock/unlock() will cause verification error. . at bpf_rcu_read_unlock(), invalidate rcu ptr to PTR_UNTRUSTED instead of SCALAR_VALUE. . a few other comment changes and other minor changes. v8 -> v9: . remove sleepable prog check for ld_abs/ind checking in rcu read lock region. . fix a test failure with gcc-compiled kernel. . a couple of other minor fixes. v7 -> v8: . add might_sleep in bpf_func_proto so we can easily identify whether a helper is sleepable or not. . do not enforce rcu rules for sleepable, e.g., rcu dereference must be in a bpf_rcu_read_lock region. This is to keep old code working fine. . Mark 'b' in 'b = a->b' (b is tagged with __rcu) as MEM_RCU only if 'b = a->b' in rcu read region and 'a' is trusted. This adds safety guarantee for 'b' inside the rcu read region. v6 -> v7: . rebase on top of bpf-next. . remove the patch which enables sleepable program using cgrp_local_storage map. This is orthogonal to this patch set and will be addressed separately. . mark the rcu pointer dereference result as UNTRUSTED if inside a bpf_rcu_read_lock() region. v5 -> v6: . fix selftest prog miss_unlock which tested nested locking. . add comments in selftest prog cgrp_succ to explain how to handle nested memory access after rcu memory load. v4 -> v5: . add new test to aarch64 deny list. v3 -> v4: . fix selftest failures when built with gcc. gcc doesn't support btf_type_tag yet and some tests relies on that. skip these tests if vmlinux BTF does not have btf_type_tag("rcu"). v2 -> v3: . went back to MEM_RCU approach with invalidate rcu ptr registers at bpf_rcu_read_unlock() place. . remove KF_RCU_LOCK/UNLOCK flag and compare btf_id at verification time instead. v1 -> v2: . use kfunc instead of helper for bpf_rcu_read_lock/unlock. . not use MEM_RCU bpf_type_flag, instead use active_rcu_lock in reg state to identify rcu ptr's. . Add more self tests. . add new test to s390x deny list. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-11-24selftests/bpf: Add tests for bpf_rcu_read_lock()Yonghong Song
Add a few positive/negative tests to test bpf_rcu_read_lock() and its corresponding verifier support. The new test will fail on s390x and aarch64, so an entry is added to each of their respective deny lists. Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20221124053222.2374650-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-11-24bpf: Add kfunc bpf_rcu_read_lock/unlock()Yonghong Song
Add two kfunc's bpf_rcu_read_lock() and bpf_rcu_read_unlock(). These two kfunc's can be used for all program types. The following is an example about how rcu pointer are used w.r.t. bpf_rcu_read_lock()/bpf_rcu_read_unlock(). struct task_struct { ... struct task_struct *last_wakee; struct task_struct __rcu *real_parent; ... }; Let us say prog does 'task = bpf_get_current_task_btf()' to get a 'task' pointer. The basic rules are: - 'real_parent = task->real_parent' should be inside bpf_rcu_read_lock region. This is to simulate rcu_dereference() operation. The 'real_parent' is marked as MEM_RCU only if (1). task->real_parent is inside bpf_rcu_read_lock region, and (2). task is a trusted ptr. So MEM_RCU marked ptr can be 'trusted' inside the bpf_rcu_read_lock region. - 'last_wakee = real_parent->last_wakee' should be inside bpf_rcu_read_lock region since it tries to access rcu protected memory. - the ptr 'last_wakee' will be marked as PTR_UNTRUSTED since in general it is not clear whether the object pointed by 'last_wakee' is valid or not even inside bpf_rcu_read_lock region. The verifier will reset all rcu pointer register states to untrusted at bpf_rcu_read_unlock() kfunc call site, so any such rcu pointer won't be trusted any more outside the bpf_rcu_read_lock() region. The current implementation does not support nested rcu read lock region in the prog. Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20221124053217.2373910-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-11-24bpf: Introduce might_sleep field in bpf_func_protoYonghong Song
Introduce bpf_func_proto->might_sleep to indicate a particular helper might sleep. This will make later check whether a helper might be sleepable or not easier. Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20221124053211.2373553-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-11-24compiler_types: Define __rcu as __attribute__((btf_type_tag("rcu")))Yonghong Song
Currently, without rcu attribute info in BTF, the verifier treats rcu tagged pointer as a normal pointer. This might be a problem for sleepable program where rcu_read_lock()/unlock() is not available. For example, for a sleepable fentry program, if rcu protected memory access is interleaved with a sleepable helper/kfunc, it is possible the memory access after the sleepable helper/kfunc might be invalid since the object might have been freed then. To prevent such cases, introducing rcu tagging for memory accesses in verifier can help to reject such programs. To enable rcu tagging in BTF, during kernel compilation, define __rcu as attribute btf_type_tag("rcu") so __rcu information can be preserved in dwarf and btf, and later can be used for bpf prog verification. Acked-by: KP Singh <kpsingh@kernel.org> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20221124053206.2373141-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-11-24Merge tag 'net-6.1-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from rxrpc, netfilter and xfrm. Current release - regressions: - dccp/tcp: fix bhash2 issues related to WARN_ON() in inet_csk_get_port() - l2tp: don't sleep and disable BH under writer-side sk_callback_lock - eth: ice: fix handling of burst tx timestamps Current release - new code bugs: - xfrm: squelch kernel warning in case XFRM encap type is not available - eth: mlx5e: fix possible race condition in macsec extended packet number update routine Previous releases - regressions: - neigh: decrement the family specific qlen - netfilter: fix ipset regression - rxrpc: fix race between conn bundle lookup and bundle removal [ZDI-CAN-15975] - eth: iavf: do not restart tx queues after reset task failure - eth: nfp: add port from netdev validation for EEPROM access - eth: mtk_eth_soc: fix potential memory leak in mtk_rx_alloc() Previous releases - always broken: - tipc: set con sock in tipc_conn_alloc - nfc: - fix potential memory leaks - fix incorrect sizing calculations in EVT_TRANSACTION - eth: octeontx2-af: fix pci device refcount leak - eth: bonding: fix ICMPv6 header handling when receiving IPv6 messages - eth: prestera: add missing unregister_netdev() in prestera_port_create() - eth: tsnep: fix rotten packets Misc: - usb: qmi_wwan: add support for LARA-L6" * tag 'net-6.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (95 commits) net: thunderx: Fix the ACPI memory leak octeontx2-af: Fix reference count issue in rvu_sdp_init() net: altera_tse: release phylink resources in tse_shutdown() virtio_net: Fix probe failed when modprobe virtio_net net: wwan: t7xx: Fix the ACPI memory leak octeontx2-pf: Add check for devm_kcalloc net: enetc: preserve TX ring priority across reconfiguration net: marvell: prestera: add missing unregister_netdev() in prestera_port_create() nfc: st-nci: fix incorrect sizing calculations in EVT_TRANSACTION nfc: st-nci: fix memory leaks in EVT_TRANSACTION nfc: st-nci: fix incorrect validating logic in EVT_TRANSACTION Documentation: networking: Update generic_netlink_howto URL net/cdc_ncm: Fix multicast RX support for CDC NCM devices with ZLP net: usb: qmi_wwan: add u-blox 0x1342 composition l2tp: Don't sleep and disable BH under writer-side sk_callback_lock net: dm9051: Fix missing dev_kfree_skb() in dm9051_loop_rx() arcnet: fix potential memory leak in com20020_probe() ipv4: Fix error return code in fib_table_insert() net: ethernet: mtk_eth_soc: fix memory leak in error path net: ethernet: mtk_eth_soc: fix resource leak in error path ...
2022-11-24Merge tag 'soc-fixes-6.1-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "There are a bunch of late fixes that just came in, in particular a longer series for Rockchips devicetree files, but most of those just address cosmetic errors that were found during the binding validation. There are a couple of code changes: - A regression fix to the IXP42x PCI bus - A fix for a memory leak on optee, and another one for mach-mxs - Two fixes for the sunxi rsb bus driver, to address problems with the shutdown logic The rest are small but important devicetree fixes for a number of individual boards, addressing issues across all platforms: - arm global timer on older rockchip SoCs is unstable and needs to be disabled in favor of a more reliable clocksource - Corrections to fix bluetooth, mmc, and networking on a few Rockchip boards - at91/sam9g20ek UDC needs a pin controller config change - an omap board runs into mmc probe errors because of regulator nodes in the wrong place - imx8mp-evk has a minor inaccuracy with its pin config, but without user visible impact - The Allwinner H6 Hantro G2 video decoder needs an IOMMU reference to prevent the driver from crashing" * tag 'soc-fixes-6.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (30 commits) bus: ixp4xx: Don't touch bit 7 on IXP42x ARM: dts: imx6q-prti6q: Fix ref/tcxo-clock-frequency properties arm64: dts: imx8mp-evk: correct pcie pad settings ARM: mxs: fix memory leak in mxs_machine_init() ARM: dts: at91: sam9g20ek: enable udc vbus gpio pinctrl tee: optee: fix possible memory leak in optee_register_device() arm64: dts: allwinner: h6: Add IOMMU reference to Hantro G2 media: dt-bindings: allwinner: h6-vpu-g2: Add IOMMU reference property bus: sunxi-rsb: Support atomic transfers bus: sunxi-rsb: Remove the shutdown callback ARM: dts: rockchip: disable arm_global_timer on rk3066 and rk3188 arm64: dts: rockchip: Fix Pine64 Quartz4-B PMIC interrupt ARM: dts: am335x-pcm-953: Define fixed regulators in root node ARM: dts: rockchip: rk3188: fix lcdc1-rgb24 node name arm64: dts: rockchip: fix ir-receiver node names ARM: dts: rockchip: fix ir-receiver node names arm64: dts: rockchip: fix adc-keys sub node names ARM: dts: rockchip: fix adc-keys sub node names arm: dts: rockchip: remove clock-frequency from rtc arm: dts: rockchip: fix node name for hym8563 rtc ...
2022-11-24Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM fixes from Russell King: "Two fixes for 6.1: - fix stacktraces for tracepoint events in Thumb2 mode - fix for noMMU ZERO_PAGE() implementation" * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 9266/1: mm: fix no-MMU ZERO_PAGE() implementation ARM: 9251/1: perf: Fix stacktraces for tracepoint events in THUMB2 kernels