linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2022-12-05	PM / devfreq: event: Use device_match_of_node()	ye xingchen
	Replace the open-code with device_match_of_node(). Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
2022-12-05	vfio: Move vfio group specific code into group.c	Yi Liu
	This prepares for compiling out vfio group after vfio device cdev is added. No vfio_group decode code should be in vfio_main.c, and neither device->group reference should be in vfio_main.c. No functional change is intended. Link: https://lore.kernel.org/r/20221201145535.589687-11-yi.l.liu@intel.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Yu He <yu.he@intel.com> Tested-by: Lixiao Yang <lixiao.yang@intel.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-05	vfio: Refactor dma APIs for emulated devices	Yi Liu
	To use group helpers instead of opening group related code in the API. This prepares moving group specific code out of vfio_main.c. Link: https://lore.kernel.org/r/20221201145535.589687-10-yi.l.liu@intel.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Lixiao Yang <lixiao.yang@intel.com> Tested-by: Yu He <yu.he@intel.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-05	vfio: Wrap vfio group module init/clean code into helpers	Yi Liu
	This wraps the init/clean code of vfio group global variable to be helpers, and prepares for further moving vfio group specific code into separate file. As container is used by group, so vfio_container_init/cleanup() is moved into vfio_group_init/cleanup(). Link: https://lore.kernel.org/r/20221201145535.589687-9-yi.l.liu@intel.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Lixiao Yang <lixiao.yang@intel.com> Tested-by: Yu He <yu.he@intel.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-05	vfio: Refactor vfio_device open and close	Yi Liu
	This refactor makes the vfio_device_open() to accept device, iommufd_ctx pointer and kvm pointer. These parameters are generic items in today's group path and future device cdev path. Caller of vfio_device_open() should take care the necessary protections. e.g. the current group path need to hold the group_lock to ensure the iommufd_ctx and kvm pointer are valid. This refactor also wraps the group spefcific codes in the device open and close paths to be paired helpers like: - vfio_device_group_open/close(): call vfio_device_open/close() - vfio_device_group_use/unuse_iommu(): this pair is container specific. iommufd vs. container is selected in vfio_device_first_open(). Such helpers are supposed to be moved to group.c. While iommufd related codes will be kept in the generic helpers since future device cdev path also need to handle iommufd. Link: https://lore.kernel.org/r/20221201145535.589687-8-yi.l.liu@intel.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Lixiao Yang <lixiao.yang@intel.com> Tested-by: Yu He <yu.he@intel.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-05	vfio: Make vfio_device_open() truly device specific	Yi Liu
	Then move group related logic into vfio_device_open_file(). Accordingly introduce a vfio_device_close() to pair up. Link: https://lore.kernel.org/r/20221201145535.589687-7-yi.l.liu@intel.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Lixiao Yang <lixiao.yang@intel.com> Tested-by: Yu He <yu.he@intel.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-05	vfio: Swap order of vfio_device_container_register() and open_device()	Yi Liu
	This makes the DMA unmap callback registration to container be consistent across the vfio iommufd compat mode and the legacy container mode. In the vfio iommufd compat mode, this registration is done in the vfio_iommufd_bind() when creating access which has an unmap callback. This is prior to calling the open_device() op. The existing mdev drivers have been converted to be OK with this order. So it is ok to swap the order of vfio_device_container_register() and open_device() for legacy mode. This also prepares for further moving group specific code into separate source file. Link: https://lore.kernel.org/r/20221201145535.589687-6-yi.l.liu@intel.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Lixiao Yang <lixiao.yang@intel.com> Tested-by: Yu He <yu.he@intel.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-05	vfio: Set device->group in helper function	Yi Liu
	This avoids referencing device->group in __vfio_register_dev(). Link: https://lore.kernel.org/r/20221201145535.589687-5-yi.l.liu@intel.com Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Lixiao Yang <lixiao.yang@intel.com> Tested-by: Yu He <yu.he@intel.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-05	vfio: Create wrappers for group register/unregister	Yi Liu
	This avoids decoding group fields in the common functions used by vfio_device registration, and prepares for further moving the vfio group specific code into separate file. Link: https://lore.kernel.org/r/20221201145535.589687-4-yi.l.liu@intel.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Lixiao Yang <lixiao.yang@intel.com> Tested-by: Yu He <yu.he@intel.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-05	vfio: Move the sanity check of the group to vfio_create_group()	Jason Gunthorpe
	This avoids opening group specific code in __vfio_register_dev() for the sanity check if an (existing) group is not corrupted by having two copies of the same struct device in it. It also simplifies the error unwind for this sanity check since the failure can be detected in the group allocation. This also prepares for moving the group specific code into separate group.c. Grabbed from: https://lore.kernel.org/kvm/20220922152338.2a2238fe.alex.williamson@redhat.com/ Link: https://lore.kernel.org/r/20221201145535.589687-3-yi.l.liu@intel.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Lixiao Yang <lixiao.yang@intel.com> Tested-by: Yu He <yu.he@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
2022-12-05	vfio: Simplify vfio_create_group()	Jason Gunthorpe
	The vfio.group_lock is now only used to serialize vfio_group creation and destruction, we don't need a micro-optimization of searching, unlocking, then allocating and searching again. Just hold the lock the whole time. Grabbed from: https://lore.kernel.org/kvm/20220922152338.2a2238fe.alex.williamson@redhat.com/ Link: https://lore.kernel.org/r/20221201145535.589687-2-yi.l.liu@intel.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Lixiao Yang <lixiao.yang@intel.com> Tested-by: Yu He <yu.he@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
2022-12-05	xen/privcmd: Fix a possible warning in privcmd_ioctl_mmap_resource()	Harshit Mogalapalli
	As 'kdata.num' is user-controlled data, if user tries to allocate memory larger than(>=) MAX_ORDER, then kcalloc() will fail, it creates a stack trace and messes up dmesg with a warning. Call trace: -> privcmd_ioctl --> privcmd_ioctl_mmap_resource Add __GFP_NOWARN in order to avoid too large allocation warning. This is detected by static analysis using smatch. Fixes: 3ad0876554ca ("xen/privcmd: add IOCTL_PRIVCMD_MMAP_RESOURCE") Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20221126050745.778967-1-harshit.m.mogalapalli@oracle.com Signed-off-by: Juergen Gross <jgross@suse.com>
2022-12-05	ipmi/watchdog: use strscpy() to instead of strncpy()	yang.yang29@zte.com.cn
	Xu Panda <xu.panda@zte.com.cn> The implementation of strscpy() is more robust and safer. That's now the recommended way to copy NUL terminated strings. Signed-off-by: Xu Panda <xu.panda@zte.com.cn> Signed-off-by: Yang Yang <yang.yang29@zte.com> Message-Id: <202212051936400309332@zte.com.cn> Signed-off-by: Corey Minyard <cminyard@mvista.com>
2022-12-05	x86/xen: Fix memory leak in xen_init_lock_cpu()	Xiu Jianfeng
	In xen_init_lock_cpu(), the @name has allocated new string by kasprintf(), if bind_ipi_to_irqhandler() fails, it should be freed, otherwise may lead to a memory leak issue, fix it. Fixes: 2d9e1e2f58b5 ("xen: implement Xen-specific spinlocks") Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20221123155858.11382-3-xiujianfeng@huawei.com Signed-off-by: Juergen Gross <jgross@suse.com>
2022-12-05	x86/xen: Fix memory leak in xen_smp_intr_init{_pv}()	Xiu Jianfeng
	These local variables @{resched\|pmu\|callfunc...}_name saves the new string allocated by kasprintf(), and when bind_{v}ipi_to_irqhandler() fails, it goes to the @fail tag, and calls xen_smp_intr_free{_pv}() to free resource, however the new string is not saved, which cause a memory leak issue. fix it. Fixes: 9702785a747a ("i386: move xen") Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20221123155858.11382-2-xiujianfeng@huawei.com Signed-off-by: Juergen Gross <jgross@suse.com>
2022-12-05	xen: fix xen.h build for CONFIG_XEN_PVH=y	Jani Nikula
	For CONFIG_XEN_PVH=y, xen.h uses bool before the type is known. Include <linux/types.h> earlier. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20221123131057.3864183-1-jani.nikula@intel.com Signed-off-by: Juergen Gross <jgross@suse.com>
2022-12-05	xen/virtio: Handle PCI devices which Host controller is described in DT	Oleksandr Tyshchenko
	Use the same "xen-grant-dma" device concept for the PCI devices behind device-tree based PCI Host controller, but with one modification. Unlike for platform devices, we cannot use generic IOMMU bindings (iommus property), as we need to support more flexible configuration. The problem is that PCI devices under the single PCI Host controller may have the backends running in different Xen domains and thus have different endpoints ID (backend domains ID). Add ability to deal with generic PCI-IOMMU bindings (iommu-map/ iommu-map-mask properties) which allows us to describe relationship between PCI devices and backend domains ID properly. To avoid having to look up for the PCI Host bridge twice and reduce the amount of checks pass an extra struct device_node *np to xen_dt_grant_init_backend_domid(). So with current patch the code expects iommus property for the platform devices and iommu-map/iommu-map-mask properties for PCI devices. The example of generated by the toolstack iommu-map property for two PCI devices 0000:00:01.0 and 0000:00:02.0 whose backends are running in different Xen domains with IDs 1 and 2 respectively: iommu-map = <0x08 0xfde9 0x01 0x08 0x10 0xfde9 0x02 0x08>; Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Reviewed-by: Xenia Ragiadakou <burzalodowa@gmail.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Link: https://lore.kernel.org/r/20221025162004.8501-3-olekstysh@gmail.com Signed-off-by: Juergen Gross <jgross@suse.com>
2022-12-05	xen/virtio: Optimize the setup of "xen-grant-dma" devices	Oleksandr Tyshchenko
	This is needed to avoid having to parse the same device-tree several times for a given device. For this to work we need to install the xen_virtio_restricted_mem_acc callback in Arm's xen_guest_init() which is same callback as x86's PV and HVM modes already use and remove the manual assignment in xen_setup_dma_ops(). Also we need to split the code to initialize backend_domid into a separate function. Prior to current patch we parsed the device-tree three times: 1. xen_setup_dma_ops()->...->xen_is_dt_grant_dma_device() 2. xen_setup_dma_ops()->...->xen_dt_grant_init_backend_domid() 3. xen_virtio_mem_acc()->...->xen_is_dt_grant_dma_device() With current patch we parse the device-tree only once in xen_virtio_restricted_mem_acc()->...->xen_dt_grant_init_backend_domid() Other benefits are: - Not diverge from x86 when setting up Xen grant DMA ops - Drop several global functions Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Reviewed-by: Xenia Ragiadakou <burzalodowa@gmail.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Link: https://lore.kernel.org/r/20221025162004.8501-2-olekstysh@gmail.com Signed-off-by: Juergen Gross <jgross@suse.com>
2022-12-05	net: phy: mxl-gpy: rename MMD_VEND1 macros to match datasheet	Michael Walle
	Rename the temperature sensors macros to match the names in the datasheet. Signed-off-by: Michael Walle <michael@walle.cc> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-05	net: mvneta: Prevent out of bounds read in mvneta_config_rss()	Dan Carpenter
	The pp->indir[0] value comes from the user. It is passed to: if (cpu_online(pp->rxq_def)) inside the mvneta_percpu_elect() function. It needs bounds checkeding to ensure that it is not beyond the end of the cpu bitmap. Fixes: cad5d847a093 ("net: mvneta: Fix the CPU choice in mvneta_percpu_elect") Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-05	nfp: add support for multicast filter	Diana Wang
	Rewrite nfp_net_set_rx_mode() to implement interface to delivery mc address and operations to firmware by using general mailbox for filtering multicast packets. The operations include add mc address and delete mc address. And the limitation of mc addresses number is 1024 for each net device. User triggers adding mc address by using command below: ip maddress add <mc address> dev <interface name> User triggers deleting mc address by using command below: ip maddress del <mc address> dev <interface name> Signed-off-by: Diana Wang <na.wang@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-05	xen-netfront: Fix NULL sring after live migration	Lin Liu
	A NAPI is setup for each network sring to poll data to kernel The sring with source host is destroyed before live migration and new sring with target host is setup after live migration. The NAPI for the old sring is not deleted until setup new sring with target host after migration. With busy_poll/busy_read enabled, the NAPI can be polled before got deleted when resume VM. BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: xennet_poll+0xae/0xd20 PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI Call Trace: finish_task_switch+0x71/0x230 timerqueue_del+0x1d/0x40 hrtimer_try_to_cancel+0xb5/0x110 xennet_alloc_rx_buffers+0x2a0/0x2a0 napi_busy_loop+0xdb/0x270 sock_poll+0x87/0x90 do_sys_poll+0x26f/0x580 tracing_map_insert+0x1d4/0x2f0 event_hist_trigger+0x14a/0x260 finish_task_switch+0x71/0x230 __schedule+0x256/0x890 recalc_sigpending+0x1b/0x50 xen_sched_clock+0x15/0x20 __rb_reserve_next+0x12d/0x140 ring_buffer_lock_reserve+0x123/0x3d0 event_triggers_call+0x87/0xb0 trace_event_buffer_commit+0x1c4/0x210 xen_clocksource_get_cycles+0x15/0x20 ktime_get_ts64+0x51/0xf0 SyS_ppoll+0x160/0x1a0 SyS_ppoll+0x160/0x1a0 do_syscall_64+0x73/0x130 entry_SYSCALL_64_after_hwframe+0x41/0xa6 ... RIP: xennet_poll+0xae/0xd20 RSP: ffffb4f041933900 CR2: 0000000000000008 ---[ end trace f8601785b354351c ]--- xen frontend should remove the NAPIs for the old srings before live migration as the bond srings are destroyed There is a tiny window between the srings are set to NULL and the NAPIs are disabled, It is safe as the NAPI threads are still frozen at that time Signed-off-by: Lin Liu <lin.liu@citrix.com> Fixes: 4ec2411980d0 ([NET]: Do not check netif_running() and carrier state in ->poll()) Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-05	net: microchip: sparx5: correctly free skb in xmit	Casper Andersson
	consume_skb on transmitted, kfree_skb on dropped, do not free on TX_BUSY. Previously the xmit function could return -EBUSY without freeing, which supposedly is interpreted as a drop. And was using kfree on successfully transmitted packets. sparx5_fdma_xmit and sparx5_inject returns error code, where -EBUSY indicates TX_BUSY and any other error code indicates dropped. Fixes: f3cad2611a77 ("net: sparx5: add hostmode with phylink support") Signed-off-by: Casper Andersson <casper.casan@gmail.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-05	octeontx2-pf: Fix potential memory leak in otx2_init_tc()	Ziyang Xuan
	In otx2_init_tc(), if rhashtable_init() failed, it does not free tc->tc_entries_bitmap which is allocated in otx2_tc_alloc_ent_bitmap(). Fixes: 2e2a8126ffac ("octeontx2-pf: Unify flow management variables") Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-05	net: ipa: use sysfs_emit() to instead of scnprintf()	ye xingchen
	Follow the advice of the Documentation/filesystems/sysfs.rst and show() should only use sysfs_emit() or sysfs_emit_at() when formatting the value to be returned to user space. Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn> Reviewed-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-05	net: mdiobus: fix double put fwnode in the error path	Yang Yingliang
	If phy_device_register() or fwnode_mdiobus_phy_device_register() fail, phy_device_free() is called, the device refcount is decreased to 0, then fwnode_handle_put() will be called in phy_device_release(), but in the error path, fwnode_handle_put() has already been called, so set fwnode to NULL after fwnode_handle_put() in the error path to avoid double put. Fixes: cdde1560118f ("net: mdiobus: fix unbalanced node reference count") Reported-by: Zeng Heng <zengheng4@huawei.com> Tested-by: Zeng Heng <zengheng4@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Zeng Heng <zengheng4@huawei.com> Tested-by: Zeng Heng <zengheng4@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-05	Merge tag 'rxrpc-next-20221201-b' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== rxrpc: Increasing SACK size and moving away from softirq, parts 2 & 3 Here are the second and third parts of patches in the process of moving rxrpc from doing a lot of its stuff in softirq context to doing it in an I/O thread in process context and thereby making it easier to support a larger SACK table. The full description is in the description for the first part[1] which is already in net-next. The second part includes some cleanups, adds some testing and overhauls some tracing: (1) Remove declaration of rxrpc_kernel_call_is_complete() as the definition is no longer present. (2) Remove the knet() and kproto() macros in favour of using tracepoints. (3) Remove handling of duplicate packets from recvmsg. The input side isn't now going to insert overlapping/duplicate packets into the recvmsg queue. (4) Don't use the rxrpc_conn_parameters struct in the rxrpc_connection or rxrpc_bundle structs - rather put the members in directly. (5) Extract the abort code from a received abort packet right up front rather than doing it in multiple places later. (6) Use enums and symbol lists rather than __builtin_return_address() to indicate where a tracepoint was triggered for local, peer, conn, call and skbuff tracing. (7) Add a refcount tracepoint for the rxrpc_bundle struct. (8) Implement an in-kernel server for the AFS rxperf testing program to talk to (enabled by a Kconfig option). This is tagged as rxrpc-next-20221201-a. The third part introduces the I/O thread and switches various bits over to running there: (1) Fix call timers and call and connection workqueues to not hold refs on the rxrpc_call and rxrpc_connection structs to thereby avoid messy cleanup when the last ref is put in softirq mode. (2) Split input.c so that the call packet processing bits are separate from the received packet distribution bits. Call packet processing gets bumped over to the call event handler. (3) Create a per-local endpoint I/O thread. Barring some tiny bits that still get done in softirq context, all packet reception, processing and transmission is done in this thread. That will allow a load of locking to be removed. (4) Perform packet processing and error processing from the I/O thread. (5) Provide a mechanism to process call event notifications in the I/O thread rather than queuing a work item for that call. (6) Move data and ACK transmission into the I/O thread. ACKs can then be transmitted at the point they're generated rather than getting delegated from softirq context to some process context somewhere. (7) Move call and local processor event handling into the I/O thread. (8) Move cwnd degradation to after packets have been transmitted so that they don't shorten the window too quickly. A bunch of simplifications can then be done: (1) The input_lock is no longer necessary as exclusion is achieved by running the code in the I/O thread only. (2) Don't need to use sk->sk_receive_queue.lock to guard socket state changes as the socket mutex should suffice. (3) Don't take spinlocks in RCU callback functions as they get run in softirq context and thus need _bh annotations. (4) RCU is then no longer needed for the peer's error_targets list. (5) Simplify the skbuff handling in the receive path by dropping the ref in the basic I/O thread loop and getting an extra ref as and when we need to queue the packet for recvmsg or another context. (6) Get the peer address earlier in the input process and pass it to the users so that we only do it once. This is tagged as rxrpc-next-20221201-b. Changes: ======== ver #2) - Added a patch to change four assertions into warnings in rxrpc_read() and fixed a checker warning from a __user annotation that should have been removed.. - Change a min() to min_t() in rxperf as PAGE_SIZE doesn't seem to match type size_t on i386. - Three error handling issues in rxrpc_new_incoming_call(): - If not DATA or not seq #1, should drop the packet, not abort. - Fix a goto that went to the wrong place, dropping a non-held lock. - Fix an rcu_read_lock that should've been an unlock. Tested-by: Marc Dionne <marc.dionne@auristor.com> Tested-by: kafs-testing+fedora36_64checkkafs-build-144@auristor.com Link: https://lore.kernel.org/r/166794587113.2389296.16484814996876530222.stgit@warthog.procyon.org.uk/ [1] Link: https://lore.kernel.org/r/166982725699.621383.2358362793992993374.stgit@warthog.procyon.org.uk/ # v1 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-05	uprobes/x86: Allow to probe a NOP instruction with 0x66 prefix	Oleg Nesterov
	Intel ICC -hotpatch inserts 2-byte "0x66 0x90" NOP at the start of each function to reserve extra space for hot-patching, and currently it is not possible to probe these functions because branch_setup_xol_ops() wrongly rejects NOP with REP prefix as it treats them like word-sized branch instructions. Fixes: 250bbd12c2fe ("uprobes/x86: Refuse to attach uprobe to "word-sized" branch insns") Reported-by: Seiji Nishikawa <snishika@redhat.com> Suggested-by: Denys Vlasenko <dvlasenk@redhat.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20221204173933.GA31544@redhat.com
2022-12-05	udf: Increase UDF_MAX_READ_VERSION to 0x0260	Bartosz Taudul
	Some discs containing the UDF file system are unable to be mounted, failing with the following message: UDF-fs: error (device sr0): udf_fill_super: minUDFReadRev=260 (max is 250) The UDF 2.60 specification [0] states in the section Basic Restrictions & Requirements (page 10): The Minimum UDF Read Revision value shall be at most #0250 for all media with a UDF 2.60 file system. This indicates that a UDF 2.50 implementation can read all UDF 2.60 media. Media that do not have a Metadata Partition may use a value lower than #250. The conclusion is that the discs failing to mount were burned with a faulty software, which didn't follow the specification. This can be worked around by increasing UDF_MAX_READ_VERSION to 0x260, to match the Minimum Read Revision. No other changes are required, as reading UDF 2.60 is backward compatible with UDF 2.50. [0] http://www.osta.org/specs/pdf/udf260.pdf Signed-off-by: Bartosz Taudul <wolf@nereid.pl> Signed-off-by: Jan Kara <jack@suse.cz>
2022-12-05	Merge branch irq/misc-6.2 into irq/irqchip-next	Marc Zyngier
	* irq/misc-6.2: : . : Random minor fixes and improvments: : : - More Loongson fixes after the Loongarch merge : : - Error handling fixes for wpcm450, GIC... : : - BE detection for a FSL controller : : - Declare the Sifive PLIC as wake-up agnostic : : - Simplify fishing out the device data for the ST irqchip : : - Mark some data structures as __initconst in the apple-aic driver : : - Switch over from strtobool to kstrtobool : : - COMPILET_TEST fixes : : - and the mandatory "repeated word" commit... : . irqchip/ls-extirq: Fix endianness detection irqchip/gic: Use kstrtobool() instead of strtobool() irqchip/sifive-plic: Support wake IRQs irqchip/loongson-liointc: Fix improper error handling in liointc_init() irqchip/sl28cpld: Replace irqchip mask_invert with unmask_base irqchip/wpcm450: Fix memory leak in wpcm450_aic_of_init() irqchip/st: Use device_get_match_data() to simplify the code irqchip/al-fic: Drop obsolete dependency on COMPILE_TEST irqchip: gic-pm: Use pm_runtime_resume_and_get() in gic_probe() irqchip/mips-gic: Drop repeated word in comment irqchip/apple-aic: Mark aic_info structs __initconst Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-12-05	Merge branch irq/cirq-v2 into irq/irqchip-next	Marc Zyngier
	* irq/cirq-v2: : . : Support for the MTK CIRQv2, courtesy of AngeloGioacchino Del Regno: : : "On newer SoCs (like MT8192/95 and also other non-chromebook chips), the : MediaTek CIRQ controller has a new register layout: this series adds : some more flexibility to the irq-mtk-cirq driver, allowing to select : the register layout based on a SoC-specific compatible." : : . irqchip/irq-mtk-cirq: Add support for System CIRQ on MT8192 irqchip/irq-mtk-cirq: Move register offsets to const array dt-bindings: interrupt-controller: mediatek,cirq: Document MT8192 dt-bindings: interrupt-controller: mediatek,cirq: Migrate to dt schema Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-12-05	Merge branch irq/loongarch-of into irq/irqchip-next	Marc Zyngier
	* irq/loongarch-of: : . : Initial OF support for LoongArch. Funny how it only took : one release from plumbing ACPI into an unsuspecting : architecture to start enabling OF on it. Oh well... : . irqchip/loongarch-cpu: Fix a missing prototype warning dt-bindings: interrupt-controller: add yaml for LoongArch CPU interrupt controller irqchip: loongarch-cpu: add DT support Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-12-05	irqchip/loongarch-cpu: Fix a missing prototype warning	Huacai Chen
	1, Rename loongarch_cpu_irq_of_init() to cpuintc_of_init() in order to keep the same style as the ACPI version. 2, Fix a missing prototype warning by adding a "static" modifier. Fixes: 855d4ca4bdb366aab3d4 ("irqchip: loongarch-cpu: add DT support") Reported-by: kernel test robot <lkp@intel.com> Cc: Peibao Liu <liupeibao@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221205044708.2054022-1-chenhuacai@loongson.cn
2022-12-05	irqchip/ls-extirq: Fix endianness detection	Sean Anderson
	parent is the interrupt parent, not the parent of node. Use node->parent. This fixes endianness detection on big-endian platforms. Fixes: 1b00adce8afd ("irqchip/ls-extirq: Fix invalid wait context by avoiding to use regmap") Signed-off-by: Sean Anderson <sean.anderson@seco.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221201212807.616191-1-sean.anderson@seco.com
2022-12-05	x86/mtrr: Make message for disabled MTRRs more descriptive	Juergen Gross
	Instead of just saying "Disabled" when MTRRs are disabled for any reason, tell what is disabled and why. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20221205080433.16643-3-jgross@suse.com
2022-12-05	x86/pat: Handle TDX guest PAT initialization	Juergen Gross
	With the decoupling of PAT and MTRR initialization, PAT will be used even with MTRRs disabled. This seems to break booting up as TDX guest, as the recommended sequence to set the PAT MSR across CPUs can't work in TDX guests due to disabling caches via setting CR0.CD isn't allowed in TDX mode. This is an inconsistency in the Intel documentation between the SDM and the TDX specification. For now handle TDX mode the same way as Xen PV guest mode by just accepting the current PAT MSR setting without trying to modify it. [ bp: Align conditions for better readability. ] Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20221205080433.16643-2-jgross@suse.com
2022-12-05	Documentation/features: Use loongarch instead of loong	Tiezhu Yang
	The official arch name is LoongArch [1], we should use small letter loongarch instead of loong in Documentation/features, just use the features-refresh.sh to refresh all the related files. [1] https://www.kernel.org/doc/html/latest/loongarch/index.html Fixes: 5860800e8696 ("Documentation/features: Update the arch support status files") Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Link: https://lore.kernel.org/r/1670156327-9631-3-git-send-email-yangtiezhu@loongson.cn Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2022-12-05	Documentation/features-refresh.sh: Only sed the beginning "arch" of ARCH_DIR	Tiezhu Yang
	It should only sed the beginning "arch" of ARCH_DIR in features-refresh.sh, otherwise loongarch is recognized as loong, that is not what we want. Fixes: be99f610a110 ("Documentation/features: Add script that refreshes the arch support status files in place") Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Link: https://lore.kernel.org/r/1670156327-9631-2-git-send-email-yangtiezhu@loongson.cn Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2022-12-05	net: encx24j600: Fix invalid logic in reading of MISTAT register	Valentina Goncharenko
	A loop for reading MISTAT register continues while regmap_read() fails and (mistat & BUSY), but if regmap_read() fails a value of mistat is undefined. The patch proposes to check for BUSY flag only when regmap_read() succeed. Compile test only. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: d70e53262f5c ("net: Microchip encx24j600 driver") Signed-off-by: Valentina Goncharenko <goncharenko.vp@ispras.ru> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-05	net: encx24j600: Add parentheses to fix precedence	Valentina Goncharenko
	In functions regmap_encx24j600_phy_reg_read() and regmap_encx24j600_phy_reg_write() in the conditions of the waiting cycles for filling the variable 'ret' it is necessary to add parentheses to prevent wrong assignment due to logical operations precedence. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: d70e53262f5c ("net: Microchip encx24j600 driver") Signed-off-by: Valentina Goncharenko <goncharenko.vp@ispras.ru> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-05	net: stmmac: tegra: Add MGBE support	Bhadram Varka
	Add support for the Multi-Gigabit Ethernet (MGBE/XPCS) IP found on NVIDIA Tegra234 SoCs. Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Bhadram Varka <vbhadram@nvidia.com> Co-developed-by: Revanth Kumar Uppala <ruppala@nvidia.com> Signed-off-by: Revanth Kumar Uppala <ruppala@nvidia.com> Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-05	net: stmmac: Power up SERDES after the PHY link	Revanth Kumar Uppala
	The Tegra MGBE ethernet controller requires that the SERDES link is powered-up after the PHY link is up, otherwise the link fails to become ready following a resume from suspend. Add a variable to indicate that the SERDES link must be powered-up after the PHY link. Signed-off-by: Revanth Kumar Uppala <ruppala@nvidia.com> Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-05	xfrm: document IPsec packet offload mode	Leon Romanovsky
	Extend XFRM device offload API description with newly added packet offload mode. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-12-05	xfrm: add support to HW update soft and hard limits	Leon Romanovsky
	Both in RX and TX, the traffic that performs IPsec packet offload transformation is accounted by HW. It is needed to properly handle hard limits that require to drop the packet. It means that XFRM core needs to update internal counters with the one that accounted by the HW, so new callbacks are introduced in this patch. In case of soft or hard limit is occurred, the driver should call to xfrm_state_check_expire() that will perform key rekeying exactly as done by XFRM core. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-12-05	xfrm: speed-up lookup of HW policies	Leon Romanovsky
	Devices that implement IPsec packet offload mode should offload SA and policies too. In RX path, it causes to the situation that HW will always have higher priority over any SW policies. It means that we don't need to perform any search of inexact policies and/or priority checks if HW policy was discovered. In such situation, the HW will catch the packets anyway and HW can still implement inexact lookups. In case specific policy is not found, we will continue with packet lookup and check for existence of HW policies in inexact list. HW policies are added to the head of SPD to ensure fast lookup, as XFRM iterates over all policies in the loop. The same solution of adding HW SAs at the begging of the list is applied to SA database too. However, we don't need to change lookups as they are sorted by insertion order and not priority. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-12-05	xfrm: add RX datapath protection for IPsec packet offload mode	Leon Romanovsky
	Traffic received by device with enabled IPsec packet offload should be forwarded to the stack only after decryption, packet headers and trailers removed. Such packets are expected to be seen as normal (non-XFRM) ones, while not-supported packets should be dropped by the HW. Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-12-05	xfrm: add TX datapath support for IPsec packet offload mode	Leon Romanovsky
	In IPsec packet mode, the device is going to encrypt and encapsulate packets that are associated with offloaded policy. After successful policy lookup to indicate if packets should be offloaded or not, the stack forwards packets to the device to do the magic. Signed-off-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Huy Nguyen <huyn@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-12-05	docs/zh_CN: Fix '.. only::' directive's expression	Akira Yokosawa
	Commit febe6c2f859e ("docs/zh_CN: Add translation zh_CN/doc-guide/index.rst") translated ".. only::" directive too much. Use the one as found in the original doc-guide/index.rst. Fixes: febe6c2f859e ("docs/zh_CN: Add translation zh_CN/doc-guide/index.rst") Signed-off-by: Akira Yokosawa <akiyks@gmail.com> Cc: Wu XiangCheng <bobwxc@email.cn> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Alex Shi <alexs@kernel.org> Reviewed-by: Wu XiangCheng <bobwxc@email.cn> Link: https://lore.kernel.org/r/20221205032622.8697-1-akiyks@gmail.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2022-12-05	xfrm: add an interface to offload policy	Leon Romanovsky
	Extend netlink interface to add and delete XFRM policy from the device. This functionality is a first step to implement packet IPsec offload solution. Signed-off-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-12-05	xfrm: allow state packet offload mode	Leon Romanovsky
	Allow users to configure xfrm states with packet offload mode. The packet mode must be requested both for policy and state, and such requires us to do not implement fallback. We explicitly return an error if requested packet mode can't be configured. Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>