summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2021-04-09Merge branch 'mlx5-next' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Saeed Mahameed says: ==================== mlx5-next 2021-04-09 This pr contains changes from mlx5-next branch, already reviewed on netdev and rdma mailing lists, links below. 1) From Leon, Dynamically assign MSI-X vectors count Already Acked by Bjorn Helgaas. https://patchwork.kernel.org/project/netdevbpf/cover/20210314124256.70253-1-leon@kernel.org/ 2) Cleanup series: https://patchwork.kernel.org/project/netdevbpf/cover/20210311070915.321814-1-saeed@kernel.org/ From Mark, E-Switch cleanups and refactoring, and the addition of single FDB mode needed HW bits. From Mikhael, Remove unused struct field From Saeed, Cleanup W=1 prototype warning From Zheng, Esw related cleanup From Tariq, User order-0 page allocation for EQs * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5: Implement sriov_get_vf_total_msix/count() callbacks net/mlx5: Dynamically assign MSI-X vectors count net/mlx5: Add dynamic MSI-X capabilities bits PCI/IOV: Add sysfs MSI-X vector assignment interface net/mlx5: Use order-0 allocations for EQs net/mlx5: Add IFC bits needed for single FDB mode net/mlx5: E-Switch, Refactor send to vport to be more generic RDMA/mlx5: Use representor E-Switch when getting netdev and metadata net/mlx5: E-Switch, Add eswitch pointer to each representor net/mlx5: E-Switch, Add match on vhca id to default send rules net/mlx5: Remove unused mlx5_core_health member recover_work net/mlx5: simplify the return expression of mlx5_esw_offloads_pair() net/mlx5: Cleanup prototype warning ==================== Link: https://lore.kernel.org/r/20210409200704.10886-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-04-09Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "14 patches. Subsystems affected by this patch series: mm (kasan, gup, pagecache, and kfence), MAINTAINERS, mailmap, nds32, gcov, ocfs2, ia64, and lib" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: lib: fix kconfig dependency on ARCH_WANT_FRAME_POINTERS kfence, x86: fix preemptible warning on KPTI-enabled systems lib/test_kasan_module.c: suppress unused var warning kasan: fix conflict with page poisoning fs: direct-io: fix missing sdio->boundary ia64: fix user_stack_pointer() for ptrace() ocfs2: fix deadlock between setattr and dio_end_io_write gcov: re-fix clang-11+ support nds32: flush_dcache_page: use page_mapping_file to avoid races with swapoff mm/gup: check page posion status for coredump. .mailmap: fix old email addresses mailmap: update email address for Jordan Crouse treewide: change my e-mail address, fix my name MAINTAINERS: update CZ.NIC's Turris information
2021-04-09net: phy: make PHY PM ops a no-op if MAC driver manages PHY PMHeiner Kallweit
Resume callback of the PHY driver is called after the one for the MAC driver. The PHY driver resume callback calls phy_init_hw(), and this is potentially problematic if the MAC driver calls phy_start() in its resume callback. One issue was reported with the fec driver and a KSZ8081 PHY which seems to become unstable if a soft reset is triggered during aneg. The new flag allows MAC drivers to indicate that they take care of suspending/resuming the PHY. Then the MAC PM callbacks can handle any dependency between MAC and PHY PM. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-04-09Merge tag 'net-5.12-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Networking fixes for 5.12-rc7, including fixes from can, ipsec, mac80211, wireless, and bpf trees. No scary regressions here or in the works, but small fixes for 5.12 changes keep coming. Current release - regressions: - virtio: do not pull payload in skb->head - virtio: ensure mac header is set in virtio_net_hdr_to_skb() - Revert "net: correct sk_acceptq_is_full()" - mptcp: revert "mptcp: provide subflow aware release function" - ethernet: lan743x: fix ethernet frame cutoff issue - dsa: fix type was not set for devlink port - ethtool: remove link_mode param and derive link params from driver - sched: htb: fix null pointer dereference on a null new_q - wireless: iwlwifi: Fix softirq/hardirq disabling in iwl_pcie_enqueue_hcmd() - wireless: iwlwifi: fw: fix notification wait locking - wireless: brcmfmac: p2p: Fix deadlock introduced by avoiding the rtnl dependency Current release - new code bugs: - napi: fix hangup on napi_disable for threaded napi - bpf: take module reference for trampoline in module - wireless: mt76: mt7921: fix airtime reporting and related tx hangs - wireless: iwlwifi: mvm: rfi: don't lock mvm->mutex when sending config command Previous releases - regressions: - rfkill: revert back to old userspace API by default - nfc: fix infinite loop, refcount & memory leaks in LLCP sockets - let skb_orphan_partial wake-up waiters - xfrm/compat: Cleanup WARN()s that can be user-triggered - vxlan, geneve: do not modify the shared tunnel info when PMTU triggers an ICMP reply - can: fix msg_namelen values depending on CAN_REQUIRED_SIZE - can: uapi: mark union inside struct can_frame packed - sched: cls: fix action overwrite reference counting - sched: cls: fix err handler in tcf_action_init() - ethernet: mlxsw: fix ECN marking in tunnel decapsulation - ethernet: nfp: Fix a use after free in nfp_bpf_ctrl_msg_rx - ethernet: i40e: fix receiving of single packets in xsk zero-copy mode - ethernet: cxgb4: avoid collecting SGE_QBASE regs during traffic Previous releases - always broken: - bpf: Refuse non-O_RDWR flags in BPF_OBJ_GET - bpf: Refcount task stack in bpf_get_task_stack - bpf, x86: Validate computation of branch displacements - ieee802154: fix many similar syzbot-found bugs - fix NULL dereferences in netlink attribute handling - reject unsupported operations on monitor interfaces - fix error handling in llsec_key_alloc() - xfrm: make ipv4 pmtu check honor ip header df - xfrm: make hash generation lock per network namespace - xfrm: esp: delete NETIF_F_SCTP_CRC bit from features for esp offload - ethtool: fix incorrect datatype in set_eee ops - xdp: fix xdp_return_frame() kernel BUG throw for page_pool memory model - openvswitch: fix send of uninitialized stack memory in ct limit reply Misc: - udp: add get handling for UDP_GRO sockopt" * tag 'net-5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (182 commits) net: fix hangup on napi_disable for threaded napi net: hns3: Trivial spell fix in hns3 driver lan743x: fix ethernet frame cutoff issue net: ipv6: check for validity before dereferencing cfg->fc_nlinfo.nlh net: dsa: lantiq_gswip: Configure all remaining GSWIP_MII_CFG bits net: dsa: lantiq_gswip: Don't use PHY auto polling net: sched: sch_teql: fix null-pointer dereference ipv6: report errors for iftoken via netlink extack net: sched: fix err handler in tcf_action_init() net: sched: fix action overwrite reference counting Revert "net: sched: bump refcount for new action in ACT replace mode" ice: fix memory leak of aRFS after resuming from suspend i40e: Fix sparse warning: missing error code 'err' i40e: Fix sparse error: 'vsi->netdev' could be null i40e: Fix sparse error: uninitialized symbol 'ring' i40e: Fix sparse errors in i40e_txrx.c i40e: Fix parameters in aq_get_phy_register() nl80211: fix beacon head validation bpf, x86: Validate computation of branch displacements for x86-32 bpf, x86: Validate computation of branch displacements for x86-64 ...
2021-04-09treewide: change my e-mail address, fix my nameMarek Behún
Change my e-mail address to kabel@kernel.org, and fix my name in non-code parts (add diacritical mark). Link: https://lkml.kernel.org/r/20210325171123.28093-2-kabel@kernel.org Signed-off-by: Marek Behún <kabel@kernel.org> Cc: Bartosz Golaszewski <bgolaszewski@baylibre.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jassi Brar <jassisinghbrar@gmail.com> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-04-10Merge tag 'drm-misc-next-2021-04-09' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 5.13: UAPI Changes: Cross-subsystem Changes: Core Changes: - bridge: Fix Kconfig dependency - cmdline: Refuse zero width/height mode - ttm: Ignore signaled move fences, ioremap buffer according to mem caching settins Driver Changes: - Conversions to sysfs_emit - tegra: Don't register DP AUX channels before connectors - zynqmp: Fix for an out-of-bound (but within struct padding) memset Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20210409090020.jroa2d4p4qansrpa@gilmour
2021-04-09Merge tag 'qcom-drivers-for-5.13-2' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/drivers More Qualcomm driver updates for 5.13 This improves the Qualcomm SCM driver logic related to detecting the calling convention, in particular on SC7180, and fixes a few small issues in the same. It introduces additonal sanity checks of the size of loaded segments in the MDT loader and adds a missing error in the return path of pdr_register_listener(). It makes it possible to specify the OEM specific firmware path in the wcn36xx control (and WiFi) driver. Lastly it adds a missing path specifier in the MAINTAINERS' entry and fixes a bunch of kerneldoc issues in various drivers. * tag 'qcom-drivers-for-5.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: soc: qcom: mdt_loader: Detect truncated read of segments soc: qcom: mdt_loader: Validate that p_filesz < p_memsz soc: qcom: pdr: Fix error return code in pdr_register_listener firmware: qcom_scm: Fix kernel-doc function names to match firmware: qcom_scm: Suppress sysfs bind attributes firmware: qcom_scm: Workaround lack of "is available" call on SC7180 firmware: qcom_scm: Reduce locking section for __get_convention() firmware: qcom_scm: Make __qcom_scm_is_call_available() return bool soc: qcom: wcnss_ctrl: Allow reading firmware-name from DT soc: qcom: wcnss_ctrl: Introduce local variable "dev" dt-bindings: soc: qcom: wcnss: Add firmware-name property soc: qcom: address kernel-doc warnings MAINTAINERS: add another entry for ARM/QUALCOMM SUPPORT Link: https://lore.kernel.org/r/20210409162001.775851-1-bjorn.andersson@linaro.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-04-09usb: typec: Add typec_port_register_altmodes()Hans de Goede
This can be used by Type-C controller drivers which use a standard usb-connector fwnode, with altmodes sub-node, to describe the available altmodes. Note there are is no devicetree bindings documentation for the altmodes node, this is deliberate. ATM the fwnodes used to register the altmodes are only used internally to pass platform info from a drivers/platform/x86 driver to the type-c subsystem. When a devicetree user of this functionally comes up and the dt-bindings have been hashed out the internal use can be adjusted to match the dt-bindings. Currently the typec_port_register_altmodes() function expects an "altmodes" child fwnode on port->dev with this "altmodes" fwnode having child fwnodes itself with each child containing 2 integer properties: 1. A "svid" property, which sets the id of the altmode, e.g. displayport altmode has a svid of 0xff01. 2. A "vdo" property, typically used as a bitmask describing the capabilities of the altmode, the bits in the vdo are specified in the specification of the altmode. Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20210409134033.105834-2-hdegoede@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-09usb: Iterator for portsHeikki Krogerus
Introducing usb_for_each_port(). It works the same way as usb_for_each_dev(), but instead of going through every USB device in the system, it walks through the USB ports in the system. Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20210407065555.88110-4-heikki.krogerus@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-09usb: typec: Port mapping utilityHeikki Krogerus
Adding functions that can be used to link/unlink ports - USB ports, TBT3/USB4 ports, DisplayPorts and so on - to the USB Type-C connectors they are attached to inside a system. The symlink that is created for the port device is named "connector". Initially only ACPI is supported. ACPI port object shares the _PLD (Physical Location of Device) with the USB Type-C connector that it's attached to. Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20210407065555.88110-2-heikki.krogerus@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-09bus: mhi: fix typo in comments for struct mhi_channel_configJarvis Jiang
The word 'rung' is a typo in below comment, fix it. * @event_ring: The event rung index that services this channel Signed-off-by: Jarvis Jiang <jarvis.w.jiang@gmail.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20210408100220.3853-1-jarvis.w.jiang@gmail.com Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2021-04-09pwm: Clarify which state pwm_get_state() returnsUwe Kleine-König
Given that lowlevel drivers usually cannot implement exactly what a consumer requests with pwm_apply_state() there is some rounding involved. pwm_get_state() returns the setting that was requested most recently by the consumer (opposed to what was actually implemented in hardware in reply to the last request). Clarify this in the function kerneldoc. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
2021-04-09static_call: Relax static_call_update() function argument typePeter Zijlstra
static_call_update() had stronger type requirements than regular C, relax them to match. Instead of requiring the @func argument has the exact matching type, allow any type which C is willing to promote to the right (function) pointer type. Specifically this allows (void *) arguments. This cleans up a bunch of static_call_update() callers for PREEMPT_DYNAMIC and should get around silly GCC11 warnings for free. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/YFoN7nCl8OfGtpeh@hirez.programming.kicks-ass.net
2021-04-08libnvdimm: Notify disk drivers to revalidate region read-onlyDan Williams
Previous kernels allowed the BLKROSET to override the disk's read-only status. With that situation fixed the pmem driver needs to rely on notification events to reevaluate the disk read-only status after the host region has been marked read-write. Recall that when libnvdimm determines that the persistent memory has lost persistence (for example lack of energy to flush from DRAM to FLASH on an NVDIMM-N device) it marks the region read-only, but that state can be overridden by the user via: echo 0 > /sys/bus/nd/devices/regionX/read_only ...to date there is no notification that the region has restored persistence, so the user override is the only recovery. Fixes: 52f019d43c22 ("block: add a hard-readonly flag to struct gendisk") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Vishal Verma <vishal.l.verma@intel.com> Tested-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Jens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/161534060720.528671.2341213328968989192.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-04-08treewide: Change list_sort to use const pointersSami Tolvanen
list_sort() internally casts the comparison function passed to it to a different type with constant struct list_head pointers, and uses this pointer to call the functions, which trips indirect call Control-Flow Integrity (CFI) checking. Instead of removing the consts, this change defines the list_cmp_func_t type and changes the comparison function types of all list_sort() callers to use const pointers, thus avoiding type mismatches. Suggested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210408182843.1754385-10-samitolvanen@google.com
2021-04-08bpf: disable CFI in dispatcher functionsSami Tolvanen
BPF dispatcher functions are patched at runtime to perform direct instead of indirect calls. Disable CFI for the dispatcher functions to avoid conflicts. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210408182843.1754385-9-samitolvanen@google.com
2021-04-08mm: add generic function_nocfi macroSami Tolvanen
With CONFIG_CFI_CLANG, the compiler replaces function addresses in instrumented C code with jump table addresses. This means that __pa_symbol(function) returns the physical address of the jump table entry instead of the actual function, which may not work as the jump table code will immediately jump to a virtual address that may not be mapped. To avoid this address space confusion, this change adds a generic definition for function_nocfi(), which architectures that support CFI can override. The typical implementation of would use inline assembly to take the function address, which avoids compiler instrumentation. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210408182843.1754385-4-samitolvanen@google.com
2021-04-08cfi: add __cficanonicalSami Tolvanen
With CONFIG_CFI_CLANG, the compiler replaces a function address taken in C code with the address of a local jump table entry, which passes runtime indirect call checks. However, the compiler won't replace addresses taken in assembly code, which will result in a CFI failure if we later jump to such an address in instrumented C code. The code generated for the non-canonical jump table looks this: <noncanonical.cfi_jt>: /* In C, &noncanonical points here */ jmp noncanonical ... <noncanonical>: /* function body */ ... This change adds the __cficanonical attribute, which tells the compiler to use a canonical jump table for the function instead. This means the compiler will rename the actual function to <function>.cfi and points the original symbol to the jump table entry instead: <canonical>: /* jump table entry */ jmp canonical.cfi ... <canonical.cfi>: /* function body */ ... As a result, the address taken in assembly, or other non-instrumented code always points to the jump table and therefore, can be used for indirect calls in instrumented code without tripping CFI checks. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> # pci.h Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210408182843.1754385-3-samitolvanen@google.com
2021-04-08add support for Clang CFISami Tolvanen
This change adds support for Clang’s forward-edge Control Flow Integrity (CFI) checking. With CONFIG_CFI_CLANG, the compiler injects a runtime check before each indirect function call to ensure the target is a valid function with the correct static type. This restricts possible call targets and makes it more difficult for an attacker to exploit bugs that allow the modification of stored function pointers. For more details, see: https://clang.llvm.org/docs/ControlFlowIntegrity.html Clang requires CONFIG_LTO_CLANG to be enabled with CFI to gain visibility to possible call targets. Kernel modules are supported with Clang’s cross-DSO CFI mode, which allows checking between independently compiled components. With CFI enabled, the compiler injects a __cfi_check() function into the kernel and each module for validating local call targets. For cross-module calls that cannot be validated locally, the compiler calls the global __cfi_slowpath_diag() function, which determines the target module and calls the correct __cfi_check() function. This patch includes a slowpath implementation that uses __module_address() to resolve call targets, and with CONFIG_CFI_CLANG_SHADOW enabled, a shadow map that speeds up module look-ups by ~3x. Clang implements indirect call checking using jump tables and offers two methods of generating them. With canonical jump tables, the compiler renames each address-taken function to <function>.cfi and points the original symbol to a jump table entry, which passes __cfi_check() validation. This isn’t compatible with stand-alone assembly code, which the compiler doesn’t instrument, and would result in indirect calls to assembly code to fail. Therefore, we default to using non-canonical jump tables instead, where the compiler generates a local jump table entry <function>.cfi_jt for each address-taken function, and replaces all references to the function with the address of the jump table entry. Note that because non-canonical jump table addresses are local to each component, they break cross-module function address equality. Specifically, the address of a global function will be different in each module, as it's replaced with the address of a local jump table entry. If this address is passed to a different module, it won’t match the address of the same function taken there. This may break code that relies on comparing addresses passed from other components. CFI checking can be disabled in a function with the __nocfi attribute. Additionally, CFI can be disabled for an entire compilation unit by filtering out CC_FLAGS_CFI. By default, CFI failures result in a kernel panic to stop a potential exploit. CONFIG_CFI_PERMISSIVE enables a permissive mode, where the kernel prints out a rate-limited warning instead, and allows execution to continue. This option is helpful for locating type mismatches, but should only be enabled during development. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210408182843.1754385-2-samitolvanen@google.com
2021-04-08i2c: Add support for software nodesHeikki Krogerus
This makes it possible for the drivers to assign complete software fwnodes to the devices instead of only the device properties in those nodes. Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2021-04-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf 2021-04-08 The following pull-request contains BPF updates for your *net* tree. We've added 4 non-merge commits during the last 2 day(s) which contain a total of 4 files changed, 31 insertions(+), 10 deletions(-). The main changes are: 1) Validate and reject invalid JIT branch displacements, from Piotr Krysiuk. 2) Fix incorrect unhash restore as well as fwd_alloc memory accounting in sock map, from John Fastabend. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-08net: qed: remove unused including <linux/version.h>Tian Tao
Remove including <linux/version.h> that don't need it. Signed-off-by: Tian Tao <tiantao6@hisilicon.com> Signed-off-by: Zhiqi Song <songzhiqi1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-08net: phy: marvell10g: add separate structure for 88X3340Marek Behún
The 88X3340 contains 4 cores similar to 88X3310, but there is a difference: it does not support xaui host mode. Instead the corresponding MACTYPE means rxaui / 5gbase-r / 2500base-x / sgmii without AN Signed-off-by: Marek Behún <kabel@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-08Merge back earlier cpuidle updates for v5.13.Rafael J. Wysocki
2021-04-08resource: Prevent irqresource_disabled() from erasing flagsAngela Czubak
Some Chromebooks use hard-coded interrupts in their ACPI tables. This is an excerpt as dumped on Relm: ... Name (_HID, "ELAN0001") // _HID: Hardware ID Name (_DDN, "Elan Touchscreen ") // _DDN: DOS Device Name Name (_UID, 0x05) // _UID: Unique ID Name (ISTP, Zero) Method (_CRS, 0, NotSerialized) // _CRS: Current Resource Settings { Name (BUF0, ResourceTemplate () { I2cSerialBusV2 (0x0010, ControllerInitiated, 0x00061A80, AddressingMode7Bit, "\\_SB.I2C1", 0x00, ResourceConsumer, , Exclusive, ) Interrupt (ResourceConsumer, Edge, ActiveLow, Exclusive, ,, ) { 0x000000B8, } }) Return (BUF0) /* \_SB_.I2C1.ETSA._CRS.BUF0 */ } ... This interrupt is hard-coded to 0xB8 = 184 which is too high to be mapped to IO-APIC, so no triggering information is propagated as acpi_register_gsi() fails and irqresource_disabled() is issued, which leads to erasing triggering and polarity information. Do not overwrite flags as it leads to erasing triggering and polarity information which might be useful in case of hard-coded interrupts. This way the information can be read later on even though mapping to APIC domain failed. Signed-off-by: Angela Czubak <acz@semihalf.com> [ rjw: Changelog rearrangement ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-04-08ACPI: utils: Add acpi_reduced_hardware() helperHans de Goede
Add a getter for the acpi_gbl_reduced_hardware variable so that modules can check if they are running on an ACPI reduced-hw platform or not. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-04-08PM: runtime: Replace inline function pm_runtime_callbacks_present()YueHaibing
Commit 9a7875461fd0 ("PM: runtime: Replace pm_runtime_callbacks_present()") forgot to change the inline version. Fixes: 9a7875461fd0 ("PM: runtime: Replace pm_runtime_callbacks_present()") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-04-08freezer: Remove unused inline function try_to_freeze_nowarn()YueHaibing
There is no caller in tree, so can remove it. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-04-08powercap: RAPL: Fix struct declaration in header fileWan Jiabing
struct rapl_package is declared twice in intel_rapl.h, once at line 80 and once earlier. Code inspection suggests that the first instance should be struct rapl_domain rather than rapl_package, so change it. Signed-off-by: Wan Jiabing <wanjiabing@vivo.com> [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-04-08block: remove disk_part_iterChristoph Hellwig
Just open code the xa_for_each in the remaining user. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210406062303.811835-12-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-08block: refactor blk_drop_partitionsChristoph Hellwig
Move the busy check and disk-wide sync into the only caller, so that the remainder can be shared with del_gendisk. Also pass the gendisk instead of the bdev as that is all that is needed. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210406062303.811835-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-08Merge tag 'scmi-updates-5.13' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/drivers ARM SCMI updates for v5.13 The major and big addition this time is to support modularisation of individual SCMI protocols thus enabling to add support for vendors' custom SCMI protocol. This changes the interface provided by the SCMI driver to all the users of SCMI and hence involved changes in various other subsystem SCMI drivers. The change has been split with a bit of transient code to preserve bisectability and avoiding one big patch bomb changing all the users. This also includes SCMI IIO driver(pulled from IIO tree) and support for per-cpu DVFS. * tag 'scmi-updates-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: (41 commits) firmware: arm_scmi: Add dynamic scmi devices creation firmware: arm_scmi: Add protocol modularization support firmware: arm_scmi: Rename non devres notify_ops firmware: arm_scmi: Make notify_priv really private firmware: arm_scmi: Cleanup events registration transient code firmware: arm_scmi: Cleanup unused core transfer helper wrappers firmware: arm_scmi: Cleanup legacy protocol init code firmware: arm_scmi: Make references to handle const firmware: arm_scmi: Remove legacy scmi_voltage_ops protocol interface regulator: scmi: Port driver to the new scmi_voltage_proto_ops interface firmware: arm_scmi: Port voltage protocol to new protocols interface firmware: arm_scmi: Port systempower protocol to new protocols interface firmware: arm_scmi: Remove legacy scmi_sensor_ops protocol interface iio/scmi: Port driver to the new scmi_sensor_proto_ops interface hwmon: (scmi) port driver to the new scmi_sensor_proto_ops interface firmware: arm_scmi: Port sensor protocol to new protocols interface firmware: arm_scmi: Remove legacy scmi_reset_ops protocol interface reset: reset-scmi: Port driver to the new scmi_reset_proto_ops interface firmware: arm_scmi: Port reset protocol to new protocols interface firmware: arm_scmi: Remove legacy scmi_clk_ops protocol interface ... Link: https://lore.kernel.org/r/20210331100657.ilu63i4swnr3zp4e@bogus Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-04-08clocksource/drivers/timer-ti-dm: Handle dra7 timer wrap errata i940Tony Lindgren
There is a timer wrap issue on dra7 for the ARM architected timer. In a typical clock configuration the timer fails to wrap after 388 days. To work around the issue, we need to use timer-ti-dm percpu timers instead. Let's configure dmtimer3 and 4 as percpu timers by default, and warn about the issue if the dtb is not configured properly. Let's do this as a single patch so it can be backported to v5.8 and later kernels easily. Note that this patch depends on earlier timer-ti-dm systimer posted mode fixes, and a preparatory clockevent patch "clocksource/drivers/timer-ti-dm: Prepare to handle dra7 timer wrap issue". For more information, please see the errata for "AM572x Sitara Processors Silicon Revisions 1.1, 2.0": https://www.ti.com/lit/er/sprz429m/sprz429m.pdf The concept is based on earlier reference patches done by Tero Kristo and Keerthy. Cc: Keerthy <j-keerthy@ti.com> Cc: Tero Kristo <kristo@kernel.org> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210323074326.28302-3-tony@atomide.com
2021-04-08Merge tag 'irq-no-autoen-2021-03-25' into review-hansHans de Goede
Tag for the input subsystem to pick up
2021-04-08spi: Fix use-after-free with devm_spi_alloc_*William A. Kennington III
We can't rely on the contents of the devres list during spi_unregister_controller(), as the list is already torn down at the time we perform devres_find() for devm_spi_release_controller. This causes devices registered with devm_spi_alloc_{master,slave}() to be mistakenly identified as legacy, non-devm managed devices and have their reference counters decremented below 0. ------------[ cut here ]------------ WARNING: CPU: 1 PID: 660 at lib/refcount.c:28 refcount_warn_saturate+0x108/0x174 [<b0396f04>] (refcount_warn_saturate) from [<b03c56a4>] (kobject_put+0x90/0x98) [<b03c5614>] (kobject_put) from [<b0447b4c>] (put_device+0x20/0x24) r4:b6700140 [<b0447b2c>] (put_device) from [<b07515e8>] (devm_spi_release_controller+0x3c/0x40) [<b07515ac>] (devm_spi_release_controller) from [<b045343c>] (release_nodes+0x84/0xc4) r5:b6700180 r4:b6700100 [<b04533b8>] (release_nodes) from [<b0454160>] (devres_release_all+0x5c/0x60) r8:b1638c54 r7:b117ad94 r6:b1638c10 r5:b117ad94 r4:b163dc10 [<b0454104>] (devres_release_all) from [<b044e41c>] (__device_release_driver+0x144/0x1ec) r5:b117ad94 r4:b163dc10 [<b044e2d8>] (__device_release_driver) from [<b044f70c>] (device_driver_detach+0x84/0xa0) r9:00000000 r8:00000000 r7:b117ad94 r6:b163dc54 r5:b1638c10 r4:b163dc10 [<b044f688>] (device_driver_detach) from [<b044d274>] (unbind_store+0xe4/0xf8) Instead, determine the devm allocation state as a flag on the controller which is guaranteed to be stable during cleanup. Fixes: 5e844cc37a5c ("spi: Introduce device-managed SPI controller allocation") Signed-off-by: William A. Kennington III <wak@google.com> Link: https://lore.kernel.org/r/20210407095527.2771582-1-wak@google.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-04-08stack: Optionally randomize kernel stack offset each syscallKees Cook
This provides the ability for architectures to enable kernel stack base address offset randomization. This feature is controlled by the boot param "randomize_kstack_offset=on/off", with its default value set by CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT. This feature is based on the original idea from the last public release of PaX's RANDKSTACK feature: https://pax.grsecurity.net/docs/randkstack.txt All the credit for the original idea goes to the PaX team. Note that the design and implementation of this upstream randomize_kstack_offset feature differs greatly from the RANDKSTACK feature (see below). Reasoning for the feature: This feature aims to make harder the various stack-based attacks that rely on deterministic stack structure. We have had many such attacks in past (just to name few): https://jon.oberheide.org/files/infiltrate12-thestackisback.pdf https://jon.oberheide.org/files/stackjacking-infiltrate11.pdf https://googleprojectzero.blogspot.com/2016/06/exploiting-recursion-in-linux-kernel_20.html As Linux kernel stack protections have been constantly improving (vmap-based stack allocation with guard pages, removal of thread_info, STACKLEAK), attackers have had to find new ways for their exploits to work. They have done so, continuing to rely on the kernel's stack determinism, in situations where VMAP_STACK and THREAD_INFO_IN_TASK_STRUCT were not relevant. For example, the following recent attacks would have been hampered if the stack offset was non-deterministic between syscalls: https://repositorio-aberto.up.pt/bitstream/10216/125357/2/374717.pdf (page 70: targeting the pt_regs copy with linear stack overflow) https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html (leaked stack address from one syscall as a target during next syscall) The main idea is that since the stack offset is randomized on each system call, it is harder for an attack to reliably land in any particular place on the thread stack, even with address exposures, as the stack base will change on the next syscall. Also, since randomization is performed after placing pt_regs, the ptrace-based approach[1] to discover the randomized offset during a long-running syscall should not be possible. Design description: During most of the kernel's execution, it runs on the "thread stack", which is pretty deterministic in its structure: it is fixed in size, and on every entry from userspace to kernel on a syscall the thread stack starts construction from an address fetched from the per-cpu cpu_current_top_of_stack variable. The first element to be pushed to the thread stack is the pt_regs struct that stores all required CPU registers and syscall parameters. Finally the specific syscall function is called, with the stack being used as the kernel executes the resulting request. The goal of randomize_kstack_offset feature is to add a random offset after the pt_regs has been pushed to the stack and before the rest of the thread stack is used during the syscall processing, and to change it every time a process issues a syscall. The source of randomness is currently architecture-defined (but x86 is using the low byte of rdtsc()). Future improvements for different entropy sources is possible, but out of scope for this patch. Further more, to add more unpredictability, new offsets are chosen at the end of syscalls (the timing of which should be less easy to measure from userspace than at syscall entry time), and stored in a per-CPU variable, so that the life of the value does not stay explicitly tied to a single task. As suggested by Andy Lutomirski, the offset is added using alloca() and an empty asm() statement with an output constraint, since it avoids changes to assembly syscall entry code, to the unwinder, and provides correct stack alignment as defined by the compiler. In order to make this available by default with zero performance impact for those that don't want it, it is boot-time selectable with static branches. This way, if the overhead is not wanted, it can just be left turned off with no performance impact. The generated assembly for x86_64 with GCC looks like this: ... ffffffff81003977: 65 8b 05 02 ea 00 7f mov %gs:0x7f00ea02(%rip),%eax # 12380 <kstack_offset> ffffffff8100397e: 25 ff 03 00 00 and $0x3ff,%eax ffffffff81003983: 48 83 c0 0f add $0xf,%rax ffffffff81003987: 25 f8 07 00 00 and $0x7f8,%eax ffffffff8100398c: 48 29 c4 sub %rax,%rsp ffffffff8100398f: 48 8d 44 24 0f lea 0xf(%rsp),%rax ffffffff81003994: 48 83 e0 f0 and $0xfffffffffffffff0,%rax ... As a result of the above stack alignment, this patch introduces about 5 bits of randomness after pt_regs is spilled to the thread stack on x86_64, and 6 bits on x86_32 (since its has 1 fewer bit required for stack alignment). The amount of entropy could be adjusted based on how much of the stack space we wish to trade for security. My measure of syscall performance overhead (on x86_64): lmbench: /usr/lib/lmbench/bin/x86_64-linux-gnu/lat_syscall -N 10000 null randomize_kstack_offset=y Simple syscall: 0.7082 microseconds randomize_kstack_offset=n Simple syscall: 0.7016 microseconds So, roughly 0.9% overhead growth for a no-op syscall, which is very manageable. And for people that don't want this, it's off by default. There are two gotchas with using the alloca() trick. First, compilers that have Stack Clash protection (-fstack-clash-protection) enabled by default (e.g. Ubuntu[3]) add pagesize stack probes to any dynamic stack allocations. While the randomization offset is always less than a page, the resulting assembly would still contain (unreachable!) probing routines, bloating the resulting assembly. To avoid this, -fno-stack-clash-protection is unconditionally added to the kernel Makefile since this is the only dynamic stack allocation in the kernel (now that VLAs have been removed) and it is provably safe from Stack Clash style attacks. The second gotcha with alloca() is a negative interaction with -fstack-protector*, in that it sees the alloca() as an array allocation, which triggers the unconditional addition of the stack canary function pre/post-amble which slows down syscalls regardless of the static branch. In order to avoid adding this unneeded check and its associated performance impact, architectures need to carefully remove uses of -fstack-protector-strong (or -fstack-protector) in the compilation units that use the add_random_kstack() macro and to audit the resulting stack mitigation coverage (to make sure no desired coverage disappears). No change is visible for this on x86 because the stack protector is already unconditionally disabled for the compilation unit, but the change is required on arm64. There is, unfortunately, no attribute that can be used to disable stack protector for specific functions. Comparison to PaX RANDKSTACK feature: The RANDKSTACK feature randomizes the location of the stack start (cpu_current_top_of_stack), i.e. including the location of pt_regs structure itself on the stack. Initially this patch followed the same approach, but during the recent discussions[2], it has been determined to be of a little value since, if ptrace functionality is available for an attacker, they can use PTRACE_PEEKUSR/PTRACE_POKEUSR to read/write different offsets in the pt_regs struct, observe the cache behavior of the pt_regs accesses, and figure out the random stack offset. Another difference is that the random offset is stored in a per-cpu variable, rather than having it be per-thread. As a result, these implementations differ a fair bit in their implementation details and results, though obviously the intent is similar. [1] https://lore.kernel.org/kernel-hardening/2236FBA76BA1254E88B949DDB74E612BA4BC57C1@IRSMSX102.ger.corp.intel.com/ [2] https://lore.kernel.org/kernel-hardening/20190329081358.30497-1-elena.reshetova@intel.com/ [3] https://lists.ubuntu.com/archives/ubuntu-devel/2019-June/040741.html Co-developed-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210401232347.2791257-4-keescook@chromium.org
2021-04-08init_on_alloc: Optimize static branchesKees Cook
The state of CONFIG_INIT_ON_ALLOC_DEFAULT_ON (and ...ON_FREE...) did not change the assembly ordering of the static branches: they were always out of line. Use the new jump_label macros to check the CONFIG settings to default to the "expected" state, which slightly optimizes the resulting assembly code. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexander Potapenko <glider@google.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Link: https://lore.kernel.org/r/20210401232347.2791257-3-keescook@chromium.org
2021-04-08jump_label: Provide CONFIG-driven build state defaultsKees Cook
As shown in the comment in jump_label.h, choosing the initial state of static branches changes the assembly layout. If the condition is expected to be likely it's inline, and if unlikely it is out of line via a jump. A few places in the kernel use (or could be using) a CONFIG to choose the default state, which would give a small performance benefit to their compile-time declared default. Provide the infrastructure to do this. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210401232347.2791257-2-keescook@chromium.org
2021-04-08irqchip/apple-aic: Add support for the Apple Interrupt ControllerHector Martin
This is the root interrupt controller used on Apple ARM SoCs such as the M1. This irqchip driver performs multiple functions: * Handles both IRQs and FIQs * Drives the AIC peripheral itself (which handles IRQs) * Dispatches FIQs to downstream hard-wired clients (currently the ARM timer). * Implements a virtual IPI multiplexer to funnel multiple Linux IPIs into a single hardware IPI Reviewed-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Hector Martin <marcan@marcan.st>
2021-04-08arm64: Move ICH_ sysreg bits from arm-gic-v3.h to sysreg.hHector Martin
These definitions are in arm-gic-v3.h for historical reasons which no longer apply. Move them to sysreg.h so the AIC driver can use them, as it needs to peek into vGIC registers to deal with the GIC maintentance interrupt. Acked-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Hector Martin <marcan@marcan.st>
2021-04-08asm-generic/io.h: implement pci_remap_cfgspace using ioremap_npHector Martin
Now that we have ioremap_np(), we can make pci_remap_cfgspace() default to it, falling back to ioremap() on platforms where it is not available. Remove the arm64 implementation, since that is now redundant. Future cleanups should be able to do the same for other arches, and eventually make the generic pci_remap_cfgspace() unconditional. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Hector Martin <marcan@marcan.st>
2021-04-08asm-generic/io.h: Add a non-posted variant of ioremap()Hector Martin
ARM64 currently defaults to posted MMIO (nGnRE), but some devices require the use of non-posted MMIO (nGnRnE). Introduce a new ioremap() variant to handle this case. ioremap_np() returns NULL on arches that do not implement this variant. sparc64 is the only architecture that needs to be touched directly, because it includes neither of the generic io.h or iomap.h headers. This adds the IORESOURCE_MEM_NONPOSTED flag, which maps to this variant and marks a given resource as requiring non-posted mappings. This is implemented in the resource system because it is a SoC-level requirement, so existing drivers do not need special-case code to pick this ioremap variant. Then this is implemented in devres by introducing devm_ioremap_np(), and making devm_ioremap_resource() automatically select this variant when the resource has the IORESOURCE_MEM_NONPOSTED flag set. Acked-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Hector Martin <marcan@marcan.st>
2021-04-08Merge remote-tracking branch 'arm64/for-next/fiq'Hector Martin
The FIQ support series, already merged into arm64, is a dependency of the M1 bring-up series and was split off after the first few versions. Signed-off-by: Hector Martin <marcan@marcan.st>
2021-04-08drm/syncobj: use newly allocated stub fencesDavid Stevens
Allocate a new private stub fence in drm_syncobj_assign_null_handle, instead of using a static stub fence. When userspace creates a fence with DRM_SYNCOBJ_CREATE_SIGNALED or when userspace signals a fence via DRM_IOCTL_SYNCOBJ_SIGNAL, the timestamp obtained when the fence is exported and queried with SYNC_IOC_FILE_INFO should match when the fence's status was changed from the perspective of userspace, which is during the respective ioctl. When a static stub fence started being used in by these ioctls, this behavior changed. Instead, the timestamp returned by SYNC_IOC_FILE_INFO became the first time anything used the static stub fence, which has no meaning to userspace. Signed-off-by: David Stevens <stevensd@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20210408095428.3983055-1-stevensd@google.com Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>
2021-04-08Merge commit '71b25f4df984' from tty/tty-nextHector Martin
This point in gregkh's tty-next tree includes all the samsung_tty changes that were part of v3 of the M1 bring-up series, and have already been merged in. Signed-off-by: Hector Martin <marcan@marcan.st>
2021-04-08USB: serial: add generic support for TIOCSSERIALJohan Hovold
TIOCSSERIAL is a horrid, underspecified, legacy interface which for most serial devices is only useful for setting the close_delay and closing_wait parameters. The closing_wait parameter determines how long to wait for the transfer buffers to drain during close and the default timeout of 30 seconds may not be sufficient at low line speeds. In other cases, when for example flow is stopped, the default timeout may instead be too long. Add generic support for TIOCSSERIAL and TIOCGSERIAL with handling of the three common parameters close_delay, closing_wait and line for the benefit of all USB serial drivers while still allowing drivers to implement further functionality through the existing callbacks. This currently includes a few drivers that report their base baud clock rate even if that is really only of interest when setting custom divisors through the deprecated ASYNC_SPD_CUST interface; an interface which only the FTDI driver actually implements. Some drivers have also been reporting back a fake UART type, something which should no longer be needed and will be dropped by a follow-on patch. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Johan Hovold <johan@kernel.org>
2021-04-08Merge branch 'immutable-devfreq-v5.13-rc1' into devfreq-nextChanwoo Choi
2021-04-07ethtool: Remove link_mode param and derive link params from driverDanielle Ratson
Some drivers clear the 'ethtool_link_ksettings' struct in their get_link_ksettings() callback, before populating it with actual values. Such drivers will set the new 'link_mode' field to zero, resulting in user space receiving wrong link mode information given that zero is a valid value for the field. Another problem is that some drivers (notably tun) can report random values in the 'link_mode' field. This can result in a general protection fault when the field is used as an index to the 'link_mode_params' array [1]. This happens because such drivers implement their set_link_ksettings() callback by simply overwriting their private copy of 'ethtool_link_ksettings' struct with the one they get from the stack, which is not always properly initialized. Fix these problems by removing 'link_mode' from 'ethtool_link_ksettings' and instead have drivers call ethtool_params_from_link_mode() with the current link mode. The function will derive the link parameters (e.g., speed) from the link mode and fill them in the 'ethtool_link_ksettings' struct. v3: * Remove link_mode parameter and derive the link parameters in the driver instead of passing link_mode parameter to ethtool and derive it there. v2: * Introduce 'cap_link_mode_supported' instead of adding a validity field to 'ethtool_link_ksettings' struct. [1] general protection fault, probably for non-canonical address 0xdffffc00f14cc32c: 0000 [#1] PREEMPT SMP KASAN KASAN: probably user-memory-access in range [0x000000078a661960-0x000000078a661967] CPU: 0 PID: 8452 Comm: syz-executor360 Not tainted 5.11.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__ethtool_get_link_ksettings+0x1a3/0x3a0 net/ethtool/ioctl.c:446 Code: b7 3e fa 83 fd ff 0f 84 30 01 00 00 e8 16 b0 3e fa 48 8d 3c ed 60 d5 69 8a 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03 +38 d0 7c 08 84 d2 0f 85 b9 RSP: 0018:ffffc900019df7a0 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: ffff888026136008 RCX: 0000000000000000 RDX: 00000000f14cc32c RSI: ffffffff873439ca RDI: 000000078a661960 RBP: 00000000ffff8880 R08: 00000000ffffffff R09: ffff88802613606f R10: ffffffff873439bc R11: 0000000000000000 R12: 0000000000000000 R13: ffff88802613606c R14: ffff888011d0c210 R15: ffff888011d0c210 FS: 0000000000749300(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000004b60f0 CR3: 00000000185c2000 CR4: 00000000001506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: linkinfo_prepare_data+0xfd/0x280 net/ethtool/linkinfo.c:37 ethnl_default_notify+0x1dc/0x630 net/ethtool/netlink.c:586 ethtool_notify+0xbd/0x1f0 net/ethtool/netlink.c:656 ethtool_set_link_ksettings+0x277/0x330 net/ethtool/ioctl.c:620 dev_ethtool+0x2b35/0x45d0 net/ethtool/ioctl.c:2842 dev_ioctl+0x463/0xb70 net/core/dev_ioctl.c:440 sock_do_ioctl+0x148/0x2d0 net/socket.c:1060 sock_ioctl+0x477/0x6a0 net/socket.c:1177 vfs_ioctl fs/ioctl.c:48 [inline] __do_sys_ioctl fs/ioctl.c:753 [inline] __se_sys_ioctl fs/ioctl.c:739 [inline] __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: c8907043c6ac9 ("ethtool: Get link mode in use instead of speed and duplex parameters") Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-07net: remove the new_ifindex argument from dev_change_net_namespaceAndrei Vagin
Here is only one place where we want to specify new_ifindex. In all other cases, callers pass 0 as new_ifindex. It looks reasonable to add a low-level function with new_ifindex and to convert dev_change_net_namespace to a static inline wrapper. Fixes: eeb85a14ee34 ("net: Allow to specify ifindex when device is moved to another namespace") Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Andrei Vagin <avagin@gmail.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-07vfio/mdev: Add mdev/mtype_get_type_group_id()Jason Gunthorpe
This returns the index in the supported_type_groups array that is associated with the mdev_type attached to the struct mdev_device or its containing struct kobject. Each mdev_device can be spawned from exactly one mdev_type, which in turn originates from exactly one supported_type_group. Drivers are using weird string calculations to try and get back to this index, providing a direct access to the index removes a bunch of wonky driver code. mdev_type->group can be deleted as the group is obtained using the type_group_id. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <11-v2-d36939638fc6+d54-vfio2_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>