summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2022-03-10signal: Move set_notify_signal and clear_notify_signal into sched/signal.hEric W. Biederman
The header tracehook.h is no place for code to live. The functions set_notify_signal and clear_notify_signal are not about signals. They are about interruptions that act like signals. The fundamental signal primitives wind up calling set_notify_signal and clear_notify_signal. Which means they need to be maintained with the signal code. Since set_notify_signal and clear_notify_signal must be maintained with the signal subsystem move them into sched/signal.h and claim them as part of the signal subsystem. Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20220309162454.123006-10-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-03-10task_work: Decouple TIF_NOTIFY_SIGNAL and task_workEric W. Biederman
There are a small handful of reasons besides pending signals that the kernel might want to break out of interruptible sleeps. The flag TIF_NOTIFY_SIGNAL and the helpers that set and clear TIF_NOTIFY_SIGNAL provide that the infrastructure for breaking out of interruptible sleeps and entering the return to user space slow path for those cases. Expand tracehook_notify_signal inline in it's callers and remove it, which makes clear that TIF_NOTIFY_SIGNAL and task_work are separate concepts. Update the comment on set_notify_signal to more accurately describe it's purpose. Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20220309162454.123006-9-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-03-10task_work: Call tracehook_notify_signal from get_signal on all architecturesEric W. Biederman
Always handle TIF_NOTIFY_SIGNAL in get_signal. With commit 35d0b389f3b2 ("task_work: unconditionally run task_work from get_signal()") always calling task_work_run all of the work of tracehook_notify_signal is already happening except clearing TIF_NOTIFY_SIGNAL. Factor clear_notify_signal out of tracehook_notify_signal and use it in get_signal so that get_signal only needs one call of task_work_run. To keep the semantics in sync update xfer_to_guest_mode_work (which does not call get_signal) to call tracehook_notify_signal if either _TIF_SIGPENDING or _TIF_NOTIFY_SIGNAL. Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20220309162454.123006-8-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-03-10net: phy: correct spelling error of media in documentationColin Foster
The header file incorrectly referenced "median-independant interface" instead of media. Correct this typo. Signed-off-by: Colin Foster <colin.foster@in-advantage.com> Fixes: 4069a572d423 ("net: phy: Document core PHY structures") Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/20220309062544.3073-1-colin.foster@in-advantage.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10Merge tag 'mlx5-updates-2022-03-09' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2022-03-09 1) Remove kernel log prints on FW events regarding FW pages management and replace that with debugfs entries to track FW pages management commands failures and general stats, we do that for all FW commands in general since it's the same effort to do so under the already existing debugfs entry for FW commands. 2) Add support for ConnectX-7 Software managed steering, in other words STEv2 which shares a lot in common with STE V1, the difference is in specific offsets in the devices, the logic is almost the same, thus we implement STEv1 and STEv2 in the same file. * tag 'mlx5-updates-2022-03-09' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: DR, Add support for ConnectX-7 steering net/mlx5: DR, Refactor ste_ctx handling for STE v0/1 net/mlx5: DR, Rename action modify fields to reflect naming in HW spec net/mlx5: DR, Fix handling of different actions on the same STE in STEv1 net/mlx5: DR, Remove unneeded comments net/mlx5: DR, Add support for matching on Internet Header Length (IHL) net/mlx5: DR, Align mlx5dv_dr API vport action with FW behavior net/mlx5: Add debugfs counters for page commands failures net/mlx5: Add pages debugfs net/mlx5: Move debugfs entries to separate struct net/mlx5: Change release_all_pages cap bit location net/mlx5: Remove redundant error on reclaim pages net/mlx5: Remove redundant error on give pages net/mlx5: Remove redundant notify fail on give pages net/mlx5: Add command failures data to debugfs net/mlx5e: TC, Fix use after free in mlx5e_clone_flow_attr_for_post_act() ==================== Link: https://lore.kernel.org/r/20220309213755.610202-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10bpf: Remove BPF_SKB_DELIVERY_TIME_NONE and rename s/delivery_time_/tstamp_/Martin KaFai Lau
This patch is to simplify the uapi bpf.h regarding to the tstamp type and use a similar way as the kernel to describe the value stored in __sk_buff->tstamp. My earlier thought was to avoid describing the semantic and clock base for the rcv timestamp until there is more clarity on the use case, so the __sk_buff->delivery_time_type naming instead of __sk_buff->tstamp_type. With some thoughts, it can reuse the UNSPEC naming. This patch first removes BPF_SKB_DELIVERY_TIME_NONE and also rename BPF_SKB_DELIVERY_TIME_UNSPEC to BPF_SKB_TSTAMP_UNSPEC and BPF_SKB_DELIVERY_TIME_MONO to BPF_SKB_TSTAMP_DELIVERY_MONO. The semantic of BPF_SKB_TSTAMP_DELIVERY_MONO is the same: __sk_buff->tstamp has delivery time in mono clock base. BPF_SKB_TSTAMP_UNSPEC means __sk_buff->tstamp has the (rcv) tstamp at ingress and the delivery time at egress. At egress, the clock base could be found from skb->sk->sk_clockid. __sk_buff->tstamp == 0 naturally means NONE, so NONE is not needed. With BPF_SKB_TSTAMP_UNSPEC for the rcv tstamp at ingress, the __sk_buff->delivery_time_type is also renamed to __sk_buff->tstamp_type which was also suggested in the earlier discussion: https://lore.kernel.org/bpf/b181acbe-caf8-502d-4b7b-7d96b9fc5d55@iogearbox.net/ The above will then make __sk_buff->tstamp and __sk_buff->tstamp_type the same as its kernel skb->tstamp and skb->mono_delivery_time counter part. The internal kernel function bpf_skb_convert_dtime_type_read() is then renamed to bpf_skb_convert_tstamp_type_read() and it can be simplified with the BPF_SKB_DELIVERY_TIME_NONE gone. A BPF_ALU32_IMM(BPF_AND) insn is also saved by using BPF_JMP32_IMM(BPF_JSET). The bpf helper bpf_skb_set_delivery_time() is also renamed to bpf_skb_set_tstamp(). The arg name is changed from dtime to tstamp also. It only allows setting tstamp 0 for BPF_SKB_TSTAMP_UNSPEC and it could be relaxed later if there is use case to change mono delivery time to non mono. prog->delivery_time_access is also renamed to prog->tstamp_type_access. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220309090509.3712315-1-kafai@fb.com
2022-03-10bpf: net: Remove TC_AT_INGRESS_OFFSET and SKB_MONO_DELIVERY_TIME_OFFSET macroMartin KaFai Lau
This patch removes the TC_AT_INGRESS_OFFSET and SKB_MONO_DELIVERY_TIME_OFFSET macros. Instead, PKT_VLAN_PRESENT_OFFSET is used because all of them are at the same offset. Comment is added to make it clear that changing the position of tc_at_ingress or mono_delivery_time will require to adjust the defined macros. The earlier discussion can be found here: https://lore.kernel.org/bpf/419d994e-ff61-7c11-0ec7-11fefcb0186e@iogearbox.net/ Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220309090450.3710955-1-kafai@fb.com
2022-03-10task_work: Introduce task_work_pendingEric W. Biederman
Wrap the test of task->task_works in a helper function to make it clear what is being tested. All of the other readers of task->task_work use READ_ONCE and this is even necessary on current as other processes can update task->task_work. So for consistency I have added READ_ONCE into task_work_pending. Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20220309162454.123006-7-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-03-10task_work: Remove unnecessary include from posix_timers.hEric W. Biederman
Break a header file circular dependency by removing the unnecessary include of task_work.h from posix_timers.h. sched.h -> posix-timers.h posix-timers.h -> task_work.h task_work.h -> sched.h Add missing includes of task_work.h to: arch/x86/mm/tlb.c kernel/time/posix-cpu-timers.c Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20220309162454.123006-6-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-03-10ptrace: Remove tracehook_signal_handlerEric W. Biederman
The two line function tracehook_signal_handler is only called from signal_delivered. Expand it inline in signal_delivered and remove it. Just to make it easier to understand what is going on. Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20220309162454.123006-5-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-03-10ptrace: Remove arch_syscall_{enter,exit}_tracehookEric W. Biederman
These functions are alwasy one-to-one wrappers around ptrace_report_syscall_entry and ptrace_report_syscall_exit. So directly call the functions they are wrapping instead. Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20220309162454.123006-4-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-03-10ptrace: Create ptrace_report_syscall_{entry,exit} in ptrace.hEric W. Biederman
Rename tracehook_report_syscall_{entry,exit} to ptrace_report_syscall_{entry,exit} and place them in ptrace.h There is no longer any generic tracehook infractructure so make these ptrace specific functions ptrace specific. Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20220309162454.123006-3-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-03-10ptrace: Move ptrace_report_syscall into ptrace.hEric W. Biederman
Move ptrace_report_syscall from tracehook.h into ptrace.h where it belongs. Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20220309162454.123006-1-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-03-10arch_topology: obtain cpu capacity using information from CPPCIonela Voinescu
Define topology_init_cpu_capacity_cppc() to use highest performance values from _CPC objects to obtain and set maximum capacity information for each CPU. acpi_cppc_processor_probe() is a good point at which to trigger the initialization of CPU (u-arch) capacity values, as at this point the highest performance values can be obtained from each CPU's _CPC objects. Architectures can therefore use this functionality through arch_init_invariance_cppc(). The performance scale used by CPPC is a unified scale for all CPUs in the system. Therefore, by obtaining the raw highest performance values from the _CPC objects, and normalizing them on the [0, 1024] capacity scale, used by the task scheduler, we obtain the CPU capacity of each CPU. While an ACPI Notify(0x85) could alert about a change in the highest performance value, which should in turn retrigger the CPU capacity computations, this notification is not currently handled by the ACPI processor driver. When supported, a call to arch_init_invariance_cppc() would perform the update. Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Tested-by: Valentin Schneider <valentin.schneider@arm.com> Tested-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-10ACPI: AGDI: Add driver for Arm Generic Diagnostic Dump and Reset deviceIlkka Koskinen
ACPI for Arm Components 1.1 Platform Design Document v1.1 [0] specifices Arm Generic Diagnostic Device Interface (AGDI). It allows an admin to issue diagnostic dump and reset via an SDEI event or an interrupt. This patch implements SDEI path. [0] https://developer.arm.com/documentation/den0093/latest/ Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-10dm: simplify dm_sumbit_bio_remap interfaceMike Snitzer
Remove the from_wq argument from dm_sumbit_bio_remap(). Eliminates the need for dm_sumbit_bio_remap() callers to know whether they are calling for a workqueue or from the original dm_submit_bio(). Add map_task to dm_io struct, record the map_task in alloc_io and clear it after all target ->map() calls have completed. Update dm_sumbit_bio_remap to check if 'current' matches io->map_task rather than rely on passed 'from_rq' argument. This change really simplifies the chore of porting each DM target to using dm_sumbit_bio_remap() because there is no longer the risk of programming error by not completely knowing all the different contexts a particular method that calls dm_sumbit_bio_remap() might be used in. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2022-03-10arm64: Add gcc Shadow Call Stack supportDan Li
Shadow call stacks will be available in GCC >= 12, this patch makes the corresponding kernel configuration available when compiling the kernel with the gcc. Note that the implementation in GCC is slightly different from Clang. With SCS enabled, functions will only pop x30 once in the epilogue, like: str x30, [x18], #8 stp x29, x30, [sp, #-16]! ...... - ldp x29, x30, [sp], #16 //clang + ldr x29, [sp], #16 //GCC ldr x30, [x18, #-8]! Link: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=ce09ab17ddd21f73ff2caf6eec3b0ee9b0e1a11e Reviewed-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Dan Li <ashimida@linux.alibaba.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220303074323.86282-1-ashimida@linux.alibaba.com
2022-03-10mm: slab: Delete unused SLAB_DEACTIVATED flagXiongwei Song
Since commit 9855609bde03 ("mm: memcg/slab: use a single set of kmem_caches for all accounted allocations") deleted all SLAB_DEACTIVATED users, therefore this flag is not needed any more, let's delete it. Signed-off-by: Xiongwei Song <sxwjean@gmail.com> Acked-by: David Rientjes <rientjes@google.com> Reviewed-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Link: https://lore.kernel.org/r/20220310140701.87908-2-sxwjean@me.com
2022-03-10io_uring: add support for registering ring file descriptorsJens Axboe
Lots of workloads use multiple threads, in which case the file table is shared between them. This makes getting and putting the ring file descriptor for each io_uring_enter(2) system call more expensive, as it involves an atomic get and put for each call. Similarly to how we allow registering normal file descriptors to avoid this overhead, add support for an io_uring_register(2) API that allows to register the ring fds themselves: 1) IORING_REGISTER_RING_FDS - takes an array of io_uring_rsrc_update structs, and registers them with the task. 2) IORING_UNREGISTER_RING_FDS - takes an array of io_uring_src_update structs, and unregisters them. When a ring fd is registered, it is internally represented by an offset. This offset is returned to the application, and the application then uses this offset and sets IORING_ENTER_REGISTERED_RING for the io_uring_enter(2) system call. This works just like using a registered file descriptor, rather than a real one, in an SQE, where IOSQE_FIXED_FILE gets set to tell io_uring that we're using an internal offset/descriptor rather than a real file descriptor. In initial testing, this provides a nice bump in performance for threaded applications in real world cases where the batch count (eg number of requests submitted per io_uring_enter(2) invocation) is low. In a microbenchmark, submitting NOP requests, we see the following increases in performance: Requests per syscall Baseline Registered Increase ---------------------------------------------------------------- 1 ~7030K ~8080K +15% 2 ~13120K ~14800K +13% 4 ~22740K ~25300K +11% Co-developed-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-10dma-mapping: benchmark: extract a common header file for map_benchmark ↵Tian Tao
definition kernel/dma/map_benchmark.c and selftests/dma/dma_map_benchmark.c have duplicate map_benchmark definitions, which tends to lead to inconsistent changes to map_benchmark on both sides, extract a common header file to avoid this problem. Signed-off-by: Tian Tao <tiantao6@hisilicon.com> Acked-by: Barry Song <song.bao.hua@hisilicon.com> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-03-09ptp: idt82p33: use rsmu driver to access i2c/spi busMin Li
rsmu (Renesas Synchronization Management Unit ) driver is located in drivers/mfd and responsible for creating multiple devices including idt82p33 phc, which will then use the exposed regmap and mutex handle to access i2c/spi bus. Signed-off-by: Min Li <min.li.xe@renesas.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Link: https://lore.kernel.org/r/1646748651-16811-1-git-send-email-min.li.xe@renesas.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-09drivers/nvdimm: Add perf interface to expose nvdimm performance statsKajol Jain
A common interface is added to get performance stats reporting support for nvdimm devices. Added interface defines supported event list, config fields for the event attributes and their corresponding bit values which are exported via sysfs. Interface also added support for pmu register/unregister functions, cpu hotplug feature along with macros for handling events addition via sysfs. It adds attribute groups for format, cpumask and events to the pmu structure. User could use the standard perf tool to access perf events exposed via nvdimm pmu. [Declare pmu functions in nd.h file to resolve implicit-function-declaration warning and make hotplug function static as reported by kernel test robot] Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Link: https://lore.kernel.org/all/202202241242.zqzGkguy-lkp@intel.com/ Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Madhavan Srinivasan <maddy@in.ibm.com> Link: https://lore.kernel.org/r/20220225143024.47947-3-kjain@linux.ibm.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-03-09drivers/nvdimm: Add nvdimm pmu structureKajol Jain
A structure is added called nvdimm_pmu, for performance stats reporting support of nvdimm devices. It can be used to add device pmu data such as pmu data structure for performance stats, nvdimm device pointer along with cpumask attributes. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Madhavan Srinivasan <maddy@in.ibm.com> Link: https://lore.kernel.org/r/20220225143024.47947-2-kjain@linux.ibm.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-03-09net/mlx5: DR, Add support for ConnectX-7 steeringYevgeny Kliteynik
Add support for a new SW format version that is implemented by ConnectX-7. Except for several differences, the STEv2 is identical to STEv1, so for most callbacks the STEv2 context struct will call STEv1 functions. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09net/mlx5: DR, Add support for matching on Internet Header Length (IHL)Yevgeny Kliteynik
Add support for matching on new field - Internet Header Length (IHL). Signed-off-by: Muhammad Sammar <muhammads@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09net/mlx5: Add debugfs counters for page commands failuresMoshe Shemesh
Add the following new debugfs counters for debug and verbosity: fw_pages_alloc_failed - number of pages FW requested but driver failed to allocate. give_pages_dropped - number of pages given to FW, but command give pages failed by FW. reclaim_pages_discard - number of pages which were about to reclaim back and FW failed the command. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09net/mlx5: Add pages debugfsMoshe Shemesh
Add pages debugfs to expose the following counters for debuggability: fw_pages_total - How many pages were given to FW and not returned yet. vfs_pages - For SRIOV, how many pages were given to FW for virtual functions usage. host_pf_pages - For ECPF, how many pages were given to FW for external hosts physical functions usage. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09net/mlx5: Move debugfs entries to separate structMoshe Shemesh
Move the debugfs entry pointers under priv to their own struct. Add get function for device debugfs root. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09net/mlx5: Change release_all_pages cap bit locationMoshe Shemesh
mlx5 FW has changed release_all_pages cap bit by one bit offset to reflect a fix in the FW flow for release_all_pages. The driver should use the new bit to ensure it calls release_all_pages only if the FW fix is there. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09net/mlx5: Add command failures data to debugfsMoshe Shemesh
Add new counters to command interface debugfs to count command failures. The following counters added: total_failed - number of times command failed (any kind of failure). failed_mbox_status - number of times command failed on bad status returned by FW. In addition, add data about last command failure to command interface debugfs: last_failed_errno - last command failed returned errno. last_failed_mbox_status - last bad status returned by FW. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09net/mlx5e: SHAMPO, reduce TIR indicationBen Ben-Ishay
SHAMPO is an RQ / WQ feature, an indication was added to the TIR in the first place to enforce suitability between connected TIR and RQ, this enforcement does not exist in current the Firmware implementation and was redundant in the first place. Fixes: 83439f3c37aa ("net/mlx5e: Add HW-GRO offload") Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09net/mlx5: Fix size field in bufferx_reg structMohammad Kabat
According to HW spec the field "size" should be 16 bits in bufferx register. Fixes: e281682bf294 ("net/mlx5_core: HW data structs/types definitions cleanup") Signed-off-by: Mohammad Kabat <mohammadkab@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09block: add ->poll_bio to block_device_operationsMing Lei
Prepare for supporting IO polling for bio-based driver. Add ->poll_bio callback so that bio-based driver can provide their own logic for polling bio. Also fix ->submit_bio_bio typo in comment block above __submit_bio_noacct. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2022-03-09clk: qcom: smd: Add missing RPM clocks for msm8992/4Konrad Dybcio
XO and MSS_CFG were omitted when first adding the clocks for these SoCs. Add them, and while at it, move the XO clock to the top of the definition list, as ideally everyone should start using it sooner or later.. Fixes: b4297844995f ("clk: qcom: smd: Add support for MSM8992/4 rpm clocks") Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20220226214126.21209-2-konrad.dybcio@somainline.org
2022-03-09Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2022-03-09 1) Fix IPv6 PMTU discovery for xfrm interfaces. From Lina Wang. 2) Revert failing for policies and states that are configured with XFRMA_IF_ID 0. It broke a user configuration. From Kai Lueke. 3) Fix a possible buffer overflow in the ESP output path. 4) Fix ESP GSO for tunnel and BEET mode on inter address family tunnels. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-09kasan: fix a missing header include of static_keys.hJoey Gouly
The kasan-enabled.h header relies on static keys, so make sure to include the header to avoid compilation errors (with JUMP_LABEL=n). It fixes the following: ./include/linux/kasan-enabled.h:9:1: warning: data definition has no type or storage class 9 | DECLARE_STATIC_KEY_FALSE(kasan_flag_enabled); | ^~~~~~~~~~~~~~~~~~~~~~~~ error: type defaults to 'int' in declaration of 'DECLARE_STATIC_KEY_FALSE' [-Werror=implicit-int] Fixes: f9b5e46f4097eb29 ("kasan: split kasan_*enabled() functions into a separate header") Cc: Peter Collingbourne <pcc@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Acked-by: Andrey Konovalov <andreyknvl@gmail.com> Signed-off-by: Joey Gouly <joey.gouly@arm.com> Link: https://lore.kernel.org/r/20220301154518.19456-1-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-03-09skb: make drop reason booleanableJakub Kicinski
We have a number of cases where function returns drop/no drop decision as a boolean. Now that we want to report the reason code as well we have to pass extra output arguments. We can make the reason code evaluate correctly as bool. I believe we're good to reorder the reasons as they are reported to user space as strings. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-08blk-mq: manage hctx map via xarrayMing Lei
First code becomes more clean by switching to xarray from plain array. Second use-after-free on q->queue_hw_ctx can be fixed because queue_for_each_hw_ctx() may be run when updating nr_hw_queues is in-progress. With this patch, q->hctx_table is defined as xarray, and this structure will share same lifetime with request queue, so queue_for_each_hw_ctx() can use q->hctx_table to lookup hctx reliably. Reported-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220308073219.91173-7-ming.lei@redhat.com [axboe: fix blk_mq_hw_ctx forward declaration] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-08fs: remove fs.f_write_hintChristoph Hellwig
The value is now completely unused except for reporting it back through the F_GET_FILE_RW_HINT ioctl, so remove the value and the two ioctls for it. Trying to use the F_SET_FILE_RW_HINT and F_GET_FILE_RW_HINT fcntls will now return EINVAL, just like it would on a kernel that never supported this functionality in the first place. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220308060529.736277-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-08fs: remove kiocb.ki_hintChristoph Hellwig
This field is entirely unused now except for a tracepoint in f2fs, so remove it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220308060529.736277-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-08prlimit: do not grab the tasklist_lockBarret Rhoden
Unnecessarily grabbing the tasklist_lock can be a scalability bottleneck for workloads that also must grab the tasklist_lock for waiting, killing, and cloning. The tasklist_lock was grabbed to protect tsk->sighand from disappearing (becoming NULL). tsk->signal was already protected by holding a reference to tsk. update_rlimit_cpu() assumed tsk->sighand != NULL. With this commit, it attempts to lock_task_sighand(). However, this means that update_rlimit_cpu() can fail. This only happens when a task is exiting. Note that during exec, sighand may *change*, but it will not be NULL. Prior to this commit, the do_prlimit() ensured that update_rlimit_cpu() would not fail by read locking the tasklist_lock and checking tsk->sighand != NULL. If update_rlimit_cpu() fails, there may be other tasks that are not exiting that share tsk->signal. However, the group_leader is the last task to be released, so if we cannot update_rlimit_cpu(group_leader), then the entire process is exiting. The only other caller of update_rlimit_cpu() is selinux_bprm_committing_creds(). It has tsk == current, so update_rlimit_cpu() cannot fail (current->sighand cannot disappear until current exits). This change resulted in a 14% speedup on a microbenchmark where parents kill and wait on their children, and children getpriority, setpriority, and getrlimit. Signed-off-by: Barret Rhoden <brho@google.com> Link: https://lkml.kernel.org/r/20220106172041.522167-4-brho@google.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2022-03-08prlimit: make do_prlimit() staticBarret Rhoden
There are no other callers in the kernel. Fixed up a comment format and whitespace issue when moving do_prlimit() higher in sys.c. Signed-off-by: Barret Rhoden <brho@google.com> Link: https://lkml.kernel.org/r/20220106172041.522167-3-brho@google.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2022-03-08coredump: Use the vma snapshot in fill_files_noteEric W. Biederman
Matthew Wilcox reported that there is a missing mmap_lock in file_files_note that could possibly lead to a user after free. Solve this by using the existing vma snapshot for consistency and to avoid the need to take the mmap_lock anywhere in the coredump code except for dump_vma_snapshot. Update the dump_vma_snapshot to capture vm_pgoff and vm_file that are neeeded by fill_files_note. Add free_vma_snapshot to free the captured values of vm_file. Reported-by: Matthew Wilcox <willy@infradead.org> Link: https://lkml.kernel.org/r/20220131153740.2396974-1-willy@infradead.org Cc: stable@vger.kernel.org Fixes: a07279c9a8cd ("binfmt_elf, binfmt_elf_fdpic: use a VMA list snapshot") Fixes: 2aa362c49c31 ("coredump: extend core dump note section to contain file names of mapped files") Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-03-08PM: sleep: Add device name to suspend_report_result()Youngjin Jang
Currently, suspend_report_result() prints only function information. If any driver uses a common PM function, nobody knows who exactly called the failing function. A device pinter is needed to recognize the failing device. For example: PM: dpm_run_callback(): pnp_bus_suspend+0x0/0x10 returns 0 PM: dpm_run_callback(): pci_pm_suspend+0x0/0x150 returns 0 become after the change: serial 00:05: PM: dpm_run_callback(): pnp_bus_suspend+0x0/0x10 returns 0 pci 0000:00:01.3: PM: dpm_run_callback(): pci_pm_suspend+0x0/0x150 returns 0 Signed-off-by: Youngjin Jang <yj84.jang@samsung.com> [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-08coredump: Snapshot the vmas in do_coredumpEric W. Biederman
Move the call of dump_vma_snapshot and kvfree(vma_meta) out of the individual coredump routines into do_coredump itself. This makes the code less error prone and easier to maintain. Make the vma snapshot available to the coredump routines in struct coredump_params. This makes it easier to change and update what is captures in the vma snapshot and will be needed for fixing fill_file_notes. Reviewed-by: Jann Horn <jannh@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-03-08coredump: Move definition of struct coredump_params into coredump.hEric W. Biederman
Move the definition of struct coredump_params into coredump.h where it belongs. Remove the slightly errorneous comment explaining why struct coredump_params was declared in binfmts.h. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-03-08Merge tag 'arm64-spectre-bhb-for-v5.17-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 spectre fixes from James Morse: "ARM64 Spectre-BHB mitigations: - Make EL1 vectors per-cpu - Add mitigation sequences to the EL1 and EL2 vectors on vulnerble CPUs - Implement ARCH_WORKAROUND_3 for KVM guests - Report Vulnerable when unprivileged eBPF is enabled" * tag 'arm64-spectre-bhb-for-v5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: proton-pack: Include unprivileged eBPF status in Spectre v2 mitigation reporting arm64: Use the clearbhb instruction in mitigations KVM: arm64: Allow SMCCC_ARCH_WORKAROUND_3 to be discovered and migrated arm64: Mitigate spectre style branch history side channels arm64: proton-pack: Report Spectre-BHB vulnerabilities as part of Spectre-v2 arm64: Add percpu vectors for EL1 arm64: entry: Add macro for reading symbol addresses from the trampoline arm64: entry: Add vectors that have the bhb mitigation sequences arm64: entry: Add non-kpti __bp_harden_el1_vectors for mitigations arm64: entry: Allow the trampoline text to occupy multiple pages arm64: entry: Make the kpti trampoline's kpti sequence optional arm64: entry: Move trampoline macros out of ifdef'd section arm64: entry: Don't assume tramp_vectors is the start of the vectors arm64: entry: Allow tramp_alias to access symbols after the 4K boundary arm64: entry: Move the trampoline data page before the text page arm64: entry: Free up another register on kpti's tramp_exit path arm64: entry: Make the trampoline cleanup optional KVM: arm64: Allow indirect vectors to be used without SPECTRE_V3A arm64: spectre: Rename spectre_v4_patch_fw_mitigation_conduit arm64: entry.S: Add ventry overflow sanity checks
2022-03-08Merge tag 'at91-soc-5.18-2' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/at91/linux into arm/drivers AT91 SoC #2 for 5.18: - SAMA5D29 variant to the SAMA5D2 family in SoC driver. * tag 'at91-soc-5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/at91/linux: ARM: at91: add support in soc driver for new SAMA5D29 soc: add microchip polarfire soc system controller ARM: at91: Kconfig: select PM_OPP ARM: at91: PM: add cpu idle support for sama7g5 ARM: at91: ddr: fix typo to align with datasheet naming ARM: at91: ddr: align macro definitions ARM: at91: ddr: remove CONFIG_SOC_SAMA7 dependency Link: https://lore.kernel.org/r/20220304144216.23340-1-nicolas.ferre@microchip.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-03-08mm: vmalloc: introduce array allocation functionsPaolo Bonzini
Linux has dozens of occurrences of vmalloc(array_size()) and vzalloc(array_size()). Allow to simplify the code by providing vmalloc_array and vcalloc, as well as the underscored variants that let the caller specify the GFP flags. Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-08Merge branch 'for-next/perf-m1' into for-next/perfWill Deacon
Support for the CPU PMUs on the Apple M1. * for-next/perf-m1: drivers/perf: Add Apple icestorm/firestorm CPU PMU driver drivers/perf: arm_pmu: Handle 47 bit counters irqchip/apple-aic: Move PMU-specific registers to their own include file arm64: dts: apple: Add t8303 PMU nodes arm64: dts: apple: Add t8103 PMU interrupt affinities irqchip/apple-aic: Wire PMU interrupts irqchip/apple-aic: Parse FIQ affinities from device-tree dt-bindings: apple,aic: Add affinity description for per-cpu pseudo-interrupts dt-bindings: apple,aic: Add CPU PMU per-cpu pseudo-interrupts dt-bindings: arm-pmu: Document Apple PMU compatible strings