summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2021-10-27fsnotify: pass dentry instead of inode dataAmir Goldstein
Define a new data type to pass for event - FSNOTIFY_EVENT_DENTRY. Use it to pass the dentry instead of it's ->d_inode where available. This is needed in preparation to the refactor to retrieve the super block from the data field. In some cases (i.e. mkdir in kernfs), the data inode comes from a negative dentry, such that no super block information would be available. By receiving the dentry itself, instead of the inode, fsnotify can derive the super block even on these cases. Link: https://lore.kernel.org/r/20211025192746.66445-3-krisman@collabora.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Amir Goldstein <amir73il@gmail.com> [Expand explanation in commit message] Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz>
2021-10-27fsnotify: pass data_type to fsnotify_name()Amir Goldstein
Align the arguments of fsnotify_name() to those of fsnotify(). Link: https://lore.kernel.org/r/20211025192746.66445-2-krisman@collabora.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz>
2021-10-27vmlinux.lds.h: Have ORC lookup cover entire _etext - _stextKristen Carlson Accardi
When using -ffunction-sections to place each function in its own text section (so it can be randomized at load time in the future FGKASLR series), the linker will place most of the functions into separate .text.* sections. SIZEOF(.text) won't work here for calculating the ORC lookup table size, so the total text size must be calculated to include .text AND all .text.* sections. Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com> [ alobakin: move it to vmlinux.lds.h and make arch-indep ] Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20211013175742.1197608-5-keescook@chromium.org
2021-10-27x86/boot/compressed: Avoid duplicate malloc() implementationsKees Cook
The early malloc() and free() implementation in include/linux/decompress/mm.h (which is also included by the static decompressors) is static. This is fine when the only thing interested in using malloc() is the decompression code, but the x86 early boot environment may use malloc() in a couple places, leading to a potential collision when the static copies of the available memory region ("malloc_ptr") gets reset to the global "free_mem_ptr" value. As it happened, the existing usage pattern was accidentally safe because each user did 1 malloc() and 1 free() before returning and were not nested: extract_kernel() (misc.c) choose_random_location() (kaslr.c) mem_avoid_init() handle_mem_options() malloc() ... free() ... parse_elf() (misc.c) malloc() ... free() Once the future FGKASLR series is added, however, it will insert additional malloc() calls local to fgkaslr.c in the middle of parse_elf()'s malloc()/free() pair: parse_elf() (misc.c) malloc() if (...) { layout_randomized_image(output, &ehdr, phdrs); malloc() <- boom ... else layout_image(output, &ehdr, phdrs); free() To avoid collisions, there must be a single implementation of malloc(). Adjust include/linux/decompress/mm.h so that visibility can be controlled, provide prototypes in misc.h, and implement the functions in misc.c. This also results in a small size savings: $ size vmlinux.before vmlinux.after text data bss dec hex filename 8842314 468 178320 9021102 89a6ae vmlinux.before 8842240 468 178320 9021028 89a664 vmlinux.after Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20211013175742.1197608-4-keescook@chromium.org
2021-10-27leds: add new LED_FUNCTION_PLAYER for player LEDs for game controllers.Roderick Colenbrander
Player LEDs are commonly found on game controllers from Nintendo and Sony to indicate a player ID across a number of LEDs. For example, "Player 2" might be indicated as "-x--" on a device with 4 LEDs where "x" means on. This patch introduces LED_FUNCTION_PLAYER1-5 defines to properly indicate player LEDs from the kernel. Until now there was no good standard, which resulted in inconsistent behavior across xpad, hid-sony, hid-wiimote and other drivers. Moving forward new drivers should use LED_FUNCTION_PLAYERx. Note: management of Player IDs is left to user space, though a kernel driver may pick a default value. Signed-off-by: Roderick Colenbrander <roderick.colenbrander@sony.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2021-10-27nvme: add new discovery log page entry definitionsHannes Reinecke
TP8014 adds a new SUBTYPE value and a new field EFLAGS for the discovery log page entry. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-10-26libata: support concurrent positioning ranges logDamien Le Moal
Add support to discover if an ATA device supports the Concurrent Positioning Ranges data log (address 0x47), indicating that the device is capable of seeking to multiple different locations in parallel using multiple actuators serving different LBA ranges. Also add support to translate the concurrent positioning ranges log into its equivalent Concurrent Positioning Ranges VPD page B9h in libata-scsi.c. The format of the Concurrent Positioning Ranges Log is defined in ACS-5 r9. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20211027022223.183838-4-damien.lemoal@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-26block: Add independent access ranges supportDamien Le Moal
The Concurrent Positioning Ranges VPD page (for SCSI) and data log page (for ATA) contain parameters describing the set of contiguous LBAs that can be served independently by a single LUN multi-actuator hard-disk. Similarly, a logically defined block device composed of multiple disks can in some cases execute requests directed at different sector ranges in parallel. A dm-linear device aggregating 2 block devices together is an example. This patch implements support for exposing a block device independent access ranges to the user through sysfs to allow optimizing device accesses to increase performance. To describe the set of independent sector ranges of a device (actuators of a multi-actuator HDDs or table entries of a dm-linear device), The type struct blk_independent_access_ranges is introduced. This structure describes the sector ranges using an array of struct blk_independent_access_range structures. This range structure defines the start sector and number of sectors of the access range. The ranges in the array cannot overlap and must contain all sectors within the device capacity. The function disk_set_independent_access_ranges() allows a device driver to signal to the block layer that a device has multiple independent access ranges. In this case, a struct blk_independent_access_ranges is attached to the device request queue by the function disk_set_independent_access_ranges(). The function disk_alloc_independent_access_ranges() is provided for drivers to allocate this structure. struct blk_independent_access_ranges contains kobjects (struct kobject) to expose to the user through sysfs the set of independent access ranges supported by a device. When the device is initialized, sysfs registration of the ranges information is done from blk_register_queue() using the block layer internal function disk_register_independent_access_ranges(). If a driver calls disk_set_independent_access_ranges() for a registered queue, e.g. when a device is revalidated, disk_set_independent_access_ranges() will execute disk_register_independent_access_ranges() to update the sysfs attribute files. The sysfs file structure created starts from the independent_access_ranges sub-directory and contains the start sector and number of sectors of each range, with the information for each range grouped in numbered sub-directories. E.g. for a dual actuator HDD, the user sees: $ tree /sys/block/sdk/queue/independent_access_ranges/ /sys/block/sdk/queue/independent_access_ranges/ |-- 0 | |-- nr_sectors | `-- sector `-- 1 |-- nr_sectors `-- sector For a regular device with a single access range, the independent_access_ranges sysfs directory does not exist. Device revalidation may lead to changes to this structure and to the attribute values. When manipulated, the queue sysfs_lock and sysfs_dir_lock mutexes are held for atomicity, similarly to how the blk-mq and elevator sysfs queue sub-directories are protected. The code related to the management of independent access ranges is added in the new file block/blk-ia-ranges.c. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20211027022223.183838-2-damien.lemoal@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-26net/mlx5e: Add handle SHAMPO cqe supportKhalid Manaa
This patch adds the new CQE SHAMPO fields: - flush: indicates that we must close the current session and pass the SKB to the network stack. - match: indicates that the current packet matches the oppened session, the packet will be merge into the current SKB. - header_size: the size of the packet headers that written into the headers buffer. - header_entry_index: the entry index in the headers buffer. - data_offset: packets data offset in the WQE. Also new cqe handler is added to handle SHAMPO packets: - The new handler uses CQE SHAMPO fields to build the SKB. CQE's Flush and match fields are not used in this patch, packets are not merged in this patch. Signed-off-by: Khalid Manaa <khalidm@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-26net/mlx5e: Add support to klm_umr_wqeBen Ben-Ishay
This commit adds the needed definitions for using the klm_umr_wqe. UMR stands for user-mode memory registration, is a mechanism to alter address translation properties of MKEY by posting WorkQueueElement aka WQE on send queue. MKEY stands for memory key, MKEY are used to describe a region in memory that can be later used by HW. KLM stands for {Key, Length, MemVa}, KLM_MKEY is indirect MKEY that enables to map multiple memory spaces with different sizes in unified MKEY. klm_umr_wqe is a UMR that use to update a KLM_MKEY. SHAMPO feature uses KLM_MKEY for memory registration of his header buffer. Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-26net/mlx5e: Rename TIR lro functions to TIR packet merge functionsKhalid Manaa
This series introduces new packet merge type, therefore rename lro functions to packet merge to support the new merge type: - Generalize + rename mlx5e_build_tir_ctx_lro to mlx5e_build_tir_ctx_packet_merge. - Rename mlx5e_modify_tirs_lro to mlx5e_modify_tirs_packet_merge. - Rename lro bit in mlx5_ifc_modify_tir_bitmask_bits to packet_merge. - Rename lro_en in mlx5e_params to packet_merge_type type and combine packet_merge params into one struct mlx5e_packet_merge_param. Signed-off-by: Khalid Manaa <khalidm@nvidia.com> Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-26net/mlx5: Add SHAMPO caps, HW bits and enumerationsBen Ben-Ishay
This commit adds SHAMPO bit to hca_cap and SHAMPO capabilities structure, SHAMPO related HW spec hardware fields and enumerations. SHAMPO stands for: split headers and merge payload offload. SHAMPO new fields: WQ: - headers_mkey: mkey that represents the headers buffer, where the packets headers will be written by the HW. - shampo_enable: flag to verify if the WQ supports SHAMPO feature. - log_reservation_size: the log of the reservation size where the data of the packet will be written by the HW. - log_max_num_of_packets_per_reservation: log of the maximum number of packets that can be written to the same reservation. - log_headers_entry_size: log of the header entry size of the headers buffer. - log_headers_buffer_entry_num: log of the entries number of the headers buffer. RQ: - shampo_no_match_alignment_granularity: the HW alignment granularity in case the received packet doesn't match the current session. - shampo_match_criteria_type: the type of match criteria. - reservation_timeout: the maximum time that the HW will hold the reservation. mlx5_ifc_shampo_cap_bits, the capabilities of the SHAMPO feature: - shampo_log_max_reservation_size: the maximum allowed value of the field WQ.log_reservation_size. - log_reservation_size: the minimum allowed value of the field WQ.log_reservation_size. - shampo_min_mss_size: the minimum payload size of packet that can open a new session or be merged to a session. - shampo_max_log_headers_entry_size: the maximum allowed value of the field WQ.log_headers_entry_size Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-26net/mlx5e: Rename lro_timeout to packet_merge_timeoutBen Ben-Ishay
TIR stands for transport interface receive, the TIR object is responsible for performing all transport related operations on the receive side like packet processing, demultiplexing the packets to different RQ's, etc. lro_timeout is a field in the TIR that is used to set the timeout for lro session, this series introduces new packet merge type, therefore rename lro_timeout to packet_merge_timeout for all packet merge types. Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-26lib: bitmap: Introduce node-aware alloc APITariq Toukan
Expose new node-aware API for bitmap allocation: bitmap_alloc_node() / bitmap_zalloc_node(). Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-26dt-bindings: clock: u8500: Rewrite in YAML and extendLinus Walleij
This rewrites the ux500/u8500 clock bindings in YAML schema and extends them with the PRCC reset controller. The bindings are a bit idiomatic but it just reflects their age, the ux500 platform was used as guinea pig for early device tree conversion of platforms in 2015. The new subnode for the reset controller follows the pattern of the old bindings and adds a node with reset-cells for this. Cc: devicetree@vger.kernel.org Cc: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20210921184803.1757916-1-linus.walleij@linaro.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-10-27Merge branch 'ib-gpio-ppid' into develLinus Walleij
2021-10-27gpio: Allow per-parent interrupt dataMarc Zyngier
The core gpiolib code is able to deal with multiple interrupt parents for a single gpio irqchip. It however only allows a single piece of data to be conveyed to all flow handlers (either the gpio_chip or some other, driver-specific data). This means that drivers have to go through some interesting dance to find the correct context, something that isn't great in interrupt context (see aebdc8abc9db86e2bd33070fc2f961012fff74b4 for a prime example). Instead, offer an optional way for a pinctrl/gpio driver to provide an array of pointers which gets used to provide the correct context to the flow handler. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20211026175815.52703-2-joey.gouly@arm.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2021-10-26Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfJakub Kicinski
Daniel Borkmann says: ==================== pull-request: bpf 2021-10-26 We've added 12 non-merge commits during the last 7 day(s) which contain a total of 23 files changed, 118 insertions(+), 98 deletions(-). The main changes are: 1) Fix potential race window in BPF tail call compatibility check, from Toke Høiland-Jørgensen. 2) Fix memory leak in cgroup fs due to missing cgroup_bpf_offline(), from Quanyang Wang. 3) Fix file descriptor reference counting in generic_map_update_batch(), from Xu Kuohai. 4) Fix bpf_jit_limit knob to the max supported limit by the arch's JIT, from Lorenz Bauer. 5) Fix BPF sockmap ->poll callbacks for UDP and AF_UNIX sockets, from Cong Wang and Yucong Sun. 6) Fix BPF sockmap concurrency issue in TCP on non-blocking sendmsg calls, from Liu Jian. 7) Fix build failure of INODE_STORAGE and TASK_STORAGE maps on !CONFIG_NET, from Tejun Heo. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf: Fix potential race in tail call compatibility check bpf: Move BPF_MAP_TYPE for INODE_STORAGE and TASK_STORAGE outside of CONFIG_NET selftests/bpf: Use recv_timeout() instead of retries net: Implement ->sock_is_readable() for UDP and AF_UNIX skmsg: Extract and reuse sk_msg_is_readable() net: Rename ->stream_memory_read to ->sock_is_readable tcp_bpf: Fix one concurrency problem in the tcp_bpf_send_verdict function cgroup: Fix memory leak caused by missing cgroup_bpf_offline bpf: Fix error usage of map_fd and fdget() in generic_map_update_batch() bpf: Prevent increasing bpf_jit_limit above max bpf: Define bpf_jit_alloc_exec_limit for arm64 JIT bpf: Define bpf_jit_alloc_exec_limit for riscv JIT ==================== Link: https://lore.kernel.org/r/20211026201920.11296-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-26f2fs: multidevice: support direct IOChao Yu
Commit 3c62be17d4f5 ("f2fs: support multiple devices") missed to support direct IO for multiple device feature, this patch adds to support the missing part of multidevice feature. In addition, for multiple device image, we should be aware of any issued direct write IO rather than just buffered write IO, so that fsync and syncfs can issue a preflush command to the device where direct write IO goes, to persist user data for posix compliant. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-10-26bpf: Fix potential race in tail call compatibility checkToke Høiland-Jørgensen
Lorenzo noticed that the code testing for program type compatibility of tail call maps is potentially racy in that two threads could encounter a map with an unset type simultaneously and both return true even though they are inserting incompatible programs. The race window is quite small, but artificially enlarging it by adding a usleep_range() inside the check in bpf_prog_array_compatible() makes it trivial to trigger from userspace with a program that does, essentially: map_fd = bpf_create_map(BPF_MAP_TYPE_PROG_ARRAY, 4, 4, 2, 0); pid = fork(); if (pid) { key = 0; value = xdp_fd; } else { key = 1; value = tc_fd; } err = bpf_map_update_elem(map_fd, &key, &value, 0); While the race window is small, it has potentially serious ramifications in that triggering it would allow a BPF program to tail call to a program of a different type. So let's get rid of it by protecting the update with a spinlock. The commit in the Fixes tag is the last commit that touches the code in question. v2: - Use a spinlock instead of an atomic variable and cmpxchg() (Alexei) v3: - Put lock and the members it protects into an embedded 'owner' struct (Daniel) Fixes: 3324b584b6f6 ("ebpf: misc core cleanup") Reported-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211026110019.363464-1-toke@redhat.com
2021-10-26bpf: Move BPF_MAP_TYPE for INODE_STORAGE and TASK_STORAGE outside of CONFIG_NETTejun Heo
bpf_types.h has BPF_MAP_TYPE_INODE_STORAGE and BPF_MAP_TYPE_TASK_STORAGE declared inside #ifdef CONFIG_NET although they are built regardless of CONFIG_NET. So, when CONFIG_BPF_SYSCALL && !CONFIG_NET, they are built without the declarations leading to spurious build failures and not registered to bpf_map_types making them unavailable. Fix it by moving the BPF_MAP_TYPE for the two map types outside of CONFIG_NET. Reported-by: kernel test robot <lkp@intel.com> Fixes: a10787e6d58c ("bpf: Enable task local storage for tracing programs") Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/YXG1cuuSJDqHQfRY@slm.duckdns.org
2021-10-26skmsg: Extract and reuse sk_msg_is_readable()Cong Wang
tcp_bpf_sock_is_readable() is pretty much generic, we can extract it and reuse it for non-TCP sockets. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211008203306.37525-3-xiyou.wangcong@gmail.com
2021-10-26net: Rename ->stream_memory_read to ->sock_is_readableCong Wang
The proto ops ->stream_memory_read() is currently only used by TCP to check whether psock queue is empty or not. We need to rename it before reusing it for non-TCP protocols, and adjust the exsiting users accordingly. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211008203306.37525-2-xiyou.wangcong@gmail.com
2021-10-26net/mlx5: remove the recent devlink paramsJakub Kicinski
revert commit 46ae40b94d88 ("net/mlx5: Let user configure io_eq_size param") revert commit a6cb08daa3b4 ("net/mlx5: Let user configure event_eq_size param") revert commit 554604061979 ("net/mlx5: Let user configure max_macs param") The EQE parameters are applicable to more drivers, they should be configured via standard API, probably ethtool. Example of another driver needing something similar: https://lore.kernel.org/all/1633454136-14679-3-git-send-email-sbhatta@marvell.com/ The last param for "max_macs" is probably fine but the documentation is severely lacking. The meaning and implications for changing the param need to be stated. Link: https://lore.kernel.org/r/20211026152939.3125950-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-26fs: export an inode_update_time helperJosef Bacik
If you already have an inode and need to update the time on the inode there is no way to do this properly. Export this helper to allow file systems to update time on the inode so the appropriate handler is called, either ->update_time or generic_update_time. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26Merge tag 'arm-ffa-updates-5.16' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/drivers Arm FF-A updates for v5.16 Just couple of minor updates: - Adding support for MEMORY_LEND API - Handling compatibility with different firmware versions(especially dealing with newer/higher versions than the one supported by the driver) * tag 'arm-ffa-updates-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: firmware: arm_ffa: Remove unused 'compat_version' variable firmware: arm_ffa: Add support for MEM_LEND firmware: arm_ffa: Handle compatibility with different firmware versions firmware: arm_ffa: Fix __ffa_devices_unregister firmware: arm_ffa: Add missing remove callback to ffa_bus_type Link: https://lore.kernel.org/r/20211026141535.1920602-1-sudeep.holla@arm.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-10-26Merge tag 'samsung-drivers-5.16' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into arm/drivers Samsung SoC drivers changes for v5.16 1. Convert Exynos ChipID and ASV driver to a module and make it a default, instead of selected. The driver is not essential, so it could be disabled, if needed. 2. Add support for Exynos850 and Exynos Auto v9 to Exynos ChipID and ASV driver. 3. Get rid of HAVE_S3C_RTC because it was adding just another layer instead of direct dependencies. 4. Minor cleanups. * tag 'samsung-drivers-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux: soc: samsung: exynos-chipid: add exynosautov9 SoC support rtc: s3c: remove HAVE_S3C_RTC in favor of direct dependencies soc: samsung: exynos-chipid: Add Exynos850 support dt-bindings: samsung: exynos-chipid: Document Exynos850 compatible soc: samsung: exynos-chipid: Pass revision reg offsets soc: samsung: pm_domains: drop unused is_off field arm64: exynos: don't have ARCH_EXYNOS select EXYNOS_CHIPID soc: samsung: exynos-chipid: do not enforce built-in soc: samsung: exynos-chipid: convert to a module soc: samsung: exynos-chipid: avoid soc_device_to_device() soc: samsung: exynos-pmu: Fix compilation when nothing selects CONFIG_MFD_CORE Link: https://lore.kernel.org/r/20211026094709.75692-2-krzysztof.kozlowski@canonical.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-10-26net: phylink: use supported_interfaces for phylink validationRussell King (Oracle)
If the network device supplies a supported interface bitmap, we can use that during phylink's validation to simplify MAC drivers in two ways by using the supported_interfaces bitmap to: 1. reject unsupported interfaces before calling into the MAC driver. 2. generate the set of all supported link modes across all supported interfaces (used mainly for SFP, but also some 10G PHYs.) Suggested-by: Sean Anderson <sean.anderson@seco.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-26net: phylink: add MAC phy_interface_t bitmapRussell King
Add a phy_interface_t bitmap so the MAC driver can specifiy which PHY interface modes it supports. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-26net: phy: add phy_interface_t bitmap supportRussell King (Oracle)
Add support for a bitmap for phy interface modes, which includes: - a macro to declare the interface bitmap - an inline helper to zero the interface bitmap - an inline helper to detect an empty interface bitmap - inline helpers to do a bitwise AND and OR operations on two interface bitmaps Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-26mctp: Implement extended addressingJeremy Kerr
This change allows an extended address struct - struct sockaddr_mctp_ext - to be passed to sendmsg/recvmsg. This allows userspace to specify output ifindex and physical address information (for sendmsg) or receive the input ifindex/physaddr for incoming messages (for recvmsg). This is typically used by userspace for MCTP address discovery and assignment operations. The extended addressing facility is conditional on a new sockopt: MCTP_OPT_ADDR_EXT; userspace must explicitly enable addressing before the kernel will consume/populate the extended address data. Includes a fix for an uninitialised var: Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-26tcp: rename sk_stream_alloc_skbEric Dumazet
sk_stream_alloc_skb() is only used by TCP. Rename it to make this clear, and move its declaration to include/net/tcp.h Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-26ASoC: qdsp6: audioreach: add topology supportSrinivas Kandagatla
Add ASoC topology support in audioreach Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Link: https://lore.kernel.org/r/20211026111655.1702-14-srinivas.kandagatla@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2021-10-26ASoC: dt-bindings: rename q6afe.h to q6dsp-lpass-ports.hSrinivas Kandagatla
move all LPASS audio ports defines from q6afe.h to q6dsp-lpass-ports.h as these belong to LPASS IP. Also this move helps in reusing this header across multiple audio frameworks on Qualcomm Audio DSP. This patch is split out of the dt-bindings patch to enable easy review. Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Acked-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20211026111655.1702-4-srinivas.kandagatla@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2021-10-26Merge tag '20210927135559.738-6-srinivas.kandagatla@linaro.org' of ↵Mark Brown
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into v11_20211026_srinivas_kandagatla_asoc_qcom_add_audioreach_support for audioreach support v5.15-rc1 + 20210927135559.738-[23456]-srinivas.kandagatla@linaro.org This immutable branch is based on v5.15-rc1 and contains the following patches extending the existig APR driver to also implement GPR: 20210927135559.738-2-srinivas.kandagatla@linaro.org 20210927135559.738-3-srinivas.kandagatla@linaro.org 20210927135559.738-4-srinivas.kandagatla@linaro.org 20210927135559.738-5-srinivas.kandagatla@linaro.org 20210927135559.738-6-srinivas.kandagatla@linaro.org
2021-10-26net: annotate data-race in neigh_output()Eric Dumazet
neigh_output() reads n->nud_state and hh->hh_len locklessly. This is fine, but we need to add annotations and document this. We evaluate skip_cache first to avoid reading these fields if the cache has to by bypassed. syzbot report: BUG: KCSAN: data-race in __neigh_event_send / ip_finish_output2 write to 0xffff88810798a885 of 1 bytes by interrupt on cpu 1: __neigh_event_send+0x40d/0xac0 net/core/neighbour.c:1128 neigh_event_send include/net/neighbour.h:444 [inline] neigh_resolve_output+0x104/0x410 net/core/neighbour.c:1476 neigh_output include/net/neighbour.h:510 [inline] ip_finish_output2+0x80a/0xaa0 net/ipv4/ip_output.c:221 ip_finish_output+0x3b5/0x510 net/ipv4/ip_output.c:309 NF_HOOK_COND include/linux/netfilter.h:296 [inline] ip_output+0xf3/0x1a0 net/ipv4/ip_output.c:423 dst_output include/net/dst.h:450 [inline] ip_local_out+0x164/0x220 net/ipv4/ip_output.c:126 __ip_queue_xmit+0x9d3/0xa20 net/ipv4/ip_output.c:525 ip_queue_xmit+0x34/0x40 net/ipv4/ip_output.c:539 __tcp_transmit_skb+0x142a/0x1a00 net/ipv4/tcp_output.c:1405 tcp_transmit_skb net/ipv4/tcp_output.c:1423 [inline] tcp_xmit_probe_skb net/ipv4/tcp_output.c:4011 [inline] tcp_write_wakeup+0x4a9/0x810 net/ipv4/tcp_output.c:4064 tcp_send_probe0+0x2c/0x2b0 net/ipv4/tcp_output.c:4079 tcp_probe_timer net/ipv4/tcp_timer.c:398 [inline] tcp_write_timer_handler+0x394/0x520 net/ipv4/tcp_timer.c:626 tcp_write_timer+0xb9/0x180 net/ipv4/tcp_timer.c:642 call_timer_fn+0x2e/0x1d0 kernel/time/timer.c:1421 expire_timers+0x135/0x240 kernel/time/timer.c:1466 __run_timers+0x368/0x430 kernel/time/timer.c:1734 run_timer_softirq+0x19/0x30 kernel/time/timer.c:1747 __do_softirq+0x12c/0x26e kernel/softirq.c:558 invoke_softirq kernel/softirq.c:432 [inline] __irq_exit_rcu kernel/softirq.c:636 [inline] irq_exit_rcu+0x4e/0xa0 kernel/softirq.c:648 sysvec_apic_timer_interrupt+0x69/0x80 arch/x86/kernel/apic/apic.c:1097 asm_sysvec_apic_timer_interrupt+0x12/0x20 native_safe_halt arch/x86/include/asm/irqflags.h:51 [inline] arch_safe_halt arch/x86/include/asm/irqflags.h:89 [inline] acpi_safe_halt drivers/acpi/processor_idle.c:109 [inline] acpi_idle_do_entry drivers/acpi/processor_idle.c:553 [inline] acpi_idle_enter+0x258/0x2e0 drivers/acpi/processor_idle.c:688 cpuidle_enter_state+0x2b4/0x760 drivers/cpuidle/cpuidle.c:237 cpuidle_enter+0x3c/0x60 drivers/cpuidle/cpuidle.c:351 call_cpuidle kernel/sched/idle.c:158 [inline] cpuidle_idle_call kernel/sched/idle.c:239 [inline] do_idle+0x1a3/0x250 kernel/sched/idle.c:306 cpu_startup_entry+0x15/0x20 kernel/sched/idle.c:403 secondary_startup_64_no_verify+0xb1/0xbb read to 0xffff88810798a885 of 1 bytes by interrupt on cpu 0: neigh_output include/net/neighbour.h:507 [inline] ip_finish_output2+0x79a/0xaa0 net/ipv4/ip_output.c:221 ip_finish_output+0x3b5/0x510 net/ipv4/ip_output.c:309 NF_HOOK_COND include/linux/netfilter.h:296 [inline] ip_output+0xf3/0x1a0 net/ipv4/ip_output.c:423 dst_output include/net/dst.h:450 [inline] ip_local_out+0x164/0x220 net/ipv4/ip_output.c:126 __ip_queue_xmit+0x9d3/0xa20 net/ipv4/ip_output.c:525 ip_queue_xmit+0x34/0x40 net/ipv4/ip_output.c:539 __tcp_transmit_skb+0x142a/0x1a00 net/ipv4/tcp_output.c:1405 tcp_transmit_skb net/ipv4/tcp_output.c:1423 [inline] tcp_xmit_probe_skb net/ipv4/tcp_output.c:4011 [inline] tcp_write_wakeup+0x4a9/0x810 net/ipv4/tcp_output.c:4064 tcp_send_probe0+0x2c/0x2b0 net/ipv4/tcp_output.c:4079 tcp_probe_timer net/ipv4/tcp_timer.c:398 [inline] tcp_write_timer_handler+0x394/0x520 net/ipv4/tcp_timer.c:626 tcp_write_timer+0xb9/0x180 net/ipv4/tcp_timer.c:642 call_timer_fn+0x2e/0x1d0 kernel/time/timer.c:1421 expire_timers+0x135/0x240 kernel/time/timer.c:1466 __run_timers+0x368/0x430 kernel/time/timer.c:1734 run_timer_softirq+0x19/0x30 kernel/time/timer.c:1747 __do_softirq+0x12c/0x26e kernel/softirq.c:558 invoke_softirq kernel/softirq.c:432 [inline] __irq_exit_rcu kernel/softirq.c:636 [inline] irq_exit_rcu+0x4e/0xa0 kernel/softirq.c:648 sysvec_apic_timer_interrupt+0x69/0x80 arch/x86/kernel/apic/apic.c:1097 asm_sysvec_apic_timer_interrupt+0x12/0x20 native_safe_halt arch/x86/include/asm/irqflags.h:51 [inline] arch_safe_halt arch/x86/include/asm/irqflags.h:89 [inline] acpi_safe_halt drivers/acpi/processor_idle.c:109 [inline] acpi_idle_do_entry drivers/acpi/processor_idle.c:553 [inline] acpi_idle_enter+0x258/0x2e0 drivers/acpi/processor_idle.c:688 cpuidle_enter_state+0x2b4/0x760 drivers/cpuidle/cpuidle.c:237 cpuidle_enter+0x3c/0x60 drivers/cpuidle/cpuidle.c:351 call_cpuidle kernel/sched/idle.c:158 [inline] cpuidle_idle_call kernel/sched/idle.c:239 [inline] do_idle+0x1a3/0x250 kernel/sched/idle.c:306 cpu_startup_entry+0x15/0x20 kernel/sched/idle.c:403 rest_init+0xee/0x100 init/main.c:734 arch_call_rest_init+0xa/0xb start_kernel+0x5e4/0x669 init/main.c:1142 secondary_startup_64_no_verify+0xb1/0xbb value changed: 0x20 -> 0x01 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.15.0-rc6-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-26Merge tag 'mlx5-updates-2021-10-25' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2021-10-25 Misc updates for mlx5 driver: 1) Misc updates and cleanups: - Don't write directly to netdev->dev_addr, From Jakub Kicinski - Remove unnecessary checks for slow path flag in tc module - Fix unused function warning of mlx5i_flow_type_mask - Bridge, support replacing existing FDB entry 2) Sub Functions, Reduction in memory usage: - Reduce flow counters bulk query buffer size - Implement max_macs devlink parameter - Add devlink vendor params to control Event Queue sizes - Added SF life cycle trace points by Parav/ 3) From Aya, Firmware health buffer reporting improvements - Print health buffer by log level and more missing information - Periodic update of host time to firmware ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-26net: multicast: calculate csum of looped-back and forwarded packetsCyril Strejc
During a testing of an user-space application which transmits UDP multicast datagrams and utilizes multicast routing to send the UDP datagrams out of defined network interfaces, I've found a multicast router does not fill-in UDP checksum into locally produced, looped-back and forwarded UDP datagrams, if an original output NIC the datagrams are sent to has UDP TX checksum offload enabled. The datagrams are sent malformed out of the NIC the datagrams have been forwarded to. It is because: 1. If TX checksum offload is enabled on the output NIC, UDP checksum is not calculated by kernel and is not filled into skb data. 2. dev_loopback_xmit(), which is called solely by ip_mc_finish_output(), sets skb->ip_summed = CHECKSUM_UNNECESSARY unconditionally. 3. Since 35fc92a9 ("[NET]: Allow forwarding of ip_summed except CHECKSUM_COMPLETE"), the ip_summed value is preserved during forwarding. 4. If ip_summed != CHECKSUM_PARTIAL, checksum is not calculated during a packet egress. The minimum fix in dev_loopback_xmit(): 1. Preserves skb->ip_summed CHECKSUM_PARTIAL. This is the case when the original output NIC has TX checksum offload enabled. The effects are: a) If the forwarding destination interface supports TX checksum offloading, the NIC driver is responsible to fill-in the checksum. b) If the forwarding destination interface does NOT support TX checksum offloading, checksums are filled-in by kernel before skb is submitted to the NIC driver. c) For local delivery, checksum validation is skipped as in the case of CHECKSUM_UNNECESSARY, thanks to skb_csum_unnecessary(). 2. Translates ip_summed CHECKSUM_NONE to CHECKSUM_UNNECESSARY. It means, for CHECKSUM_NONE, the behavior is unmodified and is there to skip a looped-back packet local delivery checksum validation. Signed-off-by: Cyril Strejc <cyril.strejc@skoda.cz> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-26dt-bindings: phy: cadence-torrent: Add clock IDs for derived and received refclkSwapnil Jakhade
Add clock IDs for derived and received reference clock output. Signed-off-by: Swapnil Jakhade <sjakhade@cadence.com> Acked-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20210922123735.21927-3-sjakhade@cadence.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-10-26genirq: Hide irq_cpu_{on,off}line() behind a deprecated optionMarc Zyngier
irq_cpu_{on,off}line() are now only used by the Octeon platform. Make their use conditional on this plaform being enabled, and otherwise hidden away. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Serge Semin <fancer.lancer@gmail.com> Link: https://lore.kernel.org/r/20211021170414.3341522-4-maz@kernel.org
2021-10-26irq: remove handle_domain_{irq,nmi}()Mark Rutland
Now that entry code handles IRQ entry (including setting the IRQ regs) before calling irqchip code, irqchip code can safely call generic_handle_domain_irq(), and there's no functional reason for it to call handle_domain_irq(). Let's cement this split of responsibility and remove handle_domain_irq() entirely, updating irqchip drivers to call generic_handle_domain_irq(). For consistency, handle_domain_nmi() is similarly removed and replaced with a generic_handle_domain_nmi() function which also does not perform any entry logic. Previously handle_domain_{irq,nmi}() had a WARN_ON() which would fire when they were called in an inappropriate context. So that we can identify similar issues going forward, similar WARN_ON_ONCE() logic is added to the generic_handle_*() functions, and comments are updated for clarity and consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de>
2021-10-26signal: Add an optional check for altstack sizeThomas Gleixner
New x86 FPU features will be very large, requiring ~10k of stack in signal handlers. These new features require a new approach called "dynamic features". The kernel currently tries to ensure that altstacks are reasonably sized. Right now, on x86, sys_sigaltstack() requires a size of >=2k. However, that 2k is a constant. Simply raising that 2k requirement to >10k for the new features would break existing apps which have a compiled-in size of 2k. Instead of universally enforcing a larger stack, prohibit a process from using dynamic features without properly-sized altstacks. This must be enforced in two places: * A dynamic feature can not be enabled without an large-enough altstack for each process thread. * Once a dynamic feature is enabled, any request to install a too-small altstack will be rejected The dynamic feature enabling code must examine each thread in a process to ensure that the altstacks are large enough. Add a new lock (sigaltstack_lock()) to ensure that threads can not race and change their altstack after being examined. Add the infrastructure in form of a config option and provide empty stubs for architectures which do not need dynamic altstack size checks. This implementation will be fleshed out for x86 in a future patch called x86/arch_prctl: Add controls for dynamic XSTATE components [dhansen: commit message. ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211021225527.10184-2-chang.seok.bae@intel.com
2021-10-26tpm: fix Atmel TPM crash caused by too frequent queriesHao Wu
The Atmel TPM 1.2 chips crash with error `tpm_try_transmit: send(): error -62` since kernel 4.14. It is observed from the kernel log after running `tpm_sealdata -z`. The error thrown from the command is as follows ``` $ tpm_sealdata -z Tspi_Key_LoadKey failed: 0x00001087 - layer=tddl, code=0087 (135), I/O error ``` The issue was reproduced with the following Atmel TPM chip: ``` $ tpm_version T0 TPM 1.2 Version Info: Chip Version: 1.2.66.1 Spec Level: 2 Errata Revision: 3 TPM Vendor ID: ATML TPM Version: 01010000 Manufacturer Info: 41544d4c ``` The root cause of the issue is due to the TPM calls to msleep() were replaced with usleep_range() [1], which reduces the actual timeout. Via experiments, it is observed that the original msleep(5) actually sleeps for 15ms. Because of a known timeout issue in Atmel TPM 1.2 chip, the shorter timeout than 15ms can cause the error described above. A few further changes in kernel 4.16 [2] and 4.18 [3, 4] further reduced the timeout to less than 1ms. With experiments, the problematic timeout in the latest kernel is the one for `wait_for_tpm_stat`. To fix it, the patch reverts the timeout of `wait_for_tpm_stat` to 15ms for all Atmel TPM 1.2 chips, but leave it untouched for Ateml TPM 2.0 chip, and chips from other vendors. As explained above, the chosen 15ms timeout is the actual timeout before this issue introduced, thus the old value is used here. Particularly, TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 14700us, TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 15000us according to the existing TPM_TIMEOUT_RANGE_US (300us). The fixed has been tested in the system with the affected Atmel chip with no issues observed after boot up. References: [1] 9f3fc7bcddcb tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers [2] cf151a9a44d5 tpm: reduce tpm polling delay in tpm_tis_core [3] 59f5a6b07f64 tpm: reduce poll sleep time in tpm_transmit() [4] 424eaf910c32 tpm: reduce polling time to usecs for even finer granularity Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers") Link: https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/ Signed-off-by: Hao Wu <hao.wu@rubrik.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2021-10-25drm: Update MST First Link Slot Information Based on Encoding FormatBhawanpreet Lakha
8b/10b encoding format requires to reserve the first slot for recording metadata. Real data transmission starts from the second slot, with a total of available 63 slots available. In 128b/132b encoding format, metadata is transmitted separately in LLCP packet before MTP. Real data transmission starts from the first slot, with a total of 64 slots available. v2: * Move total/start slots to mst_state, and copy it to mst_mgr in atomic_check v3: * Only keep the slot info on the mst_state * add a start_slot parameter to the payload function, to facilitate non atomic drivers (this is a temporary workaround and should be removed when we are moving out the non atomic driver helpers) v4: *fixed typo and formatting v5: (no functional changes) * Fixed formatting in drm_dp_mst_update_slots() * Reference mst_state instead of mst_state->mgr for debugging info Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com> [v5 nitpicks] Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211025223825.301703-3-lyude@redhat.com
2021-10-25ipv4: guard IP_MINTTL with a static keyEric Dumazet
RFC 5082 IP_MINTTL option is rarely used on hosts. Add a static key to remove from TCP fast path useless code, and potential cache line miss to fetch inet_sk(sk)->min_ttl Note that once ip4_min_ttl static key has been enabled, it stays enabled until next boot. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25ipv6: guard IPV6_MINHOPCOUNT with a static keyEric Dumazet
RFC 5082 IPV6_MINHOPCOUNT is rarely used on hosts. Add a static key to remove from TCP fast path useless code, and potential cache line miss to fetch tcp_inet6_sk(sk)->min_hopcount Note that once ip6_min_hopcount static key has been enabled, it stays enabled until next boot. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25net: annotate accesses to sk->sk_rx_queue_mappingEric Dumazet
sk->sk_rx_queue_mapping can be modified locklessly, add a couple of READ_ONCE()/WRITE_ONCE() to document this fact. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25net: avoid dirtying sk->sk_rx_queue_mappingEric Dumazet
sk_rx_queue_mapping is located in a cache line that should be kept read mostly. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25net: avoid dirtying sk->sk_napi_idEric Dumazet
sk_napi_id is located in a cache line that can be kept read mostly. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25ipv6: move inet6_sk(sk)->rx_dst_cookie to sk->sk_rx_dst_cookieEric Dumazet
Increase cache locality by moving rx_dst_coookie next to sk->sk_rx_dst This removes one or two cache line misses in IPv6 early demux (TCP/UDP) Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>