summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2021-04-19net: phy: add genphy_c45_pma_suspend/resumeRadu Pirea (NXP OSS)
Add generic PMA suspend and resume callback functions for C45 PHYs. Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19netlink: simplify nl_set_extack_cookie_u64(), nl_set_extack_cookie_u32()Alexey Dobriyan
Taking address of a function argument directly works just fine. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19bpf: Add a bpf_snprintf helperFlorent Revest
The implementation takes inspiration from the existing bpf_trace_printk helper but there are a few differences: To allow for a large number of format-specifiers, parameters are provided in an array, like in bpf_seq_printf. Because the output string takes two arguments and the array of parameters also takes two arguments, the format string needs to fit in one argument. Thankfully, ARG_PTR_TO_CONST_STR is guaranteed to point to a zero-terminated read-only map so we don't need a format string length arg. Because the format-string is known at verification time, we also do a first pass of format string validation in the verifier logic. This makes debugging easier. Signed-off-by: Florent Revest <revest@chromium.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210419155243.1632274-4-revest@chromium.org
2021-04-19bpf: Add a ARG_PTR_TO_CONST_STR argument typeFlorent Revest
This type provides the guarantee that an argument is going to be a const pointer to somewhere in a read-only map value. It also checks that this pointer is followed by a zero character before the end of the map value. Signed-off-by: Florent Revest <revest@chromium.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210419155243.1632274-3-revest@chromium.org
2021-04-19bpf: Factorize bpf_trace_printk and bpf_seq_printfFlorent Revest
Two helpers (trace_printk and seq_printf) have very similar implementations of format string parsing and a third one is coming (snprintf). To avoid code duplication and make the code easier to maintain, this moves the operations associated with format string parsing (validation and argument sanitization) into one generic function. The implementation of the two existing helpers already drifted quite a bit so unifying them entailed a lot of changes: - bpf_trace_printk always expected fmt[fmt_size] to be the terminating NULL character, this is no longer true, the first 0 is terminating. - bpf_trace_printk now supports %% (which produces the percentage char). - bpf_trace_printk now skips width formating fields. - bpf_trace_printk now supports the X modifier (capital hexadecimal). - bpf_trace_printk now supports %pK, %px, %pB, %pi4, %pI4, %pi6 and %pI6 - argument casting on 32 bit has been simplified into one macro and using an enum instead of obscure int increments. - bpf_seq_printf now uses bpf_trace_copy_string instead of strncpy_from_kernel_nofault and handles the %pks %pus specifiers. - bpf_seq_printf now prints longs correctly on 32 bit architectures. - both were changed to use a global per-cpu tmp buffer instead of one stack buffer for trace_printk and 6 small buffers for seq_printf. - to avoid per-cpu buffer usage conflict, these helpers disable preemption while the per-cpu buffer is in use. - both helpers now support the %ps and %pS specifiers to print symbols. The implementation is also moved from bpf_trace.c to helpers.c because the upcoming bpf_snprintf helper will be made available to all BPF programs and will need it. Signed-off-by: Florent Revest <revest@chromium.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210419155243.1632274-2-revest@chromium.org
2021-04-19perf: Extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHEKan Liang
Current Hardware events and Hardware cache events have special perf types, PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE. The two types don't pass the PMU type in the user interface. For a hybrid system, the perf subsystem doesn't know which PMU the events belong to. The first capable PMU will always be assigned to the events. The events never get a chance to run on the other capable PMUs. Extend the two types to become PMU aware types. The PMU type ID is stored at attr.config[63:32]. Add a new PMU capability, PERF_PMU_CAP_EXTENDED_HW_TYPE, to indicate a PMU which supports the extended PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE. The PMU type is only required when searching a specific PMU. The PMU specific codes will only be interested in the 'real' config value, which is stored in the low 32 bit of the event->attr.config. Update the event->attr.config in the generic code, so the PMU specific codes don't need to calculate it separately. If a user specifies a PMU type, but the PMU doesn't support the extended type, error out. If an event cannot be initialized in a PMU specified by a user, error out immediately. Perf should not try to open it on other PMUs. The new PMU capability is only set for the X86 hybrid PMUs for now. Other architectures, e.g., ARM, may need it as well. The support on ARM may be implemented later separately. Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1618237865-33448-22-git-send-email-kan.liang@linux.intel.com
2021-04-19perf/x86: Add structures for the attributes of Hybrid PMUsKan Liang
Hybrid PMUs have different events and formats. In theory, Hybrid PMU specific attributes should be maintained in the dedicated struct x86_hybrid_pmu, but it wastes space because the events and formats are similar among Hybrid PMUs. To reduce duplication, all hybrid PMUs will share a group of attributes in the following patch. To distinguish an attribute from different Hybrid PMUs, a PMU aware attribute structure is introduced. A PMU type is required for the attribute structure. The type is internal usage. It is not visible in the sysfs API. Hybrid PMUs may support the same event name, but with different event encoding, e.g., the mem-loads event on an Atom PMU has different event encoding from a Core PMU. It brings issue if two attributes are created for them. Current sysfs_update_group finds an attribute by searching the attr name (aka event name). If two attributes have the same event name, the first attribute will be replaced. To address the issue, only one attribute is created for the event. The event_str is extended and stores event encodings from all Hybrid PMUs. Each event encoding is divided by ";". The order of the event encodings must follow the order of the hybrid PMU index. The event_str is internal usage as well. When a user wants to show the attribute of a Hybrid PMU, only the corresponding part of the string is displayed. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/1618237865-33448-18-git-send-email-kan.liang@linux.intel.com
2021-04-19dm: replace dm_vcalloc()Matthew Wilcox (Oracle)
Use kvcalloc or kvmalloc_array instead (depending whether zeroing is useful). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-04-19btrfs: add and use readahead_batch_lengthMatthew Wilcox (Oracle)
Implement readahead_batch_length() to determine the number of bytes in the current batch of readahead pages and use it in btrfs. Also use the readahead_pos to get the offset. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-04-19fs: introduce a wrapper uuid_to_fsid()Amir Goldstein
Some filesystem's use a digest of their uuid for f_fsid. Create a simple wrapper for this open coded folding. Filesystems that have a non null uuid but use the block device number for f_fsid may also consider using this helper. [JK: Added missing asm/byteorder.h include] Link: https://lore.kernel.org/r/20210322173944.449469-2-amir73il@gmail.com Acked-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2021-04-19KVM: x86/mmu: Re-add const qualifier in kvm_tdp_mmu_zap_collapsible_sptesBen Gardon
kvm_tdp_mmu_zap_collapsible_sptes unnecessarily removes the const qualifier from its memlsot argument, leading to a compiler warning. Add the const annotation and pass it to subsequent functions. Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20210401233736.638171-2-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-19wireless: fix spelling of A-MSDU in HE capabilitiesJohannes Berg
In the HE capabilities, spell A-MSDU correctly, not "A-MDSU". Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Link: https://lore.kernel.org/r/iwlwifi.20210409123755.9e6ff1af1181.If6868bc6902ccd9a95c74c78f716c4b41473ef14@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-04-19wireless: align HE capabilities A-MPDU Length Exponent ExtensionJohannes Berg
The A-MPDU length exponent extension is defined differently in 802.11ax D6.1, align with that. Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Link: https://lore.kernel.org/r/iwlwifi.20210409123755.c2a257d3e2df.I3455245d388c52c61dace7e7958dbed7e807cfb6@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-04-19wireless: align some HE capabilities with the specJohannes Berg
Some names were changed, align that with the spec as of 802.11ax-D6.1. Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Link: https://lore.kernel.org/r/iwlwifi.20210409123755.b1e5fbab0d8c.I3eb6076cb0714ec6aec6b8f9dee613ce4a05d825@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-04-19ieee80211: add the values of ranging parameters max LTF total fieldAvraham Stern
Add an enum with the values of the ranging parameters max LTF total field, as defined in IEEE802.11az_D2.6, table Table 9-322h23fc. Signed-off-by: Avraham Stern <avraham.stern@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Link: https://lore.kernel.org/r/iwlwifi.20210409123755.d2588ebb1974.I9424c8ade13c4c938cb9999d8ce99d0d4c1cc198@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-04-18Drivers: hv: vmbus: Drivers: hv: vmbus: Introduce ↵Andrea Parri (Microsoft)
CHANNELMSG_MODIFYCHANNEL_RESPONSE Introduce the CHANNELMSG_MODIFYCHANNEL_RESPONSE message type, and code to receive and process such a message. Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20210416143449.16185-3-parri.andrea@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-04-18Drivers: hv: vmbus: Introduce and negotiate VMBus protocol version 5.3Andrea Parri (Microsoft)
Hyper-V has added VMBus protocol version 5.3. Allow Linux guests to negotiate the new version on version of Hyper-V that support it. Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20210416143449.16185-2-parri.andrea@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-04-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c - keep the ZC code, drop the code related to reinit net/bridge/netfilter/ebtables.c - fix build after move to net_generic Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-04-17Merge tag 'net-5.12-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Networking fixes for 5.12-rc8, including fixes from netfilter, and bpf. BPF verifier changes stand out, otherwise things have slowed down. Current release - regressions: - gro: ensure frag0 meets IP header alignment - Revert "net: stmmac: re-init rx buffers when mac resume back" - ethernet: macb: fix the restore of cmp registers Previous releases - regressions: - ixgbe: Fix NULL pointer dereference in ethtool loopback test - ixgbe: fix unbalanced device enable/disable in suspend/resume - phy: marvell: fix detection of PHY on Topaz switches - make tcp_allowed_congestion_control readonly in non-init netns - xen-netback: Check for hotplug-status existence before watching Previous releases - always broken: - bpf: mitigate a speculative oob read of up to map value size by tightening the masking window - sctp: fix race condition in sctp_destroy_sock - sit, ip6_tunnel: Unregister catch-all devices - netfilter: nftables: clone set element expression template - netfilter: flowtable: fix NAT IPv6 offload mangling - net: geneve: check skb is large enough for IPv4/IPv6 header - netlink: don't call ->netlink_bind with table lock held" * tag 'net-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (52 commits) netlink: don't call ->netlink_bind with table lock held MAINTAINERS: update my email bpf: Update selftests to reflect new error states bpf: Tighten speculative pointer arithmetic mask bpf: Move sanitize_val_alu out of op switch bpf: Refactor and streamline bounds check into helper bpf: Improve verifier error messages for users bpf: Rework ptr_limit into alu_limit and add common error path bpf: Ensure off_reg has no mixed signed bounds for all types bpf: Move off_reg into sanitize_ptr_alu bpf: Use correct permission flag for mixed signed bounds arithmetic ch_ktls: do not send snd_una update to TCB in middle ch_ktls: tcb close causes tls connection failure ch_ktls: fix device connection close ch_ktls: Fix kernel panic i40e: fix the panic when running bpf in xdpdrv mode net/mlx5e: fix ingress_ifindex check in mlx5e_flower_parse_meta net/mlx5e: Fix setting of RS FEC mode net/mlx5: Fix setting of devlink traps in switchdev mode Revert "net: stmmac: re-init rx buffers when mac resume back" ...
2021-04-17Merge tag 'libnvdimm-fixes-for-5.12-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fixes from Dan Williams: "The largest change is for a regression that landed during -rc1 for block-device read-only handling. Vaibhav found a new use for the ability (originally introduced by virtio_pmem) to call back to the platform to flush data, but also found an original bug in that implementation. Lastly, Arnd cleans up some compile warnings in dax. This has all appeared in -next with no reported issues. Summary: - Fix a regression of read-only handling in the pmem driver - Fix a compile warning - Fix support for platform cache flush commands on powerpc/papr" * tag 'libnvdimm-fixes-for-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: libnvdimm/region: Fix nvdimm_has_flush() to handle ND_REGION_ASYNC libnvdimm: Notify disk drivers to revalidate region read-only dax: avoid -Wempty-body warnings
2021-04-17KVM: Kill off the old hva-based MMU notifier callbacksSean Christopherson
Yank out the hva-based MMU notifier APIs now that all architectures that use the notifiers have moved to the gfn-based APIs. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210402005658.3024832-7-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: Move x86's MMU notifier memslot walkers to generic codeSean Christopherson
Move the hva->gfn lookup for MMU notifiers into common code. Every arch does a similar lookup, and some arch code is all but identical across multiple architectures. In addition to consolidating code, this will allow introducing optimizations that will benefit all architectures without incurring multiple walks of the memslots, e.g. by taking mmu_lock if and only if a relevant range exists in the memslots. The use of __always_inline to avoid indirect call retpolines, as done by x86, may also benefit other architectures. Consolidating the lookups also fixes a wart in x86, where the legacy MMU and TDP MMU each do their own memslot walks. Lastly, future enhancements to the memslot implementation, e.g. to add an interval tree to track host address, will need to touch far less arch specific code. MIPS, PPC, and arm64 will be converted one at a time in future patches. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210402005658.3024832-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: constify kvm_arch_flush_remote_tlbs_memslotPaolo Bonzini
memslots are stored in RCU and there should be no need to change them. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: Move prototypes for MMU notifier callbacks to generic codeSean Christopherson
Move the prototypes for the MMU notifier callbacks out of arch code and into common code. There is no benefit to having each arch replicate the prototypes since any deviation from the invocation in common code will explode. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210326021957.1424875-9-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-16ethtool: add interface to read RMON statsJakub Kicinski
Most devices maintain RMON (RFC 2819) stats - particularly the "histogram" of packets received by size. Unlike other RFCs which duplicate IEEE stats, the short/oversized frame counters in RMON don't seem to match IEEE stats 1-to-1 either, so expose those, too. Do not expose basic packet, CRC errors etc - those are already otherwise covered. Because standard defines packet ranges only up to 1518, and everything above that should theoretically be "oversized" - devices often create their own ranges. Going beyond what the RFC defines - expose the "histogram" in the Tx direction (assume for now that the ranges will be the same). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16ethtool: add interface to read standard MAC Ctrl statsJakub Kicinski
Number of devices maintains the standard-based MAC control counters for control frames. Add a API for those. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16ethtool: add interface to read standard MAC statsJakub Kicinski
Most of the MAC statistics are included in struct rtnl_link_stats64, but some fields are aggregated. Besides it's good to expose these clearly hardware stats separately. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16ethtool: add a new command for reading standard statsJakub Kicinski
Add an interface for reading standard stats, including stats which don't have a corresponding control interface. Start with IEEE 802.3 PHY stats. There seems to be only one stat to expose there. Define API to not require user space changes when new stats or groups are added. Groups are based on bitset, stats have a string set associated. v1: wrap stats in a nest Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16Merge tag 'mlx5-updates-2021-04-16' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2021-04-16 This patchset introduces updates to mlx5e netdev driver. 1) Tariq refactors TLS offloads and adds resiliency against RX resync failures 2) Maxim reduces code duplications by unifying channels reset flow regardless if channels are closed or open 3) Aya Enhances TX/RX health reporters diagnostics to expose the internal clock time-stamping format 4) Moshe adds support for ethtool extended link state, to show the reason for link down ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16kasan: remove redundant config optionWalter Wu
CONFIG_KASAN_STACK and CONFIG_KASAN_STACK_ENABLE both enable KASAN stack instrumentation, but we should only need one config, so that we remove CONFIG_KASAN_STACK_ENABLE and make CONFIG_KASAN_STACK workable. see [1]. When enable KASAN stack instrumentation, then for gcc we could do no prompt and default value y, and for clang prompt and default value n. This patch fixes the following compilation warning: include/linux/kasan.h:333:30: warning: 'CONFIG_KASAN_STACK' is not defined, evaluates to 0 [-Wundef] [akpm@linux-foundation.org: fix merge snafu] Link: https://bugzilla.kernel.org/show_bug.cgi?id=210221 [1] Link: https://lkml.kernel.org/r/20210226012531.29231-1-walter-zh.wu@mediatek.com Fixes: d9b571c885a8 ("kasan: fix KASAN_STACK dependency for HW_TAGS") Signed-off-by: Walter Wu <walter-zh.wu@mediatek.com> Suggested-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Andrey Konovalov <andreyknvl@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-04-16net: Add a WWAN subsystemLoic Poulain
This change introduces initial support for a WWAN framework. Given the complexity and heterogeneity of existing WWAN hardwares and interfaces, there is no strict definition of what a WWAN device is and how it should be represented. It's often a collection of multiple devices that perform the global WWAN feature (netdev, tty, chardev, etc). One usual way to expose modem controls and configuration is via high level protocols such as the well known AT command protocol, MBIM or QMI. The USB modems started to expose them as character devices, and user daemons such as ModemManager learnt to use them. This initial version adds the concept of WWAN port, which is a logical pipe to a modem control protocol. The protocols are rawly exposed to user via character device, allowing straigthforward support in existing tools (ModemManager, ofono...). The WWAN core takes care of the generic part, including character device management, and relies on port driver operations to receive/submit protocol data. Since the different devices exposing protocols for a same WWAN hardware do not necessarily know about each others (e.g. two different USB interfaces, PCI/MHI channel devices...) and can be created/removed in different orders, the WWAN core ensures that all WAN ports contributing to the 'whole' WWAN feature are grouped under the same virtual WWAN device, relying on the provided parent device (e.g. mhi controller, USB device). It's a 'trick' I copied from Johannes's earlier WWAN subsystem proposal. This initial version is purposely minimalist, it's essentially moving the generic part of the previously proposed mhi_wwan_ctrl driver inside a common WWAN framework, but the implementation is open and flexible enough to allow extension for further drivers. Signed-off-by: Loic Poulain <loic.poulain@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16time/timecounter: Mark 1st argument of timecounter_cyc2time() as constMarc Kleine-Budde
The timecounter is not modified in this function. Mark it as const. Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210303103544.994855-1-mkl@pengutronix.de
2021-04-16net/mlx5: Add register layout to support extended link stateMoshe Tal
Add needed structure layouts and defines for pddr register (Port Diagnostics Database Register) and the troublshooting page. This will be used to get extended link state from the monitor opcode bits. Signed-off-by: Moshe Tal <moshet@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-16mtd: core: Constify buf in mtd_write_user_prot_reg()Tudor Ambarus
The write buffer comes from user and should be const. Constify write buffer in mtd core and across all _write_user_prot_reg() users. cfi_cmdset_{0001, 0002} and onenand_base will pay the cost of an explicit cast to discard the const qualifier since the beginning, since they are using an otp_op_t function prototype that is used for both reads and writes. mtd_dataflash and SPI NOR will benefit of the const buffer because they are using different paths for writes and reads. Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20210403060931.7119-1-tudor.ambarus@microchip.com
2021-04-16perf core: Add PERF_COUNT_SW_CGROUP_SWITCHES eventNamhyung Kim
This patch adds a new software event to count context switches involving cgroup switches. So it's counted only if cgroups of previous and next tasks are different. Note that it only checks the cgroups in the perf_event subsystem. For cgroup v2, it shouldn't matter anyway. One can argue that we can do this by using existing sched_switch event with eBPF. But some systems might not have eBPF for some reason so I'd like to add this as a simple way. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210210083327.22726-2-namhyung@kernel.org
2021-04-16perf core: Factor out __perf_sw_event_schedNamhyung Kim
In some cases, we need to check more than whether the software event is enabled. So split the condition check and the actual event handling. This is a preparation for the next change. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210210083327.22726-1-namhyung@kernel.org
2021-04-16PCI: tegra: Add Tegra194 MCFG quirks for ECAM errataVidya Sagar
The PCIe controller in Tegra194 SoC is not ECAM-compliant. With the current hardware design, ECAM can be enabled only for one controller (the C5 controller) with bus numbers starting from 160 instead of 0. A different approach is taken to avoid this abnormal way of enabling ECAM for just one controller but to enable configuration space access for all the other controllers. In this approach, ops are added through MCFG quirk mechanism which access the configuration spaces by dynamically programming iATU (internal AddressTranslation Unit) to generate respective configuration accesses just like the way it is done in DesignWare core sub-system. This issue is specific to Tegra194 and it would be fixed in the future generations of Tegra SoCs. Link: https://lore.kernel.org/r/20210416134537.19474-1-vidyas@nvidia.com Signed-off-by: Vidya Sagar <vidyas@nvidia.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2021-04-16iommu: Streamline registration interfaceRobin Murphy
Rather than have separate opaque setter functions that are easy to overlook and lead to repetitive boilerplate in drivers, let's pass the relevant initialisation parameters directly to iommu_device_register(). Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/ab001b87c533b6f4db71eb90db6f888953986c36.1617285386.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-16iommu: Statically set module ownerRobin Murphy
It happens that the 3 drivers which first supported being modular are also ones which play games with their pgsize_bitmap, so have non-const iommu_ops where dynamically setting the owner manages to work out OK. However, it's less than ideal to force that upon all drivers which want to be modular - like the new sprd-iommu driver which now has a potential bug in that regard - so let's just statically set the module owner and let ops remain const wherever possible. Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/31423b99ff609c3d4b291c701a7a7a810d9ce8dc.1617285386.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-16Merge branches 'iommu/fixes', 'arm/mediatek', 'arm/smmu', 'arm/exynos', ↵Joerg Roedel
'unisoc', 'x86/vt-d', 'x86/amd' and 'core' into next
2021-04-16debugfs: Implement debugfs_create_str()Peter Zijlstra
Implement debugfs_create_str() to easily display names and such in debugfs. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Tested-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/20210412102001.415407080@infradead.org
2021-04-16sched: Move SCHED_DEBUG sysctl to debugfsPeter Zijlstra
Stop polluting sysctl with undocumented knobs that really are debug only, move them all to /debug/sched/ along with the existing /debug/sched_* files that already exist. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Tested-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/20210412102001.287610138@infradead.org
2021-04-16cpumask: Introduce DYING maskPeter Zijlstra
Introduce a cpumask that indicates (for each CPU) what direction the CPU hotplug is currently going. Notably, it tracks rollbacks. Eg. when an up fails and we do a roll-back down, it will accurately reflect the direction. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/20210310150109.151441252@infradead.org
2021-04-16cpumask: Make cpu_{online,possible,present,active}() inlinePeter Zijlstra
Prepare for addition of another mask. Primarily a code movement to avoid having to create more #ifdef, but while there, convert everything with an argument to an inline function. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/20210310150109.045447765@infradead.org
2021-04-16perf: Add support for SIGTRAP on perf eventsMarco Elver
Adds bit perf_event_attr::sigtrap, which can be set to cause events to send SIGTRAP (with si_code TRAP_PERF) to the task where the event occurred. The primary motivation is to support synchronous signals on perf events in the task where an event (such as breakpoints) triggered. To distinguish perf events based on the event type, the type is set in si_errno. For events that are associated with an address, si_addr is copied from perf_sample_data. The new field perf_event_attr::sig_data is copied to si_perf, which allows user space to disambiguate which event (of the same type) triggered the signal. For example, user space could encode the relevant information it cares about in sig_data. We note that the choice of an opaque u64 provides the simplest and most flexible option. Alternatives where a reference to some user space data is passed back suffer from the problem that modification of referenced data (be it the event fd, or the perf_event_attr) can race with the signal being delivered (of course, the same caveat applies if user space decides to store a pointer in sig_data, but the ABI explicitly avoids prescribing such a design). Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Dmitry Vyukov <dvyukov@google.com> Link: https://lore.kernel.org/lkml/YBv3rAT566k+6zjg@hirez.programming.kicks-ass.net/
2021-04-16signal: Introduce TRAP_PERF si_code and si_perf to siginfoMarco Elver
Introduces the TRAP_PERF si_code, and associated siginfo_t field si_perf. These will be used by the perf event subsystem to send signals (if requested) to the task where an event occurred. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k Acked-by: Arnd Bergmann <arnd@arndb.de> # asm-generic Link: https://lkml.kernel.org/r/20210408103605.1676875-6-elver@google.com
2021-04-16perf: Support only inheriting events if cloned with CLONE_THREADMarco Elver
Adds bit perf_event_attr::inherit_thread, to restricting inheriting events only if the child was cloned with CLONE_THREAD. This option supports the case where an event is supposed to be process-wide only (including subthreads), but should not propagate beyond the current process's shared environment. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/lkml/YBvj6eJR%2FDY2TsEB@hirez.programming.kicks-ass.net/
2021-04-16perf: Rework perf_event_exit_event()Peter Zijlstra
Make perf_event_exit_event() more robust, such that we can use it from other contexts. Specifically the up and coming remove_on_exec. For this to work we need to address a few issues. Remove_on_exec will not destroy the entire context, so we cannot rely on TASK_TOMBSTONE to disable event_function_call() and we thus have to use perf_remove_from_context(). When using perf_remove_from_context(), there's two races to consider. The first is against close(), where we can have concurrent tear-down of the event. The second is against child_list iteration, which should not find a half baked event. To address this, teach perf_remove_from_context() to special case !ctx->is_active and about DETACH_CHILD. [ elver@google.com: fix racing parent/child exit in sync_child_event(). ] Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210408103605.1676875-2-elver@google.com
2021-04-16blk-mq: bypass IO scheduler's limit_depth for passthrough requestLin Feng
Commit 01e99aeca39796003 ("blk-mq: insert passthrough request into hctx->dispatch directly") gives high priority to passthrough requests and bypass underlying IO scheduler. But as we allocate tag for such request it still runs io-scheduler's callback limit_depth, while we really want is to give full sbitmap-depth capabity to such request for acquiring available tag. blktrace shows PC requests(dmraid -s -c -i) hit bfq's limit_depth: 8,0 2 0 0.000000000 39952 1,0 m N bfq [bfq_limit_depth] wr_busy 0 sync 0 depth 8 8,0 2 1 0.000008134 39952 D R 4 [dmraid] 8,0 2 2 0.000021538 24 C R [0] 8,0 2 0 0.000035442 39952 1,0 m N bfq [bfq_limit_depth] wr_busy 0 sync 0 depth 8 8,0 2 3 0.000038813 39952 D R 24 [dmraid] 8,0 2 4 0.000044356 24 C R [0] This patch introduce a new wrapper to make code not that ugly. Signed-off-by: Lin Feng <linf@wangsu.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20210415033920.213963-1-linf@wangsu.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-16fs: split receive_fd_replace from __receive_fdChristoph Hellwig
receive_fd_replace shares almost no code with the general case, so split it out. Also remove the "Bump the sock usage counts" comment from both copies, as that is now what __receive_sock actually does. [AV: ... and make the only user of receive_fd_replace() choose between it and receive_fd() according to what userland had passed to it in flags] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>