summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-12-09driver core: Add fwnode_init()Saravana Kannan
There are multiple locations in the kernel where a struct fwnode_handle is initialized. Add fwnode_init() so that we have one way of initializing a fwnode_handle. Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20201121020232.908850-8-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09Revert "driver core: fw_devlink: Add support for batching fwnode parsing"Saravana Kannan
This reverts commit 716a7a25969003d82ab738179c3f1068a120ed11. The fw_devlink_pause/resume() APIs added by the commit being reverted were a first cut attempt at optimizing boot time. But these APIs don't fully solve the problem and are very fragile (can only be used for the top level devices being added). This series replaces them with a much better optimization that works for all device additions and also has the benefit of reducing the complexity of the firmware (DT, EFI) specific code and abstracting out common code to driver core. Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20201121020232.908850-7-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09Revert "of: platform: Batch fwnode parsing when adding all top level devices"Saravana Kannan
This reverts commit 93d2e4322aa74c1ad1e8c2160608eb9a960d69ff. The fw_devlink_pause/resume() optimization attempt is getting replaced with a much more robust optimization by the end of this series. So, stop using those APIs. Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20201121020232.908850-6-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09Revert "driver core: Remove check in driver_deferred_probe_force_trigger()"Saravana Kannan
This reverts commit fefcfc968723caf93318613a08e1f3ad07a6154f. The reverted commit is fixing commit 716a7a259690 ("driver core: fw_devlink: Add support for batching fwnode parsing"). Since the original commit will be reverted, the fix can be reverted too. Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20201121020232.908850-5-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09Revert "driver core: Don't do deferred probe in parallel with kernel_init ↵Saravana Kannan
thread" This reverts commit cec72f3efc6272420c2c2c699607f03d09b93e41. Commit cec72f3efc62 ("driver core: Don't do deferred probe in parallel with kernel_init thread") was fixing a commit 716a7a259690 ("driver core: fw_devlink: Add support for batching fwnode parsing"). Since the commit being fixed itself is going to be reverted, the fix can also be reverted. Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20201121020232.908850-4-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09Revert "driver core: Rename dev_links_info.defer_sync to defer_hook"Saravana Kannan
This reverts commit ec7bd78498f29680f536451fbdf9464e851273ed. This field rename was done to reuse defer_syc list head for multiple lists. That's not needed anymore and this list head will only be used for defer sync. So revert this patch to avoid conflicts with the other reverts coming after this. Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20201121020232.908850-3-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09Revert "driver core: Avoid deferred probe due to fw_devlink_pause/resume()"Saravana Kannan
This reverts commit 2451e746478a6a6e981cfa66b62b791ca93b90c8. fw_devlink_pause/resume() was an incomplete attempt at boot time optimization. That's going to get replaced by a much better optimization at the end of the series. Since fw_devlink_pause/resume() is going away, changes made for that can also go away. Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20201121020232.908850-2-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09driver: core: Fix list corruption after device_del()Takashi Iwai
The device_links_purge() function (called from device_del()) tries to remove the links.needs_suppliers list entry, but it's using list_del(), hence it doesn't initialize after the removal. This is OK for normal cases where device_del() is called via device_destroy(). However, it's not guaranteed that the device object will be really deleted soon after device_del(). In a minor case like HD-audio codec reconfiguration that re-initializes the device after device_del(), it may lead to a crash by the corrupted list entry. As a simple fix, replace list_del() with list_del_init() in order to make the list intact after the device_del() call. Fixes: e2ae9bcc4aaa ("driver core: Add support for linking devices during device addition") Cc: <stable@vger.kernel.org> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://lore.kernel.org/r/20201208190326.27531-1-tiwai@suse.de Cc: Saravana Kannan <saravanak@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09Merge remote-tracking branch 'arm64/for-next/fixes' into for-next/coreCatalin Marinas
* arm64/for-next/fixes: (26 commits) arm64: mte: fix prctl(PR_GET_TAGGED_ADDR_CTRL) if TCF0=NONE arm64: mte: Fix typo in macro definition arm64: entry: fix EL1 debug transitions arm64: entry: fix NMI {user, kernel}->kernel transitions arm64: entry: fix non-NMI kernel<->kernel transitions arm64: ptrace: prepare for EL1 irq/rcu tracking arm64: entry: fix non-NMI user<->kernel transitions arm64: entry: move el1 irq/nmi logic to C arm64: entry: prepare ret_to_user for function call arm64: entry: move enter_from_user_mode to entry-common.c arm64: entry: mark entry code as noinstr arm64: mark idle code as noinstr arm64: syscall: exit userspace before unmasking exceptions arm64: pgtable: Ensure dirty bit is preserved across pte_wrprotect() arm64: pgtable: Fix pte_accessible() ACPI/IORT: Fix doc warnings in iort.c arm64/fpsimd: add <asm/insn.h> to <asm/kprobes.h> to fix fpsimd build arm64: cpu_errata: Apply Erratum 845719 to KRYO2XX Silver arm64: proton-pack: Add KRYO2XX silver CPUs to spectre-v2 safe-list arm64: kpti: Add KRYO2XX gold/silver CPU cores to kpti safelist ... # Conflicts: # arch/arm64/include/asm/exception.h # arch/arm64/kernel/sdei.c
2020-12-09Merge remote-tracking branch 'arm64/for-next/scs' into for-next/coreCatalin Marinas
* arm64/for-next/scs: arm64: sdei: Push IS_ENABLED() checks down to callee functions arm64: scs: use vmapped IRQ and SDEI shadow stacks scs: switch to vmapped shadow stacks
2020-12-09Merge remote-tracking branch 'arm64/for-next/perf' into for-next/coreCatalin Marinas
* arm64/for-next/perf: perf/imx_ddr: Add system PMU identifier for userspace bindings: perf: imx-ddr: add compatible string arm64: Fix build failure when HARDLOCKUP_DETECTOR_PERF is enabled arm64: Enable perf events based hard lockup detector perf/imx_ddr: Add stop event counters support for i.MX8MP perf/smmuv3: Support sysfs identifier file drivers/perf: hisi: Add identifier sysfs file perf: remove duplicate check on fwnode driver/perf: Add PMU driver for the ARM DMC-620 memory controller
2020-12-09Merge branch 'for-next/misc' into for-next/coreCatalin Marinas
* for-next/misc: : Miscellaneous patches arm64: vmlinux.lds.S: Drop redundant *.init.rodata.* kasan: arm64: set TCR_EL1.TBID1 when enabled arm64: mte: optimize asynchronous tag check fault flag check arm64/mm: add fallback option to allocate virtually contiguous memory arm64/smp: Drop the macro S(x,s) arm64: consistently use reserved_pg_dir arm64: kprobes: Remove redundant kprobe_step_ctx # Conflicts: # arch/arm64/kernel/vmlinux.lds.S
2020-12-09Merge branch 'for-next/uaccess' into for-next/coreCatalin Marinas
* for-next/uaccess: : uaccess routines clean-up and set_fs() removal arm64: mark __system_matches_cap as __maybe_unused arm64: uaccess: remove vestigal UAO support arm64: uaccess: remove redundant PAN toggling arm64: uaccess: remove addr_limit_user_check() arm64: uaccess: remove set_fs() arm64: uaccess cleanup macro naming arm64: uaccess: split user/kernel routines arm64: uaccess: refactor __{get,put}_user arm64: uaccess: simplify __copy_user_flushcache() arm64: uaccess: rename privileged uaccess routines arm64: sdei: explicitly simulate PAN/UAO entry arm64: sdei: move uaccess logic to arch/arm64/ arm64: head.S: always initialize PSTATE arm64: head.S: cleanup SCTLR_ELx initialization arm64: head.S: rename el2_setup -> init_kernel_el arm64: add C wrappers for SET_PSTATE_*() arm64: ensure ERET from kthread is illegal
2020-12-09Merge branches 'for-next/kvm-build-fix', 'for-next/va-refactor', ↵Catalin Marinas
'for-next/lto', 'for-next/mem-hotplug', 'for-next/cppc-ffh', 'for-next/pad-image-header', 'for-next/zone-dma-default-32-bit', 'for-next/signal-tag-bits' and 'for-next/cmdline-extended' into for-next/core * for-next/kvm-build-fix: : Fix KVM build issues with 64K pages KVM: arm64: Fix build error in user_mem_abort() * for-next/va-refactor: : VA layout changes arm64: mm: don't assume struct page is always 64 bytes Documentation/arm64: fix RST layout of memory.rst arm64: mm: tidy up top of kernel VA space arm64: mm: make vmemmap region a projection of the linear region arm64: mm: extend linear region for 52-bit VA configurations * for-next/lto: : Upgrade READ_ONCE() to RCpc acquire on arm64 with LTO arm64: lto: Strengthen READ_ONCE() to acquire when CONFIG_LTO=y arm64: alternatives: Remove READ_ONCE() usage during patch operation arm64: cpufeatures: Add capability for LDAPR instruction arm64: alternatives: Split up alternative.h arm64: uaccess: move uao_* alternatives to asm-uaccess.h * for-next/mem-hotplug: : Memory hotplug improvements arm64/mm/hotplug: Ensure early memory sections are all online arm64/mm/hotplug: Enable MEM_OFFLINE event handling arm64/mm/hotplug: Register boot memory hot remove notifier earlier arm64: mm: account for hotplug memory when randomizing the linear region * for-next/cppc-ffh: : Add CPPC FFH support using arm64 AMU counters arm64: abort counter_read_on_cpu() when irqs_disabled() arm64: implement CPPC FFH support using AMUs arm64: split counter validation function arm64: wrap and generalise counter read functions * for-next/pad-image-header: : Pad Image header to 64KB and unmap it arm64: head: tidy up the Image header definition arm64/head: avoid symbol names pointing into first 64 KB of kernel image arm64: omit [_text, _stext) from permanent kernel mapping * for-next/zone-dma-default-32-bit: : Default to 32-bit wide ZONE_DMA (previously reduced to 1GB for RPi4) of: unittest: Fix build on architectures without CONFIG_OF_ADDRESS mm: Remove examples from enum zone_type comment arm64: mm: Set ZONE_DMA size based on early IORT scan arm64: mm: Set ZONE_DMA size based on devicetree's dma-ranges of: unittest: Add test for of_dma_get_max_cpu_address() of/address: Introduce of_dma_get_max_cpu_address() arm64: mm: Move zone_dma_bits initialization into zone_sizes_init() arm64: mm: Move reserve_crashkernel() into mem_init() arm64: Force NO_BLOCK_MAPPINGS if crashkernel reservation is required arm64: Ignore any DMA offsets in the max_zone_phys() calculation * for-next/signal-tag-bits: : Expose the FAR_EL1 tag bits in siginfo arm64: expose FAR_EL1 tag bits in siginfo signal: define the SA_EXPOSE_TAGBITS bit in sa_flags signal: define the SA_UNSUPPORTED bit in sa_flags arch: provide better documentation for the arch-specific SA_* flags signal: clear non-uapi flag bits when passing/returning sa_flags arch: move SA_* definitions to generic headers parisc: start using signal-defs.h parisc: Drop parisc special case for __sighandler_t * for-next/cmdline-extended: : Add support for CONFIG_CMDLINE_EXTENDED arm64: Extend the kernel command line from the bootloader arm64: kaslr: Refactor early init command line parsing
2020-12-09fs/kernfs: remove the double check of dentry->inodeHui Su
In both kernfs_node_from_dentry() and in kernfs_dentry_node(), we will check the dentry->inode is NULL or not, which is superfluous. So remove the check in kernfs_node_from_dentry(). Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Hui Su <sh_def@163.com> Link: https://lore.kernel.org/r/20201113132143.GA119541@rlk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-09Merge tag 'iommu-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull iommu fix from Will Deacon: "Fix interrupt table length definition for AMD IOMMU. It's actually a fix for a fix, where the size of the interrupt remapping table was increased but a related constant for the size of the interrupt table was forgotten" * tag 'iommu-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: iommu/amd: Set DTE[IntTabLen] to represent 512 IRTEs
2020-12-09USB: serial: ftdi_sio: log the CBUS GPIO validityMarc Zyngier
The validity of the ftdi CBUS GPIO is pretty hidden so far, and finding out *why* some GPIOs don't work is sometimes hard to identify. So let's help the user by displaying the map of the CBUS pins that are valid for a GPIO. Suggested-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201204164739.781812-4-maz@kernel.org Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> [johan: demote to KERN_DEBUG, rephrase messages, drop ftx-prog warning] Signed-off-by: Johan Hovold <johan@kernel.org>
2020-12-09USB: serial: ftdi_sio: drop GPIO line checking dead codeMarc Zyngier
Now that gpiolib can track the validity of GPIO pins, there is no need to check whether the line is valid in request(). Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201204164739.781812-5-maz@kernel.org Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> [johan: amend commit message] Signed-off-by: Johan Hovold <johan@kernel.org>
2020-12-09USB: serial: ftdi_sio: report the valid GPIO lines to gpiolibMarc Zyngier
Since it is pretty common for only some of the CBUS lines to be valid as GPIO lines, let's report such validity to the rest of the kernel. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201204164739.781812-3-maz@kernel.org Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Johan Hovold <johan@kernel.org>
2020-12-09ASoC: add simple-muxAlexandre Belloni
Add a driver for simple mux driven by gpios. It currently only supports one gpio, muxing one of two inputs to a single output. Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20201205001508.346439-2-alexandre.belloni@bootlin.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-12-09ASoC: add simple-audio-mux bindingAlexandre Belloni
Add devicetree documentation for simple audio multiplexers Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20201205001508.346439-1-alexandre.belloni@bootlin.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-12-09ASoC: SOF: Intel: add SoundWire support for ADL-SKai Vehmanen
Expand SOF support for Alder Lake by adding ACPI machine tables for ADL-S systems with SoundWire codecs. Modify kernel config to choose SND_SOC_SOF_INTEL_SOUNDWIRE_LINK_BASELINE for these platforms. Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://lore.kernel.org/r/20201209153102.3028310-2-kai.vehmanen@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-12-09ASoC: Intel: common: add ACPI matching tables for Alder LakeKai Vehmanen
Initial support for ADL w/ RT711 Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://lore.kernel.org/r/20201209153102.3028310-1-kai.vehmanen@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-12-09can: isotp: isotp_setsockopt(): block setsockopt on bound socketsOliver Hartkopp
The isotp socket can be widely configured in its behaviour regarding addressing types, fill-ups, receive pattern tests and link layer length. Usually all these settings need to be fixed before bind() and can not be changed afterwards. This patch adds a check to enforce the common usage pattern. Fixes: e057dd3fc20f ("can: add ISO 15765-2:2016 transport protocol") Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Tested-by: Thomas Wagner <thwa1@web.de> Link: https://lore.kernel.org/r/20201203140604.25488-2-socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Link: https://lore.kernel.org/r/20201204133508.742120-3-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-09ice: Add space to unknown speedSimon Perron Caissy
Add space to the end of 'Unknown' string in order to avoid concatenation with 'bps' string when formatting netdev log message. Signed-off-by: Simon Perron Caissy <simon.perron.caissy@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09ice: join format strings to same line as ice_debugJacob Keller
When printing messages with ice_debug, align the printed string to the origin line of the message in order to ease debugging and tracking messages back to their source. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09ice: silence static analysis warningBruce Allan
sparse warns about cast to/from restricted types which is not an actual problem; silence the warning. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09ice: cleanup misleading commentBruce Allan
The maximum Admin Queue buffer size and NVM shadow RAM sector size are both 4 Kilobytes. Some comments refer to those as 4Kb which can be confused with 4 Kilobits. Update the comments to use the commonly used KB symbol instead. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09ice: Remove vlan_ena from vsi structureNick Nunley
vlan_ena was introduced to track whether VLAN filters are enabled on the device, but 1) checking for num_vlan > 1 already gives us this information, and is currently used in this way throughout the code 2) the logic for vlan_ena is broken when multiple VLANs are active Just remove vlan_ena and use num_vlan instead. Signed-off-by: Nick Nunley <nicholas.d.nunley@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09ice: Remove gate to OROM initJeb Cramer
Remove the gate that prevents the OROM and netlist info from being populated. The NVM now has the appropriate section for software to reference the versioning info. Signed-off-by: Jeb Cramer <jeb.j.cramer@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09ice: Enable Support for FW Override (E82X)Jeb Cramer
The driver is able to override the firmware when it comes to supporting a more lenient link mode. This feature was limited to E810 devices. It is now extended to E82X devices. Signed-off-by: Jeb Cramer <jeb.j.cramer@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09ice: don't always return an error for Get PHY Abilities AQ commandPaul M Stillwell Jr
There are times when the driver shouldn't return an error when the Get PHY abilities AQ command (0x0600) returns an error. Instead the driver should log that the error occurred and continue on. This allows the driver to load even though the AQ command failed. The user can then later determine the reason for the failure and correct it. Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09ice: cleanup stack hogBruce Allan
In ice_flow_add_prof_sync(), struct ice_flow_prof_params has recently grown in size hogging stack space when allocated there. Hogging stack space should be avoided. Change allocation to be on the heap when needed. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Harikumar Bokkena <harikumarx.bokkena@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09perf/x86/intel: Add Tremont Topdown supportKan Liang
Tremont has four L1 Topdown events, TOPDOWN_FE_BOUND.ALL, TOPDOWN_BAD_SPECULATION.ALL, TOPDOWN_BE_BOUND.ALL and TOPDOWN_RETIRING.ALL. They are available on GP counters. Export them to sysfs and facilitate the perf stat tool. $perf stat --topdown -- sleep 1 Performance counter stats for 'sleep 1': retiring bad speculation frontend bound backend bound 24.9% 16.8% 31.7% 26.6% 1.001224610 seconds time elapsed 0.001150000 seconds user 0.000000000 seconds sys Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1607457952-3519-1-git-send-email-kan.liang@linux.intel.com
2020-12-09uprobes/x86: Fix fall-through warnings for ClangGustavo A. R. Silva
In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning by explicitly adding a break statement instead of letting the code fall through to the next case. Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://github.com/KSPP/linux/issues/115
2020-12-09perf/x86: Fix fall-through warnings for ClangGustavo A. R. Silva
In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning by explicitly adding a fallthrough pseudo-keyword as a replacement for a /* fall through */ comment, instead of letting the code fall through to the next case. Notice that Clang doesn't recognize /* fall through */ comments as implicit fall-through markings. Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://github.com/KSPP/linux/issues/115
2020-12-09kprobes/x86: Fix fall-through warnings for ClangGustavo A. R. Silva
In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning by explicitly adding a break statement instead of just letting the code fall through to the next case. Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://github.com/KSPP/linux/issues/115
2020-12-09perf/x86/intel/lbr: Fix the return type of get_lbr_cycles()Kan Liang
The cycle count of a timed LBR is always 1 in perf record -D. The cycle count is stored in the first 16 bits of the IA32_LBR_x_INFO register, but the get_lbr_cycles() return Boolean type. Use u16 to replace the Boolean type. Fixes: 47125db27e47 ("perf/x86/intel/lbr: Support Architectural LBR") Reported-by: Stephane Eranian <eranian@google.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20201125213720.15692-2-kan.liang@linux.intel.com
2020-12-09perf/x86/intel: Fix rtm_abort_event encoding on Ice LakeKan Liang
According to the event list from icelake_core_v1.09.json, the encoding of the RTM_RETIRED.ABORTED event on Ice Lake should be, "EventCode": "0xc9", "UMask": "0x04", "EventName": "RTM_RETIRED.ABORTED", Correct the wrong encoding. Fixes: 6017608936c1 ("perf/x86/intel: Add Icelake support") Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20201125213720.15692-1-kan.liang@linux.intel.com
2020-12-09x86/kprobes: Restore BTF if the single-stepping is cancelledMasami Hiramatsu
Fix to restore BTF if single-stepping causes a page fault and it is cancelled. Usually the BTF flag was restored when the single stepping is done (in resume_execution()). However, if a page fault happens on the single stepping instruction, the fault handler is invoked and the single stepping is cancelled. Thus, the BTF flag is not restored. Fixes: 1ecc798c6764 ("x86: debugctlmsr kprobes") Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/160389546985.106936.12727996109376240993.stgit@devnote2
2020-12-09perf: Break deadlock involving exec_update_mutexpeterz@infradead.org
Syzbot reported a lock inversion involving perf. The sore point being perf holding exec_update_mutex() for a very long time, specifically across a whole bunch of filesystem ops in pmu::event_init() (uprobes) and anon_inode_getfile(). This then inverts against procfs code trying to take exec_update_mutex. Move the permission checks later, such that we need to hold the mutex over less code. Reported-by: syzbot+db9cdf3dd1f64252c6ef@syzkaller.appspotmail.com Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2020-12-09sparc64/mm: Implement pXX_leaf_size() supportPeter Zijlstra
Sparc64 has non-pagetable aligned large page support; wire up the pXX_leaf_size() functions to report the correct pagetable page size. This enables PERF_SAMPLE_{DATA,CODE}_PAGE_SIZE to report accurate pagetable leaf sizes. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201126121121.301768209@infradead.org
2020-12-09powerpc/8xx: Implement pXX_leaf_size() supportPeter Zijlstra
Christophe Leroy wrote: > I can help with powerpc 8xx. It is a 32 bits powerpc. The PGD has 1024 > entries, that means each entry maps 4M. > > Page sizes are 4k, 16k, 512k and 8M. > > For the 8M pages we use hugepd with a single entry. The two related PGD > entries point to the same hugepd. > > For the other sizes, they are in standard page tables. 16k pages appear > 4 times in the page table. 512k entries appear 128 times in the page > table. > > When the PGD entry has _PMD_PAGE_8M bits, the PMD entry points to a > hugepd with holds the single 8M entry. > > In the PTE, we have two bits: _PAGE_SPS and _PAGE_HUGE > > _PAGE_HUGE means it is a 512k page > _PAGE_SPS means it is not a 4k page > > The kernel can by build either with 4k pages as standard page size, or > 16k pages. It doesn't change the page table layout though. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201126121121.364451610@infradead.org
2020-12-09seqlock: kernel-doc: Specify when preemption is automatically alteredAhmed S. Darwish
The kernel-doc annotations for sequence counters write side functions are incomplete: they do not specify when preemption is automatically disabled and re-enabled. This has confused a number of call-site developers. Fix it. Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/CAHk-=wikhGExmprXgaW+MVXG1zsGpztBbVwOb23vetk41EtTBQ@mail.gmail.com
2020-12-09seqlock: Prefix internal seqcount_t-only macros with a "do_"Ahmed S. Darwish
When the seqcount_LOCKNAME_t group of data types were introduced, two classes of seqlock.h sequence counter macros were added: - An external public API which can either take a plain seqcount_t or any of the seqcount_LOCKNAME_t variants. - An internal API which takes only a plain seqcount_t. To distinguish between the two groups, the "*_seqcount_t_*" pattern was used for the latter. This confused a number of mm/ call-site developers, and Linus also commented that it was not a standard practice for marking seqlock.h internal APIs. Distinguish the latter group of macros by prefixing a "do_". Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/CAHk-=wikhGExmprXgaW+MVXG1zsGpztBbVwOb23vetk41EtTBQ@mail.gmail.com
2020-12-09Documentation: seqlock: s/LOCKTYPE/LOCKNAME/gAhmed S. Darwish
Sequence counters with an associated write serialization lock are called seqcount_LOCKNAME_t. Fix the documentation accordingly. While at it, remove a paragraph that inappropriately discussed a seqlock.h implementation detail. Fixes: 6dd699b13d53 ("seqlock: seqcount_LOCKNAME_t: Standardize naming convention") Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20201206162143.14387-2-a.darwish@linutronix.de
2020-12-09locking/rwsem: Remove reader optimistic spinningWaiman Long
Reader optimistic spinning is helpful when the reader critical section is short and there aren't that many readers around. It also improves the chance that a reader can get the lock as writer optimistic spinning disproportionally favors writers much more than readers. Since commit d3681e269fff ("locking/rwsem: Wake up almost all readers in wait queue"), all the waiting readers are woken up so that they can all get the read lock and run in parallel. When the number of contending readers is large, allowing reader optimistic spinning will likely cause reader fragmentation where multiple smaller groups of readers can get the read lock in a sequential manner separated by writers. That reduces reader parallelism. One possible way to address that drawback is to limit the number of readers (preferably one) that can do optimistic spinning. These readers act as representatives of all the waiting readers in the wait queue as they will wake up all those waiting readers once they get the lock. Alternatively, as reader optimistic lock stealing has already enhanced fairness to readers, it may be easier to just remove reader optimistic spinning and simplifying the optimistic spinning code as a result. Performance measurements (locking throughput kops/s) using a locking microbenchmark with 50/50 reader/writer distribution and turbo-boost disabled was done on a 2-socket Cascade Lake system (48-core 96-thread) to see the impacts of these changes: 1) Vanilla - 5.10-rc3 kernel 2) Before - 5.10-rc3 kernel with previous patches in this series 2) limit-rspin - 5.10-rc3 kernel with limited reader spinning patch 3) no-rspin - 5.10-rc3 kernel with reader spinning disabled # of threads CS Load Vanilla Before limit-rspin no-rspin ------------ ------- ------- ------ ----------- -------- 2 1 5,185 5,662 5,214 5,077 4 1 5,107 4,983 5,188 4,760 8 1 4,782 4,564 4,720 4,628 16 1 4,680 4,053 4,567 3,402 32 1 4,299 1,115 1,118 1,098 64 1 3,218 983 1,001 957 96 1 1,938 944 957 930 2 20 2,008 2,128 2,264 1,665 4 20 1,390 1,033 1,046 1,101 8 20 1,472 1,155 1,098 1,213 16 20 1,332 1,077 1,089 1,122 32 20 967 914 917 980 64 20 787 874 891 858 96 20 730 836 847 844 2 100 372 356 360 355 4 100 492 425 434 392 8 100 533 537 529 538 16 100 548 572 568 598 32 100 499 520 527 537 64 100 466 517 526 512 96 100 406 497 506 509 The column "CS Load" represents the number of pause instructions issued in the locking critical section. A CS load of 1 is extremely short and is not likey in real situations. A load of 20 (moderate) and 100 (long) are more realistic. It can be seen that the previous patches in this series have reduced performance in general except in highly contended cases with moderate or long critical sections that performance improves a bit. This change is mostly caused by the "Prevent potential lock starvation" patch that reduce reader optimistic spinning and hence reduce reader fragmentation. The patch that further limit reader optimistic spinning doesn't seem to have too much impact on overall performance as shown in the benchmark data. The patch that disables reader optimistic spinning shows reduced performance at lightly loaded cases, but comparable or slightly better performance on with heavier contention. This patch just removes reader optimistic spinning for now. As readers are not going to do optimistic spinning anymore, we don't need to consider if the OSQ is empty or not when doing lock stealing. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Davidlohr Bueso <dbueso@suse.de> Link: https://lkml.kernel.org/r/20201121041416.12285-6-longman@redhat.com
2020-12-09locking/rwsem: Enable reader optimistic lock stealingWaiman Long
If the optimistic spinning queue is empty and the rwsem does not have the handoff or write-lock bits set, it is actually not necessary to call rwsem_optimistic_spin() to spin on it. Instead, it can steal the lock directly as its reader bias is in the count already. If it is the first reader in this state, it will try to wake up other readers in the wait queue. With this patch applied, the following were the lock event counts after rebooting a 2-socket system and a "make -j96" kernel rebuild. rwsem_opt_rlock=4437 rwsem_rlock=29 rwsem_rlock_steal=19 So lock stealing represents about 0.4% of all the read locks acquired in the slow path. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Davidlohr Bueso <dbueso@suse.de> Link: https://lkml.kernel.org/r/20201121041416.12285-4-longman@redhat.com
2020-12-09locking/rwsem: Prevent potential lock starvationWaiman Long
The lock handoff bit is added in commit 4f23dbc1e657 ("locking/rwsem: Implement lock handoff to prevent lock starvation") to avoid lock starvation. However, allowing readers to do optimistic spinning does introduce an unlikely scenario where lock starvation can happen. The lock handoff bit may only be set when a waiter is being woken up. In the case of reader unlock, wakeup happens only when the reader count reaches 0. If there is a continuous stream of incoming readers acquiring read lock via optimistic spinning, it is possible that the reader count may never reach 0 and so the handoff bit will never be asserted. One way to prevent this scenario from happening is to disallow optimistic spinning if the rwsem is currently owned by readers. If the previous or current owner is a writer, optimistic spinning will be allowed. If the previous owner is a reader but the reader count has reached 0 before, a wakeup should have been issued. So the handoff mechanism will be kicked in to prevent lock starvation. As a result, it should be OK to do optimistic spinning in this case. This patch may have some impact on reader performance as it reduces reader optimistic spinning especially if the lock critical sections are short the number of contending readers are small. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Davidlohr Bueso <dbueso@suse.de> Link: https://lkml.kernel.org/r/20201121041416.12285-3-longman@redhat.com
2020-12-09locking/rwsem: Pass the current atomic count to rwsem_down_read_slowpath()Waiman Long
The atomic count value right after reader count increment can be useful to determine the rwsem state at trylock time. So the count value is passed down to rwsem_down_read_slowpath() to be used when appropriate. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Davidlohr Bueso <dbueso@suse.de> Link: https://lkml.kernel.org/r/20201121041416.12285-2-longman@redhat.com