git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2021-02-17	preempt: Introduce CONFIG_PREEMPT_DYNAMIC	Michal Hocko
	Preemption mode selection is currently hardcoded on Kconfig choices. Introduce a dedicated option to tune preemption flavour at boot time, This will be only available on architectures efficiently supporting static calls in order not to tempt with the feature against additional overhead that might be prohibitive or undesirable. CONFIG_PREEMPT_DYNAMIC is automatically selected by CONFIG_PREEMPT if the architecture provides the necessary support (CONFIG_STATIC_CALL_INLINE, CONFIG_GENERIC_ENTRY, and provide with __preempt_schedule_function() / __preempt_schedule_notrace_function()). Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> [peterz: relax requirement to HAVE_STATIC_CALL] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20210118141223.123667-5-frederic@kernel.org
2021-02-17	static_call: Provide DEFINE_STATIC_CALL_RET0()	Frederic Weisbecker
	DECLARE_STATIC_CALL() must pass the original function targeted for a given static call. But DEFINE_STATIC_CALL() may want to initialize it as off. In this case we can't pass NULL (for functions without return value) or __static_call_return0 (for functions returning a value) directly to DEFINE_STATIC_CALL() as that may trigger a static call redeclaration with a different function prototype. Type casts neither can work around that as they don't get along with typeof(). The proper way to do that for functions that don't return a value is to use DEFINE_STATIC_CALL_NULL(). But functions returning a actual value don't have an equivalent yet. Provide DEFINE_STATIC_CALL_RET0() to solve this situation. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20210118141223.123667-3-frederic@kernel.org
2021-02-17	static_call/x86: Add __static_call_return0()	Peter Zijlstra
	Provide a stub function that return 0 and wire up the static call site patching to replace the CALL with a single 5 byte instruction that clears %RAX, the return value register. The function can be cast to any function pointer type that has a single %RAX return (including pointers). Also provide a version that returns an int for convenience. We are clearing the entire %RAX register in any case, whether the return value is 32 or 64 bits, since %RAX is always a scratch register anyway. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20210118141223.123667-2-frederic@kernel.org
2021-02-17	static_call: Pull some static_call declarations to the type headers	Peter Zijlstra
	Some static call declarations are going to be needed on low level header files. Move the necessary material to the dedicated static call types header to avoid inclusion dependency hell. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20210118141223.123667-4-frederic@kernel.org
2021-02-17	sched/core: Update task_prio() function header	Dietmar Eggemann
	The description of the RT offset and the values for 'normal' tasks needs update. Moreover there are DL tasks now. task_prio() has to stay like it is to guarantee compatibility with the /proc/<pid>/stat priority field: # cat /proc/<pid>/stat \| awk '{ print $18; }' Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20210128131040.296856-4-dietmar.eggemann@arm.com
2021-02-17	sched: Remove USER_PRIO, TASK_USER_PRIO and MAX_USER_PRIO	Dietmar Eggemann
	The only remaining use of MAX_USER_PRIO (and USER_PRIO) is the SCALE_PRIO() definition in the PowerPC Cell architecture's Synergistic Processor Unit (SPU) scheduler. TASK_USER_PRIO isn't used anymore. Commit fe443ef2ac42 ("[POWERPC] spusched: Dynamic timeslicing for SCHED_OTHER") copied SCALE_PRIO() from the task scheduler in v2.6.23. Commit a4ec24b48dde ("sched: tidy up SCHED_RR") removed it from the task scheduler in v2.6.24. Commit 3ee237dddcd8 ("sched/prio: Add 3 macros of MAX_NICE, MIN_NICE and NICE_WIDTH in prio.h") introduced NICE_WIDTH much later. With: MAX_USER_PRIO = USER_PRIO(MAX_PRIO) = MAX_PRIO - MAX_RT_PRIO MAX_PRIO = MAX_RT_PRIO + NICE_WIDTH MAX_USER_PRIO = MAX_RT_PRIO + NICE_WIDTH - MAX_RT_PRIO MAX_USER_PRIO = NICE_WIDTH MAX_USER_PRIO can be replaced by NICE_WIDTH to be able to remove all the {*_}USER_PRIO defines. Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20210128131040.296856-3-dietmar.eggemann@arm.com
2021-02-17	sched: Remove MAX_USER_RT_PRIO	Dietmar Eggemann
	Commit d46523ea32a7 ("[PATCH] fix MAX_USER_RT_PRIO and MAX_RT_PRIO") was introduced due to a a small time period in which the realtime patch set was using different values for MAX_USER_RT_PRIO and MAX_RT_PRIO. This is no longer true, i.e. now MAX_RT_PRIO == MAX_USER_RT_PRIO. Get rid of MAX_USER_RT_PRIO and make everything use MAX_RT_PRIO instead. Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20210128131040.296856-2-dietmar.eggemann@arm.com
2021-02-17	sched/topology: Fix sched_domain_topology_level alloc in sched_init_numa()	Dietmar Eggemann
	Commit "sched/topology: Make sched_init_numa() use a set for the deduplicating sort" allocates 'i + nr_levels (level)' instead of 'i + nr_levels + 1' sched_domain_topology_level. This led to an Oops (on Arm64 juno with CONFIG_SCHED_DEBUG): sched_init_domains build_sched_domains() __free_domain_allocs() __sdt_free() { ... for_each_sd_topology(tl) ... sd = *per_cpu_ptr(sdd->sd, j); <-- ... } Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Vincent Guittot <vincent.guittot@linaro.org> Tested-by: Barry Song <song.bao.hua@hisilicon.com> Link: https://lkml.kernel.org/r/6000e39e-7d28-c360-9cd6-8798fd22a9bf@arm.com
2021-02-17	rbtree, timerqueue: Use rb_add_cached()	Peter Zijlstra
	Reduce rbtree boiler plate by using the new helpers. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Davidlohr Bueso <dbueso@suse.de>
2021-02-17	rbtree, rtmutex: Use rb_add_cached()	Peter Zijlstra
	Reduce rbtree boiler plate by using the new helpers. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Davidlohr Bueso <dbueso@suse.de>
2021-02-17	rbtree, uprobes: Use rbtree helpers	Peter Zijlstra
	Reduce rbtree boilerplate by using the new helpers. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Davidlohr Bueso <dbueso@suse.de>
2021-02-17	rbtree, perf: Use new rbtree helpers	Peter Zijlstra
	Reduce rbtree boiler plate by using the new helpers. One noteworthy change is unification of the various (partial) compare functions. We construct a subtree match by forcing the sub-order to always match, see __group_cmp(). Due to 'const' we had to touch cgroup_id(). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Davidlohr Bueso <dbueso@suse.de>
2021-02-17	rbtree, sched/deadline: Use rb_add_cached()	Peter Zijlstra
	Reduce rbtree boiler plate by using the new helpers. Make rb_add_cached() / rb_erase_cached() return a pointer to the leftmost node to aid in updating additional state. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Davidlohr Bueso <dbueso@suse.de>
2021-02-17	rbtree, sched/fair: Use rb_add_cached()	Peter Zijlstra
	Reduce rbtree boiler plate by using the new helper function. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Davidlohr Bueso <dbueso@suse.de>
2021-02-17	rbtree: Add generic add and find helpers	Peter Zijlstra
	I've always been bothered by the endless (fragile) boilerplate for rbtree, and I recently wrote some rbtree helpers for objtool and figured I should lift them into the kernel and use them more widely. Provide: partial-order; less() based: - rb_add(): add a new entry to the rbtree - rb_add_cached(): like rb_add(), but for a rb_root_cached total-order; cmp() based: - rb_find(): find an entry in an rbtree - rb_find_add(): find an entry, and add if not found - rb_find_first(): find the first (leftmost) matching entry - rb_next_match(): continue from rb_find_first() - rb_for_each(): iterate a sub-tree using the previous two Inlining and constant propagation should see the compiler inline the whole thing, including the various compare functions. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Michel Lespinasse <walken@google.com> Acked-by: Davidlohr Bueso <dbueso@suse.de>
2021-02-17	sched/fair: Merge select_idle_core/cpu()	Mel Gorman
	Both select_idle_core() and select_idle_cpu() do a loop over the same cpumask. Observe that by clearing the already visited CPUs, we can fold the iteration and iterate a core at a time. All we need to do is remember any non-idle CPU we encountered while scanning for an idle core. This way we'll only iterate every CPU once. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20210127135203.19633-5-mgorman@techsingularity.net
2021-02-17	sched/fair: Remove select_idle_smt()	Mel Gorman
	In order to make the next patch more readable, and to quantify the actual effectiveness of this pass, start by removing it. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20210125085909.4600-4-mgorman@techsingularity.net
2021-02-17	Merge tag 'v5.11' into sched/core, to pick up fixes & refresh the branch	Ingo Molnar
	Signed-off-by: Ingo Molnar <mingo@kernel.org>
2021-02-17	perf symbols: Resolve symbols against debug file first	Jiri Slaby
	With LTO, there are symbols like these: /usr/lib/debug/usr/lib64/libantlr4-runtime.so.4.8-4.8-1.4.x86_64.debug 10305: 0000000000955fa4 0 NOTYPE LOCAL DEFAULT 29 Predicate.cpp.2bc410e7 This comes from a runtime/debug split done by the standard way: objcopy --only-keep-debug $runtime $debug objcopy --add-gnu-debuglink=$debugfn -R .comment -R .GCC.command.line --strip-all $runtime perf currently cannot resolve such symbols (relicts of LTO), as section 29 exists only in the debug file (29 is .debug_info). And perf resolves symbols only against runtime file. This results in all symbols from such a library being unresolved: 0.38% main2 libantlr4-runtime.so.4.8 [.] 0x00000000000671e0 So try resolving against the debug file first. And only if it fails (the section has NOBITS set), try runtime file. We can do this, as "objcopy --only-keep-debug" per documentation preserves all sections, but clears data of some of them (the runtime ones) and marks them as NOBITS. The correct result is now: 0.38% main2 libantlr4-runtime.so.4.8 [.] antlr4::IntStream::~IntStream Note that these LTO symbols are properly skipped anyway as they belong neither to text nor to data (is_label && !elf_sec__filter(&shdr, secstrs) is true). Signed-off-by: Jiri Slaby <jslaby@suse.cz> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210217122125.26416-1-jslaby@suse.cz Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-17	drm/i915/gt: Correct surface base address for renderclear	Chris Wilson
	The surface_state_base is an offset into the batch, so we need to pass the correct batch address for STATE_BASE_ADDRESS. Fixes: 47f8253d2b89 ("drm/i915/gen7: Clear all EU/L3 residual contexts") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com> Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Cc: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.7+ Link: https://patchwork.freedesktop.org/patch/msgid/20210210122728.20097-1-chris@chris-wilson.co.uk (cherry picked from commit 1914911f4aa08ddc05bae71d3516419463e0c567) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-02-17	drm/i915: Disallow plane x+w>stride on ilk+ with X-tiling	Ville Syrjälä
	ilk+ planes get notably unhappy when the plane x+w exceeds the stride. This wasn't a problem previously because we always aligned SURF to the closest tile boundary so the x offset never got particularly large. But now with async flips we have to align to 256KiB instead and thus this becomes a real issue. On ilk/snb/ivb it looks like the accesses just wrap early to the next tile row when scanout goes past the SURF+n*stride boundary, hsw/bdw suffer more heavily and start to underrun constantly. i965/g4x appear to be immune. vlv/chv I've not yet checked. Let's borrow another trick from the skl+ code and search backwards for a better SURF offset in the hopes of getting the x offset below the limit. IIRC when I ran into a similar issue on skl years ago it was causing the hardware to fall over pretty hard as well. And let's be consistent and include i965/g4x in the check as well, just in case I just got super lucky somehow when I wasn't able to reproduce the issue. Not that it really matters since we still use 4k SURF alignment for i965/g4x anyway. Fixes: 6ede6b0616b2 ("drm/i915: Implement async flips for vlv/chv") Fixes: 4bb18054adc4 ("drm/i915: Implement async flip for ilk/snb") Fixes: 2a636e240c77 ("drm/i915: Implement async flip for ivb/hsw") Fixes: cda195f13abd ("drm/i915: Implement async flips for bdw") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210209021918.16234-1-ville.syrjala@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (cherry picked from commit 59fb8218c8e5001f854e7d5fdb5fb135cba58102) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [Rodrigo also exported some functions from intel_display.c during backport]
2021-02-17	Merge branch 'perf/kprobes' into perf/core, to pick up finished branch	Ingo Molnar
	Signed-off-by: Ingo Molnar <mingo@kernel.org>
2021-02-16	net: re-solve some conflicts after net -> net-next merge	Jakub Kicinski
	Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-16	Input: xpad - add support for PowerA Enhanced Wired Controller for Xbox ↵	Olivier Crête
	Series X\|S Signed-off-by: Olivier Crête <olivier.crete@ocrete.ca> Link: https://lore.kernel.org/r/20210204005318.615647-1-olivier.crete@collabora.com Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2021-02-16	Input: sur40 - fix an error code in sur40_probe()	Dan Carpenter
	If v4l2_ctrl_handler_setup() fails then probe() should return an error code instead of returning success. Fixes: cee1e3e2ef39 ("media: add video control handlers using V4L2 control framework") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/YBKFkbATXa5fA3xj@mwanda Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2021-02-16	MAINTAINERS: Add maintainers of the CXL driver	Ben Widawsky
	Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/20210217040958.1354670-9-ben.widawsky@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-02-16	cxl/mem: Add set of informational commands	Ben Widawsky
	Add initial set of formal commands beyond basic identify and command enumeration. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> (v2) Link: https://lore.kernel.org/r/20210217040958.1354670-8-ben.widawsky@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-02-16	cxl/mem: Enable commands via CEL	Ben Widawsky
	CXL devices identified by the memory-device class code must implement the Device Command Interface (described in 8.2.9 of the CXL 2.0 spec). While the driver already maintains a list of commands it supports, there is still a need to be able to distinguish between commands that the driver knows about from commands that are optionally supported by the hardware. The Command Effects Log (CEL) is specified in the CXL 2.0 specification. The CEL is one of two types of logs, the other being vendor specific. They are distinguished in hardware/spec via UUID. The CEL is useful for 2 things: 1. Determine which optional commands are supported by the CXL device. 2. Enumerate any vendor specific commands The CEL is used by the driver to determine which commands are available in the hardware and therefore which commands userspace is allowed to execute. The set of enabled commands might be a subset of commands which are advertised in UAPI via CXL_MEM_SEND_COMMAND IOCTL. With the CEL enabling comes a internal flag to indicate a base set of commands that are enabled regardless of CEL. Such commands are required for basic interaction with the hardware and thus can be useful in debug cases, for example if the CEL is corrupted. The implementation leaves the statically defined table of commands and supplements it with a bitmap to determine commands that are enabled. This organization was chosen for the following reasons: - Smaller memory footprint. Doesn't need a table per device. - Reduce memory allocation complexity. - Fixed command IDs to opcode mapping for all devices makes development and debugging easier. - Certain helpers are easily achievable, like cxl_for_each_cmd(). Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> (v2) Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> (v3) Link: https://lore.kernel.org/r/20210217040958.1354670-7-ben.widawsky@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-02-16	cxl/mem: Add a "RAW" send command	Ben Widawsky
	The CXL memory device send interface will have a number of supported commands. The raw command is not such a command. Raw commands allow userspace to send a specified opcode to the underlying hardware and bypass all driver checks on the command. The primary use for this command is to [begrudgingly] allow undocumented vendor specific hardware commands. While not the main motivation, it also allows prototyping new hardware commands without a driver patch and rebuild. While this all sounds very powerful it comes with a couple of caveats: 1. Bug reports using raw commands will not get the same level of attention as bug reports using supported commands (via taint). 2. Supported commands will be rejected by the RAW command. With this comes new debugfs knob to allow full access to your toes with your weapon of choice. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> (v2) Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Ariel Sibley <Ariel.Sibley@microchip.com> Link: https://lore.kernel.org/r/20210217040958.1354670-6-ben.widawsky@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-02-16	cxl/mem: Add basic IOCTL interface	Ben Widawsky
	Add a straightforward IOCTL that provides a mechanism for userspace to query the supported memory device commands. CXL commands as they appear to userspace are described as part of the UAPI kerneldoc. The command list returned via this IOCTL will contain the full set of commands that the driver supports, however, some of those commands may not be available for use by userspace. Memory device commands first appear in the CXL 2.0 specification. They are submitted through a mailbox mechanism specified in the CXL 2.0 specification. The send command allows userspace to issue mailbox commands directly to the hardware. The list of available commands to send are the output of the query command. The driver verifies basic properties of the command and possibly inspect the input (or output) payload to determine whether or not the command is allowed (or might taint the kernel). Reported-by: kernel test robot <lkp@intel.com> # bug in earlier revision Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> (v2) Cc: Al Viro <viro@zeniv.linux.org.uk> Link: https://lore.kernel.org/r/20210217040958.1354670-5-ben.widawsky@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-02-16	cxl/mem: Register CXL memX devices	Dan Williams
	Create the /sys/bus/cxl hierarchy to enumerate: * Memory Devices (per-endpoint control devices) * Memory Address Space Devices (platform address ranges with interleaving, performance, and persistence attributes) * Memory Regions (active provisioned memory from an address space device that is in use as System RAM or delegated to libnvdimm as Persistent Memory regions). For now, only the per-endpoint control devices are registered on the 'cxl' bus. However, going forward it will provide a mechanism to coordinate cross-device interleave. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> (v2) Link: https://lore.kernel.org/r/20210217040958.1354670-4-ben.widawsky@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-02-16	cxl/mem: Find device capabilities	Ben Widawsky
	Provide enough functionality to utilize the mailbox of a memory device. The mailbox is used to interact with the firmware running on the memory device. The flow is proven with one implemented command, "identify". Because the class code has already told the driver this is a memory device and the identify command is mandatory. CXL devices contain an array of capabilities that describe the interactions software can have with the device or firmware running on the device. A CXL compliant device must implement the device status and the mailbox capability. Additionally, a CXL compliant memory device must implement the memory device capability. Each of the capabilities can [will] provide an offset within the MMIO region for interacting with the CXL device. The capabilities tell the driver how to find and map the register space for CXL Memory Devices. The registers are required to utilize the CXL spec defined mailbox interface. The spec outlines two mailboxes, primary and secondary. The secondary mailbox is earmarked for system firmware, and not handled in this driver. Primary mailboxes are capable of generating an interrupt when submitting a background command. That implementation is saved for a later time. Reported-by: Colin Ian King <colin.king@canonical.com> (coverity) Reported-by: Dan Carpenter <dan.carpenter@oracle.com> (smatch) Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> (v2) Link: https://www.computeexpresslink.org/download-the-specification Link: https://lore.kernel.org/r/20210217040958.1354670-3-ben.widawsky@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-02-16	cxl/mem: Introduce a driver for CXL-2.0-Type-3 endpoints	Dan Williams
	The CXL.mem protocol allows a device to act as a provider of "System RAM" and/or "Persistent Memory" that is fully coherent as if the memory was attached to the typical CPU memory controller. With the CXL-2.0 specification a PCI endpoint can implement a "Type-3" device interface and give the operating system control over "Host Managed Device Memory". See section 2.3 Type 3 CXL Device. The memory range exported by the device may optionally be described by the platform firmware memory map, or by infrastructure like LIBNVDIMM to provision persistent memory capacity from one, or more, CXL.mem devices. A pre-requisite for Linux-managed memory-capacity provisioning is this cxl_mem driver that can speak the mailbox protocol defined in section 8.2.8.4 Mailbox Registers. For now just land the initial driver boiler-plate and Documentation/ infrastructure. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: David Rientjes <rientjes@google.com> (v1) Cc: Jonathan Corbet <corbet@lwn.net> Link: https://www.computeexpresslink.org/download-the-specification Link: https://lore.kernel.org/r/20210217040958.1354670-2-ben.widawsky@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-02-16	dax-device: Make remove callback return void	Uwe Kleine-König
	The driver core ignores the return value of struct bus_type::remove() because there is only little that can be done. To simplify the quest to make this function return void, let struct dax_device_driver::remove() return void, too. All users already unconditionally return 0, this commit makes it obvious that returning an error code isn't intended. Signed-off-by: Uwe Kleine-König <uwe@kleine-koenig.org> Link: https://lore.kernel.org/r/20210205222842.34896-6-uwe@kleine-koenig.org Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-02-16	device-dax: Drop an empty .remove callback	Uwe Kleine-König
	The dax core properly handles a dax driver not having a remove callback. So drop it without changing the effective behaviour. Signed-off-by: Uwe Kleine-König <uwe@kleine-koenig.org> Link: https://lore.kernel.org/r/20210205222842.34896-5-uwe@kleine-koenig.org Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-02-16	device-dax: Fix error path in dax_driver_register	Uwe Kleine-König
	The static variable match_always_count is supposed to track if there is a driver registered that has .match_always set. If driver_register() fails, the previous increment must be undone. Signed-off-by: Uwe Kleine-König <uwe@kleine-koenig.org> Link: https://lore.kernel.org/r/20210205222842.34896-4-uwe@kleine-koenig.org Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-02-16	device-dax: Properly handle drivers without remove callback	Uwe Kleine-König
	If all resources are allocated in .probe() using devm_ functions it might make sense to not provide a .remove() callback. Then the right thing is to just return success. Signed-off-by: Uwe Kleine-König <uwe@kleine-koenig.org> Link: https://lore.kernel.org/r/20210205222842.34896-3-uwe@kleine-koenig.org Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-02-16	device-dax: Prevent registering drivers without probe callback	Uwe Kleine-König
	The bus probe function dax_bus_probe() calls the probe callback without checking it to be non-NULL. Prevent a NULL pointer exception if a driver without a probe function is registered by refusing to register this driver. Signed-off-by: Uwe Kleine-König <uwe@kleine-koenig.org> Link: https://lore.kernel.org/r/20210205222842.34896-2-uwe@kleine-koenig.org Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-02-16	libnvdimm: Make remove callback return void	Uwe Kleine-König
	All drivers return 0 in their remove callback and the driver core ignores the return value of nvdimm_bus_remove() anyhow. So simplify by changing the driver remove callback to return void and return 0 unconditionally to the upper layer. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20210212171043.2136580-2-u.kleine-koenig@pengutronix.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-02-16	libnvdimm/dimm: Simplify nvdimm_remove()	Uwe Kleine-König
	nvdimm_remove is only ever called after nvdimm_probe() returned successfully. In this case driver data is always set to a non-NULL value so the check for driver data being NULL can go away as it's always false. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20210212171043.2136580-1-u.kleine-koenig@pengutronix.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-02-16	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	David S. Miller

2021-02-16	net: dsa: tag_rtl4_a: Support also egress tags	Linus Walleij
	Support also transmitting frames using the custom "8899 A" 4 byte tag. Qingfang came up with the solution: we need to pad the ethernet frame to 60 bytes using eth_skb_pad(), then the switch will happily accept frames with custom tags. Cc: Mauri Sandberg <sandberg@mailfence.com> Reported-by: DENG Qingfang <dqfext@gmail.com> Fixes: efd7fe68f0c6 ("net: dsa: tag_rtl4_a: Implement Realtek 4 byte A tag") Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16	RDMA/rtrs-srv: Do not pass a valid pointer to PTR_ERR()	Jack Wang
	smatch gives the warning: drivers/infiniband/ulp/rtrs/rtrs-srv.c:1805 rtrs_rdma_connect() warn: passing zero to 'PTR_ERR' Which is trying to say smatch has shown that srv is not an error pointer and thus cannot be passed to PTR_ERR. The solution is to move the list_add() down after full initilization of rtrs_srv. To avoid holding the srv_mutex too long, only hold it during the list operation as suggested by Leon. Fixes: 03e9b33a0fd6 ("RDMA/rtrs: Only allow addition of path to an already established session") Link: https://lore.kernel.org/r/20210216143807.65923-1-jinpu.wang@cloud.ionos.com Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-16	RDMA/srp: Fix support for unpopulated and unbalanced NUMA nodes	Nicolas Morey-Chaisemartin
	The current code computes a number of channels per SRP target and spreads them equally across all online NUMA nodes. Each channel is then assigned a CPU within this node. In the case of unbalanced, or even unpopulated nodes, some channels do not get a CPU associated and thus do not get connected. This causes the SRP connection to fail. This patch solves the issue by rewriting channel computation and allocation: - Drop channel to node/CPU association as it had no real effect on locality but added unnecessary complexity. - Tweak the number of channels allocated to reduce CPU contention when possible: - Up to one channel per CPU (instead of up to 4 by node) - At least 4 channels per node, unless ch_count module parameter is used. Link: https://lore.kernel.org/r/9cb4d9d3-30ad-2276-7eff-e85f7ddfb411@suse.com Signed-off-by: Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-16	Merge branch 'broadcom-next'	David S. Miller
	Robert Hancock says: ==================== Broadcom PHY driver updates Updates to the Broadcom PHY driver related to use with copper SFP modules. Changed since v3: -fixed kerneldoc error Changed since v2: -Create flag for PHY on SFP module and use that rather than accessing attached_dev directly in PHY driver Changed since v1: -Reversed conditional to reduce indentation -Added missing setting of MII_BCM54XX_AUXCTL_MISC_WREN in MII_BCM54XX_AUXCTL_SHDWSEL_MISC register ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16	net: phy: broadcom: Do not modify LED configuration for SFP module PHYs	Robert Hancock
	bcm54xx_config_init was modifying the PHY LED configuration to enable link and activity indications. However, some SFP modules (such as Bel-Fuse SFP-1GBT-06) have no LEDs but use the LED outputs to control the SFP LOS signal, and modifying the LED settings will cause the LOS output to malfunction. Skip this configuration for PHYs which are bound to an SFP bus. Signed-off-by: Robert Hancock <robert.hancock@calian.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16	net: phy: Add is_on_sfp_module flag and phy_on_sfp helper	Robert Hancock
	Add a flag and helper function to indicate that a PHY device is part of an SFP module, which is set on attach. This can be used by PHY drivers to handle SFP-specific quirks or behavior. Signed-off-by: Robert Hancock <robert.hancock@calian.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16	net: phy: broadcom: Set proper 1000BaseX/SGMII interface mode for BCM54616S	Robert Hancock
	The default configuration for the BCM54616S PHY may not match the desired mode when using 1000BaseX or SGMII interface modes, such as when it is on an SFP module. Add code to explicitly set the correct mode using programming sequences provided by Bel-Fuse: https://www.belfuse.com/resources/datasheets/powersolutions/ds-bps-sfp-1gbt-05-series.pdf https://www.belfuse.com/resources/datasheets/powersolutions/ds-bps-sfp-1gbt-06-series.pdf Signed-off-by: Robert Hancock <robert.hancock@calian.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16	lan743x: sync only the received area of an rx ring buffer	Sven Van Asbroeck
	On cpu architectures w/o dma cache snooping, dma_unmap() is a is a very expensive operation, because its resulting sync needs to invalidate cpu caches. Increase efficiency/performance by syncing only those sections of the lan743x's rx ring buffers that are actually in use. Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com> Reviewed-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16	lan743x: boost performance on cpu archs w/o dma cache snooping	Sven Van Asbroeck
	The buffers in the lan743x driver's receive ring are always 9K, even when the largest packet that can be received (the mtu) is much smaller. This performs particularly badly on cpu archs without dma cache snooping (such as ARM): each received packet results in a 9K dma_{map\|unmap} operation, which is very expensive because cpu caches need to be invalidated. Careful measurement of the driver rx path on armv7 reveals that the cpu spends the majority of its time waiting for cache invalidation. Optimize by keeping the rx ring buffer size as close as possible to the mtu. This limits the amount of cache that requires invalidation. This optimization would normally force us to re-allocate all ring buffers when the mtu is changed - a disruptive event, because it can only happen when the network interface is down. Remove the need to re-allocate all ring buffers by adding support for multi-buffer frames. Now any combination of mtu and ring buffer size will work. When the mtu changes from mtu1 to mtu2, consumed buffers of size mtu1 are lazily replaced by newly allocated buffers of size mtu2. These optimizations double the rx performance on armv7. Third parties report 3x rx speedup on armv8. Tested with iperf3 on a freescale imx6qp + lan7430, both sides set to mtu 1500 bytes, measure rx performance: Before: [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-20.00 sec 550 MBytes 231 Mbits/sec 0 After: [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-20.00 sec 1.33 GBytes 570 Mbits/sec 0 Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com> Reviewed-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>