summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2020-09-01locks: Remove extra "0x" in tracepoint format specifierChuck Lever
Clean up: %p adds its own 0x already. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Jeff Layton <jlayton@kernel.org>
2020-09-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf-next 2020-09-01 The following pull-request contains BPF updates for your *net-next* tree. There are two small conflicts when pulling, resolve as follows: 1) Merge conflict in tools/lib/bpf/libbpf.c between 88a82120282b ("libbpf: Factor out common ELF operations and improve logging") in bpf-next and 1e891e513e16 ("libbpf: Fix map index used in error message") in net-next. Resolve by taking the hunk in bpf-next: [...] scn = elf_sec_by_idx(obj, obj->efile.btf_maps_shndx); data = elf_sec_data(obj, scn); if (!scn || !data) { pr_warn("elf: failed to get %s map definitions for %s\n", MAPS_ELF_SEC, obj->path); return -EINVAL; } [...] 2) Merge conflict in drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c between 9647c57b11e5 ("xsk: i40e: ice: ixgbe: mlx5: Test for dma_need_sync earlier for better performance") in bpf-next and e20f0dbf204f ("net/mlx5e: RX, Add a prefetch command for small L1_CACHE_BYTES") in net-next. Resolve the two locations by retaining net_prefetch() and taking xsk_buff_dma_sync_for_cpu() from bpf-next. Should look like: [...] xdp_set_data_meta_invalid(xdp); xsk_buff_dma_sync_for_cpu(xdp, rq->xsk_pool); net_prefetch(xdp->data); [...] We've added 133 non-merge commits during the last 14 day(s) which contain a total of 246 files changed, 13832 insertions(+), 3105 deletions(-). The main changes are: 1) Initial support for sleepable BPF programs along with bpf_copy_from_user() helper for tracing to reliably access user memory, from Alexei Starovoitov. 2) Add BPF infra for writing and parsing TCP header options, from Martin KaFai Lau. 3) bpf_d_path() helper for returning full path for given 'struct path', from Jiri Olsa. 4) AF_XDP support for shared umems between devices and queues, from Magnus Karlsson. 5) Initial prep work for full BPF-to-BPF call support in libbpf, from Andrii Nakryiko. 6) Generalize bpf_sk_storage map & add local storage for inodes, from KP Singh. 7) Implement sockmap/hash updates from BPF context, from Lorenz Bauer. 8) BPF xor verification for scalar types & add BPF link iterator, from Yonghong Song. 9) Use target's prog type for BPF_PROG_TYPE_EXT prog verification, from Udip Pant. 10) Rework BPF tracing samples to use libbpf loader, from Daniel T. Lee. 11) Fix xdpsock sample to really cycle through all buffers, from Weqaar Janjua. 12) Improve type safety for tun/veth XDP frame handling, from Maciej Żenczykowski. 13) Various smaller cleanups and improvements all over the place. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-01Merge branch 'master' into for-nextJiri Kosina
Sync with Linus' branch in order to be able to apply fixups of more recent patches.
2020-09-01scif: Fix spelling of EACCESGeert Uytterhoeven
As per POSIX, the correct spelling is EACCES: include/uapi/asm-generic/errno-base.h:#define EACCES 13 /* Permission denied */ Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2020-09-01media: v4l2-ctrl: Add frame-skip std encoder controlStanimir Varbanov
Adds encoders standard v4l2 control for frame-skip. The control is a copy of a custom encoder control so that other v4l2 encoder drivers can use it. Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org> Reviewed-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01media: v4l2-ctrls: Add encoder constant quality controlMaheshwar Ajja
When V4L2_CID_MPEG_VIDEO_BITRATE_MODE value is V4L2_MPEG_VIDEO_BITRATE_MODE_CQ, encoder will produce constant quality output indicated by V4L2_CID_MPEG_VIDEO_CONSTANT_QUALITY control value. Encoder will choose appropriate quantization parameter and bitrate to produce requested frame quality level. Signed-off-by: Maheshwar Ajja <majja@codeaurora.org> Reviewed-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01media: uapi: h264: Rename and clarify PPS_FLAG_SCALING_MATRIX_PRESENTEzequiel Garcia
Applications are expected to fill V4L2_CID_MPEG_VIDEO_H264_SCALING_MATRIX if a non-flat scaling matrix applies to the picture. This is the case if SPS scaling_matrix_present_flag or PPS pic_scaling_matrix_present_flag are set, and should be handled by applications. On one hand, the PPS bitstream syntax element signals the presence of a Picture scaling matrix modifying the Sequence (SPS) scaling matrix. On the other hand, our flag should indicate if the scaling matrix V4L2 control is applicable to this request. Rename the flag from PPS_FLAG_PIC_SCALING_MATRIX_PRESENT to PPS_FLAG_SCALING_MATRIX_PRESENT, to avoid mixing this flag with bitstream syntax element pic_scaling_matrix_present_flag, and clarify the meaning of our flag. Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> Tested-by: Jonas Karlman <jonas@kwiboo.se> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01media: uapi: h264: Clean slice invariants syntax elementsEzequiel Garcia
The H.264 specification requires in section 7.4.3 "Slice header semantics", that the following values shall be the same in all slice headers: pic_parameter_set_id frame_num field_pic_flag bottom_field_flag idr_pic_id pic_order_cnt_lsb delta_pic_order_cnt_bottom delta_pic_order_cnt[ 0 ] delta_pic_order_cnt[ 1 ] sp_for_switch_flag slice_group_change_cycle These bitstream fields are part of the slice header, and therefore passed redundantly on each slice. The purpose of the redundancy is to make the codec fault-tolerant in network scenarios. This is of course not needed to be reflected in the V4L2 controls, given the bitstream has already been parsed by applications. Therefore, move the redundant fields to the per-frame decode parameters control (DECODE_PARAMS). Field 'pic_parameter_set_id' is simply removed in this case, because the PPS control must currently contain the active PPS. Syntax elements dec_ref_pic_marking() and those related to pic order count, remain invariant as well, and therefore, the fields dec_ref_pic_marking_bit_size and pic_order_cnt_bit_size are also common to all slices. Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Tested-by: Jonas Karlman <jonas@kwiboo.se> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01media: uapi: h264: Clarify SLICE_BASED modeEzequiel Garcia
Currently, the SLICE_BASED and FRAME_BASED modes documentation is misleading and not matching the intended use-cases. Drop non-required fields SLICE_PARAMS 'start_byte_offset' and DECODE_PARAMS 'num_slices' and clarify the decoding modes in the documentation. On SLICE_BASED mode, a single slice is expected per OUTPUT buffer, and therefore 'start_byte_offset' is not needed (since the offset to the slice is the start of the buffer). This mode requires the use of CAPTURE buffer holding, and so the number of slices shall not be required. On FRAME_BASED mode, the devices are expected to take care of slice parsing. Neither SLICE_PARAMS are required (and shouldn't be exposed by frame-based drivers), nor the number of slices. Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> Tested-by: Jonas Karlman <jonas@kwiboo.se> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01media: uapi: h264: Drop SLICE_PARAMS 'size' fieldEzequiel Garcia
The SLICE_PARAMS control is intended for slice-based devices. In this mode, the OUTPUT buffer contains a single slice, and so the buffer's plane payload size can be used to query the slice size. To reduce the API surface drop the size from the SLICE_PARAMS control. A follow-up change will remove other members in SLICE_PARAMS so we don't need to add padding fields here. Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> Tested-by: Jonas Karlman <jonas@kwiboo.se> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01media: uapi: h264: Increase size of DPB entry pic_numEzequiel Garcia
DPB entry PicNum maximum value is 2*MaxFrameNum for interlaced content (field_pic_flag=1). As specified, MaxFrameNum is 2^(log2_max_frame_num_minus4 + 4) and log2_max_frame_num_minus4 is in the range of 0 to 12, which means pic_num should be a 32-bit field. The v4l2_h264_dpb_entry struct needs to be padded to avoid a hole, which might be also useful to allow future uAPI extensions. Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> Tested-by: Jonas Karlman <jonas@kwiboo.se> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01media: uapi: h264: Clean DPB entry interfaceEzequiel Garcia
As discussed recently, the current interface for the Decoded Picture Buffer is not enough to properly support field coding. This commit introduces enough semantics to support frame and field coding, and to signal how DPB entries are "used for reference". Reserved fields will be added by a follow-up commit. Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> Tested-by: Jonas Karlman <jonas@kwiboo.se> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01media: uapi: h264: Increase size of 'first_mb_in_slice' fieldEzequiel Garcia
Slice header syntax element 'first_mb_in_slice' can point to the last macroblock, currently the field can only reference 65536 macroblocks which is insufficient for 8K videos. Although unlikely, a 8192x4320 video (where macroblocks are 16x16), would contain 138240 macroblocks on a frame. As per the H264 specification, 'first_mb_in_slice' can be up to PicSizeInMbs - 1, so increase the size of the field to 32-bits. Note that v4l2_ctrl_h264_slice_params struct will be modified in a follow-up commit, and so we defer its 64-bit padding. Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> Tested-by: Jonas Karlman <jonas@kwiboo.se> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01media: uapi: h264: Split prediction weight parametersEzequiel Garcia
The prediction weight parameters are only required under certain conditions, which depend on slice header parameters. As specified in section 7.3.3 Slice header syntax, the prediction weight table is present if: ((weighted_pred_flag && (slice_type == P || slice_type == SP)) || \ (weighted_bipred_idc == 1 && slice_type == B)) Given its size, it makes sense to move this table to its control, so applications can avoid passing it if the slice doesn't specify it. Before this change struct v4l2_ctrl_h264_slice_params was 960 bytes. With this change, it's 188 bytes and struct v4l2_ctrl_h264_pred_weight is 772 bytes. Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> Tested-by: Jonas Karlman <jonas@kwiboo.se> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01media: uapi: h264: Update reference listsJernej Skrabec
When dealing with interlaced frames, reference lists must tell if each particular reference is meant for top or bottom field. This info is currently not provided at all in the H264 related controls. Change reference lists to hold a structure, which specifies an index into the DPB array and the field/frame specification for the picture. Currently the only user of these lists is Cedrus which is just compile fixed here. Actual usage of will come in a following commit. Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net> Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> Tested-by: Jonas Karlman <jonas@kwiboo.se> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01media: cec: no need to check return value of debugfs_create functionsGreg Kroah-Hartman
When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-01drm: Replace mode->export_head with a booleanVille Syrjälä
In order to shrink drm_display_mode below the magic two cacheline mark in 64bit we need to shrink it by another 8 bytes. The easiest thing to eliminate is the 'export_head' list head which is only used during the getconnector ioctl to temporarly track which modes on the connector's mode list are to be exposed and which are to remain hidden. We can simply replace the list head with a boolean which we use to tag the modes that are to be exposed. If we make sure to clear the tags after we're done with them we don't even need an extra loop over the modes to reset the tags at the start of the getconnector ioctl. Conveniently we already have a hole for the boolean left behind by the removal of mode->private_flags. The final size of the struct is now 112 bytes on 32bit and 120 bytes on 64bit. Another alternative would be a temp bitmask so we wouldn't have to have anything in the mode struct itself. The main issue is how large of a bitmask do we need? I guess we could allocate it dynamically but that means an extra kcalloc() and an extra loop through the modes to count them first (or grow the bitmask with krealloc() as needed). CC: Sam Ravnborg <sam@ravnborg.org> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200428171940.19552-17-ville.syrjala@linux.intel.com Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2020-09-01drm: Nuke mode->private_flagsVille Syrjälä
The last two uses of mode->private_flags (in i915 and gma500) are now gone. So let's remove mode->private_flags entirely. v2: Drop the earlier int->u8 conversion CC: Sam Ravnborg <sam@ravnborg.org> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200428171940.19552-16-ville.syrjala@linux.intel.com Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2020-09-01HID: core: Sanitize event code and type when mapping inputMarc Zyngier
When calling into hid_map_usage(), the passed event code is blindly stored as is, even if it doesn't fit in the associated bitmap. This event code can come from a variety of sources, including devices masquerading as input devices, only a bit more "programmable". Instead of taking the event code at face value, check that it actually fits the corresponding bitmap, and if it doesn't: - spit out a warning so that we know which device is acting up - NULLify the bitmap pointer so that we catch unexpected uses Code paths that can make use of untrusted inputs can now check that the mapping was indeed correct and bail out if not. Cc: stable@vger.kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@gmail.com>
2020-09-01tracepoint: Optimize using static_call()Steven Rostedt (VMware)
Currently the tracepoint site will iterate a vector and issue indirect calls to however many handlers are registered (ie. the vector is long). Using static_call() it is possible to optimize this for the common case of only having a single handler registered. In this case the static_call() can directly call this handler. Otherwise, if the vector is longer than 1, call a function that iterates the whole vector like the current code. [peterz: updated to new interface] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20200818135805.279421092@infradead.org
2020-09-01static_call: Allow early initPeter Zijlstra
In order to use static_call() to wire up x86_pmu, we need to initialize earlier, specifically before memory allocation works; copy some of the tricks from jump_label to enable this. Primarily we overload key->next to store a sites pointer when there are no modules, this avoids having to use kmalloc() to initialize the sites and allows us to run much earlier. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lore.kernel.org/r/20200818135805.220737930@infradead.org
2020-09-01static_call: Handle tail-callsPeter Zijlstra
GCC can turn our static_call(name)(args...) into a tail call, in which case we get a JMP.d32 into the trampoline (which then does a further tail-call). Teach objtool to recognise and mark these in .static_call_sites and adjust the code patching to deal with this. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20200818135805.101186767@infradead.org
2020-09-01static_call: Add static_call_cond()Peter Zijlstra
Extend the static_call infrastructure to optimize the following common pattern: if (func_ptr) func_ptr(args...) For the trampoline (which is in effect a tail-call), we patch the JMP.d32 into a RET, which then directly consumes the trampoline call. For the in-line sites we replace the CALL with a NOP5. NOTE: this is 'obviously' limited to functions with a 'void' return type. NOTE: DEFINE_STATIC_COND_CALL() only requires a typename, as opposed to a full function. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20200818135805.042977182@infradead.org
2020-09-01x86/static_call: Add inline static call implementation for x86-64Josh Poimboeuf
Add the inline static call implementation for x86-64. The generated code is identical to the out-of-line case, except we move the trampoline into it's own section. Objtool uses the trampoline naming convention to detect all the call sites. It then annotates those call sites in the .static_call_sites section. During boot (and module init), the call sites are patched to call directly into the destination function. The temporary trampoline is then no longer used. [peterz: merged trampolines, put trampoline in section] Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20200818135804.864271425@infradead.org
2020-09-01static_call: Avoid kprobes on inline static_call()sPeter Zijlstra
Similar to how we disallow kprobes on any other dynamic text (ftrace/jump_label) also disallow kprobes on inline static_call()s. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200818135804.744920586@infradead.org
2020-09-01static_call: Add inline static call infrastructureJosh Poimboeuf
Add infrastructure for an arch-specific CONFIG_HAVE_STATIC_CALL_INLINE option, which is a faster version of CONFIG_HAVE_STATIC_CALL. At runtime, the static call sites are patched directly, rather than using the out-of-line trampolines. Compared to out-of-line static calls, the performance benefits are more modest, but still measurable. Steven Rostedt did some tracepoint measurements: https://lkml.kernel.org/r/20181126155405.72b4f718@gandalf.local.home This code is heavily inspired by the jump label code (aka "static jumps"), as some of the concepts are very similar. For more details, see the comments in include/linux/static_call.h. [peterz: simplified interface; merged trampolines] Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20200818135804.684334440@infradead.org
2020-09-01static_call: Add basic static call infrastructureJosh Poimboeuf
Static calls are a replacement for global function pointers. They use code patching to allow direct calls to be used instead of indirect calls. They give the flexibility of function pointers, but with improved performance. This is especially important for cases where retpolines would otherwise be used, as retpolines can significantly impact performance. The concept and code are an extension of previous work done by Ard Biesheuvel and Steven Rostedt: https://lkml.kernel.org/r/20181005081333.15018-1-ard.biesheuvel@linaro.org https://lkml.kernel.org/r/20181006015110.653946300@goodmis.org There are two implementations, depending on arch support: 1) out-of-line: patched trampolines (CONFIG_HAVE_STATIC_CALL) 2) basic function pointers For more details, see the comments in include/linux/static_call.h. [peterz: simplified interface] Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20200818135804.623259796@infradead.org
2020-09-01compiler.h: Make __ADDRESSABLE() symbol truly uniqueJosh Poimboeuf
The __ADDRESSABLE() macro uses the __LINE__ macro to create a temporary symbol which has a unique name. However, if the macro is used multiple times from within another macro, the line number will always be the same, resulting in duplicate symbols. Make the temporary symbols truly unique by using __UNIQUE_ID instead of __LINE__. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Link: https://lore.kernel.org/r/20200818135804.564436253@infradead.org
2020-09-01notifier: Fix broken error handling patternPeter Zijlstra
The current notifiers have the following error handling pattern all over the place: int err, nr; err = __foo_notifier_call_chain(&chain, val_up, v, -1, &nr); if (err & NOTIFIER_STOP_MASK) __foo_notifier_call_chain(&chain, val_down, v, nr-1, NULL) And aside from the endless repetition thereof, it is broken. Consider blocking notifiers; both calls take and drop the rwsem, this means that the notifier list can change in between the two calls, making @nr meaningless. Fix this by replacing all the __foo_notifier_call_chain() functions with foo_notifier_call_chain_robust() that embeds the above pattern, but ensures it is inside a single lock region. Note: I switched atomic_notifier_call_chain_robust() to use the spinlock, since RCU cannot provide the guarantee required for the recovery. Note: software_resume() error handling was broken afaict. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20200818135804.325626653@infradead.org
2020-09-01vmlinux.lds.h: Add PGO and AutoFDO input sectionsNick Desaulniers
Basically, consider .text.{hot|unlikely|unknown}.* part of .text, too. When compiling with profiling information (collected via PGO instrumentations or AutoFDO sampling), Clang will separate code into .text.hot, .text.unlikely, or .text.unknown sections based on profiling information. After D79600 (clang-11), these sections will have a trailing `.` suffix, ie. .text.hot., .text.unlikely., .text.unknown.. When using -ffunction-sections together with profiling infomation, either explicitly (FGKASLR) or implicitly (LTO), code may be placed in sections following the convention: .text.hot.<foo>, .text.unlikely.<bar>, .text.unknown.<baz> where <foo>, <bar>, and <baz> are functions. (This produces one section per function; we generally try to merge these all back via linker script so that we don't have 50k sections). For the above cases, we need to teach our linker scripts that such sections might exist and that we'd explicitly like them grouped together, otherwise we can wind up with code outside of the _stext/_etext boundaries that might not be mapped properly for some architectures, resulting in boot failures. If the linker script is not told about possible input sections, then where the section is placed as output is a heuristic-laiden mess that's non-portable between linkers (ie. BFD and LLD), and has resulted in many hard to debug bugs. Kees Cook is working on cleaning this up by adding --orphan-handling=warn linker flag used in ARCH=powerpc to additional architectures. In the case of linker scripts, borrowing from the Zen of Python: explicit is better than implicit. Also, ld.bfd's internal linker script considers .text.hot AND .text.hot.* to be part of .text, as well as .text.unlikely and .text.unlikely.*. I didn't see support for .text.unknown.*, and didn't see Clang producing such code in our kernel builds, but I see code in LLVM that can produce such section names if profiling information is missing. That may point to a larger issue with generating or collecting profiles, but I would much rather be safe and explicit than have to debug yet another issue related to orphan section placement. Reported-by: Jian Cai <jiancai@google.com> Suggested-by: Fāng-ruì Sòng <maskray@google.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Luis Lozano <llozano@google.com> Tested-by: Manoj Gupta <manojgupta@google.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: linux-arch@vger.kernel.org Cc: stable@vger.kernel.org Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=add44f8d5c5c05e08b11e033127a744d61c26aee Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=1de778ed23ce7492c523d5850c6c6dbb34152655 Link: https://reviews.llvm.org/D79600 Link: https://bugs.chromium.org/p/chromium/issues/detail?id=1084760 Link: https://lore.kernel.org/r/20200821194310.3089815-7-keescook@chromium.org Debugged-by: Luis Lozano <llozano@google.com>
2020-09-01vmlinux.lds.h: Add .symtab, .strtab, and .shstrtab to ELF_DETAILSKees Cook
When linking vmlinux with LLD, the synthetic sections .symtab, .strtab, and .shstrtab are listed as orphaned. Add them to the ELF_DETAILS section so there will be no warnings when --orphan-handling=warn is used more widely. (They are added above comment as it is the more common order[1].) ld.lld: warning: <internal>:(.symtab) is being placed in '.symtab' ld.lld: warning: <internal>:(.shstrtab) is being placed in '.shstrtab' ld.lld: warning: <internal>:(.strtab) is being placed in '.strtab' [1] https://lore.kernel.org/lkml/20200622224928.o2a7jkq33guxfci4@google.com/ Reported-by: Fangrui Song <maskray@google.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: linux-arch@vger.kernel.org Link: https://lore.kernel.org/r/20200821194310.3089815-6-keescook@chromium.org
2020-09-01vmlinux.lds.h: Split ELF_DETAILS from STABS_DEBUGKees Cook
The .comment section doesn't belong in STABS_DEBUG. Split it out into a new macro named ELF_DETAILS. This will gain other non-debug sections that need to be accounted for when linking with --orphan-handling=warn. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: linux-arch@vger.kernel.org Link: https://lore.kernel.org/r/20200821194310.3089815-5-keescook@chromium.org
2020-09-01vmlinux.lds.h: Avoid KASAN and KCSAN's unwanted sectionsKees Cook
KASAN (-fsanitize=kernel-address) and KCSAN (-fsanitize=thread) produce unwanted[1] .eh_frame and .init_array.* sections. Add them to COMMON_DISCARDS, except with CONFIG_CONSTRUCTORS, which wants to keep .init_array.* sections. [1] https://bugs.llvm.org/show_bug.cgi?id=46478 Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Marco Elver <elver@google.com> Cc: linux-arch@vger.kernel.org Link: https://lore.kernel.org/r/20200821194310.3089815-4-keescook@chromium.org
2020-09-01vmlinux.lds.h: Add .gnu.version* to COMMON_DISCARDSKees Cook
For vmlinux linking, no architecture uses the .gnu.version* sections, so remove it via the COMMON_DISCARDS macro in preparation for adding --orphan-handling=warn more widely. This is a work-around for what appears to be a bug[1] in ld.bfd which warns for this synthetic section even when none is found in input objects, and even when no section is emitted for an output object[2]. [1] https://sourceware.org/bugzilla/show_bug.cgi?id=26153 [2] https://lore.kernel.org/lkml/202006221524.CEB86E036B@keescook/ Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Fangrui Song <maskray@google.com> Cc: linux-arch@vger.kernel.org Link: https://lore.kernel.org/r/20200821194310.3089815-3-keescook@chromium.org
2020-09-01vmlinux.lds.h: Create COMMON_DISCARDSKees Cook
Collect the common DISCARD sections for architectures that need more specialized discard control than what the standard DISCARDS section provides. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: linux-arch@vger.kernel.org Link: https://lore.kernel.org/r/20200821194310.3089815-2-keescook@chromium.org
2020-09-01drm/mst: Add support for QUERY_STREAM_ENCRYPTION_STATUS MST sideband messageSean Paul
Used to query whether an MST stream is encrypted or not. Cc: Lyude Paul <lyude@redhat.com> Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Sean Paul <seanpaul@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200218220242.107265-14-sean@poorly.run #v4 Link: https://patchwork.freedesktop.org/patch/msgid/20200305201236.152307-15-sean@poorly.run #v5 Link: https://patchwork.freedesktop.org/patch/msgid/20200429195502.39919-15-sean@poorly.run #v6 Link: https://patchwork.freedesktop.org/patch/msgid/20200623155907.22961-16-sean@poorly.run #v7 Link: https://patchwork.freedesktop.org/patch/msgid/20200818153910.27894-16-sean@poorly.run #v8 Changes in v4: -Added to the set Changes in v5: -None Changes in v6: -Use FIELD_PREP to generate request buffer bitfields (Lyude) -Add mst selftest and dump/decode_sideband_req for QSES (Lyude) Changes in v7: -None Changes in v8: -Reverse the parsing on the hdcp_*x_device_present bits and leave breadcrumb in case this is incorrect (Anshuman) Changes in v8.5: -s/DRM_DEBUG_KMS/drm_dbg_kms/ (Lyude) Signed-off-by: Ramalingam C <ramalingam.c@intel.com> Acked-by: Daniel Vetter <daniel@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20200819143133.46232-1-sean@poorly.run
2020-09-01drm/i915: Fix sha_text population codeSean Paul
This patch fixes a few bugs: 1- We weren't taking into account sha_leftovers when adding multiple ksvs to sha_text. As such, we were or'ing the end of ksv[j - 1] with the beginning of ksv[j] 2- In the sha_leftovers == 2 and sha_leftovers == 3 case, bstatus was being placed on the wrong half of sha_text, overlapping the leftover ksv value 3- In the sha_leftovers == 2 case, we need to manually terminate the byte stream with 0x80 since the hardware doesn't have enough room to add it after writing M0 The upside is that all of the HDCP supported HDMI repeaters I could find on Amazon just strip HDCP anyways, so it turns out to be _really_ hard to hit any of these cases without an MST hub, which is not (yet) supported. Oh, and the sha_leftovers == 1 case works perfectly! Fixes: ee5e5e7a5e0f ("drm/i915: Add HDCP framework + base implementation") Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ramalingam C <ramalingam.c@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Sean Paul <seanpaul@chromium.org> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: intel-gfx@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v4.17+ Reviewed-by: Ramalingam C <ramalingam.c@intel.com> Signed-off-by: Sean Paul <seanpaul@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20191203173638.94919-2-sean@poorly.run #v1 Link: https://patchwork.freedesktop.org/patch/msgid/20191212190230.188505-2-sean@poorly.run #v2 Link: https://patchwork.freedesktop.org/patch/msgid/20200117193103.156821-2-sean@poorly.run #v3 Link: https://patchwork.freedesktop.org/patch/msgid/20200218220242.107265-2-sean@poorly.run #v4 Link: https://patchwork.freedesktop.org/patch/msgid/20200305201236.152307-2-sean@poorly.run #v5 Link: https://patchwork.freedesktop.org/patch/msgid/20200429195502.39919-2-sean@poorly.run #v6 Link: https://patchwork.freedesktop.org/patch/msgid/20200623155907.22961-2-sean@poorly.run #v7 Changes in v2: -None Changes in v3: -None Changes in v4: -Rebased on intel_de_write changes Changes in v5: -None Changes in v6: -None Changes in v7: -None Changes in v8: -None Signed-off-by: Ramalingam C <ramalingam.c@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200818153910.27894-2-sean@poorly.run
2020-09-01mm: cma: use CMA_MAX_NAME to define the length of cma name arrayBarry Song
CMA_MAX_NAME should be visible to CMA's users as they might need it to set the name of CMA areas and avoid hardcoding the size locally. So this patch moves CMA_MAX_NAME from local header file to include/linux header file and removes the hardcode in both hugetlb.c and contiguous.c. Signed-off-by: Barry Song <song.bao.hua@hisilicon.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-09-01dma-contiguous: provide the ability to reserve per-numa CMABarry Song
Right now, drivers like ARM SMMU are using dma_alloc_coherent() to get coherent DMA buffers to save their command queues and page tables. As there is only one default CMA in the whole system, SMMUs on nodes other than node0 will get remote memory. This leads to significant latency. This patch provides per-numa CMA so that drivers like SMMU can get local memory. Tests show localizing CMA can decrease dma_unmap latency much. For instance, before this patch, SMMU on node2 has to wait for more than 560ns for the completion of CMD_SYNC in an empty command queue; with this patch, it needs 240ns only. A positive side effect of this patch would be improving performance even further for those users who are worried about performance more than DMA security and use iommu.passthrough=1 to skip IOMMU. With local CMA, all drivers can get local coherent DMA buffers. Also, this patch changes the default CONFIG_CMA_AREAS to 19 in NUMA. As 1+CONFIG_CMA_AREAS should be quite enough for most servers on the market even they enable both hugetlb_cma and pernuma_cma. 2 numa nodes: 2(hugetlb) + 2(pernuma) + 1(default global cma) = 5 4 numa nodes: 4(hugetlb) + 4(pernuma) + 1(default global cma) = 9 8 numa nodes: 8(hugetlb) + 8(pernuma) + 1(default global cma) = 17 Signed-off-by: Barry Song <song.bao.hua@hisilicon.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-08-31drm/i915/dp: Extract drm_dp_read_dpcd_caps()Lyude Paul
Since DP 1.3, it's been possible for DP receivers to specify an additional set of DPCD capabilities, which can take precedence over the capabilities reported at DP_DPCD_REV. Basically any device supporting DP is going to need to read these in an identical manner, in particular nouveau, so let's go ahead and just move this code out of i915 into a shared DRM DP helper that we can use in other drivers. v2: * Remove redundant dpcd[DP_DPCD_REV] == 0 check * Fix drm_dp_dpcd_read() ret checks Signed-off-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Sean Paul <sean@poorly.run> Link: https://patchwork.freedesktop.org/patch/msgid/20200826182456.322681-20-lyude@redhat.com
2020-08-31drm/i915/dp: Extract drm_dp_read_sink_count()Lyude Paul
And of course, we'll also need to read the sink count from other drivers as well if we're checking whether or not it's supported. So, let's extract the code for this into another helper. v2: * Fix drm_dp_dpcd_readb() ret check * Add back comment and move back sink_count assignment in intel_dp_get_dpcd() v5: * Change name from drm_dp_get_sink_count() to drm_dp_read_sink_count() * Also, add "See also:" section to kdocs Signed-off-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Sean Paul <sean@poorly.run> Link: https://patchwork.freedesktop.org/patch/msgid/20200826182456.322681-17-lyude@redhat.com
2020-08-31drm/i915/dp: Extract drm_dp_read_sink_count_cap()Lyude Paul
Since other drivers are also going to need to be aware of the sink count in order to do proper dongle detection, we might as well steal i915's DP_SINK_COUNT helpers and move them into DRM helpers so that other dirvers can use them as well. Note that this also starts using intel_dp_has_sink_count() in intel_dp_detect_dpcd(), which is a functional change. v5: * Change name from drm_dp_has_sink_count() to drm_dp_read_sink_count_cap() Signed-off-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Sean Paul <sean@poorly.run> Link: https://patchwork.freedesktop.org/patch/msgid/20200826182456.322681-16-lyude@redhat.com
2020-08-31drm/i915/dp: Extract drm_dp_read_downstream_info()Lyude Paul
We're going to be doing the same probing process in nouveau for determining downstream DP port capabilities, so let's deduplicate the work by moving i915's code for handling this into a shared helper: drm_dp_read_downstream_info(). Note that when we do this, we also do make some functional changes while we're at it: * We always clear the downstream port info before trying to read it, just to make things easier for the caller * We skip reading downstream port info if the DPCD indicates that we don't support downstream port info * We only read as many bytes as needed for the reported number of downstream ports, no sense in reading the whole thing every time v2: * Fixup logic for calculating the downstream port length to account for the fact that downstream port caps can be either 1 byte or 4 bytes long. We can actually skip fixing the max_clock/max_bpc helpers here since they all check for DP_DETAILED_CAP_INFO_AVAILABLE anyway. * Fix ret code check for drm_dp_dpcd_read v5: * Change name from drm_dp_downstream_read_info() to drm_dp_read_downstream_info() * Also, add "See Also" sections for the various downstream info functions (drm_dp_read_downstream_info(), drm_dp_downstream_max_clock(), drm_dp_downstream_max_bpc()) Reviewed-by: Sean Paul <sean@poorly.run> Signed-off-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200826182456.322681-14-lyude@redhat.com
2020-08-31drm/i915/dp: Extract drm_dp_read_mst_cap()Lyude Paul
Just a tiny drive-by cleanup, we can consolidate i915's code for checking for MST support into a helper to be shared across drivers. v5: * Drop !!() * Move drm_dp_has_mst() out of header * Change name from drm_dp_has_mst() to drm_dp_read_mst_cap() Signed-off-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Sean Paul <sean@poorly.run> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200826182456.322681-10-lyude@redhat.com
2020-08-31ipvs: remove dependency on ip6_tablesYaroslav Bolyukin
This dependency was added because ipv6_find_hdr was in iptables specific code but is no longer required Fixes: f8f626754ebe ("ipv6: Move ipv6_find_hdr() out of Netfilter code.") Fixes: 63dca2c0b0e7 ("ipvs: Fix faulty IPv6 extension header handling in IPVS") Signed-off-by: Yaroslav Bolyukin <iam@lach.pw> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-08-31net: ipv4: remove unused arg exact_dif in compute_scoreMiaohe Lin
The arg exact_dif is not used anymore, remove it. inet_exact_dif_match() is no longer needed after the above is removed, so remove it too. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31net: ipv6: remove unused arg exact_dif in compute_scoreMiaohe Lin
The arg exact_dif is not used anymore, remove it. inet6_exact_dif_match() is no longer needed after the above is removed, remove it too. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31net: phy: add Lynx PCS moduleIoana Ciornei
Add a Lynx PCS module which exposes the necessary operations to drive the PCS using phylink. The majority of the code is extracted from the Felix DSA driver, which will be also modified in a later patch, and exposed as a separate module for code reusability purposes. As such, this aims at feature and bug parity with the existing Felix DSA driver, and thus USXGMII, SGMII, QSGMII and 2500Base-X (only w/o in-band AN) are supported by the Lynx PCS module since these were also supported by Felix. The module can only be enabled by the drivers in need and not user selectable. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31net: mdiobus: add clause 45 mdiobus write accessorIoana Ciornei
Add the locked variant of the clause 45 mdiobus write accessor - mdiobus_c45_write(). Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31net: phylink: add helper function to decode USXGMII wordIoana Ciornei
With the new addition of the USXGMII link partner ability constants we can now introduce a phylink helper that decodes the USXGMII word and populates the appropriate fields in the phylink_link_state structure based on them. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>