linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2019-07-18	drm/amd/display: Fix dc_create failure handling and 666 color depths	Julian Parkin
	[Why] It is possible (but very unlikely) that constructing dc fails before current_state is created. We support 666 color depth in some scenarios, but this isn't handled in get_norm_pix_clk. It uses exactly the same pixel clock as the 888 case. [How] Check for non null current_state before destructing. Add case for 666 color depth to get_norm_pix_clk to avoid assertion. Signed-off-by: Julian Parkin <julian.parkin@amd.com> Reviewed-by: Charlene Liu <Charlene.Liu@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amd/display: allocate 4 ddc engines for RV2	Derek Lai
	[Why] Driver will create 0, 1, and 2 ddc engines for RV2, but some platforms used 0, 1, and 3. [How] Still allocate 4 ddc engines for RV2. Signed-off-by: Derek Lai <Derek.Lai@amd.com> Reviewed-by: Aric Cyr <Aric.Cyr@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amd/display: put back front end initialization sequence	Eric Yang
	[Why] Seamless boot optimization removed proper front end power off sequence. In driver disable enable case, this causes driver to power gate hubp and dpp while there is still memory fetching going on, this can cause invalid memory requests to be generated which will hang data fabric. [How] Put back proper front end power off sequence Signed-off-by: Eric Yang <Eric.Yang2@amd.com> Reviewed-by: Anthony Koo <Anthony.Koo@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Acked-by: Tony Cheng <Tony.Cheng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amd/display: Wait for flip to complete	Alvin Lee
	[why] In pipe split issue occurs when we program immediate flip while vsync flip is pending [how] Don't program immediate flip until flip is no longer pending Signed-off-by: Alvin Lee <alvin.lee2@amd.com> Reviewed-by: Jaehyun Chung <Jaehyun.Chung@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amd/display: Change min_h_sync_width from 8 to 4	Fatemeh Darbehani
	[Why] Some display's hsync width is lower than the minimum dcn20 is set to support right now. This will cause optc1_validate_timing to fail which eventually will result in wrong set mode. This was set to 8 as per HW team's request for no valid reason. [How] Changing min_h_sync_width to 4 will let us validate timing for preffered mode and light up the headset. This change was made to Vega 10 before for a similar issue. Signed-off-by: Fatemeh Darbehani <fatemeh.darbehani@amd.com> Reviewed-by: Joshua Aberback <Joshua.Aberback@amd.com> Acked-by: Aric Cyr <Aric.Cyr@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amd/display: use encoder's engine id to find matched free audio device	Tai Man
	[Why] On some platforms, the encoder id 3 is not populated. So the encoders are not stored in right order as index (id: 0, 1, 2, 4, 5) at pool. This would cause encoders id 4 & id 5 to fail when finding corresponding audio device, defaulting to the first available audio device. As result, we cannot stream audio into two DP ports with encoders id 4 & id 5. [How] It need to create enough audio device objects (0 - 5) to perform matching. Then use encoder engine id to find matched audio device. Signed-off-by: Tai Man <taiman.wong@amd.com> Reviewed-by: Charlene Liu <Charlene.Liu@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amd/display: fix DMCU hang when going into Modern Standby	Zi Yu Liao
	[why] When the system is going into suspend, set_backlight gets called after the eDP got blanked. Since smooth brightness is enabled, the driver will make a call into the DMCU to ramp the brightness. The DMCU would try to enable ABM to do so. But since the display is blanked, this ends up causing ABM1_ACE_DBUF_REG_UPDATE_PENDING to get stuck at 1, which results in a dead lock in the DMCU firmware. [how] Disable brightness ramping when the eDP display is blanked. Signed-off-by: Zi Yu Liao <ziyu.liao@amd.com> Reviewed-by: Eric Yang <eric.yang2@amd.com> Acked-by: Anthony Koo <Anthony.Koo@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amd/display: Disable Audio on reinitialize hardware	Alvin Lee
	[Why] When we recover from hang, we do not want to skip the audio enable call. [How] Disable audio in dc_reinitialize_hardware Signed-off-by: Alvin Lee <alvin.lee2@amd.com> Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amd/display: Read max down spread	Derek Lai
	[Why] When launch D10.2, driver will write DPCD 0x107 with 0x00 [How] Read MAX_DOWNSPREAD (0x0003h) then keep in current link settings Signed-off-by: Derek Lai <Derek.Lai@amd.com> Reviewed-by: Tony Cheng <Tony.Cheng@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amd/display: Check for valid stream_encode	Ilya Bakoulin
	Before accessing it's vtable, check that stream_encoder is non-null. Signed-off-by: Ilya Bakoulin <Ilya.Bakoulin@amd.com> Reviewed-by: Eric Bernstein <Eric.Bernstein@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amd/display: Wait for backlight programming completion in set backlight ↵	SivapiriyanKumarasamy
	level [WHY] Currently we don't wait for blacklight programming completion in DMCU when setting backlight level. Some sequences such as PSR static screen event trigger reprogramming requires it to be complete. [How] Add generic wait for dmcu command completion in set backlight level. Signed-off-by: SivapiriyanKumarasamy <sivapiriyan.kumarasamy@amd.com> Reviewed-by: Anthony Koo <Anthony.Koo@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amd/display: Poll for GPUVM context ready (v2)	Julian Parkin
	[Why] Hardware docs state that we must wait until the GPUVM context is ready after programming it. [How] Poll until the valid bit of PAGE_TABLE_BASE_ADDR_LO32 is set to 1 after programming it. v2: fix include for udelay (Alex) Signed-off-by: Julian Parkin <julian.parkin@amd.com> Reviewed-by: Charlene Liu <Charlene.Liu@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amd/display: wait for the whole frame after global unlock	Wenjing Liu
	[why] The current code will not wait for the entire frame after global unlock. This causes dsc dynamic target bpp update corruption when there is a surface update immediately happens after this. [how] Wait for the entire whole frame after unlock before continuing the rest of stream and surface update. Signed-off-by: Wenjing Liu <Wenjing.Liu@amd.com> Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amd/display: Copy max_clks_by_state after dce_clk_mgr_construct	Nicholas Kazlauskas
	[Why] For DCE110, DCE112 and DCE120 the max_clks_by_state for the clk_mgr are copied from their respective table before the call to dce_clk_mgr_construct, but then dce_clk_mgr_construct overwrites these with the dce80_max_clks_by_state. [How] Copy these after we call dce_clk_mgr_construct so we're using the right tables. Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: David Francis <David.Francis@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amd/display: Clock does not lower in Updateplanes	Murton Liu
	[why] We reset the optimized_required in atomic_plane_disable flag immediately after it is set in atomic_plane_disconnect, causing us to never have flag set during next flip in UpdatePlanes. [how] Optimize directly after each time plane is removed. Signed-off-by: Murton Liu <murton.liu@amd.com> Reviewed-by: Tony Cheng <Tony.Cheng@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amd/display: skip retrain in dc_link_set_preferred_link_settings() if ↵	Samson Tam
	using passive dongle [Why] Fixes issue when we have a display connected using a passive dongle and then emulate over it using a DP connection at 1 x 1.62 Ghz. System hangs because register bus returns back 0xFFFFFFFF for all register reads after setting register DIG_BE_CNTL in dcn10_link_encoder_connect_dig_be_to_fe(). Hang occurs later when trying to do a register read. [How] At the start of the emulation, dc_link_set_preferred_link_settings() and dp_retrain_link_dp_test() is called, even though it is connected using a passive dongle. Add an extra condition in dp_retrain_link_dp_test() to check for link->dongle_max_pix_clk > 0. This is the only way we know if the connection is using passive dongle so we don't retrain DP. Signed-off-by: Samson Tam <Samson.Tam@amd.com> Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amd/display: swap system aperture high/low	Jun Lei
	[why] Currently logical values are swapped in HW, causing system aperture to be undefined, so VA and PA cannot co-exist [how] program values correctly Signed-off-by: Jun Lei <Jun.Lei@amd.com> Reviewed-by: Yongqiang Sun <yongqiang.sun@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amd/display: Set one 4:2:0-related PPS field as recommended by DSC spec	Nikola Cornij
	[why] 'second_line_offset_adj' was mistakenly left at zero, even though DSC spec v1.2a recommends setting this field to 512 for 4:2:0. [how] Set 'second_line_offset_adj' to 512 for 4:2:0 and leave at zero otherwise Signed-off-by: Nikola Cornij <nikola.cornij@amd.com> Reviewed-by: Eric Bernstein <Eric.Bernstein@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amd/display: Set default block_size, even in unexpected cases	Dmytro Laktyushkin
	We're not expected to enter the default case, but not returning a default value here is incorrect. Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com> Reviewed-by: Eric Bernstein <Eric.Bernstein@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amd/display: No audio endpoint for Dell MST display	Harmanprit Tatla
	[Why] There are certain MST displays (i.e. Dell P2715Q) that although have the MST feature set to off may still report it is a branch device and a non-zero value for downstream port present. This can lead to us incorrectly classifying a dp dongle connection as being active and disabling the audio endpoint for the display. [How] Modified the placement and condition used to assign the is_branch_dev bit. Signed-off-by: Harmanprit Tatla <harmanprit.tatla@amd.com> Reviewed-by: Aric Cyr <aric.cyr@amd.com> Acked-by: Anthony Koo <Anthony.Koo@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amd/display: cap DCFCLK hardmin to 507 for NV10	Jun Lei
	[why] Due to limitation in SMU/PPLIB, it is not possible to know Fmax @ Vmin for DCFCLK. This causes issues at high display configurations where extra headroom of DCFCLK can enable P-state switching [how] Use existing override logic. If override not defined, then force min = 507 Signed-off-by: Jun Lei <Jun.Lei@amd.com> Reviewed-by: Eric Yang <eric.yang2@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amd/display: fix dsc disable	Dmytro Laktyushkin
	A regression caused dsc to never get disabled in certain situations. Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com> Reviewed-by: Nikola Cornij <Nikola.Cornij@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amd/display: fix up HUBBUB hw programming for VM	Jun Lei
	[why] Some values were not being converted or bit-shifted properly for HW registers, causing black screen [how] Fix up the values before programming HW Signed-off-by: Jun Lei <jun.lei@amd.com> Reviewed-by: Anthony Koo <Anthony.Koo@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amd/display: initialize p_state to proper value	Jun Lei
	[why] On some modes SMU will be in infinite loop state at boot, this is because driver assumes p_state_support is false, but this is the opposite of the assumed boot state by SMU. we optimize away notifying SMU about no pstate, and so they will get stuck [how] when we init clk manager, init pstate to true, so it matches driver load assumption Signed-off-by: Jun Lei <Jun.Lei@amd.com> Reviewed-by: Tony Cheng <Tony.Cheng@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amdgpu: use VCN firmware offset for cache window	Leo Liu
	Since we are using the signed FW now, and also using PSP firmware loading, but it's still potential to break driver when loading FW directly instead of PSP, so we should add offset. Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/ttm: use the same attributes when freeing d_page->vaddr	Fuqian Huang
	In function __ttm_dma_alloc_page(), d_page->addr is allocated by dma_alloc_attrs() but freed with use dma_free_coherent() in __ttm_dma_free_page(). Use the correct dma_free_attrs() to free d_page->vaddr. Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amd/powerplay: change sysfs pp_dpm_xxx format for navi10	Kevin Wang
	v2: set average clock value on level 1 when current clock equal min or max clock (fine grained dpm support). the navi10 gfxclk (sclk) support fine grained DPM, so use level 1 to show current dpm freq in sysfs pp_dpm_xxx Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amdgpu: drop ras self test	Hawking Zhang
	this function is not needed any more. error injection is the only way to validate ras but it can't be executed in amdgpu_ras_init, where gpu is even not initialized Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amdgpu: only allow error injection to UMC IP block	Hawking Zhang
	error injection to other IP blocks (except UMC) will be enabled until RAS feature stablize on those IP blocks Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amdgpu: disable GFX RAS by default	Hawking Zhang
	GFX RAS has not been stablized yet. disable GFX ras until it is fully funcitonal. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amdgpu: do not create ras debugfs/sysfs node for ASICs that don't have ↵	Hawking Zhang
	ras ability driver shouldn't init any ras debugfs/sysfs node for ASICs that don't have ras hardware ability Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amd/powerplay: report bootup clock as max supported on dpm disabled	Evan Quan
	With gfxclk or uclk dpm disabled, it's reasonable to report bootup clock as the max supported. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amdgpu: Default disable GDS for compute VMIDs	Joseph Greathouse
	The GDS and GWS blocks default to allowing all VMIDs to access all entries. Graphics VMIDs can handle setting these limits when the driver launches work. However, compute workloads under HWS control don't go through the kernel driver. Instead, HWS firmware should set these limits when a process is put into a VMID slot. Disable access to these devices by default by turning off all mask bits (for OA) and setting BASE=SIZE=0 (for GDS and GWS) for all compute VMIDs. If a process wants to use these resources, they can request this from the HWS firmware (when such capabilities are enabled). HWS will then handle setting the base and limit for the process when it is assigned to a VMID. This will also prevent user kernels from getting 'stuck' in GWS by accident if they write GWS-using code but HWS firmware is not set up to handle GWS reset. Until HWS is enabled to handle GWS properly, all GWS accesses will MEM_VIOL fault the kernel. v2: Move initialization outside of SRBM mutex Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amd/display: init res_pool dccg_ref, dchub_ref with xtalin_freq	hersen wu
	[WHY] dc sw clock implementation of navi10 and raven are not exact the same. dcccg, dchub reference clock initialization is done after dc calls vbios dispcontroller_init table. for raven family, before dispcontroller_init is called by dc, the ref clk values are referred by sw clock implementation and program asic register using wrong values. this causes dchub pstate error. This need provide valid ref clk values. for navi10, since dispcontroller_init is not called, dchubbub_global_timer_enable = 0, hubbub2_get_dchub_ref_freq will hit aeert. this need remove hubbub2_get_dchub_ref_freq from this location and move to dcn20_init_hw. [HOW] for all asic, initialize dccg, dchub ref clk with data from vbios firmware table by default. for raven asic family, use these data from vbios, for asic which support sw dccg component, like navi10, read ref clk by sw dccg functions and update the ref clk. Signed-off-by: hersen wu <hersenxs.wu@amd.com> Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amdgpu/pm: remove check for pp funcs in freq sysfs handlers	Alex Deucher
	The dpm sensor function already does this for us. This fixes the freq*_input files with the new SMU implementation. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	drm/amd/display: Force uclk to max for every state	Nicholas Kazlauskas
	Workaround for now to avoid underflow. The uclk switch time should really be bumped up to 404, but doing so would expose p-state hang issues for higher bandwidth display configurations. Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18	net/mlx5: Replace kfree with kvfree	Chuhong Yuan
	Variable allocated by kvmalloc should not be freed by kfree. Because it may be allocated by vmalloc. So replace kfree with kvfree here. Fixes: 9b1f298236057 ("net/mlx5: Add support for FW fatal reporter dump") Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-18	Merge tag 'modules-for-v5.3' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux Pull module updates from Jessica Yu: "Summary of modules changes for the 5.3 merge window: - Code fixes and cleanups - Fix bug where set_memory_x() wasn't being called when rodata=n - Fix bug where -EEXIST was being returned for going modules - Allow arches to override module_exit_section()" * tag 'modules-for-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux: modules: fix compile error if don't have strict module rwx ARM: module: recognize unwind exit sections module: allow arch overrides for .exit section names modules: fix BUG when load module with rodata=n kernel/module: Fix mem leak in module_add_modinfo_attrs kernel: module: Use struct_size() helper kernel/module.c: Only return -EEXIST for modules that have finished loading
2019-07-18	MAINTAINERS: update netsec driver	Ilias Apalodimas
	Add myself to maintainers since i provided the XDP and page_pool implementation Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Acked-by: Jassi Brar <jaswinder.singh@linaro.org> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-18	cifs: copy_file_range needs to strip setuid bits and update timestamps	Amir Goldstein
	cifs has both source and destination inodes locked throughout the copy. Like ->write_iter(), we update mtime and strip setuid bits of destination file before copy and like ->read_iter(), we update atime of source file after copy. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-07-18	ipv6: Unlink sibling route in case of failure	Ido Schimmel
	When a route needs to be appended to an existing multipath route, fib6_add_rt2node() first appends it to the siblings list and increments the number of sibling routes on each sibling. Later, the function notifies the route via call_fib6_entry_notifiers(). In case the notification is vetoed, the route is not unlinked from the siblings list, which can result in a use-after-free. Fix this by unlinking the route from the siblings list before returning an error. Audited the rest of the call sites from which the FIB notification chain is called and could not find more problems. Fixes: 2233000cba40 ("net/ipv6: Move call_fib6_entry_notifiers up for route adds") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Alexander Petrovskiy <alexpe@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-18	objtool: Support conditional retpolines	Josh Poimboeuf
	A Clang-built kernel is showing the following warning: arch/x86/kernel/platform-quirks.o: warning: objtool: x86_early_init_platform_quirks()+0x84: unreachable instruction That corresponds to this code: 7e: 0f 85 00 00 00 00 jne 84 <x86_early_init_platform_quirks+0x84> 80: R_X86_64_PC32 __x86_indirect_thunk_r11-0x4 84: c3 retq This is a conditional retpoline sibling call, which is now possible thanks to retpolines. Objtool hasn't seen that before. It's incorrectly interpreting the conditional jump as an unconditional dynamic jump. Reported-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/30d4c758b267ef487fb97e6ecb2f148ad007b554.1563413318.git.jpoimboe@redhat.com
2019-07-18	objtool: Convert insn type to enum	Josh Poimboeuf
	This makes it easier to add new instruction types. Also it's hopefully more robust since the compiler should warn about out-of-range enums. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/0740e96af0d40e54cfd6a07bf09db0fbd10793cd.1563413318.git.jpoimboe@redhat.com
2019-07-18	objtool: Fix seg fault on bad switch table entry	Josh Poimboeuf
	In one rare case, Clang generated the following code: 5ca: 83 e0 21 and $0x21,%eax 5cd: b9 04 00 00 00 mov $0x4,%ecx 5d2: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8) 5d5: R_X86_64_32S .rodata+0x38 which uses the corresponding jump table relocations: 000000000038 000200000001 R_X86_64_64 0000000000000000 .text + 834 000000000040 000200000001 R_X86_64_64 0000000000000000 .text + 5d9 000000000048 000200000001 R_X86_64_64 0000000000000000 .text + b96 000000000050 000200000001 R_X86_64_64 0000000000000000 .text + b96 000000000058 000200000001 R_X86_64_64 0000000000000000 .text + b96 000000000060 000200000001 R_X86_64_64 0000000000000000 .text + b96 000000000068 000200000001 R_X86_64_64 0000000000000000 .text + b96 000000000070 000200000001 R_X86_64_64 0000000000000000 .text + b96 000000000078 000200000001 R_X86_64_64 0000000000000000 .text + b96 000000000080 000200000001 R_X86_64_64 0000000000000000 .text + b96 000000000088 000200000001 R_X86_64_64 0000000000000000 .text + b96 000000000090 000200000001 R_X86_64_64 0000000000000000 .text + b96 000000000098 000200000001 R_X86_64_64 0000000000000000 .text + b96 0000000000a0 000200000001 R_X86_64_64 0000000000000000 .text + b96 0000000000a8 000200000001 R_X86_64_64 0000000000000000 .text + b96 0000000000b0 000200000001 R_X86_64_64 0000000000000000 .text + b96 0000000000b8 000200000001 R_X86_64_64 0000000000000000 .text + b96 0000000000c0 000200000001 R_X86_64_64 0000000000000000 .text + b96 0000000000c8 000200000001 R_X86_64_64 0000000000000000 .text + b96 0000000000d0 000200000001 R_X86_64_64 0000000000000000 .text + b96 0000000000d8 000200000001 R_X86_64_64 0000000000000000 .text + b96 0000000000e0 000200000001 R_X86_64_64 0000000000000000 .text + b96 0000000000e8 000200000001 R_X86_64_64 0000000000000000 .text + b96 0000000000f0 000200000001 R_X86_64_64 0000000000000000 .text + b96 0000000000f8 000200000001 R_X86_64_64 0000000000000000 .text + b96 000000000100 000200000001 R_X86_64_64 0000000000000000 .text + b96 000000000108 000200000001 R_X86_64_64 0000000000000000 .text + b96 000000000110 000200000001 R_X86_64_64 0000000000000000 .text + b96 000000000118 000200000001 R_X86_64_64 0000000000000000 .text + b96 000000000120 000200000001 R_X86_64_64 0000000000000000 .text + b96 000000000128 000200000001 R_X86_64_64 0000000000000000 .text + b96 000000000130 000200000001 R_X86_64_64 0000000000000000 .text + b96 000000000138 000200000001 R_X86_64_64 0000000000000000 .text + 82f 000000000140 000200000001 R_X86_64_64 0000000000000000 .text + 828 Since %eax was masked with 0x21, only the first two and the last two entries are possible. Objtool doesn't actually emulate all the code, so it isn't smart enough to know that all the middle entries aren't reachable. They point to the NOP padding area after the end of the function, so objtool seg faulted when it tried to dereference a NULL insn->func. After this fix, objtool still gives an "unreachable" error because it stops reading the jump table when it encounters the bad addresses: /home/jpoimboe/objtool-tests/adm1275.o: warning: objtool: adm1275_probe()+0x828: unreachable instruction While the above code is technically correct, it's very wasteful of memory -- it uses 34 jump table entries when only 4 are needed. It's also not possible for objtool to validate this type of switch table because the unused entries point outside the function and objtool has no way of determining if that's intentional. Hopefully the Clang folks can fix it. Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/a9db88eec4f1ca089e040989846961748238b6d8.1563413318.git.jpoimboe@redhat.com
2019-07-18	objtool: Support repeated uses of the same C jump table	Jann Horn
	This fixes objtool for both a GCC issue and a Clang issue: 1) GCC issue: kernel/bpf/core.o: warning: objtool: ___bpf_prog_run()+0x8d5: sibling call from callable instruction with modified stack frame With CONFIG_RETPOLINE=n, GCC is doing the following optimization in ___bpf_prog_run(). Before: select_insn: jmp jumptable(,%rax,8) ... ALU64_ADD_X: ... jmp select_insn ALU_ADD_X: ... jmp select_insn After: select_insn: jmp jumptable(, %rax, 8) ... ALU64_ADD_X: ... jmp jumptable(, %rax, 8) ALU_ADD_X: ... jmp jumptable(, %rax, 8) This confuses objtool. It has never seen multiple indirect jump sites which use the same jump table. For GCC switch tables, the only way of detecting the size of a table is by continuing to scan for more tables. The size of the previous table can only be determined after another switch table is found, or when the scan reaches the end of the function. That logic was reused for C jump tables, and was based on the assumption that each jump table only has a single jump site. The above optimization breaks that assumption. 2) Clang issue: drivers/usb/misc/sisusbvga/sisusb.o: warning: objtool: sisusb_write_mem_bulk()+0x588: can't find switch jump table With clang 9, code can be generated where a function contains two indirect jump instructions which use the same switch table. The fix is the same for both issues: split the jump table parsing into two passes. In the first pass, locate the heads of all switch tables for the function and mark their locations. In the second pass, parse the switch tables and add them. Fixes: e55a73251da3 ("bpf: Fix ORC unwinding in non-JIT BPF code") Reported-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/e995befaada9d4d8b2cf788ff3f566ba900d2b4d.1563413318.git.jpoimboe@redhat.com Co-developed-by: Josh Poimboeuf <jpoimboe@redhat.com>
2019-07-18	objtool: Refactor jump table code	Josh Poimboeuf
	Now that C jump tables are supported, call them "jump tables" instead of "switch tables". Also rename some other variables, add comments, and simplify the code flow a bit. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/cf951b0c0641628e0b9b81f7ceccd9bcabcb4bd8.1563413318.git.jpoimboe@redhat.com
2019-07-18	objtool: Refactor sibling call detection logic	Josh Poimboeuf
	Simplify the sibling call detection logic a bit. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/8357dbef9e7f5512e76bf83a76c81722fc09eb5e.1563413318.git.jpoimboe@redhat.com
2019-07-18	objtool: Do frame pointer check before dead end check	Josh Poimboeuf
	Even calls to __noreturn functions need the frame pointer setup first. Such functions often dump the stack. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/aed62fbd60e239280218be623f751a433658e896.1563413318.git.jpoimboe@redhat.com
2019-07-18	objtool: Change dead_end_function() to return boolean	Josh Poimboeuf
	dead_end_function() can no longer return an error. Simplify its interface by making it return boolean. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/9e6679610768fb6e6c51dca23f7d4d0c03b0c910.1563413318.git.jpoimboe@redhat.com
2019-07-18	objtool: Warn on zero-length functions	Josh Poimboeuf
	All callable functions should have an ELF size. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/03d429c4fa87829c61c5dc0e89652f4d9efb62f1.1563413318.git.jpoimboe@redhat.com