Age | Commit message (Collapse) | Author |
|
Currently; we're grabbing all of the modesetting locks before adding MST
connectors to fbdev. This isn't actually necessary, and causes a
deadlock as well:
======================================================
WARNING: possible circular locking dependency detected
4.17.0-rc3Lyude-Test+ #1 Tainted: G O
------------------------------------------------------
kworker/1:0/18 is trying to acquire lock:
00000000c832f62d (&helper->lock){+.+.}, at: drm_fb_helper_add_one_connector+0x2a/0x60 [drm_kms_helper]
but task is already holding lock:
00000000942e28e2 (crtc_ww_class_mutex){+.+.}, at: drm_modeset_backoff+0x8e/0x1c0 [drm]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #3 (crtc_ww_class_mutex){+.+.}:
ww_mutex_lock+0x43/0x80
drm_modeset_lock+0x71/0x130 [drm]
drm_helper_probe_single_connector_modes+0x7d/0x6b0 [drm_kms_helper]
drm_setup_crtcs+0x15e/0xc90 [drm_kms_helper]
__drm_fb_helper_initial_config_and_unlock+0x29/0x480 [drm_kms_helper]
nouveau_fbcon_init+0x138/0x1a0 [nouveau]
nouveau_drm_load+0x173/0x7e0 [nouveau]
drm_dev_register+0x134/0x1c0 [drm]
drm_get_pci_dev+0x8e/0x160 [drm]
nouveau_drm_probe+0x1a9/0x230 [nouveau]
pci_device_probe+0xcd/0x150
driver_probe_device+0x30b/0x480
__driver_attach+0xbc/0xe0
bus_for_each_dev+0x67/0x90
bus_add_driver+0x164/0x260
driver_register+0x57/0xc0
do_one_initcall+0x4d/0x323
do_init_module+0x5b/0x1f8
load_module+0x20e5/0x2ac0
__do_sys_finit_module+0xb7/0xd0
do_syscall_64+0x60/0x1b0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
-> #2 (crtc_ww_class_acquire){+.+.}:
drm_helper_probe_single_connector_modes+0x58/0x6b0 [drm_kms_helper]
drm_setup_crtcs+0x15e/0xc90 [drm_kms_helper]
__drm_fb_helper_initial_config_and_unlock+0x29/0x480 [drm_kms_helper]
nouveau_fbcon_init+0x138/0x1a0 [nouveau]
nouveau_drm_load+0x173/0x7e0 [nouveau]
drm_dev_register+0x134/0x1c0 [drm]
drm_get_pci_dev+0x8e/0x160 [drm]
nouveau_drm_probe+0x1a9/0x230 [nouveau]
pci_device_probe+0xcd/0x150
driver_probe_device+0x30b/0x480
__driver_attach+0xbc/0xe0
bus_for_each_dev+0x67/0x90
bus_add_driver+0x164/0x260
driver_register+0x57/0xc0
do_one_initcall+0x4d/0x323
do_init_module+0x5b/0x1f8
load_module+0x20e5/0x2ac0
__do_sys_finit_module+0xb7/0xd0
do_syscall_64+0x60/0x1b0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
-> #1 (&dev->mode_config.mutex){+.+.}:
drm_setup_crtcs+0x10c/0xc90 [drm_kms_helper]
__drm_fb_helper_initial_config_and_unlock+0x29/0x480 [drm_kms_helper]
nouveau_fbcon_init+0x138/0x1a0 [nouveau]
nouveau_drm_load+0x173/0x7e0 [nouveau]
drm_dev_register+0x134/0x1c0 [drm]
drm_get_pci_dev+0x8e/0x160 [drm]
nouveau_drm_probe+0x1a9/0x230 [nouveau]
pci_device_probe+0xcd/0x150
driver_probe_device+0x30b/0x480
__driver_attach+0xbc/0xe0
bus_for_each_dev+0x67/0x90
bus_add_driver+0x164/0x260
driver_register+0x57/0xc0
do_one_initcall+0x4d/0x323
do_init_module+0x5b/0x1f8
load_module+0x20e5/0x2ac0
__do_sys_finit_module+0xb7/0xd0
do_syscall_64+0x60/0x1b0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
-> #0 (&helper->lock){+.+.}:
__mutex_lock+0x70/0x9d0
drm_fb_helper_add_one_connector+0x2a/0x60 [drm_kms_helper]
nv50_mstm_register_connector+0x2c/0x50 [nouveau]
drm_dp_add_port+0x2f5/0x420 [drm_kms_helper]
drm_dp_send_link_address+0x155/0x1e0 [drm_kms_helper]
drm_dp_add_port+0x33f/0x420 [drm_kms_helper]
drm_dp_send_link_address+0x155/0x1e0 [drm_kms_helper]
drm_dp_check_and_send_link_address+0x87/0xd0 [drm_kms_helper]
drm_dp_mst_link_probe_work+0x4d/0x80 [drm_kms_helper]
process_one_work+0x20d/0x650
worker_thread+0x3a/0x390
kthread+0x11e/0x140
ret_from_fork+0x3a/0x50
other info that might help us debug this:
Chain exists of:
&helper->lock --> crtc_ww_class_acquire --> crtc_ww_class_mutex
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(crtc_ww_class_mutex);
lock(crtc_ww_class_acquire);
lock(crtc_ww_class_mutex);
lock(&helper->lock);
*** DEADLOCK ***
5 locks held by kworker/1:0/18:
#0: 000000004a05cd50 ((wq_completion)"events_long"){+.+.}, at: process_one_work+0x187/0x650
#1: 00000000601c11d1 ((work_completion)(&mgr->work)){+.+.}, at: process_one_work+0x187/0x650
#2: 00000000586ca0df (&dev->mode_config.mutex){+.+.}, at: drm_modeset_lock_all+0x3a/0x1b0 [drm]
#3: 00000000d3ca0ffa (crtc_ww_class_acquire){+.+.}, at: drm_modeset_lock_all+0x44/0x1b0 [drm]
#4: 00000000942e28e2 (crtc_ww_class_mutex){+.+.}, at: drm_modeset_backoff+0x8e/0x1c0 [drm]
stack backtrace:
CPU: 1 PID: 18 Comm: kworker/1:0 Tainted: G O 4.17.0-rc3Lyude-Test+ #1
Hardware name: Gateway FX6840/FX6840, BIOS P01-A3 05/17/2010
Workqueue: events_long drm_dp_mst_link_probe_work [drm_kms_helper]
Call Trace:
dump_stack+0x85/0xcb
print_circular_bug.isra.38+0x1ce/0x1db
__lock_acquire+0x128f/0x1350
? lock_acquire+0x9f/0x200
? lock_acquire+0x9f/0x200
? __ww_mutex_lock.constprop.13+0x8f/0x1000
lock_acquire+0x9f/0x200
? drm_fb_helper_add_one_connector+0x2a/0x60 [drm_kms_helper]
? drm_fb_helper_add_one_connector+0x2a/0x60 [drm_kms_helper]
__mutex_lock+0x70/0x9d0
? drm_fb_helper_add_one_connector+0x2a/0x60 [drm_kms_helper]
? ww_mutex_lock+0x43/0x80
? _cond_resched+0x15/0x30
? ww_mutex_lock+0x43/0x80
? drm_modeset_lock+0xb2/0x130 [drm]
? drm_fb_helper_add_one_connector+0x2a/0x60 [drm_kms_helper]
drm_fb_helper_add_one_connector+0x2a/0x60 [drm_kms_helper]
nv50_mstm_register_connector+0x2c/0x50 [nouveau]
drm_dp_add_port+0x2f5/0x420 [drm_kms_helper]
? mark_held_locks+0x50/0x80
? kfree+0xcf/0x2a0
? drm_dp_check_mstb_guid+0xd6/0x120 [drm_kms_helper]
? trace_hardirqs_on_caller+0xed/0x180
? drm_dp_check_mstb_guid+0xd6/0x120 [drm_kms_helper]
drm_dp_send_link_address+0x155/0x1e0 [drm_kms_helper]
drm_dp_add_port+0x33f/0x420 [drm_kms_helper]
? nouveau_connector_aux_xfer+0x7c/0xb0 [nouveau]
? find_held_lock+0x2d/0x90
? drm_dp_dpcd_access+0xd9/0xf0 [drm_kms_helper]
? __mutex_unlock_slowpath+0x3b/0x280
? drm_dp_dpcd_access+0xd9/0xf0 [drm_kms_helper]
drm_dp_send_link_address+0x155/0x1e0 [drm_kms_helper]
drm_dp_check_and_send_link_address+0x87/0xd0 [drm_kms_helper]
drm_dp_mst_link_probe_work+0x4d/0x80 [drm_kms_helper]
process_one_work+0x20d/0x650
worker_thread+0x3a/0x390
? process_one_work+0x650/0x650
kthread+0x11e/0x140
? kthread_create_worker_on_cpu+0x50/0x50
ret_from_fork+0x3a/0x50
Taking example from i915, the only time we need to hold any modesetting
locks is when changing the port on the mstc, and in that case we only
need to hold the connection mutex.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Potentially responsible for some random OOPSes.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Cc: stable@vger.kernel.org [v4.15+]
|
|
into drm-fixes
A little bigger than normal since this is two weeks of fixes.
- Atom firmware table updates for vega12
- Fix fallout from huge page support
- Fix up smu7 power profile interface to be consistent with vega
- Misc other fixes
* 'drm-fixes-4.17' of git://people.freedesktop.org/~agd5f/linux:
drm/amd/pp: Refine the output of pp_power_profile_mode on VI
drm/amdgpu: Switch to interruptable wait to recover from ring hang.
drm/ttm: Use GFP_TRANSHUGE_LIGHT for allocating huge pages
drm/amd/display: Use kvzalloc for potentially large allocations
drm/amd/display: Don't return ddc result and read_bytes in same return value
drm/amd/display: Add get_firmware_info_v3_2 for VG12
drm/amd: Add BIOS smu_info v3_3 required struct def.
drm/amd/display: Add VG12 ASIC IDs
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
atomic: Clear state pointers on clear (Ville)
vc4: Fix oops in dpi disable (Eric)
omap: Various error-checking + uninitialized var fixes (Tomi)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Eric Anholt <eric@anholt.net>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
* tag 'drm-misc-fixes-2018-05-09' of git://anongit.freedesktop.org/drm/drm-misc:
drm/vc4: Fix scaling of uni-planar formats
drm/bridge/sii8620: add Kconfig dependency on extcon
drm/omap: handle alloc failures in omap_connector
drm/omap: add missing linefeeds to prints
drm/omap: handle error if scale coefs are not found
drm/omap: check return value from soc_device_match
drm/omap: fix possible NULL ref issue in tiler_reserve_2d
drm/omap: fix uninitialized ret variable
drm/omap: silence unititialized variable warning
drm/vc4: Fix oops dereferencing DPI's connector since panel_bridge.
drm/atomic: Clean private obj old_state/new_state in drm_atomic_state_default_clear()
drm/atomic: Clean old_state/new_state in drm_atomic_state_default_clear()
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Increase LVDS panel timeout to 5s to avoid spurious *ERROR*
- Fix 2 WARNS: BIOS framebuffer related (FDO #105992) and eDP cdclk mismatch
* tag 'drm-intel-fixes-2018-05-09' of git://anongit.freedesktop.org/drm/drm-intel:
drm/i915: Fix drm:intel_enable_lvds ERROR message in kernel log
drm/i915: Correctly populate user mode h/vdisplay with pipe src size during readout
drm/i915: Adjust eDP's logical vco in a reliable place.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-fixes
Fixup pagefault issue of mixer driver
- it makes sure to check shadow register for interlace scan.
- it corrects chroma_addr[1], height and vertical position values.
And trivial cleanup
- it just removes duplicated drm_bridge_attach.
* tag 'exynos-drm-fixes-for-v4.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos:
drm/exynos: hdmi: avoid duplicating drm_bridge_attach
drm/exynos: mixer: avoid Oops in vp_video_buffer()
drm/exynos/mixer: fix synchronization check in interlaced mode
|
|
Both ‘uninorth_remove_memory’ and ‘null_cache_flush’ can be made
static. So make them.
Silence the following gcc warning (W=1):
drivers/char/agp/uninorth-agp.c:198:5: warning: no previous prototype for ‘uninorth_remove_memory’ [-Wmissing-prototypes]
and
drivers/char/agp/uninorth-agp.c:473:6: warning: no previous prototype for ‘null_cache_flush’ [-Wmissing-prototypes]
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Pull HID fixes from Jiri Kosina:
- quirk for Toshiba Click Mini L9W-B, from Hans de Goede
- intel-ish-hid and wacom error handling (device freeing) path fixes
from Arvind Yadav
- memory corruption fix in intel-ish-hid driver from Hans de Goede
- a few new device ID additions to hid-lenovo from Peter Ganzhorn
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: i2c-hid: Add RESEND_REPORT_DESCR quirk for Toshiba Click Mini L9W-B
HID: intel-ish-hid: use put_device() instead of kfree()
HID: intel_ish-hid: Stop using a static local buffer in get_report()
HID: intel_ish-hid: Move header size check to inside the loop
HID: wacom: Release device resource data obtained by devres_alloc()
HID: lenovo: Add support for IBM/Lenovo Scrollpoint mice
|
|
In order to keep consist with Vega,
the output format of the pp_power_profile_mode would be
<integer><mode name string>< “*” for current profile>:"detail settings"
and remove the "CURRENT" mode line.
for example:
NUM MODE_NAME SCLK_UP_HYST SCLK_DOWN_HYST SCLK_ACTIVE_LEVEL MCLK_UP_HYST MCLK_DOWN_HYST MCLK_ACTIVE_LEVEL
0 3D_FULL_SCREEN: 0 100 30 0 100 10
1 POWER_SAVING: 10 0 30 - - -
2 VIDEO: - - - 10 16 31
3 VR: 0 11 50 0 100 10
4 COMPUTE: 0 5 30 - - -
5 CUSTOM *: 0 5 30 0 100 10
NUM MODE_NAME SCLK_UP_HYST SCLK_DOWN_HYST SCLK_ACTIVE_LEVEL MCLK_UP_HYST MCLK_DOWN_HYST MCLK_ACTIVE_LEVEL
0 3D_FULL_SCREEN: 0 100 30 0 100 10
1 POWER_SAVING *: 10 0 30 0 100 10
2 VIDEO: - - - 10 16 31
3 VR: 0 11 50 0 100 10
4 COMPUTE: 0 5 30 - - -
5 CUSTOM: - - - - - -
Reviewed-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
v2:
Use dma_fence_wait instead of dma_fence_wait_timeout(...,MAX_SCHEDULE_TIMEOUT)
Avoid printing error message for ERESTARTSYS
Originally-by: David Panariti <David.Panariti@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
GFP_TRANSHUGE tries very hard to allocate huge pages, which can result
in long delays with high memory pressure. I have observed firefox
freezing for up to around a minute due to this while restic was taking
a full system backup.
Since we don't really need huge pages, use GFP_TRANSHUGE_LIGHT |
__GFP_NORETRY instead, in order to fail quickly when there are no huge
pages available.
Set __GFP_KSWAPD_RECLAIM as well, in order for huge pages to be freed
up in the background if necessary.
With these changes, I'm no longer seeing freezes during a restic backup.
Cc: stable@vger.kernel.org
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Allocating up to 32 physically contiguous pages can easily fail (and has
failed for me), and isn't necessary anyway.
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The two ranges overlap.
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Signed-off-by: Jerry (Fangzhi) Zuo <Jerry.Zuo@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Some BPF capabilities carry no value, they simply indicate feature
is present. Our capability parsing loop will exit early if last
capability is zero-length because it's looking for more than 8 bytes
of data (8B is our TLV header length). Allow the last capability to
be zero-length.
This bug would lead to driver failing to probe with the following error
if the last capability FW advertises is zero-length:
nfp: BPF capabilities left after parsing, parsed:92 total length:100
nfp: invalid BPF capabilities at offset:92
Note the "parsed" and "length" values are 8 apart.
No shipping FW runs into this issue, but we can't guarantee that will
remain the case.
Fixes: 77a844ee650c ("nfp: bpf: prepare for parsing BPF FW capabilities")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux into i2c/for-current
Single bug-fix for a regression introduced during the 4.17 merge window.
|
|
Commit feb2f19b1e8f ("eeprom: at24: move platform data processing into
a separate routine") introduced a bug where we incorrectly retireve the
at24_chip_data structure. Remove the unnecessary ampersand operator.
Fixes: feb2f19b1e8f ("eeprom: at24: move platform data processing into a separate routine")
Reported-by: Vadim Pasternak <vadimp@mellanox.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
|
|
The error handling path of 'c4iw_get_dma_mr()' does not free resources
in the correct order.
If an error occures, it can leak 'mhp->wr_waitp'.
Fixes: a3f12da0e99a ("iw_cxgb4: allocate wait object for each memory object")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The current code sets an affinity hint with a cpumask_t stored on the
stack. This value can then be accessed through /proc/irq/*/affinity_hint/,
causing a segfault or returning corrupt data.
Move the cpumask_t into struct i40iw_msix_vector so it is available later.
Backtrace:
BUG: unable to handle kernel paging request at ffffb16e600e7c90
IP: irq_affinity_hint_proc_show+0x60/0xf0
PGD 17c0c6d067
PUD 17c0c6e067
PMD 15d4a0e067
PTE 0
Oops: 0000 [#1] SMP
Modules linked in: ...
CPU: 3 PID: 172543 Comm: grep Tainted: G OE ... #1
Hardware name: ...
task: ffff9a5caee08000 task.stack: ffffb16e659d8000
RIP: 0010:irq_affinity_hint_proc_show+0x60/0xf0
RSP: 0018:ffffb16e659dbd20 EFLAGS: 00010086
RAX: 0000000000000246 RBX: ffffb16e659dbd20 RCX: 0000000000000000
RDX: ffffb16e600e7c90 RSI: 0000000000000003 RDI: 0000000000000046
RBP: ffffb16e659dbd88 R08: 0000000000000038 R09: 0000000000000001
R10: 0000000070803079 R11: 0000000000000000 R12: ffff9a59d1d97a00
R13: ffff9a5da47a6cd8 R14: ffff9a5da47a6c00 R15: ffff9a59d1d97a00
FS: 00007f946c31d740(0000) GS:ffff9a5dc1800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffb16e600e7c90 CR3: 00000016a4339000 CR4: 00000000007406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
seq_read+0x12d/0x430
? sched_clock_cpu+0x11/0xb0
proc_reg_read+0x48/0x70
__vfs_read+0x37/0x140
? security_file_permission+0xa0/0xc0
vfs_read+0x96/0x140
SyS_read+0x58/0xc0
do_syscall_64+0x5a/0x190
entry_SYSCALL64_slow_path+0x25/0x25
RIP: 0033:0x7f946bbc97e0
RSP: 002b:00007ffdd0c4ae08 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 000000000096b000 RCX: 00007f946bbc97e0
RDX: 000000000096b000 RSI: 00007f946a2f0000 RDI: 0000000000000004
RBP: 0000000000001000 R08: 00007f946a2ef011 R09: 000000000000000a
R10: 0000000000001000 R11: 0000000000000246 R12: 00007f946a2f0000
R13: 0000000000000004 R14: 0000000000000000 R15: 00007f946a2f0000
Code: b9 08 00 00 00 49 89 c6 48 89 df 31 c0 4d 8d ae d8 00 00 00 f3 48 ab 4c 89 ef e8 6c 9a 56 00 49 8b 96 30 01 00 00 48 85 d2 74 3f <48> 8b 0a 48 89 4d 98 48 8b 4a 08 48 89 4d a0 48 8b 4a 10 48 89
RIP: irq_affinity_hint_proc_show+0x60/0xf0 RSP: ffffb16e659dbd20
CR2: ffffb16e600e7c90
Fixes: 8e06af711bf2 ("i40iw: add main, hdr, status")
Signed-off-by: Andrew Boyer <andrew.boyer@dell.com>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
In this switch there is a reference held on the QP. 'continue' will grab
the next event without releasing the reference, causing a leak.
Change it to 'break' to drop the reference before grabbing the next event.
Fixes: 4e9042e647ff ("i40iw: add hw and utils files")
Signed-off-by: Andrew Boyer <andrew.boyer@dell.com>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
A panic occurs when there is a newly-registered element on the QP/CQ MR
list waiting to be attached, but a different MR is deregistered. The
current code only checks for whether the list is empty, not whether the
element being deregistered is actually on the list.
Fix the panic by adding a boolean to track if the object is on the list.
Fixes: d37498417947 ("i40iw: add files for iwarp interface")
Signed-off-by: Andrew Boyer <andrew.boyer@dell.com>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
When the last QP of eight QPs is not exist in
hns_roce_v1_mr_free_work_fn function, the
print for qpn of hr_qp may introduce a
calltrace for NULL pointer.
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
This patch mainly configure value for __internal_mr of mr_free_pd.
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
When enabled inner_pa_vld field of mpt, The pa0 and
pa1 will be valid and the hardware will use it
directly and not use base address of pbl. As a
result, it can reduce the delay.
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
In order to avoid illegal use for desc_dma_addr of ring,
it needs to set it zero when free cmq desc.
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
When received multiply rq sge, it should tag the
invalid lkey for the last non-zero length sge
when have some sges' length are zero. This patch
fixes it.
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Because hip06 hardware is not support for qp transition from
reset to reset state, it need to return errno when qp
transited from reset to reset. This patch fixes it.
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
When configure global param function run fail, it should directly return
and the initial flow will stop.
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Because the sys_image_guid of ib_device_attr structure is __be64, it
need to use cpu_to_be64 for converting.
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
To enable the linux-kernel system to load the hns-roce-hw-v2 driver
automatically when hns-roce-hw-v2 is plugged in pci bus, it need to
create a MODULE_DEVICE_TABLE for expose the pci_table of
hns-roce-hw-v2 to user.
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Reported-by: Zhou Wang <wangzhou1@hisilicon.com>
Tested-by: Xiaojun Tan <tanxiaojun@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
When used rq record db for kernel, it needs to set the rdb_en of
hr_qp to 1 and configures the dma address of record rq db of qp
context.
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
It needs to set the rqie field of qp context by configured rq inline
flags. Besides, it need to decide whether posting inline rqwqe by
judged rq inline flags.
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
This addresses 3 separate problems:
1. When using NVME over Fabrics we may end up sending IP
packets in interrupt context, we should defer this work
to a tasklet.
[ 50.939957] WARNING: CPU: 3 PID: 0 at kernel/softirq.c:161 __local_bh_enable_ip+0x1f/0xa0
[ 50.942602] CPU: 3 PID: 0 Comm: swapper/3 Kdump: loaded Tainted: G W 4.17.0-rc3-ARCH+ #104
[ 50.945466] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014
[ 50.948163] RIP: 0010:__local_bh_enable_ip+0x1f/0xa0
[ 50.949631] RSP: 0018:ffff88009c183900 EFLAGS: 00010006
[ 50.951029] RAX: 0000000080010403 RBX: 0000000000000200 RCX: 0000000000000001
[ 50.952636] RDX: 0000000000000000 RSI: 0000000000000200 RDI: ffffffff817e04ec
[ 50.954278] RBP: ffff88009c183910 R08: 0000000000000001 R09: 0000000000000614
[ 50.956000] R10: ffffea00021d5500 R11: 0000000000000001 R12: ffffffff817e04ec
[ 50.957779] R13: 0000000000000000 R14: ffff88009566f400 R15: ffff8800956c7000
[ 50.959402] FS: 0000000000000000(0000) GS:ffff88009c180000(0000) knlGS:0000000000000000
[ 50.961552] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 50.963798] CR2: 000055c4ec0ccac0 CR3: 0000000002209001 CR4: 00000000000606e0
[ 50.966121] Call Trace:
[ 50.966845] <IRQ>
[ 50.967497] __dev_queue_xmit+0x62d/0x690
[ 50.968722] dev_queue_xmit+0x10/0x20
[ 50.969894] neigh_resolve_output+0x173/0x190
[ 50.971244] ip_finish_output2+0x2b8/0x370
[ 50.972527] ip_finish_output+0x1d2/0x220
[ 50.973785] ? ip_finish_output+0x1d2/0x220
[ 50.975010] ip_output+0xd4/0x100
[ 50.975903] ip_local_out+0x3b/0x50
[ 50.976823] rxe_send+0x74/0x120
[ 50.977702] rxe_requester+0xe3b/0x10b0
[ 50.978881] ? ip_local_deliver_finish+0xd1/0xe0
[ 50.980260] rxe_do_task+0x85/0x100
[ 50.981386] rxe_run_task+0x2f/0x40
[ 50.982470] rxe_post_send+0x51a/0x550
[ 50.983591] nvmet_rdma_queue_response+0x10a/0x170
[ 50.985024] __nvmet_req_complete+0x95/0xa0
[ 50.986287] nvmet_req_complete+0x15/0x60
[ 50.987469] nvmet_bio_done+0x2d/0x40
[ 50.988564] bio_endio+0x12c/0x140
[ 50.989654] blk_update_request+0x185/0x2a0
[ 50.990947] blk_mq_end_request+0x1e/0x80
[ 50.991997] nvme_complete_rq+0x1cc/0x1e0
[ 50.993171] nvme_pci_complete_rq+0x117/0x120
[ 50.994355] __blk_mq_complete_request+0x15e/0x180
[ 50.995988] blk_mq_complete_request+0x6f/0xa0
[ 50.997304] nvme_process_cq+0xe0/0x1b0
[ 50.998494] nvme_irq+0x28/0x50
[ 50.999572] __handle_irq_event_percpu+0xa2/0x1c0
[ 51.000986] handle_irq_event_percpu+0x32/0x80
[ 51.002356] handle_irq_event+0x3c/0x60
[ 51.003463] handle_edge_irq+0x1c9/0x200
[ 51.004473] handle_irq+0x23/0x30
[ 51.005363] do_IRQ+0x46/0xd0
[ 51.006182] common_interrupt+0xf/0xf
[ 51.007129] </IRQ>
2. Work must always be offloaded to tasklet for rxe_post_send_kernel()
when using NVMEoF in order to solve lock ordering between neigh->ha_lock
seqlock and the nvme queue lock:
[ 77.833783] Possible interrupt unsafe locking scenario:
[ 77.833783]
[ 77.835831] CPU0 CPU1
[ 77.837129] ---- ----
[ 77.838313] lock(&(&n->ha_lock)->seqcount);
[ 77.839550] local_irq_disable();
[ 77.841377] lock(&(&nvmeq->q_lock)->rlock);
[ 77.843222] lock(&(&n->ha_lock)->seqcount);
[ 77.845178] <Interrupt>
[ 77.846298] lock(&(&nvmeq->q_lock)->rlock);
[ 77.847986]
[ 77.847986] *** DEADLOCK ***
3. Same goes for the lock ordering between sch->q.lock and nvme queue lock:
[ 47.634271] Possible interrupt unsafe locking scenario:
[ 47.634271]
[ 47.636452] CPU0 CPU1
[ 47.637861] ---- ----
[ 47.639285] lock(&(&sch->q.lock)->rlock);
[ 47.640654] local_irq_disable();
[ 47.642451] lock(&(&nvmeq->q_lock)->rlock);
[ 47.644521] lock(&(&sch->q.lock)->rlock);
[ 47.646480] <Interrupt>
[ 47.647263] lock(&(&nvmeq->q_lock)->rlock);
[ 47.648492]
[ 47.648492] *** DEADLOCK ***
Using NVMEoF after this patch seems to finally be stable, without it,
rxe eventually deadlocks the whole system and causes RCU stalls.
Signed-off-by: Alexandru Moise <00moses.alexander00@gmail.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Use of incorrect structure address for IPv6 neighbor lookup
causes connections to IPv6 addresses to fail. Fix this by
using correct address in call to dst_neigh_lookup.
Fixes: f27b4746f378 ("i40iw: add connection management code")
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
If i40iw_allocate_dma_mem fails when creating a QP, the
memory allocated for the QP structure using kzalloc is not
freed because iwqp->allocated_buffer is used to free the
memory and it is not setup until later. Fix this by setting
iwqp->allocated_buffer before allocating the dma memory.
Fixes: d37498417947 ("i40iw: add files for iwarp interface")
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Flow label is defined as u32 in the in ipv6 flow spec, but
used internally in the flow specs parsing as u8. That was
causing loss of part of flow_label value.
Fixes: 2d1e697e9b716 ('IB/mlx5: Add support to match inner packet fields')
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Daria Velikovsky <daria@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
User can leave system without medium BlueFlames registers,
however the code assumed that at least one such register exists.
This patch fixes that assumption.
Fixes: c1be5232d21d ("IB/mlx5: Fix micro UAR allocator")
Reported-by: Rohit Zambre <rzambre@uci.edu>
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
A pio send egress error can occur when the PSM library attempts to
to send a bad packet. That issue is still being investigated.
The pio error interrupt handler then attempts to progress the recovery
of the errored pio send context.
Code inspection reveals that the handling lacks the necessary locking
if that recovery interleaves with a PSM close of the "context" object
contains the pio send context.
The lack of the locking can cause the recovery to access the already
freed pio send context object and incorrectly deduce that the pio
send context is actually a kernel pio send context as shown by the
NULL deref stack below:
[<ffffffff8143d78c>] _dev_info+0x6c/0x90
[<ffffffffc0613230>] sc_restart+0x70/0x1f0 [hfi1]
[<ffffffff816ab124>] ? __schedule+0x424/0x9b0
[<ffffffffc06133c5>] sc_halted+0x15/0x20 [hfi1]
[<ffffffff810aa3ba>] process_one_work+0x17a/0x440
[<ffffffff810ab086>] worker_thread+0x126/0x3c0
[<ffffffff810aaf60>] ? manage_workers.isra.24+0x2a0/0x2a0
[<ffffffff810b252f>] kthread+0xcf/0xe0
[<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
[<ffffffff816b8798>] ret_from_fork+0x58/0x90
[<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
This is the best case scenario and other scenarios can corrupt the
already freed memory.
Fix by adding the necessary locking in the pio send context error
handler.
Cc: <stable@vger.kernel.org> # 4.9.x
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
INFINIBAND_ADDR_TRANS depends on INFINIBAND. So there's no need for
options which depend INFINIBAND_ADDR_TRANS to also depend on INFINIBAND.
Remove the unnecessary INFINIBAND depends.
Signed-off-by: Greg Thelen <gthelen@google.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The 0457:10fb touchscreen found on the Toshiba Click Mini L9W-B needs
to have a report-decriptors command send to it on resume in order for
the touchscreen to start generating events again on resume.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
USB controller ASM1042 stops working after commit de3ef1eb1cd0 (PM /
core: Drop run_wake flag from struct dev_pm_info).
The device in question is not power managed by platform firmware,
furthermore, it only supports PME# from D3cold:
Capabilities: [78] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=55mA PME(D0-,D1-,D2-,D3hot-,D3cold+)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Before commit de3ef1eb1cd0, the device never gets runtime suspended.
After that commit, the device gets runtime suspended to D3hot, which can
not generate any PME#.
usb_hcd_pci_probe() unconditionally calls device_wakeup_enable(), hence
device_can_wakeup() in pci_dev_run_wake() always returns true.
So pci_dev_run_wake() needs to check PME wakeup capability as its first
condition.
In addition, change wakeup flag passed to pci_target_state() from false
to true, because we want to find the deepest state different from D3cold
that the device can still generate PME#. In this case, it's D0 for the
device in question.
Fixes: de3ef1eb1cd0 (PM / core: Drop run_wake flag from struct dev_pm_info)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: 4.13+ <stable@vger.kernel.org> # 4.13+
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
When using uni-planar formats (like RGB), the scaling parameters are
stored in plane 0, not plane 1.
Fixes: fc04023fafec ("drm/vc4: Add support for YUV planes.")
Cc: stable@vger.kernel.org
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20180507121303.5610-1-boris.brezillon@bootlin.com
|
|
Since commit a92a08499b1f "r8169: improve runtime pm in general and
suspend unused ports" interfaces w/o link are runtime-suspended after
10s. On systems where drivers take longer to load this can lead to the
situation that the interface is runtime-suspended already when it's
initially brought up.
This shouldn't be a problem because rtl_open() resumes MAC/PHY.
However with at least one chip version the interface doesn't properly
come up, as reported here:
https://bugzilla.kernel.org/show_bug.cgi?id=199549
The vendor driver uses a delay to give certain chip versions some
time to resume before starting the PHY configuration. So let's do
the same. I don't know which chip versions may be affected,
therefore apply this delay always.
This patch was reported to fix the issue for RTL8168h.
I was able to reproduce the issue on an Asus H310I-Plus which also
uses a RTL8168h. Also in my case the patch fixed the issue.
Reported-by: Slava Kardakov <ojab@ojab.ru>
Tested-by: Slava Kardakov <ojab@ojab.ru>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
drm_bridge_attach takes care of these assignments, so there is no need
to open-code them a second time.
Signed-off-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
|
|
Sandisk SSDs SD7SN6S256G and SD8SN8U256G are regularly locking up
regularly under sustained moderate load with NCQ enabled. Blacklist
for now.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Cc: stable@vger.kernel.org
|
|
We need to return here instead of setting up the freed sdev device as a
transport.
Fixes: 907b6d14911d ("firmware: arm_scmi: add per-protocol channels support using idr objects")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata
Pull libata fixes from Tejun Heo:
"An earlier commit to add reset control for embedded ahci controllers
affected some of the hardware specific drivers and got reverted for
now.
Other than that, just per-device workarounds and trivial changes"
* 'for-4.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
driver core: add __printf verification to __ata_ehi_pushv_desc
ata: fix spelling mistake: "directon" -> "direction"
libata: Apply NOLPM quirk for SanDisk SD7UB3Q*G1001 SSDs
libata: Apply NOLPM quirk for SAMSUNG MZMPC128HBFU-000MV SSD
ata: ahci: mvebu: override ahci_stop_engine for mvebu AHCI
libahci: Allow drivers to override stop_engine
Revert "ata: ahci-platform: add reset control support"
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
"Here are three pin control fixes.
The Intel fixes are the most serious and important things I had queued
since it affects a large portion of deployed Chromebooks.
- Two major fixes for the Intel Cherryview and Sunrisepoint pin
controllers, adjusting numberspaces so that they get aligned with
various messed-up numbers encoded into the BIOS.
- A fix for the Meson driver GPIO pin range"
* tag 'pinctrl-v4.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: sunrisepoint: Align GPIO number space with Windows
pinctrl: cherryview: Associate IRQ descriptors to irqdomain
pinctrl: meson-axg: fix the range of aobus bank
|