summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-09-14usb: testusb: Fix for showing the connection speedFaizel K B
testusb' application which uses 'usbtest' driver reports 'unknown speed' from the function 'find_testdev'. The variable 'entry->speed' was not updated from the application. The IOCTL mentioned in the FIXME comment can only report whether the connection is low speed or not. Speed is read using the IOCTL USBDEVFS_GET_SPEED which reports the proper speed grade. The call is implemented in the function 'handle_testdev' where the file descriptor was availble locally. Sample output is given below where 'high speed' is printed as the connected speed. sudo ./testusb -a high speed /dev/bus/usb/001/011 0 /dev/bus/usb/001/011 test 0, 0.000015 secs /dev/bus/usb/001/011 test 1, 0.194208 secs /dev/bus/usb/001/011 test 2, 0.077289 secs /dev/bus/usb/001/011 test 3, 0.170604 secs /dev/bus/usb/001/011 test 4, 0.108335 secs /dev/bus/usb/001/011 test 5, 2.788076 secs /dev/bus/usb/001/011 test 6, 2.594610 secs /dev/bus/usb/001/011 test 7, 2.905459 secs /dev/bus/usb/001/011 test 8, 2.795193 secs /dev/bus/usb/001/011 test 9, 8.372651 secs /dev/bus/usb/001/011 test 10, 6.919731 secs /dev/bus/usb/001/011 test 11, 16.372687 secs /dev/bus/usb/001/011 test 12, 16.375233 secs /dev/bus/usb/001/011 test 13, 2.977457 secs /dev/bus/usb/001/011 test 14 --> 22 (Invalid argument) /dev/bus/usb/001/011 test 17, 0.148826 secs /dev/bus/usb/001/011 test 18, 0.068718 secs /dev/bus/usb/001/011 test 19, 0.125992 secs /dev/bus/usb/001/011 test 20, 0.127477 secs /dev/bus/usb/001/011 test 21 --> 22 (Invalid argument) /dev/bus/usb/001/011 test 24, 4.133763 secs /dev/bus/usb/001/011 test 27, 2.140066 secs /dev/bus/usb/001/011 test 28, 2.120713 secs /dev/bus/usb/001/011 test 29, 0.507762 secs Signed-off-by: Faizel K B <faizel.kb@dicortech.com> Link: https://lore.kernel.org/r/20210902114444.15106-1-faizel.kb@dicortech.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-14x86/mce: Avoid infinite loop for copy from user recoveryTony Luck
There are two cases for machine check recovery: 1) The machine check was triggered by ring3 (application) code. This is the simpler case. The machine check handler simply queues work to be executed on return to user. That code unmaps the page from all users and arranges to send a SIGBUS to the task that triggered the poison. 2) The machine check was triggered in kernel code that is covered by an exception table entry. In this case the machine check handler still queues a work entry to unmap the page, etc. but this will not be called right away because the #MC handler returns to the fix up code address in the exception table entry. Problems occur if the kernel triggers another machine check before the return to user processes the first queued work item. Specifically, the work is queued using the ->mce_kill_me callback structure in the task struct for the current thread. Attempting to queue a second work item using this same callback results in a loop in the linked list of work functions to call. So when the kernel does return to user, it enters an infinite loop processing the same entry for ever. There are some legitimate scenarios where the kernel may take a second machine check before returning to the user. 1) Some code (e.g. futex) first tries a get_user() with page faults disabled. If this fails, the code retries with page faults enabled expecting that this will resolve the page fault. 2) Copy from user code retries a copy in byte-at-time mode to check whether any additional bytes can be copied. On the other side of the fence are some bad drivers that do not check the return value from individual get_user() calls and may access multiple user addresses without noticing that some/all calls have failed. Fix by adding a counter (current->mce_count) to keep track of repeated machine checks before task_work() is called. First machine check saves the address information and calls task_work_add(). Subsequent machine checks before that task_work call back is executed check that the address is in the same page as the first machine check (since the callback will offline exactly one page). Expected worst case is four machine checks before moving on (e.g. one user access with page faults disabled, then a repeat to the same address with page faults enabled ... repeat in copy tail bytes). Just in case there is some code that loops forever enforce a limit of 10. [ bp: Massage commit message, drop noinstr, fix typo, extend panic messages. ] Fixes: 5567d11c21a1 ("x86/mce: Send #MC singal from task work") Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/YT/IJ9ziLqmtqEPu@agluck-desk2.amr.corp.intel.com
2021-09-14rtc: cmos: Disable irq around direct invocation of cmos_interrupt()Chris Wilson
As previously noted in commit 66e4f4a9cc38 ("rtc: cmos: Use spin_lock_irqsave() in cmos_interrupt()"): <4>[ 254.192378] WARNING: inconsistent lock state <4>[ 254.192384] 5.12.0-rc1-CI-CI_DRM_9834+ #1 Not tainted <4>[ 254.192396] -------------------------------- <4>[ 254.192400] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. <4>[ 254.192409] rtcwake/5309 [HC0[0]:SC0[0]:HE1:SE1] takes: <4>[ 254.192429] ffffffff8263c5f8 (rtc_lock){?...}-{2:2}, at: cmos_interrupt+0x18/0x100 <4>[ 254.192481] {IN-HARDIRQ-W} state was registered at: <4>[ 254.192488] lock_acquire+0xd1/0x3d0 <4>[ 254.192504] _raw_spin_lock+0x2a/0x40 <4>[ 254.192519] cmos_interrupt+0x18/0x100 <4>[ 254.192536] rtc_handler+0x1f/0xc0 <4>[ 254.192553] acpi_ev_fixed_event_detect+0x109/0x13c <4>[ 254.192574] acpi_ev_sci_xrupt_handler+0xb/0x28 <4>[ 254.192596] acpi_irq+0x13/0x30 <4>[ 254.192620] __handle_irq_event_percpu+0x43/0x2c0 <4>[ 254.192641] handle_irq_event_percpu+0x2b/0x70 <4>[ 254.192661] handle_irq_event+0x2f/0x50 <4>[ 254.192680] handle_fasteoi_irq+0x9e/0x150 <4>[ 254.192693] __common_interrupt+0x76/0x140 <4>[ 254.192715] common_interrupt+0x96/0xc0 <4>[ 254.192732] asm_common_interrupt+0x1e/0x40 <4>[ 254.192750] _raw_spin_unlock_irqrestore+0x38/0x60 <4>[ 254.192767] resume_irqs+0xba/0xf0 <4>[ 254.192786] dpm_resume_noirq+0x245/0x3d0 <4>[ 254.192811] suspend_devices_and_enter+0x230/0xaa0 <4>[ 254.192835] pm_suspend.cold.8+0x301/0x34a <4>[ 254.192859] state_store+0x7b/0xe0 <4>[ 254.192879] kernfs_fop_write_iter+0x11d/0x1c0 <4>[ 254.192899] new_sync_write+0x11d/0x1b0 <4>[ 254.192916] vfs_write+0x265/0x390 <4>[ 254.192933] ksys_write+0x5a/0xd0 <4>[ 254.192949] do_syscall_64+0x33/0x80 <4>[ 254.192965] entry_SYSCALL_64_after_hwframe+0x44/0xae <4>[ 254.192986] irq event stamp: 43775 <4>[ 254.192994] hardirqs last enabled at (43775): [<ffffffff81c00c42>] asm_sysvec_apic_timer_interrupt+0x12/0x20 <4>[ 254.193023] hardirqs last disabled at (43774): [<ffffffff81aa691a>] sysvec_apic_timer_interrupt+0xa/0xb0 <4>[ 254.193049] softirqs last enabled at (42548): [<ffffffff81e00342>] __do_softirq+0x342/0x48e <4>[ 254.193074] softirqs last disabled at (42543): [<ffffffff810b45fd>] irq_exit_rcu+0xad/0xd0 <4>[ 254.193101] other info that might help us debug this: <4>[ 254.193107] Possible unsafe locking scenario: <4>[ 254.193112] CPU0 <4>[ 254.193117] ---- <4>[ 254.193121] lock(rtc_lock); <4>[ 254.193137] <Interrupt> <4>[ 254.193142] lock(rtc_lock); <4>[ 254.193156] *** DEADLOCK *** <4>[ 254.193161] 6 locks held by rtcwake/5309: <4>[ 254.193174] #0: ffff888104861430 (sb_writers#5){.+.+}-{0:0}, at: ksys_write+0x5a/0xd0 <4>[ 254.193232] #1: ffff88810f823288 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0xe7/0x1c0 <4>[ 254.193282] #2: ffff888100cef3c0 (kn->active#285 <7>[ 254.192706] i915 0000:00:02.0: [drm:intel_modeset_setup_hw_state [i915]] [CRTC:51:pipe A] hw state readout: disabled <4>[ 254.193307] ){.+.+}-{0:0}, at: kernfs_fop_write_iter+0xf0/0x1c0 <4>[ 254.193333] #3: ffffffff82649fa8 (system_transition_mutex){+.+.}-{3:3}, at: pm_suspend.cold.8+0xce/0x34a <4>[ 254.193387] #4: ffffffff827a2108 (acpi_scan_lock){+.+.}-{3:3}, at: acpi_suspend_begin+0x47/0x70 <4>[ 254.193433] #5: ffff8881019ea178 (&dev->mutex){....}-{3:3}, at: device_resume+0x68/0x1e0 <4>[ 254.193485] stack backtrace: <4>[ 254.193492] CPU: 1 PID: 5309 Comm: rtcwake Not tainted 5.12.0-rc1-CI-CI_DRM_9834+ #1 <4>[ 254.193514] Hardware name: Google Soraka/Soraka, BIOS MrChromebox-4.10 08/25/2019 <4>[ 254.193524] Call Trace: <4>[ 254.193536] dump_stack+0x7f/0xad <4>[ 254.193567] mark_lock.part.47+0x8ca/0xce0 <4>[ 254.193604] __lock_acquire+0x39b/0x2590 <4>[ 254.193626] ? asm_sysvec_apic_timer_interrupt+0x12/0x20 <4>[ 254.193660] lock_acquire+0xd1/0x3d0 <4>[ 254.193677] ? cmos_interrupt+0x18/0x100 <4>[ 254.193716] _raw_spin_lock+0x2a/0x40 <4>[ 254.193735] ? cmos_interrupt+0x18/0x100 <4>[ 254.193758] cmos_interrupt+0x18/0x100 <4>[ 254.193785] cmos_resume+0x2ac/0x2d0 <4>[ 254.193813] ? acpi_pm_set_device_wakeup+0x1f/0x110 <4>[ 254.193842] ? pnp_bus_suspend+0x10/0x10 <4>[ 254.193864] pnp_bus_resume+0x5e/0x90 <4>[ 254.193885] dpm_run_callback+0x5f/0x240 <4>[ 254.193914] device_resume+0xb2/0x1e0 <4>[ 254.193942] ? pm_dev_err+0x25/0x25 <4>[ 254.193974] dpm_resume+0xea/0x3f0 <4>[ 254.194005] dpm_resume_end+0x8/0x10 <4>[ 254.194030] suspend_devices_and_enter+0x29b/0xaa0 <4>[ 254.194066] pm_suspend.cold.8+0x301/0x34a <4>[ 254.194094] state_store+0x7b/0xe0 <4>[ 254.194124] kernfs_fop_write_iter+0x11d/0x1c0 <4>[ 254.194151] new_sync_write+0x11d/0x1b0 <4>[ 254.194183] vfs_write+0x265/0x390 <4>[ 254.194207] ksys_write+0x5a/0xd0 <4>[ 254.194232] do_syscall_64+0x33/0x80 <4>[ 254.194251] entry_SYSCALL_64_after_hwframe+0x44/0xae <4>[ 254.194274] RIP: 0033:0x7f07d79691e7 <4>[ 254.194293] Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24 <4>[ 254.194312] RSP: 002b:00007ffd9cc2c768 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 <4>[ 254.194337] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f07d79691e7 <4>[ 254.194352] RDX: 0000000000000004 RSI: 0000556ebfc63590 RDI: 000000000000000b <4>[ 254.194366] RBP: 0000556ebfc63590 R08: 0000000000000000 R09: 0000000000000004 <4>[ 254.194379] R10: 0000556ebf0ec2a6 R11: 0000000000000246 R12: 0000000000000004 which breaks S3-resume on fi-kbl-soraka presumably as that's slow enough to trigger the alarm during the suspend. Fixes: 6950d046eb6e ("rtc: cmos: Replace spin_lock_irqsave with spin_lock in hard IRQ") References: 66e4f4a9cc38 ("rtc: cmos: Use spin_lock_irqsave() in cmos_interrupt()"): Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Xiaofei Tan <tanxiaofei@huawei.com> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210305122140.28774-1-chris@chris-wilson.co.uk
2021-09-14serial: mvebu-uart: fix driver's tx_empty callbackPali Rohár
Driver's tx_empty callback should signal when the transmit shift register is empty. So when the last character has been sent. STAT_TX_FIFO_EMP bit signals only that HW transmit FIFO is empty, which happens when the last byte is loaded into transmit shift register. STAT_TX_EMP bit signals when the both HW transmit FIFO and transmit shift register are empty. So replace STAT_TX_FIFO_EMP check by STAT_TX_EMP in mvebu_uart_tx_empty() callback function. Fixes: 30530791a7a0 ("serial: mvebu-uart: initial support for Armada-3700 serial port") Cc: stable <stable@vger.kernel.org> Signed-off-by: Pali Rohár <pali@kernel.org> Link: https://lore.kernel.org/r/20210911132017.25505-1-pali@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-14serial: 8250: 8250_omap: Fix RX_LVL register offsetNishanth Menon
Commit b67e830d38fa ("serial: 8250: 8250_omap: Fix possible interrupt storm on K3 SoCs") introduced fixup including a register read to RX_LVL, however, we should be using word offset than byte offset since our registers are on 4 byte boundary (port.regshift = 2) for 8250_omap. Fixes: b67e830d38fa ("serial: 8250: 8250_omap: Fix possible interrupt storm on K3 SoCs") Cc: stable <stable@vger.kernel.org> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Signed-off-by: Nishanth Menon <nm@ti.com> Link: https://lore.kernel.org/r/20210903050550.29050-1-nm@ti.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-14drm/i915: Get PM ref before accessing HW registerVinay Belgaumkar
Seeing these errors when GT is likely in suspend state- "RPM wakelock ref not held during HW access" Ensure GT is awake before trying to access HW registers. Avoid reading the register if that is not the case. Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Fixes: 41e5c17ebfc2 ("drm/i915/guc/slpc: Sysfs hooks for SLPC") Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210907232704.12982-1-vinay.belgaumkar@intel.com (cherry picked from commit f25e3908b9cd4a3fe819e9bdcdde58f20bacb34c) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2021-09-14drm/i915: Release ctx->syncobj on final put, not on ctx closeDaniel Vetter
gem context refcounting is another exercise in least locking design it seems, where most things get destroyed upon context closure (which can race with anything really). Only the actual memory allocation and the locks survive while holding a reference. This tripped up Jason when reimplementing the single timeline feature in commit 00dae4d3d35d4f526929633b76e00b0ab4d3970d Author: Jason Ekstrand <jason@jlekstrand.net> Date: Thu Jul 8 10:48:12 2021 -0500 drm/i915: Implement SINGLE_TIMELINE with a syncobj (v4) We could fix the bug by holding ctx->mutex in execbuf and clear the pointer (again while holding the mutex) context_close, but it's cleaner to just make the context object actually invariant over its _entire_ lifetime. This way any other ioctl that's potentially racing, but holding a full reference, can still rely on ctx->syncobj being an immutable pointer. Which without this change, is not the case. Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Fixes: 00dae4d3d35d ("drm/i915: Implement SINGLE_TIMELINE with a syncobj (v4)") Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: "Thomas Hellström" <thomas.hellstrom@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210902142057.929669-2-daniel.vetter@ffwll.ch (cherry picked from commit c238980efd3b35af70fc926066cf7440f50a97a9) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2021-09-14drm/i915/gem: Fix the mman selftestThomas Hellström
Using the I915_MMAP_TYPE_FIXED mmap type requires the TTM backend, so for that mmap type, use __i915_gem_object_create_user() instead of i915_gem_object_create_internal(), as we really want to tests objects mmap-able by user-space. This also means that the out-of-space error happens at object creation and returns -ENXIO rather than -ENOSPC, so fix the code up to expect that on out-of-offset-space errors. Finally only use I915_MMAP_TYPE_FIXED for LMEM and SMEM for now if testing on LMEM-capable devices. For stolen LMEM, we still take the same path as for integrated, as that haven't been moved over to TTM yet, and user-space should not be able to create out of stolen LMEM anyway. v2: - Check the presence of the obj->ops->mmap_offset callback rather than hardcoding the supported mmap regions in can_mmap() (Maarten Lankhorst) Fixes: 7961c5b60f23 ("drm/i915: Add TTM offset argument to mmap.") Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210831122931.157536-1-thomas.hellstrom@linux.intel.com (cherry picked from commit 450cede7f3804ca7f8b3da210ebefa61c0958f22) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2021-09-14drm/i915/guc: drop guc_communication_enabledDaniele Ceraolo Spurio
The function is only used from within GEM_BUG_ON(), which is causing warnings with Wunneeded-internal-declaration in some builds. Since the function is a simple wrapper around a CT function, we can just call the CT function directly instead. Fixes: 1fb12c587152 ("drm/i915/guc: skip disabling CTBs before sanitizing the GuC") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210823163137.19770-1-daniele.ceraolospurio@intel.com (cherry picked from commit 5db1856781e45c9610f7652a19cc656b984235e7) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2021-09-14drm/i915/dp: Use max params for panels < eDP 1.4Kai-Heng Feng
Users reported that after commit 2bbd6dba84d4 ("drm/i915: Try to use fast+narrow link on eDP again and fall back to the old max strategy on failure"), the screen starts to have wobbly effect. Commit a5c936add6a2 ("drm/i915/dp: Use slow and wide link training for everything") doesn't help either, that means the affected eDP 1.2 panels only work with max params. So use max params for panels < eDP 1.4 as Windows does to solve the issue. v3: - Do the eDP rev check in intel_edp_init_dpcd() v2: - Check eDP 1.4 instead of DPCD 1.1 to apply max params Cc: stable@vger.kernel.org Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3714 Fixes: 2bbd6dba84d4 ("drm/i915: Try to use fast+narrow link on eDP again and fall back to the old max strategy on failure") Fixes: a5c936add6a2 ("drm/i915/dp: Use slow and wide link training for everything") Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210820075301.693099-1-kai.heng.feng@canonical.com (cherry picked from commit d7f213c131adf0bec8b731553eb82990cdac265d) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2021-09-14drm/i915/dp: return proper DPRX link training resultLee Shawn C
After DPRX link training, intel_dp_link_train_phy() did not return the training result properly. If link training failed, i915 driver would not run into link train fallback function. And no hotplug uevent would be received by user space application. Fixes: b30edfd8d0b4 ("drm/i915: Switch to LTTPR non-transparent mode link training") Cc: Ville Syrjala <ville.syrjala@linux.intel.com> Cc: Imre Deak <imre.deak@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Cooper Chiou <cooper.chiou@intel.com> Cc: William Tseng <william.tseng@intel.com> Signed-off-by: Lee Shawn C <shawn.c.lee@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210706152541.25021-1-shawn.c.lee@intel.com (cherry picked from commit dab1b47e57e053b2a02c22ead8e7449f79961335) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2021-09-14PM: base: power: don't try to use non-existing RTC for storing dataJuergen Gross
If there is no legacy RTC device, don't try to use it for storing trace data across suspend/resume. Cc: <stable@vger.kernel.org> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Rafael J. Wysocki <rafael@kernel.org> Link: https://lore.kernel.org/r/20210903084937.19392-2-jgross@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-09-14xen/balloon: use a kernel thread instead a workqueueJuergen Gross
Today the Xen ballooning is done via delayed work in a workqueue. This might result in workqueue hangups being reported in case of large amounts of memory are being ballooned in one go (here 16GB): BUG: workqueue lockup - pool cpus=6 node=0 flags=0x0 nice=0 stuck for 64s! Showing busy workqueues and worker pools: workqueue events: flags=0x0 pwq 12: cpus=6 node=0 flags=0x0 nice=0 active=2/256 refcnt=3 in-flight: 229:balloon_process pending: cache_reap workqueue events_freezable_power_: flags=0x84 pwq 12: cpus=6 node=0 flags=0x0 nice=0 active=1/256 refcnt=2 pending: disk_events_workfn workqueue mm_percpu_wq: flags=0x8 pwq 12: cpus=6 node=0 flags=0x0 nice=0 active=1/256 refcnt=2 pending: vmstat_update pool 12: cpus=6 node=0 flags=0x0 nice=0 hung=64s workers=3 idle: 2222 43 This can easily be avoided by using a dedicated kernel thread for doing the ballooning work. Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20210827123206.15429-1-jgross@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-09-14staging: greybus: uart: fix tty use after freeJohan Hovold
User space can hold a tty open indefinitely and tty drivers must not release the underlying structures until the last user is gone. Switch to using the tty-port reference counter to manage the life time of the greybus tty state to avoid use after free after a disconnect. Fixes: a18e15175708 ("greybus: more uart work") Cc: stable@vger.kernel.org # 4.9 Reviewed-by: Alex Elder <elder@linaro.org> Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20210906124538.22358-1-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-14coresight: syscfg: Fix compiler warningJian Cai
This fixes warnings with -Wimplicit-function-declaration, e.g. drivers/hwtracing/coresight/coresight-syscfg.c:455:15: error: implicit declaration of function 'kzalloc' [-Werror, -Wimplicit-function-declaration] csdev_item = kzalloc(sizeof(struct cscfg_registered_csdev), GFP_KERNEL); Link: https://lore.kernel.org/r/20210830172820.2840433-1-jiancai@google.com Fixes: 85e2414c518a ("coresight: syscfg: Initial coresight system configuration") Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Jian Cai <jiancai@google.com> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Link: https://lore.kernel.org/r/20210913164613.1675791-2-mathieu.poirier@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-14nvmem: core: Add stubs for nvmem_cell_read_variable_le_u32/64 if !CONFIG_NVMEMDouglas Anderson
When I added nvmem_cell_read_variable_le_u32() and nvmem_cell_read_variable_le_u64() I forgot to add the "static inline" stub functions for when CONFIG_NVMEM wasn't defined. Add them now. This was causing problems with randconfig builds that compiled `drivers/soc/qcom/cpr.c`. Fixes: 6feba6a62c57 ("PM: AVS: qcom-cpr: Use nvmem_cell_read_variable_le_u32()") Fixes: a28e824fb827 ("nvmem: core: Add functions to make number reading easy") Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Link: https://lore.kernel.org/r/20210913160551.12907-1-srinivas.kandagatla@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-14binder: make sure fd closes completeTodd Kjos
During BC_FREE_BUFFER processing, the BINDER_TYPE_FDA object cleanup may close 1 or more fds. The close operations are completed using the task work mechanism -- which means the thread needs to return to userspace or the file object may never be dereferenced -- which can lead to hung processes. Force the binder thread back to userspace if an fd is closed during BC_FREE_BUFFER handling. Fixes: 80cd795630d6 ("binder: fix use-after-free due to ksys_close() during fdget()") Cc: stable <stable@vger.kernel.org> Reviewed-by: Martijn Coenen <maco@android.com> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Todd Kjos <tkjos@google.com> Link: https://lore.kernel.org/r/20210830195146.587206-1-tkjos@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-14binder: fix freeze raceLi Li
Currently cgroup freezer is used to freeze the application threads, and BINDER_FREEZE is used to freeze the corresponding binder interface. There's already a mechanism in ioctl(BINDER_FREEZE) to wait for any existing transactions to drain out before actually freezing the binder interface. But freezing an app requires 2 steps, freezing the binder interface with ioctl(BINDER_FREEZE) and then freezing the application main threads with cgroupfs. This is not an atomic operation. The following race issue might happen. 1) Binder interface is frozen by ioctl(BINDER_FREEZE); 2) Main thread A initiates a new sync binder transaction to process B; 3) Main thread A is frozen by "echo 1 > cgroup.freeze"; 4) The response from process B reaches the frozen thread, which will unexpectedly fail. This patch provides a mechanism to check if there's any new pending transaction happening between ioctl(BINDER_FREEZE) and freezing the main thread. If there's any, the main thread freezing operation can be rolled back to finish the pending transaction. Furthermore, the response might reach the binder driver before the rollback actually happens. That will still cause failed transaction. As the other process doesn't wait for another response of the response, the response transaction failure can be fixed by treating the response transaction like an oneway/async one, allowing it to reach the frozen thread. And it will be consumed when the thread gets unfrozen later. NOTE: This patch reuses the existing definition of struct binder_frozen_status_info but expands the bit assignments of __u32 member sync_recv. To ensure backward compatibility, bit 0 of sync_recv still indicates there's an outstanding sync binder transaction. This patch adds new information to bit 1 of sync_recv, indicating the binder transaction happens exactly when there's a race. If an existing userspace app runs on a new kernel, a sync binder call will set bit 0 of sync_recv so ioctl(BINDER_GET_FROZEN_INFO) still return the expected value (true). The app just doesn't check bit 1 intentionally so it doesn't have the ability to tell if there's a race. This behavior is aligned with what happens on an old kernel which doesn't set bit 1 at all. A new userspace app can 1) check bit 0 to know if there's a sync binder transaction happened when being frozen - same as before; and 2) check bit 1 to know if that sync binder transaction happened exactly when there's a race - a new information for rollback decision. the same time, confirmed the pending transactions succeeded. Fixes: 432ff1e91694 ("binder: BINDER_FREEZE ioctl") Acked-by: Todd Kjos <tkjos@google.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Li Li <dualli@google.com> Test: stress test with apps being frozen and initiating binder calls at Link: https://lore.kernel.org/r/20210910164210.2282716-2-dualli@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-14scsi: bsg: Fix device unregistrationZenghui Yu
device_initialize() is used to take a refcount on the device. However, put_device() is not called during device teardown. This leads to a leak of private data of the driver core, dev_name(), etc. This is reported by kmemleak at boot time if we compile kernel with DEBUG_TEST_DRIVER_REMOVE. Fix memory leaks during unregistration and implement a release function. Link: https://lore.kernel.org/r/20210911105306.1511-1-yuzenghui@huawei.com Fixes: ead09dd3aed5 ("scsi: bsg: Simplify device registration") Reviewed-by: Johan Hovold <johan@kernel.org> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-14scsi: sd: Make sd_spinup_disk() less noisyHeiner Kallweit
sd_spinup_disk() is a little bit noisy after commit 848ade90ba9c ("scsi: sd: Do not exit sd_spinup_disk() quietly"): scsi 0:0:0:0: Direct-Access Multiple Card Reader 1.00 PQ: 0 ANSI: 0 sd 0:0:0:0: Attached scsi generic sg0 type 0 sd 0:0:0:0: [sda] Media removed, stopped polling sd 0:0:0:0: [sda] Media removed, stopped polling sd 0:0:0:0: [sda] Attached SCSI removable disk sd 0:0:0:0: [sda] Media removed, stopped polling There's not really a benefit in printing the same message multiple times. Therefore print it only if media_present was previously set. Link: https://lore.kernel.org/r/a2d0a249-6035-9697-626a-e14ec50ef6ee@gmail.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13scsi: ufs: ufs-pci: Fix Intel LKF link stabilityAdrian Hunter
Intel LKF can experience link errors. Make fixes to increase link stability, especially when switching to high speed modes. Link: https://lore.kernel.org/r/20210831145317.26306-1-adrian.hunter@intel.com Fixes: b2c57925df1f ("scsi: ufs: ufs-pci: Add support for Intel LKF") Cc: stable@vger.kernel.org Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13scsi: mpt3sas: Clean up some inconsistent indentingColin Ian King
There are a couple of statements where the indentation is not correct, clean these up. Remove a redundant break statement. Link: https://lore.kernel.org/r/20210902224215.57286-1-colin.king@canonical.com Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13scsi: megaraid: Clean up some inconsistent indentingColin Ian King
There are a few statements where the indentation is not correct, clean these up. Link: https://lore.kernel.org/r/20210902223643.56979-1-colin.king@canonical.com Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13scsi: sr: Fix spelling mistake "does'nt" -> "doesn't"Colin Ian King
There is a spelling mistake in a literal string. Fix it. Link: https://lore.kernel.org/r/20210826115714.11844-1-colin.king@canonical.com Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13scsi: Remove SCSI CDROM MAINTAINERS entryJens Axboe
There's little point in keeping this one separately maintained these days, so just remove the entry and it'll fall under the SCSI subsystem where it belongs. Link: https://lore.kernel.org/r/c5e12bd1-10de-634c-d6b3-dac79ed01af5@kernel.dk Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13scsi: megaraid: Fix Coccinelle warningjing yangyang
WARNING !A || A && B is equivalent to !A || B This issue was detected with the help of Coccinelle. Link: https://lore.kernel.org/r/20210820030805.12383-1-jing.yangyang@zte.com.cn Reported-by: Zeal Robot <zealci@zte.com.cn> Acked-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: jing yangyang <jing.yangyang@zte.com.cn> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13scsi: ncr53c8xx: Remove unused retrieve_from_waiting_list() functionHelge Deller
Drop retrieve_from_waiting_list() to avoid this warning: drivers/scsi/ncr53c8xx.c:8000:26: warning: ‘retrieve_from_waiting_list’ defined but not used [-Wunused-function] Link: https://lore.kernel.org/r/YTfS/LH5vCN6afDW@ls3530 Fixes: 1c22e327545c ("scsi: ncr53c8xx: Remove unused code") Reviewed-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13scsi: elx: efct: Do not hold lock while calling fc_vport_terminate()James Smart
Smatch checker reported the following error: drivers/base/power/sysfs.c:833 dpm_sysfs_remove() warn: sleeping in atomic context With a calling sequence of: efct_lio_npiv_drop_nport() <- disables preempt -> fc_vport_terminate() -> device_del() -> dpm_sysfs_remove() Issue is efct_lio_npiv_drop_nport() is making the fc_vport_terminate() call while holding a lock w/ ipl raised. It is unnecessary to hold the lock over this call, shift where the lock is taken. Link: https://lore.kernel.org/r/20210907165225.10821-1-jsmart2021@gmail.com Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Co-developed-by: Ram Vegesna <ram.vegesna@broadcom.com> Signed-off-by: Ram Vegesna <ram.vegesna@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13scsi: target: Fix the pgr/alua_support_store functionsMaurizio Lombardi
Commit 356ba2a8bc8d ("scsi: target: tcmu: Make pgr_support and alua_support attributes writable") introduced support for changeable alua_support and pgr_support target attributes. These can only be changed if the backstore is user-backed, otherwise the kernel returns -EINVAL. This triggers a warning in the targetcli/rtslib code when performing a target restore that includes non-userbacked backstores: # targetctl restore Storage Object block/storage1: Cannot set attribute alua_support: [Errno 22] Invalid argument, skipped Storage Object block/storage1: Cannot set attribute pgr_support: [Errno 22] Invalid argument, skipped Fix this warning by returning an error code only if we are really going to flip the PGR/ALUA bit in the transport_flags field, otherwise we will do nothing and return success. Return ENOSYS instead of EINVAL if the pgr/alua attributes can not be changed, this way it will be possible for userspace to understand if the operation failed because an invalid value has been passed to strtobool() or because the attributes are fixed. Fixes: 356ba2a8bc8d ("scsi: target: tcmu: Make pgr_support and alua_support attributes writable") Link: https://lore.kernel.org/r/20210906151809.52811-1-mlombard@redhat.com Reviewed-by: Bodo Stroesser <bostroesser@gmail.com> Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13scsi: sd_zbc: Ensure buffer size is aligned to SECTOR_SIZENaohiro Aota
Reporting zones on a SCSI device sometimes fail with the following error: [76248.516390] ata16.00: invalid transfer count 131328 [76248.523618] sd 15:0:0:0: [sda] REPORT ZONES start lba 536870912 failed The error (from drivers/ata/libata-scsi.c:ata_scsi_zbc_in_xlat()) indicates that buffer size is not aligned to SECTOR_SIZE. This happens when the __vmalloc() failed. Consider we are reporting 4096 zones, then we will have "bufsize = roundup((4096 + 1) * 64, SECTOR_SIZE)" = (513 * 512) = 262656. Then, __vmalloc() failure halves the bufsize to 131328, which is no longer aligned to SECTOR_SIZE. Use rounddown() to ensure the size is always aligned to SECTOR_SIZE and fix the comment as well. Link: https://lore.kernel.org/r/20210906140642.2267569-1-naohiro.aota@wdc.com Fixes: 23a50861adda ("scsi: sd_zbc: Cleanup sd_zbc_alloc_report_buffer()") Cc: stable@vger.kernel.org # 5.5+ Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13scsi: sd: Free scsi_disk device via put_device()Ming Lei
After a device is initialized via device_initialize() it should be freed via put_device(). sd_probe() currently gets this wrong, fix it up. Link: https://lore.kernel.org/r/20210906090112.531442-1-ming.lei@redhat.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13scsi: mpt3sas: Call cpu_relax() before calling udelay()Sreekanth Reddy
Call cpu_relax() while waiting for the current blk-mq polling instance to complete. Link: https://lore.kernel.org/r/20210901152542.27866-1-sreekanth.reddy@broadcom.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13scsi: iscsi: Adjust iface sysfs attr detectionBaokun Li
ISCSI_NET_PARAM_IFACE_ENABLE belongs to enum iscsi_net_param instead of iscsi_iface_param so move it to ISCSI_NET_PARAM. Otherwise, when we call into the driver, we might not match and return that we don't want attr visible in sysfs. Found in code review. Link: https://lore.kernel.org/r/20210901085336.2264295-1-libaokun1@huawei.com Fixes: e746f3451ec7 ("scsi: iscsi: Fix iface sysfs attr detection") Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13scsi: ufs: ufshpb: Remove unused parametersChanWoo Lee
The following parameters are not used in the function. Remove them. *func(): ufshpb_set_hpb_read_to_upiu -> struct ufshpb_lu *hpb -> u32 lpn Link: https://lore.kernel.org/r/20210901025617.31174-1-cw9316.lee@samsung.com Reviewed-by: Daejun Park <daejun7.park@samsung.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: ChanWoo Lee <cw9316.lee@samsung.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13scsi: lpfc: Remove unneeded variableChi Minghao
Fix the following coccicheck REVIEW: ./drivers/scsi/lpfc/lpfc_scsi.c:1498:9-12 REVIEW Unneeded variable Link: https://lore.kernel.org/r/20210831114058.17817-1-lv.ruyi@zte.com.cn Reported-by: Zeal Robot <zealci@zte.com.cm> Reviewed-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Chi Minghao <chi.minghao@zte.com.cn> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13scsi: lpfc: Fix compilation errors on kernels with no CONFIG_DEBUG_FSJames Smart
The Kernel test robot flagged the following warning: ".../lpfc_init.c:7788:35: error: 'struct lpfc_sli4_hba' has no member named 'c_stat'" Reviewing this issue highlighted that one of the recent patches caused the driver to no longer compile cleanly if CONFIG_DEBUG_FS is not set. Correct the different areas that are failing to compile. Link: https://lore.kernel.org/r/20210908050927.37275-1-jsmart2021@gmail.com Fixes: 02243836ad6f ("scsi: lpfc: Add support for the CM framework") Reviewed-by: Nathan Chancellor <nathan@kernel.org> Build-tested-by: Nathan Chancellor <nathan@kernel.org> Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13scsi: lpfc: Fix CPU to/from endian warnings introduced by ELS processingJames Smart
The kernel test robot reported the following sparse warning: ".../lpfc_els.c:3984:25: sparse: sparse: cast from restricted __be16" For the error being flagged, using be32_to_cpu() on a be16 data type, it was simple enough. But a review of other elements and warnings were also evaluated. This patch corrected several items in the original patch: - Using be32_to_cpu() on a be16 data type - cpu_to_le32() used on a std uint32_t (CPU) data type. Note: This is a byte array, but stored in LE layout by hardware at 32-bit boundaries. So it possibly needed conversion. - Using cpu_to_le32() on a std uint16_t and assigned to a char typeA - Using le32_to_cpu() on a le16 type - Missing cpu_to_le16() on an assignment Link: https://lore.kernel.org/r/20210830231243.6227-1-jsmart2021@gmail.com Fixes: 9064aeb2df8e ("scsi: lpfc: Add EDC ELS support") Reported-by: kernel test robot <lkp@intel.com> Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13scsi: elx: efct: Fix void-pointer-to-enum-cast warning for efc_nport_topologyJames Smart
The kernel test robot flagged an warning for ".../efc_device.c:932:6: warning: cast to smaller integer type 'enum efc_nport_topology' from 'void *'" For the topology events, the "arg" field is generically defined as a void * and is used to pass different arguments. Most of the arguments are pointers to data structures. But for the EFC_EVT_NPORT_TOPOLOGY_NOTIFY event, the argument is an enum value, and the code is typecasting the void * to an enum generating the warning. Fix by converting the EFC_EVT_NPORT_TOPOLOGY_NOTIFY event to pass a pointer to the enum, thus it's a straight-forward pointer dereference in the event handler. Link: https://lore.kernel.org/r/20210830231050.5951-1-jsmart2021@gmail.com Fixes: 202bfdffae27 ("scsi: elx: libefc: FC node ELS and state handling") Reported-by: kernel test robot <lkp@intel.com> Co-developed-by: Ram Vegesna <ram.vegesna@broadcom.com> Signed-off-by: Ram Vegesna <ram.vegesna@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13scsi: st: Add missing break in switch statement in st_ioctl()Nathan Chancellor
Clang + -Wimplicit-fallthrough warns: drivers/scsi/st.c:3831:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough] default: ^ drivers/scsi/st.c:3831:2: note: insert 'break;' to avoid fall-through default: ^ break; 1 warning generated. Clang's -Wimplicit-fallthrough is a little bit more pedantic than GCC's, requiring every case block to end in break, return, or fallthrough, rather than allowing implicit fallthroughs to cases that just contain break or return. Add a break so that there is no more warning, as has been done all over the tree already. Link: https://lore.kernel.org/r/20210817235531.172995-1-nathan@kernel.org Fixes: 2e27f576abc6 ("scsi: scsi_ioctl: Call scsi_cmd_ioctl() from scsi_ioctl()") Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-13io_uring: pin SQPOLL data before unlocking ring lockJens Axboe
We need to re-check sqd->thread after we've dropped the lock. Pin the sqd before doing the lockdep lock dance, and check if the thread is alive after that. It's either NULL or alive, as the SQPOLL thread cannot exit without holding the same sqd->lock. Reported-and-tested-by: syzbot+337de45f13a4fd54d708@syzkaller.appspotmail.com Fixes: fa84693b3c89 ("io_uring: ensure IORING_REGISTER_IOWQ_MAX_WORKERS works with SQPOLL") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-13bpf, selftests: Add test case for mixed cgroup v1/v2Daniel Borkmann
Minimal selftest which implements a small BPF policy program to the connect(2) hook which rejects TCP connection requests to port 60123 with EPERM. This is being attached to a non-root cgroup v2 path. The test asserts that this works under cgroup v2-only and under a mixed cgroup v1/v2 environment where net_classid is set in the former case. Before fix: # ./test_progs -t cgroup_v1v2 test_cgroup_v1v2:PASS:server_fd 0 nsec test_cgroup_v1v2:PASS:client_fd 0 nsec test_cgroup_v1v2:PASS:cgroup_fd 0 nsec test_cgroup_v1v2:PASS:server_fd 0 nsec run_test:PASS:skel_open 0 nsec run_test:PASS:prog_attach 0 nsec test_cgroup_v1v2:PASS:cgroup-v2-only 0 nsec run_test:PASS:skel_open 0 nsec run_test:PASS:prog_attach 0 nsec run_test:PASS:join_classid 0 nsec (network_helpers.c:219: errno: None) Unexpected success to connect to server test_cgroup_v1v2:FAIL:cgroup-v1v2 unexpected error: -1 (errno 0) #27 cgroup_v1v2:FAIL Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED After fix: # ./test_progs -t cgroup_v1v2 #27 cgroup_v1v2:OK Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210913230759.2313-3-daniel@iogearbox.net
2021-09-13bpf, selftests: Add cgroup v1 net_cls classid helpersDaniel Borkmann
Minimal set of helpers for net_cls classid cgroupv1 management in order to set an id, join from a process, initiate setup and teardown. cgroupv2 helpers are left as-is, but reused where possible. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210913230759.2313-2-daniel@iogearbox.net
2021-09-13bpf, cgroups: Fix cgroup v2 fallback on v1/v2 mixed modeDaniel Borkmann
Fix cgroup v1 interference when non-root cgroup v2 BPF programs are used. Back in the days, commit bd1060a1d671 ("sock, cgroup: add sock->sk_cgroup") embedded per-socket cgroup information into sock->sk_cgrp_data and in order to save 8 bytes in struct sock made both mutually exclusive, that is, when cgroup v1 socket tagging (e.g. net_cls/net_prio) is used, then cgroup v2 falls back to the root cgroup in sock_cgroup_ptr() (&cgrp_dfl_root.cgrp). The assumption made was "there is no reason to mix the two and this is in line with how legacy and v2 compatibility is handled" as stated in bd1060a1d671. However, with Kubernetes more widely supporting cgroups v2 as well nowadays, this assumption no longer holds, and the possibility of the v1/v2 mixed mode with the v2 root fallback being hit becomes a real security issue. Many of the cgroup v2 BPF programs are also used for policy enforcement, just to pick _one_ example, that is, to programmatically deny socket related system calls like connect(2) or bind(2). A v2 root fallback would implicitly cause a policy bypass for the affected Pods. In production environments, we have recently seen this case due to various circumstances: i) a different 3rd party agent and/or ii) a container runtime such as [0] in the user's environment configuring legacy cgroup v1 net_cls tags, which triggered implicitly mentioned root fallback. Another case is Kubernetes projects like kind [1] which create Kubernetes nodes in a container and also add cgroup namespaces to the mix, meaning programs which are attached to the cgroup v2 root of the cgroup namespace get attached to a non-root cgroup v2 path from init namespace point of view. And the latter's root is out of reach for agents on a kind Kubernetes node to configure. Meaning, any entity on the node setting cgroup v1 net_cls tag will trigger the bypass despite cgroup v2 BPF programs attached to the namespace root. Generally, this mutual exclusiveness does not hold anymore in today's user environments and makes cgroup v2 usage from BPF side fragile and unreliable. This fix adds proper struct cgroup pointer for the cgroup v2 case to struct sock_cgroup_data in order to address these issues; this implicitly also fixes the tradeoffs being made back then with regards to races and refcount leaks as stated in bd1060a1d671, and removes the fallback, so that cgroup v2 BPF programs always operate as expected. [0] https://github.com/nestybox/sysbox/ [1] https://kind.sigs.k8s.io/ Fixes: bd1060a1d671 ("sock, cgroup: add sock->sk_cgroup") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Stanislav Fomichev <sdf@google.com> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/bpf/20210913230759.2313-1-daniel@iogearbox.net
2021-09-13cifs: fix incorrect kernel doc commentsSteve French
Correct kernel-doc comments pointed out by the automated kernel test robot. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-13bpf: Add oversize check before call kvcalloc()Bixuan Cui
Commit 7661809d493b ("mm: don't allow oversized kvmalloc() calls") add the oversize check. When the allocation is larger than what kmalloc() supports, the following warning triggered: WARNING: CPU: 0 PID: 8408 at mm/util.c:597 kvmalloc_node+0x108/0x110 mm/util.c:597 Modules linked in: CPU: 0 PID: 8408 Comm: syz-executor221 Not tainted 5.14.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:kvmalloc_node+0x108/0x110 mm/util.c:597 Call Trace: kvmalloc include/linux/mm.h:806 [inline] kvmalloc_array include/linux/mm.h:824 [inline] kvcalloc include/linux/mm.h:829 [inline] check_btf_line kernel/bpf/verifier.c:9925 [inline] check_btf_info kernel/bpf/verifier.c:10049 [inline] bpf_check+0xd634/0x150d0 kernel/bpf/verifier.c:13759 bpf_prog_load kernel/bpf/syscall.c:2301 [inline] __sys_bpf+0x11181/0x126e0 kernel/bpf/syscall.c:4587 __do_sys_bpf kernel/bpf/syscall.c:4691 [inline] __se_sys_bpf kernel/bpf/syscall.c:4689 [inline] __x64_sys_bpf+0x78/0x90 kernel/bpf/syscall.c:4689 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae Reported-by: syzbot+f3e749d4c662818ae439@syzkaller.appspotmail.com Signed-off-by: Bixuan Cui <cuibixuan@huawei.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210911005557.45518-1-cuibixuan@huawei.com
2021-09-13tools: compiler-gcc.h: Guard error attribute use with __has_attributeNathan Chancellor
When building objtool with HOSTCC=clang, there are several errors along the lines of orc_dump.c:201:28: error: unknown attribute 'error' ignored [-Werror,-Wunknown-attributes] This occurs after commit 4e59869aa655 ("compiler-gcc.h: drop checks for older GCC versions"), which removed the GCC_VERSION gating. The removed version check just so happened to prevent __compiletime_error() from being defined with clang because it pretends to be GCC 4.2.1 for compatibility but the error attribute was not added to clang until 14.0.0. Commit 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive") and commit a3f8a30f3f00 ("Compiler Attributes: use feature checks instead of version checks") refactored the handling of attributes in the main kernel to avoid situations like this but that refactoring has never been done for the tools directory. Refactoring is a rather large undertaking and this has never been an issue before so instead, just guard the definition of __compiletime_error() with __has_attribute() so that there are no more errors. Fixes: 4e59869aa655 ("compiler-gcc.h: drop checks for older GCC versions") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-13cifs: remove pathname for file from SPDX headerSteve French
checkpatch complains about source files with filenames (e.g. in these cases just below the SPDX header in comments at the top of various files in fs/cifs). It also is helpful to change this now so will be less confusing when the parent directory is renamed e.g. from fs/cifs to fs/smb_client (or fs/smbfs) Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-13Merge branch 'gcc-min-version-5.1' (make gcc-5.1 the minimum version)Linus Torvalds
Merge patch series from Nick Desaulniers to update the minimum gcc version to 5.1. This is some of the left-overs from the merge window that I didn't want to deal with yesterday, so it comes in after -rc1 but was sent before. Gcc-4.9 support has been an annoyance for some time, and with -Werror I had the choice of applying a fairly big patch from Kees Cook to remove a fair number of initializer warnings (still leaving some), or this patch series from Nick that just removes the source of the problem. The initializer cleanups might still be worth it regardless, but honestly, I preferred just tackling the problem with gcc-4.9 head-on. We've been more aggressiuve about no longer having to care about compilers that were released a long time ago, and I think it's been a good thing. I added a couple of patches on top to sort out a few left-overs now that we no longer support gcc-4.x. As noted by Arnd, as a result of this minimum compiler version upgrade we can probably change our use of '--std=gnu89' to '--std=gnu11', and finally start using local loop declarations etc. But this series does _not_ yet do that. Link: https://lore.kernel.org/all/20210909182525.372ee687@canb.auug.org.au/ Link: https://lore.kernel.org/lkml/CAK7LNASs6dvU6D3jL2GG3jW58fXfaj6VNOe55NJnTB8UPuk2pA@mail.gmail.com/ Link: https://github.com/ClangBuiltLinux/linux/issues/1438 * emailed patches from Nick Desaulniers <ndesaulniers@google.com>: Drop some straggling mentions of gcc-4.9 as being stale compiler_attributes.h: drop __has_attribute() support for gcc4 vmlinux.lds.h: remove old check for GCC 4.9 compiler-gcc.h: drop checks for older GCC versions Makefile: drop GCC < 5 -fno-var-tracking-assignments workaround arm64: remove GCC version check for ARCH_SUPPORTS_INT128 powerpc: remove GCC version check for UPD_CONSTR riscv: remove Kconfig check for GCC version for ARCH_RV64I Kconfig.debug: drop GCC 5+ version check for DWARF5 mm/ksm: remove old GCC 4.9+ check compiler.h: drop fallback overflow checkers Documentation: raise minimum supported version of GCC to 5.1
2021-09-13Drop some straggling mentions of gcc-4.9 as being staleLinus Torvalds
Fix up the admin-guide README file to the new gcc-5.1 requirement, and remove a stale comment about gcc support for the __assume_aligned__ attribute. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-13cpufreq: intel_pstate: Override parameters if HWP forced by BIOSDoug Smythies
If HWP has been already been enabled by BIOS, it may be necessary to override some kernel command line parameters. Once it has been enabled it requires a reset to be disabled. Suggested-by: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>