summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-09-16nfs: Fix security label length not being resetJeffrey Mitchell
nfs_readdir_page_filler() iterates over entries in a directory, reusing the same security label buffer, but does not reset the buffer's length. This causes decode_attr_security_label() to return -ERANGE if an entry's security label is longer than the previous one's. This error, in nfs4_decode_dirent(), only gets passed up as -EAGAIN, which causes another failed attempt to copy into the buffer. The second error is ignored and the remaining entries do not show up in ls, specifically the getdents64() syscall. Reproduce by creating multiple files in NFS and giving one of the later files a longer security label. ls will not see that file nor any that are added afterwards, though they will exist on the backend. In nfs_readdir_page_filler(), reset security label buffer length before every reuse Signed-off-by: Jeffrey Mitchell <jeffrey.mitchell@starlab.io> Fixes: b4487b935452 ("nfs: Fix getxattr kernel panic and memory overflow") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-09-16locking/percpu-rwsem: Use this_cpu_{inc,dec}() for read_countHou Tao
The __this_cpu*() accessors are (in general) IRQ-unsafe which, given that percpu-rwsem is a blocking primitive, should be just fine. However, file_end_write() is used from IRQ context and will cause load-store issues on architectures where the per-cpu accessors are not natively irq-safe. Fix it by using the IRQ-safe this_cpu_*() for operations on read_count. This will generate more expensive code on a number of platforms, which might cause a performance regression for some of the other percpu-rwsem users. If any such is reported, we can consider alternative solutions. Fixes: 70fe2f48152e ("aio: fix freeze protection of aio writes") Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Acked-by: Oleg Nesterov <oleg@redhat.com> Link: https://lkml.kernel.org/r/20200915140750.137881-1-houtao1@huawei.com
2020-09-16perf stat: Fix the ratio comments of miss-eventsQi Liu
'perf stat' displays miss ratio of L1-dcache, L1-icache, dTLB cache, iTLB cache and LL-cache. Take L1-dcache for example, miss ratio is caculated as "L1-dcache-load-misses/L1-dcache-loads". So "of all L1-dcache hits" is unsuitable to describe it, and "of all L1-dcache accesses" seems better. The comments of L1-icache, dTLB cache, iTLB cache and LL-cache are fixed in the same way. Signed-off-by: Qi Liu <liuqi115@huawei.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: linuxarm@huawei.com Link: http://lore.kernel.org/lkml/1600253331-10535-1-git-send-email-liuqi115@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-16fbcon: Fix user font detection test at fbcon_resize().Tetsuo Handa
syzbot is reporting OOB read at fbcon_resize() [1], for commit 39b3cffb8cf31117 ("fbcon: prevent user font height or width change from causing potential out-of-bounds access") is by error using registered_fb[con2fb_map[vc->vc_num]]->fbcon_par->p->userfont (which was set to non-zero) instead of fb_display[vc->vc_num].userfont (which remains zero for that display). We could remove tricky userfont flag [2], for we can determine it by comparing address of the font data and addresses of built-in font data. But since that commit is failing to fix the original OOB read [3], this patch keeps the change minimal in case we decide to revert altogether. [1] https://syzkaller.appspot.com/bug?id=ebcbbb6576958a496500fee9cf7aa83ea00b5920 [2] https://syzkaller.appspot.com/text?tag=Patch&x=14030853900000 [3] https://syzkaller.appspot.com/bug?id=6fba8c186d97cf1011ab17660e633b1cc4e080c9 Reported-by: syzbot <syzbot+b38b1ef6edf0c74a8d97@syzkaller.appspotmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Fixes: 39b3cffb8cf31117 ("fbcon: prevent user font height or width change from causing potential out-of-bounds access") Cc: George Kennedy <george.kennedy@oracle.com> Link: https://lore.kernel.org/r/f6e3e611-8704-1263-d163-f52c906a4f06@I-love.SAKURA.ne.jp Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-16powercap: RAPL: Add support for LakefieldRicardo Neri
Simply add Lakefield model ID. No additional changes are needed. Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> [ rjw: Minor subject edit ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-09-16serial: 8250_pci: Add Realtek 816a and 816bTobias Diedrich
These serial ports are exposed by the OOB-management-engine on RealManage-enabled network cards (e.g. AMD DASH enabled systems using Realtek cards). Because these have 3 BARs, they fail the "num_iomem <= 1" check in serial_pci_guess_board. I've manually checked the two IOMEM regions and BAR 2 doesn't seem to respond to reads, but BAR 4 seems to be an MMIO version of the IO ports (untested). With this change, the ports are detected: 0000:02:00.1: ttyS0 at I/O 0x2200 (irq = 82, base_baud = 115200) is a 16550A 0000:02:00.2: ttyS1 at I/O 0x2100 (irq = 55, base_baud = 115200) is a 16550A lspci output: 02:00.1 0700: 10ec:816a (rev 0e) (prog-if 02 [16550]) Subsystem: 17aa:5082 Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort+ <TAbort- <MAbort- >SERR- <PERR- INTx- Interrupt: pin B routed to IRQ 82 IOMMU group: 11 Region 0: I/O ports at 2200 [size=256] Region 2: Memory at fd715000 (64-bit, non-prefetchable) [size=4K] Region 4: Memory at fd704000 (64-bit, non-prefetchable) [size=16K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [70] Express (v2) Endpoint, MSI 01 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s unlimited, L1 <64us ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq- RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr+ TransPend- LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s unlimited, L1 <64us ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+ LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s (ok), Width x1 (ok) TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR+ 10BitTagComp- 10BitTagReq- OBFF Via message/WAKE#, ExtFmt- EETLPPrefix- EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit- FRS- TPHComp- ExtTPHComp- AtomicOpsCap: 32bit- 64bit- 128bitCAS- DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled, AtomicOpsCtl: ReqEn- LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1- EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest- Retimer- 2Retimers- CrosslinkRes: unsupported Capabilities: [b0] MSI-X: Enable- Count=4 Masked- Vector table: BAR=4 offset=00000000 PBA: BAR=4 offset=00000800 Capabilities: [d0] Vital Product Data Not readable Capabilities: [100 v2] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn- MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap- HeaderLog: 00000000 00000000 00000000 00000000 Capabilities: [160 v1] Device Serial Number 00-00-00-00-00-00-00-00 Capabilities: [170 v1] Latency Tolerance Reporting Max snoop latency: 0ns Max no snoop latency: 0ns Capabilities: [178 v1] L1 PM Substates L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+ PortCommonModeRestoreTime=150us PortTPowerOnTime=150us L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1- T_CommonMode=0us LTR1.2_Threshold=0ns L1SubCtl2: T_PwrOn=10us 02:00.2 0700: 10ec:816b (rev 0e) [...same...] Signed-off-by: Tobias Diedrich <tobiasdiedrich@gmail.com> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200914173628.GA22508@yamamaya.is-a-geek.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-16serial: core: fix console port-lock regressionJohan Hovold
Fix the port-lock initialisation regression introduced by commit a3cb39d258ef ("serial: core: Allow detach and attach serial device for console") by making sure that the lock is again initialised during console setup. The console may be registered before the serial controller has been probed in which case the port lock needs to be initialised during console setup by a call to uart_set_options(). The console-detach changes introduced a regression in several drivers by effectively removing that initialisation by not initialising the lock when the port is used as a console (which is always the case during console setup). Add back the early lock initialisation and instead use a new console-reinit flag to handle the case where a console is being re-attached through sysfs. The question whether the console-detach interface should have been added in the first place is left for another discussion. Note that the console-enabled check in uart_set_options() is not redundant because of kgdboc, which can end up reinitialising an already enabled console (see commit 42b6a1baa3ec ("serial_core: Don't re-initialize a previously initialized spinlock.")). Fixes: a3cb39d258ef ("serial: core: Allow detach and attach serial device for console") Cc: stable <stable@vger.kernel.org> # 5.7 Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20200909143101.15389-3-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-16serial: core: fix port-lock initialisationJohan Hovold
Commit f743061a85f5 ("serial: core: Initialise spin lock before use in uart_configure_port()") tried to work around a breakage introduced by commit a3cb39d258ef ("serial: core: Allow detach and attach serial device for console") by adding a second initialisation of the port lock when registering the port. As reported by the build robots [1], this doesn't really solve the regression introduced by the console-detach changes and also adds a second redundant initialisation of the lock for normal ports. Start cleaning up this mess by removing the redundant initialisation and making sure that the port lock is again initialised once-only for ports that aren't already in use as a console. [1] https://lore.kernel.org/r/20200802054852.GR23458@shao2-debian Fixes: f743061a85f5 ("serial: core: Initialise spin lock before use in uart_configure_port()") Fixes: a3cb39d258ef ("serial: core: Allow detach and attach serial device for console") Cc: stable <stable@vger.kernel.org> # 5.7 Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20200909143101.15389-2-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-16usb: typec: intel_pmc_mux: Handle SCU IPC error conditionsMadhusudanarao Amara
Check and return if there are errors. The response bits are valid only on no errors. Fixes: b7404a29cd3d ("usb: typec: intel_pmc_mux: Definitions for response status bits") Signed-off-by: Madhusudanarao Amara <madhusudanarao.amara@intel.com> Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20200916091102.27118-4-heikki.krogerus@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-16USB: quirks: Add USB_QUIRK_IGNORE_REMOTE_WAKEUP quirk for BYD zhaoxin notebookPenghao
Add a USB_QUIRK_IGNORE_REMOTE_WAKEUP quirk for the BYD zhaoxin notebook. This notebook come with usb touchpad. And we would like to disable touchpad wakeup on this notebook by default. Signed-off-by: Penghao <penghao@uniontech.com> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200907023026.28189-1-penghao@uniontech.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-16USB: UAS: fix disconnect by unplugging a hubOliver Neukum
The SCSI layer can go into an ugly loop if you ignore that a device is gone. You need to report an error in the command rather than in the return value of the queue method. We need to specifically check for ENODEV. The issue goes back to the introduction of the driver. Fixes: 115bb1ffa54c3 ("USB: Add UAS driver") Signed-off-by: Oliver Neukum <oneukum@suse.com> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200916094026.30085-2-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-16usb: typec: ucsi: Prevent mode overrunHeikki Krogerus
Sometimes the embedded controller firmware does not terminate the list of alternate modes that the partner supports in its response to the GET_ALTERNATE_MODES command. Instead the firmware returns the supported alternate modes over and over again until the driver stops requesting them. If that happens, the number of modes for each alternate mode will exceed the maximum 6 that is defined in the USB Power Delivery specification. Making sure that can't happen by adding a check for it. This fixes NULL pointer dereference that is caused by the overrun. Fixes: ad74b8649beaf ("usb: typec: ucsi: Preliminary support for alternate modes") Cc: stable@vger.kernel.org Reported-by: Zwane Mwaikambo <zwanem@gmail.com> Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20200916090034.25119-3-heikki.krogerus@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-16usb: typec: ucsi: acpi: Increase command completion timeout valueHeikki Krogerus
UCSI specification quite clearly states that if a command can't be completed in 10ms, the firmware must notify about BUSY condition. Unfortunately almost none of the platforms (the firmware on them) generate the BUSY notification even if a command can't be completed in time. The driver already considered that, and used a timeout value of 5 seconds, but processing especially the alternate mode discovery commands takes often considerable amount of time from the firmware, much more than the 5 seconds. That happens especially after bootup when devices are already connected to the USB Type-C connector. For now on those platforms the alternate mode discovery has simply failed because of the timeout. To improve the situation, increasing the timeout value for the command completion to 1 minute. That should give enough time for even the slowest firmware to process the commands. Fixes: f56de278e8ec ("usb: typec: ucsi: acpi: Move to the new API") Cc: stable@vger.kernel.org Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20200916090034.25119-2-heikki.krogerus@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-16drm/i915: Filter wake_flags passed to default_wake_functionChris Wilson
(NOTE: This is the minimal backportable fix, a full fix is being developed at https://patchwork.freedesktop.org/patch/388048/) The flags passed to the wait_entry.func are passed onwards to try_to_wake_up(), which has a very particular interpretation for its wake_flags. In particular, beyond the published WF_SYNC, it has a few internal flags as well. Since we passed the fence->error down the chain via the flags argument, these ended up in the default_wake_function confusing the kernel/sched. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2110 Fixes: ef4688497512 ("drm/i915: Propagate fence errors") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: <stable@vger.kernel.org> # v5.4+ Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200728152144.1100-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [Joonas: Rebased and reordered into drm-intel-gt-next branch] [Joonas: Added a note and link about more complete fix] Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (cherry picked from commit f4b3c395540aa3d4f5a6275c5bdd83ab89034806) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-09-16drm/i915: Be wary of data races when reading the active execlistsChris Wilson
To implement preempt-to-busy (and so efficient timeslicing and best utilization of the hardware submission ports) we let the GPU run asynchronously in respect to the ELSP submission queue. This created challenges in keeping and accessing the driver state mirroring the asynchronous GPU execution. The latest occurence of this was spotted by KCSAN: [ 1413.563200] BUG: KCSAN: data-race in __await_execution+0x217/0x370 [i915] [ 1413.563221] [ 1413.563236] race at unknown origin, with read to 0xffff88885bb6c478 of 8 bytes by task 9654 on cpu 1: [ 1413.563548] __await_execution+0x217/0x370 [i915] [ 1413.563891] i915_request_await_dma_fence+0x4eb/0x6a0 [i915] [ 1413.564235] i915_request_await_object+0x421/0x490 [i915] [ 1413.564577] i915_gem_do_execbuffer+0x29b7/0x3c40 [i915] [ 1413.564967] i915_gem_execbuffer2_ioctl+0x22f/0x5c0 [i915] [ 1413.564998] drm_ioctl_kernel+0x156/0x1b0 [ 1413.565022] drm_ioctl+0x2ff/0x480 [ 1413.565046] __x64_sys_ioctl+0x87/0xd0 [ 1413.565069] do_syscall_64+0x4d/0x80 [ 1413.565094] entry_SYSCALL_64_after_hwframe+0x44/0xa9 To complicate matters, we have to both avoid the read tearing of *active and avoid any write tearing as perform the pending[] -> inflight[] promotion of the execlists. This is because we cannot rely on the memcpy doing u64 aligned copies on all kernels/platforms and so we opt to open-code it with explicit WRITE_ONCE annotations to satisfy KCSAN. v2: When in doubt, write the same comment again. v3: Expanded commit message. Fixes: b55230e5e800 ("drm/i915: Check for awaits on still currently executing requests") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200716142207.13003-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [Joonas: Rebased and reordered into drm-intel-gt-next branch] [Joonas: Added expanded commit message from Tvrtko and Chris] Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (cherry picked from commit b4d9145b0154f8c71dafc2db5fd445f1f3db9426) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-09-16drm/i915/gem: Reduce context termination list iteration guard to RCUChris Wilson
As we now protect the timeline list using RCU, we can drop the timeline->mutex for guarding the list iteration during context close, as we are searching for an inflight request. Any new request will see the context is banned and not be submitted. In doing so, pull the checks for a concurrent submission of the request (notably the i915_request_completed()) under the engine spinlock, to fully serialise with __i915_request_submit()). That is in the case of preempt-to-busy where the request may be completed during the __i915_request_submit(), we need to be careful that we sample the request status after serialising so that we don't miss the request the engine is actually submitting. Fixes: 4a3174152147 ("drm/i915/gem: Refine occupancy test in kill_context()") References: d22d2d073ef8 ("drm/i915: Protect i915_request_await_start from early waits") # rcu protection of timeline->requests References: https://gitlab.freedesktop.org/drm/intel/-/issues/1622 References: https://gitlab.freedesktop.org/drm/intel/-/issues/2158 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200806105954.7766-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (cherry picked from commit 736e785f9b28cd9ef2d16a80960a04fd00e64b22) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-09-16drm/i915/gem: Delay tracking the GEM context until it is registeredChris Wilson
Avoid exposing a partially constructed context by deferring the list_add() from the initial construction to the end of registration. Otherwise, if we peek into the list of contexts from inside debugfs, we may see the partially constructed context and chase down some dangling incomplete pointers. Reported-by: CQ Tang <cq.tang@intel.com> Fixes: 3aa9945a528e ("drm/i915: Separate GEM context construction and registration to userspace") References: f6e8aa387171 ("drm/i915: Report the number of closed vma held by each context in debugfs") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: CQ Tang <cq.tang@intel.com> Cc: <stable@vger.kernel.org> # v5.2+ Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200730092856.23615-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (cherry picked from commit eb4dedae920a07c485328af3da2202ec5184fb17) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-09-15dt-bindings: leds: cznic,turris-omnia-leds: fix error in bindingMarek Behún
There is a bug in the device tree binding for cznic,turris-omnia-leds which causes make dt_binding_check to complain. The reason is that the multi-led property binding's regular expression does not contain the `@` character, while the example nodes do. Fix this, and also adjust the maximum address to 'b' as there are 12 LEDs. Cc: Rob Herring <robh+dt@kernel.org> Cc: devicetree@vger.kernel.org Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by: Marek Behún <marek.behun@nic.cz> Link: https://lore.kernel.org/r/20200915005426.15957-1-marek.behun@nic.cz Signed-off-by: Rob Herring <robh@kernel.org>
2020-09-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Alexei Starovoitov says: ==================== pull-request: bpf 2020-09-15 The following pull-request contains BPF updates for your *net* tree. We've added 12 non-merge commits during the last 19 day(s) which contain a total of 10 files changed, 47 insertions(+), 38 deletions(-). The main changes are: 1) docs/bpf fixes, from Andrii. 2) ld_abs fix, from Daniel. 3) socket casting helpers fix, from Martin. 4) hash iterator fixes, from Yonghong. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-15bpf: Fix a rcu warning for bpffs map pretty-printYonghong Song
Running selftest ./btf_btf -p the kernel had the following warning: [ 51.528185] WARNING: CPU: 3 PID: 1756 at kernel/bpf/hashtab.c:717 htab_map_get_next_key+0x2eb/0x300 [ 51.529217] Modules linked in: [ 51.529583] CPU: 3 PID: 1756 Comm: test_btf Not tainted 5.9.0-rc1+ #878 [ 51.530346] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-1.el7.centos 04/01/2014 [ 51.531410] RIP: 0010:htab_map_get_next_key+0x2eb/0x300 ... [ 51.542826] Call Trace: [ 51.543119] map_seq_next+0x53/0x80 [ 51.543528] seq_read+0x263/0x400 [ 51.543932] vfs_read+0xad/0x1c0 [ 51.544311] ksys_read+0x5f/0xe0 [ 51.544689] do_syscall_64+0x33/0x40 [ 51.545116] entry_SYSCALL_64_after_hwframe+0x44/0xa9 The related source code in kernel/bpf/hashtab.c: 709 static int htab_map_get_next_key(struct bpf_map *map, void *key, void *next_key) 710 { 711 struct bpf_htab *htab = container_of(map, struct bpf_htab, map); 712 struct hlist_nulls_head *head; 713 struct htab_elem *l, *next_l; 714 u32 hash, key_size; 715 int i = 0; 716 717 WARN_ON_ONCE(!rcu_read_lock_held()); In kernel/bpf/inode.c, bpffs map pretty print calls map->ops->map_get_next_key() without holding a rcu_read_lock(), hence causing the above warning. To fix the issue, just surrounding map->ops->map_get_next_key() with rcu read lock. Fixes: a26ca7c982cb ("bpf: btf: Add pretty print support to the basic arraymap") Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Cc: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200916004401.146277-1-yhs@fb.com
2020-09-15bpf: Bpf_skc_to_* casting helpers require a NULL check on skMartin KaFai Lau
The bpf_skc_to_* type casting helpers are available to BPF_PROG_TYPE_TRACING. The traced PTR_TO_BTF_ID may be NULL. For example, the skb->sk may be NULL. Thus, these casting helpers need to check "!sk" also and this patch fixes them. Fixes: 0d4fad3e57df ("bpf: Add bpf_skc_to_udp6_sock() helper") Fixes: 478cfbdf5f13 ("bpf: Add bpf_skc_to_{tcp, tcp_timewait, tcp_request}_sock() helpers") Fixes: af7ec1383361 ("bpf: Add bpf_skc_to_tcp6_sock() helper") Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200915182959.241101-1-kafai@fb.com
2020-09-15scsi: sd: sd_zbc: Fix ZBC disk initializationDamien Le Moal
Make sure to call sd_zbc_init_disk() when the sdkp->zoned field is known, that is, once sd_read_block_characteristics() is executed in sd_revalidate_disk(), so that host-aware disks also get initialized. To do so, move sd_zbc_init_disk() call in sd_zbc_revalidate_zones() and make sure to execute it for all zoned disks, including for host-aware disks used as regular disks as these disk zoned model may be changed back to BLK_ZONED_HA when partitions are deleted. Link: https://lore.kernel.org/r/20200915073347.832424-3-damien.lemoal@wdc.com Fixes: 5795eb443060 ("scsi: sd_zbc: emulate ZONE_APPEND commands") Cc: <stable@vger.kernel.org> # v5.8+ Reported-by: Borislav Petkov <bp@alien8.de> Tested-by: Borislav Petkov <bp@suse.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-15scsi: sd: sd_zbc: Fix handling of host-aware ZBC disksDamien Le Moal
When CONFIG_BLK_DEV_ZONED is disabled, allow using host-aware ZBC disks as regular disks. In this case, ensure that command completion is correctly executed by changing sd_zbc_complete() to return good_bytes instead of 0 and causing a hang during device probe (endless retries). When CONFIG_BLK_DEV_ZONED is enabled and a host-aware disk is detected to have partitions, it will be used as a regular disk. In this case, make sure to not do anything in sd_zbc_revalidate_zones() as that triggers warnings. Since all these different cases result in subtle settings of the disk queue zoned model, introduce the block layer helper function blk_queue_set_zoned() to generically implement setting up the effective zoned model according to the disk type, the presence of partitions on the disk and CONFIG_BLK_DEV_ZONED configuration. Link: https://lore.kernel.org/r/20200915073347.832424-2-damien.lemoal@wdc.com Fixes: b72053072c0b ("block: allow partitions on host aware zone devices") Cc: <stable@vger.kernel.org> Reported-by: Borislav Petkov <bp@alien8.de> Suggested-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-15Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fix from James Bottomley: "Just one fix in libsas for a resource leak in an error path" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: libsas: Fix error path in sas_notify_lldd_dev_found()
2020-09-15Merge tag 'fixes-v5.9a' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security layer fix from James Morris: "A device_cgroup RCU warning fix from Amol Grover" * tag 'fixes-v5.9a' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: device_cgroup: Fix RCU list debugging warning
2020-09-15Merge tag 'hyperv-fixes-signed' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fixes from Wei Liu: "Two patches from Michael and Dexuan to fix vmbus hanging issues" * tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: Drivers: hv: vmbus: Add timeout to vmbus_wait_for_unload Drivers: hv: vmbus: hibernation: do not hang forever in vmbus_bus_resume()
2020-09-15scsi: lpfc: Fix initial FLOGI failure due to BBSCN not supportedJames Smart
Initial FLOGIs are failing with the following message: lpfc 0000:13:00.1: 1:(0):0820 FLOGI Failed (x300). BBCredit Not Supported In a prior patch, the READ_SPARAM command was re-ordered to post after CONFIG_LINK as the driver is expected to update the driver's copy of the service parameters for the FLOGI payload. If the bb-credit recovery feature is enabled, this is fine. But on adapters were bb-credit recovery isn't enabled, it would cause the FLOGI to fail. Fix by restoring the original command order (READ_SPARAM before CONFIG_LINK), and after issuing CONFIG_LINK, detect bb-credit recovery support and reissuing READ_SPARAM to obtain the updated service parameters (effectively adding in the fix command order). [mkp: corrected SHA] Link: https://lore.kernel.org/r/20200911200147.110826-1-james.smart@broadcom.com Fixes: 835214f5d5f5 ("scsi: lpfc: Fix broken Credit Recovery after driver load") CC: <stable@vger.kernel.org> # v5.7+ Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-15ipv4: Update exception handling for multipath routes via same deviceDavid Ahern
Kfir reported that pmtu exceptions are not created properly for deployments where multipath routes use the same device. After some digging I see 2 compounding problems: 1. ip_route_output_key_hash_rcu is updating the flowi4_oif *after* the route lookup. This is the second use case where this has been a problem (the first is related to use of vti devices with VRF). I can not find any reason for the oif to be changed after the lookup; the code goes back to the start of git. It does not seem logical so remove it. 2. fib_lookups for exceptions do not call fib_select_path to handle multipath route selection based on the hash. The end result is that the fib_lookup used to add the exception always creates it based using the first leg of the route. An example topology showing the problem: | host1 +------+ | eth0 | .209 +------+ | +------+ switch | br0 | +------+ | +---------+---------+ | host2 | host3 +------+ +------+ | eth0 | .250 | eth0 | 192.168.252.252 +------+ +------+ +-----+ +-----+ | vti | .2 | vti | 192.168.247.3 +-----+ +-----+ \ / ================================= tunnels 192.168.247.1/24 for h in host1 host2 host3; do ip netns add ${h} ip -netns ${h} link set lo up ip netns exec ${h} sysctl -wq net.ipv4.ip_forward=1 done ip netns add switch ip -netns switch li set lo up ip -netns switch link add br0 type bridge stp 0 ip -netns switch link set br0 up for n in 1 2 3; do ip -netns switch link add eth-sw type veth peer name eth-h${n} ip -netns switch li set eth-h${n} master br0 up ip -netns switch li set eth-sw netns host${n} name eth0 done ip -netns host1 addr add 192.168.252.209/24 dev eth0 ip -netns host1 link set dev eth0 up ip -netns host1 route add 192.168.247.0/24 \ nexthop via 192.168.252.250 dev eth0 nexthop via 192.168.252.252 dev eth0 ip -netns host2 addr add 192.168.252.250/24 dev eth0 ip -netns host2 link set dev eth0 up ip -netns host2 addr add 192.168.252.252/24 dev eth0 ip -netns host3 link set dev eth0 up ip netns add tunnel ip -netns tunnel li set lo up ip -netns tunnel li add br0 type bridge ip -netns tunnel li set br0 up for n in $(seq 11 20); do ip -netns tunnel addr add dev br0 192.168.247.${n}/24 done for n in 2 3 do ip -netns tunnel link add vti${n} type veth peer name eth${n} ip -netns tunnel link set eth${n} mtu 1360 master br0 up ip -netns tunnel link set vti${n} netns host${n} mtu 1360 up ip -netns host${n} addr add dev vti${n} 192.168.247.${n}/24 done ip -netns tunnel ro add default nexthop via 192.168.247.2 nexthop via 192.168.247.3 ip netns exec host1 ping -M do -s 1400 -c3 -I 192.168.252.209 192.168.247.11 ip netns exec host1 ping -M do -s 1400 -c3 -I 192.168.252.209 192.168.247.15 ip -netns host1 ro ls cache Before this patch the cache always shows exceptions against the first leg in the multipath route; 192.168.252.250 per this example. Since the hash has an initial random seed, you may need to vary the final octet more than what is listed. In my tests, using addresses between 11 and 19 usually found 1 that used both legs. With this patch, the cache will have exceptions for both legs. Fixes: 4895c771c7f0 ("ipv4: Add FIB nexthop exceptions") Reported-by: Kfir Itzhak <mastertheknife@gmail.com> Signed-off-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-15drm/amdgpu/dc: Require primary plane to be enabled whenever the CRTC isMichel Dänzer
Don't check drm_crtc_state::active for this either, per its documentation in include/drm/drm_crtc.h: * Hence drivers must not consult @active in their various * &drm_mode_config_funcs.atomic_check callback to reject an atomic * commit. atomic_remove_fb disables the CRTC as needed for disabling the primary plane. This prevents at least the following problems if the primary plane gets disabled (e.g. due to destroying the FB assigned to the primary plane, as happens e.g. with mutter in Wayland mode): * The legacy cursor ioctl returned EINVAL for a non-0 cursor FB ID (which enables the cursor plane). * If the cursor plane was enabled, changing the legacy DPMS property value from off to on returned EINVAL. v2: * Minor changes to code comment and commit log, per review feedback. GitLab: https://gitlab.gnome.org/GNOME/mutter/-/issues/1108 GitLab: https://gitlab.gnome.org/GNOME/mutter/-/issues/1165 GitLab: https://gitlab.gnome.org/GNOME/mutter/-/issues/1344 Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15drm/radeon: revert "Prefer lower feedback dividers"Christian König
Turns out this breaks a lot of different hardware. This reverts commit fc8c70526bd30733ea8667adb8b8ffebea30a8ed. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15drm/amdgpu: Include sienna_cichlid in USBC PD FW support.Andrey Grodzovsky
Create sysfs interface also for sienna_cichlid. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15drm/amd/display: update nv1x stutter latenciesJun Lei
[why] Recent characterization shows increased stutter latencies on some SKUs, leading to underflow. [how] Update SOC params to account for this worst case latency. Signed-off-by: Jun Lei <jun.lei@amd.com> Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15drm/amd/display: Don't use DRM_ERROR() for DTM add topologyBhawanpreet Lakha
[Why] Previously we were only calling add_topology when hdcp was being enabled. Now we call add_topology by default so the ERROR messages are printed if the firmware is not loaded. This error message is not relevant for normal display functionality so no need to print a ERROR message. [How] Change DRM_ERROR to DRM_INFO Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15drm/amd/pm: support runtime pptable update for sienna_cichlid etc.Jiansong Chen
This avoids smu issue when enabling runtime pptable update for sienna_cichlid and so on. Runtime pptable udpate is needed for test and debug purpose. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15drm/amdkfd: fix a memory leak issueDennis Li
In the resume stage of GPU recovery, start_cpsch will call pm_init which set pm->allocated as false, cause the next pm_release_ib has no chance to release ib memory. Add pm_release_ib in stop_cpsch which will be called in the suspend stage of GPU recovery. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15drm/kfd: fix a system crash issue during GPU recoveryDennis Li
The crash log as the below: [Thu Aug 20 23:18:14 2020] general protection fault: 0000 [#1] SMP NOPTI [Thu Aug 20 23:18:14 2020] CPU: 152 PID: 1837 Comm: kworker/152:1 Tainted: G OE 5.4.0-42-generic #46~18.04.1-Ubuntu [Thu Aug 20 23:18:14 2020] Hardware name: GIGABYTE G482-Z53-YF/MZ52-G40-00, BIOS R12 05/13/2020 [Thu Aug 20 23:18:14 2020] Workqueue: events amdgpu_ras_do_recovery [amdgpu] [Thu Aug 20 23:18:14 2020] RIP: 0010:evict_process_queues_cpsch+0xc9/0x130 [amdgpu] [Thu Aug 20 23:18:14 2020] Code: 49 8d 4d 10 48 39 c8 75 21 eb 44 83 fa 03 74 36 80 78 72 00 74 0c 83 ab 68 01 00 00 01 41 c6 45 41 00 48 8b 00 48 39 c8 74 25 <80> 78 70 00 c6 40 6d 01 74 ee 8b 50 28 c6 40 70 00 83 ab 60 01 00 [Thu Aug 20 23:18:14 2020] RSP: 0018:ffffb29b52f6fc90 EFLAGS: 00010213 [Thu Aug 20 23:18:14 2020] RAX: 1c884edb0a118914 RBX: ffff8a0d45ff3c00 RCX: ffff8a2d83e41038 [Thu Aug 20 23:18:14 2020] RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffff8a0e2e4178c0 [Thu Aug 20 23:18:14 2020] RBP: ffffb29b52f6fcb0 R08: 0000000000001b64 R09: 0000000000000004 [Thu Aug 20 23:18:14 2020] R10: ffffb29b52f6fb78 R11: 0000000000000001 R12: ffff8a0d45ff3d28 [Thu Aug 20 23:18:14 2020] R13: ffff8a2d83e41028 R14: 0000000000000000 R15: 0000000000000000 [Thu Aug 20 23:18:14 2020] FS: 0000000000000000(0000) GS:ffff8a0e2e400000(0000) knlGS:0000000000000000 [Thu Aug 20 23:18:14 2020] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [Thu Aug 20 23:18:14 2020] CR2: 000055c783c0e6a8 CR3: 00000034a1284000 CR4: 0000000000340ee0 [Thu Aug 20 23:18:14 2020] Call Trace: [Thu Aug 20 23:18:14 2020] kfd_process_evict_queues+0x43/0xd0 [amdgpu] [Thu Aug 20 23:18:14 2020] kfd_suspend_all_processes+0x60/0xf0 [amdgpu] [Thu Aug 20 23:18:14 2020] kgd2kfd_suspend.part.7+0x43/0x50 [amdgpu] [Thu Aug 20 23:18:14 2020] kgd2kfd_pre_reset+0x46/0x60 [amdgpu] [Thu Aug 20 23:18:14 2020] amdgpu_amdkfd_pre_reset+0x1a/0x20 [amdgpu] [Thu Aug 20 23:18:14 2020] amdgpu_device_gpu_recover+0x377/0xf90 [amdgpu] [Thu Aug 20 23:18:14 2020] ? amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu] [Thu Aug 20 23:18:14 2020] amdgpu_ras_do_recovery+0x159/0x190 [amdgpu] [Thu Aug 20 23:18:14 2020] process_one_work+0x20f/0x400 [Thu Aug 20 23:18:14 2020] worker_thread+0x34/0x410 When GPU hang, user process will fail to create a compute queue whose struct object will be freed later, but driver wrongly add this queue to queue list of the proccess. And then kfd_process_evict_queues will access a freed memory, which cause a system crash. v2: The failure to execute_queues should probably not be reported to the caller of create_queue, because the queue was already created. Therefore change to ignore the return value from execute_queues. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2020-09-15net: tipc: kerneldoc fixesLu Wei
Fix parameter description of tipc_link_bc_create() Reported-by: Hulk Robot <hulkci@huawei.com> Fixes: 16ad3f4022bb ("tipc: introduce variable window congestion control") Signed-off-by: Lu Wei <luwei32@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-15ibmvnic: update MAINTAINERSDany Madden
Update supporters for IBM Power SRIOV Virtual NIC Device Driver. Thomas Falcon is moving on to other works. Dany Madden, Lijun Pan and Sukadev Bhattiprolu are the current supporters. Signed-off-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-15Merge tag 'v5.9-clk-samsung-fixes' of ↵Stephen Boyd
https://git.kernel.org/pub/scm/linux/kernel/git/snawrocki/clk into clk-fixes Pull a Samsung clk driver fix from Sylwester Nawrocki: - fix for a regression in v5.9-rc1 on Odroid XU3/XU4, i.e. booting failure due to the DRAM controller root clock being disabled * tag 'v5.9-clk-samsung-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/snawrocki/clk: clk: samsung: Keep top BPLL mux on Exynos542x enabled
2020-09-15spi: bcm2835: Make polling_limit_us staticJason Yan
This eliminates the following sparse warning: drivers/spi/spi-bcm2835.c:78:14: warning: symbol 'polling_limit_us' was not declared. Should it be static? Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> Link: https://lore.kernel.org/r/20200912072211.602735-1-yanaijie@huawei.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-09-15efi: efibc: check for efivars write capabilityArd Biesheuvel
Branden reports that commit f88814cc2578c1 ("efi/efivars: Expose RT service availability via efivars abstraction") regresses UEFI platforms that implement GetVariable but not SetVariable when booting kernels that have EFIBC (bootloader control) enabled. The reason is that EFIBC is a user of the efivars abstraction, which was updated to permit users that rely only on the read capability, but not on the write capability. EFIBC is in the latter category, so it has to check explicitly whether efivars supports writes. Fixes: f88814cc2578c1 ("efi/efivars: Expose RT service availability via efivars abstraction") Tested-by: Branden Sherrell <sherrellbc@gmail.com> Link: https://lore.kernel.org/linux-efi/AE217103-C96F-4AFC-8417-83EC11962004@gmail.com/ Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2020-09-15perf test: Free formats for perf pmu parse testNamhyung Kim
The following leaks were detected by ASAN: Indirect leak of 360 byte(s) in 9 object(s) allocated from: #0 0x7fecc305180e in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10780e) #1 0x560578f6dce5 in perf_pmu__new_format util/pmu.c:1333 #2 0x560578f752fc in perf_pmu_parse util/pmu.y:59 #3 0x560578f6a8b7 in perf_pmu__format_parse util/pmu.c:73 #4 0x560578e07045 in test__pmu tests/pmu.c:155 #5 0x560578de109b in run_test tests/builtin-test.c:410 #6 0x560578de109b in test_and_print tests/builtin-test.c:440 #7 0x560578de401a in __cmd_test tests/builtin-test.c:661 #8 0x560578de401a in cmd_test tests/builtin-test.c:807 #9 0x560578e49354 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312 #10 0x560578ce71a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364 #11 0x560578ce71a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408 #12 0x560578ce71a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538 #13 0x7fecc2b7acc9 in __libc_start_main ../csu/libc-start.c:308 Fixes: cff7f956ec4a1 ("perf tests: Move pmu tests into separate object") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200915031819.386559-12-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-15perf metric: Do not free metric when failed to resolveNamhyung Kim
It's dangerous to free the original metric when it's called from resolve_metric() as it's already in the metric_list and might have other resources too. Instead, it'd better let them bail out and be released properly at the later stage. So add a check when it's called from metricgroup__add_metric() and release it. Also make sure that mp is set properly. Fixes: 83de0b7d535de ("perf metric: Collect referenced metrics in struct metric_ref_node") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200915031819.386559-10-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-15perf metric: Free metric when it failed to resolveNamhyung Kim
The metricgroup__add_metric() can find multiple match for a metric group and it's possible to fail. Also it can fail in the middle like in resolve_metric() even for single metric. In those cases, the intermediate list and ids will be leaked like: Direct leak of 3 byte(s) in 1 object(s) allocated from: #0 0x7f4c938f40b5 in strdup (/lib/x86_64-linux-gnu/libasan.so.5+0x920b5) #1 0x55f7e71c1bef in __add_metric util/metricgroup.c:683 #2 0x55f7e71c31d0 in add_metric util/metricgroup.c:906 #3 0x55f7e71c3844 in metricgroup__add_metric util/metricgroup.c:940 #4 0x55f7e71c488d in metricgroup__add_metric_list util/metricgroup.c:993 #5 0x55f7e71c488d in parse_groups util/metricgroup.c:1045 #6 0x55f7e71c60a4 in metricgroup__parse_groups_test util/metricgroup.c:1087 #7 0x55f7e71235ae in __compute_metric tests/parse-metric.c:164 #8 0x55f7e7124650 in compute_metric tests/parse-metric.c:196 #9 0x55f7e7124650 in test_recursion_fail tests/parse-metric.c:318 #10 0x55f7e7124650 in test__parse_metric tests/parse-metric.c:356 #11 0x55f7e70be09b in run_test tests/builtin-test.c:410 #12 0x55f7e70be09b in test_and_print tests/builtin-test.c:440 #13 0x55f7e70c101a in __cmd_test tests/builtin-test.c:661 #14 0x55f7e70c101a in cmd_test tests/builtin-test.c:807 #15 0x55f7e7126214 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312 #16 0x55f7e6fc41a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364 #17 0x55f7e6fc41a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408 #18 0x55f7e6fc41a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538 #19 0x7f4c93492cc9 in __libc_start_main ../csu/libc-start.c:308 Fixes: 83de0b7d535de ("perf metric: Collect referenced metrics in struct metric_ref_node") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200915031819.386559-9-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-15perf metric: Release expr_parse_ctx after testingNamhyung Kim
The test_generic_metric() missed to release entries in the pctx. Asan reported following leak (and more): Direct leak of 128 byte(s) in 1 object(s) allocated from: #0 0x7f4c9396980e in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10780e) #1 0x55f7e748cc14 in hashmap_grow (/home/namhyung/project/linux/tools/perf/perf+0x90cc14) #2 0x55f7e748d497 in hashmap__insert (/home/namhyung/project/linux/tools/perf/perf+0x90d497) #3 0x55f7e7341667 in hashmap__set /home/namhyung/project/linux/tools/perf/util/hashmap.h:111 #4 0x55f7e7341667 in expr__add_ref util/expr.c:120 #5 0x55f7e7292436 in prepare_metric util/stat-shadow.c:783 #6 0x55f7e729556d in test_generic_metric util/stat-shadow.c:858 #7 0x55f7e712390b in compute_single tests/parse-metric.c:128 #8 0x55f7e712390b in __compute_metric tests/parse-metric.c:180 #9 0x55f7e712446d in compute_metric tests/parse-metric.c:196 #10 0x55f7e712446d in test_dcache_l2 tests/parse-metric.c:295 #11 0x55f7e712446d in test__parse_metric tests/parse-metric.c:355 #12 0x55f7e70be09b in run_test tests/builtin-test.c:410 #13 0x55f7e70be09b in test_and_print tests/builtin-test.c:440 #14 0x55f7e70c101a in __cmd_test tests/builtin-test.c:661 #15 0x55f7e70c101a in cmd_test tests/builtin-test.c:807 #16 0x55f7e7126214 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312 #17 0x55f7e6fc41a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364 #18 0x55f7e6fc41a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408 #19 0x55f7e6fc41a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538 #20 0x7f4c93492cc9 in __libc_start_main ../csu/libc-start.c:308 Fixes: 6d432c4c8aa56 ("perf tools: Add test_generic_metric function") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200915031819.386559-8-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-15perf test: Fix memory leaks in parse-metric testNamhyung Kim
It didn't release resources when there's an error so the test_recursion_fail() will leak some memory. Fixes: 0a507af9c681a ("perf tests: Add parse metric test for ipc metric") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200915031819.386559-7-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-15perf parse-event: Fix memory leak in evsel->unitNamhyung Kim
The evsel->unit borrows a pointer of pmu event or alias instead of owns a string. But tool event (duration_time) passes a result of strdup() caused a leak. It was found by ASAN during metric test: Direct leak of 210 byte(s) in 70 object(s) allocated from: #0 0x7fe366fca0b5 in strdup (/lib/x86_64-linux-gnu/libasan.so.5+0x920b5) #1 0x559fbbcc6ea3 in add_event_tool util/parse-events.c:414 #2 0x559fbbcc6ea3 in parse_events_add_tool util/parse-events.c:1414 #3 0x559fbbd8474d in parse_events_parse util/parse-events.y:439 #4 0x559fbbcc95da in parse_events__scanner util/parse-events.c:2096 #5 0x559fbbcc95da in __parse_events util/parse-events.c:2141 #6 0x559fbbc28555 in check_parse_id tests/pmu-events.c:406 #7 0x559fbbc28555 in check_parse_id tests/pmu-events.c:393 #8 0x559fbbc28555 in check_parse_cpu tests/pmu-events.c:415 #9 0x559fbbc28555 in test_parsing tests/pmu-events.c:498 #10 0x559fbbc0109b in run_test tests/builtin-test.c:410 #11 0x559fbbc0109b in test_and_print tests/builtin-test.c:440 #12 0x559fbbc03e69 in __cmd_test tests/builtin-test.c:695 #13 0x559fbbc03e69 in cmd_test tests/builtin-test.c:807 #14 0x559fbbc691f4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312 #15 0x559fbbb071a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364 #16 0x559fbbb071a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408 #17 0x559fbbb071a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538 #18 0x7fe366b68cc9 in __libc_start_main ../csu/libc-start.c:308 Fixes: f0fbb114e3025 ("perf stat: Implement duration_time as a proper event") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200915031819.386559-6-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-15perf evlist: Fix cpu/thread map leakNamhyung Kim
Asan reported leak of cpu and thread maps as they have one more refcount than released. I found that after setting evlist maps it should release it's refcount. It seems to be broken from the beginning so I chose the original commit as the culprit. But not sure how it's applied to stable trees since there are many changes in the code after that. Fixes: 7e2ed097538c5 ("perf evlist: Store pointer to the cpu and thread maps") Fixes: 4112eb1899c0e ("perf evlist: Default to syswide target when no thread/cpu maps set") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200915031819.386559-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-15perf metric: Fix some memory leaks - part 2Namhyung Kim
The metric_event_delete() missed to free expr->metric_events and it should free an expr when metric_refs allocation failed. Fixes: 4ea2896715e67 ("perf metric: Collect referenced metrics in struct metric_expr") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200915031819.386559-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-15perf metric: Fix some memory leaksNamhyung Kim
I found some memory leaks while reading the metric code. Some are real and others only occur in the error path. When it failed during metric or event parsing, it should release all resources properly. Fixes: b18f3e365019d ("perf stat: Support JSON metrics in perf stat") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200915031819.386559-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>