Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- Avoid kprobe recursion when cortex_a76_erratum_1463225_debug_handler()
is not inlined (change to __always_inline).
- Fix the visibility of compat hwcaps, broken by recent changes to
consolidate the visibility of hwcaps and the user-space view of the
ID registers.
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: cpufeature: Fix the visibility of compat hwcaps
arm64: entry: avoid kprobe recursion
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"A documentation fix and driver fixes for piix4, tegra, and i801"
* tag 'i2c-for-6.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
Documentation: devres: add missing I2C helper
i2c: i801: add lis3lv02d's I2C address for Vostro 5568
i2c: tegra: Allocate DMA memory for DMA engine
i2c: piix4: Fix adapter not be removed in piix4_remove()
|
|
When the mac device gets removed, it leaves behind the ethernet device.
This will result in a segfault next time the ethernet device accesses
mac_dev. Remove the ethernet device when we get removed to prevent
this. This is not completely reversible, since some resources aren't
cleaned up properly, but that can be addressed later.
Fixes: 3933961682a3 ("fsl/fman: Add FMan MAC driver")
Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Link: https://lore.kernel.org/r/20221103182831.2248833-1-sean.anderson@seco.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Michael Chan says:
====================
bnxt_en: Bug fixes
This bug fix series includes fixes for PCIE AER, a crash that may occur
when doing ethtool -C in the middle of error recovery, and aRFS.
====================
Link: https://lore.kernel.org/r/1667518407-15761-1-git-send-email-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In the bnxt_en driver ndo_rx_flow_steer returns '0' whenever an entry
that we are attempting to steer is already found. This is not the
correct behavior. The return code should be the value/index that
corresponds to the entry. Returning zero all the time causes the
RFS records to be incorrect unless entry '0' is the correct one. As
flows migrate to different cores this can create entries that are not
correct.
Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.")
Reported-by: Akshay Navgire <anavgire@purestorage.com>
Signed-off-by: Alex Barba <alex.barba@broadcom.com>
Signed-off-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
During the error recovery sequence, the rtnl_lock is not held for the
entire duration and some datastructures may be freed during the sequence.
Check for the BNXT_STATE_OPEN flag instead of netif_running() to ensure
that the device is fully operational before proceeding to reconfigure
the coalescing settings.
This will fix a possible crash like this:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
PGD 0 P4D 0
Oops: 0000 [#1] SMP NOPTI
CPU: 10 PID: 181276 Comm: ethtool Kdump: loaded Tainted: G IOE --------- - - 4.18.0-348.el8.x86_64 #1
Hardware name: Dell Inc. PowerEdge R740/0F9N89, BIOS 2.3.10 08/15/2019
RIP: 0010:bnxt_hwrm_set_coal+0x1fb/0x2a0 [bnxt_en]
Code: c2 66 83 4e 22 08 66 89 46 1c e8 10 cb 00 00 41 83 c6 01 44 39 b3 68 01 00 00 0f 8e a3 00 00 00 48 8b 93 c8 00 00 00 49 63 c6 <48> 8b 2c c2 48 8b 85 b8 02 00 00 48 85 c0 74 2e 48 8b 74 24 08 f6
RSP: 0018:ffffb11c8dcaba50 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8d168a8b0ac0 RCX: 00000000000000c5
RDX: 0000000000000000 RSI: ffff8d162f72c000 RDI: ffff8d168a8b0b28
RBP: 0000000000000000 R08: b6e1f68a12e9a7eb R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000037 R12: ffff8d168a8b109c
R13: ffff8d168a8b10aa R14: 0000000000000000 R15: ffffffffc01ac4e0
FS: 00007f3852e4c740(0000) GS:ffff8d24c0080000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000041b3ee003 CR4: 00000000007706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
ethnl_set_coalesce+0x3ce/0x4c0
genl_family_rcv_msg_doit.isra.15+0x10f/0x150
genl_family_rcv_msg+0xb3/0x160
? coalesce_fill_reply+0x480/0x480
genl_rcv_msg+0x47/0x90
? genl_family_rcv_msg+0x160/0x160
netlink_rcv_skb+0x4c/0x120
genl_rcv+0x24/0x40
netlink_unicast+0x196/0x230
netlink_sendmsg+0x204/0x3d0
sock_sendmsg+0x4c/0x50
__sys_sendto+0xee/0x160
? syscall_trace_enter+0x1d3/0x2c0
? __audit_syscall_exit+0x249/0x2a0
__x64_sys_sendto+0x24/0x30
do_syscall_64+0x5b/0x1a0
entry_SYSCALL_64_after_hwframe+0x65/0xca
RIP: 0033:0x7f38524163bb
Fixes: 2151fe0830fd ("bnxt_en: Handle RESET_NOTIFY async event from firmware.")
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix the sequence required for PCIE-AER. While slot reset occurs, firmware
might not be ready and the driver needs to check for its recovery. We
also need to remap the health registers for some chips and clear the
resource reservations. The resources will be allocated again during
bnxt_io_resume().
Fixes: fb1e6e562b37 ("bnxt_en: Fix AER recovery.")
Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Introduce bnxt_clear_reservations() to clear the reserved attributes only.
This will be used in the next patch to fix PCI AER handling.
Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This reverts commit 54cc3dbfc10dc3db7cb1cf49aee4477a8398fbde.
Zev Weiss reports that the reverted patch may cause a regulator
undercount. Here is his report:
... having regulator-dummy set as a supply on my PMBus regulators
(instead of having them as their own top-level regulators without
an upstream supply) leads to enable-count underflow errors when
disabling them:
# echo 0 > /sys/bus/platform/devices/efuse01/state
[ 906.094477] regulator-dummy: Underflow of regulator enable count
[ 906.100563] Failed to disable vout: -EINVAL
[ 136.992676] reg-userspace-consumer efuse01: Failed to configure state: -22
Zev reports that reverting the patch fixes the problem. So let's do that
for now.
Fixes: 54cc3dbfc10d ("hwmon: (pmbus) Add regulator supply into macro")
Cc: Marcello Sylvester Bauer <sylv@sylv.io>
Reported-by: Zev Weiss <zev@bewilderbeest.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
Available sensors are enumerated and reported by the SCMI platform server
using a 16bit identification number; not all such sensors are of a type
supported by hwmon subsystem and, among the supported ones, only a subset
could be temperature sensors that have to be registered with the Thermal
Framework.
Potential clashes between hwmon channels indexes and the underlying real
sensors IDs do not play well with the hwmon<-->thermal bridge automatic
registration routines and could need a sensible number of fake dummy
sensors to be made up in order to keep indexes and IDs in sync.
Avoid to use the hwmon<-->thermal bridge dropping the HWMON_C_REGISTER_TZ
attribute and instead explicit register temperature sensors directly with
the Thermal Framework.
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: linux-hwmon@vger.kernel.org
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20221031114018.59048-1-cristian.marussi@arm.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
At region creation time the next region-id is atomically cached so that
there is predictability of region device names. If that region is
destroyed and then a new one is created the region id increments. That
ends up looking like a memory leak, or is otherwise surprising that
identifiers roll forward even after destroying all previously created
regions.
Try to reuse rather than free old region ids at region release time.
While this fixes a cosmetic issue, the needlessly advancing memory
region-id gives the appearance of a memory leak, hence the "Fixes" tag,
but no "Cc: stable" tag.
Cc: Ben Widawsky <bwidawsk@kernel.org>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Fixes: 779dd20cfb56 ("cxl/region: Add region creation support")
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/166752186062.947915.13200195701224993317.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
When programming port decode targets, the algorithm wants to ensure that
two devices are compatible to be programmed as peers beneath a given
port. A compatible peer is a target that shares the same dport, and
where that target's interleave position also routes it to the same
dport. Compatibility is determined by the device's interleave position
being >= to distance. For example, if a given dport can only map every
Nth position then positions less than N away from the last target
programmed are incompatible.
The @distance for the host-bridge's cxl_port in a simple dual-ported
host-bridge configuration with 2 direct-attached devices is 1, i.e. An
x2 region divided by 2 dports to reach 2 region targets.
An x4 region under an x2 host-bridge would need 2 intervening switches
where the @distance at the host bridge level is 2 (x4 region divided by
2 switches to reach 4 devices).
However, the distance between peers underneath a single ported
host-bridge is always zero because there is no limit to the number of
devices that can be mapped. In other words, there are no decoders to
program in a passthrough, all descendants are mapped and distance only
starts matters for the intervening descendant ports of the passthrough
port.
Add tracking for the number of dports mapped to a port, and use that to
detect the passthrough case for calculating @distance.
Cc: <stable@vger.kernel.org>
Reported-by: Bobo WL <lmw.bobo@gmail.com>
Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: http://lore.kernel.org/r/20221010172057.00001559@huawei.com
Fixes: 27b3f8d13830 ("cxl/region: Program target lists")
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/166752185440.947915.6617495912508299445.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Jonathan reports that region creation fails when a single-port
host-bridge connects to a multi-port switch. Mock up that configuration
so a fix can be tested and regression tested going forward.
Reported-by: Bobo WL <lmw.bobo@gmail.com>
Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: http://lore.kernel.org/r/20221010172057.00001559@huawei.com
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/166752184838.947915.2167957540894293891.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Fix a few typos where 'goto err_port' was used rather than the object
specific cleanup.
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/166752184255.947915.16163477849330181425.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
When a cxl_nvdimm object goes through a ->remove() event (device
physically removed, nvdimm-bridge disabled, or nvdimm device disabled),
then any associated regions must also be disabled. As highlighted by the
cxl-create-region.sh test [1], a single device may host multiple
regions, but the driver was only tracking one region at a time. This
leads to a situation where only the last enabled region per nvdimm
device is cleaned up properly. Other regions are leaked, and this also
causes cxl_memdev reference leaks.
Fix the tracking by allowing cxl_nvdimm objects to track multiple region
associations.
Cc: <stable@vger.kernel.org>
Link: https://github.com/pmem/ndctl/blob/main/test/cxl-create-region.sh [1]
Reported-by: Vishal Verma <vishal.l.verma@intel.com>
Fixes: 04ad63f086d1 ("cxl/region: Introduce cxl_pmem_region objects")
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/166752183647.947915.2045230911503793901.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
When a region is deleted any targets that have been previously assigned
to that region hold references to it. Trigger those references to
drop by detaching all targets at unregister_region() time.
Otherwise that region object will leak as userspace has lost the ability
to detach targets once region sysfs is torn down.
Cc: <stable@vger.kernel.org>
Fixes: b9686e8c8e39 ("cxl/region: Enable the assignment of endpoint decoders to regions")
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/166752183055.947915.17681995648556534844.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Some regions may not have any address space allocated. Skip them when
validating HPA order otherwise a crash like the following may result:
devm_cxl_add_region: cxl_acpi cxl_acpi.0: decoder3.4: created region9
BUG: kernel NULL pointer dereference, address: 0000000000000000
[..]
RIP: 0010:store_targetN+0x655/0x1740 [cxl_core]
[..]
Call Trace:
<TASK>
kernfs_fop_write_iter+0x144/0x200
vfs_write+0x24a/0x4d0
ksys_write+0x69/0xf0
do_syscall_64+0x3a/0x90
store_targetN+0x655/0x1740:
alloc_region_ref at drivers/cxl/core/region.c:676
(inlined by) cxl_port_attach_region at drivers/cxl/core/region.c:850
(inlined by) cxl_region_attach at drivers/cxl/core/region.c:1290
(inlined by) attach_target at drivers/cxl/core/region.c:1410
(inlined by) store_targetN at drivers/cxl/core/region.c:1453
Cc: <stable@vger.kernel.org>
Fixes: 384e624bb211 ("cxl/region: Attach endpoint decoders")
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/166752182461.947915.497032805239915067.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
The dispatcher function is currently abusing the ftrace __fentry__
call location for its own purposes -- this obviously gives trouble
when the dispatcher and ftrace are both in use.
A previous solution tried using __attribute__((patchable_function_entry()))
which works, except it is GCC-8+ only, breaking the build on the
earlier still supported compilers. Instead use static_call() -- which
has its own annotations and does not conflict with ftrace -- to
rewrite the dispatch function.
By using: return static_call()(ctx, insni, bpf_func) you get a perfect
forwarding tail call as function body (iow a single jmp instruction).
By having the default static_call() target be bpf_dispatcher_nop_func()
it retains the default behaviour (an indirect call to the argument
function). Only once a dispatcher program is attached is the target
rewritten to directly call the JIT'ed image.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Björn Töpel <bjorn@kernel.org>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Björn Töpel <bjorn@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lkml.kernel.org/r/Y1/oBlK0yFk5c/Im@hirez.programming.kicks-ass.net
Link: https://lore.kernel.org/bpf/20221103120647.796772565@infradead.org
|
|
Because __attribute__((patchable_function_entry)) is only available
since GCC-8 this solution fails to build on the minimum required GCC
version.
Undo these changes so we might try again -- without cluttering up the
patches with too many changes.
This is an almost complete revert of:
dbe69b299884 ("bpf: Fix dispatcher patchable function entry to 5 bytes nop")
ceea991a019c ("bpf: Move bpf_dispatcher function out of ftrace locations")
(notably the arch/x86/Kconfig hunk is kept).
Reported-by: David Laight <David.Laight@aculab.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Björn Töpel <bjorn@kernel.org>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Björn Töpel <bjorn@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lkml.kernel.org/r/439d8dc735bb4858875377df67f1b29a@AcuMS.aculab.com
Link: https://lore.kernel.org/bpf/20221103120647.728830733@infradead.org
|
|
Pull xfs fixes from Darrick Wong:
"Dave and I had thought that this would be a very quiet cycle, but we
thought wrong.
At first there were the usual trickle of minor bugfixes, but then
Zorro pulled -rc1 and noticed complaints about the stronger memcpy
checks w.r.t. flex arrays.
Analyzing how to fix that revealed a bunch of validation gaps in
validating ondisk log items during recovery, and then a customer hit
an infinite loop in the refcounting code on a corrupt filesystem.
So. This largeish batch of fixes addresses all those problems, I hope.
Summary:
- Fix a UAF bug during log recovery
- Fix memory leaks when mount fails
- Detect corrupt bestfree information in a directory block
- Fix incorrect return value type for the dax page fault handlers
- Fix fortify complaints about memcpy of xfs log item objects
- Strengthen inadequate validation of recovered log items
- Fix incorrectly declared flex array in EFI log item structs
- Log corrupt log items for debugging purposes
- Fix infinite loop problems in the refcount code if the refcount
btree node block keys are corrupt
- Fix infinite loop problems in the refcount code if the refcount
btree records suffer MSB bitflips
- Add more sanity checking to continued defer ops to prevent
overflows from one AG to the next or off EOFS"
* tag 'xfs-6.1-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (28 commits)
xfs: rename XFS_REFC_COW_START to _COWFLAG
xfs: fix uninitialized list head in struct xfs_refcount_recovery
xfs: fix agblocks check in the cow leftover recovery function
xfs: check record domain when accessing refcount records
xfs: remove XFS_FIND_RCEXT_SHARED and _COW
xfs: refactor domain and refcount checking
xfs: report refcount domain in tracepoints
xfs: track cow/shared record domains explicitly in xfs_refcount_irec
xfs: refactor refcount record usage in xchk_refcountbt_rec
xfs: dump corrupt recovered log intent items to dmesg consistently
xfs: move _irec structs to xfs_types.h
xfs: actually abort log recovery on corrupt intent-done log items
xfs: check deferred refcount op continuation parameters
xfs: refactor all the EFI/EFD log item sizeof logic
xfs: create a predicate to verify per-AG extents
xfs: fix memcpy fortify errors in EFI log format copying
xfs: make sure aglen never goes negative in xfs_refcount_adjust_extents
xfs: fix memcpy fortify errors in RUI log format copying
xfs: fix memcpy fortify errors in CUI log format copying
xfs: fix memcpy fortify errors in BUI log format copying
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux
Pull landlock fix from Mickaël Salaün:
"Fix the test build for some distros"
* tag 'landlock-6.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
selftests/landlock: Build without static libraries
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening fix from Kees Cook:
- Correctly report struct member size on memcpy overflow (Kees Cook)
* tag 'hardening-v6.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
fortify: Capture __bos() results in const temp vars
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI fixes from Ard Biesheuvel:
- A pair of tweaks to the EFI random seed code so that externally
provided version of this config table are handled more robustly
- Another fix for the v6.0 EFI variable refactor that turned out to
break Apple machines which don't provide QueryVariableInfo()
- Add some guard rails to the EFI runtime service call wrapper so we
can recover from synchronous exceptions caused by firmware
* tag 'efi-fixes-for-v6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
arm64: efi: Recover from synchronous exceptions occurring in firmware
efi: efivars: Fix variable writes with unsupported query_variable_store()
efi: random: Use 'ACPI reclaim' memory for random seed
efi: random: reduce seed size to 32 bytes
efi/tpm: Pass correct address to memblock_reserve
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"There are not a lot of important fixes for the soc tree yet this time,
but it's time to upstream what I got so far:
- DT Fixes for Arm Juno and ST-Ericsson Ux500 to add missing critical
temperature points
- A number of fixes for the Arm SCMI firmware, addressing correctness
issues in the code, in particular error handling and resource
leaks.
- One error handling fix for the new i.MX93 power domain driver
- Several devicetree fixes for NXP i.MX6/8/9 and Layerscape chips,
fixing incorrect or missing DT properties for MDIO controller
nodes, CPLD, USB and regulators for various boards, as well as some
fixes for DT schema checks.
- MAINTAINERS file updates for HiSilicon LPC Bus and Broadcom git
URLs"
* tag 'soc-fixes-6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (26 commits)
arm64: dts: juno: Add thermal critical trip points
firmware: arm_scmi: Fix deferred_tx_wq release on error paths
firmware: arm_scmi: Fix devres allocation device in virtio transport
firmware: arm_scmi: Make Rx chan_setup fail on memory errors
firmware: arm_scmi: Make tx_prepare time out eventually
firmware: arm_scmi: Suppress the driver's bind attributes
firmware: arm_scmi: Cleanup the core driver removal callback
MAINTAINERS: Update HiSilicon LPC BUS Driver maintainer
ARM: dts: ux500: Add trips to battery thermal zones
arm64: dts: ls208xa: specify clock frequencies for the MDIO controllers
arm64: dts: ls1088a: specify clock frequencies for the MDIO controllers
arm64: dts: lx2160a: specify clock frequencies for the MDIO controllers
soc: imx: imx93-pd: Fix the error handling path of imx93_pd_probe()
arm64: dts: imx93: correct gpio-ranges
arm64: dts: imx93: correct s4mu interrupt names
dt-bindings: power: gpcv2: add power-domains property
arm64: dts: imx8: correct clock order
ARM: dts: imx6dl-yapp4: Do not allow PM to switch PU regulator off on Q/QP
ARM: dts: imx6qdl-gw59{10,13}: fix user pushbutton GPIO offset
arm64: dts: imx8mn: Correct the usb power domain
...
|
|
The engine busyness stats has a worker function to do things like
64bit extend the 32bit hardware counters. The GuC's reset prepare
function flushes out this worker function to ensure no corruption
happens during the reset. Unforunately, the worker function has an
infinite wait for active resets to finish before doing its work. Thus
a deadlock would occur if the worker function had actually started
just as the reset starts.
The function being used to lock the reset-in-progress mutex is called
intel_gt_reset_trylock(). However, as noted it does not follow
standard 'trylock' conventions and exit if already locked. So rename
the current _trylock function to intel_gt_reset_lock_interruptible(),
which is the behaviour it actually provides. In addition, add a new
implementation of _trylock and call that from the busyness stats
worker instead.
v2: Rename existing trylock to interruptible rather than trying to
preserve the existing (confusing) naming scheme (review comments from
Tvrtko).
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221102192109.2492625-3-John.C.Harrison@Intel.com
|
|
If a context has already been registered prior to first submission
then context init code was not being called. The noticeable effect of
that was the scheduling priority was left at zero (meaning super high
priority) instead of being set to normal. This would occur with
kernel contexts at start of day as they are manually pinned up front
rather than on first submission. So add a call to initialise those
when they are pinned.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221102192109.2492625-2-John.C.Harrison@Intel.com
|
|
These servers are all on the public versions of the roadmap. The model
numbers for Grand Ridge, Granite Rapids, and Sierra Forest were included
in the September 2022 edition of the Instruction Set Extensions document.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/r/20221103203310.5058-1-tony.luck@intel.com
|
|
as invalid
mmu notifier does not always hold mm->sem during call back. That causes
a race condition between kfd userprt buffer mapping and mmu notifier
which leds to gpu shadder or SDMA access userptr buffer before it has been
mapped to gpu VM. Always map userptr buffer to avoid that though it may make
some userprt buffers mapped two times.
Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Xiaogang Chen <xiaogang.chen@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
After moving all FPU code to the DML folder, we can enable DCN support
for the ARM64 platform. Remove the -mgeneral-regs-only CFLAG from the
code in the DML folder that needs to use hardware FPU, and add a control
mechanism for ARM Neon.
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Ao Zhong <hacc1225@gmail.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
- clear kiq ring after suspend/resume under sriov to aviod kiq ring
test failure
- update irq after resume to fix kiq interrput loss
Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
temporary workaround to skip ras error for gc_v11_0_3 until IFWI release later
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
For some GPUs with more CUs, the original sibling_map[32]
in struct crat_subtype_cache is not enough
to save the cache information when create the VCRAT table,
so skip filling the struct crat_subtype_cache info instead
fill struct kfd_cache_properties directly to fix this problem.
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
To allow invoking ip specific callbacks
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
indirect call targets are validated against the expected function
pointer prototype to make sure the call target is valid to help mitigate
ROP attacks. If they are not identical, there is a failure at run time,
which manifests as either a kernel panic or thread getting killed. A
proposed warning in clang aims to catch these at compile time, which
reveals:
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/amdgpu_smu.c:3008:29: error: incompatible function pointer types initializing 'int (*)(void *, uint32_t, long *, uint32_t)' (aka 'int (*)(void *, unsigned int, long *, unsigned int)') with an expression of type 'int (void *, enum PP_OD_DPM_TABLE_COMMAND, long *, uint32_t)' (aka 'int (void *, enum PP_OD_DPM_TABLE_COMMAND, long *, unsigned int)') [-Werror,-Wincompatible-function-pointer-types-strict]
.odn_edit_dpm_table = smu_od_edit_dpm_table,
^~~~~~~~~~~~~~~~~~~~~
1 error generated.
There are only two implementations of ->odn_edit_dpm_table() in 'struct
amd_pm_funcs': smu_od_edit_dpm_table() and pp_odn_edit_dpm_table(). One
has a second parameter type of 'enum PP_OD_DPM_TABLE_COMMAND' and the
other uses 'u32'. Ultimately, smu_od_edit_dpm_table() calls
->od_edit_dpm_table() from 'struct pptable_funcs' and
pp_odn_edit_dpm_table() calls ->odn_edit_dpm_table() from 'struct
pp_hwmgr_func', which both have a second parameter type of 'enum
PP_OD_DPM_TABLE_COMMAND'.
Update the type parameter in both the prototype in 'struct amd_pm_funcs'
and pp_odn_edit_dpm_table() to 'enum PP_OD_DPM_TABLE_COMMAND', which
cleans up the warning.
Link: https://github.com/ClangBuiltLinux/linux/issues/1750
Reported-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
indirect call targets are validated against the expected function
pointer prototype to make sure the call target is valid to help mitigate
ROP attacks. If they are not identical, there is a failure at run time,
which manifests as either a kernel panic or thread getting killed. A
proposed warning in clang aims to catch these at compile time, which
reveals:
drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c:412:15: error: incompatible function pointer types initializing 'void (*)(struct amdgpu_device *, u32, u32, u32, u32)' (aka 'void (*)(struct amdgpu_device *, unsigned int, unsigned int, unsigned int, unsigned int)') with an expression of type 'void (struct amdgpu_device *, enum idh_request, u32, u32, u32)' (aka 'void (struct amdgpu_device *, enum idh_request, unsigned int, unsigned int, unsigned int)') [-Werror,-Wincompatible-function-pointer-types-strict]
.trans_msg = xgpu_ai_mailbox_trans_msg,
^~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c:435:15: error: incompatible function pointer types initializing 'void (*)(struct amdgpu_device *, u32, u32, u32, u32)' (aka 'void (*)(struct amdgpu_device *, unsigned int, unsigned int, unsigned int, unsigned int)') with an expression of type 'void (struct amdgpu_device *, enum idh_request, u32, u32, u32)' (aka 'void (struct amdgpu_device *, enum idh_request, unsigned int, unsigned int, unsigned int)') [-Werror,-Wincompatible-function-pointer-types-strict]
.trans_msg = xgpu_nv_mailbox_trans_msg,
^~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
The type of the second parameter in the prototype should be 'enum
idh_request' instead of 'u32'. Update it to clear up the warnings.
Link: https://github.com/ClangBuiltLinux/linux/issues/1750
Reported-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
One-element arrays are deprecated, and we are replacing them with
flexible array members instead. So, replace one-element array with
flexible-array member in struct _ATOM_FAKE_EDID_PATCH_RECORD and
refactor the rest of the code accordingly.
Important to mention is that doing a build before/after this patch
results in no binary output differences.
This helps with the ongoing efforts to tighten the FORTIFY_SOURCE
routines on memcpy() and help us make progress towards globally
enabling -fstrict-flex-arrays=3 [1].
Link: https://github.com/KSPP/linux/issues/79
Link: https://github.com/KSPP/linux/issues/238
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836 [1]
Signed-off-by: Paulo Miguel Almeida <paulo.miguel.almeida.rodenas@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
One-element arrays are deprecated, and we are replacing them with
flexible array members instead. So, replace one-element array with
flexible-array member in struct _ATOM_FAKE_EDID_PATCH_RECORD and
refactor the rest of the code accordingly.
It's worth mentioning that doing a build before/after this patch results
in no binary output differences.
This helps with the ongoing efforts to tighten the FORTIFY_SOURCE
routines on memcpy() and help us make progress towards globally
enabling -fstrict-flex-arrays=3 [1].
Link: https://github.com/KSPP/linux/issues/79
Link: https://github.com/KSPP/linux/issues/239
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836 [1]
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Paulo Miguel Almeida <paulo.miguel.almeida.rodenas@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
So the callbacks are set early in case we need them.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
So the callbacks are set early in case we need them.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
So the callbacks are set before we use them.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Remove unnecessary register program in SRIOV
Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[why]
Start from soc21, CP does not support MCBP, so disable it.
[how]
Used amgpu_mcbp flag alone instead of checking if is in SRIOV to
enable/disable MCBP.
Only set flag to enable on asic_type prior to soc21 in SRIOV.
Signed-off-by: Yiqing Yao <yiqing.yao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Use virt_init_setting instead of per ip version setting.
Signed-off-by: Yiqing Yao <yiqing.yao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Hang on MES timeout if halt_if_hws_hang is set to 1.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
With corresponding FW change fixes issue where triggering CWSR on a
workgroup with waves in s_barrier wouldn't lead to a back-off and
therefore cause a hang.
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Tested-by: Graham Sider <Graham.Sider@amd.com>
Acked-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Graham Sider <Graham.Sider@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Some of the GuC state dump messages were adding extra line feeds. When
printing via a DRM printer to dmesg, for example, that messes up the
log formatting as it loses any prefixing from the printer. Given that
the extra line feeds are just in the middle of random bits of GuC
state, there isn't any real need for them. So just remove them
completely.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221031220007.4176835-1-John.C.Harrison@Intel.com
|
|
The devm_ioremap() function returns NULL on error, it doesn't return
error pointers.
Fixes: 99d9ccd973852 ("phy: usb: Add USB2.0 phy driver for Sunplus SP7021")
Signed-off-by: Peng Wu <wupeng58@huawei.com>
Link: https://lore.kernel.org/r/20220911060053.123594-1-wupeng58@huawei.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Pull drm fixes from Dave Airlie:
"This is the weekly fixes for rc4. Misc fixes across rockchip, imx,
amdgpu and i915.
The biggest change is for amdkfd where the trap handler needs an
updated fw from a header which makes it a bit larger. I hadn't noticed
this particular file before so I'm going to figure out what the magic
is for, but the fix should be fine for now.
amdgpu:
- DCN 3.1.4 fixes
- DCN 3.2.x fixes
- GC 11.x fixes
- Virtual display fix
- Fail suspend if resources can't be evicted
- SR-IOV fix
- Display PSR fix
amdkfd:
- Fix possible NULL pointer deref
- GC 11.x trap handler fix
i915:
- Add locking around DKL PHY register accesses
- Stop abusing swiotlb_max_segment
- Filter out invalid outputs more sensibly
- Setup DDC fully before output init
- Simplify intel_panel_add_edid_alt_fixed_modes()
- Grab mode_config.mutex during LVDS init to avoid WARNs
rockchip:
- fix probing issues
- fix framebuffer without iommu
- fix vop selection
- fix NULL ptr access
imx:
- Fix Kconfig
- fix mode_valid function"
* tag 'drm-fixes-2022-11-04-1' of git://anongit.freedesktop.org/drm/drm: (35 commits)
drm/amdkfd: update GFX11 CWSR trap handler
drm/amd/display: Investigate tool reported FCLK P-state deviations
drm/amd/display: Add DSC delay factor workaround
drm/amd/display: Round up DST_after_scaler to nearest int
drm/amd/display: Use forced DSC bpp in DML
drm/amd/display: Fix DCN32 DSC delay calculation
drm/amdgpu: Disable GPU reset on SRIOV before remove pci.
drm/amdgpu: disable GFXOFF during compute for GFX11
drm/amd: Fail the suspend if resources can't be evicted
drm/amdkfd: Fix NULL pointer dereference in svm_migrate_to_ram()
drm/amdgpu: correct MES debugfs versions
drm/amdgpu: set fb_modifiers_not_supported in vkms
drm/amd/display: cursor update command incomplete
drm/amd/display: Enable timing sync on DCN32
drm/amd/display: Set memclk levels to be at least 1 for dcn32
drm/amd/display: Update latencies on DCN321
drm/amd/display: Limit dcn32 to 1950Mhz display clock
drm/amd/display: Ignore Cable ID Feature
drm/amd/display: Update DSC capabilitie for DCN314
drm/imx: imx-tve: Fix return type of imx_tve_connector_mode_valid
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"Fixes in clk drivers and some clk rate range fixes in the core as
well:
- Make sure the struct clk_rate_request is more sane
- Remove a WARN_ON that was triggering for clks with no parents that
can change frequency
- Fix bad i2c bus transactions on Renesas rs9
- Actually return an error in clk_mt8195_topck_probe() on an error
path
- Keep the GPU memories powered while the clk isn't enabled on
Qualcomm's sc7280 SoC
- Fix the parent clk for HSCIF modules on Renesas' R-Car V4H SoC"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: qcom: Update the force mem core bit for GPU clocks
clk: Initialize max_rate in struct clk_rate_request
clk: Initialize the clk_rate_request even if clk_core is NULL
clk: Remove WARN_ON NULL parent in clk_core_init_rate_req()
clk: renesas: r8a779g0: Fix HSCIF parent clocks
clk: renesas: r8a779g0: Add SASYNCPER clocks
clk: mediatek: clk-mt8195-topckgen: Fix error return code in clk_mt8195_topck_probe()
clk: sifive: select by default if SOC_SIFIVE
clk: rs9: Fix I2C accessors
|
|
We got a syzkaller problem because of aarch64 alignment fault
if KFENCE enabled. When the size from user bpf program is an odd
number, like 399, 407, etc, it will cause the struct skb_shared_info's
unaligned access. As seen below:
BUG: KFENCE: use-after-free read in __skb_clone+0x23c/0x2a0 net/core/skbuff.c:1032
Use-after-free read at 0xffff6254fffac077 (in kfence-#213):
__lse_atomic_add arch/arm64/include/asm/atomic_lse.h:26 [inline]
arch_atomic_add arch/arm64/include/asm/atomic.h:28 [inline]
arch_atomic_inc include/linux/atomic-arch-fallback.h:270 [inline]
atomic_inc include/asm-generic/atomic-instrumented.h:241 [inline]
__skb_clone+0x23c/0x2a0 net/core/skbuff.c:1032
skb_clone+0xf4/0x214 net/core/skbuff.c:1481
____bpf_clone_redirect net/core/filter.c:2433 [inline]
bpf_clone_redirect+0x78/0x1c0 net/core/filter.c:2420
bpf_prog_d3839dd9068ceb51+0x80/0x330
bpf_dispatcher_nop_func include/linux/bpf.h:728 [inline]
bpf_test_run+0x3c0/0x6c0 net/bpf/test_run.c:53
bpf_prog_test_run_skb+0x638/0xa7c net/bpf/test_run.c:594
bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline]
__do_sys_bpf kernel/bpf/syscall.c:4441 [inline]
__se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381
kfence-#213: 0xffff6254fffac000-0xffff6254fffac196, size=407, cache=kmalloc-512
allocated by task 15074 on cpu 0 at 1342.585390s:
kmalloc include/linux/slab.h:568 [inline]
kzalloc include/linux/slab.h:675 [inline]
bpf_test_init.isra.0+0xac/0x290 net/bpf/test_run.c:191
bpf_prog_test_run_skb+0x11c/0xa7c net/bpf/test_run.c:512
bpf_prog_test_run kernel/bpf/syscall.c:3148 [inline]
__do_sys_bpf kernel/bpf/syscall.c:4441 [inline]
__se_sys_bpf+0xad0/0x1634 kernel/bpf/syscall.c:4381
__arm64_sys_bpf+0x50/0x60 kernel/bpf/syscall.c:4381
To fix the problem, we adjust @size so that (@size + @hearoom) is a
multiple of SMP_CACHE_BYTES. So we make sure the struct skb_shared_info
is aligned to a cache line.
Fixes: 1cf1cae963c2 ("bpf: introduce BPF_PROG_TEST_RUN command")
Signed-off-by: Baisong Zhong <zhongbaisong@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/bpf/20221102081620.1465154-1-zhongbaisong@huawei.com
|