Age | Commit message (Collapse) | Author |
|
v2:
- Move to context workarounds. ROW_CHICKEN4 is part of the context
image on gen11 (although it isn't on gen12).
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200311162300.1838847-5-matthew.d.roper@intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
|
|
The bspec description for this workaround tells us to program
0xFFFF_FFFF into both FBC_RT_BASE_ADDR_REGISTER_* registers, but we've
previously found that this leads to failures in CI. Our suspicion is
that the failures are caused by this valid turning on the "address valid
bit" even though we're intentionally supplying an invalid address.
Experimentation has shown that setting all bits _except_ for the
RT_VALID bit seems to avoid these failures.
v2:
- Mask off the RT_VALID bit. Experimentation with CI trybot indicates
that this is necessary to avoid reset failures on BCS.
v3:
- Program RT_BASE before RT_BASE_UPPER so that the valid bit is turned
off by the first write. (Chris)
Bspec: 11388
Bspec: 33451
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200311162300.1838847-4-matthew.d.roper@intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
|
|
On gen11 the XY_FAST_COPY_BLT command has some size restrictions on its
usage. Although this instruction is mainly used by userspace, i915 also
uses it to copy object contents during some selftests, so let's ensure
the restrictions are followed.
Bspec: 6544
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200311162300.1838847-3-matthew.d.roper@intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
|
|
The bspec documents multiple MCR ranges; make sure they're all captured
by the driver.
Bspec: 13991, 52079
Fixes: 592a7c5e082e ("drm/i915: Extend non readable mcr range")
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200311162300.1838847-2-matthew.d.roper@intel.com
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
|
|
1db8c142b6c5 (drm/scheduler: Add drm_sched_suspend/resume_timeout()) made
the job_list_lock IRQ safe in as the suspend/resume calls were expected to
be called from IRQ context. This usage never materialized in upstream.
Instead amdgpu started locking the job_list_lock in an IRQ unsafe way in
amdgpu_ib_preempt_mark_partial_job() and amdgpu_ib_preempt_job_recovery(),
which leads to potential deadlock if one would actually start to call the
drm_sched_suspend/resume_timeout functions from IRQ context.
As no current user needs the locking to be IRQ safe, the local IRQ
disable/enable is pure overhead. Fix the inconsistent locking by changing
all uses of job_list_lock to use the IRQ unsafe locking primitives.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add a new trace event to show when jobs are run on the HW.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
clean up unused header in swsmu driver stack:
1. pp_debug.h
2. amd_pcie.h
3. soc15_common.h
Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
AccVGPRs are newly added in arcturus. Before reading these
registers, they should be initialized. Otherwise edc error
happens, when RAS is enabled.
v2: reuse the existing logical to calculate register size
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This fixes a problem found on the MacBookPro 2017 Retina panel:
The panel reports 10 bpc color depth in its EDID, and the
firmware chooses link settings at boot which support enough
bandwidth for 10 bpc (324000 kbit/sec aka LINK_RATE_RBR2
aka 0xc), but the DP_MAX_LINK_RATE dpcd register only reports
2.7 Gbps (multiplier value 0xa) as possible, in direct
contradiction of what the firmware successfully set up.
This restricts the panel to 8 bpc, not providing the full
color depth of the panel on Linux <= 5.5. Additionally, commit
'4a8ca46bae8a ("drm/amd/display: Default max bpc to 16 for eDP")'
introduced into Linux 5.6-rc1 will unclamp panel depth to
its full 10 bpc, thereby requiring a eDP bandwidth for all
modes that exceeds the bandwidth available and causes all modes
to fail validation -> No modes for the laptop panel -> failure
to set any mode -> Panel goes dark.
This patch adds a quirk specific to the MBP 2017 15" Retina
panel to override reported max link rate to the correct maximum
of 0xc = LINK_RATE_RBR2 to fix the darkness and reduced display
precision.
Please apply for Linux 5.6+ to avoid regressing Apple MBP panel
support.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
These are some very loud debug statements that get printed on every
vblank when driver level debug printing is enabled in DRM, and doesn't
really tell us anything that isn't related to vblanks. So let's move
this over to the proper debug flag to be a little less spammy with our
debug output.
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
If "speed" is zero then we use it as a divisor to find "prescale". It's
better to move the check for zero to the very start of the function.
Fixes: 9eeec26a1339 ("drm/amd/display: Refine i2c frequency calculating sequence")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
These lines were accidentally indented 4 spaces more than they should
be.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
We can remove the NULL check for "res_ctx" and
"res_ctx->pipe_ctx[i].stream->link". Also it's nicer to align the
conditions using spaces so I re-indented a bit.
Longer explanation: The "res_ctx" pointer points to an address in the
middle of a struct so it can't be NULL. For
"res_ctx->pipe_ctx[i].stream->link" we know that it is equal to "link"
and "link" is non-NULL.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Fix the Kconfig dependencies so that the menu is presented
correctly by adding a dependency on DRM_AMDGPU to the "menu"
Kconfig statement. This makes a continuous dependency on
DRM_AMDGPU in the DRM AMD menus and eliminates a broken menu
structure.
Fixes: a8fe58cec351 ("drm/amd: add ACP driver support")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: David (ChunMing) Zhou <David1.Zhou@amd.com>
Cc: Maruthi Bayyavarapu <maruthi.bayyavarapu@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Convert the various uses of fallthrough comments to fallthrough;
Done via script
Link: https://lore.kernel.org/lkml/b56602fcf79f849e733e7b521bb0e17895d390fa.1582230379.git.joe@perches.com/
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The automated conversion of /* fallthrough */ comments converted
a comment outside of an #ifdef/#endif case block that should be
inside the block.
Move the fallthrough inside the block to silence the warning.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Convert the various uses of fallthrough comments to fallthrough;
Done via script
Link: https://lore.kernel.org/lkml/b56602fcf79f849e733e7b521bb0e17895d390fa.1582230379.git.joe@perches.com/
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Convert the various uses of fallthrough comments to fallthrough;
Done via script
Link: https://lore.kernel.org/lkml/b56602fcf79f849e733e7b521bb0e17895d390fa.1582230379.git.joe@perches.com/
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Fix the warning
"warn: variable dereferenced before check 'obj' (see line 1131)"
by removing unnecessary checks as amdgpu_ras_debugfs_create_all()
is only called from amdgpu_debugfs_init() where obj member in
con->head list is not NULL.
Use list_for_each_entry() instead list_for_each_entry_safe() as obj
do not to be freeing or removing from list during this process.
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This can fix the baco reset failure seen on Navi10.
And this should be a low risk fix as the same sequence
is already used for system suspend/resume.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
RAS support capability needs to be updated on top of different
memeory ECC enablement, and remove redundant memory ecc check
in gmc module for vega20 and arcturus.
v2: check HBM ECC enablement and set ras mask accordingly.
v3: avoid to invoke atomfirmware interface to query twice.
Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The offset into the array was specified in bytes but should
be in terms of 32-bit words. Also prevent large reads that
would also cause a buffer overread.
v2: Read from correct offset from internal storage buffer.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
In dcn20_funcs and dcn21_funcs struct, the member ".dsc_pg_control = NULL"
should be removed due to .dsc_pg_control be assigned to dcn20_dsc_pg_control.
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Anthony Koo <Anthony.Koo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
include amdgpu_ras.h head file instead of use extern
ras_debugfs_create_all function
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
disallow the logical to be enabled on platforms that
don't support gfx ras at this stage, like sriov skus,
dgpu with legacy ras.etc
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
invoking an error injection successfully will cause an at_event intterrupt that
will occur before the invoke sequence can complete causing an invalid error
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
refine the assignment for vcn.num_vcn_inst,
vcn.harvest_config, vcn.num_enc_rings in VF
Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux into i2c/for-current
at24 fixes for v5.6-rc6
- fix regulator underflow bug introduced during the v5.6 merge window
|
|
Lockdep reports "WARNING: lock held when returning to user space!" due to
async write holding freeze lock over the write. Apparently aio.c already
deals with this by lying to lockdep about the state of the lock.
Do the same here. No need to check for S_IFREG() here since these file ops
are regular-only.
Reported-by: syzbot+9331a354f4f624a52a55@syzkaller.appspotmail.com
Fixes: 2406a307ac7d ("ovl: implement async IO routines")
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
Fix up two bugs in the coversion to xino_mode:
1. xino=off does not always end up in disabled mode
2. xino=auto on 32bit arch should end up in disabled mode
Take a proactive approach to disabling xino on 32bit kernel:
1. Disable XINO_AUTO config during build time
2. Disable xino with a warning on mount time
As a by product, xino=on on 32bit arch also ends up in disabled mode.
We never intended to enable xino on 32bit arch and this will make the
rest of the logic simpler.
Fixes: 0f831ec85eda ("ovl: simplify ovl_same_sb() helper")
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
The L3 interconnect's memory map is from 0x0 to
0xffffffff. Out of this, System memory (SDRAM) can be
accessed from 0x80000000 to 0xffffffff (2GB)
DRA7 does support 4GB of SDRAM but upper 2GB can only be
accessed by the MPU subsystem.
Add the dma-ranges property to reflect the physical address limit
of the L3 bus.
Issues ere observed only with SATA on DRA7-EVM with 4GB RAM
and CONFIG_ARM_LPAE enabled. This is because the controller
supports 64-bit DMA and its driver sets the dma_mask to 64-bit
thus resulting in DMA accesses beyond L3 limit of 2G.
Setting the correct bus_dma_limit fixes the issue.
Signed-off-by: Roger Quadros <rogerq@ti.com>
Cc: stable@kernel.org
Signed-off-by: Tony Lindgren <tony@atomide.com>
|
|
Shutdown of firmware framebuffer has a bunch of problems. Because
of this the framebuffer region might still be reserved even after
drm_fb_helper_remove_conflicting_pci_framebuffers() returned.
Don't consider pci_request_region() failure for the framebuffer
region as fatal error to workaround this issue.
Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20200313084152.2734-1-kraxel@redhat.com
|
|
When a kernel ULP requests the rdmavt to create a completion queue, it
allocated the queue and set cq->kqueue to point to it. However, when the
completion queue is destroyed, cq->queue is freed instead, leading to a
memory leak:
https://lore.kernel.org/r/215235485.15264050.1583334487658.JavaMail.zimbra@redhat.com
unreferenced object 0xffffc90006639000 (size 12288):
comm "kworker/u128:0", pid 8, jiffies 4295777598 (age 589.085s)
hex dump (first 32 bytes):
4d 00 00 00 4d 00 00 00 00 c0 08 ac 8b 88 ff ff M...M...........
00 00 00 00 80 00 00 00 00 00 00 00 10 00 00 00 ................
backtrace:
[<0000000035a3d625>] __vmalloc_node_range+0x361/0x720
[<000000002942ce4f>] __vmalloc_node.constprop.30+0x63/0xb0
[<00000000f228f784>] rvt_create_cq+0x98a/0xd80 [rdmavt]
[<00000000b84aec66>] __ib_alloc_cq_user+0x281/0x1260 [ib_core]
[<00000000ef3764be>] nvme_rdma_cm_handler+0xdb7/0x1b80 [nvme_rdma]
[<00000000936b401c>] cma_cm_event_handler+0xb7/0x550 [rdma_cm]
[<00000000d9c40b7b>] addr_handler+0x195/0x310 [rdma_cm]
[<00000000c7398a03>] process_one_req+0xdd/0x600 [ib_core]
[<000000004d29675b>] process_one_work+0x920/0x1740
[<00000000efedcdb5>] worker_thread+0x87/0xb40
[<000000005688b340>] kthread+0x327/0x3f0
[<0000000043a168d6>] ret_from_fork+0x3a/0x50
This patch fixes the issue by freeing cq->kqueue instead.
Fixes: 239b0e52d8aa ("IB/hfi1: Move rvt_cq_wc struct into uapi directory")
Link: https://lore.kernel.org/r/20200313123957.14343.43879.stgit@awfm-01.aw.intel.com
Cc: <stable@vger.kernel.org> # 5.4.x
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
The vector management code assumes that managed interrupts cannot be
migrated away from an online CPU. free_moved_vector() has a WARN_ON_ONCE()
which triggers when a managed interrupt vector association on a online CPU
is cleared. The CPU offline code uses a different mechanism which cannot
trigger this.
This assumption is not longer correct because the new CPU isolation feature
which affects the placement of managed interrupts must be able to move a
managed interrupt away from an online CPU.
There are two reasons why this can happen:
1) When the interrupt is activated the affinity mask which was
established in irq_create_affinity_masks() is handed in to
the vector allocation code. This mask contains all CPUs to which
the interrupt can be made affine to, but this does not take the
CPU isolation 'managed_irq' mask into account.
When the interrupt is finally requested by the device driver then the
affinity is checked again and the CPU isolation 'managed_irq' mask is
taken into account, which moves the interrupt to a non-isolated CPU if
possible.
2) The interrupt can be affine to an isolated CPU because the
non-isolated CPUs in the calculated affinity mask are not online.
Once a non-isolated CPU which is in the mask comes online the
interrupt is migrated to this non-isolated CPU
In both cases the regular online migration mechanism is used which triggers
the WARN_ON_ONCE() in free_moved_vector().
Case #1 could have been addressed by taking the isolation mask into
account, but that would require a massive code change in the activation
logic and the eventual migration event was accepted as a reasonable
tradeoff when the isolation feature was developed. But even if #1 would be
addressed, #2 would still trigger it.
Of course the warning in free_moved_vector() was overlooked at that time
and the above two cases which have been discussed during patch review have
obviously never been tested before the final submission.
So keep it simple and remove the warning.
[ tglx: Rewrote changelog and added a comment to free_moved_vector() ]
Fixes: 11ea68f553e2 ("genirq, sched/isolation: Isolate from handling managed interrupts")
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lkml.kernel.org/r/20200312205830.81796-1-peterx@redhat.com
|
|
i2c_verify_client() can fail, so we need to put the device when that
happens.
Fixes: 525e6fabeae2 ("i2c / ACPI: add support for ACPI reconfigure notifications")
Reported-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Non-IB devices do not have a umad interface and the client_data will be
left set to NULL. In this case calling get_nl_info() will try to kref a
NULL cdev causing a crash:
general protection fault, probably for non-canonical address 0xdffffc00000000ba: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x00000000000005d0-0x00000000000005d7]
CPU: 0 PID: 20851 Comm: syz-executor.0 Not tainted 5.6.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:kobject_get+0x35/0x150 lib/kobject.c:640
Code: 53 e8 3f b0 8b f9 4d 85 e4 0f 84 a2 00 00 00 e8 31 b0 8b f9 49 8d 7c 24 3c 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f b6 04 02 48 89 fa
+83 e2 07 38 d0 7f 08 84 c0 0f 85 eb 00 00 00
RSP: 0018:ffffc9000946f1a0 EFLAGS: 00010203
RAX: dffffc0000000000 RBX: ffffffff85bdbbb0 RCX: ffffc9000bf22000
RDX: 00000000000000ba RSI: ffffffff87e9d78f RDI: 00000000000005d4
RBP: ffffc9000946f1b8 R08: ffff8880581a6440 R09: ffff8880581a6cd0
R10: fffffbfff154b838 R11: ffffffff8aa5c1c7 R12: 0000000000000598
R13: 0000000000000000 R14: ffffc9000946f278 R15: ffff88805cb0c4d0
FS: 00007faa9e8af700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b30121000 CR3: 000000004515d000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
get_device+0x25/0x40 drivers/base/core.c:2574
__ib_get_client_nl_info+0x205/0x2e0 drivers/infiniband/core/device.c:1861
ib_get_client_nl_info+0x35/0x180 drivers/infiniband/core/device.c:1881
nldev_get_chardev+0x575/0xac0 drivers/infiniband/core/nldev.c:1621
rdma_nl_rcv_msg drivers/infiniband/core/netlink.c:195 [inline]
rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline]
rdma_nl_rcv+0x5d9/0x980 drivers/infiniband/core/netlink.c:259
netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
netlink_unicast+0x59e/0x7e0 net/netlink/af_netlink.c:1329
netlink_sendmsg+0x91c/0xea0 net/netlink/af_netlink.c:1918
sock_sendmsg_nosec net/socket.c:652 [inline]
sock_sendmsg+0xd7/0x130 net/socket.c:672
____sys_sendmsg+0x753/0x880 net/socket.c:2343
___sys_sendmsg+0x100/0x170 net/socket.c:2397
__sys_sendmsg+0x105/0x1d0 net/socket.c:2430
__do_sys_sendmsg net/socket.c:2439 [inline]
__se_sys_sendmsg net/socket.c:2437 [inline]
__x64_sys_sendmsg+0x78/0xb0 net/socket.c:2437
do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Cc: stable@kernel.org
Fixes: 8f71bb0030b8 ("RDMA: Report available cdevs through RDMA_NLDEV_CMD_GET_CHARDEV")
Link: https://lore.kernel.org/r/20200310075339.238090-1-leon@kernel.org
Reported-by: syzbot+46fe08363dbba223dec5@syzkaller.appspotmail.com
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
If name memory allocation fails the name will be left empty and
device_add_one() will crash:
kobject: (0000000004952746): attempted to be registered with empty name!
WARNING: CPU: 0 PID: 329 at lib/kobject.c:234 kobject_add_internal+0x7ac/0x9a0 lib/kobject.c:234
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 329 Comm: syz-executor.5 Not tainted 5.6.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x197/0x210 lib/dump_stack.c:118
panic+0x2e3/0x75c kernel/panic.c:221
__warn.cold+0x2f/0x3e kernel/panic.c:582
report_bug+0x289/0x300 lib/bug.c:195
fixup_bug arch/x86/kernel/traps.c:174 [inline]
fixup_bug arch/x86/kernel/traps.c:169 [inline]
do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:267
do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:286
invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1027
RIP: 0010:kobject_add_internal+0x7ac/0x9a0 lib/kobject.c:234
Code: 1a 98 ca f9 e9 f0 f8 ff ff 4c 89 f7 e8 6d 98 ca f9 e9 95 f9 ff ff e8 c3 f0 8b f9 4c 89 e6 48 c7 c7 a0 0e 1a 89 e8 e3 41 5c f9 <0f> 0b 41 bd ea ff ff ff e9 52 ff ff ff e8 a2 f0 8b f9 0f 0b e8 9b
RSP: 0018:ffffc90005b27908 EFLAGS: 00010286
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000040000 RSI: ffffffff815eae46 RDI: fffff52000b64f13
RBP: ffffc90005b27960 R08: ffff88805aeba480 R09: ffffed1015d06659
R10: ffffed1015d06658 R11: ffff8880ae8332c7 R12: ffff8880a37fd000
R13: 0000000000000000 R14: ffff888096691780 R15: 0000000000000001
kobject_add_varg lib/kobject.c:390 [inline]
kobject_add+0x150/0x1c0 lib/kobject.c:442
device_add+0x3be/0x1d00 drivers/base/core.c:2412
add_one_compat_dev drivers/infiniband/core/device.c:901 [inline]
add_one_compat_dev+0x46a/0x7e0 drivers/infiniband/core/device.c:857
rdma_dev_init_net+0x2eb/0x490 drivers/infiniband/core/device.c:1120
ops_init+0xb3/0x420 net/core/net_namespace.c:137
setup_net+0x2d5/0x8b0 net/core/net_namespace.c:327
copy_net_ns+0x29e/0x5a0 net/core/net_namespace.c:468
create_new_namespaces+0x403/0xb50 kernel/nsproxy.c:108
unshare_nsproxy_namespaces+0xc2/0x200 kernel/nsproxy.c:229
ksys_unshare+0x444/0x980 kernel/fork.c:2955
__do_sys_unshare kernel/fork.c:3023 [inline]
__se_sys_unshare kernel/fork.c:3021 [inline]
__x64_sys_unshare+0x31/0x40 kernel/fork.c:3021
do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Link: https://lore.kernel.org/r/20200309193200.GA10633@ziepe.ca
Cc: stable@kernel.org
Fixes: 4e0f7b907072 ("RDMA/core: Implement compat device/sysfs tree in net namespace")
Reported-by: syzbot+ab4dae63f7d310641ded@syzkaller.appspotmail.com
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Commit 6825d3ea6cde ("iommu/vt-d: Add debugfs support to show register
contents") dumps the register contents for all IOMMU devices.
Currently, a 64 bit read(dmar_readq) is done for all the IOMMU registers,
even though some of the registers are 32 bits, which is incorrect.
Use the correct read function variant (dmar_readl/dmar_readq) while
reading the contents of 32/64 bit registers respectively.
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
Link: https://lore.kernel.org/r/1583784587-26126-2-git-send-email-megha.dey@linux.intel.com
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
add_taint
Quoting from the comment describing the WARN functions in
include/asm-generic/bug.h:
* WARN(), WARN_ON(), WARN_ON_ONCE, and so on can be used to report
* significant kernel issues that need prompt attention if they should ever
* appear at runtime.
*
* Do not use these macros when checking for invalid external inputs
The (buggy) firmware tables which the dmar code was calling WARN_TAINT
for really are invalid external inputs. They are not under the kernel's
control and the issues in them cannot be fixed by a kernel update.
So logging a backtrace, which invites bug reports to be filed about this,
is not helpful.
Fixes: 556ab45f9a77 ("ioat2: catch and recover from broken vtd configurations v6")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200309182510.373875-1-hdegoede@redhat.com
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=701847
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Quoting from the comment describing the WARN functions in
include/asm-generic/bug.h:
* WARN(), WARN_ON(), WARN_ON_ONCE, and so on can be used to report
* significant kernel issues that need prompt attention if they should ever
* appear at runtime.
*
* Do not use these macros when checking for invalid external inputs
The (buggy) firmware tables which the dmar code was calling WARN_TAINT
for really are invalid external inputs. They are not under the kernel's
control and the issues in them cannot be fixed by a kernel update.
So logging a backtrace, which invites bug reports to be filed about this,
is not helpful.
Some distros, e.g. Fedora, have tools watching for the kernel backtraces
logged by the WARN macros and offer the user an option to file a bug for
this when these are encountered. The WARN_TAINT in dmar_parse_one_rmrr
+ another iommu WARN_TAINT, addressed in another patch, have lead to over
a 100 bugs being filed this way.
This commit replaces the WARN_TAINT("...") call, with a
pr_warn(FW_BUG "...") + add_taint(TAINT_FIRMWARE_WORKAROUND, ...) call
avoiding the backtrace and thus also avoiding bug-reports being filed
about this against the kernel.
Fixes: f5a68bb0752e ("iommu/vt-d: Mark firmware tainted if RMRR fails sanity check")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Cc: stable@vger.kernel.org
Cc: Barret Rhoden <brho@google.com>
Link: https://lore.kernel.org/r/20200309140138.3753-3-hdegoede@redhat.com
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1808874
|
|
Quoting from the comment describing the WARN functions in
include/asm-generic/bug.h:
* WARN(), WARN_ON(), WARN_ON_ONCE, and so on can be used to report
* significant kernel issues that need prompt attention if they should ever
* appear at runtime.
*
* Do not use these macros when checking for invalid external inputs
The (buggy) firmware tables which the dmar code was calling WARN_TAINT
for really are invalid external inputs. They are not under the kernel's
control and the issues in them cannot be fixed by a kernel update.
So logging a backtrace, which invites bug reports to be filed about this,
is not helpful.
Some distros, e.g. Fedora, have tools watching for the kernel backtraces
logged by the WARN macros and offer the user an option to file a bug for
this when these are encountered. The WARN_TAINT in warn_invalid_dmar()
+ another iommu WARN_TAINT, addressed in another patch, have lead to over
a 100 bugs being filed this way.
This commit replaces the WARN_TAINT("...") calls, with
pr_warn(FW_BUG "...") + add_taint(TAINT_FIRMWARE_WORKAROUND, ...) calls
avoiding the backtrace and thus also avoiding bug-reports being filed
about this against the kernel.
Fixes: fd0c8894893c ("intel-iommu: Set a more specific taint flag for invalid BIOS DMAR tables")
Fixes: e625b4a95d50 ("iommu/vt-d: Parse ANDD records")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200309140138.3753-2-hdegoede@redhat.com
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1564895
|
|
Empty device names cannot be added to sysfs and crash with:
kobject: (00000000f9de3792): attempted to be registered with empty name!
WARNING: CPU: 1 PID: 10856 at lib/kobject.c:234 kobject_add_internal+0x7ac/0x9a0 lib/kobject.c:234
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 10856 Comm: syz-executor459 Not tainted 5.6.0-rc3-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x197/0x210 lib/dump_stack.c:118
panic+0x2e3/0x75c kernel/panic.c:221
__warn.cold+0x2f/0x3e kernel/panic.c:582
report_bug+0x289/0x300 lib/bug.c:195
fixup_bug arch/x86/kernel/traps.c:174 [inline]
fixup_bug arch/x86/kernel/traps.c:169 [inline]
do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:267
do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:286
invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1027
RIP: 0010:kobject_add_internal+0x7ac/0x9a0 lib/kobject.c:234
Code: 7a ca ca f9 e9 f0 f8 ff ff 4c 89 f7 e8 cd ca ca f9 e9 95 f9 ff ff e8 13 25 8c f9 4c 89 e6 48 c7 c7 a0 08 1a 89 e8 a3 76 5c f9 <0f> 0b 41 bd ea ff ff ff e9 52 ff ff ff e8 f2 24 8c f9 0f 0b e8 eb
RSP: 0018:ffffc90002006eb0 EFLAGS: 00010286
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff815eae46 RDI: fffff52000400dc8
RBP: ffffc90002006f08 R08: ffff8880972ac500 R09: ffffed1015d26659
R10: ffffed1015d26658 R11: ffff8880ae9332c7 R12: ffff888093034668
R13: 0000000000000000 R14: ffff8880a69d7600 R15: 0000000000000001
kobject_add_varg lib/kobject.c:390 [inline]
kobject_add+0x150/0x1c0 lib/kobject.c:442
device_add+0x3be/0x1d00 drivers/base/core.c:2412
ib_register_device drivers/infiniband/core/device.c:1371 [inline]
ib_register_device+0x93e/0xe40 drivers/infiniband/core/device.c:1343
rxe_register_device+0x52e/0x655 drivers/infiniband/sw/rxe/rxe_verbs.c:1231
rxe_add+0x122b/0x1661 drivers/infiniband/sw/rxe/rxe.c:302
rxe_net_add+0x91/0xf0 drivers/infiniband/sw/rxe/rxe_net.c:539
rxe_newlink+0x39/0x90 drivers/infiniband/sw/rxe/rxe.c:318
nldev_newlink+0x28a/0x430 drivers/infiniband/core/nldev.c:1538
rdma_nl_rcv_msg drivers/infiniband/core/netlink.c:195 [inline]
rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline]
rdma_nl_rcv+0x5d9/0x980 drivers/infiniband/core/netlink.c:259
netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
netlink_unicast+0x59e/0x7e0 net/netlink/af_netlink.c:1329
netlink_sendmsg+0x91c/0xea0 net/netlink/af_netlink.c:1918
sock_sendmsg_nosec net/socket.c:652 [inline]
sock_sendmsg+0xd7/0x130 net/socket.c:672
____sys_sendmsg+0x753/0x880 net/socket.c:2343
___sys_sendmsg+0x100/0x170 net/socket.c:2397
__sys_sendmsg+0x105/0x1d0 net/socket.c:2430
__do_sys_sendmsg net/socket.c:2439 [inline]
__se_sys_sendmsg net/socket.c:2437 [inline]
__x64_sys_sendmsg+0x78/0xb0 net/socket.c:2437
do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Prevent empty names when checking the name provided from userspace during
newlink and rename.
Fixes: 3856ec4b93c9 ("RDMA/core: Add RDMA_NLDEV_CMD_NEWLINK/DELLINK support")
Fixes: 05d940d3a3ec ("RDMA/nldev: Allow IB device rename through RDMA netlink")
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/20200309191648.GA30852@ziepe.ca
Reported-and-tested-by: syzbot+da615ac67d4dbea32cbc@syzkaller.appspotmail.com
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
This fixes a problem found on the MacBookPro 2017 Retina panel:
The panel reports 10 bpc color depth in its EDID, and the
firmware chooses link settings at boot which support enough
bandwidth for 10 bpc (324000 kbit/sec aka LINK_RATE_RBR2
aka 0xc), but the DP_MAX_LINK_RATE dpcd register only reports
2.7 Gbps (multiplier value 0xa) as possible, in direct
contradiction of what the firmware successfully set up.
This restricts the panel to 8 bpc, not providing the full
color depth of the panel on Linux <= 5.5. Additionally, commit
'4a8ca46bae8a ("drm/amd/display: Default max bpc to 16 for eDP")'
introduced into Linux 5.6-rc1 will unclamp panel depth to
its full 10 bpc, thereby requiring a eDP bandwidth for all
modes that exceeds the bandwidth available and causes all modes
to fail validation -> No modes for the laptop panel -> failure
to set any mode -> Panel goes dark.
This patch adds a quirk specific to the MBP 2017 15" Retina
panel to override reported max link rate to the correct maximum
of 0xc = LINK_RATE_RBR2 to fix the darkness and reduced display
precision.
Please apply for Linux 5.6+ to avoid regressing Apple MBP panel
support.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This can fix the baco reset failure seen on Navi10.
And this should be a low risk fix as the same sequence
is already used for system suspend/resume.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The offset into the array was specified in bytes but should
be in terms of 32-bit words. Also prevent large reads that
would also cause a buffer overread.
v2: Read from correct offset from internal storage buffer.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
In dcn20_funcs and dcn21_funcs struct, the member ".dsc_pg_control = NULL"
should be removed due to .dsc_pg_control be assigned to dcn20_dsc_pg_control.
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Anthony Koo <Anthony.Koo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Throw in the inverse patterns to create more examples of poison to use
against the LRC state.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200313102812.30173-1-chris@chris-wilson.co.uk
|
|
With CONFIG_KASAN_VMALLOC, new page tables are created at the time
shadow memory for vmalloc area is unmapped. If some parts of the
page table still have entries to the zero page shadow memory, the
entries are wrongly marked RW.
With CONFIG_KASAN_VMALLOC, almost the entire kernel address space
is managed by KASAN. To make it simple, just create KASAN page tables
for the entire kernel space at kasan_init(). That doesn't use much
more space, and that's anyway already done for hash platforms.
Fixes: 3d4247fcc938 ("powerpc/32: Add support of KASAN_VMALLOC")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/ef5248fc1f496c6b0dfdb59380f24968f25f75c5.1583513368.git.christophe.leroy@c-s.fr
|
|
Pull drm fixes from Dave Airlie:
"It's a bit quieter, probably not as much as it could be.
There is on large regression fix in here from Lyude for displayport
bandwidth calculations, there've been reports of multi-monitor in
docks not working since -rc1 and this has been tested to fix those.
Otherwise it's a bunch of i915 (with some GVT fixes), a set of amdgpu
watermark + bios fixes, and an exynos iommu cleanup fix.
core:
- DP MST bandwidth regression fix.
i915:
- hard lockup fix
- GVT fixes
- 32-bit alignment issue fix
- timeline wait fixes
- cacheline_retire and free
amdgpu:
- Update the display watermark bounding box for navi14
- Fix fetching vbios directly from rom on vega20/arcturus
- Navi and renoir watermark fixes
exynos:
- iommu object cleanup fix"
`
* tag 'drm-fixes-2020-03-13' of git://anongit.freedesktop.org/drm/drm:
drm/dp_mst: Rewrite and fix bandwidth limit checks
drm/dp_mst: Reprobe path resources in CSN handler
drm/dp_mst: Use full_pbn instead of available_pbn for bandwidth checks
drm/dp_mst: Rename drm_dp_mst_is_dp_mst_end_device() to be less redundant
drm/i915: Defer semaphore priority bumping to a workqueue
drm/i915/gt: Close race between cacheline_retire and free
drm/i915/execlists: Enable timeslice on partial virtual engine dequeue
drm/i915: be more solid in checking the alignment
drm/i915/gvt: Fix dma-buf display blur issue on CFL
drm/i915: Return early for await_start on same timeline
drm/i915: Actually emit the await_start
drm/amdgpu/powerplay: nv1x, renior copy dcn clock settings of watermark to smu during boot up
drm/exynos: Fix cleanup of IOMMU related objects
drm/amdgpu: correct ROM_INDEX/DATA offset for VEGA20
drm/amd/display: update soc bb for nv14
drm/i915/gvt: Fix emulated vbt size issue
drm/i915/gvt: Fix unnecessary schedule timer when no vGPU exits
|
|
All the files in Documentation/kbuild/ were converted to reST.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|