Age | Commit message (Collapse) | Author |
|
Dave writes:
"drm fixes for 4.19-rc6
Looks like a pretty normal week for graphics,
core: syncobj fix, panel link regression revert
amd: suspend/resume fixes, EDID emulation fix
mali-dp: NV12 writeback and vblank reset fixes
etnaviv: DMA setup fix"
* tag 'drm-fixes-2018-09-28' of git://anongit.freedesktop.org/drm/drm:
drm/amd/display: Fix Edid emulation for linux
drm/amd/display: Fix Vega10 lightup on S3 resume
drm/amdgpu: Fix vce work queue was not cancelled when suspend
Revert "drm/panel: Add device_link from panel device to DRM device"
drm/syncobj: Don't leak fences when WAIT_FOR_SUBMIT is set
drm/malidp: Fix writeback in NV12
drm: mali-dp: Call drm_crtc_vblank_reset on device init
drm/etnaviv: add DMA configuration for etnaviv platform device
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux
Palmer writes:
"A Single RISC-V Update for 4.19-rc6
The Debian guys have been pushing on our port and found some
unversioned symbols leaking into modules. This PR contains a single
fix for that issue."
* tag 'riscv-for-linus-4.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
RISC-V: include linux/ftrace.h in asm-prototypes.h
|
|
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Bjorn writes:
"PCI fixes:
- Fix ACPI hotplug issue that causes black screen crash at boot (Mika
Westerberg)
- Fix DesignWare "scheduling while atomic" issues (Jisheng Zhang)
- Add PPC contacts to MAINTAINERS for PCI core error handling (Bjorn
Helgaas)
- Sort Mobiveil MAINTAINERS entry (Lorenzo Pieralisi)"
* tag 'pci-v4.19-fixes-2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
ACPI / hotplug / PCI: Don't scan for non-hotplug bridges if slot is not bridge
PCI: dwc: Fix scheduling while atomic issues
MAINTAINERS: Move mobiveil PCI driver entry where it belongs
MAINTAINERS: Update PPC contacts for PCI core error handling
|
|
The debounce value passed to mmc_gpiod_request_cd() function is in
microseconds, but msecs_to_jiffies() requires the value to be in
miliseconds to properly calculate the delay, so adjust the value stored
in cd_debounce_delay_ms context entry.
Fixes: 1d71926bbd59 ("mmc: core: Fix debounce time to use microseconds")
Fixes: bfd694d5e21c ("mmc: core: Add tunable delay before detecting card
after card is inserted")
Cc: stable@vger.kernel.org # v4.18+
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Pull NVMe fix from Christoph.
* 'nvme-4.19' of git://git.infradead.org/nvme:
nvme: properly propagate errors in nvme_mpath_init
|
|
Commit a46b53672b2c2e3770b38a4abf90d16364d2584b ("xen/blkfront: cleanup
stale persistent grants") introduced a regression as purged persistent
grants were not pu into the list of free grants again. Correct that.
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Fix didn't work for all cases, reverting to add a (hopefully)
better fix.
This reverts commit f151ba989d149bbdfc90e5405724bbea094f9b17.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
cgroup_storage_update_elem() shouldn't accept any flags
argument values except BPF_ANY and BPF_EXIST to guarantee
the backward compatibility, had a new flag value been added.
Fixes: de9cbbaadba5 ("bpf: introduce cgroup storage maps")
Signed-off-by: Roman Gushchin <guro@fb.com>
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Only check for the network namespace if the socket is available.
Fixes: f564650106a6 ("netfilter: check if the socket netns is correct.")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Unfortunately some versions of gcc emit following warning:
$ make net/xfrm/xfrm_output.o
linux/compiler.h:252:20: warning: array subscript is above array bounds [-Warray-bounds]
hook_head = rcu_dereference(net->nf.hooks_arp[hook]);
^~~~~~~~~~~~~~~~~~~~~
xfrm_output_resume passes skb_dst(skb)->ops->family as its 'pf' arg so compiler
can't know that we'll never access hooks_arp[].
(NFPROTO_IPV4 or NFPROTO_IPV6 are only possible cases).
Avoid this by adding an explicit WARN_ON_ONCE() check.
This patch has no effect if the family is a compile-time constant as gcc
will remove the switch() construct entirely.
Reported-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
The nft_set_gc_batch_check() checks whether gc buffer is full.
If gc buffer is full, gc buffer is released by
the nft_set_gc_batch_complete() internally.
In case of rbtree, the rb_erase() should be called before calling the
nft_set_gc_batch_complete(). therefore the rb_erase() should
be called before calling the nft_set_gc_batch_check() too.
test commands:
table ip filter {
set set1 {
type ipv4_addr; flags interval, timeout;
gc-interval 10s;
timeout 1s;
elements = {
1-2,
3-4,
5-6,
...
10000-10001,
}
}
}
%nft -f test.nft
splat looks like:
[ 430.273885] kasan: GPF could be caused by NULL-ptr deref or user memory access
[ 430.282158] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[ 430.283116] CPU: 1 PID: 190 Comm: kworker/1:2 Tainted: G B 4.18.0+ #7
[ 430.283116] Workqueue: events_power_efficient nft_rbtree_gc [nf_tables_set]
[ 430.313559] RIP: 0010:rb_next+0x81/0x130
[ 430.313559] Code: 08 49 bd 00 00 00 00 00 fc ff df 48 bb 00 00 00 00 00 fc ff df 48 85 c0 75 05 eb 58 48 89 d4
[ 430.313559] RSP: 0018:ffff88010cdb7680 EFLAGS: 00010207
[ 430.313559] RAX: 0000000000b84854 RBX: dffffc0000000000 RCX: ffffffff83f01973
[ 430.313559] RDX: 000000000017090c RSI: 0000000000000008 RDI: 0000000000b84864
[ 430.313559] RBP: ffff8801060d4588 R08: fffffbfff09bc349 R09: fffffbfff09bc349
[ 430.313559] R10: 0000000000000001 R11: fffffbfff09bc348 R12: ffff880100f081a8
[ 430.313559] R13: dffffc0000000000 R14: ffff880100ff8688 R15: dffffc0000000000
[ 430.313559] FS: 0000000000000000(0000) GS:ffff88011b400000(0000) knlGS:0000000000000000
[ 430.313559] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 430.313559] CR2: 0000000001551008 CR3: 000000005dc16000 CR4: 00000000001006e0
[ 430.313559] Call Trace:
[ 430.313559] nft_rbtree_gc+0x112/0x5c0 [nf_tables_set]
[ 430.313559] process_one_work+0xc13/0x1ec0
[ 430.313559] ? _raw_spin_unlock_irq+0x29/0x40
[ 430.313559] ? pwq_dec_nr_in_flight+0x3c0/0x3c0
[ 430.313559] ? set_load_weight+0x270/0x270
[ 430.313559] ? __switch_to_asm+0x34/0x70
[ 430.313559] ? __switch_to_asm+0x40/0x70
[ 430.313559] ? __switch_to_asm+0x34/0x70
[ 430.313559] ? __switch_to_asm+0x34/0x70
[ 430.313559] ? __switch_to_asm+0x40/0x70
[ 430.313559] ? __switch_to_asm+0x34/0x70
[ 430.313559] ? __switch_to_asm+0x40/0x70
[ 430.313559] ? __switch_to_asm+0x34/0x70
[ 430.313559] ? __switch_to_asm+0x34/0x70
[ 430.313559] ? __switch_to_asm+0x40/0x70
[ 430.313559] ? __switch_to_asm+0x34/0x70
[ 430.313559] ? __schedule+0x6d3/0x1f50
[ 430.313559] ? find_held_lock+0x39/0x1c0
[ 430.313559] ? __sched_text_start+0x8/0x8
[ 430.313559] ? cyc2ns_read_end+0x10/0x10
[ 430.313559] ? save_trace+0x300/0x300
[ 430.313559] ? sched_clock_local+0xd4/0x140
[ 430.313559] ? find_held_lock+0x39/0x1c0
[ 430.313559] ? worker_thread+0x353/0x1120
[ 430.313559] ? worker_thread+0x353/0x1120
[ 430.313559] ? lock_contended+0xe70/0xe70
[ 430.313559] ? __lock_acquire+0x4500/0x4500
[ 430.535635] ? do_raw_spin_unlock+0xa5/0x330
[ 430.535635] ? do_raw_spin_trylock+0x101/0x1a0
[ 430.535635] ? do_raw_spin_lock+0x1f0/0x1f0
[ 430.535635] ? _raw_spin_lock_irq+0x10/0x70
[ 430.535635] worker_thread+0x15d/0x1120
[ ... ]
Fixes: 8d8540c4f5e0 ("netfilter: nft_set_rbtree: add timeout support")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Fix error distribution by immediately delivering the errors to all the
affected calls rather than deferring them to a worker thread. The problem
with the latter is that retries and things can happen in the meantime when we
want to stop that sooner.
To this end:
(1) Stop the error distributor from removing calls from the error_targets
list so that peer->lock isn't needed to synchronise against other adds
and removals.
(2) Require the peer's error_targets list to be accessed with RCU, thereby
avoiding the need to take peer->lock over distribution.
(3) Don't attempt to affect a call's state if it is already marked complete.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
It seems that enabling IPV6_RECVERR on an IPv6 socket doesn't also turn on
IP_RECVERR, so neither local errors nor ICMP-transported remote errors from
IPv4 peer addresses are returned to the AF_RXRPC protocol.
Make the sockopt setting code in rxrpc_open_socket() fall through from the
AF_INET6 case to the AF_INET case to turn on all the AF_INET options too in
the AF_INET6 case.
Fixes: f2aeed3a591f ("rxrpc: Fix error reception on AF_INET6 sockets")
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Make the following changes to improve the robustness of the code that sets
up a new service call:
(1) Cache the rxrpc_sock struct obtained in rxrpc_data_ready() to do a
service ID check and pass that along to rxrpc_new_incoming_call().
This means that I can remove the check from rxrpc_new_incoming_call()
without the need to worry about the socket attached to the local
endpoint getting replaced - which would invalidate the check.
(2) Cache the rxrpc_peer struct, thereby allowing the peer search to be
done once. The peer is passed to rxrpc_new_incoming_call(), thereby
saving the need to repeat the search.
This also reduces the possibility of rxrpc_publish_service_conn()
BUG()'ing due to the detection of a duplicate connection, despite the
initial search done by rxrpc_find_connection_rcu() having turned up
nothing.
This BUG() shouldn't ever get hit since rxrpc_data_ready() *should* be
non-reentrant and the result of the initial search should still hold
true, but it has proven possible to hit.
I *think* this may be due to __rxrpc_lookup_peer_rcu() cutting short
the iteration over the hash table if it finds a matching peer with a
zero usage count, but I don't know for sure since it's only ever been
hit once that I know of.
Another possibility is that a bug in rxrpc_data_ready() that checked
the wrong byte in the header for the RXRPC_CLIENT_INITIATED flag
might've let through a packet that caused a spurious and invalid call
to be set up. That is addressed in another patch.
(3) Fix __rxrpc_lookup_peer_rcu() to skip peer records that have a zero
usage count rather than stopping and returning not found, just in case
there's another peer record behind it in the bucket.
(4) Don't search the peer records in rxrpc_alloc_incoming_call(), but
rather either use the peer cached in (2) or, if one wasn't found,
preemptively install a new one.
Fixes: 8496af50eb38 ("rxrpc: Use RCU to access a peer's service connection tree")
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Do more up-front checking on incoming packets to weed out invalid ones and
also ones aimed at services that we don't support.
Whilst we're at it, replace the clearing of call and skew if we don't find
a connection with just initialising the variables to zero at the top of the
function.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
In the input path, a received sk_buff can be marked for rejection by
setting RXRPC_SKB_MARK_* in skb->mark and, if needed, some auxiliary data
(such as an abort code) in skb->priority. The rejection is handled by
queueing the sk_buff up for dealing with in process context. The output
code reads the mark and priority and, theoretically, generates an
appropriate response packet.
However, if RXRPC_SKB_MARK_BUSY is set, this isn't noticed and an ABORT
message with a random abort code is generated (since skb->priority wasn't
set to anything).
Fix this by outputting the appropriate sort of packet.
Also, whilst we're at it, most of the marks are no longer used, so remove
them and rename the remaining two to something more obvious.
Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Fix RTT information gathering in AF_RXRPC by the following means:
(1) Enable Rx timestamping on the transport socket with SO_TIMESTAMPNS.
(2) If the sk_buff doesn't have a timestamp set when rxrpc_data_ready()
collects it, set it at that point.
(3) Allow ACKs to be requested on the last packet of a client call, but
not a service call. We need to be careful lest we undo:
bf7d620abf22c321208a4da4f435e7af52551a21
Author: David Howells <dhowells@redhat.com>
Date: Thu Oct 6 08:11:51 2016 +0100
rxrpc: Don't request an ACK on the last DATA packet of a call's Tx phase
but that only really applies to service calls that we're handling,
since the client side gets to send the final ACK (or not).
(4) When about to transmit an ACK or DATA packet, record the Tx timestamp
before only; don't update the timestamp afterwards.
(5) Switch the ordering between recording the serial and recording the
timestamp to always set the serial number first. The serial number
shouldn't be seen referenced by an ACK packet until we've transmitted
the packet bearing it - so in the Rx path, we don't need the timestamp
until we've checked the serial number.
Fixes: cf1a6474f807 ("rxrpc: Add per-peer RTT tracker")
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
There's a check in rxrpc_data_ready() that's checking the CLIENT_INITIATED
flag in the packet type field rather than in the packet flags field.
Fix this by creating a pair of helper functions to check whether the packet
is going to the client or to the server and use them generally.
Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Fix kernel NULL pointer dereference,
Call Trace:
[<ffffffff9b7658e6>] __mutex_lock_slowpath+0xa6/0x1d0
[<ffffffff9b764cef>] mutex_lock+0x1f/0x2f
[<ffffffffc061b5e1>] qedi_get_protocol_tlv_data+0x61/0x450 [qedi]
[<ffffffff9b1f9d8e>] ? map_vm_area+0x2e/0x40
[<ffffffff9b1fc370>] ? __vmalloc_node_range+0x170/0x280
[<ffffffffc0b81c3d>] ? qed_mfw_process_tlv_req+0x27d/0xbd0 [qed]
[<ffffffffc0b6461b>] qed_mfw_fill_tlv_data+0x4b/0xb0 [qed]
[<ffffffffc0b81c59>] qed_mfw_process_tlv_req+0x299/0xbd0 [qed]
[<ffffffff9b02a59e>] ? __switch_to+0xce/0x580
[<ffffffffc0b61e5b>] qed_slowpath_task+0x5b/0x80 [qed]
Signed-off-by: Nilesh Javali <nilesh.javali@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Commit b2d35fa5fc80 ("selftests: add headers_install to lib.mk")
introduced a requirement that Makefiles more than one level below the
selftests directory need to define top_srcdir, but it didn't update
any of the powerpc Makefiles.
This broke building all the powerpc selftests with eg:
make[1]: Entering directory '/src/linux/tools/testing/selftests/powerpc'
BUILD_TARGET=/src/linux/tools/testing/selftests/powerpc/alignment; mkdir -p $BUILD_TARGET; make OUTPUT=$BUILD_TARGET -k -C alignment all
make[2]: Entering directory '/src/linux/tools/testing/selftests/powerpc/alignment'
../../lib.mk:20: ../../../../scripts/subarch.include: No such file or directory
make[2]: *** No rule to make target '../../../../scripts/subarch.include'.
make[2]: Failed to remake makefile '../../../../scripts/subarch.include'.
Makefile:38: recipe for target 'alignment' failed
Fix it by setting top_srcdir in the affected Makefiles.
Fixes: b2d35fa5fc80 ("selftests: add headers_install to lib.mk")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
The following KASAN warning was printed when booting a 64-bit kernel
on some systems with Intel CPUs:
[ 44.512826] ==================================================================
[ 44.520165] BUG: KASAN: stack-out-of-bounds in find_first_bit+0xb0/0xc0
[ 44.526786] Read of size 8 at addr ffff88041e02fc50 by task kworker/0:2/124
[ 44.535253] CPU: 0 PID: 124 Comm: kworker/0:2 Tainted: G X --------- --- 4.18.0-12.el8.x86_64+debug #1
[ 44.545858] Hardware name: Intel Corporation PURLEY/PURLEY, BIOS BKVDTRL1.86B.0005.D08.1712070559 12/07/2017
[ 44.555682] Workqueue: events work_for_cpu_fn
[ 44.560043] Call Trace:
[ 44.562502] dump_stack+0x9a/0xe9
[ 44.565832] print_address_description+0x65/0x22e
[ 44.570683] ? find_first_bit+0xb0/0xc0
[ 44.570689] kasan_report.cold.6+0x92/0x19f
[ 44.578726] find_first_bit+0xb0/0xc0
[ 44.578737] adf_probe+0x9eb/0x19a0 [qat_c62x]
[ 44.578751] ? adf_remove+0x110/0x110 [qat_c62x]
[ 44.591490] ? mark_held_locks+0xc8/0x140
[ 44.591498] ? _raw_spin_unlock+0x30/0x30
[ 44.591505] ? trace_hardirqs_on_caller+0x381/0x570
[ 44.604418] ? adf_remove+0x110/0x110 [qat_c62x]
[ 44.604427] local_pci_probe+0xd4/0x180
[ 44.604432] ? pci_device_shutdown+0x110/0x110
[ 44.617386] work_for_cpu_fn+0x51/0xa0
[ 44.621145] process_one_work+0x8fe/0x16e0
[ 44.625263] ? pwq_dec_nr_in_flight+0x2d0/0x2d0
[ 44.629799] ? lock_acquire+0x14c/0x400
[ 44.633645] ? move_linked_works+0x12e/0x2a0
[ 44.637928] worker_thread+0x536/0xb50
[ 44.641690] ? __kthread_parkme+0xb6/0x180
[ 44.645796] ? process_one_work+0x16e0/0x16e0
[ 44.650160] kthread+0x30c/0x3d0
[ 44.653400] ? kthread_create_worker_on_cpu+0xc0/0xc0
[ 44.658457] ret_from_fork+0x3a/0x50
[ 44.663557] The buggy address belongs to the page:
[ 44.668350] page:ffffea0010780bc0 count:0 mapcount:0 mapping:0000000000000000 index:0x0
[ 44.676356] flags: 0x17ffffc0000000()
[ 44.680023] raw: 0017ffffc0000000 ffffea0010780bc8 ffffea0010780bc8 0000000000000000
[ 44.687769] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
[ 44.695510] page dumped because: kasan: bad access detected
[ 44.702578] Memory state around the buggy address:
[ 44.707372] ffff88041e02fb00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 44.714593] ffff88041e02fb80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 44.721810] >ffff88041e02fc00: 00 00 00 00 00 00 f1 f1 f1 f1 04 f2 f2 f2 f2 f2
[ 44.729028] ^
[ 44.734864] ffff88041e02fc80: f2 f2 00 00 00 00 f3 f3 f3 f3 00 00 00 00 00 00
[ 44.742082] ffff88041e02fd00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 44.749299] ==================================================================
Looking into the code:
int ret, bar_mask;
:
for_each_set_bit(bar_nr, (const unsigned long *)&bar_mask,
It is casting a 32-bit integer pointer to a 64-bit unsigned long
pointer. There are two problems here. First, the 32-bit pointer address
may not be 64-bit aligned. Secondly, it is accessing an extra 4 bytes.
This is fixed by changing the bar_mask type to unsigned long.
Cc: <stable@vger.kernel.org>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
When compiling with CONFIG_DEBUG_ATOMIC_SLEEP=y the mxs-dcp driver
prints warnings such as:
WARNING: CPU: 0 PID: 120 at kernel/sched/core.c:7736 __might_sleep+0x98/0x9c
do not call blocking ops when !TASK_RUNNING; state=1 set at [<8081978c>] dcp_chan_thread_sha+0x3c/0x2ec
The problem is that blocking ops will manipulate current->state
themselves so it is not allowed to call them between
set_current_state(TASK_INTERRUPTIBLE) and schedule().
Fix this by converting the per-chan mutex to a spinlock (it only
protects tiny list ops anyway) and rearranging the wait logic so that
callbacks are called current->state as TASK_RUNNING. Those callbacks
will indeed call blocking ops themselves so this is required.
Cc: <stable@vger.kernel.org>
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Update PCI Id in "cpl_rx_phys_dsgl" header. In case pci_chan_id and
tx_chan_id are not derived from same queue, H/W can send request
completion indication before completing DMA Transfer.
Herbert, It would be good if fix can be merge to stable tree.
For 4.14 kernel, It requires some update to avoid mege conficts.
Cc: <stable@vger.kernel.org>
Signed-off-by: Harsh Jain <harsh@chelsio.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
into drm-fixes
Just a few fixes for 4.19:
- Couple of suspend/resume fixes
- Fix EDID emulation with DC
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180927155418.2813-1-alexander.deucher@amd.com
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
- Revert adding device-link to panels
- Don't leak fences in drm/syncobj
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Sean Paul <sean@poorly.run>
Link: https://patchwork.freedesktop.org/patch/msgid/20180927152712.GA53076@art_vandelay
|
|
On 38+ Intel-based ASUS products, the NVIDIA GPU becomes unusable after S3
suspend/resume. The affected products include multiple generations of
NVIDIA GPUs and Intel SoCs. After resume, nouveau logs many errors such
as:
fifo: fault 00 [READ] at 0000005555555000 engine 00 [GR] client 04
[HUB/FE] reason 4a [] on channel -1 [007fa91000 unknown]
DRM: failed to idle channel 0 [DRM]
Similarly, the NVIDIA proprietary driver also fails after resume (black
screen, 100% CPU usage in Xorg process). We shipped a sample to NVIDIA for
diagnosis, and their response indicated that it's a problem with the parent
PCI bridge (on the Intel SoC), not the GPU.
Runtime suspend/resume works fine, only S3 suspend is affected.
We found a workaround: on resume, rewrite the Intel PCI bridge
'Prefetchable Base Upper 32 Bits' register (PCI_PREF_BASE_UPPER32). In the
cases that I checked, this register has value 0 and we just have to rewrite
that value.
Linux already saves and restores PCI config space during suspend/resume,
but this register was being skipped because upon resume, it already has
value 0 (the correct, pre-suspend value).
Intel appear to have previously acknowledged this behaviour and the
requirement to rewrite this register:
https://bugzilla.kernel.org/show_bug.cgi?id=116851#c23
Based on that, rewrite the prefetch register values even when that appears
unnecessary.
We have confirmed this solution on all the affected models we have in-hands
(X542UQ, UX533FD, X530UN, V272UN).
Additionally, this solves an issue where r8169 MSI-X interrupts were broken
after S3 suspend/resume on ASUS X441UAR. This issue was recently worked
around in commit 7bb05b85bc2d ("r8169: don't use MSI-X on RTL8106e"). It
also fixes the same issue on RTL6186evl/8111evl on an Aimfor-tech laptop
that we had not yet patched. I suspect it will also fix the issue that was
worked around in commit 7c53a722459c ("r8169: don't use MSI-X on
RTL8168g").
Thomas Martitz reports that this change also solves an issue where the AMD
Radeon Polaris 10 GPU on the HP Zbook 14u G5 is unresponsive after S3
suspend/resume.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=201069
Signed-off-by: Daniel Drake <drake@endlessm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-By: Peter Wu <peter@lekensteyn.nl>
CC: stable@vger.kernel.org
|
|
Add a couple of new APIs to check the probing status of qman and bman:
'int bman_is_probed()' and 'int qman_is_probed()'.
They return the following values.
* 1 if qman/bman were probed correctly
* 0 if qman/bman were not yet probed
* -1 if probing of qman/bman failed
Drivers that use qman/bman driver services are required to use these
APIs before calling any functions exported by qman or bman drivers
or otherwise they will crash the kernel.
The APIs will be used in the following couple of qbman portal patches
and later in the series in the dpaa1 ethernet driver.
Signed-off-by: Laurentiu Tudor <laurentiu.tudor@nxp.com>
Signed-off-by: Li Yang <leoyang.li@nxp.com>
|
|
Jason writes:
"Second RDMA rc pull request
- Fix a long standing race bug when destroying comp_event file descriptors
- srp, hfi1, bnxt_re: Various driver crashes from missing validation
and other cases
- Fixes for regressions in patches merged this window in the gid
cache, devx, ucma and uapi."
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/core: Set right entry state before releasing reference
IB/mlx5: Destroy the DEVX object upon error flow
IB/uverbs: Free uapi on destroy
RDMA/bnxt_re: Fix system crash during RDMA resource initialization
IB/hfi1: Fix destroy_qp hang after a link down
IB/hfi1: Fix context recovery when PBC has an UnsupportedVL
IB/hfi1: Invalid user input can result in crash
IB/hfi1: Fix SL array bounds check
RDMA/uverbs: Fix validity check for modify QP
IB/srp: Avoid that sg_reset -d ${srp_device} triggers an infinite loop
ucma: fix a use-after-free in ucma_resolve_ip()
RDMA/uverbs: Atomically flush and mark closed the comp event queue
cxgb4: fix abort_req_rss6 struct
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Jan writes:
"an ext2 patch fixing fsync(2) for DAX mounts."
* tag 'for_v4.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
ext2, dax: set ext2_dax_aops for dax files
|
|
trace_block_unplug() takes true for explicit unplugs and false for
implicit unplugs. schedule() unplugs are implicit and should be
reported as timer unplugs. While correct in the legacy code, this has
been inverted in blk-mq since 4.11.
Cc: stable@vger.kernel.org
Fixes: bd166ef183c2 ("blk-mq-sched: add framework for MQ capable IO schedulers")
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Fixes a crash when the report encounters an address that could not be
associated with an mmaped region:
#0 0x00005555557bdc4a in callchain_srcline (ip=<error reading variable: Cannot access memory at address 0x38>, sym=0x0, map=0x0) at util/machine.c:2329
#1 unwind_entry (entry=entry@entry=0x7fffffff9180, arg=arg@entry=0x7ffff5642498) at util/machine.c:2329
#2 0x00005555558370af in entry (arg=0x7ffff5642498, cb=0x5555557bdb50 <unwind_entry>, thread=<optimized out>, ip=18446744073709551615) at util/unwind-libunwind-local.c:586
#3 get_entries (ui=ui@entry=0x7fffffff9620, cb=0x5555557bdb50 <unwind_entry>, arg=0x7ffff5642498, max_stack=<optimized out>) at util/unwind-libunwind-local.c:703
#4 0x0000555555837192 in _unwind__get_entries (cb=<optimized out>, arg=<optimized out>, thread=<optimized out>, data=<optimized out>, max_stack=<optimized out>) at util/unwind-libunwind-local.c:725
#5 0x00005555557c310f in thread__resolve_callchain_unwind (max_stack=127, sample=0x7fffffff9830, evsel=0x555555c7b3b0, cursor=0x7ffff5642498, thread=0x555555c7f6f0) at util/machine.c:2351
#6 thread__resolve_callchain (thread=0x555555c7f6f0, cursor=0x7ffff5642498, evsel=0x555555c7b3b0, sample=0x7fffffff9830, parent=0x7fffffff97b8, root_al=0x7fffffff9750, max_stack=127) at util/machine.c:2378
#7 0x00005555557ba4ee in sample__resolve_callchain (sample=<optimized out>, cursor=<optimized out>, parent=parent@entry=0x7fffffff97b8, evsel=<optimized out>, al=al@entry=0x7fffffff9750,
max_stack=<optimized out>) at util/callchain.c:1085
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Tested-by: Sandipan Das <sandipan@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: 2a9d5050dc84 ("perf script: Show correct offsets for DWARF-based unwinding")
Link: http://lkml.kernel.org/r/20180926135207.30263-1-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
On x86-64, the parametrized selftest code for rseq crashes with a
segmentation fault when compiled with -fpie. This happens when the
param_test binary is loaded at an address beyond 32-bit on x86-64.
The issue is caused by use of a 32-bit register to hold the address
of the loop counter variable.
Fix this by using a 64-bit register to calculate the address of the
loop counter variables as an offset from rip.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> # v4.18
Cc: Shuah Khan <shuah@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Watson <davejwatson@fb.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: linux-kselftest@vger.kernel.org
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Chris Lameter <cl@linux.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Paul Turner <pjt@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ben Maurer <bmaurer@fb.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
|
|
.max_tfd_queue_size was ommited for 1000 card serries leading to oops in
swiotlb.
Fixes: 7b3e42ea2ead ("iwlwifi: support multiple tfd queue max sizes for different devices")
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
|
|
When dax_lock_mapping_entry() has to sleep to obtain entry lock, it will
fail to unlock mapping->i_pages spinlock and thus immediately deadlock
against itself when retrying to grab the entry lock again. Fix the
problem by unlocking mapping->i_pages before retrying.
Fixes: c2a7d2a11552 ("filesystem-dax: Introduce dax_lock_mapping_entry()")
Reported-by: Barret Rhoden <brho@google.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Commit
1958b5fc4010 ("x86/boot: Add early boot support when running with SEV active")
can occasionally cause system resets when kexec-ing a second kernel even
if SEV is not active.
That's because get_sev_encryption_bit() uses 32-bit rIP-relative
addressing to read the value of enc_bit - a variable which caches a
previously detected encryption bit position - but kexec may allocate
the early boot code to a higher location, beyond the 32-bit addressing
limit.
In this case, garbage will be read and get_sev_encryption_bit() will
return the wrong value, leading to accessing memory with the wrong
encryption setting.
Therefore, remove enc_bit, and thus get rid of the need to do 32-bit
rIP-relative addressing in the first place.
[ bp: massage commit message heavily. ]
Fixes: 1958b5fc4010 ("x86/boot: Add early boot support when running with SEV active")
Suggested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Kairui Song <kasong@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: linux-kernel@vger.kernel.org
Cc: tglx@linutronix.de
Cc: mingo@redhat.com
Cc: hpa@zytor.com
Cc: brijesh.singh@amd.com
Cc: kexec@lists.infradead.org
Cc: dyoung@redhat.com
Cc: bhe@redhat.com
Cc: ghook@redhat.com
Link: https://lkml.kernel.org/r/20180927123845.32052-1-kasong@redhat.com
|
|
After write SSD completed, bcache schedules journal_write work to
system_wq, which is a public workqueue in system, without WQ_MEM_RECLAIM
flag. system_wq is also a bound wq, and there may be no idle kworker on
current processor. Creating a new kworker may unfortunately need to
reclaim memory first, by shrinking cache and slab used by vfs, which
depends on bcache device. That's a deadlock.
This patch create a new workqueue for journal_write with WQ_MEM_RECLAIM
flag. It's rescuer thread will work to avoid the deadlock.
Signed-off-by: Guoju Fang <fangguoju@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The combination of defined constants are used to present the
state of IRQ so the magic numbers has been replaced.
This is a simple coding style change which should have no impact on
runtime code execution.
Signed-off-by: Xue Liu <liuxuenetmail@gmail.com>
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
|
|
[Why]
EDID emulation didn't work properly for linux, as we stop programming
if nothing is connected physically.
[How]
We get a flag from DRM when we want to do edid emulation. We check if
this flag is true and nothing is connected physically, if so we only
program the front end using VIRTUAL_SIGNAL.
Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
There have been a few reports of Vega10 display remaining blank
after S3 resume. The regression is caused by workaround for mode
change on Vega10 - skip set_bandwidth if stream count is 0.
As a result we skipped dispclk reset on suspend, thus on resume
we may skip the clock update assuming it hasn't been changed.
On some systems it causes display blank or 'out of range'.
[How]
Revert "drm/amd/display: Fix Vega10 black screen after mode change"
Verified that it hadn't cause mode change regression.
Signed-off-by: Roman Li <Roman.Li@amd.com>
Reviewed-by: Sun peng Li <Sunpeng.Li@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The vce cancel_delayed_work_sync never be called.
driver call the function in error path.
This caused the A+A suspend hang when runtime pm enebled.
As we will visit the smu in the idle queue. this will cause
smu hang because the dgpu has been suspend, and the dgpu also
will be waked up. As the smu has been hang, so the dgpu resume
will failed.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
This reverts commit 0c08754b59da5557532d946599854e6df28edc22.
commit 0c08754b59da
("drm/panel: Add device_link from panel device to DRM device")
creates a circular dependency under these circumstances:
1. The panel depends on dsi-host because it is MIPI-DSI child
device.
2. dsi-host depends on the drm parent device (connector->dev->dev)
this should be allowed.
3. drm parent dev (connector->dev->dev) depends on the panel
after this patch.
This makes the dependency circular and while it appears it
does not affect any in-tree drivers (they do not seem to have
dsi hosts depending on the same parent device) this does not
seem right.
As noted in a response from Andrzej Hajda, the intent is
likely to make the panel dependent on the DRM device
(connector->dev) not its parent. But we have no way of
doing that since the DRM device doesn't contain any
struct device on its own (arguably it should).
Revert this until a proper approach is figured out.
Cc: Jyri Sarha <jsarha@ti.com>
Cc: Eric Anholt <eric@anholt.net>
Cc: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20180927124130.9102-1-linus.walleij@linaro.org
|
|
https://git.linaro.org/people/daniel.lezcano/linux into timers/urgent
Pull another fix from Daniel Lezcano, which felt through the cracks:
- Fix a potential memory leak reported by smatch in the atmel timer driver
|
|
If I attach a vfio-ccw device to my guest, I get the following warning
on the host when the host kernel is CONFIG_HARDENED_USERCOPY=y
[250757.595325] Bad or missing usercopy whitelist? Kernel memory overwrite attempt detected to SLUB object 'dma-kmalloc-512' (offset 64, size 124)!
[250757.595365] WARNING: CPU: 2 PID: 10958 at mm/usercopy.c:81 usercopy_warn+0xac/0xd8
[250757.595369] Modules linked in: kvm vhost_net vhost tap xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c devlink tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables sunrpc dm_multipath s390_trng crc32_vx_s390 ghash_s390 prng aes_s390 des_s390 des_generic sha512_s390 sha1_s390 eadm_sch tape_3590 tape tape_class qeth_l2 qeth ccwgroup vfio_ccw vfio_mdev zcrypt_cex4 mdev vfio_iommu_type1 zcrypt vfio sha256_s390 sha_common zfcp scsi_transport_fc qdio dasd_eckd_mod dasd_mod
[250757.595424] CPU: 2 PID: 10958 Comm: CPU 2/KVM Not tainted 4.18.0-derp #2
[250757.595426] Hardware name: IBM 3906 M05 780 (LPAR)
...snip regs...
[250757.595523] Call Trace:
[250757.595529] ([<0000000000349210>] usercopy_warn+0xa8/0xd8)
[250757.595535] [<000000000032daaa>] __check_heap_object+0xfa/0x160
[250757.595540] [<0000000000349396>] __check_object_size+0x156/0x1d0
[250757.595547] [<000003ff80332d04>] vfio_ccw_mdev_write+0x74/0x148 [vfio_ccw]
[250757.595552] [<000000000034ed12>] __vfs_write+0x3a/0x188
[250757.595556] [<000000000034f040>] vfs_write+0xa8/0x1b8
[250757.595559] [<000000000034f4e6>] ksys_pwrite64+0x86/0xc0
[250757.595568] [<00000000008959a0>] system_call+0xdc/0x2b0
[250757.595570] Last Breaking-Event-Address:
[250757.595573] [<0000000000349210>] usercopy_warn+0xa8/0xd8
While vfio_ccw_mdev_{write|read} validates that the input position/count
does not run over the ccw_io_region struct, the usercopy code that does
copy_{to|from}_user doesn't necessarily know this. It sees the variable
length and gets worried that it's affecting a normal kmalloc'd struct,
and generates the above warning.
Adjust how the ccw_io_region is alloc'd with a whitelist to remove this
warning. The boundary checking will continue to do its thing.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20180921204013.95804-3-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
|
|
In the event that we want to change the layout of the ccw_io_region in the
future[1], it might be easier to work with it as a pointer within the
vfio_ccw_private struct rather than an embedded struct.
[1] https://patchwork.kernel.org/comment/22228541/
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20180921204013.95804-2-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
|
|
Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
added support for purging persistent grants when they are not in use. As
part of the purge, the grants were removed from the grant buffer, This
eventually causes the buffer to become empty, with BUG_ON triggered in
get_free_grant(). This can be observed even on an idle system, within
20-30 minutes.
We should keep the grants in the buffer when purging, and only free the
grant ref.
Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
rxrpc_find_connection_rcu() initialises variable k twice with the same
information. Remove one of the initialisations.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
debugfs_remove has taken the IS_ERR into account. Just
remove the unnecessary condition.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
|
|
The smatch utility reports a possible leak:
smatch warnings:
drivers/clocksource/timer-atmel-pit.c:183 at91sam926x_pit_dt_init() warn: possible memory leak of 'data'
Ensure data is freed before exiting with an error.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
|
|
Use array_index_nospec() to sanitize i with respect to speculation.
Note that the user doesn't control i directly, but can make it out
of bounds by not finding a threshold in the array.
Signed-off-by: Masashi Honma <masashi.honma@gmail.com>
[add note about user control, as explained by Masashi]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
(fix documentation and sysctl access to treat it as such)
Tested:
# zcat /proc/config.gz | egrep ^CONFIG_HZ
CONFIG_HZ_1000=y
CONFIG_HZ=1000
# echo $[(1<<32)/1000 + 1] | tee /proc/sys/net/ipv4/tcp_probe_interval
4294968
tee: /proc/sys/net/ipv4/tcp_probe_interval: Invalid argument
# echo $[(1<<32)/1000] | tee /proc/sys/net/ipv4/tcp_probe_interval
4294967
# echo 0 | tee /proc/sys/net/ipv4/tcp_probe_interval
# echo -1 | tee /proc/sys/net/ipv4/tcp_probe_interval
-1
tee: /proc/sys/net/ipv4/tcp_probe_interval: Invalid argument
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|