Age | Commit message (Collapse) | Author |
|
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.
ena uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Netanel Belgazal <netanel@amazon.com>
Cc: Saeed Bishara <saeedb@amazon.com>
Cc: Zorik Machulsky <zorik@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.
netxen uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Manish Chopra <manish.chopra@cavium.com>
Cc: Rahul Verma <rahul.verma@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.
qlcnic uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Harish Patil <harish.patil@cavium.com>
Cc: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.
virto_net uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.
hns uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yisen Zhuang <yisen.zhuang@huawei.com>
Cc: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.
ehea uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Douglas Miller <dougmill@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.
hinic uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.
Note that hinic_netpoll() was incorrectly scheduling NAPI
on both RX and TX queues.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Aviad Krawczyk <aviad.krawczyk@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since we do no longer require NAPI drivers to provide
an ndo_poll_controller(), napi_schedule() has not been done
before poll_one_napi() invocation.
So testing NAPI_STATE_SCHED is likely to cause early returns.
While we are at it, remove outdated comment.
Note to future bisections : This change might surface prior
bugs in drivers. See commit 73f21c653f93 ("bnxt_en: Fix TX
timeout during netpoll.") for one occurrence.
Fixes: ac3d9dd034e5 ("netpoll: make ndo_poll_controller() optional")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Song Liu <songliubraving@fb.com>
Cc: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
More patches than I'd like perhaps, but each seems reasonable:
* two new spectre-v1 mitigations in nl80211
* TX status fix in general, and mesh in particular
* powersave vs. offchannel fix
* regulatory initialization fix
* fix for a queue hang due to a bad return value
* allocate TXQs for active monitor interfaces, fixing my
earlier patch to avoid unnecessary allocations where I
missed this case needed them
* fix TDLS data frames priority assignment
* fix scan results processing to take into account duplicate
channel numbers (over different operating classes, but we
don't necessarily know the operating class)
* various hwsim fixes for radio destruction and new radio
announcement messages
* remove an extraneous kernel-doc line
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The structure shared between driver and the management FW (mfw) differ in
sizes. This would lead to issues when driver try to access the structure
members which are not-aligned with the mfw copy e.g., data_ptr usage in the
case of mfw_tlv request.
Align the driver structure with mfw copy, add reserved field(s) to driver
structure for the members not used by the driver.
Fixes: dd006921d67f ("qed: Add MFW interfaces for TLV request support.)
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
|
|
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ameen Rahman <Ameen.Rahman@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
I haven't been doing reviews only but not active development on bridge
code for several years. Roopa and Nikolay have been doing most of
the new features and have agreed to take over as new co-maintainers.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
|
|
Julian Wiedmann says:
====================
s390/qeth: fixes 2019-09-26
please apply two qeth patches for -net. The first is a trivial cleanup
required for patch #2 by Jean, which fixes a potential endless loop.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Functions qeth_get_ipa_msg and qeth_get_ipa_cmd_name are modifying
the last member of global arrays without any locking that I can see.
If two instances of either function are running at the same time,
it could cause a race ultimately leading to an array overrun (the
contents of the last entry of the array is the only guarantee that
the loop will ever stop).
Performing the lookups without modifying the arrays is admittedly
slower (two comparisons per iteration instead of one) but these
are operations which are rare (should only be needed in error
cases or when debugging, not during successful operation) and it
seems still less costly than introducing a mutex to protect the
arrays in question.
As a side bonus, it allows us to declare both arrays as const data.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Julian Wiedmann <jwi@linux.ibm.com>
Cc: Ursula Braun <ubraun@linux.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use the common code ARRAY_SIZE macro instead of a private implementation.
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Dave writes:
"drm fixes for 4.19-rc6
Looks like a pretty normal week for graphics,
core: syncobj fix, panel link regression revert
amd: suspend/resume fixes, EDID emulation fix
mali-dp: NV12 writeback and vblank reset fixes
etnaviv: DMA setup fix"
* tag 'drm-fixes-2018-09-28' of git://anongit.freedesktop.org/drm/drm:
drm/amd/display: Fix Edid emulation for linux
drm/amd/display: Fix Vega10 lightup on S3 resume
drm/amdgpu: Fix vce work queue was not cancelled when suspend
Revert "drm/panel: Add device_link from panel device to DRM device"
drm/syncobj: Don't leak fences when WAIT_FOR_SUBMIT is set
drm/malidp: Fix writeback in NV12
drm: mali-dp: Call drm_crtc_vblank_reset on device init
drm/etnaviv: add DMA configuration for etnaviv platform device
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux
Palmer writes:
"A Single RISC-V Update for 4.19-rc6
The Debian guys have been pushing on our port and found some
unversioned symbols leaking into modules. This PR contains a single
fix for that issue."
* tag 'riscv-for-linus-4.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
RISC-V: include linux/ftrace.h in asm-prototypes.h
|
|
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Bjorn writes:
"PCI fixes:
- Fix ACPI hotplug issue that causes black screen crash at boot (Mika
Westerberg)
- Fix DesignWare "scheduling while atomic" issues (Jisheng Zhang)
- Add PPC contacts to MAINTAINERS for PCI core error handling (Bjorn
Helgaas)
- Sort Mobiveil MAINTAINERS entry (Lorenzo Pieralisi)"
* tag 'pci-v4.19-fixes-2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
ACPI / hotplug / PCI: Don't scan for non-hotplug bridges if slot is not bridge
PCI: dwc: Fix scheduling while atomic issues
MAINTAINERS: Move mobiveil PCI driver entry where it belongs
MAINTAINERS: Update PPC contacts for PCI core error handling
|
|
The debounce value passed to mmc_gpiod_request_cd() function is in
microseconds, but msecs_to_jiffies() requires the value to be in
miliseconds to properly calculate the delay, so adjust the value stored
in cd_debounce_delay_ms context entry.
Fixes: 1d71926bbd59 ("mmc: core: Fix debounce time to use microseconds")
Fixes: bfd694d5e21c ("mmc: core: Add tunable delay before detecting card
after card is inserted")
Cc: stable@vger.kernel.org # v4.18+
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Pull NVMe fix from Christoph.
* 'nvme-4.19' of git://git.infradead.org/nvme:
nvme: properly propagate errors in nvme_mpath_init
|
|
Commit a46b53672b2c2e3770b38a4abf90d16364d2584b ("xen/blkfront: cleanup
stale persistent grants") introduced a regression as purged persistent
grants were not pu into the list of free grants again. Correct that.
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Fix didn't work for all cases, reverting to add a (hopefully)
better fix.
This reverts commit f151ba989d149bbdfc90e5405724bbea094f9b17.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
cgroup_storage_update_elem() shouldn't accept any flags
argument values except BPF_ANY and BPF_EXIST to guarantee
the backward compatibility, had a new flag value been added.
Fixes: de9cbbaadba5 ("bpf: introduce cgroup storage maps")
Signed-off-by: Roman Gushchin <guro@fb.com>
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Only check for the network namespace if the socket is available.
Fixes: f564650106a6 ("netfilter: check if the socket netns is correct.")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Unfortunately some versions of gcc emit following warning:
$ make net/xfrm/xfrm_output.o
linux/compiler.h:252:20: warning: array subscript is above array bounds [-Warray-bounds]
hook_head = rcu_dereference(net->nf.hooks_arp[hook]);
^~~~~~~~~~~~~~~~~~~~~
xfrm_output_resume passes skb_dst(skb)->ops->family as its 'pf' arg so compiler
can't know that we'll never access hooks_arp[].
(NFPROTO_IPV4 or NFPROTO_IPV6 are only possible cases).
Avoid this by adding an explicit WARN_ON_ONCE() check.
This patch has no effect if the family is a compile-time constant as gcc
will remove the switch() construct entirely.
Reported-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
The nft_set_gc_batch_check() checks whether gc buffer is full.
If gc buffer is full, gc buffer is released by
the nft_set_gc_batch_complete() internally.
In case of rbtree, the rb_erase() should be called before calling the
nft_set_gc_batch_complete(). therefore the rb_erase() should
be called before calling the nft_set_gc_batch_check() too.
test commands:
table ip filter {
set set1 {
type ipv4_addr; flags interval, timeout;
gc-interval 10s;
timeout 1s;
elements = {
1-2,
3-4,
5-6,
...
10000-10001,
}
}
}
%nft -f test.nft
splat looks like:
[ 430.273885] kasan: GPF could be caused by NULL-ptr deref or user memory access
[ 430.282158] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[ 430.283116] CPU: 1 PID: 190 Comm: kworker/1:2 Tainted: G B 4.18.0+ #7
[ 430.283116] Workqueue: events_power_efficient nft_rbtree_gc [nf_tables_set]
[ 430.313559] RIP: 0010:rb_next+0x81/0x130
[ 430.313559] Code: 08 49 bd 00 00 00 00 00 fc ff df 48 bb 00 00 00 00 00 fc ff df 48 85 c0 75 05 eb 58 48 89 d4
[ 430.313559] RSP: 0018:ffff88010cdb7680 EFLAGS: 00010207
[ 430.313559] RAX: 0000000000b84854 RBX: dffffc0000000000 RCX: ffffffff83f01973
[ 430.313559] RDX: 000000000017090c RSI: 0000000000000008 RDI: 0000000000b84864
[ 430.313559] RBP: ffff8801060d4588 R08: fffffbfff09bc349 R09: fffffbfff09bc349
[ 430.313559] R10: 0000000000000001 R11: fffffbfff09bc348 R12: ffff880100f081a8
[ 430.313559] R13: dffffc0000000000 R14: ffff880100ff8688 R15: dffffc0000000000
[ 430.313559] FS: 0000000000000000(0000) GS:ffff88011b400000(0000) knlGS:0000000000000000
[ 430.313559] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 430.313559] CR2: 0000000001551008 CR3: 000000005dc16000 CR4: 00000000001006e0
[ 430.313559] Call Trace:
[ 430.313559] nft_rbtree_gc+0x112/0x5c0 [nf_tables_set]
[ 430.313559] process_one_work+0xc13/0x1ec0
[ 430.313559] ? _raw_spin_unlock_irq+0x29/0x40
[ 430.313559] ? pwq_dec_nr_in_flight+0x3c0/0x3c0
[ 430.313559] ? set_load_weight+0x270/0x270
[ 430.313559] ? __switch_to_asm+0x34/0x70
[ 430.313559] ? __switch_to_asm+0x40/0x70
[ 430.313559] ? __switch_to_asm+0x34/0x70
[ 430.313559] ? __switch_to_asm+0x34/0x70
[ 430.313559] ? __switch_to_asm+0x40/0x70
[ 430.313559] ? __switch_to_asm+0x34/0x70
[ 430.313559] ? __switch_to_asm+0x40/0x70
[ 430.313559] ? __switch_to_asm+0x34/0x70
[ 430.313559] ? __switch_to_asm+0x34/0x70
[ 430.313559] ? __switch_to_asm+0x40/0x70
[ 430.313559] ? __switch_to_asm+0x34/0x70
[ 430.313559] ? __schedule+0x6d3/0x1f50
[ 430.313559] ? find_held_lock+0x39/0x1c0
[ 430.313559] ? __sched_text_start+0x8/0x8
[ 430.313559] ? cyc2ns_read_end+0x10/0x10
[ 430.313559] ? save_trace+0x300/0x300
[ 430.313559] ? sched_clock_local+0xd4/0x140
[ 430.313559] ? find_held_lock+0x39/0x1c0
[ 430.313559] ? worker_thread+0x353/0x1120
[ 430.313559] ? worker_thread+0x353/0x1120
[ 430.313559] ? lock_contended+0xe70/0xe70
[ 430.313559] ? __lock_acquire+0x4500/0x4500
[ 430.535635] ? do_raw_spin_unlock+0xa5/0x330
[ 430.535635] ? do_raw_spin_trylock+0x101/0x1a0
[ 430.535635] ? do_raw_spin_lock+0x1f0/0x1f0
[ 430.535635] ? _raw_spin_lock_irq+0x10/0x70
[ 430.535635] worker_thread+0x15d/0x1120
[ ... ]
Fixes: 8d8540c4f5e0 ("netfilter: nft_set_rbtree: add timeout support")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Fix error distribution by immediately delivering the errors to all the
affected calls rather than deferring them to a worker thread. The problem
with the latter is that retries and things can happen in the meantime when we
want to stop that sooner.
To this end:
(1) Stop the error distributor from removing calls from the error_targets
list so that peer->lock isn't needed to synchronise against other adds
and removals.
(2) Require the peer's error_targets list to be accessed with RCU, thereby
avoiding the need to take peer->lock over distribution.
(3) Don't attempt to affect a call's state if it is already marked complete.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
It seems that enabling IPV6_RECVERR on an IPv6 socket doesn't also turn on
IP_RECVERR, so neither local errors nor ICMP-transported remote errors from
IPv4 peer addresses are returned to the AF_RXRPC protocol.
Make the sockopt setting code in rxrpc_open_socket() fall through from the
AF_INET6 case to the AF_INET case to turn on all the AF_INET options too in
the AF_INET6 case.
Fixes: f2aeed3a591f ("rxrpc: Fix error reception on AF_INET6 sockets")
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Make the following changes to improve the robustness of the code that sets
up a new service call:
(1) Cache the rxrpc_sock struct obtained in rxrpc_data_ready() to do a
service ID check and pass that along to rxrpc_new_incoming_call().
This means that I can remove the check from rxrpc_new_incoming_call()
without the need to worry about the socket attached to the local
endpoint getting replaced - which would invalidate the check.
(2) Cache the rxrpc_peer struct, thereby allowing the peer search to be
done once. The peer is passed to rxrpc_new_incoming_call(), thereby
saving the need to repeat the search.
This also reduces the possibility of rxrpc_publish_service_conn()
BUG()'ing due to the detection of a duplicate connection, despite the
initial search done by rxrpc_find_connection_rcu() having turned up
nothing.
This BUG() shouldn't ever get hit since rxrpc_data_ready() *should* be
non-reentrant and the result of the initial search should still hold
true, but it has proven possible to hit.
I *think* this may be due to __rxrpc_lookup_peer_rcu() cutting short
the iteration over the hash table if it finds a matching peer with a
zero usage count, but I don't know for sure since it's only ever been
hit once that I know of.
Another possibility is that a bug in rxrpc_data_ready() that checked
the wrong byte in the header for the RXRPC_CLIENT_INITIATED flag
might've let through a packet that caused a spurious and invalid call
to be set up. That is addressed in another patch.
(3) Fix __rxrpc_lookup_peer_rcu() to skip peer records that have a zero
usage count rather than stopping and returning not found, just in case
there's another peer record behind it in the bucket.
(4) Don't search the peer records in rxrpc_alloc_incoming_call(), but
rather either use the peer cached in (2) or, if one wasn't found,
preemptively install a new one.
Fixes: 8496af50eb38 ("rxrpc: Use RCU to access a peer's service connection tree")
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Do more up-front checking on incoming packets to weed out invalid ones and
also ones aimed at services that we don't support.
Whilst we're at it, replace the clearing of call and skew if we don't find
a connection with just initialising the variables to zero at the top of the
function.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
In the input path, a received sk_buff can be marked for rejection by
setting RXRPC_SKB_MARK_* in skb->mark and, if needed, some auxiliary data
(such as an abort code) in skb->priority. The rejection is handled by
queueing the sk_buff up for dealing with in process context. The output
code reads the mark and priority and, theoretically, generates an
appropriate response packet.
However, if RXRPC_SKB_MARK_BUSY is set, this isn't noticed and an ABORT
message with a random abort code is generated (since skb->priority wasn't
set to anything).
Fix this by outputting the appropriate sort of packet.
Also, whilst we're at it, most of the marks are no longer used, so remove
them and rename the remaining two to something more obvious.
Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Fix RTT information gathering in AF_RXRPC by the following means:
(1) Enable Rx timestamping on the transport socket with SO_TIMESTAMPNS.
(2) If the sk_buff doesn't have a timestamp set when rxrpc_data_ready()
collects it, set it at that point.
(3) Allow ACKs to be requested on the last packet of a client call, but
not a service call. We need to be careful lest we undo:
bf7d620abf22c321208a4da4f435e7af52551a21
Author: David Howells <dhowells@redhat.com>
Date: Thu Oct 6 08:11:51 2016 +0100
rxrpc: Don't request an ACK on the last DATA packet of a call's Tx phase
but that only really applies to service calls that we're handling,
since the client side gets to send the final ACK (or not).
(4) When about to transmit an ACK or DATA packet, record the Tx timestamp
before only; don't update the timestamp afterwards.
(5) Switch the ordering between recording the serial and recording the
timestamp to always set the serial number first. The serial number
shouldn't be seen referenced by an ACK packet until we've transmitted
the packet bearing it - so in the Rx path, we don't need the timestamp
until we've checked the serial number.
Fixes: cf1a6474f807 ("rxrpc: Add per-peer RTT tracker")
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
There's a check in rxrpc_data_ready() that's checking the CLIENT_INITIATED
flag in the packet type field rather than in the packet flags field.
Fix this by creating a pair of helper functions to check whether the packet
is going to the client or to the server and use them generally.
Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Fix kernel NULL pointer dereference,
Call Trace:
[<ffffffff9b7658e6>] __mutex_lock_slowpath+0xa6/0x1d0
[<ffffffff9b764cef>] mutex_lock+0x1f/0x2f
[<ffffffffc061b5e1>] qedi_get_protocol_tlv_data+0x61/0x450 [qedi]
[<ffffffff9b1f9d8e>] ? map_vm_area+0x2e/0x40
[<ffffffff9b1fc370>] ? __vmalloc_node_range+0x170/0x280
[<ffffffffc0b81c3d>] ? qed_mfw_process_tlv_req+0x27d/0xbd0 [qed]
[<ffffffffc0b6461b>] qed_mfw_fill_tlv_data+0x4b/0xb0 [qed]
[<ffffffffc0b81c59>] qed_mfw_process_tlv_req+0x299/0xbd0 [qed]
[<ffffffff9b02a59e>] ? __switch_to+0xce/0x580
[<ffffffffc0b61e5b>] qed_slowpath_task+0x5b/0x80 [qed]
Signed-off-by: Nilesh Javali <nilesh.javali@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Commit b2d35fa5fc80 ("selftests: add headers_install to lib.mk")
introduced a requirement that Makefiles more than one level below the
selftests directory need to define top_srcdir, but it didn't update
any of the powerpc Makefiles.
This broke building all the powerpc selftests with eg:
make[1]: Entering directory '/src/linux/tools/testing/selftests/powerpc'
BUILD_TARGET=/src/linux/tools/testing/selftests/powerpc/alignment; mkdir -p $BUILD_TARGET; make OUTPUT=$BUILD_TARGET -k -C alignment all
make[2]: Entering directory '/src/linux/tools/testing/selftests/powerpc/alignment'
../../lib.mk:20: ../../../../scripts/subarch.include: No such file or directory
make[2]: *** No rule to make target '../../../../scripts/subarch.include'.
make[2]: Failed to remake makefile '../../../../scripts/subarch.include'.
Makefile:38: recipe for target 'alignment' failed
Fix it by setting top_srcdir in the affected Makefiles.
Fixes: b2d35fa5fc80 ("selftests: add headers_install to lib.mk")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
The following KASAN warning was printed when booting a 64-bit kernel
on some systems with Intel CPUs:
[ 44.512826] ==================================================================
[ 44.520165] BUG: KASAN: stack-out-of-bounds in find_first_bit+0xb0/0xc0
[ 44.526786] Read of size 8 at addr ffff88041e02fc50 by task kworker/0:2/124
[ 44.535253] CPU: 0 PID: 124 Comm: kworker/0:2 Tainted: G X --------- --- 4.18.0-12.el8.x86_64+debug #1
[ 44.545858] Hardware name: Intel Corporation PURLEY/PURLEY, BIOS BKVDTRL1.86B.0005.D08.1712070559 12/07/2017
[ 44.555682] Workqueue: events work_for_cpu_fn
[ 44.560043] Call Trace:
[ 44.562502] dump_stack+0x9a/0xe9
[ 44.565832] print_address_description+0x65/0x22e
[ 44.570683] ? find_first_bit+0xb0/0xc0
[ 44.570689] kasan_report.cold.6+0x92/0x19f
[ 44.578726] find_first_bit+0xb0/0xc0
[ 44.578737] adf_probe+0x9eb/0x19a0 [qat_c62x]
[ 44.578751] ? adf_remove+0x110/0x110 [qat_c62x]
[ 44.591490] ? mark_held_locks+0xc8/0x140
[ 44.591498] ? _raw_spin_unlock+0x30/0x30
[ 44.591505] ? trace_hardirqs_on_caller+0x381/0x570
[ 44.604418] ? adf_remove+0x110/0x110 [qat_c62x]
[ 44.604427] local_pci_probe+0xd4/0x180
[ 44.604432] ? pci_device_shutdown+0x110/0x110
[ 44.617386] work_for_cpu_fn+0x51/0xa0
[ 44.621145] process_one_work+0x8fe/0x16e0
[ 44.625263] ? pwq_dec_nr_in_flight+0x2d0/0x2d0
[ 44.629799] ? lock_acquire+0x14c/0x400
[ 44.633645] ? move_linked_works+0x12e/0x2a0
[ 44.637928] worker_thread+0x536/0xb50
[ 44.641690] ? __kthread_parkme+0xb6/0x180
[ 44.645796] ? process_one_work+0x16e0/0x16e0
[ 44.650160] kthread+0x30c/0x3d0
[ 44.653400] ? kthread_create_worker_on_cpu+0xc0/0xc0
[ 44.658457] ret_from_fork+0x3a/0x50
[ 44.663557] The buggy address belongs to the page:
[ 44.668350] page:ffffea0010780bc0 count:0 mapcount:0 mapping:0000000000000000 index:0x0
[ 44.676356] flags: 0x17ffffc0000000()
[ 44.680023] raw: 0017ffffc0000000 ffffea0010780bc8 ffffea0010780bc8 0000000000000000
[ 44.687769] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
[ 44.695510] page dumped because: kasan: bad access detected
[ 44.702578] Memory state around the buggy address:
[ 44.707372] ffff88041e02fb00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 44.714593] ffff88041e02fb80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 44.721810] >ffff88041e02fc00: 00 00 00 00 00 00 f1 f1 f1 f1 04 f2 f2 f2 f2 f2
[ 44.729028] ^
[ 44.734864] ffff88041e02fc80: f2 f2 00 00 00 00 f3 f3 f3 f3 00 00 00 00 00 00
[ 44.742082] ffff88041e02fd00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 44.749299] ==================================================================
Looking into the code:
int ret, bar_mask;
:
for_each_set_bit(bar_nr, (const unsigned long *)&bar_mask,
It is casting a 32-bit integer pointer to a 64-bit unsigned long
pointer. There are two problems here. First, the 32-bit pointer address
may not be 64-bit aligned. Secondly, it is accessing an extra 4 bytes.
This is fixed by changing the bar_mask type to unsigned long.
Cc: <stable@vger.kernel.org>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
When compiling with CONFIG_DEBUG_ATOMIC_SLEEP=y the mxs-dcp driver
prints warnings such as:
WARNING: CPU: 0 PID: 120 at kernel/sched/core.c:7736 __might_sleep+0x98/0x9c
do not call blocking ops when !TASK_RUNNING; state=1 set at [<8081978c>] dcp_chan_thread_sha+0x3c/0x2ec
The problem is that blocking ops will manipulate current->state
themselves so it is not allowed to call them between
set_current_state(TASK_INTERRUPTIBLE) and schedule().
Fix this by converting the per-chan mutex to a spinlock (it only
protects tiny list ops anyway) and rearranging the wait logic so that
callbacks are called current->state as TASK_RUNNING. Those callbacks
will indeed call blocking ops themselves so this is required.
Cc: <stable@vger.kernel.org>
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Update PCI Id in "cpl_rx_phys_dsgl" header. In case pci_chan_id and
tx_chan_id are not derived from same queue, H/W can send request
completion indication before completing DMA Transfer.
Herbert, It would be good if fix can be merge to stable tree.
For 4.14 kernel, It requires some update to avoid mege conficts.
Cc: <stable@vger.kernel.org>
Signed-off-by: Harsh Jain <harsh@chelsio.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
into drm-fixes
Just a few fixes for 4.19:
- Couple of suspend/resume fixes
- Fix EDID emulation with DC
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180927155418.2813-1-alexander.deucher@amd.com
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
- Revert adding device-link to panels
- Don't leak fences in drm/syncobj
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Sean Paul <sean@poorly.run>
Link: https://patchwork.freedesktop.org/patch/msgid/20180927152712.GA53076@art_vandelay
|
|
On 38+ Intel-based ASUS products, the NVIDIA GPU becomes unusable after S3
suspend/resume. The affected products include multiple generations of
NVIDIA GPUs and Intel SoCs. After resume, nouveau logs many errors such
as:
fifo: fault 00 [READ] at 0000005555555000 engine 00 [GR] client 04
[HUB/FE] reason 4a [] on channel -1 [007fa91000 unknown]
DRM: failed to idle channel 0 [DRM]
Similarly, the NVIDIA proprietary driver also fails after resume (black
screen, 100% CPU usage in Xorg process). We shipped a sample to NVIDIA for
diagnosis, and their response indicated that it's a problem with the parent
PCI bridge (on the Intel SoC), not the GPU.
Runtime suspend/resume works fine, only S3 suspend is affected.
We found a workaround: on resume, rewrite the Intel PCI bridge
'Prefetchable Base Upper 32 Bits' register (PCI_PREF_BASE_UPPER32). In the
cases that I checked, this register has value 0 and we just have to rewrite
that value.
Linux already saves and restores PCI config space during suspend/resume,
but this register was being skipped because upon resume, it already has
value 0 (the correct, pre-suspend value).
Intel appear to have previously acknowledged this behaviour and the
requirement to rewrite this register:
https://bugzilla.kernel.org/show_bug.cgi?id=116851#c23
Based on that, rewrite the prefetch register values even when that appears
unnecessary.
We have confirmed this solution on all the affected models we have in-hands
(X542UQ, UX533FD, X530UN, V272UN).
Additionally, this solves an issue where r8169 MSI-X interrupts were broken
after S3 suspend/resume on ASUS X441UAR. This issue was recently worked
around in commit 7bb05b85bc2d ("r8169: don't use MSI-X on RTL8106e"). It
also fixes the same issue on RTL6186evl/8111evl on an Aimfor-tech laptop
that we had not yet patched. I suspect it will also fix the issue that was
worked around in commit 7c53a722459c ("r8169: don't use MSI-X on
RTL8168g").
Thomas Martitz reports that this change also solves an issue where the AMD
Radeon Polaris 10 GPU on the HP Zbook 14u G5 is unresponsive after S3
suspend/resume.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=201069
Signed-off-by: Daniel Drake <drake@endlessm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-By: Peter Wu <peter@lekensteyn.nl>
CC: stable@vger.kernel.org
|
|
Jason writes:
"Second RDMA rc pull request
- Fix a long standing race bug when destroying comp_event file descriptors
- srp, hfi1, bnxt_re: Various driver crashes from missing validation
and other cases
- Fixes for regressions in patches merged this window in the gid
cache, devx, ucma and uapi."
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/core: Set right entry state before releasing reference
IB/mlx5: Destroy the DEVX object upon error flow
IB/uverbs: Free uapi on destroy
RDMA/bnxt_re: Fix system crash during RDMA resource initialization
IB/hfi1: Fix destroy_qp hang after a link down
IB/hfi1: Fix context recovery when PBC has an UnsupportedVL
IB/hfi1: Invalid user input can result in crash
IB/hfi1: Fix SL array bounds check
RDMA/uverbs: Fix validity check for modify QP
IB/srp: Avoid that sg_reset -d ${srp_device} triggers an infinite loop
ucma: fix a use-after-free in ucma_resolve_ip()
RDMA/uverbs: Atomically flush and mark closed the comp event queue
cxgb4: fix abort_req_rss6 struct
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Jan writes:
"an ext2 patch fixing fsync(2) for DAX mounts."
* tag 'for_v4.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
ext2, dax: set ext2_dax_aops for dax files
|
|
trace_block_unplug() takes true for explicit unplugs and false for
implicit unplugs. schedule() unplugs are implicit and should be
reported as timer unplugs. While correct in the legacy code, this has
been inverted in blk-mq since 4.11.
Cc: stable@vger.kernel.org
Fixes: bd166ef183c2 ("blk-mq-sched: add framework for MQ capable IO schedulers")
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
On x86-64, the parametrized selftest code for rseq crashes with a
segmentation fault when compiled with -fpie. This happens when the
param_test binary is loaded at an address beyond 32-bit on x86-64.
The issue is caused by use of a 32-bit register to hold the address
of the loop counter variable.
Fix this by using a 64-bit register to calculate the address of the
loop counter variables as an offset from rip.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> # v4.18
Cc: Shuah Khan <shuah@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Watson <davejwatson@fb.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: linux-kselftest@vger.kernel.org
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Chris Lameter <cl@linux.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Paul Turner <pjt@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ben Maurer <bmaurer@fb.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
|
|
.max_tfd_queue_size was ommited for 1000 card serries leading to oops in
swiotlb.
Fixes: 7b3e42ea2ead ("iwlwifi: support multiple tfd queue max sizes for different devices")
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
|
|
When dax_lock_mapping_entry() has to sleep to obtain entry lock, it will
fail to unlock mapping->i_pages spinlock and thus immediately deadlock
against itself when retrying to grab the entry lock again. Fix the
problem by unlocking mapping->i_pages before retrying.
Fixes: c2a7d2a11552 ("filesystem-dax: Introduce dax_lock_mapping_entry()")
Reported-by: Barret Rhoden <brho@google.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Commit
1958b5fc4010 ("x86/boot: Add early boot support when running with SEV active")
can occasionally cause system resets when kexec-ing a second kernel even
if SEV is not active.
That's because get_sev_encryption_bit() uses 32-bit rIP-relative
addressing to read the value of enc_bit - a variable which caches a
previously detected encryption bit position - but kexec may allocate
the early boot code to a higher location, beyond the 32-bit addressing
limit.
In this case, garbage will be read and get_sev_encryption_bit() will
return the wrong value, leading to accessing memory with the wrong
encryption setting.
Therefore, remove enc_bit, and thus get rid of the need to do 32-bit
rIP-relative addressing in the first place.
[ bp: massage commit message heavily. ]
Fixes: 1958b5fc4010 ("x86/boot: Add early boot support when running with SEV active")
Suggested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Kairui Song <kasong@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: linux-kernel@vger.kernel.org
Cc: tglx@linutronix.de
Cc: mingo@redhat.com
Cc: hpa@zytor.com
Cc: brijesh.singh@amd.com
Cc: kexec@lists.infradead.org
Cc: dyoung@redhat.com
Cc: bhe@redhat.com
Cc: ghook@redhat.com
Link: https://lkml.kernel.org/r/20180927123845.32052-1-kasong@redhat.com
|
|
After write SSD completed, bcache schedules journal_write work to
system_wq, which is a public workqueue in system, without WQ_MEM_RECLAIM
flag. system_wq is also a bound wq, and there may be no idle kworker on
current processor. Creating a new kworker may unfortunately need to
reclaim memory first, by shrinking cache and slab used by vfs, which
depends on bcache device. That's a deadlock.
This patch create a new workqueue for journal_write with WQ_MEM_RECLAIM
flag. It's rescuer thread will work to avoid the deadlock.
Signed-off-by: Guoju Fang <fangguoju@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The combination of defined constants are used to present the
state of IRQ so the magic numbers has been replaced.
This is a simple coding style change which should have no impact on
runtime code execution.
Signed-off-by: Xue Liu <liuxuenetmail@gmail.com>
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
|