Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
"A set of fixes for perf:
Kernel side:
- Fix the hardcoded index of extra PCI devices on Broadwell which
caused a resource conflict and triggered warnings on CPU hotplug.
Tooling:
- Update the tools copy of several files, including perf_event.h,
powerpc's asm/unistd.h (new io_pgetevents syscall), bpf.h and x86's
memcpy_64.s (used in 'perf bench mem'), silencing the respective
warnings during the perf tools build.
- Fix the build on the alpine:edge distro"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel/uncore: Fix hardcoded index of Broadwell extra PCI devices
perf tools: Fix the build on the alpine:edge distro
tools arch: Update arch/x86/lib/memcpy_64.S copy used in 'perf bench mem memcpy'
tools headers uapi: Refresh linux/bpf.h copy
tools headers powerpc: Update asm/unistd.h copy to pick new
tools headers uapi: Update tools's copy of linux/perf_event.h
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fix from Thomas Gleixner:
"A single bugfix for the irq core to prevent silent data corruption and
malfunction of threaded interrupts under certain conditions"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Make force irq threading setup more robust
|
|
Pull networking fixes from David Miller:
1) Handle frames in error situations properly in AF_XDP, from Jakub
Kicinski.
2) tcp_mmap test case only tests ipv6 due to a thinko, fix from
Maninder Singh.
3) Session refcnt fix in l2tp_ppp, from Guillaume Nault.
4) Fix regression in netlink bind handling of multicast gruops, from
Dmitry Safonov.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
netlink: Don't shift on 64 for ngroups
net/smc: no cursor update send in state SMC_INIT
l2tp: fix missing refcount drop in pppol2tp_tunnel_ioctl()
mlxsw: core_acl_flex_actions: Remove redundant mirror resource destruction
mlxsw: core_acl_flex_actions: Remove redundant counter destruction
mlxsw: core_acl_flex_actions: Remove redundant resource destruction
mlxsw: core_acl_flex_actions: Return error for conflicting actions
selftests/bpf: update test_lwt_seg6local.sh according to iproute2
drivers: net: lmc: fix case value for target abort error
selftest/net: fix protocol family to work for IPv4.
net: xsk: don't return frames via the allocator on error
tools/bpftool: fix a percpu_array map dump problem
|
|
When nested virtualization is in use, VMENTER operations from the nested
hypervisor into the nested guest will always be processed by the bare metal
hypervisor, and KVM's "conditional cache flushes" mode in particular does a
flush on nested vmentry. Therefore, include the "skip L1D flush on
vmentry" bit in KVM's suggested ARCH_CAPABILITIES setting.
Add the relevant Documentation.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Bit 3 of ARCH_CAPABILITIES tells a hypervisor that L1D flush on vmentry is
not needed. Add a new value to enum vmx_l1d_flush_state, which is used
either if there is no L1TF bug at all, or if bit 3 is set in ARCH_CAPABILITIES.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Three changes to the content of the sysfs file:
- If EPT is disabled, L1TF cannot be exploited even across threads on the
same core, and SMT is irrelevant.
- If mitigation is completely disabled, and SMT is enabled, print "vulnerable"
instead of "vulnerable, SMT vulnerable"
- Reorder the two parts so that the main vulnerability state comes first
and the detail on SMT is second.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Dave reported, that it's not confirmed that Yonah processors are
unaffected. Remove them from the list.
Reported-by: ave Hansen <dave.hansen@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
For VMEXITs caused by external interrupts, vmx_handle_external_intr()
indirectly calls into the interrupt handlers through the host's IDT.
It follows that these interrupts get accounted for in the
kvm_cpu_l1tf_flush_l1d per-cpu flag.
The subsequently executed vmx_l1d_flush() will thus be aware that some
interrupts have happened and conduct a L1d flush anyway.
Setting l1tf_flush_l1d from vmx_handle_external_intr() isn't needed
anymore. Drop it.
Signed-off-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The last missing piece to having vmx_l1d_flush() take interrupts after
VMEXIT into account is to set the kvm_cpu_l1tf_flush_l1d per-cpu flag on
irq entry.
Issue calls to kvm_set_cpu_l1tf_flush_l1d() from entering_irq(),
ipi_entering_ack_irq(), smp_reschedule_interrupt() and
uv_bau_message_interrupt().
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The next patch in this series will have to make the definition of
irq_cpustat_t available to entering_irq().
Inclusion of asm/hardirq.h into asm/apic.h would cause circular header
dependencies like
asm/smp.h
asm/apic.h
asm/hardirq.h
linux/irq.h
linux/topology.h
linux/smp.h
asm/smp.h
or
linux/gfp.h
linux/mmzone.h
asm/mmzone.h
asm/mmzone_64.h
asm/smp.h
asm/apic.h
asm/hardirq.h
linux/irq.h
linux/irqdesc.h
linux/kobject.h
linux/sysfs.h
linux/kernfs.h
linux/idr.h
linux/gfp.h
and others.
This causes compilation errors because of the header guards becoming
effective in the second inclusion: symbols/macros that had been defined
before wouldn't be available to intermediate headers in the #include chain
anymore.
A possible workaround would be to move the definition of irq_cpustat_t
into its own header and include that from both, asm/hardirq.h and
asm/apic.h.
However, this wouldn't solve the real problem, namely asm/harirq.h
unnecessarily pulling in all the linux/irq.h cruft: nothing in
asm/hardirq.h itself requires it. Also, note that there are some other
archs, like e.g. arm64, which don't have that #include in their
asm/hardirq.h.
Remove the linux/irq.h #include from x86' asm/hardirq.h.
Fix resulting compilation errors by adding appropriate #includes to *.c
files as needed.
Note that some of these *.c files could be cleaned up a bit wrt. to their
set of #includes, but that should better be done from separate patches, if
at all.
Signed-off-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Part of the L1TF mitigation for vmx includes flushing the L1D cache upon
VMENTRY.
L1D flushes are costly and two modes of operations are provided to users:
"always" and the more selective "conditional" mode.
If operating in the latter, the cache would get flushed only if a host side
code path considered unconfined had been traversed. "Unconfined" in this
context means that it might have pulled in sensitive data like user data
or kernel crypto keys.
The need for L1D flushes is tracked by means of the per-vcpu flag
l1tf_flush_l1d. KVM exit handlers considered unconfined set it. A
vmx_l1d_flush() subsequently invoked before the next VMENTER will conduct a
L1d flush based on its value and reset that flag again.
Currently, interrupts delivered "normally" while in root operation between
VMEXIT and VMENTER are not taken into account. Part of the reason is that
these don't leave any traces and thus, the vmx code is unable to tell if
any such has happened.
As proposed by Paolo Bonzini, prepare for tracking all interrupts by
introducing a new per-cpu flag, "kvm_cpu_l1tf_flush_l1d". It will be in
strong analogy to the per-vcpu ->l1tf_flush_l1d.
A later patch will make interrupt handlers set it.
For the sake of cache locality, group kvm_cpu_l1tf_flush_l1d into x86'
per-cpu irq_cpustat_t as suggested by Peter Zijlstra.
Provide the helpers kvm_set_cpu_l1tf_flush_l1d(),
kvm_clear_cpu_l1tf_flush_l1d() and kvm_get_cpu_l1tf_flush_l1d(). Make them
trivial resp. non-existent for !CONFIG_KVM_INTEL as appropriate.
Let vmx_l1d_flush() handle kvm_cpu_l1tf_flush_l1d in the same way as
l1tf_flush_l1d.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
An upcoming patch will extend KVM's L1TF mitigation in conditional mode
to also cover interrupts after VMEXITs. For tracking those, stores to a
new per-cpu flag from interrupt handlers will become necessary.
In order to improve cache locality, this new flag will be added to x86's
irq_cpustat_t.
Make some space available there by shrinking the ->softirq_pending bitfield
from 32 to 16 bits: the number of bits actually used is only NR_SOFTIRQS,
i.e. 10.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Currently, vmx_vcpu_run() checks if l1tf_flush_l1d is set and invokes
vmx_l1d_flush() if so.
This test is unncessary for the "always flush L1D" mode.
Move the check to vmx_l1d_flush()'s conditional mode code path.
Notes:
- vmx_l1d_flush() is likely to get inlined anyway and thus, there's no
extra function call.
- This inverts the (static) branch prediction, but there hadn't been any
explicit likely()/unlikely() annotations before and so it stays as is.
Signed-off-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The vmx_l1d_flush_always static key is only ever evaluated if
vmx_l1d_should_flush is enabled. In that case however, there are only two
L1d flushing modes possible: "always" and "conditional".
The "conditional" mode's implementation tends to require more sophisticated
logic than the "always" mode.
Avoid inverted logic by replacing the 'vmx_l1d_flush_always' static key
with a 'vmx_l1d_flush_cond' one.
There is no change in functionality.
Signed-off-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
vmx_l1d_flush() gets invoked only if l1tf_flush_l1d is true. There's no
point in setting l1tf_flush_l1d to true from there again.
Signed-off-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull usercopy whitelisting fix from Kees Cook:
"Bart Massey discovered that the usercopy whitelist for JFS was
incomplete: the inline inode data may intentionally "overflow" into
the neighboring "extended area", so the size of the whitelist needed
to be raised to include the neighboring field"
* tag 'usercopy-fix-v4.18-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
jfs: Fix usercopy whitelist for inline inode data
|
|
Pull xfs bugfix from Darrick Wong:
"One more patch for 4.18 to fix a coding error in the iomap_bmap()
function introduced in -rc1: fix incorrect shifting"
* tag 'xfs-4.18-fixes-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
fs: fix iomap_bmap position calculation
|
|
It turns out that commit 721c7fc701c7 ("block: fail op_is_write()
requests to read-only partitions"), while obviously correct, causes
problems for some older lvm2 installations.
The reason is that the lvm snapshotting will continue to write to the
snapshow COW volume, even after the volume has been marked read-only.
End result: snapshot failure.
This has actually been fixed in newer version of the lvm2 tool, but the
old tools still exist, and the breakage was reported both in the kernel
bugzilla and in the Debian bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=200439
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=900442
The lvm2 fix is here
https://sourceware.org/git/?p=lvm2.git;a=commit;h=a6fdb9d9d70f51c49ad11a87ab4243344e6701a3
but until everybody has updated to recent versions, we'll have to weaken
the "never write to read-only partitions" check. It now allows the
write to happen, but causes a warning, something like this:
generic_make_request: Trying to write to read-only block-device dm-3 (partno X)
Modules linked in: nf_tables xt_cgroup xt_owner kvm_intel iwlmvm kvm irqbypass iwlwifi
CPU: 1 PID: 77 Comm: kworker/1:1 Not tainted 4.17.9-gentoo #3
Hardware name: LENOVO 20B6A019RT/20B6A019RT, BIOS GJET91WW (2.41 ) 09/21/2016
Workqueue: ksnaphd do_metadata
RIP: 0010:generic_make_request_checks+0x4ac/0x600
...
Call Trace:
generic_make_request+0x64/0x400
submit_bio+0x6c/0x140
dispatch_io+0x287/0x430
sync_io+0xc3/0x120
dm_io+0x1f8/0x220
do_metadata+0x1d/0x30
process_one_work+0x1b9/0x3e0
worker_thread+0x2b/0x3c0
kthread+0x113/0x130
ret_from_fork+0x35/0x40
Note that this is a "revert" in behavior only. I'm leaving alone the
actual code cleanups in commit 721c7fc701c7, but letting the previously
uncaught request go through with a warning instead of stopping it.
Fixes: 721c7fc701c7 ("block: fail op_is_write() requests to read-only partitions")
Reported-and-tested-by: WGH <wgh@torlan.ru>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Zdenek Kabelac <zkabelac@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
It's legal to have 64 groups for netlink_sock.
As user-supplied nladdr->nl_groups is __u32, it's possible to subscribe
only to first 32 groups.
The check for correctness of .bind() userspace supplied parameter
is done by applying mask made from ngroups shift. Which broke Android
as they have 64 groups and the shift for mask resulted in an overflow.
Fixes: 61f4b23769f0 ("netlink: Don't shift with UB on nlk->ngroups")
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: netdev@vger.kernel.org
Cc: stable@vger.kernel.org
Reported-and-Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Daniel Borkmann says:
====================
pull-request: bpf 2018-08-05
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix bpftool percpu_array dump by using correct roundup to next
multiple of 8 for the value size, from Yonghong.
2) Fix in AF_XDP's __xsk_rcv_zc() to not returning frames back to
allocator since driver will recycle frame anyway in case of an
error, from Jakub.
3) Fix up BPF test_lwt_seg6local test cases to final iproute2
syntax, from Mathieu.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The err is not used after initalization. So just remove the variable.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
If a writer blocked condition is received without data, the current
consumer cursor is immediately sent. Servers could already receive this
condition in state SMC_INIT without finished tx-setup. This patch
avoids sending a consumer cursor update in this case.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Bart Massey reported what turned out to be a usercopy whitelist false
positive in JFS when symlink contents exceeded 128 bytes. The inline
inode data (i_inline) is actually designed to overflow into the "extended
area" following it (i_inline_ea) when needed. So the whitelist needed to
be expanded to include both i_inline and i_inline_ea (the whole size
of which is calculated internally using IDATASIZE, 256, instead of
sizeof(i_inline), 128).
$ cd /mnt/jfs
$ touch $(perl -e 'print "B" x 250')
$ ln -s B* b
$ ls -l >/dev/null
[ 249.436410] Bad or missing usercopy whitelist? Kernel memory exposure attempt detected from SLUB object 'jfs_ip' (offset 616, size 250)!
Reported-by: Bart Massey <bart.massey@gmail.com>
Fixes: 8d2704d382a9 ("jfs: Define usercopy region in jfs_ip slab cache")
Cc: Dave Kleikamp <shaggy@kernel.org>
Cc: jfs-discussion@lists.sourceforge.net
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
Warning level 2 was used: -Wimplicit-fallthrough=2
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
Notice that in this particular case, I replaced the code comment with
a proper "fall through" annotation, which is what GCC is expecting
to find.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
This patch added the 6th compression algorithm support for pstore: zstd.
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
Pull KVM fixes from Paolo Bonzini:
"Two vmx bugfixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: x86: vmx: fix vpid leak
KVM: vmx: use local variable for current_vmptr when emulating VMPTRST
|
|
We hit that when inumber allocation has failed. In that case
the in-core inode is not hashed and since its ->i_nlink is 1
the only place where jfs checks is_bad_inode() won't be reached.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
We never look them up in there; inode_fake_hash() will make them appear
hashed for mark_inode_dirty() purposes. And don't leave them around
until memory pressure kicks them out - we never look them up again.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
open-coded in a quite a few places...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
iput() ends up calling ->evict() on new inode, which is not yet initialized
by owning fs. So use destroy_inode() instead.
Add to sb->s_inodes list only if inode is not in I_CREATING state (meaning
that it wasn't allocated with new_inode(), which already does the
insertion).
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: 80ea09a002bf ("vfs: factor out inode_insert5()")
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
we don't want open-by-handle to pick an in-core inode that
has failed setup halfway through.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
we don't want open-by-handle to pick an in-core inode that
has failed setup halfway through.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
we don't want open-by-handle to pick an in-core inode that
has failed setup halfway through.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Make sure that no partially set up inodes can be returned by
open-by-handle.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
We don't want open-by-handle picking half-set-up in-core
struct inode from e.g. mkdir() having failed halfway through.
In other words, we don't want such inodes returned by iget_locked()
on their way to extinction. However, we can't just have them
unhashed - otherwise open-by-handle immediately *after* that would've
ended up creating a new in-core inode over the on-disk one that
is in process of being freed right under us.
Solution: new flag (I_CREATING) set by insert_inode_locked() and
removed by unlock_new_inode() and a new primitive (discard_new_inode())
to be used by such halfway-through-setup failure exits instead of
unlock_new_inode() / iput() combinations. That primitive unlocks new
inode, but leaves I_CREATING in place.
iget_locked() treats finding an I_CREATING inode as failure
(-ESTALE, once we sort out the error propagation).
insert_inode_locked() treats the same as instant -EBUSY.
ilookup() treats those as icache miss.
[Fix by Dan Carpenter <dan.carpenter@oracle.com> folded in]
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
If 'session' is not NULL and is not a PPP pseudo-wire, then we fail to
drop the reference taken by l2tp_session_get().
Fixes: ecd012e45ab5 ("l2tp: filter out non-PPP sessions in pppol2tp_tunnel_ioctl()")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Ido Schimmel says:
====================
mlxsw: Fix ACL actions error condition handling
Nir says:
Two issues were lately noticed within mlxsw ACL actions error condition
handling. The first patch deals with conflicting actions such as:
# tc filter add dev swp49 parent ffff: \
protocol ip pref 10 flower skip_sw dst_ip 192.168.101.1 \
action goto chain 100 \
action mirred egress redirect dev swp4
The second action will never execute, however SW model allows this
configuration, while the mlxsw driver cannot allow for it as it
implements actions in sets of up to three actions per set with a single
termination marking. Conflicting actions create a contradiction over
this single marking and thus cannot be configured. The fix replaces a
misplaced warning with an error code to be returned.
Patches 2-4 fix a condition of duplicate destruction of resources. Some
actions require allocation of specific resource prior to setting the
action itself. On error condition this resource was destroyed twice,
leading to a crash when using mirror action, and to a redundant
destruction in other cases, since for error condition rule destruction
also takes care of resource destruction. In order to fix this state a
symmetry in behavior is added and resource destruction also takes care
of removing the resource from rule's resource list.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In previous patch mlxsw_afa_resource_del() was added to avoid a duplicate
resource detruction scenario.
For mirror actions, such duplicate destruction leads to a crash as in:
# tc qdisc add dev swp49 ingress
# tc filter add dev swp49 parent ffff: \
protocol ip chain 100 pref 10 \
flower skip_sw dst_ip 192.168.101.1 action drop
# tc filter add dev swp49 parent ffff: \
protocol ip pref 10 \
flower skip_sw dst_ip 192.168.101.1 action goto chain 100 \
action mirred egress mirror dev swp4
Therefore add a call to mlxsw_afa_resource_del() in
mlxsw_afa_mirror_destroy() in order to clear that resource
from rule's resources.
Fixes: d0d13c1858a1 ("mlxsw: spectrum_acl: Add support for mirror action")
Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Each tc flower rule uses a hidden count action. As counter resource may
not be available due to limited HW resources, update _counter_create()
and _counter_destroy() pair to follow previously introduced symmetric
error condition handling, add a call to mlxsw_afa_resource_del() as part
of the counter resource destruction.
Fixes: c18c1e186ba8 ("mlxsw: core: Make counter index allocated inside the action append")
Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Some ACL actions require the allocation of a separate resource
prior to applying the action itself. When facing an error condition
during the setup phase of the action, resource should be destroyed.
For such actions the destruction was done twice which is dangerous
and lead to a potential crash.
The destruction took place first upon error on action setup phase
and then as the rule was destroyed.
The following sequence generated a crash:
# tc qdisc add dev swp49 ingress
# tc filter add dev swp49 parent ffff: \
protocol ip chain 100 pref 10 \
flower skip_sw dst_ip 192.168.101.1 action drop
# tc filter add dev swp49 parent ffff: \
protocol ip pref 10 \
flower skip_sw dst_ip 192.168.101.1 action goto chain 100 \
action mirred egress mirror dev swp4
Therefore add mlxsw_afa_resource_del() as a complement of
mlxsw_afa_resource_add() to add symmetry to resource_list membership
handling. Call this from mlxsw_afa_fwd_entry_ref_destroy() to make the
_fwd_entry_ref_create() and _fwd_entry_ref_destroy() pair of calls a
NOP.
Fixes: 140ce421217e ("mlxsw: core: Convert fwd_entry_ref list to be generic per-block resource list")
Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Spectrum switch ACL action set is built in groups of three actions
which may point to additional actions. A group holds a single record
which can be set as goto record for pointing at a following group
or can be set to mark the termination of the lookup. This is perfectly
adequate for handling a series of actions to be executed on a packet.
While the SW model allows configuration of conflicting actions
where it is clear that some actions will never execute, the mlxsw
driver must block such configurations as it creates a conflict
over the single terminate/goto record value.
For a conflicting actions configuration such as:
# tc filter add dev swp49 parent ffff: \
protocol ip pref 10 \
flower skip_sw dst_ip 192.168.101.1 \
action goto chain 100 \
action mirred egress mirror dev swp4
Where it is clear that the last action will never execute, the
mlxsw driver was issuing a warning instead of returning an error.
Therefore replace that warning with an error for this specific
case.
Fixes: 4cda7d8d7098 ("mlxsw: core: Introduce flexible actions support")
Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commands that are reset are returned with status
SAM_STAT_COMMAND_TERMINATED. PVSCSI currently returns DID_OK |
SAM_STAT_COMMAND_TERMINATED which fails the command. Instead, set hostbyte
to DID_RESET to allow upper layers to retry.
Tested by copying a large file between two pvscsi disks on same adapter
while performing a bus reset at 1-second intervals. Before fix, commands
sometimes fail with DID_OK. After fix, commands observed to fail with
DID_RESET.
Signed-off-by: Jim Gill <jgill@vmware.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
enabled
Surround scsi_execute() calls with scsi_autopm_get_device() and
scsi_autopm_put_device(). Note: removing sr_mutex protection from the
scsi_cd_get() and scsi_cd_put() calls is safe because the purpose of
sr_mutex is to serialize cdrom_*() calls.
This patch avoids that complaints similar to the following appear in the
kernel log if runtime power management is enabled:
INFO: task systemd-udevd:650 blocked for more than 120 seconds.
Not tainted 4.18.0-rc7-dbg+ #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
systemd-udevd D28176 650 513 0x00000104
Call Trace:
__schedule+0x444/0xfe0
schedule+0x4e/0xe0
schedule_preempt_disabled+0x18/0x30
__mutex_lock+0x41c/0xc70
mutex_lock_nested+0x1b/0x20
__blkdev_get+0x106/0x970
blkdev_get+0x22c/0x5a0
blkdev_open+0xe9/0x100
do_dentry_open.isra.19+0x33e/0x570
vfs_open+0x7c/0xd0
path_openat+0x6e3/0x1120
do_filp_open+0x11c/0x1c0
do_sys_open+0x208/0x2d0
__x64_sys_openat+0x59/0x70
do_syscall_64+0x77/0x230
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Maurizio Lombardi <mlombard@redhat.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: <stable@vger.kernel.org>
Tested-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Swap the I/O memory read value back to cpu endianness before storing it in
a data structures which are defined in the MPI headers where u8 components
are not defined in the endianness order.
In this area from day one mpt3sas driver is using le32_to_cpu() &
cpu_to_le32() APIs. But in commit cf6bf9710c
(mpt3sas: Bug fix for big endian systems) we have removed these APIs
before reading I/O memory which we should haven't done it. So
in this patch I am correcting it by adding these APIs back
before accessing I/O memory.
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Pull rdma fix from Jason Gunthorpe:
"One bug for missing user input validation: refuse invalid port numbers
in the modify_qp system call"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/uverbs: Expand primary and alt AV port checks
|
|
Pull block fix from Jens Axboe:
"Just a single fix, from Ming, fixing a regression in this cycle where
the busy tag iteration was changed to only calling the callback
function for requests that are started. We really want all non-free
requests.
This fixes a boot regression on certain VM setups"
* tag 'for-linus-20180803' of git://git.kernel.dk/linux-block:
blk-mq: fix blk_mq_tagset_busy_iter
|
|
gpiochip_lock_as_irq() may return a few error codes,
do not shadow them by -EINVAL and let caller to decide.
No functional change intended.
Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|