Age | Commit message (Collapse) | Author |
|
is_external_interrupt() is not used now and so remove it.
Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The code tries to pre-allocate *min* number of objects, so it is ok to
return 0 when the kvm_mmu_memory_cache meets the requirement.
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Suggested-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
On a 64bits machine, struct is naturally aligned with 8 bytes. Since
kvm_mmu_page member *unsync* and *role* are less then 4 bytes, we can
rearrange the sequence to compace the struct.
As the comment shows, *role* and *gfn* are used to key the shadow page. In
order to keep the comment valid, this patch moves the *unsync* up and
exchange the position of *role* and *gfn*.
From /proc/slabinfo, it shows the size of kvm_mmu_page is 8 bytes less and
with one more object per slap after applying this patch.
# name <active_objs> <num_objs> <objsize> <objperslab>
kvm_mmu_page_header 0 0 168 24
kvm_mmu_page_header 0 0 160 25
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
A VMEnter that VMFails (as opposed to VMExits) does not touch host
state beyond registers that are explicitly noted in the VMFail path,
e.g. EFLAGS. Host state does not need to be loaded because VMFail
is only signaled for consistency checks that occur before the CPU
starts to load guest state, i.e. there is no need to restore any
state as nothing has been modified. But in the case where a VMFail
is detected by hardware and not by KVM (due to deferring consistency
checks to hardware), KVM has already loaded some amount of guest
state. Luckily, "loaded" only means loaded to KVM's software model,
i.e. vmcs01 has not been modified. So, unwind our software model to
the pre-VMEntry host state.
Not restoring host state in this VMFail path leads to a variety of
failures because we end up with stale data in vcpu->arch, e.g. CR0,
CR4, EFER, etc... will all be out of sync relative to vmcs01. Any
significant delta in the stale data is all but guaranteed to crash
L1, e.g. emulation of SMEP, SMAP, UMIP, WP, etc... will be wrong.
An alternative to this "soft" reload would be to load host state from
vmcs12 as if we triggered a VMExit (as opposed to VMFail), but that is
wildly inconsistent with respect to the VMX architecture, e.g. an L1
VMM with separate VMExit and VMFail paths would explode.
Note that this approach does not mean KVM is 100% accurate with
respect to VMX hardware behavior, even at an architectural level
(the exact order of consistency checks is microarchitecture specific).
But 100% emulation accuracy isn't the goal (with this patch), rather
the goal is to be consistent in the information delivered to L1, e.g.
a VMExit should not fall-through VMENTER, and a VMFail should not jump
to HOST_RIP.
This technically reverts commit "5af4157388ad (KVM: nVMX: Fix mmu
context after VMLAUNCH/VMRESUME failure)", but retains the core
aspects of that patch, just in an open coded form due to the need to
pull state from vmcs01 instead of vmcs12. Restoring host state
resolves a variety of issues introduced by commit "4f350c6dbcb9
(kvm: nVMX: Handle deferred early VMLAUNCH/VMRESUME failure properly)",
which remedied the incorrect behavior of treating VMFail like VMExit
but in doing so neglected to restore arch state that had been modified
prior to attempting nested VMEnter.
A sample failure that occurs due to stale vcpu.arch state is a fault
of some form while emulating an LGDT (due to emulated UMIP) from L1
after a failed VMEntry to L3, in this case when running the KVM unit
test test_tpr_threshold_values in L1. L0 also hits a WARN in this
case due to a stale arch.cr4.UMIP.
L1:
BUG: unable to handle kernel paging request at ffffc90000663b9e
PGD 276512067 P4D 276512067 PUD 276513067 PMD 274efa067 PTE 8000000271de2163
Oops: 0009 [#1] SMP
CPU: 5 PID: 12495 Comm: qemu-system-x86 Tainted: G W 4.18.0-rc2+ #2
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:native_load_gdt+0x0/0x10
...
Call Trace:
load_fixmap_gdt+0x22/0x30
__vmx_load_host_state+0x10e/0x1c0 [kvm_intel]
vmx_switch_vmcs+0x2d/0x50 [kvm_intel]
nested_vmx_vmexit+0x222/0x9c0 [kvm_intel]
vmx_handle_exit+0x246/0x15a0 [kvm_intel]
kvm_arch_vcpu_ioctl_run+0x850/0x1830 [kvm]
kvm_vcpu_ioctl+0x3a1/0x5c0 [kvm]
do_vfs_ioctl+0x9f/0x600
ksys_ioctl+0x66/0x70
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x4f/0x100
entry_SYSCALL_64_after_hwframe+0x44/0xa9
L0:
WARNING: CPU: 2 PID: 3529 at arch/x86/kvm/vmx.c:6618 handle_desc+0x28/0x30 [kvm_intel]
...
CPU: 2 PID: 3529 Comm: qemu-system-x86 Not tainted 4.17.2-coffee+ #76
Hardware name: Intel Corporation Kabylake Client platform/KBL S
RIP: 0010:handle_desc+0x28/0x30 [kvm_intel]
...
Call Trace:
kvm_arch_vcpu_ioctl_run+0x863/0x1840 [kvm]
kvm_vcpu_ioctl+0x3a1/0x5c0 [kvm]
do_vfs_ioctl+0x9f/0x5e0
ksys_ioctl+0x66/0x70
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x49/0xf0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: 5af4157388ad (KVM: nVMX: Fix mmu context after VMLAUNCH/VMRESUME failure)
Fixes: 4f350c6dbcb9 (kvm: nVMX: Handle deferred early VMLAUNCH/VMRESUME failure properly)
Cc: Jim Mattson <jmattson@google.com>
Cc: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim KrÄmář <rkrcmar@redhat.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
According to volume 3 of the SDM, bits 63:15 and 12:4 of the exit
qualification field for debug exceptions are reserved (cleared to
0). However, the SDM is incorrect about bit 16 (corresponding to
DR6.RTM). This bit should be set if a debug exception (#DB) or a
breakpoint exception (#BP) occurred inside an RTM region while
advanced debugging of RTM transactional regions was enabled. Note that
this is the opposite of DR6.RTM, which "indicates (when clear) that a
debug exception (#DB) or breakpoint exception (#BP) occurred inside an
RTM region while advanced debugging of RTM transactional regions was
enabled."
There is still an issue with stale DR6 bits potentially being
misreported for the current debug exception. DR6 should not have been
modified before vectoring the #DB exception, and the "new DR6 bits"
should be available somewhere, but it was and they aren't.
Fixes: b96fb439774e1 ("KVM: nVMX: fixes to nested virt interrupt injection")
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Let's add the 40 PA-bit versions of the VM modes, that AArch64
should have been using, so we can extend the dirty log test without
breaking things.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
While we're messing with the code for the port and to support guest
page sizes that are less than the host page size, we also make some
code formatting cleanups and apply sync_global_to_guest().
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Rename VM_MODE_FLAT48PG to be more descriptive of its config and add a
new config that has the same parameters, except with 64K pages.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
This code adds VM and VCPU setup code for the VM_MODE_FLAT48PG mode.
The VM_MODE_FLAT48PG isn't yet fully supportable, as it defines the
guest physical address limit as 52-bits, and KVM currently only
supports guests with up to 40-bit physical addresses (see
KVM_PHYS_SHIFT). VM_MODE_FLAT48PG will work fine, though, as long as
no >= 40-bit physical addresses are used.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Tidy up kvm-util code: code/comment formatting, remove unused code,
and move x86 specific code out. We also move vcpu_dump() out of
common code, because not all arches (AArch64) have KVM_GET_REGS.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Rework the guest exit to userspace code to generalize the concept
into what it is, a "hypercall to userspace", and provide two
implementations of it: the PortIO version currently used, but only
useable by x86, and an MMIO version that other architectures (except
s390) can use.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Guest code may want to call functions that have variable arguments.
To do so, we either need to compile with -mno-sse or enable SSE in
the VCPUs. As it should be pretty safe to turn on the feature, and
-mno-sse would make linking test code with standard libraries
difficult, we choose the feature enabling.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
In cloud environment, lapic_timer_advance_ns is needed to be tuned for every CPU
generations, and every host kernel versions(the kvm-unit-tests/tscdeadline_latency.flat
is 5700 cycles for upstream kernel and 9600 cycles for our 3.10 product kernel,
both preemption_timer=N, Skylake server).
This patch adds the capability to automatically tune lapic_timer_advance_ns
step by step, the initial value is 1000ns as 'commit d0659d946be0 ("KVM: x86:
add option to advance tscdeadline hrtimer expiration")' recommended, it will be
reduced when it is too early, and increased when it is too late. The guest_tsc
and tsc_deadline are hard to equal, so we assume we are done when the delta
is within a small scope e.g. 100 cycles. This patch reduces latency
(kvm-unit-tests/tscdeadline_latency, busy waits, preemption_timer enabled)
from ~2600 cyles to ~1200 cyles on our Skylake server.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Clang generates a warning when it sees a logical not followed by a
conditional operator like ==, >, or < because it thinks that the logical
not should be applied to the whole statement:
drivers/scsi/qla2xxx/qla_nx.c:3702:7: warning: logical not is only
applied to the left hand side of this comparison
[-Wlogical-not-parentheses]
if (!qla2x00_eh_wait_for_pending_commands(vha, 0, 0,
^
drivers/scsi/qla2xxx/qla_nx.c:3702:7: note: add parentheses after the
'!' to evaluate the comparison first
if (!qla2x00_eh_wait_for_pending_commands(vha, 0, 0,
^
(
drivers/scsi/qla2xxx/qla_nx.c:3702:7: note: add parentheses around left
hand side expression to silence this warning
if (!qla2x00_eh_wait_for_pending_commands(vha, 0, 0,
^
(
1 warning generated.
It assumes the author might have made a mistake in their logic:
if (!a == b) -> if (!(a == b))
Sometimes that is the case; other times, it's just a super convoluted
way of saying 'if (a)' when b = 0:
if (!1 == 0) -> if (0 == 0) -> if (true)
Alternatively:
if (!1 == 0) -> if (!!1) -> if (1)
Simplify this comparison so that Clang doesn't complain.
Link: https://github.com/ClangBuiltLinux/linux/issues/80
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Jakub Kicinski says:
====================
this set adds check to make sure offload behaviour is correct.
First when atomic counters are used, we must make sure the map
does not already contain data we did not prepare for holding
atomics.
Second patch double checks vNIC capabilities for program offload
in case program is shared by multiple vNICs with different
constraints.
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Program translation stage checks that program can be offloaded to
the netdev which was passed during the load (bpf_attr->prog_ifindex).
After program sharing was introduced, however, the netdev on which
program is loaded can theoretically be different, and therefore
we should recheck the program size and max stack size at load time.
This was found by code inspection, AFAIK today all vNICs have
identical caps.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Atomic operations on the NFP are currently always in big endian.
The driver keeps track of regions of memory storing atomic values
and byte swaps them accordingly. There are corner cases where
the map values may be initialized before the driver knows they
are used as atomic counters. This can happen either when the
datapath is performing the update and the stack contents are
unknown or when map is updated before the program which will
use it for atomic values is loaded.
To avoid situation where user initializes the value to 0 1 2 3
and then after loading a program which uses the word as an atomic
counter starts reading 3 2 1 0 - only allow atomic counters to be
initialized to endian-neutral values.
For updates from the datapath the stack information may not be
as precise, so just allow initializing such values to 0.
Example code which would break:
struct bpf_map_def SEC("maps") rxcnt = {
.type = BPF_MAP_TYPE_HASH,
.key_size = sizeof(__u32),
.value_size = sizeof(__u64),
.max_entries = 1,
};
int xdp_prog1()
{
__u64 nonzeroval = 3;
__u32 key = 0;
__u64 *value;
value = bpf_map_lookup_elem(&rxcnt, &key);
if (!value)
bpf_map_update_elem(&rxcnt, &key, &nonzeroval, BPF_ANY);
else
__sync_fetch_and_add(value, 1);
return XDP_PASS;
}
$ offload bpftool map dump
key: 00 00 00 00 value: 00 00 00 03 00 00 00 00
should be:
$ offload bpftool map dump
key: 00 00 00 00 value: 03 00 00 00 00 00 00 00
Reported-by: David Beckett <david.beckett@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Clang warns when a variable is assigned to itself.
drivers/scsi/bfa/bfa_fcbuild.c:199:6: warning: explicitly assigning
value of variable of type 'int' to itself [-Wself-assign]
len = len;
~~~ ^ ~~~
drivers/scsi/bfa/bfa_fcbuild.c:838:6: warning: explicitly assigning
value of variable of type 'int' to itself [-Wself-assign]
len = len;
~~~ ^ ~~~
drivers/scsi/bfa/bfa_fcbuild.c:917:6: warning: explicitly assigning
value of variable of type 'int' to itself [-Wself-assign]
len = len;
~~~ ^ ~~~
drivers/scsi/bfa/bfa_fcbuild.c:981:6: warning: explicitly assigning
value of variable of type 'int' to itself [-Wself-assign]
len = len;
~~~ ^ ~~~
drivers/scsi/bfa/bfa_fcbuild.c:1008:6: warning: explicitly assigning
value of variable of type 'int' to itself [-Wself-assign]
len = len;
~~~ ^ ~~~
5 warnings generated.
This construct is usually used to avoid unused variable warnings, which
I assume is the case here. -Wunused-parameter is hidden behind -Wextra
with GCC 4.6, which is the minimum version to compile the kernel as of
commit cafa0010cd51 ("Raise the minimum required gcc version to 4.6").
However, upon further inspection, these functions aren't actually used
anywhere; they're just defined. Rather than just removing the self
assignments, remove all of this dead code.
Link: https://github.com/ClangBuiltLinux/linux/issues/148
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Clang warns when a variable is assigned to itself.
drivers/scsi/qla2xxx/qla_mbx.c:1514:4: warning: explicitly assigning
value of variable of type 'uint64_t' (aka 'unsigned long long') to
itself [-Wself-assign]
l = l;
~ ^ ~
1 warning generated.
This construct is usually used to avoid unused variable warnings, which
I assume is the case here. -Wunused-parameter is hidden behind -Wextra
with GCC 4.6, which is the minimum version to compile the kernel as of
commit cafa0010cd51 ("Raise the minimum required gcc version to 4.6").
Just remove this line to silence Clang.
Link: https://github.com/ClangBuiltLinux/linux/issues/83
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Make global symbols in libbpf DSO hidden by default with
-fvisibility=hidden and export symbols that are part of ABI explicitly
with __attribute__((visibility("default"))).
This is common practice that should prevent from accidentally exporting
a symbol, that is not supposed to be a part of ABI what, in turn,
improves both libbpf developer- and user-experiences. See [1] for more
details.
Export control becomes more important since more and more projects use
libbpf.
The patch doesn't export a bunch of netlink related functions since as
agreed in [2] they'll be reworked. That doesn't break bpftool since
bpftool links libbpf statically.
[1] https://www.akkadia.org/drepper/dsohowto.pdf (2.2 Export Control)
[2] https://www.mail-archive.com/netdev@vger.kernel.org/msg251434.html
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/scsi/arcmsr/arcmsr_hba.c: In function 'arcmsr_drain_donequeue':
drivers/scsi/arcmsr/arcmsr_hba.c:1320:10: warning:
variable 'lun' set but not used [-Wunused-but-set-variable]
drivers/scsi/arcmsr/arcmsr_hba.c:1320:6: warning:
variable 'id' set but not used [-Wunused-but-set-variable]
Never used since introduction in commit ae52e7f09ff5 ("arcmsr: Support 1024 scatter-gather list entries and improve AP while FW trapped and behaviors of EHs").
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
In megasas_mgmt_compat_ioctl_fw(), to handle the structure
compat_megasas_iocpacket 'cioc', a user-space structure megasas_iocpacket
'ioc' is allocated before megasas_mgmt_ioctl_fw() is invoked to handle
the packet. Since the two data structures have different fields, the data
is copied from 'cioc' to 'ioc' field by field. In the copy process,
'sense_ptr' is prepared if the field 'sense_len' is not null, because it
will be used in megasas_mgmt_ioctl_fw(). To prepare 'sense_ptr', the
user-space data 'ioc->sense_off' and 'cioc->sense_off' are copied and
saved to kernel-space variables 'local_sense_off' and 'user_sense_off'
respectively. Given that 'ioc->sense_off' is also copied from
'cioc->sense_off', 'local_sense_off' and 'user_sense_off' should have the
same value. However, 'cioc' is in the user space and a malicious user can
race to change the value of 'cioc->sense_off' after it is copied to
'ioc->sense_off' but before it is copied to 'user_sense_off'. By doing
so, the attacker can inject different values into 'local_sense_off' and
'user_sense_off'. This can cause undefined behavior in the following
execution, because the two variables are supposed to be same.
This patch enforces a check on the two kernel variables 'local_sense_off'
and 'user_sense_off' to make sure they are the same after the copy. In
case they are not, an error code EINVAL will be returned.
Signed-off-by: Wenwen Wang <wang6495@umn.edu>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Trivial fix to spelling mistake in beiscsi_log message.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Trivial fix to spelling mistake in lpfc_printf_log message text.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Currently a spin_unlock_irqrestore() call is missing on the error path,
so add it.
Reported-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
We just need to free the request here. Additionally, this is currently
wrong for a queue that's using MQ currently, it'll crash.
Cc: Doug Gilbert <dgilbert@interlog.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
This is currently wrong since it isn't dependent on if we're using mq or
not. At least now it'll be correct when we force mq.
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Add missing break statement in order to prevent the code from falling
through to case TEST_UNIT_READY.
Addresses-Coverity-ID: 1357338 ("Missing break in switch")
Suggested-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Eric reported that syzkaller triggered a splat in tcp_cleanup_ulp()
where assertion sock_owned_by_me() failed. This happened through
inet_csk_prepare_forced_close() first releasing the socket lock,
then calling into tcp_done(newsk) which is called after the
inet_csk_prepare_forced_close() and therefore without the socket
lock held. The sock_owned_by_me() assertion can generally be
removed as the only place where tcp_cleanup_ulp() is called from
now is out of inet_csk_destroy_sock() -> sk->sk_prot->destroy()
where socket is in dead state and unreachable. Therefore, add a
comment why the check is not needed instead.
Fixes: 8b9088f806e1 ("tcp, ulp: enforce sock_owned_by_me upon ulp init and cleanup")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
dc_req_scat_data_cqe capability bit determines
if requester scatter to cqe is available for 64 bytes CQE over
DC transport type.
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
|
|
Replace custom code to allocate indexes to generic kernel API.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Replace custom code to allocate indexes to generic kernel API.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
IDA adds overhead to store IDs bitmap with maximal value of IDA
can be upto 2099202 (IDA_MAX = 0x80000000U / IDA_BITMAP_BITS - 1).
However, there is no need to add such enormous number of devices
and it is enough for now to limit it to be 8192.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
WARN_ON() already contains an unlikely(), so it's not necessary to
wrap it into another.
Signed-off-by: Igor Stoppa <igor.stoppa@huawei.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Add an entry for the Broadcom SPI controller in the MAINTAINERS file.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The API surrounding refcount_t should be used in place of atomic_t
when variables are being used as reference counters. This API can
prevent issues such as counter overflows and use-after-free
conditions. Within the dm zoned target stack, the atomic_t API is
used for bioctx->ref and cw->refcount. Change these to use
refcount_t, avoiding the issues mentioned.
Signed-off-by: John Pittman <jpittman@redhat.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Tested-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
The API surrounding refcount_t should be used in place of atomic_t
when variables are being used as reference counters. It can
potentially prevent reference counter overflows and use-after-free
conditions. In the dm thin layer, one such example is tc->refcount.
Change this from the atomic_t API to the refcount_t API to prevent
mentioned conditions.
Signed-off-by: John Pittman <jpittman@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Add said information and make the debug print format consistent.
Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
IB Subnet Management Packets (SMPs) were excluded from debug prints.
Fixed by enabling print even on QP0 MADs.
Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Normally kobj objects have kobj suffix to reflect it.
Rename ports_parent to ports_kobj.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
If the provider driver (such as rdma_rxe) doesn't support pma counters,
avoid exposing its directory similar to optional hw_counters directory.
If core fails to read the PMA counter, return an error so that user can
retry later if needed.
Fixes: 35c4cbb17811 ("IB/core: Create get_perf_mad function in sysfs.c")
Reported-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
iov sysfs tree is created under ib device at
/sys/class/infiniband/mlx4_0/iov.
And,
ibdev->ports_parent->parent = &ibdev->dev.
Therefore, refer to device's kobject directly instead of
indirect access to it.
Additionally, iov entries are created under device kobject and deleted
before device is removed. There is no need to hold additional reference
to device kobject in provider driver.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
So the extra user build flags are propagated to libtraceevent.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: "Herton R. Krzesinski" <herton@redhat.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
Cc: Yordan Karadzhov (VMware) <y.karadz@gmail.com>
Link: http://lkml.kernel.org/r/20181016150614.21260-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|