Age | Commit message (Collapse) | Author |
|
Ensure that CONDITION MET and other non-zero status values that indicate
success are translated into BLK_STS_OK.
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Lee Duncan <lduncan@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
scsi_result_to_blk_status()
Since the next patch will modify this function such that it checks more than
just the host byte of the SCSI result, rename __scsi_error_from_host_byte()
into scsi_result_to_blk_status(). This patch does not change any
functionality.
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Lee Duncan <lduncan@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
__scsi_error_from_host_byte()"
The description of commit e39a97353e53 is wrong: it mentions that commit
2a842acab109 introduced a bug in __scsi_error_from_host_byte() although that
commit did not change the behavior of that function. Additionally, commit
e39a97353e53 introduced a bug: it causes commands that fail with
hostbyte=DID_OK and driverbyte=DRIVER_SENSE to be completed with
BLK_STS_OK. Hence revert that commit.
Fixes: e39a97353e53 ("scsi: core: return BLK_STS_OK for DID_OK in __scsi_error_from_host_byte()")
Reported-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Lee Duncan <lduncan@suse.com>
Cc: stable@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
If a recursive IOP_RESET is invoked, usually due to the eh_thread
handling errors after the first reset, be sure we flag that the command
thread has been stopped to avoid an Oops of the form;
[ 336.620256] CPU: 28 PID: 1193 Comm: scsi_eh_0 Kdump: loaded Not tainted 4.14.0-49.el7a.ppc64le #1
[ 336.620297] task: c000003fd630b800 task.stack: c000003fd61a4000
[ 336.620326] NIP: c000000000176794 LR: c00000000013038c CTR: c00000000024bc10
[ 336.620361] REGS: c000003fd61a7720 TRAP: 0300 Not tainted (4.14.0-49.el7a.ppc64le)
[ 336.620395] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 22084022 XER: 20040000
[ 336.620435] CFAR: c000000000130388 DAR: 0000000000000000 DSISR: 40000000 SOFTE: 1
[ 336.620435] GPR00: c00000000013038c c000003fd61a79a0 c0000000014c7e00 0000000000000000
[ 336.620435] GPR04: 000000000000000c 000000000000000c 9000000000009033 0000000000000477
[ 336.620435] GPR08: 0000000000000477 0000000000000000 0000000000000000 c008000010f7d940
[ 336.620435] GPR12: c00000000024bc10 c000000007a33400 c0000000001708a8 c000003fe3b881d8
[ 336.620435] GPR16: c000003fe3b88060 c000003fd61a7d10 fffffffffffff000 000000000000001e
[ 336.620435] GPR20: 0000000000000001 c000000000ebf1a0 0000000000000001 c000003fe3b88000
[ 336.620435] GPR24: 0000000000000003 0000000000000002 c000003fe3b88840 c000003fe3b887e8
[ 336.620435] GPR28: c000003fe3b88000 c000003fc8181788 0000000000000000 c000003fc8181700
[ 336.620750] NIP [c000000000176794] exit_creds+0x34/0x160
[ 336.620775] LR [c00000000013038c] __put_task_struct+0x8c/0x1f0
[ 336.620804] Call Trace:
[ 336.620817] [c000003fd61a79a0] [c000003fe3b88000] 0xc000003fe3b88000 (unreliable)
[ 336.620853] [c000003fd61a79d0] [c00000000013038c] __put_task_struct+0x8c/0x1f0
[ 336.620889] [c000003fd61a7a00] [c000000000171418] kthread_stop+0x1e8/0x1f0
[ 336.620922] [c000003fd61a7a40] [c008000010f7448c] aac_reset_adapter+0x14c/0x8d0 [aacraid]
[ 336.620959] [c000003fd61a7b00] [c008000010f60174] aac_eh_host_reset+0x84/0x100 [aacraid]
[ 336.621010] [c000003fd61a7b30] [c000000000864f24] scsi_try_host_reset+0x74/0x180
[ 336.621046] [c000003fd61a7bb0] [c000000000867ac0] scsi_eh_ready_devs+0xc00/0x14d0
[ 336.625165] [c000003fd61a7ca0] [c0000000008699e0] scsi_error_handler+0x550/0x730
[ 336.632101] [c000003fd61a7dc0] [c000000000170a08] kthread+0x168/0x1b0
[ 336.639031] [c000003fd61a7e30] [c00000000000b528] ret_from_kernel_thread+0x5c/0xb4
[ 336.645971] Instruction dump:
[ 336.648743] 384216a0 7c0802a6 fbe1fff8 f8010010 f821ffd1 7c7f1b78 60000000 60000000
[ 336.657056] 39400000 e87f0838 f95f0838 7c0004ac <7d401828> 314affff 7d40192d 40c2fff4
[ 336.663997] -[ end trace 4640cf8d4945ad95 ]-
So flag when the thread is stopped by setting the thread pointer to NULL.
Signed-off-by: Dave Carroll <david.carroll@microsemi.com>
Reviewed-by: Raghava Aditya Renukunta <raghavaaditya.renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Bart reports that in qla_isr.c's qla2x00_handle_dif_error we're wrongly
shifting the SAM_STAT_CHECK_CONDITION by one instead of directly ORing it
onto the SCSI command's result.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reported-by: Bart Van Assche <Bart.VanAssche@wdc.com>
Cc: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The SCSI host byte has to be shifted by 16 not 6.
As Bart pointed out this patch does not change any functionality because
DID_OK == 0, but a wrong shift is irritating for the reviewer.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
qla2x00_init_timer() calls add_timer() on the iocb timeout timer, which
means the timeout function pointer and any data that the function depends on
must be initialised beforehand.
Move this initialisation before each call to qla2x00_init_timer(). In some
cases qla2x00_init_timer() initialises a completion structure needed by the
timeout function, so move the call to add_timer() after that.
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
qla2x00_tmf_sp_done() now deletes the timer that will run
qla2x00_tmf_iocb_timeout(), but doesn't check whether the timer already
expired. Check the return value from del_timer() to avoid calling
complete() a second time.
Fixes: 4440e46d5db7 ("[SCSI] qla2xxx: Add IOCB Abort command asynchronous ...")
Fixes: 1514839b3664 ("scsi: qla2xxx: Fix NULL pointer crash due to active ...")
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Pull networking fixes from David Miller:
1) The sockmap code has to free socket memory on close if there is
corked data, from John Fastabend.
2) Tunnel names coming from userspace need to be length validated. From
Eric Dumazet.
3) arp_filter() has to take VRFs properly into account, from Miguel
Fadon Perlines.
4) Fix oops in error path of tcf_bpf_init(), from Davide Caratti.
5) Missing idr_remove() in u32_delete_key(), from Cong Wang.
6) More syzbot stuff. Several use of uninitialized value fixes all
over, from Eric Dumazet.
7) Do not leak kernel memory to userspace in sctp, also from Eric
Dumazet.
8) Discard frames from unused ports in DSA, from Andrew Lunn.
9) Fix DMA mapping and reset/failover problems in ibmvnic, from Thomas
Falcon.
10) Do not access dp83640 PHY registers prematurely after reset, from
Esben Haabendal.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (46 commits)
vhost-net: set packet weight of tx polling to 2 * vq size
net: thunderx: rework mac addresses list to u64 array
inetpeer: fix uninit-value in inet_getpeer
dp83640: Ensure against premature access to PHY registers after reset
devlink: convert occ_get op to separate registration
ARM: dts: ls1021a: Specify TBIPA register address
net/fsl_pq_mdio: Allow explicit speficition of TBIPA address
ibmvnic: Do not reset CRQ for Mobility driver resets
ibmvnic: Fix failover case for non-redundant configuration
ibmvnic: Fix reset scheduler error handling
ibmvnic: Zero used TX descriptor counter on reset
ibmvnic: Fix DMA mapping mistakes
tipc: use the right skb in tipc_sk_fill_sock_diag()
sctp: sctp_sockaddr_af must check minimal addr length for AF_INET6
net: dsa: Discard frames from unused ports
sctp: do not leak kernel memory to user space
soreuseport: initialise timewait reuseport field
ipv4: fix uninit-value in ip_route_output_key_hash_rcu()
dccp: initialize ireq->ir_mark
net: fix uninit-value in __hw_addr_add_ex()
...
|
|
The code that fixes the crashes in the following commit introduced a small
memory leak:
commit 6a2cf8d3663e ("scsi: qla2xxx: Fix crashes in qla2x00_probe_one on probe failure")
Fixing this requires a bit of reworking, which I've explained. Also provide
some code cleanup.
There is a small window in qla2x00_probe_one where if qla2x00_alloc_queues
fails, we end up never freeing req and rsp and leak 0xc0 and 0xc8 bytes
respectively (the sizes of req and rsp).
I originally put in checks to test for this condition which were based on
the incorrect assumption that if ha->rsp_q_map and ha->req_q_map were
allocated, then rsp and req were allocated as well. This is incorrect.
There is a window between these allocations:
ret = qla2x00_mem_alloc(ha, req_length, rsp_length, &req, &rsp);
goto probe_hw_failed;
[if successful, both rsp and req allocated]
base_vha = qla2x00_create_host(sht, ha);
goto probe_hw_failed;
ret = qla2x00_request_irqs(ha, rsp);
goto probe_failed;
if (qla2x00_alloc_queues(ha, req, rsp)) {
goto probe_failed;
[if successful, now ha->rsp_q_map and ha->req_q_map allocated]
To simplify this, we should just set req and rsp to NULL after we free
them. Sounds simple enough? The problem is that req and rsp are pointers
defined in the qla2x00_probe_one and they are not always passed by reference
to the routines that free them.
Here are paths which can free req and rsp:
PATH 1:
qla2x00_probe_one
ret = qla2x00_mem_alloc(ha, req_length, rsp_length, &req, &rsp);
[req and rsp are passed by reference, but if this fails, we currently
do not NULL out req and rsp. Easily fixed]
PATH 2:
qla2x00_probe_one
failing in qla2x00_request_irqs or qla2x00_alloc_queues
probe_failed:
qla2x00_free_device(base_vha);
qla2x00_free_req_que(ha, req)
qla2x00_free_rsp_que(ha, rsp)
PATH 3:
qla2x00_probe_one:
failing in qla2x00_mem_alloc or qla2x00_create_host
probe_hw_failed:
qla2x00_free_req_que(ha, req)
qla2x00_free_rsp_que(ha, rsp)
PATH 1: This should currently work, but it doesn't because rsp and rsp are
not set to NULL in qla2x00_mem_alloc. Easily remedied.
PATH 2: req and rsp aren't passed in at all to qla2x00_free_device but are
derived from ha->req_q_map[0] and ha->rsp_q_map[0]. These are only set up if
qla2x00_alloc_queues succeeds.
In qla2x00_free_queues, we are protected from crashing if these don't exist
because req_qid_map and rsp_qid_map are only set on their allocation. We are
guarded in this way:
for (cnt = 0; cnt < ha->max_req_queues; cnt++) {
if (!test_bit(cnt, ha->req_qid_map))
continue;
PATH 3: This works. We haven't freed req or rsp yet (or they were never
allocated if qla2x00_mem_alloc failed), so we'll attempt to free them here.
To summarize, there are a few small changes to make this work correctly and
(and for some cleanup):
1) (For PATH 1) Set *rsp and *req to NULL in case of failure in
qla2x00_mem_alloc so these are correctly set to NULL back in
qla2x00_probe_one
2) After jumping to probe_failed: and calling qla2x00_free_device,
explicitly set rsp and req to NULL so further calls with these pointers do
not crash, i.e. the free queue calls in the probe_hw_failed section we fall
through to.
3) Fix return code check in the call to qla2x00_alloc_queues. We currently
drop the return code on the floor. The probe fails but the caller of the
probe doesn't have an error code, so it attaches to pci. This can result in
a crash on module shutdown.
4) Remove unnecessary NULL checks in qla2x00_free_req_que,
qla2x00_free_rsp_que, and the egregious NULL checks before kfrees and vfrees
in qla2x00_mem_free.
I tested this out running a scenario where the card breaks at various times
during initialization. I made sure I forced every error exit path in
qla2x00_probe_one.
Cc: <stable@vger.kernel.org> # v4.16
Fixes: 6a2cf8d3663e ("scsi: qla2xxx: Fix crashes in qla2x00_probe_one on probe failure")
Signed-off-by: Bill Kuzeja <william.kuzeja@stratus.com>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Currently scsi_dh_lookup() doesn't check for NULL as a device name. This
combined with nvme over dm-mpath results in the following messages
emitted by device-mapper:
device-mapper: multipath: Could not failover device 259:67: Handler scsi_dh_(null) error 14.
Let scsi_dh_lookup() fail fast on NULL names.
[mkp: typo fix]
Cc: <stable@vger.kernel.org> # v4.16
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The first assignment to shost->use_blk_mq is redundant as it is
overwritten by the following statement. Remove this redundant code.
Detected by CoverityScan, CID#1466993 ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Pull kvm updates from Paolo Bonzini:
"ARM:
- VHE optimizations
- EL2 address space randomization
- speculative execution mitigations ("variant 3a", aka execution past
invalid privilege register access)
- bugfixes and cleanups
PPC:
- improvements for the radix page fault handler for HV KVM on POWER9
s390:
- more kvm stat counters
- virtio gpu plumbing
- documentation
- facilities improvements
x86:
- support for VMware magic I/O port and pseudo-PMCs
- AMD pause loop exiting
- support for AMD core performance extensions
- support for synchronous register access
- expose nVMX capabilities to userspace
- support for Hyper-V signaling via eventfd
- use Enlightened VMCS when running on Hyper-V
- allow userspace to disable MWAIT/HLT/PAUSE vmexits
- usual roundup of optimizations and nested virtualization bugfixes
Generic:
- API selftest infrastructure (though the only tests are for x86 as
of now)"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (174 commits)
kvm: x86: fix a prototype warning
kvm: selftests: add sync_regs_test
kvm: selftests: add API testing infrastructure
kvm: x86: fix a compile warning
KVM: X86: Add Force Emulation Prefix for "emulate the next instruction"
KVM: X86: Introduce handle_ud()
KVM: vmx: unify adjacent #ifdefs
x86: kvm: hide the unused 'cpu' variable
KVM: VMX: remove bogus WARN_ON in handle_ept_misconfig
Revert "KVM: X86: Fix SMRAM accessing even if VM is shutdown"
kvm: Add emulation for movups/movupd
KVM: VMX: raise internal error for exception during invalid protected mode state
KVM: nVMX: Optimization: Dont set KVM_REQ_EVENT when VMExit with nested_run_pending
KVM: nVMX: Require immediate-exit when event reinjected to L2 and L1 event pending
KVM: x86: Fix misleading comments on handling pending exceptions
KVM: x86: Rename interrupt.pending to interrupt.injected
KVM: VMX: No need to clear pending NMI/interrupt on inject realmode interrupt
x86/kvm: use Enlightened VMCS when running on Hyper-V
x86/hyper-v: detect nested features
x86/hyper-v: define struct hv_enlightened_vmcs and clean field bits
...
|
|
|
|
|
|
Pull ARM SA1100 updates from Russell King:
"We have support for arbitary MMIO registers providing platform GPIOs,
which allows us to abstract some of the SA11x0 CF support.
This set of updates makes that change"
* 'for-linus-sa1100' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: sa1100/simpad: switch simpad CF to use gpiod APIs
ARM: sa1100/shannon: convert to generic CF sockets
ARM: sa1100/nanoengine: convert to generic CF sockets
ARM: sa1100/h3xxx: switch h3xxx PCMCIA to use gpiod APIs
ARM: sa1100/cerf: convert to generic CF sockets
ARM: sa1100/assabet: convert to generic CF sockets
ARM: sa1100: provide infrastructure to support generic CF sockets
pcmcia: sa1100: provide generic CF support
|
|
Stephen reports that an x86 allmodconfig build fails to build the
of_pmem driver due to a missing definition of of_node_to_nid(). That
helper is currently only exported in the OF_NUMA=y case. In other cases,
ppc and sparc, it is a weak symbol, and outside of those platforms it is
a static inline.
Until an OF_NUMA=n configuration can reliably support usage of
of_node_to_nid() in modules across architectures, mark this driver as
'bool' instead of 'tristate'.
Cc: Rob Herring <robh@kernel.org>
Cc: Oliver O'Halloran <oohall@gmail.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:
- Improvements for the spectre defense:
* The spectre related code is consolidated to a single file
nospec-branch.c
* Automatic enable/disable for the spectre v2 defenses (expoline vs.
nobp)
* Syslog messages for specve v2 are added
* Enable CONFIG_GENERIC_CPU_VULNERABILITIES and define the attribute
functions for spectre v1 and v2
- Add helper macros for assembler alternatives and use them to shorten
the code in entry.S.
- Add support for persistent configuration data via the SCLP Store Data
interface. The H/W interface requires a page table that uses 4K pages
only, the code to setup such an address space is added as well.
- Enable virtio GPU emulation in QEMU. To do this the depends
statements for a few common Kconfig options are modified.
- Add support for format-3 channel path descriptors and add a binary
sysfs interface to export the associated utility strings.
- Add a sysfs attribute to control the IFCC handling in case of
constant channel errors.
- The vfio-ccw changes from Cornelia.
- Bug fixes and cleanups.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (40 commits)
s390/kvm: improve stack frame constants in entry.S
s390/lpp: use assembler alternatives for the LPP instruction
s390/entry.S: use assembler alternatives
s390: add assembler macros for CPU alternatives
s390: add sysfs attributes for spectre
s390: report spectre mitigation via syslog
s390: add automatic detection of the spectre defense
s390: move nobp parameter functions to nospec-branch.c
s390/cio: add util_string sysfs attribute
s390/chsc: query utility strings via fmt3 channel path descriptor
s390/cio: rename struct channel_path_desc
s390/cio: fix unbind of io_subchannel_driver
s390/qdio: split up CCQ handling for EQBS / SQBS
s390/qdio: don't retry EQBS after CCQ 96
s390/qdio: restrict buffer merging to eligible devices
s390/qdio: don't merge ERROR output buffers
s390/qdio: simplify math in get_*_buffer_frontier()
s390/decompressor: trim uncompressed image head during the build
s390/crypto: Fix kernel crash on aes_s390 module remove.
s390/defkeymap: fix global init to zero
...
|
|
handle_tx will delay rx for tens or even hundreds of milliseconds when tx busy
polling udp packets with small length(e.g. 1byte udp payload), because setting
VHOST_NET_WEIGHT takes into account only sent-bytes but no single packet length.
Ping-Latencies shown below were tested between two Virtual Machines using
netperf (UDP_STREAM, len=1), and then another machine pinged the client:
vq size=256
Packet-Weight Ping-Latencies(millisecond)
min avg max
Origin 3.319 18.489 57.303
64 1.643 2.021 2.552
128 1.825 2.600 3.224
256 1.997 2.710 4.295
512 1.860 3.171 4.631
1024 2.002 4.173 9.056
2048 2.257 5.650 9.688
4096 2.093 8.508 15.943
vq size=512
Packet-Weight Ping-Latencies(millisecond)
min avg max
Origin 6.537 29.177 66.245
64 2.798 3.614 4.403
128 2.861 3.820 4.775
256 3.008 4.018 4.807
512 3.254 4.523 5.824
1024 3.079 5.335 7.747
2048 3.944 8.201 12.762
4096 4.158 11.057 19.985
Seems pretty consistent, a small dip at 2 VQ sizes.
Ring size is a hint from device about a burst size it can tolerate. Based on
benchmarks, set the weight to 2 * vq size.
To evaluate this change, another tests were done using netperf(RR, TX) between
two machines with Intel(R) Xeon(R) Gold 6133 CPU @ 2.50GHz, and vq size was
tweaked through qemu. Results shown below does not show obvious changes.
vq size=256 TCP_RR vq size=512 TCP_RR
size/sessions/+thu%/+normalize% size/sessions/+thu%/+normalize%
1/ 1/ -7%/ -2% 1/ 1/ 0%/ -2%
1/ 4/ +1%/ 0% 1/ 4/ +1%/ 0%
1/ 8/ +1%/ -2% 1/ 8/ 0%/ +1%
64/ 1/ -6%/ 0% 64/ 1/ +7%/ +3%
64/ 4/ 0%/ +2% 64/ 4/ -1%/ +1%
64/ 8/ 0%/ 0% 64/ 8/ -1%/ -2%
256/ 1/ -3%/ -4% 256/ 1/ -4%/ -2%
256/ 4/ +3%/ +4% 256/ 4/ +1%/ +2%
256/ 8/ +2%/ 0% 256/ 8/ +1%/ -1%
vq size=256 UDP_RR vq size=512 UDP_RR
size/sessions/+thu%/+normalize% size/sessions/+thu%/+normalize%
1/ 1/ -5%/ +1% 1/ 1/ -3%/ -2%
1/ 4/ +4%/ +1% 1/ 4/ -2%/ +2%
1/ 8/ -1%/ -1% 1/ 8/ -1%/ 0%
64/ 1/ -2%/ -3% 64/ 1/ +1%/ +1%
64/ 4/ -5%/ -1% 64/ 4/ +2%/ 0%
64/ 8/ 0%/ -1% 64/ 8/ -2%/ +1%
256/ 1/ +7%/ +1% 256/ 1/ -7%/ 0%
256/ 4/ +1%/ +1% 256/ 4/ -3%/ -4%
256/ 8/ +2%/ +2% 256/ 8/ +1%/ +1%
vq size=256 TCP_STREAM vq size=512 TCP_STREAM
size/sessions/+thu%/+normalize% size/sessions/+thu%/+normalize%
64/ 1/ 0%/ -3% 64/ 1/ 0%/ 0%
64/ 4/ +3%/ -1% 64/ 4/ -2%/ +4%
64/ 8/ +9%/ -4% 64/ 8/ -1%/ +2%
256/ 1/ +1%/ -4% 256/ 1/ +1%/ +1%
256/ 4/ -1%/ -1% 256/ 4/ -3%/ 0%
256/ 8/ +7%/ +5% 256/ 8/ -3%/ 0%
512/ 1/ +1%/ 0% 512/ 1/ -1%/ -1%
512/ 4/ +1%/ -1% 512/ 4/ 0%/ 0%
512/ 8/ +7%/ -5% 512/ 8/ +6%/ -1%
1024/ 1/ 0%/ -1% 1024/ 1/ 0%/ +1%
1024/ 4/ +3%/ 0% 1024/ 4/ +1%/ 0%
1024/ 8/ +8%/ +5% 1024/ 8/ -1%/ 0%
2048/ 1/ +2%/ +2% 2048/ 1/ -1%/ 0%
2048/ 4/ +1%/ 0% 2048/ 4/ 0%/ -1%
2048/ 8/ -2%/ 0% 2048/ 8/ 5%/ -1%
4096/ 1/ -2%/ 0% 4096/ 1/ -2%/ 0%
4096/ 4/ +2%/ 0% 4096/ 4/ 0%/ 0%
4096/ 8/ +9%/ -2% 4096/ 8/ -5%/ -1%
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Haibin Zhang <haibinzhang@tencent.com>
Signed-off-by: Yunfang Tai <yunfangtai@tencent.com>
Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It is too expensive to pass u64 values via linked list, instead
allocate array for them by overall number of mac addresses from netdev.
This eventually removes multiple kmalloc() calls, aviod memory
fragmentation and allow to put single null check on kmalloc
return value in order to prevent a potential null pointer dereference.
Addresses-Coverity-ID: 1467429 ("Dereference null return value")
Fixes: 37c3347eb247 ("net: thunderx: add ndo_set_rx_mode callback implementation for VF")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If the scheduler tick has been stopped already and the governor
selects a shallow idle state, the CPU can spend a long time in that
state if the selection is based on an inaccurate prediction of idle
time. That effect turns out to be relevant, so it needs to be
mitigated.
To that end, modify the menu governor to discard the result of the
idle time prediction if the tick is stopped and the predicted idle
time is less than the tick period length, unless the tick timer is
going to expire soon.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
|
|
If the tick isn't stopped, the target residency of the state selected
by the menu governor may be greater than the actual time to the next
tick and that means lost energy.
To avoid that, make tick_nohz_get_sleep_length() return the current
time to the next event (before stopping the tick) in addition to the
estimated one via an extra pointer argument and make menu_select()
use that value to refine the state selection when necessary.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
|
|
The datasheet specifies a 3uS pause after performing a software
reset. The default implementation of genphy_soft_reset() does not
provide this, so implement soft_reset with the needed pause.
Signed-off-by: Esben Haabendal <eha@deif.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This resolves race during initialization where the resources with
ops are registered before driver and the structures used by occ_get
op is initialized. So keep occ_get callbacks registered only when
all structs are initialized.
The example flows, as it is in mlxsw:
1) driver load/asic probe:
mlxsw_core
-> mlxsw_sp_resources_register
-> mlxsw_sp_kvdl_resources_register
-> devlink_resource_register IDX
mlxsw_spectrum
-> mlxsw_sp_kvdl_init
-> mlxsw_sp_kvdl_parts_init
-> mlxsw_sp_kvdl_part_init
-> devlink_resource_size_get IDX (to get the current setup
size from devlink)
-> devlink_resource_occ_get_register IDX (register current
occupancy getter)
2) reload triggered by devlink command:
-> mlxsw_devlink_core_bus_device_reload
-> mlxsw_sp_fini
-> mlxsw_sp_kvdl_fini
-> devlink_resource_occ_get_unregister IDX
(struct mlxsw_sp *mlxsw_sp is freed at this point, call to occ get
which is using mlxsw_sp would cause use-after free)
-> mlxsw_sp_init
-> mlxsw_sp_kvdl_init
-> mlxsw_sp_kvdl_parts_init
-> mlxsw_sp_kvdl_part_init
-> devlink_resource_size_get IDX (to get the current setup
size from devlink)
-> devlink_resource_occ_get_register IDX (register current
occupancy getter)
Fixes: d9f9b9a4d05f ("devlink: Add support for resource abstraction")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This introduces a simpler and generic method for for finding (and mapping)
the TBIPA register.
Instead of relying of complicated logic for finding the TBIPA register
address based on the MDIO or MII register block base
address, which even in some cases relies on undocumented shadow registers,
a second "reg" entry for the mdio bus devicetree node specifies the TBIPA
register.
Backwards compatibility is kept, as the existing logic is applied when
only a single "reg" mapping is specified.
Signed-off-by: Esben Haabendal <eha@deif.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When resetting the ibmvnic driver after a partition migration occurs
there is no requirement to do a reset of the main CRQ. The current
driver code does the required re-enable of the main CRQ, then does
a reset of the main CRQ later.
What we should be doing for a driver reset after a migration is to
re-enable the main CRQ, release all the sub-CRQs, and then allocate
new sub-CRQs after capability negotiation.
This patch updates the handling of mobility resets to do the proper
work and not reset the main CRQ. To do this the initialization/reset
of the main CRQ had to be moved out of the ibmvnic_init routine
and in to the ibmvnic_probe and do_reset routines.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There is a failover case for a non-redundant pseries VNIC
configuration that was not being handled properly. The current
implementation assumes that the driver will always have a redandant
device to communicate with following a failover notification. There
are cases, however, when a non-redundant configuration can receive
a failover request. If that happens, the driver should wait until
it receives a signal that the device is ready for operation.
The driver is agnostic of its backing hardware configuration,
so this fix necessarily affects all device failover management.
The driver needs to wait until it receives a signal that the device
is ready for resetting. A flag is introduced to track this intermediary
state where the driver is waiting for an active device.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In some cases, if the driver is waiting for a reset following
a device parameter change, failure to schedule a reset can result
in a hang since a completion signal is never sent.
If the device configuration is being altered by a tool such
as ethtool or ifconfig, it could cause the console to hang
if the reset request does not get scheduled. Add some additional
error handling code to exit the wait_for_completion if there is
one in progress.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The counter that tracks used TX descriptors pending completion
needs to be zeroed as part of a device reset. This change fixes
a bug causing transmit queues to be stopped unnecessarily and in
some cases a transmit queue stall and timeout reset. If the counter
is not reset, the remaining descriptors will not be "removed",
effectively reducing queue capacity. If the queue is over half full,
it will cause the queue to stall if stopped.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix some mistakes caught by the DMA debugger. The first change
fixes a unnecessary unmap that should have been removed in an
earlier update. The next hunk fixes another bad unmap by zeroing
the bit checked to determine that an unmap is needed. The final
change fixes some buffers that are unmapped with the wrong
direction specified.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull TPM updates from James Morris:
"This release contains only bug fixes. There are no new major features
added"
* 'next-tpm' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
tpm: fix intermittent failure with self tests
tpm: add retry logic
tpm: self test failure should not cause suspend to fail
tpm2: add longer timeouts for creation commands.
tpm_crb: use __le64 annotated variable for response buffer address
tpm: fix buffer type in tpm_transmit_cmd
tpm: tpm-interface: fix tpm_transmit/_cmd kdoc
tpm: cmd_ready command can be issued only after granting locality
|
|
Joe Perches noted that we have a few source files that for some
inexplicable reason (read: I'm too lazy to even go look at the history)
are marked executable:
drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
drivers/net/ethernet/cadence/macb_ptp.c
A simple git command line to show executable C/asm/header files is this:
git ls-files -s '*.[chsS]' | grep '^100755'
and then you can fix them up with scripting by just feeding that output
into:
| cut -f2 | xargs chmod -x
and commit it.
Which is exactly what this commit does.
Reported-by: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c updates from Wolfram Sang:
-I2C core now reports proper OF style module alias. I'd like to repeat
the note from the commit msg here (Thanks, Javier!):
NOTE: This patch may break out-of-tree drivers that were relying
on this behavior, and only had an I2C device ID table even
when the device was registered via OF.
There are no remaining drivers in mainline that do this, but
out-of-tree drivers have to be fixed and define a proper OF
device ID table to have module auto-loading working.
- new driver for the SynQuacer I2C controller
- major refactoring of the QUP driver
- the piix4 driver now uses request_muxed_region which should fix a
long standing resource conflict with the sp5100_tco watchdog
- a bunch of small core & driver improvements
* 'i2c/for-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (53 commits)
i2c: add support for Socionext SynQuacer I2C controller
dt-bindings: i2c: add binding for Socionext SynQuacer I2C
i2c: Update i2c_trace_msg static key to modern api
i2c: fix parameter of trace_i2c_result
i2c: imx: avoid taking clk_prepare mutex in PM callbacks
i2c: imx: use clk notifier for rate changes
i2c: make i2c_check_addr_validity() static
i2c: rcar: fix mask value of prohibited bit
dt-bindings: i2c: document R8A77965 bindings
i2c: pca-platform: drop gpio from platform data
i2c: pca-platform: use device_property_read_u32
i2c: pca-platform: unconditionally use devm_gpiod_get_optional
sh: sh7785lcr: add GPIO lookup table for i2c controller reset
i2c: qup: reorganization of driver code to remove polling for qup v2
i2c: qup: reorganization of driver code to remove polling for qup v1
i2c: qup: send NACK for last read sub transfers
i2c: qup: fix buffer overflow for multiple msg of maximum xfer len
i2c: qup: change completion timeout according to transfer length
i2c: qup: use the complete transfer length to choose DMA mode
i2c: qup: proper error handling for i2c error in BAM mode
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
"Notable changes:
- Support for 4PB user address space on 64-bit, opt-in via mmap().
- Removal of POWER4 support, which was accidentally broken in 2016
and no one noticed, and blocked use of some modern instructions.
- Workarounds so that the hypervisor can enable Transactional Memory
on Power9.
- A series to disable the DAWR (Data Address Watchpoint Register) on
Power9.
- More information displayed in the meltdown/spectre_v1/v2 sysfs
files.
- A vpermxor (Power8 Altivec) implementation for the raid6 Q
Syndrome.
- A big series to make the allocation of our pacas (per cpu area),
kernel page tables, and per-cpu stacks NUMA aware when using the
Radix MMU on Power9.
And as usual many fixes, reworks and cleanups.
Thanks to: Aaro Koskinen, Alexandre Belloni, Alexey Kardashevskiy,
Alistair Popple, Andy Shevchenko, Aneesh Kumar K.V, Anshuman Khandual,
Balbir Singh, Benjamin Herrenschmidt, Christophe Leroy, Christophe
Lombard, Cyril Bur, Daniel Axtens, Dave Young, Finn Thain, Frederic
Barrat, Gustavo Romero, Horia Geantă, Jonathan Neuschäfer, Kees Cook,
Larry Finger, Laurent Dufour, Laurent Vivier, Logan Gunthorpe,
Madhavan Srinivasan, Mark Greer, Mark Hairgrove, Markus Elfring,
Mathieu Malaterre, Matt Brown, Matt Evans, Mauricio Faria de Oliveira,
Michael Neuling, Naveen N. Rao, Nicholas Piggin, Paul Mackerras,
Philippe Bergheaud, Ram Pai, Rob Herring, Sam Bobroff, Segher
Boessenkool, Simon Guo, Simon Horman, Stewart Smith, Sukadev
Bhattiprolu, Suraj Jitindar Singh, Thiago Jung Bauermann, Vaibhav
Jain, Vaidyanathan Srinivasan, Vasant Hegde, Wei Yongjun"
* tag 'powerpc-4.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (207 commits)
powerpc/64s/idle: Fix restore of AMOR on POWER9 after deep sleep
powerpc/64s: Fix POWER9 DD2.2 and above in cputable features
powerpc/64s: Fix pkey support in dt_cpu_ftrs, add CPU_FTR_PKEY bit
powerpc/64s: Fix dt_cpu_ftrs to have restore_cpu clear unwanted LPCR bits
Revert "powerpc/64s/idle: POWER9 ESL=0 stop avoid save/restore overhead"
powerpc: iomap.c: introduce io{read|write}64_{lo_hi|hi_lo}
powerpc: io.h: move iomap.h include so that it can use readq/writeq defs
cxl: Fix possible deadlock when processing page faults from cxllib
powerpc/hw_breakpoint: Only disable hw breakpoint if cpu supports it
powerpc/mm/radix: Update command line parsing for disable_radix
powerpc/mm/radix: Parse disable_radix commandline correctly.
powerpc/mm/hugetlb: initialize the pagetable cache correctly for hugetlb
powerpc/mm/radix: Update pte fragment count from 16 to 256 on radix
powerpc/mm/keys: Update documentation and remove unnecessary check
powerpc/64s/idle: POWER9 ESL=0 stop avoid save/restore overhead
powerpc/64s/idle: Consolidate power9_offline_stop()/power9_idle_stop()
powerpc/powernv: Always stop secondaries before reboot/shutdown
powerpc: hard disable irqs in smp_send_stop loop
powerpc: use NMI IPI for smp_send_stop
powerpc/powernv: Fix SMT4 forcing idle code
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull general security layer updates from James Morris:
- Convert security hooks from list to hlist, a nice cleanup, saving
about 50% of space, from Sargun Dhillon.
- Only pass the cred, not the secid, to kill_pid_info_as_cred and
security_task_kill (as the secid can be determined from the cred),
from Stephen Smalley.
- Close a potential race in kernel_read_file(), by making the file
unwritable before calling the LSM check (vs after), from Kees Cook.
* 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
security: convert security hooks to use hlist
exec: Set file unwritable before LSM check
usb, signal, security: only pass the cred, not the secid, to kill_pid_info_as_cred and security_task_kill
|
|
After attempting to quickly retrieve known errors the kernel proceeds to
kick off a long running ARS. Add a module option to disable this
behavior at initialization time, or at new region discovery time.
Otherwise, ARS can be started manually regardless of the state of this
setting.
Co-developed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
ARS is an operation that can take 10s to 100s of seconds to find media
errors that should rarely be present. If the platform crashes due to
media errors in persistent memory, the expectation is that the BIOS will
report those known errors in a 'short' ARS request.
A 'short' ARS request asks platform firmware to return an ARS payload
with all known errors, but without issuing a 'long' scrub. At driver
init a short request is issued to all PMEM ranges before registering
regions. Then, in the background, a long ARS is scheduled for each
region.
The ARS implementation is simplified to centralize ARS completion work
in the ars_complete() helper. The timeout is removed since there is no
facility to cancel ARS, and this otherwise arranges for system init to
never be blocked waiting for a 'long' ARS. The ars_state flags are used
to coordinate ARS requests from driver init, ARS requests from
userspace, and ARS requests in response to media error notifications.
Given that there is no notification of ARS completion the implementation
still needs to poll. It backs off exponentially to a maximum poll period
of 30 minutes.
Suggested-by: Toshi Kani <toshi.kani@hpe.com>
Co-developed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
acpi_nfit_query_poison() is awkward in that it requires an nfit_spa
argument in order to determine what max_ars value to use. Instead probe
for the minimum max_ars across all scrub-capable ranges in the system
and drop the nfit_spa argument.
This enables a larger rework / simplification of the ARS state machine
whereby the status can be retrieved once and then iterated over all
address ranges to reap completions.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
This patch adds peliminary device-tree bindings for persistent memory
regions. The driver registers a libnvdimm bus for each pmem-region
node and each address range under the node is converted to a region
within that bus.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
We want to be able to cross reference the region and bus devices
with the device tree node that they were spawned from. libNVDIMM
handles creating the actual devices for these internally, so we
need to pass in a pointer to the relevant node in the descriptor.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
The message about constraining number of online cpus to be less than or
equal to ND_MAX_LANES (256) is only useful for block-aperture
configurations and BTT. Make it debug since it is only relevant when
debugging performance.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
GNU Make automatically deletes intermediate files that are updated
in a chain of pattern rules.
Example 1) %.dtb.o <- %.dtb.S <- %.dtb <- %.dts
Example 2) %.o <- %.c <- %.c_shipped
A couple of makefiles mark such targets as .PRECIOUS to prevent Make
from deleting them, but the correct way is to use .SECONDARY.
.SECONDARY
Prerequisites of this special target are treated as intermediate
files but are never automatically deleted.
.PRECIOUS
When make is interrupted during execution, it may delete the target
file it is updating if the file was modified since make started.
If you mark the file as precious, make will never delete the file
if interrupted.
Both can avoid deletion of intermediate files, but the difference is
the behavior when Make is interrupted; .SECONDARY deletes the target,
but .PRECIOUS does not.
The use of .PRECIOUS is relatively rare since we do not want to keep
partially constructed (possibly corrupted) targets.
Another difference is that .PRECIOUS works with pattern rules whereas
.SECONDARY does not.
.PRECIOUS: $(obj)/%.lex.c
works, but
.SECONDARY: $(obj)/%.lex.c
has no effect. However, for the reason above, I do not want to use
.PRECIOUS which could cause obscure build breakage.
The targets specified as .SECONDARY must be explicit. $(targets)
contains all targets that need to include .*.cmd files. So, the
intermediates you want to keep are mostly in there. Therefore, mark
$(targets) as .SECONDARY. It means primary targets are also marked
as .SECONDARY, but I do not see any drawback for this.
I replaced some .SECONDARY / .PRECIOUS markers with 'targets'. This
will make Kbuild search for non-existing .*.cmd files, but this is
not a noticeable performance issue.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Frank Rowand <frowand.list@gmail.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
|
|
These are common patterns where source files are parsed by the
asn1_compiler.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
|
|
Another common pattern that consists of chained commands is to compile
a DTB as binary data into the kernel image or a module. It is used in
several places in the source tree. Support it in the core Makefile.
$(call if_changed,dt_S_dtb) is more suitable than $(call cmd,dt_S_dtb)
in case cmd_dt_S_dtb is changed in the future.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Frank Rowand <frowand.list@gmail.com>
|
|
The following NULL dereference results from incorrectly assuming that
ndd is valid in this print:
struct nvdimm_drvdata *ndd = to_ndd(&nd_region->mapping[i]);
/*
* Give up if we don't find an instance of a uuid at each
* position (from 0 to nd_region->ndr_mappings - 1), or if we
* find a dimm with two instances of the same uuid.
*/
dev_err(&nd_region->dev, "%s missing label for %pUb\n",
dev_name(ndd->dev), nd_label->uuid);
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
IP: nd_region_register_namespaces+0xd67/0x13c0 [libnvdimm]
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
CPU: 43 PID: 673 Comm: kworker/u609:10 Not tainted 4.16.0-rc4+ #1
[..]
RIP: 0010:nd_region_register_namespaces+0xd67/0x13c0 [libnvdimm]
[..]
Call Trace:
? devres_add+0x2f/0x40
? devm_kmalloc+0x52/0x60
? nd_region_activate+0x9c/0x320 [libnvdimm]
nd_region_probe+0x94/0x260 [libnvdimm]
? kernfs_add_one+0xe4/0x130
nvdimm_bus_probe+0x63/0x100 [libnvdimm]
Switch to using the nvdimm device directly.
Fixes: 0e3b0d123c8f ("libnvdimm, namespace: allow multiple pmem...")
Cc: <stable@vger.kernel.org>
Reported-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
At initialization time the 'dimm' driver caches a copy of the memory
device's label area and reserves address space for each of the
namespaces defined.
However, as can be seen below, the reservation occurs even when the
index blocks are invalid:
nvdimm nmem0: nvdimm_init_config_data: len: 131072 rc: 0
nvdimm nmem0: config data size: 131072
nvdimm nmem0: __nd_label_validate: nsindex0 labelsize 1 invalid
nvdimm nmem0: __nd_label_validate: nsindex1 labelsize 1 invalid
nvdimm nmem0: : pmem-6025e505: 0x1000000000 @ 0xf50000000 reserve <-- bad
Gate dpa reservation on the presence of valid index blocks.
Cc: <stable@vger.kernel.org>
Fixes: 4a826c83db4e ("libnvdimm: namespace indices: read and validate")
Reported-by: Krzysztof Rusocki <krzysztof.rusocki@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Pull VFIO updates from Alex Williamson:
- Adopt iommu_unmap_fast() interface to type1 backend
(Suravee Suthikulpanit)
- mdev sample driver fixup (Shunyong Yang)
- More efficient PFN mapping handling in type1 backend
(Jason Cai)
- VFIO device ioeventfd interface (Alex Williamson)
- Tag new vfio-platform sub-maintainer (Alex Williamson)
* tag 'vfio-v4.17-rc1' of git://github.com/awilliam/linux-vfio:
MAINTAINERS: vfio/platform: Update sub-maintainer
vfio/pci: Add ioeventfd support
vfio/pci: Use endian neutral helpers
vfio/pci: Pull BAR mapping setup from read-write path
vfio/type1: Improve memory pinning process for raw PFN mapping
vfio-mdev/samples: change RDI interrupt condition
vfio/type1: Adopt fast IOTLB flush interface when unmap IOVAs
|
|
Pull fw_cfg, vhost updates from Michael Tsirkin:
"This cleans up the qemu fw cfg device driver.
On top of this, vmcore is dumped there on crash to help debugging
with kASLR enabled.
Also included are some fixes in vhost"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vhost: add vsock compat ioctl
vhost: fix vhost ioctl signature to build with clang
fw_cfg: write vmcoreinfo details
crash: export paddr_vmcoreinfo_note()
fw_cfg: add DMA register
fw_cfg: add a public uapi header
fw_cfg: handle fw_cfg_read_blob() error
fw_cfg: remove inline from fw_cfg_read_blob()
fw_cfg: fix sparse warnings around FW_CFG_FILE_DIR read
fw_cfg: fix sparse warning reading FW_CFG_ID
fw_cfg: fix sparse warnings with fw_cfg_file
fw_cfg: fix sparse warnings in fw_cfg_sel_endianness()
ptr_ring: fix build
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
- move pci_uevent_ers() out of pci.h (Michael Ellerman)
- skip ASPM common clock warning if BIOS already configured it (Sinan
Kaya)
- fix ASPM Coverity warning about threshold_ns (Gustavo A. R. Silva)
- remove last user of pci_get_bus_and_slot() and the function itself
(Sinan Kaya)
- add decoding for 16 GT/s link speed (Jay Fang)
- add interfaces to get max link speed and width (Tal Gilboa)
- add pcie_bandwidth_capable() to compute max supported link bandwidth
(Tal Gilboa)
- add pcie_bandwidth_available() to compute bandwidth available to
device (Tal Gilboa)
- add pcie_print_link_status() to log link speed and whether it's
limited (Tal Gilboa)
- use PCI core interfaces to report when device performance may be
limited by its slot instead of doing it in each driver (Tal Gilboa)
- fix possible cpqphp NULL pointer dereference (Shawn Lin)
- rescan more of the hierarchy on ACPI hotplug to fix Thunderbolt/xHCI
hotplug (Mika Westerberg)
- add support for PCI I/O port space that's neither directly accessible
via CPU in/out instructions nor directly mapped into CPU physical
memory space. This is fairly intrusive and includes minor changes to
interfaces used for I/O space on most platforms (Zhichang Yuan, John
Garry)
- add support for HiSilicon Hip06/Hip07 LPC I/O space (Zhichang Yuan,
John Garry)
- use PCI_EXP_DEVCTL2_COMP_TIMEOUT in rapidio/tsi721 (Bjorn Helgaas)
- remove possible NULL pointer dereference in of_pci_bus_find_domain_nr()
(Shawn Lin)
- report quirk timings with dev_info (Bjorn Helgaas)
- report quirks that take longer than 10ms (Bjorn Helgaas)
- add and use Altera Vendor ID (Johannes Thumshirn)
- tidy Makefiles and comments (Bjorn Helgaas)
- don't set up INTx if MSI or MSI-X is enabled to align cris, frv,
ia64, and mn10300 with x86 (Bjorn Helgaas)
- move pcieport_if.h to drivers/pci/pcie/ to encapsulate it (Frederick
Lawler)
- merge pcieport_if.h into portdrv.h (Bjorn Helgaas)
- move workaround for BIOS PME issue from portdrv to PCI core (Bjorn
Helgaas)
- completely disable portdrv with "pcie_ports=compat" (Bjorn Helgaas)
- remove portdrv link order dependency (Bjorn Helgaas)
- remove support for unused VC portdrv service (Bjorn Helgaas)
- simplify portdrv feature permission checking (Bjorn Helgaas)
- remove "pcie_hp=nomsi" parameter (use "pci=nomsi" instead) (Bjorn
Helgaas)
- remove unnecessary "pcie_ports=auto" parameter (Bjorn Helgaas)
- use cached AER capability offset (Frederick Lawler)
- don't enable DPC if BIOS hasn't granted AER control (Mika Westerberg)
- rename pcie-dpc.c to dpc.c (Bjorn Helgaas)
- use generic pci_mmap_resource_range() instead of powerpc and xtensa
arch-specific versions (David Woodhouse)
- support arbitrary PCI host bridge offsets on sparc (Yinghai Lu)
- remove System and Video ROM reservations on sparc (Bjorn Helgaas)
- probe for device reset support during enumeration instead of runtime
(Bjorn Helgaas)
- add ACS quirk for Ampere (née APM) root ports (Feng Kan)
- add function 1 DMA alias quirk for Marvell 88SE9220 (Thomas
Vincent-Cross)
- protect device restore with device lock (Sinan Kaya)
- handle failure of FLR gracefully (Sinan Kaya)
- handle CRS (config retry status) after device resets (Sinan Kaya)
- skip various config reads for SR-IOV VFs as an optimization
(KarimAllah Ahmed)
- consolidate VPD code in vpd.c (Bjorn Helgaas)
- add Tegra dependency on PCI_MSI_IRQ_DOMAIN (Arnd Bergmann)
- add DT support for R-Car r8a7743 (Biju Das)
- fix a PCI_EJECT vs PCI_BUS_RELATIONS race condition in Hyper-V host
bridge driver that causes a general protection fault (Dexuan Cui)
- fix Hyper-V host bridge hang in MSI setup on 1-vCPU VMs with SR-IOV
(Dexuan Cui)
- fix Hyper-V host bridge hang when ejecting a VF before setting up MSI
(Dexuan Cui)
- make several structures static (Fengguang Wu)
- increase number of MSI IRQs supported by Synopsys DesignWare bridges
from 32 to 256 (Gustavo Pimentel)
- implemented multiplexed IRQ domain API and remove obsolete MSI IRQ
API from DesignWare drivers (Gustavo Pimentel)
- add Tegra power management support (Manikanta Maddireddy)
- add Tegra loadable module support (Manikanta Maddireddy)
- handle 64-bit BARs correctly in endpoint support (Niklas Cassel)
- support optional regulator for HiSilicon STB (Shawn Guo)
- use regulator bulk API for Qualcomm apq8064 (Srinivas Kandagatla)
- support power supplies for Qualcomm msm8996 (Srinivas Kandagatla)
* tag 'pci-v4.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (123 commits)
MAINTAINERS: Add John Garry as maintainer for HiSilicon LPC driver
HISI LPC: Add ACPI support
ACPI / scan: Do not enumerate Indirect IO host children
ACPI / scan: Rename acpi_is_serial_bus_slave() for more general use
HISI LPC: Support the LPC host on Hip06/Hip07 with DT bindings
of: Add missing I/O range exception for indirect-IO devices
PCI: Apply the new generic I/O management on PCI IO hosts
PCI: Add fwnode handler as input param of pci_register_io_range()
PCI: Remove __weak tag from pci_register_io_range()
MAINTAINERS: Add missing /drivers/pci/cadence directory entry
fm10k: Report PCIe link properties with pcie_print_link_status()
net/mlx5e: Use pcie_bandwidth_available() to compute bandwidth
net/mlx5: Report PCIe link properties with pcie_print_link_status()
net/mlx4_core: Report PCIe link properties with pcie_print_link_status()
PCI: Add pcie_print_link_status() to log link speed and whether it's limited
PCI: Add pcie_bandwidth_available() to compute bandwidth available to device
misc: pci_endpoint_test: Handle 64-bit BARs properly
PCI: designware-ep: Make dw_pcie_ep_reset_bar() handle 64-bit BARs properly
PCI: endpoint: Make sure that BAR_5 does not have 64-bit flag set when clearing
PCI: endpoint: Make epc->ops->clear_bar()/pci_epc_clear_bar() take struct *epf_bar
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:
"Doug and I are at a conference next week so if another PR is sent I
expect it to only be bug fixes. Parav noted yesterday that there are
some fringe case behavior changes in his work that he would like to
fix, and I see that Intel has a number of rc looking patches for HFI1
they posted yesterday.
Parav is again the biggest contributor by patch count with his ongoing
work to enable container support in the RDMA stack, followed by Leon
doing syzkaller inspired cleanups, though most of the actual fixing
went to RC.
There is one uncomfortable series here fixing the user ABI to actually
work as intended in 32 bit mode. There are lots of notes in the commit
messages, but the basic summary is we don't think there is an actual
32 bit kernel user of drivers/infiniband for several good reasons.
However we are seeing people want to use a 32 bit user space with 64
bit kernel, which didn't completely work today. So in fixing it we
required a 32 bit rxe user to upgrade their userspace. rxe users are
still already quite rare and we think a 32 bit one is non-existing.
- Fix RDMA uapi headers to actually compile in userspace and be more
complete
- Three shared with netdev pull requests from Mellanox:
* 7 patches, mostly to net with 1 IB related one at the back).
This series addresses an IRQ performance issue (patch 1),
cleanups related to the fix for the IRQ performance problem
(patches 2-6), and then extends the fragmented completion queue
support that already exists in the net side of the driver to the
ib side of the driver (patch 7).
* Mostly IB, with 5 patches to net that are needed to support the
remaining 10 patches to the IB subsystem. This series extends
the current 'representor' framework when the mlx5 driver is in
switchdev mode from being a netdev only construct to being a
netdev/IB dev construct. The IB dev is limited to raw Eth queue
pairs only, but by having an IB dev of this type attached to the
representor for a switchdev port, it enables DPDK to work on the
switchdev device.
* All net related, but needed as infrastructure for the rdma
driver
- Updates for the hns, i40iw, bnxt_re, cxgb3, cxgb4, hns drivers
- SRP performance updates
- IB uverbs write path cleanup patch series from Leon
- Add RDMA_CM support to ib_srpt. This is disabled by default. Users
need to set the port for ib_srpt to listen on in configfs in order
for it to be enabled
(/sys/kernel/config/target/srpt/discovery_auth/rdma_cm_port)
- TSO and Scatter FCS support in mlx4
- Refactor of modify_qp routine to resolve problems seen while
working on new code that is forthcoming
- More refactoring and updates of RDMA CM for containers support from
Parav
- mlx5 'fine grained packet pacing', 'ipsec offload' and 'device
memory' user API features
- Infrastructure updates for the new IOCTL interface, based on
increased usage
- ABI compatibility bug fixes to fully support 32 bit userspace on 64
bit kernel as was originally intended. See the commit messages for
extensive details
- Syzkaller bugs and code cleanups motivated by them"
* tag 'for-linus-unmerged' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (199 commits)
IB/rxe: Fix for oops in rxe_register_device on ppc64le arch
IB/mlx5: Device memory mr registration support
net/mlx5: Mkey creation command adjustments
IB/mlx5: Device memory support in mlx5_ib
net/mlx5: Query device memory capabilities
IB/uverbs: Add device memory registration ioctl support
IB/uverbs: Add alloc/free dm uverbs ioctl support
IB/uverbs: Add device memory capabilities reporting
IB/uverbs: Expose device memory capabilities to user
RDMA/qedr: Fix wmb usage in qedr
IB/rxe: Removed GID add/del dummy routines
RDMA/qedr: Zero stack memory before copying to user space
IB/mlx5: Add ability to hash by IPSEC_SPI when creating a TIR
IB/mlx5: Add information for querying IPsec capabilities
IB/mlx5: Add IPsec support for egress and ingress
{net,IB}/mlx5: Add ipsec helper
IB/mlx5: Add modify_flow_action_esp verb
IB/mlx5: Add implementation for create and destroy action_xfrm
IB/uverbs: Introduce ESP steering match filter
IB/uverbs: Add modify ESP flow_action
...
|