Age | Commit message (Collapse) | Author |
|
The arguments for trace_hfi1_rcvhdr() get computed every
time in the hot path regardless of the whether the trace
is on or off. This is seen to be costly with a profile.
The handling of fault inject isolates the verbs device for
all packets regardless of the presence of a RHF_DC_ERR error.
Fix the first by computing trace_hfi1_rcvhdr() arguments within
the trace itself, so that when the trace is off, the argument
data isn't computed. Fix the second by moving the error check to
handle_eflags() when an RHF error occurs and by testing for
RHF_DC_ERR before executing the reset of handle_eflags().
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
packet->fecn and packet->becn are calculated in the hot path
and are never used. Remove these fields as they show to be
costly in a profile. Also, remove initialization for
becn and fecn in process_ecn() as they're unconditionally
assigned in the function and ensure fecn and becn variables
use a boolean type.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
In the receive path, hfi1_ibport is looked up by indexing into an
array. A profile shows this to be expensive. The receive context
data has a pointer to the ibport data, use that pointer instead.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
The packet type comparison used to find out if a packet is a bypass
packet in the hot path is an expensive operation as seen in a profile.
Determine packet's pkey and migration bit through the bypass and 9B
code paths instead.
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
In hfi1_rc_rcv(), BTH is computed for all packets received.
However, it's only used for packets received with opcodes
RDMA_WRITE_LAST and SEND_LAST, and it is a costly operation.
Compute BTH only in the RDMA_WRITE_LAST/SEND_LAST code path
and let the compiler handle endianness conversion for bitwise
operations.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
The s_hdrwords variable was used to indicate whether a
packet was already built on a previous iteration of the
send engine. This variable assumed the protection of the
QP's RVT_S_BUSY flag, which was required since the the
QP's s_lock was dropped just prior to the packet being
queued on the one of the egress mechanisms.
Support for multiple send engine instantiations require
that the field not be used due to concurency issues.
The ps.txreq signals the "already built" without the
potential concurency issues.
Fix by getting rid of all s_hdrword usage. A wrapper
is added to test for the already built case that used to
use s_hdrwords.
What used to be stored in s_hdrwords is now in the txreq.
The PBC is not counted, but is added in the pio/sdma code
paths prior to posting the packet.
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
The dd refcount is speculatively incremented prior to allocating
the fd memory with kzalloc(). If that kzalloc() failed the dd
refcount leaks.
Increment refcount on kzalloc success.
Fixes: e11ffbd57520 ("IB/hfi1: Do not free hfi1 cdev parent structure early")
Reviewed-by: Michael J Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
With IRQF_SHARED flag set and CONFIG_DEBUG_SHIRQ enabled
module removal may result in panic in sdma_interrupt() routine
if associated sdma context was released before pci_free_irq();
[ 9198.939885] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 9198.940514] IP: sdma_make_progress+0xa5/0x450 [hfi1]
[ 9198.941114] PGD 170bdc0067 P4D 170bdc0067 PUD 172063e067 PMD 0
[ 9198.941783] Oops: 0000 [#1] SMP
.....
[ 9198.958877] CPU: 132 PID: 64173 Comm: rmmod Tainted: G OE 4.14.0-rc4+ #1
[ 9198.961032] Hardware name: Intel Corporation S7200AP/S7200AP, BIOS S72C610.86B.01.02.0118.080620171935 08/06/2017
[ 9198.963323] task: ffff9681397f0000 task.stack: ffffae1647c40000
[ 9198.965695] RIP: 0010:sdma_make_progress+0xa5/0x450 [hfi1]
[ 9198.968082] RSP: 0018:ffffae1647c43be8 EFLAGS: 00010046
[ 9198.970503] RAX: 0000000000000000 RBX: ffff9680ce8b5ca8 RCX: 0000000000000000
[ 9198.973006] RDX: 0000000000000000 RSI: 0000000001a00d28 RDI: ffff9680ce8b5ca0
[ 9198.975546] RBP: ffffae1647c43c40 R08: ffff96814325ec00 R09: 00000000ffffffff
[ 9198.978142] R10: 000000004325e501 R11: ffff96814325ec00 R12: ffff9680ce8b5c44
[ 9198.980779] R13: ffff9680ce8b5ca0 R14: 0000000000000000 R15: ffff9680ce8b5b00
[ 9198.983462] FS: 00007f31196ba740(0000) GS:ffff96819df00000(0000) knlGS:0000000000000000
[ 9198.986231] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9198.989036] CR2: 0000000000000000 CR3: 000000170833f000 CR4: 00000000001406e0
[ 9198.991911] Call Trace:
[ 9198.994847] sdma_engine_interrupt+0x82/0x100 [hfi1]
[ 9198.997852] sdma_interrupt+0x61/0xc0 [hfi1]
[ 9199.000852] __free_irq+0x1b3/0x2d0
[ 9199.003873] free_irq+0x35/0x70
[ 9199.006909] pci_free_irq+0x1c/0x30
[ 9199.009999] clean_up_interrupts+0x53/0xf0 [hfi1]
[ 9199.013137] hfi1_start_cleanup+0x117/0x190 [hfi1]
[ 9199.016315] postinit_cleanup+0x1d/0x270 [hfi1]
[ 9199.019529] remove_one+0x1f3/0x210 [hfi1]
[ 9199.022738] pci_device_remove+0x39/0xc0
[ 9199.025974] device_release_driver_internal+0x141/0x210
[ 9199.029268] driver_detach+0x3f/0x80
[ 9199.032580] bus_remove_driver+0x55/0xd0
[ 9199.035931] driver_unregister+0x2c/0x50
[ 9199.039321] pci_unregister_driver+0x2a/0xa0
[ 9199.042755] hfi1_mod_cleanup+0x10/0xb50 [hfi1]
[ 9199.046196] SyS_delete_module+0x171/0x250
...
Fix by exporting sdma_clean() and removing from sdma_exit().
sdma_exit() now just manipulates the engine state,
leaving the memory free to sdma_clean() which is now called
just before the dd is freed.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Michael J Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
The pci_request_irq() interfaces always adds the IRQF_SHARED bit to
all IRQ requests.
When the kernel is built with CONFIG_DEBUG_SHIRQ config flag, if the
IRQF_SHARED bit is set, a call to the IRQ handler is made from the
__free_irq() function. This is testing a race condition between the
IRQ cleanup and an IRQ racing the cleanup. The HFI driver should be
able to handle this race, but does not.
This race can cause traces that start with this footprint:
BUG: unable to handle kernel NULL pointer dereference at (null)
Call Trace:
<hfi1 irq handler>
...
__free_irq+0x1b3/0x2d0
free_irq+0x35/0x70
pci_free_irq+0x1c/0x30
clean_up_interrupts+0x53/0xf0 [hfi1]
hfi1_start_cleanup+0x122/0x190 [hfi1]
postinit_cleanup+0x1d/0x280 [hfi1]
remove_one+0x233/0x250 [hfi1]
pci_device_remove+0x39/0xc0
Export IRQ cleanup function so it can be called from other modules.
Using the exported cleanup function:
Re-order the driver cleanup code to clean up IRQ resources before
other resources, eliminating the race.
Re-order error path for init so that the race does not occur.
Reduce severity on spurious error message for SDMA IRQs to info.
Reviewed-by: Alex Estrin <alex.estrin@intel.com>
Reviewed-by: Patel Jay P <jay.p.patel@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
We should return -ENOMEM if the allocation fails. The current code
accidentally returns success.
Fixes: bf3c5a93c523 ("RDMA/nldev: Provide global resource utilization")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
The mtt_table is cleaned up during the err_unmap_cqe label, it is a
mistake to duplicate the cleanup during the later unwind labels.
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
This patch mainly fix some style warings matched with the new checkpatch
requirement. The warning as follows:
WARNING: function definition argument 'struct hns_roce_cq *' should also have
an identifier name
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
The double not-operator is unncessary when used in a boolean context. This
patch removes them.
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Add a jump target so that a bit of exception handling can be better reused
at the end of this function.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Devesh Sharma <devesh.sharma@broadcom.com>
Acked-by: Jonathan Toppins <jtoppins@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
bnxt_qplib_alloc_dpi_tbl()
Omit extra messages for a memory allocation failure in this function.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Devesh Sharma <devesh.sharma@broadcom.com>
Acked-by: Jonathan Toppins <jtoppins@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Adding NFIT platform capabilities sub table in nfit_test simulated ACPI
NFIT table. Only the first NFIT table is added with the capability
sub-table.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
|
|
Providing a sysfs attribute for nd_region that shows the persistence
capabilities for the platform.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
|
|
Propagate the ADR attribute flag from the NFIT platform capabilities
sub-table to nd_region.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
|
|
In ACPI 6.2a the platform capability structure has been added to the NFIT
tables. That provides software the ability to determine whether a system
supports the auto flushing of CPU caches on power loss. If the capability
is supported, we do not need to do dax_flush(). Plumbing the path to set the
property on per region from the NFIT tables.
This patch depends on the ACPI NFIT 6.2a platform capabilities support code
in include/acpi/actbl1.h.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
|
|
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk
Pull printk updates from Petr Mladek:
- Add a console_msg_format command line option:
The value "default" keeps the old "[time stamp] text\n" format. The
value "syslog" allows to see the syslog-like "<log
level>[timestamp] text" format.
This feature was requested by people doing regression tests, for
example, 0day robot. They want to have both filtered and full logs
at hands.
- Reduce the risk of softlockup:
Pass the console owner in a busy loop.
This is a new approach to the old problem. It was first proposed by
Steven Rostedt on Kernel Summit 2017. It marks a context in which
the console_lock owner calls console drivers and could not sleep.
On the other side, printk() callers could detect this state and use
a busy wait instead of a simple console_trylock(). Finally, the
console_lock owner checks if there is a busy waiter at the end of
the special context and eventually passes the console_lock to the
waiter.
The hand-off works surprisingly well and helps in many situations.
Well, there is still a possibility of the softlockup, for example,
when the flood of messages stops and the last owner still has too
much to flush.
There is increasing number of people having problems with
printk-related softlockups. We might eventually need to get better
solution. Anyway, this looks like a good start and promising
direction.
- Do not allow to schedule in console_unlock() called from printk():
This reverts an older controversial commit. The reschedule helped
to avoid softlockups. But it also slowed down the console output.
This patch is obsoleted by the new console waiter logic described
above. In fact, the reschedule made the hand-off less effective.
- Deprecate "%pf" and "%pF" format specifier:
It was needed on ia64, ppc64 and parisc64 to dereference function
descriptors and show the real function address. It is done
transparently by "%ps" and "pS" format specifier now.
Sergey Senozhatsky found that all the function descriptors were in
a special elf section and could be easily detected.
- Remove printk_symbol() API:
It has been obsoleted by "%pS" format specifier, and this change
helped to remove few continuous lines and a less intuitive old API.
- Remove redundant memsets:
Sergey removed unnecessary memset when processing printk.devkmsg
command line option.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk: (27 commits)
printk: drop redundant devkmsg_log_str memsets
printk: Never set console_may_schedule in console_trylock()
printk: Hide console waiter logic into helpers
printk: Add console owner and waiter logic to load balance console writes
kallsyms: remove print_symbol() function
checkpatch: add pF/pf deprecation warning
symbol lookup: introduce dereference_symbol_descriptor()
parisc64: Add .opd based function descriptor dereference
powerpc64: Add .opd based function descriptor dereference
ia64: Add .opd based function descriptor dereference
sections: split dereference_function_descriptor()
openrisc: Fix conflicting types for _exext and _stext
lib: do not use print_symbol()
irq debug: do not use print_symbol()
sysfs: do not use print_symbol()
drivers: do not use print_symbol()
x86: do not use print_symbol()
unicore32: do not use print_symbol()
sh: do not use print_symbol()
mn10300: do not use print_symbol()
...
|
|
Pull VFIO updates from Alex Williamson:
- Mask INTx from user if pdev->irq is zero (Alexey Kardashevskiy)
- Capability helper cleanup (Alex Williamson)
- Allow mmaps overlapping MSI-X vector table with region capability
exposing this feature (Alexey Kardashevskiy)
- mdev static cleanups (Xiongwei Song)
* tag 'vfio-v4.16-rc1' of git://github.com/awilliam/linux-vfio:
vfio: mdev: make a couple of functions and structure vfio_mdev_driver static
vfio-pci: Allow mapping MSIX BAR
vfio: Simplify capability helper
vfio-pci: Mask INTx if a device is not capabable of enabling it
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
"There's not much changes for the tracing system this release. Mostly
small clean ups and fixes.
The biggest change is to how bprintf works. bprintf is used by
trace_printk() to just save the format and args of a printf call, and
the formatting is done when the trace buffer is read. This is done to
keep the formatting out of the fast path (this was recommended by
you). The issue is when arguments are de-referenced.
If a pointer is saved, and the format has something like "%*pbl", when
the buffer is read, it will de-reference the argument then. The
problem is if the data no longer exists. This can cause the kernel to
oops.
The fix for this was to make these de-reference pointes do the
formatting at the time it is called (the fast path), as this
guarantees that the data exists (and doesn't change later)"
* tag 'trace-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
vsprintf: Do not have bprintf dereference pointers
ftrace: Mark function tracer test functions noinline/noclone
trace_uprobe: Display correct offset in uprobe_events
tracing: Make sure the parsed string always terminates with '\0'
tracing: Clear parser->idx if only spaces are read
tracing: Detect the string nul character when parsing user input string
|
|
I ran into an issue on my laptop that triggered a bug on the
discard path:
WARNING: CPU: 2 PID: 207 at drivers/nvme/host/core.c:527 nvme_setup_cmd+0x3d3/0x430
Modules linked in: rfcomm fuse ctr ccm bnep arc4 binfmt_misc snd_hda_codec_hdmi nls_iso8859_1 nls_cp437 vfat snd_hda_codec_conexant fat snd_hda_codec_generic iwlmvm snd_hda_intel snd_hda_codec snd_hwdep mac80211 snd_hda_core snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq x86_pkg_temp_thermal intel_powerclamp kvm_intel uvcvideo iwlwifi btusb snd_seq_device videobuf2_vmalloc btintel videobuf2_memops kvm snd_timer videobuf2_v4l2 bluetooth irqbypass videobuf2_core aesni_intel aes_x86_64 crypto_simd cryptd snd glue_helper videodev cfg80211 ecdh_generic soundcore hid_generic usbhid hid i915 psmouse e1000e ptp pps_core xhci_pci xhci_hcd intel_gtt
CPU: 2 PID: 207 Comm: jbd2/nvme0n1p7- Tainted: G U 4.15.0+ #176
Hardware name: LENOVO 20FBCTO1WW/20FBCTO1WW, BIOS N1FET59W (1.33 ) 12/19/2017
RIP: 0010:nvme_setup_cmd+0x3d3/0x430
RSP: 0018:ffff880423e9f838 EFLAGS: 00010217
RAX: 0000000000000000 RBX: ffff880423e9f8c8 RCX: 0000000000010000
RDX: ffff88022b200010 RSI: 0000000000000002 RDI: 00000000327f0000
RBP: ffff880421251400 R08: ffff88022b200000 R09: 0000000000000009
R10: 0000000000000000 R11: 0000000000000000 R12: 000000000000ffff
R13: ffff88042341e280 R14: 000000000000ffff R15: ffff880421251440
FS: 0000000000000000(0000) GS:ffff880441500000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055b684795030 CR3: 0000000002e09006 CR4: 00000000001606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
nvme_queue_rq+0x40/0xa00
? __sbitmap_queue_get+0x24/0x90
? blk_mq_get_tag+0xa3/0x250
? wait_woken+0x80/0x80
? blk_mq_get_driver_tag+0x97/0xf0
blk_mq_dispatch_rq_list+0x7b/0x4a0
? deadline_remove_request+0x49/0xb0
blk_mq_do_dispatch_sched+0x4f/0xc0
blk_mq_sched_dispatch_requests+0x106/0x170
__blk_mq_run_hw_queue+0x53/0xa0
__blk_mq_delay_run_hw_queue+0x83/0xa0
blk_mq_run_hw_queue+0x6c/0xd0
blk_mq_sched_insert_request+0x96/0x140
__blk_mq_try_issue_directly+0x3d/0x190
blk_mq_try_issue_directly+0x30/0x70
blk_mq_make_request+0x1a4/0x6a0
generic_make_request+0xfd/0x2f0
? submit_bio+0x5c/0x110
submit_bio+0x5c/0x110
? __blkdev_issue_discard+0x152/0x200
submit_bio_wait+0x43/0x60
ext4_process_freed_data+0x1cd/0x440
? account_page_dirtied+0xe2/0x1a0
ext4_journal_commit_callback+0x4a/0xc0
jbd2_journal_commit_transaction+0x17e2/0x19e0
? kjournald2+0xb0/0x250
kjournald2+0xb0/0x250
? wait_woken+0x80/0x80
? commit_timeout+0x10/0x10
kthread+0x111/0x130
? kthread_create_worker_on_cpu+0x50/0x50
? do_group_exit+0x3a/0xa0
ret_from_fork+0x1f/0x30
Code: 73 89 c1 83 ce 10 c1 e1 10 09 ca 83 f8 04 0f 87 0f ff ff ff 8b 4d 20 48 8b 7d 00 c1 e9 09 48 01 8c c7 00 08 00 00 e9 f8 fe ff ff <0f> ff 4c 89 c7 41 bc 0a 00 00 00 e8 0d 78 d6 ff e9 a1 fc ff ff
---[ end trace 50d361cc444506c8 ]---
print_req_error: I/O error, dev nvme0n1, sector 847167488
Decoding the assembly, the request claims to have 0xffff segments,
while nvme counts two. This turns out to be because we don't check
for a data carrying request on the mq scheduler path, and since
blk_phys_contig_segment() returns true for a non-data request,
we decrement the initial segment count of 0 and end up with
0xffff in the unsigned short.
There are a few issues here:
1) We should initialize the segment count for a discard to 1.
2) The discard merging is currently using the data limits for
segments and sectors.
Fix this up by having attempt_merge() correctly identify the
request, and by initializing the segment count correctly
for discards.
This can only be triggered with mq-deadline on discard capable
devices right now, which isn't a common configuration.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Merge KASAN word-at-a-time fixups from Andrey Ryabinin.
The word-at-a-time optimizations have caused headaches for KASAN, since
the whole point is that we access byte streams in bigger chunks, and
KASAN can be unhappy about the potential extra access at the end of the
string.
We used to have a horrible hack in dcache, and then people got
complaints from the strscpy() case. This fixes it all up properly, by
adding an explicit helper for the "access byte stream one word at a
time" case.
* emailed patches from Andrey Ryabinin <aryabinin@virtuozzo.com>:
fs: dcache: Revert "manually unpoison dname after allocation to shut up kasan's reports"
fs/dcache: Use read_word_at_a_time() in dentry_string_cmp()
lib/strscpy: Shut up KASAN false-positives in strscpy()
compiler.h: Add read_word_at_a_time() function.
compiler.h, kasan: Avoid duplicating __read_once_size_nocheck()
|
|
kasan's reports"
This reverts commit df4c0e36f1b1782b0611a77c52cc240e5c4752dd.
It's no longer needed since dentry_string_cmp() now uses
read_word_at_a_time() to avoid kasan's reports.
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
dentry_string_cmp() performs the word-at-a-time reads from 'cs' and may
read slightly more than it was requested in kmallac(). Normally this
would make KASAN to report out-of-bounds access, but this was
workarounded by commit df4c0e36f1b1 ("fs: dcache: manually unpoison
dname after allocation to shut up kasan's reports").
This workaround is not perfect, since it allows out-of-bounds access to
dentry's name for all the code, not just in dentry_string_cmp().
So it would be better to use read_word_at_a_time() instead and revert
commit df4c0e36f1b1.
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
strscpy() performs the word-at-a-time optimistic reads. So it may may
access the memory past the end of the object, which is perfectly fine
since strscpy() doesn't use that (past-the-end) data and makes sure the
optimistic read won't cross a page boundary.
Use new read_word_at_a_time() to shut up the KASAN.
Note that this potentially could hide some bugs. In example bellow,
stscpy() will copy more than we should (1-3 extra uninitialized bytes):
char dst[8];
char *src;
src = kmalloc(5, GFP_KERNEL);
memset(src, 0xff, 5);
strscpy(dst, src, 8);
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Sometimes we know that it's safe to do potentially out-of-bounds access
because we know it won't cross a page boundary. Still, KASAN will
report this as a bug.
Add read_word_at_a_time() function which is supposed to be used in such
cases. In read_word_at_a_time() KASAN performs relaxed check - only the
first byte of access is validated.
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Instead of having two identical __read_once_size_nocheck() functions
with different attributes, consolidate all the difference in new macro
__no_kasan_or_inline and use it. No functional changes.
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This implements ndo_poll_controller callback which is necessary to
enable netconsole.
Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Newly added igmpv3_get_srcaddr() needs to be called under rcu lock.
Timer callbacks do not ensure this locking.
=============================
WARNING: suspicious RCU usage
4.15.0+ #200 Not tainted
-----------------------------
./include/linux/inetdevice.h:216 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
3 locks held by syzkaller616973/4074:
#0: (&mm->mmap_sem){++++}, at: [<00000000bfce669e>] __do_page_fault+0x32d/0xc90 arch/x86/mm/fault.c:1355
#1: ((&im->timer)){+.-.}, at: [<00000000619d2f71>] lockdep_copy_map include/linux/lockdep.h:178 [inline]
#1: ((&im->timer)){+.-.}, at: [<00000000619d2f71>] call_timer_fn+0x1c6/0x820 kernel/time/timer.c:1316
#2: (&(&im->lock)->rlock){+.-.}, at: [<000000005f833c5c>] spin_lock_bh include/linux/spinlock.h:315 [inline]
#2: (&(&im->lock)->rlock){+.-.}, at: [<000000005f833c5c>] igmpv3_send_report+0x98/0x5b0 net/ipv4/igmp.c:600
stack backtrace:
CPU: 0 PID: 4074 Comm: syzkaller616973 Not tainted 4.15.0+ #200
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4592
__in_dev_get_rcu include/linux/inetdevice.h:216 [inline]
igmpv3_get_srcaddr net/ipv4/igmp.c:329 [inline]
igmpv3_newpack+0xeef/0x12e0 net/ipv4/igmp.c:389
add_grhead.isra.27+0x235/0x300 net/ipv4/igmp.c:432
add_grec+0xbd3/0x1170 net/ipv4/igmp.c:565
igmpv3_send_report+0xd5/0x5b0 net/ipv4/igmp.c:605
igmp_send_report+0xc43/0x1050 net/ipv4/igmp.c:722
igmp_timer_expire+0x322/0x5c0 net/ipv4/igmp.c:831
call_timer_fn+0x228/0x820 kernel/time/timer.c:1326
expire_timers kernel/time/timer.c:1363 [inline]
__run_timers+0x7ee/0xb70 kernel/time/timer.c:1666
run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
__do_softirq+0x2d7/0xb85 kernel/softirq.c:285
invoke_softirq kernel/softirq.c:365 [inline]
irq_exit+0x1cc/0x200 kernel/softirq.c:405
exiting_irq arch/x86/include/asm/apic.h:541 [inline]
smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052
apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:938
Fixes: a46182b00290 ("net: igmp: Use correct source address on IGMPv3 reports")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
the VIOS server
Older versions of VIOS servers do not send the firmware level in the VPD
buffer for the ibmvnic driver. Thus, not only the current message is mis-
leading but the firmware version in the ethtool will be NULL. Therefore,
this patch fixes the firmware string and its warning.
Fixes: 4e6759be28e4 ("ibmvnic: Feature implementation of VPD for the ibmvnic driver")
Signed-off-by: Desnes A. Nunes do Rosario <desnesn@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pointer rq is being initialized but this value is never read, it
is being updated inside a for-loop. Remove the initialization and
move it into the scope of the for-loop.
Cleans up clang warning:
drivers/net/vmxnet3/vmxnet3_drv.c:2763:27: warning: Value stored
to 'rq' during its initialization is never read
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Shrikrishna Khare <skhare@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pointer phydev is initialized and this value is never read, phydev
is immediately updated to a new value, hence this initialization
is redundant and can be removed
Cleans up clang warning:
drivers/net/usb/lan78xx.c:2009:21: warning: Value stored to 'phydev'
during its initialization is never read
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pointer rxdesc is assigned a value that is never read, it is overwritten
by a new assignment inside a while loop hence the initial assignment
is redundant and can be removed.
Cleans up clang warning:
drivers/net/ethernet/jme.c:1074:17: warning: Value stored to 'rxdesc'
during its initialization is never read
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Edits for grammar, punctuation, and a doubled-up word.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Andreas Noever <andreas.noever@gmail.com>
Cc: Michael Jamet <michael.jamet@intel.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Yehezkel Bernat <yehezkel.bernat@intel.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kconfig updates from Masahiro Yamada:
"A pretty big batch of Kconfig updates.
I have to mention the lexer and parser of Kconfig are now built from
real .l and .y sources. So, flex and bison are the requirement for
building the kernel. Both of them (unlike gperf) have been stable for
a long time. This change has been tested several weeks in linux-next,
and I did not receive any problem report about this.
Summary:
- add checks for mistakes, like the choice default is not in choice,
help is doubled
- document data structure and complex code
- fix various memory leaks
- change Makefile to build lexer and parser instead of using
pre-generated C files
- drop 'boolean' keyword, which is equivalent to 'bool'
- use default 'yy' prefix and remove unneeded Make variables
- fix gettext() check for xconfig
- announce that oldnoconfig will be finally removed
- make 'Selected by:' and 'Implied by' readable in help and search
result
- hide silentoldconfig from 'make help' to stop confusing people
- fix misc things and cleanups"
* tag 'kconfig-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (37 commits)
kconfig: Remove silentoldconfig from help and docs; fix kconfig/conf's help
kconfig: make "Selected by:" and "Implied by:" readable
kconfig: announce removal of oldnoconfig if used
kconfig: fix make xconfig when gettext is missing
kconfig: Clarify menu and 'if' dependency propagation
kconfig: Document 'if' flattening logic
kconfig: Clarify choice dependency propagation
kconfig: Document SYMBOL_OPTIONAL logic
kbuild: remove unnecessary LEX_PREFIX and YACC_PREFIX
kconfig: use default 'yy' prefix for lexer and parser
kconfig: make conf_unsaved a local variable of conf_read()
kconfig: make xfgets() really static
kconfig: make input_mode static
kconfig: Warn if there is more than one help text
kconfig: drop 'boolean' keyword
kconfig: use bool instead of boolean for type definition attributes, again
kconfig: Remove menu_end_entry()
kconfig: Document important expression functions
kconfig: Document automatic submenu creation code
kconfig: Fix choice symbol expression leak
...
|
|
IDEDMA_AUTO IDEDMA_PCI_AUTO was removed in commit 120b9cfddff2 ("ide: remove CONFIG_IDEDMA_{ICS,PCI}_AUTO config")
BLK_DEV_IDEDISK was removed in commit 806f80a6fc20 ("ide: add generic ATA/ATAPI disk driver")
BLK_DEV_IDE_AU1XXX_BURSTABLE_ON was removed in commit 8f29e650bffc ("ide: AU1200 IDE update")
Remove them from documentation
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild misc updates from Masahiro Yamada:
- add snap-pkg target to create Linux kernel snap package
- make out-of-tree creation of source packages fail correctly
- improve and fix several semantic patches
* tag 'kbuild-misc-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
Coccinelle: coccicheck: fix typo
Coccinelle: memdup: drop spurious line
Coccinelle: kzalloc-simple: Rename kzalloc-simple to zalloc-simple
Coccinelle: ifnullfree: Trim the warning reported in report mode
Coccinelle: alloc_cast: Add more memory allocating functions to the list
Coccinelle: array_size: report even if include is missing
Coccinelle: kzalloc-simple: Add all zero allocating functions
kbuild: pkg: make out-of-tree rpm/deb-pkg build immediately fail
scripts/package: snap-pkg target
|
|
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for your net tree,
they are:
1) Fix OOM that syskaller triggers with ipt_replace.size = -1 and
IPT_SO_SET_REPLACE socket option, from Dmitry Vyukov.
2) Check for too long extension name in xt_request_find_{match|target}
that result in out-of-bound reads, from Eric Dumazet.
3) Fix memory exhaustion bug in ipset hash:*net* types when adding ranges
that look like x.x.x.x-255.255.255.255, from Jozsef Kadlecsik.
4) Fix pointer leaks to userspace in x_tables, from Dmitry Vyukov.
5) Insufficient sanity checks in clusterip_tg_check(), also from Dmitry.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- terminate the build correctly in case of fixdep errors
- clean up fixdep
- suppress packed-not-aligned warnings from GCC-8
- fix W= handling for extra DTC warnings
* tag 'kbuild-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: fix W= option checks for extra DTC warnings
Kbuild: suppress packed-not-aligned warning for default setting only
fixdep: use existing helper to check modular CONFIG options
fixdep: refactor parse_dep_file()
fixdep: move global variables to local variables of main()
fixdep: remove unneeded memcpy() in parse_dep_file()
fixdep: factor out common code for reading files
fixdep: use malloc() and read() to load dep_file to buffer
fixdep: remove unnecessary <arpa/inet.h> inclusion
fixdep: exit with error code in error branches of do_config_file()
|
|
Kernel Glossary has moved from /Glossary to /KernelGlossary
on kernelnewbies.org. This patch corrects this link.
Signed-off-by: Grigory Shipunov <mail@oxapentane.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull DeviceTree updates from Rob Herring:
- Convert to use memblock_virt_alloc in DT code which supports
bootmem arches. With this we can remove the arch specific
early_init_dt_alloc_memory_arch() functions.
- Enable running the DT unittests on UML
- Use SPDX license tags on DT files
- Fix early FDT kconfig ifdef logic
- Clean-up unittest Makefile
- Fix function comment for of_irq_parse_raw
- Add missing documentation for linux,initrd-{start,end} properties
- Clean-up of binding examples using uppercase hex
- Add trivial devices W83773G and Infineon TLV493D-A1B6
- Add missing STM32 SoC bindings
- Various small binding doc fixes
* tag 'devicetree-for-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (23 commits)
xtensa: remove arch specific early DT functions
x86: remove arch specific early_init_dt_alloc_memory_arch
nios2: remove arch specific early_init_dt_alloc_memory_arch
mips: remove arch specific early_init_dt_alloc_memory_arch
metag: remove arch specific early DT functions
cris: remove arch specific early DT functions
libfdt: remove unnecessary include directive from <linux/libfdt.h>
of: unittest: refactor Makefile
of/fdt: use memblock_virt_alloc for early alloc
of: Use SPDX license tag for DT files
of/fdt: Fix #ifdef dependency of early flattree declarations
dt-bindings: h8300 clocksource: correct spelling of pulse
dt-bindings: imx6q-pcie: Add required property for i.MX6SX
mmc: Don't reference Linux-specific OF_GPIO_ACTIVE_LOW flag in DT binding
dt-bindings: Use lower case hex in unit-addresses
dt-bindings: display: panel: Fix compatible string for Toshiba LT089AC29000
dt-bindings: Add Infineon TLV493D-A1B6
dt-bindings: mailbox: ti,message-manager: Fix interrupt name error
dt-bindings: chosen: Document linux,initrd-{start,end}
dt-bindings: arm: document supported STM32 SoC family
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input layer updates from Dmitry Torokhov:
- evdev interface has been adjusted to extend the life of timestamps on
32 bit systems to the year of 2108
- Synaptics RMI4 driver's PS/2 guest handling ha beed updated to
improve chances of detecting trackpoints on the pass-through port
- mms114 touchcsreen controller driver has been updated to support
generic device properties and work with mms152 cntrollers
- Goodix driver now supports generic touchscreen properties
- couple of drivers for AVR32 architecture are gone as the architecture
support has been removed from the kernel
- gpio-tilt driver has been removed as there are no mainline users and
the driver itself is using legacy APIs and relies on platform data
- MODULE_LINECSE/MODULE_VERSION cleanups
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (45 commits)
Input: goodix - use generic touchscreen_properties
Input: mms114 - fix typo in definition
Input: mms114 - use BIT() macro instead of explicit shifting
Input: mms114 - replace mdelay with msleep
Input: mms114 - add support for mms152
Input: mms114 - drop platform data and use generic APIs
Input: mms114 - mark as direct input device
Input: mms114 - do not clobber interrupt trigger
Input: edt-ft5x06 - fix error handling for factory mode on non-M06
Input: stmfts - set IRQ_NOAUTOEN to the irq flag
Input: auo-pixcir-ts - delete an unnecessary return statement
Input: auo-pixcir-ts - remove custom log for a failed memory allocation
Input: da9052_tsi - remove unused mutex
Input: docs - use PROPERTY_ENTRY_U32() directly
Input: synaptics-rmi4 - log when we create a guest serio port
Input: synaptics-rmi4 - unmask F03 interrupts when port is opened
Input: synaptics-rmi4 - do not delete interrupt memory too early
Input: ad7877 - use managed resource allocations
Input: stmfts,s6sy671 - add SPDX identifier
Input: remove atmel-wm97xx touchscreen driver
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver updates from Greg KH:
"Here is the big pull request for char/misc drivers for 4.16-rc1.
There's a lot of stuff in here. Three new driver subsystems were added
for various types of hardware busses:
- siox
- slimbus
- soundwire
as well as a new vboxguest subsystem for the VirtualBox hypervisor
drivers.
There's also big updates from the FPGA subsystem, lots of Android
binder fixes, the usual handful of hyper-v updates, and lots of other
smaller driver updates.
All of these have been in linux-next for a long time, with no reported
issues"
* tag 'char-misc-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (155 commits)
char: lp: use true or false for boolean values
android: binder: use VM_ALLOC to get vm area
android: binder: Use true and false for boolean values
lkdtm: fix handle_irq_event symbol for INT_HW_IRQ_EN
EISA: Delete error message for a failed memory allocation in eisa_probe()
EISA: Whitespace cleanup
misc: remove AVR32 dependencies
virt: vbox: Add error mapping for VERR_INVALID_NAME and VERR_NO_MORE_FILES
soundwire: Fix a signedness bug
uio_hv_generic: fix new type mismatch warnings
uio_hv_generic: fix type mismatch warnings
auxdisplay: img-ascii-lcd: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
uio_hv_generic: add rescind support
uio_hv_generic: check that host supports monitor page
uio_hv_generic: create send and receive buffers
uio: document uio_hv_generic regions
doc: fix documentation about uio_hv_generic
vmbus: add monitor_id and subchannel_id to sysfs per channel
vmbus: fix ABI documentation
uio_hv_generic: use ISR callback method
...
|
|
Restore an optimization removed in commit 7f19449553 "Fix debugfs glocks
dump": keep the glock hash table iterator active while the glock dump
file is held open. This avoids having to rescan the hash table from the
start for each read, with quadratically rising runtime.
In addition, use rhastable_walk_peek for resuming a glock dump at the
current position: when a glock doesn't fit in the provided buffer
anymore, the next read must revisit the same glock.
Finally, also restart the dump from the first entry when we notice that
the hash table has been resized in gfs2_glock_seq_start.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
|
Depend on LIBCRC32C which uses the crypto API to select the appropriate
crc32c implementation. With the CRYPTO and CRYPTO_CRC32C dependencies,
gfs2 would still need to use the crypto API directly like ext4 and btrfs
do, which isn't necessary.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the set of "big" driver core patches for 4.16-rc1.
The majority of the work here is in the firmware subsystem, with
reworks to try to attempt to make the code easier to handle in the
long run, but no functional change. There's also some tree-wide sysfs
attribute fixups with lots of acks from the various subsystem
maintainers, as well as a handful of other normal fixes and changes.
And finally, some license cleanups for the driver core and sysfs code.
All have been in linux-next for a while with no reported issues"
* tag 'driver-core-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (48 commits)
device property: Define type of PROPERTY_ENRTY_*() macros
device property: Reuse property_entry_free_data()
device property: Move property_entry_free_data() upper
firmware: Fix up docs referring to FIRMWARE_IN_KERNEL
firmware: Drop FIRMWARE_IN_KERNEL Kconfig option
USB: serial: keyspan: Drop firmware Kconfig options
sysfs: remove DEBUG defines
sysfs: use SPDX identifiers
drivers: base: add coredump driver ops
sysfs: add attribute specification for /sysfs/devices/.../coredump
test_firmware: fix missing unlock on error in config_num_requests_store()
test_firmware: make local symbol test_fw_config static
sysfs: turn WARN() into pr_warn()
firmware: Fix a typo in fallback-mechanisms.rst
treewide: Use DEVICE_ATTR_WO
treewide: Use DEVICE_ATTR_RO
treewide: Use DEVICE_ATTR_RW
sysfs.h: Use octal permissions
component: add debugfs support
bus: simple-pm-bus: convert bool SIMPLE_PM_BUS to tristate
...
|
|
This guide is an adapted version of the more general "Protecting Code
Integrity" guide written and maintained by The Linux Foundation IT for
use with open-source projects. It provides the oft-lacking guidance on
the following topics:
- how to properly protect one's PGP keys to minimize the risks of them
being stolen and used maliciously to impersonate a kernel developer
- how to configure Git to properly use GnuPG
- when and how to use PGP with Git
- how to verify fellow Linux Kernel developer identities
I believe this document should live with the rest of the documentation
describing proper processes one should follow when participating in
kernel development. Placing it in a wiki on some place like kernel.org
would be insufficient for a number of reasons -- primarily, because only
a relatively small subset of maintainers have accounts on kernel.org,
but also because even those who do rarely remember that such wiki
exists. Keeping it with the rest of in-kernel docs should hopefully give
it more visibility, but also help keep it up-to-date as tools and
processes evolve.
Signed-off-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|