summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-10-16qede: Check available link modes before link set from ethtool.Rahul Verma
Set link mode after checking available "supported" link caps of the port. Signed-off-by: Rahul Verma <Rahul.Verma@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-16qed: Add supported link and advertise link to display in ethtool.Rahul Verma
Added transceiver type, speed capability and board types in HSI, are utilizing to display the accurate link information in ethtool. Signed-off-by: Rahul Verma <Rahul.Verma@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-16qed: Added supported transceiver modes, speed capability and board config to ↵Rahul Verma
HSI. Added transceiver modes with different speed and media type, speed capability and supported board types in HSI, which will be utilizing to display correct specification of link modes and speed type. Signed-off-by: Rahul Verma <Rahul.Verma@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-16qed: Align local and global PTT to propagate through the APIs.Rahul Verma
Align the use of local PTT to propagate through the qed_mcp* API's. Global ptt should not be used. Register access should be done through layers. Register address is mapped into a PTT, PF translation table. Several interface functions require a PTT to direct read/write into register. There is a pool of PTT maintained, and several PTT are used simultaneously to access device registers in different flows. Same PTT should not be used in flows that can run concurrently. To avoid running out of PTT resources, too many PTT should not be acquired without releasing them. Every PF has a global PTT, which is used throughout the life of PF, in most important flows for register access. Generic functions acquire the PTT locally and release after the use. This patch aligns the use of Global PTT and Local PTT accordingly. Signed-off-by: Rahul Verma <rahul.verma@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-16net: aquantia: make function aq_fw2x_update_stats staticYueHaibing
Fixes the following sparse warning: drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils_fw2x.c:282:5: warning: symbol 'aq_fw2x_update_stats' was not declared. Should it be static? Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-16sctp: get pr_assoc and pr_stream all status with SCTP_PR_SCTP_ALL insteadXin Long
According to rfc7496 section 4.3 or 4.4: sprstat_policy: This parameter indicates for which PR-SCTP policy the user wants the information. It is an error to use SCTP_PR_SCTP_NONE in sprstat_policy. If SCTP_PR_SCTP_ALL is used, the counters provided are aggregated over all supported policies. We change to dump pr_assoc and pr_stream all status by SCTP_PR_SCTP_ALL instead, and return error for SCTP_PR_SCTP_NONE, as it also said "It is an error to use SCTP_PR_SCTP_NONE in sprstat_policy. " Fixes: 826d253d57b1 ("sctp: add SCTP_PR_ASSOC_STATUS on sctp sockopt") Fixes: d229d48d183f ("sctp: add SCTP_PR_STREAM_STATUS sockopt for prsctp") Reported-by: Ying Xu <yinxu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcGreg Kroah-Hartman
David writes: "Sparc fixes 1) Revert the %pOF change, it causes regressions. 2) Wire up io_pgetevents(). 3) Fix perf events on single-PCR sparc64 cpus. 4) Do proper perf event throttling like arm and x86." * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: Revert "sparc: Convert to using %pOFn instead of device_node.name" sparc64: Set %l4 properly on trap return after handling signals. sparc64: Make proc_id signed. sparc: Throttle perf events properly. sparc: Fix single-pcr perf event counter management. sparc: Wire up io_pgetevents system call. sunvdc: Remove VLA usage
2018-10-16Merge tag 'selinux-pr-20181015' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Paul writes: "SELinux fixes for v4.19 We've got one SELinux "fix" that I'd like to get into v4.19 if possible. I'm using double quotes on "fix" as this is just an update to the MAINTAINERS file and not a code change. From my perspective, MAINTAINERS updates generally don't warrant inclusion during the -rcX phase, but this is a change to the mailing list location so it seemed prudent to get this in before v4.19 is released" * tag 'selinux-pr-20181015' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: MAINTAINERS: update the SELinux mailing list location
2018-10-16RDMA/ucma: Fix Spectre v1 vulnerabilityGustavo A. R. Silva
hdr.cmd can be indirectly controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: drivers/infiniband/core/ucma.c:1686 ucma_write() warn: potential spectre issue 'ucma_cmd_table' [r] (local cap) Fix this by sanitizing hdr.cmd before using it to index ucm_cmd_table. Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2 Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-10-16MIPS: VDSO: Reduce VDSO_RANDOMIZE_SIZE to 64MB for 64bitHuacai Chen
Commit ea7e0480a4b6 ("MIPS: VDSO: Always map near top of user memory") set VDSO_RANDOMIZE_SIZE to 256MB for 64bit kernel. But take a look at arch/mips/mm/mmap.c we can see that MIN_GAP is 128MB, which means the mmap_base may be at (user_address_top - 128MB). This make the stack be surrounded by mmaped areas, then stack expanding fails and causes a segmentation fault. Therefore, VDSO_RANDOMIZE_SIZE should be less than MIN_GAP and this patch reduce it to 64MB. Signed-off-by: Huacai Chen <chenhc@lemote.com> Signed-off-by: Paul Burton <paul.burton@mips.com> Fixes: ea7e0480a4b6 ("MIPS: VDSO: Always map near top of user memory") Patchwork: https://patchwork.linux-mips.org/patch/20910/ Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Cc: linux-mips@linux-mips.org Cc: Fuxin Zhang <zhangfx@lemote.com> Cc: Zhangjin Wu <wuzhangjin@gmail.com> Cc: Huacai Chen <chenhuacai@gmail.com>
2018-10-16f2fs: allow to mount, if quota is failedJaegeuk Kim
Since we can use the filesystem without quotas till next boot. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-10-16f2fs: update REQ_TIME in f2fs_cross_rename()Sahitya Tummala
Update REQ_TIME in the missing path - f2fs_cross_rename(). Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> [Jaegeuk Kim: add it in f2fs_rename()] Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-10-16f2fs: do not update REQ_TIME in case of error conditionsSahitya Tummala
The REQ_TIME should be updated only in case of success cases as followed at all other places in the file system. Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-10-16f2fs: remove unneeded disable_nat_bits()Chao Yu
Commit 7735730d39d7 ("f2fs: fix to propagate error from __get_meta_page()") added disable_nat_bits() in error path of __get_nat_bitmaps(), but it's unneeded, beause we will fail mount, we won't have chance to change nid usage status w/o nat full/empty bitmaps. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-10-16f2fs: remove unused sbi->trigger_ssr_thresholdChao Yu
Commit a2a12b679f36 ("f2fs: export SSR allocation threshold") introduced two threshold .min_ssr_sections and .trigger_ssr_threshold, but only .min_ssr_sections is used, so just remove redundant one for cleanup. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-10-16f2fs: shrink sbi->sb_lock coverage in set_file_temperature()Chao Yu
file_set_{cold,hot} doesn't need holding sbi->sb_lock, so moving them out of the lock. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-10-16f2fs: use rb_*_cached friendsChao Yu
As rbtree supports caching leftmost node natively, update f2fs codes to use rb_*_cached helpers to speed up leftmost node visiting. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-10-16f2fs: fix to recover cold bit of inode block during PORChao Yu
Testcase to reproduce this bug: 1. mkfs.f2fs /dev/sdd 2. mount -t f2fs /dev/sdd /mnt/f2fs 3. touch /mnt/f2fs/file 4. sync 5. chattr +A /mnt/f2fs/file 6. xfs_io -f /mnt/f2fs/file -c "fsync" 7. godown /mnt/f2fs 8. umount /mnt/f2fs 9. mount -t f2fs /dev/sdd /mnt/f2fs 10. chattr -A /mnt/f2fs/file 11. xfs_io -f /mnt/f2fs/file -c "fsync" 12. umount /mnt/f2fs 13. mount -t f2fs /dev/sdd /mnt/f2fs 14. lsattr /mnt/f2fs/file -----------------N- /mnt/f2fs/file But actually, we expect the corrct result is: -------A---------N- /mnt/f2fs/file The reason is in step 9) we missed to recover cold bit flag in inode block, so later, in fsync, we will skip write inode block due to below condition check, result in lossing data in another SPOR. f2fs_fsync_node_pages() if (!IS_DNODE(page) || !is_cold_node(page)) continue; Note that, I guess that some non-dir inode has already lost cold bit during POR, so in order to reenable recovery for those inode, let's try to recover cold bit in f2fs_iget() to save more fsynced data. Fixes: c56675750d7c ("f2fs: remove unneeded set_cold_node()") Cc: <stable@vger.kernel.org> 4.17+ Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-10-16f2fs: submit cached bio to avoid endless PageWritebackChao Yu
When migrating encrypted block from background GC thread, we only add them into f2fs inner bio cache, but forget to submit the cached bio, it may cause potential deadlock when we are waiting page writebacked, fix it. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-10-16f2fs: checkpoint disablingDaniel Rosenberg
Note that, it requires "f2fs: return correct errno in f2fs_gc". This adds a lightweight non-persistent snapshotting scheme to f2fs. To use, mount with the option checkpoint=disable, and to return to normal operation, remount with checkpoint=enable. If the filesystem is shut down before remounting with checkpoint=enable, it will revert back to its apparent state when it was first mounted with checkpoint=disable. This is useful for situations where you wish to be able to roll back the state of the disk in case of some critical failure. Signed-off-by: Daniel Rosenberg <drosen@google.com> [Jaegeuk Kim: use SB_RDONLY instead of MS_RDONLY] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-10-16sx8: convert to blk-mqJens Axboe
Convert from the old request_fn style driver to blk-mq. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-16z2ram: convert to blk-mqJens Axboe
Straight forward conversion to blk-mq, nothing special about this driver. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-16gdrom: convert to blk-mqJens Axboe
Ditch the deffered list, lock, and workqueue handling. Just mark the set as being blocking, so we are invoked from a workqueue already. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-16floppy: convert to blk-mqOmar Sandoval
This driver likes to fetch requests from all over the place, so make queue_rq put requests on a list so that the logic stays the same. Tested with QEMU. Signed-off-by: Omar Sandoval <osandov@fb.com> Converted to blk_mq_init_sq_queue() and fixed a few spots where the tag_set leaked on cleanup. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-16ataflop: convert to blk-mqOmar Sandoval
This driver is already pretty broken, in that it has two wait_events() (one in stdma_lock()) in request_fn. Get rid of the first one by freezing/quiescing the queue on format, and the second one by replacing it with stdma_try_lock(). The rest is straightforward. Compile-tested only and probably incorrect. Cc: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Converted to blk_mq_init_sq_queue() Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-16ataflop: fix error handling during setupOmar Sandoval
Move queue allocation next to disk allocation to fix a couple of issues: - If add_disk() hasn't been called, we should clear disk->queue before calling put_disk(). - If we fail to allocate a request queue, we still need to put all of the disks, not just the ones that we allocated queues for. Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-16ataflop: fold headers into C fileOmar Sandoval
atafd.h and atafdreg.h are only used from ataflop.c, so merge them in there. Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-16amiflop: convert to blk-mqOmar Sandoval
Straightforward conversion, just use the existing amiflop_lock to serialize access to the controller. Compile-tested only. Cc: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Converted to blk_mq_init_sq_queue() Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-16amiflop: clean up on errors during setupOmar Sandoval
The error handling in fd_probe_drives() doesn't clean up at all. Fix it up in preparation for converting to blk-mq. While we're here, get rid of the commented out amiga_floppy_remove(). Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-16amiflop: fold headers into C fileOmar Sandoval
amifd.h and amifdreg.h are only used from amiflop.c, and they're pretty small, so move the contents to amiflop.c and get rid of the .h files. This is preparation for adding a struct blk_mq_tag_set to struct amiga_floppy_struct. Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-16swim3: convert to blk-mqOmar Sandoval
Pretty simple conversion. grab_drive() could probably be replaced by some freeze/quiesce incantation, but I left it alone, and just used freeze/quiesce for eject. Compile-tested only. Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Omar Sandoval <osandov@fb.com> Converted to blk_mq_init_sq_queue(). Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-16swim3: add real error handling in setupOmar Sandoval
The driver doesn't have support for removing a device that has already been configured, but with more careful ordering we can avoid the need for that and make sure that we don't leak generic resources. Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-16swim: convert to blk-mqOmar Sandoval
The only interesting thing here is that there may be two floppies (i.e., request queues) sharing the same controller, so we use the global struct swim_priv->lock to check whether the controller is busy. Compile-tested only. Tested-by: Finn Thain <fthain@telegraphics.com.au> Acked-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Converted to blk_mq_init_sq_queue() Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-16swim: fix cleanup on setup errorOmar Sandoval
If we fail to allocate the request queue for a disk, we still need to free that disk, not just the previous ones. Additionally, we need to cleanup the previous request queues. Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-16locking/qspinlock, x86: Provide liveness guaranteePeter Zijlstra
On x86 we cannot do fetch_or() with a single instruction and thus end up using a cmpxchg loop, this reduces determinism. Replace the fetch_or() with a composite operation: tas-pending + load. Using two instructions of course opens a window we previously did not have. Consider the scenario: CPU0 CPU1 CPU2 1) lock trylock -> (0,0,1) 2) lock trylock /* fail */ 3) unlock -> (0,0,0) 4) lock trylock -> (0,0,1) 5) tas-pending -> (0,1,1) load-val <- (0,1,0) from 3 6) clear-pending-set-locked -> (0,0,1) FAIL: _2_ owners where 5) is our new composite operation. When we consider each part of the qspinlock state as a separate variable (as we can when _Q_PENDING_BITS == 8) then the above is entirely possible, because tas-pending will only RmW the pending byte, so the later load is able to observe prior tail and lock state (but not earlier than its own trylock, which operates on the whole word, due to coherence). To avoid this we need 2 things: - the load must come after the tas-pending (obviously, otherwise it can trivially observe prior state). - the tas-pending must be a full word RmW instruction, it cannot be an XCHGB for example, such that we cannot observe other state prior to setting pending. On x86 we can realize this by using "LOCK BTS m32, r32" for tas-pending followed by a regular load. Note that observing later state is not a problem: - if we fail to observe a later unlock, we'll simply spin-wait for that store to become visible. - if we observe a later xchg_tail(), there is no difference from that xchg_tail() having taken place before the tas-pending. Suggested-by: Will Deacon <will.deacon@arm.com> Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Will Deacon <will.deacon@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: andrea.parri@amarulasolutions.com Cc: longman@redhat.com Fixes: 59fb586b4a07 ("locking/qspinlock: Remove unbounded cmpxchg() loop from locking slowpath") Link: https://lkml.kernel.org/r/20181003130957.183726335@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-16x86/asm: 'Simplify' GEN_*_RMWcc() macrosPeter Zijlstra
Currently the GEN_*_RMWcc() macros include a return statement, which pretty much mandates we directly wrap them in a (inline) function. Macros with return statements are tricky and, as per the above, limit use, so remove the return statement and make them statement-expressions. This allows them to be used more widely. Also, shuffle the arguments a bit. Place the @cc argument as 3rd, this makes it consistent between UNARY and BINARY, but more importantly, it makes the @arg0 argument last. Since the @arg0 argument is now last, we can do CPP trickery and make it an optional argument, simplifying the users; 17 out of 18 occurences do not need this argument. Finally, change to asm symbolic names, instead of the numeric ordering of operands, which allows us to get rid of __BINARY_RMWcc_ARG and get cleaner code overall. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: JBeulich@suse.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bp@alien8.de Cc: hpa@linux.intel.com Link: https://lkml.kernel.org/r/20181003130957.108960094@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-16locking/qspinlock: Rework some commentsPeter Zijlstra
While working my way through the code again; I felt the comments could use help. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: andrea.parri@amarulasolutions.com Cc: longman@redhat.com Link: https://lkml.kernel.org/r/20181003130257.156322446@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-16locking/qspinlock: Re-order codePeter Zijlstra
Flip the branch condition after atomic_fetch_or_acquire(_Q_PENDING_VAL) such that we loose the indent. This also result in a more natural code flow IMO. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: andrea.parri@amarulasolutions.com Cc: longman@redhat.com Link: https://lkml.kernel.org/r/20181003130257.156322446@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-16IB/ucm: Fix Spectre v1 vulnerabilityGustavo A. R. Silva
hdr.cmd can be indirectly controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: drivers/infiniband/core/ucm.c:1127 ib_ucm_write() warn: potential spectre issue 'ucm_cmd_table' [r] (local cap) Fix this by sanitizing hdr.cmd before using it to index ucm_cmd_table. Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2 Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-10-16Merge branch 'x86/build' into locking/core, to pick up dependent patches and ↵Ingo Molnar
unify jump-label work Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-16perf cpu_map: Align cpu map synthesized events properly.David Miller
The size of the resulting cpu map can be smaller than a multiple of sizeof(u64), resulting in SIGBUS on cpus like Sparc as the next event will not be aligned properly. Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Fixes: 6c872901af07 ("perf cpu_map: Add cpu_map event synthesize function") Link: http://lkml.kernel.org/r/20181011.224655.716771175766946817.davem@davemloft.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-16perf/x86/intel: Export mem events only if there's PEBS supportJiri Olsa
Memory events depends on PEBS support and access to LDLAT MSR, but we display them in /sys/devices/cpu/events even if the CPU does not provide those, like for KVM guests. That brings the false assumption that those events should be available, while they fail event to open. Separating the mem-* events attributes and merging them with cpu_events only if there's PEBS support detected. We could also check if LDLAT MSR is available, but the PEBS check seems to cover the need now. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/20180906135748.GC9577@krava Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-16perf tools: Fix tracing_path_mount proper pathJiri Olsa
If there's no tracefs (RHEL7) support the tracing_path_mount returns debugfs path which results in following fail: # perf probe sys_write kprobe_events file does not exist - please rebuild kernel with CONFIG_KPROBE_EVENTS. Error: Failed to add events. In tracing_path_debugfs_mount function we need to return the 'tracing' path instead of just the mount to make it work: # perf probe sys_write Added new event: probe:sys_write (on sys_write) You can now use it in all perf tools, such as: perf record -e probe:sys_write -aR sleep 1 Adding the 'return tracing_path;' also to tracing_path_tracefs_mount function just for consistency with tracing_path_debugfs_mount. Upstream keeps working, because it has the tracefs support. Link: http://lkml.kernel.org/n/tip-yiwkzexq9fk1ey1xg3gnjlw4@git.kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Fixes: 23773ca18b39 ("perf tools: Make perf aware of tracefs") Link: http://lkml.kernel.org/r/20181016114818.3595-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-16bpf, tls: add tls header to tools infrastructureDaniel Borkmann
Andrey reported a build error for the BPF kselftest suite when compiled on a machine which does not have tls related header bits installed natively: test_sockmap.c:120:23: fatal error: linux/tls.h: No such file or directory #include <linux/tls.h> ^ compilation terminated. Fix it by adding the header to the tools include infrastructure and add definitions such as SOL_TLS that could potentially be missing. Fixes: e9dd904708c4 ("bpf: add tls support for testing in test_sockmap") Reported-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-10-16perf tools: Fix use of alternatives to find JDIRJarod Wilson
When a build is run from something like a cron job, the user's $PATH is rather minimal, of note, not including /usr/sbin in my own case. Because of that, an automated rpm package build ultimately fails to find libperf-jvmti.so, because somewhere within the build, this happens... /bin/sh: alternatives: command not found /bin/sh: alternatives: command not found Makefile.config:849: No openjdk development package found, please install JDK package, e.g. openjdk-8-jdk, java-1.8.0-openjdk-devel ...and while the build continues, libperf-jvmti.so isn't built, and things fall down when rpm tries to find all the %files specified. Exact same system builds everything just fine when the job is launched from a login shell instead of a cron job, since alternatives is in $PATH, so openjdk is actually found. The test required to get into this section of code actually specifies the full path, as does a block just above it, so let's do that here too. Signed-off-by: Jarod Wilson <jarod@redhat.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: William Cohen <wcohen@redhat.com> Fixes: d4dfdf00d43e ("perf jvmti: Plug compilation into perf build") Link: http://lkml.kernel.org/r/20180906221812.11167-1-jarod@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-10-16dmaengine: ppc4xx: fix off-by-one build failureChristian Lamparter
There are two poly_store, but one should have been poly_show. |adma.c:4382:16: error: conflicting types for 'poly_store' | static ssize_t poly_store(struct device_driver *dev, const char *buf, | ^~~~~~~~~~ |adma.c:4363:16: note: previous definition of 'poly_store' was here | static ssize_t poly_store(struct device_driver *dev, char *buf) | ^~~~~~~~~~ CC: stable@vger.kernel.org Fixes: 13efe1a05384 ("dmaengine: ppc4xx: remove DRIVER_ATTR() usage") Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Vinod Koul <vkoul@kernel.org>
2018-10-16IB/mlx5: Fix MR cache initializationArtemy Kovalyov
Schedule MR cache work only after bucket was initialized. Cc: <stable@vger.kernel.org> # 4.10 Fixes: 49780d42dfc9 ("IB/mlx5: Expose MR cache for mlx5_ib") Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16RDMA/cm: Respect returned status of cm_init_av_by_pathLeon Romanovsky
Add missing check for failure of cm_init_av_by_path Fixes: e1444b5a163e ("IB/cm: Fix automatic path migration support") Reported-by: Slava Shwartsman <slavash@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16IB/ipoib: Clear IPCB before icmp_sendDenis Drozdov
IPCB should be cleared before icmp_send, since it may contain data from previous layers and the data could be misinterpreted as ip header options, which later caused the ihl to be set to an invalid value and resulted in the following stack corruption: [ 1083.031512] ib0: packet len 57824 (> 2048) too long to send, dropping [ 1083.031843] ib0: packet len 37904 (> 2048) too long to send, dropping [ 1083.032004] ib0: packet len 4040 (> 2048) too long to send, dropping [ 1083.032253] ib0: packet len 63800 (> 2048) too long to send, dropping [ 1083.032481] ib0: packet len 23960 (> 2048) too long to send, dropping [ 1083.033149] ib0: packet len 63800 (> 2048) too long to send, dropping [ 1083.033439] ib0: packet len 63800 (> 2048) too long to send, dropping [ 1083.033700] ib0: packet len 63800 (> 2048) too long to send, dropping [ 1083.034124] ib0: packet len 63800 (> 2048) too long to send, dropping [ 1083.034387] ================================================================== [ 1083.034602] BUG: KASAN: stack-out-of-bounds in __ip_options_echo+0xf08/0x1310 [ 1083.034798] Write of size 4 at addr ffff880353457c5f by task kworker/u16:0/7 [ 1083.034990] [ 1083.035104] CPU: 7 PID: 7 Comm: kworker/u16:0 Tainted: G O 4.19.0-rc5+ #1 [ 1083.035316] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu2 04/01/2014 [ 1083.035573] Workqueue: ipoib_wq ipoib_cm_skb_reap [ib_ipoib] [ 1083.035750] Call Trace: [ 1083.035888] dump_stack+0x9a/0xeb [ 1083.036031] print_address_description+0xe3/0x2e0 [ 1083.036213] kasan_report+0x18a/0x2e0 [ 1083.036356] ? __ip_options_echo+0xf08/0x1310 [ 1083.036522] __ip_options_echo+0xf08/0x1310 [ 1083.036688] icmp_send+0x7b9/0x1cd0 [ 1083.036843] ? icmp_route_lookup.constprop.9+0x1070/0x1070 [ 1083.037018] ? netif_schedule_queue+0x5/0x200 [ 1083.037180] ? debug_show_all_locks+0x310/0x310 [ 1083.037341] ? rcu_dynticks_curr_cpu_in_eqs+0x85/0x120 [ 1083.037519] ? debug_locks_off+0x11/0x80 [ 1083.037673] ? debug_check_no_obj_freed+0x207/0x4c6 [ 1083.037841] ? check_flags.part.27+0x450/0x450 [ 1083.037995] ? debug_check_no_obj_freed+0xc3/0x4c6 [ 1083.038169] ? debug_locks_off+0x11/0x80 [ 1083.038318] ? skb_dequeue+0x10e/0x1a0 [ 1083.038476] ? ipoib_cm_skb_reap+0x2b5/0x650 [ib_ipoib] [ 1083.038642] ? netif_schedule_queue+0xa8/0x200 [ 1083.038820] ? ipoib_cm_skb_reap+0x544/0x650 [ib_ipoib] [ 1083.038996] ipoib_cm_skb_reap+0x544/0x650 [ib_ipoib] [ 1083.039174] process_one_work+0x912/0x1830 [ 1083.039336] ? wq_pool_ids_show+0x310/0x310 [ 1083.039491] ? lock_acquire+0x145/0x3a0 [ 1083.042312] worker_thread+0x87/0xbb0 [ 1083.045099] ? process_one_work+0x1830/0x1830 [ 1083.047865] kthread+0x322/0x3e0 [ 1083.050624] ? kthread_create_worker_on_cpu+0xc0/0xc0 [ 1083.053354] ret_from_fork+0x3a/0x50 For instance __ip_options_echo is failing to proceed with invalid srr and optlen passed from another layer via IPCB [ 762.139568] IPv4: __ip_options_echo rr=0 ts=0 srr=43 cipso=0 [ 762.139720] IPv4: ip_options_build: IPCB 00000000f3cd969e opt 000000002ccb3533 [ 762.139838] IPv4: __ip_options_echo in srr: optlen 197 soffset 84 [ 762.139852] IPv4: ip_options_build srr=0 is_frag=0 rr_needaddr=0 ts_needaddr=0 ts_needtime=0 rr=0 ts=0 [ 762.140269] ================================================================== [ 762.140713] IPv4: __ip_options_echo rr=0 ts=0 srr=0 cipso=0 [ 762.141078] BUG: KASAN: stack-out-of-bounds in __ip_options_echo+0x12ec/0x1680 [ 762.141087] Write of size 4 at addr ffff880353457c7f by task kworker/u16:0/7 Signed-off-by: Denis Drozdov <denisd@mellanox.com> Reviewed-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16RDMA/restrack: Protect from reentry to resource return pathLeon Romanovsky
Nullify the resource task struct pointer to ensure that subsequent calls won't try to release task_struct again. ------------[ cut here ]------------ ODEBUG: free active (active state 1) object type: rcu_head hint: (null) WARNING: CPU: 0 PID: 6048 at lib/debugobjects.c:329 debug_print_object+0x16a/0x210 lib/debugobjects.c:326 Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 6048 Comm: syz-executor022 Not tainted 4.19.0-rc7-next-20181008+ #89 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x244/0x3ab lib/dump_stack.c:113 panic+0x238/0x4e7 kernel/panic.c:184 __warn.cold.8+0x163/0x1ba kernel/panic.c:536 report_bug+0x254/0x2d0 lib/bug.c:186 fixup_bug arch/x86/kernel/traps.c:178 [inline] do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271 do_invalid_op+0x36/0x40 arch/x86/kernel/traps.c:290 invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:969 RIP: 0010:debug_print_object+0x16a/0x210 lib/debugobjects.c:326 Code: 41 88 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 92 00 00 00 48 8b 14 dd 60 02 41 88 4c 89 fe 48 c7 c7 00 f8 40 88 e8 36 2f b4 fd <0f> 0b 83 05 a9 f4 5e 06 01 48 83 c4 18 5b 41 5c 41 5d 41 5e 41 5f RSP: 0018:ffff8801d8c3eda8 EFLAGS: 00010086 RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff8164d235 RDI: 0000000000000005 RBP: ffff8801d8c3ede8 R08: ffff8801d70aa280 R09: ffffed003b5c3eda R10: ffffed003b5c3eda R11: ffff8801dae1f6d7 R12: 0000000000000001 R13: ffffffff8939a760 R14: 0000000000000000 R15: ffffffff8840fca0 __debug_check_no_obj_freed lib/debugobjects.c:786 [inline] debug_check_no_obj_freed+0x3ae/0x58d lib/debugobjects.c:818 kmem_cache_free+0x202/0x290 mm/slab.c:3759 free_task_struct kernel/fork.c:163 [inline] free_task+0x16e/0x1f0 kernel/fork.c:457 __put_task_struct+0x2e6/0x620 kernel/fork.c:730 put_task_struct include/linux/sched/task.h:96 [inline] finish_task_switch+0x66c/0x900 kernel/sched/core.c:2715 context_switch kernel/sched/core.c:2834 [inline] __schedule+0x8d7/0x21d0 kernel/sched/core.c:3480 schedule+0xfe/0x460 kernel/sched/core.c:3524 freezable_schedule include/linux/freezer.h:172 [inline] futex_wait_queue_me+0x3f9/0x840 kernel/futex.c:2530 futex_wait+0x45c/0xa50 kernel/futex.c:2645 do_futex+0x31a/0x26d0 kernel/futex.c:3528 __do_sys_futex kernel/futex.c:3589 [inline] __se_sys_futex kernel/futex.c:3557 [inline] __x64_sys_futex+0x472/0x6a0 kernel/futex.c:3557 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x446549 Code: e8 2c b3 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b 09 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f3a998f5da8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca RAX: ffffffffffffffda RBX: 00000000006dbc38 RCX: 0000000000446549 RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00000000006dbc38 RBP: 00000000006dbc30 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006dbc3c R13: 2f646e6162696e69 R14: 666e692f7665642f R15: 00000000006dbd2c Kernel Offset: disabled Reported-by: syzbot+71aff6ea121ffefc280f@syzkaller.appspotmail.com Fixes: ed7a01fd3fd7 ("RDMA/restrack: Release task struct which was hold by CM_ID object") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>