summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-08-29f2fs: return error when accessing insane flie offsetJaegeuk Kim
If file offset is insane, we have to return error instead of kernel panic. Reported-by: Eric Zhang <followme999@163.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-29f2fs: trigger normal fsync for non-atomic_write fileChao Yu
If file was not opened with atomic write mode, but user uses atomic write ioctl to fsync datas, in the flow, we should not fsync that file with atomic write mode. Fixes: 608514deba38 ("f2fs: set fsync mark only for the last dnode") Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-29f2fs: clear FI_HOT_DATA correctlyChao Yu
This patch fixes to clear FI_HOT_DATA correctly in below path: - error handling in f2fs_ioc_start_atomic_write - after commit atomic write in f2fs_ioc_commit_atomic_write - after drop atomic write in drop_inmem_pages Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-29f2fs: fix out-of-order execution in f2fs_issue_flushChao Yu
In f2fs_issue_flush, due to out-of-order execution of CPU, wake_up can be called before we insert issue_list, result in long latency of wait_for_completion. Fix this by adding smp_mb() to force the order of related codes. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-29f2fs: issue discard commands if gc_urgent is setJaegeuk Kim
It's time to issue all the discard commands, if user sets the idle time. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-29bsg: remove #if 0'ed codeChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-29mq-deadline: Enable auto-loading when built as moduleBen Hutchings
The block core requests modules with the "-iosched" name suffix, but mq-deadline does not have that suffix. Add an alias. Fixes: 945ffb60c11d ("mq-deadline: add blk-mq adaptation of the deadline ...") Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-29bfq: Re-enable auto-loading when built as a moduleBen Hutchings
The block core requests modules with the "-iosched" name suffix, but bfq no longer has that suffix. Add an alias. Fixes: ea25da48086d ("block, bfq: split bfq-iosched.c into multiple ...") Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-29Merge tag 'rxrpc-next-20170829' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== rxrpc: Miscellany Here are a number of patches that make some changes/fixes and add a couple of extensions to AF_RXRPC for kernel services to use. The changes and fixes are: (1) Use time64_t rather than u32 outside of protocol or UAPI-representative structures. (2) Use the correct time stamp when loading a key from an XDR-encoded Kerberos 5 key. (3) Fix IPv6 support. (4) Fix some places where the error code is being incorrectly made positive before returning. (5) Remove some white space. And the extensions: (6) Add an end-of-Tx phase notification, thereby allowing kAFS to transition the state on its own call record at the correct point, rather than having to do it in advance and risk non-completion of the call in the wrong state. (7) Allow a kernel client call to be retried if it fails on a network error, thereby making it possible for kAFS to iterate over a number of IP addresses without having to reload the Tx queue and re-encrypt data each time. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-29Merge branch 'addrlabel-no-rtnl-locking'David S. Miller
Florian Westphal says: ==================== addrlabel: don't use rtnl locking addrlabel doesn't appear to require rtnl lock as the addrlabel table uses a spinlock to serialize add/delete operations. Also, entries are reference counted so it should be safe to call the rtnl ops without the rtnl mutex. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-29addrlabel: add/delete/get can run without rtnlFlorian Westphal
There appears to be no need to use rtnl, addrlabel entries are refcounted and add/delete is serialized by the addrlabel table spinlock. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-29selftests: add addrlabel add/delete to rtnetlink.shFlorian Westphal
Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-29Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2017-08-29 1) Fix dst_entry refcount imbalance when using socket policies. From Lorenzo Colitti. 2) Fix locking when adding the ESP trailers. 3) Fix tailroom calculation for the ESP trailer by using skb_tailroom instead of skb_availroom. 4) Fix some info leaks in xfrm_user. From Mathias Krause. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-29staging: irda: update MAINTAINERSGreg Kroah-Hartman
Now that the IRDA code has moved under drivers/staging/irda/, update the MAINTAINERS file with the new location. Reported-by: Joe Perches <joe@perches.com> Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-29bnxt_en: add a dummy definition for bnxt_vf_rep_get_fid()Sathya Perla
When bnxt VF-reps are not compiled in (CONFIG_BNXT_SRIOV is off) bnxt_tc.c needs a dummy definition of the routine bnxt_vf_rep_get_fid(). Reported-by: kbuild test robot <fengguang.wu@intel.com> Fixes: 2ae7408fedfe ("bnxt_en: bnxt: add TC flower filter offload support") Signed-off-by: Sathya Perla <sathya.perla@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-29mtd: nand: complain loudly when chip->bits_per_cell is not correctly initializedLothar Waßmann
chip->bits_per_cell which is used to determine the NAND cell type (SLC/MLC) should always have a value != 0. Complain loudly if the value is 0 in nand_is_slc() to catch use before correct initialization. Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
2017-08-29mtd: nand: make Samsung SLC NAND usable againLothar Waßmann
commit c51d0ac59f24 ("mtd: nand: Move Samsung specific init/detection logic in nand_samsung.c") introduced a regression for Samsung SLC NAND chips. Prior to this commit chip->bits_per_cell was initialized by calling nand_get_bits_per_cell() before using nand_is_slc(). With the offending commit this call is skipped, leaving chip->bits_per_cell cleared to zero when the manufacturer specific '.detect' function calls nand_is_slc() which in turn interprets bits_per_cell != 1 as indication for an MLC chip. The effect is that e.g. a K9F1G08U0F NAND chip is falsely detected as MLC NAND with 4KiB page size rather than SLC with 2KiB page size. Add a call to nand_get_bits_per_cell() before calling the .detect hook function in nand_manufacturer_detect(), so that the nand_is_slc() calls in the manufacturer specific code will return correct results. Fixes: c51d0ac59f24 ("mtd: nand: Move Samsung specific init/detection logic in nand_samsung.c") Cc: <stable@vger.kernel.org> Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
2017-08-29Revert "rmap: do not call mmu_notifier_invalidate_page() under ptl"Linus Torvalds
This reverts commit aac2fea94f7a3df8ad1eeb477eb2643f81fd5393. It turns out that that patch was complete and utter garbage, and broke KVM, resulting in odd oopses. Quoting Andrea Arcangeli: "The aforementioned commit has 3 bugs. 1) mmu_notifier_invalidate_range cannot be used in replacement of mmu_notifier_invalidate_range_start/end. For KVM mmu_notifier_invalidate_range is a noop and rightfully so. A MMU notifier implementation has to implement either ->invalidate_range method or the invalidate_range_start/end methods, not both. And if you implement invalidate_range_start/end like KVM is forced to do, calling mmu_notifier_invalidate_range in common code is a noop for KVM. For those MMU notifiers that can get away only implementing ->invalidate_range, the ->invalidate_range is implicitly called by mmu_notifier_invalidate_range_end(). And only those secondary MMUs that share the same pagetable with the primary MMU (like AMD iommuv2) can get away only implementing ->invalidate_range. So all cases (THP on/off) are broken right now. To fix this is enough to replace mmu_notifier_invalidate_range with mmu_notifier_invalidate_range_start;mmu_notifier_invalidate_range_end. Either that or call multiple mmu_notifier_invalidate_page like before. 2) address + (1UL << compound_order(page) is buggy, it should be PAGE_SIZE << compound_order(page), it's bytes not pages, 2M not 512. 3) The whole invalidate_range thing was an attempt to call a single invalidate while walking multiple 4k ptes that maps the same THP (after a pmd virtual split without physical compound page THP split). It's unclear if the rmap_walk will always provide an address that is 2M aligned as parameter to try_to_unmap_one, in presence of THP. I think it needs also an address &= (PAGE_SIZE << compound_order(page)) - 1 to be safe" In general, we should stop making excuses for horrible MMU notifier users. It's much more important that the core VM is sane and safe, than letting MMU notifiers sleep. So if some MMU notifier is sleeping under a spinlock, we need to fix the notifier, not try to make excuses for that garbage in the core VM. Reported-and-tested-by: Bernhard Held <berny156@gmx.de> Reported-and-tested-by: Adam Borowski <kilobyte@angband.pl> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Wanpeng Li <kernellwp@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: axie <axie@amd.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-29block: Make blk_dequeue_request() staticDamien Le Moal
The only caller of this function is blk_start_request() in the same file. Fix blk_start_request() description accordingly. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-29skd: Let the block layer core choose .nr_requestsBart Van Assche
Since blk_mq_init_queue() initializes .nr_requests to the tag set size and since that value is a good default for the skd driver, do not overwrite the value set by blk_mq_init_queue(). This change doubles the default value of .nr_requests. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-29skd: Remove blk_queue_bounce_limit() callBart Van Assche
Since sTec s1120 devices support 64-bit DMA it is not necessary to request data buffer bouncing. Hence remove the blk_queue_bounce_limit() call. Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-29Revert "libata: quirk read log on no-name M.2 SSD"Tejun Heo
This reverts commit 35f0b6a779b8b7a98faefd7c1c660b4dac9a5c26. We now conditionalize issuing of READ LOG PAGE on the TRUSTED COMPUTING SUPPORTED bit in the identity data and this shouldn't be necessary. Signed-off-by: Tejun Heo <tj@kernel.org>
2017-08-29libata: check for trusted computing in IDENTIFY DEVICE dataChristoph Hellwig
ATA-8 and later mirrors the TRUSTED COMPUTING SUPPORTED bit in word 48 of the IDENTIFY DEVICE data. Check this before issuing a READ LOG PAGE command to avoid issues with buggy devices. The only downside is that we can't support Security Send / Receive for a device with an older revision due to the conflicting use of this field in earlier specifications. tj: The reason we need this is because some devices which don't support READ LOG PAGE lock up after getting issued that command. Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: David Ahern <dsahern@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2017-08-29Merge branch 'nvme-4.14' of git://git.infradead.org/nvme into ↵Jens Axboe
for-4.14/block-postmerge Pull NVMe changes from Christoph: "Below is the current set of NVMe updates for Linux 4.14, now against your postmerge branch, and with three more patches. The biggest bit comes from Sagi and refactors the RDMA driver to prepare for more code sharing in the setup and teardown path. But we have various features and bug fixes from a lot of people as well."
2017-08-29KVM: s390: we are always in czam modeDavid Hildenbrand
Independent of the underlying hardware, kvm will now always handle SIGP SET ARCHITECTURE as if czam were enabled. Therefore, let's not only forward that bit but always set it. While at it, add a comment regarding STHYI. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20170829143108.14703-1-david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-08-29perf symbols: Fix plt entry calculation for ARM and AARCH64Li Bin
On x86, the plt header size is as same as the plt entry size, and can be identified from shdr's sh_entsize of the plt. But we can't assume that the sh_entsize of the plt shdr is always the plt entry size in all architecture, and the plt header size may be not as same as the plt entry size in some architecure. On ARM, the plt header size is 20 bytes and the plt entry size is 12 bytes (don't consider the FOUR_WORD_PLT case) that refer to the binutils implementation. The plt section is as follows: Disassembly of section .plt: 000004a0 <__cxa_finalize@plt-0x14>: 4a0: e52de004 push {lr} ; (str lr, [sp, #-4]!) 4a4: e59fe004 ldr lr, [pc, #4] ; 4b0 <_init+0x1c> 4a8: e08fe00e add lr, pc, lr 4ac: e5bef008 ldr pc, [lr, #8]! 4b0: 00008424 .word 0x00008424 000004b4 <__cxa_finalize@plt>: 4b4: e28fc600 add ip, pc, #0, 12 4b8: e28cca08 add ip, ip, #8, 20 ; 0x8000 4bc: e5bcf424 ldr pc, [ip, #1060]! ; 0x424 000004c0 <printf@plt>: 4c0: e28fc600 add ip, pc, #0, 12 4c4: e28cca08 add ip, ip, #8, 20 ; 0x8000 4c8: e5bcf41c ldr pc, [ip, #1052]! ; 0x41c On AARCH64, the plt header size is 32 bytes and the plt entry size is 16 bytes. The plt section is as follows: Disassembly of section .plt: 0000000000000560 <__cxa_finalize@plt-0x20>: 560: a9bf7bf0 stp x16, x30, [sp,#-16]! 564: 90000090 adrp x16, 10000 <__FRAME_END__+0xf8a8> 568: f944be11 ldr x17, [x16,#2424] 56c: 9125e210 add x16, x16, #0x978 570: d61f0220 br x17 574: d503201f nop 578: d503201f nop 57c: d503201f nop 0000000000000580 <__cxa_finalize@plt>: 580: 90000090 adrp x16, 10000 <__FRAME_END__+0xf8a8> 584: f944c211 ldr x17, [x16,#2432] 588: 91260210 add x16, x16, #0x980 58c: d61f0220 br x17 0000000000000590 <__gmon_start__@plt>: 590: 90000090 adrp x16, 10000 <__FRAME_END__+0xf8a8> 594: f944c611 ldr x17, [x16,#2440] 598: 91262210 add x16, x16, #0x988 59c: d61f0220 br x17 NOTES: In addition to ARM and AARCH64, other architectures, such as s390/alpha/mips/parisc/poperpc/sh/sparc/xtensa also need to consider this issue. Signed-off-by: Li Bin <huawei.libin@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexis Berlemont <alexis.berlemont@gmail.com> Cc: David Tolnay <dtolnay@gmail.com> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Cc: zhangmengting@huawei.com Link: http://lkml.kernel.org/r/1496622849-21877-1-git-send-email-huawei.libin@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-29Merge branch 'stable/for-jens-4.13' of ↵Jens Axboe
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-linus Pull xen-blkback fix from Konrad: "[...] A bug-fix when shutting down xen block backend driver with multiple queues and the driver not clearing all of them."
2017-08-29s390/mm: avoid empty zero pages for KVM guests to avoid postcopy hangsChristian Borntraeger
Right now there is a potential hang situation for postcopy migrations, if the guest is enabling storage keys on the target system during the postcopy process. For storage key virtualization, we have to forbid the empty zero page as the storage key is a property of the physical page frame. As we enable storage key handling lazily we then drop all mappings for empty zero pages for lazy refaulting later on. This does not work with the postcopy migration, which relies on the empty zero page never triggering a fault again in the future. The reason is that postcopy migration will simply read a page on the target system if that page is a known zero page to fault in an empty zero page. At the same time postcopy remembers that this page was already transferred - so any future userfault on that page will NOT be retransmitted again to avoid races. If now the guest enters the storage key mode while in postcopy, we will break this assumption of postcopy. The solution is to disable the empty zero page for KVM guests early on and not during storage key enablement. With this change, the postcopy migration process is guaranteed to start after no zero pages are left. As guest pages are very likely not empty zero pages anyway the memory overhead is also pretty small. While at it this also adds proper page table locking to the zero page removal. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Janosch Frank <frankja@linux.vnet.ibm.com> Cc: stable@vger.kernel.org Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-08-29s390/dasd: Add discard support for FBA devicesJan Höppner
The z/VM hypervisor provides virtual disks (VDISK) which are backed by main memory of the hypervisor. Those devices are seen as DASD FBA disks within the Linux guest. Whenever data is written to such a device, memory is allocated on-the-fly by z/VM accordingly. This memory, however, is not being freed if data on the device is deleted by the guest OS. In order to make memory usable after deletion again, add discard support to the FBA discipline. While at it, update comments regarding the DASD_FEATURE_* flags. Reviewed-by: Stefan Haberland <sth@linux.vnet.ibm.com> Signed-off-by: Jan Höppner <hoeppner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-08-29s390/zcrypt: make CPRBX constBhumika Goyal
Make this const as it is only used in a copy operation. Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-08-29s390/uaccess: avoid mvcos jump labelMartin Schwidefsky
If the kernel is compiled for z10 or later machines the uaccess code inlines the mvcos instruction. The facility bit 27 which indicates the availability of MVCOS has to be set. The have_mvcos jump label will always be true. Make the generation of the have_mvcos jump label conditional on !CONFIG_HAVE_MARCH_Z10_FEATURES. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-08-29s390/mm: use generic mm_hooksMartin Schwidefsky
With git commit 3446c13b268af86391d06611327006b059b8bab1 "s390/mm: four page table levels vs. fork" s390 dropped its architecture specific version of arch_dup_mmap. Now all functions defined by include/asm-generic/mm_hooks.h are identical to the s390 versions. Use the generic header. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-08-29s390/facilities: fix typoHeiko Carstens
There really is no "general extension" facility available. Use the correct name "general instructions extension" facility instead. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-08-29s390/vmcp: simplify vmcp_response_free()Heiko Carstens
Get rid of the goto and "out" label within vmcp_response_free() which I added. This just makes the code harder to read than necessary. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-08-29dt-bindings: ata: add DT bindings for MediaTek SATA controllerRyder Lee
Add DT bindings for the onboard SATA controller present on the MediaTek SoCs. Signed-off-by: Ryder Lee <ryder.lee@mediatek.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Tejun Heo <tj@kernel.org>
2017-08-29perf probe: Fix kprobe blacklist checking conditionLi Bin
The commit 9aaf5a5f479b ("perf probe: Check kprobes blacklist when adding new events"), 'perf probe' supports checking the blacklist of the fuctions which can not be probed. But the checking condition is wrong, that the end_addr of the symbol which is the start_addr of the next symbol can't be included. Committer notes: IOW make it match its kernel counterpart in kernel/kprobes.c: bool within_kprobe_blacklist(unsigned long addr) Each entry have as its end address not its end address, but the first address _outside_ that symbol, which for related functions, is the first address of the next symbol, like these from kernel/trace/trace_probe.c: 0xffffffffbd198df0-0xffffffffbd198e40 print_type_u8 0xffffffffbd198e40-0xffffffffbd198e90 print_type_u16 0xffffffffbd198e90-0xffffffffbd198ee0 print_type_u32 0xffffffffbd198ee0-0xffffffffbd198f30 print_type_u64 0xffffffffbd198f30-0xffffffffbd198f80 print_type_s8 0xffffffffbd198f80-0xffffffffbd198fd0 print_type_s16 0xffffffffbd198fd0-0xffffffffbd199020 print_type_s32 0xffffffffbd199020-0xffffffffbd199070 print_type_s64 0xffffffffbd199070-0xffffffffbd1990c0 print_type_x8 0xffffffffbd1990c0-0xffffffffbd199110 print_type_x16 0xffffffffbd199110-0xffffffffbd199160 print_type_x32 0xffffffffbd199160-0xffffffffbd1991b0 print_type_x64 But not always: 0xffffffffbd1997b0-0xffffffffbd1997c0 fetch_kernel_stack_address (kernel/trace/trace_probe.c) 0xffffffffbd1c57f0-0xffffffffbd1c58b0 __context_tracking_enter (kernel/context_tracking.c) Signed-off-by: Li Bin <huawei.libin@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Cc: zhangmengting@huawei.com Fixes: 9aaf5a5f479b ("perf probe: Check kprobes blacklist when adding new events") Link: http://lkml.kernel.org/r/1504011443-7269-1-git-send-email-huawei.libin@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-29virt: Convert to using %pOF instead of full_nameRob Herring
Now that we have a custom printf format specifier, convert users of full_name to use %pOF instead. This is preparation to remove storing of the full path string for each node. Signed-off-by: Rob Herring <robh@kernel.org>
2017-08-29macintosh: Convert to using %pOF instead of full_nameRob Herring
Now that we have a custom printf format specifier, convert users of full_name to use %pOF instead. This is preparation to remove storing of the full path string for each node. Signed-off-by: Rob Herring <robh@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: linuxppc-dev@lists.ozlabs.org
2017-08-29ide: pmac: Convert to using %pOF instead of full_nameRob Herring
Now that we have a custom printf format specifier, convert users of full_name to use %pOF instead. This is preparation to remove storing of the full path string for each node. Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: David S. Miller <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linux-ide@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org
2017-08-29microblaze: Convert to using %pOF instead of full_nameRob Herring
Now that we have a custom printf format specifier, convert users of full_name to use %pOF instead. This is preparation to remove storing of the full path string for each node. Signed-off-by: Rob Herring <robh@kernel.org> Cc: Michal Simek <monstr@monstr.eu>
2017-08-29MIPS: Remove pt_regs adjustments in indirect syscall handlerJames Cowgill
If a restartable syscall is called using the indirect o32 syscall handler - eg: syscall(__NR_waitid, ...), then it is possible for the incorrect arguments to be passed to the syscall after it has been restarted. This is because the syscall handler tries to shift all the registers down one place in pt_regs so that when the syscall is restarted, the "real" syscall is called instead. Unfortunately it only shifts the arguments passed in registers, not the arguments on the user stack. This causes the 4th argument to be duplicated when the syscall is restarted. Fix by removing all the pt_regs shifting so that the indirect syscall handler is called again when the syscall is restarted. The comment "some syscalls like execve get their arguments from struct pt_regs" is long out of date so this should now be safe. Signed-off-by: James Cowgill <James.Cowgill@imgtec.com> Reviewed-by: James Hogan <james.hogan@imgtec.com> Tested-by: James Hogan <james.hogan@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/15856/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2017-08-29MIPS: seccomp: Fix indirect syscall argsJames Hogan
Since commit 669c4092225f ("MIPS: Give __secure_computing() access to syscall arguments."), upon syscall entry when seccomp is enabled, syscall_trace_enter() passes a carefully prepared struct seccomp_data containing syscall arguments to __secure_computing(). Unfortunately it directly uses mips_get_syscall_arg() and fails to take into account the indirect O32 system calls (i.e. syscall(2)) which put the system call number in a0 and have the arguments shifted up by one entry. We can't just revert that commit as samples/bpf/tracex5 would break again, so use syscall_get_arguments() which already takes indirect syscalls into account instead of directly using mips_get_syscall_arg(), similar to what populate_seccomp_data() does. This also removes the redundant error checking of the mips_get_syscall_arg() return value (get_user() already zeroes the result if an argument from the stack can't be loaded). Reported-by: James Cowgill <James.Cowgill@imgtec.com> Fixes: 669c4092225f ("MIPS: Give __secure_computing() access to syscall arguments.") Signed-off-by: James Hogan <james.hogan@imgtec.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: David Daney <david.daney@cavium.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Will Drewry <wad@chromium.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/16994/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2017-08-29clocksource/drivers/bcm2835: Remove message for a memory allocation failureMarkus Elfring
The bcm2835_timer_init() function emits an error message in case of a memory allocation failure. This is pointless as the mm core does that already. Remove this message. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-08-29KVM: s390: expose no-DAT to guest and migration supportClaudio Imbrenda
The STFLE bit 147 indicates whether the ESSA no-DAT operation code is valid, the bit is not normally provided to the host; the host is instead provided with an SCLP bit that indicates whether guests can support the feature. This patch: * enables the STFLE bit in the guest if the corresponding SCLP bit is present in the host. * adds support for migrating the no-DAT bit in the PGSTEs * fixes the software interpretation of the ESSA instruction that is used when migrating, both for the new operation code and for the old "set stable", as per specifications. Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-08-29KVM: s390: sthyi: remove invalid guest write accessHeiko Carstens
handle_sthyi() always writes to guest memory if the sthyi function code is zero in order to fault in the page that later is written to. However a function code of zero does not necessarily mean that a write to guest memory happens: if the KVM host is running as a second level guest under z/VM 6.2 the sthyi instruction is indicated to be available to the KVM host, however if the instruction is executed it will always return with a return code that indicates "unsupported function code". In such a case handle_sthyi() must not write to guest memory. This means that the prior write access to fault in the guest page may result in invalid guest exceptions, and/or invalid data modification. In order to be architecture compliant simply remove the write_guest() call. Given that the guest assumed a write access anyway, this fix does not qualify for -stable. This just makes sure the sthyi handler is architecture compliant. Fixes: 95ca2cb57985 ("KVM: s390: Add sthyi emulation") Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-08-29KVM: s390: Multiple Epoch Facility supportCollin L. Walling
Allow for the enablement of MEF and the support for the extended epoch in SIE and VSIE for the extended guest TOD-Clock. A new interface is used for getting/setting a guest's extended TOD-Clock that uses a single ioctl invocation, KVM_S390_VM_TOD_EXT. Since the host time is a moving target that might see an epoch switch or STP sync checks we need an atomic ioctl and cannot use the exisiting two interfaces. The old method of getting and setting the guest TOD-Clock is still retained and is used when the old ioctls are called. Signed-off-by: Collin L. Walling <walling@linux.vnet.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> Reviewed-by: Jason J. Herne <jjherne@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-08-29locking/lockdep/selftests: Fix mixed read-write ABBA testsPeter Zijlstra
Commit: e91498589746 ("locking/lockdep/selftests: Add mixed read-write ABBA tests") adds an explicit FAILURE to the locking selftest but overlooked the fact that this kills lockdep. Fudge the test to avoid this. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: hpa@zytor.com Link: http://lkml.kernel.org/r/20170828124245.xlo2yshxq2btgmuf@hirez.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29sched/completion: Avoid unnecessary stack allocation for ↵Boqun Feng
COMPLETION_INITIALIZER_ONSTACK() In theory, COMPLETION_INITIALIZER_ONSTACK() should never affect the stack allocation of the caller. However, on some compilers, a temporary structure was allocated for the return value of COMPLETION_INITIALIZER_ONSTACK(). For example in write_journal() with LOCKDEP_COMPLETIONS=y (GCC is 7.1.1): io_comp.comp = COMPLETION_INITIALIZER_ONSTACK(io_comp.comp); 2462: e8 00 00 00 00 callq 2467 <write_journal+0x47> 2467: 48 8d 85 80 fd ff ff lea -0x280(%rbp),%rax 246e: 48 c7 c6 00 00 00 00 mov $0x0,%rsi 2475: 48 c7 c2 00 00 00 00 mov $0x0,%rdx x->done = 0; 247c: c7 85 90 fd ff ff 00 movl $0x0,-0x270(%rbp) 2483: 00 00 00 init_waitqueue_head(&x->wait); 2486: 48 8d 78 18 lea 0x18(%rax),%rdi 248a: e8 00 00 00 00 callq 248f <write_journal+0x6f> if (commit_start + commit_sections <= ic->journal_sections) { 248f: 41 8b 87 a8 00 00 00 mov 0xa8(%r15),%eax io_comp.comp = COMPLETION_INITIALIZER_ONSTACK(io_comp.comp); 2496: 48 8d bd e8 f9 ff ff lea -0x618(%rbp),%rdi 249d: 48 8d b5 90 fd ff ff lea -0x270(%rbp),%rsi 24a4: b9 17 00 00 00 mov $0x17,%ecx 24a9: f3 48 a5 rep movsq %ds:(%rsi),%es:(%rdi) if (commit_start + commit_sections <= ic->journal_sections) { 24ac: 41 39 c6 cmp %eax,%r14d io_comp.comp = COMPLETION_INITIALIZER_ONSTACK(io_comp.comp); 24af: 48 8d bd 90 fd ff ff lea -0x270(%rbp),%rdi 24b6: 48 8d b5 e8 f9 ff ff lea -0x618(%rbp),%rsi 24bd: b9 17 00 00 00 mov $0x17,%ecx 24c2: f3 48 a5 rep movsq %ds:(%rsi),%es:(%rdi) We can obviously see the temporary structure allocated, and the compiler also does two meaningless memcpy with "rep movsq". And according to: https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html#Statement-Exprs The return value of a statement expression is returned by value, so the temporary variable is created in COMPLETION_INITIALIZER_ONSTACK(), and that's why the temporary structures are allocted. To fix this, make the brace block in COMPLETION_INITIALIZER_ONSTACK() return a pointer and dereference it outside the block rather than return the whole structure, in this way, we are able to teach the compiler not to do the unnecessary stack allocation. This could also reduce the stack size even if !LOCKDEP, for example in write_journal(), compiled with gcc 7.1.1, the result of command: objdump -d drivers/md/dm-integrity.o | ./scripts/checkstack.pl x86 before: 0x0000246a write_journal [dm-integrity.o]: 696 after: 0x00002b7a write_journal [dm-integrity.o]: 296 Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Byungchul Park <byungchul.park@lge.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: walken@google.com Cc: willy@infradead.org Link: http://lkml.kernel.org/r/20170823152542.5150-3-boqun.feng@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29acpi/nfit: Fix COMPLETION_INITIALIZER_ONSTACK() abuseBoqun Feng
COMPLETION_INITIALIZER_ONSTACK() is supposed to be used as an initializer, in other words, it should only be used in assignment expressions or compound literals. So the usage in drivers/acpi/nfit/core.c: COMPLETION_INITIALIZER_ONSTACK(flush.cmp); ... is inappropriate. Besides, this usage could also break the build for another fix that reduces stack sizes caused by COMPLETION_INITIALIZER_ONSTACK(), because that fix changes COMPLETION_INITIALIZER_ONSTACK() from rvalue to lvalue, and usage as above will report the following error: drivers/acpi/nfit/core.c: In function 'acpi_nfit_flush_probe': include/linux/completion.h:77:3: error: value computed is not used [-Werror=unused-value] (*({ init_completion(&work); &work; })) This patch fixes this by replacing COMPLETION_INITIALIZER_ONSTACK() with init_completion() in acpi_nfit_flush_probe(), which does the same initialization without any other problems. Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Byungchul Park <byungchul.park@lge.com> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: walken@google.com Cc: willy@infradead.org Link: http://lkml.kernel.org/r/20170824142239.15178-1-boqun.feng@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29locking/pvqspinlock: Relax cmpxchg's to improve performance on some ↵Waiman Long
architectures All the locking related cmpxchg's in the following functions are replaced with the _acquire variants: - pv_queued_spin_steal_lock() - trylock_clear_pending() This change should help performance on architectures that use LL/SC. The cmpxchg in pv_kick_node() is replaced with a relaxed version with explicit memory barrier to make sure that it is fully ordered in the writing of next->lock and the reading of pn->state whether the cmpxchg is a success or failure without affecting performance in non-LL/SC architectures. On a 2-socket 12-core 96-thread Power8 system with pvqspinlock explicitly enabled, the performance of a locking microbenchmark with and without this patch on a 4.13-rc4 kernel with Xinhui's PPC qspinlock patch were as follows: # of thread w/o patch with patch % Change ----------- --------- ---------- -------- 8 5054.8 Mop/s 5209.4 Mop/s +3.1% 16 3985.0 Mop/s 4015.0 Mop/s +0.8% 32 2378.2 Mop/s 2396.0 Mop/s +0.7% Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrea Parri <parri.andrea@gmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pan Xinhui <xinhui@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/1502741222-24360-1-git-send-email-longman@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>