Age | Commit message (Collapse) | Author |
|
If file offset is insane, we have to return error instead of kernel panic.
Reported-by: Eric Zhang <followme999@163.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
If file was not opened with atomic write mode, but user uses atomic write
ioctl to fsync datas, in the flow, we should not fsync that file with
atomic write mode.
Fixes: 608514deba38 ("f2fs: set fsync mark only for the last dnode")
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch fixes to clear FI_HOT_DATA correctly in below path:
- error handling in f2fs_ioc_start_atomic_write
- after commit atomic write in f2fs_ioc_commit_atomic_write
- after drop atomic write in drop_inmem_pages
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
In f2fs_issue_flush, due to out-of-order execution of CPU, wake_up can
be called before we insert issue_list, result in long latency of
wait_for_completion. Fix this by adding smp_mb() to force the order of
related codes.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
It's time to issue all the discard commands, if user sets the idle time.
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The block core requests modules with the "-iosched" name suffix, but
mq-deadline does not have that suffix. Add an alias.
Fixes: 945ffb60c11d ("mq-deadline: add blk-mq adaptation of the deadline ...")
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The block core requests modules with the "-iosched" name suffix, but
bfq no longer has that suffix. Add an alias.
Fixes: ea25da48086d ("block, bfq: split bfq-iosched.c into multiple ...")
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
David Howells says:
====================
rxrpc: Miscellany
Here are a number of patches that make some changes/fixes and add a couple
of extensions to AF_RXRPC for kernel services to use. The changes and
fixes are:
(1) Use time64_t rather than u32 outside of protocol or
UAPI-representative structures.
(2) Use the correct time stamp when loading a key from an XDR-encoded
Kerberos 5 key.
(3) Fix IPv6 support.
(4) Fix some places where the error code is being incorrectly made
positive before returning.
(5) Remove some white space.
And the extensions:
(6) Add an end-of-Tx phase notification, thereby allowing kAFS to
transition the state on its own call record at the correct point,
rather than having to do it in advance and risk non-completion of the
call in the wrong state.
(7) Allow a kernel client call to be retried if it fails on a network
error, thereby making it possible for kAFS to iterate over a number of
IP addresses without having to reload the Tx queue and re-encrypt data
each time.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Florian Westphal says:
====================
addrlabel: don't use rtnl locking
addrlabel doesn't appear to require rtnl lock as the addrlabel
table uses a spinlock to serialize add/delete operations.
Also, entries are reference counted so it should be safe
to call the rtnl ops without the rtnl mutex.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There appears to be no need to use rtnl, addrlabel entries are refcounted
and add/delete is serialized by the addrlabel table spinlock.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:
====================
pull request (net): ipsec 2017-08-29
1) Fix dst_entry refcount imbalance when using socket policies.
From Lorenzo Colitti.
2) Fix locking when adding the ESP trailers.
3) Fix tailroom calculation for the ESP trailer by using
skb_tailroom instead of skb_availroom.
4) Fix some info leaks in xfrm_user.
From Mathias Krause.
Please pull or let me know if there are problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Now that the IRDA code has moved under drivers/staging/irda/, update the
MAINTAINERS file with the new location.
Reported-by: Joe Perches <joe@perches.com>
Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When bnxt VF-reps are not compiled in (CONFIG_BNXT_SRIOV is off)
bnxt_tc.c needs a dummy definition of the routine bnxt_vf_rep_get_fid().
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Fixes: 2ae7408fedfe ("bnxt_en: bnxt: add TC flower filter offload support")
Signed-off-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
chip->bits_per_cell which is used to determine the NAND cell type
(SLC/MLC) should always have a value != 0.
Complain loudly if the value is 0 in nand_is_slc() to catch use before
correct initialization.
Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
|
|
commit c51d0ac59f24 ("mtd: nand: Move Samsung specific init/detection
logic in nand_samsung.c") introduced a regression for Samsung SLC NAND
chips. Prior to this commit chip->bits_per_cell was initialized by calling
nand_get_bits_per_cell() before using nand_is_slc().
With the offending commit this call is skipped, leaving
chip->bits_per_cell cleared to zero when the manufacturer specific
'.detect' function calls nand_is_slc() which in turn interprets
bits_per_cell != 1 as indication for an MLC chip.
The effect is that e.g. a K9F1G08U0F NAND chip is falsely detected as
MLC NAND with 4KiB page size rather than SLC with 2KiB page size.
Add a call to nand_get_bits_per_cell() before calling the .detect hook
function in nand_manufacturer_detect(), so that the nand_is_slc()
calls in the manufacturer specific code will return correct results.
Fixes: c51d0ac59f24 ("mtd: nand: Move Samsung specific init/detection logic in nand_samsung.c")
Cc: <stable@vger.kernel.org>
Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
|
|
This reverts commit aac2fea94f7a3df8ad1eeb477eb2643f81fd5393.
It turns out that that patch was complete and utter garbage, and broke
KVM, resulting in odd oopses.
Quoting Andrea Arcangeli:
"The aforementioned commit has 3 bugs.
1) mmu_notifier_invalidate_range cannot be used in replacement of
mmu_notifier_invalidate_range_start/end.
For KVM mmu_notifier_invalidate_range is a noop and rightfully so.
A MMU notifier implementation has to implement either
->invalidate_range method or the invalidate_range_start/end
methods, not both. And if you implement invalidate_range_start/end
like KVM is forced to do, calling mmu_notifier_invalidate_range in
common code is a noop for KVM.
For those MMU notifiers that can get away only implementing
->invalidate_range, the ->invalidate_range is implicitly called by
mmu_notifier_invalidate_range_end(). And only those secondary MMUs
that share the same pagetable with the primary MMU (like AMD
iommuv2) can get away only implementing ->invalidate_range.
So all cases (THP on/off) are broken right now.
To fix this is enough to replace mmu_notifier_invalidate_range with
mmu_notifier_invalidate_range_start;mmu_notifier_invalidate_range_end.
Either that or call multiple mmu_notifier_invalidate_page like
before.
2) address + (1UL << compound_order(page) is buggy, it should be
PAGE_SIZE << compound_order(page), it's bytes not pages, 2M not
512.
3) The whole invalidate_range thing was an attempt to call a single
invalidate while walking multiple 4k ptes that maps the same THP
(after a pmd virtual split without physical compound page THP
split).
It's unclear if the rmap_walk will always provide an address that
is 2M aligned as parameter to try_to_unmap_one, in presence of THP.
I think it needs also an address &= (PAGE_SIZE <<
compound_order(page)) - 1 to be safe"
In general, we should stop making excuses for horrible MMU notifier
users. It's much more important that the core VM is sane and safe, than
letting MMU notifiers sleep.
So if some MMU notifier is sleeping under a spinlock, we need to fix the
notifier, not try to make excuses for that garbage in the core VM.
Reported-and-tested-by: Bernhard Held <berny156@gmx.de>
Reported-and-tested-by: Adam Borowski <kilobyte@angband.pl>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Wanpeng Li <kernellwp@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: axie <axie@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The only caller of this function is blk_start_request() in the same
file. Fix blk_start_request() description accordingly.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Since blk_mq_init_queue() initializes .nr_requests to the tag set
size and since that value is a good default for the skd driver, do
not overwrite the value set by blk_mq_init_queue(). This change
doubles the default value of .nr_requests.
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Since sTec s1120 devices support 64-bit DMA it is not necessary
to request data buffer bouncing. Hence remove the
blk_queue_bounce_limit() call.
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
This reverts commit 35f0b6a779b8b7a98faefd7c1c660b4dac9a5c26.
We now conditionalize issuing of READ LOG PAGE on the TRUSTED
COMPUTING SUPPORTED bit in the identity data and this shouldn't be
necessary.
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
ATA-8 and later mirrors the TRUSTED COMPUTING SUPPORTED bit in word 48 of
the IDENTIFY DEVICE data. Check this before issuing a READ LOG PAGE
command to avoid issues with buggy devices. The only downside is that
we can't support Security Send / Receive for a device with an older
revision due to the conflicting use of this field in earlier
specifications.
tj: The reason we need this is because some devices which don't
support READ LOG PAGE lock up after getting issued that command.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
for-4.14/block-postmerge
Pull NVMe changes from Christoph:
"Below is the current set of NVMe updates for Linux 4.14, now against
your postmerge branch, and with three more patches.
The biggest bit comes from Sagi and refactors the RDMA driver to
prepare for more code sharing in the setup and teardown path. But we
have various features and bug fixes from a lot of people as well."
|
|
Independent of the underlying hardware, kvm will now always handle
SIGP SET ARCHITECTURE as if czam were enabled. Therefore, let's not
only forward that bit but always set it.
While at it, add a comment regarding STHYI.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170829143108.14703-1-david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
|
|
On x86, the plt header size is as same as the plt entry size, and can be
identified from shdr's sh_entsize of the plt.
But we can't assume that the sh_entsize of the plt shdr is always the
plt entry size in all architecture, and the plt header size may be not
as same as the plt entry size in some architecure.
On ARM, the plt header size is 20 bytes and the plt entry size is 12
bytes (don't consider the FOUR_WORD_PLT case) that refer to the binutils
implementation. The plt section is as follows:
Disassembly of section .plt:
000004a0 <__cxa_finalize@plt-0x14>:
4a0: e52de004 push {lr} ; (str lr, [sp, #-4]!)
4a4: e59fe004 ldr lr, [pc, #4] ; 4b0 <_init+0x1c>
4a8: e08fe00e add lr, pc, lr
4ac: e5bef008 ldr pc, [lr, #8]!
4b0: 00008424 .word 0x00008424
000004b4 <__cxa_finalize@plt>:
4b4: e28fc600 add ip, pc, #0, 12
4b8: e28cca08 add ip, ip, #8, 20 ; 0x8000
4bc: e5bcf424 ldr pc, [ip, #1060]! ; 0x424
000004c0 <printf@plt>:
4c0: e28fc600 add ip, pc, #0, 12
4c4: e28cca08 add ip, ip, #8, 20 ; 0x8000
4c8: e5bcf41c ldr pc, [ip, #1052]! ; 0x41c
On AARCH64, the plt header size is 32 bytes and the plt entry size is 16
bytes. The plt section is as follows:
Disassembly of section .plt:
0000000000000560 <__cxa_finalize@plt-0x20>:
560: a9bf7bf0 stp x16, x30, [sp,#-16]!
564: 90000090 adrp x16, 10000 <__FRAME_END__+0xf8a8>
568: f944be11 ldr x17, [x16,#2424]
56c: 9125e210 add x16, x16, #0x978
570: d61f0220 br x17
574: d503201f nop
578: d503201f nop
57c: d503201f nop
0000000000000580 <__cxa_finalize@plt>:
580: 90000090 adrp x16, 10000 <__FRAME_END__+0xf8a8>
584: f944c211 ldr x17, [x16,#2432]
588: 91260210 add x16, x16, #0x980
58c: d61f0220 br x17
0000000000000590 <__gmon_start__@plt>:
590: 90000090 adrp x16, 10000 <__FRAME_END__+0xf8a8>
594: f944c611 ldr x17, [x16,#2440]
598: 91262210 add x16, x16, #0x988
59c: d61f0220 br x17
NOTES:
In addition to ARM and AARCH64, other architectures, such as
s390/alpha/mips/parisc/poperpc/sh/sparc/xtensa also need to consider
this issue.
Signed-off-by: Li Bin <huawei.libin@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexis Berlemont <alexis.berlemont@gmail.com>
Cc: David Tolnay <dtolnay@gmail.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: zhangmengting@huawei.com
Link: http://lkml.kernel.org/r/1496622849-21877-1-git-send-email-huawei.libin@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-linus
Pull xen-blkback fix from Konrad:
"[...] A bug-fix when shutting down xen block backend driver with
multiple queues and the driver not clearing all of them."
|
|
Right now there is a potential hang situation for postcopy migrations,
if the guest is enabling storage keys on the target system during the
postcopy process.
For storage key virtualization, we have to forbid the empty zero page as
the storage key is a property of the physical page frame. As we enable
storage key handling lazily we then drop all mappings for empty zero
pages for lazy refaulting later on.
This does not work with the postcopy migration, which relies on the
empty zero page never triggering a fault again in the future. The reason
is that postcopy migration will simply read a page on the target system
if that page is a known zero page to fault in an empty zero page. At
the same time postcopy remembers that this page was already transferred
- so any future userfault on that page will NOT be retransmitted again
to avoid races.
If now the guest enters the storage key mode while in postcopy, we will
break this assumption of postcopy.
The solution is to disable the empty zero page for KVM guests early on
and not during storage key enablement. With this change, the postcopy
migration process is guaranteed to start after no zero pages are left.
As guest pages are very likely not empty zero pages anyway the memory
overhead is also pretty small.
While at it this also adds proper page table locking to the zero page
removal.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
The z/VM hypervisor provides virtual disks (VDISK) which are backed by
main memory of the hypervisor. Those devices are seen as DASD FBA disks
within the Linux guest.
Whenever data is written to such a device, memory is allocated
on-the-fly by z/VM accordingly. This memory, however, is not being freed
if data on the device is deleted by the guest OS.
In order to make memory usable after deletion again, add discard support
to the FBA discipline.
While at it, update comments regarding the DASD_FEATURE_* flags.
Reviewed-by: Stefan Haberland <sth@linux.vnet.ibm.com>
Signed-off-by: Jan Höppner <hoeppner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Make this const as it is only used in a copy operation.
Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
If the kernel is compiled for z10 or later machines the uaccess
code inlines the mvcos instruction. The facility bit 27 which
indicates the availability of MVCOS has to be set. The have_mvcos
jump label will always be true.
Make the generation of the have_mvcos jump label conditional on
!CONFIG_HAVE_MARCH_Z10_FEATURES.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
With git commit 3446c13b268af86391d06611327006b059b8bab1
"s390/mm: four page table levels vs. fork"
s390 dropped its architecture specific version of arch_dup_mmap.
Now all functions defined by include/asm-generic/mm_hooks.h are
identical to the s390 versions. Use the generic header.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
There really is no "general extension" facility available. Use the
correct name "general instructions extension" facility instead.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Get rid of the goto and "out" label within vmcp_response_free() which
I added. This just makes the code harder to read than necessary.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Add DT bindings for the onboard SATA controller present on the MediaTek
SoCs.
Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
The commit 9aaf5a5f479b ("perf probe: Check kprobes blacklist when
adding new events"), 'perf probe' supports checking the blacklist of the
fuctions which can not be probed. But the checking condition is wrong,
that the end_addr of the symbol which is the start_addr of the next
symbol can't be included.
Committer notes:
IOW make it match its kernel counterpart in kernel/kprobes.c:
bool within_kprobe_blacklist(unsigned long addr)
Each entry have as its end address not its end address, but the first
address _outside_ that symbol, which for related functions, is the first
address of the next symbol, like these from kernel/trace/trace_probe.c:
0xffffffffbd198df0-0xffffffffbd198e40 print_type_u8
0xffffffffbd198e40-0xffffffffbd198e90 print_type_u16
0xffffffffbd198e90-0xffffffffbd198ee0 print_type_u32
0xffffffffbd198ee0-0xffffffffbd198f30 print_type_u64
0xffffffffbd198f30-0xffffffffbd198f80 print_type_s8
0xffffffffbd198f80-0xffffffffbd198fd0 print_type_s16
0xffffffffbd198fd0-0xffffffffbd199020 print_type_s32
0xffffffffbd199020-0xffffffffbd199070 print_type_s64
0xffffffffbd199070-0xffffffffbd1990c0 print_type_x8
0xffffffffbd1990c0-0xffffffffbd199110 print_type_x16
0xffffffffbd199110-0xffffffffbd199160 print_type_x32
0xffffffffbd199160-0xffffffffbd1991b0 print_type_x64
But not always:
0xffffffffbd1997b0-0xffffffffbd1997c0 fetch_kernel_stack_address (kernel/trace/trace_probe.c)
0xffffffffbd1c57f0-0xffffffffbd1c58b0 __context_tracking_enter (kernel/context_tracking.c)
Signed-off-by: Li Bin <huawei.libin@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: zhangmengting@huawei.com
Fixes: 9aaf5a5f479b ("perf probe: Check kprobes blacklist when adding new events")
Link: http://lkml.kernel.org/r/1504011443-7269-1-git-send-email-huawei.libin@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Now that we have a custom printf format specifier, convert users of
full_name to use %pOF instead. This is preparation to remove storing
of the full path string for each node.
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
Now that we have a custom printf format specifier, convert users of
full_name to use %pOF instead. This is preparation to remove storing
of the full path string for each node.
Signed-off-by: Rob Herring <robh@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev@lists.ozlabs.org
|
|
Now that we have a custom printf format specifier, convert users of
full_name to use %pOF instead. This is preparation to remove storing
of the full path string for each node.
Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linux-ide@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
|
|
Now that we have a custom printf format specifier, convert users of
full_name to use %pOF instead. This is preparation to remove storing
of the full path string for each node.
Signed-off-by: Rob Herring <robh@kernel.org>
Cc: Michal Simek <monstr@monstr.eu>
|
|
If a restartable syscall is called using the indirect o32 syscall
handler - eg: syscall(__NR_waitid, ...), then it is possible for the
incorrect arguments to be passed to the syscall after it has been
restarted. This is because the syscall handler tries to shift all the
registers down one place in pt_regs so that when the syscall is restarted,
the "real" syscall is called instead. Unfortunately it only shifts the
arguments passed in registers, not the arguments on the user stack. This
causes the 4th argument to be duplicated when the syscall is restarted.
Fix by removing all the pt_regs shifting so that the indirect syscall
handler is called again when the syscall is restarted. The comment "some
syscalls like execve get their arguments from struct pt_regs" is long
out of date so this should now be safe.
Signed-off-by: James Cowgill <James.Cowgill@imgtec.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Tested-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15856/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Since commit 669c4092225f ("MIPS: Give __secure_computing() access to
syscall arguments."), upon syscall entry when seccomp is enabled,
syscall_trace_enter() passes a carefully prepared struct seccomp_data
containing syscall arguments to __secure_computing(). Unfortunately it
directly uses mips_get_syscall_arg() and fails to take into account the
indirect O32 system calls (i.e. syscall(2)) which put the system call
number in a0 and have the arguments shifted up by one entry.
We can't just revert that commit as samples/bpf/tracex5 would break
again, so use syscall_get_arguments() which already takes indirect
syscalls into account instead of directly using mips_get_syscall_arg(),
similar to what populate_seccomp_data() does.
This also removes the redundant error checking of the
mips_get_syscall_arg() return value (get_user() already zeroes the
result if an argument from the stack can't be loaded).
Reported-by: James Cowgill <James.Cowgill@imgtec.com>
Fixes: 669c4092225f ("MIPS: Give __secure_computing() access to syscall arguments.")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: David Daney <david.daney@cavium.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16994/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
The bcm2835_timer_init() function emits an error message in case of a memory
allocation failure. This is pointless as the mm core does that already.
Remove this message.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
|
|
The STFLE bit 147 indicates whether the ESSA no-DAT operation code is
valid, the bit is not normally provided to the host; the host is
instead provided with an SCLP bit that indicates whether guests can
support the feature.
This patch:
* enables the STFLE bit in the guest if the corresponding SCLP bit is
present in the host.
* adds support for migrating the no-DAT bit in the PGSTEs
* fixes the software interpretation of the ESSA instruction that is
used when migrating, both for the new operation code and for the old
"set stable", as per specifications.
Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
|
|
handle_sthyi() always writes to guest memory if the sthyi function
code is zero in order to fault in the page that later is written to.
However a function code of zero does not necessarily mean that a write
to guest memory happens: if the KVM host is running as a second level
guest under z/VM 6.2 the sthyi instruction is indicated to be
available to the KVM host, however if the instruction is executed it
will always return with a return code that indicates "unsupported
function code".
In such a case handle_sthyi() must not write to guest memory. This
means that the prior write access to fault in the guest page may
result in invalid guest exceptions, and/or invalid data modification.
In order to be architecture compliant simply remove the write_guest()
call.
Given that the guest assumed a write access anyway, this fix does not
qualify for -stable. This just makes sure the sthyi handler is
architecture compliant.
Fixes: 95ca2cb57985 ("KVM: s390: Add sthyi emulation")
Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
|
|
Allow for the enablement of MEF and the support for the extended
epoch in SIE and VSIE for the extended guest TOD-Clock.
A new interface is used for getting/setting a guest's extended TOD-Clock
that uses a single ioctl invocation, KVM_S390_VM_TOD_EXT. Since the
host time is a moving target that might see an epoch switch or STP sync
checks we need an atomic ioctl and cannot use the exisiting two
interfaces. The old method of getting and setting the guest TOD-Clock is
still retained and is used when the old ioctls are called.
Signed-off-by: Collin L. Walling <walling@linux.vnet.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
|
|
Commit:
e91498589746 ("locking/lockdep/selftests: Add mixed read-write ABBA tests")
adds an explicit FAILURE to the locking selftest but overlooked the
fact that this kills lockdep. Fudge the test to avoid this.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: hpa@zytor.com
Link: http://lkml.kernel.org/r/20170828124245.xlo2yshxq2btgmuf@hirez.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
COMPLETION_INITIALIZER_ONSTACK()
In theory, COMPLETION_INITIALIZER_ONSTACK() should never affect the
stack allocation of the caller. However, on some compilers, a temporary
structure was allocated for the return value of
COMPLETION_INITIALIZER_ONSTACK().
For example in write_journal() with LOCKDEP_COMPLETIONS=y (GCC is 7.1.1):
io_comp.comp = COMPLETION_INITIALIZER_ONSTACK(io_comp.comp);
2462: e8 00 00 00 00 callq 2467 <write_journal+0x47>
2467: 48 8d 85 80 fd ff ff lea -0x280(%rbp),%rax
246e: 48 c7 c6 00 00 00 00 mov $0x0,%rsi
2475: 48 c7 c2 00 00 00 00 mov $0x0,%rdx
x->done = 0;
247c: c7 85 90 fd ff ff 00 movl $0x0,-0x270(%rbp)
2483: 00 00 00
init_waitqueue_head(&x->wait);
2486: 48 8d 78 18 lea 0x18(%rax),%rdi
248a: e8 00 00 00 00 callq 248f <write_journal+0x6f>
if (commit_start + commit_sections <= ic->journal_sections) {
248f: 41 8b 87 a8 00 00 00 mov 0xa8(%r15),%eax
io_comp.comp = COMPLETION_INITIALIZER_ONSTACK(io_comp.comp);
2496: 48 8d bd e8 f9 ff ff lea -0x618(%rbp),%rdi
249d: 48 8d b5 90 fd ff ff lea -0x270(%rbp),%rsi
24a4: b9 17 00 00 00 mov $0x17,%ecx
24a9: f3 48 a5 rep movsq %ds:(%rsi),%es:(%rdi)
if (commit_start + commit_sections <= ic->journal_sections) {
24ac: 41 39 c6 cmp %eax,%r14d
io_comp.comp = COMPLETION_INITIALIZER_ONSTACK(io_comp.comp);
24af: 48 8d bd 90 fd ff ff lea -0x270(%rbp),%rdi
24b6: 48 8d b5 e8 f9 ff ff lea -0x618(%rbp),%rsi
24bd: b9 17 00 00 00 mov $0x17,%ecx
24c2: f3 48 a5 rep movsq %ds:(%rsi),%es:(%rdi)
We can obviously see the temporary structure allocated, and the compiler
also does two meaningless memcpy with "rep movsq".
And according to:
https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html#Statement-Exprs
The return value of a statement expression is returned by value, so the
temporary variable is created in COMPLETION_INITIALIZER_ONSTACK(), and
that's why the temporary structures are allocted.
To fix this, make the brace block in COMPLETION_INITIALIZER_ONSTACK()
return a pointer and dereference it outside the block rather than return
the whole structure, in this way, we are able to teach the compiler not
to do the unnecessary stack allocation.
This could also reduce the stack size even if !LOCKDEP, for example in
write_journal(), compiled with gcc 7.1.1, the result of command:
objdump -d drivers/md/dm-integrity.o | ./scripts/checkstack.pl x86
before:
0x0000246a write_journal [dm-integrity.o]: 696
after:
0x00002b7a write_journal [dm-integrity.o]: 296
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: walken@google.com
Cc: willy@infradead.org
Link: http://lkml.kernel.org/r/20170823152542.5150-3-boqun.feng@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
COMPLETION_INITIALIZER_ONSTACK() is supposed to be used as an initializer,
in other words, it should only be used in assignment expressions or
compound literals. So the usage in drivers/acpi/nfit/core.c:
COMPLETION_INITIALIZER_ONSTACK(flush.cmp);
... is inappropriate.
Besides, this usage could also break the build for another fix that
reduces stack sizes caused by COMPLETION_INITIALIZER_ONSTACK(), because
that fix changes COMPLETION_INITIALIZER_ONSTACK() from rvalue to lvalue,
and usage as above will report the following error:
drivers/acpi/nfit/core.c: In function 'acpi_nfit_flush_probe':
include/linux/completion.h:77:3: error: value computed is not used [-Werror=unused-value]
(*({ init_completion(&work); &work; }))
This patch fixes this by replacing COMPLETION_INITIALIZER_ONSTACK()
with init_completion() in acpi_nfit_flush_probe(), which does the
same initialization without any other problems.
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: walken@google.com
Cc: willy@infradead.org
Link: http://lkml.kernel.org/r/20170824142239.15178-1-boqun.feng@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
architectures
All the locking related cmpxchg's in the following functions are
replaced with the _acquire variants:
- pv_queued_spin_steal_lock()
- trylock_clear_pending()
This change should help performance on architectures that use LL/SC.
The cmpxchg in pv_kick_node() is replaced with a relaxed version
with explicit memory barrier to make sure that it is fully ordered
in the writing of next->lock and the reading of pn->state whether
the cmpxchg is a success or failure without affecting performance in
non-LL/SC architectures.
On a 2-socket 12-core 96-thread Power8 system with pvqspinlock
explicitly enabled, the performance of a locking microbenchmark
with and without this patch on a 4.13-rc4 kernel with Xinhui's PPC
qspinlock patch were as follows:
# of thread w/o patch with patch % Change
----------- --------- ---------- --------
8 5054.8 Mop/s 5209.4 Mop/s +3.1%
16 3985.0 Mop/s 4015.0 Mop/s +0.8%
32 2378.2 Mop/s 2396.0 Mop/s +0.7%
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrea Parri <parri.andrea@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pan Xinhui <xinhui@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/1502741222-24360-1-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|