Age | Commit message (Collapse) | Author |
|
There is no more reason to check the return value of
check_symbol_range().
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
|
|
Collect the ignored patterns to is_ignored_symbol().
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
|
|
Refactoring for shortening the code.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
|
|
Unless the address range matters, symbols can be ignored earlier,
which avoids unneeded memory allocation.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
|
|
Add 'const' where a function does not write to the pointer dereferenes.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
|
|
The callers of this function expect (unsigned char *). I do not see
a good reason to make this function return (void *).
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
|
|
You can do equivalent things with strspn(). I do not see noticeable
performance difference.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
|
|
sym_entry::sym is (unsigned char *) instead of (char *) because
kallsyms exploits the MSB for compression, and the characters are
used as the index of token_profit array.
However, it requires casting (unsigned char *) to (char *) in some
places since standard library functions such as strcmp(), strlen()
expect (char *).
Introduce a new helper, sym_name(), which advances the given pointer
by 1 and casts it to (char *).
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
|
|
l <= strlen(sym_name) is unnecessary for prefix matching.
strncmp() will do.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
|
|
Since commit 6f00df24ee39 ("[PATCH] Strip local symbols from kallsyms"),
all symbols starting '$' are ignored.
is_arm_mapping_symbol() particularly ignores $a, $t, etc. but it is
redundant.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
|
|
Currently, record_relative_base() iterates over the entire table to
find the minimum address, but it is not efficient because we sort
the table anyway.
After sort_symbol(), the table is sorted by address. (kallsyms parses
the 'nm -n' output, so the data is already sorted by address, but this
commit does not rely on it.)
Move record_relative_base() after sort_symbols(), and take the first
non-absolute symbol value.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
|
|
Currently, build_initial_tok_table() trims unused symbols, but it is
called after sort_symbols().
It is not efficient to sort the huge table that contains unused entries.
Shrink the table before sorting it.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
|
|
build_initial_tok_table() overwrites unused sym_entry to shrink the
table size. Before the entry is overwritten, table[i].sym must be freed
since it is malloc'ed data.
This fixes the 'definitely lost' report from valgrind. I ran valgrind
against x86_64_defconfig of v5.4-rc8 kernel, and here is the summary:
[Before the fix]
LEAK SUMMARY:
definitely lost: 53,184 bytes in 2,874 blocks
[After the fix]
LEAK SUMMARY:
definitely lost: 0 bytes in 0 blocks
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
|
|
This is not defined in the standard headers. #ifndef is unneeded.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
|
|
The AML code implementing the WMI methods creates a variable length
field to hold the input data we pass like this:
CreateDWordField (Arg1, 0x0C, DSZI)
Local5 = DSZI /* \HWMC.DSZI */
CreateField (Arg1, 0x80, (Local5 * 0x08), DAIN)
If we pass 0 as bios_args.datasize argument then (Local5 * 0x08)
is 0 which results in these errors:
[ 71.973305] ACPI BIOS Error (bug): Attempt to CreateField of length zero (20190816/dsopcode-133)
[ 71.973332] ACPI Error: Aborting method \HWMC due to previous error (AE_AML_OPERAND_VALUE) (20190816/psparse-529)
[ 71.973413] ACPI Error: Aborting method \_SB.WMID.WMAA due to previous error (AE_AML_OPERAND_VALUE) (20190816/psparse-529)
And in our HPWMI_WIRELESS2_QUERY calls always failing. for read commands
like HPWMI_WIRELESS2_QUERY the DSZI value is not used / checked, except for
read commands where extra input is needed to specify exactly what to read.
So for HPWMI_WIRELESS2_QUERY we can safely pass the size of the expected
output as insize to hp_wmi_perform_query(), as we are already doing for all
other HPWMI_READ commands we send. Doing so fixes these errors.
Cc: stable@vger.kernel.org
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=197007
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=201981
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1520703
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
|
|
The HP WMI calls may take up to 128 bytes of data as input, and
the AML methods implementing the WMI calls, declare a couple of fields for
accessing input in different sizes, specifycally the HWMC method contains:
CreateField (Arg1, 0x80, 0x0400, D128)
Even though we do not use any of the WMI command-types which need a buffer
of this size, the APCI interpreter still tries to create it as it is
declared in generoc code at the top of the HWMC method which runs before
the code looks at which command-type is requested.
This results in many of these errors on many different HP laptop models:
[ 14.459261] ACPI Error: Field [D128] at 1152 exceeds Buffer [NULL] size 160 (bits) (20170303/dsopcode-236)
[ 14.459268] ACPI Error: Method parse/execution failed [\HWMC] (Node ffff8edcc61507f8), AE_AML_BUFFER_LIMIT (20170303/psparse-543)
[ 14.459279] ACPI Error: Method parse/execution failed [\_SB.WMID.WMAA] (Node ffff8edcc61523c0), AE_AML_BUFFER_LIMIT (20170303/psparse-543)
This commit increases the size of the data element of the bios_args struct
to 128 bytes fixing these errors.
Cc: stable@vger.kernel.org
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=197007
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=201981
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1520703
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
|
|
LLVM revision r374662 gives LLVM the ability to convert certain loops
into a reference to bcmp as an optimization; this breaks
prom_init_check.sh:
CALL arch/powerpc/kernel/prom_init_check.sh
Error: External symbol 'bcmp' referenced from prom_init.c
make[2]: *** [arch/powerpc/kernel/Makefile:196: prom_init_check] Error 1
bcmp is defined in lib/string.c as a wrapper for memcmp so this could
be added to the whitelist. However, commit
450e7dd4001f ("powerpc/prom_init: don't use string functions from
lib/") copied memcmp as prom_memcmp to avoid KASAN instrumentation so
having bcmp be resolved to regular memcmp would break that assumption.
Furthermore, because the compiler is the one that inserted bcmp, we
cannot provide something like prom_bcmp.
To prevent LLVM from being clever with optimizations like this, use
-ffreestanding to tell LLVM we are not hosted so it is not free to
make transformations like this.
Reviewed-by: Nick Desaulneris <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191119045712.39633-4-natechancellor@gmail.com
|
|
Commit aea447141c7e ("powerpc: Disable -Wbuiltin-requires-header when
setjmp is used") disabled -Wbuiltin-requires-header because of a
warning about the setjmp and longjmp declarations.
r367387 in clang added another diagnostic around this, complaining
that there is no jmp_buf declaration.
In file included from ../arch/powerpc/xmon/xmon.c:47:
../arch/powerpc/include/asm/setjmp.h:10:13: error: declaration of
built-in function 'setjmp' requires the declaration of the 'jmp_buf'
type, commonly provided in the header <setjmp.h>.
[-Werror,-Wincomplete-setjmp-declaration]
extern long setjmp(long *);
^
../arch/powerpc/include/asm/setjmp.h:11:13: error: declaration of
built-in function 'longjmp' requires the declaration of the 'jmp_buf'
type, commonly provided in the header <setjmp.h>.
[-Werror,-Wincomplete-setjmp-declaration]
extern void longjmp(long *, long);
^
2 errors generated.
We are not using the standard library's longjmp/setjmp implementations
for obvious reasons; make this clear to clang by using -ffreestanding
on these files.
Cc: stable@vger.kernel.org # 4.14+
Suggested-by: Segher Boessenkool <segher@kernel.crashing.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191119045712.39633-3-natechancellor@gmail.com
|
|
When building pseries_defconfig, building vdso32 errors out:
error: unknown target ABI 'elfv1'
This happens because -m32 in clang changes the target to 32-bit,
which does not allow the ABI to be changed.
Commit 4dc831aa8813 ("powerpc: Fix compiling a BE kernel with a
powerpc64le toolchain") added these flags to fix building big endian
kernels with a little endian GCC.
Clang doesn't need -mabi because the target triple controls the
default value. -mlittle-endian and -mbig-endian manipulate the triple
into either powerpc64-* or powerpc64le-*, which properly sets the
default ABI.
Adding a debug print out in the PPC64TargetInfo constructor after line
383 above shows this:
$ echo | ./clang -E --target=powerpc64-linux -mbig-endian -o /dev/null -
Default ABI: elfv1
$ echo | ./clang -E --target=powerpc64-linux -mlittle-endian -o /dev/null -
Default ABI: elfv2
$ echo | ./clang -E --target=powerpc64le-linux -mbig-endian -o /dev/null -
Default ABI: elfv1
$ echo | ./clang -E --target=powerpc64le-linux -mlittle-endian -o /dev/null -
Default ABI: elfv2
Don't specify -mabi when building with clang to avoid the build error
with -m32 and not change any code generation.
-mcall-aixdesc is not an implemented flag in clang so it can be safely
excluded as well, see commit 238abecde8ad ("powerpc: Don't use gcc
specific options on clang").
pseries_defconfig successfully builds after this patch and
powernv_defconfig and ppc44x_defconfig don't regress.
Reviewed-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
[mpe: Trim clang links in change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191119045712.39633-2-natechancellor@gmail.com
|
|
Adjust indentation from spaces to tab (+optional two spaces) as in
coding style with command like:
$ sed -e 's/^ /\t/' -i */Kconfig
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1574306461-7646-1-git-send-email-krzk@kernel.org
|
|
fixmap is intended to map things permanently like the IMMR region on
FSL SOC (8xx, 83xx, ...), so don't clear it when initialising paging()
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/41c99bc06394a6bc2888631cb98a3ed2ae281ddb.1568295907.git.christophe.leroy@c-s.fr
|
|
For a read-only mapping, ask for a set of features that make the image
only unwritable rather than both unreadable and unwritable by a client
that doesn't understand them. As of today, the difference between them
for krbd is journaling (JOURNALING) and live migration (MIGRATING).
get_features method supports read_only parameter since hammer, ceph.git
commit 6176ec5fde2a ("librbd: differentiate between R/O vs R/W RBD
features").
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
|
|
Since infernalis, ceph.git commit 281f87f9ee52 ("cls_rbd: get_features
on snapshots returns HEAD image features"), querying and checking that
is pointless. Userspace support for manipulating image features after
image creation came also in infernalis, so a snapshot with a different
set of features wasn't ever possible.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
|
|
RBD_DEV_FLAG_EXISTS check in rbd_queue_workfn() is racy and leads to
inconsistent behaviour. If the object (or its snapshot) isn't there,
the OSD returns ENOENT. A read submitted before the snapshot removal
notification is processed would be zero-filled and ended with status
OK, while future reads would be failed with IOERR. It also doesn't
handle a case when an image that is mapped read-only is removed.
On top of this, because watch is no longer established for read-only
mappings, we no longer get notifications, so rbd_exists_validate() is
effectively dead code. While failing requests rather than returning
zeros is a good thing, RBD_DEV_FLAG_EXISTS is not it.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
|
|
With exclusive lock out of the way, watch is the only thing left that
prevents a read-only mapping from being used with read-only OSD caps.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
|
|
A read-only mapping should be usable with read-only OSD caps, so
neither the header lock nor the object map lock can be acquired.
Unfortunately, this means that images mapped read-only lose the
advantage of the object map.
Snapshots, however, can take advantage of the object map without
any exclusionary locks, so if the object map is desired, snapshot
the image and map the snapshot instead of the image.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
|
|
If an image is mapped read-only, don't allow setting its partition(s)
to read-write via BLKROSET: with the previous patch all writes to such
images are failed anyway.
If an image is mapped read-write, its partition(s) can be set to
read-only (and back to read-write) as before. Note that at the rbd
level the image will remain writeable: anything sent down by the block
layer will be executed, including any write from internal kernel users.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
|
|
Even though -o ro/-o read_only/--read-only options are very old, we
have never really treated them seriously (on par with snapshots). As
a first step, fail writes to images mapped read-only just like we do
for snapshots.
We need this check in rbd because the block layer basically ignores
read-only setting, see commit a32e236eb93e ("Partially revert "block:
fail op_is_write() requests to read-only partitions"").
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
|
|
rbd_dev->opts is not available for parent images, making checking
rbd_dev->opts->read_only in various places (rbd_dev_image_probe(),
need_exclusive_lock(), use_object_map() in the following patches)
harder than it needs to be.
Keeping rbd_dev_image_probe() in mind, move the initialization in
do_rbd_add() up. snap_id isn't filled in at that point, so replace
rbd_is_snap() with a snap_name comparison.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
|
|
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
|
|
We currently just pass junk in this field unless we're retransmitting a
create, but in later patches, we'll need a mechanism to pass a delegated
inode number on an initial create request. Prepare for this by ensuring
this field is zeroed out.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
When this occurs, it usually means that we raced with a rename, and
there is no need to warn in that case. Only printk if we pass the
rename sequence check but still ended up with pos < 0.
Either way, this doesn't warrant a KERN_ERR message. Change it to
KERN_WARNING.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Alex has got plenty on his plate aside from rbd and hasn't really been
active in recent years. Remove his maintainership entry.
Dongsheng is very familiar with the code base and has been reviewing rbd
patches for a while now. Add him as a reviewer.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Acked-by: Alex Elder <elder@kernel.org>
Acked-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
|
|
For example, if we have 5 mds in the mdsmap and the states are:
m_info[5] --> [-1, 1, -1, 1, 1]
If we get a random number 1, then we should get the mds index 3 as
expected, but actually we will get index 2, which the state is -1.
The issue is that the for loop increment will advance past any "up"
MDS that was found during the while loop search.
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
There is a spelling mistake in a debug message. Fix it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
None of these helper functions change anything in memory, so we can
declare their arguments as const.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
con->private is set in ceph_con_init() and is never cleared.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD
Second KVM PPC update for 5.5
- Two fixes from Greg Kurz to fix memory leak bugs in the XIVE code.
|
|
UNWIND_ESPFIX_STACK needs to read the GDT, and the GDT mapping that
can be accessed via %fs is not mapped in the user pagetables. Use
SGDT to find the cpu_entry_area mapping and read the espfix offset
from that instead.
Reported-and-tested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
CONFIG_REFCOUNT_FULL no longer exists, so remove all references to it.
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Kees Cook <keescook@chromium.org>
Tested-by: Hanjun Guo <guohanjun@huawei.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Elena Reshetova <elena.reshetova@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20191121115902.2551-11-will@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
'refcount_error_report()' has no callers. Remove it.
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Kees Cook <keescook@chromium.org>
Tested-by: Hanjun Guo <guohanjun@huawei.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Elena Reshetova <elena.reshetova@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20191121115902.2551-10-will@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The generic implementation of refcount_t should be good enough for
everybody, so remove ARCH_HAS_REFCOUNT and REFCOUNT_FULL entirely,
leaving the generic implementation enabled unconditionally.
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Kees Cook <keescook@chromium.org>
Tested-by: Hanjun Guo <guohanjun@huawei.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Elena Reshetova <elena.reshetova@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20191121115902.2551-9-will@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The definitions of REFCOUNT_MAX and REFCOUNT_SATURATED are the same,
regardless of CONFIG_REFCOUNT_FULL, so consolidate them into a single
pair of definitions.
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Hanjun Guo <guohanjun@huawei.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Elena Reshetova <elena.reshetova@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20191121115902.2551-8-will@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Having the refcount saturation and warnings inline bloats the text,
despite the fact that these paths should never be executed in normal
operation.
Move the refcount saturation and warnings out of line to reduce the
image size when refcount_t checking is enabled. Relative to an x86_64
defconfig, the sizes reported by bloat-o-meter are:
# defconfig+REFCOUNT_FULL, inline saturation (i.e. before this patch)
Total: Before=14762076, After=14915442, chg +1.04%
# defconfig+REFCOUNT_FULL, out-of-line saturation (i.e. after this patch)
Total: Before=14762076, After=14835497, chg +0.50%
A side-effect of this change is that we now only get one warning per
refcount saturation type, rather than one per problematic call-site.
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Hanjun Guo <guohanjun@huawei.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Elena Reshetova <elena.reshetova@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20191121115902.2551-7-will@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Rewrite the generic REFCOUNT_FULL implementation so that the saturation
point is moved to INT_MIN / 2. This allows us to defer the sanity checks
until after the atomic operation, which removes many uses of cmpxchg()
in favour of atomic_fetch_{add,sub}().
Some crude perf results obtained from lkdtm show substantially less
overhead, despite the checking:
$ perf stat -r 3 -B -- echo {ATOMIC,REFCOUNT}_TIMING >/sys/kernel/debug/provoke-crash/DIRECT
# arm64
ATOMIC_TIMING: 46.50451 +- 0.00134 seconds time elapsed ( +- 0.00% )
REFCOUNT_TIMING (REFCOUNT_FULL, mainline): 77.57522 +- 0.00982 seconds time elapsed ( +- 0.01% )
REFCOUNT_TIMING (REFCOUNT_FULL, this series): 48.7181 +- 0.0256 seconds time elapsed ( +- 0.05% )
# x86
ATOMIC_TIMING: 31.6225 +- 0.0776 seconds time elapsed ( +- 0.25% )
REFCOUNT_TIMING (!REFCOUNT_FULL, mainline/x86 asm): 31.6689 +- 0.0901 seconds time elapsed ( +- 0.28% )
REFCOUNT_TIMING (REFCOUNT_FULL, mainline): 53.203 +- 0.138 seconds time elapsed ( +- 0.26% )
REFCOUNT_TIMING (REFCOUNT_FULL, this series): 31.7408 +- 0.0486 seconds time elapsed ( +- 0.15% )
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Hanjun Guo <guohanjun@huawei.com>
Tested-by: Jan Glauber <jglauber@marvell.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Elena Reshetova <elena.reshetova@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20191121115902.2551-6-will@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
<linux/refcount.h> header
In an effort to improve performance of the REFCOUNT_FULL implementation,
move the bulk of its functions into linux/refcount.h. This allows them
to be inlined in the same way as if they had been provided via
CONFIG_ARCH_HAS_REFCOUNT.
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Hanjun Guo <guohanjun@huawei.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Elena Reshetova <elena.reshetova@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20191121115902.2551-5-will@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The full-fat refcount implementation is exposed via a set of functions
suffixed with "_checked()", the idea being that code can choose to use
the more expensive, yet more secure implementation on a case-by-case
basis.
In reality, this hasn't happened, so with a grand total of zero users,
let's remove the checked variants for now by simply dropping the suffix
and predicating the out-of-line functions on CONFIG_REFCOUNT_FULL=y.
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Hanjun Guo <guohanjun@huawei.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Elena Reshetova <elena.reshetova@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20191121115902.2551-4-will@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
In preparation for changing the saturation point of REFCOUNT_FULL to
INT_MIN/2, change the type of integer operands passed into the API
from 'unsigned int' to 'int' so that we can avoid casting during
comparisons when we don't want to fall foul of C integral conversion
rules for signed and unsigned types.
Since the kernel is compiled with '-fno-strict-overflow', we don't need
to worry about the UB introduced by signed overflow here. Furthermore,
we're already making heavy use of the atomic_t API, which operates
exclusively on signed types.
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Hanjun Guo <guohanjun@huawei.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Elena Reshetova <elena.reshetova@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20191121115902.2551-3-will@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The REFCOUNT_FULL implementation uses a different saturation point than
the x86 implementation, which means that the shared refcount code in
lib/refcount.c (e.g. refcount_dec_not_one()) needs to be aware of the
difference.
Rather than duplicate the definitions from the lkdtm driver, instead
move them into <linux/refcount.h> and update all references accordingly.
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Hanjun Guo <guohanjun@huawei.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Elena Reshetova <elena.reshetova@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20191121115902.2551-2-will@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
completed topic tree
Conflicts:
tools/perf/check-headers.sh
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|