summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* ARM: 8961/2: Fix Kbuild issue caused by per-task stack protector GCC pluginfixesArd Biesheuvel5 days2-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using plugins, GCC requires that the -fplugin= options precedes any of its plugin arguments appearing on the command line as well. This is usually not a concern, but as it turns out, this requirement is causing some issues with ARM's per-task stack protector plugin and Kbuild's implementation of $(cc-option). When the per-task stack protector plugin is enabled, and we tweak the implementation of cc-option not to pipe the stderr output of GCC to /dev/null, the following output is generated when GCC is executed in the context of cc-option: cc1: error: plugin arm_ssp_per_task_plugin should be specified before \ -fplugin-arg-arm_ssp_per_task_plugin-tso=1 in the command line cc1: error: plugin arm_ssp_per_task_plugin should be specified before \ -fplugin-arg-arm_ssp_per_task_plugin-offset=24 in the command line These errors will cause any option passed to cc-option to be treated as unsupported, which is obviously incorrect. The cause of this issue is the fact that the -fplugin= argument is added to GCC_PLUGINS_CFLAGS, whereas the arguments above are added to KBUILD_CFLAGS, and the contents of the former get filtered out of the latter before being passed to the GCC running the cc-option test, and so the -fplugin= option does not appear at all on the GCC command line. Adding the arguments to GCC_PLUGINS_CFLAGS instead of KBUILD_CFLAGS would be the correct approach here, if it weren't for the fact that we are using $(eval) to defer the moment that they are added until after asm-offsets.h is generated, which is after the point where the contents of GCC_PLUGINS_CFLAGS are added to KBUILD_CFLAGS. So instead, we have to add our plugin arguments to both. For similar reasons, we cannot append DISABLE_ARM_SSP_PER_TASK_PLUGIN to KBUILD_CFLAGS, as it will be passed to GCC when executing in the context of cc-option, whereas the other plugin arguments will have been filtered out, resulting in a similar error and false negative result as above. So add it to ccflags-y instead. Fixes: 189af4657186da08 ("ARM: smp: add support for per-task stack canaries") Reported-by: Merlijn Wajer <merlijn@wizzup.org> Tested-by: Tony Lindgren <tony@atomide.com> Acked-by: Kees Cook <keescook@chromium.org> Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
* ARM: 8958/1: rename missed uaccess .fixup sectionKees Cook5 days1-1/+1
| | | | | | | | | | | | | | | | | | | | When the uaccess .fixup section was renamed to .text.fixup, one case was missed. Under ld.bfd, the orphaned section was moved close to .text (since they share the "ax" bits), so things would work normally on uaccess faults. Under ld.lld, the orphaned section was placed outside the .text section, making it unreachable. Link: https://github.com/ClangBuiltLinux/linux/issues/282 Link: https://bugs.chromium.org/p/chromium/issues/detail?id=1020633#c44 Link: https://lore.kernel.org/r/nycvar.YSQ.7.76.1912032147340.17114@knanqh.ubzr Link: https://lore.kernel.org/lkml/202002071754.F5F073F1D@keescook/ Fixes: c4a84ae39b4a5 ("ARM: 8322/1: keep .text and .fixup regions closer together") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
* ARM: 8957/1: VDSO: Match ARMv8 timer in cntvct_functional()Florian Fainelli5 days1-0/+2
| | | | | | | | | | | | It is possible for a system with an ARMv8 timer to run a 32-bit kernel. When this happens we will unconditionally have the vDSO code remove the __vdso_gettimeofday and __vdso_clock_gettime symbols because cntvct_functional() returns false since it does not match that compatibility string. Fixes: ecf99a439105 ("ARM: 8331/1: VDSO initialization, mapping, and synchronization") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
* Linux 5.6-rc1Linus Torvalds2020-02-091-2/+2
|
* Merge tag 'kbuild-v5.6-2' of ↵Linus Torvalds2020-02-0953-261/+252
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull more Kbuild updates from Masahiro Yamada: - fix randconfig to generate a sane .config - rename hostprogs-y / always to hostprogs / always-y, which are more natual syntax. - optimize scripts/kallsyms - fix yes2modconfig and mod2yesconfig - make multiple directory targets ('make foo/ bar/') work * tag 'kbuild-v5.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kbuild: make multiple directory targets work kconfig: Invalidate all symbols after changing to y or m. kallsyms: fix type of kallsyms_token_table[] scripts/kallsyms: change table to store (strcut sym_entry *) scripts/kallsyms: rename local variables in read_symbol() kbuild: rename hostprogs-y/always to hostprogs/always-y kbuild: fix the document to use extra-y for vmlinux.lds kconfig: fix broken dependency in randconfig-generated .config
| * kbuild: make multiple directory targets workMasahiro Yamada2020-02-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the single-target build does not work when two or more sub-directories are given: $ make fs/ kernel/ lib/ CALL scripts/checksyscalls.sh CALL scripts/atomic/check-atomics.sh DESCEND objtool make[2]: Nothing to be done for 'kernel/'. make[2]: Nothing to be done for 'fs/'. make[2]: Nothing to be done for 'lib/'. Make it work properly. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
| * kconfig: Invalidate all symbols after changing to y or m.Tetsuo Handa2020-02-051-3/+2
| | | | | | | | | | | | | | | | | | | | | | Since commit 89b9060987d9 ("kconfig: Add yes2modconfig and mod2yesconfig targets.") forgot to clear SYMBOL_VALID bit after changing to y or m, these targets did not save the changes. Call sym_clear_all_valid() so that all symbols are revalidated. Fixes: 89b9060987d9 ("kconfig: Add yes2modconfig and mod2yesconfig targets.") Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
| * kallsyms: fix type of kallsyms_token_table[]Masahiro Yamada2020-02-051-2/+3
| | | | | | | | | | | | | | | | kallsyms_token_table[] only contains ASCII characters. It should be char instead of u8. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
| * scripts/kallsyms: change table to store (strcut sym_entry *)Masahiro Yamada2020-02-041-56/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | The symbol table is extended every 10000 addition by using realloc(), where data copy might occur to the new buffer. To decrease the amount of possible data copy, let's change the table to store the pointer. The symbol type + symbol name part is appended at the end of (struct sym_entry), and allocated together with the struct body. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
| * scripts/kallsyms: rename local variables in read_symbol()Masahiro Yamada2020-02-041-12/+12
| | | | | | | | | | | | | | I will use 'sym' for the point to struce sym_entry in the next commit. Rename 'sym', 'stype' to 'name', 'type', which are more intuitive. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
| * kbuild: rename hostprogs-y/always to hostprogs/always-yMasahiro Yamada2020-02-0449-190/+172
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In old days, the "host-progs" syntax was used for specifying host programs. It was renamed to the current "hostprogs-y" in 2004. It is typically useful in scripts/Makefile because it allows Kbuild to selectively compile host programs based on the kernel configuration. This commit renames like follows: always -> always-y hostprogs-y -> hostprogs So, scripts/Makefile will look like this: always-$(CONFIG_BUILD_BIN2C) += ... always-$(CONFIG_KALLSYMS) += ... ... hostprogs := $(always-y) $(always-m) I think this makes more sense because a host program is always a host program, irrespective of the kernel configuration. We want to specify which ones to compile by CONFIG options, so always-y will be handier. The "always", "hostprogs-y", "hostprogs-m" will be kept for backward compatibility for a while. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
| * kbuild: fix the document to use extra-y for vmlinux.ldsMasahiro Yamada2020-02-041-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The difference between "always" and "extra-y" is that the targets listed in $(always) are always built, whereas the ones in $(extra-y) are built only when KBUILD_BUILTIN is set. So, "make modules" does not build the targets in $(extra-y). vmlinux.lds is only needed for linking vmlinux. So, adding it to extra-y is more correct. In fact, arch/x86/kernel/Makefile does this. Fix the example code. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
| * kconfig: fix broken dependency in randconfig-generated .configMasahiro Yamada2020-02-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Running randconfig on arm64 using KCONFIG_SEED=0x40C5E904 (e.g. on v5.5) produces the .config with CONFIG_EFI=y and CONFIG_CPU_BIG_ENDIAN=y, which does not meet the !CONFIG_CPU_BIG_ENDIAN dependency. This is because the user choice for CONFIG_CPU_LITTLE_ENDIAN vs CONFIG_CPU_BIG_ENDIAN is set by randomize_choice_values() after the value of CONFIG_EFI is calculated. When this happens, the has_changed flag should be set. Currently, it takes the result from the last iteration. It should accumulate all the results of the loop. Fixes: 3b9a19e08960 ("kconfig: loop as long as we changed some symbols in randconfig") Reported-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
* | Merge tag 'zonefs-5.6-rc1' of ↵Linus Torvalds2020-02-099-0/+2058
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs Pull new zonefs file system from Damien Le Moal: "Zonefs is a very simple file system exposing each zone of a zoned block device as a file. Unlike a regular file system with native zoned block device support (e.g. f2fs or the on-going btrfs effort), zonefs does not hide the sequential write constraint of zoned block devices to the user. As a result, zonefs is not a POSIX compliant file system. Its goal is to simplify the implementation of zoned block devices support in applications by replacing raw block device file accesses with a richer file based API, avoiding relying on direct block device file ioctls which may be more obscure to developers. One example of this approach is the implementation of LSM (log-structured merge) tree structures (such as used in RocksDB and LevelDB) on zoned block devices by allowing SSTables to be stored in a zone file similarly to a regular file system rather than as a range of sectors of a zoned device. The introduction of the higher level construct "one file is one zone" can help reducing the amount of changes needed in the application while at the same time allowing the use of zoned block devices with various programming languages other than C. Zonefs IO management implementation uses the new iomap generic code. Zonefs has been successfully tested using a functional test suite (available with zonefs userland format tool on github) and a prototype implementation of LevelDB on top of zonefs" * tag 'zonefs-5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs: zonefs: Add documentation fs: New zonefs file system
| * | zonefs: Add documentationDamien Le Moal2020-02-072-0/+405
| | | | | | | | | | | | | | | | | | | | | | | | Add the new file Documentation/filesystems/zonefs.txt to document zonefs principles and user-space tool usage. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
| * | fs: New zonefs file systemDamien Le Moal2020-02-078-0/+1653
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | zonefs is a very simple file system exposing each zone of a zoned block device as a file. Unlike a regular file system with zoned block device support (e.g. f2fs), zonefs does not hide the sequential write constraint of zoned block devices to the user. Files representing sequential write zones of the device must be written sequentially starting from the end of the file (append only writes). As such, zonefs is in essence closer to a raw block device access interface than to a full featured POSIX file system. The goal of zonefs is to simplify the implementation of zoned block device support in applications by replacing raw block device file accesses with a richer file API, avoiding relying on direct block device file ioctls which may be more obscure to developers. One example of this approach is the implementation of LSM (log-structured merge) tree structures (such as used in RocksDB and LevelDB) on zoned block devices by allowing SSTables to be stored in a zone file similarly to a regular file system rather than as a range of sectors of a zoned device. The introduction of the higher level construct "one file is one zone" can help reducing the amount of changes needed in the application as well as introducing support for different application programming languages. Zonefs on-disk metadata is reduced to an immutable super block to persistently store a magic number and optional feature flags and values. On mount, zonefs uses blkdev_report_zones() to obtain the device zone configuration and populates the mount point with a static file tree solely based on this information. E.g. file sizes come from the device zone type and write pointer offset managed by the device itself. The zone files created on mount have the following characteristics. 1) Files representing zones of the same type are grouped together under a common sub-directory: * For conventional zones, the sub-directory "cnv" is used. * For sequential write zones, the sub-directory "seq" is used. These two directories are the only directories that exist in zonefs. Users cannot create other directories and cannot rename nor delete the "cnv" and "seq" sub-directories. 2) The name of zone files is the number of the file within the zone type sub-directory, in order of increasing zone start sector. 3) The size of conventional zone files is fixed to the device zone size. Conventional zone files cannot be truncated. 4) The size of sequential zone files represent the file's zone write pointer position relative to the zone start sector. Truncating these files is allowed only down to 0, in which case, the zone is reset to rewind the zone write pointer position to the start of the zone, or up to the zone size, in which case the file's zone is transitioned to the FULL state (finish zone operation). 5) All read and write operations to files are not allowed beyond the file zone size. Any access exceeding the zone size is failed with the -EFBIG error. 6) Creating, deleting, renaming or modifying any attribute of files and sub-directories is not allowed. 7) There are no restrictions on the type of read and write operations that can be issued to conventional zone files. Buffered, direct and mmap read & write operations are accepted. For sequential zone files, there are no restrictions on read operations, but all write operations must be direct IO append writes. mmap write of sequential files is not allowed. Several optional features of zonefs can be enabled at format time. * Conventional zone aggregation: ranges of contiguous conventional zones can be aggregated into a single larger file instead of the default one file per zone. * File ownership: The owner UID and GID of zone files is by default 0 (root) but can be changed to any valid UID/GID. * File access permissions: the default 640 access permissions can be changed. The mkzonefs tool is used to format zoned block devices for use with zonefs. This tool is available on Github at: git@github.com:damien-lemoal/zonefs-tools.git. zonefs-tools also includes a test suite which can be run against any zoned block device, including null_blk block device created with zoned mode. Example: the following formats a 15TB host-managed SMR HDD with 256 MB zones with the conventional zones aggregation feature enabled. $ sudo mkzonefs -o aggr_cnv /dev/sdX $ sudo mount -t zonefs /dev/sdX /mnt $ ls -l /mnt/ total 0 dr-xr-xr-x 2 root root 1 Nov 25 13:23 cnv dr-xr-xr-x 2 root root 55356 Nov 25 13:23 seq The size of the zone files sub-directories indicate the number of files existing for each type of zones. In this example, there is only one conventional zone file (all conventional zones are aggregated under a single file). $ ls -l /mnt/cnv total 137101312 -rw-r----- 1 root root 140391743488 Nov 25 13:23 0 This aggregated conventional zone file can be used as a regular file. $ sudo mkfs.ext4 /mnt/cnv/0 $ sudo mount -o loop /mnt/cnv/0 /data The "seq" sub-directory grouping files for sequential write zones has in this example 55356 zones. $ ls -lv /mnt/seq total 14511243264 -rw-r----- 1 root root 0 Nov 25 13:23 0 -rw-r----- 1 root root 0 Nov 25 13:23 1 -rw-r----- 1 root root 0 Nov 25 13:23 2 ... -rw-r----- 1 root root 0 Nov 25 13:23 55354 -rw-r----- 1 root root 0 Nov 25 13:23 55355 For sequential write zone files, the file size changes as data is appended at the end of the file, similarly to any regular file system. $ dd if=/dev/zero of=/mnt/seq/0 bs=4K count=1 conv=notrunc oflag=direct 1+0 records in 1+0 records out 4096 bytes (4.1 kB, 4.0 KiB) copied, 0.000452219 s, 9.1 MB/s $ ls -l /mnt/seq/0 -rw-r----- 1 root root 4096 Nov 25 13:23 /mnt/seq/0 The written file can be truncated to the zone size, preventing any further write operation. $ truncate -s 268435456 /mnt/seq/0 $ ls -l /mnt/seq/0 -rw-r----- 1 root root 268435456 Nov 25 13:49 /mnt/seq/0 Truncation to 0 size allows freeing the file zone storage space and restart append-writes to the file. $ truncate -s 0 /mnt/seq/0 $ ls -l /mnt/seq/0 -rw-r----- 1 root root 0 Nov 25 13:49 /mnt/seq/0 Since files are statically mapped to zones on the disk, the number of blocks of a file as reported by stat() and fstat() indicates the size of the file zone. $ stat /mnt/seq/0 File: /mnt/seq/0 Size: 0 Blocks: 524288 IO Block: 4096 regular empty file Device: 870h/2160d Inode: 50431 Links: 1 Access: (0640/-rw-r-----) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2019-11-25 13:23:57.048971997 +0900 Modify: 2019-11-25 13:52:25.553805765 +0900 Change: 2019-11-25 13:52:25.553805765 +0900 Birth: - The number of blocks of the file ("Blocks") in units of 512B blocks gives the maximum file size of 524288 * 512 B = 256 MB, corresponding to the device zone size in this example. Of note is that the "IO block" field always indicates the minimum IO size for writes and corresponds to the device physical sector size. This code contains contributions from: * Johannes Thumshirn <jthumshirn@suse.de>, * Darrick J. Wong <darrick.wong@oracle.com>, * Christoph Hellwig <hch@lst.de>, * Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> and * Ting Yao <tingyao@hust.edu.cn>. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* | | irqchip/gic-v4.1: Avoid 64bit division for the sake of 32bit ARMMarc Zyngier2020-02-091-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to allow the GICv4 code to link properly on 32bit ARM, make sure we don't use 64bit divisions when it isn't strictly necessary. Fixes: 4e6437f12d6e ("irqchip/gic-v4.1: Ensure L2 vPE table is allocated at RD level") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge tag '5.6-rc-smb3-plugfest-patches' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds2020-02-0922-129/+247
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull cifs fixes from Steve French: "13 cifs/smb3 patches, most from testing at the SMB3 plugfest this week: - Important fix for multichannel and for modefromsid mounts. - Two reconnect fixes - Addition of SMB3 change notify support - Backup tools fix - A few additional minor debug improvements (tracepoints and additional logging found useful during testing this week)" * tag '5.6-rc-smb3-plugfest-patches' of git://git.samba.org/sfrench/cifs-2.6: smb3: Add defines for new information level, FileIdInformation smb3: print warning once if posix context returned on open smb3: add one more dynamic tracepoint missing from strict fsync path cifs: fix mode bits from dir listing when mounted with modefromsid cifs: fix channel signing cifs: add SMB3 change notification support cifs: make multichannel warning more visible cifs: fix soft mounts hanging in the reconnect code cifs: Add tracepoints for errors on flush or fsync cifs: log warning message (once) if out of disk space cifs: fail i/o on soft mounts if sessionsetup errors out smb3: fix problem with null cifs super block with previous patch SMB3: Backup intent flag missing from some more ops
| * | | smb3: Add defines for new information level, FileIdInformationSteve French2020-02-061-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | See MS-FSCC 2.4.43. Valid to be quried from most Windows servers (among others). Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com>
| * | | smb3: print warning once if posix context returned on openSteve French2020-02-062-0/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SMB3.1.1 POSIX Context processing is not complete yet - so print warning (once) if server returns it on open. Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com>
| * | | smb3: add one more dynamic tracepoint missing from strict fsync pathSteve French2020-02-061-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We didn't have a dynamic trace point for catching errors in file_write_and_wait_range error cases in cifs_strict_fsync. Since not all apps check for write behind errors, it can be important for debugging to be able to trace these error paths. Suggested-and-reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
| * | | cifs: fix mode bits from dir listing when mounted with modefromsidAurelien Aptel2020-02-061-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When mounting with -o modefromsid, the mode bits are stored in an ACE. Directory enumeration (e.g. ls -l /mnt) triggers an SMB Query Dir which does not include ACEs in its response. The mode bits in this case are silently set to a default value of 755 instead. This patch marks the dentry created during the directory enumeration as needing re-evaluation (i.e. additional Query Info with ACEs) so that the mode bits can be properly extracted. Quick repro: $ mount.cifs //win19.test/data /mnt -o ...,modefromsid $ touch /mnt/foo && chmod 751 /mnt/foo $ stat /mnt/foo # reports 751 (OK) $ sleep 2 # dentry older than 1s by default get invalidated $ ls -l /mnt # since dentry invalid, ls does a Query Dir # and reports foo as 755 (WRONG) Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com> CC: Stable <stable@vger.kernel.org> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
| * | | cifs: fix channel signingAurelien Aptel2020-02-061-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The server var was accidentally used as an iterator over the global list of connections, thus overwritten the passed argument. This resulted in the wrong signing key being returned for extra channels. Fix this by using a separate var to iterate. Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
| * | | cifs: add SMB3 change notification supportSteve French2020-02-065-0/+87
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A commonly used SMB3 feature is change notification, allowing an app to be notified about changes to a directory. The SMB3 Notify request blocks until the server detects a change to that directory or its contents that matches the completion flags that were passed in and the "watch_tree" flag (which indicates whether subdirectories under this directory should be also included). See MS-SMB2 2.2.35 for additional detail. To use this simply pass in the following structure to ioctl: struct __attribute__((__packed__)) smb3_notify { uint32_t completion_filter; bool watch_tree; } __packed; using CIFS_IOC_NOTIFY 0x4005cf09 or equivalently _IOW(CIFS_IOCTL_MAGIC, 9, struct smb3_notify) SMB3 change notification is supported by all major servers. The ioctl will block until the server detects a change to that directory or its subdirectories (if watch_tree is set). Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Acked-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
| * | | cifs: make multichannel warning more visibleAurelien Aptel2020-02-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When no interfaces are returned by the server we cannot open multiple channels. Make it more obvious by reporting that to the user at the VFS log level. Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
| * | | cifs: fix soft mounts hanging in the reconnect codeRonnie Sahlberg2020-02-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RHBZ: 1795423 This is the SMB1 version of a patch we already have for SMB2 In recent DFS updates we have a new variable controlling how many times we will retry to reconnect the share. If DFS is not used, then this variable is initialized to 0 in: static inline int dfs_cache_get_nr_tgts(const struct dfs_cache_tgt_list *tl) { return tl ? tl->tl_numtgts : 0; } This means that in the reconnect loop in smb2_reconnect() we will immediately wrap retries to -1 and never actually get to pass this conditional: if (--retries) continue; The effect is that we no longer reach the point where we fail the commands with -EHOSTDOWN and basically the kernel threads are virtually hung and unkillable. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
| * | | cifs: Add tracepoints for errors on flush or fsyncSteve French2020-02-052-2/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Makes it easier to debug errors on writeback that happen later, and are being returned on flush or fsync For example: writetest-17829 [002] .... 13583.407859: cifs_flush_err: ino=90 rc=-28 Signed-off-by: Steve French <stfrench@microsoft.com>
| * | | cifs: log warning message (once) if out of disk spaceSteve French2020-02-051-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We ran into a confusing problem where an application wasn't checking return code on close and so user didn't realize that the application ran out of disk space. log a warning message (once) in these cases. For example: [ 8407.391909] Out of space writing to \\oleg-server\small-share Signed-off-by: Steve French <stfrench@microsoft.com> Reported-by: Oleg Kravtsov <oleg@tuxera.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
| * | | cifs: fail i/o on soft mounts if sessionsetup errors outRonnie Sahlberg2020-02-051-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RHBZ: 1579050 If we have a soft mount we should fail commands for session-setup failures (such as the password having changed/ account being deleted/ ...) and return an error back to the application. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> CC: Stable <stable@vger.kernel.org>
| * | | smb3: fix problem with null cifs super block with previous patchSteve French2020-02-052-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add check for null cifs_sb to create_options helper Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com>
| * | | SMB3: Backup intent flag missing from some more opsAmir Goldstein2020-02-0314-118/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When "backup intent" is requested on the mount (e.g. backupuid or backupgid mount options), the corresponding flag was missing from some of the operations. Change all operations to use the macro cifs_create_options() to set the backup intent flag if needed. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
* | | | Merge branch 'work.vboxsf' of ↵Linus Torvalds2020-02-0912-0/+3280
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vboxfs from Al Viro: "This is the VirtualBox guest shared folder support by Hans de Goede, with fixups for fs_parse folded in to avoid bisection hazards from those API changes..." * 'work.vboxsf' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: Add VirtualBox guest shared folder (vboxsf) support
| * | | | fs: Add VirtualBox guest shared folder (vboxsf) supportHans de Goede2020-02-0812-0/+3280
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | VirtualBox hosts can share folders with guests, this commit adds a VFS driver implementing the Linux-guest side of this, allowing folders exported by the host to be mounted under Linux. This driver depends on the guest <-> host IPC functions exported by the vboxguest driver. Acked-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | | | | Merge tag 'x86-urgent-2020-02-09' of ↵Linus Torvalds2020-02-0913-11/+260
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A set of fixes for X86: - Ensure that the PIT is set up when the local APIC is disable or configured in legacy mode. This is caused by an ordering issue introduced in the recent changes which skip PIT initialization when the TSC and APIC frequencies are already known. - Handle malformed SRAT tables during early ACPI parsing which caused an infinite loop anda boot hang. - Fix a long standing race in the affinity setting code which affects PCI devices with non-maskable MSI interrupts. The problem is caused by the non-atomic writes of the MSI address (destination APIC id) and data (vector) fields which the device uses to construct the MSI message. The non-atomic writes are mandated by PCI. If both fields change and the device raises an interrupt after writing address and before writing data, then the MSI block constructs a inconsistent message which causes interrupts to be lost and subsequent malfunction of the device. The fix is to redirect the interrupt to the new vector on the current CPU first and then switch it over to the new target CPU. This allows to observe an eventually raised interrupt in the transitional stage (old CPU, new vector) to be observed in the APIC IRR and retriggered on the new target CPU and the new vector. The potential spurious interrupts caused by this are harmless and can in the worst case expose a buggy driver (all handlers have to be able to deal with spurious interrupts as they can and do happen for various reasons). - Add the missing suspend/resume mechanism for the HYPERV hypercall page which prevents resume hibernation on HYPERV guests. This change got lost before the merge window. - Mask the IOAPIC before disabling the local APIC to prevent potentially stale IOAPIC remote IRR bits which cause stale interrupt lines after resume" * tag 'x86-urgent-2020-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic: Mask IOAPIC entries when disabling the local APIC x86/hyperv: Suspend/resume the hypercall page for hibernation x86/apic/msi: Plug non-maskable MSI affinity race x86/boot: Handle malformed SRAT tables during early ACPI parsing x86/timer: Don't skip PIT setup when APIC is disabled or in legacy mode
| * | | | | x86/apic: Mask IOAPIC entries when disabling the local APICTony W Wang-oc2020-02-071-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a system suspends, the local APIC is disabled in the suspend sequence, but the IOAPIC is left in the current state. This means unmasked interrupt lines stay unmasked. This is usually the case for IOAPIC pin 9 to which the ACPI interrupt is connected. That means that in suspended state the IOAPIC can respond to an external interrupt, e.g. the wakeup via keyboard/RTC/ACPI, but the interrupt message cannot be handled by the disabled local APIC. As a consequence the Remote IRR bit is set, but the local APIC does not send an EOI to acknowledge it. This causes the affected interrupt line to become stale and the stale Remote IRR bit will cause a hang when __synchronize_hardirq() is invoked for that interrupt line. To prevent this, mask all IOAPIC entries before disabling the local APIC. The resume code already has the unmask operation inside. [ tglx: Massaged changelog ] Signed-off-by: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/1579076539-7267-1-git-send-email-TonyWWang-oc@zhaoxin.com
| * | | | | x86/hyperv: Suspend/resume the hypercall page for hibernationDexuan Cui2020-02-011-0/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For hibernation the hypercall page must be disabled before the hibernation image is created so that subsequent hypercall operations fail safely. On resume the hypercall page has to be restored and reenabled to ensure proper operation of the resumed kernel. Implement the necessary suspend/resume callbacks. [ tglx: Decrypted changelog ] Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/1578350559-130275-1-git-send-email-decui@microsoft.com
| * | | | | x86/apic/msi: Plug non-maskable MSI affinity raceThomas Gleixner2020-02-016-4/+163
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Evan tracked down a subtle race between the update of the MSI message and the device raising an interrupt internally on PCI devices which do not support MSI masking. The update of the MSI message is non-atomic and consists of either 2 or 3 sequential 32bit wide writes to the PCI config space. - Write address low 32bits - Write address high 32bits (If supported by device) - Write data When an interrupt is migrated then both address and data might change, so the kernel attempts to mask the MSI interrupt first. But for MSI masking is optional, so there exist devices which do not provide it. That means that if the device raises an interrupt internally between the writes then a MSI message is sent built from half updated state. On x86 this can lead to spurious interrupts on the wrong interrupt vector when the affinity setting changes both address and data. As a consequence the device interrupt can be lost causing the device to become stuck or malfunctioning. Evan tried to handle that by disabling MSI accross an MSI message update. That's not feasible because disabling MSI has issues on its own: If MSI is disabled the PCI device is routing an interrupt to the legacy INTx mechanism. The INTx delivery can be disabled, but the disablement is not working on all devices. Some devices lose interrupts when both MSI and INTx delivery are disabled. Another way to solve this would be to enforce the allocation of the same vector on all CPUs in the system for this kind of screwed devices. That could be done, but it would bring back the vector space exhaustion problems which got solved a few years ago. Fortunately the high address (if supported by the device) is only relevant when X2APIC is enabled which implies interrupt remapping. In the interrupt remapping case the affinity setting is happening at the interrupt remapping unit and the PCI MSI message is programmed only once when the PCI device is initialized. That makes it possible to solve it with a two step update: 1) Target the MSI msg to the new vector on the current target CPU 2) Target the MSI msg to the new vector on the new target CPU In both cases writing the MSI message is only changing a single 32bit word which prevents the issue of inconsistency. After writing the final destination it is necessary to check whether the device issued an interrupt while the intermediate state #1 (new vector, current CPU) was in effect. This is possible because the affinity change is always happening on the current target CPU. The code runs with interrupts disabled, so the interrupt can be detected by checking the IRR of the local APIC. If the vector is pending in the IRR then the interrupt is retriggered on the new target CPU by sending an IPI for the associated vector on the target CPU. This can cause spurious interrupts on both the local and the new target CPU. 1) If the new vector is not in use on the local CPU and the device affected by the affinity change raised an interrupt during the transitional state (step #1 above) then interrupt entry code will ignore that spurious interrupt. The vector is marked so that the 'No irq handler for vector' warning is supressed once. 2) If the new vector is in use already on the local CPU then the IRR check might see an pending interrupt from the device which is using this vector. The IPI to the new target CPU will then invoke the handler of the device, which got the affinity change, even if that device did not issue an interrupt 3) If the new vector is in use already on the local CPU and the device affected by the affinity change raised an interrupt during the transitional state (step #1 above) then the handler of the device which uses that vector on the local CPU will be invoked. expose issues in device driver interrupt handlers which are not prepared to handle a spurious interrupt correctly. This not a regression, it's just exposing something which was already broken as spurious interrupts can happen for a lot of reasons and all driver handlers need to be able to deal with them. Reported-by: Evan Green <evgreen@chromium.org> Debugged-by: Evan Green <evgreen@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Evan Green <evgreen@chromium.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/87imkr4s7n.fsf@nanos.tec.linutronix.de
| * | | | | x86/boot: Handle malformed SRAT tables during early ACPI parsingSteven Clarkson2020-01-311-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Break an infinite loop when early parsing of the SRAT table is caused by a subtable with zero length. Known to affect the ASUS WS X299 SAGE motherboard with firmware version 1201 which has a large block of zeros in its SRAT table. The kernel could boot successfully on this board/firmware prior to the introduction of early parsing this table or after a BIOS update. [ bp: Fixup whitespace damage and commit message. Make it return 0 to denote that there are no immovable regions because who knows what else is broken in this BIOS. ] Fixes: 02a3e3cdb7f1 ("x86/boot: Parse SRAT table and count immovable memory regions") Signed-off-by: Steven Clarkson <sc@lambdal.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: linux-acpi@vger.kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=206343 Link: https://lkml.kernel.org/r/CAHKq8taGzj0u1E_i=poHUam60Bko5BpiJ9jn0fAupFUYexvdUQ@mail.gmail.com
| * | | | | x86/timer: Don't skip PIT setup when APIC is disabled or in legacy modeThomas Gleixner2020-01-296-7/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tony reported a boot regression caused by the recent workaround for systems which have a disabled (clock gate off) PIT. On his machine the kernel fails to initialize the PIT because apic_needs_pit() does not take into account whether the local APIC interrupt delivery mode will actually allow to setup and use the local APIC timer. This should be easy to reproduce with acpi=off on the command line which also disables HPET. Due to the way the PIT/HPET and APIC setup ordering works (APIC setup can require working PIT/HPET) the information is not available at the point where apic_needs_pit() makes this decision. To address this, split out the interrupt mode selection from apic_intr_mode_init(), invoke the selection before making the decision whether PIT is required or not, and add the missing checks into apic_needs_pit(). Fixes: c8c4076723da ("x86/timer: Skip PIT initialization on modern chipsets") Reported-by: Anthony Buckley <tony.buckley000@gmail.com> Tested-by: Anthony Buckley <tony.buckley000@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Daniel Drake <drake@endlessm.com> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206125 Link: https://lore.kernel.org/r/87sgk6tmk2.fsf@nanos.tec.linutronix.de
* | | | | | Merge tag 'smp-urgent-2020-02-09' of ↵Linus Torvalds2020-02-092-2/+3
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull SMP fixes from Thomas Gleixner: "Two fixes for the SMP related functionality: - Make the UP version of smp_call_function_single() match SMP semantics when called for a not available CPU. Instead of emitting a warning and assuming that the function call target is CPU0, return a proper error code like the SMP version does. - Remove a superfluous check in smp_call_function_many_cond()" * tag 'smp-urgent-2020-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: smp/up: Make smp_call_function_single() match SMP semantics smp: Remove superfluous cond_func check in smp_call_function_many_cond()
| * | | | | | smp/up: Make smp_call_function_single() match SMP semanticsPaul E. McKenney2020-02-071-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In CONFIG_SMP=y kernels, smp_call_function_single() returns -ENXIO when invoked for a non-existent CPU. In contrast, in CONFIG_SMP=n kernels, a splat is emitted and smp_call_function_single() otherwise silently ignores its "cpu" argument, instead pretending that the caller intended to have something happen on CPU 0. Given that there is now code that expects smp_call_function_single() to return an error if a bad CPU was specified, this difference in semantics needs to be addressed. Bring the semantics of the CONFIG_SMP=n version of smp_call_function_single() into alignment with its CONFIG_SMP=y counterpart. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20200205143409.GA7021@paulmck-ThinkPad-P72
| * | | | | | smp: Remove superfluous cond_func check in smp_call_function_many_cond()Sebastian Andrzej Siewior2020-01-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It was requested to remove the cond_func check but the follow up patch was overlooked. Remove it now. Fixes: 67719ef25eeb ("smp: Add a smp_cond_func_t argument to smp_call_function_many()") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20200127083915.434tdkztorkklpdu@linutronix.de
* | | | | | | Merge tag 'perf-urgent-2020-02-09' of ↵Linus Torvalds2020-02-0910-36/+88
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "A set of fixes and improvements for the perf subsystem: Kernel fixes: - Install cgroup events to the correct CPU context to prevent a potential list double add - Prevent an integer underflow in the perf mlock accounting - Add a missing prototype for arch_perf_update_userpage() Tooling: - Add a missing unlock in the error path of maps__insert() in perf maps. - Fix the build with the latest libbfd - Fix the perf parser so it does not delete parse event terms, which caused a regression for using perf with the ARM CoreSight as the sink configuration was missing due to the deletion. - Fix the double free in the perf CPU map merging test case - Add the missing ustring support for the perf probe command" * tag 'perf-urgent-2020-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf maps: Add missing unlock to maps__insert() error case perf probe: Add ustring support for perf probe command perf: Make perf able to build with latest libbfd perf test: Fix test case Merge cpu map perf parse: Copy string to perf_evsel_config_term perf parse: Refactor 'struct perf_evsel_config_term' kernel/events: Add a missing prototype for arch_perf_update_userpage() perf/cgroups: Install cgroup events to correct cpuctx perf/core: Fix mlock accounting in perf_mmap()
| * \ \ \ \ \ \ Merge tag 'perf-core-for-mingo-5.6-20200201' of ↵Ingo Molnar2020-02-058-32/+71
| |\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: perf maps: Cengiz Can: - Add missing unlock to maps__insert() error case. srcline: Changbin Du: - Make perf able to build with latest libbfd. perf parse: Leo Yan: - Keep copy of string in perf_evsel_config_term() to fix sink terms processing in ARM CoreSight. perf test: Thomas Richter: - Fix test case Merge cpu map, removing extra reference count drop that causes a segfault on s/390. perf probe: Thomas Richter: - Add ustring support for perf probe command Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * | | | | | | perf maps: Add missing unlock to maps__insert() error caseCengiz Can2020-01-311-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | `tools/perf/util/map.c` has a function named `maps__insert` that acquires a write lock if its in multithread context. Even though this lock is released when function successfully completes, there's a branch that is executed when `maps_by_name == NULL` that returns from this function without releasing the write lock. Added an `up_write` to release the lock when this happens. Fixes: a7c2b572e217 ("perf map_groups: Auto sort maps by name, if needed") Signed-off-by: Cengiz Can <cengiz@kernel.wtf> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20200120141553.23934-1-cengiz@kernel.wtf Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | * | | | | | | perf probe: Add ustring support for perf probe commandThomas Richter2020-01-311-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Kernel commit 88903c464321 ("tracing/probe: Add ustring type for user-space string") adds support for user-space strings when type 'ustring' is specified. Here is an example using sysfs command line interface for kprobes: Function to probe: struct filename * getname_flags(const char __user *filename, int flags, int *empty) Setup: # cd /sys/kernel/debug/tracing/ # echo 'p:tmr1 getname_flags +0(%r2):ustring' > kprobe_events # cat events/kprobes/tmr1/format | fgrep print print fmt: "(%lx) arg1=\"%s\"", REC->__probe_ip, REC->arg1 # echo 1 > events/kprobes/tmr1/enable # touch /tmp/111 # echo 0 > events/kprobes/tmr1/enable # cat trace|fgrep /tmp/111 touch-5846 [005] d..2 255520.717960: tmr1:\ (getname_flags+0x0/0x400) arg1="/tmp/111" Doing the same with the perf tool fails. Using type 'string' succeeds: # perf probe "vfs_getname=getname_flags:72 pathname=filename:string" Added new event: probe:vfs_getname (on getname_flags:72 with pathname=filename:string) .... # perf probe -d probe:vfs_getname Removed event: probe:vfs_getname However using type 'ustring' fails (output before): # perf probe "vfs_getname=getname_flags:72 pathname=filename:ustring" Failed to write event: Invalid argument Error: Failed to add events. # Fix this by adding type 'ustring' in function convert_variable_type(). Using ustring succeeds (output after): # ./perf probe "vfs_getname=getname_flags:72 pathname=filename:ustring" Added new event: probe:vfs_getname (on getname_flags:72 with pathname=filename:ustring) You can now use it in all perf tools, such as: perf record -e probe:vfs_getname -aR sleep 1 # Note: This issue also exists on x86, it is not s390 specific. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: sumanthk@linux.ibm.com Link: http://lore.kernel.org/lkml/20200120132011.64698-2-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | * | | | | | | perf: Make perf able to build with latest libbfdChangbin Du2020-01-301-1/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | libbfd has changed the bfd_section_* macros to inline functions bfd_section_<field> since 2019-09-18. See below two commits: o http://www.sourceware.org/ml/gdb-cvs/2019-09/msg00064.html o https://www.sourceware.org/ml/gdb-cvs/2019-09/msg00072.html This fix make perf able to build with both old and new libbfd. Signed-off-by: Changbin Du <changbin.du@gmail.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200128152938.31413-1-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | * | | | | | | perf test: Fix test case Merge cpu mapThomas Richter2020-01-301-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit a2408a70368a ("perf evlist: Maintain evlist->all_cpus") introduces a test case for cpumap merge operation, see functions perf_cpu_map__merge() and test__cpu_map_merge(). The test case fails on s390 with this error message: [root@m35lp76 perf]# ./perf test -Fvvvvv 52 52: Merge cpu map : --- start --- cpumask list: 1-2,4-5,7 perf: /root/linux/tools/include/linux/refcount.h:131:\ refcount_sub_and_test: Assertion `!(new > val)' failed. Aborted (core dumped) [root@m35lp76 perf]# The root cause is in the function test__cpu_map_merge(): It creates two cpu_maps named 'a' and 'b': struct perf_cpu_map *a = perf_cpu_map__new("4,2,1"); struct perf_cpu_map *b = perf_cpu_map__new("4,5,7"); and creates a third map named 'c' which is the result of the merge of maps a and b: struct perf_cpu_map *c = perf_cpu_map__merge(a, b); After some verifaction of the merged cpu_map all three of them are have their reference count reduced and are freed: perf_cpu_map__put(a); (1) perf_cpu_map__put(b); perf_cpu_map__put(c); The release of perf_cpu_map__put(a) is wrong. The map is already released and free'ed as part of the function perf_cpu_map__merge(struct perf_cpu_map *orig, | struct perf_cpu_map *other) +--> perf_cpu_map__put(orig); | +--> cpu_map__delete(orig) At the end perf_cpu_map_put() is called for map 'orig' alias 'a' and since the reference count is 1, the map is deleted, as can be seen by the following gdb trace: (gdb) where #0 tcache_put (tc_idx=0, chunk=0x156cc30) at malloc.c:2940 #1 _int_free (av=0x3fffd49ee80 <main_arena>, p=0x156cc30, have_lock=<optimized out>) at malloc.c:4222 #2 0x00000000012d5e78 in cpu_map__delete (map=0x156cc40) at cpumap.c:31 #3 0x00000000012d5f7a in perf_cpu_map__put (map=0x156cc40) at cpumap.c:45 #4 0x00000000012d723a in perf_cpu_map__merge (orig=0x156cc40, other=0x156cc60) at cpumap.c:343 #5 0x000000000110cdd0 in test__cpu_map_merge ( test=0x14ea6c8 <generic_tests+2856>, subtest=-1) at tests/cpumap.c:128 Thus the perf_cpu_map__put(a) (see (1) above) frees map 'a' a second time and causes the failure. Fix this be removing that function call. Output after: [root@m35lp76 perf]# ./perf test -Fvvvvv 52 52: Merge cpu map : --- start --- cpumask list: 1-2,4-5,7 ---- end ---- Merge cpu map: Ok [root@m35lp76 perf]# Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: sumanthk@linux.ibm.com Link: http://lore.kernel.org/lkml/20200120132011.64698-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | * | | | | | | perf parse: Copy string to perf_evsel_config_termLeo Yan2020-01-303-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | perf with CoreSight fails to record trace data with command: perf record -e cs_etm/@tmc_etr0/u --per-thread ls failed to set sink "" on event cs_etm/@tmc_etr0/u with 21 (Is a directory)/perf/ This failure is root caused with the commit 1dc925568f01 ("perf parse: Add a deep delete for parse event terms"). The log shows, cs_etm fails to parse the sink attribution; cs_etm event relies on the event configuration to pass sink name, but the event specific configuration data cannot be passed properly with flow: get_config_terms() ADD_CONFIG_TERM(DRV_CFG, term->val.str); __t->val.str = term->val.str; `> __t->val.str is assigned to term->val.str; parse_events_terms__purge() parse_events_term__delete() zfree(&term->val.str); `> term->val.str is freed and assigned to NULL pointer; cs_etm_set_sink_attr() sink = __t->val.str; `> sink string has been freed. To fix this issue, in the function get_config_terms(), this patch changes to use strdup() for allocation a new duplicate string rather than directly assignment string pointer. This patch addes a new field 'free_str' in the data structure perf_evsel_config_term; 'free_str' is set to true when the union is used as a string pointer; thus it can tell perf_evsel__free_config_terms() to free the string. Fixes: 1dc925568f01 ("perf parse: Add a deep delete for parse event terms") Suggested-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lore.kernel.org/lkml/20200117055251.24058-2-leo.yan@linaro.org [ Use zfree() in perf_evsel__free_config_terms ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> :# modified: tools/perf/util/evsel_config.h
| | * | | | | | | perf parse: Refactor 'struct perf_evsel_config_term'Leo Yan2020-01-304-29/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The struct perf_evsel_config_term::val is a union which contains fields 'callgraph', 'drv_cfg' and 'branch' as string pointers. This leads to the complex code logic for handling every type's string separately, and it's hard to release string as a general way. This patch refactors the structure to add a common field 'str' in the 'val' union as string pointer and remove the other three fields 'callgraph', 'drv_cfg' and 'branch'. Without passing field name, the patch simplifies the string handling with macro ADD_CONFIG_TERM_STR() for string pointer assignment. This patch fixes multiple warnings of line over 80 characters detected by checkpatch tool. Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lore.kernel.org/lkml/20200117055251.24058-1-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>