summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-03-05kasan: remove use after scope bugs detection.Andrey Ryabinin
Use after scope bugs detector seems to be almost entirely useless for the linux kernel. It exists over two years, but I've seen only one valid bug so far [1]. And the bug was fixed before it has been reported. There were some other use-after-scope reports, but they were false-positives due to different reasons like incompatibility with structleak plugin. This feature significantly increases stack usage, especially with GCC < 9 version, and causes a 32K stack overflow. It probably adds performance penalty too. Given all that, let's remove use-after-scope detector entirely. While preparing this patch I've noticed that we mistakenly enable use-after-scope detection for clang compiler regardless of CONFIG_KASAN_EXTRA setting. This is also fixed now. [1] http://lkml.kernel.org/r/<20171129052106.rhgbjhhis53hkgfn@wfg-t540p.sh.intel.com> Link: http://lkml.kernel.org/r/20190111185842.13978-1-aryabinin@virtuozzo.com Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Acked-by: Will Deacon <will.deacon@arm.com> [arm64] Cc: Qian Cai <cai@lca.pw> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05mm: hwpoison: fix thp split handing in soft_offline_in_use_page()zhongjiang
When soft_offline_in_use_page() runs on a thp tail page after pmd is split, we trigger the following VM_BUG_ON_PAGE(): Memory failure: 0x3755ff: non anonymous thp __get_any_page: 0x3755ff: unknown zero refcount page type 2fffff80000000 Soft offlining pfn 0x34d805 at process virtual address 0x20fff000 page:ffffea000d360140 count:0 mapcount:0 mapping:0000000000000000 index:0x1 flags: 0x2fffff80000000() raw: 002fffff80000000 ffffea000d360108 ffffea000d360188 0000000000000000 raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0) ------------[ cut here ]------------ kernel BUG at ./include/linux/mm.h:519! soft_offline_in_use_page() passed refcount and page lock from tail page to head page, which is not needed because we can pass any subpage to split_huge_page(). Naoya had fixed a similar issue in c3901e722b29 ("mm: hwpoison: fix thp split handling in memory_failure()"). But he missed fixing soft offline. Link: http://lkml.kernel.org/r/1551452476-24000-1-git-send-email-zhongjiang@huawei.com Fixes: 61f5d698cc97 ("mm: re-enable THP") Signed-off-by: zhongjiang <zhongjiang@huawei.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Hugh Dickins <hughd@google.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> [4.5+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05smb3: request more credits on normal (non-large read/write) opsSteve French
We can end up building up credits too slowly to do large operations (reads and writes for example) that require many credits. By comparison most other SMB3 clients request many more (sometimes thousands) of credits on all operations. Increase the number of credits we request on typical (non-large e.g read/write) operations to 10 from 2 so we can build a pool of credits faster. Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2019-03-05CIFS: Mask off signals when sending SMB packetsPavel Shilovsky
We don't want to break SMB sessions if we receive signals when sending packets through the network. Fix it by masking off signals inside __smb_send_rqst() to avoid partial packet sends due to interrupts. Return -EINTR if a signal is pending and only a part of the packet was sent. Return a success status code if the whole packet was sent regardless of signal being pending or not. This keeps a mid entry for the request in the pending queue and allows the demultiplex thread to handle a response from the server properly. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-05CIFS: Return -EAGAIN instead of -ENOTSOCKPavel Shilovsky
When we attempt to send a packet while the demultiplex thread is in the middle of cifs_reconnect() we may end up returning -ENOTSOCK to upper layers. The intent here is to retry the request once the TCP connection is up, so change it to return -EAGAIN instead. The latter error code is retryable and the upper layers will retry the request if needed. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-05CIFS: Only send SMB2_NEGOTIATE command on new TCP connectionsPavel Shilovsky
Do not allow commands other than SMB2_NEGOTIATE to be sent over recently established TCP connections. Return -EAGAIN to let upper layers handle it properly. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-05CIFS: Fix read after write for files with read cachingPavel Shilovsky
When we have a READ lease for a file and have just issued a write operation to the server we need to purge the cache and set oplock/lease level to NONE to avoid reading stale data. Currently we do that only if a write operation succedeed thus not covering cases when a request was sent to the server but a negative error code was returned later for some other reasons (e.g. -EIOCBQUEUED or -EINTR). Fix this by turning off caching regardless of the error code being returned. The patches fixes generic tests 075 and 112 from the xfs-tests. Cc: <stable@vger.kernel.org> Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2019-03-05smb3: for kerberos mounts display the credential uid usedSteve French
For kerberos mounts, the cruid is helpful to display in /proc/mounts in order to tell which uid's krb5 cache we got the ticket for and to tell in the multiuser krb5 case which local users (uids) we have Kerberos authentic sessions for. Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2019-03-05cifs: use correct format charactersLouis Taylor
When compiling with -Wformat, clang emits the following warnings: fs/cifs/smb1ops.c:312:20: warning: format specifies type 'unsigned short' but the argument has type 'unsigned int' [-Wformat] tgt_total_cnt, total_in_tgt); ^~~~~~~~~~~~ fs/cifs/cifs_dfs_ref.c:289:4: warning: format specifies type 'short' but the argument has type 'int' [-Wformat] ref->flags, ref->server_type); ^~~~~~~~~~ fs/cifs/cifs_dfs_ref.c:289:16: warning: format specifies type 'short' but the argument has type 'int' [-Wformat] ref->flags, ref->server_type); ^~~~~~~~~~~~~~~~ fs/cifs/cifs_dfs_ref.c:291:4: warning: format specifies type 'short' but the argument has type 'int' [-Wformat] ref->ref_flag, ref->path_consumed); ^~~~~~~~~~~~~ fs/cifs/cifs_dfs_ref.c:291:19: warning: format specifies type 'short' but the argument has type 'int' [-Wformat] ref->ref_flag, ref->path_consumed); ^~~~~~~~~~~~~~~~~~ The types of these arguments are unconditionally defined, so this patch updates the format character to the correct ones for ints and unsigned ints. Link: https://github.com/ClangBuiltLinux/linux/issues/378 Signed-off-by: Louis Taylor <louis@kragniz.eu> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2019-03-05smb3: add dynamic trace point for query_info_enter/doneSteve French
Adds dynamic trace points for the query_info_enter and query_info_done (no error) case. We only had one existing trace point related to this which was on query_info errors. Note that these two new tracepoints are for the non-compounded query_info paths. Sample output (from: trace-cmd record -e smb3_query_info*) ls-24140 [001] .... 27811.866068: smb3_query_info_enter: xid=7 sid=0xd2d00587 tid=0xb5441939 fid=0xcf082bac class=18 type=0x1 ls-24140 [001] .... 27811.867656: smb3_query_info_done: xid=7 sid=0xd2d00587 tid=0xb5441939 fid=0xcf082bac class=18 type=0x1 getcifsacl-24149 [005] .... 27854.759873: smb3_query_info_enter: xid=15 sid=0xd2d00587 tid=0xb5441939 fid=0x99896e72 class=0 type=0x3 getcifsacl-24149 [005] .... 27854.761730: smb3_query_info_done: xid=15 sid=0xd2d00587 tid=0xb5441939 fid=0x99896e72 class=0 type=0x3 Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-03-05smb3: add dynamic trace point for smb3_cmd_enterSteve French
Add tracepoint before sending an SMB3 command on the wire (ie add an smb3_cmd_enter tracepoint). This allows us to look in much more detail at response times (between request and response). Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-03-05smb3: improve dynamic tracing of open and posix mkdirSteve French
Add dynamic trace point for open_enter (and posix mkdir enter) Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-03-05smb3: add missing read completion trace pointSteve French
When ENODATA returned we weren't logging the read completion (not an error, but can be indicated by logging length 0) which makes looking at read traces confusing for smb3. Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2019-03-05smb3: Add tracepoints for read, write and query_dir enterSteve French
Allows tracing begin (not just completion) of read, write and query_dir which may be helpful in finding slow requests and other timing information Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2019-03-05smb3: add tracepoints for query dirSteve French
Adds two tracepoints - one for query_dir done (no err) and one for query_dir_err Sanple output: To start the trace in one window: trace-cmd record -e smb3_query_dir* Then in another window after doing an ls /mnt View the trace output by: trace-cmd show Sample output: TASK-PID CPU# |||| TIMESTAMP FUNCTION | | | |||| | | ls-24869 [007] .... 90695.452009: smb3_query_dir_done: xid=7 sid=0x5027d24d tid=0xb95cf25a fid=0xc41a8c3e offset=0x0 len=0x16 ls-24869 [000] .... 90695.452764: smb3_query_dir_done: xid=8 sid=0x5027d24d tid=0xb95cf25a fid=0xc41a8c3e offset=0x0 len=0x0 ls-24874 [003] .... 90701.506342: smb3_query_dir_done: xid=11 sid=0x5027d24d tid=0xb95cf25a fid=0x33ad3601 offset=0x0 len=0x8 ls-24874 [003] .... 90701.506917: smb3_query_dir_done: xid=12 sid=0x5027d24d tid=0xb95cf25a fid=0x33ad3601 offset=0x0 len=0x0 Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-05smb3: Update POSIX negotiate context with POSIX ctxt GUIDSteve French
POSIX negotiate context now includes the GUID specifying which POSIX open context we support. Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Jeremy Allison <jra@samba.org>
2019-03-05cifs: update internal module version numberSteve French
To 2.18 Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-05CIFS: Try to acquire credits at once for compound requestsPavel Shilovsky
Currently we get one credit per compound part of the request individually. This may lead to being stuck on waiting for credits if multiple compounded operations happen in parallel. Try acquire credits for all compound parts at once. Return immediately if not enough credits and too few requests are in flight currently thus narrowing the possibility of infinite waiting for credits. The more advance fix is to return right away if not enough credits for the compound request and do not look at the number of requests in flight. The caller should handle such situations by falling back to sequential execution of SMB commands instead of compounding. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-05CIFS: Return error code when getting file handle for writebackPavel Shilovsky
Now we just return NULL cifsFileInfo pointer in cases we didn't find or couldn't reopen a file. This hides errors from cifs_reopen_file() especially retryable errors which should be handled appropriately. Create new cifs_get_writable_file() routine that returns error codes from cifs_reopen_file() and use it in the writeback codepath. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-05CIFS: Move open file handling to writepagesPavel Shilovsky
Currently we check for an open file existence in wdata_send_pages() which doesn't provide an easy way to handle error codes that will be returned from find_writable_filehandle() once it is changed. Move the check to writepages. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-05CIFS: Move unlocking pages from wdata_send_pages()Pavel Shilovsky
Currently wdata_send_pages() unlocks pages after sending. This complicates further refactoring and doesn't align with the function name. Move unlocking to writepages. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-05CIFS: Find and reopen a file before get MTU credits in writepagesPavel Shilovsky
Reorder finding and reopening a writable handle file and getting MTU credits in writepages because we may be stuck on low credits otherwise. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-05CIFS: Reopen file before get SMB2 MTU credits for async IOPavel Shilovsky
Currently we get MTU credits before we check an open file if it needs to be reopened. Reopening the file in such conditions leads to a possibility of being stuck waiting indefinitely for credits in the transport layer. Fix this by reopening the file first if needed and then getting MTU credits for async IO. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-05CIFS: Remove custom credit adjustments for SMB2 async IOPavel Shilovsky
Currently we do proper accounting for credits in regards to reconnects and error handling, thus we do not need custom credit adjustments when reconnect is detected developed previously. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-05CIFS: Adjust MTU credits before reopening a filePavel Shilovsky
Currently we adjust MTU credits before sending an IO request and after reopening a file. This approach doesn't allow the reopen routine to use existing credits that are not needed for IO. Reorder credit adjustment and reopening a file to use credits available to the client more efficiently. Also unwrap complex if statement into few pieces to improve readability. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-05CIFS: Check for reconnects before sending compound requestsPavel Shilovsky
The reconnect might have happended after we obtained credits and before we acquired srv_mutex. Check for that under the mutex and retry a sync operation if the reconnect is detected. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-05CIFS: Check for reconnects before sending async requestsPavel Shilovsky
The reconnect might have happended after we obtained credits and before we acquired srv_mutex. Check for that under the mutex and retry an async operation if the reconnect is detected. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-05CIFS: Respect reconnect in non-MTU credits calculationsPavel Shilovsky
Every time after a session reconnect we don't need to account for credits obtained in previous sessions. Make use of the recently added cifs_credits structure to properly calculate credits for non-MTU requests the same way we did for MTU ones. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-05CIFS: Respect reconnect in MTU credits calculationsPavel Shilovsky
Every time after a session reconnect we don't need to account for credits obtained in previous sessions. Introduce new struct cifs_credits which contains both credits value and reconnect instance of the time those credits were taken. Modify a routine that add credits back to handle the reconnect instance by assuming zero credits if the reconnect happened after the credits were obtained and before we decided to add them back due to some errors during sending. This patch fixes the MTU credits cases. The subsequent patch will handle non-MTU ones. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-05CIFS: Set reconnect instance to one initiallyPavel Shilovsky
Currently we set reconnect instance to zero on the first connection but this is not convenient because we need to reserve some special value for credit handling on reconnects which is coming in subsequent patches. Fix this by starting with one when initiating a new TCP connection. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-05Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "The main RCU related changes in this cycle were: - Additional cleanups after RCU flavor consolidation - Grace-period forward-progress cleanups and improvements - Documentation updates - Miscellaneous fixes - spin_is_locked() conversions to lockdep - SPDX changes to RCU source and header files - SRCU updates - Torture-test updates, including nolibc updates and moving nolibc to tools/include" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (71 commits) locking/locktorture: Convert to SPDX license identifier linux/torture: Convert to SPDX license identifier torture: Convert to SPDX license identifier linux/srcu: Convert to SPDX license identifier linux/rcutree: Convert to SPDX license identifier linux/rcutiny: Convert to SPDX license identifier linux/rcu_sync: Convert to SPDX license identifier linux/rcu_segcblist: Convert to SPDX license identifier linux/rcupdate: Convert to SPDX license identifier linux/rcu_node_tree: Convert to SPDX license identifier rcu/update: Convert to SPDX license identifier rcu/tree: Convert to SPDX license identifier rcu/tiny: Convert to SPDX license identifier rcu/sync: Convert to SPDX license identifier rcu/srcu: Convert to SPDX license identifier rcu/rcutorture: Convert to SPDX license identifier rcu/rcu_segcblist: Convert to SPDX license identifier rcu/rcuperf: Convert to SPDX license identifier rcu/rcu.h: Convert to SPDX license identifier RCU/torture.txt: Remove section MODULE PARAMETERS ...
2019-03-05Merge branch 'timers-2038-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull year 2038 updates from Thomas Gleixner: "Another round of changes to make the kernel ready for 2038. After lots of preparatory work this is the first set of syscalls which are 2038 safe: 403 clock_gettime64 404 clock_settime64 405 clock_adjtime64 406 clock_getres_time64 407 clock_nanosleep_time64 408 timer_gettime64 409 timer_settime64 410 timerfd_gettime64 411 timerfd_settime64 412 utimensat_time64 413 pselect6_time64 414 ppoll_time64 416 io_pgetevents_time64 417 recvmmsg_time64 418 mq_timedsend_time64 419 mq_timedreceiv_time64 420 semtimedop_time64 421 rt_sigtimedwait_time64 422 futex_time64 423 sched_rr_get_interval_time64 The syscall numbers are identical all over the architectures" * 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits) riscv: Use latest system call ABI checksyscalls: fix up mq_timedreceive and stat exceptions unicore32: Fix __ARCH_WANT_STAT64 definition asm-generic: Make time32 syscall numbers optional asm-generic: Drop getrlimit and setrlimit syscalls from default list 32-bit userspace ABI: introduce ARCH_32BIT_OFF_T config option compat ABI: use non-compat openat and open_by_handle_at variants y2038: add 64-bit time_t syscalls to all 32-bit architectures y2038: rename old time and utime syscalls y2038: remove struct definition redirects y2038: use time32 syscall names on 32-bit syscalls: remove obsolete __IGNORE_ macros y2038: syscalls: rename y2038 compat syscalls x86/x32: use time64 versions of sigtimedwait and recvmmsg timex: change syscalls to use struct __kernel_timex timex: use __kernel_timex internally sparc64: add custom adjtimex/clock_adjtime functions time: fix sys_timer_settime prototype time: Add struct __kernel_timex time: make adjtime compat handling available for 32 bit ...
2019-03-05PCI: Fix "try" semantics of bus and slot resetAlex Williamson
The commit referenced below introduced device locking around save and restore of state for each device during a PCI bus "try" reset, making it decidely non-"try" and prone to deadlock in the event that a device is already locked. Restore __pci_reset_bus() and __pci_reset_slot() to their advertised locking semantics by pushing the save and restore functions into the branch where the entire tree is already locked. Extend the helper function names with "_locked" and update the comment to reflect this calling requirement. Fixes: b014e96d1abb ("PCI: Protect pci_error_handlers->reset_notify() usage with device_lock()") Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org>
2019-03-05PCI/LINK: Report degraded links via link bandwidth notificationAlexandru Gagniuc
A warning is generated when a PCIe device is probed with a degraded link, but there was no similar mechanism to warn when the link becomes degraded after probing. The Link Bandwidth Notification provides this mechanism. Use the Link Bandwidth Management Interrupt to detect bandwidth changes, and rescan the bandwidth, looking for the weakest point. This is the same logic used in probe(). Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Lukas Wunner <lukas@wunner.de>
2019-03-05net/sched: act_tunnel_key: Fix double free dst_cachewenxu
dst_cache_destroy will be called in dst_release dst_release-->dst_destroy_rcu-->dst_destroy-->metadata_dst_free -->dst_cache_destroy It should not call dst_cache_destroy before dst_release Fixes: 41411e2fd6b8 ("net/sched: act_tunnel_key: Add dst_cache support") Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-05Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/pti update from Thomas Gleixner: "Just a single change from the anti-performance departement: - Add a new PR_SPEC_DISABLE_NOEXEC option which allows to apply the speculation protections on a process without inheriting the state on exec. This remedies a situation where a Java-launcher has speculation protections enabled because that's the default for JVMs which causes the launched regular harmless processes to inherit the protection state which results in unintended performance degradation" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/speculation: Add PR_SPEC_DISABLE_NOEXEC
2019-03-05tipc: fix RDM/DGRAM connect() regressionErik Hugne
Fix regression bug introduced in commit 365ad353c256 ("tipc: reduce risk of user starvation during link congestion") Only signal -EDESTADDRREQ for RDM/DGRAM if we don't have a cached sockaddr. Fixes: 365ad353c256 ("tipc: reduce risk of user starvation during link congestion") Signed-off-by: Erik Hugne <erik.hugne@gmail.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-05Merge branch 'irq-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "The interrupt departement delivers this time: - New infrastructure to manage NMIs on platforms which have a sane NMI delivery, i.e. identifiable NMI vectors instead of a single lump. - Simplification of the interrupt affinity management so drivers don't have to implement ugly loops around the PCI/MSI enablement. - Speedup for interrupt statistics in /proc/stat - Provide a function to retrieve the default irq domain - A new interrupt controller for the Loongson LS1X platform - Affinity support for the SiFive PLIC - Better support for the iMX irqsteer driver - NUMA aware memory allocations for GICv3 - The usual small fixes, improvements and cleanups all over the place" * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits) irqchip/imx-irqsteer: Add multi output interrupts support irqchip/imx-irqsteer: Change to use reg_num instead of irq_group dt-bindings: irq: imx-irqsteer: Add multi output interrupts support dt-binding: irq: imx-irqsteer: Use irq number instead of group number irqchip/brcmstb-l2: Use _irqsave locking variants in non-interrupt code irqchip/gicv3-its: Use NUMA aware memory allocation for ITS tables irqdomain: Allow the default irq domain to be retrieved irqchip/sifive-plic: Implement irq_set_affinity() for SMP host irqchip/sifive-plic: Differentiate between PLIC handler and context irqchip/sifive-plic: Add warning in plic_init() if handler already present irqchip/sifive-plic: Pre-compute context hart base and enable base PCI/MSI: Remove obsolete sanity checks for multiple interrupt sets genirq/affinity: Remove the leftovers of the original set support nvme-pci: Simplify interrupt allocation genirq/affinity: Add new callback for (re)calculating interrupt sets genirq/affinity: Store interrupt sets size in struct irq_affinity genirq/affinity: Code consolidation irqchip/irq-sifive-plic: Check and continue in case of an invalid cpuid. irqchip/i8259: Fix shutdown order by moving syscore_ops registration dt-bindings: interrupt-controller: loongson ls1x intc ...
2019-03-05Merge branch 'timers-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer and clockevent updates from Thomas Gleixner: "The time(r) core and clockevent updates are mostly boring this time: - A new driver for the Tegra210 timer - Small fixes and improvements alll over the place - Documentation updates and cleanups" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits) soc/tegra: default select TEGRA_TIMER for Tegra210 clocksource/drivers/tegra: Add Tegra210 timer support dt-bindings: timer: add Tegra210 timer clocksource/drivers/timer-cs5535: Rename the file for consistency clocksource/drivers/timer-pxa: Rename the file for consistency clocksource/drivers/tango-xtal: Rename the file for consistency dt-bindings: timer: gpt: update binding doc clocksource/drivers/exynos_mct: Remove unused header includes dt-bindings: timer: mediatek: update bindings for MT7629 SoC clocksource/drivers/exynos_mct: Fix error path in timer resources initialization clocksource/drivers/exynos_mct: Remove dead code clocksource/drivers/riscv: Add required checks during clock source init dt-bindings: timer: renesas: tmu: Document r8a774c0 bindings dt-bindings: timer: renesas, cmt: Document r8a774c0 CMT support clocksource/drivers/exynos_mct: Clear timer interrupt when shutdown clocksource/drivers/exynos_mct: Move one-shot check from tick clear to ISR clocksource/drivers/arch_timer: Workaround for Allwinner A64 timer instability clocksource/drivers/sun5i: Fail gracefully when clock rate is unavailable timers: Mark expected switch fall-throughs timekeeping/debug: No need to check return value of debugfs_create functions ...
2019-03-05dm snapshot: don't define direct_access if we don't support itMikulas Patocka
Don't define a direct_access function that fails, dm_dax_direct_access already fails with -EIO if the pointer is zero; Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-03-05dm cache: add support for discard passdown to the origin deviceMike Snitzer
DM cache now defaults to passing discards down to the origin device. User may disable this using the "no_discard_passdown" feature when creating the cache device. If the cache's underlying origin device doesn't support discards then passdown is disabled (with warning). Similarly, if the underlying origin device's max_discard_sectors is less than a cache block discard passdown will be disabled (this is required because sizing of the cache internal discard bitset depends on it). Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-03-05dm writecache: fix typo in name for writeback_wqHuaisheng Ye
The workqueue's name should be "writecache-writeback" instead of "writecache-writeabck". Signed-off-by: Huaisheng Ye <yehs1@lenovo.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-03-05dm: add support to directly boot to a mapped deviceHelen Koike
Add a "create" module parameter, which allows device-mapper targets to be configured at boot time. This enables early use of DM targets in the boot process (as the root device or otherwise) without the need of an initramfs. The syntax used in the boot param is based on the concise format from the dmsetup tool to follow the rule of least surprise: dmsetup table --concise /dev/mapper/lroot Which is: dm-mod.create=<name>,<uuid>,<minor>,<flags>,<table>[,<table>+][;<name>,<uuid>,<minor>,<flags>,<table>[,<table>+]+] Where, <name> ::= The device name. <uuid> ::= xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx | "" <minor> ::= The device minor number | "" <flags> ::= "ro" | "rw" <table> ::= <start_sector> <num_sectors> <target_type> <target_args> <target_type> ::= "verity" | "linear" | ... For example, the following could be added in the boot parameters: dm-mod.create="lroot,,,rw, 0 4096 linear 98:16 0, 4096 4096 linear 98:32 0" root=/dev/dm-0 Only the targets that were tested are allowed and the ones that don't change any block device when the device is create as read-only. For example, mirror and cache targets are not allowed. The rationale behind this is that if the user makes a mistake, choosing the wrong device to be the mirror or the cache can corrupt data. The only targets initially allowed are: * crypt * delay * linear * snapshot-origin * striped * verity Co-developed-by: Will Drewry <wad@chromium.org> Co-developed-by: Kees Cook <keescook@chromium.org> Co-developed-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Signed-off-by: Helen Koike <helen.koike@collabora.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-03-05dm thin: add sanity checks to thin-pool and external snapshot creationJason Cai (Xiang Feng)
Invoking dm_get_device() twice on the same device path with different modes is dangerous. Because in that case, upgrade_mode() will alloc a new 'dm_dev' and free the old one, which may be referenced by a previous caller. Dereferencing the dangling pointer will trigger kernel NULL pointer dereference. The following two cases can reproduce this issue. Actually, they are invalid setups that must be disallowed, e.g.: 1. Creating a thin-pool with read_only mode, and the same device as both metadata and data. dmsetup create thinp --table \ "0 41943040 thin-pool /dev/vdb /dev/vdb 128 0 1 read_only" BUG: unable to handle kernel NULL pointer dereference at 0000000000000080 ... Call Trace: new_read+0xfb/0x110 [dm_bufio] dm_bm_read_lock+0x43/0x190 [dm_persistent_data] ? kmem_cache_alloc_trace+0x15c/0x1e0 __create_persistent_data_objects+0x65/0x3e0 [dm_thin_pool] dm_pool_metadata_open+0x8c/0xf0 [dm_thin_pool] pool_ctr.cold.79+0x213/0x913 [dm_thin_pool] ? realloc_argv+0x50/0x70 [dm_mod] dm_table_add_target+0x14e/0x330 [dm_mod] table_load+0x122/0x2e0 [dm_mod] ? dev_status+0x40/0x40 [dm_mod] ctl_ioctl+0x1aa/0x3e0 [dm_mod] dm_ctl_ioctl+0xa/0x10 [dm_mod] do_vfs_ioctl+0xa2/0x600 ? handle_mm_fault+0xda/0x200 ? __do_page_fault+0x26c/0x4f0 ksys_ioctl+0x60/0x90 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x55/0x150 entry_SYSCALL_64_after_hwframe+0x44/0xa9 2. Creating a external snapshot using the same thin-pool device. dmsetup create thinp --table \ "0 41943040 thin-pool /dev/vdc /dev/vdb 128 0 2 ignore_discard" dmsetup message /dev/mapper/thinp 0 "create_thin 0" dmsetup create snap --table \ "0 204800 thin /dev/mapper/thinp 0 /dev/mapper/thinp" BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 ... Call Trace: ? __alloc_pages_nodemask+0x13c/0x2e0 retrieve_status+0xa5/0x1f0 [dm_mod] ? dm_get_live_or_inactive_table.isra.7+0x20/0x20 [dm_mod] table_status+0x61/0xa0 [dm_mod] ctl_ioctl+0x1aa/0x3e0 [dm_mod] dm_ctl_ioctl+0xa/0x10 [dm_mod] do_vfs_ioctl+0xa2/0x600 ksys_ioctl+0x60/0x90 ? ksys_write+0x4f/0xb0 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x55/0x150 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Signed-off-by: Jason Cai (Xiang Feng) <jason.cai@linux.alibaba.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-03-05dm block manager: remove redundant unlikely annotationChengguang Xu
unlikely has already included in IS_ERR(), so just remove redundant unlikely annotation. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-03-05dm verity fec: remove redundant unlikely annotationChengguang Xu
unlikely has already included in IS_ERR(), so just remove redundant unlikely annotation. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-03-05dm integrity: remove redundant unlikely annotationChengguang Xu
unlikely has already included in IS_ERR(), so just remove redundant unlikely annotation. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-03-05dm: always call blk_queue_split() in dm_process_bio()Mike Snitzer
Do not just call blk_queue_split() if the bio is_abnormal_io(). Fixes: 568c73a355e ("dm: update dm_process_bio() to split bio if in ->make_request_fn()") Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-03-05dm: fix to_sector() for 32bitNeilBrown
A dm-raid array with devices larger than 4GB won't assemble on a 32 bit host since _check_data_dev_sectors() was added in 4.16. This is because to_sector() treats its argument as an "unsigned long" which is 32bits (4GB) on a 32bit host. Using "unsigned long long" is more correct. Kernels as early as 4.2 can have other problems due to to_sector() being used on the size of a device. Fixes: 0cf4503174c1 ("dm raid: add support for the MD RAID0 personality") cc: stable@vger.kernel.org (v4.2+) Reported-and-tested-by: Guillaume Perréal <gperreal@free.fr> Signed-off-by: NeilBrown <neil@brown.name> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-03-05dm switch: use struct_size() in kzalloc()Gustavo A. R. Silva
One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct foo { int stuff; void *entry[]; }; instance = kzalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL); Instead of leaving these open-coded and prone to type mistakes, we can now use the new struct_size() helper: instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL); This code was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>