summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-01-04fat: new inline functions to determine the FAT variant (32, 16 or 12)Carmeli Tamir
This patch introduces 3 new inline functions - is_fat12, is_fat16 and is_fat32, and replaces every occurrence in the code in which the FS variant (whether this is FAT12, FAT16 or FAT32) was previously checked using msdos_sb_info->fat_bits. Link: http://lkml.kernel.org/r/1544990640-11604-4-git-send-email-carmeli.tamir@gmail.com Signed-off-by: Carmeli Tamir <carmeli.tamir@gmail.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04fat: move MAX_FAT to fat.h and change it to inline functionCarmeli Tamir
MAX_FAT is useless in msdos_fs.h, since it uses the MSDOS_SB function that is defined in fat.h. So really, this macro can be only called from code that already includes fat.h. Hence, this patch moves it to fat.h, right after MSDOS_SB is defined. I also changed it to an inline function in order to save the double call to MSDOS_SB. This was suggested by joe@perches.com in the previous version. This patch is required for the next in the series, in which the variant (whether this is FAT12, FAT16 or FAT32) checks are replaced with new macros. Link: http://lkml.kernel.org/r/1544990640-11604-3-git-send-email-carmeli.tamir@gmail.com Signed-off-by: Carmeli Tamir <carmeli.tamir@gmail.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04fat: remove FAT_FIRST_ENT macroCarmeli Tamir
The comment edited in this patch was the only reference to the FAT_FIRST_ENT macro, which is not used anymore. Moreover, the commented line of code does not compile with the current code. Since the FAT_FIRST_ENT macro checks the FAT variant in a way that the patch series changes, I removed it, and instead wrote a clear explanation of what was checked. I verified that the changed comment is correct according to Microsoft FAT spec, search for "BPB_Media" in the following references: 1. Microsoft FAT specification 2005 (http://read.pudn.com/downloads77/ebook/294884/FAT32%20Spec%20%28SDA%20Contribution%29.pdf). Search for 'volume label'. 2. Microsoft Extensible Firmware Initiative, FAT32 File System Specification (https://staff.washington.edu/dittrich/misc/fatgen103.pdf). Search for 'volume label'. Link: http://lkml.kernel.org/r/1544990640-11604-2-git-send-email-carmeli.tamir@gmail.com Signed-off-by: Carmeli Tamir <carmeli.tamir@gmail.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04include/uapi/linux/msdos_fs.h: use MSDOS_NAME for volume label sizeCarmeli Tamir
The FAT file system volume label file stored in the root directory should match the volume label field in the FAT boot sector. As consequence, the max length of these fields ought to be the same. This patch replaces the magic '11' usef in the struct fat_boot_sector with MSDOS_NAME, which is used in struct msdos_dir_entry. Please check the following references: 1. Microsoft FAT specification 2005 (http://read.pudn.com/downloads77/ebook/294884/FAT32%20Spec%20%28SDA%20Contribution%29.pdf). Search for 'volume label'. 2. Microsoft Extensible Firmware Initiative, FAT32 File System Specification (https://staff.washington.edu/dittrich/misc/fatgen103.pdf). Search for 'volume label'. 3. User space code that creates FAT filesystem sometimes uses MSDOS_NAME for the label, sometimes not. Search for 'if (memcmp(label, NO_NAME, MSDOS_NAME))'. I consider to make the same patch there as well. https://github.com/dosfstools/dosfstools/blob/master/src/mkfs.fat.c Link: http://lkml.kernel.org/r/1543096879-82837-1-git-send-email-carmeli.tamir@gmail.com Signed-off-by: Carmeli Tamir <carmeli.tamir@gmail.com> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Jens Axboe <axboe@kernel.dk> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04hfsplus: return file attributes on statxErnesto A. Fernández
The immutable, append-only and no-dump attributes can only be retrieved with an ioctl; implement the ->getattr() method to return them on statx. Do not return the inode birthtime yet, because the issue of how best to handle the post-2038 timestamps is still under discussion. This patch is needed to pass xfstests generic/424. Link: http://lkml.kernel.org/r/20181014163558.sxorxlzjqccq2lpw@eaf Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com> Cc: Viacheslav Dubeyko <slava@dubeyko.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04autofs: add strictexpire mount optionIan Kent
Commit 092a53452bb7 ("autofs: take more care to not update last_used on path walk") helped to (partially) resolve a problem where automounts were not expiring due to aggressive accesses from user space. This patch was later reverted because, for very large environments, it meant more mount requests from clients and when there are a lot of clients this caused a fairly significant increase in server load. But there is a need for both types of expire check, depending on use case, so add a mount option to allow for strict update of last use of autofs dentrys (which just means not updating the last use on path walk access). Link: http://lkml.kernel.org/r/154296973880.9889.14085372741514507967.stgit@pluto-themaw-net Signed-off-by: Ian Kent <raven@themaw.net> Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04autofs: change catatonic setting to a bit flagIan Kent
Change the superblock info. catatonic setting to be part of a flags bit field. Link: http://lkml.kernel.org/r/154296973142.9889.17275721668508589639.stgit@pluto-themaw-net Signed-off-by: Ian Kent <raven@themaw.net> Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04autofs: simplify parse_options() function callIan Kent
The parse_options() function uses a long list of parameters, most of which are present in the super block info structure already. The mount parameters set in parse_options() options don't require cleanup so using the super block info struct directly is simpler. Link: http://lkml.kernel.org/r/154296972423.9889.9368859245676473329.stgit@pluto-themaw-net Signed-off-by: Ian Kent <raven@themaw.net> Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04autofs: improve ioctl sbi checksIan Kent
Al Viro made some suggestions to improve the implementation of commit 0633da48f0 ("fix autofs_sbi() does not check super block type"). The check is unnecessary in all cases except for ioctl usage so placing the check in the super block accessor function adds a small overhead to the common case where it isn't needed. So it's sufficient to do this in the ioctl code only. Also the check in the ioctl code is needlessly complex. [akpm@linux-foundation.org: declare autofs_fs_type in .h, not .c] Link: http://lkml.kernel.org/r/154296970987.9889.1597442413573683096.stgit@pluto-themaw-net Signed-off-by: Ian Kent <raven@themaw.net> Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04init/main.c: make "initcall_level_names[]" const char *Alexey Dobriyan
Initcall names should not be changed. Link: http://lkml.kernel.org/r/20181124091829.GD10969@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04fs/epoll: deal with wait_queue only onceDavidlohr Bueso
There is no reason why we rearm the waitiqueue upon every fetch_events retry (for when events are found yet send_events() fails). If nothing else, this saves four lock operations per retry, and furthermore reduces the scope of the lock even further. [akpm@linux-foundation.org: restore code to original position, fix and reflow comment] Link: http://lkml.kernel.org/r/20181114182532.27981-2-dave@stgolabs.net Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Cc: Jason Baron <jbaron@akamai.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04fs/epoll: rename check_events label to send_eventsDavidlohr Bueso
It is currently called check_events because it, well, did exactly that. However, since the lockless ep_events_available() call, the label no longer checks, but just sends the events. Rename as such. Link: http://lkml.kernel.org/r/20181114182532.27981-1-dave@stgolabs.net Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Jason Baron <jbaron@akamai.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04fs/epoll: avoid barrier after an epoll_wait(2) timeoutDavidlohr Bueso
Upon timeout, we can just exit out of the loop, without the cost of the changing the task's state with an smp_store_mb call. Just exit out of the loop and be done - setting the task state afterwards will be, of course, redundant. [dave@stgolabs.net: forgotten fixlets] Link: http://lkml.kernel.org/r/20181109155258.jxcr4t2pnz6zqct3@linux-r8p5 Link: http://lkml.kernel.org/r/20181108051006.18751-7-dave@stgolabs.net Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: Jason Baron <jbaron@akamai.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04fs/epoll: reduce the scope of wq lock in epoll_wait()Davidlohr Bueso
This patch aims at reducing ep wq.lock hold times in epoll_wait(2). For the blocking case, there is no need to constantly take and drop the spinlock, which is only needed to manipulate the waitqueue. The call to ep_events_available() is now lockless, and only exposed to benign races. Here, if false positive (returns available events and does not see another thread deleting an epi from the list) we call into send_events and then the list's state is correctly seen. Otoh, if a false negative and we don't see a list_add_tail(), for example, from irq callback, then it is rechecked again before blocking, which will see the correct state. In order for more accuracy to see concurrent list_del_init(), use the list_empty_careful() variant -- of course, this won't be safe against insertions from wakeup. For the overflow list we obviously need to prevent load/store tearing as we don't want to see partial values while the ready list is disabled. [dave@stgolabs.net: forgotten fixlets] Link: http://lkml.kernel.org/r/20181109155258.jxcr4t2pnz6zqct3@linux-r8p5 Link: http://lkml.kernel.org/r/20181108051006.18751-6-dave@stgolabs.net Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Suggested-by: Jason Baron <jbaron@akamai.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04fs/epoll: robustify ep->mtx held checksDavidlohr Bueso
Insted of just commenting how important it is, lets make it more robust and add a lockdep_assert_held() call. Link: http://lkml.kernel.org/r/20181108051006.18751-5-dave@stgolabs.net Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Jason Baron <jbaron@akamai.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04fs/epoll: drop ovflist branch predictionDavidlohr Bueso
The ep->ovflist is a secondary ready-list to temporarily store events that might occur when doing sproc without holding the ep->wq.lock. This accounts for every time we check for ready events and also send events back to userspace; both callbacks, particularly the latter because of copy_to_user, can account for a non-trivial time. As such, the unlikely() check to see if the pointer is being used, seems both misleading and sub-optimal. In fact, we go to an awful lot of trouble to sync both lists, and populating the ovflist is far from an uncommon scenario. For example, profiling a concurrent epoll_wait(2) benchmark, with CONFIG_PROFILE_ANNOTATED_BRANCHES shows that for a two threads a 33% incorrect rate was seen; and when incrementally increasing the number of epoll instances (which is used, for example for multiple queuing load balancing models), up to a 90% incorrect rate was seen. Similarly, by deleting the prediction, 3% throughput boost was seen across incremental threads. Link: http://lkml.kernel.org/r/20181108051006.18751-4-dave@stgolabs.net Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Jason Baron <jbaron@akamai.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04fs/epoll: simplify ep_send_events_proc() ready-list loopDavidlohr Bueso
The current logic is a bit convoluted. Lets simplify this with a standard list_for_each_entry_safe() loop instead and just break out after maxevents is reached. While at it, remove an unnecessary indentation level in the loop when there are in fact ready events. Link: http://lkml.kernel.org/r/20181108051006.18751-3-dave@stgolabs.net Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Jason Baron <jbaron@akamai.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04fs/epoll: remove max_nests argument from ep_call_nested()Davidlohr Bueso
Patch series "epoll: some miscellaneous optimizations". The following are some incremental optimizations on some of the epoll core. Each patch has the details, but together, the series is seen to shave off measurable cycles on a number of systems and workloads. For example, on a 40-core IB, a pipetest as well as parallel epoll_wait() benchmark show around a 20-30% increase in raw operations per second when the box is fully occupied (incremental thread counts), and up to 15% performance improvement with lower counts. Passes ltp epoll related testcases. This patch(of 6): All callers pass the EP_MAX_NESTS constant already, so lets simplify this a tad and get rid of the redundant parameter for nested eventpolls. Link: http://lkml.kernel.org/r/20181108051006.18751-2-dave@stgolabs.net Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Jason Baron <jbaron@akamai.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04checkpatch: warn on const char foo[] = "bar"; declarationsJoe Perches
These declarations should generally be static const to avoid poor compilation and runtime performance where compilers tend to initialize the const declaration for every call instead of using .rodata for the string. Miscellanea: - Convert spaces to tabs for indentation in 2 adjacent checks Link: http://lkml.kernel.org/r/10ea5f4b087dc911e41e187a4a2b5e79c7529aa3.camel@perches.com Signed-off-by: Joe Perches <joe@perches.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04drivers/firmware/memmap.c: modify memblock_alloc to memblock_alloc_nopanichuang.zijiang
memblock_alloc() never returns NULL because panic never returns. Link: http://lkml.kernel.org/r/1545640882-42009-1-git-send-email-huang.zijiang@zte.com.cn Signed-off-by: huang.zijiang <huang.zijiang@zte.com.cn> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Yi Wang <wang.yi59@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04lib/genalloc.c: use vzalloc_node() to allocate the bitmapHuang Shijie
Some devices may have big memory on chip, such as over 1G. In some cases, the nbytes maybe bigger then 4M which is the bounday of the memory buddy system (4K default). So use vzalloc_node() to allocate the bitmap. Also use vfree to free it. Link: http://lkml.kernel.org/r/20181225015701.6289-1-sjhuang@iluvatar.ai Signed-off-by: Huang Shijie <sjhuang@iluvatar.ai> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexey Skidanov <alexey.skidanov@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04lib/find_bit_benchmark.c: align test_find_next_and_bit with othersYury Norov
Contrary to other tests, test_find_next_and_bit() test uses tab formatting in output and get_cycles() instead of ktime_get(). get_cycles() is not supported by some arches, so ktime_get() fits better in generic code. Fix it and minor style issues, so the output looks like this: Start testing find_bit() with random-filled bitmap find_next_bit: 7142816 ns, 163282 iterations find_next_zero_bit: 8545712 ns, 164399 iterations find_last_bit: 6332032 ns, 163282 iterations find_first_bit: 20509424 ns, 16606 iterations find_next_and_bit: 4060016 ns, 73424 iterations Start testing find_bit() with sparse bitmap find_next_bit: 55984 ns, 656 iterations find_next_zero_bit: 19197536 ns, 327025 iterations find_last_bit: 65088 ns, 656 iterations find_first_bit: 5923712 ns, 656 iterations find_next_and_bit: 29088 ns, 1 iterations Link: http://lkml.kernel.org/r/20181123174803.10916-1-ynorov@caviumnetworks.com Signed-off-by: Yury Norov <ynorov@caviumnetworks.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: "Norov, Yuri" <Yuri.Norov@cavium.com> Cc: Clement Courbet <courbet@google.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04lib/genalloc.c: fix allocation of aligned buffer from non-aligned chunkAlexey Skidanov
gen_pool_alloc_algo() uses different allocation functions implementing different allocation algorithms. With gen_pool_first_fit_align() allocation function, the returned address should be aligned on the requested boundary. If chunk start address isn't aligned on the requested boundary, the returned address isn't aligned too. The only way to get properly aligned address is to initialize the pool with chunks aligned on the requested boundary. If want to have an ability to allocate buffers aligned on different boundaries (for example, 4K, 1MB, ...), the chunk start address should be aligned on the max possible alignment. This happens because gen_pool_first_fit_align() looks for properly aligned memory block without taking into account the chunk start address alignment. To fix this, we provide chunk start address to gen_pool_first_fit_align() and change its implementation such that it starts looking for properly aligned block with appropriate offset (exactly as is done in CMA). Link: https://lkml.kernel.org/lkml/a170cf65-6884-3592-1de9-4c235888cc8a@intel.com Link: http://lkml.kernel.org/r/1541690953-4623-1-git-send-email-alexey.skidanov@intel.com Signed-off-by: Alexey Skidanov <alexey.skidanov@intel.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: Daniel Mentz <danielmentz@google.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Laura Abbott <labbott@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04fls: change parameter to unsigned intMatthew Wilcox
When testing in userspace, UBSAN pointed out that shifting into the sign bit is undefined behaviour. It doesn't really make sense to ask for the highest set bit of a negative value, so just turn the argument type into an unsigned int. Some architectures (eg ppc) already had it declared as an unsigned int, so I don't expect too many problems. Link: http://lkml.kernel.org/r/20181105221117.31828-1-willy@infradead.org Signed-off-by: Matthew Wilcox <willy@infradead.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04include/linux/printk.h: drop silly "static inline asmlinkage" from dump_stack()Alexey Dobriyan
Empty function will be inlined so asmlinkage doesn't do anything. Link: http://lkml.kernel.org/r/20181124093530.GE10969@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Joey Pabalinas <joeypabalinas@gmail.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04drivers/dma-buf/udmabuf.c: convert to use vm_fault_tSouptick Joarder
Use new return type vm_fault_t for fault handler. Link: http://lkml.kernel.org/r/20181106173628.GA12989@jordon-HP-15-Notebook-PC Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04kernel/hung_task.c: break RCU locks based on jiffiesTetsuo Handa
check_hung_uninterruptible_tasks() is currently calling rcu_lock_break() for every 1024 threads. But check_hung_task() is very slow if printk() was called, and is very fast otherwise. If many threads within some 1024 threads called printk(), the RCU grace period might be extended enough to trigger RCU stall warnings. Therefore, calling rcu_lock_break() for every some fixed jiffies will be safer. Link: http://lkml.kernel.org/r/1544800658-11423-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Paul E. McKenney <paulmck@linux.ibm.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04kernel/hung_task.c: force console verbose before panicLiu, Chuansheng
Based on commit 401c636a0eeb ("kernel/hung_task.c: show all hung tasks before panic"), we could get the call stack of hung task. However, if the console loglevel is not high, we still can not see the useful panic information in practice, and in most cases users don't set console loglevel to high level. This patch is to force console verbose before system panic, so that the real useful information can be seen in the console, instead of being like the following, which doesn't have hung task information. INFO: task init:1 blocked for more than 120 seconds. Tainted: G U W 4.19.0-quilt-2e5dc0ac-g51b6c21d76cc #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Kernel panic - not syncing: hung_task: blocked tasks CPU: 2 PID: 479 Comm: khungtaskd Tainted: G U W 4.19.0-quilt-2e5dc0ac-g51b6c21d76cc #1 Call Trace: dump_stack+0x4f/0x65 panic+0xde/0x231 watchdog+0x290/0x410 kthread+0x12c/0x150 ret_from_fork+0x35/0x40 reboot: panic mode set: p,w Kernel Offset: 0x34000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) Link: http://lkml.kernel.org/r/27240C0AC20F114CBF8149A2696CBE4A6015B675@SHSMSX101.ccr.corp.intel.com Signed-off-by: Chuansheng Liu <chuansheng.liu@intel.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04build_bug.h: remove most of dummy BUILD_BUG_ON stubs for SparseMasahiro Yamada
The introduction of these dummy BUILD_BUG_ON stubs dates back to commmit 903c0c7cdc21 ("sparse: define dummy BUILD_BUG_ON definition for sparse"). At that time, BUILD_BUG_ON() was implemented with the negative array trick *and* the link-time trick, like this: extern int __build_bug_on_failed; #define BUILD_BUG_ON(condition) \ do { \ ((void)sizeof(char[1 - 2*!!(condition)])); \ if (condition) __build_bug_on_failed = 1; \ } while(0) Sparse is more strict about the negative array trick than GCC because Sparse requires the array length to be really constant. Here is the simple test code for the macro above: static const int x = 0; BUILD_BUG_ON(x); GCC is absolutely fine with it (-Wvla was enabled only very recently), but Sparse warns like this: error: bad constant expression error: cannot size expression (If you are using a newer version of Sparse, you will see a different warning message, "warning: Variable length array is used".) Anyway, Sparse was producing many false positives, and noisier than it should be at that time. With the previous commit, the leftover negative array trick is gone. Sparse is fine with the current BUILD_BUG_ON(), which is implemented by using the 'error' attribute. I am keeping the stub for BUILD_BUG_ON_ZERO(). Otherwise, Sparse would complain about the following code, which GCC is fine with: static const int x = 0; int y = BUILD_BUG_ON_ZERO(x); Link: http://lkml.kernel.org/r/1542856462-18836-3-git-send-email-yamada.masahiro@socionext.com Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Kees Cook <keescook@chromium.org> Reviewed-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04build_bug.h: remove negative-array fallback for BUILD_BUG_ON()Masahiro Yamada
The kernel can only be compiled with an optimization option (-O2, -Os, or the currently proposed -Og). Hence, __OPTIMIZE__ is always defined in the kernel source. The fallback for the -O0 case is just hypothetical and pointless. Moreover, commit 0bb95f80a38f ("Makefile: Globally enable VLA warning") enabled -Wvla warning. The use of variable length arrays is banned. Link: http://lkml.kernel.org/r/1542856462-18836-2-git-send-email-yamada.masahiro@socionext.com Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Kees Cook <keescook@chromium.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04Documentation/process/coding-style.rst: don't use "extern" with function ↵Alexey Dobriyan
prototypes `extern' with function prototypes makes lines longer and creates more characters on the screen. Do not bug people with checkpatch.pl warnings for now as fallout can be devastating. Link: http://lkml.kernel.org/r/20181101134153.GA29267@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04proc/sysctl: fix return error for proc_doulongvec_minmax()Cheng Lin
If the number of input parameters is less than the total parameters, an EINVAL error will be returned. For example, we use proc_doulongvec_minmax to pass up to two parameters with kern_table: { .procname = "monitor_signals", .data = &monitor_sigs, .maxlen = 2*sizeof(unsigned long), .mode = 0644, .proc_handler = proc_doulongvec_minmax, }, Reproduce: When passing two parameters, it's work normal. But passing only one parameter, an error "Invalid argument"(EINVAL) is returned. [root@cl150 ~]# echo 1 2 > /proc/sys/kernel/monitor_signals [root@cl150 ~]# cat /proc/sys/kernel/monitor_signals 1 2 [root@cl150 ~]# echo 3 > /proc/sys/kernel/monitor_signals -bash: echo: write error: Invalid argument [root@cl150 ~]# echo $? 1 [root@cl150 ~]# cat /proc/sys/kernel/monitor_signals 3 2 [root@cl150 ~]# The following is the result after apply this patch. No error is returned when the number of input parameters is less than the total parameters. [root@cl150 ~]# echo 1 2 > /proc/sys/kernel/monitor_signals [root@cl150 ~]# cat /proc/sys/kernel/monitor_signals 1 2 [root@cl150 ~]# echo 3 > /proc/sys/kernel/monitor_signals [root@cl150 ~]# echo $? 0 [root@cl150 ~]# cat /proc/sys/kernel/monitor_signals 3 2 [root@cl150 ~]# There are three processing functions dealing with digital parameters, __do_proc_dointvec/__do_proc_douintvec/__do_proc_doulongvec_minmax. This patch deals with __do_proc_doulongvec_minmax, just as __do_proc_dointvec does, adding a check for parameters 'left'. In __do_proc_douintvec, its code implementation explicitly does not support multiple inputs. static int __do_proc_douintvec(...){ ... /* * Arrays are not supported, keep this simple. *Do not* add * support for them. */ if (vleft != 1) { *lenp = 0; return -EINVAL; } ... } So, just __do_proc_doulongvec_minmax has the problem. And most use of proc_doulongvec_minmax/proc_doulongvec_ms_jiffies_minmax just have one parameter. Link: http://lkml.kernel.org/r/1544081775-15720-1-git-send-email-cheng.lin130@zte.com.cn Signed-off-by: Cheng Lin <cheng.lin130@zte.com.cn> Acked-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04fs/proc/base.c: slightly faster /proc/*/limitsAlexey Dobriyan
Header of /proc/*/limits is a fixed string, so print it directly without formatting specifiers. Link: http://lkml.kernel.org/r/20181203164242.GB6904@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04fs/proc/inode.c: delete unnecessary variable in proc_alloc_inode()Alexey Dobriyan
Link: http://lkml.kernel.org/r/20181203164015.GA6904@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04fs/proc/util.c: include fs/proc/internal.h for name_to_int()Eric Biggers
name_to_int() is defined in fs/proc/util.c and declared in fs/proc/internal.h, but the declaration isn't included at the point of the definition. Include the header to enforce that the definition matches the declaration. This addresses a gcc warning when -Wmissing-prototypes is enabled. Link: http://lkml.kernel.org/r/20181115001833.49371-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04fs/proc/base.c: use ns_capable instead of capable for timerslack_nsBenjamin Gordon
Access to timerslack_ns is controlled by a process having CAP_SYS_NICE in its effective capability set, but the current check looks in the root namespace instead of the process' user namespace. Since a process is allowed to do other activities controlled by CAP_SYS_NICE inside a namespace, it should also be able to adjust timerslack_ns. Link: http://lkml.kernel.org/r/20181030180012.232896-1-bmgordon@google.com Signed-off-by: Benjamin Gordon <bmgordon@google.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: John Stultz <john.stultz@linaro.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Kees Cook <keescook@chromium.org> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Oren Laadan <orenl@cellrox.com> Cc: Ruchi Kandoi <kandoiruchi@google.com> Cc: Rom Lemarchand <romlem@android.com> Cc: Todd Kjos <tkjos@google.com> Cc: Colin Cross <ccross@android.com> Cc: Nick Kralevich <nnk@google.com> Cc: Dmitry Shmidt <dimitrysh@google.com> Cc: Elliott Hughes <enh@google.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04net: dsa: mt7530: Drop unused GPIO includeLinus Walleij
This driver uses GPIO descriptors only, <linux/of_gpio.h> is not used so drop the include. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-04Merge branch 'GUE-error-recursion'David S. Miller
Stefano Brivio says: ==================== Fix two further potential unbounded recursions in GUE error handlers Patch 1/2 takes care of preventing the issue fixed by commit 11789039da53 ("fou: Prevent unbounded recursion in GUE error handler") also with UDP-Lite payloads -- I just realised this might happen from a syzbot report. Patch 2/2 fixes the issue for both UDP and UDP-Lite on IPv6, which I also forgot to deal with in that same commit. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-04fou6: Prevent unbounded recursion in GUE error handlerStefano Brivio
I forgot to deal with IPv6 in commit 11789039da53 ("fou: Prevent unbounded recursion in GUE error handler"). Now syzbot reported what might be the same type of issue, caused by gue6_err(), that is, handling exceptions for direct UDP encapsulation in GUE (UDP-in-UDP) leads to unbounded recursion in the GUE exception handler. As it probably doesn't make sense to set up GUE this way, and it's currently not even possible to configure this, skip exception handling for UDP (or UDP-Lite) packets encapsulated in UDP (or UDP-Lite) packets with GUE on IPv6. Reported-by: syzbot+4ad25edc7a33e4ab91e0@syzkaller.appspotmail.com Reported-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com> Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Fixes: b8a51b38e4d4 ("fou, fou6: ICMP error handlers for FoU and GUE") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-04fou: Prevent unbounded recursion in GUE error handler also with UDP-LiteStefano Brivio
In commit 11789039da53 ("fou: Prevent unbounded recursion in GUE error handler"), I didn't take care of the case where UDP-Lite is encapsulated into UDP or UDP-Lite with GUE. From a syzbot report about a possibly similar issue with GUE on IPv6, I just realised the same thing might happen with a UDP-Lite inner payload. Also skip exception handling for inner UDP-Lite protocol. Fixes: 11789039da53 ("fou: Prevent unbounded recursion in GUE error handler") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-04openvswitch: Fix IPv6 later frags parsingYi-Hung Wei
The previous commit fa642f08839b ("openvswitch: Derive IP protocol number for IPv6 later frags") introduces IP protocol number parsing for IPv6 later frags that can mess up the network header length calculation logic, i.e. nh_len < 0. However, the network header length calculation is mainly for deriving the transport layer header in the key extraction process which the later fragment does not apply. Therefore, this commit skips the network header length calculation to fix the issue. Reported-by: Chris Mi <chrism@mellanox.com> Reported-by: Greg Rose <gvrose8192@gmail.com> Fixes: fa642f08839b ("openvswitch: Derive IP protocol number for IPv6 later frags") Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-04net: macb: remove unnecessary codeClaudiu Beznea
Commit 653e92a9175e ("net: macb: add support for padding and fcs computation") introduced a bug fixed by commit 899ecaedd155 ("net: ethernet: cadence: fix socket buffer corruption problem"). Code removed in this patch is not reachable at all so remove it. Fixes: 653e92a9175e ("net: macb: add support for padding and fcs computation") Cc: Tristram Ha <Tristram.Ha@microchip.com> Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-04net: dsa: microchip: Drop unused GPIO includesLinus Walleij
This driver does not use the old GPIO includes so drop them. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-04Merge branch 'qed-fixes'David S. Miller
Denis Bolotin says: ==================== qed: Misc fixes in qed This patch series fixes 2 potential bugs in qed. Please consider applying to net. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-04qed: Fix qed_ll2_post_rx_buffer_notify_fw() by adding a write memory barrierDenis Bolotin
Make sure chain element is updated before ringing the doorbell. Signed-off-by: Denis Bolotin <dbolotin@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-04qed: Fix qed_chain_set_prod() for PBL chains with non power of 2 page countDenis Bolotin
In PBL chains with non power of 2 page count, the producer is not at the beginning of the chain when index is 0 after a wrap. Therefore, after the producer index wrap around, page index should be calculated more carefully. Signed-off-by: Denis Bolotin <dbolotin@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-04make 'user_access_begin()' do 'access_ok()'Linus Torvalds
Originally, the rule used to be that you'd have to do access_ok() separately, and then user_access_begin() before actually doing the direct (optimized) user access. But experience has shown that people then decide not to do access_ok() at all, and instead rely on it being implied by other operations or similar. Which makes it very hard to verify that the access has actually been range-checked. If you use the unsafe direct user accesses, hardware features (either SMAP - Supervisor Mode Access Protection - on x86, or PAN - Privileged Access Never - on ARM) do force you to use user_access_begin(). But nothing really forces the range check. By putting the range check into user_access_begin(), we actually force people to do the right thing (tm), and the range check vill be visible near the actual accesses. We have way too long a history of people trying to avoid them. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04net, skbuff: do not prefer skb allocation fails earlyDavid Rientjes
Commit dcda9b04713c ("mm, tree wide: replace __GFP_REPEAT by __GFP_RETRY_MAYFAIL with more useful semantic") replaced __GFP_REPEAT in alloc_skb_with_frags() with __GFP_RETRY_MAYFAIL when the allocation may directly reclaim. The previous behavior would require reclaim up to 1 << order pages for skb aligned header_len of order > PAGE_ALLOC_COSTLY_ORDER before failing, otherwise the allocations in alloc_skb() would loop in the page allocator looking for memory. __GFP_RETRY_MAYFAIL makes both allocations failable under memory pressure, including for the HEAD allocation. This can cause, among many other things, write() to fail with ENOTCONN during RPC when under memory pressure. These allocations should succeed as they did previous to dcda9b04713c even if it requires calling the oom killer and additional looping in the page allocator to find memory. There is no way to specify the previous behavior of __GFP_REPEAT, but it's unlikely to be necessary since the previous behavior only guaranteed that 1 << order pages would be reclaimed before failing for order > PAGE_ALLOC_COSTLY_ORDER. That reclaim is not guaranteed to be contiguous memory, so repeating for such large orders is usually not beneficial. Removing the setting of __GFP_RETRY_MAYFAIL to restore the previous behavior, specifically not allowing alloc_skb() to fail for small orders and oom kill if necessary rather than allowing RPCs to fail. Fixes: dcda9b04713c ("mm, tree wide: replace __GFP_REPEAT by __GFP_RETRY_MAYFAIL with more useful semantic") Signed-off-by: David Rientjes <rientjes@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-04soc/fsl/qe: fix err handling of ucc_of_parse_tdmWen Yang
Currently there are some issues with the ucc_of_parse_tdm function: 1, a possible null pointer dereference in ucc_of_parse_tdm, detected by the semantic patch deref_null.cocci, with the following warning: drivers/soc/fsl/qe/qe_tdm.c:177:21-24: ERROR: pdev is NULL but dereferenced. 2, dev gets modified, so in any case that devm_iounmap() will fail even when the new pdev is valid, because the iomap was done with a different pdev. 3, there is no driver bind with the "fsl,t1040-qe-si" or "fsl,t1040-qe-siram" device. So allocating resources using devm_*() with these devices won't provide a cleanup path for these resources when the caller fails. This patch fixes them. Suggested-by: Li Yang <leoyang.li@nxp.com> Suggested-by: Christophe LEROY <christophe.leroy@c-s.fr> Signed-off-by: Wen Yang <wen.yang99@zte.com.cn> Reviewed-by: Peng Hao <peng.hao2@zte.com.cn> CC: Julia Lawall <julia.lawall@lip6.fr> CC: Zhao Qiang <qiang.zhao@nxp.com> CC: David S. Miller <davem@davemloft.net> CC: netdev@vger.kernel.org CC: linuxppc-dev@lists.ozlabs.org CC: linux-kernel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-04r8169: Add support for new Realtek EthernetKai-Heng Feng
There are two new Realtek Ethernet devices which are re-branded r8168h. Add the IDs to to support them. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>