summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-06-04swait: strengthen language to discourage useLinus Torvalds
We already earlier discouraged people from using this interface in commit 88796e7e5c45 ("sched/swait: Document it clearly that the swait facilities are special and shouldn't be used"), but I just got a pull request with a new broken user. So make the comment *really* clear. The swait interfaces are bad, and should not be used unless you have some *very* strong reasons that include tons of hard performance numbers on just why you want to use them, and you show that you actually understand that they aren't at all like the normal wait/wakeup interfaces. So far, every single user has been suspect. The main user is KVM, which is completely pointless (there is only ever one waiter, which avoids the interface subtleties, but also means that having a queue instead of a pointer is counter-productive and certainly not an "optimization"). So make the comments much stronger. Not that anybody likely reads them anyway, but there's always some slight hope that it will cause somebody to think twice. I'd like to remove this interface entirely, but there is the theoretical possibility that it's actually the right thing to use in some situation, most likely some deep RT use. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-06-04rbd: flush rbd_dev->watch_dwork after watch is unregisteredDongsheng Yang
There is a problem if we are going to unmap a rbd device and the watch_dwork is going to queue delayed work for watch: unmap Thread watch Thread timer do_rbd_remove cancel_tasks_sync(rbd_dev) queue_delayed_work for watch destroy_workqueue(rbd_dev->task_wq) drain_workqueue(wq) destroy other resources in wq call_timer_fn __queue_work() Then the delayed work escape the cancel_tasks_sync() and destroy_workqueue() and we will get an user-after-free call trace: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI Modules linked in: CPU: 7 PID: 0 Comm: swapper/7 Tainted: G OE 4.17.0-rc6+ #13 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:__queue_work+0x6a/0x3b0 RSP: 0018:ffff9427df1c3e90 EFLAGS: 00010086 RAX: ffff9427deca8400 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff9427deca8400 RSI: ffff9427df1c3e50 RDI: 0000000000000000 RBP: ffff942783e39e00 R08: ffff9427deca8400 R09: ffff9427df1c3f00 R10: 0000000000000004 R11: 0000000000000005 R12: ffff9427cfb85970 R13: 0000000000002000 R14: 000000000001eca0 R15: 0000000000000007 FS: 0000000000000000(0000) GS:ffff9427df1c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000004c900a005 CR4: 00000000000206e0 Call Trace: <IRQ> ? __queue_work+0x3b0/0x3b0 call_timer_fn+0x2d/0x130 run_timer_softirq+0x16e/0x430 ? tick_sched_timer+0x37/0x70 __do_softirq+0xd2/0x280 irq_exit+0xd5/0xe0 smp_apic_timer_interrupt+0x6c/0x130 apic_timer_interrupt+0xf/0x20 [ Move rbd_dev->watch_dwork cancellation so that rbd_reregister_watch() either bails out early because the watch is UNREGISTERED at that point or just gets cancelled. ] Cc: stable@vger.kernel.org Fixes: 99d1694310df ("rbd: retry watch re-registration periodically") Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-06-04ceph: update description of some mount optionsChengguang Xu
Based on code, default value of rsize/wsize is 16 MB. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-06-04ceph: show ino32 if the value is different with defaultChengguang Xu
In current ceph_show_options(), there is no item for showing 'ino32', so add showing mount option 'ino32' if the value is different with default. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-06-04ceph: strengthen rsize/wsize/readdir_max_bytes validationChengguang Xu
The check (intval < PAGE_SIZE) will involve type cast, so even when specifying negative value to rsize/wsize/readdir_max_bytes, it will pass the validation check successfully. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-06-04ceph: fix alignment of rasizeChengguang Xu
On currently logic: when I specify rasize=0~1 then it will be 4096. when I specify rasize=2~4097 then it will be 8192. Make it the same as rsize & wsize. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-06-04ceph: fix use-after-free in ceph_statfs()Luis Henriques
KASAN found an UAF in ceph_statfs. This was a one-off bug but looking at the code it looks like the monmap access needs to be protected as it can be modified while we're accessing it. Fix this by protecting the access with the monc->mutex. BUG: KASAN: use-after-free in ceph_statfs+0x21d/0x2c0 Read of size 8 at addr ffff88006844f2e0 by task trinity-c5/304 CPU: 0 PID: 304 Comm: trinity-c5 Not tainted 4.17.0-rc6+ #172 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 Call Trace: dump_stack+0xa5/0x11b ? show_regs_print_info+0x5/0x5 ? kmsg_dump_rewind+0x118/0x118 ? ceph_statfs+0x21d/0x2c0 print_address_description+0x73/0x2b0 ? ceph_statfs+0x21d/0x2c0 kasan_report+0x243/0x360 ceph_statfs+0x21d/0x2c0 ? ceph_umount_begin+0x80/0x80 ? kmem_cache_alloc+0xdf/0x1a0 statfs_by_dentry+0x79/0xb0 vfs_statfs+0x28/0x110 user_statfs+0x8c/0xe0 ? vfs_statfs+0x110/0x110 ? __fdget_raw+0x10/0x10 __se_sys_statfs+0x5d/0xa0 ? user_statfs+0xe0/0xe0 ? mutex_unlock+0x1d/0x40 ? __x64_sys_statfs+0x20/0x30 do_syscall_64+0xee/0x290 ? syscall_return_slowpath+0x1c0/0x1c0 ? page_fault+0x1e/0x30 ? syscall_return_slowpath+0x13c/0x1c0 ? prepare_exit_to_usermode+0xdb/0x140 ? syscall_trace_enter+0x330/0x330 ? __put_user_4+0x1c/0x30 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Allocated by task 130: __kmalloc+0x124/0x210 ceph_monmap_decode+0x1c1/0x400 dispatch+0x113/0xd20 ceph_con_workfn+0xa7e/0x44e0 process_one_work+0x5f0/0xa30 worker_thread+0x184/0xa70 kthread+0x1a0/0x1c0 ret_from_fork+0x35/0x40 Freed by task 130: kfree+0xb8/0x210 dispatch+0x15a/0xd20 ceph_con_workfn+0xa7e/0x44e0 process_one_work+0x5f0/0xa30 worker_thread+0x184/0xa70 kthread+0x1a0/0x1c0 ret_from_fork+0x35/0x40 Signed-off-by: Luis Henriques <lhenriques@suse.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-06-04ceph: prevent i_version from going backYan, Zheng
inode info from non-auth can be stale. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-06-04ceph: fix wrong check for the case of updating link countYan, Zheng
Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-06-04libceph: allocate the locator string with GFP_NOFAILIlya Dryomov
calc_target() isn't supposed to fail with anything but POOL_DNE, in which case we report that the pool doesn't exist and fail the request with -ENOENT. Doing this for -ENOMEM is at the very least confusing and also harmful -- as the preceding requests complete, a short-lived locator string allocation is likely to succeed after a wait. (We used to call ceph_object_locator_to_pg() for a pi lookup. In theory that could fail with -ENOENT, hence the "ret != -ENOENT" warning being removed.) Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-06-04libceph: make abort_on_full a per-osdc settingIlya Dryomov
The intent behind making it a per-request setting was that it would be set for writes, but not for reads. As it is, the flag is set for all fs/ceph requests except for pool perm check stat request (technically a read). ceph_osdc_abort_on_full() skips reads since the previous commit and I don't see a use case for marking individual requests. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Acked-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
2018-06-04libceph: don't abort reads in ceph_osdc_abort_on_full()Ilya Dryomov
Don't consider reads for aborting and use ->base_oloc instead of ->target_oloc, as done in __submit_request(). Strictly speaking, we shouldn't be aborting FULL_TRY/FULL_FORCE writes either. But, there is an inconsistency in FULL_TRY/FULL_FORCE handling on the OSD side [1], so given that neither of these is used in the kernel client, leave it for when the OSD behaviour is sorted out. [1] http://tracker.ceph.com/issues/24339 Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Acked-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
2018-06-04libceph: avoid a use-after-free during map checkIlya Dryomov
Sending map check after complete_request() was called is not only useless, but can lead to a use-after-free as req->r_kref decrement in __complete_request() races with map check code. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Acked-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
2018-06-04libceph: don't warn if req->r_abort_on_full is setIlya Dryomov
The "FULL or reached pool quota" warning is there to explain paused requests. No need to emit it if pausing isn't going to occur. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Acked-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
2018-06-04libceph: use for_each_request() in ceph_osdc_abort_on_full()Ilya Dryomov
Scanning the trees just to see if there is anything to abort is unnecessary -- all that is needed here is to update the epoch barrier first, before we start aborting. Simplify and do the update inside the loop before calling abort_request() for the first time. The switch to for_each_request() also fixes a bug: homeless requests weren't even considered for aborting. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Acked-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
2018-06-04libceph: defer __complete_request() to a workqueueIlya Dryomov
In the common case, req->r_callback is called by handle_reply() on the ceph-msgr worker thread without any locks. If handle_reply() fails, it is called with both osd->lock and osdc->lock. In the map check case, it is called with just osdc->lock but held for write. Finally, if the request is aborted because of -ENOSPC or by ceph_osdc_abort_requests(), it is called directly on the submitter's thread, again with both locks. req->r_callback on the submitter's thread is relatively new (introduced in 4.12) and ripe for deadlocks -- e.g. writeback worker thread waiting on itself: inode_wait_for_writeback+0x26/0x40 evict+0xb5/0x1a0 iput+0x1d2/0x220 ceph_put_wrbuffer_cap_refs+0xe0/0x2c0 [ceph] writepages_finish+0x2d3/0x410 [ceph] __complete_request+0x26/0x60 [libceph] complete_request+0x2e/0x70 [libceph] __submit_request+0x256/0x330 [libceph] submit_request+0x2b/0x30 [libceph] ceph_osdc_start_request+0x25/0x40 [libceph] ceph_writepages_start+0xdfe/0x1320 [ceph] do_writepages+0x1f/0x70 __writeback_single_inode+0x45/0x330 writeback_sb_inodes+0x26a/0x600 __writeback_inodes_wb+0x92/0xc0 wb_writeback+0x274/0x330 wb_workfn+0x2d5/0x3b0 Defer __complete_request() to a workqueue in all failure cases so it's never on the same thread as ceph_osdc_start_request() and always called with no locks held. Link: http://tracker.ceph.com/issues/23978 Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Acked-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
2018-06-04libceph: move more code into __complete_request()Ilya Dryomov
Move req->r_completion wake up and req->r_kref decrement into __complete_request(). Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Acked-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
2018-06-04libceph: no need to call flush_workqueue() before destructionIlya Dryomov
destroy_workqueue() drains the workqueue before proceeding with destruction. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-06-04ceph: flush pending works before shutdown superYan, Zheng
Pending works hold inode references, which cause "Busy inodes after unmount" warning. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-06-04ceph: abort osd requests on force umountYan, Zheng
This avoid force umount waiting on page writeback: io_schedule+0xd/0x30 wait_on_page_bit_common+0xc6/0x130 __filemap_fdatawait_range+0xbd/0x100 filemap_fdatawait_keep_errors+0x15/0x40 sync_inodes_sb+0x1cf/0x240 sync_filesystem+0x52/0x90 generic_shutdown_super+0x1d/0x110 ceph_kill_sb+0x28/0x80 [ceph] deactivate_locked_super+0x35/0x60 cleanup_mnt+0x36/0x70 task_work_run+0x79/0xa0 exit_to_usermode_loop+0x62/0x70 do_syscall_64+0xdb/0xf0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 0xffffffffffffffff Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-06-04libceph: introduce ceph_osdc_abort_requests()Ilya Dryomov
This will be used by the filesystem for "umount -f". Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-06-04ceph: fix st_nlink stat for directoriesLuis Henriques
Currently, calling stat on a cephfs directory returns 1 for st_nlink. This behaviour has recently changed in the fuse client, as some applications seem to expect this value to be either 0 (if it's unlinked) or 2 + number of subdirectories. This behaviour was changed in the fuse client with commit 67c7e4619188 ("client: use common interp of st_nlink for dirs"). This patch modifies the kernel client to have a similar behaviour. Link: https://tracker.ceph.com/issues/23873 Signed-off-by: Luis Henriques <lhenriques@suse.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-06-04ceph: support file lock on directoryYan, Zheng
Link: http://tracker.ceph.com/issues/24028 Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-06-04ceph: show wsize only if non-defaultIlya Dryomov
This is how it was before commit 95cca2b44e54 ("ceph: limit osd write size") went in. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-06-04ceph: handle the new nfiles/nsubdirs fields in cap messageYan, Zheng
Without these new fields, stale st_size is returned in following case. 1. MDS modifies a directory 2. MDS issues CEPH_CAP_ANY_SHARED to client 3. The client satifies stat(2) by its cached metadata. set st_size to "i_files + i_subdirs". Link: http://tracker.ceph.com/issues/23855 Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-06-04ceph: define argument structure for handle_cap_grantYan, Zheng
The data structure includes the versioned feilds of cap message. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-06-04ceph: update i_files/i_subdirs only when Fs cap is issuedYan, Zheng
In MDS, file/subdir counts of a directory inode are protected by filelock. In request reply without Fs cap, nfiles/nsubdirs can be stale. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-06-04ceph: always get rstat from auth mdsYan, Zheng
rstat is not tracked by capability. client can't know if rstat from non-auth mds is uptodate or not. Link: http://tracker.ceph.com/issues/23538 Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-06-04ceph: use bit flags to define vxattr attributesYan, Zheng
Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-06-04libceph: use MSG_TRUNC for discarding received bytesIlya Dryomov
Avoid a copy into the "skip buffer". Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-06-04libceph: get rid of more_kvec in try_write()Ilya Dryomov
All gotos to "more" are conditioned on con->state == OPEN, but the only thing "more" does is opening the socket if con->state == PREOPEN. Kill that label and rename "more_kvec" to "more". Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2018-06-04libceph, rbd: add error handling for osd_req_op_cls_init()Chengguang Xu
Add proper error handling for osd_req_op_cls_init() to replace BUG_ON statement when failing from memory allocation. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-06-04Merge tag 'regmap-v4.18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap Pull regmap updates from Mark Brown: "This is another quiet release for regmap, there's one minor feature improvement for the recently added slimbus support and a few minor fixes and cleanups" * tag 'regmap-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: regmap: slimbus: allow register offsets up to 16 bits regmap: add missing prototype for devm_init_slimbus regmap: Skip clk_put for attached clocks when freeing context regmap: include <linux/ktime.h> from include/linux/regmap.h
2018-06-04Merge tag 'spi-v4.18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi updates from Mark Brown: "Quite a busy release for SPI, mainly as a result of Boris Brezillon's work on improving the integration with MTD for accelerated SPI flash controllers. He's added a new spi_mem interface which works a lot better with general hardware and converted the users over to it, as a result of this work we've got some MTD changes in here as well. Other highlights include: - Lots of spring cleaning for the s3c64xx driver. - Removal of the bcm53xx, the hardware is also supported by the mspi driver but SoC naming had caused people to miss the duplication. - Conversion of the pxa2xx driver to use the standard message processing loop rather than open coding. - A bunch of improvements to the runtime PM of the OMAP McSPI driver" * tag 'spi-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (47 commits) spi: Fix typo on SPI_MEM help text spi: sh-msiof: Fix setting SIRMDR1.SYNCAC to match SITMDR1.SYNCAC mtd: devices: m25p80: Use spi_mem_set_drvdata() instead of spi_set_drvdata() spi: omap2-mcspi: Remove unnecessary pm_runtime_force_suspend() spi: Add missing pm_runtime_put_noidle() after failed get spi: ti-qspi: Make sure res_mmap != NULL before dereferencing it spi: spi-s3c64xx: Fix system resume support spi: bcm-qspi: Fix build failure caused by spi_flash_read() API removal spi: Get rid of the spi_flash_read() API mtd: spi-nor: Use the spi_mem_xx() API spi: ti-qspi: Implement the spi_mem interface spi: bcm-qspi: Implement the spi_mem interface spi: Make support for regular transfers optional when ->mem_ops != NULL spi: Extend the core to ease integration of SPI memory controllers spi: remove forgotten CONFIG_SPI_BCM53XX spi: remove the older/duplicated bcm53xx driver spi: pxa2xx: check clk_prepare_enable() return value spi: lpspi: Switch to SPDX identifier spi: mxs: Switch to SPDX identifier spi: imx: Switch to SPDX identifier ...
2018-06-04Merge tag 'chrome-platform-for-linus-4.18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform Pull chrome platform updates from Benson Leung: - further changes from Dmitry related to the removal of platform data from atmel_mxt_ts and chromeos_laptop. This time, we have some changes that teach chromeos_laptop how to supply acpi properties for some input devices so that the peripheral driver doesn't have to do dmi matching on some Chromebook platforms. - new Chromebook Tablet switch driver, which is useful for x86 convertible Chromebooks. - other misc cleanup * tag 'chrome-platform-for-linus-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform: platform/chrome: Use to_cros_ec_dev more broadly platform/chrome: chromeos_laptop: fix touchpad button mapping on Celes platform: chrome: Add input dependency for tablet switch driver platform/chrome: chromeos_laptop - supply properties for ACPI devices platform/chrome: chromeos_tbmc - add SPDX identifier platform: chrome: Add Tablet Switch ACPI driver platform/chrome: cros_ec_lpc: do not try DMI match when ACPI device found
2018-06-04Merge tag 'hwmon-for-linus-v4.18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon updates from Guenter Roeck: - asus_atk0110 driver modified to use new API - k10temp supports new CPUs and reports both Tctl and Tdie - minor fixes in gpio-fan, ltc2990, fschmd, and mc13783 drivers * tag 'hwmon-for-linus-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (asus_atk0110) Make use of device managed memory hwmon: (asus_atk0110) Replace deprecated device register call hwmon: (k10temp) Make function get_raw_temp static hwmon: (gpio-fan) Fix "#cooling-cells" property name in bindings MAINTAINERS: hwmon: Add Documentation/devicetree/bindings/hwmon hwmon: (ltc2990) support all measurement modes hwmon: (ltc2990) add devicetree binding hwmon: (ltc2990) Fix incorrect conversion of negative temperatures hwmon: (core) check parent dev != NULL when chip != NULL hwmon: (fschmd) fix typo 'can by' to 'can be' hwmon: (k10temp) Display both Tctl and Tdie hwmon: (k10temp) Add support for Stoney Ridge and Bristol Ridge CPUs hwmon: MC13783: Add uid and die temperature sensor inputs
2018-06-04fs: aio ioprio use ioprio_check_cap ret valAdam Manzanares
Previously the value was ignored. Signed-off-by: Adam Manzanares <adam.manzanares@wdc.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-06-04fs: aio ioprio add explicit block layer dependenceAdam Manzanares
Previously, the ioprio_check_cap function was only defined when CONFIG_BLOCK was set. Make this relationship explicit and add a stub for !CONFIG_BLOCK. Signed-off-by: Adam Manzanares <adam.manzanares@wdc.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-06-04blk-mq: return when hctx is stopped in blk_mq_run_work_fnJianchao Wang
If a hardware queue is stopped, it should not be run again before explicitly started. Ignore stopped queues in blk_mq_run_work_fn(), fixing a regression recently introduced when the START_ON_RUN bit was removed. Fixes: 15fe8a90bb45 ("blk-mq: remove blk_mq_delay_queue()") Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-04Merge tag 'dma-mapping-4.18' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds
Pull dma-mapping updates from Christoph Hellwig: - replace the force_dma flag with a dma_configure bus method. (Nipun Gupta, although one patch is іncorrectly attributed to me due to a git rebase bug) - use GFP_DMA32 more agressively in dma-direct. (Takashi Iwai) - remove PCI_DMA_BUS_IS_PHYS and rely on the dma-mapping API to do the right thing for bounce buffering. - move dma-debug initialization to common code, and apply a few cleanups to the dma-debug code. - cleanup the Kconfig mess around swiotlb selection - swiotlb comment fixup (Yisheng Xie) - a trivial swiotlb fix. (Dan Carpenter) - support swiotlb on RISC-V. (based on a patch from Palmer Dabbelt) - add a new generic dma-noncoherent dma_map_ops implementation and use it for arc, c6x and nds32. - improve scatterlist validity checking in dma-debug. (Robin Murphy) - add a struct device quirk to limit the dma-mask to 32-bit due to bridge/system issues, and switch x86 to use it instead of a local hack for VIA bridges. - handle devices without a dma_mask more gracefully in the dma-direct code. * tag 'dma-mapping-4.18' of git://git.infradead.org/users/hch/dma-mapping: (48 commits) dma-direct: don't crash on device without dma_mask nds32: use generic dma_noncoherent_ops nds32: implement the unmap_sg DMA operation nds32: consolidate DMA cache maintainance routines x86/pci-dma: switch the VIA 32-bit DMA quirk to use the struct device flag x86/pci-dma: remove the explicit nodac and allowdac option x86/pci-dma: remove the experimental forcesac boot option Documentation/x86: remove a stray reference to pci-nommu.c core, dma-direct: add a flag 32-bit dma limits dma-mapping: remove unused gfp_t parameter to arch_dma_alloc_attrs dma-debug: check scatterlist segments c6x: use generic dma_noncoherent_ops arc: use generic dma_noncoherent_ops arc: fix arc_dma_{map,unmap}_page arc: fix arc_dma_sync_sg_for_{cpu,device} arc: simplify arc_dma_sync_single_for_{cpu,device} dma-mapping: provide a generic dma-noncoherent implementation dma-mapping: simplify Kconfig dependencies riscv: add swiotlb support riscv: only enable ZONE_DMA32 for 64-bit ...
2018-06-04PCI: qcom: Include gpio/consumer.hArnd Bergmann
When CONFIG_GPIOLIB is disabled, we run into a build failure: drivers/pci/dwc/pcie-qcom.c: In function 'qcom_pcie_probe': drivers/pci/dwc/pcie-qcom.c:1223:16: error: implicit declaration of function 'devm_gpiod_get_optional'; did you mean 'devm_regulator_get_optional'? [-Werror=implicit-function-declaration] pcie->reset = devm_gpiod_get_optional(dev, "perst", GPIOD_OUT_LOW); Including gpio/consumer.h directly is the correct fix. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2018-06-04ixgbe: fix broken ipsec Rx with proper cast on spiShannon Nelson
Fix up a cast problem introduced by a sparse cleanup patch. This fixes a problem where the encrypted packets were not recognized on Rx and subsequently dropped. Fixes: 9cfbfa701b55 ("ixgbe: cleanup sparse warnings") Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-06-04ixgbe: check ipsec ip addr against mgmt filtersShannon Nelson
Make sure we don't try to offload the decryption of an incoming packet that should get delivered to the management engine. This is a corner case that will likely be very seldom seen, but could really confuse someone if they were to hit it. Suggested-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-06-04Merge branch 'mlxsw-Fixes-in-offloading-of-mirror-to-gretap'David S. Miller
Ido Schimmel says: ==================== mlxsw: Fixes in offloading of mirror-to-gretap Petr says: These two patches fix issues in offloading of mirror-to-gretap when bridge is present in the underlay. In patch #1, reconsideration of SPAN configuration is not done right at the point that SWITCHDEV_OBJ_ID_PORT_VLAN deletion notification is distributed, but is postponed, because the notifications are actually distributed before the relevant change is implemented in the bridge. In patch #2, a problem in configuring VLAN tagging in situations when a VLAN device is on top of an 802.1Q bridge whose egress port is marked as "egress untagged". In that case, mlxsw would neglect to suppress the tagging implicitly assumed after the VLAN device was seen. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04mlxsw: spectrum_span: Suppress VLAN on BRIDGE_VLAN_INFO_UNTAGGEDPetr Machata
When offloading mirroring to gretap or ip6gretap netdevices, an 802.1q bridge is one of the soft devices permissible in the underlay when resolving the packet path. After the packet path is resolved to a particular bridge egress device, flags on packet VLAN determine whether the egressed packet should be tagged. The current logic however only ever sets the VLAN tag, never suppresses it. Thus if there's a VLAN netdevice above the bridge that determines the packet VLAN, that VLAN is never unset, and mirroring is configured with VLAN tagging. Fix by setting the packet VLAN on both branches: set to zero (for unset) when BRIDGE_VLAN_INFO_UNTAGGED, copy the resolved VLAN (e.g. from bridge PVID) otherwise. Fixes: 946a11e7408e ("mlxsw: spectrum_span: Allow bridge for gretap mirror") Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04mlxsw: spectrum_switchdev: Postpone respin on object deletionPetr Machata
VLAN deletion notifications are emitted before the relevant change is projected to bridge configuration. Thus, like with VLAN addition, schedule SPAN respin for later. Fixes: c520bc698647 ("mlxsw: Respin SPAN on switchdev events") Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04ixgbe: fix possible race in reset subtaskTony Nguyen
Similar to ixgbevf, the same possibility for race exists. Extend the RTNL lock in ixgbe_reset_subtask() to protect the state bits; this is to make sure that we get the most up-to-date values for the bits and avoid a possible race when going down. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-06-04bpf, i40e: add meta data supportDaniel Borkmann
Add support for XDP meta data when using build skb variant of the i40e driver. Implementation is analogous to the existing ixgbe and ixgbevf support for meta data from 366a88fe2f40 ("bpf, ixgbe: add meta data support") and be8333322eff ("ixgbevf: Add support for meta data"). With the build skb variant we get 192 bytes of extra headroom which can be used for encaps or meta data. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Tested-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-06-04ipv6: omit traffic class when calculating flow hashMichal Kubecek
Some of the code paths calculating flow hash for IPv6 use flowlabel member of struct flowi6 which, despite its name, encodes both flow label and traffic class. If traffic class changes within a TCP connection (as e.g. ssh does), ECMP route can switch between path. It's also inconsistent with other code paths where ip6_flowlabel() (returning only flow label) is used to feed the key. Use only flow label everywhere, including one place where hash key is set using ip6_flowinfo(). Fixes: 51ebd3181572 ("ipv6: add support of equal cost multipath (ECMP)") Fixes: f70ea018da06 ("net: Add functions to get skb->hash based on flow structures") Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04ixgbe: introduce a helper to simplify codeYueHaibing
ixgbe_dbg_reg_ops_read and ixgbe_dbg_netdev_ops_read copy-pasting the same code except for ixgbe_dbg_netdev_ops_buf/ixgbe_dbg_reg_ops_buf, so introduce a helper ixgbe_dbg_common_ops_read to remove redundant code. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>