git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2019-11-18	watchdog: prevent deferral of watchdogd wakeup on RT	Julia Cartwright
	When PREEMPT_RT is enabled, all hrtimer expiry functions are deferred for execution into the context of ksoftirqd unless otherwise annotated. Deferring the expiry of the hrtimer used by the watchdog core, however, is a waste, as the callback does nothing but queue a kthread work item and wakeup watchdogd. It's worst then that, too: the deferral through ksoftirqd also means that for correct behavior a user must adjust the scheduling parameters of both watchdogd _and_ ksoftirqd, which is unnecessary and has other side effects (like causing unrelated expiry functions to execute at potentially elevated priority). Instead, mark the hrtimer used by the watchdog core as being _HARD to allow it's execution directly from hardirq context. The work done in this expiry function is well-bounded and minimal. A user still must adjust the scheduling parameters of the watchdogd to be correct w.r.t. their application needs. Link: https://lkml.kernel.org/r/0e02d8327aeca344096c246713033887bc490dd7.1538089180.git.julia@ni.com Cc: Guenter Roeck <linux@roeck-us.net> Reported-and-tested-by: Steffen Trumtrar <s.trumtrar@pengutronix.de> Reported-by: Tim Sander <tim@krieglstein.org> Signed-off-by: Julia Cartwright <julia@ni.com> Acked-by: Guenter Roeck <linux@roeck-us.net> [bigeasy: use only HRTIMER_MODE_REL_HARD] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20191105144506.clyadjbvnn7b7b2m@linutronix.de Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18	watchdog: imx7ulp: Use definitions instead of magic values	Fabio Estevam
	Use definitions instead of magic values in order to improve readability. Since the CLK field of the WDOG CS register is composed of two bits to select the watchdog clock source, use a shift representation instead of BIT(). Signed-off-by: Fabio Estevam <festevam@gmail.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20191029174037.25381-5-festevam@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18	watchdog: imx7ulp: Remove inline annotations	Fabio Estevam
	Compiler is smart enough and knows when to inline, so there is no need to mark some of these functions as 'inline'. Remove such annotations. Signed-off-by: Fabio Estevam <festevam@gmail.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20191029174037.25381-4-festevam@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18	watchdog: imx7ulp: Remove unused structure member	Fabio Estevam
	The 'notifier_block' structure member is unused, so just remove it. Signed-off-by: Fabio Estevam <festevam@gmail.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20191029174037.25381-3-festevam@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18	watchdog: imx7ulp: Pass the wdog instance inimx7ulp_wdt_enable()	Fabio Estevam
	It is more natural to pass the watchdog instance inside imx7ulp_wdt_enable() instead of the base address. This also has the benefit to reduce the code a bit. Signed-off-by: Fabio Estevam <festevam@gmail.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20191029174037.25381-2-festevam@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18	watchdog: wdat_wdt: Spelling s/configrable/configurable/	Geert Uytterhoeven
	Fix misspelling of "configurable". Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20191024152856.30788-1-geert+renesas@glider.be Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18	watchdog: bd70528: Trivial function documentation fix	Matti Vaittinen
	The function documentation for the exported ROHM BD70528 WDG control functions used old argument names. Fix the names. Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20191010060733.GA9979@localhost.localdomain Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18	watchdog: cadence: Do not show error in case of deferred probe	Michal Simek
	There is no reason to show error message if clocks are not ready yet. Signed-off-by: Michal Simek <michal.simek@xilinx.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/d3e295d5ba79f453b4aa4128c4fa63fbd6c16920.1570544944.git.michal.simek@xilinx.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18	watchdog: Fix the race between the release of watchdog_core_data and cdev	Kevin Hao
	The struct cdev is embedded in the struct watchdog_core_data. In the current code, we manage the watchdog_core_data with a kref, but the cdev is manged by a kobject. There is no any relationship between this kref and kobject. So it is possible that the watchdog_core_data is freed before the cdev is entirely released. We can easily get the following call trace with CONFIG_DEBUG_KOBJECT_RELEASE and CONFIG_DEBUG_OBJECTS_TIMERS enabled. ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x38 WARNING: CPU: 23 PID: 1028 at lib/debugobjects.c:481 debug_print_object+0xb0/0xf0 Modules linked in: softdog(-) deflate ctr twofish_generic twofish_common camellia_generic serpent_generic blowfish_generic blowfish_common cast5_generic cast_common cmac xcbc af_key sch_fq_codel openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 CPU: 23 PID: 1028 Comm: modprobe Not tainted 5.3.0-next-20190924-yoctodev-standard+ #180 Hardware name: Marvell OcteonTX CN96XX board (DT) pstate: 00400009 (nzcv daif +PAN -UAO) pc : debug_print_object+0xb0/0xf0 lr : debug_print_object+0xb0/0xf0 sp : ffff80001cbcfc70 x29: ffff80001cbcfc70 x28: ffff800010ea2128 x27: ffff800010bad000 x26: 0000000000000000 x25: ffff80001103c640 x24: ffff80001107b268 x23: ffff800010bad9e8 x22: ffff800010ea2128 x21: ffff000bc2c62af8 x20: ffff80001103c600 x19: ffff800010e867d8 x18: 0000000000000060 x17: 0000000000000000 x16: 0000000000000000 x15: ffff000bd7240470 x14: 6e6968207473696c x13: 5f72656d6974203a x12: 6570797420746365 x11: 6a626f2029302065 x10: 7461747320657669 x9 : 7463612820657669 x8 : 3378302f3078302b x7 : 0000000000001d7a x6 : ffff800010fd5889 x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000 x2 : ffff000bff948548 x1 : 276a1c9e1edc2300 x0 : 0000000000000000 Call trace: debug_print_object+0xb0/0xf0 debug_check_no_obj_freed+0x1e8/0x210 kfree+0x1b8/0x368 watchdog_cdev_unregister+0x88/0xc8 watchdog_dev_unregister+0x38/0x48 watchdog_unregister_device+0xa8/0x100 softdog_exit+0x18/0xfec4 [softdog] __arm64_sys_delete_module+0x174/0x200 el0_svc_handler+0xd0/0x1c8 el0_svc+0x8/0xc This is a common issue when using cdev embedded in a struct. Fortunately, we already have a mechanism to solve this kind of issue. Please see commit 233ed09d7fda ("chardev: add helper function to register char devs with a struct device") for more detail. In this patch, we choose to embed the struct device into the watchdog_core_data, and use the API provided by the commit 233ed09d7fda to make sure that the release of watchdog_core_data and cdev are in sequence. Signed-off-by: Kevin Hao <haokexin@gmail.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20191008112934.29669-1-haokexin@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18	watchdog: sbc7240_wdt: Fix yet another -Wimplicit-fallthrough warning	Borislav Petkov
	... by moving the fall through comment outside of the code block so that gcc sees it. No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Wim Van Sebroeck <wim@linux-watchdog.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: linux-watchdog@vger.kernel.org Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20190929114649.22740-1-bp@alien8.de Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18	watchdog: intel-mid_wdt: Add WATCHDOG_NOWAYOUT support	Andy Shevchenko
	Normally, the watchdog is disabled when /dev/watchdog is closed, but if CONFIG_WATCHDOG_NOWAYOUT is defined, then it means that the watchdog should remain enabled. So we should keep it enabled if CONFIG_WATCHDOG_NOWAYOUT is defined. Reported-by: Razvan Becheriu <razvan.becheriu@qualitance.com> Cc: Ferry Toth <fntoth@gmail.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Razvan Becheriu <razvan.becheriu@gmail.com> Link: https://lore.kernel.org/r/20190924143116.69823-1-andriy.shevchenko@linux.intel.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18	watchdog: imx2_wdt: Use 'dev' instead of dereferencing it repeatedly	Anson Huang
	Add helper variable dev = &pdev->dev to simply the code. Signed-off-by: Anson Huang <Anson.Huang@nxp.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/1569308828-8320-3-git-send-email-Anson.Huang@nxp.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18	watchdog: imx2_wdt: Use __maybe_unused instead of #if CONFIG_PM_SLEEP	Anson Huang
	Use __maybe_unused for power management related functions instead of #if CONFIG_PM_SLEEP to simply the code. Signed-off-by: Anson Huang <Anson.Huang@nxp.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/1569308828-8320-1-git-send-email-Anson.Huang@nxp.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18	watchdog: imx2_wdt: Remove unnecessary blank line	Anson Huang
	Remove unnecessary blank line. Signed-off-by: Anson Huang <Anson.Huang@nxp.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/1569308828-8320-1-git-send-email-Anson.Huang@nxp.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18	watchdog: w83627hf_wdt: Support NCT6116D	Srikanth Krishnakar
	The watchdog controller on NCT6116D is compatible with NCT6102D. Extend the support to enable SuperIO based NCT6116D watchdog device. Signed-off-by: Srikanth Krishnakar <Srikanth_Krishnakar@mentor.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20190918160458.10108-1-Srikanth_Krishnakar@mentor.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18	spi: mediatek: add SPI_CS_HIGH support	Luhua Xu
	Change to use SPI_CS_HIGH to support spi CS polarity setting for chips support enhance_timing. Signed-off-by: Luhua Xu <luhua.xu@mediatek.com> Link: https://lore.kernel.org/r/1574053037-26721-2-git-send-email-luhua.xu@mediatek.com Signed-off-by: Mark Brown <broonie@kernel.org>
2019-11-18	drm/i915/userptr: Try to acquire the page lock around set_page_dirty()	Chris Wilson
	set_page_dirty says: For pages with a mapping this should be done under the page lock for the benefit of asynchronous memory errors who prefer a consistent dirty state. This rule can be broken in some special cases, but should be better not to. Under those rules, it is only safe for us to use the plain set_page_dirty calls for shmemfs/anonymous memory. Userptr may be used with real mappings and so needs to use the locked version (set_page_dirty_lock). However, following a try_to_unmap() we may want to remove the userptr and so call put_pages(). However, try_to_unmap() acquires the page lock and so we must avoid recursively locking the pages ourselves -- which means that we cannot safely acquire the lock around set_page_dirty(). Since we can't be sure of the lock, we have to risk skip dirtying the page, or else risk calling set_page_dirty() without a lock and so risk fs corruption. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203317 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112012 Fixes: 5cc9ed4b9a7a ("drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl") References: cb6d7c7dc7ff ("drm/i915/userptr: Acquire the page lock around set_page_dirty()") References: 505a8ec7e11a ("Revert "drm/i915/userptr: Acquire the page lock around set_page_dirty()"") References: 6dcc693bc57f ("ext4: warn when page is dirtied without buffers") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: stable@vger.kernel.org Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191111133205.11590-1-chris@chris-wilson.co.uk (cherry picked from commit 0d4bbe3d407f79438dc4f87943db21f7134cfc65) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (cherry picked from commit cee7fb437edcdb2f9f8affa959e274997f5dca4d) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2019-11-18	drm/i915/pmu: "Frequency" is reported as accumulated cycles	Chris Wilson
	We report "frequencies" (actual-frequency, requested-frequency) as the number of accumulated cycles so that the average frequency over that period may be determined by the user. This means the units we report to the user are Mcycles (or just M), not MHz. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: stable@vger.kernel.org Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191109105356.5273-1-chris@chris-wilson.co.uk (cherry picked from commit e88866ef02851c88fe95a4bb97820b94b4d46f36) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (cherry picked from commit a7d87b70d6da96c6772e50728c8b4e78e4cbfd55) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2019-11-18	drm/i915: Preload LUTs if the hw isn't currently using them	Ville Syrjälä
	The LUTs are single buffered so in order to program them without tearing we'd have to do it during vblank (actually to be 100% effective it has to happen between start of vblank and frame start). We have no proper mechanism for that at the moment so we just defer loading them after the vblank waits have happened. That is not quite sufficient (especially when committing multiple pipes whose vblanks don't line up) so the LUT load will often leak into the following frame causing tearing. However in case the hardware wasn't previously using the LUT we can preload it before setting the enable bit (which is double buffered so won't tear). Let's determine if we can do such preloading and make it happen. Slight variation between the hardware requires some platforms specifics in the checks. Hans is seeing ugly colored flash on VLV/CHV macchines (GPD win and Asus T100HA) when the gamma LUT gets loaded for the first time as the BIOS has left some junk in the LUT memory. v2: Deal with uapi vs. hw crtc state split s/GCM/CGM/ typo fix Cc: Hans de Goede <hdegoede@redhat.com> Fixes: 051a6d8d3ca0 ("drm/i915: Move LUT programming to happen after vblank waits") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191030190815.7359-1-ville.syrjala@linux.intel.com Tested-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> (cherry picked from commit 0ccc42a2fd5107a7f58e62c8b35b61de9a70ce82) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (cherry picked from commit f77021372e2880237278e0ee57faadc077a8256a) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2019-11-18	drm/i915: Don't oops in dumb_create ioctl if we have no crtcs	Ville Syrjälä
	Make sure we have a crtc before probing its primary plane's max stride. Initially I thought we can't get this far without crtcs, but looks like we can via the dumb_create ioctl. Not sure if we shouldn't disable dumb buffer support entirely when we have no crtcs, but that would require some amount of work as the only thing currently being checked is dev->driver->dumb_create which we'd have to convert to some device specific dynamic thing. Cc: stable@vger.kernel.org Reported-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Fixes: aa5ca8b7421c ("drm/i915: Align dumb buffer stride to 4k to allow for gtt remapping") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191106172349.11987-1-ville.syrjala@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (cherry picked from commit baea9ffe64200033499a4955f431e315bb807899) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (cherry picked from commit aeec766133f99d45aad60d650de50fb382104d95) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2019-11-18	usb-storage: Disable UAS on JMicron SATA enclosure	Laura Abbott
	Steve Ellis reported incorrect block sizes and alignement offsets with a SATA enclosure. Adding a quirk to disable UAS fixes the problems. Reported-by: Steven Ellis <sellis@redhat.com> Cc: Pacho Ramos <pachoramos@gmail.com> Signed-off-by: Laura Abbott <labbott@fedoraproject.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-18	Revert "bcache: fix fifo index swapping condition in journal_pin_cmp()"	Jens Axboe
	Coly says: "Guoju Fang talked to me today, he told me this change was unnecessary and I was over-thought. Then I realize fifo_idx() uses a mask to handle the array index overflow condition, so the index swap in journal_pin_cmp() won't happen. And yes, Guoju and Kent are correct. Since you already applied this patch, can you please to remove this patch from your for-next branch? This single patch does not break thing, but it is unecessary at this moment." This reverts commit c0e0954e909c17b43d176ab219fc598964616ae6. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-11-18	scsi: sd_zbc: Remove set but not used variable 'buflen'	YueHaibing
	Fixes gcc '-Wunused-but-set-variable' warning: drivers/scsi/sd_zbc.c: In function 'sd_zbc_check_zones': drivers/scsi/sd_zbc.c:341:9: warning: variable 'buflen' set but not used [-Wunused-but-set-variable] It is not used since commit d9dd73087a8b ("block: Enhance blk_revalidate_disk_zones()") Reported-by: Hulk Robot <hulkci@huawei.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-11-18	dm thin: wakeup worker only when deferred bios exist	Jeffle Xu
	Single thread fio test (read, bs=4k, ioengine=libaio, iodepth=128, numjobs=1) over dm-thin device has poor performance versus bare nvme device. Further investigation with perf indicates that queue_work_on() consumes over 20% CPU time when doing IO over dm-thin device. The call stack is as follows. - 40.57% thin_map + 22.07% queue_work_on + 9.95% dm_thin_find_block + 2.80% cell_defer_no_holder 1.91% inc_all_io_entry.isra.33.part.34 + 1.78% bio_detain.isra.35 In cell_defer_no_holder(), wakeup_worker() is always called, no matter whether the tc->deferred_bio_list list is empty or not. In single thread IO model, this list is most likely empty. So skip waking up worker thread if tc->deferred_bio_list list is empty. Single thread IO performance improves from 448 MiB/s to 646 MiB/s (+44%) once the needless wake_worker() calls are properly skipped. Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-11-18	drm/i915: fix accidental static variable use	Jani Nikula
	It's supposed to be just a const pointer. Fixes: 074c77e3ec63 ("drm/i915/tgl: Gen-12 display loses Yf tiling and legacy CCS support") Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191115120440.17883-1-jani.nikula@intel.com (cherry picked from commit 48ea97fabe75c83adf4e6ff9262bbda229e6ee73) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-11-18	drm/i915/guc: Skip suspend/resume GuC action on platforms w/o GuC submission	Don Hiatt
	On some platforms (e.g. KBL) that do not support GuC submission, but the user enabled the GuC communication (e.g for HuC authentication) calling the GuC EXIT_S_STATE action results in lose of ability to enter RC6. We can remove the GuC suspend/resume entirely as we do not need to save the GuC submission status. Add intel_guc_submission_is_enabled() function to determine if GuC submission is active. v2: Do not suspend/resume the GuC on platforms that do not support Guc Submission. v3: Fix typo, move suspend logic to remove goto. v4: Use intel_guc_submission_is_enabled() to check GuC submission status. v5: No need to look at engine to determine if submission is enabled. Squash fix + intel_guc_submission_is_enabled() patch into one. v6: Move resume check into intel_guc_resume() for symmetry. Fix commit Fixes tag. Reported-by: KiteStramuort <kitestramuort@autistici.org> Reported-by: S. Zharkoff <s.zharkoff@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111594 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111623 Fixes: ffd5ce22faa4 ("drm/i915/guc: Updates for GuC 32.0.3 firmware") Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceralo Spurio <daniele.ceraolospurio@intel.com> Cc: Stuart Summers <stuart.summers@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Tomas Janousek <tomi@nomi.cz> Signed-off-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191115231538.1249-1-don.hiatt@intel.com (cherry picked from commit 82e0c5bbd6eb1d274b5a3e519ff0ab91f1f8e537) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-11-18	drm/i915/gt: Wait for new requests in intel_gt_retire_requests()	Chris Wilson
	Our callers fall into two categories, those passing timeout=0 who just want to flush request retirements and those passing a timeout that need to wait for submission completion (e.g. intel_gt_wait_for_idle()). Currently, we only wait for a snapshot of timelines at the start of the wait (but there was an expectation that new requests would cause timelines to appear at the end). However, our callers, such as intel_gt_wait_for_idle() before suspend, do require us to wait for the power management requests emitted by retirement as well. If we don't, then it takes an extra second or two for the background worker to flush the queue and mark the GT as idle. Fixes: 7e8057626640 ("drm/i915: Drop struct_mutex from around i915_retire_requests()") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191114225736.616885-1-chris@chris-wilson.co.uk (cherry picked from commit 7936a22dd4660d24b4ca0668c02b0372127cab44) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-11-18	drm/i915: Restore GT coarse power gating workaround	Imre Deak
	The workaround to disable coarse power gating is still needed on SKL GT3/GT4 machines and since the RC6 context corruption was discovered by the hardware team also on all GEN9 machines. Restore applying the workaround. Fixes: c113236718e8 ("drm/i915: Extract GT render sleep (rc6) management") Testcase: igt/intel_gt_pm_late_selftests/live_rc6_ctx Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191114152621.7235-1-imre.deak@intel.com (cherry picked from commit 980f87a2edb3e7825949ebd0a7e63ab574c20816) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-11-18	drm/i915/fbdev: Restore physical addresses for fb_mmap()	Chris Wilson
	fbdev uses the physical address of our framebuffer for its fb_mmap() routine. While we need to adapt this address for the new io BAR, we have to fix v5.4 first! The simplest fix is to restore the smem back to v5.3 and we will then probably have to implement our fbops->fb_mmap() callback to handle local memory. Reported-by: Neil MacLeod <freedesktop@nmacleod.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112256 Fixes: 5f889b9a61dd ("drm/i915: Disregard drm_mode_config.fb_base") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Tested-by: Neil MacLeod <freedesktop@nmacleod.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191113180633.3947-1-chris@chris-wilson.co.uk (cherry picked from commit abc5520704ab438099fe352636b30b05c1253bea) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-11-18	drm/i915/perf: don't forget noa wait after oa config	Lionel Landwerlin
	I'm observing incoherence metric values, changing from run to run. It appears the patches introducing noa wait & reconfiguration from command stream switched places in the series multiple times during the review. This lead to the dependency of one onto the order to go missing... Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: 15d0ace1f876 ("drm/i915/perf: execute OA configuration from command stream") Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191113154639.27144-1-lionel.g.landwerlin@intel.com (cherry picked from commit 93937659dc644f708def8fa58cb63c5c9f499f26) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-11-18	drm/i915: Avoid atomic context for error capture	Bruce Chang
	io_mapping_map_atomic/kmap_atomic are occasionally taken in error capture (if there is no aperture preallocated for the use of error capture), but the error capture and compression routines are now run in normal context: <3> [113.316247] BUG: sleeping function called from invalid context at mm/page_alloc.c:4653 <3> [113.318190] in_atomic(): 1, irqs_disabled(): 0, pid: 678, name: debugfs_test <4> [113.319900] no locks held by debugfs_test/678. <3> [113.321002] Preemption disabled at: <4> [113.321130] [<ffffffffa02506d4>] i915_error_object_create+0x494/0x610 [i915] <4> [113.327259] Call Trace: <4> [113.327871] dump_stack+0x67/0x9b <4> [113.328683] ___might_sleep+0x167/0x250 <4> [113.329618] __alloc_pages_nodemask+0x26b/0x1110 <4> [113.334614] pool_alloc.constprop.19+0x14/0x60 [i915] <4> [113.335951] compress_page+0x7c/0x100 [i915] <4> [113.337110] i915_error_object_create+0x4bd/0x610 [i915] <4> [113.338515] i915_capture_gpu_state+0x384/0x1680 [i915] However, it is not a good idea to run the slow compression inside atomic context, so we choose not to. Fixes: 895d8ebeaa924 ("drm/i915: error capture with no ggtt slot") Signed-off-by: Bruce Chang <yu.bruce.chang@intel.com> Reviewed-by: Brian Welty <brian.welty@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191113231104.24208-1-yu.bruce.chang@intel.com (cherry picked from commit 48715f7001742e0d1cb20cffab1a0d75f5f7ad72) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-11-18	drm/i915/display: Fix TRANS_DDI_MST_TRANSPORT_SELECT definition	José Roberto de Souza
	TRANS_DDI_MST_TRANSPORT_SELECT is 2 bits wide not 3, it was taking one bit from EDP/DSI Input Select. Fixes: b3545e086877 ("drm/i915/tgl: add support to one DP-MST stream") Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191107214559.77087-1-jose.souza@intel.com (cherry picked from commit bb747fa5a9cbf561e5a30649f360feb9e6855645) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-11-18	drm/i915: Fix detection for a CMP-V PCH	Imre Deak
	According to internal documents I found for CMP PCHs the PCI ID 0xA3C1 belongs to a CMP-V chipset. Based on the same docs the programming of the PCH is compatible with that of KBP. Fix up my previous wrong assumption accordingly using the SPT programming which in turn is the basis for KBP. The original bug reporter verified that this is the correct PCH identification (the only way we'll program valid DDC pin-pair values to the GMBUS register) and the Windows team uses the same identification (that is using the KBP programming model for this PCH). I filed the necessary Bspec update requests (BSpec/33734). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112051 Fixes: 37c92dc303dd ("drm/i915: Add new CNL PCH ID seen on a CML platform") Reported-and-tested-by: Cyrus <cyrus.lien@canonical.com> Cc: Cyrus <cyrus.lien@canonical.com> Cc: Timo Aaltonen <tjaalton@ubuntu.com> Cc: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191112104608.24587-1-imre.deak@intel.com (cherry picked from commit 50a5065f4474c2dbc1f7462b45a32d33d7b48d88) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-11-18	drm/i915: Flush context free work on cleanup	Chris Wilson
	Throw in a flush_work() to specifically flush the context cleanup work before the module is unloaded. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112248 Fixes: a4e7ccdac38e ("drm/i915: Move context management under GEM") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191112150051.1603-1-chris@chris-wilson.co.uk (cherry picked from commit 5f00cac921b1219bc9daf00d169385b4cb3916ce) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-11-18	Merge tag 'v5.4-rc8' into sched/core, to pick up fixes and dependencies	Ingo Molnar
	Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-18	spi: st-ssc4: add missed pm_runtime_disable	Chuhong Yuan
	The driver forgets to call pm_runtime_disable in probe failure and remove. Add the missed calls to fix it. Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Link: https://lore.kernel.org/r/20191118024848.21645-1-hslester96@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2019-11-18	regulator: vexpress: Use PTR_ERR_OR_ZERO() to simplify code	zhengbin
	Fixes coccicheck warning: drivers/regulator/vexpress-regulator.c:78:1-3: WARNING: PTR_ERR_OR_ZERO can be used Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Link: https://lore.kernel.org/r/1574074762-34629-1-git-send-email-zhengbin13@huawei.com Signed-off-by: Mark Brown <broonie@kernel.org>
2019-11-18	staging: rtl8723bs: remove set but not used variable 'change', 'pos'	zhengbin
	Fixes gcc '-Wunused-but-set-variable' warning: drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c: In function cfg80211_rtw_change_iface: drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c:1310:5: warning: variable change set but not used [-Wunused-but-set-variable] drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c: In function rtw_cfg80211_set_wpa_ie: drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c:1803:19: warning: variable pos set but not used [-Wunused-but-set-variable] They are introduced by commit 554c0a3abf21 ("staging: Add rtl8723bs sdio wifi driver"), but never used, so remove them. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Link: https://lore.kernel.org/r/1574063156-68155-6-git-send-email-zhengbin13@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-18	staging: rtl8723bs: remove set but not used variable 'notify_ielen', ↵	zhengbin
	'notify_ie', 'notify_interval', 'notify_capability' Fixes gcc '-Wunused-but-set-variable' warning: drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c: In function rtw_cfg80211_inform_bss: drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c:248:9: warning: variable notify_ielen set but not used [-Wunused-but-set-variable] drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c: In function rtw_cfg80211_inform_bss: drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c:247:6: warning: variable notify_ie set but not used [-Wunused-but-set-variable] drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c: In function rtw_cfg80211_inform_bss: drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c:246:6: warning: variable notify_interval set but not used [-Wunused-but-set-variable] drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c: In function rtw_cfg80211_inform_bss: drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c:245:6: warning: variable notify_capability set but not used [-Wunused-but-set-variable] They are introduced by commit 554c0a3abf21 ("staging: Add rtl8723bs sdio wifi driver"), but never used, so remove them. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Link: https://lore.kernel.org/r/1574063156-68155-5-git-send-email-zhengbin13@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-18	staging: rtl8723bs: remove set but not used variable 'pmlmeinfo', 'pHalData'	zhengbin
	Fixes gcc '-Wunused-but-set-variable' warning: drivers/staging/rtl8723bs/hal/rtl8723b_cmd.c: In function ConstructBtNullFunctionData: drivers/staging/rtl8723bs/hal/rtl8723b_cmd.c:2099:24: warning: variable pmlmeinfo set but not used [-Wunused-but-set-variable] drivers/staging/rtl8723bs/hal/rtl8723b_cmd.c: In function SetFwRsvdPagePkt_BTCoex: drivers/staging/rtl8723bs/hal/rtl8723b_cmd.c:2154:24: warning: variable pmlmeinfo set but not used [-Wunused-but-set-variable] drivers/staging/rtl8723bs/hal/rtl8723b_cmd.c: In function SetFwRsvdPagePkt_BTCoex: drivers/staging/rtl8723bs/hal/rtl8723b_cmd.c:2149:23: warning: variable pHalData set but not used [-Wunused-but-set-variable] They are introduced by commit 554c0a3abf21 ("staging: Add rtl8723bs sdio wifi driver"), but never used, so remove them. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Link: https://lore.kernel.org/r/1574063156-68155-4-git-send-email-zhengbin13@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-18	staging: rtl8723bs: remove set but not used variable 'pHalData', 'pdmpriv'	zhengbin
	Fixes gcc '-Wunused-but-set-variable' warning: drivers/staging/rtl8723bs/hal/rtl8723b_hal_init.c: In function rtl8723b_InitAntenna_Selection: drivers/staging/rtl8723bs/hal/rtl8723b_hal_init.c:2237:23: warning: variable pHalData set but not used [-Wunused-but-set-variable] drivers/staging/rtl8723bs/hal/rtl8723b_hal_init.c: In function rtl8723b_fill_default_txdesc: drivers/staging/rtl8723bs/hal/rtl8723b_hal_init.c:3056:18: warning: variable pdmpriv set but not used [-Wunused-but-set-variable] They are introduced by commit 554c0a3abf21 ("staging: Add rtl8723bs sdio wifi driver"), but never used, so remove them. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Link: https://lore.kernel.org/r/1574063156-68155-3-git-send-email-zhengbin13@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-18	staging: rtl8723bs: remove set but not used variable 'pHalData', 'pregistrypriv'	zhengbin
	Fixes gcc '-Wunused-but-set-variable' warning: drivers/staging/rtl8723bs/hal/sdio_halinit.c: In function sdio_AggSettingRxUpdate: drivers/staging/rtl8723bs/hal/sdio_halinit.c:578:23: warning: variable pHalData set but not used [-Wunused-but-set-variable] drivers/staging/rtl8723bs/hal/sdio_halinit.c: In function rtl8723bs_hal_init: drivers/staging/rtl8723bs/hal/sdio_halinit.c:734:24: warning: variable pregistrypriv set but not used [-Wunused-but-set-variable] They are introduced by commit 554c0a3abf21 ("staging: Add rtl8723bs sdio wifi driver"), but never used, so remove them. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Link: https://lore.kernel.org/r/1574063156-68155-2-git-send-email-zhengbin13@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-18	staging: rtl8192e: remove set but not used variable 'frag'	zhengbin
	Fixes gcc '-Wunused-but-set-variable' warning: drivers/staging/rtl8192e/rtl8192e/r8192E_dev.c: In function _rtl92e_process_phyinfo: drivers/staging/rtl8192e/rtl8192e/r8192E_dev.c:1689:15: warning: variable frag set but not used [-Wunused-but-set-variable] It is introduced by commit 3d461c912462 ("rtl8192e: Split into two directories"), but never used, so remove it. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Link: https://lore.kernel.org/r/1574063901-87429-1-git-send-email-zhengbin13@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-18	USB: uas: heed CAPACITY_HEURISTICS	Oliver Neukum
	There is no need to ignore this flag. We should be as close to storage in that regard as makes sense, so honor flags whose cost is tiny. Signed-off-by: Oliver Neukum <oneukum@suse.com> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20191114112758.32747-3-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-18	USB: uas: honor flag to avoid CAPACITY16	Oliver Neukum
	Copy the support over from usb-storage to get feature parity Signed-off-by: Oliver Neukum <oneukum@suse.com> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20191114112758.32747-2-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-18	usb: host: xhci-tegra: Correct phy enable sequence	Nagarjuna Kristam
	XUSB phy needs to be enabled before un-powergating the power partitions. However in the current sequence, it happens opposite. Correct the phy enable and powergating partition sequence to avoid any boot hangs. Signed-off-by: Nagarjuna Kristam <nkristam@nvidia.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Jui Chang Kuo <jckuo@nvidia.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/1572859470-7823-1-git-send-email-nkristam@nvidia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-18	Merge tag 'usb-ci-v5.5-rc1' of ↵	Greg Kroah-Hartman
	git://git.kernel.org/pub/scm/linux/kernel/git/peter.chen/usb into usb-next Peter writes: - Improve the Vbus handler - Improve the HSIC handler for i.mx serial SoC - Some other tiny changes * tag 'usb-ci-v5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/peter.chen/usb: usb: chipidea: imx: pinctrl for HSIC is optional usb: chipidea: imx: refine the error handling for hsic usb: chipidea: imx: change hsic power regulator as optional usb: chipidea: imx: check data->usbmisc_data against NULL before access usb: chipidea: core: change vbus-regulator as optional usb: chipidea: imx: enable vbus and id wakeup only for OTG events usb: chipidea: udc: protect usb interrupt enable usb: chipidea: udc: add new API ci_hdrc_gadget_connect
2019-11-18	usb-serial: cp201x: support Mark-10 digital force gauge	Greg Kroah-Hartman
	Add support for the Mark-10 digital force gauge device to the cp201x driver. Based on a report and a larger patch from Joel Jennings Reported-by: Joel Jennings <joel.jennings@makeitlabs.com> Cc: stable <stable@vger.kernel.org> Acked-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20191118092119.GA153852@kroah.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-18	bpf: Convert bpf_prog refcnt to atomic64_t	Andrii Nakryiko
	Similarly to bpf_map's refcnt/usercnt, convert bpf_prog's refcnt to atomic64 and remove artificial 32k limit. This allows to make bpf_prog's refcounting non-failing, simplifying logic of users of bpf_prog_add/bpf_prog_inc. Validated compilation by running allyesconfig kernel build. Suggested-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191117172806.2195367-3-andriin@fb.com
2019-11-18	bpf: Switch bpf_map ref counter to atomic64_t so bpf_map_inc() never fails	Andrii Nakryiko
	92117d8443bc ("bpf: fix refcnt overflow") turned refcounting of bpf_map into potentially failing operation, when refcount reaches BPF_MAX_REFCNT limit (32k). Due to using 32-bit counter, it's possible in practice to overflow refcounter and make it wrap around to 0, causing erroneous map free, while there are still references to it, causing use-after-free problems. But having a failing refcounting operations are problematic in some cases. One example is mmap() interface. After establishing initial memory-mapping, user is allowed to arbitrarily map/remap/unmap parts of mapped memory, arbitrarily splitting it into multiple non-contiguous regions. All this happening without any control from the users of mmap subsystem. Rather mmap subsystem sends notifications to original creator of memory mapping through open/close callbacks, which are optionally specified during initial memory mapping creation. These callbacks are used to maintain accurate refcount for bpf_map (see next patch in this series). The problem is that open() callback is not supposed to fail, because memory-mapped resource is set up and properly referenced. This is posing a problem for using memory-mapping with BPF maps. One solution to this is to maintain separate refcount for just memory-mappings and do single bpf_map_inc/bpf_map_put when it goes from/to zero, respectively. There are similar use cases in current work on tcp-bpf, necessitating extra counter as well. This seems like a rather unfortunate and ugly solution that doesn't scale well to various new use cases. Another approach to solve this is to use non-failing refcount_t type, which uses 32-bit counter internally, but, once reaching overflow state at UINT_MAX, stays there. This utlimately causes memory leak, but prevents use after free. But given refcounting is not the most performance-critical operation with BPF maps (it's not used from running BPF program code), we can also just switch to 64-bit counter that can't overflow in practice, potentially disadvantaging 32-bit platforms a tiny bit. This simplifies semantics and allows above described scenarios to not worry about failing refcount increment operation. In terms of struct bpf_map size, we are still good and use the same amount of space: BEFORE (3 cache lines, 8 bytes of padding at the end): struct bpf_map { const struct bpf_map_ops * ops __attribute__((__aligned__(64))); /* 0 8 / struct bpf_map inner_map_meta; /* 8 8 / void security; /* 16 8 / enum bpf_map_type map_type; / 24 4 / u32 key_size; / 28 4 / u32 value_size; / 32 4 / u32 max_entries; / 36 4 / u32 map_flags; / 40 4 / int spin_lock_off; / 44 4 / u32 id; / 48 4 / int numa_node; / 52 4 / u32 btf_key_type_id; / 56 4 / u32 btf_value_type_id; / 60 4 / / --- cacheline 1 boundary (64 bytes) --- / struct btf btf; /* 64 8 / struct bpf_map_memory memory; / 72 16 / bool unpriv_array; / 88 1 / bool frozen; / 89 1 / / XXX 38 bytes hole, try to pack / / --- cacheline 2 boundary (128 bytes) --- / atomic_t refcnt __attribute__((__aligned__(64))); / 128 4 / atomic_t usercnt; / 132 4 / struct work_struct work; / 136 32 / char name[16]; / 168 16 / / size: 192, cachelines: 3, members: 21 / / sum members: 146, holes: 1, sum holes: 38 / / padding: 8 / / forced alignments: 2, forced holes: 1, sum forced holes: 38 / } __attribute__((__aligned__(64))); AFTER (same 3 cache lines, no extra padding now): struct bpf_map { const struct bpf_map_ops ops __attribute__((__aligned__(64))); /* 0 8 / struct bpf_map inner_map_meta; /* 8 8 / void security; /* 16 8 / enum bpf_map_type map_type; / 24 4 / u32 key_size; / 28 4 / u32 value_size; / 32 4 / u32 max_entries; / 36 4 / u32 map_flags; / 40 4 / int spin_lock_off; / 44 4 / u32 id; / 48 4 / int numa_node; / 52 4 / u32 btf_key_type_id; / 56 4 / u32 btf_value_type_id; / 60 4 / / --- cacheline 1 boundary (64 bytes) --- / struct btf btf; /* 64 8 / struct bpf_map_memory memory; / 72 16 / bool unpriv_array; / 88 1 / bool frozen; / 89 1 / / XXX 38 bytes hole, try to pack / / --- cacheline 2 boundary (128 bytes) --- / atomic64_t refcnt __attribute__((__aligned__(64))); / 128 8 / atomic64_t usercnt; / 136 8 / struct work_struct work; / 144 32 / char name[16]; / 176 16 / / size: 192, cachelines: 3, members: 21 / / sum members: 154, holes: 1, sum holes: 38 / / forced alignments: 2, forced holes: 1, sum forced holes: 38 */ } __attribute__((__aligned__(64))); This patch, while modifying all users of bpf_map_inc, also cleans up its interface to match bpf_map_put with separate operations for bpf_map_inc and bpf_map_inc_with_uref (to match bpf_map_put and bpf_map_put_with_uref, respectively). Also, given there are no users of bpf_map_inc_not_zero specifying uref=true, remove uref flag and default to uref=false internally. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20191117172806.2195367-2-andriin@fb.com