summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-01-29dm mpath: delay the retry of a request if the target responded as busyMike Snitzer
Add DM_ENDIO_DELAY_REQUEUE to allow request-based multipath's multipath_end_io() to instruct dm-rq.c:dm_done() to delay a requeue. This is beneficial to do if BLK_STS_RESOURCE is returned from the target (because target is busy). Relative to blk-mq: kick the hw queues via blk_mq_requeue_work(), indirectly from dm-rq.c:__dm_mq_kick_requeue_list(), after a delay. For old .request_fn: use blk_delay_queue(). bio-based multipath doesn't have feature parity with request-based for retryable error requeues; that is something that'll need fixing in the future. Suggested-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Acked-by: Bart Van Assche <bart.vanassche@wdc.com> [as interpreted from Bart's "... patch looks fine to me."]
2018-01-29platform/x86: GPD pocket fan: Stop work on suspendHans de Goede
Stop the work on suspend, otherwise it may run between our suspend method running and the system suspending, possibly restarting the fan which we've just stopped. Note we already requeue the work on resume, so that we get a fresh speed at resume. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2018-01-29platform/x86: GPD pocket fan: Use a min-speed of 2 while chargingHans de Goede
Newer versions of the GPD pocket BIOS set the fan-speed to 2 when a charger gets plugged in while the device is off. Mirror this in our fan driver and use a minimum speed of 2 while charging, Cc: James <kernel@madingley.org> Suggested-by: James <kernel@madingley.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2018-01-29platform/x86: GPD pocket fan: Set speed to max on get_temp failureHans de Goede
When we fail to get the temperature, assume the worst and set the speed to max. While at it introduce a define for MAX_SPEED. Cc: James <kernel@madingley.org> Suggested-by: James <kernel@madingley.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2018-01-29platform/x86: intel_pmc_core: Convert to ICPU macroRajneesh Bhardwaj
Use ICPU macro to refactor code related to x86_cpu_id for better readability. Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2018-01-29platform/x86: intel_pmc_core: Substitute PCI with CPUID enumerationSrinivas Pandruvada
The Only use of PCI device enumeration here is to get the PMC base address which is a fixed value i.e. 0xFE000000. On some platforms this can be read through a non standard PCI BAR. But after Kabylake, PMC is not exposed as a PCI device anymore. There are other non standard methods like ACPI LPIT which can also be used for obtaining this value. For simplicity, this value can be hardcoded as it won't change. Since we don't have a PMC PCI device on any platform after Kabylake, this creates a foundation for future SoC support. Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@intel.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2018-01-29platform/x86: intel_pmc_core: Refactor debugfs entriesRajneesh Bhardwaj
When on a platform if we can't show MPHY and PLL status, don't even bother to create a debugfs entry as it will fail anyway. In fact unless OEM builds a special BIOS for test, it will fail on every production system. This will help to add future platform support where we can't support these entries. Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2018-01-29platform/x86: dell-smbios: Correct notation for filteringMario Limonciello
The class/select were mistakenly put into octal notation but intended to be in decimal notation. Suggested-by: Pali Rohar <pali.rohar@gmail.com> Signed-off-by: Mario Limonciello <mario.limonciello@dell.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2018-01-29btrfs: drop devid as device_list_add() argAnand Jain
As struct btrfs_disk_super is being passed, so it can get devid the same way its parent does. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-29btrfs: get device pointer from device_list_add()Anand Jain
Instead of pointer to btrfs_fs_devices as an arg in device_list_add() better to get pointer to btrfs_device as return value, then we have both, pointer to btrfs_device and btrfs_fs_devices. btrfs_device is needed to handle reappearing missing device. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-29Merge tag 'acpi-4.16-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "The majority of this is an update of the ACPICA kernel code to upstream revision 20171215 with a cosmetic change and a maintainers information update on top of it. The rest is mostly some minor fixes and cleanups in the ACPI drivers and cleanups to initialization on x86. Specifics: - Update the ACPICA kernel code to upstream revision 20171215 including: * Support for ACPI 6.0A changes in the NFIT table (Bob Moore) * Local 64-bit divide in string conversions (Bob Moore) * Fix for a regression in acpi_evaluate_object_type() (Bob Moore) * Fixes for memory leaks during package object resolution (Bob Moore) * Deployment of safe version of strncpy() (Bob Moore) * Debug and messaging updates (Bob Moore) * Support for PDTT, SDEV, TPM2 tables in iASL and tools (Bob Moore) * Null pointer dereference avoidance in Op and cleanups (Colin Ian King) * Fix for memory leak from building prefixed pathname (Erik Schmauss) * Coding style fixes, disassembler and compiler updates (Hanjun Guo, Erik Schmauss) * Additional PPTT flags from ACPI 6.2 (Jeremy Linton) * Fix for an off-by-one error in acpi_get_timer_duration() (Jung-uk Kim) * Infinite loop detection timeout and utilities cleanups (Lv Zheng) * Windows 10 version 1607 and 1703 OSI strings (Mario Limonciello) - Update ACPICA information in MAINTAINERS to reflect the current status of ACPICA maintenance and rename a local variable in one function to match the corresponding upstream code (Rafael Wysocki) - Clean up ACPI-related initialization on x86 (Andy Shevchenko) - Add support for Intel Merrifield to the ACPI GPIO code (Andy Shevchenko) - Clean up ACPI PMIC drivers (Andy Shevchenko, Arvind Yadav) - Fix the ACPI Generic Event Device (GED) driver to free IRQs on shutdown and clean up the PCI IRQ Link driver (Sinan Kaya) - Make the GHES code call into the AER driver on all errors and clean up the ACPI APEI code (Colin Ian King, Tyler Baicar) - Make the IA64 ACPI NUMA code parse all SRAT entries (Ganapatrao Kulkarni) - Add a lid switch blacklist to the ACPI button driver and make it print extra debug messages on lid events (Hans de Goede) - Add quirks for Asus GL502VSK and UX305LA to the ACPI battery driver and clean it up somewhat (Bjørn Mork, Kai-Heng Feng) - Add device link for CHT SD card dependency on I2C to the ACPI LPSS (Intel SoCs) driver and make it avoid creating platform device objects for devices without MMIO resources (Adrian Hunter, Hans de Goede) - Fix the ACPI GPE mask kernel command line parameter handling (Prarit Bhargava) - Fix the handling of (incorrectly exposed) backlight interfaces without LCD (Hans de Goede) - Fix the usage of debugfs_create_*() in the ACPI EC driver (Geert Uytterhoeven)" * tag 'acpi-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (62 commits) ACPI/PCI: pci_link: reduce verbosity when IRQ is enabled ACPI / LPSS: Do not instiate platform_dev for devs without MMIO resources ACPI / PMIC: Convert to use builtin_platform_driver() macro ACPI / x86: boot: Propagate error code in acpi_gsi_to_irq() ACPICA: Update version to 20171215 ACPICA: trivial style fix, no functional change ACPICA: Fix a couple memory leaks during package object resolution ACPICA: Recognize the Windows 10 version 1607 and 1703 OSI strings ACPICA: DT compiler: prevent error if optional field at the end of table is not present ACPICA: Rename a global variable, no functional change ACPICA: Create and deploy safe version of strncpy ACPICA: Cleanup the global variables and update comments ACPICA: Debugger: fix slight indentation issue ACPICA: Fix a regression in the acpi_evaluate_object_type() interface ACPICA: Update for a few debug output statements ACPICA: Debug output, no functional change ACPI: EC: Fix debugfs_create_*() usage ACPI / video: Default lcd_only to true on Win8-ready and newer machines ACPI / x86: boot: Don't setup SCI on HW-reduced platforms ACPI / x86: boot: Use INVALID_ACPI_IRQ instead of 0 for acpi_sci_override_gsi ...
2018-01-29Merge tag 'pm-4.16-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "This includes some infrastructure changes in the PM core, mostly related to integration between runtime PM and system-wide suspend and hibernation, plus some driver changes depending on them and fixes for issues in that area which have become quite apparent recently. Also included are changes making more x86-based systems use the Low Power Sleep S0 _DSM interface by default, which turned out to be necessary to handle power button wakeups from suspend-to-idle on Surface Pro3. On the cpufreq front we have fixes and cleanups in the core, some new hardware support, driver updates and the removal of some unused code from the CPU cooling thermal driver. Apart from this, the Operating Performance Points (OPP) framework is prepared to be used with power domains in the future and there is a usual bunch of assorted fixes and cleanups. Specifics: - Define a PM driver flag allowing drivers to request that their devices be left in suspend after system-wide transitions to the working state if possible and add support for it to the PCI bus type and the ACPI PM domain (Rafael Wysocki). - Make the PM core carry out optimizations for devices with driver PM flags set in some cases and make a few drivers set those flags (Rafael Wysocki). - Fix and clean up wrapper routines allowing runtime PM device callbacks to be re-used for system-wide PM, change the generic power domains (genpd) framework to stop using those routines incorrectly and fix up a driver depending on that behavior of genpd (Rafael Wysocki, Ulf Hansson, Geert Uytterhoeven). - Fix and clean up the PM core's device wakeup framework and re-factor system-wide PM core code related to device wakeup (Rafael Wysocki, Ulf Hansson, Brian Norris). - Make more x86-based systems use the Low Power Sleep S0 _DSM interface by default (to fix power button wakeup from suspend-to-idle on Surface Pro3) and add a kernel command line switch to tell it to ignore the system sleep blacklist in the ACPI core (Rafael Wysocki). - Fix a race condition related to cpufreq governor module removal and clean up the governor management code in the cpufreq core (Rafael Wysocki). - Drop the unused generic code related to the handling of the static power energy usage model in the CPU cooling thermal driver along with the corresponding documentation (Viresh Kumar). - Add mt2712 support to the Mediatek cpufreq driver (Andrew-sh Cheng). - Add a new operating point to the imx6ul and imx6q cpufreq drivers and switch the latter to using clk_bulk_get() (Anson Huang, Dong Aisheng). - Add support for multiple regulators to the TI cpufreq driver along with a new DT binding related to that and clean up that driver somewhat (Dave Gerlach). - Fix a powernv cpufreq driver regression leading to incorrect CPU frequency reporting, fix that driver to deal with non-continguous P-states correctly and clean it up (Gautham Shenoy, Shilpasri Bhat). - Add support for frequency scaling on Armada 37xx SoCs through the generic DT cpufreq driver (Gregory CLEMENT). - Fix error code paths in the mvebu cpufreq driver (Gregory CLEMENT). - Fix a transition delay setting regression in the longhaul cpufreq driver (Viresh Kumar). - Add Skylake X (server) support to the intel_pstate cpufreq driver and clean up that driver somewhat (Srinivas Pandruvada). - Clean up the cpufreq statistics collection code (Viresh Kumar). - Drop cluster terminology and dependency on physical_package_id from the PSCI driver and drop dependency on arm_big_little from the SCPI cpufreq driver (Sudeep Holla). - Add support for system-wide suspend and resume to the RAPL power capping driver and drop a redundant semicolon from it (Zhen Han, Luis de Bethencourt). - Make SPI domain validation (in the SCSI SPI transport driver) and system-wide suspend mutually exclusive as they rely on the same underlying mechanism and cannot be carried out at the same time (Bart Van Assche). - Fix the computation of the amount of memory to preallocate in the hibernation core and clean up one function in there (Rainer Fiebig, Kyungsik Lee). - Prepare the Operating Performance Points (OPP) framework for being used with power domains and clean up one function in it (Viresh Kumar, Wei Yongjun). - Clean up the generic sysfs interface for device PM (Andy Shevchenko). - Fix several minor issues in power management frameworks and clean them up a bit (Arvind Yadav, Bjorn Andersson, Geert Uytterhoeven, Gustavo Silva, Julia Lawall, Luis de Bethencourt, Paul Gortmaker, Sergey Senozhatsky, gaurav jindal). - Make it easier to disable PM via Kconfig (Mark Brown). - Clean up the cpupower and intel_pstate_tracer utilities (Doug Smythies, Laura Abbott)" * tag 'pm-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (89 commits) PCI / PM: Remove spurious semicolon cpufreq: scpi: remove arm_big_little dependency drivers: psci: remove cluster terminology and dependency on physical_package_id powercap: intel_rapl: Fix trailing semicolon dmaengine: rcar-dmac: Make DMAC reinit during system resume explicit PM / runtime: Allow no callbacks in pm_runtime_force_suspend|resume() PM / hibernate: Drop unused parameter of enough_swap PM / runtime: Check ignore_children in pm_runtime_need_not_resume() PM / runtime: Rework pm_runtime_force_suspend/resume() PM / genpd: Stop/start devices without pm_runtime_force_suspend/resume() cpufreq: powernv: Dont assume distinct pstate values for nominal and pmin cpufreq: intel_pstate: Add Skylake servers support cpufreq: intel_pstate: Replace bxt_funcs with core_funcs platform/x86: surfacepro3: Support for wakeup from suspend-to-idle ACPI / PM: Use Low Power S0 Idle on more systems PM / wakeup: Print warn if device gets enabled as wakeup source during sleep PM / domains: Don't skip driver's ->suspend|resume_noirq() callbacks PM / core: Propagate wakeup_path status flag in __device_suspend_late() PM / core: Re-structure code for clearing the direct_complete flag powercap: add suspend and resume mechanism for SOC power limit ...
2018-01-29Merge branch 'net_sched-reflect-tx_queue_len-change-for-pfifo_fast'David S. Miller
Cong Wang says: ==================== net_sched: reflect tx_queue_len change for pfifo_fast This pathcset restores the pfifo_fast qdisc behavior of dropping packets based on latest dev->tx_queue_len. Patch 1 introduces a helper, patch 2 introduces a new Qdisc ops which is called when we modify tx_queue_len, patch 3 implements this ops for pfifo_fast. Please see each patch for details. --- v3: use skb_array_resize_multiple() v2: handle error case for ->change_tx_queue_len() ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-29net_sched: implement ->change_tx_queue_len() for pfifo_fastCong Wang
pfifo_fast used to drop based on qdisc_dev(qdisc)->tx_queue_len, so we have to resize skb array when we change tx_queue_len. Other qdiscs which read tx_queue_len are fine because they all save it to sch->limit or somewhere else in qdisc during init. They don't have to implement this, it is nicer if they do so that users don't have to re-configure qdisc after changing tx_queue_len. Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-29net_sched: plug in qdisc ops change_tx_queue_lenCong Wang
Introduce a new qdisc ops ->change_tx_queue_len() so that each qdisc could decide how to implement this if it wants. Previously we simply read dev->tx_queue_len, after pfifo_fast switches to skb array, we need this API to resize the skb array when we change dev->tx_queue_len. To avoid handling race conditions with TX BH, we need to deactivate all TX queues before change the value and bring them back after we are done, this also makes implementation easier. Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-29net: introduce helper dev_change_tx_queue_len()Cong Wang
This patch promotes the local change_tx_queue_len() to a core helper function, dev_change_tx_queue_len(), so that rtnetlink and net-sysfs could share the code. This also prepares for the following patch. Note, the -EFAULT in the original code doesn't make sense, we should propagate the errno from notifiers. Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-29Merge tag 'sound-4.16-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound updates from Takashi Iwai: "The major changes in the core API side in this cycle are the still on-going ASoC componentization works. Other than that, only few small changes such as 20bit PCM format support are found. Meanwhile the rest majority of changes are for ASoC drivers: - Large cleanups of some of the TI CODEC drivers - Continued work on Intel ASoC stuff for new quirks, ACPI GPIO handling, Kconfigs and lots of cleanups - Refactoring of the Freescale SSI driver, as preliminary work for the upcoming changes - Work on ST DFSDM driver, including the required IIO patches - New drivers for Allwinner A83T, Maxim MAX89373, SocioNext UiniPhier EVEA Tempo Semiconductor TSCS42xx and TI PCM816x, TAS5722 and TAS6424 devices - Removal of dead codes for SN95031 and board drivers Last but not least, a few HD-audio and USB-audio quirks are included as usual, too" * tag 'sound-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (303 commits) ALSA: hda - Reduce the suspend time consumption for ALC256 ASoC: use seq_file to dump the contents of dai_list,platform_list and codec_list ASoC: soc-core: add missing EXPORT_SYMBOL_GPL() for snd_soc_rtdcom_lookup IIO: ADC: stm32-dfsdm: remove unused variable again ASoC: bcm2835: fix hw_params error when device is in prepared state ASoC: mxs-sgtl5000: Do not print error on probe deferral ASoC: sgtl5000: Do not print error on probe deferral ASoC: Intel: remove select on non-existing SND_SOC_INTEL_COMMON ALSA: usb-audio: Support changing input on Sound Blaster E1 ASoC: Intel: remove second duplicated assignment to pointer 'res' ALSA: hda/realtek - update ALC215 depop optimize ALSA: hda/realtek - Support headset mode for ALC215/ALC285/ALC289 ALSA: pcm: Fix trailing semicolon ASoC: add Component level .read/.write ASoC: cx20442: fix regression by adding back .read/.write ASoC: uda1380: fix regression by adding back .read/.write ASoC: tlv320dac33: fix regression by adding back .read/.write ALSA: hda - Use IS_REACHABLE() for dependency on input IIO: ADC: stm32-dfsdm: fix static check warning IIO: ADC: stm32-dfsdm: code optimization ...
2018-01-29libceph: check kstrndup() return valueChengguang Xu
Should check result of kstrndup() in case of memory allocation failure. Signed-off-by: Chengguang Xu <cgxu519@icloud.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-01-29ceph: try to allocate enough memory for reserved capsZhi Zhang
ceph_reserve_caps() may not reserve enough caps under high memory pressure, but it saved the needed caps number that expected to be reserved. When getting caps, crash would happen due to number mismatch. Now we will try to trim more caps when failing to allocate memory for caps need to be reserved, then try again. If still failing to allocate memory, return -ENOMEM. Signed-off-by: Zhi Zhang <zhang.david2011@gmail.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-01-29ceph: fix race of queuing delayed capsYan, Zheng
When called with CHECK_CAPS_AUTHONLY flag, ceph_check_caps() only processes auth caps. In that case, it's unsafe to remove inode from mdsc->cap_delay_list, because there can be delayed non-auth caps. Besides, ceph_check_caps() may lock/unlock i_ceph_lock several times, when multiple threads call ceph_check_caps() at the same time. It's possible that one thread calls __cap_delay_requeue(), another thread calls __cap_delay_cancel(). __cap_delay_cancel() should be called at very beginning of ceph_check_caps(), so that it does not race with __cap_delay_requeue(). Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-01-29ceph: delete unreachable code in ceph_check_caps()Yan, Zheng
"revoking & (CEPH_CAP_FILE_CACHE|CEPH_CAP_FILE_LAZYIO)" has already been tested before calling try_nonblocking_invalidate() Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-01-29ceph: limit rate of cap import/export error messagesYan, Zheng
Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-01-29ceph: fix incorrect snaprealm when adding capsYan, Zheng
Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-01-29ceph: fix un-balanced fsc->writeback_count updateYan, Zheng
Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-01-29ceph: track read contexts in ceph_file_infoYan, Zheng
Previously ceph_read_iter() uses current->journal to pass context info to ceph_readpages(), so that ceph_readpages() can distinguish read(2) from readahead(2)/fadvise(2)/madvise(2). The problem is that page fault can happen when copying data to userspace memory. Page fault may call other filesystem's page_mkwrite() if the userspace memory is mapped to a file. The later filesystem may also want to use current->journal. The fix is define a on-stack data structure in ceph_read_iter(), add it to context list in ceph_file_info. ceph_readpages() searches the list, find if there is a context belongs to current thread. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-01-29ceph: avoid dereferencing invalid pointer during cached readdirYan, Zheng
Readdir cache keeps array of dentry pointers in page cache. If any dentry in readdir cache gets pruned, ceph_d_prune() disables readdir cache for later readdir syscall. The problem is that ceph_d_prune() ignores unhashed dentry. Ideally MDS should have already revoked CEPH_CAP_FILE_SHARED (which also disables readdir cache) when dentry gets unhashed. But if it is somehow MDS does not properly revoke CEPH_CAP_FILE_SHARED and the unhashed dentry gets pruned later, ceph_d_prune() will not disable readdir cache, later readdir may reference invalid dentry pointer. The fix is make ceph_d_prune() do extra check for unhashed dentry. Disable readdir cache if the unhashed dentry is still referenced by readdir cache. Another fix in this patch is handle d_splice_alias(). If a dentry gets spliced into new parent dentry, treat it as if it was pruned (call ceph_d_prune() for it). Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-01-29ceph: use atomic_t for ceph_inode_info::i_shared_genYan, Zheng
It allows accessing i_shared_gen without holding i_ceph_lock. It is preparation for later patch. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-01-29ceph: cleanup traceless reply handling for renameYan, Zheng
ceph_fill_trace() already calls ceph_invalidate_dir_request() for traceless reply. No need to duplicate the code in ceph_rename(). Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-01-29ceph: voluntarily drop Fx cap for readdir requestYan, Zheng
MDS need to rdlock directory inode's filelock when handling readdir request. Voluntarily dropping CEPH_CAP_AUTH_EXCL avoids a cap revoke message. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-01-29ceph: properly drop caps for setattr requestYan, Zheng
For CEPH_SETATTR_ATIME, MDS needs to xlock filelock, Fsxrw caps are not allowed for xlocked filelock. For CEPH_SETATTR_SIZE request that truncates file to smaller size, MDS needs to xlock filelock, Fsxrw caps are not allowed for xlocked filelock. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-01-29ceph: voluntarily drop Lx cap for link/rename requestsYan, Zheng
MDS need to xlock inode's linklock when handling link/rename requests. Voluntarily dropping CEPH_CAP_AUTH_EXCL avoids a cap revoke message. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-01-29ceph: voluntarily drop Ax cap for requests that create new inodeYan, Zheng
MDS need to rdlock directory inode's authlock when handling these requests. Voluntarily dropping CEPH_CAP_AUTH_EXCL avoids a cap revoke message. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-01-29rbd: whitelist RBD_FEATURE_OPERATIONS feature bitIlya Dryomov
This feature bit restricts older clients from performing certain maintenance operations against an image (e.g. clone, snap create). krbd does not perform maintenance operations. Cc: stable@vger.kernel.org Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2018-01-29vhost_net: stop device during reset ownerJason Wang
We don't stop device before reset owner, this means we could try to serve any virtqueue kick before reset dev->worker. This will result a warn since the work was pending at llist during owner resetting. Fix this by stopping device during owner reset. Reported-by: syzbot+eb17c6162478cc50632c@syzkaller.appspotmail.com Fixes: 3a4d5c94e9593 ("vhost_net: a kernel-level virtio server") Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-29Merge branch 'net-Ease-to-follow-an-interface-that-moves-to-another-netns'David S. Miller
Nicolas Dichtel says: ==================== net: Ease to follow an interface that moves to another netns The goal of this series is to ease the user to follow an interface that moves to another netns. After this series, with a patched iproute2: $ ip netns bar foo $ ip monitor link & $ ip link set dummy0 netns foo Deleted 5: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default link/ether 6e:a7:82:35:96:46 brd ff:ff:ff:ff:ff:ff new-nsid 0 new-ifindex 6 => new nsid: 0, new ifindex: 6 (was 5 in the previous netns) $ ip link set eth1 netns bar Deleted 3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default link/ether 52:54:01:12:34:57 brd ff:ff:ff:ff:ff:ff new-nsid 1 new-ifindex 3 => new nsid: 1, new ifindex: 3 (same ifindex) $ ip netns bar (id: 1) foo (id: 0) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-29dev: advertise the new ifindex when the netns iface changesNicolas Dichtel
The goal is to let the user follow an interface that moves to another netns. CC: Jiri Benc <jbenc@redhat.com> CC: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-29dev: always advertise the new nsid when the netns iface changesNicolas Dichtel
The user should be able to follow any interface that moves to another netns. There is no reason to hide physical interfaces. CC: Jiri Benc <jbenc@redhat.com> CC: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-29net: ethernet: cavium: Correct Cavium Thunderx NIC driver names accordingly ↵Vadim Lomovtsev
to module name It was found that ethtool provides unexisting module name while it queries the specified network device for associated driver information. Then user tries to unload that module by provided module name and fails. This happens because ethtool reads value of DRV_NAME macro, while module name is defined at the driver's Makefile. This patch is to correct Cavium CN88xx Thunder NIC driver names (DRV_NAME macro) 'thunder-nicvf' to 'nicvf' and 'thunder-nic' to 'nicpf', sync bgx and xcv driver names accordingly to their module names. Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-29Merge tag 'init_task-20180117' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull init_task initializer cleanups from David Howells: "It doesn't seem useful to have the init_task in a header file rather than in a normal source file. We could consolidate init_task handling instead and expand out various macros. Here's a series of patches that consolidate init_task handling: (1) Make THREAD_SIZE available to vmlinux.lds for cris, hexagon and openrisc. (2) Alter the INIT_TASK_DATA linker script macro to set init_thread_union and init_stack rather than defining these in C. Insert init_task and init_thread_into into the init_stack area in the linker script as appropriate to the configuration, with different section markers so that they end up correctly ordered. We can then get merge ia64's init_task.c into the main one. We then have a bunch of single-use INIT_*() macros that seem only to be macros because they used to be used per-arch. We can then expand these in place of the user and get rid of a few lines and a lot of backslashes. (3) Expand INIT_TASK() in place. (4) Expand in place various small INIT_*() macros that are defined conditionally. Expand them and surround them by #if[n]def/#endif in the .c file as it takes fewer lines. (5) Expand INIT_SIGNALS() and INIT_SIGHAND() in place. (6) Expand INIT_STRUCT_PID in place. These macros can then be discarded" * tag 'init_task-20180117' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: Expand INIT_STRUCT_PID and remove Expand the INIT_SIGNALS and INIT_SIGHAND macros and remove Expand various INIT_* macros and remove Expand INIT_TASK() in init/init_task.c and remove Construct init thread stack in the linker script rather than by union openrisc: Make THREAD_SIZE available to vmlinux.lds hexagon: Make THREAD_SIZE available to vmlinux.lds cris: Make THREAD_SIZE available to vmlinux.lds
2018-01-29Merge branch 'ptr_ring-fixes'David S. Miller
Michael S. Tsirkin says: ==================== ptr_ring fixes This fixes a bunch of issues around ptr_ring use in net core. One of these: "tap: fix use-after-free" is also needed on net, but can't be backported cleanly. I will post a net patch separately. Lightly tested - Jason, could you pls confirm this addresses the security issue you saw with ptr_ring? Testing reports would be appreciated too. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> Tested-by: Jason Wang <jasowang@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2018-01-29tools/virtio: fix smp_mb on x86Michael S. Tsirkin
Offset 128 overlaps the last word of the redzone. Use 132 which is always beyond that. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-29tools/virtio: copy READ/WRITE_ONCEMichael S. Tsirkin
This is to make ptr_ring test build again. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-29tools/virtio: more stubs to fix tools buildMichael S. Tsirkin
Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-29tools/virtio: switch to __ptr_ring_emptyMichael S. Tsirkin
We don't rely on lockless guarantees, but it seems cleaner than inverting __ptr_ring_peek. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-29ptr_ring: prevent queue load/store tearingMichael S. Tsirkin
In theory compiler could tear queue loads or stores in two. It does not seem to be happening in practice but it seems easier to convert the cases where this would be a problem to READ/WRITE_ONCE than worry about it. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-29skb_array: use __ptr_ring_emptyMichael S. Tsirkin
__skb_array_empty should use __ptr_ring_empty since that's the only legal lockless function. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-29Revert "net: ptr_ring: otherwise safe empty checks can overrun array bounds"Michael S. Tsirkin
This reverts commit bcecb4bbf88aa03171c30652bca761cf27755a6b. If we try to allocate an extra entry as the above commit did, and when the requested size is UINT_MAX, addition overflows causing zero size to be passed to kmalloc(). kmalloc then returns ZERO_SIZE_PTR with a subsequent crash. Reported-by: syzbot+87678bcf753b44c39b67@syzkaller.appspotmail.com Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-29ptr_ring: disallow lockless __ptr_ring_fullMichael S. Tsirkin
Similar to bcecb4bbf88a ("net: ptr_ring: otherwise safe empty checks can overrun array bounds") a lockless use of __ptr_ring_full might cause an out of bounds access. We can fix this, but it's easier to just disallow lockless __ptr_ring_full for now. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-29tap: fix use-after-freeMichael S. Tsirkin
Lockless access to __ptr_ring_full is only legal if ring is never resized, otherwise it might cause use-after free errors. Simply drop the lockless test, we'll drop the packet a bit later when produce fails. Fixes: 362899b8 ("macvtap: switch to use skb array") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-29ptr_ring: READ/WRITE_ONCE for __ptr_ring_emptyMichael S. Tsirkin
Lockless __ptr_ring_empty requires that consumer head is read and written at once, atomically. Annotate accordingly to make sure compiler does it correctly. Switch locked callers to __ptr_ring_peek which does not support the lockless operation. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>