summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-10-18PM / Domains: Deal with multiple states but no governor in genpdUlf Hansson
A caller of pm_genpd_init() that provides some states for the genpd via the ->states pointer in the struct generic_pm_domain, should also provide a governor. This because it's the job of the governor to pick a state that satisfies the constraints. Therefore, let's print a warning to inform the user about such bogus configuration and avoid to bail out, by instead picking the shallowest state before genpd invokes the ->power_off() callback. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Lina Iyer <ilina@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-10-18PM / Domains: Don't treat zero found compatible idle states as an errorUlf Hansson
Instead of returning -EINVAL from of_genpd_parse_idle_states() in case none compatible states was found, let's return 0 to indicate success. Assign also the out-parameter *states to NULL and *n to 0, to indicate to the caller that zero states have been found/allocated. This enables the caller of of_genpd_parse_idle_states() to easier act on the returned error code. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Lina Iyer <ilina@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-10-18Merge branch 'acpica'Rafael J. Wysocki
* acpica: ACPICA: Remove acpi_gbl_group_module_level_code and only use acpi_gbl_execute_tables_as_methods instead ACPICA: AML Parser: fix parse loop to correctly skip erroneous extended opcodes ACPICA: AML interpreter: add region addresses in global list during initialization ACPICA: Update version to 20181003 ACPICA: Never run _REG on system_memory and system_IO ACPICA: Split large interpreter file ACPICA: Update for field unit access ACPICA: Rename some of the Field Attribute defines ACPICA: Update for generic_serial_bus and attrib_raw_process_bytes protocol
2018-10-18fscache: Fix out of bound read in long cookie keysEric Sandeen
fscache_set_key() can incur an out-of-bounds read, reported by KASAN: BUG: KASAN: slab-out-of-bounds in fscache_alloc_cookie+0x5b3/0x680 [fscache] Read of size 4 at addr ffff88084ff056d4 by task mount.nfs/32615 and also reported by syzbot at https://lkml.org/lkml/2018/7/8/236 BUG: KASAN: slab-out-of-bounds in fscache_set_key fs/fscache/cookie.c:120 [inline] BUG: KASAN: slab-out-of-bounds in fscache_alloc_cookie+0x7a9/0x880 fs/fscache/cookie.c:171 Read of size 4 at addr ffff8801d3cc8bb4 by task syz-executor907/4466 This happens for any index_key_len which is not divisible by 4 and is larger than the size of the inline key, because the code allocates exactly index_key_len for the key buffer, but the hashing loop is stepping through it 4 bytes (u32) at a time in the buf[] array. Fix this by calculating how many u32 buffers we'll need by using DIV_ROUND_UP, and then using kcalloc() to allocate a precleared allocation buffer to hold the index_key, then using that same count as the hashing index limit. Fixes: ec0328e46d6e ("fscache: Maintain a catalogue of allocated cookies") Reported-by: syzbot+a95b989b2dde8e806af8@syzkaller.appspotmail.com Signed-off-by: Eric Sandeen <sandeen@redhat.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-18fscache: Fix incomplete initialisation of inline key spaceDavid Howells
The inline key in struct rxrpc_cookie is insufficiently initialized, zeroing only 3 of the 4 slots, therefore an index_key_len between 13 and 15 bytes will end up hashing uninitialized memory because the memcpy only partially fills the last buf[] element. Fix this by clearing fscache_cookie objects on allocation rather than using the slab constructor to initialise them. We're going to pretty much fill in the entire struct anyway, so bringing it into our dcache writably shouldn't incur much overhead. This removes the need to do clearance in fscache_set_key() (where we aren't doing it correctly anyway). Also, we don't need to set cookie->key_len in fscache_set_key() as we already did it in the only caller, so remove that. Fixes: ec0328e46d6e ("fscache: Maintain a catalogue of allocated cookies") Reported-by: syzbot+a95b989b2dde8e806af8@syzkaller.appspotmail.com Reported-by: Eric Sandeen <sandeen@redhat.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-18cachefiles: fix the race between cachefiles_bury_object() and rmdir(2)Al Viro
the victim might've been rmdir'ed just before the lock_rename(); unlike the normal callers, we do not look the source up after the parents are locked - we know it beforehand and just recheck that it's still the child of what used to be its parent. Unfortunately, the check is too weak - we don't spot a dead directory since its ->d_parent is unchanged, dentry is positive, etc. So we sail all the way to ->rename(), with hosting filesystems _not_ expecting to be asked renaming an rmdir'ed subdirectory. The fix is easy, fortunately - the lock on parent is sufficient for making IS_DEADDIR() on child safe. Cc: stable@vger.kernel.org Fixes: 9ae326a69004 (CacheFiles: A cache that backs onto a mounted filesystem) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-18mremap: properly flush TLB before releasing the pageLinus Torvalds
Jann Horn points out that our TLB flushing was subtly wrong for the mremap() case. What makes mremap() special is that we don't follow the usual "add page to list of pages to be freed, then flush tlb, and then free pages". No, mremap() obviously just _moves_ the page from one page table location to another. That matters, because mremap() thus doesn't directly control the lifetime of the moved page with a freelist: instead, the lifetime of the page is controlled by the page table locking, that serializes access to the entry. As a result, we need to flush the TLB not just before releasing the lock for the source location (to avoid any concurrent accesses to the entry), but also before we release the destination page table lock (to avoid the TLB being flushed after somebody else has already done something to that page). This also makes the whole "need_flush" logic unnecessary, since we now always end up flushing the TLB for every valid entry. Reported-and-tested-by: Jann Horn <jannh@google.com> Acked-by: Will Deacon <will.deacon@arm.com> Tested-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-18LICENSES: Remove CC-BY-SA-4.0 license textChristoph Hellwig
Using non-GPL licenses for our documentation is rather problematic, as it can directly include other files, which generally are GPLv2 licensed and thus not compatible. Remove this license now that the only user (idr.rst) is gone to avoid people semi-accidentally using it again. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-18Merge branch 'ida-fixes-4.19-rc8' of ↵Greg Kroah-Hartman
git://git.infradead.org/users/willy/linux-dax Matthew writes: "IDA/IDR fixes for 4.19 I have two tiny fixes, one for the IDA test-suite and one for the IDR documentation license." * 'ida-fixes-4.19-rc8' of git://git.infradead.org/users/willy/linux-dax: idr: Change documentation license test_ida: Fix lockdep warning
2018-10-18ALSA: doc: Brush up the old writing-an-alsa-driverTakashi Iwai
Slightly brushing up and throw the old dust away from my ancient writing-an-alsa-driver document. The contents aren't changed so much but the obsoleted parts are dropped. Also, remove the date and the version number. It's useless. Reviewed-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-10-18Bluetooth: hci_qca: Add support for controller debug logs.Balakrishna Godavarthi
This patch will prevent error messages splashing on console. [ 78.426697] Bluetooth: hci_core.c:hci_acldata_packet() hci0: ACL packet for unknown connection handle 3804 [ 78.436682] Bluetooth: hci_core.c:hci_acldata_packet() hci0: ACL packet for unknown connection handle 3804 [ 78.446639] Bluetooth: hci_core.c:hci_acldata_packet() hci0: ACL packet for unknown connection handle 3804 [ 78.456596] Bluetooth: hci_core.c:hci_acldata_packet() hci0: ACL packet for unknown connection handle 3804 QCA wcn3990 will send the debug logs in the form of ACL packets. While decoding packet in qca_recv(), marking the received debug log packet as diagnostic packet. Signed-off-by: Harish Bandi <c-hbandi@codeaurora.org> Signed-off-by: Balakrishna Godavarthi <bgodavar@codeaurora.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-10-18cpuidle: menu: Avoid computations when result will be discardedRafael J. Wysocki
If the minimum interval taken into account in the average computation loop in get_typical_interval() is less than the expected idle duration determined so far, the resultant average cannot be greater than that value as well and the entire return result of the function is going to be discarded anyway going forward. In that case, it is a waste of time to carry out the remaining computations in get_typical_interval(), so avoid that by returning early if the minimum interval is not below the expected idle duration. No intentional changes of behavior. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-10-18cpuidle: menu: Drop redundant comparisonRafael J. Wysocki
Since the correction factor cannot be greater than RESOLUTION * DECAY, the result of the predicted_us computation in menu_select() cannot be greater than data->next_timer_us, so it is not necessary to compare the "typical interval" value coming from get_typical_interval() with data->next_timer_us separately. It is sufficient to copmare predicted_us with the return value of get_typical_interval() directly, so do that and drop the now redundant expected_interval variable. No intentional changes of behavior. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-10-18nvme-pci: remove duplicate checkChaitanya Kulkarni
This is a cleanup patch doesn't change any functionality. It removes the duplicate call to the blk_integrity_rq() in the nvme_map_data(). Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-10-18Bluetooth: btusb: Add support for 0cf3:535b QCA_ROME deviceOwen Lin
T: Bus=01 Lev=01 Prnt=01 Port=03 Cnt=02 Dev#= 3 Spd=12 MxCh= 0 D: Ver= 2.01 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=0cf3 ProdID=535b Rev= 0.01 C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=1ms E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms Signed-off-by: Owen Lin <olin@rivetnetworks.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-10-18ACPI / scan: Create platform device for INT33FE ACPI nodesHans de Goede
Bay and Cherry Trail devices with a Dollar Cove or Whiskey Cove PMIC have an ACPI node with a HID of INT33FE which is a "virtual" battery device implementing a standard ACPI battery interface which depends upon a proprietary, undocument OpRegion called BMOP. Since we do have docs for the actual fuel-gauges used on these boards we instead use native fuel-gauge drivers talking directly to the fuel-gauge ICs on boards which rely on this INT33FE device for their battery monitoring. On boards with a Dollar Cove PMIC the INT33FE device's resources (_CRS) describe a non-existing I2C client at address 0x6b with a bus-speed of 100KHz. This is a problem on some boards since there are actual devices on that same bus which need a speed of 400KHz to function properly. This commit adds the INT33FE HID to the list of devices with I2C resources which should be enumerated as a platform-device rather then letting the i2c-core instantiate an i2c-client matching the first I2C resource, so that its bus-speed will not influence the max speed of the I2C bus. This fixes e.g. the touchscreen not working on the Teclast X98 II Plus. The INT33FE device on boards with a Whiskey Cove PMIC is somewhat special. Its first I2C resource is for a secondary I2C address of the PMIC itself, which is already described in an ACPI device with an INT34D3 HID. But it has 3 more I2C resources describing 3 other chips for which we do need to instantiate I2C clients and which need device-connections added between them for things to work properly. This special case is handled by the drivers/platform/x86/intel_cht_int33fe.c code. Before this commit that code was binding to the i2c-client instantiated for the secondary I2C address of the PMIC, since we now instantiate a platform device for the INT33FE device instead, this commit also changes the intel_cht_int33fe driver from an i2c driver to a platform driver. This also brings the intel_cht_int33fe drv inline with how we instantiate multiple i2c clients from a single ACPI device in other cases, as done by the drivers/platform/x86/i2c-multi-instantiate.c code. Reported-and-tested-by: Alexander Meiler <alex.meiler@protonmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-10-18ACPI / OSL: Use 'jiffies' as the time bassis for acpi_os_get_timer()Bart Van Assche
Since acpi_os_get_timer() may be called after the timer subsystem has been suspended, use the jiffies counter instead of ktime_get(). This patch avoids that the following warning is reported during hibernation: WARNING: CPU: 0 PID: 612 at kernel/time/timekeeping.c:751 ktime_get+0x116/0x120 RIP: 0010:ktime_get+0x116/0x120 Call Trace: acpi_os_get_timer+0xe/0x30 acpi_ds_exec_begin_control_op+0x175/0x1de acpi_ds_exec_begin_op+0x2c7/0x39a acpi_ps_create_op+0x573/0x5e4 acpi_ps_parse_loop+0x349/0x1220 acpi_ps_parse_aml+0x25b/0x6da acpi_ps_execute_method+0x327/0x41b acpi_ns_evaluate+0x4e9/0x6f5 acpi_ut_evaluate_object+0xd9/0x2f2 acpi_rs_get_method_data+0x8f/0x114 acpi_walk_resources+0x122/0x1b6 acpi_pci_link_get_current.isra.2+0x157/0x280 acpi_pci_link_set+0x32f/0x4a0 irqrouter_resume+0x58/0x80 syscore_resume+0x84/0x380 hibernation_snapshot+0x20c/0x4f0 hibernate+0x22d/0x3a6 state_store+0x99/0xa0 kobj_attr_store+0x37/0x50 sysfs_kf_write+0x87/0xa0 kernfs_fop_write+0x1a5/0x240 __vfs_write+0xd2/0x410 vfs_write+0x101/0x250 ksys_write+0xab/0x120 __x64_sys_write+0x43/0x50 do_syscall_64+0x71/0x220 entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixes: 164a08cee135 (ACPICA: Dispatcher: Introduce timeout mechanism for infinite loop detection) Reported-by: Fengguang Wu <fengguang.wu@intel.com> References: https://lists.01.org/pipermail/lkp/2018-April/008406.html Signed-off-by: Bart Van Assche <bvanassche@acm.org> Cc: 4.16+ <stable@vger.kernel.org> # 4.16+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-10-18ACPI: probe ECDT before loading AML tables regardless of module-level code flagErik Schmauss
It was discovered that AML tables were loaded before or after the ECDT depending on acpi_gbl_execute_tables_as_methods. According to the ACPI spec, the ECDT should be loaded before the namespace is populated by loading AML tables (DSDT and SSDT). Since the ECDT should be loaded early in the boot process, this change moves the ECDT probing to acpi_early_init. Signed-off-by: Erik Schmauss <erik.schmauss@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-10-18ACPICA: Remove acpi_gbl_group_module_level_code and only use ↵Erik Schmauss
acpi_gbl_execute_tables_as_methods instead acpi_gbl_group_module_level_code and acpi_gbl_execute_tables_as_methods were used to enable different table load behavior. The different table load behaviors are as follows: A.) acpi_gbl_group_module_level_code enabled the legacy approach where ASL if statements are executed after the namespace object has been loaded. B.) acpi_gbl_execute_tables_as_methods is currently used to enable the table load to be a method invocation. This meaning that ASL If statements are executed in-line rather than deferred until after the ACPI namespace has been populated. This is the correct behavior and option A will be removed in the future. We do not support a table load behavior where these variables are assigned the same value. In otherwords, we only support option A or B and do not need acpi_gbl_group_module_level_code to enable A. From now on, acpi_gbl_execute_tables_as_methods == 0 enables option A and acpi_gbl_execute_tables_as_methods == 1 enables option B. Note: option A is expected to be removed in the future and option B will become the only supported table load behavior. Signed-off-by: Erik Schmauss <erik.schmauss@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-10-18ACPICA: AML Parser: fix parse loop to correctly skip erroneous extended opcodesErik Schmauss
AML opcodes come in two lengths: 1-byte opcodes and 2-byte, extended opcodes. If an error occurs due to illegal opcodes during table load, the AML parser needs to continue loading the table. In order to do this, it needs to skip parsing of the offending opcode and operands associated with that opcode. This change fixes the AML parse loop to correctly skip parsing of incorrect extended opcodes. Previously, only the short opcodes were skipped correctly. Signed-off-by: Erik Schmauss <erik.schmauss@intel.com> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-10-18ACPICA: AML interpreter: add region addresses in global list during ↵Erik Schmauss
initialization The table load process omitted adding the operation region address range to the global list. This omission is problematic because the OS queries the global list to check for address range conflicts before deciding which drivers to load. This commit may result in warning messages that look like the following: [ 7.871761] ACPI Warning: system_IO range 0x00000428-0x0000042F conflicts with op_region 0x00000400-0x0000047F (\PMIO) (20180531/utaddress-213) [ 7.871769] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver However, these messages do not signify regressions. It is a result of properly adding address ranges within the global address list. Link: https://bugzilla.kernel.org/show_bug.cgi?id=200011 Tested-by: Jean-Marc Lenoir <archlinux@jihemel.com> Signed-off-by: Erik Schmauss <erik.schmauss@intel.com> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-10-18ACPI: TAD: Add low-level support for real time capabilityRafael J. Wysocki
Add low-level support for the (optional) real time capability of the ACPI Time and Alarm Device (TAD) to the ACPI TAD driver. This allows the real time to be acquired or set via sysfs with the help of the _GRT and _SRT methods of the TAD, respectively. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2018-10-18kprobes, x86/ptrace.h: Make regs_get_kernel_stack_nth() not fault on bad stackSteven Rostedt (VMware)
Andy had some concerns about using regs_get_kernel_stack_nth() in a new function regs_get_kernel_argument() as if there's any error in the stack code, it could cause a bad memory access. To be on the safe side, call probe_kernel_read() on the stack address to be extra careful in accessing the memory. A helper function, regs_get_kernel_stack_nth_addr(), was added to just return the stack address (or NULL if not on the stack), that will be used to find the address (and could be used by other functions) and read the address with kernel_probe_read(). Requested-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20181017165951.09119177@gandalf.local.home Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-18xfs: cancel COW blocks before swapextChristoph Hellwig
We need to make sure we have no outstanding COW blocks before we swap extents, as there is nothing preventing us from having preallocated COW delalloc on either inode that swapext is called on. That case can easily be reproduced by running generic/324 in always_cow mode: [ 620.760572] XFS: Assertion failed: tip->i_delayed_blks == 0, file: fs/xfs/xfs_bmap_util.c, line: 1669 [ 620.761608] ------------[ cut here ]------------ [ 620.762171] kernel BUG at fs/xfs/xfs_message.c:102! [ 620.762732] invalid opcode: 0000 [#1] SMP PTI [ 620.763272] CPU: 0 PID: 24153 Comm: xfs_fsr Tainted: G W 4.19.0-rc1+ #4182 [ 620.764203] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014 [ 620.765202] RIP: 0010:assfail+0x20/0x28 [ 620.765646] Code: 31 ff e8 83 fc ff ff 0f 0b c3 48 89 f1 41 89 d0 48 c7 c6 48 ca 8d 82 48 89 fa 38 [ 620.767758] RSP: 0018:ffffc9000898bc10 EFLAGS: 00010202 [ 620.768359] RAX: 0000000000000000 RBX: ffff88012f14ba40 RCX: 0000000000000000 [ 620.769174] RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffff828560d9 [ 620.769982] RBP: ffff88012f14b300 R08: 0000000000000000 R09: 0000000000000000 [ 620.770788] R10: 000000000000000a R11: f000000000000000 R12: ffffc9000898bc98 [ 620.771638] R13: ffffc9000898bc9c R14: ffff880130b5e2b8 R15: ffff88012a1fa2a8 [ 620.772504] FS: 00007fdc36e0fbc0(0000) GS:ffff88013ba00000(0000) knlGS:0000000000000000 [ 620.773475] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 620.774168] CR2: 00007fdc3604d000 CR3: 0000000132afc000 CR4: 00000000000006f0 [ 620.774978] Call Trace: [ 620.775274] xfs_swap_extent_forks+0x2a0/0x2e0 [ 620.775792] xfs_swap_extents+0x38b/0xab0 [ 620.776256] xfs_ioc_swapext+0x121/0x140 [ 620.776709] xfs_file_ioctl+0x328/0xc90 [ 620.777154] ? rcu_read_lock_sched_held+0x50/0x60 [ 620.777694] ? xfs_iunlock+0x233/0x260 [ 620.778127] ? xfs_setattr_nonsize+0x3be/0x6a0 [ 620.778647] do_vfs_ioctl+0x9d/0x680 [ 620.779071] ? ksys_fchown+0x47/0x80 [ 620.779552] ksys_ioctl+0x35/0x70 [ 620.780040] __x64_sys_ioctl+0x11/0x20 [ 620.780530] do_syscall_64+0x4b/0x190 [ 620.780927] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 620.781467] RIP: 0033:0x7fdc364d0f07 [ 620.781900] Code: b3 66 90 48 8b 05 81 5f 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 28 [ 620.784044] RSP: 002b:00007ffe2a766038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 620.784896] RAX: ffffffffffffffda RBX: 0000000000000025 RCX: 00007fdc364d0f07 [ 620.785667] RDX: 0000560296ca2fc0 RSI: 00000000c0c0586d RDI: 0000000000000005 [ 620.786398] RBP: 0000000000000025 R08: 0000000000001200 R09: 0000000000000000 [ 620.787283] R10: 0000000000000432 R11: 0000000000000246 R12: 0000000000000005 [ 620.788051] R13: 0000000000000000 R14: 0000000000001000 R15: 0000000000000006 [ 620.788927] Modules linked in: [ 620.789340] ---[ end trace 9503b7417ffdbdb0 ]--- [ 620.790065] RIP: 0010:assfail+0x20/0x28 [ 620.790642] Code: 31 ff e8 83 fc ff ff 0f 0b c3 48 89 f1 41 89 d0 48 c7 c6 48 ca 8d 82 48 89 fa 38 [ 620.793038] RSP: 0018:ffffc9000898bc10 EFLAGS: 00010202 [ 620.793609] RAX: 0000000000000000 RBX: ffff88012f14ba40 RCX: 0000000000000000 [ 620.794317] RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffff828560d9 [ 620.795025] RBP: ffff88012f14b300 R08: 0000000000000000 R09: 0000000000000000 [ 620.795778] R10: 000000000000000a R11: f000000000000000 R12: ffffc9000898bc98 [ 620.796675] R13: ffffc9000898bc9c R14: ffff880130b5e2b8 R15: ffff88012a1fa2a8 [ 620.797782] FS: 00007fdc36e0fbc0(0000) GS:ffff88013ba00000(0000) knlGS:0000000000000000 [ 620.798908] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 620.799594] CR2: 00007fdc3604d000 CR3: 0000000132afc000 CR4: 00000000000006f0 [ 620.800424] Kernel panic - not syncing: Fatal exception [ 620.801191] Kernel Offset: disabled [ 620.801597] ---[ end Kernel panic - not syncing: Fatal exception ]--- Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-10-18xfs: clear ail delwri queued bufs on unmount of shutdown fsBrian Foster
In the typical unmount case, the AIL is forced out by the unmount sequence before the xfsaild task is stopped. Since AIL items are removed on writeback completion, this means that the AIL ->ail_buf_list delwri queue has been drained. This is not always true in the shutdown case, however. It's possible for buffers to sit on a delwri queue for a period of time across submission attempts if said items are locked or have been relogged and pinned since first added to the queue. If the attempt to log such an item results in a log I/O error, the error processing can shutdown the fs, remove the item from the AIL, stale the buffer (dropping the LRU reference) and clear its delwri queue state. The latter bit means the buffer will be released from a delwri queue on the next submission attempt, but this might never occur if the filesystem has shutdown and the AIL is empty. This means that such buffers are held indefinitely by the AIL delwri queue across destruction of the AIL. Aside from being a memory leak, these buffers can also hold references to in-core perag structures. The latter problem manifests as a generic/475 failure, reproducing the following asserts at unmount time: XFS: Assertion failed: atomic_read(&pag->pag_ref) == 0, file: fs/xfs/xfs_mount.c, line: 151 XFS: Assertion failed: atomic_read(&pag->pag_ref) == 0, file: fs/xfs/xfs_mount.c, line: 132 To prevent this problem, clear the AIL delwri queue as a final step before xfsaild() exit. The !empty state should never occur in the normal case, so add an assert to catch unexpected problems going forward. [dgc: add comment explaining need for xfs_buf_delwri_cancel() after calling xfs_buf_delwri_submit_nowait().] Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-10-18xfs: use offsetof() in place of offset macros for __xfsstatsCarlos Maiolino
Most offset macro mess is used in xfs_stats_format() only, and we can simply get the right offsets using offsetof(), instead of several macros to mark the offsets inside __xfsstats structure. Replace all XFSSTAT_END_* macros by a single helper macro to get the right offset into __xfsstats, and use this helper in xfs_stats_format() directly. The quota stats code, still looks a bit cleaner when using XFSSTAT_* macros, so, this patch also defines XFSSTAT_START_XQMSTAT and XFSSTAT_END_XQMSTAT locally to that code. This also should prevent offset mistakes when updates are done into __xfsstats. Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-10-18xfs: Fix xqmstats offsets in /proc/fs/xfs/xqmstatCarlos Maiolino
The addition of FIBT, RMAP and REFCOUNT changed the offsets into __xfssats structure. This caused xqmstat_proc_show() to display garbage data via /proc/fs/xfs/xqmstat, once it relies on the offsets marked via macros. Fix it. Fixes: 00f4e4f9 xfs: add rmap btree stats infrastructure Fixes: aafc3c24 xfs: support the XFS_BTNUM_FINOBT free inode btree type Fixes: 46eeb521 xfs: introduce refcount btree definitions Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-10-18xfs: fix use-after-free race in xfs_buf_releDave Chinner
When looking at a 4.18 based KASAN use after free report, I noticed that racing xfs_buf_rele() may race on dropping the last reference to the buffer and taking the buffer lock. This was the symptom displayed by the KASAN report, but the actual issue that was reported had already been fixed in 4.19-rc1 by commit e339dd8d8b04 ("xfs: use sync buffer I/O for sync delwri queue submission"). Despite this, I think there is still an issue with xfs_buf_rele() in this code: release = atomic_dec_and_lock(&bp->b_hold, &pag->pag_buf_lock); spin_lock(&bp->b_lock); if (!release) { ..... If two threads race on the b_lock after both dropping a reference and one getting dropping the last reference so release = true, we end up with: CPU 0 CPU 1 atomic_dec_and_lock() atomic_dec_and_lock() spin_lock(&bp->b_lock) spin_lock(&bp->b_lock) <spins> <release = true bp->b_lru_ref = 0> <remove from lists> freebuf = true spin_unlock(&bp->b_lock) xfs_buf_free(bp) <gets lock, reading and writing freed memory> <accesses freed memory> spin_unlock(&bp->b_lock) <reads/writes freed memory> IOWs, we can't safely take bp->b_lock after dropping the hold reference because the buffer may go away at any time after we drop that reference. However, this can be fixed simply by taking the bp->b_lock before we drop the reference. It is safe to nest the pag_buf_lock inside bp->b_lock as the pag_buf_lock is only used to serialise against lookup in xfs_buf_find() and no other locks are held over or under the pag_buf_lock there. Make this clear by documenting the buffer lock orders at the top of the file. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-10-18xfs: Add attibute remove and helper functionsAllison Henderson
This patch adds xfs_attr_remove_args. These sub-routines remove the attributes specified in @args. We will use this later for setting parent pointers as a deferred attribute operation. Signed-off-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-10-18xfs: Add attibute set and helper functionsAllison Henderson
This patch adds xfs_attr_set_args and xfs_bmap_set_attrforkoff. These sub-routines set the attributes specified in @args. We will use this later for setting parent pointers as a deferred attribute operation. [dgc: remove attr fork init code from xfs_attr_set_args().] [dgc: xfs_attr_try_sf_addname() NULLs args.trans after commit.] [dgc: correct sf add error handling.] Signed-off-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-10-18xfs: Add helper function xfs_attr_try_sf_addnameAllison Henderson
This patch adds a subroutine xfs_attr_try_sf_addname used by xfs_attr_set. This subrotine will attempt to add the attribute name specified in args in shortform, as well and perform error handling previously done in xfs_attr_set. This patch helps to pre-simplify xfs_attr_set for reviewing purposes and reduce indentation. New function will be added in the next patch. [dgc: moved commit to helper function, too.] Signed-off-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-10-18xfs: Move fs/xfs/xfs_attr.h to fs/xfs/libxfs/xfs_attr.hAllison Henderson
This patch moves fs/xfs/xfs_attr.h to fs/xfs/libxfs/xfs_attr.h since xfs_attr.c is in libxfs. We will need these later in xfsprogs. Signed-off-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-10-18xfs: issue log message on user force shutdownDave Chinner
The kernel only issues a log message that it's been shut down when the filesystem triggers a shutdown itself. Hence there is no trace in the log when a shutdown is triggered manually from userspace. This can make it hard to see sequence of events in the log when things go wrong, so make sure we always log a message when a shutdown is run. While there, clean up the logic flow so we don't have to continually check if the shutdown trigger was user initiated before logging shutdown messages. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-10-18xfs: fix buffer state management in xrep_findroot_blockDarrick J. Wong
We don't handle buffer state properly in online repair's findroot routine. If a buffer already has b_ops set, we don't ever want to touch that, and we don't want to call the read verifiers on a buffer that could be dirty (CRCs are only recomputed during log checkpoints). Therefore, be more careful about what we do with a buffer -- if someone else already attached ops that are not the ones for this btree type, just ignore the buffer. We only attach our btree type's buf ops if it matches the magic/uuid and structure checks. We also modify xfs_buf_read_map to allow callers to set buffer ops on a DONE buffer with NULL ops so that repair doesn't leave behind buffers which won't have buffers attached to them. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-10-18xfs: always assign buffer verifiers when one is providedDarrick J. Wong
If a caller supplies buffer ops when trying to read a buffer and the buffer doesn't already have buf ops assigned, ensure that the ops are assigned to the buffer and the verifier is run on that buffer. Note that current XFS code is careful to assign buffer ops after a xfs_{trans_,}buf_read call in which ops were not supplied. However, we should apply ops defensively in case there is ever a coding mistake; and an upcoming repair patch will need to be able to read a buffer without assigning buf ops. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-10-18xfs: xrep_findroot_block should reject root blocks with siblingsDarrick J. Wong
In xrep_findroot_block, if we find a candidate root block with sibling pointers or sibling blocks on the same tree level, we should not return that block as a tree root because root blocks cannot have siblings. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-10-18xfs: add a define for statfs magic to uapiAdam Borowski
Needed by userspace programs that call fstatfs(). It'd be natural to publish XFS_SB_MAGIC in uapi, but while these two have identical values, they have different semantic meaning: one is an enum cookie meant for statfs, the other a signature of the on-disk format. Signed-off-by: Adam Borowski <kilobyte@angband.pl> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-10-18xfs: print dangling delalloc extentsChristoph Hellwig
Instead of just asserting that we have no delalloc space dangling in an inode that gets freed print the actual offenders for debug mode. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-10-18xfs: fix fork selection in xfs_find_trim_cow_extentChristoph Hellwig
We should want to write directly into the data fork for blocks that don't have an extent in the COW fork covering them yet. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-10-18xfs: remove the unused trimmed argument from xfs_reflink_trim_around_sharedChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-10-18xfs: remove the unused shared argument to xfs_reflink_reserve_cowChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-10-18xfs: handle zeroing in xfs_file_iomap_begin_delayChristoph Hellwig
We only need to allocate blocks for zeroing for reflink inodes, and for we currently have a special case for reflink files in the otherwise direct I/O path that I'd like to get rid of. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-10-18xfs: remove suport for filesystems without unwritten extent flagChristoph Hellwig
The option to enable unwritten extents was made default in 2003, removed from mkfs in 2007, and cannot be disabled in v5. We also rely on it for a lot of common functionality, so filesystems without it will run a completely untested and buggy code path. Enabling the support also is a simple bit flip using xfs_db, so legacy file systems can still be brought forward. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-10-18xfs: remove XFS_IO_INVALIDChristoph Hellwig
The invalid state isn't any different from a hole, so merge the two states. Use the more descriptive hole name, but keep it as the first value of the enum to catch uninitialized fields. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-10-18Merge tag 'perf-urgent-for-mingo-4.19-20181017' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: - Stop falling back to kallsyms for vDSO symbols lookup, this wasn't being really used and is not valid in arches such as Sparc, where user and kernel space don't share the address space, relying only on cpumode to figure out what DSOs to lookup (Arnaldo Carvalho de Melo) - Align CPU map synthesized events properly, fixing SIGBUS in CPUs like Sparc (David Miller) - Fix use of alternatives to find JDIR (Jarod Wilson) - Store IDs for events with their own CPUs when synthesizing user level event details (scale, unit, etc) events, fixing a crash when recording a PMU event with a cpumask defined (Jiri Olsa) - Fix wrong filter_band* values for uncore Intel vendor events (Jiri Olsa) - Fix detection of tracefs path in systems without tracefs, where that path should be the debugfs mountpoint plus "/tracing/" (Jiri Olsa) - Pass build flags to traceevent build, allowing using alternative flags in distro packages, RPM, for instance (Jiri Olsa) - Fix 'perf report' crash on invalid inline debug information (Milian Wolff) - Synch KVM UAPI copies (Arnaldo Carvalho de Melo) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-17net: ipmr: fix unresolved entry dumpsNikolay Aleksandrov
If the skb space ends in an unresolved entry while dumping we'll miss some unresolved entries. The reason is due to zeroing the entry counter between dumping resolved and unresolved mfc entries. We should just keep counting until the whole table is dumped and zero when we move to the next as we have a separate table counter. Reported-by: Colin Ian King <colin.king@canonical.com> Fixes: 8fb472c09b9d ("ipmr: improve hash scalability") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-17net: mscc: ocelot: Fix comment in ocelot_vlant_wait_for_completion()Gregory CLEMENT
The ocelot_vlant_wait_for_completion() function is very similar to the ocelot_mact_wait_for_completion(). It seemed to have be copied but the comment was not updated, so let's fix it. Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-17sctp: fix the data size calculation in sctp_data_sizeXin Long
sctp data size should be calculated by subtracting data chunk header's length from chunk_hdr->length, not just data header. Fixes: 668c9beb9020 ("sctp: implement assign_number for sctp_stream_interleave") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-17net: skbuff.h: Mark expected switch fall-throughsGustavo A. R. Silva
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-17net: ena: enable Low Latency QueuesArthur Kiyanovski
Use the new API to enable usage of LLQ. Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>