summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2019-11-20firmware: xilinx: Add SDIO Tap Delay nodesManish Narani
Add tap delay nodes for setting SDIO Tap Delays on ZynqMP platform. Signed-off-by: Manish Narani <manish.narani@xilinx.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2019-11-20cpuidle: Pass exit latency limit to cpuidle_use_deepest_state()Daniel Lezcano
Modify cpuidle_use_deepest_state() to take an additional exit latency limit argument to be passed to find_deepest_idle_state() and make cpuidle_idle_call() pass dev->forced_idle_latency_limit_ns to it for forced idle. Suggested-by: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> [ rjw: Rebase and rearrange code, subject & changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-11-20cpuidle: Allow idle injection to apply exit latency limitDaniel Lezcano
In some cases it may be useful to specify an exit latency limit for the idle state to be used during CPU idle time injection. Instead of duplicating the information in struct cpuidle_device or propagating the latency limit in the call stack, replace the use_deepest_state field with forced_latency_limit_ns to represent that limit, so that the deepest idle state with exit latency within that limit is forced (i.e. no governors) when it is set. A zero exit latency limit for forced idle means to use governors in the usual way (analogous to use_deepest_state equal to "false" before this change). Additionally, add play_idle_precise() taking two arguments, the duration of forced idle and the idle state exit latency limit, both in nanoseconds, and redefine play_idle() as a wrapper around that new function. This change is preparatory, no functional impact is expected. Suggested-by: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> [ rjw: Subject, changelog, cpuidle_use_deepest_state() kerneldoc, whitespace ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-11-20futex: Add mutex around futex exitThomas Gleixner
The mutex will be used in subsequent changes to replace the busy looping of a waiter when the futex owner is currently executing the exit cleanup to prevent a potential live lock. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20191106224556.845798895@linutronix.de
2019-11-20futex: Mark the begin of futex exit explicitlyThomas Gleixner
Instead of relying on PF_EXITING use an explicit state for the futex exit and set it in the futex exit function. This moves the smp barrier and the lock/unlock serialization into the futex code. As with the DEAD state this is restricted to the exit path as exec continues to use the same task struct. This allows to simplify that logic in a next step. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20191106224556.539409004@linutronix.de
2019-11-20futex: Split futex_mm_release() for exit/execThomas Gleixner
To allow separate handling of the futex exit state in the futex exit code for exit and exec, split futex_mm_release() into two functions and invoke them from the corresponding exit/exec_mm_release() callsites. Preparatory only, no functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20191106224556.332094221@linutronix.de
2019-11-20exit/exec: Seperate mm_release()Thomas Gleixner
mm_release() contains the futex exit handling. mm_release() is called from do_exit()->exit_mm() and from exec()->exec_mm(). In the exit_mm() case PF_EXITING and the futex state is updated. In the exec_mm() case these states are not touched. As the futex exit code needs further protections against exit races, this needs to be split into two functions. Preparatory only, no functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20191106224556.240518241@linutronix.de
2019-11-20futex: Replace PF_EXITPIDONE with a stateThomas Gleixner
The futex exit handling relies on PF_ flags. That's suboptimal as it requires a smp_mb() and an ugly lock/unlock of the exiting tasks pi_lock in the middle of do_exit() to enforce the observability of PF_EXITING in the futex code. Add a futex_state member to task_struct and convert the PF_EXITPIDONE logic over to the new state. The PF_EXITING dependency will be cleaned up in a later step. This prepares for handling various futex exit issues later. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20191106224556.149449274@linutronix.de
2019-11-20futex: Move futex exit handling into futex codeThomas Gleixner
The futex exit handling is #ifdeffed into mm_release() which is not pretty to begin with. But upcoming changes to address futex exit races need to add more functionality to this exit code. Split it out into a function, move it into futex code and make the various futex exit functions static. Preparatory only and no functional change. Folded build fix from Borislav. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20191106224556.049705556@linutronix.de
2019-11-19scsi: target: iscsi: Wait for all commands to finish before freeing a sessionBart Van Assche
The iSCSI target driver is the only target driver that does not wait for ongoing commands to finish before freeing a session. Make the iSCSI target driver wait for ongoing commands to finish before freeing a session. This patch fixes the following KASAN complaint: BUG: KASAN: use-after-free in __lock_acquire+0xb1a/0x2710 Read of size 8 at addr ffff8881154eca70 by task kworker/0:2/247 CPU: 0 PID: 247 Comm: kworker/0:2 Not tainted 5.4.0-rc1-dbg+ #6 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 Workqueue: target_completion target_complete_ok_work [target_core_mod] Call Trace: dump_stack+0x8a/0xd6 print_address_description.constprop.0+0x40/0x60 __kasan_report.cold+0x1b/0x33 kasan_report+0x16/0x20 __asan_load8+0x58/0x90 __lock_acquire+0xb1a/0x2710 lock_acquire+0xd3/0x200 _raw_spin_lock_irqsave+0x43/0x60 target_release_cmd_kref+0x162/0x7f0 [target_core_mod] target_put_sess_cmd+0x2e/0x40 [target_core_mod] lio_check_stop_free+0x12/0x20 [iscsi_target_mod] transport_cmd_check_stop_to_fabric+0xd8/0xe0 [target_core_mod] target_complete_ok_work+0x1b0/0x790 [target_core_mod] process_one_work+0x549/0xa40 worker_thread+0x7a/0x5d0 kthread+0x1bc/0x210 ret_from_fork+0x24/0x30 Allocated by task 889: save_stack+0x23/0x90 __kasan_kmalloc.constprop.0+0xcf/0xe0 kasan_slab_alloc+0x12/0x20 kmem_cache_alloc+0xf6/0x360 transport_alloc_session+0x29/0x80 [target_core_mod] iscsi_target_login_thread+0xcd6/0x18f0 [iscsi_target_mod] kthread+0x1bc/0x210 ret_from_fork+0x24/0x30 Freed by task 1025: save_stack+0x23/0x90 __kasan_slab_free+0x13a/0x190 kasan_slab_free+0x12/0x20 kmem_cache_free+0x146/0x400 transport_free_session+0x179/0x2f0 [target_core_mod] transport_deregister_session+0x130/0x180 [target_core_mod] iscsit_close_session+0x12c/0x350 [iscsi_target_mod] iscsit_logout_post_handler+0x136/0x380 [iscsi_target_mod] iscsit_response_queue+0x8de/0xbe0 [iscsi_target_mod] iscsi_target_tx_thread+0x27f/0x370 [iscsi_target_mod] kthread+0x1bc/0x210 ret_from_fork+0x24/0x30 The buggy address belongs to the object at ffff8881154ec9c0 which belongs to the cache se_sess_cache of size 352 The buggy address is located 176 bytes inside of 352-byte region [ffff8881154ec9c0, ffff8881154ecb20) The buggy address belongs to the page: page:ffffea0004553b00 refcount:1 mapcount:0 mapping:ffff888101755400 index:0x0 compound_mapcount: 0 flags: 0x2fff000000010200(slab|head) raw: 2fff000000010200 dead000000000100 dead000000000122 ffff888101755400 raw: 0000000000000000 0000000080130013 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8881154ec900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff8881154ec980: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb >ffff8881154eca00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff8881154eca80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8881154ecb00: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc Cc: Mike Christie <mchristi@redhat.com> Link: https://lore.kernel.org/r/20191113220508.198257-3-bvanassche@acm.org Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-19net/tls: enable sk_msg redirect to tls socket egressWillem de Bruijn
Bring back tls_sw_sendpage_locked. sk_msg redirection into a socket with TLS_TX takes the following path: tcp_bpf_sendmsg_redir tcp_bpf_push_locked tcp_bpf_push kernel_sendpage_locked sock->ops->sendpage_locked Also update the flags test in tls_sw_sendpage_locked to allow flag MSG_NO_SHARED_FRAGS. bpf_tcp_sendmsg sets this. Link: https://lore.kernel.org/netdev/CA+FuTSdaAawmZ2N8nfDDKu3XLpXBbMtcCT0q4FntDD2gn8ASUw@mail.gmail.com/T/#t Link: https://github.com/wdebruij/kerneltools/commits/icept.2 Fixes: 0608c69c9a80 ("bpf: sk_msg, sock{map|hash} redirect through ULP") Fixes: f3de19af0f5b ("Revert \"net/tls: remove unused function tls_sw_sendpage_locked\"") Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-19ASoC: soc-component: tidyup snd_soc_pcm_component_new/free() parameterKuninori Morimoto
This patch uses rtd instead of pcm at snd_soc_pcm_component_new/free() parameter. This is prepare for dai_link remove bug fix on topology. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Link: https://lore.kernel.org/r/87pnhqx89j.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2019-11-19libnvdimm: Move nvdimm_bus_attribute_group to device_typeDan Williams
A 'struct device_type' instance can carry default attributes for the device. Use this facility to remove the export of nvdimm_bus_attribute_group and put the responsibility on the core rather than leaf implementations to define this attribute. Cc: Ira Weiny <ira.weiny@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/157309903815.1582359.6418211876315050283.stgit@dwillia2-desk3.amr.corp.intel.com
2019-11-19libnvdimm: Move nvdimm_attribute_group to device_typeDan Williams
A 'struct device_type' instance can carry default attributes for the device. Use this facility to remove the export of nvdimm_attribute_group and put the responsibility on the core rather than leaf implementations to define this attribute. Cc: Ira Weiny <ira.weiny@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/157309903201.1582359.10966209746585062329.stgit@dwillia2-desk3.amr.corp.intel.com
2019-11-19libnvdimm: Move nd_mapping_attribute_group to device_typeDan Williams
A 'struct device_type' instance can carry default attributes for the device. Use this facility to remove the export of nd_mapping_attribute_group and put the responsibility on the core rather than leaf implementations to define this attribute. Cc: Ira Weiny <ira.weiny@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/157309902686.1582359.6749533709859492704.stgit@dwillia2-desk3.amr.corp.intel.com
2019-11-19libnvdimm: Move nd_region_attribute_group to device_typeDan Williams
A 'struct device_type' instance can carry default attributes for the device. Use this facility to remove the export of nd_region_attribute_group and put the responsibility on the core rather than leaf implementations to define this attribute. Cc: Ira Weiny <ira.weiny@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/157309902169.1582359.16828508538444551337.stgit@dwillia2-desk3.amr.corp.intel.com
2019-11-19libnvdimm: Move nd_numa_attribute_group to device_typeDan Williams
A 'struct device_type' instance can carry default attributes for the device. Use this facility to remove the export of nd_numa_attribute_group and put the responsibility on the core rather than leaf implementations to define this attribute. Cc: Ira Weiny <ira.weiny@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/157401269537.43284.14411189404186877352.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-11-19platform/chrome: wilco_ec: Add keyboard backlight LED supportDaniel Campello
The EC is in charge of controlling the keyboard backlight on the Wilco platform. We expose a standard LED class device named platform::kbd_backlight. Since the EC will never change the backlight level of its own accord, we don't need to implement a brightness_get() method. Signed-off-by: Nick Crews <ncrews@chromium.org> Signed-off-by: Daniel Campello <campello@chromium.org> Reviewed-by: Daniel Campello <campello@chromium.org> Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
2019-11-19platform/chrome: wilco_ec: Add charging config driverNick Crews
Add a device to control the charging algorithm used on Wilco devices, which will be picked up by the drivers/power/supply/wilco-charger.c driver. See Documentation/ABI/testing/sysfs-class-power-wilco for the userspace interface and other info. Signed-off-by: Nick Crews <ncrews@chromium.org> Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
2019-11-19cpuidle: Introduce cpuidle_driver_state_disabled() for driver quirksRafael J. Wysocki
Commit 99e98d3fb100 ("cpuidle: Consolidate disabled state checks") overlooked the fact that the imx6q and tegra20 cpuidle drivers use the "disabled" field in struct cpuidle_state for quirks which trigger after the initialization of cpuidle, so reading the initial value of that field is not sufficient for those drivers. In order to allow them to implement the quirks without using the "disabled" field in struct cpuidle_state, introduce a new helper function and modify them to use it. Fixes: 99e98d3fb100 ("cpuidle: Consolidate disabled state checks") Reported-by: Len Brown <lenb@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-11-18page_pool: extend tracepoint to also include the page PFNJesper Dangaard Brouer
The MM tracepoint for page free (called kmem:mm_page_free) doesn't provide the page pointer directly, instead it provides the PFN (Page Frame Number). This is annoying when writing a page_pool leak detector in BPF. This patch change page_pool tracepoints to also provide the PFN. The page pointer is still provided to allow other kinds of troubleshooting from BPF. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18page_pool: add destroy attempts counter and rename tracepointJesper Dangaard Brouer
When Jonathan change the page_pool to become responsible to its own shutdown via deferred work queue, then the disconnect_cnt counter was removed from xdp memory model tracepoint. This patch change the page_pool_inflight tracepoint name to page_pool_release, because it reflects the new responsability better. And it reintroduces a counter that reflect the number of times page_pool_release have been tried. The counter is also used by the code, to only empty the alloc cache once. With a stuck work queue running every second and counter being 64-bit, it will overrun in approx 584 billion years. For comparison, Earth lifetime expectancy is 7.5 billion years, before the Sun will engulf, and destroy, the Earth. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18net: phy: add core phylib sfp supportRussell King
Add core phylib help for supporting SFP sockets on PHYs. This provides a mechanism to inform the SFP layer about PHY up/down events, and also unregister the SFP bus when the PHY is going away. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for net-next: 1) Wildcard support for the net,iface set from Kristian Evensen. 2) Offload support for matching on the input interface. 3) Simplify matching on vlan header fields. 4) Add nft_payload_rebuild_vlan_hdr() function to rebuild the vlan header from the vlan sk_buff metadata. 5) Pass extack to nft_flow_cls_offload_setup(). 6) Add C-VLAN matching support. 7) Use time64_t in xt_time to fix y2038 overflow, from Arnd Bergmann. 8) Use time_t in nft_meta to fix y2038 overflow, also from Arnd. 9) Add flow_action_entry_next() helper function to flowtable offload infrastructure. 10) Add IPv6 support to the flowtable offload infrastructure. 11) Support for input interface matching from postrouting, from Phil Sutter. 12) Missing check for ndo callback in flowtable offload, from wenxu. 13) Remove conntrack parameter from flow_offload_fill_dir(), from wenxu. 14) Do not pass flow_rule object for rule removal, cookie is sufficient to achieve this. 15) Release flow_rule object in case of error from the offload commit path. 16) Undo offload ruleset updates if transaction fails. 17) Check for error when binding flowtable callbacks, from wenxu. 18) Always unbind flowtable callbacks when unregistering hooks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18docs: Add request_irq() documentationJonathan Corbet
While checking the results of the :c:func: removal, I noticed that there was no documentation for request_irq(), and request_threaded_irq() was not mentioned at all. Add a kerneldoc comment for request_irq() and add request_threaded_irq() to the list of functions. Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-11-18spi: mediatek: add SPI_CS_HIGH supportLuhua Xu
Change to use SPI_CS_HIGH to support spi CS polarity setting for chips support enhance_timing. Signed-off-by: Luhua Xu <luhua.xu@mediatek.com> Link: https://lore.kernel.org/r/1574053037-26721-2-git-send-email-luhua.xu@mediatek.com Signed-off-by: Mark Brown <broonie@kernel.org>
2019-11-18btrfs: rename btrfs_block_group_cacheDavid Sterba
The type name is misleading, a single entry is named 'cache' while this normally means a collection of objects. Rename that everywhere. Also the identifier was quite long, making function prototypes harder to format. Suggested-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18btrfs: add incompat for raid1 with 3, 4 copiesDavid Sterba
The new raid1c3 and raid1c4 profiles are backward incompatible and the name shall be 'raid1c34', the status can be found in the global supported features in /sys/fs/btrfs/features or in the per-filesystem directory. Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18btrfs: add support for 4-copy replication (raid1c4)David Sterba
Add new block group profile to store 4 copies in a simliar way that current RAID1 does. The profile attributes and constraints are defined in the raid table and used by the same code that already handles the 2- and 3-copy RAID1. The minimum number of devices is 4, the maximum number of devices/chunks that can be lost/damaged is 3. There is no comparable traditional RAID level, the profile is added for future needs to accompany triple-parity and beyond. Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18btrfs: add support for 3-copy replication (raid1c3)David Sterba
Add new block group profile to store 3 copies in a simliar way that current RAID1 does. The profile attributes and constraints are defined in the raid table and used by the same code that already handles the 2-copy RAID1. The minimum number of devices is 3, the maximum number of devices/chunks that can be lost/damaged is 2. Like RAID6 but with 33% space utilization. Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18btrfs: add dedicated members for start and length of a block groupDavid Sterba
The on-disk format of block group item makes use of the key that stores the offset and length. This is further used in the code, although this makes thing harder to understand. The key is also packed so the offset/length is not properly aligned as u64. Add start (key.objectid) and length (key.offset) members to block group and remove the embedded key. When the item is searched or written, a local variable for key is used. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18btrfs: move block_group_item::used to block groupDavid Sterba
For unknown reasons, the member 'used' in the block group struct is stored in the b-tree item and accessed everywhere using the special accessor helper. Let's unify it and make it a regular member and only update the item before writing it to the tree. The item is still being used for flags and chunk_objectid, there's some duplication until the item is removed in following patches. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18btrfs: add blake2b to checksumming algorithmsDavid Sterba
Add blake2b (with 256 bit digest) to the list of possible checksumming algorithms used by BTRFS. Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18btrfs: add sha256 to checksumming algorithmJohannes Thumshirn
Add sha256 to the list of possible checksumming algorithms used by BTRFS. Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18btrfs: add xxhash64 to checksumming algorithmsJohannes Thumshirn
Add xxhash64 to the list of possible checksumming algorithms used by BTRFS. Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18ftrace: Rename ftrace_graph_stub to ftrace_stub_graphSteven Rostedt (VMware)
The ftrace_graph_stub was created and points to ftrace_stub as a way to assign the functon graph tracer function pointer to a stub function with a different prototype than what ftrace_stub has and not trigger the C verifier. The ftrace_graph_stub was created via the linker script vmlinux.lds.h. Unfortunately, powerpc already uses the name ftrace_graph_stub for its internal implementation of the function graph tracer, and even though powerpc would still build, the change via the linker script broke function tracer on powerpc from working. By using the name ftrace_stub_graph, which does not exist anywhere else in the kernel, this should not be a problem. Link: https://lore.kernel.org/r/1573849732.5937.136.camel@lca.pw Fixes: b83b43ffc6e4 ("fgraph: Fix function type mismatches of ftrace_graph_return using ftrace_stub") Reorted-by: Qian Cai <cai@lca.pw> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-11-18ftrace: Add a helper function to modify_ftrace_direct() to allow arch ↵Steven Rostedt (VMware)
optimization If a direct ftrace callback is at a location that does not have any other ftrace helpers attached to it, it is possible to simply just change the text to call the new caller (if the architecture supports it). But this requires special architecture code. Currently, modify_ftrace_direct() uses a trick to add a stub ftrace callback to the location forcing it to call the ftrace iterator. Then it can change the direct helper to call the new function in C, and then remove the stub. Removing the stub will have the location now call the new location that the direct helper is using. The new helper function does the registering the stub trick, but is a weak function, allowing an architecture to override it to do something a bit more direct. Link: https://lore.kernel.org/r/20191115215125.mbqv7taqnx376yed@ast-mbp.dhcp.thefacebook.com Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-11-18blk-cgroup: cgroup_rstat_updated() shouldn't be called on cgroup1Tejun Heo
Currently, cgroup rstat is supported only on cgroup2 hierarchy and rstat functions shouldn't be called on cgroup1 cgroups. While converting blk-cgroup core statistics to rstat, f73316482977 ("blk-cgroup: reimplement basic IO stats using cgroup rstat") accidentally ended up calling cgroup_rstat_updated() on cgroup1 cgroups causing crashes. Longer term, we probably should add cgroup1 support to rstat but for now let's mask the call directly. Fixes: f73316482977 ("blk-cgroup: reimplement basic IO stats using cgroup rstat") Tested-by: Faiz Abbas <faiz_abbas@ti.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-11-18Merge tag 'v5.4-rc8' into sched/core, to pick up fixes and dependenciesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-18ALSA: compress: add flac decoder paramsVinod Koul
The current design of sending codec parameters assumes that decoders will have parsers so they can parse the encoded stream for parameters and configure the decoder. But this assumption may not be universally true and we know some DSPs which do not contain the parsers so additional parameters are required to be passed. So add these parameters starting with FLAC decoder. The size of snd_codec_options is still 120 bytes after this change (due to this being a union) Co-developed-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Vinod Koul <vkoul@kernel.org> Link: https://lore.kernel.org/r/20191115102705.649976-2-vkoul@kernel.org Acked-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Mark Brown <broonie@kernel.org>
2019-11-18btrfs: tracepoints: constify all pointersDavid Sterba
We don't modify the data passed to tracepoints, some of the declarations are already const, add it to the rest. Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18btrfs: tracepoints: drop typecasts from printkDavid Sterba
Remove typecasts from trace printk, adjust types and move typecast to the assignment if necessary. When assigning, the types are more obvious compared to matching the variables to the format strings. Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18btrfs: use enum for extent type definesChengguang Xu
Use enum to replace macro definitions of extent types. Signed-off-by: Chengguang Xu <cgxu519@mykernel.net> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18btrfs: get rid of pointless wtag variable in async-thread.cOmar Sandoval
Commit ac0c7cf8be00 ("btrfs: fix crash when tracepoint arguments are freed by wq callbacks") added a void pointer, wtag, which is passed into trace_btrfs_all_work_done() instead of the freed work item. This is silly for a few reasons: 1. The freed work item still has the same address. 2. work is still in scope after it's freed, so assigning wtag doesn't stop anyone from using it. 3. The tracepoint has always taken a void * argument, so assigning wtag doesn't actually make things any more type-safe. (Note that the original bug in commit bc074524e123 ("btrfs: prefix fsid to all trace events") was that the void * was implicitly casted when it was passed to btrfs_work_owner() in the trace point itself). Instead, let's add some clearer warnings as comments. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18bpf: Add mmap() support for BPF_MAP_TYPE_ARRAYAndrii Nakryiko
Add ability to memory-map contents of BPF array map. This is extremely useful for working with BPF global data from userspace programs. It allows to avoid typical bpf_map_{lookup,update}_elem operations, improving both performance and usability. There had to be special considerations for map freezing, to avoid having writable memory view into a frozen map. To solve this issue, map freezing and mmap-ing is happening under mutex now: - if map is already frozen, no writable mapping is allowed; - if map has writable memory mappings active (accounted in map->writecnt), map freezing will keep failing with -EBUSY; - once number of writable memory mappings drops to zero, map freezing can be performed again. Only non-per-CPU plain arrays are supported right now. Maps with spinlocks can't be memory mapped either. For BPF_F_MMAPABLE array, memory allocation has to be done through vmalloc() to be mmap()'able. We also need to make sure that array data memory is page-sized and page-aligned, so we over-allocate memory in such a way that struct bpf_array is at the end of a single page of memory with array->value being aligned with the start of the second page. On deallocation we need to accomodate this memory arrangement to free vmalloc()'ed memory correctly. One important consideration regarding how memory-mapping subsystem functions. Memory-mapping subsystem provides few optional callbacks, among them open() and close(). close() is called for each memory region that is unmapped, so that users can decrease their reference counters and free up resources, if necessary. open() is *almost* symmetrical: it's called for each memory region that is being mapped, **except** the very first one. So bpf_map_mmap does initial refcnt bump, while open() will do any extra ones after that. Thus number of close() calls is equal to number of open() calls plus one more. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Link: https://lore.kernel.org/bpf/20191117172806.2195367-4-andriin@fb.com
2019-11-18bpf: Convert bpf_prog refcnt to atomic64_tAndrii Nakryiko
Similarly to bpf_map's refcnt/usercnt, convert bpf_prog's refcnt to atomic64 and remove artificial 32k limit. This allows to make bpf_prog's refcounting non-failing, simplifying logic of users of bpf_prog_add/bpf_prog_inc. Validated compilation by running allyesconfig kernel build. Suggested-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191117172806.2195367-3-andriin@fb.com
2019-11-18bpf: Switch bpf_map ref counter to atomic64_t so bpf_map_inc() never failsAndrii Nakryiko
92117d8443bc ("bpf: fix refcnt overflow") turned refcounting of bpf_map into potentially failing operation, when refcount reaches BPF_MAX_REFCNT limit (32k). Due to using 32-bit counter, it's possible in practice to overflow refcounter and make it wrap around to 0, causing erroneous map free, while there are still references to it, causing use-after-free problems. But having a failing refcounting operations are problematic in some cases. One example is mmap() interface. After establishing initial memory-mapping, user is allowed to arbitrarily map/remap/unmap parts of mapped memory, arbitrarily splitting it into multiple non-contiguous regions. All this happening without any control from the users of mmap subsystem. Rather mmap subsystem sends notifications to original creator of memory mapping through open/close callbacks, which are optionally specified during initial memory mapping creation. These callbacks are used to maintain accurate refcount for bpf_map (see next patch in this series). The problem is that open() callback is not supposed to fail, because memory-mapped resource is set up and properly referenced. This is posing a problem for using memory-mapping with BPF maps. One solution to this is to maintain separate refcount for just memory-mappings and do single bpf_map_inc/bpf_map_put when it goes from/to zero, respectively. There are similar use cases in current work on tcp-bpf, necessitating extra counter as well. This seems like a rather unfortunate and ugly solution that doesn't scale well to various new use cases. Another approach to solve this is to use non-failing refcount_t type, which uses 32-bit counter internally, but, once reaching overflow state at UINT_MAX, stays there. This utlimately causes memory leak, but prevents use after free. But given refcounting is not the most performance-critical operation with BPF maps (it's not used from running BPF program code), we can also just switch to 64-bit counter that can't overflow in practice, potentially disadvantaging 32-bit platforms a tiny bit. This simplifies semantics and allows above described scenarios to not worry about failing refcount increment operation. In terms of struct bpf_map size, we are still good and use the same amount of space: BEFORE (3 cache lines, 8 bytes of padding at the end): struct bpf_map { const struct bpf_map_ops * ops __attribute__((__aligned__(64))); /* 0 8 */ struct bpf_map * inner_map_meta; /* 8 8 */ void * security; /* 16 8 */ enum bpf_map_type map_type; /* 24 4 */ u32 key_size; /* 28 4 */ u32 value_size; /* 32 4 */ u32 max_entries; /* 36 4 */ u32 map_flags; /* 40 4 */ int spin_lock_off; /* 44 4 */ u32 id; /* 48 4 */ int numa_node; /* 52 4 */ u32 btf_key_type_id; /* 56 4 */ u32 btf_value_type_id; /* 60 4 */ /* --- cacheline 1 boundary (64 bytes) --- */ struct btf * btf; /* 64 8 */ struct bpf_map_memory memory; /* 72 16 */ bool unpriv_array; /* 88 1 */ bool frozen; /* 89 1 */ /* XXX 38 bytes hole, try to pack */ /* --- cacheline 2 boundary (128 bytes) --- */ atomic_t refcnt __attribute__((__aligned__(64))); /* 128 4 */ atomic_t usercnt; /* 132 4 */ struct work_struct work; /* 136 32 */ char name[16]; /* 168 16 */ /* size: 192, cachelines: 3, members: 21 */ /* sum members: 146, holes: 1, sum holes: 38 */ /* padding: 8 */ /* forced alignments: 2, forced holes: 1, sum forced holes: 38 */ } __attribute__((__aligned__(64))); AFTER (same 3 cache lines, no extra padding now): struct bpf_map { const struct bpf_map_ops * ops __attribute__((__aligned__(64))); /* 0 8 */ struct bpf_map * inner_map_meta; /* 8 8 */ void * security; /* 16 8 */ enum bpf_map_type map_type; /* 24 4 */ u32 key_size; /* 28 4 */ u32 value_size; /* 32 4 */ u32 max_entries; /* 36 4 */ u32 map_flags; /* 40 4 */ int spin_lock_off; /* 44 4 */ u32 id; /* 48 4 */ int numa_node; /* 52 4 */ u32 btf_key_type_id; /* 56 4 */ u32 btf_value_type_id; /* 60 4 */ /* --- cacheline 1 boundary (64 bytes) --- */ struct btf * btf; /* 64 8 */ struct bpf_map_memory memory; /* 72 16 */ bool unpriv_array; /* 88 1 */ bool frozen; /* 89 1 */ /* XXX 38 bytes hole, try to pack */ /* --- cacheline 2 boundary (128 bytes) --- */ atomic64_t refcnt __attribute__((__aligned__(64))); /* 128 8 */ atomic64_t usercnt; /* 136 8 */ struct work_struct work; /* 144 32 */ char name[16]; /* 176 16 */ /* size: 192, cachelines: 3, members: 21 */ /* sum members: 154, holes: 1, sum holes: 38 */ /* forced alignments: 2, forced holes: 1, sum forced holes: 38 */ } __attribute__((__aligned__(64))); This patch, while modifying all users of bpf_map_inc, also cleans up its interface to match bpf_map_put with separate operations for bpf_map_inc and bpf_map_inc_with_uref (to match bpf_map_put and bpf_map_put_with_uref, respectively). Also, given there are no users of bpf_map_inc_not_zero specifying uref=true, remove uref flag and default to uref=false internally. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20191117172806.2195367-2-andriin@fb.com
2019-11-18Merge tag 'nfs-rdma-for-5.5-1' of ↵Trond Myklebust
git://git.linux-nfs.org/projects/anna/linux-nfs NFSoRDMA Client Updates for Linux 5.5 New Features: - New tracepoints for congestion control and Local Invalidate WRs Bugfixes and Cleanups: - Eliminate log noise in call_reserveresult - Fix unstable connections after a reconnect - Clean up some code duplication - Close race between waking a sender and posting a receive - Fix MR list corruption, and clean up MR usage - Remove unused rpcrdma_sendctx fields - Try to avoid DMA mapping pages if it is too costly - Wake pending tasks if connection fails - Replace some dprintk()s with tracepoints
2019-11-18mmc: core: Fix size overflow for mmc partitionsBradley Bolen
With large eMMC cards, it is possible to create general purpose partitions that are bigger than 4GB. The size member of the mmc_part struct is only an unsigned int which overflows for gp partitions larger than 4GB. Change this to a u64 to handle the overflow. Signed-off-by: Bradley Bolen <bradleybolen@gmail.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2019-11-17Merge tag 'iommu-fixes-v5.4-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fixes from Joerg Roedel: - Fix for Intel IOMMU to correct invalidation commands when in SVA mode. - Update MAINTAINERS entry for Intel IOMMU * tag 'iommu-fixes-v5.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/vt-d: Fix QI_DEV_IOTLB_PFSID and QI_DEV_EIOTLB_PFSID macros MAINTAINERS: Update for INTEL IOMMU (VT-d) entry