summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2021-03-24xfrm: Fix NULL pointer dereference on policy lookupSteffen Klassert
When xfrm interfaces are used in combination with namespaces and ESP offload, we get a dst_entry NULL pointer dereference. This is because we don't have a dst_entry attached in the ESP offloading case and we need to do a policy lookup before the namespace transition. Fix this by expicit checking of skb_dst(skb) before accessing it. Fixes: f203b76d78092 ("xfrm: Add virtual xfrm interfaces") Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2021-03-24pps: clients: gpio: Get rid of legacy platform dataAndy Shevchenko
Platform data is a legacy interface to supply device properties to the driver. In this case we even don't have in-kernel users for it. Just remove it for good. Acked-by: Rodolfo Giometti <giometti@enneenne.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20210318130321.24227-4-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24binder: BINDER_GET_FROZEN_INFO ioctlMarco Ballesio
User space needs to know if binder transactions occurred to frozen processes. Introduce a new BINDER_GET_FROZEN ioctl and keep track of transactions occurring to frozen proceses. Signed-off-by: Marco Ballesio <balejs@google.com> Signed-off-by: Li Li <dualli@google.com> Acked-by: Todd Kjos <tkjos@google.com> Link: https://lore.kernel.org/r/20210316011630.1121213-4-dualli@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24binder: BINDER_FREEZE ioctlMarco Ballesio
Frozen tasks can't process binder transactions, so a way is required to inform transmitting ends of communication failures due to the frozen state of their receiving counterparts. Additionally, races are possible between transitions to frozen state and binder transactions enqueued to a specific process. Implement BINDER_FREEZE ioctl for user space to inform the binder driver about the intention to freeze or unfreeze a process. When the ioctl is called, block the caller until any pending binder transactions toward the target process are flushed. Return an error to transactions to processes marked as frozen. Co-developed-by: Todd Kjos <tkjos@google.com> Acked-by: Todd Kjos <tkjos@google.com> Signed-off-by: Marco Ballesio <balejs@google.com> Signed-off-by: Todd Kjos <tkjos@google.com> Signed-off-by: Li Li <dualli@google.com> Link: https://lore.kernel.org/r/20210316011630.1121213-2-dualli@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-24uapi: map_to_7segment: Remove licence boilerplateGeert Uytterhoeven
Remove the license boilerplate (containing an obsolete address), because we now have the SPDX header. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/r/20210322141748.1062733-1-geert@linux-m68k.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-23drm/amdkfd: Bump KFD API versionFelix Kuehling
Indicate the availability reliable SRAM EDC state in the new bit in the device properties. Proposed userspace changes: https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/commit/7cdd63475c36bb9f49bb960f90f9a8cdb7e80a21 Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23drm/amdkfd: add new flag for uncached GPU mappingEric Huang
The macro is for memory mapped by GPU as uncached. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Oak Zeng <Oak.Zeng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23net: make unregister netdev warning timeout configurableDmitry Vyukov
netdev_wait_allrefs() issues a warning if refcount does not drop to 0 after 10 seconds. While 10 second wait generally should not happen under normal workload in normal environment, it seems to fire falsely very often during fuzzing and/or in qemu emulation (~10x slower). At least it's not possible to understand if it's really a false positive or not. Automated testing generally bumps all timeouts to very high values to avoid flake failures. Add net.core.netdev_unregister_timeout_secs sysctl to make the timeout configurable for automated testing systems. Lowering the timeout may also be useful for e.g. manual bisection. The default value matches the current behavior. Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=211877 Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-23net: ocelot: replay switchdev events when joining bridgeVladimir Oltean
The premise of this change is that the switchdev port attributes and objects offloaded by ocelot might have been missed when we are joining an already existing bridge port, such as a bonding interface. The patch pulls these switchdev attributes and objects from the bridge, on behalf of the 'bridge port' net device which might be either the ocelot switch interface, or the bonding upper interface. The ocelot_net.c belongs strictly to the switchdev ocelot driver, while ocelot.c is part of a library shared with the DSA felix driver. The ocelot_port_bridge_leave function (part of the common library) used to call ocelot_port_vlan_filtering(false), something which is not necessary for DSA, since the framework deals with that already there. So we move this function to ocelot_switchdev_unsync, which is specific to the switchdev driver. The code movement described above makes ocelot_port_bridge_leave no longer return an error code, so we change its type from int to void. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-23net: bridge: add helper to replay VLANs installed on portVladimir Oltean
Currently this simple setup with DSA: ip link add br0 type bridge vlan_filtering 1 ip link add bond0 type bond ip link set bond0 master br0 ip link set swp0 master bond0 will not work because the bridge has created the PVID in br_add_if -> nbp_vlan_init, and it has notified switchdev of the existence of VLAN 1, but that was too early, since swp0 was not yet a lower of bond0, so it had no reason to act upon that notification. We need a helper in the bridge to replay the switchdev VLAN objects that were notified since the bridge port creation, because some of them may have been missed. As opposed to the br_mdb_replay function, the vg->vlan_list write side protection is offered by the rtnl_mutex which is sleepable, so we don't need to queue up the objects in atomic context, we can replay them right away. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-23net: bridge: add helper to replay port and local fdb entriesVladimir Oltean
When a switchdev port starts offloading a LAG that is already in a bridge and has an FDB entry pointing to it: ip link set bond0 master br0 bridge fdb add dev bond0 00:01:02:03:04:05 master static ip link set swp0 master bond0 the switchdev driver will have no idea that this FDB entry is there, because it missed the switchdev event emitted at its creation. Ido Schimmel pointed this out during a discussion about challenges with switchdev offloading of stacked interfaces between the physical port and the bridge, and recommended to just catch that condition and deny the CHANGEUPPER event: https://lore.kernel.org/netdev/20210210105949.GB287766@shredder.lan/ But in fact, we might need to deal with the hard thing anyway, which is to replay all FDB addresses relevant to this port, because it isn't just static FDB entries, but also local addresses (ones that are not forwarded but terminated by the bridge). There, we can't just say 'oh yeah, there was an upper already so I'm not joining that'. So, similar to the logic for replaying MDB entries, add a function that must be called by individual switchdev drivers and replays local FDB entries as well as ones pointing towards a bridge port. This time, we use the atomic switchdev notifier block, since that's what FDB entries expect for some reason. Reported-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-23net: bridge: add helper to replay port and host-joined mdb entriesVladimir Oltean
I have a system with DSA ports, and udhcpcd is configured to bring interfaces up as soon as they are created. I create a bridge as follows: ip link add br0 type bridge As soon as I create the bridge and udhcpcd brings it up, I also have avahi which automatically starts sending IPv6 packets to advertise some local services, and because of that, the br0 bridge joins the following IPv6 groups due to the code path detailed below: 33:33:ff:6d:c1:9c vid 0 33:33:00:00:00:6a vid 0 33:33:00:00:00:fb vid 0 br_dev_xmit -> br_multicast_rcv -> br_ip6_multicast_add_group -> __br_multicast_add_group -> br_multicast_host_join -> br_mdb_notify This is all fine, but inside br_mdb_notify we have br_mdb_switchdev_host hooked up, and switchdev will attempt to offload the host joined groups to an empty list of ports. Of course nobody offloads them. Then when we add a port to br0: ip link set swp0 master br0 the bridge doesn't replay the host-joined MDB entries from br_add_if, and eventually the host joined addresses expire, and a switchdev notification for deleting it is emitted, but surprise, the original addition was already completely missed. The strategy to address this problem is to replay the MDB entries (both the port ones and the host joined ones) when the new port joins the bridge, similar to what vxlan_fdb_replay does (in that case, its FDB can be populated and only then attached to a bridge that you offload). However there are 2 possibilities: the addresses can be 'pushed' by the bridge into the port, or the port can 'pull' them from the bridge. Considering that in the general case, the new port can be really late to the party, and there may have been many other switchdev ports that already received the initial notification, we would like to avoid delivering duplicate events to them, since they might misbehave. And currently, the bridge calls the entire switchdev notifier chain, whereas for replaying it should just call the notifier block of the new guy. But the bridge doesn't know what is the new guy's notifier block, it just knows where the switchdev notifier chain is. So for simplification, we make this a driver-initiated pull for now, and the notifier block is passed as an argument. To emulate the calling context for mdb objects (deferred and put on the blocking notifier chain), we must iterate under RCU protection through the bridge's mdb entries, queue them, and only call them once we're out of the RCU read-side critical section. There was some opportunity for reuse between br_mdb_switchdev_host_port, br_mdb_notify and the newly added br_mdb_queue_one in how the switchdev mdb object is created, so a helper was created. Suggested-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-23net: bridge: add helper to retrieve the current ageing timeVladimir Oltean
The SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME attribute is only emitted from: sysfs/ioctl/netlink -> br_set_ageing_time -> __set_ageing_time therefore not at bridge port creation time, so: (a) switchdev drivers have to hardcode the initial value for the address ageing time, because they didn't get any notification (b) that hardcoded value can be out of sync, if the user changes the ageing time before enslaving the port to the bridge We need a helper in the bridge, such that switchdev drivers can query the current value of the bridge ageing time when they start offloading it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Tobias Waldekranz <tobias@waldekranz.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-23net: bridge: add helper for retrieving the current bridge port STP stateVladimir Oltean
It may happen that we have the following topology with DSA or any other switchdev driver with LAG offload: ip link add br0 type bridge stp_state 1 ip link add bond0 type bond ip link set bond0 master br0 ip link set swp0 master bond0 ip link set swp1 master bond0 STP decides that it should put bond0 into the BLOCKING state, and that's that. The ports that are actively listening for the switchdev port attributes emitted for the bond0 bridge port (because they are offloading it) and have the honor of seeing that switchdev port attribute can react to it, so we can program swp0 and swp1 into the BLOCKING state. But if then we do: ip link set swp2 master bond0 then as far as the bridge is concerned, nothing has changed: it still has one bridge port. But this new bridge port will not see any STP state change notification and will remain FORWARDING, which is how the standalone code leaves it in. We need a function in the bridge driver which retrieves the current STP state, such that drivers can synchronize to it when they may have missed switchdev events. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Tobias Waldekranz <tobias@waldekranz.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-23net: lapb: Make "lapb_t1timer_running" able to detect an already running timerXie He
Problem: The "lapb_t1timer_running" function in "lapb_timer.c" is used in only one place: in the "lapb_kick" function in "lapb_out.c". "lapb_kick" calls "lapb_t1timer_running" to check if the timer is already pending, and if it is not, schedule it to run. However, if the timer has already fired and is running, and is waiting to get the "lapb->lock" lock, "lapb_t1timer_running" will not detect this, and "lapb_kick" will then schedule a new timer. The old timer will then abort when it sees a new timer pending. I think this is not right. The purpose of "lapb_kick" should be ensuring that the actual work of the timer function is scheduled to be done. If the timer function is already running but waiting for the lock, "lapb_kick" should not abort and reschedule it. Changes made: I added a new field "t1timer_running" in "struct lapb_cb" for "lapb_t1timer_running" to use. "t1timer_running" will accurately reflect whether the actual work of the timer is pending. If the timer has fired but is still waiting for the lock, "t1timer_running" will still correctly reflect whether the actual work is waiting to be done. The old "t1timer_stop" field, whose only responsibility is to ask a timer (that is already running but waiting for the lock) to abort, is no longer needed, because the new "t1timer_running" field can fully take over its responsibility. Therefore "t1timer_stop" is deleted. "t1timer_running" is not simply a negation of the old "t1timer_stop". At the end of the timer function, if it does not reschedule itself, "t1timer_running" is set to false to indicate that the timer is stopped. For consistency of the code, I also added "t2timer_running" and deleted "t2timer_stop". Signed-off-by: Xie He <xie.he.0141@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-23mm/writeback: Add wait_on_page_writeback_killableMatthew Wilcox (Oracle)
This is the killable version of wait_on_page_writeback. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: kafs-testing@auristor.com cc: linux-afs@lists.infradead.org cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/20210320054104.1300774-3-willy@infradead.org
2021-03-23fs/cachefiles: Remove wait_bit_key layout dependencyMatthew Wilcox (Oracle)
Cachefiles was relying on wait_page_key and wait_bit_key being the same layout, which is fragile. Now that wait_page_key is exposed in the pagemap.h header, we can remove that fragility A comment on the need to maintain structure layout equivalence was added by Linus[1] and that is no longer applicable. Fixes: 62906027091f ("mm: add PageWaiters indicating tasks are waiting for a page bit") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: kafs-testing@auristor.com cc: linux-cachefs@redhat.com cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/20210320054104.1300774-2-willy@infradead.org/ Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3510ca20ece0150af6b10c77a74ff1b5c198e3e2 [1]
2021-03-23ACPI: CPPC: Add emtpy stubs of functions for CONFIG_ACPI_CPPC_LIB unsetRafael J. Wysocki
For convenience, add empty stubs of library functions defined in cppc_acpi.c for the CONFIG_ACPI_CPPC_LIB unset case. Because one of them needs to return CPUFREQ_ETERNAL, include linux/cpufreq.h into the CPPC library header file and drop the direct inclusion of it from cppc_acpi.c. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Chen Yu <yu.c.chen@intel.com>
2021-03-23tracing: Fix various typos in commentsIngo Molnar
Fix ~59 single-word typos in the tracing code comments, and fix the grammar in a handful of places. Link: https://lore.kernel.org/r/20210322224546.GA1981273@gmail.com Link: https://lkml.kernel.org/r/20210323174935.GA4176821@gmail.com Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-23workqueue: Add resource managed version of delayed work initMatti Vaittinen
A few drivers which need a delayed work-queue must cancel work at driver detach. Some of those implement remove() solely for this purpose. Help drivers to avoid unnecessary remove and error-branch implementation by adding managed verision of delayed work initialization. This will also help drivers to avoid mixing manual and devm based unwinding when other resources are handled by devm. Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> Link: https://lore.kernel.org/r/51769ea4668198deb798fe47fcfb5f5288d61586.1616506559.git.matti.vaittinen@fi.rohmeurope.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-23driver core: Avoid pointless deferred probe attemptsSaravana Kannan
There's no point in adding a device to the deferred probe list if we know for sure that it doesn't have a matching driver. So, check if a device can match with a driver before adding it to the deferred probe list. Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20210302211133.2244281-2-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-23usb: core: Track SuperSpeed Plus GenXxYThinh Nguyen
Introduce ssp_rate field to usb_device structure to capture the connected SuperSpeed Plus signaling rate generation and lane count with the corresponding usb_ssp_rate enum. Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Link: https://lore.kernel.org/r/b7805d121e5ae4ad5ae144bd860b6ac04ee47436.1615432770.git.Thinh.Nguyen@synopsys.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-23USB: core: rename usb_driver_claim_interface() data parameterJohan Hovold
It's been almost twenty years since the interface "private data" pointer was removed in favour of using the driver-data pointer of struct device. Let's rename the driver-data parameter of usb_driver_claim_interface() so that it better reflects how it's used. Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20210318155406.22399-2-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-23usb: ehci: add spurious flag to disable overcurrent checkingFlorian Fainelli
This patch adds an ignore_oc flag which can be set by EHCI controller not supporting or wanting to disable overcurrent checking. The EHCI platform data in include/linux/usb/ehci_pdriver.h is also augmented to take advantage of this new flag. Signed-off-by: Florian Fainelli <florian@openwrt.org> Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com> Link: https://lore.kernel.org/r/20210223174455.1378-2-noltari@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-23locking/mutex: Fix non debug version of mutex_lock_io_nested()Thomas Gleixner
If CONFIG_DEBUG_LOCK_ALLOC=n then mutex_lock_io_nested() maps to mutex_lock() which is clearly wrong because mutex_lock() lacks the io_schedule_prepare()/finish() invocations. Map it to mutex_lock_io(). Fixes: f21860bac05b ("locking/mutex, sched/wait: Fix the mutex_lock_io_nested() define") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/878s6fshii.fsf@nanos.tec.linutronix.de
2021-03-23fs: update kernel-doc for vfs_rename()Christian Brauner
Commit 9fe61450972d ("namei: introduce struct renamedata") introduces a new struct for vfs_rename() and makes the vfs_rename() kernel-doc argument description out of sync. Move the description of arguments for vfs_rename() to a new kernel-doc for the struct renamedata to make these descriptions checkable against the actual implementation. Link: https://lore.kernel.org/r/20210204180059.28360-3-lukas.bulwahn@gmail.com Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-03-23fs: turn some comments into kernel-docLukas Bulwahn
While reviewing ./include/linux/fs.h, I noticed that three comments can actually be turned into kernel-doc comments. This allows to check the consistency between the descriptions and the functions' signatures in case they may change in the future. A quick validation with the consistency check: ./scripts/kernel-doc -none include/linux/fs.h currently reports no issues in this file. Link: https://lore.kernel.org/r/20210204180059.28360-2-lukas.bulwahn@gmail.com Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-03-23fs: introduce two inode i_{u,g}id initialization helpersChristian Brauner
Give filesystem two little helpers that do the right thing when initializing the i_uid and i_gid fields on idmapped and non-idmapped mounts. Filesystems shouldn't have to be concerned with too many details. Link: https://lore.kernel.org/r/20210320122623.599086-5-christian.brauner@ubuntu.com Inspired-by: Vivek Goyal <vgoyal@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-03-23fs: introduce fsuidgid_has_mapping() helperChristian Brauner
Don't open-code the checks and instead move them into a clean little helper we can call. This also reduces the risk that if we ever change something we forget to change all locations. Link: https://lore.kernel.org/r/20210320122623.599086-4-christian.brauner@ubuntu.com Inspired-by: Vivek Goyal <vgoyal@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-03-23fs: document and rename fsid helpersChristian Brauner
Vivek pointed out that the fs{g,u}id_into_mnt() naming scheme can be misleading as it could be understood as implying they do the exact same thing as i_{g,u}id_into_mnt(). The original motivation for this naming scheme was to signal to callers that the helpers will always take care to map the k{g,u}id such that the ownership is expressed in terms of the mnt_users. Get rid of the confusion by renaming those helpers to something more sensible. Al suggested mapped_fs{g,u}id() which seems a really good fit. Usually filesystems don't need to bother with these helpers directly only in some cases where they allocate objects that carry {g,u}ids which are either filesystem specific (e.g. xfs quota objects) or don't have a clean set of helpers as inodes have. Link: https://lore.kernel.org/r/20210320122623.599086-3-christian.brauner@ubuntu.com Inspired-by: Vivek Goyal <vgoyal@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Darrick J. Wong <djwong@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-03-23fs: document mapping helpersChristian Brauner
Document new helpers we introduced this cycle. Link: https://lore.kernel.org/r/20210320122623.599086-2-christian.brauner@ubuntu.com Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Suggested-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-03-23driver core: Trivial typo fixBhaskar Chowdhury
s/subsytem/subsystem/ Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Link: https://lore.kernel.org/r/20210320201240.23745-1-unixbhaskar@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-22net: dsa: hellcreek: Report switch name and IDKurt Kanzenbach
Report the driver name, ASIC ID and the switch name via devlink. This is a useful information for user space tooling. Signed-off-by: Kurt Kanzenbach <kurt@kmk-computers.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-22RDMA/include: Mundane typo fixes throughout the fileBhaskar Chowdhury
s/proviee/provide/ s/undelying/underlying/ s/quesiton/question/ s/drivr/driver/ Link: https://lore.kernel.org/r/20210318100453.9759-1-unixbhaskar@gmail.com Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following batch contains Netfilter updates for net-next: 1) Split flowtable workqueues per events, from Oz Shlomo. 2) fall-through warnings for clang, from Gustavo A. R. Silva 3) Remove unused declaration in conntrack, from YueHaibing. 4) Consolidate skb_try_make_writable() in flowtable datapath, simplify some of the existing codebase. 5) Call dst_check() to fall back to static classic forwarding path. 6) Update table flags from commit phase. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-22net: set initial device refcount to 1Eric Dumazet
When adding CONFIG_PCPU_DEV_REFCNT, I forgot that the initial net device refcount was 0. When CONFIG_PCPU_DEV_REFCNT is not set, this means the first dev_hold() triggers an illegal refcount operation (addition on 0) refcount_t: addition on 0; use-after-free. WARNING: CPU: 0 PID: 1 at lib/refcount.c:25 refcount_warn_saturate+0x128/0x1a4 Fix is to change initial (and final) refcount to be 1. Also add a missing kerneldoc piece, as reported by Stephen Rothwell. Fixes: 919067cc845f ("net: add CONFIG_PCPU_DEV_REFCNT") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Guenter Roeck <groeck@google.com> Tested-by: Guenter Roeck <groeck@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-22Merge branch '100GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 100GbE Intel Wired LAN Driver Updates 2021-03-22 This series contains updates to ice and iavf drivers. Haiyue Wang says: The Intel E810 Series supports a programmable pipeline for a domain specific protocols classification, for example GTP by Dynamic Device Personalization (DDP) profile. The E810 PF has introduced flex-bytes support by ethtool user-def option allowing for packet deeper matching based on an offset and value for DDP usage. For making VF also benefit from this flexible protocol classification, some new virtchnl messages are defined and handled by PF, so VF can query this new flow director capability, and use ethtool with extending the user-def option to configure Rx flow classification. The new user-def 0xAAAABBBBCCCCDDDD: BBBB is the 2 byte pattern while AAAA corresponds to its offset in the packet. Similarly DDDD is the 2 byte pattern with CCCC being the corresponding offset. The offset ranges from 0x0 to 0x1F7 (up to 504 bytes into the packet). The offset starts from the beginning of the packet. This feature can be used to allow customers to set flow director rules for protocols headers that are beyond standard ones supported by ethtool (e.g. PFCP or GTP-U). Like for matching GTP-U's TEID value 0x10203040: ethtool -N ens787f0v0 flow-type udp4 dst-port 2152 \ user-def 0x002e102000303040 action 13 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-22timekeeping, clocksource: Fix various typos in commentsIngo Molnar
Fix ~56 single-word typos in timekeeping & clocksource code comments. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Stephen Boyd <sboyd@kernel.org> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: linux-kernel@vger.kernel.org
2021-03-22net: move the ptype_all and ptype_base declarations to include/linux/netdevice.hVladimir Oltean
ptype_all and ptype_base are declared in net/core/dev.c as non-static, because they are used by net-procfs.c too. However, a "make W=1" build complains that there was no previous declaration of ptype_all and ptype_base in a header file, so this way of declaring things constitutes a violation of coding style. Let's move the extern declarations of ptype_all and ptype_base to the linux/netdevice.h file, which is included by net-procfs.c too. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-22linux/qed: Mundane spelling fixes throughout the fileBhaskar Chowdhury
s/unrequired/"not required"/ s/consme/consume/ .....two different places s/accros/across/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Acked-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-22netdev: add netdev_queue_set_dql_min_limit()Vincent Mailhol
Add a function to set the dynamic queue limit minimum value. Some specific drivers might have legitimate reasons to configure dql.min_limit to a given value. Typically, this is the case when the PDU of the protocol is smaller than the packet size to used to carry those frames to the device. Concrete example: a CAN (Control Area Network) device with an USB 2.0 interface. The PDU of classical CAN protocol are roughly 16 bytes but the USB packet size (which is used to carry the CAN frames to the device) might be up to 512 bytes. Wen small traffic burst occurs, BQL algorithm is not able to immediately adjust and this would result in having to send many small USB packets (i.e packet of 16 bytes for each CAN frame). Filling up the USB packet with CAN frames is relatively fast (small latency issue) but the gain of not having to send several small USB packets is huge (big throughput increase). In this case, forcing dql.min_limit to a given value that would allow to stuff the USB packet is always a win. This function is to be used by network drivers which are able to prove through a rationale and through empirical tests on several environment (with other applications, heavy context switching, virtualization...), that they constantly reach better performances with a specific predefined dql.min_limit value with no noticeable latency impact. Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-22lsm: separate security_task_getsecid() into subjective and objective variantsPaul Moore
Of the three LSMs that implement the security_task_getsecid() LSM hook, all three LSMs provide the task's objective security credentials. This turns out to be unfortunate as most of the hook's callers seem to expect the task's subjective credentials, although a small handful of callers do correctly expect the objective credentials. This patch is the first step towards fixing the problem: it splits the existing security_task_getsecid() hook into two variants, one for the subjective creds, one for the objective creds. void security_task_getsecid_subj(struct task_struct *p, u32 *secid); void security_task_getsecid_obj(struct task_struct *p, u32 *secid); While this patch does fix all of the callers to use the correct variant, in order to keep this patch focused on the callers and to ease review, the LSMs continue to use the same implementation for both hooks. The net effect is that this patch should not change the behavior of the kernel in any way, it will be up to the latter LSM specific patches in this series to change the hook implementations and return the correct credentials. Acked-by: Mimi Zohar <zohar@linux.ibm.com> (IMA) Acked-by: Casey Schaufler <casey@schaufler-ca.com> Reviewed-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-03-22nfs: account for selinux security context when deciding to share superblockOlga Kornievskaia
Keep track of whether or not there were LSM security context options passed during mount (ie creation of the superblock). Then, while deciding if the superblock can be shared for the new mount, check if the newly passed in LSM security context options are compatible with the existing superblock's ones by calling security_sb_mnt_opts_compat(). Previously, with selinux enabled, NFS wasn't able to do the following 2mounts: mount -o vers=4.2,sec=sys,context=system_u:object_r:root_t:s0 <serverip>:/ /mnt mount -o vers=4.2,sec=sys,context=system_u:object_r:swapfile_t:s0 <serverip>:/scratch /scratch 2nd mount would fail with "mount.nfs: an incorrect mount option was specified" and var log messages would have: "SElinux: mount invalid. Same superblock, different security settings for.." Signed-off-by: Olga Kornievskaia <kolga@netapp.com> [PM: tweak subject line] Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-03-22lsm,selinux: add new hook to compare new mount to an existing mountOlga Kornievskaia
Add a new hook that takes an existing super block and a new mount with new options and determines if new options confict with an existing mount or not. A filesystem can use this new hook to determine if it can share the an existing superblock with a new superblock for the new mount. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com> [PM: tweak the subject line, fix tab/space problems] Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-03-22ice: Enable FDIR Configure for AVFQi Zhang
The virtual channel is going to be extended to support FDIR and RSS configure from AVF. New data structures and OP codes will be added, the patch enable the FDIR part. To support above advanced AVF feature, we need to figure out what kind of data structure should be passed from VF to PF to describe an FDIR rule or RSS config rule. The common part of the requirement is we need a data structure to represent the input set selection of a rule's hash key. An input set selection is a group of fields be selected from one or more network protocol layers that could be identified as a specific flow. For example, select dst IP address from an IPv4 header combined with dst port from the TCP header as the input set for an IPv4/TCP flow. The patch adds a new data structure virtchnl_proto_hdrs to abstract a network protocol headers group which is composed of layers of network protocol header(virtchnl_proto_hdr). A protocol header contains a 32 bits mask (field_selector) to describe which fields are selected as input sets, as well as a header type (enum virtchnl_proto_hdr_type). Each bit is mapped to a field in enum virtchnl_proto_hdr_field guided by its header type. +------------+-----------+------------------------------+ | | Proto Hdr | Header Type A | | | +------------------------------+ | | | BIT 31 | ... | BIT 1 | BIT 0 | | |-----------+------------------------------+ |Proto Hdrs | Proto Hdr | Header Type B | | | +------------------------------+ | | | BIT 31 | ... | BIT 1 | BIT 0 | | |-----------+------------------------------+ | | Proto Hdr | Header Type C | | | +------------------------------+ | | | BIT 31 | ... | BIT 1 | BIT 0 | | |-----------+------------------------------+ | | .... | +-------------------------------------------------------+ All fields in enum virtchnl_proto_hdr_fields are grouped with header type and the value of the first field of a header type is always 32 aligned. enum proto_hdr_type { header_type_A = 0; header_type_B = 1; .... } enum proto_hdr_field { /* header type A */ header_A_field_0 = 0, header_A_field_1 = 1, header_A_field_2 = 2, header_A_field_3 = 3, /* header type B */ header_B_field_0 = 32, // = header_type_B << 5 header_B_field_0 = 33, header_B_field_0 = 34 header_B_field_0 = 35, .... }; So we have: proto_hdr_type = proto_hdr_field / 32 bit offset = proto_hdr_field % 32 To simply the protocol header's operations, couple help macros are added. For example, to select src IP and dst port as input set for an IPv4/UDP flow. we have: struct virtchnl_proto_hdr hdr[2]; VIRTCHNL_SET_PROTO_HDR_TYPE(&hdr[0], IPV4) VIRTCHNL_ADD_PROTO_HDR_FIELD(&hdr[0], IPV4, SRC) VIRTCHNL_SET_PROTO_HDR_TYPE(&hdr[1], UDP) VIRTCHNL_ADD_PROTO_HDR_FIELD(&hdr[1], UDP, DST) The byte array is used to store the protocol header of a training package. The byte array must be network order. The patch added virtual channel support for iAVF FDIR add/validate/delete filter. iAVF FDIR is Flow Director for Intel Adaptive Virtual Function which can direct Ethernet packets to the queues of the Network Interface Card. Add/delete command is adding or deleting one rule for each virtual channel message, while validate command is just verifying if this rule is valid without any other operations. To add or delete one rule, driver needs to config TCAM and Profile, build training packets which contains the input set value, and send the training packets through FDIR Tx queue. In addition, driver needs to manage the software context to avoid adding duplicated rules, deleting non-existent rule, input set conflicts and other invalid cases. NOTE: Supported pattern/actions and their parse functions are not be included in this patch, they will be added in a separate one. Signed-off-by: Jeff Guo <jia.guo@intel.com> Signed-off-by: Yahui Cao <yahui.cao@intel.com> Signed-off-by: Simei Su <simei.su@intel.com> Signed-off-by: Beilei Xing <beilei.xing@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Tested-by: Chen Bo <BoX.C.Chen@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-03-22SUNRPC: Export svc_xprt_received()Chuck Lever
Prepare svc_xprt_received() to be called from transport code instead of from generic RPC server code. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22svcrdma: Remove unused sc_pages fieldChuck Lever
Clean up. This significantly reduces the size of struct svc_rdma_send_ctxt. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22svcrdma: Normalize Send page handlingChuck Lever
Currently svc_rdma_sendto() migrates xdr_buf pages into a separate page list and NULLs out a bunch of entries in rq_pages while the pages are under I/O. The Send completion handler then frees those pages later. Instead, let's wait for the Send completion, then handle page releasing in the nfsd thread. I'd like to avoid the cost of 250+ put_page() calls in the Send completion handler, which is single- threaded. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22svcrdma: Add a "deferred close" helperChuck Lever
Refactor a bit of commonly used logic so that every site that wants a close deferred to an nfsd thread does all the right things (set_bit(XPT_CLOSE) then enqueue). Also, once XPT_CLOSE is set on a transport, it is never cleared. If XPT_CLOSE is already set, then the close is already being handled and the enqueue can be skipped. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22svcrdma: Maintain a Receive water markChuck Lever
Post more Receives when the number of pending Receives drops below a water mark. The batch mechanism is disabled if the underlying device cannot support a reasonably-sized Receive Queue. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>