linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2022-01-10	Merge branches 'acpi-ec' and 'acpi-processor'	Rafael J. Wysocki
	Merge ACPI EC driver updates and ACPI processor driver updates for 5.17-rc1: - Rework flushing of EC work while suspended to idle and clean up the handling of events in the ACPI EC driver (Rafael Wysocki). - Prohibit ec_sys module parameter write_support from being used when the system is locked down (Hans de Goede). - Make the ACPI processor thermal driver use cpufreq_cpu_get() to check for presence of cpufreq policy (Manfred Spraul). - Avoid unnecessary CPU cache flushing in the ACPI processor idle driver (Kirill A. Shutemov). - Replace kernel.h with the necessary inclusions in the ACPI processor driver (Andy Shevchenko). - Use swap() instead of open coding it in the ACPI processor idle driver (Guo Zhengkui). * acpi-ec: ACPI: EC: Mark the ec_sys write_support param as module_param_hw() ACPI: EC: Relocate acpi_ec_create_query() and drop acpi_ec_delete_query() ACPI: EC: Make the event work state machine visible ACPI: EC: Avoid queuing unnecessary work in acpi_ec_submit_event() ACPI: EC: Rename three functions ACPI: EC: Simplify locking in acpi_ec_event_handler() ACPI: EC: Rearrange the loop in acpi_ec_event_handler() ACPI: EC: Fold acpi_ec_check_event() into acpi_ec_event_handler() ACPI: EC: Pass one argument to acpi_ec_query() ACPI: EC: Call advance_transaction() from acpi_ec_dispatch_gpe() ACPI: EC: Rework flushing of EC work while suspended to idle * acpi-processor: ACPI: processor: thermal: avoid cpufreq_get_policy() ACPI: processor: idle: Only flush cache on entering C3 ACPI: processor idle: Use swap() instead of open coding it ACPI: processor: Replace kernel.h with the necessary inclusions
2022-01-10	SUNRPC: Fix sockaddr handling in svcsock_accept_class trace points	Chuck Lever
	Avoid potentially hazardous memory copying and the needless use of "%pIS" -- in the kernel, an RPC service listener is always bound to ANYADDR. Having the network namespace is helpful when recording errors, though. Fixes: a0469f46faab ("SUNRPC: Replace dprintk call sites in TCP state change callouts") Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-01-10	SUNRPC: Fix sockaddr handling in the svc_xprt_create_error trace point	Chuck Lever
	While testing, I got an unexpected KASAN splat: Jan 08 13:50:27 oracle-102.nfsv4.dev kernel: BUG: KASAN: stack-out-of-bounds in trace_event_raw_event_svc_xprt_create_err+0x190/0x210 [sunrpc] Jan 08 13:50:27 oracle-102.nfsv4.dev kernel: Read of size 28 at addr ffffc9000008f728 by task mount.nfs/4628 The memcpy() in the TP_fast_assign section of this trace point copies the size of the destination buffer in order that the buffer won't be overrun. In other similar trace points, the source buffer for this memcpy is a "struct sockaddr_storage" so the actual length of the source buffer is always long enough to prevent the memcpy from reading uninitialized or unallocated memory. However, for this trace point, the source buffer can be as small as a "struct sockaddr_in". For AF_INET sockaddrs, the memcpy() reads memory that follows the source buffer, which is not always valid memory. To avoid copying past the end of the passed-in sockaddr, make the source address's length available to the memcpy(). It would be a little nicer if the tracing infrastructure was more friendly about storing socket addresses that are not AF_INET, but I could not find a way to make printk("%pIS") work with a dynamic array. Reported-by: KASAN Fixes: 4b8f380e46e4 ("SUNRPC: Tracepoint to record errors in svc_xpo_create()") Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-01-10	Merge branches 'acpi-scan', 'acpi-pm', 'acpi-power' and 'acpi-pci'	Rafael J. Wysocki
	Merge ACPI device enumeration updates, ACPI power management updates and PCI host bridge ACPI driver updates for 5.17-rc1: - Introduce acpi_fetch_acpi_dev() as a replacement for acpi_bus_get_device() and use it in the ACPI subsystem (Rafael Wysocki). - Avoid using _CID for device enumaration if _HID is missing or invalid (Rafael Wysocki). - Rework quirk handling during ACPI device enumeration and add some new quirks for known broken platforms (Hans de Goede). - Avoid unnecessary or redundant CPU cache flushing during system PM transitions (Kirill A. Shutemov). - Add PM debug messages related to power resources (Rafael Wysocki). - Fix kernel-doc comment in the PCI host bridge ACPI driver (Yang Li). * acpi-scan: serdev: Do not instantiate serdevs on boards with known bogus DSDT entries i2c: acpi: Do not instantiate I2C-clients on boards with known bogus DSDT entries ACPI / x86: Add acpi_quirk_skip_[i2c_client\|serdev]_enumeration() helpers ACPI: scan: Create platform device for BCM4752 and LNV4752 ACPI nodes ACPI: Use acpi_fetch_acpi_dev() instead of acpi_bus_get_device() ACPI: scan: Introduce acpi_fetch_acpi_dev() ACPI: scan: Do not add device IDs from _CID if _HID is not valid * acpi-pm: ACPI: PM: Remove redundant cache flushing ACPI: PM: Avoid CPU cache flush when entering S4 * acpi-power: ACPI: PM: Emit debug messages when enabling/disabling wakeup power * acpi-pci: PCI/ACPI: Fix acpi_pci_osc_control_set() kernel-doc comment
2022-01-10	Merge tag 'asoc-v5.17-2' of ↵	Takashi Iwai
	https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Updates for v5.17 A few more updates for v5.17, nothing hugely stand out in the few days since the initial pull request was sent.
2022-01-10	Merge tag 'irqchip-5.17' of ↵	Thomas Gleixner
	git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core Pull irqchip updates from Marc Zyngier: - Fix GICv3 redistributor table reservation with RT across kexec - Fix GICv4.1 redistributor view of the VPE table across kexec - Add support for extra interrupts on spear-shirq - Make obtaining some interrupts optional for the Renesas drivers - Various cleanups and bug fixes Link: https://lore.kernel.org/lkml/20220108130807.4109738-1-maz@kernel.org
2022-01-10	Merge tag 'timers-v5.17-rc1' of ↵	Thomas Gleixner
	https://git.linaro.org/people/daniel.lezcano/linux into timers/core Pull clocksource/events updates from Daniel Lezcano: - Refactor resource allocation on the Exynos_mct driver without functional changes (Marek Szyprowski) - Add imx8ulp compatible string for NPX TPM driver (Jacky Bai) - Fix comma introduced by error by replacing it by the initial semicolon on the Exynos_mct (Will Deacon) - Add OSTM driver support on Renesas. The reset line must be deasserted before accessing the registers. This change depends on an external change resulting in a shared immutable branch 'reset/of-get-optional-exclusive' from git://git.pengutronix.de/pza/linux (Biju Das) - Make the OSTM Kconfig option visible to user in order to let him disable it when ARM architected timers is enabled (Biju Das) - Tag two variables on iMX sysctr _ro_afterinit (Peng Fan) - Set the cpumask to cpu_possible_mask in order to have full benefit of the DYNIRQ flag on iMX sysctr (Peng Fan) - Tag __maybe_unused a variable in the Pistachio timer driver in order to fix a warning reported by the kernel test robot (Drew Fustini) - Add MStar MSC313e timer support and the ssd20xd-based variant, as well as the DT bindings (Romain Perier) - Remove the incompatible compatible string for the rk3066 (Johan Jonker) - Fix dts_check warnings on the cadence ttc driver by adding the power domain bindings (Michal Simek) Link: https://lore.kernel.org/lkml/e093c706-c98d-29ee-0102-78b6d41c6164@linaro.org
2022-01-10	nfs: Implement cache I/O by accessing the cache directly	David Howells
	Move NFS to using fscache DIO API instead of the old upstream I/O API as that has been removed. This is a stopgap solution as the intention is that at sometime in the future, the cache will move to using larger blocks and won't be able to store individual pages in order to deal with the potential for data corruption due to the backing filesystem being able insert/remove bridging blocks of zeros into its extent list[1]. NFS then reads and writes cache pages synchronously and one page at a time. The preferred change would be to use the netfs lib, but the new I/O API can be used directly. It's just that as the cache now needs to track data for itself, caching blocks may exceed page size... This code is somewhat borrowed from my "fallback I/O" patchset[2]. Changes ======= ver #3: - Restore lost =n fallback for nfs_fscache_release_page()[2]. Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Dave Wysochanski <dwysocha@redhat.com> Acked-by: Jeff Layton <jlayton@kernel.org> cc: Trond Myklebust <trond.myklebust@hammerspace.com> cc: Anna Schumaker <anna.schumaker@netapp.com> cc: linux-nfs@vger.kernel.org cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/YO17ZNOcq+9PajfQ@mit.edu [1] Link: https://lore.kernel.org/r/202112100957.2oEDT20W-lkp@intel.com/ [2] Link: https://lore.kernel.org/r/163189108292.2509237.12615909591150927232.stgit@warthog.procyon.org.uk/ [2] Link: https://lore.kernel.org/r/163906981318.143852.17220018647843475985.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967184451.1823006.6450645559828329590.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021577632.640689.11069627070150063812.stgit@warthog.procyon.org.uk/ # v4
2022-01-10	nfs: Convert to new fscache volume/cookie API	Dave Wysochanski
	Change the nfs filesystem to support fscache's indexing rewrite and reenable caching in nfs. The following changes have been made: (1) The fscache_netfs struct is no more, and there's no need to register the filesystem as a whole. (2) The session cookie is now an fscache_volume cookie, allocated with fscache_acquire_volume(). That takes three parameters: a string representing the "volume" in the index, a string naming the cache to use (or NULL) and a u64 that conveys coherency metadata for the volume. For nfs, I've made it render the volume name string as: "nfs,<ver>,<family>,<address>,<port>,<fsidH>,<fsidL>*<,param>[,<uniq>]" (3) The fscache_cookie_def is no more and needed information is passed directly to fscache_acquire_cookie(). The cache no longer calls back into the filesystem, but rather metadata changes are indicated at other times. fscache_acquire_cookie() is passed the same keying and coherency information as before. (4) fscache_enable/disable_cookie() have been removed. Call fscache_use_cookie() and fscache_unuse_cookie() when a file is opened or closed to prevent a cache file from being culled and to keep resources to hand that are needed to do I/O. If a file is opened for writing, we invalidate it with FSCACHE_INVAL_DIO_WRITE in lieu of doing writeback to the cache, thereby making it cease caching until all currently open files are closed. This should give the same behaviour as the uptream code. Making the cache store local modifications isn't straightforward for NFS, so that's left for future patches. (5) fscache_invalidate() now needs to be given uptodate auxiliary data and a file size. It also takes a flag to indicate if this was due to a DIO write. (6) Call nfs_fscache_invalidate() with FSCACHE_INVAL_DIO_WRITE on a file to which a DIO write is made. (7) Call fscache_note_page_release() from nfs_release_page(). (8) Use a killable wait in nfs_vm_page_mkwrite() when waiting for PG_fscache to be cleared. (9) The functions to read and write data to/from the cache are stubbed out pending a conversion to use netfslib. Changes ======= ver #3: - Added missing =n fallback for nfs_fscache_release_file()[1][2]. ver #2: - Use gfpflags_allow_blocking() rather than using flag directly. - fscache_acquire_volume() now returns errors. - Remove NFS_INO_FSCACHE as it's no longer used. - Need to unuse a cookie on file-release, not inode-clear. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Co-developed-by: David Howells <dhowells@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Dave Wysochanski <dwysocha@redhat.com> Acked-by: Jeff Layton <jlayton@kernel.org> cc: Trond Myklebust <trond.myklebust@hammerspace.com> cc: Anna Schumaker <anna.schumaker@netapp.com> cc: linux-nfs@vger.kernel.org cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/202112100804.nksO8K4u-lkp@intel.com/ [1] Link: https://lore.kernel.org/r/202112100957.2oEDT20W-lkp@intel.com/ [2] Link: https://lore.kernel.org/r/163819668938.215744.14448852181937731615.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906979003.143852.2601189243864854724.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967182112.1823006.7791504655391213379.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021575950.640689.12069642327533368467.stgit@warthog.procyon.org.uk/ # v4
2022-01-10	Merge branch 'for-5.17/letsketch' into for-linus	Jiri Kosina
	- new driver to support for LetSketch device (Hans de Goede)
2022-01-10	Merge branch 'for-5.17/hidraw' into for-linus	Jiri Kosina
	- locking performance improvement for hidraw code (André Almeida)
2022-01-10	Merge branch 'for-5.17/core' into for-linus	Jiri Kosina
	- support for USI style pens (Tero Kristo, Mika Westerberg) - quirk for devices that need inverted X/Y axes (Alistair Francis) - small core code cleanups and deduplication (Benjamin Tissoires)
2022-01-10	exfat: move super block magic number to magic.h	Namjae Jeon
	Move exfat superblock magic number from local definition to magic.h. It is also needed by userspace programs that call fstatfs(). Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2022-01-09	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Merge in fixes directly in prep for the 5.17 merge window. No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-10	net/p9: load default transports	Thomas Weißschuh
	Now that all transports are split into modules it may happen that no transports are registered when v9fs_get_default_trans() is called. When that is the case try to load more transports from modules. Link: https://lkml.kernel.org/r/20211103193823.111007-5-linux@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> [Dominique: constify v9fs_get_trans_by_name argument as per patch1v2] Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2022-01-10	9p/trans_fd: split into dedicated module	Thomas Weißschuh
	This allows these transports only to be used when needed. Link: https://lkml.kernel.org/r/20211103193823.111007-3-linux@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> [Dominique: Kconfig NET_9P_FD: -depends VIRTIO, +default NET_9P] Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2022-01-09	net: skb: use kfree_skb_reason() in __udp4_lib_rcv()	Menglong Dong
	Replace kfree_skb() with kfree_skb_reason() in __udp4_lib_rcv. New drop reason 'SKB_DROP_REASON_UDP_CSUM' is added for udp csum error. Signed-off-by: Menglong Dong <imagedong@tencent.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-09	net: skb: use kfree_skb_reason() in tcp_v4_rcv()	Menglong Dong
	Replace kfree_skb() with kfree_skb_reason() in tcp_v4_rcv(). Following drop reasons are added: SKB_DROP_REASON_NO_SOCKET SKB_DROP_REASON_PKT_TOO_SMALL SKB_DROP_REASON_TCP_CSUM SKB_DROP_REASON_TCP_FILTER After this patch, 'kfree_skb' event will print message like this: $ TASK-PID CPU# \|\|\|\|\| TIMESTAMP FUNCTION $ \| \| \| \|\|\|\|\| \| \| <idle>-0 [000] ..s1. 36.113438: kfree_skb: skbaddr=(____ptrval____) protocol=2048 location=(____ptrval____) reason: NO_SOCKET The reason of skb drop is printed too. Signed-off-by: Menglong Dong <imagedong@tencent.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-09	net: skb: introduce kfree_skb_reason()	Menglong Dong
	Introduce the interface kfree_skb_reason(), which is able to pass the reason why the skb is dropped to 'kfree_skb' tracepoint. Add the 'reason' field to 'trace_kfree_skb', therefor user can get more detail information about abnormal skb with 'drop_monitor' or eBPF. All drop reasons are defined in the enum 'skb_drop_reason', and they will be print as string in 'kfree_skb' tracepoint in format of 'reason: XXX'. ( Maybe the reasons should be defined in a uapi header file, so that user space can use them? ) Signed-off-by: Menglong Dong <imagedong@tencent.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-09	net: openvswitch: Fix ct_state nat flags for conns arriving from tc	Paul Blakey
	Netfilter conntrack maintains NAT flags per connection indicating whether NAT was configured for the connection. Openvswitch maintains NAT flags on the per packet flow key ct_state field, indicating whether NAT was actually executed on the packet. When a packet misses from tc to ovs the conntrack NAT flags are set. However, NAT was not necessarily executed on the packet because the connection's state might still be in NEW state. As such, openvswitch wrongly assumes that NAT was executed and sets an incorrect flow key NAT flags. Fix this, by flagging to openvswitch which NAT was actually done in act_ct via tc_skb_ext and tc_skb_cb to the openvswitch module, so the packet flow key NAT flags will be correctly set. Fixes: b57dc7c13ea9 ("net/sched: Introduce action ct") Signed-off-by: Paul Blakey <paulb@nvidia.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20220106153804.26451-1-paulb@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-09	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next	Jakub Kicinski
	Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for net-next. This includes one patch to update ovs and act_ct to use nf_ct_put() instead of nf_conntrack_put(). 1) Add netns_tracker to nfnetlink_log and masquerade, from Eric Dumazet. 2) Remove redundant rcu read-size lock in nf_tables packet path. 3) Replace BUG() by WARN_ON_ONCE() in nft_payload. 4) Consolidate rule verdict tracing. 5) Replace WARN_ON() by WARN_ON_ONCE() in nf_tables core. 6) Make counter support built-in in nf_tables. 7) Add new field to conntrack object to identify locally generated traffic, from Florian Westphal. 8) Prevent NAT from shadowing well-known ports, from Florian Westphal. 9) Merge nf_flow_table_{ipv4,ipv6} into nf_flow_table_inet, also from Florian. 10) Remove redundant pointer in nft_pipapo AVX2 support, from Colin Ian King. 11) Replace opencoded max() in conntrack, from Jiapeng Chong. 12) Update conntrack to use refcount_t API, from Florian Westphal. 13) Move ip_ct_attach indirection into the nf_ct_hook structure. 14) Constify several pointer object in the netfilter codebase, from Florian Westphal. 15) Tree-wide replacement of nf_conntrack_put() by nf_ct_put(), also from Florian. 16) Fix egress splat due to incorrect rcu notation, from Florian. 17) Move stateful fields of connlimit, last, quota, numgen and limit out of the expression data area. 18) Build a blob to represent the ruleset in nf_tables, this is a requirement of the new register tracking infrastructure. 19) Add NFT_REG32_NUM to define the maximum number of 32-bit registers. 20) Add register tracking infrastructure to skip redundant store-to-register operations, this includes support for payload, meta and bitwise expresssions. * git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next: (32 commits) netfilter: nft_meta: cancel register tracking after meta update netfilter: nft_payload: cancel register tracking after payload update netfilter: nft_bitwise: track register operations netfilter: nft_meta: track register operations netfilter: nft_payload: track register operations netfilter: nf_tables: add register tracking infrastructure netfilter: nf_tables: add NFT_REG32_NUM netfilter: nf_tables: add rule blob layout netfilter: nft_limit: move stateful fields out of expression data netfilter: nft_limit: rename stateful structure netfilter: nft_numgen: move stateful fields out of expression data netfilter: nft_quota: move stateful fields out of expression data netfilter: nft_last: move stateful fields out of expression data netfilter: nft_connlimit: move stateful fields out of expression data netfilter: egress: avoid a lockdep splat net: prefer nf_ct_put instead of nf_conntrack_put netfilter: conntrack: avoid useless indirection during conntrack destruction netfilter: make function op structures const netfilter: core: move ip_ct_attach indirection to struct nf_ct_hook netfilter: conntrack: convert to refcount_t api ... ==================== Link: https://lore.kernel.org/r/20220109231640.104123-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-09	netfilter: nft_bitwise: track register operations	Pablo Neira Ayuso
	Check if the destination register already contains the data that this bitwise expression performs. This allows to skip this redundant operation. If the destination contains a different bitwise operation, cancel the register tracking information. If the destination contains no bitwise operation, update the register tracking information. Update the payload and meta expression to check if this bitwise operation has been already performed on the register. Hence, both the payload/meta and the bitwise expressions are reduced. There is also a special case: If source register != destination register and source register is not updated by a previous bitwise operation, then transfer selector from the source register to the destination register. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-01-09	netfilter: nf_tables: add register tracking infrastructure	Pablo Neira Ayuso
	This patch adds new infrastructure to skip redundant selector store operations on the same register to achieve a performance boost from the packet path. This is particularly noticeable in pure linear rulesets but it also helps in rulesets which are already heaving relying in maps to avoid ruleset linear inspection. The idea is to keep data of the most recurrent store operations on register to reuse them with cmp and lookup expressions. This infrastructure allows for dynamic ruleset updates since the ruleset blob reduction happens from the kernel. Userspace still needs to be updated to maximize register utilization to cooperate to improve register data reuse / reduce number of store on register operations. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-01-09	netfilter: nf_tables: add NFT_REG32_NUM	Pablo Neira Ayuso
	Add a definition including the maximum number of 32-bits registers that are used a scratchpad memory area to store data. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-01-09	netfilter: nf_tables: add rule blob layout	Pablo Neira Ayuso
	This patch adds a blob layout per chain to represent the ruleset in the packet datapath. size (unsigned long) struct nft_rule_dp struct nft_expr ... struct nft_rule_dp struct nft_expr ... struct nft_rule_dp (is_last=1) The new structure nft_rule_dp represents the rule in a more compact way (smaller memory footprint) compared to the control-plane nft_rule structure. The ruleset blob is a read-only data structure. The first field contains the blob size, then the rules containing expressions. There is a trailing rule which is used by the tracing infrastructure which is equivalent to the NULL rule marker in the previous representation. The blob size field does not include the size of this trailing rule marker. The ruleset blob is generated from the commit path. This patch reuses the infrastructure available since 0cbc06b3faba ("netfilter: nf_tables: remove synchronize_rcu in commit phase") to build the array of rules per chain. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-01-09	netfilter: egress: avoid a lockdep splat	Florian Westphal
	include/linux/netfilter_netdev.h:97 suspicious rcu_dereference_check() usage! 2 locks held by sd-resolve/1100: 0: ..(rcu_read_lock_bh){1:3}, at: ip_finish_output2 1: ..(rcu_read_lock_bh){1:3}, at: __dev_queue_xmit __dev_queue_xmit+0 .. The helper has two callers, one uses rcu_read_lock, the other rcu_read_lock_bh(). Annotate the dereference to reflect this. Fixes: 42df6e1d221dd ("netfilter: Introduce egress hook") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-01-09	netfilter: conntrack: avoid useless indirection during conntrack destruction	Florian Westphal
	nf_ct_put() results in a usesless indirection: nf_ct_put -> nf_conntrack_put -> nf_conntrack_destroy -> rcu readlock + indirect call of ct_hooks->destroy(). There are two _put helpers: nf_ct_put and nf_conntrack_put. The latter is what should be used in code that MUST NOT cause a linker dependency on the conntrack module (e.g. calls from core network stack). Everyone else should call nf_ct_put() instead. A followup patch will convert a few nf_conntrack_put() calls to nf_ct_put(), in particular from modules that already have a conntrack dependency such as act_ct or even nf_conntrack itself. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-01-09	netfilter: make function op structures const	Florian Westphal
	No functional changes, these structures should be const. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-01-09	netfilter: core: move ip_ct_attach indirection to struct nf_ct_hook	Florian Westphal
	ip_ct_attach predates struct nf_ct_hook, we can place it there and remove the exported symbol. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-01-09	netfilter: conntrack: convert to refcount_t api	Florian Westphal
	Convert nf_conn reference counting from atomic_t to refcount_t based api. refcount_t api provides more runtime sanity checks and will warn on certain constructs, e.g. refcount_inc() on a zero reference count, which usually indicates use-after-free. For this reason template allocation is changed to init the refcount to 1, the subsequenct add operations are removed. Likewise, init_conntrack() is changed to set the initial refcount to 1 instead refcount_inc(). This is safe because the new entry is not (yet) visible to other cpus. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-01-09	netfilter: conntrack: Use max() instead of doing it manually	Jiapeng Chong
	Fix following coccicheck warning: ./include/net/netfilter/nf_conntrack.h:282:16-17: WARNING opportunity for max(). Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-01-09	Merge tag 'for-net-next-2022-01-07' of ↵	Jakub Kicinski
	git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Luiz Augusto von Dentz says: ==================== bluetooth-next pull request for net-next: - Add support for Foxconn QCA 0xe0d0 - Fix HCI init sequence on MacBook Air 8,1 and 8,2 - Fix Intel firmware loading on legacy ROM devices * tag 'for-net-next-2022-01-07' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next: Bluetooth: hci_sock: fix endian bug in hci_sock_setsockopt() Bluetooth: L2CAP: uninitialized variables in l2cap_sock_setsockopt() Bluetooth: btqca: sequential validation Bluetooth: btusb: Add support for Foxconn QCA 0xe0d0 Bluetooth: btintel: Fix broken LED quirk for legacy ROM devices Bluetooth: hci_event: Rework hci_inquiry_result_with_rssi_evt Bluetooth: btbcm: disable read tx power for MacBook Air 8,1 and 8,2 Bluetooth: hci_qca: Fix NULL vs IS_ERR_OR_NULL check in qca_serdev_probe Bluetooth: hci_bcm: Check for error irq ==================== Link: https://lore.kernel.org/r/20220107210942.3750887-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-09	fs/locks: fix fcntl_getlk64/fcntl_setlk64 stub prototypes	Arnd Bergmann
	My patch to rework oabi fcntl64() introduced a harmless sparse warning when file locking is disabled: arch/arm/kernel/sys_oabi-compat.c:251:51: sparse: sparse: incorrect type in argument 3 (different address spaces) @@ expected struct flock64 [noderef] __user user @@ got struct flock64 @@ arch/arm/kernel/sys_oabi-compat.c:251:51: sparse: expected struct flock64 [noderef] __user user arch/arm/kernel/sys_oabi-compat.c:251:51: sparse: got struct flock64 arch/arm/kernel/sys_oabi-compat.c:265:55: sparse: sparse: incorrect type in argument 4 (different address spaces) @@ expected struct flock64 [noderef] __user user @@ got struct flock64 @@ arch/arm/kernel/sys_oabi-compat.c:265:55: sparse: expected struct flock64 [noderef] __user user arch/arm/kernel/sys_oabi-compat.c:265:55: sparse: got struct flock64 When file locking is enabled, everything works correctly and the right data gets passed, but the stub declarations in linux/fs.h did not get modified when the calling conventions changed in an earlier patch. Reported-by: kernel test robot <lkp@intel.com> Fixes: 7e2d8c29ecdd ("ARM: 9111/1: oabi-compat: rework fcntl64() emulation") Fixes: a75d30c77207 ("fs/locks: pass kernel struct flock to fcntl_getlk/setlk") Cc: Christoph Hellwig <hch@lst.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-01-09	block: fix old-style declaration	Yang Li
	Move the 'inline' keyword to the front of 'void'. Remove a warning found by clang(make W=1 LLVM=1) ./include/linux/blk-mq.h:259:1: warning: ‘inline’ is not at beginning of declaration Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Link: https://lore.kernel.org/r/20220107005228.103927-1-yang.lee@linux.alibaba.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-01-09	tpm: Add Upgrade/Reduced mode support for TPM2 modules	axelj
	If something went wrong during the TPM firmware upgrade, like power failure or the firmware image file get corrupted, the TPM might end up in Upgrade or Failure mode upon the next start. The state is persistent between the TPM power cycle/restart. According to TPM specification: * If the TPM is in Upgrade mode, it will answer with TPM2_RC_UPGRADE to all commands except TPM2_FieldUpgradeData(). It may also accept other commands if it is able to complete them using the previously installed firmware. * If the TPM is in Failure mode, it will allow performing TPM initialization but will not provide any crypto operations. Will happily respond to Field Upgrade calls. Change the behavior of the tpm2_auto_startup(), so it detects the active running mode of the TPM by adding the following checks. If tpm2_do_selftest() call returns TPM2_RC_UPGRADE, the TPM is in Upgrade mode. If the TPM is in Failure mode, it will successfully respond to both tpm2_do_selftest() and tpm2_startup() calls. Although, will fail to answer to tpm2_get_cc_attrs_tbl(). Use this fact to conclude that TPM is in Failure mode. If detected that the TPM is in the Upgrade or Failure mode, the function sets TPM_CHIP_FLAG_FIRMWARE_UPGRADE_MODE flag. The TPM_CHIP_FLAG_FIRMWARE_UPGRADE_MODE flag is used later during driver initialization/deinitialization to disable functionality which makes no sense or will fail in the current TPM state. Following functionality is affected: * Do not register TPM as a hwrng * Do not register sysfs entries which provide information impossible to obtain in limited mode * Do not register resource managed character device Signed-off-by: axelj <axelj@axis.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2022-01-09	keys: X.509 public key issuer lookup without AKID	Andrew Zaborowski
	There are non-root X.509 v3 certificates in use out there that contain no Authority Key Identifier extension (RFC5280 section 4.2.1.1). For trust verification purposes the kernel asymmetric key type keeps two struct asymmetric_key_id instances that the key can be looked up by, and another two to look up the key's issuer. The x509 public key type and the PKCS7 type generate them from the SKID and AKID extensions in the certificate. In effect current code has no way to look up the issuer certificate for verification without the AKID. To remedy this, add a third asymmetric_key_id blob to the arrays in both asymmetric_key_id's (for certficate subject) and in the public_keys_signature's auth_ids (for issuer lookup), using just raw subject and issuer DNs from the certificate. Adapt asymmetric_key_ids() and its callers to use the third ID for lookups when none of the other two are available. Attempt to keep the logic intact when they are, to minimise behaviour changes. Adapt the restrict functions' NULL-checks to include that ID too. Do not modify the lookup logic in pkcs7_verify.c, the AKID extensions are still required there. Internally use a new "dn:" prefix to the search specifier string generated for the key lookup in find_asymmetric_key(). This tells asymmetric_key_match_preparse to only match the data against the raw DN in the third ID and shouldn't conflict with search specifiers already in use. In effect implement what (2) in the struct asymmetric_key_id comment (include/keys/asymmetric-type.h) is probably talking about already, so do not modify that comment. It is also how "openssl verify" looks up issuer certificates without the AKID available. Lookups by the raw DN are unambiguous only provided that the CAs respect the condition in RFC5280 4.2.1.1 that the AKID may only be omitted if the CA uses a single signing key. The following is an example of two things that this change enables. A self-signed ceritficate is generated following the example from https://letsencrypt.org/docs/certificates-for-localhost/, and can be looked up by an identifier and verified against itself by linking to a restricted keyring -- both things not possible before due to the missing AKID extension: $ openssl req -x509 -out localhost.crt -outform DER -keyout localhost.key \ -newkey rsa:2048 -nodes -sha256 \ -subj '/CN=localhost' -extensions EXT -config <( \ echo -e "[dn]\nCN=localhost\n[req]\ndistinguished_name = dn\n[EXT]\n" \ "subjectAltName=DNS:localhost\nkeyUsage=digitalSignature\n" \ "extendedKeyUsage=serverAuth") $ keyring=`keyctl newring test @u` $ trusted=`keyctl padd asymmetric trusted $keyring < localhost.crt`; \ echo $trusted 39726322 $ keyctl search $keyring asymmetric dn:3112301006035504030c096c6f63616c686f7374 39726322 $ keyctl restrict_keyring $keyring asymmetric key_or_keyring:$trusted $ keyctl padd asymmetric verified $keyring < localhost.crt Signed-off-by: Andrew Zaborowski <andrew.zaborowski@intel.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Acked-by: Jarkko Sakkinen <jarkko@kernel.org> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2022-01-08	nfs: block notification on fs with its own ->lock	J. Bruce Fields
	NFSv4.1 supports an optional lock notification feature which notifies the client when a lock comes available. (Normally NFSv4 clients just poll for locks if necessary.) To make that work, we need to request a blocking lock from the filesystem. We turned that off for NFS in commit f657f8eef3ff ("nfs: don't atempt blocking locks on nfs reexports") [sic] because it actually blocks the nfsd thread while waiting for the lock. Thanks to Vasily Averin for pointing out that NFS isn't the only filesystem with that problem. Any filesystem that leaves ->lock NULL will use posix_lock_file(), which does the right thing. Simplest is just to assume that any filesystem that defines its own ->lock is not safe to request a blocking lock from. So, this patch mostly reverts commit f657f8eef3ff ("nfs: don't atempt blocking locks on nfs reexports") [sic] and commit b840be2f00c0 ("lockd: don't attempt blocking locks on nfs reexports"), and instead uses a check of ->lock (Vasily's suggestion) to decide whether to support blocking lock notifications on a given filesystem. Also add a little documentation. Perhaps someday we could add back an export flag later to allow filesystems with "good" ->lock methods to support blocking lock notifications. Reported-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> [ cel: Description rewritten to address checkpatch nits ] [ cel: Fixed warning when SUNRPC debugging is disabled ] [ cel: Fixed NULL check ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Vasily Averin <vvs@virtuozzo.com>
2022-01-08	ptrace: Remove unused regs argument from ptrace_report_syscall	Eric W. Biederman
	Link: https://lkml.kernel.org/r/20220103213312.9144-7-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-01-08	exit: Remove profile_handoff_task	Eric W. Biederman
	All profile_handoff_task does is notify the task_free_notifier chain. The helpers task_handoff_register and task_handoff_unregister are used to add and delete entries from that chain and are never called. So remove the dead code and make it much easier to read and reason about __put_task_struct. Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Link: https://lkml.kernel.org/r/87fspyw6m0.fsf@email.froward.int.ebiederm.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-01-08	exit: Remove profile_task_exit & profile_munmap	Eric W. Biederman
	When I say remove I mean remove. All profile_task_exit and profile_munmap do is call a blocking notifier chain. The helpers profile_task_register and profile_task_unregister are not called anywhere in the tree. Which means this is all dead code. So remove the dead code and make it easier to read do_exit. Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lkml.kernel.org/r/20220103213312.9144-1-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-01-08	signal: Remove the helper signal_group_exit	Eric W. Biederman
	This helper is misleading. It tests for an ongoing exec as well as the process having received a fatal signal. Sometimes it is appropriate to treat an on-going exec differently than a process that is shutting down due to a fatal signal. In particular taking the fast path out of exit_signals instead of retargeting signals is not appropriate during exec, and not changing the the exit code in do_group_exit during exec. Removing the helper makes it more obvious what is going on as both cases must be coded for explicitly. While removing the helper fix the two cases where I have observed using signal_group_exit resulted in the wrong result. In exit_signals only test for SIGNAL_GROUP_EXIT so that signals are retargetted during an exec. In do_group_exit use 0 as the exit code during an exec as de_thread does not set group_exit_code. As best as I can determine group_exit_code has been is set to 0 most of the time during de_thread. During a thread group stop group_exit_code is set to the stop signal and when the thread group receives SIGCONT group_exit_code is reset to 0. Link: https://lkml.kernel.org/r/20211213225350.27481-8-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-01-08	signal: Rename group_exit_task group_exec_task	Eric W. Biederman
	The only remaining user of group_exit_task is exec. Rename the field so that it is clear which part of the code uses it. Update the comment above the definition of group_exec_task to document how it is currently used. Link: https://lkml.kernel.org/r/20211213225350.27481-7-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-01-08	signal: Remove SIGNAL_GROUP_COREDUMP	Eric W. Biederman
	After the previous cleanups "signal->core_state" is set whenever SIGNAL_GROUP_COREDUMP is set and "signal->core_state" is tested whenver the code wants to know if a coredump is in progress. The remaining tests of SIGNAL_GROUP_COREDUMP also test to see if SIGNAL_GROUP_EXIT is set. Similarly the only place that sets SIGNAL_GROUP_COREDUMP also sets SIGNAL_GROUP_EXIT. Which makes SIGNAL_GROUP_COREDUMP unecessary and redundant. So stop setting SIGNAL_GROUP_COREDUMP, stop testing SIGNAL_GROUP_COREDUMP, and remove it's definition. With the setting of SIGNAL_GROUP_COREDUMP gone, coredump_finish no longer needs to clear SIGNAL_GROUP_COREDUMP out of signal->flags by setting SIGNAL_GROUP_EXIT. Link: https://lkml.kernel.org/r/20211213225350.27481-5-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-01-08	power: supply: Provide stubs for charge_behaviour helpers	Thomas Weißschuh
	When CONFIG_SYSFS is not enabled provide stubs for the helper functions to not break their callers. Fixes: 539b9c94ac83 ("power: supply: add helpers for charge_behaviour sysfs") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20220108153158.189489-1-linux@weissschuh.net Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-01-08	kthread: Generalize pf_io_worker so it can point to struct kthread	Eric W. Biederman
	The point of using set_child_tid to hold the kthread pointer was that it already did what is necessary. There are now restrictions on when set_child_tid can be initialized and when set_child_tid can be used in schedule_tail. Which indicates that continuing to use set_child_tid to hold the kthread pointer is a bad idea. Instead of continuing to use the set_child_tid field of task_struct generalize the pf_io_worker field of task_struct and use it to hold the kthread pointer. Rename pf_io_worker (which is a void * pointer) to worker_private so it can be used to store kthreads struct kthread pointer. Update the kthread code to store the kthread pointer in the worker_private field. Remove the places where set_child_tid had to be dealt with carefully because kthreads also used it. Link: https://lkml.kernel.org/r/CAHk-=wgtFAA9SbVYg0gR1tqPMC17-NYcs0GQkaYg1bGhh1uJQQ@mail.gmail.com Link: https://lkml.kernel.org/r/87a6grvqy8.fsf_-_@email.froward.int.ebiederm.org Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-01-08	Merge branch 'linus' into irq/core, to fix conflict	Ingo Molnar
	Conflicts: drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c Signed-off-by: Ingo Molnar <mingo@kernel.org>
2022-01-08	kbuild: move headers_check.pl to usr/include/	Masahiro Yamada
	This script is only used by usr/include/Makefile. Make it local to the directory. Update the comment in include/uapi/linux/soundcard.h because 'make headers_check' is no longer functional. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-01-08	mm: Use multi-index entries in the page cache	Matthew Wilcox (Oracle)
	We currently store large folios as 2^N consecutive entries. While this consumes rather more memory than necessary, it also turns out to be buggy. A writeback operation which starts within a tail page of a dirty folio will not write back the folio as the xarray's dirty bit is only set on the head index. With multi-index entries, the dirty bit will be found no matter where in the folio the operation starts. This does end up simplifying the page cache slightly, although not as much as I had hoped. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: William Kucharski <william.kucharski@oracle.com>
2022-01-08	XArray: Add xas_advance()	Matthew Wilcox (Oracle)
	Add a new helper function to help iterate over multi-index entries. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: William Kucharski <william.kucharski@oracle.com>
2022-01-08	mm: Remove pagevec_remove_exceptionals()	Matthew Wilcox (Oracle)
	All of its callers now call folio_batch_remove_exceptionals(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: William Kucharski <william.kucharski@oracle.com>