linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2022-09-29	fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE	Lukas Czerner
	Currently the I_DIRTY_TIME will never get set if the inode already has I_DIRTY_INODE with assumption that it supersedes I_DIRTY_TIME. That's true, however ext4 will only update the on-disk inode in ->dirty_inode(), not on actual writeback. As a result if the inode already has I_DIRTY_INODE state by the time we get to __mark_inode_dirty() only with I_DIRTY_TIME, the time was already filled into on-disk inode and will not get updated until the next I_DIRTY_INODE update, which might never come if we crash or get a power failure. The problem can be reproduced on ext4 by running xfstest generic/622 with -o iversion mount option. Fix it by allowing I_DIRTY_TIME to be set even if the inode already has I_DIRTY_INODE. Also make sure that the case is properly handled in writeback_single_inode() as well. Additionally changes in xfs_fs_dirty_inode() was made to accommodate for I_DIRTY_TIME in flag. Thanks Jan Kara for suggestions on how to make this work properly. Cc: Dave Chinner <david@fromorbit.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: stable@kernel.org Signed-off-by: Lukas Czerner <lczerner@redhat.com> Suggested-by: Jan Kara <jack@suse.cz> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220825100657.44217-1-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-09-29	ext4: don't increase iversion counter for ea_inodes	Lukas Czerner
	ea_inodes are using i_version for storing part of the reference count so we really need to leave it alone. The problem can be reproduced by xfstest ext4/026 when iversion is enabled. Fix it by not calling inode_inc_iversion() for EXT4_EA_INODE_FL inodes in ext4_mark_iloc_dirty(). Cc: stable@kernel.org Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Link: https://lore.kernel.org/r/20220824160349.39664-1-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-09-29	ext4: fix check for block being out of directory size	Jan Kara
	The check in __ext4_read_dirblock() for block being outside of directory size was wrong because it compared block number against directory size in bytes. Fix it. Fixes: 65f8ea4cd57d ("ext4: check if directory block is within i_size") CVE: CVE-2022-1184 CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Lukas Czerner <lczerner@redhat.com> Link: https://lore.kernel.org/r/20220822114832.1482-1-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-09-29	fs/buffer: make submit_bh & submit_bh_wbc return type as void	Ritesh Harjani (IBM)
	submit_bh/submit_bh_wbc are non-blocking functions which just submit the bio and return. The caller of submit_bh/submit_bh_wbc needs to wait on buffer till I/O completion and then check buffer head's b_state field to know if there was any I/O error. Hence there is no need for these functions to have any return type. Even now they always returns 0. Hence drop the return value and make their return type as void to avoid any confusion. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Link: https://lore.kernel.org/r/cb66ef823374cdd94d2d03083ce13de844fffd41.1660788334.git.ritesh.list@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-09-29	fs/buffer: drop useless return value of submit_bh	Ritesh Harjani (IBM)
	submit_bh always returns 0. This patch drops the useless return value of submit_bh from __sync_dirty_buffer(). Once all of submit_bh callers are cleaned up, we can make it's return type as void. Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/a98a6ddfac68f73d684c2724952e825bc1f4d238.1660788334.git.ritesh.list@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-09-29	fs/ntfs: drop useless return value of submit_bh from ntfs_submit_bh_for_read	Ritesh Harjani (IBM)
	submit_bh always returns 0. This patch drops the useless return value of submit_bh from ntfs_submit_bh_for_read(). Once all of submit_bh callers are cleaned up, we can make it's return type as void. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Ritesh Harjani <ritesh.list@gmail.com> Link: https://lore.kernel.org/r/d82eb29e8dbc52fe13a7affef5c907ea4076aa31.1660788334.git.ritesh.list@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-09-29	jbd2: drop useless return value of submit_bh	Ritesh Harjani (IBM)
	submit_bh always returns 0. This patch cleans up 2 of it's caller in jbd2 to drop submit_bh's useless return value. Once all submit_bh callers are cleaned up, we can make it's return type as void. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Link: https://lore.kernel.org/r/e069c0539be0aec61abcdc6f6141982ec85d489d.1660788334.git.ritesh.list@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-09-29	ext4: make ext4_lazyinit_thread freezable	Lalith Rajendran
	ext4_lazyinit_thread is not set freezable. Hence when the thread calls try_to_freeze it doesn't freeze during suspend and continues to send requests to the storage during suspend, resulting in suspend failures. Cc: stable@kernel.org Signed-off-by: Lalith Rajendran <lalithkraj@google.com> Link: https://lore.kernel.org/r/20220818214049.1519544-1-lalithkraj@google.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-09-29	Merge branch '100GbE' of ↵	Jakub Kicinski
	git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-09-28 (ice) Arkadiusz implements a single pin initialization function, checking feature bits, instead of having separate device functions and updates sub-device IDs for recognizing E810T devices. Martyna adds support for switchdev filters on VLAN priority field. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: Add support for VLAN priority filters in switchdev ice: support features on new E810T variants ice: Merge pin initialization of E810 and E810T adapters ==================== Link: https://lore.kernel.org/r/20220928203217.411078-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	eth: alx: take rtnl_lock on resume	Jakub Kicinski
	Zbynek reports that alx trips an rtnl assertion on resume: RTNL: assertion failed at net/core/dev.c (2891) RIP: 0010:netif_set_real_num_tx_queues+0x1ac/0x1c0 Call Trace: <TASK> __alx_open+0x230/0x570 [alx] alx_resume+0x54/0x80 [alx] ? pci_legacy_resume+0x80/0x80 dpm_run_callback+0x4a/0x150 device_resume+0x8b/0x190 async_resume+0x19/0x30 async_run_entry_fn+0x30/0x130 process_one_work+0x1e5/0x3b0 indeed the driver does not hold rtnl_lock during its internal close and re-open functions during suspend/resume. Note that this is not a huge bug as the driver implements its own locking, and does not implement changing the number of queues, but we need to silence the splat. Fixes: 4a5fe57e7751 ("alx: use fine-grained locking instead of RTNL") Reported-and-tested-by: Zbynek Michl <zbynek.michl@gmail.com> Reviewed-by: Niels Dossche <dossche.niels@gmail.com> Link: https://lore.kernel.org/r/20220928181236.1053043-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	sparc: Unbreak the build	Bart Van Assche
	Fix the following build errors: arch/sparc/mm/srmmu.c: In function ‘smp_flush_page_for_dma’: arch/sparc/mm/srmmu.c:1639:13: error: cast between incompatible function types from ‘void ()(long unsigned int)’ to ‘void ()(long unsigned int, long unsigned int, long unsigned int, long unsigned int, long unsigned int)’ [-Werror=cast-function-type] 1639 \| xc1((smpfunc_t) local_ops->page_for_dma, page); \| ^ arch/sparc/mm/srmmu.c: In function ‘smp_flush_cache_mm’: arch/sparc/mm/srmmu.c:1662:29: error: cast between incompatible function types from ‘void ()(struct mm_struct )’ to ‘void (*)(long unsigned int, long unsigned int, long unsigned int, long unsigned int, long unsigned int)’ [-Werror=cast-function-type] 1662 \| xc1((smpfunc_t) local_ops->cache_mm, (unsigned long) mm); \| [ ... ] Compile-tested only. Fixes: 552a23a0e5d0 ("Makefile: Enable -Wcast-function-type") Cc: stable@vger.kernel.org Signed-off-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220830205854.1918026-1-bvanassche@acm.org
2022-09-29	net: phy: Convert to use sysfs_emit() APIs	Wang Yufen
	Follow the advice of the Documentation/filesystems/sysfs.rst and show() should only use sysfs_emit() or sysfs_emit_at() when formatting the value to be returned to user space. Signed-off-by: Wang Yufen <wangyufen@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/1664364860-29153-1-git-send-email-wangyufen@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	Merge branch 'add-tc-taprio-support-for-queuemaxsdu'	Jakub Kicinski
	Vladimir Oltean says: ==================== Add tc-taprio support for queueMaxSDU The tc-taprio offload mode supported by the Felix DSA driver has limitations surrounding its guard bands. The initial discussion was at: https://lore.kernel.org/netdev/c7618025da6723418c56a54fe4683bd7@walle.cc/ with the latest status being that we now have a vsc9959_tas_guard_bands_update() method which makes a best-guess attempt at how much useful space to reserve for packet scheduling in a taprio interval, and how much to reserve for guard bands. IEEE 802.1Q actually does offer a tunable variable (queueMaxSDU) which can determine the max MTU supported per traffic class. In turn we can determine the size we need for the guard bands, depending on the queueMaxSDU. This way we can make the guard band of small taprio intervals smaller than one full MTU worth of transmission time, if we know that said traffic class will transport only smaller packets. As discussed with Gerhard Engleder, the queueMaxSDU may also be useful in limiting the latency on an endpoint, if some of the TX queues are outside of the control of the Linux driver. https://patchwork.kernel.org/project/netdevbpf/patch/20220914153303.1792444-11-vladimir.oltean@nxp.com/ Allow input of queueMaxSDU through netlink into tc-taprio, offload it to the hardware I have access to (LS1028A), and (implicitly) deny non-default values to everyone else. Kurt Kanzenbach has also kindly tested and shared a patch to offload this to hellcreek. v3 at: https://patchwork.kernel.org/project/netdevbpf/cover/20220927234746.1823648-1-vladimir.oltean@nxp.com/ v2 at: https://patchwork.kernel.org/project/netdevbpf/list/?series=679954&state=* v1 at: https://patchwork.kernel.org/project/netdevbpf/cover/20220914153303.1792444-1-vladimir.oltean@nxp.com/ ==================== Link: https://lore.kernel.org/r/20220928095204.2093716-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	net: enetc: offload per-tc max SDU from tc-taprio	Vladimir Oltean
	The driver currently sets the PTCMSDUR register statically to the max MTU supported by the interface. Keep this logic if tc-taprio is absent or if the max_sdu for a traffic class is 0, and follow the requested max SDU size otherwise. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	net: enetc: use common naming scheme for PTGCR and PTGCAPR registers	Vladimir Oltean
	The Port Time Gating Control Register (PTGCR) and Port Time Gating Capability Register (PTGCAPR) have definitions in the driver which aren't in line with the other registers. Rename these. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	net: enetc: cache accesses to &priv->si->hw	Vladimir Oltean
	The &priv->si->hw construct dereferences 2 pointers and makes lines longer than they need to be, in turn making the code harder to read. Replace &priv->si->hw accesses with a "hw" variable when there are 2 or more accesses within a function that dereference this. This includes loops, since &priv->si->hw is a loop invariant. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	net: dsa: hellcreek: Offload per-tc max SDU from tc-taprio	Kurt Kanzenbach
	Add support for configuring the max SDU per priority and per port. If not specified, keep the default. Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	net: dsa: hellcreek: refactor hellcreek_port_setup_tc() to use switch/case	Vladimir Oltean
	The following patch will need to make this function also respond to TC_QUERY_BASE, so make the processing more structured around the tc_setup_type. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	net: dsa: felix: offload per-tc max SDU from tc-taprio	Vladimir Oltean
	Our current vsc9959_tas_guard_bands_update() algorithm has a limitation imposed by the hardware design. To avoid packet overruns between one gate interval and the next (which would add jitter for scheduled traffic in the next gate), we configure the switch to use guard bands. These are as large as the largest packet which is possible to be transmitted. The problem is that at tc-taprio intervals of sizes comparable to a guard band, there isn't an obvious place in which to split the interval between the useful portion (for scheduling) and the guard band portion (where scheduling is blocked). For example, a 10 us interval at 1Gbps allows 1225 octets to be transmitted. We currently split the interval between the bare minimum of 33 ns useful time (required to schedule a single packet) and the rest as guard band. But 33 ns of useful scheduling time will only allow a single packet to be sent, be that packet 1200 octets in size, or 60 octets in size. It is impossible to send 2 60 octets frames in the 10 us window. Except that if we reduced the guard band (and therefore the maximum allowable SDU size) to 5 us, the useful time for scheduling is now also 5 us, so more packets could be scheduled. The hardware inflexibility of not scheduling according to individual packet lengths must unfortunately propagate to the user, who needs to tune the queueMaxSDU values if he wants to fit more small packets into a 10 us interval, rather than one large packet. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	net/sched: taprio: allow user input of per-tc max SDU	Vladimir Oltean
	IEEE 802.1Q clause 12.29.1.1 "The queueMaxSDUTable structure and data types" and 8.6.8.4 "Enhancements for scheduled traffic" talk about the existence of a per traffic class limitation of maximum frame sizes, with a fallback on the port-based MTU. As far as I am able to understand, the 802.1Q Service Data Unit (SDU) represents the MAC Service Data Unit (MSDU, i.e. L2 payload), excluding any number of prepended VLAN headers which may be otherwise present in the MSDU. Therefore, the queueMaxSDU is directly comparable to the device MTU (1500 means L2 payload sizes are accepted, or frame sizes of 1518 octets, or 1522 plus one VLAN header). Drivers which offload this are directly responsible of translating into other units of measurement. To keep the fast path checks optimized, we keep 2 arrays in the qdisc, one for max_sdu translated into frame length (so that it's comparable to skb->len), and another for offloading and for dumping back to the user. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	net/sched: query offload capabilities through ndo_setup_tc()	Vladimir Oltean
	When adding optional new features to Qdisc offloads, existing drivers must reject the new configuration until they are coded up to act on it. Since modifying all drivers in lockstep with the changes in the Qdisc can create problems of its own, it would be nice if there existed an automatic opt-in mechanism for offloading optional features. Jakub proposes that we multiplex one more kind of call through ndo_setup_tc(): one where the driver populates a Qdisc-specific capability structure. First user will be taprio in further changes. Here we are introducing the definitions for the base functionality. Link: https://patchwork.kernel.org/project/netdevbpf/patch/20220923163310.3192733-3-vladimir.oltean@nxp.com/ Suggested-by: Jakub Kicinski <kuba@kernel.org> Co-developed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	net/tipc: Remove unused struct distr_queue_item	Yuan Can
	After commit 09b5678c778f("tipc: remove dead code in tipc_net and relatives"), struct distr_queue_item is not used any more and can be removed as well. Signed-off-by: Yuan Can <yuancan@huawei.com> Acked-by: Jon Maloy <jmaloy@redhat.com> Link: https://lore.kernel.org/r/20220928085636.71749-1-yuancan@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	net: skb: introduce and use a single page frag cache	Paolo Abeni
	After commit 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs") we are observing 10-20% regressions in performance tests with small packets. The perf trace points to high pressure on the slab allocator. This change tries to improve the allocation schema for small packets using an idea originally suggested by Eric: a new per CPU page frag is introduced and used in __napi_alloc_skb to cope with small allocation requests. To ensure that the above does not lead to excessive truesize underestimation, the frag size for small allocation is inflated to 1K and all the above is restricted to build with 4K page size. Note that we need to update accordingly the run-time check introduced with commit fd9ea57f4e95 ("net: add napi_get_frags_check() helper"). Alex suggested a smart page refcount schema to reduce the number of atomic operations and deal properly with pfmemalloc pages. Under small packet UDP flood, I measure a 15% peak tput increases. Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Suggested-by: Alexander H Duyck <alexanderduyck@fb.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://lore.kernel.org/r/6b6f65957c59f86a353fc09a5127e83a32ab5999.1664350652.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	net: sched: cls_u32: Avoid memcpy() false-positive warning	Kees Cook
	To work around a misbehavior of the compiler's ability to see into composite flexible array structs (as detailed in the coming memcpy() hardening series[1]), use unsafe_memcpy(), as the sizing, bounds-checking, and allocation are all very tightly coupled here. This silences the false-positive reported by syzbot: memcpy: detected field-spanning write (size 80) of single field "&n->sel" at net/sched/cls_u32.c:1043 (size 16) [1] https://lore.kernel.org/linux-hardening/20220901065914.1417829-2-keescook@chromium.org Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Jiri Pirko <jiri@resnulli.us> Reported-by: syzbot+a2c4601efc75848ba321@syzkaller.appspotmail.com Link: https://lore.kernel.org/lkml/000000000000a96c0b05e97f0444@google.com/ Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20220927153700.3071688-1-keescook@chromium.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	dt-bindings: net: snps,dwmac: Document stmmac-axi-config subnode	Marek Vasut
	The stmmac-axi-config subnode is present in multiple dwmac instance DTs, document its content per snps,axi-config property description which is a phandle to this subnode. Signed-off-by: Marek Vasut <marex@denx.de> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20220927012449.698915-1-marex@denx.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	docs: netlink: clarify the historical baggage of Netlink flags	Jakub Kicinski
	nlmsg_flags are full of historical baggage, inconsistencies and strangeness. Try to document it more thoroughly. Explain the meaning of the ECHO flag (and while at it clarify the comment in the uAPI). Handwave a little about the NEW request flags and how they make sense on the surface but cater to really old paradigm before commands were a thing. I will add more notes on how to make use of ECHO and discouragement for reuse of flags to the kernel-side documentation. Link: https://lore.kernel.org/r/20220927212306.823862-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	vhost/vsock: Use kvmalloc/kvfree for larger packets.	Junichi Uekawa
	When copying a large file over sftp over vsock, data size is usually 32kB, and kmalloc seems to fail to try to allocate 32 32kB regions. vhost-5837: page allocation failure: order:4, mode:0x24040c0 Call Trace: [<ffffffffb6a0df64>] dump_stack+0x97/0xdb [<ffffffffb68d6aed>] warn_alloc_failed+0x10f/0x138 [<ffffffffb68d868a>] ? __alloc_pages_direct_compact+0x38/0xc8 [<ffffffffb664619f>] __alloc_pages_nodemask+0x84c/0x90d [<ffffffffb6646e56>] alloc_kmem_pages+0x17/0x19 [<ffffffffb6653a26>] kmalloc_order_trace+0x2b/0xdb [<ffffffffb66682f3>] __kmalloc+0x177/0x1f7 [<ffffffffb66e0d94>] ? copy_from_iter+0x8d/0x31d [<ffffffffc0689ab7>] vhost_vsock_handle_tx_kick+0x1fa/0x301 [vhost_vsock] [<ffffffffc06828d9>] vhost_worker+0xf7/0x157 [vhost] [<ffffffffb683ddce>] kthread+0xfd/0x105 [<ffffffffc06827e2>] ? vhost_dev_set_owner+0x22e/0x22e [vhost] [<ffffffffb683dcd1>] ? flush_kthread_worker+0xf3/0xf3 [<ffffffffb6eb332e>] ret_from_fork+0x4e/0x80 [<ffffffffb683dcd1>] ? flush_kthread_worker+0xf3/0xf3 Work around by doing kvmalloc instead. Fixes: 433fc58e6bf2 ("VSOCK: Introduce vhost_vsock.ko") Signed-off-by: Junichi Uekawa <uekawa@chromium.org> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20220928064538.667678-1-uekawa@chromium.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	clk: fixed-rate: add devm_clk_hw_register_fixed_rate	Dmitry Baryshkov
	Add devm_clk_hw_register_fixed_rate(), devres-managed helper to register fixed-rate clock. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20220916061740.87167-3-dmitry.baryshkov@linaro.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-09-29	clk: asm9260: use parent index to link the reference clock	Dmitry Baryshkov
	Rewrite clk-asm9260 to use parent index to use the reference clock. During this rework two helpers are added: - clk_hw_register_mux_table_parent_data() to supplement clk_hw_register_mux_table() but using parent_data instead of parent_names - clk_hw_register_fixed_rate_parent_accuracy() to be used instead of directly calling __clk_hw_register_fixed_rate(). The later function is an internal API, which is better not to be called directly. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20220916061740.87167-2-dmitry.baryshkov@linaro.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-09-29	Merge tag 'mtk-clk-for-6.1' of ↵	Stephen Boyd
	https://git.kernel.org/pub/scm/linux/kernel/git/wens/linux into clk-mtk Pull MediaTek clk driver updates from Chen-Yu Tsai: A lot of clean up work, as well as new drivers and new functions - New clock drivers for MediaTek Helio X10 MT6795 - Add missing DPI1_HDMI clock in MT8195 VDOSYS1 - Clock driver changes to support GPU DVFS on MT8183, MT8192, MT8195 - Fix GPU clock topology on MT8195 - Propogate rate changes from GPU clock gate up the tree - Clock mux notifiers for GPU-related PLLs - Conversion of more "simple" drivers to mtk_clk_simple_probe() - Hook up mtk_clk_simple_remove() for "simple" MT8192 clock drivers - Fixes to previous \|struct clk\| to \|struct clk_hw\| conversion - Shrink MT8192 clock driver by deduplicating clock parent lists * tag 'mtk-clk-for-6.1' of https://git.kernel.org/pub/scm/linux/kernel/git/wens/linux: (31 commits) clk: mediatek: mt8192: deduplicate parent clock lists clk: mediatek: Migrate remaining clk_unregister_() to clk_hw_unregister_() clk: mediatek: fix unregister function in mtk_clk_register_dividers cleanup clk: mediatek: clk-mt8192: Add clock mux notifier for mfg_pll_sel clk: mediatek: clk-mt8192-mfg: Propagate rate changes to parent clk: mediatek: clk-mt8195-topckgen: Drop univplls from mfg mux parents clk: mediatek: clk-mt8195-topckgen: Add GPU clock mux notifier clk: mediatek: clk-mt8195-topckgen: Register mfg_ck_fast_ref as generic mux clk: mediatek: clk-mt8195-mfg: Reparent mfg_bg3d and propagate rate changes clk: mediatek: mt8183: Add clk mux notifier for MFG mux clk: mediatek: mux: add clk notifier functions clk: mediatek: mt8183: mfgcfg: Propagate rate changes to parent clk: mediatek: Use mtk_clk_register_gates_with_dev in simple probe clk: mediatek: gate: Export mtk_clk_register_gates_with_dev clk: mediatek: add VDOSYS1 clock dt-bindings: clk: mediatek: Add MT8195 DPI clocks clk: mediatek: mt8192: add mtk_clk_simple_remove clk: mediatek: mt8183: use mtk_clk_simple_probe to simplify driver clk: mediatek: mt6797: use mtk_clk_simple_probe to simplify driver clk: mediatek: mt6779: use mtk_clk_simple_probe to simplify driver ...
2022-09-30	drm/msm: Fix build break with recent mm tree	Rob Clark
	9178e3dcb121 ("mm: discard __GFP_ATOMIC") removed __GFP_ATOMIC, replacing it with a check for not __GFP_DIRECT_RECLAIM. Reported-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220929161404.2769414-1-robdclark@gmail.com
2022-09-29	sbitmap: fix lockup while swapping	Hugh Dickins
	Commit 4acb83417cad ("sbitmap: fix batched wait_cnt accounting") is a big improvement: without it, I had to revert to before commit 040b83fcecfb ("sbitmap: fix possible io hung due to lost wakeup") to avoid the high system time and freezes which that had introduced. Now okay on the NVME laptop, but 4acb83417cad is a disaster for heavy swapping (kernel builds in low memory) on another: soon locking up in sbitmap_queue_wake_up() (into which __sbq_wake_up() is inlined), cycling around with waitqueue_active() but wait_cnt 0 . Here is a backtrace, showing the common pattern of outer sbitmap_queue_wake_up() interrupted before setting wait_cnt 0 back to wake_batch (in some cases other CPUs are idle, in other cases they're spinning for a lock in dd_bio_merge()): sbitmap_queue_wake_up < sbitmap_queue_clear < blk_mq_put_tag < __blk_mq_free_request < blk_mq_free_request < __blk_mq_end_request < scsi_end_request < scsi_io_completion < scsi_finish_command < scsi_complete < blk_complete_reqs < blk_done_softirq < __do_softirq < __irq_exit_rcu < irq_exit_rcu < common_interrupt < asm_common_interrupt < _raw_spin_unlock_irqrestore < __wake_up_common_lock < __wake_up < sbitmap_queue_wake_up < sbitmap_queue_clear < blk_mq_put_tag < __blk_mq_free_request < blk_mq_free_request < dd_bio_merge < blk_mq_sched_bio_merge < blk_mq_attempt_bio_merge < blk_mq_submit_bio < __submit_bio < submit_bio_noacct_nocheck < submit_bio_noacct < submit_bio < __swap_writepage < swap_writepage < pageout < shrink_folio_list < evict_folios < lru_gen_shrink_lruvec < shrink_lruvec < shrink_node < do_try_to_free_pages < try_to_free_pages < __alloc_pages_slowpath < __alloc_pages < folio_alloc < vma_alloc_folio < do_anonymous_page < __handle_mm_fault < handle_mm_fault < do_user_addr_fault < exc_page_fault < asm_exc_page_fault See how the process-context sbitmap_queue_wake_up() has been interrupted, after bringing wait_cnt down to 0 (and in this example, after doing its wakeups), before advancing wake_index and refilling wake_cnt: an interrupt-context sbitmap_queue_wake_up() of the same sbq gets stuck. I have almost no grasp of all the possible sbitmap races, and their consequences: but __sbq_wake_up() can do nothing useful while wait_cnt 0, so it is better if sbq_wake_ptr() skips on to the next ws in that case: which fixes the lockup and shows no adverse consequence for me. The check for wait_cnt being 0 is obviously racy, and ultimately can lead to lost wakeups: for example, when there is only a single waitqueue with waiters. However, lost wakeups are unlikely to matter in these cases, and a proper fix requires redesign (and benchmarking) of the batched wakeup code: so let's plug the hole with this bandaid for now. Signed-off-by: Hugh Dickins <hughd@google.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/9c2038a7-cdc5-5ee-854c-fbc6168bf16@google.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-29	io_uring/net: fix notif cqe reordering	Pavel Begunkov
	send zc is not restricted to !IO_URING_F_UNLOCKED anymore and so we can't use task-tw ordering trick to order notification cqes with requests completions. In this case leave it alone and let io_send_zc_cleanup() flush it. Cc: stable@vger.kernel.org Fixes: 53bdc88aac9a2 ("io_uring/notif: order notif vs send CQEs") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/0031f3a00d492e814a4a0935a2029a46d9c9ba06.1664486545.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-29	io_uring/net: don't update msg_name if not provided	Pavel Begunkov
	io_sendmsg_copy_hdr() may clear msg->msg_name if the userspace didn't provide it, we should retain NULL in this case. Cc: stable@vger.kernel.org Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/97d49f61b5ec76d0900df658cfde3aa59ff22121.1664486545.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-30	Merge tag 'drm-intel-next-fixes-2022-09-29' of ↵	Dave Airlie
	git://anongit.freedesktop.org/drm/drm-intel into drm-next - Fix release build bug in 'remove GuC log size module parameters' (John Harrison) - Remove ipc_enabled from struct drm_i915_private (Jani Nikula) - Do not cleanup obj with NULL bo->resource (Nirmoy Das) - Fix device info for devices without display (Jani Nikula) - Force DPLL calculation for TC ports after readout (Ville Syrjälä) - Use i915_vm_put on ppgtt_create error paths (Chris Wilson) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/YzWqtwPNxAe+r9FO@tursulin-desk
2022-09-29	binfmt: remove taso from linux_binprm struct	Lukas Bulwahn
	With commit 987f20a9dcce ("a.out: Remove the a.out implementation"), the use of the special taso flag for alpha architectures in the linux_binprm struct is gone. Remove the definition of taso in the linux_binprm struct. No functional change. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220929203903.9475-1-lukas.bulwahn@gmail.com
2022-09-30	Merge tag 'drm-intel-fixes-2022-09-29' of ↵	Dave Airlie
	git://anongit.freedesktop.org/drm/drm-intel into drm-fixes - Restrict forced preemption to the active context (Chris) - Restrict perf_limit_reasons to the supported platforms - gen11+ (Ashutosh) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/YzXAkH1a32pYJD33@intel.com
2022-09-30	Merge tag 'amd-drm-fixes-6.0-2022-09-29' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.0-2022-09-29: amdgpu: - GC 11.x fixes - SMU 13.x fixes - DCN 3.1.4 fixes - DCN 3.2.x fixes - GC 9.x fix - Fence fix - SR-IOV supend/resume fix - PSR regression fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220929144003.8363-1-alexander.deucher@amd.com
2022-09-30	Merge tag 'drm-misc-fixes-2022-09-29' of ↵	Dave Airlie
	git://anongit.freedesktop.org/drm/drm-misc into drm-fixes Short summary of fixes pull: * bridge/analogix: Revert earlier suspend fix * bridge/lt8912b: Fix corrupt display output Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/YzWvHhaqHhYirn4L@linux-uq9g
2022-09-29	of: fdt: Remove unused struct fdt_scan_status	Yuan Can
	After commit bba04d965d06("of/fdt: remove unused of_scan_flat_dt_by_path"), no one use struct fdt_scan_status, so remove it. Signed-off-by: Yuan Can <yuancan@huawei.com> Reviewed-by: Frank Rowand <frank.rowand@sony.com> Link: https://lore.kernel.org/r/20220927133739.98493-1-yuancan@huawei.com Signed-off-by: Rob Herring <robh@kernel.org>
2022-09-29	dt-bindings: display: st,stm32-dsi: Handle data-lanes in DSI port node	Marek Vasut
	Handle 'data-lanes' property of the DSI output endpoint, it is possible to describe DSI link with 1 or 2 data lanes this way. Signed-off-by: Marek Vasut <marex@denx.de> Link: https://lore.kernel.org/r/20220926234501.583115-1-marex@denx.de Signed-off-by: Rob Herring <robh@kernel.org>
2022-09-29	io_uring: don't gate task_work run on TIF_NOTIFY_SIGNAL	Jens Axboe
	This isn't a reliable mechanism to tell if we have task_work pending, we really should be looking at whether we have any items queued. This is problematic if forward progress is gated on running said task_work. One such example is reading from a pipe, where the write side has been closed right before the read is started. The fput() of the file queues TWA_RESUME task_work, and we need that task_work to be run before ->release() is called for the pipe. If ->release() isn't called, then the read will sit forever waiting on data that will never arise. Fix this by io_run_task_work() so it checks if we have task_work pending rather than rely on TIF_NOTIFY_SIGNAL for that. The latter obviously doesn't work for task_work that is queued without TWA_SIGNAL. Reported-by: Christiano Haesbaert <haesbaert@haesbaert.org> Cc: stable@vger.kernel.org Link: https://github.com/axboe/liburing/issues/665 Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-29	Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	Linus Torvalds
	Pull coredump fix from Al Viro: "Fix for breakage in dump_user_range()" * tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: [coredump] don't use __kernel_write() on kmap_local_page()
2022-09-29	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	drm/panel: simple: Use dev_err_probe() to simplify code	Yuan Can
	In the probe path, dev_err() can be replaced with dev_err_probe() which will check if error code is -EPROBE_DEFER and prints the error name. It also sets the defer probe reason which can be checked later through debugfs. Signed-off-by: Yuan Can <yuancan@huawei.com> Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20220929015503.17301-3-yuancan@huawei.com
2022-09-29	drm/panel: panel-edp: Use dev_err_probe() to simplify code	Yuan Can
	In the probe path, dev_err() can be replaced with dev_err_probe() which will check if error code is -EPROBE_DEFER and prints the error name. It also sets the defer probe reason which can be checked later through debugfs. Signed-off-by: Yuan Can <yuancan@huawei.com> Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20220929015503.17301-2-yuancan@huawei.com Link: https://patchwork.freedesktop.org/patch/msgid/20220929015503.17301-2-yuancan@huawei.com
2022-09-29	Bluetooth: L2CAP: Fix user-after-free	Luiz Augusto von Dentz
	This uses l2cap_chan_hold_unless_zero() after calling __l2cap_get_chan_blah() to prevent the following trace: Bluetooth: l2cap_core.c:static void l2cap_chan_destroy(struct kref *kref) Bluetooth: chan 0000000023c4974d Bluetooth: parent 00000000ae861c08 ================================================================== BUG: KASAN: use-after-free in __mutex_waiter_is_first kernel/locking/mutex.c:191 [inline] BUG: KASAN: use-after-free in __mutex_lock_common kernel/locking/mutex.c:671 [inline] BUG: KASAN: use-after-free in __mutex_lock+0x278/0x400 kernel/locking/mutex.c:729 Read of size 8 at addr ffff888006a49b08 by task kworker/u3:2/389 Link: https://lore.kernel.org/lkml/20220622082716.478486-1-lee.jones@linaro.org Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sungwoo Kim <iam@sung-woo.kim>
2022-09-29	prandom: make use of smaller types in prandom_u32_max	Jason A. Donenfeld
	When possible at compile-time, make use of smaller types in prandom_u32_max(), so that we can use smaller batches from random.c, which in turn leads to a 2x or 4x performance boost. This makes a difference, for example, in kfence, which needs a fast stream of small numbers (booleans). At the same time, we use the occasion to update the old documentation on these functions. prandom_u32() and prandom_bytes() have direct replacements now in random.h, while prandom_u32_max() remains useful as a prandom.h function, since it's not cryptographically secure by virtue of not being evenly distributed. Cc: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Marco Elver <elver@google.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-09-29	random: add 8-bit and 16-bit batches	Jason A. Donenfeld
	There are numerous places in the kernel that would be sped up by having smaller batches. Currently those callsites do `get_random_u32() & 0xff` or similar. Since these are pretty spread out, and will require patches to multiple different trees, let's get ahead of the curve and lay the foundation for `get_random_u8()` and `get_random_u16()`, so that it's then possible to start submitting conversion patches leisurely. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-09-29	utsname: contribute changes to RNG	Jason A. Donenfeld
	On some small machines with little entropy, a quasi-unique hostname is sometimes a relevant factor. I've seen, for example, 8 character alpha-numeric serial numbers. In addition, the time at which the hostname is set is usually a decent measurement of how long early boot took. So, call add_device_randomness() on new hostnames, which feeds its arguments to the RNG in addition to a fresh cycle counter. Low cost hooks like this never hurt and can only ever help, and since this costs basically nothing for an operation that is never a fast path, this is an overall easy win. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>