Age | Commit message (Collapse) | Author |
|
Currently the I_DIRTY_TIME will never get set if the inode already has
I_DIRTY_INODE with assumption that it supersedes I_DIRTY_TIME. That's
true, however ext4 will only update the on-disk inode in
->dirty_inode(), not on actual writeback. As a result if the inode
already has I_DIRTY_INODE state by the time we get to
__mark_inode_dirty() only with I_DIRTY_TIME, the time was already filled
into on-disk inode and will not get updated until the next I_DIRTY_INODE
update, which might never come if we crash or get a power failure.
The problem can be reproduced on ext4 by running xfstest generic/622
with -o iversion mount option.
Fix it by allowing I_DIRTY_TIME to be set even if the inode already has
I_DIRTY_INODE. Also make sure that the case is properly handled in
writeback_single_inode() as well. Additionally changes in
xfs_fs_dirty_inode() was made to accommodate for I_DIRTY_TIME in flag.
Thanks Jan Kara for suggestions on how to make this work properly.
Cc: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: stable@kernel.org
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Suggested-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220825100657.44217-1-lczerner@redhat.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
ea_inodes are using i_version for storing part of the reference count so
we really need to leave it alone.
The problem can be reproduced by xfstest ext4/026 when iversion is
enabled. Fix it by not calling inode_inc_iversion() for EXT4_EA_INODE_FL
inodes in ext4_mark_iloc_dirty().
Cc: stable@kernel.org
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Link: https://lore.kernel.org/r/20220824160349.39664-1-lczerner@redhat.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
The check in __ext4_read_dirblock() for block being outside of directory
size was wrong because it compared block number against directory size
in bytes. Fix it.
Fixes: 65f8ea4cd57d ("ext4: check if directory block is within i_size")
CVE: CVE-2022-1184
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Link: https://lore.kernel.org/r/20220822114832.1482-1-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
submit_bh/submit_bh_wbc are non-blocking functions which just submit
the bio and return. The caller of submit_bh/submit_bh_wbc needs to wait
on buffer till I/O completion and then check buffer head's b_state field
to know if there was any I/O error.
Hence there is no need for these functions to have any return type.
Even now they always returns 0. Hence drop the return value and make
their return type as void to avoid any confusion.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Link: https://lore.kernel.org/r/cb66ef823374cdd94d2d03083ce13de844fffd41.1660788334.git.ritesh.list@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
submit_bh always returns 0. This patch drops the useless return value of
submit_bh from __sync_dirty_buffer(). Once all of submit_bh callers are
cleaned up, we can make it's return type as void.
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/a98a6ddfac68f73d684c2724952e825bc1f4d238.1660788334.git.ritesh.list@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
submit_bh always returns 0. This patch drops the useless return value of
submit_bh from ntfs_submit_bh_for_read(). Once all of submit_bh callers are
cleaned up, we can make it's return type as void.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Ritesh Harjani <ritesh.list@gmail.com>
Link: https://lore.kernel.org/r/d82eb29e8dbc52fe13a7affef5c907ea4076aa31.1660788334.git.ritesh.list@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
submit_bh always returns 0. This patch cleans up 2 of it's caller
in jbd2 to drop submit_bh's useless return value.
Once all submit_bh callers are cleaned up, we can make it's return
type as void.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Link: https://lore.kernel.org/r/e069c0539be0aec61abcdc6f6141982ec85d489d.1660788334.git.ritesh.list@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
ext4_lazyinit_thread is not set freezable. Hence when the thread calls
try_to_freeze it doesn't freeze during suspend and continues to send
requests to the storage during suspend, resulting in suspend failures.
Cc: stable@kernel.org
Signed-off-by: Lalith Rajendran <lalithkraj@google.com>
Link: https://lore.kernel.org/r/20220818214049.1519544-1-lalithkraj@google.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2022-09-28 (ice)
Arkadiusz implements a single pin initialization function, checking feature
bits, instead of having separate device functions and updates sub-device
IDs for recognizing E810T devices.
Martyna adds support for switchdev filters on VLAN priority field.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
ice: Add support for VLAN priority filters in switchdev
ice: support features on new E810T variants
ice: Merge pin initialization of E810 and E810T adapters
====================
Link: https://lore.kernel.org/r/20220928203217.411078-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Zbynek reports that alx trips an rtnl assertion on resume:
RTNL: assertion failed at net/core/dev.c (2891)
RIP: 0010:netif_set_real_num_tx_queues+0x1ac/0x1c0
Call Trace:
<TASK>
__alx_open+0x230/0x570 [alx]
alx_resume+0x54/0x80 [alx]
? pci_legacy_resume+0x80/0x80
dpm_run_callback+0x4a/0x150
device_resume+0x8b/0x190
async_resume+0x19/0x30
async_run_entry_fn+0x30/0x130
process_one_work+0x1e5/0x3b0
indeed the driver does not hold rtnl_lock during its internal close
and re-open functions during suspend/resume. Note that this is not
a huge bug as the driver implements its own locking, and does not
implement changing the number of queues, but we need to silence
the splat.
Fixes: 4a5fe57e7751 ("alx: use fine-grained locking instead of RTNL")
Reported-and-tested-by: Zbynek Michl <zbynek.michl@gmail.com>
Reviewed-by: Niels Dossche <dossche.niels@gmail.com>
Link: https://lore.kernel.org/r/20220928181236.1053043-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix the following build errors:
arch/sparc/mm/srmmu.c: In function ‘smp_flush_page_for_dma’:
arch/sparc/mm/srmmu.c:1639:13: error: cast between incompatible function types from ‘void (*)(long unsigned int)’ to ‘void (*)(long unsigned int, long unsigned int, long unsigned int, long unsigned int, long unsigned int)’ [-Werror=cast-function-type]
1639 | xc1((smpfunc_t) local_ops->page_for_dma, page);
| ^
arch/sparc/mm/srmmu.c: In function ‘smp_flush_cache_mm’:
arch/sparc/mm/srmmu.c:1662:29: error: cast between incompatible function types from ‘void (*)(struct mm_struct *)’ to ‘void (*)(long unsigned int, long unsigned int, long unsigned int, long unsigned int, long unsigned int)’ [-Werror=cast-function-type]
1662 | xc1((smpfunc_t) local_ops->cache_mm, (unsigned long) mm);
|
[ ... ]
Compile-tested only.
Fixes: 552a23a0e5d0 ("Makefile: Enable -Wcast-function-type")
Cc: stable@vger.kernel.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Andreas Larsson <andreas@gaisler.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220830205854.1918026-1-bvanassche@acm.org
|
|
Follow the advice of the Documentation/filesystems/sysfs.rst and show()
should only use sysfs_emit() or sysfs_emit_at() when formatting the value
to be returned to user space.
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/1664364860-29153-1-git-send-email-wangyufen@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Vladimir Oltean says:
====================
Add tc-taprio support for queueMaxSDU
The tc-taprio offload mode supported by the Felix DSA driver has
limitations surrounding its guard bands.
The initial discussion was at:
https://lore.kernel.org/netdev/c7618025da6723418c56a54fe4683bd7@walle.cc/
with the latest status being that we now have a vsc9959_tas_guard_bands_update()
method which makes a best-guess attempt at how much useful space to
reserve for packet scheduling in a taprio interval, and how much to
reserve for guard bands.
IEEE 802.1Q actually does offer a tunable variable (queueMaxSDU) which
can determine the max MTU supported per traffic class. In turn we can
determine the size we need for the guard bands, depending on the
queueMaxSDU. This way we can make the guard band of small taprio
intervals smaller than one full MTU worth of transmission time, if we
know that said traffic class will transport only smaller packets.
As discussed with Gerhard Engleder, the queueMaxSDU may also be useful
in limiting the latency on an endpoint, if some of the TX queues are
outside of the control of the Linux driver.
https://patchwork.kernel.org/project/netdevbpf/patch/20220914153303.1792444-11-vladimir.oltean@nxp.com/
Allow input of queueMaxSDU through netlink into tc-taprio, offload it to
the hardware I have access to (LS1028A), and (implicitly) deny
non-default values to everyone else. Kurt Kanzenbach has also kindly
tested and shared a patch to offload this to hellcreek.
v3 at:
https://patchwork.kernel.org/project/netdevbpf/cover/20220927234746.1823648-1-vladimir.oltean@nxp.com/
v2 at:
https://patchwork.kernel.org/project/netdevbpf/list/?series=679954&state=*
v1 at:
https://patchwork.kernel.org/project/netdevbpf/cover/20220914153303.1792444-1-vladimir.oltean@nxp.com/
====================
Link: https://lore.kernel.org/r/20220928095204.2093716-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The driver currently sets the PTCMSDUR register statically to the max
MTU supported by the interface. Keep this logic if tc-taprio is absent
or if the max_sdu for a traffic class is 0, and follow the requested max
SDU size otherwise.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The Port Time Gating Control Register (PTGCR) and Port Time Gating
Capability Register (PTGCAPR) have definitions in the driver which
aren't in line with the other registers. Rename these.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The &priv->si->hw construct dereferences 2 pointers and makes lines
longer than they need to be, in turn making the code harder to read.
Replace &priv->si->hw accesses with a "hw" variable when there are 2 or
more accesses within a function that dereference this. This includes
loops, since &priv->si->hw is a loop invariant.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add support for configuring the max SDU per priority and per port. If not
specified, keep the default.
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The following patch will need to make this function also respond to
TC_QUERY_BASE, so make the processing more structured around the
tc_setup_type.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Our current vsc9959_tas_guard_bands_update() algorithm has a limitation
imposed by the hardware design. To avoid packet overruns between one
gate interval and the next (which would add jitter for scheduled traffic
in the next gate), we configure the switch to use guard bands. These are
as large as the largest packet which is possible to be transmitted.
The problem is that at tc-taprio intervals of sizes comparable to a
guard band, there isn't an obvious place in which to split the interval
between the useful portion (for scheduling) and the guard band portion
(where scheduling is blocked).
For example, a 10 us interval at 1Gbps allows 1225 octets to be
transmitted. We currently split the interval between the bare minimum of
33 ns useful time (required to schedule a single packet) and the rest as
guard band.
But 33 ns of useful scheduling time will only allow a single packet to
be sent, be that packet 1200 octets in size, or 60 octets in size. It is
impossible to send 2 60 octets frames in the 10 us window. Except that
if we reduced the guard band (and therefore the maximum allowable SDU
size) to 5 us, the useful time for scheduling is now also 5 us, so more
packets could be scheduled.
The hardware inflexibility of not scheduling according to individual
packet lengths must unfortunately propagate to the user, who needs to
tune the queueMaxSDU values if he wants to fit more small packets into a
10 us interval, rather than one large packet.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
IEEE 802.1Q clause 12.29.1.1 "The queueMaxSDUTable structure and data
types" and 8.6.8.4 "Enhancements for scheduled traffic" talk about the
existence of a per traffic class limitation of maximum frame sizes, with
a fallback on the port-based MTU.
As far as I am able to understand, the 802.1Q Service Data Unit (SDU)
represents the MAC Service Data Unit (MSDU, i.e. L2 payload), excluding
any number of prepended VLAN headers which may be otherwise present in
the MSDU. Therefore, the queueMaxSDU is directly comparable to the
device MTU (1500 means L2 payload sizes are accepted, or frame sizes of
1518 octets, or 1522 plus one VLAN header). Drivers which offload this
are directly responsible of translating into other units of measurement.
To keep the fast path checks optimized, we keep 2 arrays in the qdisc,
one for max_sdu translated into frame length (so that it's comparable to
skb->len), and another for offloading and for dumping back to the user.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When adding optional new features to Qdisc offloads, existing drivers
must reject the new configuration until they are coded up to act on it.
Since modifying all drivers in lockstep with the changes in the Qdisc
can create problems of its own, it would be nice if there existed an
automatic opt-in mechanism for offloading optional features.
Jakub proposes that we multiplex one more kind of call through
ndo_setup_tc(): one where the driver populates a Qdisc-specific
capability structure.
First user will be taprio in further changes. Here we are introducing
the definitions for the base functionality.
Link: https://patchwork.kernel.org/project/netdevbpf/patch/20220923163310.3192733-3-vladimir.oltean@nxp.com/
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Co-developed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
After commit 09b5678c778f("tipc: remove dead code in tipc_net and relatives"),
struct distr_queue_item is not used any more and can be removed as well.
Signed-off-by: Yuan Can <yuancan@huawei.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Link: https://lore.kernel.org/r/20220928085636.71749-1-yuancan@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
After commit 3226b158e67c ("net: avoid 32 x truesize under-estimation
for tiny skbs") we are observing 10-20% regressions in performance
tests with small packets. The perf trace points to high pressure on
the slab allocator.
This change tries to improve the allocation schema for small packets
using an idea originally suggested by Eric: a new per CPU page frag is
introduced and used in __napi_alloc_skb to cope with small allocation
requests.
To ensure that the above does not lead to excessive truesize
underestimation, the frag size for small allocation is inflated to 1K
and all the above is restricted to build with 4K page size.
Note that we need to update accordingly the run-time check introduced
with commit fd9ea57f4e95 ("net: add napi_get_frags_check() helper").
Alex suggested a smart page refcount schema to reduce the number
of atomic operations and deal properly with pfmemalloc pages.
Under small packet UDP flood, I measure a 15% peak tput increases.
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Suggested-by: Alexander H Duyck <alexanderduyck@fb.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://lore.kernel.org/r/6b6f65957c59f86a353fc09a5127e83a32ab5999.1664350652.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
To work around a misbehavior of the compiler's ability to see into
composite flexible array structs (as detailed in the coming memcpy()
hardening series[1]), use unsafe_memcpy(), as the sizing,
bounds-checking, and allocation are all very tightly coupled here.
This silences the false-positive reported by syzbot:
memcpy: detected field-spanning write (size 80) of single field "&n->sel" at net/sched/cls_u32.c:1043 (size 16)
[1] https://lore.kernel.org/linux-hardening/20220901065914.1417829-2-keescook@chromium.org
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Reported-by: syzbot+a2c4601efc75848ba321@syzkaller.appspotmail.com
Link: https://lore.kernel.org/lkml/000000000000a96c0b05e97f0444@google.com/
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20220927153700.3071688-1-keescook@chromium.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The stmmac-axi-config subnode is present in multiple dwmac instance DTs,
document its content per snps,axi-config property description which is
a phandle to this subnode.
Signed-off-by: Marek Vasut <marex@denx.de>
Reviewed-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20220927012449.698915-1-marex@denx.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
nlmsg_flags are full of historical baggage, inconsistencies and
strangeness. Try to document it more thoroughly. Explain the meaning
of the ECHO flag (and while at it clarify the comment in the uAPI).
Handwave a little about the NEW request flags and how they make
sense on the surface but cater to really old paradigm before commands
were a thing.
I will add more notes on how to make use of ECHO and discouragement
for reuse of flags to the kernel-side documentation.
Link: https://lore.kernel.org/r/20220927212306.823862-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When copying a large file over sftp over vsock, data size is usually 32kB,
and kmalloc seems to fail to try to allocate 32 32kB regions.
vhost-5837: page allocation failure: order:4, mode:0x24040c0
Call Trace:
[<ffffffffb6a0df64>] dump_stack+0x97/0xdb
[<ffffffffb68d6aed>] warn_alloc_failed+0x10f/0x138
[<ffffffffb68d868a>] ? __alloc_pages_direct_compact+0x38/0xc8
[<ffffffffb664619f>] __alloc_pages_nodemask+0x84c/0x90d
[<ffffffffb6646e56>] alloc_kmem_pages+0x17/0x19
[<ffffffffb6653a26>] kmalloc_order_trace+0x2b/0xdb
[<ffffffffb66682f3>] __kmalloc+0x177/0x1f7
[<ffffffffb66e0d94>] ? copy_from_iter+0x8d/0x31d
[<ffffffffc0689ab7>] vhost_vsock_handle_tx_kick+0x1fa/0x301 [vhost_vsock]
[<ffffffffc06828d9>] vhost_worker+0xf7/0x157 [vhost]
[<ffffffffb683ddce>] kthread+0xfd/0x105
[<ffffffffc06827e2>] ? vhost_dev_set_owner+0x22e/0x22e [vhost]
[<ffffffffb683dcd1>] ? flush_kthread_worker+0xf3/0xf3
[<ffffffffb6eb332e>] ret_from_fork+0x4e/0x80
[<ffffffffb683dcd1>] ? flush_kthread_worker+0xf3/0xf3
Work around by doing kvmalloc instead.
Fixes: 433fc58e6bf2 ("VSOCK: Introduce vhost_vsock.ko")
Signed-off-by: Junichi Uekawa <uekawa@chromium.org>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/r/20220928064538.667678-1-uekawa@chromium.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add devm_clk_hw_register_fixed_rate(), devres-managed helper to register
fixed-rate clock.
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20220916061740.87167-3-dmitry.baryshkov@linaro.org
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
Rewrite clk-asm9260 to use parent index to use the reference clock.
During this rework two helpers are added:
- clk_hw_register_mux_table_parent_data() to supplement
clk_hw_register_mux_table() but using parent_data instead of
parent_names
- clk_hw_register_fixed_rate_parent_accuracy() to be used instead of
directly calling __clk_hw_register_fixed_rate(). The later function is
an internal API, which is better not to be called directly.
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20220916061740.87167-2-dmitry.baryshkov@linaro.org
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/wens/linux into clk-mtk
Pull MediaTek clk driver updates from Chen-Yu Tsai:
A lot of clean up work, as well as new drivers and new functions
- New clock drivers for MediaTek Helio X10 MT6795
- Add missing DPI1_HDMI clock in MT8195 VDOSYS1
- Clock driver changes to support GPU DVFS on MT8183, MT8192, MT8195
- Fix GPU clock topology on MT8195
- Propogate rate changes from GPU clock gate up the tree
- Clock mux notifiers for GPU-related PLLs
- Conversion of more "simple" drivers to mtk_clk_simple_probe()
- Hook up mtk_clk_simple_remove() for "simple" MT8192 clock drivers
- Fixes to previous |struct clk| to |struct clk_hw| conversion
- Shrink MT8192 clock driver by deduplicating clock parent lists
* tag 'mtk-clk-for-6.1' of https://git.kernel.org/pub/scm/linux/kernel/git/wens/linux: (31 commits)
clk: mediatek: mt8192: deduplicate parent clock lists
clk: mediatek: Migrate remaining clk_unregister_*() to clk_hw_unregister_*()
clk: mediatek: fix unregister function in mtk_clk_register_dividers cleanup
clk: mediatek: clk-mt8192: Add clock mux notifier for mfg_pll_sel
clk: mediatek: clk-mt8192-mfg: Propagate rate changes to parent
clk: mediatek: clk-mt8195-topckgen: Drop univplls from mfg mux parents
clk: mediatek: clk-mt8195-topckgen: Add GPU clock mux notifier
clk: mediatek: clk-mt8195-topckgen: Register mfg_ck_fast_ref as generic mux
clk: mediatek: clk-mt8195-mfg: Reparent mfg_bg3d and propagate rate changes
clk: mediatek: mt8183: Add clk mux notifier for MFG mux
clk: mediatek: mux: add clk notifier functions
clk: mediatek: mt8183: mfgcfg: Propagate rate changes to parent
clk: mediatek: Use mtk_clk_register_gates_with_dev in simple probe
clk: mediatek: gate: Export mtk_clk_register_gates_with_dev
clk: mediatek: add VDOSYS1 clock
dt-bindings: clk: mediatek: Add MT8195 DPI clocks
clk: mediatek: mt8192: add mtk_clk_simple_remove
clk: mediatek: mt8183: use mtk_clk_simple_probe to simplify driver
clk: mediatek: mt6797: use mtk_clk_simple_probe to simplify driver
clk: mediatek: mt6779: use mtk_clk_simple_probe to simplify driver
...
|
|
9178e3dcb121 ("mm: discard __GFP_ATOMIC") removed __GFP_ATOMIC,
replacing it with a check for not __GFP_DIRECT_RECLAIM.
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220929161404.2769414-1-robdclark@gmail.com
|
|
Commit 4acb83417cad ("sbitmap: fix batched wait_cnt accounting")
is a big improvement: without it, I had to revert to before commit
040b83fcecfb ("sbitmap: fix possible io hung due to lost wakeup")
to avoid the high system time and freezes which that had introduced.
Now okay on the NVME laptop, but 4acb83417cad is a disaster for heavy
swapping (kernel builds in low memory) on another: soon locking up in
sbitmap_queue_wake_up() (into which __sbq_wake_up() is inlined), cycling
around with waitqueue_active() but wait_cnt 0 . Here is a backtrace,
showing the common pattern of outer sbitmap_queue_wake_up() interrupted
before setting wait_cnt 0 back to wake_batch (in some cases other CPUs
are idle, in other cases they're spinning for a lock in dd_bio_merge()):
sbitmap_queue_wake_up < sbitmap_queue_clear < blk_mq_put_tag <
__blk_mq_free_request < blk_mq_free_request < __blk_mq_end_request <
scsi_end_request < scsi_io_completion < scsi_finish_command <
scsi_complete < blk_complete_reqs < blk_done_softirq < __do_softirq <
__irq_exit_rcu < irq_exit_rcu < common_interrupt < asm_common_interrupt <
_raw_spin_unlock_irqrestore < __wake_up_common_lock < __wake_up <
sbitmap_queue_wake_up < sbitmap_queue_clear < blk_mq_put_tag <
__blk_mq_free_request < blk_mq_free_request < dd_bio_merge <
blk_mq_sched_bio_merge < blk_mq_attempt_bio_merge < blk_mq_submit_bio <
__submit_bio < submit_bio_noacct_nocheck < submit_bio_noacct <
submit_bio < __swap_writepage < swap_writepage < pageout <
shrink_folio_list < evict_folios < lru_gen_shrink_lruvec <
shrink_lruvec < shrink_node < do_try_to_free_pages < try_to_free_pages <
__alloc_pages_slowpath < __alloc_pages < folio_alloc < vma_alloc_folio <
do_anonymous_page < __handle_mm_fault < handle_mm_fault <
do_user_addr_fault < exc_page_fault < asm_exc_page_fault
See how the process-context sbitmap_queue_wake_up() has been interrupted,
after bringing wait_cnt down to 0 (and in this example, after doing its
wakeups), before advancing wake_index and refilling wake_cnt: an
interrupt-context sbitmap_queue_wake_up() of the same sbq gets stuck.
I have almost no grasp of all the possible sbitmap races, and their
consequences: but __sbq_wake_up() can do nothing useful while wait_cnt 0,
so it is better if sbq_wake_ptr() skips on to the next ws in that case:
which fixes the lockup and shows no adverse consequence for me.
The check for wait_cnt being 0 is obviously racy, and ultimately can lead
to lost wakeups: for example, when there is only a single waitqueue with
waiters. However, lost wakeups are unlikely to matter in these cases,
and a proper fix requires redesign (and benchmarking) of the batched
wakeup code: so let's plug the hole with this bandaid for now.
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/9c2038a7-cdc5-5ee-854c-fbc6168bf16@google.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
send zc is not restricted to !IO_URING_F_UNLOCKED anymore and so
we can't use task-tw ordering trick to order notification cqes
with requests completions. In this case leave it alone and let
io_send_zc_cleanup() flush it.
Cc: stable@vger.kernel.org
Fixes: 53bdc88aac9a2 ("io_uring/notif: order notif vs send CQEs")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/0031f3a00d492e814a4a0935a2029a46d9c9ba06.1664486545.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
io_sendmsg_copy_hdr() may clear msg->msg_name if the userspace didn't
provide it, we should retain NULL in this case.
Cc: stable@vger.kernel.org
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/97d49f61b5ec76d0900df658cfde3aa59ff22121.1664486545.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-next
- Fix release build bug in 'remove GuC log size module parameters' (John Harrison)
- Remove ipc_enabled from struct drm_i915_private (Jani Nikula)
- Do not cleanup obj with NULL bo->resource (Nirmoy Das)
- Fix device info for devices without display (Jani Nikula)
- Force DPLL calculation for TC ports after readout (Ville Syrjälä)
- Use i915_vm_put on ppgtt_create error paths (Chris Wilson)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YzWqtwPNxAe+r9FO@tursulin-desk
|
|
With commit 987f20a9dcce ("a.out: Remove the a.out implementation"), the
use of the special taso flag for alpha architectures in the linux_binprm
struct is gone.
Remove the definition of taso in the linux_binprm struct.
No functional change.
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220929203903.9475-1-lukas.bulwahn@gmail.com
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Restrict forced preemption to the active context (Chris)
- Restrict perf_limit_reasons to the supported platforms - gen11+ (Ashutosh)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YzXAkH1a32pYJD33@intel.com
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.0-2022-09-29:
amdgpu:
- GC 11.x fixes
- SMU 13.x fixes
- DCN 3.1.4 fixes
- DCN 3.2.x fixes
- GC 9.x fix
- Fence fix
- SR-IOV supend/resume fix
- PSR regression fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220929144003.8363-1-alexander.deucher@amd.com
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
Short summary of fixes pull:
* bridge/analogix: Revert earlier suspend fix
* bridge/lt8912b: Fix corrupt display output
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YzWvHhaqHhYirn4L@linux-uq9g
|
|
After commit bba04d965d06("of/fdt: remove unused of_scan_flat_dt_by_path"), no
one use struct fdt_scan_status, so remove it.
Signed-off-by: Yuan Can <yuancan@huawei.com>
Reviewed-by: Frank Rowand <frank.rowand@sony.com>
Link: https://lore.kernel.org/r/20220927133739.98493-1-yuancan@huawei.com
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
Handle 'data-lanes' property of the DSI output endpoint, it is possible
to describe DSI link with 1 or 2 data lanes this way.
Signed-off-by: Marek Vasut <marex@denx.de>
Link: https://lore.kernel.org/r/20220926234501.583115-1-marex@denx.de
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
This isn't a reliable mechanism to tell if we have task_work pending, we
really should be looking at whether we have any items queued. This is
problematic if forward progress is gated on running said task_work. One
such example is reading from a pipe, where the write side has been closed
right before the read is started. The fput() of the file queues TWA_RESUME
task_work, and we need that task_work to be run before ->release() is
called for the pipe. If ->release() isn't called, then the read will sit
forever waiting on data that will never arise.
Fix this by io_run_task_work() so it checks if we have task_work pending
rather than rely on TIF_NOTIFY_SIGNAL for that. The latter obviously
doesn't work for task_work that is queued without TWA_SIGNAL.
Reported-by: Christiano Haesbaert <haesbaert@haesbaert.org>
Cc: stable@vger.kernel.org
Link: https://github.com/axboe/liburing/issues/665
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Pull coredump fix from Al Viro:
"Fix for breakage in dump_user_range()"
* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
[coredump] don't use __kernel_write() on kmap_local_page()
|
|
No conflicts.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In the probe path, dev_err() can be replaced with dev_err_probe()
which will check if error code is -EPROBE_DEFER and prints the
error name. It also sets the defer probe reason which can be
checked later through debugfs.
Signed-off-by: Yuan Can <yuancan@huawei.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20220929015503.17301-3-yuancan@huawei.com
|
|
In the probe path, dev_err() can be replaced with dev_err_probe()
which will check if error code is -EPROBE_DEFER and prints the
error name. It also sets the defer probe reason which can be
checked later through debugfs.
Signed-off-by: Yuan Can <yuancan@huawei.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20220929015503.17301-2-yuancan@huawei.com
Link: https://patchwork.freedesktop.org/patch/msgid/20220929015503.17301-2-yuancan@huawei.com
|
|
This uses l2cap_chan_hold_unless_zero() after calling
__l2cap_get_chan_blah() to prevent the following trace:
Bluetooth: l2cap_core.c:static void l2cap_chan_destroy(struct kref
*kref)
Bluetooth: chan 0000000023c4974d
Bluetooth: parent 00000000ae861c08
==================================================================
BUG: KASAN: use-after-free in __mutex_waiter_is_first
kernel/locking/mutex.c:191 [inline]
BUG: KASAN: use-after-free in __mutex_lock_common
kernel/locking/mutex.c:671 [inline]
BUG: KASAN: use-after-free in __mutex_lock+0x278/0x400
kernel/locking/mutex.c:729
Read of size 8 at addr ffff888006a49b08 by task kworker/u3:2/389
Link: https://lore.kernel.org/lkml/20220622082716.478486-1-lee.jones@linaro.org
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sungwoo Kim <iam@sung-woo.kim>
|
|
When possible at compile-time, make use of smaller types in
prandom_u32_max(), so that we can use smaller batches from random.c,
which in turn leads to a 2x or 4x performance boost. This makes a
difference, for example, in kfence, which needs a fast stream of small
numbers (booleans).
At the same time, we use the occasion to update the old documentation on
these functions. prandom_u32() and prandom_bytes() have direct
replacements now in random.h, while prandom_u32_max() remains useful as
a prandom.h function, since it's not cryptographically secure by virtue
of not being evenly distributed.
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Acked-by: Marco Elver <elver@google.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
There are numerous places in the kernel that would be sped up by having
smaller batches. Currently those callsites do `get_random_u32() & 0xff`
or similar. Since these are pretty spread out, and will require patches
to multiple different trees, let's get ahead of the curve and lay the
foundation for `get_random_u8()` and `get_random_u16()`, so that it's
then possible to start submitting conversion patches leisurely.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
On some small machines with little entropy, a quasi-unique hostname is
sometimes a relevant factor. I've seen, for example, 8 character
alpha-numeric serial numbers. In addition, the time at which the hostname
is set is usually a decent measurement of how long early boot took. So,
call add_device_randomness() on new hostnames, which feeds its arguments
to the RNG in addition to a fresh cycle counter.
Low cost hooks like this never hurt and can only ever help, and since
this costs basically nothing for an operation that is never a fast path,
this is an overall easy win.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|