Age | Commit message (Collapse) | Author |
|
byt-rt5640 is deprecated in favor of bytcr_rt5640 used by
sound/soc/intel/atom and SOF solutions both. Remove redundant machine
board and all related code.
Signed-off-by: Cezary Rojewski <cezary.rojewski@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Liam Girdwood <liam.r.girdwood@intel.com>
Link: https://lore.kernel.org/r/20201006064907.16277-4-cezary.rojewski@intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
byt-max98090 is deprecated in favor of cht-bsw-max98090 used by
sound/soc/intel/atom and SOF solutions both. Remove redundant machine
board and all related code.
Signed-off-by: Cezary Rojewski <cezary.rojewski@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Liam Girdwood <liam.r.girdwood@intel.com>
Link: https://lore.kernel.org/r/20201006064907.16277-3-cezary.rojewski@intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Newly added catpt solution found in sound/soc/intel/catpt is a direct
replacement to sound/soc/intel/haswell. It covers all features supported
by it and more - by aligning to recommended flows and requirement list
based on Windows driver equivalent. No harm is done to userspace as
catpt - similarly to haswell - loads no extenal topology files while
sharing the exact same ADSP firmware binary.
Given the above, existing haswell code is redundant so remove it.
Signed-off-by: Cezary Rojewski <cezary.rojewski@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Liam Girdwood <liam.r.girdwood@intel.com>
Link: https://lore.kernel.org/r/20201006064907.16277-2-cezary.rojewski@intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Add quite common property - power-domains - to fix dtbs_check warnings
like:
arch/arm64/boot/dts/freescale/imx8qxp-mek.dt.yaml:
mailbox@5d280000: 'power-domains' does not match any of the regexes: 'pinctrl-[0-9]+'
Reviewed-by: Dong Aisheng <aisheng.dong@nxp.com>
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Link: https://lore.kernel.org/r/20201002161837.5784-1-krzk@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
|
|
Canonalize to ioctl FS_* flags instead of inode S_* flags.
Note that we do not call the helper vfs_ioc_fssetxattr_check()
for FS_IOC_FSSETXATTR ioctl. The reason is that underlying filesystem
will perform all the checks. We only need to perform the capability
check before overriding credentials.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Xiao Yang <yangx.jy@cn.fujitsu.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
[S|G]ETFLAGS and FS[S|G]ETXATTR ioctls are applicable to both files and
directories, so add ioctl operations to dir as well.
We teach ovl_real_fdget() to get the realfile of directories which use
a different type of file->private_data.
Ifdef away compat ioctl implementation to conform to standard practice.
With this change, xfstest generic/079 which tests these ioctls on files
and directories passes.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Xiao Yang <yangx.jy@cn.fujitsu.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
Move blk_mq_sched_try_merge to blk-merge.c, which allows to mark
a lot of the merge infrastructure static there.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Also move the definition from the public blkdev.h to the private
block/blk.h header.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Also move the definition from the public blkdev.h to the private
block/blk.h header.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The field of 'q_usage_counter' is always fetched in fast path of every
block driver, and move it into front of 'request_queue', so it can be
fetched into 1st cacheline of 'request_queue' instance.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Tested-by: Veronika Kabatova <vkabatov@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
'struct percpu_ref' is often embedded into one user structure, and the
instance is usually referenced in fast path, however actually only
'percpu_count_ptr' is needed in fast path.
So move other fields into one new structure of 'percpu_ref_data', and
allocate it dynamically via kzalloc(), then memory footprint of
'percpu_ref' in fast path is reduced a lot and becomes suitable to put
into hot cacheline of user structure.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Tested-by: Veronika Kabatova <vkabatov@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Jakub Kicinski says:
====================
ethtool: allow dumping policies to user space
This series wires up ethtool policies to ops, so they can be
dumped to user space for feature discovery.
First patch wires up GET commands, and second patch wires up SETs.
The policy tables are trimmed to save space and LoC.
Next - take care of linking up nested policies for the header
(which is the policy what we actually care about). And once header
policy is linked make sure that attribute range validation for flags
is done by policy, not a conditions in the code. New type of policy
is needed to validate masks (patch 6).
Netlink as always staying a step ahead of all the other kernel
API interfaces :)
v2:
- merge patches 1 & 2 -> 1
- add patch 3 & 5
- remove .max_attr from struct ethnl_request_ops
====================
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Perform header flags validation through the policy.
Only pause command supports ETHTOOL_FLAG_STATS. Create a separate
policy to be able to express that in policy dumps to user space.
Note that even though the core will validate the header policy,
it cannot record multiple layers of attributes and we have to
re-parse header sub-attrs. When doing so we could skip attribute
validation, or use most permissive policy. Opt for the former.
We will no longer return the extack cookie for flags but since
we only added first new flag in this release it's not expected
that any user space had a chance to make use of it.
v2: - remove the re-validation in ethnl_parse_header_dev_get()
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We don't have good validation policy for existing unsigned int attrs
which serve as flags (for new ones we could use NLA_BITFIELD32).
With increased use of policy dumping having the validation be
expressed as part of the policy is important. Add validation
policy in form of a mask of supported/valid bits.
Support u64 in the uAPI to be future-proof, but really for now
the embedded mask member can only hold 32 bits, so anything with
bit 32+ set will always fail validation.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There's a number of policies which check if type is a uint or sint.
Factor the checking against the list of value sizes to a helper
for easier reuse.
v2: - new patch
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
To get the most out of parsing by the core, and to allow dumping
full policies we need to specify which policy applies to nested
attrs. For headers it's ethnl_header_policy.
$ sed -i 's@\(ETHTOOL_A_.*HEADER\].*=\) { .type = NLA_NESTED },@\1\n\t\tNLA_POLICY_NESTED(ethnl_header_policy),@' net/ethtool/*
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since ethtool uses strict attribute validation there's no need
to initialize all attributes in policy tables. 0 is NLA_UNSPEC
which is going to be rejected. Remove the NLA_REJECTs.
Similarly attributes above maxattrs are rejected, so there's
no need to always size the policy tables to ETHTOOL_A_..._MAX.
v2: - new patch
Suggested-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Similarly to get commands wire up the policies of set commands
to get parsing by the core and policy dumps.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Wire up policies for get commands in struct nla_policy of the ethtool
family. Make use of genetlink code attr validation and parsing, as well
as allow dumping policies to user space.
For every ETHTOOL_MSG_*_GET:
- add 'ethnl_' prefix to policy name
- add extern declaration in net/ethtool/netlink.h
- wire up the policy & attr in ethtool_genl_ops[].
- remove .request_policy and .max_attr from ethnl_request_ops.
Obviously core only records the first "layer" of parsed attrs
so we still need to parse the sub-attrs of the nested header
attribute.
v2:
- merge of patches 1 and 2 from v1
- remove stray empty lines in ops
- also remove .max_attr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fabian Frederick says:
====================
drivers/net: add sw_netstats_rx_add helper
This small patchset creates netstats addition dev_sw_netstats_rx_add()
based on dev_lstats_add() and replaces some open coding
in both drivers/net and net branches.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
use new helper for netstats settings
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
use new helper for netstats settings
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
use new helper for netstats settings
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
use new helper for netstats settings
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
use new helper for netstats settings
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Acked-by: Harald Welte <laforge@gnumonks.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
use new helper for netstats settings
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
use new helper for netstats settings
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
use new helper for netstats settings
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
some drivers/network protocols update rx bytes/packets under
u64_stats_update_begin/end sequence.
Add a specific helper like dev_lstats_add()
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
David Howells says:
====================
rxrpc: Miscellaneous fixes
Here are some miscellaneous rxrpc fixes:
(1) Fix the xdr encoding of the contents read from an rxrpc key.
(2) Fix a BUG() for a unsupported encoding type.
(3) Fix missing _bh lock annotations.
(4) Fix acceptance handling for an incoming call where the incoming call
is encrypted.
(5) The server token keyring isn't network namespaced - it belongs to the
server, so there's no need. Namespacing it means that request_key()
fails to find it.
(6) Fix a leak of the server keyring.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When a group that has TopDown members is failed to be scheduled, any
later TopDown groups will not return valid values.
Here is an example.
A background perf that occupies all the GP counters and the fixed
counter 1.
$perf stat -e "{cycles,cycles,cycles,cycles,cycles,cycles,cycles,
cycles,cycles}:D" -a
A user monitors a TopDown group. It works well, because the fixed
counter 3 and the PERF_METRICS are available.
$perf stat -x, --topdown -- ./workload
retiring,bad speculation,frontend bound,backend bound,
18.0,16.1,40.4,25.5,
Then the user tries to monitor a group that has TopDown members.
Because of the cycles event, the group is failed to be scheduled.
$perf stat -x, -e '{slots,topdown-retiring,topdown-be-bound,
topdown-fe-bound,topdown-bad-spec,cycles}'
-- ./workload
<not counted>,,slots,0,0.00,,
<not counted>,,topdown-retiring,0,0.00,,
<not counted>,,topdown-be-bound,0,0.00,,
<not counted>,,topdown-fe-bound,0,0.00,,
<not counted>,,topdown-bad-spec,0,0.00,,
<not counted>,,cycles,0,0.00,,
The user tries to monitor a TopDown group again. It doesn't work anymore.
$perf stat -x, --topdown -- ./workload
,,,,,
In a txn, cancel_txn() is to truncate the event_list for a canceled
group and update the number of events added in this transaction.
However, the number of TopDown events added in this transaction is not
updated. The kernel will probably fail to add new Topdown events.
Fixes: 7b2c05a15d29 ("perf/x86/intel: Generic support for hardware TopDown metrics")
Reported-by: Andi Kleen <ak@linux.intel.com>
Reported-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lkml.kernel.org/r/20201005082611.GH2628@hirez.programming.kicks-ass.net
|
|
Kan reported that n_metric gets corrupted for cancelled transactions;
a similar issue exists for n_pair for AMD's Large Increment thing.
The problem was confirmed and confirmed fixed by Kim using:
sudo perf stat -e "{cycles,cycles,cycles,cycles}:D" -a sleep 10 &
# should succeed:
sudo perf stat -e "{fp_ret_sse_avx_ops.all}:D" -a workload
# should fail:
sudo perf stat -e "{fp_ret_sse_avx_ops.all,fp_ret_sse_avx_ops.all,cycles}:D" -a workload
# previously failed, now succeeds with this patch:
sudo perf stat -e "{fp_ret_sse_avx_ops.all}:D" -a workload
Fixes: 5738891229a2 ("perf/x86/amd: Add support for Large Increment per Cycle Events")
Reported-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Kim Phillips <kim.phillips@amd.com>
Link: https://lkml.kernel.org/r/20201005082516.GG2628@hirez.programming.kicks-ass.net
|
|
Igor Russkikh says:
====================
net: atlantic: phy tunables from mac driver
This series implements phy tunables settings via MAC driver callbacks.
AQC 10G devices use integrated MAC+PHY solution, where PHY is fully controlled
by MAC firmware. Therefore, it is not possible to implement separate phy driver
for these.
We use ethtool ops callbacks to implement downshift and EDPC tunables.
v3: fixed flaw in EDPD logic, from Andrew
v2: comments from Andrew
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Mediadetect is another name for the EDPD (energy detect power down).
This feature allows device to save extra power when no link is available.
PHY goes into the extreme power saving mode and only periodically wakes up
and checks for the link.
AQC devices has fixed check period of 6 seconds
The feature may increase linkup time.
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
PHY downshift allows phy to try renegotiate if link is unstable
and can carry higher speed.
AQC devices has integrated PHY which is controlled by MAC firmware.
Thus, driver defines new ethtool callbacks to implement phy tunables
via netdev.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Define get/set phy tunable callbacks in ethtool ops.
This will allow MAC drivers with integrated PHY still to implement
these tunables.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently skb_dump has a restriction to only dump full packet for the
first 5 socket buffers, then only headers will be printed. Remove this
arbitrary and confusing restriction, which is only documented vaguely
("up to") in the comments above the prototype.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We got reports from GKE customers flows being reset by netfilter
conntrack unless nf_conntrack_tcp_be_liberal is set to 1.
Traces seemed to suggest ACK packet being dropped by the
packet capture, or more likely that ACK were received in the
wrong order.
wscale=7, SYN and SYNACK not shown here.
This ACK allows the sender to send 1871*128 bytes from seq 51359321 :
New right edge of the window -> 51359321+1871*128=51598809
09:17:23.389210 IP A > B: Flags [.], ack 51359321, win 1871, options [nop,nop,TS val 10 ecr 999], length 0
09:17:23.389212 IP B > A: Flags [.], seq 51422681:51424089, ack 1577, win 268, options [nop,nop,TS val 999 ecr 10], length 1408
09:17:23.389214 IP A > B: Flags [.], ack 51422681, win 1376, options [nop,nop,TS val 10 ecr 999], length 0
09:17:23.389253 IP B > A: Flags [.], seq 51424089:51488857, ack 1577, win 268, options [nop,nop,TS val 999 ecr 10], length 64768
09:17:23.389272 IP A > B: Flags [.], ack 51488857, win 859, options [nop,nop,TS val 10 ecr 999], length 0
09:17:23.389275 IP B > A: Flags [.], seq 51488857:51521241, ack 1577, win 268, options [nop,nop,TS val 999 ecr 10], length 32384
Receiver now allows to send 606*128=77568 from seq 51521241 :
New right edge of the window -> 51521241+606*128=51598809
09:17:23.389296 IP A > B: Flags [.], ack 51521241, win 606, options [nop,nop,TS val 10 ecr 999], length 0
09:17:23.389308 IP B > A: Flags [.], seq 51521241:51553625, ack 1577, win 268, options [nop,nop,TS val 999 ecr 10], length 32384
It seems the sender exceeds RWIN allowance, since 51611353 > 51598809
09:17:23.389346 IP B > A: Flags [.], seq 51553625:51611353, ack 1577, win 268, options [nop,nop,TS val 999 ecr 10], length 57728
09:17:23.389356 IP B > A: Flags [.], seq 51611353:51618393, ack 1577, win 268, options [nop,nop,TS val 999 ecr 10], length 7040
09:17:23.389367 IP A > B: Flags [.], ack 51611353, win 0, options [nop,nop,TS val 10 ecr 999], length 0
netfilter conntrack is not happy and sends RST
09:17:23.389389 IP A > B: Flags [R], seq 92176528, win 0, length 0
09:17:23.389488 IP B > A: Flags [R], seq 174478967, win 0, length 0
Now imagine ACK were delivered out of order and tcp_add_backlog() sets window based on wrong packet.
New right edge of the window -> 51521241+859*128=51631193
Normally TCP stack handles OOO packets just fine, but it
turns out tcp_add_backlog() does not. It can update the window
field of the aggregated packet even if the ACK sequence
of the last received packet is too old.
Many thanks to Alexandre Ferrieux for independently reporting the issue
and suggesting a fix.
Fixes: 4f693b55c3d2 ("tcp: implement coalescing on backlog queue")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Alexandre Ferrieux <alexandre.ferrieux@orange.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
DA7219 uses I2S2 and I2S3 for input and output respectively. Commit
9e30251fb22e ("ASoC: mediatek: mt8183-da7219: support machine driver
with rt1015") introduces a bug that:
- If using I2S2 solely, MCLK to DA7219 is 256FS.
- If using I2S3 solely, MCLK to DA7219 is 128FS.
- If using I2S3 first and then I2S2, the MCLK changes from 128FS to
256FS. As a result, no sound output to the headset. Also no sound
input from the headset microphone.
Both I2S2 and I2S3 should set MCLK to 256FS. Fixes the wrong ops for
I2S3.
Fixes: 9e30251fb22e ("ASoC: mediatek: mt8183-da7219: support machine driver with rt1015")
Signed-off-by: Tzung-Bi Shih <tzungbi@google.com>
Link: https://lore.kernel.org/r/20201006101252.1890385-1-tzungbi@google.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
When get_registers() fails in set_ethernet_addr(),the uninitialized
value of node_id gets copied over as the address.
So, check the return value of get_registers().
If get_registers() executed successfully (i.e., it returns
sizeof(node_id)), copy over the MAC address using ether_addr_copy()
(instead of using memcpy()).
Else, if get_registers() failed instead, a randomly generated MAC
address is set as the MAC address instead.
Reported-by: syzbot+abbc768b560c84d92fd3@syzkaller.appspotmail.com
Tested-by: syzbot+abbc768b560c84d92fd3@syzkaller.appspotmail.com
Acked-by: Petko Manolov <petkan@nucleusys.com>
Signed-off-by: Anant Thazhemadam <anant.thazhemadam@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently we skip calling tcp_cleanup_rbuf() when packets
are moved into the OoO queue or simply dropped. In both
cases we still increment tp->copied_seq, and we should
ask the TCP stack to check for ack.
Fixes: c76c6956566f ("mptcp: call tcp_cleanup_rbuf on subflows")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently data fin on data packet are not handled properly:
the 'rcv_data_fin_seq' field is interpreted as the last
sequence number carrying a valid data, but for data fin
packet with valid maps we currently store map_seq + map_len,
that is, the next value.
The 'write_seq' fields carries instead the value subseguent
to the last valid byte, so in mptcp_write_data_fin() we
never detect correctly the last DSS map.
Fixes: 7279da6145bb ("mptcp: Use MPTCP-level flag for sending DATA_FIN")
Fixes: 1a49b2c2a501 ("mptcp: Handle incoming 32-bit DATA_FIN values")
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Vladimir Oltean says:
====================
Fix tail dropping watermarks for Ocelot switches
This series adds a missing division by 60, and a warning to prevent that
in the future.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There is an upper bound to the value that a watermark may hold. That
upper bound is not immediately obvious during configuration, and it
might be possible to have accidental truncation.
Actually this has happened already, add a warning to prevent it from
happening again.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Tail dropping is enabled for a port when:
1. A source port consumes more packet buffers than the watermark encoded
in SYS:PORT:ATOP_CFG.ATOP.
AND
2. Total memory use exceeds the consumption watermark encoded in
SYS:PAUSE_CFG:ATOP_TOT_CFG.
The unit of these watermarks is a 60 byte memory cell. That unit is
programmed properly into ATOP_TOT_CFG, but not into ATOP. Actually when
written into ATOP, it would get truncated and wrap around.
Fixes: a556c76adc05 ("net: mscc: Add initial Ocelot switch support")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The rcu_read_lock() is not supposed to lock the kernel_sendmsg() API
since it has the lock_sock() in qrtr_sendmsg() which will sleep. Hence,
fix it by excluding the locking for kernel_sendmsg().
While at it, let's also use radix_tree_deref_retry() to confirm the
validity of the pointer returned by radix_tree_deref_slot() and use
radix_tree_iter_resume() to resume iterating the tree properly before
releasing the lock as suggested by Doug.
Fixes: a7809ff90ce6 ("net: qrtr: ns: Protect radix_tree_deref_slot() using rcu read locks")
Reported-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Alex Elder <elder@linaro.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
During memory hot-add, dlpar_add_lmb() calls memory_add_physaddr_to_nid()
to determine which node id (nid) to use when later calling __add_memory().
This is wasteful. On pseries, memory_add_physaddr_to_nid() finds an
appropriate nid for a given address by looking up the LMB containing the
address and then passing that LMB to of_drconf_to_nid_single() to get the
nid. In dlpar_add_lmb() we get this address from the LMB itself.
In short, we have a pointer to an LMB and then we are searching for
that LMB *again* in order to find its nid.
If we call of_drconf_to_nid_single() directly from dlpar_add_lmb() we
can skip the redundant lookup. The only error handling we need to
duplicate from memory_add_physaddr_to_nid() is the fallback to the
default nid when drconf_to_nid_single() returns -1 (NUMA_NO_NODE) or
an invalid nid.
Skipping the extra lookup makes hot-add operations faster, especially
on machines with many LMBs.
Consider an LPAR with 126976 LMBs. In one test, hot-adding 126000
LMBs on an upatched kernel took ~3.5 hours while a patched kernel
completed the same operation in ~2 hours:
Unpatched (12450 seconds):
Sep 9 04:06:31 ltc-brazos1 drmgr[810169]: drmgr: -c mem -a -q 126000
Sep 9 04:06:31 ltc-brazos1 kernel: pseries-hotplug-mem: Attempting to hot-add 126000 LMB(s)
[...]
Sep 9 07:34:01 ltc-brazos1 kernel: pseries-hotplug-mem: Memory at 20000000 (drc index 80000002) was hot-added
Patched (7065 seconds):
Sep 8 21:49:57 ltc-brazos1 drmgr[877703]: drmgr: -c mem -a -q 126000
Sep 8 21:49:57 ltc-brazos1 kernel: pseries-hotplug-mem: Attempting to hot-add 126000 LMB(s)
[...]
Sep 8 23:27:42 ltc-brazos1 kernel: pseries-hotplug-mem: Memory at 20000000 (drc index 80000002) was hot-added
It should be noted that the speedup grows more substantial when
hot-adding LMBs at the end of the drconf range. This is because we
are skipping a linear LMB search.
To see the distinction, consider smaller hot-add test on the same
LPAR. A perf-stat run with 10 iterations showed that hot-adding 4096
LMBs completed less than 1 second faster on a patched kernel:
Unpatched:
Performance counter stats for 'drmgr -c mem -a -q 4096' (10 runs):
104,753.42 msec task-clock # 0.992 CPUs utilized ( +- 0.55% )
4,708 context-switches # 0.045 K/sec ( +- 0.69% )
2,444 cpu-migrations # 0.023 K/sec ( +- 1.25% )
394 page-faults # 0.004 K/sec ( +- 0.22% )
445,902,503,057 cycles # 4.257 GHz ( +- 0.55% ) (66.67%)
8,558,376,740 stalled-cycles-frontend # 1.92% frontend cycles idle ( +- 0.88% ) (49.99%)
300,346,181,651 stalled-cycles-backend # 67.36% backend cycles idle ( +- 0.76% ) (50.01%)
258,091,488,691 instructions # 0.58 insn per cycle
# 1.16 stalled cycles per insn ( +- 0.22% ) (66.67%)
70,568,169,256 branches # 673.660 M/sec ( +- 0.17% ) (50.01%)
3,100,725,426 branch-misses # 4.39% of all branches ( +- 0.20% ) (49.99%)
105.583 +- 0.589 seconds time elapsed ( +- 0.56% )
Patched:
Performance counter stats for 'drmgr -c mem -a -q 4096' (10 runs):
104,055.69 msec task-clock # 0.993 CPUs utilized ( +- 0.32% )
4,606 context-switches # 0.044 K/sec ( +- 0.20% )
2,463 cpu-migrations # 0.024 K/sec ( +- 0.93% )
394 page-faults # 0.004 K/sec ( +- 0.25% )
442,951,129,921 cycles # 4.257 GHz ( +- 0.32% ) (66.66%)
8,710,413,329 stalled-cycles-frontend # 1.97% frontend cycles idle ( +- 0.47% ) (50.06%)
299,656,905,836 stalled-cycles-backend # 67.65% backend cycles idle ( +- 0.39% ) (50.02%)
252,731,168,193 instructions # 0.57 insn per cycle
# 1.19 stalled cycles per insn ( +- 0.20% ) (66.66%)
68,902,851,121 branches # 662.173 M/sec ( +- 0.13% ) (49.94%)
3,100,242,882 branch-misses # 4.50% of all branches ( +- 0.15% ) (49.98%)
104.829 +- 0.325 seconds time elapsed ( +- 0.31% )
This is consistent. An add-by-count hot-add operation adds LMBs
greedily, so LMBs near the start of the drconf range are considered
first. On an otherwise idle LPAR with so many LMBs we would expect to
find the LMBs we need near the start of the drconf range, hence the
smaller speedup.
Signed-off-by: Scott Cheloha <cheloha@linux.ibm.com>
Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200916145122.3408129-1-cheloha@linux.ibm.com
|
|
Add a selftest to test the basic functionality of CONFIG_RTAS_FILTER.
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
[mpe: Change rmo_start/end to 32-bit to avoid build errors on ppc64]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200820044512.7543-2-ajd@linux.ibm.com
|
|
A number of userspace utilities depend on making calls to RTAS to retrieve
information and update various things.
The existing API through which we expose RTAS to userspace exposes more
RTAS functionality than we actually need, through the sys_rtas syscall,
which allows root (or anyone with CAP_SYS_ADMIN) to make any RTAS call they
want with arbitrary arguments.
Many RTAS calls take the address of a buffer as an argument, and it's up to
the caller to specify the physical address of the buffer as an argument. We
allocate a buffer (the "RMO buffer") in the Real Memory Area that RTAS can
access, and then expose the physical address and size of this buffer in
/proc/powerpc/rtas/rmo_buffer. Userspace is expected to read this address,
poke at the buffer using /dev/mem, and pass an address in the RMO buffer to
the RTAS call.
However, there's nothing stopping the caller from specifying whatever
address they want in the RTAS call, and it's easy to construct a series of
RTAS calls that can overwrite arbitrary bytes (even without /dev/mem
access).
Additionally, there are some RTAS calls that do potentially dangerous
things and for which there are no legitimate userspace use cases.
In the past, this would not have been a particularly big deal as it was
assumed that root could modify all system state freely, but with Secure
Boot and lockdown we need to care about this.
We can't fundamentally change the ABI at this point, however we can address
this by implementing a filter that checks RTAS calls against a list
of permitted calls and forces the caller to use addresses within the RMO
buffer.
The list is based off the list of calls that are used by the librtas
userspace library, and has been tested with a number of existing userspace
RTAS utilities. For compatibility with any applications we are not aware of
that require other calls, the filter can be turned off at build time.
Cc: stable@vger.kernel.org
Reported-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200820044512.7543-1-ajd@linux.ibm.com
|