Age | Commit message (Collapse) | Author |
|
This converts one of the two users of mmu_notifiers to use the new API.
The conversion is fairly straightforward, however the existing use of
notifiers here seems to be racey.
Link: https://lore.kernel.org/r/20191112202231.3856-7-jgg@ziepe.ca
Tested-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Replace the internal interval tree based mmu notifier with the new common
mmu_interval_notifier_insert() API. This removes a lot of code and fixes a
deadlock that can be triggered in ODP:
zap_page_range()
mmu_notifier_invalidate_range_start()
[..]
ib_umem_notifier_invalidate_range_start()
down_read(&per_mm->umem_rwsem)
unmap_single_vma()
[..]
__split_huge_page_pmd()
mmu_notifier_invalidate_range_start()
[..]
ib_umem_notifier_invalidate_range_start()
down_read(&per_mm->umem_rwsem) // DEADLOCK
mmu_notifier_invalidate_range_end()
up_read(&per_mm->umem_rwsem)
mmu_notifier_invalidate_range_end()
up_read(&per_mm->umem_rwsem)
The umem_rwsem is held across the range_start/end as the ODP algorithm for
invalidate_range_end cannot tolerate changes to the interval
tree. However, due to the nested invalidation regions the second
down_read() can deadlock if there are competing writers. The new core code
provides an alternative scheme to solve this problem.
Fixes: ca748c39ea3f ("RDMA/umem: Get rid of per_mm->notifier_count")
Link: https://lore.kernel.org/r/20191112202231.3856-6-jgg@ziepe.ca
Tested-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Only the function calls are stubbed out with static inlines that always
fail. This is the standard way to write a header for an optional component
and makes it easier for drivers that only optionally need HMM_MIRROR.
Link: https://lore.kernel.org/r/20191112202231.3856-5-jgg@ziepe.ca
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
hmm_mirror's handling of ranges does not use a sequence count which
results in this bug:
CPU0 CPU1
hmm_range_wait_until_valid(range)
valid == true
hmm_range_fault(range)
hmm_invalidate_range_start()
range->valid = false
hmm_invalidate_range_end()
range->valid = true
hmm_range_valid(range)
valid == true
Where the hmm_range_valid() should not have succeeded.
Adding the required sequence count would make it nearly identical to the
new mmu_interval_notifier. Instead replace the hmm_mirror stuff with
mmu_interval_notifier.
Co-existence of the two APIs is the first step.
Link: https://lore.kernel.org/r/20191112202231.3856-4-jgg@ziepe.ca
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
Tested-by: Philip Yang <Philip.Yang@amd.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Of the 13 users of mmu_notifiers, 8 of them use only
invalidate_range_start/end() and immediately intersect the
mmu_notifier_range with some kind of internal list of VAs. 4 use an
interval tree (i915_gem, radeon_mn, umem_odp, hfi1). 4 use a linked list
of some kind (scif_dma, vhost, gntdev, hmm)
And the remaining 5 either don't use invalidate_range_start() or do some
special thing with it.
It turns out that building a correct scheme with an interval tree is
pretty complicated, particularly if the use case is synchronizing against
another thread doing get_user_pages(). Many of these implementations have
various subtle and difficult to fix races.
This approach puts the interval tree as common code at the top of the mmu
notifier call tree and implements a shareable locking scheme.
It includes:
- An interval tree tracking VA ranges, with per-range callbacks
- A read/write locking scheme for the interval tree that avoids
sleeping in the notifier path (for OOM killer)
- A sequence counter based collision-retry locking scheme to tell
device page fault that a VA range is being concurrently invalidated.
This is based on various ideas:
- hmm accumulates invalidated VA ranges and releases them when all
invalidates are done, via active_invalidate_ranges count.
This approach avoids having to intersect the interval tree twice (as
umem_odp does) at the potential cost of a longer device page fault.
- kvm/umem_odp use a sequence counter to drive the collision retry,
via invalidate_seq
- a deferred work todo list on unlock scheme like RTNL, via deferred_list.
This makes adding/removing interval tree members more deterministic
- seqlock, except this version makes the seqlock idea multi-holder on the
write side by protecting it with active_invalidate_ranges and a spinlock
To minimize MM overhead when only the interval tree is being used, the
entire SRCU and hlist overheads are dropped using some simple
branches. Similarly the interval tree overhead is dropped when in hlist
mode.
The overhead from the mandatory spinlock is broadly the same as most of
existing users which already had a lock (or two) of some sort on the
invalidation path.
Link: https://lore.kernel.org/r/20191112202231.3856-3-jgg@ziepe.ca
Acked-by: Christian König <christian.koenig@amd.com>
Tested-by: Philip Yang <Philip.Yang@amd.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
PROM only enables ethernet PHY on first Origin 200 module, so we must
do it ourselves for the second module.
Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: Paul Burton <paulburton@kernel.org>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@mips.com>
Cc: James Hogan <jhogan@kernel.org>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: linux-rtc@vger.kernel.org
Cc: linux-serial@vger.kernel.org
|
|
Generation of fake subdevice ID had vendor and device ID swapped.
Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: Paul Burton <paulburton@kernel.org>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@mips.com>
Cc: James Hogan <jhogan@kernel.org>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: linux-rtc@vger.kernel.org
Cc: linux-serial@vger.kernel.org
|
|
Pull last minute virtio bugfixes from Michael Tsirkin:
"Minor bugfixes all over the place"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio_balloon: fix shrinker count
virtio_balloon: fix shrinker scan number of pages
virtio_console: allocate inbufs in add_port() only if it is needed
virtio_ring: fix return code on DMA mapping fails
|
|
rhashtable_lookup_fast() internally calls rcu_read_lock() then,
calls rhashtable_lookup(). So if rcu_read_lock() is already held,
rhashtable_lookup() is enough.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valo says:
====================
wireless-drivers-next patches for v5.5
Last set of patches for v5.5. Major features here 802.11ax support for
qtnfmac and airtime fairness support to mt76. And naturally smaller
fixes and improvements all over.
Major changes:
qtnfmac
* add 802.11ax support in AP mode
* enable offload bridging support
iwlwifi
* support TX/RX antennas reporting
mt76
* mt7615 smart carrier sense support
* aggregation statistics via debugfs
* airtime fairness (ATF) support
* mt76x0 OF mac address support
====================
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Robert Schwebel says:
====================
here is v2 of the series converting the NFC documentation from txt to
rst. Thanks to Jonathan and Dave for the input.
Changes since (implicit) v1:
* replace code-block by more compact :: syntax
* really add the rst file to the index
====================
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Now that the sphinx syntax has been fixed, change the document from txt
to rst and add it to the index.
Signed-off-by: Robert Schwebel <r.schwebel@pengutronix.de>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Silence this warning:
Documentation/networking/nfc.rst:113: WARNING: Definition list ends without
a blank line; unexpected unindent.
Signed-off-by: Robert Schwebel <r.schwebel@pengutronix.de>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Fix this warning:
Documentation/networking/nfc.rst:87: WARNING: Bullet list ends without
a blank line; unexpected unindent.
Signed-off-by: Robert Schwebel <r.schwebel@pengutronix.de>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Change the block diagram to match the sphinx syntax. This will make it
possible to switch this file to rst in the future.
Signed-off-by: Robert Schwebel <r.schwebel@pengutronix.de>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
The headlines in this file do are not in the standard kernel docu-
mentation headline format. Change it, so this file can be switched to
rst in the future.
Signed-off-by: Robert Schwebel <r.schwebel@pengutronix.de>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
When a phydev is created, the speed and duplex are set to zero and
-1 respectively, rather than using the predefined SPEED_UNKNOWN and
DUPLEX_UNKNOWN constants.
There is a window at initialisation time where we may report link
down using the 0/-1 values. Tidy this up and use the predefined
constants, so debug doesn't complain with:
"Unsupported (update phy-core.c)/Unsupported (update phy-core.c)"
when the speed and duplex settings are printed.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
There are no users of phy_ethtool_sset() in the kernel anymore, and
as of commit 3c1bcc8614db ("net: ethernet: Convert phydev advertize
and supported from u32 to link mode"), the implementation is slightly
buggy - it doesn't correctly check the masked advertising mask as it
used to.
Remove it, and update the phy documentation to refer to its replacement
function.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
This commit reverts commit 91e6015b082b ("bpf: Emit audit messages
upon successful prog load and unload") and its follow up commit
7599a896f2e4 ("audit: Move audit_log_task declaration under
CONFIG_AUDITSYSCALL") as requested by Paul Moore. The change needs
close review on linux-audit, tests etc.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Commit 37e4c997dadf ("KVM: VMX: validate individual bits of guest
MSR_IA32_FEATURE_CONTROL") broke the KVM_SET_MSRS ABI by instituting
new constraints on the data values that kvm would accept for the guest
MSR, IA32_FEATURE_CONTROL. Perhaps these constraints should have been
opt-in via a new KVM capability, but they were applied
indiscriminately, breaking at least one existing hypervisor.
Relax the constraints to allow either or both of
FEATURE_CONTROL_VMXON_ENABLED_OUTSIDE_SMX and
FEATURE_CONTROL_VMXON_ENABLED_INSIDE_SMX to be set when nVMX is
enabled. This change is sufficient to fix the aforementioned breakage.
Fixes: 37e4c997dadf ("KVM: VMX: validate individual bits of guest MSR_IA32_FEATURE_CONTROL")
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Acquire kvm->srcu for the duration of ->set_nested_state() to fix a bug
where nVMX derefences ->memslots without holding ->srcu or ->slots_lock.
The other half of nested migration, ->get_nested_state(), does not need
to acquire ->srcu as it is a purely a dump of internal KVM (and CPU)
state to userspace.
Detected as an RCU lockdep splat that is 100% reproducible by running
KVM's state_test selftest with CONFIG_PROVE_LOCKING=y. Note that the
failing function, kvm_is_visible_gfn(), is only checking the validity of
a gfn, it's not actually accessing guest memory (which is more or less
unsupported during vmx_set_nested_state() due to incorrect MMU state),
i.e. vmx_set_nested_state() itself isn't fundamentally broken. In any
case, setting nested state isn't a fast path so there's no reason to go
out of our way to avoid taking ->srcu.
=============================
WARNING: suspicious RCU usage
5.4.0-rc7+ #94 Not tainted
-----------------------------
include/linux/kvm_host.h:626 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
1 lock held by evmcs_test/10939:
#0: ffff88826ffcb800 (&vcpu->mutex){+.+.}, at: kvm_vcpu_ioctl+0x85/0x630 [kvm]
stack backtrace:
CPU: 1 PID: 10939 Comm: evmcs_test Not tainted 5.4.0-rc7+ #94
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
Call Trace:
dump_stack+0x68/0x9b
kvm_is_visible_gfn+0x179/0x180 [kvm]
mmu_check_root+0x11/0x30 [kvm]
fast_cr3_switch+0x40/0x120 [kvm]
kvm_mmu_new_cr3+0x34/0x60 [kvm]
nested_vmx_load_cr3+0xbd/0x1f0 [kvm_intel]
nested_vmx_enter_non_root_mode+0xab8/0x1d60 [kvm_intel]
vmx_set_nested_state+0x256/0x340 [kvm_intel]
kvm_arch_vcpu_ioctl+0x491/0x11a0 [kvm]
kvm_vcpu_ioctl+0xde/0x630 [kvm]
do_vfs_ioctl+0xa2/0x6c0
ksys_ioctl+0x66/0x70
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x54/0x200
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7f59a2b95f47
Fixes: 8fcc4b5923af5 ("kvm: nVMX: Introduce KVM_CAP_NESTED_STATE")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Fold shared_msr_update() into its sole user to eliminate its pointless
bounds check, its godawful printk, its misleading comment (it's called
under a global lock), and its woefully inaccurate name.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The jump label out_free_1 and out_free_2 deal with
the same stuff, so git rid of one and rename the
label out_free_0a to retain the label name order.
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
A recent change inadvertently exported a static function, which results
in modpost throwing a warning. Fix it.
Fixes: cbbaa2727aa3 ("KVM: x86: fix presentation of TSX feature in ARCH_CAPABILITIES")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
perf report:
Jin Yao:
- Allow entering the annotation view (symbol source/assembly +
overhead/cycles/etc column) from the 'perf report --total-cycles'
interface.
E.g.:
# perf record --all-cpus --branch-any --all-kernel
^C[ perf record: Woken up 5 times to write data ]
#
# perf evlist -v
cycles: size: 120, { sample_period, sample_freq }: 4000,
sample_type: IP|TID|TIME|CPU|PERIOD|BRANCH_STACK,
read_format: ID, disabled: 1, inherit: 1, exclude_user: 1, mmap: 1, comm: 1, freq: 1, task: 1,
precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1,
bpf_event: 1, branch_sample_type: ANY
#
# perf report --total-cycles
#
# Samples: 78762 of event 'cycles'
Sampled Sampled Avg Avg
Cycles% Cycles Cycles% Cycles [Program Block Range] Shared Object
1.72% 95.8K 0.00% 254 [msr.h:105 -> msr.h:166] [kernel.vmlinux]
1.56% 107.6K 0.00% 618 [compiler.h:199 -> common.c:301] [kernel.vmlinux]
0.83% 46.3K 0.00% 409 [entry_64.S:153 -> entry_64.S:175] [kernel.vmlinux]
0.83% 46.1K 0.00% 83 [jump_label.h:41 -> tsc.c:230] [kernel.vmlinux]
0.64% 36.9K 0.01% 1.4K [hda_intel.c:904 -> hda_intel.c:916] [snd_hda_intel]
0.57% 30.2K 0.00% 282 [file.c:710 -> file.c:730] [kernel.vmlinux]
0.48% 25.8K 0.00% 82 [spinlock.c:158 -> spinlock.c:160] [kernel.vmlinux]
0.45% 23.7K 0.00% 369 [tick-broadcast.c:585 -> tick-broadcast.c:586] [kernel.vmlinux]
0.44% 24.4K 0.00% 73 [msr.h:236 -> tsc.c:1088] [kernel.vmlinux]
0.43% 22.7K 0.00% 144 [cpuidle.c:229 -> cpuidle.c:232] [kernel.vmlinux]
Then press 'A' or Enter on one of those lines, just like with 'perf top', say
the top one: [msr.h:105 -> msr.h:166], then this shows up:
Samples: 78K of event 'cycles', 4000 Hz, Event count (approx.): 78762
native_write_msr /lib/modules/5.4.0-rc8/build/vmlinux [Percent: local period]
Percent│ IPC Cycle (Average IPC: 0.02, IPC Coverage: 50.0%)
│
│ Disassembly of section .text:
│
│ ffffffff8106c480 <native_write_msr>:
│ __wrmsr():
│ return EAX_EDX_VAL(val, low, high);
│ }
│
│ static inline void notrace __wrmsr(unsigned int msr, u32 low, u32 high)
│ {
│ asm volatile("1: wrmsr\n"
49.16 │0.02 mov %edi,%ecx
│0.02 mov %esi,%eax
│0.02 wrmsr
│ arch_static_branch():
│ #include <linux/stringify.h>
│ #include <linux/types.h>
│
│ static __always_inline bool arch_static_branch(struct static_key *key, bool branch)
│ {
│ asm_volatile_goto("1:"
0.79 │0.02 nop
│ native_write_msr():
│ {
│ __wrmsr(msr, low, high);
│
│ if (msr_tracepoint_active(__tracepoint_write_msr))
│ do_trace_write_msr(msr, ((u64)high << 32 | low), 0);
│ }
50.05 │0.02 254 ← retq
│ do_trace_write_msr(msr, ((u64)high << 32 | low), 0);
│ shl $0x20,%rdx
│ mov %esi,%esi
│ or %rdx,%rsi
│ xor %edx,%edx
│ → jmpq do_trace_write_msr
We need to improve this to show the source code line numbers in the
annotation view, so one can go from that program block to the annotation view
and see those source code line numbers straight away.
auxtrace/Intel PT:
Adrian Hunter:
- Add support for AUX area sampling, requires new functionality that
will land in 5.5, its already in tip.
This includes kernel capability querying so that it fails gracefully
with older kernels, duimping aux area samples in 'perf report -D' and
'perf script'.
perf.data:
Alexey Budankov:
- Fix decompression of PERF_RECORD_COMPRESSED records.
core:
Arnaldo Carvalho de Melo:
- Use the 'dcacheline' cmp routine to find the right DSOs taking into
account the 'maj', 'min', 'ino' and 'ino_generation', that got moved
from 'struct map' to 'struct dso', where it belongs.
This further reduces the size of 'struct map', there is still more
work to do to maybe get it to max one cacheline.
libtraceevent:
Hewenliang:
- Fix memory leakage in copy_filter_type().
Sudip Mukherjee:
- Fix header installation.
perf parse:
Ian Rogers :
- Fix potential memory leak when handling tracepoint errors, found using
LLVM's libFuzzer.
perf probe:
Colin Ian King:
- Fix spelling mistake "addrees" -> "address".
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Part of the documentation is taken from the README of the userspace
utils (https://github.com/vitorafsr/i8kutils). The license is GPL-2+
and the author Massimo Dal Zotto is already credited as author of
the module. Therefore there should be no copyright problem.
I also added a paragraph with specific information on the experimental
support for automatic BIOS fan control.
Signed-off-by: Giovanni Mascellani <gio@debian.org>
Link: https://lore.kernel.org/r/20191122101519.1246458-2-gio@debian.org
[groeck: Fixed some of the documentation warnings]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
This patch exports standard hwmon pwmX_enable sysfs attribute for
enabling or disabling automatic fan control by BIOS. Standard value
"1" is for disabling automatic BIOS fan control and value "2" for
enabling.
By default BIOS auto mode is enabled by laptop firmware.
When BIOS auto mode is enabled, custom fan speed value (set via hwmon
pwmX sysfs attribute) is overwritten by SMM in few seconds and
therefore any custom settings are without effect. So this is reason
why implementing option for disabling BIOS auto mode is needed.
So finally this patch allows kernel to set and control fan speed on
laptops, but it can be dangerous (like setting speed of other fans).
The SMM commands to enable or disable automatic fan control are not
documented and are not the same on all Dell laptops. Therefore a
whitelist is used to send the correct codes only on laptopts for which
they are known.
This patch was originally developed by Pali Rohár; later Giovanni
Mascellani implemented the whitelist.
Signed-off-by: Giovanni Mascellani <gio@debian.org>
Co-Developed-by: Pali Rohár <pali.rohar@gmail.com>
Signed-off-by: Pali Rohár <pali.rohar@gmail.com>
Link: https://lore.kernel.org/r/20191122101519.1246458-1-gio@debian.org
[groeck: Fixed checkpatch warnings]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
Conflicts:
arch/riscv/boot/Makefile
arch/riscv/include/asm/sbi.h
|
|
|
|
|
|
|
|
|
|
Edward Cree says:
====================
A series of changes to how we check filters for expiry, manage how much
of that work to do & when, etc.
Prompted by some pathological behaviour under heavy load, which was
Reported-by: David Ahern <dahern@digitalocean.com>
====================
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
If there's no traffic on a channel, its ARFS expiry work will never get
scheduled by efx_poll() as that isn't being run.
So make efx_filter_rfs_expire() reschedule itself to run after 30 seconds.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Report the number of successful and failed insertions, and also the
current count of filters, to aid in tuning e.g. rps_flow_cnt.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
In high connection count usage, the NIC's filter table may be filled with
sufficiently many ARFS filters that further insertions fail. As this
does not represent a correctness issue, do not log the resulting MCDI
errors. Add a debug-level message under the (by default disabled)
rx_status category instead; and take the opportunity to do a little extra
expiry work.
Since there are now multiple workitems able to call __efx_filter_rfs_expire
on a given channel, it is possible for them to race and thus pass quotas
which, combined, exceed rfs_filter_count. Thus, don't WARN_ON if we loop
all the way around the table with quota left over.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Tested-by: David Ahern <dahern@digitalocean.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
The old rfs_filters_added method for determining the quota could potentially
allow the NIC to become filled with old filters, which never get tested for
expiry. Instead, explicitly make expiry check work depend on the number of
filters installed, and don't count checking slots without filters in as
doing work. This guarantees that each filter will be checked for expiry at
least once every thirty seconds (assuming the channel to which it belongs is
NAPI polling actively) regardless of fill level.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Tested-by: David Ahern <dahern@digitalocean.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
This series contains updates to the ice driver only.
Bruce updates the driver to store the number of functions the device has
so that it won't have to compute it when setting safe mode capabilities.
Adds a check to adjust the reporting of capabilities for devices with
more than 4 ports, which differ for devices with less than 4 ports.
Brett adds a helper function to determine if the VF is allowed to do
VLAN operations based on the host's VF configuration. Also adds a new
function that initializes VLAN stripping (enabled/disabled) for the VF
based on the device supported capabilities. Adds a check if the vector
index is valid with the respect to the number of transmit and receive
queues configured when we set coalesce settings for DCB. Adds a check
if the promisc_mask contains ICE_PROMISC_VLAN_RX or ICE_PROMISC_VLAN_TX
so that VLAN 0 promiscuous rules to be removed. Add a helper macro for
a commonly used de-reference of a pointer to &pf->dev->pdev.
Jesse fixes an issue where if an invalid virtchnl request from the VF,
the driver would return uninitialized data to the VF from the PF stack,
so ensure the stack variable is initialized earlier. Add helpers to the
virtchnl interface make the reporting of strings consistent and help
reduce stack space. Implements VF statistics gathering via the kernel
ndo_get_vf_stats().
Akeem ensures we disable the state flag for each VF when its resources
are returned to the device.
Tony does additional cleanup in the driver to ensure the when we
allocate and free memory within the same function, we should not be
using devm_* variants; use regular alloc and free functions.
Henry implements code to query and set the number of channels on the
primary VSI for a PF via ethtool.
Jake cleans up needless NULL checks in ice_sched_cleanup_all().
Kevin updates the firmware API version to align with current NVM images.
v2: Added "Fixes:" tag to patch 5 commit description and added the use
of netif_is_rxfh_configured() in patch 13 to see if RSS has been
configured by the user, if so do not overwrite that configuration.
====================
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fix from Dmitry Torokhov:
"Just a single revert as RMI mode should not have been enabled for this
model [yet?]"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Revert "Input: synaptics - enable RMI mode for X1 Extreme 2nd Generation"
|
|
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Since MIPS architecture has a sparse syscall array, select the
HAVE_SPARSE_SYSCALL_NR to save space.
Link: http://lkml.kernel.org/r/20191115234314.21599-2-hnaveed@wavecomp.com
Signed-off-by: Hassan Naveed <hnaveed@wavecomp.com>
Reviewed-by: Paul Burton <paulburton@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Currently, a lot of memory is wasted for architectures like MIPS when
init_ftrace_syscalls() allocates the array for syscalls using kcalloc.
This is because syscalls numbers start from 4000, 5000 or 6000 and
array elements up to that point are unused.
Fix this by using a data structure more suited to storing sparsely
populated arrays. The XARRAY data structure, implemented using radix
trees, is much more memory efficient for storing the syscalls in
question.
Link: http://lkml.kernel.org/r/20191115234314.21599-1-hnaveed@wavecomp.com
Signed-off-by: Hassan Naveed <hnaveed@wavecomp.com>
Reviewed-by: Paul Burton <paulburton@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Rahul Lakkireddy says:
====================
This series of patches add UDP Segmentation Offload (USO) supported
by Chelsio T5/T6 NICs.
Patch 1 updates the current Scatter Gather List (SGL) DMA unmap logic
for USO requests.
Patch 2 adds USO support for NIC and MQPRIO QoS offload Tx path.
Patch 3 adds missing stats for MQPRIO QoS offload Tx path.
====================
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Export necessary stats for traffic flowing through MQPRIO QoS offload
Tx path.
v2:
- No change.
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
Implement and export UDP segmentation offload (USO) support for both
NIC and MQPRIO QoS offload Tx path. Update appropriate logic in Tx to
parse GSO info in skb and configure FW_ETH_TX_EO_WR request needed to
perform USO.
v2:
- Remove inline keyword from write_eo_udp_wr() in sge.c. Let the
compiler decide.
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
The FW_ETH_TX_EO_WR used for sending UDP Segmentation Offload (USO)
requests expects the headers to be part of the descriptor and the
payload to be part of the SGL containing the DMA mapped addresses.
Hence, the DMA address in the first entry of the SGL can start after
the packet headers. Currently, unmap_sgl() tries to unmap from this
wrong offset, instead of the originally mapped DMA address.
So, use existing unmap_skb() instead, which takes originally saved DMA
addresses as input. Update all necessary Tx paths to save the original
DMA addresses, so that unmap_skb() can unmap them properly.
v2:
- No change.
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
This is a sample module to demonstrate the use of the newly introduced and
exported APIs to access Ftrace instances from within the kernel.
Newly introduced APIs used here -
1. Create/Lookup a trace array with the given name.
struct trace_array *trace_array_get_by_name(const char *name)
2. Destroy/Remove a trace array.
int trace_array_destroy(struct trace_array *tr)
4. Enable/Disable trace events:
int trace_array_set_clr_event(struct trace_array *tr, const char *system,
const char *event, bool enable);
Exported APIs -
1. trace_printk equivalent for instances.
int trace_array_printk(struct trace_array *tr,
unsigned long ip, const char *fmt, ...);
2. Helper function.
void trace_printk_init_buffers(void);
3. To decrement the reference counter.
void trace_array_put(struct trace_array *tr)
Sample output(contents of /sys/kernel/tracing/instances/sample-instance)
NOTE: Tracing disabled after ~5 sec)
_-----=> irqs-off
/ _----=> need-resched
| / _---=> hardirq/softirq
|| / _--=> preempt-depth
||| / delay
TASK-PID CPU# |||| TIMESTAMP FUNCTION
| | | |||| | |
sample-instance-1452 [002] .... 49.430948: simple_thread: trace_array_printk: count=0
sample-instance-1452 [002] .... 49.430951: sample_event: count value=0 at jiffies=4294716608
sample-instance-1452 [002] .... 50.454847: simple_thread: trace_array_printk: count=1
sample-instance-1452 [002] .... 50.454849: sample_event: count value=1 at jiffies=4294717632
sample-instance-1452 [002] .... 51.478748: simple_thread: trace_array_printk: count=2
sample-instance-1452 [002] .... 51.478750: sample_event: count value=2 at jiffies=4294718656
sample-instance-1452 [002] .... 52.502652: simple_thread: trace_array_printk: count=3
sample-instance-1452 [002] .... 52.502655: sample_event: count value=3 at jiffies=4294719680
sample-instance-1452 [002] .... 53.526533: simple_thread: trace_array_printk: count=4
sample-instance-1452 [002] .... 53.526535: sample_event: count value=4 at jiffies=4294720704
sample-instance-1452 [002] .... 54.550438: simple_thread: trace_array_printk: count=5
sample-instance-1452 [002] .... 55.574336: simple_thread: trace_array_printk: count=6
Link: http://lkml.kernel.org/r/1574276919-11119-3-git-send-email-divya.indi@oracle.com
Reviewed-by: Aruna Ramakrishna <aruna.ramakrishna@oracle.com>
Signed-off-by: Divya Indi <divya.indi@oracle.com>
[ Moved to samples/ftrace ]
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Adding 2 new functions -
1) struct trace_array *trace_array_get_by_name(const char *name);
Return pointer to a trace array with given name. If it does not exist,
create and return pointer to the new trace array.
2) int trace_array_set_clr_event(struct trace_array *tr,
const char *system ,const char *event, bool enable);
Enable/Disable events to this trace array.
Additionally,
- To handle reference counters, export trace_array_put()
- Due to introduction of the above 2 new functions, we no longer need to
export - ftrace_set_clr_event & trace_array_create APIs.
Link: http://lkml.kernel.org/r/1574276919-11119-2-git-send-email-divya.indi@oracle.com
Signed-off-by: Divya Indi <divya.indi@oracle.com>
Reviewed-by: Aruna Ramakrishna <aruna.ramakrishna@oracle.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Adjust indentation from spaces to tab (+optional two spaces) as in
coding style with command like:
$ sed -e 's/^ /\t/' -i */Kconfig
Link: http://lkml.kernel.org/r/20191120133807.12741-1-krzk@kernel.org
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Fix spelling and other typos
Link: http://lkml.kernel.org/r/1573916755-32478-1-git-send-email-xianting_tian@126.com
Signed-off-by: Xianting Tian <xianting_tian@126.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|