Age | Commit message (Collapse) | Author |
|
add bpf_raw_tracepoint_open(const char *name, int prog_fd) api to libbpf
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Introduce BPF_PROG_TYPE_RAW_TRACEPOINT bpf program type to access
kernel internal arguments of the tracepoints in their raw form.
>From bpf program point of view the access to the arguments look like:
struct bpf_raw_tracepoint_args {
__u64 args[0];
};
int bpf_prog(struct bpf_raw_tracepoint_args *ctx)
{
// program can read args[N] where N depends on tracepoint
// and statically verified at program load+attach time
}
kprobe+bpf infrastructure allows programs access function arguments.
This feature allows programs access raw tracepoint arguments.
Similar to proposed 'dynamic ftrace events' there are no abi guarantees
to what the tracepoints arguments are and what their meaning is.
The program needs to type cast args properly and use bpf_probe_read()
helper to access struct fields when argument is a pointer.
For every tracepoint __bpf_trace_##call function is prepared.
In assembler it looks like:
(gdb) disassemble __bpf_trace_xdp_exception
Dump of assembler code for function __bpf_trace_xdp_exception:
0xffffffff81132080 <+0>: mov %ecx,%ecx
0xffffffff81132082 <+2>: jmpq 0xffffffff811231f0 <bpf_trace_run3>
where
TRACE_EVENT(xdp_exception,
TP_PROTO(const struct net_device *dev,
const struct bpf_prog *xdp, u32 act),
The above assembler snippet is casting 32-bit 'act' field into 'u64'
to pass into bpf_trace_run3(), while 'dev' and 'xdp' args are passed as-is.
All of ~500 of __bpf_trace_*() functions are only 5-10 byte long
and in total this approach adds 7k bytes to .text.
This approach gives the lowest possible overhead
while calling trace_xdp_exception() from kernel C code and
transitioning into bpf land.
Since tracepoint+bpf are used at speeds of 1M+ events per second
this is valuable optimization.
The new BPF_RAW_TRACEPOINT_OPEN sys_bpf command is introduced
that returns anon_inode FD of 'bpf-raw-tracepoint' object.
The user space looks like:
// load bpf prog with BPF_PROG_TYPE_RAW_TRACEPOINT type
prog_fd = bpf_prog_load(...);
// receive anon_inode fd for given bpf_raw_tracepoint with prog attached
raw_tp_fd = bpf_raw_tracepoint_open("xdp_exception", prog_fd);
Ctrl-C of tracing daemon or cmdline tool that uses this feature
will automatically detach bpf program, unload it and
unregister tracepoint probe.
On the kernel side the __bpf_raw_tp_map section of pointers to
tracepoint definition and to __bpf_trace_*() probe function is used
to find a tracepoint with "xdp_exception" name and
corresponding __bpf_trace_xdp_exception() probe function
which are passed to tracepoint_probe_register() to connect probe
with tracepoint.
Addition of bpf_raw_tracepoint doesn't interfere with ftrace and perf
tracepoint mechanisms. perf_event_open() can be used in parallel
on the same tracepoint.
Multiple bpf_raw_tracepoint_open("xdp_exception", prog_fd) are permitted.
Each with its own bpf program. The kernel will execute
all tracepoint probes and all attached bpf programs.
In the future bpf_raw_tracepoints can be extended with
query/introspection logic.
__bpf_raw_tp_map section logic was contributed by Steven Rostedt
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
move COUNT_ARGS() macro from apparmor to generic header and extend it
to count till twelve.
COUNT() was an alternative name for this logic, but it's used for
different purpose in many other places.
Similarly for CONCATENATE() macro.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
fix iwlwifi_dev_ucode_error tracepoint to pass pointer to a table
instead of all 17 arguments by value.
dvm/main.c and mvm/utils.c have 'struct iwl_error_event_table'
defined with very similar yet subtly different fields and offsets.
tracepoint is still common and using definition of 'struct iwl_error_event_table'
from dvm/commands.h while copying fields.
Long term this tracepoint probably should be split into two.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
two trace events defined with the same name and both unused.
They conflict in allyesconfig build. Rename one of them.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
two trace events defined with the same name and both unused.
They conflict in allyesconfig build. Rename one of them.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
- fix trace_hfi1_ctxt_info() to pass large struct by reference instead of by value
- convert 'type array[]' tracepoint arguments into 'type *array',
since compiler will warn that sizeof('type array[]') == sizeof('type *array')
and later should be used instead
The CAST_TO_U64 macro in the later patch will enforce that tracepoint
arguments can only be integers, pointers, or less than 8 byte structures.
Larger structures should be passed by reference.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
We can set triggers that cause a debug data collection when something
of interest happens (e.g. when too many probes are lost conscutively).
Normally, this triggers don't cause the FW to be restarted, but in
some cases that may be desired, so we recover from the problem. To
support this, add a flag that indicates that the FW should be
restarted when the trigger fires.
Signed-off-by: Shahar S Matityahu <shahar.s.matityahu@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
Currently we have a boolean variable for each cause.
This costs space, and requires to check each separately
when determining low latency.
Since we have another cause incoming, convert it to an enum.
While at it, move the retrieval of the prev value and the
assignment of the new value to be inside iwl_mvm_update_low_latency
and save the need for each caller to do it separately.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
We are now ready to load 38.ucode
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
Some geographic profiles require specific handling. For example ETSI
profile requires special channel access handling. Add geographic
profile information to MCC_UPDATE response to allow it.
Signed-off-by: Haim Dreyfuss <haim.dreyfuss@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
A lot of new PCI IDs were added for the 9000 series. Add them to the
list of supported PCI IDs.
Cc: stable@vger.kernel.org # 4.13+
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
Remove fragmented_dwell_time and add num_of_fragments to support
the new API version.
Signed-off-by: Ayala Beker <ayala.beker@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
The FW does not allocate quota air time for the binding of a station
MAC before iwlmvm indicates that it is associated. Currently iwlmvm
indicates that the MAC is associated only after hearing a beacon from
the AP. In case a deauthentication frame is sent before the MAC is
associated, the frame might not be sent as the corresponding binding
is not scheduled.
To handle such cases, set IEEE80211_HW_DEAUTH_NEED_MGD_TX_PREP in the
HW flags, requesting mac80211 to call the mgd_prepare_tx() callback
before transmitting a deauthentication frame if associated but no
beacon was heard from the AP.
In addition, do not warn in iwl_mvm_mac_mgd_prepare_tx() when already
associated as now the callback can be called also when associated.
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
Add support for Optimized Connectivity Experience (OCE). Get
capabilities from the fw, expose them with nl80211, and enable them in
UMAC scan if the relevant nl80211 flags are set by the userspace.
Signed-off-by: Roee Zamir <roee.zamir@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
Update the scan command API with support for adaptive dwell. Adaptive
dwell is a type of scan that dynamically changes the time it remains
on each channel listening for beacons or probe responses.
Signed-off-by: Roee Zamir <roee.zamir@intel.com>
Signed-off-by: Beni Lev <beni.lev@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
Sample usage for tos ...
bpf_getsockopt(skops, SOL_IP, IP_TOS, &v, sizeof(v))
... where skops is a pointer to the ctx (struct bpf_sock_ops).
Signed-off-by: Nikita V. Shirokov <tehnerd@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
This reverts commit 4b1e84276a6172980c5bf39aa091ba13e90d6dad.
Software uses the valid bits to decide if the values can be used for
further processing or other actions. So setting the valid bits will have
software act on values that it shouldn't be acting on.
The recommendation to save all the register values does not mean that
the values are always valid.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: tony.luck@intel.com
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: bp@suse.de
Cc: linux-edac@vger.kernel.org
Link: https://lkml.kernel.org/r/20180326191526.64314-1-Yazen.Ghannam@amd.com
|
|
Stephen Rothewell <sfr@canb.auug.org.au> wrote:
> After merging the userns tree, today's linux-next build (powerpc
> ppc64_defconfig) produced this warning:
>
> In file included from include/linux/sched.h:16:0,
> from arch/powerpc/lib/xor_vmx_glue.c:14:
> include/linux/shm.h:17:35: error: 'struct file' declared inside parameter list will not be visible outside of this definition or declaration [-Werror]
> bool is_file_shm_hugepages(struct file *file);
> ^~~~
>
> and many, many more (most warnings, but some errors - arch/powerpc is
> mostly built with -Werror)
I dug through this and I discovered that the error was caused by the
removal of struct shmid_kernel from shm.h when building on powerpc.
Except for observing the existence of "struct file *shm_file" in
struct shmid_kernel I have no clue why the structure move would cause
such a failure. I suspect shm.h always needed the forward declaration
and someting had been confusing gcc into not issuing the warning. --EWB
Fixes: a2e102cd3cdd ("shm: Move struct shmid_kernel into ipc/shm.c")
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
|
A critical error was found testing the fixed UV4 HUB in that an MMR address
was found to be incorrect. This causes the virtual address space for
accessing the MMIOH1 region to be allocated with the incorrect size.
Fixes: 673aa20c55a1 ("x86/platform/UV: Update uv_mmrs.h to prepare for UV4A fixes")
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com>
Cc: Russ Anderson <russ.anderson@hpe.com>
Cc: Andrew Banman <andrew.banman@hpe.com>
Link: https://lkml.kernel.org/r/20180328174011.041801248@stormcage.americas.sgi.com
|
|
Annoyingly, modify_user_hw_breakpoint() unnecessarily complicates the
modification of a breakpoint - simplify it and remove the pointless
local variables.
Also update the stale Docbook while at it.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Revert the clearing of __GFP_ZERO in dma_alloc_attrs and move it to
dma_direct_alloc for now. While most common architectures always zero dma
cohereny allocations (and x86 did so since day one) this is not documented
and at least arc and s390 do not zero without the explicit __GFP_ZERO
argument.
Fixes: 57bf5a8963f8 ("dma-mapping: clear harmful GFP_* flags in common code")
Reported-by: Evgeniy Didin <Evgeniy.Didin@synopsys.com>
Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Evgeniy Didin <Evgeniy.Didin@synopsys.com>
Cc: iommu@lists.linux-foundation.org
Link: https://lkml.kernel.org/r/20180328133535.17302-2-hch@lst.de
|
|
Fixes the following sparse warnings:
drivers/gpu/drm/tegra/dc.c:2181:69: warning:
Using plain integer as NULL pointer
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
PV TLB FLUSH can only be turned on when steal time is enabled.
The condition got reversed during conflict resolution.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Fixes: 4f2f61fc5071 ("KVM: X86: Avoid traversing all the cpus for pv tlb flush when steal time is disabled")
[Rebased on top of kvm/master and reworded the commit message. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
Trivial fix to spelling mistake in error message text
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Sometimes iwl_mvm_disable_txq() may be called with mac80211_queue ==
IEEE80211_INVAL_HW_QUEUE, and this would cause us to use BIT(0xFF)
which is way too large for the u16 we used to store it in
hw_queue_to_mac820211. If this happens the following UBSAN warning
will be generated:
[ 167.185167] UBSAN: Undefined behaviour in drivers/net/wireless/intel/iwlwifi/mvm/utils.c:838:5
[ 167.185171] shift exponent 255 is too large for 64-bit type 'long unsigned int'
Fix that by checking that it is not IEEE80211_INVAL_HW_QUEUE and,
while at it, add a warning if the queue number is larger than
IEEE80211_MAX_QUEUES.
Fixes: 34e10860ae8d ("iwlwifi: mvm: remove references to queue_info in new TX path")
Reported-by: Paul Menzel <pmenzel+linux-wireless@molgen.mpg.de>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
In case debug configuration is started with LDBG cmd also start timestamp
marker for syncing logs witn the FW.
Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
This makes future bail-outs from transmitting an AMSDU more
readable.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
RCU isn't properly locked.
Fixes: 46d372af9935 ("iwlwifi: mvm: rs: new rate scale API - add FW notifications")
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
Hardware bug was fixed in later generation.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
Requested by Luca, needed for upcoming patch "iwlwifi: add a bunch of new 9000
PCI IDs".
|
|
In arch/x86/boot/compressed/kaslr_64.c, CONFIG_AMD_MEM_ENCRYPT support was
initially #undef'd to support SME with minimal effort. When support for
SEV was added, the #undef remained and some minimal support for setting the
encryption bit was added for building identity mapped pagetable entries.
Commit b83ce5ee9147 ("x86/mm/64: Make __PHYSICAL_MASK_SHIFT always 52")
changed __PHYSICAL_MASK_SHIFT from 46 to 52 in support of 5-level paging.
This change resulted in SEV guests failing to boot because the encryption
bit was no longer being automatically masked out. The compressed boot
path now requires sme_me_mask to be defined in order for the pagetable
functions, such as pud_present(), to properly mask out the encryption bit
(currently bit 47) when evaluating pagetable entries.
Add an sme_me_mask variable in arch/x86/boot/compressed/mem_encrypt.S,
which is set when SEV is active, delete the #undef CONFIG_AMD_MEM_ENCRYPT
from arch/x86/boot/compressed/kaslr_64.c and use sme_me_mask when building
the identify mapped pagetable entries.
Fixes: b83ce5ee9147 ("x86/mm/64: Make __PHYSICAL_MASK_SHIFT always 52")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lkml.kernel.org/r/20180327220711.8702.55842.stgit@tlendack-t1.amdoffice.net
|
|
BAU uses the old alloc_initr_gate90 method to setup its interrupt. This
fails silently as the BAU vector is in the range of APIC vectors that are
registered to the spurious interrupt handler. As a consequence BAU
broadcasts are not handled, and the broadcast source CPU hangs.
Update BAU to use new idt structure.
Fixes: dc20b2d52653 ("x86/idt: Move interrupt gate initialization to IDT code")
Signed-off-by: Andrew Banman <abanman@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Mike Travis <mike.travis@hpe.com>
Cc: Dimitri Sivanich <sivanich@hpe.com>
Cc: Russ Anderson <rja@hpe.com>
Cc: stable@vger.kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lkml.kernel.org/r/1522188546-196177-1-git-send-email-abanman@hpe.com
|
|
When changing rdmsr_safe_on_cpu() to schedule, it was missed that
__rdmsr_safe_on_cpu() was also used by rdmsrl_safe_on_cpu()
Make rdmsrl_safe_on_cpu() a wrapper instead of copy/pasting the code which
was added for the completion handling.
Fixes: 07cde313b2d2 ("x86/msr: Allow rdmsr_safe_on_cpu() to schedule")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lkml.kernel.org/r/20180328032233.153055-1-edumazet@google.com
|
|
git://people.freedesktop.org/~gabbayo/linux into drm-next
- GPUVM support for dGPUs
- KFD events support for dGPUs
- Fix live-lock situation when restoring multiple evicted processes
- Fix VM page table allocation on large-bar systems
- Fix for build failure on frv architecture
* tag 'drm-amdkfd-next-2018-03-27' of git://people.freedesktop.org/~gabbayo/linux:
drm/amdkfd: Use ordered workqueue to restore processes
drm/amdgpu: Fix acquiring VM on large-BAR systems
drm/amdkfd: Add module option for testing large-BAR functionality
drm/amdkfd: Kmap event page for dGPUs
drm/amdkfd: Add ioctls for GPUVM memory management
drm/amdkfd: Add TC flush on VMID deallocation for Hawaii
drm/amdkfd: Allocate CWSR trap handler memory for dGPUs
drm/amdkfd: Add per-process IDR for buffer handles
drm/amdkfd: Aperture setup for dGPUs
drm/amdkfd: Remove limit on number of GPUs
drm/amdkfd: Populate DRM render device minor
drm/amdkfd: Create KFD VMs on demand
drm/amdgpu: Add kfd2kgd interface to acquire an existing VM
drm/amdgpu: Add helper to turn an existing VM into a compute VM
drm/amdgpu: Fix initial validation of PD BO for KFD VMs
drm/amdgpu: Move KFD-specific fields into struct amdgpu_vm
drm/amdkfd: fix uninitialized variable use
drm/amdkfd: add missing include of mm.h
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-next
- Display fixes for booting with MST hub lid closed and display
freezing after hibernation (fd.o bugs 105470 & 105196)
- Fix for a very rare interrupt handling race resulting in GPU hang
* tag 'drm-intel-next-fixes-2018-03-27' of git://anongit.freedesktop.org/drm/drm-intel:
drm/i915: Fix hibernation with ACPI S0 target state
drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
drm/i915: Specify which engines to reset following semaphore/event lockups
drm/i915/dp: Write to SET_POWER dpcd to enable MST hub.
|
|
Linux 4.16-rc7
This was requested by Daniel, and things were getting
a bit hard to reconcile, most of the conflicts were
trivial though.
|
|
Use enum dma_transfer_direction as required by the functions
dmaengine_prep_slave_(sg|single)() instead of enum dma_data_direction.
This won't change behavior in practice as the enum values are
equivalent.
This fixes two warnings when building with clang:
drivers/spi/spi-atmel.c:771:12: warning: implicit conversion from enumeration
type 'enum dma_data_direction' to different enumeration type
'enum dma_transfer_direction'
[-Wenum-conversion]
DMA_FROM_DEVICE,
^~~~~~~~~~~~~~~
...
Signed-off-by: Stefan Agner <stefan@agner.ch>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
'regulator/topic/dt', 'regulator/topic/formatting' and 'regulator/topic/gpio' into regulator-next
|
|
|
|
|
|
|
|
Document support for the MSIOF module in the Renesas M3-N (r8a77965) SoC.
No driver update is needed.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Add the pm8998 and pmi8998 regulators as used in the MSM8998 platform.
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
An error TX completion (CQE) which arrived on a specific SQ indicates
that this SQ got moved by the hardware to error state, which means all
pending and incoming TX requests are dropped or will be dropped and no
further "Good" CQEs will be generated for that SQ.
Before this patch TX completions (CQEs) were not monitored and were
handled as a regular CQE. This caused the SQ to stay in an error state,
making it useless for xmiting new packets.
Mitigation plan:
In case of an error completion, schedule a recovery work which would do
the following:
- Mark the TXQ as DRV_XOFF to disable new packets to arrive from the
stack
- NAPI to flush all pending SQ WQEs (via flush_in_error_en bit) to
release SW and HW resources(SKB, DMA, etc) and have the SQ and CQ
consumer/producer indices synced.
- Modify the SQ state ERR -> RST -> RDY (restart the SQ).
- Reactivate the SQ and reset SQ cc and pc
If we identify two consecutive requests for SQ recover in less than
500 msecs, drop the recover request to avoid CPU overload, as this
scenario most likely happened due to a severe repeated bug.
In addition, add SQ recover SW counter to monitor successful recoveries.
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Pull ARM fixes from Russell King:
"A small number of small fixes for ARM, mostly for some build issues.
One fix for a regression caused by the cpu hotplug conversion from a
few kernel versions ago"
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8750/1: deflate_xip_data.sh: minor fixes
ARM: 8748/1: mm: Define vdso_start, vdso_end as array
ARM: 8747/1: make CONFIG_DEBUG_WX depend on MMU
ARM: 8746/1: vfp: Go back to clearing vfp_current_hw_state[]
|
|
Monitor and dump xmit error completions. In addition, add err_cqe
counter to track the number of error completion per send queue.
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Move mlx5_ib dump error CQE implementation to mlx5 CQ header file in
order to use it in a downstream patch from mlx5e.
In addition, use print_hex_dump instead of manual dumping of the buffer.
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Move query SQ state function from mlx5_ib to mlx5_core in order to
have it in shared code.
It will be used in a downstream patch from mlx5e.
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Driver callback for handling TX timeout should access some internal
resources (SQ, CQ) in order to decide if the tx timeout work should be
scheduled. These resources might be unavailable if channels are closed
in parallel (ifdown for example).
The state lock is the mechanism to protect from such races.
Move all TX timeout logic to be in the work under a state lock.
In addition, Move the work from the global WQ to mlx5e WQ to make sure
this work is flushed when device is detached..
Also, move the mlx5e_tx_timeout_work code to be next to the TX timeout
NDO for better code locality.
Fixes: 3947ca185999 ("net/mlx5e: Implement ndo_tx_timeout callback")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|