Age | Commit message (Collapse) | Author |
|
Avoid garbage in sigaction structs used in sigaction() syscalls.
Valgrind is complaining about it.
Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lore.kernel.org/lkml/20200305083714.9381-4-tommi.t.rantala@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Since commit 3b2323c2c1c4 ("perf bench futex: Use cpumaps") the default
number of threads the benchmark uses got changed from number of online
CPUs to zero:
$ perf bench futex wake
# Running 'futex/wake' benchmark:
Run summary [PID 15930]: blocking on 0 threads (at [private] futex 0x558b8ee4bfac), waking up 1 at a time.
[Run 1]: Wokeup 0 of 0 threads in 0.0000 ms
[...]
[Run 10]: Wokeup 0 of 0 threads in 0.0000 ms
Wokeup 0 of 0 threads in 0.0004 ms (+-40.82%)
Restore the old behavior by grabbing the number of online CPUs via
cpu->nr:
$ perf bench futex wake
# Running 'futex/wake' benchmark:
Run summary [PID 18356]: blocking on 8 threads (at [private] futex 0xb3e62c), waking up 1 at a time.
[Run 1]: Wokeup 8 of 8 threads in 0.0260 ms
[...]
[Run 10]: Wokeup 8 of 8 threads in 0.0270 ms
Wokeup 8 of 8 threads in 0.0419 ms (+-24.35%)
Fixes: 3b2323c2c1c4 ("perf bench futex: Use cpumaps")
Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lore.kernel.org/lkml/20200305083714.9381-3-tommi.t.rantala@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Since glibc 2.28 when running 'perf top --stdio', input handling no
longer works, but hitting any key always just prints the "Mapped keys"
help text.
To fix it, call clearerr() in the display_thread() loop to clear any EOF
sticky errors, as instructed in the glibc NEWS file
(https://sourceware.org/git/?p=glibc.git;a=blob;f=NEWS):
* All stdio functions now treat end-of-file as a sticky condition. If you
read from a file until EOF, and then the file is enlarged by another
process, you must call clearerr or another function with the same effect
(e.g. fseek, rewind) before you can read the additional data. This
corrects a longstanding C99 conformance bug. It is most likely to affect
programs that use stdio to read interactive input from a terminal.
(Bug #1190.)
Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200305083714.9381-2-tommi.t.rantala@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
clang warns:
util/block-info.c:298:18: error: result of comparison against a string
literal is unspecified (use an explicit string comparison function
instead) [-Werror,-Wstring-compare]
if ((start_line != SRCLINE_UNKNOWN) && (end_line != SRCLINE_UNKNOWN)) {
^ ~~~~~~~~~~~~~~~
util/block-info.c:298:51: error: result of comparison against a string
literal is unspecified (use an explicit string comparison function
instead) [-Werror,-Wstring-compare]
if ((start_line != SRCLINE_UNKNOWN) && (end_line != SRCLINE_UNKNOWN)) {
^ ~~~~~~~~~~~~~~~
util/block-info.c:298:18: error: result of comparison against a string
literal is unspecified (use an explicit string
comparison function instead) [-Werror,-Wstring-compare]
if ((start_line != SRCLINE_UNKNOWN) && (end_line != SRCLINE_UNKNOWN)) {
^ ~~~~~~~~~~~~~~~
util/block-info.c:298:51: error: result of comparison against a string
literal is unspecified (use an explicit string comparison function
instead) [-Werror,-Wstring-compare]
if ((start_line != SRCLINE_UNKNOWN) && (end_line != SRCLINE_UNKNOWN)) {
^ ~~~~~~~~~~~~~~~
util/map.c:434:15: error: result of comparison against a string literal
is unspecified (use an explicit string comparison function instead)
[-Werror,-Wstring-compare]
if (srcline != SRCLINE_UNKNOWN)
^ ~~~~~~~~~~~~~~~
Reviewer Notes:
Looks good to me. Some more context:
https://clang.llvm.org/docs/DiagnosticsReference.html#wstring-compare
The spec says:
J.1 Unspecified behavior
The following are unspecified:
.. Whether two string literals result in distinct arrays (6.4.5).
Signed-off-by: Nick Desaulniers <nick.desaulniers@gmail.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Keeping <john@metanate.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: clang-built-linux@googlegroups.com
Link: https://github.com/ClangBuiltLinux/linux/issues/900
Link: http://lore.kernel.org/lkml/20200223193456.25291-1-nick.desaulniers@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Commit ee88f4ebe575 ("ALSA: mips: Use managed buffer allocation") removed
superfluous hw_params/hw_free callbacks, but forgot to remove them where
they were used.
Fixes: ee88f4ebe575 ("ALSA: mips: Use managed buffer allocation")
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Link: https://lore.kernel.org/r/20200306105837.31523-1-tsbogend@alpha.franken.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
If one is monitoring 6 events on 20 cgroups the per-CPU RB tree will
hold 120 events. The scheduling in of the events currently iterates
over all events looking to see which events match the task's cgroup or
its cgroup hierarchy. If a task is in 1 cgroup with 6 events, then 114
events are considered unnecessarily.
This change orders events in the RB tree by cgroup id if it is present.
This means scheduling in may go directly to events associated with the
task's cgroup if one is present. The per-CPU iterator storage in
visit_groups_merge is sized sufficent for an iterator per cgroup depth,
where different iterators are needed for the task's cgroup and parent
cgroups. By considering the set of iterators when visiting, the lowest
group_index event may be selected and the insertion order group_index
property is maintained. This also allows event rotation to function
correctly, as although events are grouped into a cgroup, rotation always
selects the lowest group_index event to rotate (delete/insert into the
tree) and the min heap of iterators make it so that the group_index order
is maintained.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20190724223746.153620-3-irogers@google.com
|
|
Allow the per-CPU min heap storage to have sufficient space for per-cgroup
iterators.
Based-on-work-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200214075133.181299-6-irogers@google.com
|
|
The storage required for visit_groups_merge's min heap needs to vary in
order to support more iterators, such as when multiple nested cgroups'
events are being visited. This change allows for 2 iterators and doesn't
support growth.
Based-on-work-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200214075133.181299-5-irogers@google.com
|
|
visit_groups_merge will pick the next event based on when it was
inserted in to the context (perf_event group_index). Events may be per CPU
or for any CPU, but in the future we'd also like to have per cgroup events
to avoid searching all events for the events to schedule for a cgroup.
Introduce a min heap for the events that maintains a property that the
earliest inserted event is always at the 0th element. Initialize the heap
with per-CPU and any-CPU events for the context.
Based-on-work-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200214075133.181299-4-irogers@google.com
|
|
Supports push, pop and converting an array into a heap. If the sense of
the compare function is inverted then it can provide a max-heap.
Based-on-work-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200214075133.181299-3-irogers@google.com
|
|
Move perf_cgroup_connect() after perf_event_alloc(), such that we can
find/use the PMU's cpu context.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200214075133.181299-2-irogers@google.com
|
|
We can deduce the ctx and cpuctx from the event, no need to pass them
along. Remove the structure and pass in can_add_hw directly.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Less is more; unify the two very nearly identical function.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The comment in 'asm-generic/bitops.h' states that you should "recode
these in the native assembly language, if at all possible". This is
pretty crappy advice now that the generic implementation is defined in
terms of atomic_long_t rather than a spinlock, so update the comment and
hopefully save future architecture maintainers a bit of work.
Reported-by: Stefan Asserhall <stefana@xilinx.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200213093927.1836-1-will@kernel.org
|
|
Now that {get,drop}_futex_key_refs() have become a glorified NOP,
remove them entirely.
The only thing get_futex_key_refs() is still doing is an smp_mb(), and
now that we don't need to (ab)use existing atomic ops to obtain them,
we can place it explicitly where we need it.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
|
|
We always set 'key->private.mm' to 'current->mm', getting an extra
reference on 'current->mm' is quite pointless, because as long as the
task is blocked it isn't going to go away.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
|
|
|
|
As reported by Jann, ihold() does not in fact guarantee inode
persistence. And instead of making it so, replace the usage of inode
pointers with a per boot, machine wide, unique inode identifier.
This sequence number is global, but shared (file backed) futexes are
rare enough that this should not become a performance issue.
Reported-by: Jann Horn <jannh@google.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
|
|
* acpi-doc:
Documentation/admin-guide/acpi: fix fan_performance_states.rst warnings
|
|
Currently passive MPTCP socket can skip including the DACK
option - if the peer sends data before accept() completes.
The above happens because the msk 'can_ack' flag is set
only after the accept() call.
Such missing DACK option may cause - as per RFC spec -
unwanted fallback to TCP.
This change addresses the issue using the key material
available in the current subflow, if any, to create a suitable
dack option when msk ack seq is not yet available.
v1 -> v2:
- adavance the generated ack after the initial MPC packet
Fixes: d22f4988ffec ("mptcp: process MP_CAPABLE data option")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This is similar to commit 674d9de02aa7 ("NFC: Fix possible memory
corruption when handling SHDLC I-Frame commands") and commit d7ee81ad09f0
("NFC: nci: Add some bounds checking in nci_hci_cmd_received()") which
added range checks on "pipe".
The "pipe" variable comes skb->data[0] in nfc_hci_msg_rx_work().
It's in the 0-255 range. We're using it as the array index into the
hdev->pipes[] array which has NFC_HCI_MAX_PIPES (128) members.
Fixes: 118278f20aa8 ("NFC: hci: Add pipes table to reference them with a tuple {gate, host}")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Without this, we get a couple of warnings when CONFIG_PM
is disabled:
drivers/gpu/drm/arm/display/komeda/komeda_drv.c:156:12: error: 'komeda_rt_pm_resume' defined but not used [-Werror=unused-function]
static int komeda_rt_pm_resume(struct device *dev)
^~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/arm/display/komeda/komeda_drv.c:149:12: error: 'komeda_rt_pm_suspend' defined but not used [-Werror=unused-function]
static int komeda_rt_pm_suspend(struct device *dev)
^~~~~~~~~~~~~~~~~~~~
Fixes: efb465088518 ("drm/komeda: Add runtime_pm support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: James Qian Wang (Arm Technology China) <james.qian.wang@arm.com>
Signed-off-by: james qian wang (Arm Technology China) <james.qian.wang@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200107215327.1579195-1-arnd@arndb.de
|
|
The emulated vbt doesn't tell its size correctly. According to the
intel_vbt_defs.h, vbt_header.vbt_size should the size of VBT (VBT Header,
BDB Header and data blocks), and bdb_header.bdb_size should be the size
of BDB (BDB Header and data blocks).
This patch fixes the issue and lets vbt provided by GVT-g pass the guest
i915's sanity test.
v2: refine the commit message. (Zhenyu)
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20200305131600.29640-1-tina.zhang@intel.com
|
|
When local NET_RX backlog is full due to traffic overrun,
peer veth tx_dropped counter increases. At that time, list
local veth stats, rx_dropped has double value of peer
tx_dropped, even bigger than transmit packets by peer.
In NET_RX softirq process, if any packet drop case happens,
it increases dev's rx_dropped counter and returns NET_RX_DROP.
At veth tx side, it records any error returned from peer netif_rx
into local dev tx_dropped counter.
In veth get stats process, it puts local dev rx_dropped and
peer dev tx_dropped into together as local rx_drpped value.
So that it shows double value of real dropped packets number in
this case.
This patch ignores peer tx_dropped when counting local rx_dropped,
since peer tx_dropped is duplicated to local rx_dropped at most cases.
Signed-off-by: Jiang Lidong <jianglidong3@jd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://people.freedesktop.org/~agd5f/linux into drm-fixes
amd-drm-fixes-5.6-2020-03-05:
amdgpu:
- Gfx reset fix for gfx9, 10
- Fix for gfx10
- DP MST fix
- DCC fix
- Renoir power fixes
- Navi power fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200305185957.4268-1-alexander.deucher@amd.com
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
drm/i915 fixes for v5.6-rc5:
- Break up long lists of object reclaim with cond_resched()
- PSR probe fix
- TGL workarounds
- Selftest return value fix
- Drop timeline mutex while waiting for retirement
- Wait for OA configuration completion before writes to OA buffer
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87eeu7nl6z.fsf@intel.com
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
Fixes for v5.6.rc5:
- dma-buf fix memory leak
- Fix resource id creation race in virtio.
- Various mmap fixes.
- Fix fence leak in ttm_buffer_object_transfer().
- Fixes for sun4i VI layer format support.
- kirin: Revert "Fix for hikey620 display offset problem"
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/56de63c7-0cdf-5805-e268-44944af7fef2@linux.intel.com
|
|
After returning from unregister_netdevice_notifier_dev_net(), set the
notifier_call field to NULL so successive call to mlx5_lag_add() will
function as expected.
Fixes: 7907f23adc18 ("net/mlx5: Implement RoCE LAG feature")
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Raed Salem <raeds@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
The mask value is provided as 64 bit and has to be casted in
either 32 or 16 bit. On big endian systems the wrong half was
casted which resulted in an all zero mask.
Fixes: 2b64beba0251 ("net/mlx5e: Support header re-write of partial fields in TC pedit offload")
Signed-off-by: Sebastian Hense <sebastian.hense1@ibm.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Fix to match the HW spec: TRACKING state is 1, SEARCHING is 2.
No real issue for now, as these values are not currently used.
Fixes: d2ead1f360e8 ("net/mlx5e: Add kTLS TX HW offload support")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
We have an off-by-1 issue in the TCP seq comparison.
The last sequence number that belongs to the TCP packet's payload
is not "start_seq + len", but one byte before it.
Fix it so the 'ends_before' is evaluated properly.
This fixes a bug that results in error completions in the
kTLS HW offload flows.
Fixes: ffbd9ca94e2e ("net/mlx5e: kTLS, Fix corner-case checks in TX resync flow")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Fix the send info write length to be (actions x action) size in bytes.
Fixes: 297cccebdc5a ("net/mlx5: DR, Expose an internal API to issue RDMA operations")
Signed-off-by: Hamdan Igbaria <hamdani@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:
====================
wireless-drivers fixes for v5.6
Second set of fixes for v5.6. Only two small fixes this time.
iwlwifi
* fix another initialisation regression with 3168 devices
mt76
* fix memory corruption with too many rx fragments
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
- Add a SPDX header;
- Add a document title, based on the original contents of
cpu-freq/index.txt;
- Use lists where needed;
- Comment out the existing text-only index;
- Adjust some title marks;
- Use bold on some places;
- Mark literal blocks as such;
- Some whitespace fixes and new line breaks;
- Add it to cpu-freq/index.rst.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
- Add a SPDX header;
- Add a document title, based on the original contents of
cpu-freq/index.txt;
- Use lists where needed;
- Comment out the existing text-only index;
- Adjust some title marks;
- Add table markups;
- Add notes markups;
- Mark literal blocks as such;
- use ``foo`` for literal texts;
- Some whitespace fixes and new line breaks;
- Add it to cpu-freq/index.rst.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
- Add a SPDX header;
- Adjust the document title, based on the original contents of
cpu-freq/index.txt;
- Use lists where needed;
- Comment out the existing text-only index;
- use ``foo`` to mark a literal expression with asterisk;
- Adjust some title marks;
- Add table markups;
- Add notes markups;
- Some whitespace fixes and new line breaks;
- Add it to cpu-freq/index.rst.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
most of the stuff there can be re-used with ReST format,
but we need to add an empty TOC and remove the existing
entries, as the following conversion patches will be re-adding
them, as they're converted.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The old cpufreq docs got moved. Let's also update its
pointer inside Kconfig.x86.
Fixes: 03b22496502d ("Documentation: cpufreq: Move legacy driver documentation")
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
We now ignore the "completion" event when using tx queue timestamping,
and only pay attention to the two (high and low) timestamp events. The
NIC will send a pair of timestamp events for every packet transmitted.
The current firmware may merge the completion events, and it is possible
that future versions may reorder the completion and timestamp events.
As such the completion event is not useful.
Without this patch in place a merged completion event on a queue with
timestamping will cause a "spurious TX completion" error. This affects
SFN8000-series adapters.
Signed-off-by: Tom Zhao <tzhao@solarflare.com>
Acked-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
struct_ops map cannot support map_freeze. Otherwise, a struct_ops
cannot be unregistered from the subsystem.
Fixes: 85d33df357b6 ("bpf: Introduce BPF_MAP_TYPE_STRUCT_OPS")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200305013454.535397-1-kafai@fb.com
|
|
The current always succeed behavior in bpf_struct_ops_map_delete_elem()
is not ideal for userspace tool. It can be improved to return proper
error value.
If it is in TOBEFREE, it means unregistration has been already done
before but it is in progress and waiting for the subsystem to clear
the refcnt to zero, so -EINPROGRESS.
If it is INIT, it means the struct_ops has not been registered yet,
so -ENOENT.
Fixes: 85d33df357b6 ("bpf: Introduce BPF_MAP_TYPE_STRUCT_OPS")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200305013447.535326-1-kafai@fb.com
|
|
Yonghong Song says:
====================
Commit 8b401f9ed244 ("bpf: implement bpf_send_signal() helper")
introduced bpf_send_signal() helper and Commit 8482941f0906
("bpf: Add bpf_send_signal_thread() helper") added bpf_send_signal_thread()
helper. Both helpers try to send a signel to current process or thread.
When bpf_prog is called with scheduler rq_lock held, a deadlock
could happen since bpf_send_signal() and bpf_send_signal_thread()
will call group_send_sig_info() which may ultimately want to acquire
rq_lock() again. This happens in 5.2 and 4.16 based kernels in our
production environment with perf_sw_event.
In a different scenario, the following is also possible in the last kernel:
cpu 1:
do_task_stat <- holding sighand->siglock
...
task_sched_runtime <- trying to grab rq_lock
cpu 2:
__schedule <- holding rq_lock
...
do_send_sig_info <- trying to grab sighand->siglock
Commit eac9153f2b58 ("bpf/stackmap: Fix deadlock with
rq_lock in bpf_get_stack()") has a similar issue with above
rq_lock() deadlock. This patch set addressed the issue
in a similar way. Patch #1 provided kernel solution and
Patch #2 added a selftest.
Changelogs:
v2 -> v3:
. simplify selftest send_signal_sched_switch().
The previous version has mmap/munmap inherited
from Song's reproducer. They are not necessary
in this context.
v1 -> v2:
. previous fix using task_work in nmi() is incorrect.
there is no nmi() involvement here. Using task_work
in all cases might be a solution. But decided to
use a similar fix as in Commit eac9153f2b58.
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This adds VP9 decoding for the Amlogic GXL, G12A & SM1 SoCs, using
the commong "HEVC" HW decoder.
For G12A & SM1, it uses the IOMMU support from the firmware.
For 10bit decoding, the firmware can only decode in the proprietary
Amlogic Framebuffer Compression format, but can output in 8bit NV12
buffer while writing the decoded frame.
Signed-off-by: Maxime Jourdan <mjourdan@baylibre.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Tested-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
|
|
Added one test, send_signal_sched_switch, to test bpf_send_signal()
helper triggered by sched/sched_switch tracepoint. This test can be used
to verify kernel deadlocks fixed by the previous commit. The test itself
is heavily borrowed from Commit eac9153f2b58 ("bpf/stackmap: Fix deadlock
with rq_lock in bpf_get_stack()").
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200304191105.2796601-1-yhs@fb.com
|
|
When experimenting with bpf_send_signal() helper in our production
environment (5.2 based), we experienced a deadlock in NMI mode:
#5 [ffffc9002219f770] queued_spin_lock_slowpath at ffffffff8110be24
#6 [ffffc9002219f770] _raw_spin_lock_irqsave at ffffffff81a43012
#7 [ffffc9002219f780] try_to_wake_up at ffffffff810e7ecd
#8 [ffffc9002219f7e0] signal_wake_up_state at ffffffff810c7b55
#9 [ffffc9002219f7f0] __send_signal at ffffffff810c8602
#10 [ffffc9002219f830] do_send_sig_info at ffffffff810ca31a
#11 [ffffc9002219f868] bpf_send_signal at ffffffff8119d227
#12 [ffffc9002219f988] bpf_overflow_handler at ffffffff811d4140
#13 [ffffc9002219f9e0] __perf_event_overflow at ffffffff811d68cf
#14 [ffffc9002219fa10] perf_swevent_overflow at ffffffff811d6a09
#15 [ffffc9002219fa38] ___perf_sw_event at ffffffff811e0f47
#16 [ffffc9002219fc30] __schedule at ffffffff81a3e04d
#17 [ffffc9002219fc90] schedule at ffffffff81a3e219
#18 [ffffc9002219fca0] futex_wait_queue_me at ffffffff8113d1b9
#19 [ffffc9002219fcd8] futex_wait at ffffffff8113e529
#20 [ffffc9002219fdf0] do_futex at ffffffff8113ffbc
#21 [ffffc9002219fec0] __x64_sys_futex at ffffffff81140d1c
#22 [ffffc9002219ff38] do_syscall_64 at ffffffff81002602
#23 [ffffc9002219ff50] entry_SYSCALL_64_after_hwframe at ffffffff81c00068
The above call stack is actually very similar to an issue
reported by Commit eac9153f2b58 ("bpf/stackmap: Fix deadlock with
rq_lock in bpf_get_stack()") by Song Liu. The only difference is
bpf_send_signal() helper instead of bpf_get_stack() helper.
The above deadlock is triggered with a perf_sw_event.
Similar to Commit eac9153f2b58, the below almost identical reproducer
used tracepoint point sched/sched_switch so the issue can be easily caught.
/* stress_test.c */
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <pthread.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#define THREAD_COUNT 1000
char *filename;
void *worker(void *p)
{
void *ptr;
int fd;
char *pptr;
fd = open(filename, O_RDONLY);
if (fd < 0)
return NULL;
while (1) {
struct timespec ts = {0, 1000 + rand() % 2000};
ptr = mmap(NULL, 4096 * 64, PROT_READ, MAP_PRIVATE, fd, 0);
usleep(1);
if (ptr == MAP_FAILED) {
printf("failed to mmap\n");
break;
}
munmap(ptr, 4096 * 64);
usleep(1);
pptr = malloc(1);
usleep(1);
pptr[0] = 1;
usleep(1);
free(pptr);
usleep(1);
nanosleep(&ts, NULL);
}
close(fd);
return NULL;
}
int main(int argc, char *argv[])
{
void *ptr;
int i;
pthread_t threads[THREAD_COUNT];
if (argc < 2)
return 0;
filename = argv[1];
for (i = 0; i < THREAD_COUNT; i++) {
if (pthread_create(threads + i, NULL, worker, NULL)) {
fprintf(stderr, "Error creating thread\n");
return 0;
}
}
for (i = 0; i < THREAD_COUNT; i++)
pthread_join(threads[i], NULL);
return 0;
}
and the following command:
1. run `stress_test /bin/ls` in one windown
2. hack bcc trace.py with the following change:
--- a/tools/trace.py
+++ b/tools/trace.py
@@ -513,6 +513,7 @@ BPF_PERF_OUTPUT(%s);
__data.tgid = __tgid;
__data.pid = __pid;
bpf_get_current_comm(&__data.comm, sizeof(__data.comm));
+ bpf_send_signal(10);
%s
%s
%s.perf_submit(%s, &__data, sizeof(__data));
3. in a different window run
./trace.py -p $(pidof stress_test) t:sched:sched_switch
The deadlock can be reproduced in our production system.
Similar to Song's fix, the fix is to delay sending signal if
irqs is disabled to avoid deadlocks involving with rq_lock.
With this change, my above stress-test in our production system
won't cause deadlock any more.
I also implemented a scale-down version of reproducer in the
selftest (a subsequent commit). With latest bpf-next,
it complains for the following potential deadlock.
[ 32.832450] -> #1 (&p->pi_lock){-.-.}:
[ 32.833100] _raw_spin_lock_irqsave+0x44/0x80
[ 32.833696] task_rq_lock+0x2c/0xa0
[ 32.834182] task_sched_runtime+0x59/0xd0
[ 32.834721] thread_group_cputime+0x250/0x270
[ 32.835304] thread_group_cputime_adjusted+0x2e/0x70
[ 32.835959] do_task_stat+0x8a7/0xb80
[ 32.836461] proc_single_show+0x51/0xb0
...
[ 32.839512] -> #0 (&(&sighand->siglock)->rlock){....}:
[ 32.840275] __lock_acquire+0x1358/0x1a20
[ 32.840826] lock_acquire+0xc7/0x1d0
[ 32.841309] _raw_spin_lock_irqsave+0x44/0x80
[ 32.841916] __lock_task_sighand+0x79/0x160
[ 32.842465] do_send_sig_info+0x35/0x90
[ 32.842977] bpf_send_signal+0xa/0x10
[ 32.843464] bpf_prog_bc13ed9e4d3163e3_send_signal_tp_sched+0x465/0x1000
[ 32.844301] trace_call_bpf+0x115/0x270
[ 32.844809] perf_trace_run_bpf_submit+0x4a/0xc0
[ 32.845411] perf_trace_sched_switch+0x10f/0x180
[ 32.846014] __schedule+0x45d/0x880
[ 32.846483] schedule+0x5f/0xd0
...
[ 32.853148] Chain exists of:
[ 32.853148] &(&sighand->siglock)->rlock --> &p->pi_lock --> &rq->lock
[ 32.853148]
[ 32.854451] Possible unsafe locking scenario:
[ 32.854451]
[ 32.855173] CPU0 CPU1
[ 32.855745] ---- ----
[ 32.856278] lock(&rq->lock);
[ 32.856671] lock(&p->pi_lock);
[ 32.857332] lock(&rq->lock);
[ 32.857999] lock(&(&sighand->siglock)->rlock);
Deadlock happens on CPU0 when it tries to acquire &sighand->siglock
but it has been held by CPU1 and CPU1 tries to grab &rq->lock
and cannot get it.
This is not exactly the callstack in our production environment,
but sympotom is similar and both locks are using spin_lock_irqsave()
to acquire the lock, and both involves rq_lock. The fix to delay
sending signal when irq is disabled also fixed this issue.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200304191104.2796501-1-yhs@fb.com
|
|
Amlogic VP9 decoder requires an additional 16-byte payload before every
frame header.
The source buffer is updated in-place, then given to the Parser FIFO DMA.
The FIFO DMA copies the blocks into the 16MiB parser ring buffer, then parses
and copies the slice into the decoder "workspace".
Signed-off-by: Maxime Jourdan <mjourdan@baylibre.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Tested-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
|
|
Add support for the HEVC & VP9 common decoder support, handling
Amlogic GXBB, GXL, G12A and SM1 platforms.
This handles the "HEVC" hw decoder used for HEVC and VP9, and will be
using in the new H264 multi-instance decoder for G12A & SM1 platforms.
Signed-off-by: Maxime Jourdan <mjourdan@baylibre.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Tested-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
|
|
Add helpers to support the lossless framebuffer compression format that
will be used in HEVC & VP9 decoders when decoding 10bit content for
downsampling to 8bit NV12 and later proper compressed buffer support.
Signed-off-by: Maxime Jourdan <mjourdan@baylibre.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Tested-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
|
|
The HEVC/VP9 aligns the plane stride on 32, so align the planes stride
for all codecs to 32 to satisfy HEVC/VP9 decoding using the "HEVC" HW.
This fixes VP9 decoding of streams with following (not limited) widths:
- 264
-288
- 350
- 352
- 472
- 480
- 528
- 600
- 720
- 800
- 848
- 1440
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Tested-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
|