Age | Commit message (Collapse) | Author |
|
According to the ACPI spec, "The system must reset immediately following
the write to the ACPI RESET_REG register.", but there are cases that the
system does not follow this and results in racing with the subsequetial
reboot mechanism, which brings unexpected behavior.
Fix this by adding a 15ms delay after writing to the ACPI RESET_REG.
Reported-by: Ghorai, Sukumar <sukumar.ghorai@intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[ rjw: Edit comment in the code and subject ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
If ACPI is disabled then loading the acpi_dbg module will result in the
following splat when lock debugging is enabled.
DEBUG_LOCKS_WARN_ON(lock->magic != lock)
WARNING: CPU: 0 PID: 1 at kernel/locking/mutex.c:938 __mutex_lock+0xa10/0x1290
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.9.0-rc8+ #103
Hardware name: linux,dummy-virt (DT)
Call trace:
dump_backtrace+0x0/0x4d8
show_stack+0x34/0x48
dump_stack+0x174/0x1f8
panic+0x360/0x7a0
__warn+0x244/0x2ec
report_bug+0x240/0x398
bug_handler+0x50/0xc0
call_break_hook+0x160/0x1d8
brk_handler+0x30/0xc0
do_debug_exception+0x184/0x340
el1_dbg+0x48/0xb0
el1_sync_handler+0x170/0x1c8
el1_sync+0x80/0x100
__mutex_lock+0xa10/0x1290
mutex_lock_nested+0x6c/0xc0
acpi_register_debugger+0x40/0x88
acpi_aml_init+0xc4/0x114
do_one_initcall+0x24c/0xb10
kernel_init_freeable+0x690/0x728
kernel_init+0x20/0x1e8
ret_from_fork+0x10/0x18
This is because acpi_debugger.lock has not been initialized as
acpi_debugger_init() is not called when ACPI is disabled. Fail module
loading to avoid this and any subsequent problems that might arise by
trying to debug AML when ACPI is disabled.
Fixes: 8cfb0cdf07e2 ("ACPI / debugger: Add IO interface to access debugger functionalities")
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Jamie Iles <jamie@nuviainc.com>
Cc: 4.10+ <stable@vger.kernel.org> # 4.10+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
To enable better debug of PM domains, keep a track of successful
and failing attempts to enter each domain idle state.
This statistics are exported in debugfs when reading the
idle_states node associated with each PM domain.
Signed-off-by: Lina Iyer <ilina@codeaurora.org>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
There is not strict need to group a comment and a single statement in an
if block, as comments are stripped by the pre-processor. However,
adding curly braces does make the code easier to read, and may avoid
mistakes when changing the code later.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The main intention of the max_segment argument to
__sg_alloc_table_from_pages() is to match the DMA layer segment size set
by dma_set_max_seg_size().
Restricting the input to be page aligned makes it impossible to just
connect the DMA layer to this API.
The only reason for a page alignment here is because the algorithm will
overshoot the max_segment if it is not a multiple of PAGE_SIZE. Simply fix
the alignment before starting and don't expose this implementation detail
to the callers.
A future patch will completely remove SCATTERLIST_MAX_SEGMENT.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
From Maor Gottlieb says:
====================
This series extends __sg_alloc_table_from_pages to allow chaining of new
pages to an already initialized SG table.
This allows for drivers to utilize the optimization of merging contiguous
pages without a need to pre allocate all the pages and hold them in a very
large temporary buffer prior to the call to SG table initialization.
The last patch changes the Infiniband core to use the new API. It removes
duplicate functionality from the code and benefits from the optimization
of allocating dynamic SG table from pages.
In huge pages system of 2MB page size, without this change, the SG table
would contain x512 SG entries.
====================
* branch 'dynamic_sg':
RDMA/umem: Move to allocate SG table from pages
lib/scatterlist: Add support in dynamic allocation of SG table from pages
tools/testing/scatterlist: Show errors in human readable form
tools/testing/scatterlist: Rejuvenate bit-rotten test
|
|
A device may have specific HW constraints that must be obeyed to, before
its corresponding PM domain (genpd) can be powered off - and vice verse at
power on. These constraints can't be managed through the regular runtime PM
based deployment for a device, because the access pattern for it, isn't
always request based. In other words, using the runtime PM callbacks to
deal with the constraints doesn't work for these cases.
For these reasons, let's instead add a PM domain power on/off notification
mechanism to genpd. To add/remove a notifier for a device, the device must
already have been attached to the genpd, which also means that it needs to
be a part of the PM domain topology.
To add/remove a notifier, let's introduce two genpd specific functions:
- dev_pm_genpd_add|remove_notifier()
Note that, to further clarify when genpd power on/off notifiers may be
used, one can compare with the existing CPU_CLUSTER_PM_ENTER|EXIT
notifiers. In the long run, the genpd power on/off notifiers should be able
to replace them, but that requires additional genpd based platform support
for the current users.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Lina Iyer <ilina@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
domain
On multi-package systems, the Psys MSR is only valid for CPUs on
specific package (master package). The current code makes the
assumption that package 0 is the master package, but this is not
true on new platforms like SPR.
Fix the problem by emuerating the Psys RAPL domain for every
package, so CPUs in slave packages will read 0 for the Psys energy
counter and only CPUs in master packages can get a valid reading
and register the Psys RAPL domain.
The sysfs I/F for the Psys RAPL domain is not changed.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
As only the low 32 bits of the RAPL_DOMAIN_REG_STATUS register
represents the energy counter, and the high 32 bits are reserved,
detect the existence of a RAPL domain by checking the low 32 bits only.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Before commit 9495b7e92f716ab2 ("driver core: platform: Initialize
dma_parms for platform devices"), the R-Car SATA device didn't have DMA
parameters. Hence the DMA boundary mask supplied by its driver was
silently ignored, as __scsi_init_queue() doesn't check the return value
of dma_set_seg_boundary(), and the default value of 0xffffffff was used.
Now the device has gained DMA parameters, the driver-supplied value is
used, and the following warning is printed on Salvator-XS:
DMA-API: sata_rcar ee300000.sata: mapping sg segment across boundary [start=0x00000000ffffe000] [end=0x00000000ffffefff] [boundary=0x000000001ffffffe]
WARNING: CPU: 5 PID: 38 at kernel/dma/debug.c:1233 debug_dma_map_sg+0x298/0x300
(the range of start/end values depend on whether IOMMU support is
enabled or not)
The issue here is that SATA_RCAR_DMA_BOUNDARY doesn't have bit 0 set, so
any typical end value, which is odd, will trigger the check.
Fix this by increasing the DMA boundary value by 1.
This also fixes the following WRITE DMA EXT timeout issue:
# dd if=/dev/urandom of=/mnt/de1/file1-1024M bs=1M count=1024
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
ata1.00: failed command: WRITE DMA EXT
ata1.00: cmd 35/00:00:00:e6:0c/00:0a:00:00:00/e0 tag 0 dma 1310720 out
res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata1.00: status: { DRDY }
as seen by Shimoda-san since commit 429120f3df2dba2b ("block: fix
splitting segments on boundary masks").
Fixes: 8bfbeed58665dbbf ("sata_rcar: correct 'sata_rcar_sht'")
Fixes: 9495b7e92f716ab2 ("driver core: platform: Initialize dma_parms for platform devices")
Fixes: 429120f3df2dba2b ("block: fix splitting segments on boundary masks")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
sgl_alloc_order() can fail when 'length' is large on a memory
constrained system. When order > 0 it will potentially be
making several multi-page allocations with the later ones more
likely to fail than the earlier one. So it is important that
sgl_alloc_order() frees up any pages it has obtained before
returning NULL. In the case when order > 0 it calls the wrong
free page function and leaks. In testing the leak was
sufficient to bring down my 8 GiB laptop with OOM.
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
There is an off-by-one array check that can lead to a out-of-bounds
write to devices->info[i]. Fix this by checking by using >= rather
than > for the size check. Also replace hard-coded array size limit
with ARRAY_SIZE on the array.
Addresses-Coverity: ("Out-of-bounds write")
Fixes: cd9e9808d18f ("lightnvm: Support for Open-Channel SSDs")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
e6d4f08a6776 ("intel_idle: Use ACPI _CST on server systems") avoids
enabling c-states that have been disabled by the platform with the
exception of C1E.
Unfortunately, BIOS implementations are not always consistent in terms
of how capabilities are advertised and control cannot always be handed
over. If control cannot be handed over then intel_idle reports that "ACPI
_CST not found or not usable" but does not clear acpi_state_table.count
meaning the information is still partially used.
This patch ignores ACPI information if CST control cannot be requested from
the platform. This was only observed on a number of Haswell platforms that
had identical CPUs but not identical BIOS versions. While this problem
may be rare overall, 24 separate test cases bisected to this specific
commit across 4 separate test machines and is worth addressing. If the
situation occurs, the kernel behaves as it did before commit e6d4f08a6776
and uses any c-states that are discovered.
The affected test cases were all ones that involved a small number of
processes -- exec microbenchmark, pipe microbenchmark, git test suite,
netperf, tbench with one client and system call microbenchmark. Each
case benefits from being able to use turboboost which is prevented if the
lower c-states are unavailable. This may mask real regressions specific
to older hardware so it is worth addressing.
C-state status before and after the patch
5.9.0-vanilla POLL latency:0 disabled:0 default:enabled
5.9.0-vanilla C1 latency:2 disabled:0 default:enabled
5.9.0-vanilla C1E latency:10 disabled:0 default:enabled
5.9.0-vanilla C3 latency:33 disabled:1 default:disabled
5.9.0-vanilla C6 latency:133 disabled:1 default:disabled
5.9.0-ignore-cst-v1r1 POLL latency:0 disabled:0 default:enabled
5.9.0-ignore-cst-v1r1 C1 latency:2 disabled:0 default:enabled
5.9.0-ignore-cst-v1r1 C1E latency:10 disabled:0 default:enabled
5.9.0-ignore-cst-v1r1 C3 latency:33 disabled:0 default:enabled
5.9.0-ignore-cst-v1r1 C6 latency:133 disabled:0 default:enabled
Patch enables C3/C6.
Netperf UDP_STREAM
netperf-udp
5.5.0 5.9.0
vanilla ignore-cst-v1r1
Hmean send-64 193.41 ( 0.00%) 226.54 * 17.13%*
Hmean send-128 392.16 ( 0.00%) 450.54 * 14.89%*
Hmean send-256 769.94 ( 0.00%) 881.85 * 14.53%*
Hmean send-1024 2994.21 ( 0.00%) 3468.95 * 15.85%*
Hmean send-2048 5725.60 ( 0.00%) 6628.99 * 15.78%*
Hmean send-3312 8468.36 ( 0.00%) 10288.02 * 21.49%*
Hmean send-4096 10135.46 ( 0.00%) 12387.57 * 22.22%*
Hmean send-8192 17142.07 ( 0.00%) 19748.11 * 15.20%*
Hmean send-16384 28539.71 ( 0.00%) 30084.45 * 5.41%*
Fixes: e6d4f08a6776 ("intel_idle: Use ACPI _CST on server systems")
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Cc: 5.6+ <stable@vger.kernel.org> # 5.6+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The cpuidle.h header is declaring a function with an empty stub
for the cpuidle disabled case, but that function is only called
by cpuidle governors which depend on cpuidle anyway.
In other words, the function is only called when cpuidle is enabled,
so there is no need for the stub.
Remove the pointless stub.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Intel SDM does not explicitly say that entering a C-state via MWAIT will
implicitly flush CPU caches as appropriate for that C-state. However,
documentation for individual Intel CPU generations does mention this
behavior.
Since intel_idle binds to any Intel CPU with MWAIT, list this assumption
of MWAIT behavior.
In passing, reword opening comment to make it clear that the driver can
load on any old and future Intel CPU with MWAIT.
Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The cpuidle-psci-domain.c is not listed in the section for the cpuidle
driver for ARM PSCI.
From discussions at LKML, Lorenzo and Sudeep prefer to add a separate
section for it, so do that and add myself as the maintainer for that
part.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
driver
There is a corner case that if the intel_pstate driver fails to be
registered (might be due to invalid MSR access) and acpi_cpufreq
takse over, the intel_pstate sysfs interface is still populated
which may confuse user space (turbostat for example):
grep . /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver
acpi-cpufreq
grep . /sys/devices/system/cpu/intel_pstate/*
/sys/devices/system/cpu/intel_pstate/max_perf_pct:0
/sys/devices/system/cpu/intel_pstate/min_perf_pct:0
grep: /sys/devices/system/cpu/intel_pstate/no_turbo: Resource temporarily unavailable
grep: /sys/devices/system/cpu/intel_pstate/num_pstates: Resource temporarily unavailable
/sys/devices/system/cpu/intel_pstate/status:off
grep: /sys/devices/system/cpu/intel_pstate/turbo_pct: Resource temporarily unavailable
The mere presence of the intel_pstate sysfs interface does not mean
that intel_pstate is in use (for example, echo "off" to "status"),
but it should not be created in the failing case.
Fix this issue by deleting the intel_pstate sysfs if the driver
registration fails.
Reported-by: Wendy Wang <wendy.wang@intel.com>
Suggested-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com
[ rjw: Refactor code to avoid jumps, change function name, changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The cpufreq core checks if the frequency programmed by the bootloaders
is not listed in the freq table and programs one from the table in such
a case. This is done only if the driver has set the
CPUFREQ_NEED_INITIAL_FREQ_CHECK flag.
Currently we print two separate messages, with almost the same content,
and do this with a pr_warn() which may be a bit too much as the driver
only asked us to check this as it expected this to be the case. Lower
down the severity of the print message by switching to pr_info() instead
and print a single message only.
Reported-by: Sumit Gupta <sumitg@nvidia.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Sumit Gupta <sumitg@nvidia.com>
Tested-by: Sumit Gupta <sumitg@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux
Pull cpupower utility updates for 5.10-rc1 from Shuah Khan:
"This update consists of minor fixes for spelling and speeding up
generating git version string which will in turn speed up compiles."
* tag 'linux-cpupower-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux:
cpupower: speed up generating git version string
cpupowerutils: fix spelling mistake "dependant" -> "dependent"
|
|
The commit 720dee53ad8d ("tracing/boot: Initialize per-instance event
list in early boot") removes __init from __trace_early_add_events()
but __trace_early_add_new_event() still has __init and will cause a
section mismatch.
Remove __init from __trace_early_add_new_event() as same as
__trace_early_add_events().
Link: https://lore.kernel.org/lkml/CAHk-=wjU86UhovK4XuwvCqTOfc+nvtpAuaN2PJBz15z=w=u0Xg@mail.gmail.com/
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Don't give an assertion failure on unpurgeable afs_server records - which
kills the thread - but rather emit a trace line when we are purging a
record (which only happens during network namespace removal or rmmod) and
print a notice of the problem.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Add a tracepoint to log the cell refcount and active user count and pass in
a reason code through various functions that manipulate these counters.
Additionally, a helper function, afs_see_cell(), is provided to log
interesting places that deal with a cell without actually doing any
accounting directly.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Fix cell removal by inserting a more final state than AFS_CELL_FAILED that
indicates that the cell has been unpublished in case the manager is already
requeued and will go through again. The new AFS_CELL_REMOVED state will
just immediately leave the manager function.
Going through a second time in the AFS_CELL_FAILED state will cause it to
try to remove the cell again, potentially leading to the proc list being
removed.
Fixes: 989782dcdc91 ("afs: Overhaul cell database management")
Reported-by: syzbot+b994ecf2b023f14832c1@syzkaller.appspotmail.com
Reported-by: syzbot+0e0db88e1eb44a91ae8d@syzkaller.appspotmail.com
Reported-by: syzbot+2d0585e5efcd43d113c2@syzkaller.appspotmail.com
Reported-by: syzbot+1ecc2f9d3387f1d79d42@syzkaller.appspotmail.com
Reported-by: syzbot+18d51774588492bf3f69@syzkaller.appspotmail.com
Reported-by: syzbot+a5e4946b04d6ca8fa5f3@syzkaller.appspotmail.com
Suggested-by: Hillf Danton <hdanton@sina.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Hillf Danton <hdanton@sina.com>
|
|
When the afs module is removed, one of the things that has to be done is to
purge the cell database. afs_cell_purge() cancels the management timer and
then starts the cell manager work item to do the purging. This does a
single run through and then assumes that all cells are now purged - but
this is no longer the case.
With the introduction of alias detection, a later cell in the database can
now be holding an active count on an earlier cell (cell->alias_of). The
purge scan passes by the earlier cell first, but this can't be got rid of
until it has discarded the alias. Ordinarily, afs_unuse_cell() would
handle this by setting the management timer to trigger another pass - but
afs_set_cell_timer() doesn't do anything if the namespace is being removed
(net->live == false). rmmod then hangs in the wait on cells_outstanding in
afs_cell_purge().
Fix this by making afs_set_cell_timer() directly queue the cell manager if
net->live is false. This causes additional management passes.
Queueing the cell manager increments cells_outstanding to make sure the
wait won't complete until all cells are destroyed.
Fixes: 8a070a964877 ("afs: Detect cell aliases 1 - Cells with root volumes")
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Management of the lifetime of afs_cell struct has some problems due to the
usage counter being used to determine whether objects of that type are in
use in addition to whether anyone might be interested in the structure.
This is made trickier by cell objects being cached for a period of time in
case they're quickly reused as they hold the result of a setup process that
may be slow (DNS lookups, AFS RPC ops).
Problems include the cached root volume from alias resolution pinning its
parent cell record, rmmod occasionally hanging and occasionally producing
assertion failures.
Fix this by splitting the count of active users from the struct reference
count. Things then work as follows:
(1) The cell cache keeps +1 on the cell's activity count and this has to
be dropped before the cell can be removed. afs_manage_cell() tries to
exchange the 1 to a 0 with the cells_lock write-locked, and if
successful, the record is removed from the net->cells.
(2) One struct ref is 'owned' by the activity count. That is put when the
active count is reduced to 0 (final_destruction label).
(3) A ref can be held on a cell whilst it is queued for management on a
work queue without confusing the active count. afs_queue_cell() is
added to wrap this.
(4) The queue's ref is dropped at the end of the management. This is
split out into a separate function, afs_manage_cell_work().
(5) The root volume record is put after a cell is removed (at the
final_destruction label) rather then in the RCU destruction routine.
(6) Volumes hold struct refs, but aren't active users.
(7) Both counts are displayed in /proc/net/afs/cells.
There are some management function changes:
(*) afs_put_cell() now just decrements the refcount and triggers the RCU
destruction if it becomes 0. It no longer sets a timer to have the
manager do this.
(*) afs_use_cell() and afs_unuse_cell() are added to increase and decrease
the active count. afs_unuse_cell() sets the management timer.
(*) afs_queue_cell() is added to queue a cell with approprate refs.
There are also some other fixes:
(*) Don't let /proc/net/afs/cells access a cell's vllist if it's NULL.
(*) Make sure that candidate cells in lookups are properly destroyed
rather than being simply kfree'd. This ensures the bits it points to
are destroyed also.
(*) afs_dec_cells_outstanding() is now called in cell destruction rather
than at "final_destruction". This ensures that cell->net is still
valid to the end of the destructor.
(*) As a consequence of the previous two changes, move the increment of
net->cells_outstanding that was at the point of insertion into the
tree to the allocation routine to correctly balance things.
Fixes: 989782dcdc91 ("afs: Overhaul cell database management")
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
RFC 7862 introduced a new flag that either client or server is
allowed to set: EXCHGID4_FLAG_SUPP_FENCE_OPS.
Client needs to update its bitmask to allow for this flag value.
v2: changed minor version argument to unsigned int
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
There are a number of problems that are being seen by the rapidly mounting
and unmounting an afs dynamic root with an explicit cell and volume
specified (which should probably be rejected, but that's a separate issue):
What the tests are doing is to look up/create a cell record for the name
given and then tear it down again without actually using it to try to talk
to a server. This is repeated endlessly, very fast, and the new cell
collides with the old one if it's not quick enough to reuse it.
It appears (as suggested by Hillf Danton) that the search through the RB
tree under a read_seqbegin_or_lock() under RCU conditions isn't safe and
that it's not blocking the write_seqlock(), despite taking two passes at
it. He suggested that the code should take a ref on the cell it's
attempting to look at - but this shouldn't be necessary until we've
compared the cell names. It's possible that I'm missing a barrier
somewhere.
However, using an RCU search for this is overkill, really - we only need to
access the cell name in a few places, and they're places where we're may
end up sleeping anyway.
Fix this by switching to an R/W semaphore instead.
Additionally, draw the down_read() call inside the function (renamed to
afs_find_cell()) since all the callers were taking the RCU read lock (or
should've been[*]).
[*] afs_probe_cell_name() should have been, but that doesn't appear to be
involved in the bug reports.
The symptoms of this look like:
general protection fault, probably for non-canonical address 0xf27d208691691fdb: 0000 [#1] PREEMPT SMP KASAN
KASAN: maybe wild-memory-access in range [0x93e924348b48fed8-0x93e924348b48fedf]
...
RIP: 0010:strncasecmp lib/string.c:52 [inline]
RIP: 0010:strncasecmp+0x5f/0x240 lib/string.c:43
afs_lookup_cell_rcu+0x313/0x720 fs/afs/cell.c:88
afs_lookup_cell+0x2ee/0x1440 fs/afs/cell.c:249
afs_parse_source fs/afs/super.c:290 [inline]
...
Fixes: 989782dcdc91 ("afs: Overhaul cell database management")
Reported-by: syzbot+459a5dce0b4cb70fd076@syzkaller.appspotmail.com
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Hillf Danton <hdanton@sina.com>
cc: syzkaller-bugs@googlegroups.com
|
|
Use of nmi_enter/exit in real mode handler causes the kernel to panic
and reboot on injecting SLB mutihit on pseries machine running in hash
MMU mode, because these calls try to accesses memory outside RMO
region in real mode handler where translation is disabled.
Add check to not to use these calls on pseries machine running in hash
MMU mode.
Fixes: 116ac378bb3f ("powerpc/64s: machine check interrupt update NMI accounting")
Cc: stable@vger.kernel.org # v5.8+
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201009064005.19777-2-ganeshgr@linux.ibm.com
|
|
The update_devfreq() is also documented at devfreq.c, which
has a more complete note.
So, drop the duplicated markup, in order to avoid this
warning:
.../Documentation/driver-api/device_link.rst: WARNING: Duplicate C declaration, also defined in 'driver-api/infrastructure'.
Declaration is 'device_link_state'.
(and to cause a problem with cross-references to it)
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
|
|
Literal blocks with :: markup should be indented, as otherwise
Sphinx complains:
Documentation/vm/hmm.rst:363: WARNING: Literal block expected; none found.
Fixes: f7ebd9ed7767 ("mm/doc: add usage description for migrate_vma_*()")
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
|
|
As warned by Sphinx:
./Documentation/core-api/workqueue:400: ./kernel/workqueue.c:1218: WARNING: Unexpected indentation.
the return code table is currently not recognized, as it lacks
markups.
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
|
|
There's a missing new line for a literal block:
.../Documentation/virt/uml/user_mode_linux_howto_v2.rst:682: WARNING: Unexpected indentation.
Fixes: 04301bf5b072 ("docs: replace the old User Mode Linux HowTo with a new one")
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
|
|
Add new module load parameter enable_gcm_256. If set, then add
AES-256-GCM (strongest encryption type) to the list of encryption
types requested. Put it in the list as the second choice (since
AES-128-GCM is faster and much more broadly supported by
SMB3 servers). To make this stronger encryption type, GCM-256,
required (the first and only choice, you would use module parameter
"require_gcm_256."
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Add new module load parameter require_gcm_256. If set, then only
request AES-256-GCM (strongest encryption type).
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
This is basically the same as STATUS_LOGON_FAILURE,
but after the account is locked out.
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Currently there are three supported signing algorithms
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
RHBZ: 1848178
Some calls that set attributes, like utimensat(), are not supposed to return
-EINTR and thus do not have handlers for this in glibc which causes us
to leak -EINTR to the applications which are also unprepared to handle it.
For example tar will break if utimensat() return -EINTR and abort unpacking
the archive. Other applications may break too.
To handle this we add checks, and retry, for -EINTR in cifs_setattr()
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Currently STATUS_IO_TIMEOUT is not treated as retriable error.
It is currently mapped to ETIMEDOUT and returned to userspace
for most system calls. STATUS_IO_TIMEOUT is returned by server
in case of unavailability or throttling errors.
This patch will map the STATUS_IO_TIMEOUT to EAGAIN, so that it
can be retried. Also, added a check to drop the connection to
not overload the server in case of ongoing unavailability.
Signed-off-by: Rohith Surabattula <rohiths@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Even though we use self removing sysfs helper, we still need
to make sure we do the final kobject delete conditionally.
sysfs_remove_file_self() will handle parallel calls to remove
the sysfs attribute file and returns true only in the caller
that removed the attribute file. The other parallel callers
are returned false. Do the final kobject delete checking
the return value of sysfs_remove_file_self().
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201014064813.109515-1-aneesh.kumar@linux.ibm.com
|
|
As reported by Stephen, dsb() is not declared on e.g. x86_64, preventing
the mtp_scp from building. Simply remove the barrier (and the readback),
suggested by Pi-Hsun to resolve this.
Fixes: fd0b6c1ff85a ("remoteproc/mediatek: Add support for mt8192 SCP")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Suggested-by: Pi-Hsun Shih <pihsun@chromium.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
- Add redirect_neigh() BPF packet redirect helper, allowing to limit
stack traversal in common container configs and improving TCP
back-pressure.
Daniel reports ~10Gbps => ~15Gbps single stream TCP performance gain.
- Expand netlink policy support and improve policy export to user
space. (Ge)netlink core performs request validation according to
declared policies. Expand the expressiveness of those policies
(min/max length and bitmasks). Allow dumping policies for particular
commands. This is used for feature discovery by user space (instead
of kernel version parsing or trial and error).
- Support IGMPv3/MLDv2 multicast listener discovery protocols in
bridge.
- Allow more than 255 IPv4 multicast interfaces.
- Add support for Type of Service (ToS) reflection in SYN/SYN-ACK
packets of TCPv6.
- In Multi-patch TCP (MPTCP) support concurrent transmission of data on
multiple subflows in a load balancing scenario. Enhance advertising
addresses via the RM_ADDR/ADD_ADDR options.
- Support SMC-Dv2 version of SMC, which enables multi-subnet
deployments.
- Allow more calls to same peer in RxRPC.
- Support two new Controller Area Network (CAN) protocols - CAN-FD and
ISO 15765-2:2016.
- Add xfrm/IPsec compat layer, solving the 32bit user space on 64bit
kernel problem.
- Add TC actions for implementing MPLS L2 VPNs.
- Improve nexthop code - e.g. handle various corner cases when nexthop
objects are removed from groups better, skip unnecessary
notifications and make it easier to offload nexthops into HW by
converting to a blocking notifier.
- Support adding and consuming TCP header options by BPF programs,
opening the doors for easy experimental and deployment-specific TCP
option use.
- Reorganize TCP congestion control (CC) initialization to simplify
life of TCP CC implemented in BPF.
- Add support for shipping BPF programs with the kernel and loading
them early on boot via the User Mode Driver mechanism, hence reusing
all the user space infra we have.
- Support sleepable BPF programs, initially targeting LSM and tracing.
- Add bpf_d_path() helper for returning full path for given 'struct
path'.
- Make bpf_tail_call compatible with bpf-to-bpf calls.
- Allow BPF programs to call map_update_elem on sockmaps.
- Add BPF Type Format (BTF) support for type and enum discovery, as
well as support for using BTF within the kernel itself (current use
is for pretty printing structures).
- Support listing and getting information about bpf_links via the bpf
syscall.
- Enhance kernel interfaces around NIC firmware update. Allow
specifying overwrite mask to control if settings etc. are reset
during update; report expected max time operation may take to users;
support firmware activation without machine reboot incl. limits of
how much impact reset may have (e.g. dropping link or not).
- Extend ethtool configuration interface to report IEEE-standard
counters, to limit the need for per-vendor logic in user space.
- Adopt or extend devlink use for debug, monitoring, fw update in many
drivers (dsa loop, ice, ionic, sja1105, qed, mlxsw, mv88e6xxx,
dpaa2-eth).
- In mlxsw expose critical and emergency SFP module temperature alarms.
Refactor port buffer handling to make the defaults more suitable and
support setting these values explicitly via the DCBNL interface.
- Add XDP support for Intel's igb driver.
- Support offloading TC flower classification and filtering rules to
mscc_ocelot switches.
- Add PTP support for Marvell Octeontx2 and PP2.2 hardware, as well as
fixed interval period pulse generator and one-step timestamping in
dpaa-eth.
- Add support for various auth offloads in WiFi APs, e.g. SAE (WPA3)
offload.
- Add Lynx PHY/PCS MDIO module, and convert various drivers which have
this HW to use it. Convert mvpp2 to split PCS.
- Support Marvell Prestera 98DX3255 24-port switch ASICs, as well as
7-port Mediatek MT7531 IP.
- Add initial support for QCA6390 and IPQ6018 in ath11k WiFi driver,
and wcn3680 support in wcn36xx.
- Improve performance for packets which don't require much offloads on
recent Mellanox NICs by 20% by making multiple packets share a
descriptor entry.
- Move chelsio inline crypto drivers (for TLS and IPsec) from the
crypto subtree to drivers/net. Move MDIO drivers out of the phy
directory.
- Clean up a lot of W=1 warnings, reportedly the actively developed
subsections of networking drivers should now build W=1 warning free.
- Make sure drivers don't use in_interrupt() to dynamically adapt their
code. Convert tasklets to use new tasklet_setup API (sadly this
conversion is not yet complete).
* tag 'net-next-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2583 commits)
Revert "bpfilter: Fix build error with CONFIG_BPFILTER_UMH"
net, sockmap: Don't call bpf_prog_put() on NULL pointer
bpf, selftest: Fix flaky tcp_hdr_options test when adding addr to lo
bpf, sockmap: Add locking annotations to iterator
netfilter: nftables: allow re-computing sctp CRC-32C in 'payload' statements
net: fix pos incrementment in ipv6_route_seq_next
net/smc: fix invalid return code in smcd_new_buf_create()
net/smc: fix valid DMBE buffer sizes
net/smc: fix use-after-free of delayed events
bpfilter: Fix build error with CONFIG_BPFILTER_UMH
cxgb4/ch_ipsec: Replace the module name to ch_ipsec from chcr
net: sched: Fix suspicious RCU usage while accessing tcf_tunnel_info
bpf: Fix register equivalence tracking.
rxrpc: Fix loss of final ack on shutdown
rxrpc: Fix bundle counting for exclusive connections
netfilter: restore NF_INET_NUMHOOKS
ibmveth: Identify ingress large send packets.
ibmveth: Switch order of ibmveth_helper calls.
cxgb4: handle 4-tuple PEDIT to NAT mode translation
selftests: Add VRF route leaking tests
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity
Pull integrity updates from Mimi Zohar:
"Continuing IMA policy rule cleanup and validation in particular for
measuring keys, adding/removing/updating informational and error
messages (e.g. "ima_appraise" boot command line option), and other bug
fixes (e.g. minimal data size validation before use, return code and
NULL pointer checking)"
* tag 'integrity-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
ima: Fix NULL pointer dereference in ima_file_hash
evm: Check size of security.evm before using it
ima: Remove semicolon at the end of ima_get_binary_runtime_size()
ima: Don't ignore errors from crypto_shash_update()
ima: Use kmemdup rather than kmalloc+memcpy
integrity: include keyring name for unknown key request
ima: limit secure boot feedback scope for appraise
integrity: invalid kernel parameters feedback
ima: add check for enforced appraise option
integrity: Use current_uid() in integrity_audit_message()
ima: Fail rule parsing when asymmetric key measurement isn't supportable
ima: Pre-parse the list of keyrings in a KEY_CHECK rule
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
"Updates for tracing and bootconfig:
- Add support for "bool" type in synthetic events
- Add per instance tracing for bootconfig
- Support perf-style return probe ("SYMBOL%return") in kprobes and
uprobes
- Allow for kprobes to be enabled earlier in boot up
- Added tracepoint helper function to allow testing if tracepoints
are enabled in headers
- Synthetic events can now have dynamic strings (variable length)
- Various fixes and cleanups"
* tag 'trace-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (58 commits)
tracing: support "bool" type in synthetic trace events
selftests/ftrace: Add test case for synthetic event syntax errors
tracing: Handle synthetic event array field type checking correctly
selftests/ftrace: Change synthetic event name for inter-event-combined test
tracing: Add synthetic event error logging
tracing: Check that the synthetic event and field names are legal
tracing: Move is_good_name() from trace_probe.h to trace.h
tracing: Don't show dynamic string internals in synthetic event description
tracing: Fix some typos in comments
tracing/boot: Add ftrace.instance.*.alloc_snapshot option
tracing: Fix race in trace_open and buffer resize call
tracing: Check return value of __create_val_fields() before using its result
tracing: Fix synthetic print fmt check for use of __get_str()
tracing: Remove a pointless assignment
ftrace: ftrace_global_list is renamed to ftrace_ops_list
ftrace: Format variable declarations of ftrace_allocate_records
ftrace: Simplify the calculation of page number for ftrace_page->records
ftrace: Simplify the dyn_ftrace->flags macro
ftrace: Simplify the hash calculation
ftrace: Use fls() to get the bits for dup_hash()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull another Hyper-V update from Wei Liu:
"One patch from Michael to get VMbus interrupt from ACPI DSDT"
* tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
Drivers: hv: vmbus: Add parsing of VMbus interrupt in ACPI DSDT
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc updates from Helge Deller:
- Added fw_cfg support for parisc on qemu
- Added font support in sti text console driver for byte- and word-mode
ROMs
- Switch to more fine grained lws locks and improve spinlock handling
- Add ioread64_hi_lo() and iowrite64_hi_lo() to avoid 0-day linking
errors
- Mark pointers volatile in __xchg8(), __xchg32() and __xchg64() to
help compiler
- Header file cleanups, mostly removal of unused HP-UX compat defines
- Drop one bit from our O_NONBLOCK define to become now 000200000
- Add MAP_UNINITIALIZED define to avoid userspace compile errors
- Drop CONFIG_IDE from defconfigs
- Speed up synchronize_caches() on UP machines
- Rewrite tlb flush threshold calculation
- Comment fixes and cleanups
* 'parisc-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc/sticon: Add user font support
parisc/sticon: Always register sticon console driver
parisc: Add MAP_UNINITIALIZED define
parisc: Improve spinlock handling
parisc: Install vmlinuz instead of zImage file
parisc: Rewrite tlb flush threshold calculation
parisc: Switch to more fine grained lws locks
parisc: Mark pointers volatile in __xchg8(), __xchg32() and __xchg64()
parisc: Fix comments and enable interrupts later
parisc: Add alternative patching to synchronize_caches define
parisc: Add ioread64_hi_lo() and iowrite64_hi_lo()
parisc: disable CONFIG_IDE in defconfigs
parisc: Drop useless comments in uapi/asm/signal.h
parisc: Define O_NONBLOCK to become 000200000
parisc: Drop HP-UX specific fcntl and signal flags
parisc: Avoid external interrupts when IPI finishes
parisc: Add qemu fw_cfg interface
fw_cfg: Add support for parisc architecture
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull Kunit updates from Shuah Khan:
"Several kunit tool bug fixes in flag handling, run outside kernel
tree, make errors, and generating results"
* tag 'linux-kselftest-kunit-fixes-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kunit: tool: fix display of make errors
kunit: tool: handle when .kunit exists but .kunitconfig does not
kunit: tool: fix --alltests flag
kunit: tool: allow generating test results in JSON
kunit: tool: fix running kunit_tool from outside kernel tree
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest updates from Shuah Khan:
- speed up headers_install done during selftest build
- add generic make nesting support
- add support to select individual tests:
Selftests build/install generates run_kselftest.sh script to run
selftests on a target system. Currently the script doesn't have
support for selecting individual tests. Add support for it.
With this enhancement, user can select test collections (or tests)
individually. e.g:
run_kselftest.sh -c seccomp -t timers:posix_timers -t timers:nanosleep
Additionally adds a way to list all known tests with "-l", usage with
"-h", and perform a dry run without running tests with "-n".
* tag 'linux-kselftest-next-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
doc: dev-tools: kselftest.rst: Update examples and paths
selftests/run_kselftest.sh: Make each test individually selectable
selftests: Extract run_kselftest.sh and generate stand-alone test list
selftests: Add missing gitignore entries
selftests: more general make nesting support
selftests: use "$(MAKE)" instead of "make" for headers_install
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial updates from Jiri Kosina:
"The latest advances in computer science from the trivial queue"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
xtensa: fix Kconfig typo
spelling.txt: Remove some duplicate entries
mtd: rawnand: oxnas: cleanup/simplify code
selftests: vm: add fragment CONFIG_GUP_BENCHMARK
perf: Fix opt help text for --no-bpf-event
HID: logitech-dj: Fix spelling in comment
bootconfig: Fix kernel message mentioning CONFIG_BOOT_CONFIG
MAINTAINERS: rectify MMP SUPPORT after moving cputype.h
scif: Fix spelling of EACCES
printk: fix global comment
lib/bitmap.c: fix spello
fs: Fix missing 'bit' in comment
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID updates from Jiri Kosina:
- Lenovo X1 Tablet support improvements from Mikael Wikström
- "heartbeat" report fix for several Wacom devices from Jason Gerecke
- bounds checking fix in hid-roccat from Dan Carpenter
- stylus battery reporting fix from Dmitry Torokhov
- i2c-hid support for wakeup from suspend-to-idle from Kai-Heng Feng
- new driver for Vivaldi devices from Sean O'Brien
- other assorted small fixes and device ID additions
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
HID: i2c-hid: Enable wakeup capability from Suspend-to-Idle
HID: add vivaldi HID driver
HID: hid-input: fix stylus battery reporting
HID: wacom: Avoid entering wacom_wac_pen_report for pad / battery
HID: i2c-hid: fix kerneldoc warnings in i2c-hid-core.c
HID: core: fix kerneldoc warnings in hid-core.c
HID: multitouch: Lenovo X1 Tablet Gen2 trackpoint and buttons
HID: multitouch: Lenovo X1 Tablet Gen3 trackpoint and buttons
HID: alps: clean up indentation issue
HID: intel-ish-hid: simplify the return expression of ishtp_bus_remove_device()
HID: hid-debug: fix nonblocking read semantics wrt EIO/ERESTARTSYS
HID: i2c-hid: Prefer asynchronous probe
HID: ite: Add USB id match for Acer One S1003 keyboard dock
HID: roccat: add bounds checking in kone_sysfs_write_settings()
HID: wiimote: narrow spinlock range in wiimote_hid_event()
HID: wiimote: make handlers[] const
HID: apple: Add support for Matias wireless keyboard
HID: cp2112: Use irqchip template
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching
Pull livepatching update from Jiri Kosina:
"livepatching kselftest output fix from Miroslav Benes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching:
selftests/livepatch: Do not check order when using "comm" for dmesg checking
|