Age | Commit message (Collapse) | Author |
|
- Add a SPDX header;
- Adjust document and section titles;
- Some whitespace fixes and new line breaks;
- Add table markups;
- Use notes markups;
- Add it to filesystems/index.rst.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Doing a "get_user_pages()" on a copy-on-write page for reading can be
ambiguous: the page can be COW'ed at any time afterwards, and the
direction of a COW event isn't defined.
Yes, whoever writes to it will generally do the COW, but if the thread
that did the get_user_pages() unmapped the page before the write (and
that could happen due to memory pressure in addition to any outright
action), the writer could also just take over the old page instead.
End result: the get_user_pages() call might result in a page pointer
that is no longer associated with the original VM, and is associated
with - and controlled by - another VM having taken it over instead.
So when doing a get_user_pages() on a COW mapping, the only really safe
thing to do would be to break the COW when getting the page, even when
only getting it for reading.
At the same time, some users simply don't even care.
For example, the perf code wants to look up the page not because it
cares about the page, but because the code simply wants to look up the
physical address of the access for informational purposes, and doesn't
really care about races when a page might be unmapped and remapped
elsewhere.
This adds logic to force a COW event by setting FOLL_WRITE on any
copy-on-write mapping when FOLL_GET (or FOLL_PIN) is used to get a page
pointer as a result.
The current semantics end up being:
- __get_user_pages_fast(): no change. If you don't ask for a write,
you won't break COW. You'd better know what you're doing.
- get_user_pages_fast(): the fast-case "look it up in the page tables
without anything getting mmap_sem" now refuses to follow a read-only
page, since it might need COW breaking. Which happens in the slow
path - the fast path doesn't know if the memory might be COW or not.
- get_user_pages() (including the slow-path fallback for gup_fast()):
for a COW mapping, turn on FOLL_WRITE for FOLL_GET/FOLL_PIN, with
very similar semantics to FOLL_FORCE.
If it turns out that we want finer granularity (ie "only break COW when
it might actually matter" - things like the zero page are special and
don't need to be broken) we might need to push these semantics deeper
into the lookup fault path. So if people care enough, it's possible
that we might end up adding a new internal FOLL_BREAK_COW flag to go
with the internal FOLL_COW flag we already have for tracking "I had a
COW".
Alternatively, if it turns out that different callers might want to
explicitly control the forced COW break behavior, we might even want to
make such a flag visible to the users of get_user_pages() instead of
using the above default semantics.
But for now, this is mostly commentary on the issue (this commit message
being a lot bigger than the patch, and that patch in turn is almost all
comments), with that minimal "enable COW breaking early" logic using the
existing FOLL_WRITE behavior.
[ It might be worth noting that we've always had this ambiguity, and it
could arguably be seen as a user-space issue.
You only get private COW mappings that could break either way in
situations where user space is doing cooperative things (ie fork()
before an execve() etc), but it _is_ surprising and very subtle, and
fork() is supposed to give you independent address spaces.
So let's treat this as a kernel issue and make the semantics of
get_user_pages() easier to understand. Note that obviously a true
shared mapping will still get a page that can change under us, so this
does _not_ mean that get_user_pages() somehow returns any "stable"
page ]
Reported-by: Jann Horn <jannh@google.com>
Tested-by: Christoph Hellwig <hch@lst.de>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Kirill Shutemov <kirill@shutemov.name>
Acked-by: Jan Kara <jack@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
All Intel platforms guarantee that all root complex implementations must
send transactions up to IOMMU for address translations. Hence for Intel
RCiEP devices, we can assume some ACS-type isolation even without an ACS
capability.
From the Intel VT-d spec, r3.1, sec 3.16 ("Root-Complex Peer to Peer
Considerations"):
When DMA remapping is enabled, peer-to-peer requests through the
Root-Complex must be handled as follows:
- The input address in the request is translated (through first-level,
second-level or nested translation) to a host physical address (HPA).
The address decoding for peer addresses must be done only on the
translated HPA. Hardware implementations are free to further limit
peer-to-peer accesses to specific host physical address regions (or
to completely disallow peer-forwarding of translated requests).
- Since address translation changes the contents (address field) of
the PCI Express Transaction Layer Packet (TLP), for PCI Express
peer-to-peer requests with ECRC, the Root-Complex hardware must use
the new ECRC (re-computed with the translated address) if it
decides to forward the TLP as a peer request.
- Root-ports, and multi-function root-complex integrated endpoints, may
support additional peer-to-peer control features by supporting PCI
Express Access Control Services (ACS) capability. Refer to ACS
capability in PCI Express specifications for details.
Since Linux didn't give special treatment to allow this exception, certain
RCiEP MFD devices were grouped in a single IOMMU group. This doesn't permit
a single device to be assigned to a guest for instance.
In one vendor system: Device 14.x were grouped in a single IOMMU group.
/sys/kernel/iommu_groups/5/devices/0000:00:14.0
/sys/kernel/iommu_groups/5/devices/0000:00:14.2
/sys/kernel/iommu_groups/5/devices/0000:00:14.3
After this patch:
/sys/kernel/iommu_groups/5/devices/0000:00:14.0
/sys/kernel/iommu_groups/5/devices/0000:00:14.2
/sys/kernel/iommu_groups/6/devices/0000:00:14.3 <<< new group
14.0 and 14.2 are integrated devices, but legacy end points, whereas 14.3
was a PCIe-compliant RCiEP.
00:14.3 Network controller: Intel Corporation Device 9df0 (rev 30)
Capabilities: [40] Express (v2) Root Complex Integrated Endpoint, MSI 00
This permits assigning this device to a guest VM.
[bhelgaas: drop "Fixes" tag since this doesn't fix a bug in that commit]
Link: https://lore.kernel.org/r/1590699462-7131-1-git-send-email-ashok.raj@intel.com
Tested-by: Darrel Goeddel <dgoeddel@forcepoint.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Cc: stable@vger.kernel.org
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Mark Scott <mscott@forcepoint.com>,
Cc: Romil Sharma <rsharma@forcepoint.com>
|
|
There is one more regression introduced by the last build fix:
arch/arm/mach-omap2/timer.c:170:6: error: attribute declaration must precede definition [-Werror,-Wignored-attributes]
void __init omap5_realtime_timer_init(void)
^
arch/arm/mach-omap2/common.h:118:20: note: previous definition is here
static inline void omap5_realtime_timer_init(void)
^
arch/arm/mach-omap2/timer.c:170:13: error: redefinition of 'omap5_realtime_timer_init'
void __init omap5_realtime_timer_init(void)
^
arch/arm/mach-omap2/common.h:118:20: note: previous definition is here
static inline void omap5_realtime_timer_init(void)
Address this by removing the now obsolete #ifdefs in that file and
just building the entire file based on the flag that controls the
omap5_realtime_timer_init function declaration.
Link: https://lore.kernel.org/r/20200529201701.521933-1-arnd@arndb.de
Fixes: d86ad463d670 ("ARM: OMAP2+: Fix regression for using local timer on non-SMP SoCs")
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone into arm/dt
ARM: Keystone DTS updates for 5.7
Add display support for K2G EVM Board
* tag 'keystone_dts_for_5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone:
ARM: dts: keystone-k2g-evm: add HDMI video support
ARM: dts: keystone-k2g: Add DSS node
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
Fix config file to require CONFIG_TEST_SYSCTL=m instead of y
because this driver introduces a test sysctl interfaces which
are normally not used, and only used for the selftest.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Fix to load test_sysctl.ko module correctly.
sysctl.sh checks whether the test module is embedded (or loaded
already) or not at first, and if not, it returns skip error
instead of trying modprobe. Thus, there is no chance to load the
test_sysctl test module.
Instead, this removes that module embedded check and returns
skip error only if it ensures that there is no embedded test
module *and* no loadable test module.
This also avoid referring config file since that is not
installed.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
test_sysctl.c is expected to be used as a module, but since
it does not use module_init(), it never be registered as
a module and not appeared under /sys/module/.
In the result, the selftests/sysctl/sysctl.sh always fails
to find the test module and is skipped.
This makes test_sysctl.c initialized as a module by module_init()
and allow sysctl.sh to find the test module is loaded.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Make prime number generator independently selectable from
kconfig. This allows us to enable CONFIG_PRIME_NUMBERS=m
and run the tools/testing/selftests/lib/prime_numbers.sh
without other DRM selftest modules.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Implement the ->update op for the big_key type.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
A while back, I noticed that the crypto and crypto API usage in big_keys
were entirely broken in multiple ways, so I rewrote it. Now, I'm
rewriting it again, but this time using the simpler ChaCha20Poly1305
library function. This makes the file considerably more simple; the
diffstat alone should justify this commit. It also should be faster,
since it no longer requires a mutex around the "aead api object" (nor
allocations), allowing us to encrypt multiple items in parallel. We also
benefit from being able to pass any type of pointer, so we can get rid
of the ridiculously complex custom page allocator that big_key really
doesn't need.
[DH: Change the select CRYPTO_LIB_CHACHA20POLY1305 to a depends on as
select doesn't propagate and the build can end up with an =y dependending
on some =m pieces.
The depends on CRYPTO also had to be removed otherwise the configurator
complains about a recursive dependency.]
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: kernel-hardening@lists.openwall.com
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.
Also, notice that, dynamic memory allocations won't be affected by
this change:
"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]
sizeof(flexible-array-member) triggers a warning because flexible array
members have incomplete type[1]. There are some instances of code in
which the sizeof operator is being incorrectly/erroneously applied to
zero-length arrays and the result is zero. Such instances may be hiding
some bugs. So, this work (flexible-array member conversions) will also
help to get completely rid of those sorts of issues.
This issue was found with the help of Coccinelle.
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
|
|
This argument was just never documented in the first place.
Signed-off-by: Ben Boeckel <mathstuf@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
|
|
The commit is a nice cleanup, but breaks booting on exynos5 based
chromebooks. It's seems to come down to exynos5's i2c driver not
implementing I2C_FUNC_SMBUS_READ_BLOCK_DATA. It's not yet clear
why that breaks boot / massively slows it down when userspace
starts, so revert the problematic patch.
This reverts commit c4b12a2f3f3de670f6be5e96092a2cab0b877f1a.
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
|
|
This depends on the simplification of sbs_read_string_data, which
breaks booting exynos5 based chromebooks. More investigation is
required, so this patch and the simplification patch are reverted
for this merge window.
Note, that this is only a partial revert, since sbs_update_presence()
has not been removed. It is also required for the charger broadcast
disabling.
This reverts commit 79bcd5a4a66076a8a8dacd7f4a3be1952283aef4.
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
|
|
This commit moves channel picking code in separate function.
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Fix four minor typos in comments and log messages
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
|
|
MS-SMB2 specification was updated in March. Make minor additions
and corrections to compression related definitions in smb2pdu.h
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
|
|
Currently, 'KDBFLAGS' is an internal variable of kdb, it is combined
by 'KDBDEBUG' and state flags. It will be shown only when 'KDBDEBUG'
is set, and the user can define an environment variable named 'KDBFLAGS'
too. These are puzzling indeed.
After communication with Daniel, it seems that 'KDBFLAGS' is a misfeature.
So let's replace 'KDBFLAGS' with 'KDBDEBUG' to just show the value we
wrote into. After this modification, we can use `md4c1 kdb_flags` instead,
to observe the state flags.
Suggested-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Wei Li <liwei391@huawei.com>
Link: https://lore.kernel.org/r/20200521072125.21103-1-liwei391@huawei.com
[daniel.thompson@linaro.org: Make kdb_flags unsigned to avoid arithmetic
right shift]
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
|
|
From code inspection the math in handle_ctrl_cmd() looks super sketchy
because it subjects -1 from cmdptr and then does a "%
KDB_CMD_HISTORY_COUNT". It turns out that this code works because
"cmdptr" is unsigned and KDB_CMD_HISTORY_COUNT is a nice power of 2.
Let's make this a little less sketchy.
This patch should be a no-op.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20200507161125.1.I2cce9ac66e141230c3644b8174b6c15d4e769232@changeid
Reviewed-by: Sumit Garg <sumit.garg@linaro.org>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
|
|
Implement the read() function in the early console driver. With
recently added kgdboc_earlycon feature, this allows you to use kgdb
to debug fairly early into the system boot.
We only bother implementing this if polling is enabled since kgdb can't
be enabled without that.
Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20200507130644.v4.12.I8ee0811f0e0816dd8bfe7f2f5540b3dba074fae8@changeid
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
|
|
Implement the read() function in the early console driver. With
recent kgdb patches this allows you to use kgdb to debug fairly early
into the system boot.
We only bother implementing this if polling is enabled since kgdb
can't be enabled without that.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20200507130644.v4.11.I8f668556c244776523320a95b09373a86eda11b7@changeid
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
|
|
Implement the read() function in the early console driver. With
recent kgdb patches this allows you to use kgdb to debug fairly early
into the system boot.
We only bother implementing this if polling is enabled since kgdb
can't be enabled without that.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20200507130644.v4.10.If2deff9679a62c1ce1b8f2558a8635dc837adf8c@changeid
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
|
|
Currently there is no guarantee that an earlycon will be initialized
before kgdboc tries to adopt it. Almost the opposite: on systems
with ACPI then if earlycon has no arguments then it is guaranteed that
earlycon will not be initialized.
This patch mitigates the problem by giving kgdboc_earlycon a second
chance during console_init(). This isn't quite as good as stopping during
early parameter parsing but it is still early in the kernel boot.
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20200430161741.1832050-1-daniel.thompson@linaro.org
Reviewed-by: Douglas Anderson <dianders@chromium.org>
|
|
The recent patch ("kgdboc: Add kgdboc_earlycon to support early kgdb
using boot consoles") adds a new kernel command line parameter.
Document it.
Note that the patch adding the feature does some comparing/contrasting
of "kgdboc_earlycon" vs. the existing "ekgdboc". See that patch for
more details, but briefly "ekgdboc" can be used _instead_ of "kgdboc"
and just makes "kgdboc" do its normal initialization early (only works
if your tty driver is already ready). The new "kgdboc_earlycon" works
in combination with "kgdboc" and is backed by boot consoles.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20200507130644.v4.9.I7d5eb42c6180c831d47aef1af44d0b8be3fac559@changeid
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
|
|
When I combined kgdboc_earlycon with an inflight patch titled ("soc:
qcom-geni-se: Add interconnect support to fix earlycon crash") [1]
things went boom. Specifically I got a crash during the transition
between kgdboc_earlycon and the main kgdboc that looked like this:
Call trace:
__schedule_bug+0x68/0x6c
__schedule+0x75c/0x924
schedule+0x8c/0xbc
schedule_timeout+0x9c/0xfc
do_wait_for_common+0xd0/0x160
wait_for_completion_timeout+0x54/0x74
rpmh_write_batch+0x1fc/0x23c
qcom_icc_bcm_voter_commit+0x1b4/0x388
qcom_icc_set+0x2c/0x3c
apply_constraints+0x5c/0x98
icc_set_bw+0x204/0x3bc
icc_put+0x30/0xf8
geni_remove_earlycon_icc_vote+0x6c/0x9c
qcom_geni_serial_earlycon_exit+0x10/0x1c
kgdboc_earlycon_deinit+0x38/0x58
kgdb_register_io_module+0x11c/0x194
configure_kgdboc+0x108/0x174
kgdboc_probe+0x38/0x60
platform_drv_probe+0x90/0xb0
really_probe+0x130/0x2fc
...
The problem was that we were holding the "kgdb_registration_lock"
while calling into code that didn't expect to be called in spinlock
context.
Let's slightly defer when we call the deinit code so that it's not
done under spinlock.
NOTE: this does mean that the "deinit" call of the old kgdb IO module
is now made _after_ the init of the new IO module, but presumably
that's OK.
[1] https://lkml.kernel.org/r/1588919619-21355-3-git-send-email-akashast@codeaurora.org
Fixes: 220995622da5 ("kgdboc: Add kgdboc_earlycon to support early kgdb using boot consoles")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20200526142001.1.I523dc33f96589cb9956f5679976d402c8cda36fa@changeid
[daniel.thompson@linaro.org: Resolved merge issues by hand]
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
|
|
When kgdboc is compiled as a module all of the "ekgdboc" and
"kgdb_earlycon" code isn't useful and, in fact, breaks compilation.
This is because early_param() isn't defined for modules and that's how
this code gets configured.
It turns out that this was broken by commit eae3e19ca930 ("kgdboc:
Remove useless #ifdef CONFIG_KGDB_SERIAL_CONSOLE in kgdboc") and then
made worse by commit 220995622da5 ("kgdboc: Add kgdboc_earlycon to
support early kgdb using boot consoles"). I guess the #ifdef wasn't
so useless, even if it wasn't obvious why it was useful. When kgdboc
was compiled as a module only "CONFIG_KGDB_SERIAL_CONSOLE_MODULE" was
defined, not "CONFIG_KGDB_SERIAL_CONSOLE". That meant that the old
module.
Let's basically do the same thing that the old code (pre-removal of
the #ifdef) did but use "IS_BUILTIN(CONFIG_KGDB_SERIAL_CONSOLE)" to
make it more obvious what the point of the check is. We'll fix
kgdboc_earlycon in a similar way.
Fixes: 220995622da5 ("kgdboc: Add kgdboc_earlycon to support early kgdb using boot consoles")
Fixes: eae3e19ca930 ("kgdboc: Remove useless #ifdef CONFIG_KGDB_SERIAL_CONSOLE in kgdboc")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20200519084345.1.I91670accc8a5ddabab227eb63bb4ad3e2e9d2b58@changeid
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
|
|
There exists some duplicated includes in tools/perf, remove them.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: xuefeng li <lixuefeng@loongson.cn>
Link: http://lore.kernel.org/lkml/1591071304-19338-2-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Adjust 'map->pgoff' also when moving a map's start address.
Example with v5.4.34 based kernel:
Before:
$ sudo tools/perf/perf record -a --kcore -e intel_pt//k sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.958 MB perf.data ]
$ sudo tools/perf/perf script --itrace=e >/dev/null
Warning:
961 instruction trace errors
After:
$ sudo tools/perf/perf script --itrace=e >/dev/null
$
Committer testing:
# uname -a
Linux seventh 5.6.10-100.fc30.x86_64 #1 SMP Mon May 4 15:36:44 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
#
Before:
# perf record -a --kcore -e intel_pt//k sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.923 MB perf.data ]
# perf script --itrace=e >/dev/null
Warning:
295 instruction trace errors
#
After:
# perf record -a --kcore -e intel_pt//k sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.919 MB perf.data ]
# perf script --itrace=e >/dev/null
#
Fixes: fb5a88d4131a ("perf tools: Preserve eBPF maps when loading kcore")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200602112505.1406-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To pick up the changes in:
5cde265384ca ("perf/x86/rapl: Add AMD Fam17h RAPL support")
Addressing this tools/perf build warning:
Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'
diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
With this one will be able to use these new AMD MSRs in filters, by
name, e.g.:
# perf trace -e msr:* --filter="msr==AMD_PKG_ENERGY_STATUS || msr==AMD_RAPL_POWER_UNIT"
Just like it is now possible with other MSRs:
[root@five ~]# uname -a
Linux five 5.5.17-200.fc31.x86_64 #1 SMP Mon Apr 13 15:29:42 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
[root@five ~]# grep 'model name' -m1 /proc/cpuinfo
model name : AMD Ryzen 5 3600X 6-Core Processor
[root@five ~]#
[root@five ~]# perf trace -e msr:*/max-stack=16/ --filter="msr==AMD_PERF_CTL" --max-events=2
0.000 kworker/1:1-ev/2327824 msr:write_msr(msr: AMD_PERF_CTL, val: 2)
do_trace_write_msr ([kernel.kallsyms])
do_trace_write_msr ([kernel.kallsyms])
[0xffffffffc01d71c3] ([acpi_cpufreq])
[0] ([unknown])
__cpufreq_driver_target ([kernel.kallsyms])
od_dbs_update ([kernel.kallsyms])
dbs_work_handler ([kernel.kallsyms])
process_one_work ([kernel.kallsyms])
worker_thread ([kernel.kallsyms])
kthread ([kernel.kallsyms])
ret_from_fork ([kernel.kallsyms])
8.597 kworker/2:2-ev/2338099 msr:write_msr(msr: AMD_PERF_CTL, val: 2)
do_trace_write_msr ([kernel.kallsyms])
do_trace_write_msr ([kernel.kallsyms])
[0] ([unknown])
[0] ([unknown])
__cpufreq_driver_target ([kernel.kallsyms])
od_dbs_update ([kernel.kallsyms])
dbs_work_handler ([kernel.kallsyms])
process_one_work ([kernel.kallsyms])
worker_thread ([kernel.kallsyms])
kthread ([kernel.kallsyms])
ret_from_fork ([kernel.kallsyms])
[root@five ~]#
Longer explanation with what happens in the perf build process,
automatically after this is made in synch with the kernel sources:
$ make -C tools/perf O=/tmp/build/perf install-bin
<SNIP>
Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'
diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
<SNIP>
make: Leaving directory '/home/acme/git/perf/tools/perf'
$
$ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
$
$ diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
--- tools/arch/x86/include/asm/msr-index.h 2020-06-02 10:46:36.217782288 -0300
+++ arch/x86/include/asm/msr-index.h 2020-05-28 10:41:23.313794627 -0300
@@ -301,6 +301,9 @@
#define MSR_PP1_ENERGY_STATUS 0x00000641
#define MSR_PP1_POLICY 0x00000642
+#define MSR_AMD_PKG_ENERGY_STATUS 0xc001029b
+#define MSR_AMD_RAPL_POWER_UNIT 0xc0010299
+
/* Config TDP MSRs */
#define MSR_CONFIG_TDP_NOMINAL 0x00000648
#define MSR_CONFIG_TDP_LEVEL_1 0x00000649
$ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
$
$ make -C tools/perf O=/tmp/build/perf install-bin
<SNIP>
CC /tmp/build/perf/trace/beauty/tracepoints/x86_msr.o
LD /tmp/build/perf/trace/beauty/tracepoints/perf-in.o
LD /tmp/build/perf/trace/beauty/perf-in.o
LD /tmp/build/perf/perf-in.o
LINK /tmp/build/perf/perf
<SNIP>
make: Leaving directory '/home/acme/git/perf/tools/perf'
$
$ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
$ diff -u before after
--- before 2020-06-02 10:47:08.486334348 -0300
+++ after 2020-06-02 10:47:33.075008948 -0300
@@ -286,6 +286,8 @@
[0xc0010240 - x86_AMD_V_KVM_MSRs_offset] = "F15H_NB_PERF_CTL",
[0xc0010241 - x86_AMD_V_KVM_MSRs_offset] = "F15H_NB_PERF_CTR",
[0xc0010280 - x86_AMD_V_KVM_MSRs_offset] = "F15H_PTSC",
+ [0xc0010299 - x86_AMD_V_KVM_MSRs_offset] = "AMD_RAPL_POWER_UNIT",
+ [0xc001029b - x86_AMD_V_KVM_MSRs_offset] = "AMD_PKG_ENERGY_STATUS",
[0xc00102f0 - x86_AMD_V_KVM_MSRs_offset] = "AMD_PPIN_CTL",
[0xc00102f1 - x86_AMD_V_KVM_MSRs_offset] = "AMD_PPIN",
};
$
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Lee has kindly offered his help in sharing the patch review workload for
the PWM subsystem. If this works out well between Lee, Uwe and myself it
may be a good idea to maintain the subsystem as a group.
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
|
|
To not trigger the warnings provided by CONFIG_PWM_DEBUG
- use up-rounding in .get_state()
- don't divide by the result of a division
- don't use the rounded counter value for the period length to calculate
the counter value for the duty cycle
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
|
|
The way state->enabled is computed is rather convoluted and hard to
read - both branches of the if() actually do the exact same thing. So
remove the if(), and further simplify "<boolean condition> ? true :
false" to "<boolean condition>".
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
|
|
Even in failed case of pm_runtime_get_sync(), the usage_count is
incremented. In order to keep the usage_count with correct value call
appropriate pm_runtime_put().
Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
|
|
into drm-intel-next-fixes
gvt-next-fixes-2020-05-28
- Fix one clang warning on debug only function (Nathan)
- Use ARRAY_SIZE for coccicheck warn (Aishwarya)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
From: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200528033559.GG23961@zhen-hp.sh.intel.com
|
|
Jin Yao reported the issue (and posted first versions of this change)
with groups being defined over events with different cpu mask.
This causes assert aborts in get_group_fd, like:
# perf stat -M "C2_Pkg_Residency" -a -- sleep 1
perf: util/evsel.c:1464: get_group_fd: Assertion `!(fd == -1)' failed.
Aborted
All the events in the group have to be defined over the same cpus so the
group_fd can be found for every leader/member pair.
Adding check to ensure this condition is met and removing the group
(with warning) if we detect mixed cpus, like:
$ sudo perf stat -e '{power/energy-cores/,cycles},{instructions,power/energy-cores/}'
WARNING: event cpu maps do not match, disabling group:
anon group { power/energy-cores/, cycles }
anon group { instructions, power/energy-cores/ }
Ian asked also for cpu maps details, it's displayed in verbose mode:
$ sudo perf stat -e '{cycles,power/energy-cores/}' -v
WARNING: group events cpu maps do not match, disabling group:
anon group { power/energy-cores/, cycles }
power/energy-cores/: 0
cycles: 0-7
anon group { instructions, power/energy-cores/ }
instructions: 0-7
power/energy-cores/: 0
Committer testing:
[root@seventh ~]# perf stat -e '{power/energy-cores/,cycles},{instructions,power/energy-cores/}'
WARNING: grouped events cpus do not match, disabling group:
anon group { power/energy-cores/, cycles }
anon group { instructions, power/energy-cores/ }
^C
Performance counter stats for 'system wide':
12.62 Joules power/energy-cores/
106,920,637 cycles
80,228,899 instructions # 0.75 insn per cycle
12.62 Joules power/energy-cores/
14.514476987 seconds time elapsed
[root@seventh ~]#
But if we put compatible events in each group it works:
[root@seventh ~]# perf stat -e '{power/energy-cores/,power/energy-ram/},{instructions,cycles}' -a sleep 2
Performance counter stats for 'system wide':
1.95 Joules power/energy-cores/
0.92 Joules power/energy-ram/
29,305,715 instructions # 1.03 insn per cycle
28,423,338 cycles
2.001438142 seconds time elapsed
[root@seventh ~]#
This needs improvement tho:
[root@seventh ~]# perf stat -e '{power/energy-cores/,power/energy-ram/},{instructions,cycles}' sleep 2
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (power/energy-cores/).
/bin/dmesg | grep -i perf may provide additional information.
[root@seventh ~]#
We need to emit a better message, one stating that the power/ events
can't be used for a specific workload, instead it is per-cpu or system
wide.
Fixes: 6a4bb04caacc8 ("perf tools: Enable grouping logic for parsed events")
Co-developed-by: Jin Yao <yao.jin@linux.intel.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200602101736.GE1112120@krava
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
fake_lmem_start does not need to be mutable via module param sysfs. It's
only used during driver probe.
Fixes: 1629224324b6 ("drm/i915/lmem: add the fake lmem region")
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200601215510.18379-2-jani.nikula@intel.com
(cherry picked from commit f322e851f20e534cf5305332a9ad5eefadb55d56)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
The parameter only makes sense as a module parameter only.
Fixes: c43c5a8818d4 ("drm/i915/params: add i915 parameters to debugfs")
Cc: Juha-Pekka Heikkilä <juha-pekka.heikkila@intel.com>
Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200601215510.18379-1-jani.nikula@intel.com
(cherry picked from commit dbf4081ffb68c0d9b518a34c715a8d8681658411)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
Allow batch buffers to read their own _local_ cumulative HW runtime of
their logical context.
Fixes: 0f2f39758341 ("drm/i915: Add gen9 BCS cmdparsing")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200601161942.30854-1-chris@chris-wilson.co.uk
(cherry picked from commit f9496520df11de00fbafc3cbd693b9570d600ab3)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
While the current locking/serialization of the global state
suffices for protecting the obj->state access and the actual
hardware reprogramming, we do have a problem with accessing
the old/new states during nonblocking commits.
The state computation and swap will be protected by the crtc
locks, but the commit_tails can finish out of order, thus also
causing the atomic states to be cleaned up out of order. This
would mean the commit that started first but finished last has
had its new state freed as the no-longer-needed old state by the
other commit.
To fix this let's just refcount the states. obj->state amounts
to one reference, and the intel_atomic_state holds extra references
to both its new and old global obj states.
Fixes: 0ef1905ecf2e ("drm/i915: Introduce better global state handling")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200527200245.13184-1-ville.syrjala@linux.intel.com
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
(cherry picked from commit f8c86ffa2800adc80adc679c84c45e0c6b027374)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
Added support for dynamic clock freq configuration in PWM kernel driver.
Earlier the PWM driver used to cache boot time clock rate by PWM clock
parent during probe. Hence dynamically changing PWM frequency was not
possible for all the possible ranges. With this change, dynamic
calculation is enabled and it is able to set the requested period from
sysfs knob provided the value is supported by clock source.
Changes mainly have 2 parts:
- Tegra186 and later chips [1]
- Tegra210 and prior chips [2]
For [1] - Changes implemented to set pwm period dynamically and also
checks added to allow only if requested period(ns) is below or
equals to higher range.
For [2] - Only checks if the requested period(ns) is below or equals to
higher range defined by max clock limit. The limitation in
Tegra210 or prior chips are due to the reason of having only
one PWM controller supporting multiple channels. But later
chips have multiple PWM controller instances each having
single channel support.
Signed-off-by: Sandipan Patra <spatra@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
|
|
The PWM hardware in the JZ4725B works the same as in the JZ4740, but has
only six channels available.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
|
|
The PWM in Ingenic SoCs starts in inactive state until the internal
timer reaches the duty value, then becomes active until the timer
reaches the period value. In theory, we should then use (period - duty)
as the real duty value, as a high duty value would otherwise result in
the PWM pin being inactive most of the time.
This is the reason why the duty value was inverted in the driver until
now, but it still had the problem that it would not start with the
active part.
To address this remaining issue, the common trick is to invert the
duty, and invert the polarity when the PWM is enabled.
Since the duty was already inverted, and we invert it again, we now
program the hardware for the requested duty, and simply invert the
polarity when the PWM is enabled.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
|
|
Calculating the hardware value for the duty from the hardware value of
the period resulted in a precision loss versus calculating it from the
clock rate directly.
(Also remove a cast that doesn't really need to be here)
Fixes: f6b8a5700057 ("pwm: Add Ingenic JZ4740 support")
Cc: <stable@vger.kernel.org>
Suggested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
|
|
Depending on MACH_INGENIC prevent us from creating a generic kernel that
works on more than one MIPS board. Instead, we just depend on MIPS being
set.
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
|
|
Before commit cfc4c189bc70 ("pwm: Read initial hardware state at request
time"), a driver's get_state callback would get called once per PWM from
pwmchip_add().
pwm-lpss' runtime-pm code was relying on this, getting a runtime-pm ref for
PWMs which are enabled at probe time from within its get_state callback,
before enabling runtime-pm.
The change to calling get_state at request time causes a number of
problems:
1. PWMs enabled at probe time may get runtime suspended before they are
requested, causing e.g. a LCD backlight controlled by the PWM to turn off.
2. When the request happens when the PWM has been runtime suspended, the
ctrl register will read all 1 / 0xffffffff, causing get_state to store
bogus values in the pwm_state.
3. get_state was using an async pm_runtime_get() call, because it assumed
that runtime-pm has not been enabled yet. If shortly after the request an
apply call is made, then the pwm_lpss_is_updating() check may trigger
because the resume triggered by the pm_runtime_get() call is not complete
yet, so the ctrl register still reads all 1 / 0xffffffff.
This commit fixes these issues by moving the initial pm_runtime_get() call
for PWMs which are enabled at probe time to the pwm_lpss_probe() function;
and by making get_state take a runtime-pm ref before reading the ctrl reg.
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1828927
Fixes: cfc4c189bc70 ("pwm: Read initial hardware state at request time")
Cc: stable@vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
|
|
The vio and ibmebus buses are used for pseries specific
paravirtualised devices and currently they're initialised by the
generic initcall types. This is mostly fine, but it can result in some
nuisance errors in dmesg when booting on PowerNV on some OSes, e.g.
[ 2.984439] synth uevent: /devices/vio: failed to send uevent
[ 2.984442] vio vio: uevent: failed to send synthetic uevent
[ 17.968551] synth uevent: /devices/vio: failed to send uevent
[ 17.968554] vio vio: uevent: failed to send synthetic uevent
We don't see anything similar for the ibmebus because that depends on
!CONFIG_LITTLE_ENDIAN.
This patch squashes those by switching to using machine_*_initcall()
so the bus type is only registered when the kernel is running on a
pseries machine.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200421081539.7485-1-oohall@gmail.com
|
|
Allwinner A64 is capable of a direct clock output on PWM (see A64 User
Manual chapter 3.10). Add support for this in the sun4i PWM driver.
Signed-off-by: Peter Vasil <peter.vasil@gmail.com>
Acked-by: Maxime Ripard <mripard@kernel.org>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
|
|
To simplify future expansion.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20200505122745.53208-6-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
|
|
The schib region can be used by userspace to get the subchannel-
information block (SCHIB) for the passthrough subchannel.
This can be useful to get information such as channel path
information via the SCHIB.PMCW fields.
Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20200505122745.53208-5-farman@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
|