Age | Commit message (Collapse) | Author |
|
Fix a spelling mistake in the help text for PREEMPT_RT.
Signed-off-by: Srivatsa S. Bhat (VMware) <srivatsa@csail.mit.edu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/157204450499.10518.4542293884417101528.stgit@srivatsa-ubuntu
|
|
Due to kptr_restrict, JITted BPF code is now displayed like this:
000000000b6ed1b2: ebdff0800024 stmg %r13,%r15,128(%r15)
000000004cde2ba0: 41d0f040 la %r13,64(%r15)
00000000fbad41b0: a7fbffa0 aghi %r15,-96
Leaking kernel addresses to dmesg is not a concern in this case, because
this happens only when JIT debugging is explicitly activated, which only
root can do.
Use %px in this particular instance, and also to print an instruction
address in show_code and PCREL (e.g. brasl) arguments in print_insn.
While at present functionally equivalent to %016lx, %px is recommended
by Documentation/core-api/printk-formats.rst for such cases.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
When starting the CPU Measurement sampling facility using
qsi() function, this function may return an error value.
This error value is referenced in the else part of the
if statement to dump its value in a debug statement.
Right now this value is always zero because it has not been
assigned a value.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
Replace hard coded function names in debug statements
by the "%s ...", __func__ construct suggested by checkpatch.pl
script.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
Use consistant debug print format of the form variable
blank value. Also add leading 0x for all hex values.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi into efi/core
Pull EFI changes for v5.5 from Ard Biesheuvel:
- Change my email address to @kernel.org so I am no longer at the mercy of
useless corporate email infrastructure
- Wire up the EFI RNG code for x86. This enables an additional source of
entropy during early boot.
- Enable the TPM event log code on ARM platforms.
|
|
The debounce time passed to gpiod_set_debounce() is specified in
microseconds, so make sure to use the correct unit when computing the
register values, which denote delays in milliseconds.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Cc: <stable@vger.kernel.org>
Fixes: 18bc64b3aebf ("gpio: Initial support for ROHM bd70528 GPIO block")
[Bartosz: fixed a typo in commit message]
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
|
|
When converting milliseconds to microseconds in commit fffa6af94894
("gpio: max77620: Use correct unit for debounce times") some ~1 ms gaps
were introduced between the various ranges supported by the controller.
Fix this by changing the start of each range to the value immediately
following the end of the previous range. This way a debounce time of,
say 8250 us will translate into 16 ms instead of returning an -EINVAL
error.
Typically the debounce delay is only ever set through device tree and
specified in milliseconds, so we can never really hit this issue because
debounce times are always a multiple of 1000 us.
The only notable exception for this is drivers/mmc/host/mmc-spi.c where
the CD GPIO is requested, which passes a 1 us debounce time. According
to a comment preceeding that code this should actually be 1 ms (i.e.
1000 us).
Reported-by: Pavel Machek <pavel@denx.de>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Acked-by: Pavel Machek <pavel@denx.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
|
|
Explicitly exempt ZONE_DEVICE pages from kvm_is_reserved_pfn() and
instead manually handle ZONE_DEVICE on a case-by-case basis. For things
like page refcounts, KVM needs to treat ZONE_DEVICE pages like normal
pages, e.g. put pages grabbed via gup(). But for flows such as setting
A/D bits or shifting refcounts for transparent huge pages, KVM needs to
to avoid processing ZONE_DEVICE pages as the flows in question lack the
underlying machinery for proper handling of ZONE_DEVICE pages.
This fixes a hang reported by Adam Borowski[*] in dev_pagemap_cleanup()
when running a KVM guest backed with /dev/dax memory, as KVM straight up
doesn't put any references to ZONE_DEVICE pages acquired by gup().
Note, Dan Williams proposed an alternative solution of doing put_page()
on ZONE_DEVICE pages immediately after gup() in order to simplify the
auditing needed to ensure is_zone_device_page() is called if and only if
the backing device is pinned (via gup()). But that approach would break
kvm_vcpu_{un}map() as KVM requires the page to be pinned from map() 'til
unmap() when accessing guest memory, unlike KVM's secondary MMU, which
coordinates with mmu_notifier invalidations to avoid creating stale
page references, i.e. doesn't rely on pages being pinned.
[*] http://lkml.kernel.org/r/20190919115547.GA17963@angband.pl
Reported-by: Adam Borowski <kilobyte@angband.pl>
Analyzed-by: David Hildenbrand <david@redhat.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Cc: stable@vger.kernel.org
Fixes: 3565fce3a659 ("mm, x86: get_user_pages() for dax mappings")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Streamline the PID.PIR check and change its call sites to use
the newly added helper.
Suggested-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
When vCPU enters block phase, pi_pre_block() inserts vCPU to a per pCPU
linked list of all vCPUs that are blocked on this pCPU. Afterwards, it
changes PID.NV to POSTED_INTR_WAKEUP_VECTOR which its handler
(wakeup_handler()) is responsible to kick (unblock) any vCPU on that
linked list that now has pending posted interrupts.
While vCPU is blocked (in kvm_vcpu_block()), it may be preempted which
will cause vmx_vcpu_pi_put() to set PID.SN. If later the vCPU will be
scheduled to run on a different pCPU, vmx_vcpu_pi_load() will clear
PID.SN but will also *overwrite PID.NDST to this different pCPU*.
Instead of keeping it with original pCPU which vCPU had entered block
phase on.
This results in an issue because when a posted interrupt is delivered, as
the wakeup_handler() will be executed and fail to find blocked vCPU on
its per pCPU linked list of all vCPUs that are blocked on this pCPU.
Which is due to the vCPU being placed on a *different* per pCPU
linked list i.e. the original pCPU in which it entered block phase.
The regression is introduced by commit c112b5f50232 ("KVM: x86:
Recompute PID.ON when clearing PID.SN"). Therefore, partially revert
it and reintroduce the condition in vmx_vcpu_pi_load() responsible for
avoiding changing PID.NDST when loading a blocked vCPU.
Fixes: c112b5f50232 ("KVM: x86: Recompute PID.ON when clearing PID.SN")
Tested-by: Nathan Ni <nathan.ni@oracle.com>
Co-developed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Commit 17e433b54393 ("KVM: Fix leak vCPU's VMCS value into other pCPU")
introduced vmx_dy_apicv_has_pending_interrupt() in order to determine
if a vCPU have a pending posted interrupt. This routine is used by
kvm_vcpu_on_spin() when searching for a a new runnable vCPU to schedule
on pCPU instead of a vCPU doing busy loop.
vmx_dy_apicv_has_pending_interrupt() determines if a
vCPU has a pending posted interrupt solely based on PID.ON. However,
when a vCPU is preempted, vmx_vcpu_pi_put() sets PID.SN which cause
raised posted interrupts to only set bit in PID.PIR without setting
PID.ON (and without sending notification vector), as depicted in VT-d
manual section 5.2.3 "Interrupt-Posting Hardware Operation".
Therefore, checking PID.ON is insufficient to determine if a vCPU has
pending posted interrupts and instead we should also check if there is
some bit set on PID.PIR if PID.SN=1.
Fixes: 17e433b54393 ("KVM: Fix leak vCPU's VMCS value into other pCPU")
Reviewed-by: Jagannathan Raman <jag.raman@oracle.com>
Co-developed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The Outstanding Notification (ON) bit is part of the Posted Interrupt
Descriptor (PID) as opposed to the Posted Interrupts Register (PIR).
The latter is a bitmap for pending vectors.
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The three MSR lists(msrs_to_save[], emulated_msrs[] and
msr_based_features[]) are global arrays of kvm.ko, which are
adjusted (copy supported MSRs forward to override the unsupported MSRs)
when insmod kvm-{intel,amd}.ko, but it doesn't reset these three arrays
to their initial value when rmmod kvm-{intel,amd}.ko. Thus, at the next
installation, kvm-{intel,amd}.ko will do operations on the modified
arrays with some MSRs lost and some MSRs duplicated.
So define three constant arrays to hold the initial MSR lists and
initialize msrs_to_save[], emulated_msrs[] and msr_based_features[]
based on the constant arrays.
Cc: stable@vger.kernel.org
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
[Remove now useless conditionals. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Use %u instead of %d to print u32 values to expand the value range,
especially when latency or bandwidth value is bigger than INT_MAX.
Then HMAT latency can support up to 4.29s and bandwidth can support
up to 4PB/s.
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jingqi Liu <Jingqi.liu@intel.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Commit cf8741ac57ed ("ACPI: NUMA: HMAT: Register "soft reserved"
memory as an "hmem" device") introduced a linker warning,
WARNING: vmlinux.o(.text+0x64ec3c): Section mismatch in reference from
the function hmat_register_target() to the function
.init.text:hmat_register_target_devices()
The function hmat_register_target() references the function __init
hmat_register_target_devices().
Since hmat_register_target() is also called from hmat_callback(), and
then register_hotmemory_notifier(), where it should not be freed when
hmat_init() is done, it indicates that the __init annotation of
hmat_register_target_devices() is incorrect.
Fixes: cf8741ac57ed ("ACPI: NUMA: HMAT: Register "soft reserved" memory as an "hmem" device")
Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
I missed two instances of the old RODATA macro (seems I was searching
for vmlinux.lds* not vmlinux*lds*). Fix both instances and double-check
the entire tree for other "RODATA" instances in linker scripts.
Fixes: c82318254d15 ("vmlinux.lds.h: Replace RODATA with RO_DATA")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux-m68k@lists.linux-m68k.org
Cc: Sam Creasey <sammy@sammy.net>
Link: https://lkml.kernel.org/r/201911110920.5840E9AF1@keescook
|
|
An ESP packet could be decrypted in async mode if the input handler for
this packet returns -EINPROGRESS in xfrm_input(). At this moment the device
reference in skb is held. Later xfrm_input() will be invoked again to
resume the processing.
If the transform state is still valid it would continue to release the
device reference and there won't be a problem; however if the transform
state is not valid when async resumption happens, the packet will be
dropped while the device reference is still being held.
When the device is deleted for some reason and the reference to this
device is not properly released, the kernel will keep logging like:
unregister_netdevice: waiting for ppp2 to become free. Usage count = 1
The issue is observed when running IPsec traffic over a PPPoE device based
on a bridge interface. By terminating the PPPoE connection on the server
end for multiple times, the PPPoE device on the client side will eventually
get stuck on the above warning message.
This patch will check the async mode first and continue to release device
reference in async resumption, before it is dropped due to invalid state.
v2: Do not assign address family from outer_mode in the transform if the
state is invalid
v3: Release device reference in the error path instead of jumping to resume
Fixes: 4ce3dbe397d7b ("xfrm: Fix xfrm_input() to verify state is valid when (encap_type < 0)")
Signed-off-by: Xiaodong Xu <stid.smth@gmail.com>
Reported-by: Bo Chen <chenborfc@163.com>
Tested-by: Bo Chen <chenborfc@163.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Currently we make sequence == 0 be the same as sequence == 1, but that's
not super useful if the intent is really to have a timeout that's just
a pure timeout.
If the user passes in sqe->off == 0, then don't apply any sequence logic
to the request, let it purely be driven by the timeout specified.
Reported-by: 李通洲 <carter.li@eoitek.com>
Reviewed-by: 李通洲 <carter.li@eoitek.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
A cast to 'time_t' was accidentally left in place during the
conversion of __do_adjtimex() to 64-bit timestamps, so the
resulting value is incorrectly truncated.
Remove the cast so the 64-bit time gets propagated correctly.
Fixes: ead25417f82e ("timex: use __kernel_timex internally")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20191108203435.112759-2-arnd@arndb.de
|
|
Jose Abreu says:
====================
net: stmmac: Improvements for -next
Misc improvements for stmmac.
Patch 1/6, fixes a sparse warning that was introduced in recent commit in
-next.
Patch 2/6, adds the Split Header support which is also available in XGMAC
cores and now in GMAC4+ with this patch.
Patch 3/6, adds the C45 support for MDIO transactions when using XGMAC cores.
Patch 4/6, removes the speed dependency on CBS callbacks so that it can be used
in XGMAC cores.
Patch 5/6, reworks the over-engineered stmmac_rx() function so that its easier
to read.
Patch 6/6, implements the UDP Segmentation Offload feature in GMAC4+ cores.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Implement the UDP Segmentation Offload feature in stmmac. This is only
available in GMAC4+ cores.
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This looks over-engineered. Let's use some helpers to get the buffer
length and hereby simplify the stmmac_rx() function. No performance drop
was seen with the new implementation.
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
XGMAC3 supports full CBS features with speeds that can go up to 10G so
we can now remove the maximum speed check of CBS.
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add the support for C45 PHYs in the MDIO callbacks for XGMAC. This was
tested using Synopsys DesignWare XPCS.
v2:
- Pull out the readl_poll_timeout() calls into common code (Andrew)
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
GMAC4+ cores also support the Split Header feature.
Add the support for Split Header feature in the RX path following the
same implementation logic that XGMAC followed.
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The VID is converted to le16 so the variable must be __le16 type.
Reported-by: kbuild test robot <lkp@intel.com>
Fixes: c7ab0b8088d7 ("net: stmmac: Fallback to VLAN Perfect filtering if HASH is not available")
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Variable hdr_len is being assigned a value that is never read.
The assignment is redundant and hence can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Variable err is not uninitialized and hence can potentially contain
any garbage value. This may cause an error when logical or'ing the
return values from the calls to functions crypto_aead_setauthsize or
crypto_aead_setkey. Fix this by setting err to the return of
crypto_aead_setauthsize rather than or'ing in the return into the
uninitialized variable
Addresses-Coverity: ("Uninitialized scalar variable")
Fixes: fc1b6d6de220 ("tipc: introduce TIPC encryption & authentication")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The sysfs paths changed, updating to the current ones.
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Recent changes in the dpaa_eth driver reduced the number of
buffer pools per interface from three to one.
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix coccinelle warning:
./drivers/net/phy/mdio_bus.c:67:5-12: ERROR: PTR_ERR applied after initialization to constant on line 62
./drivers/net/phy/mdio_bus.c:68:5-12: ERROR: PTR_ERR applied after initialization to constant on line 62
Fix this by using IS_ERR before PTR_ERR
Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 71dd6c0dff51 ("net: phy: add support for reset-controller")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Side effect of some kbuild changes resulted in breaking the
documented way to build samples/bpf/.
This patch change the samples/bpf/Makefile to work again, when
invoking make from the subdir samples/bpf/. Also update the
documentation in README.rst, to reflect the new way to build.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
I2C communication errors (-EREMOTEIO) during the IRQ handler of nxp-nci
result in a NULL pointer dereference at the moment:
BUG: kernel NULL pointer dereference, address: 0000000000000000
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 1 PID: 355 Comm: irq/137-nxp-nci Not tainted 5.4.0-rc6 #1
RIP: 0010:skb_queue_tail+0x25/0x50
Call Trace:
nci_recv_frame+0x36/0x90 [nci]
nxp_nci_i2c_irq_thread_fn+0xd1/0x285 [nxp_nci_i2c]
? preempt_count_add+0x68/0xa0
? irq_forced_thread_fn+0x80/0x80
irq_thread_fn+0x20/0x60
irq_thread+0xee/0x180
? wake_threads_waitq+0x30/0x30
kthread+0xfb/0x130
? irq_thread_check_affinity+0xd0/0xd0
? kthread_park+0x90/0x90
ret_from_fork+0x1f/0x40
Afterward the kernel must be rebooted to work properly again.
This happens because it attempts to call nci_recv_frame() with skb == NULL.
However, unlike nxp_nci_fw_recv_frame(), nci_recv_frame() does not have any
NULL checks for skb, causing the NULL pointer dereference.
Change the code to call only nxp_nci_fw_recv_frame() in case of an error.
Make sure to log it so it is obvious that a communication error occurred.
The error above then becomes:
nxp-nci_i2c i2c-NXP1001:00: NFC: Read failed with error -121
nci: __nci_request: wait_for_completion_interruptible_timeout failed 0
nxp-nci_i2c i2c-NXP1001:00: NFC: Read failed with error -121
Fixes: 6be88670fc59 ("NFC: nxp-nci_i2c: Add I2C support to NXP NCI driver")
Signed-off-by: Stephan Gerhold <stephan@gerhold.net>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Call devlink enable only during probe time and avoid deadlock
during reload.
Reported-by: Shalom Toledo <shalomt@mellanox.com>
Fixes: a0c76345e3d3 ("devlink: disallow reload operation during device cleanup")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Tested-by: Shalom Toledo <shalomt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Call devlink enable only during probe time and avoid deadlock
during reload.
Reported-by: Shalom Toledo <shalomt@mellanox.com>
Fixes: 5a508a254bed ("devlink: disallow reload operation during device cleanup")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Tested-by: Shalom Toledo <shalomt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add support for chip version RTL8117. Settings have been copied from
Realtek's r8168 driver, there however chip ID 54a belongs to a chip
version called RTL8168FP. It was confirmed that RTL8117 works with
Realtek's driver, so both chip versions seem to be the same or at
least compatible.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Due to unneeded multiplication in the out_free_pages portion of
r10buf_pool_alloc(), when using a 3-copy raid10 layout, it is
possible to access a resync_pages offset that has not been
initialized. This access translates into a crash of the system
within resync_free_pages() while passing a bad pointer to
put_page(). Remove the multiplication, preventing access to the
uninitialized area.
Fixes: f0250618361db ("md: raid10: don't use bio's vec table to manage resync pages")
Cc: stable@vger.kernel.org # 4.12+
Signed-off-by: John Pittman <jpittman@redhat.com>
Suggested-by: David Jeffery <djeffery@redhat.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
|
|
we need to gurantee 'desc_nr' valid before access array
of sb->dev_roles.
In addition, we should avoid .load_super always return '0'
when level is LEVEL_MULTIPATH, which is not expected.
Reported-by: coverity-bot <keescook+coverity-bot@chromium.org>
Addresses-Coverity-ID: 1487373 ("Memory - illegal accesses")
Fixes: 6a5cb53aaa4e ("md: no longer compare spare disk superblock events in super_load")
Signed-off-by: Yufen Yu <yuyufen@huawei.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
|
|
As all I/O is being pushed through a kernel thread the softlockup
watchdog might be triggered under high load.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Song Liu <songliubraving@fb.com>
|
|
This fixes two different classes of bugs in the Intel graphics hardware:
MMIO register read hang:
"On Intels Gen8 and Gen9 Graphics hardware, a read of specific graphics
MMIO registers when the product is in certain low power states causes
a system hang.
There are two potential triggers for DoS:
a) H/W corruption of the RC6 save/restore vector
b) Hard hang within the MIPI hardware
This prevents the DoS in two areas of the hardware:
1) Detect corruption of RC6 address on exit from low-power state,
and if we find it corrupted, disable RC6 and RPM
2) Permanently lower the MIPI MMIO timeout"
Blitter command streamer unrestricted memory accesses:
"On Intels Gen9 Graphics hardware the Blitter Command Streamer (BCS)
allows writing to Memory Mapped Input Output (MMIO) that should be
blocked. With modifications of page tables, this can lead to privilege
escalation. This exposure is limited to the Guest Physical Address
space and does not allow for access outside of the graphics virtual
machine.
This series establishes a software parser into the Blitter command
stream to scan for, and prevent, reads or writes to MMIO's that should
not be accessible to non-privileged contexts.
Much of the command parser infrastructure has existed for some time,
and is used on Ivybridge/Haswell/Valleyview derived products to allow
the use of features normally blocked by hardware. In this legacy
context, the command parser is employed to allow normally unprivileged
submissions to be run with elevated privileges in order to grant
access to a limited set of extra capabilities. In this mode the parser
is optional; In the event that the parser finds any construct that it
cannot properly validate (e.g. nested command buffers), it simply
aborts the scan and submits the buffer in non-privileged mode.
For Gen9 Graphics, this series makes the parser mandatory for all
Blitter submissions. The incoming user buffer is first copied to a
kernel owned buffer, and parsed. If all checks are successful the
kernel owned buffer is mapped READ-ONLY and submitted on behalf of the
user. If any checks fail, or the parser is unable to complete the scan
(nested buffers), it is forcibly rejected. The successfully scanned
buffer is executed with NORMAL user privileges (key difference from
legacy usage).
Modern usermode does not use the Blitter on later hardware, having
switched over to using the 3D engine instead for performance reasons.
There are however some legacy usermode apps that rely on Blitter,
notably the SNA X-Server. There are no known usermode applications
that require nested command buffers on the Blitter, so the forcible
rejection of such buffers in this patch series is considered an
acceptable limitation"
* Intel graphics fixes in emailed bundle from Jon Bloomfield <jon.bloomfield@intel.com>:
drm/i915/cmdparser: Fix jump whitelist clearing
drm/i915/gen8+: Add RC6 CTX corruption WA
drm/i915: Lower RM timeout to avoid DSI hard hangs
drm/i915/cmdparser: Ignore Length operands during command matching
drm/i915/cmdparser: Add support for backward jumps
drm/i915/cmdparser: Use explicit goto for error paths
drm/i915: Add gen9 BCS cmdparsing
drm/i915: Allow parsing of unsized batches
drm/i915: Support ro ppgtt mapped cmdparser shadow buffers
drm/i915: Add support for mandatory cmdparsing
drm/i915: Remove Master tables from cmdparser
drm/i915: Disable Secure Batches for gen6+
drm/i915: Rename gen7 cmdparser tables
|
|
Russell King says:
====================
sfp: Allow slow to initialise GPON modules to work
Some GPON modules take longer than the SFF MSA specified time to
initialise and respond to transactions on the I2C bus for either
both 0x50 and 0x51, or 0x51 bus addresses. Technically these modules
are non-compliant with the SFP Multi-Source Agreement, they have
been around for some time, so are difficult to just ignore.
Most of the patch series is restructuring the code to make it more
readable, and split various things into separate functions.
We split the three state machines into three separate functions, and
re-arrange them to start probing the module as soon as a module has
been detected (without waiting for the network device.) We try to
read the module's EEPROM, retrying quickly for the first second, and
then once every five seconds for about a minute until we have read
the EEPROM. So that the kernel isn't entirely silent, we print a
message indicating that we're waiting for the module to respond after
the first second, or when all retries have expired.
Once the module ID has been read, we kick off a delayed work queue
which attempts to register the hwmon, retrying for up to a minute if
the monitoring parameters are unreadable; this allows us to proceed
with module initialisation independently of the hwmon state.
With high-power modules, we wait for the netdev to be attached before
switching the module power mode, and retry this in a similar way to
before until we have successfully read and written the EEPROM at 0x51.
We also move the handling of the TX_DISABLE signal entirely to the main
state machine, and avoid probing any on-board PHY while TX_FAULT is
set.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When a module is inserted, we attempt to read read the ID from address
0x50. Once we are able to read the ID, we immediately attempt to
initialise the hwmon support by reading from address 0x51. If this
fails, then we fall into error state, and assume that the module is
not usable.
Modules such as the ALCATELLUCENT 3FE46541AA use a real EEPROM for
I2C address 0x50, which responds immediately. However, address 0x51
is an emulated, which only becomes available once the on-board firmware
has booted. This prompts us to fall into the error state.
Since the module may be usable without diagnostics, arrange for the
hwmon probe independent of the rest of the SFP itself, retrying every
5s for up to about 60s for the monitoring to become available, and
print an error message if it doesn't become available.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Some GPON modules (e.g. Huawei MA5671A) take a significant amount of
time to start responding on the I2C bus, contary to the SFF
specifications.
Work around this by implementing a two-level timeout strategy, where
we initially quickly retry for the module, and then use a slower retry
after we exceed a maximum number of quick attempts.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Move the module insertion reporting out of the probe handling, but
after we have detected that the upstream has attached (since that is
whom we are reporting insertion to.)
Only report module removal if we had previously reported a module
insertion.
This gives cleaner semantics, and means we can probe the module before
we have an upstream attached.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Switch the power mode switching from the probe, so that we don't
repeatedly re-probe the SFP device if there is a problem accessing
the registers at I2C address 0x51.
In splitting this out, we can also fix a bug where we leave the module
in high-power mode when the upstream device is detached but the module
is still inserted.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Track the upstream's attachment state in the state machine rather than
maintaining a boolean, which ensures that we have a strict order of
ATTACH followed by an UP event - we can never believe that a newly
attached upstream will be anything but down.
Rearrange the order of state machines so we run the module state
machine after the upstream device's state machine, so the module state
machine can check the current state of the device and take action to
e.g. reset back to empty state when the upstream is detached.
This is to allow the module detection to run independently of the
network device becoming available.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
TX_FAULT should be deasserted to indicate that the module has completed
its initialisation. This may include the on-board PHY, so wait until
the module has deasserted TX_FAULT before probing the PHY.
This means that we need an extra state to handle a TX_FAULT that
remains set for longer than t_init, since using the existing handling
state would bypass the PHY probe.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add the next state to sfp_sm_fault() so that it can branch to other
states. This will be necessary to improve the initialisation path.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Rather than using mdelay() to wait before probing the PHY (which holds
several locks, including the rtnl lock), add an extra wait state to
the state machine to introduce the 50ms delay without holding any
locks.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|