Age | Commit message (Collapse) | Author |
|
The imx SC api strongly assumes that messages are composed out of
4-bytes words but some of our message structs have odd sizeofs.
This produces many oopses with CONFIG_KASAN=y.
Fix by marking with __aligned(4).
Fixes: a3094fc1a15e ("rtc: imx-sc: add rtc alarm support")
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Link: https://lore.kernel.org/r/13404bac8360852d86c61fad5ae5f0c91ffc4cb6.1582216144.git.leonard.crestez@nxp.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
|
|
Report interrupt state to the RTC core.
Signed-off-by: Biwen Li <biwen.li@nxp.com>
Link: https://lore.kernel.org/r/20200327084457.45161-1-biwen.li@nxp.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
|
|
Merge vm fixes from Andrew Morton:
"5 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm/sparse: fix kernel crash with pfn_section_valid check
mm: fork: fix kernel_stack memcg stats for various stack implementations
hugetlb_cgroup: fix illegal access to memory
drivers/base/memory.c: indicate all memory blocks as removable
mm/swapfile.c: move inode_lock out of claim_swapfile
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Thomas Gleixner:
"A single fix for the Hyper-V clocksource driver to make sched clock
actually return nanoseconds and not the virtual clock value which
increments at 10e7 HZ (100ns)"
* tag 'timers-urgent-2020-03-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource/drivers/hyper-v: Make sched clock return nanoseconds correctly
|
|
We see multiple issues with the implementation/interface to compute
whether a memory block can be offlined (exposed via
/sys/devices/system/memory/memoryX/removable) and would like to simplify
it (remove the implementation).
1. It runs basically lockless. While this might be good for performance,
we see possible races with memory offlining that will require at
least some sort of locking to fix.
2. Nowadays, more false positives are possible. No arch-specific checks
are performed that validate if memory offlining will not be denied
right away (and such check will require locking). For example, arm64
won't allow to offline any memory block that was added during boot -
which will imply a very high error rate. Other archs have other
constraints.
3. The interface is inherently racy. E.g., if a memory block is detected
to be removable (and was not a false positive at that time), there is
still no guarantee that offlining will actually succeed. So any
caller already has to deal with false positives.
4. It is unclear which performance benefit this interface actually
provides. The introducing commit 5c755e9fd813 ("memory-hotplug: add
sysfs removable attribute for hotplug memory remove") mentioned
"A user-level agent must be able to identify which sections
of memory are likely to be removable before attempting the
potentially expensive operation."
However, no actual performance comparison was included.
Known users:
- lsmem: Will group memory blocks based on the "removable" property. [1]
- chmem: Indirect user. It has a RANGE mode where one can specify
removable ranges identified via lsmem to be offlined. However,
it also has a "SIZE" mode, which allows a sysadmin to skip the
manual "identify removable blocks" step. [2]
- powerpc-utils: Uses the "removable" attribute to skip some memory
blocks right away when trying to find some to offline+remove.
However, with ballooning enabled, it already skips this
information completely (because it once resulted in many false
negatives). Therefore, the implementation can deal with false
positives properly already. [3]
According to Nathan Fontenot, DLPAR on powerpc is nowadays no longer
driven from userspace via the drmgr command (powerpc-utils). Nowadays
it's managed in the kernel - including onlining/offlining of memory
blocks - triggered by drmgr writing to /sys/kernel/dlpar. So the
affected legacy userspace handling is only active on old kernels. Only
very old versions of drmgr on a new kernel (unlikely) might execute
slower - totally acceptable.
With CONFIG_MEMORY_HOTREMOVE, always indicating "removable" should not
break any user space tool. We implement a very bad heuristic now.
Without CONFIG_MEMORY_HOTREMOVE we cannot offline anything, so report
"not removable" as before.
Original discussion can be found in [4] ("[PATCH RFC v1] mm:
is_mem_section_removable() overhaul").
Other users of is_mem_section_removable() will be removed next, so that
we can remove is_mem_section_removable() completely.
[1] http://man7.org/linux/man-pages/man1/lsmem.1.html
[2] http://man7.org/linux/man-pages/man8/chmem.8.html
[3] https://github.com/ibm-power-utilities/powerpc-utils
[4] https://lkml.kernel.org/r/20200117105759.27905-1-david@redhat.com
Also, this patch probably fixes a crash reported by Steve.
http://lkml.kernel.org/r/CAPcyv4jpdaNvJ67SkjyUJLBnBnXXQv686BiVW042g03FUmWLXw@mail.gmail.com
Reported-by: "Scargall, Steve" <steve.scargall@intel.com>
Suggested-by: Michal Hocko <mhocko@kernel.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Nathan Fontenot <ndfont@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Karel Zak <kzak@redhat.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200128093542.6908-1-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
I was checking the field to see if it needed the full get_random_bytes()
and discovered it's unused.
Only compile-tested, as I don't have the hardware, but I'm still pretty
confident.
Link: https://lore.kernel.org/r/202003281643.02SGh6eG002694@sdf.org
Signed-off-by: George Spelvin <lkml@sdf.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
There is a potential execution path in which variable *ret* is returned
without being properly initialized, previously.
Fix this by initializing variable *ret* to 0.
Link: https://lore.kernel.org/r/20200328023539.GA32016@embeddedor
Addresses-Coverity-ID: 1491917 ("Uninitialized scalar variable")
Fixes: 2f49de21f3e9 ("RDMA/hns: Optimize mhop get flow for multi-hop addressing")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
The hip08 supports up to 1M QPs, so the qpn mask of cqe should be
modified.
Link: https://lore.kernel.org/r/1585194018-4381-4-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Just reduce the default number to 64 for backward compatibility, the
driver can still get this configuration from the firmware.
Link: https://lore.kernel.org/r/1585194018-4381-3-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
The original value means sending 16 packets at a time, and it should be
configured to 0 which means sending 1 packet instead. It is modified to
reduce the number of PFC frames to make sure the performance meets
expectations when flow control is enabled on hip08.
Link: https://lore.kernel.org/r/1585194018-4381-2-git-send-email-liweihang@huawei.com
Signed-off-by: Jihua Tao <taojihua4@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Split the big recipe into 3 stages: compile, link, and hexdump.
After this commit, the build log with CONFIG_WANXL_BUILD_FIRMWARE
will look like this:
M68KAS drivers/net/wan/wanxlfw.o
M68KLD drivers/net/wan/wanxlfw.bin
BLDFW drivers/net/wan/wanxlfw.inc
CC [M] drivers/net/wan/wanxl.o
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
The firmware source, wanxlfw.S, is currently compiled by the combo of
$(CPP) and $(M68KAS). This is not what we usually do for compiling *.S
files. In fact, this Makefile is the only user of $(AS) in the kernel
build.
Instead of combining $(CPP) and (AS) from different tool sets, using
$(M68KCC) as an assembler driver is simpler, and saner.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
As far as I understood from the Kconfig help text, this build rule is
used to rebuild the driver firmware, which runs on an old m68k-based
chip. So, you need m68k tools for the firmware rebuild.
wanxl.c is a PCI driver, but CONFIG_M68K does not select CONFIG_HAVE_PCI.
So, you cannot enable CONFIG_WANXL_BUILD_FIRMWARE for ARCH=m68k. In other
words, ifeq ($(ARCH),m68k) is false here.
I am keeping the dead code for now, but rebuilding the firmware requires
'as68k' and 'ld68k', which I do not have in hand.
Instead, the kernel.org m68k GCC [1] successfully built it.
Allowing a user to pass in CROSS_COMPILE_M68K= is handier.
[1] https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/9.2.0/x86_64-gcc-9.2.0-nolibc-m68k-linux.tar.xz
Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Commit:
ec93fc371f014a6f ("efi/libstub: Add support for loading the initrd from a device path")
added a diagnostic print to the ARM version of the EFI stub that
reports whether an initrd has been loaded that was passed
via the command line using initrd=.
However, it failed to take into account that, for historical reasons,
the file loading routines return EFI_SUCCESS when no file was found,
and the only way to decide whether a file was loaded is to inspect
the 'size' argument that is passed by reference. So let's inspect
this returned size, to prevent the print from being emitted even if
no initrd was loaded at all.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-efi@vger.kernel.org
|
|
Commit:
9f9223778ef3 ("efi/libstub/arm: Make efi_entry() an ordinary PE/COFF entrypoint")
did some code refactoring to get rid of the EFI entry point assembler
code, and in the process, it got rid of the assignment of image_addr
to the value of _text. Instead, it switched to using the image_base
field of the efi_loaded_image struct provided by UEFI, which should
contain the same value.
However, Michael reports that this is not the case: older GRUB builds
corrupt this value in some way, and since we can easily switch back to
referring to _text to discover this value, let's simply do that.
While at it, fix another issue in commit 9f9223778ef3, which may result
in the unassigned image_addr to be misidentified as the preferred load
offset of the kernel, which is unlikely but will cause a boot crash if
it does occur.
Finally, let's add a warning if the _text vs. image_base discrepancy is
detected, so we can tell more easily how widespread this issue actually
is.
Reported-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-efi@vger.kernel.org
|
|
Move away from the deprecated API.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Link: https://lore.kernel.org/linux-i3c/20200326211002.13241-2-wsa+renesas@sang-engineering.com
|
|
Pull networking fixes from David Miller:
1) Fix memory leak in vti6, from Torsten Hilbrich.
2) Fix double free in xfrm_policy_timer, from YueHaibing.
3) NL80211_ATTR_CHANNEL_WIDTH attribute is put with wrong type, from
Johannes Berg.
4) Wrong allocation failure check in qlcnic driver, from Xu Wang.
5) Get ks8851-ml IO operations right, for real this time, from Marek
Vasut.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (22 commits)
r8169: fix PHY driver check on platforms w/o module softdeps
net: ks8851-ml: Fix IO operations, again
mlxsw: spectrum_mr: Fix list iteration in error path
qlcnic: Fix bad kzalloc null test
mac80211: set IEEE80211_TX_CTRL_PORT_CTRL_PROTO for nl80211 TX
mac80211: mark station unauthorized before key removal
mac80211: Check port authorization in the ieee80211_tx_dequeue() case
cfg80211: Do not warn on same channel at the end of CSA
mac80211: drop data frames without key on encrypted links
ieee80211: fix HE SPR size calculation
nl80211: fix NL80211_ATTR_CHANNEL_WIDTH attribute type
xfrm: policy: Fix doulbe free in xfrm_policy_timer
bpf: Explicitly memset some bpf info structures declared on the stack
bpf: Explicitly memset the bpf_attr structure
bpf: Sanitize the bpf_struct_ops tcp-cc name
vti6: Fix memory leak of skb if input policy check fails
esp: remove the skb from the chain when it's enqueued in cryptd_wq
ipv6: xfrm6_tunnel.c: Use built-in RCU list checking
xfrm: add the missing verify_sec_ctx_len check in xfrm_add_acquire
xfrm: fix uctx len check in verify_sec_ctx_len
...
|
|
Many Zhaoxin Root Ports and Switch Downstream Ports do provide ACS-like
capability but have no ACS Capability Structure. Peer-to-Peer transactions
could be blocked between these ports, so add quirk so devices behind them
could be assigned to different IOMMU group.
Link: https://lore.kernel.org/r/20200327091148.5190-4-RaymondPang-oc@zhaoxin.com
Signed-off-by: Raymond Pang <RaymondPang-oc@zhaoxin.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Some Zhaoxin endpoints are implemented as multi-function devices without an
ACS capability, but they actually don't support peer-to-peer transactions.
Add ACS quirks to declare DMA isolation.
Link: https://lore.kernel.org/r/20200327091148.5190-3-RaymondPang-oc@zhaoxin.com
Signed-off-by: Raymond Pang <RaymondPang-oc@zhaoxin.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Report the maximum amount of sample the EC can hold.
This is not tunable, but can be useful for application to find out the
maximum amount of time it can sleep when hwfifo_timeout is set to a
large number.
Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
|
|
Expose EC minimal interrupt period through buffer/hwfifo_timeout:
- Maximal timeout is limited to 65s.
- When timeout for all sensors is set to 0, EC will not send events,
even if the sensor sampling rate is greater than 0.
Rename frequency to sampling_frequency to match IIO ABI.
Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
|
|
Since cros_ec_sensorhub is shutting down the FIFO when the device
suspends, no need to slow down the EC sampling period rate.
It was necesseary to do that before command CMD_FIFO_INT_ENABLE was
introduced, but now all supported chromebooks have it.
Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
|
|
When EC supports FIFO, each IIO device registers a callback, to put
samples in the buffer when they arrives from the FIFO.
When no FIFO, the user space app needs to call trigger_new, or better
register a high precision timer.
Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
|
|
Some IIO devices may want to override the default (realtime) to another
clock source by default.
It can beneficial when timestamps coming from the hardware or underlying
drivers are already in that format.
It can always be overridden by attribute current_timestamp_clock.
Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
|
|
To prevent comment rot, move function description to
cros_ec_sensors_core.c.
Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
|
|
Events are timestamped in EC time space, their timestamps need to be
converted in host time space.
The assumption is the time delta between when the interrupt is sent
by the EC and when it is receive by the host is a [small] constant.
This is not always true, even with hard-wired interrupt. To mitigate
worst offenders, add a median filter to weed out bigger than expected
delays.
Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Lee Jones <lee.jones@linaro.org>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
|
|
EC FIFO can send sensor events in batch. Spread them based on
previous (TSa) and currnet timestamp (TSb)
EC FIFO iio events
+-----------+
| TSa |
+-----------+ +---------------------------------------+
| event 1 | | event 1 | TSb - (TSb - TSa)/n * (n-1) |
+-----------+ +---------------------------------------+
| event 2 | | event 2 | TSb - (TSb - TSa)/n * (n-2) |
+-----------+ +---------------------------------------+
| ... | ------> | .... | |
+-----------+ +---------------------------------------+
| event n-1 | | event 2 | TSb - (TSb - TSa)/n |
+-----------+ +---------------------------------------+
| event n | | event 2 | TSb |
+-----------+ +---------------------------------------+
| TSb |
+-----------+
Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Lee Jones <lee.jones@linaro.org>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
|
|
cros_ec_sensorhub registers a listener and query motion sense FIFO,
spread to iio sensors registers.
To test, we can use libiio:
iiod&
iio_readdev -u ip:localhost -T 10000 -s 25 -b 16 cros-ec-gyro | od -x
Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
|
|
To better manage resources, store the number of sensors reported by
the EC.
Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
|
|
Kconfig section is misplaced. Put it in the same order as it is done
in Makefile for this driver.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
|
|
We obviously are users of bits.h and types.h. Add them to the list.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
|
|
For better readability reformat GUID assignment.
While here, add the comment how this GUID looks in a string representation.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
|
|
Driver depends to ACPI, this marco always is evaluated to the parameter,
thus useless. Drop it for good.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
|
|
For better namespace maintenance prefix POLL_INTERVAL macro with SURFACE_3.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
|
|
Refactor mshw0011_adp_psr() to be one liner.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
|
|
We have device and we may use it to print messages.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
|
|
As reported by kbuild bot the struct mshw0011_lookup in never used.
Drop its definition for good.
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"Three more driver bugfixes, and two doc improvements fixing build
warnings while we are here"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: pca-platform: Use platform_irq_get_optional
i2c: st: fix missing struct parameter description
i2c: nvidia-gpu: Handle timeout correctly in gpu_i2c_check_status()
i2c: fix a doc warning
i2c: hix5hd2: add missed clk_disable_unprepare in remove
|
|
When the UEFI/BIOS or bootloader has not initialised a PCIe device we would
get the following message:
kern.warning: pci 0000:00:01.0: ASPM: current common clock configuration is broken, reconfiguring
"warning" and "broken" are slightly misleading. On an embedded system it is
quite possible for the bootloader to avoid configuring PCIe devices if they
are not needed.
Downgrade the message to pci_info() and change "broken" to "inconsistent"
since we fix up the inconsistency in the code immediately following the
message (and emit an error if that fails).
Link: https://lore.kernel.org/r/20200323035530.11569-1-chris.packham@alliedtelesis.co.nz
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
The AER interfaces to clear error status registers were a confusing mess:
- pci_cleanup_aer_uncorrect_error_status() cleared non-fatal errors
from the Uncorrectable Error Status register.
- pci_aer_clear_fatal_status() cleared fatal errors from the
Uncorrectable Error Status register.
- pci_cleanup_aer_error_status_regs() cleared the Root Error Status
register (for Root Ports), the Uncorrectable Error Status register,
and the Correctable Error Status register.
Rename them to make them consistent:
From To
---------------------------------------- -------------------------------
pci_cleanup_aer_uncorrect_error_status() pci_aer_clear_nonfatal_status()
pci_aer_clear_fatal_status() pci_aer_clear_fatal_status()
pci_cleanup_aer_error_status_regs() pci_aer_clear_status()
Since pci_cleanup_aer_error_status_regs() (renamed to
pci_aer_clear_status()) is only used within drivers/pci/, move the
declaration from <linux/aer.h> to drivers/pci/pci.h.
[bhelgaas: commit log, add renames]
Link: https://lore.kernel.org/r/d1310a75dc3d28f7e8da4e99c45fbd3e60fe238e.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Error Disconnect Recover (EDR) is a feature that allows ACPI firmware to
notify OSPM that a device has been disconnected due to an error condition
(ACPI v6.3, sec 5.6.6). OSPM advertises its support for EDR on PCI devices
via _OSC (see [1], sec 4.5.1, table 4-4). The OSPM EDR notify handler
should invalidate software state associated with disconnected devices and
may attempt to recover them. OSPM communicates the status of recovery to
the firmware via _OST (sec 6.3.5.2).
For PCIe, firmware may use Downstream Port Containment (DPC) to support
EDR. Per [1], sec 4.5.1, table 4-6, even if firmware has retained control
of DPC, OSPM may read/write DPC control and status registers during the EDR
notification processing window, i.e., from the time it receives an EDR
notification until it clears the DPC Trigger Status.
Note that per [1], sec 4.5.1 and 4.5.2.4,
1. If the OS supports EDR, it should advertise that to firmware by
setting OSC_PCI_EDR_SUPPORT in _OSC Support.
2. If the OS sets OSC_PCI_EXPRESS_DPC_CONTROL in _OSC Control to request
control of the DPC capability, it must also set OSC_PCI_EDR_SUPPORT in
_OSC Support.
Add an EDR notify handler to attempt recovery.
[1] Downstream Port Containment Related Enhancements ECN, Jan 28, 2019,
affecting PCI Firmware Specification, Rev. 3.2
https://members.pcisig.com/wg/PCI-SIG/document/12888
[bhelgaas: squash add/enable patches into one]
Link: https://lore.kernel.org/r/90f91fe6d25c13f9d2255d2ce97ca15be307e1bb.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
|
|
If firmware controls DPC, it is generally responsible for managing the DPC
capability and events, and the OS should not access the DPC capability.
However, if firmware controls DPC and both the OS and the platform support
Error Disconnect Recover (EDR) notifications, the OS EDR notify handler is
responsible for recovery, and the notify handler may read/write the DPC
capability until it clears the DPC Trigger Status bit. See [1], sec 4.5.1,
table 4-6.
Expose some DPC error handling functions so they can be used by the EDR
notify handler.
[1] Downstream Port Containment Related Enhancements ECN, Jan 28, 2019,
affecting PCI Firmware Specification, Rev. 3.2
https://members.pcisig.com/wg/PCI-SIG/document/12888
Link: https://lore.kernel.org/r/e9000bb15b3a4293e81d98bb29ead7c84a6393c9.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Per the SFI _OSC and DPC Updates ECN [1] implementation note flowchart, the
OS seems to be expected to clear AER status even if it doesn't have
ownership of the AER capability. Unlike the DPC capability, where a DPC
ECN [2] specifies a window when the OS is allowed to access DPC registers
even if it doesn't have ownership, there is no clear model for AER.
Add pci_aer_raw_clear_status() to clear the AER error status registers
unconditionally. This is intended for use only by the EDR path (see [2]).
[1] System Firmware Intermediary (SFI) _OSC and DPC Updates ECN, Feb 24,
2020, affecting PCI Firmware Specification, Rev. 3.2
https://members.pcisig.com/wg/PCI-SIG/document/14076
[2] Downstream Port Containment Related Enhancements ECN, Jan 28, 2019,
affecting PCI Firmware Specification, Rev. 3.2
https://members.pcisig.com/wg/PCI-SIG/document/12888
[bhelgaas: changelog]
Link: https://lore.kernel.org/r/c19ad28f3633cce67448609e89a75635da0da07d.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Since Error Disconnect Recover needs to use DPC error handling routines
even if the OS doesn't have control of DPC, move the initalization and
caching of DPC capabilities from the DPC driver to pci_init_capabilities().
Link: https://lore.kernel.org/r/5888380657c8b9551675b5dbd48e370e4fd2703d.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
As per the DPC Enhancements ECN [1], sec 4.5.1, table 4-4, if the OS
supports Error Disconnect Recover (EDR), it must invalidate the software
state associated with child devices of the port without attempting to
access the child device hardware. In addition, if the OS supports DPC, it
must attempt to recover the child devices if the port implements the DPC
Capability. If the OS continues operation, the OS must inform the firmware
of the status of the recovery operation via the _OST method.
Return the result of pcie_do_recovery() so we can report it to firmware via
_OST.
[1] Downstream Port Containment Related Enhancements ECN, Jan 28, 2019,
affecting PCI Firmware Specification, Rev. 3.2
https://members.pcisig.com/wg/PCI-SIG/document/12888
Link: https://lore.kernel.org/r/eb60ec89448769349c6722954ffbf2de163155b5.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Previously we passed the PCIe service type parameter to pcie_do_recovery(),
where reset_link() looked up the underlying pci_port_service_driver and its
.reset_link() function pointer. Instead of using this roundabout way, we
can just pass the driver-specific .reset_link() callback function when
calling pcie_do_recovery() function.
This allows us to call pcie_do_recovery() from code that is not a PCIe port
service driver, e.g., Error Disconnect Recover (EDR) support.
Remove pcie_port_find_service() and pcie_port_service_driver.reset_link
since they are now unused.
Link: https://lore.kernel.org/r/60e02b87b526cdf2930400059d98704bf0a147d1.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
We only need 25 bits of data for DPC, so I don't think it's worth the
complexity of allocating and keeping track of the struct dpc_dev separately
from the pci_dev. Move that data into the struct pci_dev.
Link: https://lore.kernel.org/r/98323eaa18080adbe5bb30846862f09f8722d4b3.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Commit bdb5ac85777d ("PCI/ERR: Handle fatal error recovery") uses
reset_link() to recover from fatal errors. But during fatal error
recovery, if the initial value of error status is PCI_ERS_RESULT_DISCONNECT
or PCI_ERS_RESULT_NO_AER_DRIVER then even after successful recovery (using
reset_link()) pcie_do_recovery() will report the recovery result as
failure. Update the status of error after reset_link().
You can reproduce this issue by triggering a SW DPC using "DPC Software
Trigger" bit in "DPC Control Register". You should see recovery failed
dmesg log as below:
pcieport 0000:00:16.0: DPC: containment event, status:0x1f27 source:0x0000
pcieport 0000:00:16.0: DPC: software trigger detected
pci 0000:04:00.0: AER: can't recover (no error_detected callback)
pcieport 0000:00:16.0: AER: device recovery failed
Fixes: bdb5ac85777d ("PCI/ERR: Handle fatal error recovery")
Link: https://lore.kernel.org/r/a255fcb3a3fdebcd90f84e08b555f1786eb8eba2.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com
[bhelgaas: split pci_channel_io_frozen simplification to separate patch]
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Keith Busch <keith.busch@intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
|
|
pcie_do_recovery() had two "if (state == pci_channel_io_frozen)" cases
right after each other. Combine them to make this easier to read. No
functional change intended.
Link: https://lore.kernel.org/r/20200317170654.GA23125@infradead.org
[bhelgaas: split from https://lore.kernel.org/r/a255fcb3a3fdebcd90f84e08b555f1786eb8eba2.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com]
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Two small fixes: one in drivers (qla2xxx), and one in the core (sd) to
try to cope with USB enclosures that silently change reported
parameters"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: sd: Fix optimal I/O size for devices that change reported values
scsi: qla2xxx: Fix I/Os being passed down when FC device is being deleted
|