Age | Commit message (Collapse) | Author |
|
smq seems to be performing better than the old mq policy in all
situations, as well as using a quarter of the memory.
Make 'mq' an alias for 'smq' when choosing a cache policy. The tunables
that were present for the old mq are faked, and have no effect. mq
should be considered deprecated now.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
md->queue and q are the same thing in dm_old_init_request_queue() and
dm_mq_init_request_queue().
Also drop the temporary 'struct request_queue *q' in
dm_old_init_request_queue().
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Saves 16 bytes by eliminating 4 4byte holes but more importantly:
numerous members that crossed cachelines were fixed.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Change the map pointer in 'struct mapped_device' from 'struct dm_table
__rcu *' to 'void __rcu *' to avoid the need for the dummy definition.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Allows user to control which NUMA node the memory for DM device
structures (e.g. mapped_device, request_queue, gendisk, blk_mq_tag_set)
is allocated from.
Defaults to NUMA_NO_NODE (-1). Allowable range is from -1 until the
last online NUMA node id.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
fail_path() will print a "Failing path ..." message but reinstate_path()
doesn't print a "Reinstating path ...". Add that message to
reinstate_path() to add symmetry and aid system debugging.
Remove reinstate_path()'s check for the path_selector providing
.reinstate_path hook. All path selectors provide this and any future
ones must too.
activate_path() calls pg_init_done() with SCSI_DH_DEV_OFFLINED but
pg_init_done() doesn't expicitly handle it in its swicth statement. Add
SCSI_DH_DEV_OFFLINED to the default case.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Amir Vadai says:
====================
cls_flower hardware offload support
Please see changes from V2 at the bottom.
This patchset introduces cls_flower hardware offload support over ConnectX-4
driver, more hardware vendors are welcome to use it too.
This patchset is based on John's infrastructure for tc offloading [2] to add
hardware offload support to the flower filter. It also extends the support to
an additional tc action - skbedit mark operation.
NIC driver that was used is ConnectX-4. Feature is off by default and could be
turned on using ethtool.
Some commands to use this code:
export TC=../iproute2/tc/tc
export ETH=ens9
ethtool -K ens9 hw-tc-offload on
$TC qdisc add dev $ETH ingress
$TC filter add dev $ETH protocol ip prio 20 parent ffff: \
flower ip_proto 1 \
dst_mac 7c:fe:90:69:81:62 \
src_mac 7c:fe:90:69:81:56 \
dst_ip 11.11.11.11 \
src_ip 11.11.11.12 \
indev $ETH \
action drop
$TC filter add dev $ETH protocol ip prio 30 parent ffff: \
flower ip_proto 6 \
indev $ETH \
action skbedit mark 0x1234
$TC filter add dev $ETH protocol ip prio 10 parent ffff: \
handle 0x1234 fw action pass
The code was tested and applied on top of commit 3ebeac1 ("Merge branch
'cxgb4-next'")
Changes from V2:
- patch 1/10 ("net/flower: Introduce hardware offload support")
- Remove unused variable [Dave]
- Don't fail command when HW can't offload filter [John]
- patch 3/10 ("net/sched: Macro instead of CONFIG_NET_CLS_ACT ifdef")
- Mention in changelog that struct tc_action is now exposed out of the ifdef.
- patch 4/10 ("net/act_skbedit: Utility functions for mark action")
- Document clearly that is_tcf_skbedit_mark() is returning true if and only
if the only action is mark [Dave]
- patch 8/10 ("net/mlx5e: Introduce tc offload support")
- make mlx5e_tc_add_flow() static
Changes from V1:
- patch 3/10 ("net/sched: Macro instead of CONFIG_NET_CLS_ACT ifdef")
- fixed return value of tc_no_actions
Changes from V0:
- Use tc_no_actions and tc_for_each_action instead of ifdef CONFIG_NET_CLS_ACT
- Replace ENOTSUPP (and some EINVAL) with EOPNOTSUPP
- Name the flower command enum
- fl_hw_destroy_filter() to return void - nobody uses the return value
- mlx5e_tc_init() and mlx5e_tc_cleanup() to be called from the right places.
- When adding HW rule fails - fail the command
- Rules are added to be processed both by HW and SW unless SKIP_HW is given
- Adding patch 6/10 ("net/mlx5e: Relax ndo_setup_tc handle restriction")
Main changes from the RFC [1]:
- API
- Using ndo_setup_tc() instead of switchdev
- act_skbedit, act_gact
- Actions are not serialized to NIC driver, instead using access functions.
- cls_flower
- prevent double classification by software by not adding
successfuly offloaded filters to the hashtable
- Fixed some bugs in original RFC with rule delete
- mlx5
- Adding flow table to kernel namespace instead of a new namespace
- s/offload/tc/ in many places
- no need for a special kconfig since switchdev is not used
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Introduce offloading of skbedit mark action.
For example, to mark with 0x1234, all TCP (ip_proto 6) packets arriving
to interface ens9:
# tc qdisc add dev ens9 ingress
# tc filter add dev ens9 protocol ip parent ffff: \
flower ip_proto 6 \
indev ens9 \
action skbedit mark 0x1234
Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Parse tc_cls_flower_offload into device specific commands and program
the hardware to classify and act accordingly.
For example, to drop ICMP (ip_proto 1) packets from specific smac, dmac,
src_ip, src_ip, arriving to interface ens9:
# tc qdisc add dev ens9 ingress
# tc filter add dev ens9 protocol ip parent ffff: \
flower ip_proto 1 \
dst_mac 7c:fe:90:69:81:62 src_mac 7c:fe:90:69:81:56 \
dst_ip 11.11.11.11 src_ip 11.11.11.12 indev ens9 \
action drop
Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Extend ndo_setup_tc() to support ingress tc offloading. Will be used by
later patches to offload tc flower filter.
Feature is off by default and could be enabled by issuing:
# ethtool -K eth0 hw-tc-offload on
Offloads flow table is dynamically created when first filter is
added.
Rules are saved in a hash table that is maintained by the consumer (for
example - the flower offload in the next patch).
When last filter is removed and no filters exist in the hash table, the
offload flow table is destroyed.
Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Move the vlan and main flow tables to use priority 1. This will allow
the upcoming TC offload logic to use a higher priority (0) for the
offload steering table.
Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Restricting handle to TC_H_ROOT breaks the old instantiation of mqprio
to setup a hardware qdisc. This patch relaxes the test, to only check the
type.
Fixes: 08fb1da ("net/mlx5e: Support DCBNL IEEE ETS")
Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We need to handle flow table entry destinations only if the action
associated with the rule is forwarding (MLX5_FLOW_CONTEXT_ACTION_FWD_DEST).
Fixes: 26a8145390b3 ('net/mlx5_core: Introduce flow steering firmware commands')
Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Enable device drivers to query the action, if and only if is a mark
action and what value to use for marking.
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Introduce the macros tc_no_actions and tc_for_each_action to make code
clearer.
Extracted struct tc_action out of the ifdef to make calls to
is_tcf_gact_shot() and similar functions valid, even when it is a nop.
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Suggested-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
skb_flow_dissector_target() public
Will be used in a following patch to query if a key is being used, and
what it's value in the target object.
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch is based on a patch made by John Fastabend.
It adds support for offloading cls_flower.
when NETIF_F_HW_TC is on:
flags = 0 => Rule will be processed twice - by hardware, and if
still relevant, by software.
flags = SKIP_HW => Rull will be processed by software only
If hardware fail/not capabale to apply the rule, operation will NOT
fail. Filter will be processed by SW only.
Acked-by: Jiri Pirko <jiri@mellanox.com>
Suggested-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
John Crispin says:
====================
net-next: mediatek: add ethernet driver
This series adds support for the Mediatek ethernet core found on current ARM
based SoCs. The driver works on MT2701 and MT7623 SoCs
Instead of trying to upstream everything at once I decided to concentrate on
the important parts required to make current generation silicon work. The V3
series only includes the code required to make dual MAC setups work and only
supports the newer QDMA engine.
Changes in V5
* reduce the mdio timeut to HZ
* add a call to usleep_range() which schedules in the background.
Changes in V4
* remove ugly _FE macro, use offsetof() instead
Changes in V3
* only include code for MT2701/7623 support
* drop support for PDMA and older MIPS based SoCs
* drop switch support
Changes in V2
* change the namespace of the functions from fe_* to mtk_*
* add support for the latest generation of ARM SoCs
* add dual MAC support
* remove the swconfig specific bits
* remove most of the magic values and replace them with defines
* add verbose descriptions to the patches
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add myself and Felix as the Maintainers for the MediaTek ethernet driver.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John Crispin <blogic@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds the Makefile and Kconfig required to make the driver build.
Signed-off-by: John Crispin <blogic@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add ethernet support for MediaTek SoCs from the MT7623 family. These have
dual GMAC. Depending on the exact version, there might be a built-in
Gigabit switch (MT7530). The core does not have the typical DMA ring setup.
Instead there is a linked list that we add descriptors to. There is only
one linked list that both MACs use together. There is a special field
inside the TX descriptors called the VQID. This allows us to assign packets
to different internal queues. By using a separate id for each MAC we are
able to get deterministic results for BQL. Additionally we need to
provide the core with a block of scratch memory that is the same size as
the RX ring and data buffer. This is really needed to make the HW datapath
work. Although the driver does not support this yet, we still need to
assign the memory and tell the core about it for RX to work.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Michael Lee <igvtee@gmail.com>
Signed-off-by: John Crispin <blogic@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This adds the binding documentation for the MediaTek Ethernet
controller.
Signed-off-by: John Crispin <blogic@openwrt.org>
Acked-by: Rob Herring <robh@kernel.org>
Cc: devicetree@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The initial commit badly merged into the dsa_resume method instead
of the dsa_remove_dst method.
As consequence, the dst->master_netdev->dsa_ptr is not set to NULL on
removal and re-bind of the dsa device fails with error -17.
Fixes: b0dc635d923c ("net: dsa: cleanup resources upon module removal ")
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Acked-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
'commit 55482edc25f0606851de42e73618f813f310d009
("qede: Add slowpath/fastpath support and enable hardware GRO")'
introduces below error when compiling net-next with "make ARCH=x86_64"
drivers/built-in.o: In function `qede_rx_int':
qede_main.c:(.text+0x6101a0): undefined reference to `tcp_gro_complete'
Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Rajesh Borundia says:
====================
qlcnic fixes
This series adds following fixes.
o While processing mailbox if driver gets a spurious mailbox
interrupt it leads into premature completion of a next
mailbox request. Added a guard against this by checking current
state of mailbox and ignored spurious interrupt.
Added a stats counter to record this condition.
v2:
o Added patch that removes usage of atomic_t as we are not implemeting
atomicity by using atomic_t value.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
o While the driver is in the middle of a MB completion processing
and it receives a spurious MB interrupt, it is mistaken as a good MB
completion interrupt leading to premature completion of the next MB
request. Fix the driver to guard against this by checking the current
state of MB processing and ignore the spurious interrupt.
Also added a stats counter to record this condition.
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
o atomic_t usage is incorrect as we are not implementing
any atomicity.
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Hariprasad Shenai says:
====================
cxgb4vf: Interrupt and queue configuration changes
This series fixes some issues and some changes in the queue and interrupt
configuration for cxgb4vf driver. We need to enable interrupts before we
register our network device, so that we don't loose link up interrupts.
Allocate rx queues based on interrupt type. Set number of tx/rx queues in
probe function only. Also adds check for some invalid configurations.
This patch series has been created against net-next tree and includes
patches on cxgb4vf driver.
We have included all the maintainers of respective drivers. Kindly review
the change and let us know in case of any review comments.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The Queue Set Configuration code was always reserving room for a
Forwarded interrupt Queue even in the cases where we weren't using it.
Figure out how many Ports and Queue Sets we can support. This depends on
knowing our Virtual Function Resources and may be called a second time
if we fall back from MSI-X to MSI Interrupt Mode. This change fixes that
problem.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This avoids a race condition where a system that has network devices set up
to be automatically configured and we get the first Port Link Status
message from the firmware on the Asynchronous Firmware Event Queue before
we've enabled interrupts. If that happens, we end up losing the interrupt
and never realizing that the links has actually come up.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There is no need to change the 802.1Q port mode for the same value.
Thus avoid such message:
[ 401.954836] dsa dsa@0 lan0: 802.1Q Mode: Disabled (was Disabled)
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The port register 0x07 contains more options than just the default VID,
even though they are not used yet. So prefer a read then write operation
over a direct write.
This also allows to keep track of the change through dynamic debug.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Apply a few non-functional changes on the port state setter:
* add a dynamic debug message with state names to track changes
* explicit states checking instead of assuming their numeric values
* lock mutex only once when changing several port states
* use bitmap macros to declare and access port_state_update_mask
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Sergei Shtylyov says:
====================
sh_eth: fix couple of bugs in sh_eth_ring_format()
Here's a set of 2 patches against DaveM's 'net.git' repo fixing two bugs
in sh_eth_.ring_format()...
[1/2] sh_eth: fix NULL pointer dereference in sh_eth_ring_format()
[2/2] sh_eth: advance 'rxdesc' later in sh_eth_ring_format()
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Iff dma_map_single() fails, 'rxdesc' should point to the last filled RX
descriptor, so that it can be marked as the last one, however the driver
would have already advanced it by that time. In order to fix that, only
fill an RX descriptor once all the data for it is ready.
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In a low memory situation, if netdev_alloc_skb() fails on a first RX ring
loop iteration in sh_eth_ring_format(), 'rxdesc' is still NULL. Avoid
kernel oops by adding the 'rxdesc' check after the loop.
Reported-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Attach the new VMD domain's resources to the VMD device's resources. This
allows /proc/iomem to display a more complete picture.
Before:
c0000000-c1ffffff : 0000:5d:05.5
c2000000-c3ffffff : 0000:5d:05.5
c2010000-c2013fff : nvme
c4000000-c40fffff : 0000:5d:05.5
After:
c0000000-c1ffffff : 0000:5d:05.5
c2000000-c3ffffff : 0000:5d:05.5
c2000000-c3ffffff : VMD MEMBAR1
c2000000-c22fffff : PCI Bus 10000:01
c2000000-c200ffff : 10000:01:00.0
c2010000-c2013fff : 10000:01:00.0
c2010000-c2013fff : nvme
c2300000-c24fffff : PCI Bus 10000:01
c4000000-c40fffff : 0000:5d:05.5
c4002000-c40fffff : VMD MEMBAR2
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
|
|
The bus always starts at 0. Due to alignment and down-casting, this
happened to work before, but looked alarmingly incorrect in kernel logs.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Comment the less obvious portion of the code for setting up memory windows,
and the platform dependency for initializing the h/w with appropriate
resources.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Some devices take longer than the spec indicates to return from FLR reset,
a notable case of this is Intel integrated graphics (IGD), which can often
take an additional 300ms powering down an attached LCD panel as part of the
FLR. Allow devices up to 1000ms, testing every 100ms whether the second
dword of config space is read as -1.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Add PCI support to ARC and update drivers/pci Makefile enabling the ARC
arch to use the generic PCI setup functions.
[bhelgaas: fold in Joao's pci-dma-compat.h & pci-bridge.h build fix (I
should have caught this myself, sorry]
Signed-off-by: Joao Pinto <jpinto@synopsys.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
|
|
Add device HID AMDI0010 to match the AMD ACPI Vendor ID (AMDI) that
was registered in http://www.uefi.org/acpi_id_list, and the I2C
controller on future AMD paltform will use the HID instead of AMD0010.
Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
In device_remove_property_set(), the secondary fwnode needs
to be cleared before the pset is freed. This fixes a
use-after-free when a property set is providing the primary
fwnode.
As a result of the fix, the primary fwnode may end up
containing ERR_PTR(-ENODEV), so also adding checks for it to
the property handling code.
Reported-by: John Youn <John.Youn@synopsys.com>
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
support
It is reported that the following commit triggers regressions:
Linux commit: efaed9be998b5ae0afb7458e057e5f4402b43fa0
ACPICA commit: 31178590dde82368fdb0f6b0e466b6c0add96c57
Subject: ACPICA: Events: Enhance acpi_ev_execute_reg_method() to
ensure no _REG evaluations can happen during OS early boot
stages
This is because that the ECDT support is not corrected in Linux, and Linux
requires to execute _REG for ECDT (though this sounds so wrong), we need to
ensure acpi_gbl_namespace_initialized is set before ECDT probing in order
for _REG to be executed. Since we have to move
"acpi_gbl_namespace_initialized = TRUE" to the initialization step
happening before ECDT probing, acpi_load_tables() is the best candidate for
now. Thus this patch fixes the regression by doing so.
But if the ECDT support is fixed, Linux will not execute _REG for ECDT, and
ECDT probing will happen before acpi_load_tables(). At that time, we still
want to ensure acpi_gbl_namespace_initialized is set after executing
acpi_ns_initialize_objects() (under the condition of
acpi_gbl_group_module_level_code = FALSE), this patch also moves
acpi_ns_initialize_objects() to acpi_load_tables() accordingly.
Since acpi_ns_initialize_objects() doesn't seem to be skippable, this
patch also removes ACPI_NO_OBJECT_INIT for the one invoked in
acpi_load_tables(). And since the default region handlers should always be
installed before loading the tables, this patch also removes useless
acpi_gbl_group_module_level_code check accordingly. Reported by Chris
Bainbridge, Fixed by Lv Zheng.
Fixes: efaed9be998b (ACPICA: Events: Enhance acpi_ev_execute_reg_method() to ensure no _REG evaluations can happen during OS early boot stages)
Reported-and-tested-by: Chris Bainbridge <chris.bainbridge@gmail.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
On some devices, reading or writing VPD causes a system panic.
This can be easily reproduced by running "lspci -vvv" or
"cat /sys/bus/devices/XX../vpd".
Blacklist these devices so we don't access VPD data at all.
[bhelgaas: changelog, comment, drop pci/access.c changes]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=110681
Tested-by: Shane Seymour <shane.seymour@hpe.com>
Tested-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
|
|
Use usleep_range() instead of udelay() while waiting for a VPD access to
complete. This is not a performance path, so no need to hog the CPU.
Rationale for usleep_range() parameters:
We clear PCI_VPD_ADDR_F for a read (or set it for a write), then wait for
the device to change it. For a device that updates PCI_VPD_ADDR between
our config write and subsequent config read, we won't sleep at all and
can get the device's maximum rate.
Sleeping a minimum of 10 usec per 4-byte access limits throughput to
about 400Kbytes/second. VPD is small (32K bytes at most), and most
devices use only a fraction of that.
We back off exponentially up to 1024 usec per iteration. If we reach
1024, we've already waited up to 1008 usec (16 + 32 + ... + 512), so if
we miss an update and wait an extra 1024 usec, we can still get about
1/2 of the device's maximum rate.
Tested-by: Shane Seymour <shane.seymour@hpe.com>
Tested-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
|
|
Commit 931ef163309e moved the smpboot thread park/unpark invocation to the
state machine. The move of the unpark invocation was premature as it depends
on work in progress patches.
As a result cpu down can fail, because rcu synchronization in takedown_cpu()
eventually requires a functional softirq thread. I never encountered the
problem in testing, but 0day testing managed to provide a reliable reproducer.
Remove the smpboot_threads_park() call from the state machine for now and put
it back into the original place after the rcu synchronization.
I'm embarrassed as I knew about the dependency and still managed to get it
wrong. Hotplug induced brain melt seems to be the only sensible explanation
for that.
Fixes: 931ef163309e "cpu/hotplug: Unpark smpboot threads from the state machine"
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
|