Age | Commit message (Collapse) | Author |
|
Some log messages lacks the final newline. So add them.
Fixes: 93d2725affd6 ("clk: bcm: rpi: Discover the firmware clocks")
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Link: https://lore.kernel.org/r/20220713154953.3336-3-stefan.wahren@i2se.com
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Ivan T. Ivanov <iivanov@suse.de>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
The while loop in raspberrypi_discover_clocks() relies on the assumption
that the id of the last clock element is zero. Because this data comes
from the Videocore firmware and it doesn't guarantuee such a behavior
this could lead to out-of-bounds access. So fix this by providing
a sentinel element.
Fixes: 93d2725affd6 ("clk: bcm: rpi: Discover the firmware clocks")
Link: https://github.com/raspberrypi/firmware/issues/1688
Suggested-by: Phil Elwell <phil@raspberrypi.com>
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Link: https://lore.kernel.org/r/20220713154953.3336-2-stefan.wahren@i2se.com
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Ivan T. Ivanov <iivanov@suse.de>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
LRO/GRO_HW should be disabled if there is an attached XDP program.
BNXT_FLAG_TPA is the current setting of the LRO/GRO_HW. Using
BNXT_FLAG_TPA to disable LRO/GRO_HW will cause these features to be
permanently disabled once they are disabled.
Fixes: 1dc4c557bfed ("bnxt: adding bnxt_xdp_build_skb to build skb from multibuffer xdp_buff")
Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There are 2 issues:
1. We should decrement hw_resc->max_nqs instead of hw_resc->max_irqs
with the number of NQs assigned to the VFs. The IRQs are fixed
on each function and cannot be re-assigned. Only the NQs are being
assigned to the VFs.
2. vf_msix is the total number of NQs to be assigned to the VFs. So
we should decrement vf_msix from hw_resc->max_nqs.
Fixes: b16b68918674 ("bnxt_en: Add SR-IOV support for 57500 chips.")
Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add missing devlink_set_features() API for callbacks reload_down
and reload_up to function.
Fixes: 228ea8c187d8 ("bnxt_en: implement devlink dev reload driver_reinit")
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Using BNXT_PAGE_MODE_BUF_SIZE + offset as buffer length value is not
sufficient when running single buffer XDP programs doing redirect
operations. The stack will complain on missing skb tail room. Fix it
by using PAGE_SIZE when calling xdp_init_buff() for single buffer
programs.
Fixes: b231c3f3414c ("bnxt: refactor bnxt_rx_xdp to separate xdp_init_buff/xdp_prepare_buff")
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Address learning should initially be turned off by the driver for port
operation in standalone mode, then the DSA core handles changes to it
via ds->ops->port_bridge_flags().
Leaving address learning enabled while ports are standalone breaks any
kind of communication which involves port B receiving what port A has
sent. Notably it breaks the ksz9477 driver used with a (non offloaded,
ports act as if standalone) bonding interface in active-backup mode,
when the ports are connected together through external switches, for
redundancy purposes.
This fixes a major design flaw in the ksz9477 and ksz8795 drivers, which
unconditionally leave address learning enabled even while ports operate
as standalone.
Fixes: b987e98e50ab ("dsa: add DSA switch driver for Microchip KSZ9477")
Link: https://lore.kernel.org/netdev/CAFZh4h-JVWt80CrQWkFji7tZJahMfOToUJQgKS5s0_=9zzpvYQ@mail.gmail.com/
Reported-by: Brian Hutchinson <b.hutchman@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220818164809.3198039-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc fixes from Helge Deller:
"Some interesting background to the current patchset:
It turned out that the fldw instruction (which loads a 32-bit word
from memory into one half of a FP register) failed on unaligned
addresses and even trashed some other random FP register instead. It's
a trivial one-liner fix in the exception handler but this failure
dates back to the very beginnings of the parisc-port. It's strange
that it was never noticed before.
Another patch fixes an annoyance noticed by Randy Dunlap. Running
"make ARCH=parisc64 randconfig" always returned a 32-bit config,
although one would expect a 64-bit config. Masahiro Yamada suggested
to mimik sparc Kconfig code, which fixed the issue nicely. This
allowed to drop some compiler build checks too.
Third, it's possible to build an optimized 32-bit kernel for PA8X00
(64-bit) CPUs, which then wouldn't start on 32-bit-only (PA1.x)
machines. I've added a bootup check which prevents that and which
prints a message to the console. This can be tested with qemu, which
currently only supports 32-bit emulation.
The other patches are usual clean-up stuff like added return value
checks and typo fixes in comments.
Summary:
- Fix emulation of fldw instruction on unaligned addresses
- Fix "make ARCH=parisc64 randconfig" to return a 64-bit config
- Prevent boot if trying to boot a 32-bit kernel compiled for PA8X00
CPUs on 32-bit only machines
- ccio-dma: Handle kmalloc failure in ccio_init_resources()"
* tag 'parisc-for-6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Add runtime check to prevent PA2.0 kernels on PA1.x machines
parisc: ccio-dma: Handle kmalloc failure in ccio_init_resources()
parisc: led: Move from strlcpy with unused retval to strscpy
parisc: ccio-dma: Fix typo in comment
Revert "parisc: Show error if wrong 32/64-bit compiler is being used"
parisc: Make CONFIG_64BIT available for ARCH=parisc64 only
parisc: Fix exception handler for fldw and fstw instructions
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"Thirteen fixes, almost all for MM.
Seven of these are cc:stable and the remainder fix up the changes
which went into this -rc cycle"
* tag 'mm-hotfixes-stable-2022-08-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
kprobes: don't call disarm_kprobe() for disabled kprobes
mm/shmem: shmem_replace_page() remember NR_SHMEM
mm/shmem: tmpfs fallocate use file_modified()
mm/shmem: fix chattr fsflags support in tmpfs
mm/hugetlb: support write-faults in shared mappings
mm/hugetlb: fix hugetlb not supporting softdirty tracking
mm/uffd: reset write protection when unregister with wp-mode
mm/smaps: don't access young/dirty bit if pte unpresent
mm: add DEVICE_ZONE to FOR_ALL_ZONES
kernel/sys_ni: add compat entry for fadvise64_64
mm/gup: fix FOLL_FORCE COW security issue and remove FOLL_COW
Revert "zram: remove double compression logic"
get_maintainer: add Alan to .get_maintainer.ignore
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull KUnit fixes from Shuah Khan:
"Fix for a mmc test and to load .kunit_test_suites section when
CONFIG_KUNIT=m, and not just when KUnit is built-in"
* tag 'linux-kselftest-kunit-fixes-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
module: kunit: Load .kunit_test_suites section when CONFIG_KUNIT=m
mmc: sdhci-of-aspeed: test: Fix dependencies when KUNIT=m
|
|
There is no need to check if the cpufreq driver implements callback
cpufreq_driver::target_index. The logic in the __resolve_freq uses
the frequency table available in the policy. It doesn't matter if the
driver provides 'target_index' or 'target' callback. It just has to
populate the 'policy->freq_table'.
Thus, check only frequency table during the frequency resolving call.
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
In some case, the GDDV returns a package with a buffer which has
zero length. It causes that kmemdup() returns ZERO_SIZE_PTR (0x10).
Then the data_vault_read() got NULL point dereference problem when
accessing the 0x10 value in data_vault.
[ 71.024560] BUG: kernel NULL pointer dereference, address:
0000000000000010
This patch uses ZERO_OR_NULL_PTR() for checking ZERO_SIZE_PTR or
NULL value in data_vault.
Signed-off-by: "Lee, Chun-Yi" <jlee@suse.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Add compatible string for GT1158 missing from the previous patch.
Fixes: 425fe4709c76 ("Input: goodix - add support for GT1158")
Signed-off-by: Jarrah Gosbell <kernel@undef.tools>
Link: https://lore.kernel.org/r/20220813043821.9981-1-kernel@undef.tools
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
The freq Qos request would be removed repeatedly if the cpufreq policy
relates to more than one CPU. Then, it would cause the "called for unknown
object" warning.
Remove the freq Qos request for each CPU relates to the cpufreq policy,
instead of removing repeatedly for the last CPU of it.
Fixes: a1bb46c36ce3 ("ACPI: processor: Add QoS requests for all CPUs")
Reported-by: Jeremy Linton <Jeremy.Linton@arm.com>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Riwen Lu <luriwen@kylinos.cn>
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
It is a bit unlcear to us why that's helping, but it does and unbreaks
suspend/resume on a lot of GPUs without any known drawbacks.
Cc: stable@vger.kernel.org # v5.15+
Closes: https://gitlab.freedesktop.org/drm/nouveau/-/issues/156
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220819200928.401416-1-kherbst@redhat.com
|
|
Looks like adding clock gate flag patch forgot to remove the old code that
gets reset control.
This causes below crash on platforms that do not need reset.
[ 15.653501] reset_control_reset+0x124/0x170
[ 15.653508] qcom_swrm_init+0x50/0x1a0
[ 15.653514] qcom_swrm_probe+0x320/0x668
[ 15.653519] platform_probe+0x68/0xe0
[ 15.653529] really_probe+0xbc/0x2a8
[ 15.653535] __driver_probe_device+0x7c/0xe8
[ 15.653541] driver_probe_device+0x40/0x110
[ 15.653547] __device_attach_driver+0x98/0xd0
[ 15.653553] bus_for_each_drv+0x68/0xd0
[ 15.653559] __device_attach+0xf4/0x188
[ 15.653565] device_initial_probe+0x14/0x20
Fix this by removing old code.
Reported-by: Amit Pundir <amit.pundir@linaro.org>
Fixes: 1fd0d85affe4 ("soundwire: qcom: Add flag for software clock gating check")
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Tested-by: Amit Pundir <amit.pundir@linaro.org>
Link: https://lore.kernel.org/r/20220814123800.31200-1-srinivas.kandagatla@linaro.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Looks to have been left out in an oversight.
Cc: Mahesh Bandewar <maheshb@google.com>
Cc: Sainath Grandhi <sainath.grandhi@intel.com>
Fixes: 235a9d89da97 ('ipvtap: IP-VLAN based tap driver')
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Link: https://lore.kernel.org/r/20220821130808.12143-1-zenczykowski@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
This reverts commit b09796d528bbf06e3e10a4a8f78038719da7ebc6.
An issue was reported[1] on the original commit. I'll need to address that
before I can delete the use of driver_deferred_probe_check_state(). So,
bring it back for now.
[1] - https://lore.kernel.org/lkml/4799738.LvFx2qVVIh@steina-w/
Fixes: b09796d528bb ("iommu/of: Delete usage of driver_deferred_probe_check_state()")
Reported-by: Jean-Philippe Brucker <jpb@kernel.org>
Tested-by: Jean-Philippe Brucker <jpb@kernel.org>
Tested-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220819221616.2107893-5-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This reverts commit 5a46079a96451cfb15e4f5f01f73f7ba24ef851a.
Quite a few issues have been reported [1][2][3][4][5][6] on the original
commit. While about half of them have been fixed, I'll need to fix the rest
before driver_deferred_probe_check_state() can be deleted. So, revert the
deletion for now.
[1] - https://lore.kernel.org/all/DU0PR04MB941735271F45C716342D0410886B9@DU0PR04MB9417.eurprd04.prod.outlook.com/
[2] - https://lore.kernel.org/all/CM6REZS9Z8AC.2KCR9N3EFLNQR@otso/
[3] - https://lore.kernel.org/all/CAD=FV=XYVwaXZxqUKAuM5c7NiVjFz5C6m6gAHSJ7rBXBF94_Tg@mail.gmail.com/
[4] - https://lore.kernel.org/all/Yvpd2pwUJGp7R+YE@euler/
[5] - https://lore.kernel.org/lkml/20220601070707.3946847-2-saravanak@google.com/
[6] - https://lore.kernel.org/all/CA+G9fYt_cc5SiNv1Vbse=HYY_+uc+9OYPZuJ-x59bROSaLN6fw@mail.gmail.com/
Fixes: 5a46079a9645 ("PM: domains: Delete usage of driver_deferred_probe_check_state()")
Reported-by: Peng Fan <peng.fan@nxp.com>
Reported-by: Luca Weiss <luca.weiss@fairphone.com>
Reported-by: Doug Anderson <dianders@chromium.org>
Reported-by: Colin Foster <colin.foster@in-advantage.com>
Reported-by: Tony Lindgren <tony@atomide.com>
Reported-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Tested-by: Tony Lindgren <tony@atomide.com>
Tested-by: Peng Fan <peng.fan@nxp.com>
Tested-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reviewed-by: Tony Lindgren <tony@atomide.com>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220819221616.2107893-4-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This reverts commit f8217275b57aa48d98cc42051c2aac34152718d6.
There are a few more issues to fix that have been reported in the thread
for the original series [1]. We'll need to fix those before this will work.
So, revert it for now.
[1] - https://lore.kernel.org/lkml/CAMuHMdWo_wRwV-i_iyTxVnEsf3Th9GBAG+wxUQMQGnw1t2ijTg@mail.gmail.com/
Fixes: f8217275b57a ("net: mdio: Delete usage of driver_deferred_probe_check_state()")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reviewed-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220819221616.2107893-3-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This reverts commit 9cbffc7a59561be950ecc675d19a3d2b45202b2b.
There are a few more issues to fix that have been reported in the thread
for the original series [1]. We'll need to fix those before this will work.
So, revert it for now.
[1] - https://lore.kernel.org/lkml/20220601070707.3946847-1-saravanak@google.com/
Fixes: 9cbffc7a5956 ("driver core: Delete driver_deferred_probe_check_state()")
Tested-by: Tony Lindgren <tony@atomide.com>
Tested-by: Peng Fan <peng.fan@nxp.com>
Tested-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reviewed-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220819221616.2107893-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-linus
Jonathan writes:
"1st set of IIO fixes for 6.0-rc1
adi,ad7292
- Prevent duplicate disable of regulator in error path.
bosch,bmg160
- Correct dt-binding to allow for 2 interrupt pins.
capella,cm3605
- Fix missing error cleanup due to premature return.
capella,cm32181
- Fix missing static on local symbol.
microchip,mcp33911
- Correctly handle sign bit.
- Fix mismatch between driver and DT binding, including fallback to old
driver behavior.
- Use correct formula for voltage calculation."
* tag 'iio-fixes-for-6.0a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio:
iio: light: cm32181: make cm32181_pm_ops static
iio: ad7292: Prevent regulator double disable
dt-bindings: iio: gyroscope: bosch,bmg160: correct number of pins
iio: adc: mcp3911: use correct formula for AD conversion
iio: adc: mcp3911: correct "microchip,device-addr" property
iio: light: cm3605: Fix an error handling path in cm3605_probe()
iio: adc: mcp3911: make use of the sign bit
|
|
Delete the redundant word 'was'.
Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Link: https://lore.kernel.org/r/20220708151538.51483-1-yuanjilin@cdjrlc.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
|
|
There is a possible race condition (use-after-free) like below
(FREE) | (USE)
adf7242_remove | adf7242_channel
cancel_delayed_work_sync |
destroy_workqueue (1) | adf7242_cmd_rx
| mod_delayed_work (2)
|
The root cause for this race is that the upper layer (ieee802154) is
unaware of this detaching event and the function adf7242_channel can
be called without any checks.
To fix this, we can add a flag write at the beginning of adf7242_remove
and add flag check in adf7242_channel. Or we can just defer the
destructive operation like other commit 3e0588c291d6 ("hamradio: defer
ax25 kfree after unregister_netdev") which let the
ieee802154_unregister_hw() to handle the synchronization. This patch
takes the second option.
Fixes: 58e9683d1475 ("net: ieee802154: adf7242: Fix OCL calibration
runs")
Signed-off-by: Lin Ma <linma@zju.edu.cn>
Acked-by: Michael Hennerich <michael.hennerich@analog.com>
Link: https://lore.kernel.org/r/20220808034224.12642-1-linma@zju.edu.cn
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
|
|
The recent commit 87d0e2f41b8c ("usb: typec: ucsi: add a common
function ucsi_unregister_connectors()") introduced a regression that
caused NULL dereference at reading the power supply sysfs. It's a
stale sysfs entry that should have been removed but remains with NULL
ops. The commit changed the error handling to skip the entries after
a NULL con->wq, and this leaves the power device unreleased.
For addressing the regression, the straight revert is applied here.
Further code improvements can be done from the scratch again.
Link: https://bugzilla.suse.com/show_bug.cgi?id=1202386
Link: https://lore.kernel.org/r/87r11cmbx0.wl-tiwai@suse.de
Fixes: 87d0e2f41b8c ("usb: typec: ucsi: add a common function ucsi_unregister_connectors()")
Cc: <stable@vger.kernel.org>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/r/20220823065455.32579-1-tiwai@suse.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The dwc3_qcom_read_usb2_speed() helper is now only called when the
controller is acting as host, but the compiler will warn that the hcd
variable is unused in gadget-only W=1 builds.
Fixes: c06795f114a6 ("usb: dwc3: qcom: fix gadget-only builds")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/r/20220822100550.3039-1-johan+linaro@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Although commit 88f1669019bd ("scsi: sd: Rework asynchronous resume support")
eliminates a delay for some ATA disks after resume, it causes resume of ATA
disks to fail on other setups. See also:
* "Resume process hangs for 5-6 seconds starting sometime in 5.16"
(https://bugzilla.kernel.org/show_bug.cgi?id=215880).
* Geert's regression report
(https://lore.kernel.org/linux-scsi/alpine.DEB.2.22.394.2207191125130.1006766@ramsan.of.borg/).
This is what I understand about this issue:
* During resume, ata_port_pm_resume() starts the SCSI error handler. This
changes the SCSI host state into SHOST_RECOVERY and causes
scsi_queue_rq() to return BLK_STS_RESOURCE.
* sd_resume() calls sd_start_stop_device() for ATA devices. That function
in turn calls sd_submit_start() which tries to submit a START STOP UNIT
command. That command can only be submitted after the SCSI error handler
has changed the SCSI host state back to SHOST_RUNNING.
* The SCSI error handler runs on its own thread and calls
schedule_work(&(ap->scsi_rescan_task)). That causes
ata_scsi_dev_rescan() to be called from the context of a kernel
workqueue. That call hangs in blk_mq_get_tag(). I'm not sure why - maybe
because all available tags have been allocated by sd_submit_start()
calls (this is a guess).
Link: https://lore.kernel.org/r/20220816172638.538734-1-bvanassche@acm.org
Fixes: 88f1669019bd ("scsi: sd: Rework asynchronous resume support")
Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: gzhqyz@gmail.com
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reported-by: gzhqyz@gmail.com
Reported-and-tested-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: John Garry <john.garry@huawei.com>
Tested-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The function raspberrypi_fw_get_rate (e.g. used for the recalc_rate
hook) can fail to get the clock rate from the firmware. In this case
we cannot return a signed error value, which would be casted to
unsigned long. Fix this by returning 0 instead.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Link: https://lore.kernel.org/r/20220625083643.4012-1-stefan.wahren@i2se.com
Fixes: 4e85e535e6cc ("clk: bcm283x: add driver interfacing with Raspberry Pi's firmware")
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
The value is only ever set once in bond_3ad_initialize and only ever
read otherwise. There seems to be no reason to set the variable via
bond_3ad_initialize when setting the global variable will do. Change
ad_ticks_per_sec to a const to enforce its read-only usage.
Signed-off-by: Jonathan Toppins <jtoppins@redhat.com>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This is caused by the global variable ad_ticks_per_sec being zero as
demonstrated by the reproducer script discussed below. This causes
all timer values in __ad_timer_to_ticks to be zero, resulting
in the periodic timer to never fire.
To reproduce:
Run the script in
`tools/testing/selftests/drivers/net/bonding/bond-break-lacpdu-tx.sh` which
puts bonding into a state where it never transmits LACPDUs.
line 44: ip link add fbond type bond mode 4 miimon 200 \
xmit_hash_policy 1 ad_actor_sys_prio 65535 lacp_rate fast
setting bond param: ad_actor_sys_prio
given:
params.ad_actor_system = 0
call stack:
bond_option_ad_actor_sys_prio()
-> bond_3ad_update_ad_actor_settings()
-> set ad.system.sys_priority = bond->params.ad_actor_sys_prio
-> ad.system.sys_mac_addr = bond->dev->dev_addr; because
params.ad_actor_system == 0
results:
ad.system.sys_mac_addr = bond->dev->dev_addr
line 48: ip link set fbond address 52:54:00:3B:7C:A6
setting bond MAC addr
call stack:
bond->dev->dev_addr = new_mac
line 52: ip link set fbond type bond ad_actor_sys_prio 65535
setting bond param: ad_actor_sys_prio
given:
params.ad_actor_system = 0
call stack:
bond_option_ad_actor_sys_prio()
-> bond_3ad_update_ad_actor_settings()
-> set ad.system.sys_priority = bond->params.ad_actor_sys_prio
-> ad.system.sys_mac_addr = bond->dev->dev_addr; because
params.ad_actor_system == 0
results:
ad.system.sys_mac_addr = bond->dev->dev_addr
line 60: ip link set veth1-bond down master fbond
given:
params.ad_actor_system = 0
params.mode = BOND_MODE_8023AD
ad.system.sys_mac_addr == bond->dev->dev_addr
call stack:
bond_enslave
-> bond_3ad_initialize(); because first slave
-> if ad.system.sys_mac_addr != bond->dev->dev_addr
return
results:
Nothing is run in bond_3ad_initialize() because dev_addr equals
sys_mac_addr leaving the global ad_ticks_per_sec zero as it is
never initialized anywhere else.
The if check around the contents of bond_3ad_initialize() is no longer
needed due to commit 5ee14e6d336f ("bonding: 3ad: apply ad_actor settings
changes immediately") which sets ad.system.sys_mac_addr if any one of
the bonding parameters whos set function calls
bond_3ad_update_ad_actor_settings(). This is because if
ad.system.sys_mac_addr is zero it will be set to the current bond mac
address, this causes the if check to never be true.
Fixes: 5ee14e6d336f ("bonding: 3ad: apply ad_actor settings changes immediately")
Signed-off-by: Jonathan Toppins <jtoppins@redhat.com>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since priv->rx_mapping[i] is maped in moxart_mac_open(), we
should unmap it from moxart_mac_stop(). Fixes 2 warnings.
1. During error unwinding in moxart_mac_probe(): "goto init_fail;",
then moxart_mac_free_memory() calls dma_unmap_single() with
priv->rx_mapping[i] pointers zeroed.
WARNING: CPU: 0 PID: 1 at kernel/dma/debug.c:963 check_unmap+0x704/0x980
DMA-API: moxart-ethernet 92000000.mac: device driver tries to free DMA memory it has not allocated [device address=0x0000000000000000] [size=1600 bytes]
CPU: 0 PID: 1 Comm: swapper Not tainted 5.19.0+ #60
Hardware name: Generic DT based system
unwind_backtrace from show_stack+0x10/0x14
show_stack from dump_stack_lvl+0x34/0x44
dump_stack_lvl from __warn+0xbc/0x1f0
__warn from warn_slowpath_fmt+0x94/0xc8
warn_slowpath_fmt from check_unmap+0x704/0x980
check_unmap from debug_dma_unmap_page+0x8c/0x9c
debug_dma_unmap_page from moxart_mac_free_memory+0x3c/0xa8
moxart_mac_free_memory from moxart_mac_probe+0x190/0x218
moxart_mac_probe from platform_probe+0x48/0x88
platform_probe from really_probe+0xc0/0x2e4
2. After commands:
ip link set dev eth0 down
ip link set dev eth0 up
WARNING: CPU: 0 PID: 55 at kernel/dma/debug.c:570 add_dma_entry+0x204/0x2ec
DMA-API: moxart-ethernet 92000000.mac: cacheline tracking EEXIST, overlapping mappings aren't supported
CPU: 0 PID: 55 Comm: ip Not tainted 5.19.0+ #57
Hardware name: Generic DT based system
unwind_backtrace from show_stack+0x10/0x14
show_stack from dump_stack_lvl+0x34/0x44
dump_stack_lvl from __warn+0xbc/0x1f0
__warn from warn_slowpath_fmt+0x94/0xc8
warn_slowpath_fmt from add_dma_entry+0x204/0x2ec
add_dma_entry from dma_map_page_attrs+0x110/0x328
dma_map_page_attrs from moxart_mac_open+0x134/0x320
moxart_mac_open from __dev_open+0x11c/0x1ec
__dev_open from __dev_change_flags+0x194/0x22c
__dev_change_flags from dev_change_flags+0x14/0x44
dev_change_flags from devinet_ioctl+0x6d4/0x93c
devinet_ioctl from inet_ioctl+0x1ac/0x25c
v1 -> v2:
Extraneous change removed.
Fixes: 6c821bd9edc9 ("net: Add MOXA ART SoCs ethernet driver")
Signed-off-by: Sergei Antonov <saproj@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20220819110519.1230877-1-saproj@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
For some MAC drivers, they set the mac_managed_pm to true in its
->ndo_open() callback. So before the mac_managed_pm is set to true,
we still want to leverage the mdio_bus_phy_suspend()/resume() for
the phy device suspend and resume. In this case, the phy device is
in PHY_READY, and we shouldn't warn about this. It also seems that
the check of mac_managed_pm in WARN_ON is redundant since we already
check this in the entry of mdio_bus_phy_resume(), so drop it.
Fixes: 744d23c71af3 ("net: phy: Warn about incorrect mdio_bus_phy_resume() state")
Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220819082451.1992102-1-xiaolei.wang@windriver.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In ipa_smem_init(), a Qualcomm SMEM region is allocated (if needed)
and then its virtual address is fetched using qcom_smem_get(). The
physical address associated with that region is also fetched.
The physical address is adjusted so that it is page-aligned, and an
attempt is made to update the size of the region to compensate for
any non-zero adjustment.
But that adjustment isn't done properly. The physical address is
aligned twice, and as a result the size is never actually adjusted.
Fix this by *not* aligning the "addr" local variable, and instead
making the "phys" local variable be the adjusted "addr" value.
Fixes: a0036bb413d5b ("net: ipa: define SMEM memory region for IPA")
Signed-off-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20220818134206.567618-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
DSA has multiple ways of specifying a MAC connection to an internal PHY.
One requires a DT description like this:
port@0 {
reg = <0>;
phy-handle = <&internal_phy>;
phy-mode = "internal";
};
(which is IMO the recommended approach, as it is the clearest
description)
but it is also possible to leave the specification as just:
port@0 {
reg = <0>;
}
and if the driver implements ds->ops->phy_read and ds->ops->phy_write,
the DSA framework "knows" it should create a ds->slave_mii_bus, and it
should connect to a non-OF-based internal PHY on this MDIO bus, at an
MDIO address equal to the port address.
There is also an intermediary way of describing things:
port@0 {
reg = <0>;
phy-handle = <&internal_phy>;
};
In case 2, DSA calls phylink_connect_phy() and in case 3, it calls
phylink_of_phy_connect(). In both cases, phylink_create() has been
called with a phy_interface_t of PHY_INTERFACE_MODE_NA, and in both
cases, PHY_INTERFACE_MODE_NA is translated into phy->interface.
It is important to note that phy_device_create() initializes
dev->interface = PHY_INTERFACE_MODE_GMII, and so, when we use
phylink_create(PHY_INTERFACE_MODE_NA), no one will override this, and we
will end up with a PHY_INTERFACE_MODE_GMII interface inherited from the
PHY.
All this means that in order to maintain compatibility with device tree
blobs where the phy-mode property is missing, we need to allow the
"gmii" phy-mode and treat it as "internal".
Fixes: 2c709e0bdad4 ("net: dsa: microchip: ksz8795: add phylink support")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216320
Reported-by: Craig McQueen <craig@mcqueen.id.au>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Tested-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Link: https://lore.kernel.org/r/20220818143250.2797111-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add the module alias so the rk805-pwrkey driver will
autoload when built as a module.
Fixes: 5a35b85c2d92 ("Input: add power key driver for Rockchip RK805 PMIC")
Signed-off-by: Peter Robinson <pbrobinson@gmail.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://lore.kernel.org/r/20220612225437.3628788-1-pbrobinson@gmail.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
In the original commit 9a34b45397e5 ("clk: Add support for runtime PM"),
the commit message mentioned that pm_runtime_put_sync() would be done
at the end of clk_core_unprepare(). This mirrors the operations in
clk_core_prepare() in the opposite order.
However, the actual code that was added wasn't in the order the commit
message described. Move clk_pm_runtime_put() to the end of
clk_core_unprepare() so that it is in the correct order.
Fixes: 9a34b45397e5 ("clk: Add support for runtime PM")
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
Reviewed-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Link: https://lore.kernel.org/r/20220822081424.1310926-3-wenst@chromium.org
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
In the previous commits that added CLK_OPS_PARENT_ENABLE, support for
this flag was only added to rate change operations (rate setting and
reparent) and disabling unused subtree. It was not added to the
clock gate related operations. Any hardware driver that needs it for
these operations will either see bogus results, or worse, hang.
This has been seen on MT8192 and MT8195, where the imp_ii2_* clk
drivers set this, but dumping debugfs clk_summary would cause it
to hang.
Fixes: fc8726a2c021 ("clk: core: support clocks which requires parents enable (part 2)")
Fixes: a4b3518d146f ("clk: core: support clocks which requires parents enable (part 1)")
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
Reviewed-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Tested-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Link: https://lore.kernel.org/r/20220822081424.1310926-2-wenst@chromium.org
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
Unlock before returning if mlx5_device_enable_sriov() fails.
Fixes: 84a433a40d0e ("net/mlx5: Lock mlx5 devlink reload callbacks")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Call mlx5e_fs_vlan_free(fs) before kvfree(fs).
Fixes: af8bbf730068 ("net/mlx5e: Convert mlx5e_flow_steering member of mlx5e_priv to pointer")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Use the list_for_each_entry_safe() macro to prevent dereferencing "obj"
after it has been freed.
Fixes: c4dfe704f53f ("net/mlx5e: kTLS, Recycle objects of device-offloaded TLS TX connections")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Unlock before returning on this error path.
Fixes: f1bc646c9a06 ("net/mlx5: Use devl_ API in mlx5_esw_offloads_devlink_port_register")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
The cited commit reintroduced the ability to set hw-tc-offload
in switchdev mode by reusing NIC mode calls without modifying it
to support both modes, this can cause an illegal memory access
when trying to turn hw-tc-offload off.
Fix this by using the right TC_FLAG when checking if tc rules
are installed while disabling hw-tc-offload.
Fixes: d3cbd4254df8 ("net/mlx5e: Add ndo_set_feature for uplink representor")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
There is a missing policer validation when offloading police action
with tc action api. Add it.
Fixes: 7d1a5ce46e47 ("net/mlx5e: TC, Support tc action api for police")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Driver caches packet merge type in mlx5e_params instance which must be
in perfect sync with the netdev_feature's bit.
Prior to this patch, in certain conditions (*) LRO state was set in
mlx5e_params, while netdev_feature's bit was off. Causing the LRO to
be applied on the RQs (HW level).
(*) This can happen only on profile init (mlx5e_build_nic_params()),
when RQ expect non-linear SKB and PCI is fast enough in comparison to
link width.
Solution: remove setting of packet merge type from
mlx5e_build_nic_params() as netdev features are not updated.
Fixes: 619a8f2a42f1 ("net/mlx5e: Use linear SKB in Striding RQ")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Add a lock_class_key per mlx5 device to avoid a false positive
"possible circular locking dependency" warning by lockdep, on flows
which lock more than one mlx5 device, such as adding SF.
kernel log:
======================================================
WARNING: possible circular locking dependency detected
5.19.0-rc8+ #2 Not tainted
------------------------------------------------------
kworker/u20:0/8 is trying to acquire lock:
ffff88812dfe0d98 (&dev->intf_state_mutex){+.+.}-{3:3}, at: mlx5_init_one+0x2e/0x490 [mlx5_core]
but task is already holding lock:
ffff888101aa7898 (&(¬ifier->n_head)->rwsem){++++}-{3:3}, at: blocking_notifier_call_chain+0x5a/0x130
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&(¬ifier->n_head)->rwsem){++++}-{3:3}:
down_write+0x90/0x150
blocking_notifier_chain_register+0x53/0xa0
mlx5_sf_table_init+0x369/0x4a0 [mlx5_core]
mlx5_init_one+0x261/0x490 [mlx5_core]
probe_one+0x430/0x680 [mlx5_core]
local_pci_probe+0xd6/0x170
work_for_cpu_fn+0x4e/0xa0
process_one_work+0x7c2/0x1340
worker_thread+0x6f6/0xec0
kthread+0x28f/0x330
ret_from_fork+0x1f/0x30
-> #0 (&dev->intf_state_mutex){+.+.}-{3:3}:
__lock_acquire+0x2fc7/0x6720
lock_acquire+0x1c1/0x550
__mutex_lock+0x12c/0x14b0
mlx5_init_one+0x2e/0x490 [mlx5_core]
mlx5_sf_dev_probe+0x29c/0x370 [mlx5_core]
auxiliary_bus_probe+0x9d/0xe0
really_probe+0x1e0/0xaa0
__driver_probe_device+0x219/0x480
driver_probe_device+0x49/0x130
__device_attach_driver+0x1b8/0x280
bus_for_each_drv+0x123/0x1a0
__device_attach+0x1a3/0x460
bus_probe_device+0x1a2/0x260
device_add+0x9b1/0x1b40
__auxiliary_device_add+0x88/0xc0
mlx5_sf_dev_state_change_handler+0x67e/0x9d0 [mlx5_core]
blocking_notifier_call_chain+0xd5/0x130
mlx5_vhca_state_work_handler+0x2b0/0x3f0 [mlx5_core]
process_one_work+0x7c2/0x1340
worker_thread+0x59d/0xec0
kthread+0x28f/0x330
ret_from_fork+0x1f/0x30
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&(¬ifier->n_head)->rwsem);
lock(&dev->intf_state_mutex);
lock(&(¬ifier->n_head)->rwsem);
lock(&dev->intf_state_mutex);
*** DEADLOCK ***
4 locks held by kworker/u20:0/8:
#0: ffff888150612938 ((wq_completion)mlx5_events){+.+.}-{0:0}, at: process_one_work+0x6e2/0x1340
#1: ffff888100cafdb8 ((work_completion)(&work->work)#3){+.+.}-{0:0}, at: process_one_work+0x70f/0x1340
#2: ffff888101aa7898 (&(¬ifier->n_head)->rwsem){++++}-{3:3}, at: blocking_notifier_call_chain+0x5a/0x130
#3: ffff88813682d0e8 (&dev->mutex){....}-{3:3}, at:__device_attach+0x76/0x460
stack backtrace:
CPU: 6 PID: 8 Comm: kworker/u20:0 Not tainted 5.19.0-rc8+
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Workqueue: mlx5_events mlx5_vhca_state_work_handler [mlx5_core]
Call Trace:
<TASK>
dump_stack_lvl+0x57/0x7d
check_noncircular+0x278/0x300
? print_circular_bug+0x460/0x460
? lock_chain_count+0x20/0x20
? register_lock_class+0x1880/0x1880
__lock_acquire+0x2fc7/0x6720
? register_lock_class+0x1880/0x1880
? register_lock_class+0x1880/0x1880
lock_acquire+0x1c1/0x550
? mlx5_init_one+0x2e/0x490 [mlx5_core]
? lockdep_hardirqs_on_prepare+0x400/0x400
__mutex_lock+0x12c/0x14b0
? mlx5_init_one+0x2e/0x490 [mlx5_core]
? mlx5_init_one+0x2e/0x490 [mlx5_core]
? _raw_read_unlock+0x1f/0x30
? mutex_lock_io_nested+0x1320/0x1320
? __ioremap_caller.constprop.0+0x306/0x490
? mlx5_sf_dev_probe+0x269/0x370 [mlx5_core]
? iounmap+0x160/0x160
mlx5_init_one+0x2e/0x490 [mlx5_core]
mlx5_sf_dev_probe+0x29c/0x370 [mlx5_core]
? mlx5_sf_dev_remove+0x130/0x130 [mlx5_core]
auxiliary_bus_probe+0x9d/0xe0
really_probe+0x1e0/0xaa0
__driver_probe_device+0x219/0x480
? auxiliary_match_id+0xe9/0x140
driver_probe_device+0x49/0x130
__device_attach_driver+0x1b8/0x280
? driver_allows_async_probing+0x140/0x140
bus_for_each_drv+0x123/0x1a0
? bus_for_each_dev+0x1a0/0x1a0
? lockdep_hardirqs_on_prepare+0x286/0x400
? trace_hardirqs_on+0x2d/0x100
__device_attach+0x1a3/0x460
? device_driver_attach+0x1e0/0x1e0
? kobject_uevent_env+0x22d/0xf10
bus_probe_device+0x1a2/0x260
device_add+0x9b1/0x1b40
? dev_set_name+0xab/0xe0
? __fw_devlink_link_to_suppliers+0x260/0x260
? memset+0x20/0x40
? lockdep_init_map_type+0x21a/0x7d0
__auxiliary_device_add+0x88/0xc0
? auxiliary_device_init+0x86/0xa0
mlx5_sf_dev_state_change_handler+0x67e/0x9d0 [mlx5_core]
blocking_notifier_call_chain+0xd5/0x130
mlx5_vhca_state_work_handler+0x2b0/0x3f0 [mlx5_core]
? mlx5_vhca_event_arm+0x100/0x100 [mlx5_core]
? lock_downgrade+0x6e0/0x6e0
? lockdep_hardirqs_on_prepare+0x286/0x400
process_one_work+0x7c2/0x1340
? lockdep_hardirqs_on_prepare+0x400/0x400
? pwq_dec_nr_in_flight+0x230/0x230
? rwlock_bug.part.0+0x90/0x90
worker_thread+0x59d/0xec0
? process_one_work+0x1340/0x1340
kthread+0x28f/0x330
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x1f/0x30
</TASK>
Fixes: 6a3273217469 ("net/mlx5: SF, Port function state change support")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
When the driver unloads, give/reclaim_pages may fail as PF driver in
teardown flow, current code will lead to the following kernel log print
'failed reclaiming pages: err 0'.
Fix it to get same behavior as before the cited commits,
by calling mlx5_cmd_check before handling error state.
mlx5_cmd_check will verify if the returned error is an actual error
needed to be handled by the driver or not and will return an
appropriate value.
Fixes: 8d564292a166 ("net/mlx5: Remove redundant error on reclaim pages")
Fixes: 4dac2f10ada0 ("net/mlx5: Remove redundant notify fail on give pages")
Signed-off-by: Roy Novich <royno@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
The lag_lock is taken from both process and softirq contexts which results
lockdep warning[0] about potential deadlock. However, just disabling
softirqs by using *_bh spinlock API is not enough since it will cause
warning in some contexts where the lock is obtained with hard irqs
disabled. To fix the issue save current irq state, disable them before
obtaining the lock an re-enable irqs from saved state after releasing it.
[0]:
[Sun Aug 7 13:12:29 2022] ================================
[Sun Aug 7 13:12:29 2022] WARNING: inconsistent lock state
[Sun Aug 7 13:12:29 2022] 5.19.0_for_upstream_debug_2022_08_04_16_06 #1 Not tainted
[Sun Aug 7 13:12:29 2022] --------------------------------
[Sun Aug 7 13:12:29 2022] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[Sun Aug 7 13:12:29 2022] swapper/0/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
[Sun Aug 7 13:12:29 2022] ffffffffa06dc0d8 (lag_lock){+.?.}-{2:2}, at: mlx5_lag_is_shared_fdb+0x1f/0x120 [mlx5_core]
[Sun Aug 7 13:12:29 2022] {SOFTIRQ-ON-W} state was registered at:
[Sun Aug 7 13:12:29 2022] lock_acquire+0x1c1/0x550
[Sun Aug 7 13:12:29 2022] _raw_spin_lock+0x2c/0x40
[Sun Aug 7 13:12:29 2022] mlx5_lag_add_netdev+0x13b/0x480 [mlx5_core]
[Sun Aug 7 13:12:29 2022] mlx5e_nic_enable+0x114/0x470 [mlx5_core]
[Sun Aug 7 13:12:29 2022] mlx5e_attach_netdev+0x30e/0x6a0 [mlx5_core]
[Sun Aug 7 13:12:29 2022] mlx5e_resume+0x105/0x160 [mlx5_core]
[Sun Aug 7 13:12:29 2022] mlx5e_probe+0xac3/0x14f0 [mlx5_core]
[Sun Aug 7 13:12:29 2022] auxiliary_bus_probe+0x9d/0xe0
[Sun Aug 7 13:12:29 2022] really_probe+0x1e0/0xaa0
[Sun Aug 7 13:12:29 2022] __driver_probe_device+0x219/0x480
[Sun Aug 7 13:12:29 2022] driver_probe_device+0x49/0x130
[Sun Aug 7 13:12:29 2022] __driver_attach+0x1e4/0x4d0
[Sun Aug 7 13:12:29 2022] bus_for_each_dev+0x11e/0x1a0
[Sun Aug 7 13:12:29 2022] bus_add_driver+0x3f4/0x5a0
[Sun Aug 7 13:12:29 2022] driver_register+0x20f/0x390
[Sun Aug 7 13:12:29 2022] __auxiliary_driver_register+0x14e/0x260
[Sun Aug 7 13:12:29 2022] mlx5e_init+0x38/0x90 [mlx5_core]
[Sun Aug 7 13:12:29 2022] vhost_iotlb_itree_augment_rotate+0xcb/0x180 [vhost_iotlb]
[Sun Aug 7 13:12:29 2022] do_one_initcall+0xc4/0x400
[Sun Aug 7 13:12:29 2022] do_init_module+0x18a/0x620
[Sun Aug 7 13:12:29 2022] load_module+0x563a/0x7040
[Sun Aug 7 13:12:29 2022] __do_sys_finit_module+0x122/0x1d0
[Sun Aug 7 13:12:29 2022] do_syscall_64+0x3d/0x90
[Sun Aug 7 13:12:29 2022] entry_SYSCALL_64_after_hwframe+0x46/0xb0
[Sun Aug 7 13:12:29 2022] irq event stamp: 3596508
[Sun Aug 7 13:12:29 2022] hardirqs last enabled at (3596508): [<ffffffff813687c2>] __local_bh_enable_ip+0xa2/0x100
[Sun Aug 7 13:12:29 2022] hardirqs last disabled at (3596507): [<ffffffff813687da>] __local_bh_enable_ip+0xba/0x100
[Sun Aug 7 13:12:29 2022] softirqs last enabled at (3596488): [<ffffffff81368a2a>] irq_exit_rcu+0x11a/0x170
[Sun Aug 7 13:12:29 2022] softirqs last disabled at (3596495): [<ffffffff81368a2a>] irq_exit_rcu+0x11a/0x170
[Sun Aug 7 13:12:29 2022]
other info that might help us debug this:
[Sun Aug 7 13:12:29 2022] Possible unsafe locking scenario:
[Sun Aug 7 13:12:29 2022] CPU0
[Sun Aug 7 13:12:29 2022] ----
[Sun Aug 7 13:12:29 2022] lock(lag_lock);
[Sun Aug 7 13:12:29 2022] <Interrupt>
[Sun Aug 7 13:12:29 2022] lock(lag_lock);
[Sun Aug 7 13:12:29 2022]
*** DEADLOCK ***
[Sun Aug 7 13:12:29 2022] 4 locks held by swapper/0/0:
[Sun Aug 7 13:12:29 2022] #0: ffffffff84643260 (rcu_read_lock){....}-{1:2}, at: mlx5e_napi_poll+0x43/0x20a0 [mlx5_core]
[Sun Aug 7 13:12:29 2022] #1: ffffffff84643260 (rcu_read_lock){....}-{1:2}, at: netif_receive_skb_list_internal+0x2d7/0xd60
[Sun Aug 7 13:12:29 2022] #2: ffff888144a18b58 (&br->hash_lock){+.-.}-{2:2}, at: br_fdb_update+0x301/0x570
[Sun Aug 7 13:12:29 2022] #3: ffffffff84643260 (rcu_read_lock){....}-{1:2}, at: atomic_notifier_call_chain+0x5/0x1d0
[Sun Aug 7 13:12:29 2022]
stack backtrace:
[Sun Aug 7 13:12:29 2022] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.19.0_for_upstream_debug_2022_08_04_16_06 #1
[Sun Aug 7 13:12:29 2022] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[Sun Aug 7 13:12:29 2022] Call Trace:
[Sun Aug 7 13:12:29 2022] <IRQ>
[Sun Aug 7 13:12:29 2022] dump_stack_lvl+0x57/0x7d
[Sun Aug 7 13:12:29 2022] mark_lock.part.0.cold+0x5f/0x92
[Sun Aug 7 13:12:29 2022] ? lock_chain_count+0x20/0x20
[Sun Aug 7 13:12:29 2022] ? unwind_next_frame+0x1c4/0x1b50
[Sun Aug 7 13:12:29 2022] ? secondary_startup_64_no_verify+0xcd/0xdb
[Sun Aug 7 13:12:29 2022] ? mlx5e_napi_poll+0x4e9/0x20a0 [mlx5_core]
[Sun Aug 7 13:12:29 2022] ? mlx5e_napi_poll+0x4e9/0x20a0 [mlx5_core]
[Sun Aug 7 13:12:29 2022] ? stack_access_ok+0x1d0/0x1d0
[Sun Aug 7 13:12:29 2022] ? start_kernel+0x3a7/0x3c5
[Sun Aug 7 13:12:29 2022] __lock_acquire+0x1260/0x6720
[Sun Aug 7 13:12:29 2022] ? lock_chain_count+0x20/0x20
[Sun Aug 7 13:12:29 2022] ? lock_chain_count+0x20/0x20
[Sun Aug 7 13:12:29 2022] ? register_lock_class+0x1880/0x1880
[Sun Aug 7 13:12:29 2022] ? mark_lock.part.0+0xed/0x3060
[Sun Aug 7 13:12:29 2022] ? stack_trace_save+0x91/0xc0
[Sun Aug 7 13:12:29 2022] lock_acquire+0x1c1/0x550
[Sun Aug 7 13:12:29 2022] ? mlx5_lag_is_shared_fdb+0x1f/0x120 [mlx5_core]
[Sun Aug 7 13:12:29 2022] ? lockdep_hardirqs_on_prepare+0x400/0x400
[Sun Aug 7 13:12:29 2022] ? __lock_acquire+0xd6f/0x6720
[Sun Aug 7 13:12:29 2022] _raw_spin_lock+0x2c/0x40
[Sun Aug 7 13:12:29 2022] ? mlx5_lag_is_shared_fdb+0x1f/0x120 [mlx5_core]
[Sun Aug 7 13:12:29 2022] mlx5_lag_is_shared_fdb+0x1f/0x120 [mlx5_core]
[Sun Aug 7 13:12:29 2022] mlx5_esw_bridge_rep_vport_num_vhca_id_get+0x1a0/0x600 [mlx5_core]
[Sun Aug 7 13:12:29 2022] ? mlx5_esw_bridge_update_work+0x90/0x90 [mlx5_core]
[Sun Aug 7 13:12:29 2022] ? lock_acquire+0x1c1/0x550
[Sun Aug 7 13:12:29 2022] mlx5_esw_bridge_switchdev_event+0x185/0x8f0 [mlx5_core]
[Sun Aug 7 13:12:29 2022] ? mlx5_esw_bridge_port_obj_attr_set+0x3e0/0x3e0 [mlx5_core]
[Sun Aug 7 13:12:29 2022] ? check_chain_key+0x24a/0x580
[Sun Aug 7 13:12:29 2022] atomic_notifier_call_chain+0xd7/0x1d0
[Sun Aug 7 13:12:29 2022] br_switchdev_fdb_notify+0xea/0x100
[Sun Aug 7 13:12:29 2022] ? br_switchdev_set_port_flag+0x310/0x310
[Sun Aug 7 13:12:29 2022] fdb_notify+0x11b/0x150
[Sun Aug 7 13:12:29 2022] br_fdb_update+0x34c/0x570
[Sun Aug 7 13:12:29 2022] ? lock_chain_count+0x20/0x20
[Sun Aug 7 13:12:29 2022] ? br_fdb_add_local+0x50/0x50
[Sun Aug 7 13:12:29 2022] ? br_allowed_ingress+0x5f/0x1070
[Sun Aug 7 13:12:29 2022] ? check_chain_key+0x24a/0x580
[Sun Aug 7 13:12:29 2022] br_handle_frame_finish+0x786/0x18e0
[Sun Aug 7 13:12:29 2022] ? check_chain_key+0x24a/0x580
[Sun Aug 7 13:12:29 2022] ? br_handle_local_finish+0x20/0x20
[Sun Aug 7 13:12:29 2022] ? __lock_acquire+0xd6f/0x6720
[Sun Aug 7 13:12:29 2022] ? sctp_inet_bind_verify+0x4d/0x190
[Sun Aug 7 13:12:29 2022] ? xlog_unpack_data+0x2e0/0x310
[Sun Aug 7 13:12:29 2022] ? br_handle_local_finish+0x20/0x20
[Sun Aug 7 13:12:29 2022] br_nf_hook_thresh+0x227/0x380 [br_netfilter]
[Sun Aug 7 13:12:29 2022] ? setup_pre_routing+0x460/0x460 [br_netfilter]
[Sun Aug 7 13:12:29 2022] ? br_handle_local_finish+0x20/0x20
[Sun Aug 7 13:12:29 2022] ? br_nf_pre_routing_ipv6+0x48b/0x69c [br_netfilter]
[Sun Aug 7 13:12:29 2022] br_nf_pre_routing_finish_ipv6+0x5c2/0xbf0 [br_netfilter]
[Sun Aug 7 13:12:29 2022] ? br_handle_local_finish+0x20/0x20
[Sun Aug 7 13:12:29 2022] br_nf_pre_routing_ipv6+0x4c6/0x69c [br_netfilter]
[Sun Aug 7 13:12:29 2022] ? br_validate_ipv6+0x9e0/0x9e0 [br_netfilter]
[Sun Aug 7 13:12:29 2022] ? br_nf_forward_arp+0xb70/0xb70 [br_netfilter]
[Sun Aug 7 13:12:29 2022] ? br_nf_pre_routing+0xacf/0x1160 [br_netfilter]
[Sun Aug 7 13:12:29 2022] br_handle_frame+0x8a9/0x1270
[Sun Aug 7 13:12:29 2022] ? br_handle_frame_finish+0x18e0/0x18e0
[Sun Aug 7 13:12:29 2022] ? register_lock_class+0x1880/0x1880
[Sun Aug 7 13:12:29 2022] ? br_handle_local_finish+0x20/0x20
[Sun Aug 7 13:12:29 2022] ? bond_handle_frame+0xf9/0xac0 [bonding]
[Sun Aug 7 13:12:29 2022] ? br_handle_frame_finish+0x18e0/0x18e0
[Sun Aug 7 13:12:29 2022] __netif_receive_skb_core+0x7c0/0x2c70
[Sun Aug 7 13:12:29 2022] ? check_chain_key+0x24a/0x580
[Sun Aug 7 13:12:29 2022] ? generic_xdp_tx+0x5b0/0x5b0
[Sun Aug 7 13:12:29 2022] ? __lock_acquire+0xd6f/0x6720
[Sun Aug 7 13:12:29 2022] ? register_lock_class+0x1880/0x1880
[Sun Aug 7 13:12:29 2022] ? check_chain_key+0x24a/0x580
[Sun Aug 7 13:12:29 2022] __netif_receive_skb_list_core+0x2d7/0x8a0
[Sun Aug 7 13:12:29 2022] ? lock_acquire+0x1c1/0x550
[Sun Aug 7 13:12:29 2022] ? process_backlog+0x960/0x960
[Sun Aug 7 13:12:29 2022] ? lockdep_hardirqs_on_prepare+0x129/0x400
[Sun Aug 7 13:12:29 2022] ? kvm_clock_get_cycles+0x14/0x20
[Sun Aug 7 13:12:29 2022] netif_receive_skb_list_internal+0x5f4/0xd60
[Sun Aug 7 13:12:29 2022] ? do_xdp_generic+0x150/0x150
[Sun Aug 7 13:12:29 2022] ? mlx5e_poll_rx_cq+0xf6b/0x2960 [mlx5_core]
[Sun Aug 7 13:12:29 2022] ? mlx5e_poll_ico_cq+0x3d/0x1590 [mlx5_core]
[Sun Aug 7 13:12:29 2022] napi_complete_done+0x188/0x710
[Sun Aug 7 13:12:29 2022] mlx5e_napi_poll+0x4e9/0x20a0 [mlx5_core]
[Sun Aug 7 13:12:29 2022] ? __queue_work+0x53c/0xeb0
[Sun Aug 7 13:12:29 2022] __napi_poll+0x9f/0x540
[Sun Aug 7 13:12:29 2022] net_rx_action+0x420/0xb70
[Sun Aug 7 13:12:29 2022] ? napi_threaded_poll+0x470/0x470
[Sun Aug 7 13:12:29 2022] ? __common_interrupt+0x79/0x1a0
[Sun Aug 7 13:12:29 2022] __do_softirq+0x271/0x92c
[Sun Aug 7 13:12:29 2022] irq_exit_rcu+0x11a/0x170
[Sun Aug 7 13:12:29 2022] common_interrupt+0x7d/0xa0
[Sun Aug 7 13:12:29 2022] </IRQ>
[Sun Aug 7 13:12:29 2022] <TASK>
[Sun Aug 7 13:12:29 2022] asm_common_interrupt+0x22/0x40
[Sun Aug 7 13:12:29 2022] RIP: 0010:default_idle+0x42/0x60
[Sun Aug 7 13:12:29 2022] Code: c1 83 e0 07 48 c1 e9 03 83 c0 03 0f b6 14 11 38 d0 7c 04 84 d2 75 14 8b 05 6b f1 22 02 85 c0 7e 07 0f 00 2d 80 3b 4a 00 fb f4 <c3> 48 c7 c7 e0 07 7e 85 e8 21 bd 40 fe eb de 66 66 2e 0f 1f 84 00
[Sun Aug 7 13:12:29 2022] RSP: 0018:ffffffff84407e18 EFLAGS: 00000242
[Sun Aug 7 13:12:29 2022] RAX: 0000000000000001 RBX: ffffffff84ec4a68 RCX: 1ffffffff0afc0fc
[Sun Aug 7 13:12:29 2022] RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffffffff835b1fac
[Sun Aug 7 13:12:29 2022] RBP: 0000000000000000 R08: 0000000000000001 R09: ffff8884d2c44ac3
[Sun Aug 7 13:12:29 2022] R10: ffffed109a588958 R11: 00000000ffffffff R12: 0000000000000000
[Sun Aug 7 13:12:29 2022] R13: ffffffff84efac20 R14: 0000000000000000 R15: dffffc0000000000
[Sun Aug 7 13:12:29 2022] ? default_idle_call+0xcc/0x460
[Sun Aug 7 13:12:29 2022] default_idle_call+0xec/0x460
[Sun Aug 7 13:12:29 2022] do_idle+0x394/0x450
[Sun Aug 7 13:12:29 2022] ? arch_cpu_idle_exit+0x40/0x40
[Sun Aug 7 13:12:29 2022] cpu_startup_entry+0x19/0x20
[Sun Aug 7 13:12:29 2022] rest_init+0x156/0x250
[Sun Aug 7 13:12:29 2022] arch_call_rest_init+0xf/0x15
[Sun Aug 7 13:12:29 2022] start_kernel+0x3a7/0x3c5
[Sun Aug 7 13:12:29 2022] secondary_startup_64_no_verify+0xcd/0xdb
[Sun Aug 7 13:12:29 2022] </TASK>
Fixes: ff9b7521468b ("net/mlx5: Bridge, support LAG")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Make sure to modify the rule for uplink forwarding only for the case
where destination vport number is MLX5_VPORT_UPLINK.
Fixes: 94db33177819 ("net/mlx5: Support multiport eswitch mode")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Only set MLX5_LAG_FLAG_NDEVS_READY if both netdevices are registered.
Doing so guarantees that both ldev->pf[MLX5_LAG_P0].dev and
ldev->pf[MLX5_LAG_P1].dev have valid pointers when
MLX5_LAG_FLAG_NDEVS_READY is set.
The core issue is asymmetry in setting MLX5_LAG_FLAG_NDEVS_READY and
clearing it. Setting it is done wrongly when both
ldev->pf[MLX5_LAG_P0].dev and ldev->pf[MLX5_LAG_P1].dev are set;
clearing it is done right when either of ldev->pf[i].netdev is cleared.
Consider the following scenario:
1. PF0 loads and sets ldev->pf[MLX5_LAG_P0].dev to a valid pointer
2. PF1 loads and sets both ldev->pf[MLX5_LAG_P1].dev and
ldev->pf[MLX5_LAG_P1].netdev with valid pointers. This results in
MLX5_LAG_FLAG_NDEVS_READY is set.
3. PF0 is unloaded before setting dev->pf[MLX5_LAG_P0].netdev.
MLX5_LAG_FLAG_NDEVS_READY remains set.
Further execution of mlx5_do_bond() will result in null pointer
dereference when calling mlx5_lag_is_multipath()
This patch fixes the following call trace actually encountered:
[ 1293.475195] BUG: kernel NULL pointer dereference, address: 00000000000009a8
[ 1293.478756] #PF: supervisor read access in kernel mode
[ 1293.481320] #PF: error_code(0x0000) - not-present page
[ 1293.483686] PGD 0 P4D 0
[ 1293.484434] Oops: 0000 [#1] SMP PTI
[ 1293.485377] CPU: 1 PID: 23690 Comm: kworker/u16:2 Not tainted 5.18.0-rc5_for_upstream_min_debug_2022_05_05_10_13 #1
[ 1293.488039] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[ 1293.490836] Workqueue: mlx5_lag mlx5_do_bond_work [mlx5_core]
[ 1293.492448] RIP: 0010:mlx5_lag_is_multipath+0x5/0x50 [mlx5_core]
[ 1293.494044] Code: e8 70 40 ff e0 48 8b 14 24 48 83 05 5c 1a 1b 00 01 e9 19 ff ff ff 48 83 05 47 1a 1b 00 01 eb d7 0f 1f 44 00 00 0f 1f 44 00 00 <48> 8b 87 a8 09 00 00 48 85 c0 74 26 48 83 05 a7 1b 1b 00 01 41 b8
[ 1293.498673] RSP: 0018:ffff88811b2fbe40 EFLAGS: 00010202
[ 1293.500152] RAX: ffff88818a94e1c0 RBX: ffff888165eca6c0 RCX: 0000000000000000
[ 1293.501841] RDX: 0000000000000001 RSI: ffff88818a94e1c0 RDI: 0000000000000000
[ 1293.503585] RBP: 0000000000000000 R08: ffff888119886740 R09: ffff888165eca73c
[ 1293.505286] R10: 0000000000000018 R11: 0000000000000018 R12: ffff88818a94e1c0
[ 1293.506979] R13: ffff888112729800 R14: 0000000000000000 R15: ffff888112729858
[ 1293.508753] FS: 0000000000000000(0000) GS:ffff88852cc40000(0000) knlGS:0000000000000000
[ 1293.510782] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1293.512265] CR2: 00000000000009a8 CR3: 00000001032d4002 CR4: 0000000000370ea0
[ 1293.514001] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1293.515806] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Fixes: 8a66e4585979 ("net/mlx5: Change ownership model for lag")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
When querying mlx5 non-uplink representors capabilities with ethtool
rx-vlan-offload is marked as "off [fixed]". However, it is actually always
enabled because mlx5e_params->vlan_strip_disable is 0 by default when
initializing struct mlx5e_params instance. Fix the issue by explicitly
setting the vlan_strip_disable to 'true' for non-uplink representors.
Fixes: cb67b832921c ("net/mlx5e: Introduce SRIOV VF representors")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|