Age | Commit message (Collapse) | Author |
|
There have been reports of approximately a 0.9%-1.7% failure rate in SMU
communication timeouts with s0i3 entry on some OEM designs. Currently
the design in amd-pmc is to try every 100us for up to 20ms.
However the GPU driver which also communicates with the SMU using a
mailbox register which the driver polls every 1us for up to 2000ms.
In the GPU driver this was increased by commit 055162645a40 ("drm/amd/pm:
increase time out value when sending msg to SMU")
Increase the maximum timeout used by amd-pmc to 2000ms to match this
behavior. This has been shown to improve the stability for machines
that randomly have failures.
Cc: stable@kernel.org
Reported-by: Julian Sikorski <belegdol@gmail.com>
BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1629
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Link: https://lore.kernel.org/r/20210914020115.655-1-mario.limonciello@amd.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
There are two bugs:
1) If ida_simple_get() fails then this code calls put_device(carrier)
but we haven't yet called get_device(carrier) and probably that
leads to a use after free.
2) After device_initialize() then we need to use put_device() to
release the bus. This will free the internal resources tied to the
device and call mcb_free_bus() which will free the rest.
Fixes: 5d9e2ab9fea4 ("mcb: Implement bus->dev.release callback")
Fixes: 18d288198099 ("mcb: Correctly initialize the bus's device")
Cc: stable@vger.kernel.org
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Johannes Thumshirn <jth@kernel.org>
Link: https://lore.kernel.org/r/32e160cf6864ce77f9d62948338e24db9fd8ead9.1630931319.git.johannes.thumshirn@wdc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Initially, tty_ldisc_release() was exported for speakup (spk_tty) while
in staging. Later, the call to this function was removed as it was bogus
anyway.
Remove the export now.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Link: https://lore.kernel.org/r/20210914091134.17426-1-jslaby@suse.cz
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
tx-fifo-resize is now added by default by the dwc3-qcom driver
to the SNPS DWC3 child node.
So, lets drop the tx-fifo-resize property from dwc3-qcom nodes
as having it there will cause the dwc3-qcom driver to error and
abort probe with:
[ 1.362938] dwc3-qcom 8af8800.usb: unable to add property
[ 1.368405] dwc3-qcom 8af8800.usb: failed to register DWC3 Core, err=-17
Fixes: cefdd52fa045 ("usb: dwc3: dwc3-qcom: Enable tx-fifo-resize property by default")
Signed-off-by: Robert Marko <robimarko@gmail.com>
Link: https://lore.kernel.org/r/20210902220325.1783567-1-robimarko@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
15IMHG05, Yoga 7i 14ITL5/15ITL5, and 13s Gen2 laptops.
This patch initializes and enables speaker output on the Lenovo Legion 7i
15IMHG05, Yoga 7i 14ITL5/15ITL5, and 13s Gen2 series of laptops using the
HDA verb sequence specific to each model.
Speaker automute is suppressed for the Lenovo Legion 7i 15IMHG05 to avoid
breaking speaker output on resume and when devices are unplugged from its
headphone jack.
Thanks to: Andreas Holzer, Vincent Morel, sycxyc, Max Christian Pohle and
all others that helped.
[ minor coding style fixes by tiwai ]
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=208555
Signed-off-by: Cameron Berkenpas <cam@neo-zeon.de>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210913212627.339362-1-cam@neo-zeon.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
'set_signals()' in synclink_gt.c conflicts with an exported symbol
in arch/um/, so change set_signals() to set_gtsignals(). Keep
the function names similar by also changing get_signals() to
get_gtsignals().
../drivers/tty/synclink_gt.c:442:13: error: conflicting types for ‘set_signals’
static void set_signals(struct slgt_info *info);
^~~~~~~~~~~
In file included from ../include/linux/irqflags.h:16:0,
from ../include/linux/spinlock.h:58,
from ../include/linux/mm_types.h:9,
from ../include/linux/buildid.h:5,
from ../include/linux/module.h:14,
from ../drivers/tty/synclink_gt.c:46:
../arch/um/include/asm/irqflags.h:6:5: note: previous declaration of ‘set_signals’ was here
int set_signals(int enable);
^~~~~~~~~~~
Fixes: 705b6c7b34f2 ("[PATCH] new driver synclink_gt")
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jirislaby@kernel.org>
Cc: Paul Fulghum <paulkf@microgate.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20210902003806.17054-1-rdunlap@infradead.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Commit 505b08777d78 ("misc: genwqe: Use dma_set_mask_and_coherent to simplify code")
changed the logic in the code.
Instead of a ||, a && should have been used to keep the code the same.
Fixes: 505b08777d78 ("misc: genwqe: Use dma_set_mask_and_coherent to simplify code")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/be49835baa8ba6daba5813b399edf6300f7fdbda.1631130862.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
For Isochronous endpoints, the SS companion descriptor's
wBytesPerInterval field is required to reserve bus time in order
to transmit the required payload during the service interval.
If left at 0, the UAC2 function is unable to transact data on its
playback or capture endpoints in SuperSpeed mode.
Since f_uac2 currently does not support any bursting this value can
be exactly equal to the calculated wMaxPacketSize.
Tested with Windows 10 as a host.
Fixes: f8cb3d556be3 ("usb: f_uac2: adds support for SS and SSP")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Jack Pham <jackp@codeaurora.org>
Link: https://lore.kernel.org/r/20210909174811.12534-3-jackp@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The f_uac2 function fails to enumerate when connected in SuperSpeed
due to the feedback endpoint missing the companion descriptor.
Add a new ss_epin_fback_desc_comp descriptor and append it behind the
ss_epin_fback_desc both in the static definition of the ss_audio_desc
structure as well as its dynamic construction in setup_headers().
Fixes: 24f779dac8f3 ("usb: gadget: f_uac2/u_audio: add feedback endpoint support")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Jack Pham <jackp@codeaurora.org>
Link: https://lore.kernel.org/r/20210909174811.12534-2-jackp@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
When last descriptor in a descriptor list completed with XferComplete
interrupt, core switching to handle next descriptor and assert BNA
interrupt. Both these interrupts are set while dwc2_hsotg_epint()
handler called. Each interrupt should be handled separately: first
XferComplete interrupt then BNA interrupt, otherwise last completed
transfer will not be giveback to function driver as completed
request.
Fixes: 729cac693eec ("usb: dwc2: Change ISOC DDMA flow")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Minas Harutyunyan <Minas.Harutyunyan@synopsys.com>
Link: https://lore.kernel.org/r/a36981accc26cd674c5d8f8da6164344b94ec1fe.1631386531.git.Minas.Harutyunyan@synopsys.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
No functional change. Since configuration to stop HCD is invoked from
multiple places, group all of them in usb_stop_hcd().
Tested-by: Chris Chiu <chris.chiu@canonical.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Link: https://lore.kernel.org/r/20210909064200.16216-4-kishon@ti.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Set "HCD_FLAG_DEFER_RH_REGISTER" to hcd->flags in xhci_run() to defer
registering primary roothub in usb_add_hcd(). This will make sure both
primary roothub and secondary roothub will be registered along with the
second HCD. This is required for cold plugged USB devices to be detected
in certain PCIe USB cards (like Inateck USB card connected to AM64 EVM
or J7200 EVM).
CC: stable@vger.kernel.org # 5.4+
Suggested-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Tested-by: Chris Chiu <chris.chiu@canonical.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Link: https://lore.kernel.org/r/20210909064200.16216-3-kishon@ti.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
It has been observed with certain PCIe USB cards (like Inateck connected
to AM64 EVM or J7200 EVM) that as soon as the primary roothub is
registered, port status change is handled even before xHC is running
leading to cold plug USB devices not detected. For such cases, registering
both the root hubs along with the second HCD is required. Add support for
deferring roothub registration in usb_add_hcd(), so that both primary and
secondary roothubs are registered along with the second HCD.
CC: stable@vger.kernel.org # 5.4+
Suggested-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Tested-by: Chris Chiu <chris.chiu@canonical.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Link: https://lore.kernel.org/r/20210909064200.16216-2-kishon@ti.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
According USB spec each ISOC transaction should be performed in a
designated for that transaction interval. On bus errors or delays
in operating system scheduling of client software can result in no
packet being transferred for a (micro)frame. An error indication
should be returned as status to the client software in such a case.
Current implementation in case of missed/dropped interval send same
data in next possible interval instead of reporting missed isoc.
This fix complete requests with -ENODATA if interval elapsed.
HSOTG core in BDMA and Slave modes haven't HW support for
(micro)frames tracking, this is why SW should care about tracking
of (micro)frames. Because of that method and consider operating
system scheduling delays, added few additional checking's of elapsed
target (micro)frame:
1. Immediately before enabling EP to start transfer.
2. With any transfer completion interrupt.
3. With incomplete isoc in/out interrupt.
4. With EP disabled interrupt because of incomplete transfer.
5. With OUT token received while EP disabled interrupt (for OUT
transfers).
6. With NAK replied to IN token interrupt (for IN transfers).
As part of ISOC flow, additionally fixed 'current' and 'target' frame
calculation functions. In HS mode SOF limits provided by DSTS register
is 0x3fff, but in non HS mode this limit is 0x7ff.
Tested by internal tool which also using for dwc3 testing.
Signed-off-by: Minas Harutyunyan <Minas.Harutyunyan@synopsys.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/95d1423adf4b0f68187c9894820c4b7e964a3f7f.1631175721.git.Minas.Harutyunyan@synopsys.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
After we start to do core soft reset while usb role switch,
the phy init is invoked at every switch to device mode, but
its counter part de-init is missing, this causes the actual
phy init can not be done when we really want to re-init phy
like system resume, because the counter maintained by phy
core is not 0. considering phy init is actually redundant for
role switch, so move out the phy init from core soft reset to
dwc3 core init where is the only place required.
Fixes: f88359e1588b ("usb: dwc3: core: Do core softreset when switch mode")
Cc: <stable@vger.kernel.org>
Tested-by: faqiang.zhu <faqiang.zhu@nxp.com>
Tested-by: John Stultz <john.stultz@linaro.org> #HiKey960
Acked-by: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Li Jun <jun.li@nxp.com>
Link: https://lore.kernel.org/r/1631068099-13559-1-git-send-email-jun.li@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This reverts commit f3de5d857bb2362b00e2a8d4bc886cd49dcb66db.
That commit broke USB on all routers that have USB always powered on and
don't require toggling any GPIO. It's a majority of devices actually.
The original code worked and seemed safe: vcc GPIO is optional and
bcma_hci_platform_power_gpio() takes care of checking the pointer before
using it.
This revert fixes:
[ 10.801127] bcma_hcd: probe of bcma0:11 failed with error -2
Fixes: f3de5d857bb2 ("USB: bcma: Add a check for devm_gpiod_get")
Cc: stable <stable@vger.kernel.org>
Cc: Chuhong Yuan <hslester96@gmail.com>
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Link: https://lore.kernel.org/r/20210831065419.18371-1-zajec5@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Use platform_register_drivers() and platform_unregister_drivers() to
register and unregister ehci platform drivers. This simplifies the code
and prevents the following build errors seen with sparc:allmodconfig.
drivers/usb/host/ehci-hcd.c:1301: error:
"PLATFORM_DRIVER" redefined
drivers/usb/host/ehci-sh.c:173:31: error:
'ehci_hcd_sh_driver' defined but not used
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20210907123002.3951446-1-linux@roeck-us.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
If the driver runs out of minor numbers it would release minor 0 and
allow another device to claim the minor while still in use.
Fortunately, registering the tty class device of the second device would
fail (with a stack dump) due to the sysfs name collision so no memory is
leaked.
Fixes: cae2bc768d17 ("usb: cdc-acm: Decrement tty port's refcount if probe() fail")
Cc: stable@vger.kernel.org # 4.19
Cc: Jaejoong Kim <climbbb.kim@gmail.com>
Acked-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://lore.kernel.org/r/20210907082318.7757-1-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
It will cause null-ptr-deref if platform_get_resource() returns NULL,
we need check the return value.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20210831084236.1359677-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
For DEV_VER_V3 version there exist race condition between clearing
ep_sts.EP_STS_TRBERR and setting ep_cmd.EP_CMD_DRDY bit.
Setting EP_CMD_DRDY will be ignored by controller when
EP_STS_TRBERR is set. So, between these two instructions we have
a small time gap in which the EP_STSS_TRBERR can be set. In such case
the transfer will not start after setting doorbell.
Fixes: 7733f6c32e36 ("usb: cdns3: Add Cadence USB3 DRD Driver")
cc: <stable@vger.kernel.org> # 5.12.x
Tested-by: Aswath Govindraju <a-govindraju@ti.com>
Reviewed-by: Aswath Govindraju <a-govindraju@ti.com>
Signed-off-by: Pawel Laszczak <pawell@cadence.com>
Link: https://lore.kernel.org/r/20210907062619.34622-1-pawell@gli-login.cadence.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This loop is supposed to loop until if reads something other than
CS_IDST or until it times out after 30,000 attempts. But because of
the || vs && bug, it will never time out and instead it will loop a
minimum of 30,000 times.
This bug is quite old but the code is only used in USB_DEVICE_TEST_MODE
so it probably doesn't affect regular usage.
Fixes: 96fe53ef5498 ("usb: gadget: r8a66597-udc: add support for TEST_MODE")
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Acked-by: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20210906094221.GA10957@kili
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The patch increases the bitshift in feedback frequency
calculation with EP-OUT bInterval value.
Tests have revealed that Win10 and OSX UAC2 drivers require
the feedback frequency to be based on the actual packet
interval instead of on the USB2 microframe. Otherwise they
ignore the feedback value. Linux snd-usb-audio driver
detects the applied bitshift automatically.
Tested-by: Henrik Enquist <henrik.enquist@gmail.com>
Signed-off-by: Pavel Hofman <pavel.hofman@ivitera.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210906130822.12256-1-pavel.hofman@ivitera.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Dispatching requests inline with the .queue_rq() call may block while
holding the send_mutex. If the tcp io_work also happens to schedule, it
may see the req_list is non-empty, leaving "pending" true and remaining
in TASK_RUNNING. Since io_work is of higher scheduling priority, the
.queue_rq task may not get a chance to run, blocking forward progress
and leading to io timeouts.
Instead of checking for pending requests within io_work, let the queueing
restart io_work outside the send_mutex lock if there is more work to be
done.
Fixes: a0fdd1418007f ("nvme-tcp: rerun io_work if req_list is not empty")
Reported-by: Samuel Jones <sjones@kalrayinc.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
We should always destroy cm_id before destroy qp to avoid to get cma
event after qp was destroyed, which may lead to use after free.
In RDMA connection establishment error flow, don't destroy qp in cm
event handler.Just report cm_error to upper level, qp will be destroy
in nvme_rdma_alloc_queue() after destroy cm id.
Signed-off-by: Ruozhu Li <liruozhu@huawei.com>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
nvme_update_ana_state() has a deficiency that results in a failure to
properly update the ana state for a namespace in the following case:
NSIDs in ctrl->namespaces: 1, 3, 4
NSIDs in desc->nsids: 1, 2, 3, 4
Loop iteration 0:
ns index = 0, n = 0, ns->head->ns_id = 1, nsid = 1, MATCH.
Loop iteration 1:
ns index = 1, n = 1, ns->head->ns_id = 3, nsid = 2, NO MATCH.
Loop iteration 2:
ns index = 2, n = 2, ns->head->ns_id = 4, nsid = 4, MATCH.
Where the update to the ANA state of NSID 3 is missed. To fix this
increment n and retry the update with the same ns when ns->head->ns_id is
higher than nsid,
Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
|
|
testusb' application which uses 'usbtest' driver reports 'unknown speed'
from the function 'find_testdev'. The variable 'entry->speed' was not
updated from the application. The IOCTL mentioned in the FIXME comment can
only report whether the connection is low speed or not. Speed is read using
the IOCTL USBDEVFS_GET_SPEED which reports the proper speed grade. The
call is implemented in the function 'handle_testdev' where the file
descriptor was availble locally. Sample output is given below where 'high
speed' is printed as the connected speed.
sudo ./testusb -a
high speed /dev/bus/usb/001/011 0
/dev/bus/usb/001/011 test 0, 0.000015 secs
/dev/bus/usb/001/011 test 1, 0.194208 secs
/dev/bus/usb/001/011 test 2, 0.077289 secs
/dev/bus/usb/001/011 test 3, 0.170604 secs
/dev/bus/usb/001/011 test 4, 0.108335 secs
/dev/bus/usb/001/011 test 5, 2.788076 secs
/dev/bus/usb/001/011 test 6, 2.594610 secs
/dev/bus/usb/001/011 test 7, 2.905459 secs
/dev/bus/usb/001/011 test 8, 2.795193 secs
/dev/bus/usb/001/011 test 9, 8.372651 secs
/dev/bus/usb/001/011 test 10, 6.919731 secs
/dev/bus/usb/001/011 test 11, 16.372687 secs
/dev/bus/usb/001/011 test 12, 16.375233 secs
/dev/bus/usb/001/011 test 13, 2.977457 secs
/dev/bus/usb/001/011 test 14 --> 22 (Invalid argument)
/dev/bus/usb/001/011 test 17, 0.148826 secs
/dev/bus/usb/001/011 test 18, 0.068718 secs
/dev/bus/usb/001/011 test 19, 0.125992 secs
/dev/bus/usb/001/011 test 20, 0.127477 secs
/dev/bus/usb/001/011 test 21 --> 22 (Invalid argument)
/dev/bus/usb/001/011 test 24, 4.133763 secs
/dev/bus/usb/001/011 test 27, 2.140066 secs
/dev/bus/usb/001/011 test 28, 2.120713 secs
/dev/bus/usb/001/011 test 29, 0.507762 secs
Signed-off-by: Faizel K B <faizel.kb@dicortech.com>
Link: https://lore.kernel.org/r/20210902114444.15106-1-faizel.kb@dicortech.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
There are two cases for machine check recovery:
1) The machine check was triggered by ring3 (application) code.
This is the simpler case. The machine check handler simply queues
work to be executed on return to user. That code unmaps the page
from all users and arranges to send a SIGBUS to the task that
triggered the poison.
2) The machine check was triggered in kernel code that is covered by
an exception table entry. In this case the machine check handler
still queues a work entry to unmap the page, etc. but this will
not be called right away because the #MC handler returns to the
fix up code address in the exception table entry.
Problems occur if the kernel triggers another machine check before the
return to user processes the first queued work item.
Specifically, the work is queued using the ->mce_kill_me callback
structure in the task struct for the current thread. Attempting to queue
a second work item using this same callback results in a loop in the
linked list of work functions to call. So when the kernel does return to
user, it enters an infinite loop processing the same entry for ever.
There are some legitimate scenarios where the kernel may take a second
machine check before returning to the user.
1) Some code (e.g. futex) first tries a get_user() with page faults
disabled. If this fails, the code retries with page faults enabled
expecting that this will resolve the page fault.
2) Copy from user code retries a copy in byte-at-time mode to check
whether any additional bytes can be copied.
On the other side of the fence are some bad drivers that do not check
the return value from individual get_user() calls and may access
multiple user addresses without noticing that some/all calls have
failed.
Fix by adding a counter (current->mce_count) to keep track of repeated
machine checks before task_work() is called. First machine check saves
the address information and calls task_work_add(). Subsequent machine
checks before that task_work call back is executed check that the address
is in the same page as the first machine check (since the callback will
offline exactly one page).
Expected worst case is four machine checks before moving on (e.g. one
user access with page faults disabled, then a repeat to the same address
with page faults enabled ... repeat in copy tail bytes). Just in case
there is some code that loops forever enforce a limit of 10.
[ bp: Massage commit message, drop noinstr, fix typo, extend panic
messages. ]
Fixes: 5567d11c21a1 ("x86/mce: Send #MC singal from task work")
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/YT/IJ9ziLqmtqEPu@agluck-desk2.amr.corp.intel.com
|
|
As previously noted in commit 66e4f4a9cc38 ("rtc: cmos: Use
spin_lock_irqsave() in cmos_interrupt()"):
<4>[ 254.192378] WARNING: inconsistent lock state
<4>[ 254.192384] 5.12.0-rc1-CI-CI_DRM_9834+ #1 Not tainted
<4>[ 254.192396] --------------------------------
<4>[ 254.192400] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
<4>[ 254.192409] rtcwake/5309 [HC0[0]:SC0[0]:HE1:SE1] takes:
<4>[ 254.192429] ffffffff8263c5f8 (rtc_lock){?...}-{2:2}, at: cmos_interrupt+0x18/0x100
<4>[ 254.192481] {IN-HARDIRQ-W} state was registered at:
<4>[ 254.192488] lock_acquire+0xd1/0x3d0
<4>[ 254.192504] _raw_spin_lock+0x2a/0x40
<4>[ 254.192519] cmos_interrupt+0x18/0x100
<4>[ 254.192536] rtc_handler+0x1f/0xc0
<4>[ 254.192553] acpi_ev_fixed_event_detect+0x109/0x13c
<4>[ 254.192574] acpi_ev_sci_xrupt_handler+0xb/0x28
<4>[ 254.192596] acpi_irq+0x13/0x30
<4>[ 254.192620] __handle_irq_event_percpu+0x43/0x2c0
<4>[ 254.192641] handle_irq_event_percpu+0x2b/0x70
<4>[ 254.192661] handle_irq_event+0x2f/0x50
<4>[ 254.192680] handle_fasteoi_irq+0x9e/0x150
<4>[ 254.192693] __common_interrupt+0x76/0x140
<4>[ 254.192715] common_interrupt+0x96/0xc0
<4>[ 254.192732] asm_common_interrupt+0x1e/0x40
<4>[ 254.192750] _raw_spin_unlock_irqrestore+0x38/0x60
<4>[ 254.192767] resume_irqs+0xba/0xf0
<4>[ 254.192786] dpm_resume_noirq+0x245/0x3d0
<4>[ 254.192811] suspend_devices_and_enter+0x230/0xaa0
<4>[ 254.192835] pm_suspend.cold.8+0x301/0x34a
<4>[ 254.192859] state_store+0x7b/0xe0
<4>[ 254.192879] kernfs_fop_write_iter+0x11d/0x1c0
<4>[ 254.192899] new_sync_write+0x11d/0x1b0
<4>[ 254.192916] vfs_write+0x265/0x390
<4>[ 254.192933] ksys_write+0x5a/0xd0
<4>[ 254.192949] do_syscall_64+0x33/0x80
<4>[ 254.192965] entry_SYSCALL_64_after_hwframe+0x44/0xae
<4>[ 254.192986] irq event stamp: 43775
<4>[ 254.192994] hardirqs last enabled at (43775): [<ffffffff81c00c42>] asm_sysvec_apic_timer_interrupt+0x12/0x20
<4>[ 254.193023] hardirqs last disabled at (43774): [<ffffffff81aa691a>] sysvec_apic_timer_interrupt+0xa/0xb0
<4>[ 254.193049] softirqs last enabled at (42548): [<ffffffff81e00342>] __do_softirq+0x342/0x48e
<4>[ 254.193074] softirqs last disabled at (42543): [<ffffffff810b45fd>] irq_exit_rcu+0xad/0xd0
<4>[ 254.193101]
other info that might help us debug this:
<4>[ 254.193107] Possible unsafe locking scenario:
<4>[ 254.193112] CPU0
<4>[ 254.193117] ----
<4>[ 254.193121] lock(rtc_lock);
<4>[ 254.193137] <Interrupt>
<4>[ 254.193142] lock(rtc_lock);
<4>[ 254.193156]
*** DEADLOCK ***
<4>[ 254.193161] 6 locks held by rtcwake/5309:
<4>[ 254.193174] #0: ffff888104861430 (sb_writers#5){.+.+}-{0:0}, at: ksys_write+0x5a/0xd0
<4>[ 254.193232] #1: ffff88810f823288 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0xe7/0x1c0
<4>[ 254.193282] #2: ffff888100cef3c0 (kn->active#285
<7>[ 254.192706] i915 0000:00:02.0: [drm:intel_modeset_setup_hw_state [i915]] [CRTC:51:pipe A] hw state readout: disabled
<4>[ 254.193307] ){.+.+}-{0:0}, at: kernfs_fop_write_iter+0xf0/0x1c0
<4>[ 254.193333] #3: ffffffff82649fa8 (system_transition_mutex){+.+.}-{3:3}, at: pm_suspend.cold.8+0xce/0x34a
<4>[ 254.193387] #4: ffffffff827a2108 (acpi_scan_lock){+.+.}-{3:3}, at: acpi_suspend_begin+0x47/0x70
<4>[ 254.193433] #5: ffff8881019ea178 (&dev->mutex){....}-{3:3}, at: device_resume+0x68/0x1e0
<4>[ 254.193485]
stack backtrace:
<4>[ 254.193492] CPU: 1 PID: 5309 Comm: rtcwake Not tainted 5.12.0-rc1-CI-CI_DRM_9834+ #1
<4>[ 254.193514] Hardware name: Google Soraka/Soraka, BIOS MrChromebox-4.10 08/25/2019
<4>[ 254.193524] Call Trace:
<4>[ 254.193536] dump_stack+0x7f/0xad
<4>[ 254.193567] mark_lock.part.47+0x8ca/0xce0
<4>[ 254.193604] __lock_acquire+0x39b/0x2590
<4>[ 254.193626] ? asm_sysvec_apic_timer_interrupt+0x12/0x20
<4>[ 254.193660] lock_acquire+0xd1/0x3d0
<4>[ 254.193677] ? cmos_interrupt+0x18/0x100
<4>[ 254.193716] _raw_spin_lock+0x2a/0x40
<4>[ 254.193735] ? cmos_interrupt+0x18/0x100
<4>[ 254.193758] cmos_interrupt+0x18/0x100
<4>[ 254.193785] cmos_resume+0x2ac/0x2d0
<4>[ 254.193813] ? acpi_pm_set_device_wakeup+0x1f/0x110
<4>[ 254.193842] ? pnp_bus_suspend+0x10/0x10
<4>[ 254.193864] pnp_bus_resume+0x5e/0x90
<4>[ 254.193885] dpm_run_callback+0x5f/0x240
<4>[ 254.193914] device_resume+0xb2/0x1e0
<4>[ 254.193942] ? pm_dev_err+0x25/0x25
<4>[ 254.193974] dpm_resume+0xea/0x3f0
<4>[ 254.194005] dpm_resume_end+0x8/0x10
<4>[ 254.194030] suspend_devices_and_enter+0x29b/0xaa0
<4>[ 254.194066] pm_suspend.cold.8+0x301/0x34a
<4>[ 254.194094] state_store+0x7b/0xe0
<4>[ 254.194124] kernfs_fop_write_iter+0x11d/0x1c0
<4>[ 254.194151] new_sync_write+0x11d/0x1b0
<4>[ 254.194183] vfs_write+0x265/0x390
<4>[ 254.194207] ksys_write+0x5a/0xd0
<4>[ 254.194232] do_syscall_64+0x33/0x80
<4>[ 254.194251] entry_SYSCALL_64_after_hwframe+0x44/0xae
<4>[ 254.194274] RIP: 0033:0x7f07d79691e7
<4>[ 254.194293] Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
<4>[ 254.194312] RSP: 002b:00007ffd9cc2c768 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
<4>[ 254.194337] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f07d79691e7
<4>[ 254.194352] RDX: 0000000000000004 RSI: 0000556ebfc63590 RDI: 000000000000000b
<4>[ 254.194366] RBP: 0000556ebfc63590 R08: 0000000000000000 R09: 0000000000000004
<4>[ 254.194379] R10: 0000556ebf0ec2a6 R11: 0000000000000246 R12: 0000000000000004
which breaks S3-resume on fi-kbl-soraka presumably as that's slow enough
to trigger the alarm during the suspend.
Fixes: 6950d046eb6e ("rtc: cmos: Replace spin_lock_irqsave with spin_lock in hard IRQ")
References: 66e4f4a9cc38 ("rtc: cmos: Use spin_lock_irqsave() in cmos_interrupt()"):
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Xiaofei Tan <tanxiaofei@huawei.com>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20210305122140.28774-1-chris@chris-wilson.co.uk
|
|
Driver's tx_empty callback should signal when the transmit shift register
is empty. So when the last character has been sent.
STAT_TX_FIFO_EMP bit signals only that HW transmit FIFO is empty, which
happens when the last byte is loaded into transmit shift register.
STAT_TX_EMP bit signals when the both HW transmit FIFO and transmit shift
register are empty.
So replace STAT_TX_FIFO_EMP check by STAT_TX_EMP in mvebu_uart_tx_empty()
callback function.
Fixes: 30530791a7a0 ("serial: mvebu-uart: initial support for Armada-3700 serial port")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Pali Rohár <pali@kernel.org>
Link: https://lore.kernel.org/r/20210911132017.25505-1-pali@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Commit b67e830d38fa ("serial: 8250: 8250_omap: Fix possible interrupt
storm on K3 SoCs") introduced fixup including a register read to
RX_LVL, however, we should be using word offset than byte offset
since our registers are on 4 byte boundary (port.regshift = 2) for
8250_omap.
Fixes: b67e830d38fa ("serial: 8250: 8250_omap: Fix possible interrupt storm on K3 SoCs")
Cc: stable <stable@vger.kernel.org>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: Nishanth Menon <nm@ti.com>
Link: https://lore.kernel.org/r/20210903050550.29050-1-nm@ti.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Seeing these errors when GT is likely in suspend state-
"RPM wakelock ref not held during HW access"
Ensure GT is awake before trying to access HW registers. Avoid
reading the register if that is not the case.
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Fixes: 41e5c17ebfc2 ("drm/i915/guc/slpc: Sysfs hooks for SLPC")
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210907232704.12982-1-vinay.belgaumkar@intel.com
(cherry picked from commit f25e3908b9cd4a3fe819e9bdcdde58f20bacb34c)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
gem context refcounting is another exercise in least locking design it
seems, where most things get destroyed upon context closure (which can
race with anything really). Only the actual memory allocation and the
locks survive while holding a reference.
This tripped up Jason when reimplementing the single timeline feature
in
commit 00dae4d3d35d4f526929633b76e00b0ab4d3970d
Author: Jason Ekstrand <jason@jlekstrand.net>
Date: Thu Jul 8 10:48:12 2021 -0500
drm/i915: Implement SINGLE_TIMELINE with a syncobj (v4)
We could fix the bug by holding ctx->mutex in execbuf and clear the
pointer (again while holding the mutex) context_close, but it's
cleaner to just make the context object actually invariant over its
_entire_ lifetime. This way any other ioctl that's potentially racing,
but holding a full reference, can still rely on ctx->syncobj being
an immutable pointer. Which without this change, is not the case.
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Fixes: 00dae4d3d35d ("drm/i915: Implement SINGLE_TIMELINE with a syncobj (v4)")
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: "Thomas Hellström" <thomas.hellstrom@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210902142057.929669-2-daniel.vetter@ffwll.ch
(cherry picked from commit c238980efd3b35af70fc926066cf7440f50a97a9)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
Using the I915_MMAP_TYPE_FIXED mmap type requires the TTM backend, so
for that mmap type, use __i915_gem_object_create_user() instead of
i915_gem_object_create_internal(), as we really want to tests objects
mmap-able by user-space.
This also means that the out-of-space error happens at object creation
and returns -ENXIO rather than -ENOSPC, so fix the code up to expect
that on out-of-offset-space errors.
Finally only use I915_MMAP_TYPE_FIXED for LMEM and SMEM for now if
testing on LMEM-capable devices. For stolen LMEM, we still take the
same path as for integrated, as that haven't been moved over to TTM yet,
and user-space should not be able to create out of stolen LMEM anyway.
v2:
- Check the presence of the obj->ops->mmap_offset callback rather than
hardcoding the supported mmap regions in can_mmap() (Maarten Lankhorst)
Fixes: 7961c5b60f23 ("drm/i915: Add TTM offset argument to mmap.")
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210831122931.157536-1-thomas.hellstrom@linux.intel.com
(cherry picked from commit 450cede7f3804ca7f8b3da210ebefa61c0958f22)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
The function is only used from within GEM_BUG_ON(), which is causing
warnings with Wunneeded-internal-declaration in some builds. Since the
function is a simple wrapper around a CT function, we can just call the
CT function directly instead.
Fixes: 1fb12c587152 ("drm/i915/guc: skip disabling CTBs before sanitizing the GuC")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210823163137.19770-1-daniele.ceraolospurio@intel.com
(cherry picked from commit 5db1856781e45c9610f7652a19cc656b984235e7)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
Users reported that after commit 2bbd6dba84d4 ("drm/i915: Try to use
fast+narrow link on eDP again and fall back to the old max strategy on
failure"), the screen starts to have wobbly effect.
Commit a5c936add6a2 ("drm/i915/dp: Use slow and wide link training for
everything") doesn't help either, that means the affected eDP 1.2 panels
only work with max params.
So use max params for panels < eDP 1.4 as Windows does to solve the
issue.
v3:
- Do the eDP rev check in intel_edp_init_dpcd()
v2:
- Check eDP 1.4 instead of DPCD 1.1 to apply max params
Cc: stable@vger.kernel.org
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3714
Fixes: 2bbd6dba84d4 ("drm/i915: Try to use fast+narrow link on eDP again and fall back to the old max strategy on failure")
Fixes: a5c936add6a2 ("drm/i915/dp: Use slow and wide link training for everything")
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210820075301.693099-1-kai.heng.feng@canonical.com
(cherry picked from commit d7f213c131adf0bec8b731553eb82990cdac265d)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
After DPRX link training, intel_dp_link_train_phy() did not
return the training result properly. If link training failed,
i915 driver would not run into link train fallback function.
And no hotplug uevent would be received by user space application.
Fixes: b30edfd8d0b4 ("drm/i915: Switch to LTTPR non-transparent mode link training")
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Cooper Chiou <cooper.chiou@intel.com>
Cc: William Tseng <william.tseng@intel.com>
Signed-off-by: Lee Shawn C <shawn.c.lee@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210706152541.25021-1-shawn.c.lee@intel.com
(cherry picked from commit dab1b47e57e053b2a02c22ead8e7449f79961335)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
If there is no legacy RTC device, don't try to use it for storing trace
data across suspend/resume.
Cc: <stable@vger.kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20210903084937.19392-2-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
Today the Xen ballooning is done via delayed work in a workqueue. This
might result in workqueue hangups being reported in case of large
amounts of memory are being ballooned in one go (here 16GB):
BUG: workqueue lockup - pool cpus=6 node=0 flags=0x0 nice=0 stuck for 64s!
Showing busy workqueues and worker pools:
workqueue events: flags=0x0
pwq 12: cpus=6 node=0 flags=0x0 nice=0 active=2/256 refcnt=3
in-flight: 229:balloon_process
pending: cache_reap
workqueue events_freezable_power_: flags=0x84
pwq 12: cpus=6 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
pending: disk_events_workfn
workqueue mm_percpu_wq: flags=0x8
pwq 12: cpus=6 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
pending: vmstat_update
pool 12: cpus=6 node=0 flags=0x0 nice=0 hung=64s workers=3 idle: 2222 43
This can easily be avoided by using a dedicated kernel thread for doing
the ballooning work.
Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20210827123206.15429-1-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
User space can hold a tty open indefinitely and tty drivers must not
release the underlying structures until the last user is gone.
Switch to using the tty-port reference counter to manage the life time
of the greybus tty state to avoid use after free after a disconnect.
Fixes: a18e15175708 ("greybus: more uart work")
Cc: stable@vger.kernel.org # 4.9
Reviewed-by: Alex Elder <elder@linaro.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://lore.kernel.org/r/20210906124538.22358-1-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This fixes warnings with -Wimplicit-function-declaration, e.g.
drivers/hwtracing/coresight/coresight-syscfg.c:455:15: error:
implicit declaration of function 'kzalloc' [-Werror,
-Wimplicit-function-declaration]
csdev_item = kzalloc(sizeof(struct cscfg_registered_csdev),
GFP_KERNEL);
Link: https://lore.kernel.org/r/20210830172820.2840433-1-jiancai@google.com
Fixes: 85e2414c518a ("coresight: syscfg: Initial coresight system configuration")
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Jian Cai <jiancai@google.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20210913164613.1675791-2-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
When I added nvmem_cell_read_variable_le_u32() and
nvmem_cell_read_variable_le_u64() I forgot to add the "static inline"
stub functions for when CONFIG_NVMEM wasn't defined. Add them
now. This was causing problems with randconfig builds that compiled
`drivers/soc/qcom/cpr.c`.
Fixes: 6feba6a62c57 ("PM: AVS: qcom-cpr: Use nvmem_cell_read_variable_le_u32()")
Fixes: a28e824fb827 ("nvmem: core: Add functions to make number reading easy")
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20210913160551.12907-1-srinivas.kandagatla@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
During BC_FREE_BUFFER processing, the BINDER_TYPE_FDA object
cleanup may close 1 or more fds. The close operations are
completed using the task work mechanism -- which means the thread
needs to return to userspace or the file object may never be
dereferenced -- which can lead to hung processes.
Force the binder thread back to userspace if an fd is closed during
BC_FREE_BUFFER handling.
Fixes: 80cd795630d6 ("binder: fix use-after-free due to ksys_close() during fdget()")
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Martijn Coenen <maco@android.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Link: https://lore.kernel.org/r/20210830195146.587206-1-tkjos@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Currently cgroup freezer is used to freeze the application threads, and
BINDER_FREEZE is used to freeze the corresponding binder interface.
There's already a mechanism in ioctl(BINDER_FREEZE) to wait for any
existing transactions to drain out before actually freezing the binder
interface.
But freezing an app requires 2 steps, freezing the binder interface with
ioctl(BINDER_FREEZE) and then freezing the application main threads with
cgroupfs. This is not an atomic operation. The following race issue
might happen.
1) Binder interface is frozen by ioctl(BINDER_FREEZE);
2) Main thread A initiates a new sync binder transaction to process B;
3) Main thread A is frozen by "echo 1 > cgroup.freeze";
4) The response from process B reaches the frozen thread, which will
unexpectedly fail.
This patch provides a mechanism to check if there's any new pending
transaction happening between ioctl(BINDER_FREEZE) and freezing the
main thread. If there's any, the main thread freezing operation can
be rolled back to finish the pending transaction.
Furthermore, the response might reach the binder driver before the
rollback actually happens. That will still cause failed transaction.
As the other process doesn't wait for another response of the response,
the response transaction failure can be fixed by treating the response
transaction like an oneway/async one, allowing it to reach the frozen
thread. And it will be consumed when the thread gets unfrozen later.
NOTE: This patch reuses the existing definition of struct
binder_frozen_status_info but expands the bit assignments of __u32
member sync_recv.
To ensure backward compatibility, bit 0 of sync_recv still indicates
there's an outstanding sync binder transaction. This patch adds new
information to bit 1 of sync_recv, indicating the binder transaction
happens exactly when there's a race.
If an existing userspace app runs on a new kernel, a sync binder call
will set bit 0 of sync_recv so ioctl(BINDER_GET_FROZEN_INFO) still
return the expected value (true). The app just doesn't check bit 1
intentionally so it doesn't have the ability to tell if there's a race.
This behavior is aligned with what happens on an old kernel which
doesn't set bit 1 at all.
A new userspace app can 1) check bit 0 to know if there's a sync binder
transaction happened when being frozen - same as before; and 2) check
bit 1 to know if that sync binder transaction happened exactly when
there's a race - a new information for rollback decision.
the same time, confirmed the pending transactions succeeded.
Fixes: 432ff1e91694 ("binder: BINDER_FREEZE ioctl")
Acked-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Li Li <dualli@google.com>
Test: stress test with apps being frozen and initiating binder calls at
Link: https://lore.kernel.org/r/20210910164210.2282716-2-dualli@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
device_initialize() is used to take a refcount on the device. However,
put_device() is not called during device teardown. This leads to a
leak of private data of the driver core, dev_name(), etc. This is
reported by kmemleak at boot time if we compile kernel with
DEBUG_TEST_DRIVER_REMOVE.
Fix memory leaks during unregistration and implement a release
function.
Link: https://lore.kernel.org/r/20210911105306.1511-1-yuzenghui@huawei.com
Fixes: ead09dd3aed5 ("scsi: bsg: Simplify device registration")
Reviewed-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
sd_spinup_disk() is a little bit noisy after commit 848ade90ba9c ("scsi:
sd: Do not exit sd_spinup_disk() quietly"):
scsi 0:0:0:0: Direct-Access Multiple Card Reader 1.00 PQ: 0 ANSI: 0
sd 0:0:0:0: Attached scsi generic sg0 type 0
sd 0:0:0:0: [sda] Media removed, stopped polling
sd 0:0:0:0: [sda] Media removed, stopped polling
sd 0:0:0:0: [sda] Attached SCSI removable disk
sd 0:0:0:0: [sda] Media removed, stopped polling
There's not really a benefit in printing the same message multiple
times. Therefore print it only if media_present was previously set.
Link: https://lore.kernel.org/r/a2d0a249-6035-9697-626a-e14ec50ef6ee@gmail.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Intel LKF can experience link errors. Make fixes to increase link
stability, especially when switching to high speed modes.
Link: https://lore.kernel.org/r/20210831145317.26306-1-adrian.hunter@intel.com
Fixes: b2c57925df1f ("scsi: ufs: ufs-pci: Add support for Intel LKF")
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
There are a couple of statements where the indentation is not correct,
clean these up. Remove a redundant break statement.
Link: https://lore.kernel.org/r/20210902224215.57286-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
There are a few statements where the indentation is not correct, clean
these up.
Link: https://lore.kernel.org/r/20210902223643.56979-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
There is a spelling mistake in a literal string. Fix it.
Link: https://lore.kernel.org/r/20210826115714.11844-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
There's little point in keeping this one separately maintained these days,
so just remove the entry and it'll fall under the SCSI subsystem where it
belongs.
Link: https://lore.kernel.org/r/c5e12bd1-10de-634c-d6b3-dac79ed01af5@kernel.dk
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|