Age | Commit message (Collapse) | Author |
|
team interface could be nested and it's lock variable could be nested too.
But this lock uses static lockdep key and there is no nested locking
handling code such as mutex_lock_nested() and so on.
so the Lockdep would warn about the circular locking scenario that
couldn't happen.
In order to fix, this patch makes the team module to use dynamic lock key
instead of static key.
Test commands:
ip link add team0 type team
ip link add team1 type team
ip link set team0 master team1
ip link set team0 nomaster
ip link set team1 master team0
ip link set team1 nomaster
Splat that looks like:
[ 40.364352] WARNING: possible recursive locking detected
[ 40.364964] 5.4.0-rc3+ #96 Not tainted
[ 40.365405] --------------------------------------------
[ 40.365973] ip/750 is trying to acquire lock:
[ 40.366542] ffff888060b34c40 (&team->lock){+.+.}, at: team_set_mac_address+0x151/0x290 [team]
[ 40.367689]
but task is already holding lock:
[ 40.368729] ffff888051201c40 (&team->lock){+.+.}, at: team_del_slave+0x29/0x60 [team]
[ 40.370280]
other info that might help us debug this:
[ 40.371159] Possible unsafe locking scenario:
[ 40.371942] CPU0
[ 40.372338] ----
[ 40.372673] lock(&team->lock);
[ 40.373115] lock(&team->lock);
[ 40.373549]
*** DEADLOCK ***
[ 40.374432] May be due to missing lock nesting notation
[ 40.375338] 2 locks held by ip/750:
[ 40.375851] #0: ffffffffabcc42b0 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x466/0x8a0
[ 40.376927] #1: ffff888051201c40 (&team->lock){+.+.}, at: team_del_slave+0x29/0x60 [team]
[ 40.377989]
stack backtrace:
[ 40.378650] CPU: 0 PID: 750 Comm: ip Not tainted 5.4.0-rc3+ #96
[ 40.379368] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 40.380574] Call Trace:
[ 40.381208] dump_stack+0x7c/0xbb
[ 40.381959] __lock_acquire+0x269d/0x3de0
[ 40.382817] ? register_lock_class+0x14d0/0x14d0
[ 40.383784] ? check_chain_key+0x236/0x5d0
[ 40.384518] lock_acquire+0x164/0x3b0
[ 40.385074] ? team_set_mac_address+0x151/0x290 [team]
[ 40.385805] __mutex_lock+0x14d/0x14c0
[ 40.386371] ? team_set_mac_address+0x151/0x290 [team]
[ 40.387038] ? team_set_mac_address+0x151/0x290 [team]
[ 40.387632] ? mutex_lock_io_nested+0x1380/0x1380
[ 40.388245] ? team_del_slave+0x60/0x60 [team]
[ 40.388752] ? rcu_read_lock_sched_held+0x90/0xc0
[ 40.389304] ? rcu_read_lock_bh_held+0xa0/0xa0
[ 40.389819] ? lock_acquire+0x164/0x3b0
[ 40.390285] ? lockdep_rtnl_is_held+0x16/0x20
[ 40.390797] ? team_port_get_rtnl+0x90/0xe0 [team]
[ 40.391353] ? __module_text_address+0x13/0x140
[ 40.391886] ? team_set_mac_address+0x151/0x290 [team]
[ 40.392547] team_set_mac_address+0x151/0x290 [team]
[ 40.393111] dev_set_mac_address+0x1f0/0x3f0
[ ... ]
Fixes: 3d249d4ca7d0 ("net: introduce ethernet teaming device")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Some interface types could be nested.
(VLAN, BONDING, TEAM, MACSEC, MACVLAN, IPVLAN, VIRT_WIFI, VXLAN, etc..)
These interface types should set lockdep class because, without lockdep
class key, lockdep always warn about unexisting circular locking.
In the current code, these interfaces have their own lockdep class keys and
these manage itself. So that there are so many duplicate code around the
/driver/net and /net/.
This patch adds new generic lockdep keys and some helper functions for it.
This patch does below changes.
a) Add lockdep class keys in struct net_device
- qdisc_running, xmit, addr_list, qdisc_busylock
- these keys are used as dynamic lockdep key.
b) When net_device is being allocated, lockdep keys are registered.
- alloc_netdev_mqs()
c) When net_device is being free'd llockdep keys are unregistered.
- free_netdev()
d) Add generic lockdep key helper function
- netdev_register_lockdep_key()
- netdev_unregister_lockdep_key()
- netdev_update_lockdep_key()
e) Remove unnecessary generic lockdep macro and functions
f) Remove unnecessary lockdep code of each interfaces.
After this patch, each interface modules don't need to maintain
their lockdep keys.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Current code doesn't limit the number of nested devices.
Nested devices would be handled recursively and this needs huge stack
memory. So, unlimited nested devices could make stack overflow.
This patch adds upper_level and lower_level, they are common variables
and represent maximum lower/upper depth.
When upper/lower device is attached or dettached,
{lower/upper}_level are updated. and if maximum depth is bigger than 8,
attach routine fails and returns -EMLINK.
In addition, this patch converts recursive routine of
netdev_walk_all_{lower/upper} to iterator routine.
Test commands:
ip link add dummy0 type dummy
ip link add link dummy0 name vlan1 type vlan id 1
ip link set vlan1 up
for i in {2..55}
do
let A=$i-1
ip link add vlan$i link vlan$A type vlan id $i
done
ip link del dummy0
Splat looks like:
[ 155.513226][ T908] BUG: KASAN: use-after-free in __unwind_start+0x71/0x850
[ 155.514162][ T908] Write of size 88 at addr ffff8880608a6cc0 by task ip/908
[ 155.515048][ T908]
[ 155.515333][ T908] CPU: 0 PID: 908 Comm: ip Not tainted 5.4.0-rc3+ #96
[ 155.516147][ T908] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 155.517233][ T908] Call Trace:
[ 155.517627][ T908]
[ 155.517918][ T908] Allocated by task 0:
[ 155.518412][ T908] (stack is not available)
[ 155.518955][ T908]
[ 155.519228][ T908] Freed by task 0:
[ 155.519885][ T908] (stack is not available)
[ 155.520452][ T908]
[ 155.520729][ T908] The buggy address belongs to the object at ffff8880608a6ac0
[ 155.520729][ T908] which belongs to the cache names_cache of size 4096
[ 155.522387][ T908] The buggy address is located 512 bytes inside of
[ 155.522387][ T908] 4096-byte region [ffff8880608a6ac0, ffff8880608a7ac0)
[ 155.523920][ T908] The buggy address belongs to the page:
[ 155.524552][ T908] page:ffffea0001822800 refcount:1 mapcount:0 mapping:ffff88806c657cc0 index:0x0 compound_mapcount:0
[ 155.525836][ T908] flags: 0x100000000010200(slab|head)
[ 155.526445][ T908] raw: 0100000000010200 ffffea0001813808 ffffea0001a26c08 ffff88806c657cc0
[ 155.527424][ T908] raw: 0000000000000000 0000000000070007 00000001ffffffff 0000000000000000
[ 155.528429][ T908] page dumped because: kasan: bad access detected
[ 155.529158][ T908]
[ 155.529410][ T908] Memory state around the buggy address:
[ 155.530060][ T908] ffff8880608a6b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 155.530971][ T908] ffff8880608a6c00: fb fb fb fb fb f1 f1 f1 f1 00 f2 f2 f2 f3 f3 f3
[ 155.531889][ T908] >ffff8880608a6c80: f3 fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 155.532806][ T908] ^
[ 155.533509][ T908] ffff8880608a6d00: fb fb fb fb fb fb fb fb fb f1 f1 f1 f1 00 00 00
[ 155.534436][ T908] ffff8880608a6d80: f2 f3 f3 f3 f3 fb fb fb 00 00 00 00 00 00 00 00
[ ... ]
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
i2c-digital-filter-width-ns:
This optional timing property specifies the width of the spikes on the i2c
lines (in ns) that can be filtered out by built-in digital filters which are
embedded in some i2c controllers.
i2c-analog-filter-cutoff-frequency:
This optional timing property specifies the cutoff frequency of a low-pass
analog filter built-in i2c controllers. This low pass filter is used to filter
out high frequency noise on the i2c lines. Specified in Hz.
Include these properties in the timings structure and read them as integers.
Signed-off-by: Eugen Hristev <eugen.hristev@microchip.com>
Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Reviewed-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Add kerneldoc comments for the optional reset_control_get variants.
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
|
|
Mention of_reset_simple_xlate as the default if of_xlate is not set.
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
|
|
Add missing parentheses to correctly hyperlink the reference to
reset_control_get_shared().
Fixes: 0b52297f2288 ("reset: Add support for shared reset controls")
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
|
|
Add a missing colon to fix a documentation build warning:
./include/linux/reset-controller.h:45: warning: Function parameter or member 'con_id' not described in 'reset_control_lookup'
Fixes: 6691dffab0ab ("reset: add support for non-DT systems")
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
|
|
move code to separate header-file to reuse definitions later
in poweroff-driver (drivers/power/reset/mt6323-poweroff.c)
Suggested-by: Frank Wunderlich <frank-w@public-files.de>
Signed-off-by: Josef Friedl <josef.friedl@speed.at>
Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
|
|
This patch is a stripped down version of the locking changes
necessary to support dynamic DMA-buf handling.
It adds a dynamic flag for both importers as well as exporters
so that drivers can choose if they want the reservation object
locked or unlocked during mapping of attachments.
For compatibility between drivers we cache the DMA-buf mapping
during attaching an importer as soon as exporter/importer
disagree on the dynamic handling.
Issues and solutions we considered:
- We can't change all existing drivers, and existing improters have
strong opinions about which locks they're holding while calling
dma_buf_attachment_map/unmap. Exporters also have strong opinions about
which locks they can acquire in their ->map/unmap callbacks, levaing no
room for change. The solution to avoid this was to move the
actual map/unmap out from this call, into the attach/detach callbacks,
and cache the mapping. This works because drivers don't call
attach/detach from deep within their code callchains (like deep in
memory management code called from cs/execbuf ioctl), but directly from
the fd2handle implementation.
- The caching has some troubles on some soc drivers, which set other modes
than DMA_BIDIRECTIONAL. We can't have 2 incompatible mappings, and we
can't re-create the mapping at _map time due to the above locking fun.
We very carefuly step around that by only caching at attach time if the
dynamic mode between importer/expoert mismatches.
- There's been quite some discussion on dma-buf mappings which need active
cache management, which would all break down when caching, plus we don't
have explicit flush operations on the attachment side. The solution to
this was to shrug and keep the current discrepancy between what the
dma-buf docs claim and what implementations do, with the hope that the
begin/end_cpu_access hooks are good enough and that all necessary
flushing to keep device mappings consistent will be done there.
v2: cleanup set_name merge, improve kerneldoc
v3: update commit message, kerneldoc and cleanup _debug_show()
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/336788/
|
|
The BCM54616S PHY cannot work properly in RGMII->1000Base-X mode, mainly
because genphy functions are designed for copper links, and 1000Base-X
(clause 37) auto negotiation needs to be handled differently.
This patch enables 1000Base-X support for BCM54616S by customizing 3
driver callbacks, and it's verified to be working on Facebook CMM BMC
platform (RGMII->1000Base-KX):
- probe: probe callback detects PHY's operation mode based on
INTERF_SEL[1:0] pins and 1000X/100FX selection bit in SerDES 100-FX
Control register.
- config_aneg: calls genphy_c37_config_aneg when the PHY is running in
1000Base-X mode; otherwise, genphy_config_aneg will be called.
- read_status: calls genphy_c37_read_status when the PHY is running in
1000Base-X mode; otherwise, genphy_read_status will be called.
Note: BCM54616S PHY can also be configured in RGMII->100Base-FX mode, and
100Base-FX support is not available as of now.
Signed-off-by: Tao Ren <taoren@fb.com>
Acked-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds support for clause 37 1000Base-X auto-negotiation.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Tao Ren <taoren@fb.com>
Tested-by: René van Dorst <opensource@vdorst.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
UDP IPv6 packets auto flowlabels are using a 32bit secret
(static u32 hashrnd in net/core/flow_dissector.c) and
apply jhash() over fields known by the receivers.
Attackers can easily infer the 32bit secret and use this information
to identify a device and/or user, since this 32bit secret is only
set at boot time.
Really, using jhash() to generate cookies sent on the wire
is a serious security concern.
Trying to change the rol32(hash, 16) in ip6_make_flowlabel() would be
a dead end. Trying to periodically change the secret (like in sch_sfq.c)
could change paths taken in the network for long lived flows.
Let's switch to siphash, as we did in commit df453700e8d8
("inet: switch IP ID generator to siphash")
Using a cryptographically strong pseudo random function will solve this
privacy issue and more generally remove other weak points in the stack.
Packet schedulers using skb_get_hash_perturb() benefit from this change.
Fixes: b56774163f99 ("ipv6: Enable auto flow labels by default")
Fixes: 42240901f7c4 ("ipv6: Implement different admin modes for automatic flow labels")
Fixes: 67800f9b1f4e ("ipv6: Call skb_get_hash_flowi6 to get skb->hash in ip6_make_flowlabel")
Fixes: cb1ce2ef387b ("ipv6: Implement automatic flow label generation on transmit")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Jonathan Berger <jonathann1@walla.com>
Reported-by: Amit Klein <aksecurity@gmail.com>
Reported-by: Benny Pinkas <benny@pinkas.net>
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This change documents the CS setup, host & inactive times. They were
omitted when the fields were added, and were caught by one of the build
bots.
Fixes: 25093bdeb6bc ("spi: implement SW control for CS times")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Link: https://lore.kernel.org/r/20191023070046.12478-1-alexandru.ardelean@analog.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Remove the nr_exclusive argument from __wake_up_sync_key() and derived
functions as everything seems to set it to 1. Note also that if it wasn't
set to 1, it would clear WF_SYNC anyway.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
|
|
There are two code locations that implement the SG_IO ioctl: the old
sg.c driver, and the generic scsi_ioctl helper that is in turn used by
multiple drivers.
To eradicate the old compat_ioctl conversion handler for the SG_IO
command, I implement a readable pair of put_sg_io_hdr() /get_sg_io_hdr()
helper functions that can be used for both compat and native mode,
and then I call this from both drivers.
For the iovec handling, there is already a compat_import_iovec() function
that can simply be called in place of import_iovec().
To avoid having to pass the compat/native state through multiple
indirections, I mark the SG_IO command itself as compatible in
fs/compat_ioctl.c and use in_compat_syscall() to figure out where
we are called from.
As a side-effect of this, the sg.c driver now also accepts the 32-bit
sg_io_hdr format in compat mode using the read/write interface, not
just ioctl. This should improve compatiblity with old 32-bit binaries,
but it would break if any application intentionally passes the 64-bit
data structure in compat mode here.
Steffen Maier helped debug an issue in an earlier version of this patch.
Cc: Steffen Maier <maier@linux.ibm.com>
Cc: linux-scsi@vger.kernel.org
Cc: Doug Gilbert <dgilbert@interlog.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
MTIOCPOS and MTIOCGET are incompatible between 32-bit and 64-bit user
space, and traditionally have been translated in fs/compat_ioctl.c.
To get rid of that translation handler, move a corresponding
implementation into each of the four drivers implementing those commands.
The interesting part of that is now in a new linux/mtio.h header that
wraps the existing uapi/linux/mtio.h header and provides an abstraction
to let drivers handle both cases easily. Using an in_compat_syscall()
check, the caller does not have to keep track of whether this was
called through .unlocked_ioctl() or .compat_ioctl().
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "Kai Mäkisara" <Kai.Makisara@kolumbus.fi>
Cc: linux-scsi@vger.kernel.org
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
... and lose the ridiculous games with compat_alloc_user_space()
there.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
Many drivers have ioctl() handlers that are completely compatible between
32-bit and 64-bit architectures, except for the argument that is passed
down from user space and may have to be passed through compat_ptr()
in order to become a valid 64-bit pointer.
Using ".compat_ptr = compat_ptr_ioctl" in file operations should let
us simplify a lot of those drivers to avoid #ifdef checks, and convert
additional drivers that don't have proper compat handling yet.
On most architectures, the compat_ptr_ioctl() just passes all arguments
to the corresponding ->ioctl handler. The exception is arch/s390, where
compat_ptr() clears the top bit of a 32-bit pointer value, so user space
pointers to the second 2GB alias the first 2GB, as is the case for native
32-bit s390 user space.
The compat_ptr_ioctl() function must therefore be used only with
ioctl functions that either ignore the argument or pass a pointer to a
compatible data type.
If any ioctl command handled by fops->unlocked_ioctl passes a plain
integer instead of a pointer, or any of the passed data types is
incompatible between 32-bit and 64-bit architectures, a proper handler
is required instead of compat_ptr_ioctl.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
v3: add a better description
v2: use compat_ptr_ioctl instead of generic_compat_ioctl_ptrarg,
as suggested by Al Viro
|
|
Thierry needs fd70c7755bf0 ("drm/bridge: tc358767: fix max_tu_symbol
value") to be able to merge his dp_link patch series.
Some adjacent changes conflicts, plus some clashes in i915 due to
cherry-picking and git trying to be helpful and leaving both versions
in.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
Tegra XUSB device control driver needs to control vbus override
during its operations, add API for the support.
Signed-off-by: Nagarjuna Kristam <nkristam@nvidia.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
|
|
Move all SPI NOR controller driver specific ops in a dedicated
structure. 'struct spi_nor' becomes lighter.
Use size_t for lengths in 'int (*write_reg)()' and 'int (*read_reg)()'.
Rename wite/read_buf to buf, the name of the functions are
suggestive enough. Constify buf in int (*write_reg). Comply with these
changes in the SPI NOR controller drivers.
Suggested-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
|
|
Now that SPI flash controllers without a software sequencer are
supported, it's trivial to add support for CNL and its PCI ID.
Values from https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/300-series-chipset-pch-datasheet-vol-2.pdf
Signed-off-by: Jethro Beekman <jethro@fortanix.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
|
|
The ionic driver started using dymamic_hex_dump(), but
that is not always defined:
drivers/net/ethernet/pensando/ionic/ionic_main.c:229:2: error: implicit declaration of function 'dynamic_hex_dump' [-Werror,-Wimplicit-function-declaration]
Add a dummy implementation to use when CONFIG_DYNAMIC_DEBUG
is disabled, printing nothing.
Fixes: 938962d55229 ("ionic: Add adminq action")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
|
|
If something has the IPMI driver open, don't allow the device
module to be unloaded. Before it would unload and the user would
get errors on use.
This change is made on user request, and it makes it consistent
with the I2C driver, which has the same behavior.
It does change things a little bit with respect to kernel users.
If the ACPI or IPMI watchdog (or any other kernel user) has
created a user, then the device module cannot be unloaded. Before
it could be unloaded,
This does not affect hot-plug. If the device goes away (it's on
something removable that is removed or is hot-removed via sysfs)
then it still behaves as it did before.
Reported-by: tony camuso <tcamuso@redhat.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Tested-by: tony camuso <tcamuso@redhat.com>
|
|
syzkaller managed to trigger the following crash:
[...]
BUG: unable to handle page fault for address: ffffc90001923030
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD aa551067 P4D aa551067 PUD aa552067 PMD a572b067 PTE 80000000a1173163
Oops: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 7982 Comm: syz-executor912 Not tainted 5.4.0-rc3+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:bpf_jit_binary_hdr include/linux/filter.h:787 [inline]
RIP: 0010:bpf_get_prog_addr_region kernel/bpf/core.c:531 [inline]
RIP: 0010:bpf_tree_comp kernel/bpf/core.c:600 [inline]
RIP: 0010:__lt_find include/linux/rbtree_latch.h:115 [inline]
RIP: 0010:latch_tree_find include/linux/rbtree_latch.h:208 [inline]
RIP: 0010:bpf_prog_kallsyms_find kernel/bpf/core.c:674 [inline]
RIP: 0010:is_bpf_text_address+0x184/0x3b0 kernel/bpf/core.c:709
[...]
Call Trace:
kernel_text_address kernel/extable.c:147 [inline]
__kernel_text_address+0x9a/0x110 kernel/extable.c:102
unwind_get_return_address+0x4c/0x90 arch/x86/kernel/unwind_frame.c:19
arch_stack_walk+0x98/0xe0 arch/x86/kernel/stacktrace.c:26
stack_trace_save+0xb6/0x150 kernel/stacktrace.c:123
save_stack mm/kasan/common.c:69 [inline]
set_track mm/kasan/common.c:77 [inline]
__kasan_kmalloc+0x11c/0x1b0 mm/kasan/common.c:510
kasan_slab_alloc+0xf/0x20 mm/kasan/common.c:518
slab_post_alloc_hook mm/slab.h:584 [inline]
slab_alloc mm/slab.c:3319 [inline]
kmem_cache_alloc+0x1f5/0x2e0 mm/slab.c:3483
getname_flags+0xba/0x640 fs/namei.c:138
getname+0x19/0x20 fs/namei.c:209
do_sys_open+0x261/0x560 fs/open.c:1091
__do_sys_open fs/open.c:1115 [inline]
__se_sys_open fs/open.c:1110 [inline]
__x64_sys_open+0x87/0x90 fs/open.c:1110
do_syscall_64+0xf7/0x1c0 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
[...]
After further debugging it turns out that we walk kallsyms while in parallel
we tear down a BPF program which contains subprograms that have been JITed
though the program itself has not been fully exposed and is eventually bailing
out with error.
The bpf_prog_kallsyms_del_subprogs() in bpf_prog_load()'s error path removes
the symbols, however, bpf_prog_free() tears down the JIT memory too early via
scheduled work. Instead, it needs to properly respect RCU grace period as the
kallsyms walk for BPF is under RCU.
Fix it by refactoring __bpf_prog_put()'s tear down and reuse it in our error
path where we defer final destruction when we have subprogs in the program.
Fixes: 7d1982b4e335 ("bpf: fix panic in prog load calls cleanup")
Fixes: 1c2a088a6626 ("bpf: x64: add JIT support for multi-function programs")
Reported-by: syzbot+710043c5d1d5b5013bc7@syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: syzbot+710043c5d1d5b5013bc7@syzkaller.appspotmail.com
Link: https://lore.kernel.org/bpf/55f6367324c2d7e9583fa9ccf5385dcbba0d7a6e.1571752452.git.daniel@iogearbox.net
|
|
Add a new helper, kvm_put_kvm_no_destroy(), to handle putting a borrowed
reference[*] to the VM when installing a new file descriptor fails. KVM
expects the refcount to remain valid in this case, as the in-progress
ioctl() has an explicit reference to the VM. The primary motiviation
for the helper is to document that the 'kvm' pointer is still valid
after putting the borrowed reference, e.g. to document that doing
mutex(&kvm->lock) immediately after putting a ref to kvm isn't broken.
[*] When exposing a new object to userspace via a file descriptor, e.g.
a new vcpu, KVM grabs a reference to itself (the VM) prior to making
the object visible to userspace to avoid prematurely freeing the VM
in the scenario where userspace immediately closes file descriptor.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The kvm_vcpu variable, guest_xcr0_loaded, is a waste of an 'int'
and a conditional branch. VMX and SVM are the only users, and both
unconditionally pair kvm_load_guest_xcr0() with kvm_put_guest_xcr0()
making this check unnecessary. Without this variable, the predicates in
kvm_load_guest_xcr0 and kvm_put_guest_xcr0 should match.
Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Change-Id: I7b1eb9b62969d7bbb2850f27e42f863421641b23
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
On some SoCs the Adaptive Voltage Scaling (AVS) technique is
employed to optimize the operating voltage of a device. At a
given frequency, the hardware monitors dynamic factors and either
makes a suggestion for how much to adjust a voltage for the
current frequency, or it automatically adjusts the voltage
without software intervention. Add an API to the OPP library for
the former case, so that AVS type devices can update the voltages
for an OPP when the hardware determines the voltage should
change. The assumption is that drivers like CPUfreq or devfreq
will register for the OPP notifiers and adjust the voltage
according to suggestions that AVS makes.
This patch is derived from [1] submitted by Stephen.
[1] https://lore.kernel.org/patchwork/patch/599279/
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Roger Lu <roger.lu@mediatek.com>
[s.nawrocki@samsung.com: added handling of OPP min/max voltage]
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|
Since this driver is enabled for COMPILE_TEST, it avoids build error
on x86 allmodconfig:
In file included from /build/drivers/phy/marvell/phy-mmp3-usb.c:12:
/build/include/linux/soc/mmp/cputype.h:5:10: fatal error: asm/cputype.h: No such file or directory
Link: https://lore.kernel.org/r/20191022015658.14624-1-olof@lixom.net
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/lkundrak/linux-mmp into arm/drivers
ARM: Marvell MMP driver patches for v5.5
This tag includes the MMP3 USB2 PHY driver. The branch is based on
mmp-soc-for-v5.5-2 because the driver depends on changes in MMP SoC
support.
* tag 'mmp-drivers-for-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/lkundrak/linux-mmp:
MAINTAINERS: phy: add entry for USB PHY drivers on MMP SoCs
phy: Add USB2 PHY driver for Marvell MMP3 SoC
MAINTAINERS: mmp: add Git repository
ARM: mmp: remove MMP3 USB PHY registers from regs-usb.h
ARM: mmp: move cputype.h to include/linux/soc/
ARM: mmp: add SMP support
ARM: mmp: add support for MMP3 SoC
ARM: mmp: define MMP_CHIPID by the means of CIU_REG()
ARM: mmp: DT: convert timer driver to use TIMER_OF_DECLARE
ARM: mmp: map the PGU as well
ARM: mmp: don't select CACHE_TAUROS2 on all ARCH_MMP
ARM: l2c: add definition for FWA in PL310 aux register
Link: https://lore.kernel.org/r/7cee3ddbb553ba7fe6e1420e0dbc5adb4922b317.camel@v3.sk
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/lkundrak/linux-mmp into arm/soc
ARM: Marvell MMP SoC patches for v5.5
This tag includes initial support for the Marvell MMP3 processor.
MMP3 is used in OLPC XO-4 laptops, Panasonic Toughpad FZ-A1 tablet
and Dell Wyse 3020/Tx0D thin clients.
* tag 'mmp-soc-for-v5.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/lkundrak/linux-mmp:
MAINTAINERS: mmp: add Git repository
ARM: mmp: remove MMP3 USB PHY registers from regs-usb.h
ARM: mmp: move cputype.h to include/linux/soc/
ARM: mmp: add SMP support
ARM: mmp: add support for MMP3 SoC
ARM: mmp: define MMP_CHIPID by the means of CIU_REG()
ARM: mmp: DT: convert timer driver to use TIMER_OF_DECLARE
ARM: mmp: map the PGU as well
ARM: mmp: don't select CACHE_TAUROS2 on all ARCH_MMP
ARM: l2c: add definition for FWA in PL310 aux register
Link: https://lore.kernel.org/r/3a035bed90f9d8acc49b2d11d20089b546062aea.camel@v3.sk
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
Now that ext4 and f2fs implement their own post-read workflow that
supports both fscrypt and fsverity, the fscrypt-only workflow based
around struct fscrypt_ctx is no longer used. So remove the unused code.
This is based on a patch from Chandan Rajendra's "Consolidate FS read
I/O callbacks code" patchset, but rebased onto the latest kernel, folded
__fscrypt_decrypt_bio() into fscrypt_decrypt_bio(), cleaned up
fscrypt_initialize(), and updated the commit message.
Originally-from: Chandan Rajendra <chandan@linux.ibm.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
|
|
Enable paravirtualization features when running under a hypervisor
supporting the PV_TIME_ST hypercall.
For each (v)CPU, we ask the hypervisor for the location of a shared
page which the hypervisor will use to report stolen time to us. We set
pv_time_ops to the stolen time function which simply reads the stolen
value from the shared page for a VCPU. We guarantee single-copy
atomicity using READ_ONCE which means we can also read the stolen
time for another VCPU than the currently running one while it is
potentially being updated by the hypervisor.
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
SMCCC 1.1 calls may use either HVC or SMC depending on the PSCI
conduit. Rather than coding this in every call site, provide a macro
which uses the correct instruction. The macro also handles the case
where no conduit is configured/available returning a not supported error
in res, along with returning the conduit used for the call.
This allow us to remove some duplicated code and will be useful later
when adding paravirtualized time hypervisor calls.
Signed-off-by: Steven Price <steven.price@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Currently a kvm_device_ops structure cannot be const without triggering
compiler warnings. However the structure doesn't need to be written to
and, by marking it const, it can be read-only in memory. Add some more
const keywords to allow this.
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Implement the service call for configuring a shared structure between a
VCPU and the hypervisor in which the hypervisor can write the time
stolen from the VCPU's execution time by other tasks on the host.
User space allocates memory which is placed at an IPA also chosen by user
space. The hypervisor then updates the shared structure using
kvm_put_guest() to ensure single copy atomicity of the 64-bit value
reporting the stolen time in nanoseconds.
Whenever stolen time is enabled by the guest, the stolen time counter is
reset.
The stolen time itself is retrieved from the sched_info structure
maintained by the Linux scheduler code. We enable SCHEDSTATS when
selecting KVM Kconfig to ensure this value is meaningful.
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
kvm_put_guest() is analogous to put_user() - it writes a single value to
the guest physical address. The implementation is built upon put_user()
and so it has the same single copy atomic properties.
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
This provides a mechanism for querying which paravirtualized time
features are available in this hypervisor.
Also add the header file which defines the ABI for the paravirtualized
time features we're about to add.
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The srcmap is used to identify where the read is to be performed from.
It is passed to ->iomap_begin, which can fill it in if we need to read
data for partially written blocks from a different location than the
write target. The srcmap is only supported for buffered writes so far.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
[hch: merged two patches, removed the IOMAP_F_COW flag, use iomap as
srcmap if not set, adjust length down to srcmap end as well]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
|
|
Instead of keeping a separate unnamed state for uninitialized iomaps,
renumber IOMAP_HOLE to zero so that an uninitialized iomap is treated
as a hole.
Suggested-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
xfs_file_dirty is used to unshare reflink blocks. Rename the function
to xfs_file_unshare to better document that purpose, and skip iomaps
that are not shared and don't need zeroing. This will allow to simplify
the caller.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
The documentation for IOMAP_F_* is a bit disorganized, and doesn't
mention the fact that most flags are set by the file system and consumed
by the iomap core, while IOMAP_F_SIZE_CHANGED is set by the core and
consumed by the file system.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Allison Collins <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Now that all the writepage code is in the iomap code there is no
need to keep this structure public.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Take the xfs writeback code and move it to fs/iomap. A new structure
with three methods is added as the abstraction from the generic writeback
code to the file system. These methods are used to map blocks, submit an
ioend, and cancel a page that encountered an error before it was added to
an ioend.
Signed-off-by: Christoph Hellwig <hch@lst.de>
[darrick: rename ->submit_ioend to ->prepare_ioend to clarify what it
does]
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
|
Remove usage of the ternary operator to assign values for register
fields. Instead, parameterize the register and field offset macros
and pass the index to them.
This removes clutter and improves readability.
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
|
|
Commit 71523d1812ac (pwm: Ensure pwm_apply_state() doesn't modify the
state argument) updated the kernel-doc for pwm_apply_state(), but not
for the ->apply callback in the pwm_ops struct.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
|
|
The inline prefix was missing in the dummy function pci_pr3_present()
definition. Fix it.
Reported-by: kbuild test robot <lkp@intel.com>
Fixes: 52525b7a3cf8 ("PCI: Add a helper to check Power Resource Requirements _PR3 existence")
Link: https://lore.kernel.org/r/201910212111.qHm6OcWx%lkp@intel.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Bit-spinlocks are problematic on PREEMPT_RT if functions which might sleep
on RT, e.g. spin_lock(), alloc/free(), are invoked inside the lock held
region because bit spinlocks disable preemption even on RT.
A first attempt was to replace state lock with a spinlock placed in struct
buffer_head and make the locking conditional on PREEMPT_RT and
DEBUG_BIT_SPINLOCKS.
Jan pointed out that there is a 4 byte hole in struct journal_head where a
regular spinlock fits in and he would not object to convert the state lock
to a spinlock unconditionally.
Aside of solving the RT problem, this also gains lockdep coverage for the
journal head state lock (bit-spinlocks are not covered by lockdep as it's
hard to fit a lockdep map into a single bit).
The trivial change would have been to convert the jbd_*lock_bh_state()
inlines, but that comes with the downside that these functions take a
buffer head pointer which needs to be converted to a journal head pointer
which adds another level of indirection.
As almost all functions which use this lock have a journal head pointer
readily available, it makes more sense to remove the lock helper inlines
and write out spin_*lock() at all call sites.
Fixup all locking comments as well.
Suggested-by: Jan Kara <jack@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Jan Kara <jack@suse.com>
Cc: linux-ext4@vger.kernel.org
Link: https://lore.kernel.org/r/20190809124233.13277-7-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
__jbd2_journal_unfile_buffer() and __jbd2_journal_refile_buffer() drop
transaction's jh reference when they remove jh from a transaction. This
will be however inconvenient once we move state lock into journal_head
itself as we still need to unlock it and we'd need to grab jh reference
just for that. Move dropping of jh reference out of these functions into
the few callers.
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20190809124233.13277-4-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|