Age | Commit message (Collapse) | Author |
|
When using TIPC_SOCK_RECVQ_DEPTH for getsockopt(), it returns the
number of buffers in receive socket buffer which is not so helpful
for user space applications.
This commit introduces the new option TIPC_SOCK_RECVQ_USED which
returns the current allocated bytes of the receive socket buffer.
This helps user space applications dimension its buffer usage to
avoid buffer overload issue.
Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Some clk providers are simple DT nodes that only have a 'clocks'
property without having an associated 'clock-names' property. In these
cases, we want to let these clk providers point to their parent clks
without having to dereference the 'clocks' property at probe time to
figure out the parent's globally unique clk name. Let's add an 'index'
property to the parent_data structure so that clk providers can indicate
that their parent is a particular index in the 'clocks' DT property.
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Jerome Brunet <jbrunet@baylibre.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Michael Turquette <mturquette@baylibre.com>
Cc: Jeffrey Hugo <jhugo@codeaurora.org>
Cc: Chen-Yu Tsai <wens@csie.org>
Tested-by: Jeffrey Hugo <jhugo@codeaurora.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
The common clk framework is lacking in ability to describe the clk
topology without specifying strings for every possible parent-child
link. There are a few drawbacks to the current approach:
1) String comparisons are used for everything, including describing
topologies that are 'local' to a single clock controller.
2) clk providers (e.g. i2c clk drivers) need to create globally unique
clk names to avoid collisions in the clk namespace, leading to awkward
name generation code in various clk drivers.
3) DT bindings may not fully describe the clk topology and linkages
between clk controllers because drivers can easily rely on globally unique
strings to describe connections between clks.
This leads to confusing DT bindings, complicated clk name generation
code, and inefficient string comparisons during clk registration just so
that the clk framework can detect the topology of the clk tree.
Furthermore, some drivers call clk_get() and then __clk_get_name() to
extract the globally unique clk name just so they can specify the parent
of the clk they're registering. We have of_clk_parent_fill() but that
mostly only works for single clks registered from a DT node, which isn't
the norm. Let's simplify this all by introducing two new ways of
specifying clk parents.
The first method is an array of pointers to clk_hw structures
corresponding to the parents at that index. This works for clks that are
registered when we have access to all the clk_hw pointers for the
parents.
The second method is a mix of clk_hw pointers and strings of local and
global parent clk names. If the .fw_name member of the map is set we'll
look for that clk by performing a DT based lookup of the device the clk
is registered with and the .name specified in the map. If that fails,
we'll fallback to the .name member and perform a global clk name lookup
like we've always done before.
Using either one of these new methods is entirely optional. Existing
drivers will continue to work, and they can migrate to this new approach
as they see fit. Eventually, we'll want to get rid of the 'parent_names'
array in struct clk_init_data and use one of these new methods instead.
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Jerome Brunet <jbrunet@baylibre.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Michael Turquette <mturquette@baylibre.com>
Cc: Jeffrey Hugo <jhugo@codeaurora.org>
Cc: Chen-Yu Tsai <wens@csie.org>
Cc: Rob Herring <robh@kernel.org>
Tested-by: Jeffrey Hugo <jhugo@codeaurora.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
In some circumstances drivers register clks early and don't have access
to a struct device because the device model isn't initialized yet. Add
an API to let drivers register clks associated with a struct device_node
so that these drivers can participate in getting parent clks through DT.
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Jerome Brunet <jbrunet@baylibre.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Michael Turquette <mturquette@baylibre.com>
Cc: Jeffrey Hugo <jhugo@codeaurora.org>
Cc: Chen-Yu Tsai <wens@csie.org>
Cc: Rob Herring <robh@kernel.org>
Tested-by: Jeffrey Hugo <jhugo@codeaurora.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
We'd like to chain this in places where the 'dev' argument might be
NULL. Let this function take a NULL 'dev' so this can work.
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Jerome Brunet <jbrunet@baylibre.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Michael Turquette <mturquette@baylibre.com>
Cc: Jeffrey Hugo <jhugo@codeaurora.org>
Cc: Chen-Yu Tsai <wens@csie.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Rob Herring <robh@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tested-by: Jeffrey Hugo <jhugo@codeaurora.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
The 'timeval' and 'timespec' data structures used for socket timestamps
are going to be redefined in user space based on 64-bit time_t in future
versions of the C library to deal with the y2038 overflow problem,
which breaks the ABI definition.
Unlike many modern ioctl commands, SIOCGSTAMP and SIOCGSTAMPNS do not
use the _IOR() macro to encode the size of the transferred data, so it
remains ambiguous whether the application uses the old or new layout.
The best workaround I could find is rather ugly: we redefine the command
code based on the size of the respective data structure with a ternary
operator. This lets it get evaluated as late as possible, hopefully after
that structure is visible to the caller. We cannot use an #ifdef here,
because inux/sockios.h might have been included before any libc header
that could determine the size of time_t.
The ioctl implementation now interprets the new command codes as always
referring to the 64-bit structure on all architectures, while the old
architecture specific command code still refers to the old architecture
specific layout. The new command number is only used when they are
actually different.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The SIOCGSTAMP/SIOCGSTAMPNS ioctl commands are implemented by many
socket protocol handlers, and all of those end up calling the same
sock_get_timestamp()/sock_get_timestampns() helper functions, which
results in a lot of duplicate code.
With the introduction of 64-bit time_t on 32-bit architectures, this
gets worse, as we then need four different ioctl commands in each
socket protocol implementation.
To simplify that, let's add a new .gettstamp() operation in
struct proto_ops, and move ioctl implementation into the common
sock_ioctl()/compat_sock_ioctl_trans() functions that these all go
through.
We can reuse the sock_get_timestamp() implementation, but generalize
it so it can deal with both native and compat mode, as well as
timeval and timespec structures.
Acked-by: Stefan Schmidt <stefan@datenfreihafen.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
Link: https://lore.kernel.org/lkml/CAK8P3a038aDQQotzua_QtKGhq8O9n+rdiz2=WDCp82ys8eUT+A@mail.gmail.com/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In the case of vlan filtering on bridges, the bridge may also have the
corresponding vlan devices as upper devices. Currently the link state
of vlan devices is transferred from the lower device. So this is up if
the bridge is in admin up state and there is at least one bridge port
that is up, regardless of the vlan that the port is a member of.
The link state of the vlan device may need to track only the state of
the subset of ports that are also members of the corresponding vlan,
rather than that of all ports.
Add a flag to specify a vlan bridge binding mode, by which the link
state is no longer automatically transferred from the lower device,
but is instead determined by the bridge ports that are members of the
vlan.
Signed-off-by: Mike Manning <mmanning@vyatta.att-mail.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The syzkaller fuzzer reported a bug in the USB hub driver which turned
out to be caused by a negative runtime-PM usage counter. This allowed
a hub to be runtime suspended at a time when the driver did not expect
it. The symptom is a WARNING issued because the hub's status URB is
submitted while it is already active:
URB 0000000031fb463e submitted while active
WARNING: CPU: 0 PID: 2917 at drivers/usb/core/urb.c:363
The negative runtime-PM usage count was caused by an unfortunate
design decision made when runtime PM was first implemented for USB.
At that time, USB class drivers were allowed to unbind from their
interfaces without balancing the usage counter (i.e., leaving it with
a positive count). The core code would take care of setting the
counter back to 0 before allowing another driver to bind to the
interface.
Later on when runtime PM was implemented for the entire kernel, the
opposite decision was made: Drivers were required to balance their
runtime-PM get and put calls. In order to maintain backward
compatibility, however, the USB subsystem adapted to the new
implementation by keeping an independent usage counter for each
interface and using it to automatically adjust the normal usage
counter back to 0 whenever a driver was unbound.
This approach involves duplicating information, but what is worse, it
doesn't work properly in cases where a USB class driver delays
decrementing the usage counter until after the driver's disconnect()
routine has returned and the counter has been adjusted back to 0.
Doing so would cause the usage counter to become negative. There's
even a warning about this in the USB power management documentation!
As it happens, this is exactly what the hub driver does. The
kick_hub_wq() routine increments the runtime-PM usage counter, and the
corresponding decrement is carried out by hub_event() in the context
of the hub_wq work-queue thread. This work routine may sometimes run
after the driver has been unbound from its interface, and when it does
it causes the usage counter to go negative.
It is not possible for hub_disconnect() to wait for a pending
hub_event() call to finish, because hub_disconnect() is called with
the device lock held and hub_event() acquires that lock. The only
feasible fix is to reverse the original design decision: remove the
duplicate interface-specific usage counter and require USB drivers to
balance their runtime PM gets and puts. As far as I know, all
existing drivers currently do this.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-and-tested-by: syzbot+7634edaea4d0b341c625@syzkaller.appspotmail.com
CC: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Add the capability to query information from a submit queue.
The first available parameter is for querying the number of GPU faults
(hangs) that can be attributed to the queue.
This is useful for implementing context robustness. A user context can
regularly query the number of faults to see if it is responsible for any
and if so it can invalidate itself.
This is also helpful for testing by confirming to the user driver if a
particular command stream caused a fault (or not as the case may be).
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
|
|
For KHR_robustness, userspace wants to know two things, the count of GPU
faults globally, and the count of faults attributed to a given context.
This patch providees the former, and the next patch provides the latter.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
|
|
For now it always returns '0' (false), but once the iommu work is in
place to enable per-process pagetables we can update the value returned.
Userspace needs to know this to make an informed decision about exposing
KHR_robustness.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
|
|
Merge misc fixes from Andrew Morton:
"16 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping
mm/kmemleak.c: fix unused-function warning
init: initialize jump labels before command line option parsing
kernel/watchdog_hld.c: hard lockup message should end with a newline
kcov: improve CONFIG_ARCH_HAS_KCOV help text
mm: fix inactive list balancing between NUMA nodes and cgroups
mm/hotplug: treat CMA pages as unmovable
proc: fixup proc-pid-vm test
proc: fix map_files test on F29
mm/vmstat.c: fix /proc/vmstat format for CONFIG_DEBUG_TLBFLUSH=y CONFIG_SMP=n
mm/memory_hotplug: do not unlock after failing to take the device_hotplug_lock
mm: swapoff: shmem_unuse() stop eviction without igrab()
mm: swapoff: take notice of completion sooner
mm: swapoff: remove too limiting SWAP_UNUSE_MAX_TRIES
mm: swapoff: shmem_find_swap_entries() filter out other types
slab: store tagged freelist for off-slab slabmgmt
|
|
The IXP4xx (arch/arm/mach-ixp4xx) is an old Intel XScale
platform that has very wide deployment and use.
As part of modernizing the platform, we need to implement a
proper irqchip in the irqchip subsystem.
The IXP4xx irqchip is tightly jotted together with the GPIO
controller, and whereas in the past we would deal with this
complex logic by adding necessarily different code, we can
nowadays modernize it using a hierarchical irqchip.
The actual IXP4 irqchip is a simple active low level IRQ
controller, whereas the GPIO functionality resides in a
different memory area and adds edge trigger support for
the interrupts.
The interrupts from GPIO lines 0..12 are 1:1 mapped to
a fixed set of hardware IRQs on this IRQchip, so we
expect the child GPIO interrupt controller to go in and
allocate descriptors for these interrupts.
For the other interrupts, as we do not yet have DT
support for this platform, we create a linear irqdomain
and then go in and allocate the IRQs that the legacy
boards use. This code will be removed on the DT probe
path when we add DT support to the platform.
We add some translation code for supporting DT
translations for the fwnodes, but we leave most of that
for later.
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
Add cgroup:cgroup_freeze and cgroup:cgroup_unfreeze events,
which are using the existing cgroup tracing infrastructure.
Add the cgroup_event event class, which is similar to the cgroup
class, but contains an additional integer field to store a new
value (the level field is dropped).
Also add two tracing events: cgroup_notify_populated and
cgroup_notify_frozen, which are raised in a generic way using
the TRACE_CGROUP_PATH() macro.
This allows to trace cgroup state transitions and is generally
helpful for debugging the cgroup freezer code.
Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Cgroup v1 implements the freezer controller, which provides an ability
to stop the workload in a cgroup and temporarily free up some
resources (cpu, io, network bandwidth and, potentially, memory)
for some other tasks. Cgroup v2 lacks this functionality.
This patch implements freezer for cgroup v2.
Cgroup v2 freezer tries to put tasks into a state similar to jobctl
stop. This means that tasks can be killed, ptraced (using
PTRACE_SEIZE*), and interrupted. It is possible to attach to
a frozen task, get some information (e.g. read registers) and detach.
It's also possible to migrate a frozen tasks to another cgroup.
This differs cgroup v2 freezer from cgroup v1 freezer, which mostly
tried to imitate the system-wide freezer. However uninterruptible
sleep is fine when all tasks are going to be frozen (hibernation case),
it's not the acceptable state for some subset of the system.
Cgroup v2 freezer is not supporting freezing kthreads.
If a non-root cgroup contains kthread, the cgroup still can be frozen,
but the kthread will remain running, the cgroup will be shown
as non-frozen, and the notification will not be delivered.
* PTRACE_ATTACH is not working because non-fatal signal delivery
is blocked in frozen state.
There are some interface differences between cgroup v1 and cgroup v2
freezer too, which are required to conform the cgroup v2 interface
design principles:
1) There is no separate controller, which has to be turned on:
the functionality is always available and is represented by
cgroup.freeze and cgroup.events cgroup control files.
2) The desired state is defined by the cgroup.freeze control file.
Any hierarchical configuration is allowed.
3) The interface is asynchronous. The actual state is available
using cgroup.events control file ("frozen" field). There are no
dedicated transitional states.
4) It's allowed to make any changes with the cgroup hierarchy
(create new cgroups, remove old cgroups, move tasks between cgroups)
no matter if some cgroups are frozen.
Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
No-objection-from-me-by: Oleg Nesterov <oleg@redhat.com>
Cc: kernel-team@fb.com
|
|
The number of descendant cgroups and the number of dying
descendant cgroups are currently synchronized using the cgroup_mutex.
The number of descendant cgroups will be required by the cgroup v2
freezer, which will use it to determine if a cgroup is frozen
(depending on total number of descendants and number of frozen
descendants). It's not always acceptable to grab the cgroup_mutex,
especially from quite hot paths (e.g. exit()).
To avoid this, let's additionally synchronize these counters using
the css_set_lock.
So, it's safe to read these counters with either cgroup_mutex or
css_set_lock locked, and for changing both locks should be acquired.
Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: kernel-team@fb.com
|
|
bvec->bv_offset may be bigger than PAGE_SIZE sometimes, such as,
when one bio is splitted in the middle of one bvec via bio_split(),
and bi_iter.bi_bvec_done is used to build offset of the 1st bvec of
remained bio. And the remained bio's bvec may be re-submitted to fs
layer via ITER_IBVEC, such as loop and nvme-loop.
So we have to make sure that every bvec's offset is less than
PAGE_SIZE from bio_for_each_segment_all() because some drivers(loop,
nvme-loop) passes the splitted bvec to fs layer via ITER_BVEC.
This patch fixes this issue reported by Zhang Yi When running nvme/011.
Cc: Christoph Hellwig <hch@lst.de>
Cc: Yi Zhang <yi.zhang@redhat.com>
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Fixes: 6dc4f100c175 ("block: allow bio_for_each_segment_all() to iterate over multi-page bvec")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
all_q_node has not been used since commit 4b855ad37194 ("blk-mq: Create
hctx for each present CPU"), so remove it.
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
- several new key mappings for HID
- a host of new ACPI IDs used to identify Elan touchpads in Lenovo
laptops
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: snvs_pwrkey - initialize necessary driver data before enabling IRQ
HID: input: add mapping for "Toggle Display" key
HID: input: add mapping for "Full Screen" key
HID: input: add mapping for keyboard Brightness Up/Down/Toggle keys
HID: input: add mapping for Expose/Overview key
HID: input: fix mapping of aspect ratio key
[media] doc-rst: switch to new names for Full Screen/Aspect keys
Input: document meanings of KEY_SCREEN and KEY_ZOOM
Input: elan_i2c - add hardware ID for multiple Lenovo laptops
|
|
After patch "drm: Use the same mmap-range offset and size for GEM and
TTM", application failed to create bo of system memory because drm
mmap_range size decrease to 64GB from original 1TB. This is not big
enough for applications. Increase the drm mmap_range size to 1TB.
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190417221507.933-1-Philip.Yang@amd.com
|
|
dumping
The core dumping code has always run without holding the mmap_sem for
writing, despite that is the only way to ensure that the entire vma
layout will not change from under it. Only using some signal
serialization on the processes belonging to the mm is not nearly enough.
This was pointed out earlier. For example in Hugh's post from Jul 2017:
https://lkml.kernel.org/r/alpine.LSU.2.11.1707191716030.2055@eggly.anvils
"Not strictly relevant here, but a related note: I was very surprised
to discover, only quite recently, how handle_mm_fault() may be called
without down_read(mmap_sem) - when core dumping. That seems a
misguided optimization to me, which would also be nice to correct"
In particular because the growsdown and growsup can move the
vm_start/vm_end the various loops the core dump does around the vma will
not be consistent if page faults can happen concurrently.
Pretty much all users calling mmget_not_zero()/get_task_mm() and then
taking the mmap_sem had the potential to introduce unexpected side
effects in the core dumping code.
Adding mmap_sem for writing around the ->core_dump invocation is a
viable long term fix, but it requires removing all copy user and page
faults and to replace them with get_dump_page() for all binary formats
which is not suitable as a short term fix.
For the time being this solution manually covers the places that can
confuse the core dump either by altering the vma layout or the vma flags
while it runs. Once ->core_dump runs under mmap_sem for writing the
function mmget_still_valid() can be dropped.
Allowing mmap_sem protected sections to run in parallel with the
coredump provides some minor parallelism advantage to the swapoff code
(which seems to be safe enough by never mangling any vma field and can
keep doing swapins in parallel to the core dumping) and to some other
corner case.
In order to facilitate the backporting I added "Fixes: 86039bd3b4e6"
however the side effect of this same race condition in /proc/pid/mem
should be reproducible since before 2.6.12-rc2 so I couldn't add any
other "Fixes:" because there's no hash beyond the git genesis commit.
Because find_extend_vma() is the only location outside of the process
context that could modify the "mm" structures under mmap_sem for
reading, by adding the mmget_still_valid() check to it, all other cases
that take the mmap_sem for reading don't need the new check after
mmget_not_zero()/get_task_mm(). The expand_stack() in page fault
context also doesn't need the new check, because all tasks under core
dumping are frozen.
Link: http://lkml.kernel.org/r/20190325224949.11068-1-aarcange@redhat.com
Fixes: 86039bd3b4e6 ("userfaultfd: add new syscall to provide memory externalization")
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Jann Horn <jannh@google.com>
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Jann Horn <jannh@google.com>
Acked-by: Jason Gunthorpe <jgg@mellanox.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The igrab() in shmem_unuse() looks good, but we forgot that it gives no
protection against concurrent unmounting: a point made by Konstantin
Khlebnikov eight years ago, and then fixed in 2.6.39 by 778dd893ae78
("tmpfs: fix race between umount and swapoff"). The current 5.1-rc
swapoff is liable to hit "VFS: Busy inodes after unmount of tmpfs.
Self-destruct in 5 seconds. Have a nice day..." followed by GPF.
Once again, give up on using igrab(); but don't go back to making such
heavy-handed use of shmem_swaplist_mutex as last time: that would spoil
the new design, and I expect could deadlock inside shmem_swapin_page().
Instead, shmem_unuse() just raise a "stop_eviction" count in the shmem-
specific inode, and shmem_evict_inode() wait for that to go down to 0.
Call it "stop_eviction" rather than "swapoff_busy" because it can be put
to use for others later (huge tmpfs patches expect to use it).
That simplifies shmem_unuse(), protecting it from both unlink and
unmount; and in practice lets it locate all the swap in its first try.
But do not rely on that: there's still a theoretical case, when
shmem_writepage() might have been preempted after its get_swap_page(),
before making the swap entry visible to swapoff.
[hughd@google.com: remove incorrect list_del()]
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1904091133570.1898@eggly.anvils
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1904081259400.1523@eggly.anvils
Fixes: b56a2d8af914 ("mm: rid swapoff of quadratic complexity")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: "Alex Xu (Hello71)" <alex_y_xu@yahoo.ca>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Kelley Nielsen <kelleynnn@gmail.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Rik van Riel <riel@surriel.com>
Cc: Vineeth Pillai <vpillai@digitalocean.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When a driver unloads without unloading TTM we don't correctly
clear the global structures leading to errors on re-init.
Next step should probably be to remove the global structures and
kobjs all together, but this is tricky since we need to maintain
backward compatibility.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
CC: stable@vger.kernel.org # 5.0.x
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This patch breaks set_selection() into two functions so that when
called from kernel, copy_from_user() can be avoided. The two functions
are called set_selection_user() and set_selection_kernel() in order to
be explicit about their purposes. This also means updating any
references to set_selection() and fixing for name change. It also
exports set_selection_kernel() and paste_selection().
These changes are used the following patch where speakup's selection
functionality calls into the above functions, thereby doing away with
parallel implementation.
Signed-off-by: Okash Khawaja <okash.khawaja@gmail.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Tested-by: Gregory Nowak <greg@gregn.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Verify the stack frame pointer on kretprobe trampoline handler,
If the stack frame pointer does not match, it skips the wrong
entry and tries to find correct one.
This can happen if user puts the kretprobe on the function
which can be used in the path of ftrace user-function call.
Such functions should not be probed, so this adds a warning
message that reports which function should be blacklisted.
Tested-by: Andrea Righi <righi.andrea@gmail.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/155094059185.6137.15527904013362842072.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Some tcpc device-drivers need to explicitly be told to watch for connection
events, otherwise the tcpc will not generate any TCPM_CC_EVENTs and devices
being plugged into the Type-C port will not be noticed.
For dual-role ports tcpm_start_drp_toggling() is used to tell the tcpc to
watch for connection events. Sofar we lack a similar callback to the tcpc
for single-role ports. With some tcpc-s such as the fusb302 this means
no TCPM_CC_EVENTs will be generated when the port is configured as a
single-role port.
This commit renames start_drp_toggling to start_toggling and since the
device-properties are parsed by the tcpm-core, adds a port_type parameter
to the start_toggling callback so that the tcpc_dev driver knows the
port-type and can act accordingly when it starts toggling.
The new start_toggling callback now always gets called if defined, instead
of only being called for DRP ports.
To avoid this causing undesirable functional changes all existing
start_drp_toggling implementations are not only renamed to start_toggling,
but also get a port_type check added and return -EOPNOTSUPP when port_type
is not DRP.
Fixes: ea3b4d5523bc("usb: typec: fusb302: Resolve fixed power role ...")
Cc: Adam Thomson <Adam.Thomson.Opensource@diasemi.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Tested-by: Adam Thomson <Adam.Thomson.Opensource@diasemi.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The rseq system call, when invoked with flags of "0" or
"RSEQ_FLAG_UNREGISTER" values, expects the rseq_len parameter to
be equal to sizeof(struct rseq), which is fixed-size and fixed-layout,
specified in uapi linux/rseq.h.
Expecting a fixed size for rseq_len is a design choice that ensures
multiple libraries and application defining __rseq_abi in the same
process agree on its exact size.
Considering that this size is and will always be the same value, there
is no point in saving this value within task_struct rseq_len. Remove
this field from task_struct.
No change in functionality intended.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Ben Maurer <bmaurer@fb.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Lameter <cl@linux.com>
Cc: Dave Watson <davejwatson@fb.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-api@vger.kernel.org
Link: http://lkml.kernel.org/r/20190305194755.2602-3-mathieu.desnoyers@efficios.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
This renames the GPIO reset of mdio devices from 'reset' to
'reset_gpio' to better differentiate between GPIO and
reset-controller driven reset line.
Signed-off-by: David Bauer <mail@david-bauer.net>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This commit adds support for PHY reset pins handled by a reset controller.
Signed-off-by: David Bauer <mail@david-bauer.net>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We are discouraging the use of BUG() these days, remove the
unused ASSERT macros from skbuff.h.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
To make ICMPv6 closer to ICMPv4, add ratemask parameter. Since the ICMP
message types use larger numeric values, a simple bitmask doesn't fit.
I use large bitmap. The input and output are the in form of list of
ranges. Set the default to rate limit all error messages but Packet Too
Big. For Packet Too Big, use ratemask instead of hard-coded.
There are functions where icmpv6_xrlim_allow() and icmpv6_global_allow()
aren't called. This patch only adds them to icmpv6_echo_reply().
Rate limiting error messages is mandated by RFC 4443 but RFC 4890 says
that it is also acceptable to rate limit informational messages. Thus,
I removed the current hard-coded behavior of icmpv6_mask_allow() that
doesn't rate limit informational messages.
v2: Add dummy function proc_do_large_bitmap() if CONFIG_PROC_SYSCTL
isn't defined, expand the description in ip-sysctl.txt and remove
unnecessary conditional before kfree().
v3: Inline the bitmap instead of dynamically allocated. Still is a
pointer to it is needed because of the way proc_do_large_bitmap work.
Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add hlist_bl_add_before/behind helpers to add an element before/after an
existing element in a bl_list.
Co-developed-by: Ilias Tsitsimpis <iliastsi@arrikto.com>
Signed-off-by: Nikos Tsironis <ntsironis@arrikto.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.ibm.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Commit 1c97be677f72b3 ("list: Use WRITE_ONCE() when adding to lists and
hlists") introduced the use of WRITE_ONCE() to atomically write the list
head's ->next pointer.
hlist_add_behind() doesn't touch the hlist head's ->first pointer so
there is no reason to use WRITE_ONCE() in this case.
Co-developed-by: Ilias Tsitsimpis <iliastsi@arrikto.com>
Signed-off-by: Nikos Tsironis <ntsironis@arrikto.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.ibm.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Merge immutable branch containing the IIO changes required
for the new Ingenic JZ47xx battery fuel gauge driver.
Signed-off-by: Sebastian Reichel <sre@kernel.org>
|
|
The compute shader dispatch interface is pretty simple -- just pass in
the regs that userspace has passed us, with no CLs to run. However,
with no CL to run it means that we need to do manual cache flushing of
the L2 after the HW execution completes (for SSBO, atomic, and
image_load_store writes that are the output of compute shaders).
This doesn't yet expose the L2 cache's ability to have a region of the
address space not write back to memory (which could be used for
shared_var storage).
So far, the Mesa side has been tested on V3D v4.2 simpenrose (passing
the ES31 tests), and on the kernel side on 7278 (failing atomic
compswap tests in a way that doesn't reproduce on simpenrose).
v2: Fix excessive allocation for the clean_job (reported by Dan
Carpenter). Keep refs on jobs until clean_job is finished, to
avoid spurious MMU errors if the output BOs are freed by userspace
before L2 cleaning is finished.
Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190416225856.20264-4-eric@anholt.net
Acked-by: Rob Clark <robdclark@gmail.com>
|
|
fwnode_graph_get_endpoint_by_id() is intended for obtaining local
endpoints by a given local port.
fwnode_graph_get_endpoint_by_id() is slightly different from its OF
counterpart, of_graph_get_endpoint_by_regs(): instead of using -1 as
a value to indicate that a port or an endpoint number does not matter,
it uses flags to look for equal or greater endpoint. The port number
is always fixed. It also returns only remote endpoints that belong
to an available device, a behaviour that can be turned off with a flag.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Remove cryptd_alloc_ablkcipher() and the ability of cryptd to create
algorithms with the deprecated "ablkcipher" type.
This has been unused since commit 0e145b477dea ("crypto: ablk_helper -
remove ablk_helper"). Instead, cryptd_alloc_skcipher() is used.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Add Elliptic Curve Russian Digital Signature Algorithm (GOST R
34.10-2012, RFC 7091, ISO/IEC 14888-3) is one of the Russian (and since
2018 the CIS countries) cryptographic standard algorithms (called GOST
algorithms). Only signature verification is supported, with intent to be
used in the IMA.
Summary of the changes:
* crypto/Kconfig:
- EC-RDSA is added into Public-key cryptography section.
* crypto/Makefile:
- ecrdsa objects are added.
* crypto/asymmetric_keys/x509_cert_parser.c:
- Recognize EC-RDSA and Streebog OIDs.
* include/linux/oid_registry.h:
- EC-RDSA OIDs are added to the enum. Also, a two currently not
implemented curve OIDs are added for possible extension later (to
not change numbering and grouping).
* crypto/ecc.c:
- Kenneth MacKay copyright date is updated to 2014, because
vli_mmod_slow, ecc_point_add, ecc_point_mult_shamir are based on his
code from micro-ecc.
- Functions needed for ecrdsa are EXPORT_SYMBOL'ed.
- New functions:
vli_is_negative - helper to determine sign of vli;
vli_from_be64 - unpack big-endian array into vli (used for
a signature);
vli_from_le64 - unpack little-endian array into vli (used for
a public key);
vli_uadd, vli_usub - add/sub u64 value to/from vli (used for
increment/decrement);
mul_64_64 - optimized to use __int128 where appropriate, this speeds
up point multiplication (and as a consequence signature
verification) by the factor of 1.5-2;
vli_umult - multiply vli by a small value (speeds up point
multiplication by another factor of 1.5-2, depending on vli sizes);
vli_mmod_special - module reduction for some form of Pseudo-Mersenne
primes (used for the curves A);
vli_mmod_special2 - module reduction for another form of
Pseudo-Mersenne primes (used for the curves B);
vli_mmod_barrett - module reduction using pre-computed value (used
for the curve C);
vli_mmod_slow - more general module reduction which is much slower
(used when the modulus is subgroup order);
vli_mod_mult_slow - modular multiplication;
ecc_point_add - add two points;
ecc_point_mult_shamir - add two points multiplied by scalars in one
combined multiplication (this gives speed up by another factor 2 in
compare to two separate multiplications).
ecc_is_pubkey_valid_partial - additional samity check is added.
- Updated vli_mmod_fast with non-strict heuristic to call optimal
module reduction function depending on the prime value;
- All computations for the previously defined (two NIST) curves should
not unaffected.
* crypto/ecc.h:
- Newly exported functions are documented.
* crypto/ecrdsa_defs.h
- Five curves are defined.
* crypto/ecrdsa.c:
- Signature verification is implemented.
* crypto/ecrdsa_params.asn1, crypto/ecrdsa_pub_key.asn1:
- Templates for BER decoder for EC-RDSA parameters and public key.
Cc: linux-integrity@vger.kernel.org
Signed-off-by: Vitaly Chikunov <vt@altlinux.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Some public key algorithms (like EC-DSA) keep in parameters field
important data such as digest and curve OIDs (possibly more for
different EC-DSA variants). Thus, just setting a public key (as
for RSA) is not enough.
Append parameters into the key stream for akcipher_set_{pub,priv}_key.
Appended data is: (u32) algo OID, (u32) parameters length, parameters
data.
This does not affect current akcipher API nor RSA ciphers (they could
ignore it). Idea of appending parameters to the key stream is by Herbert
Xu.
Cc: David Howells <dhowells@redhat.com>
Cc: Denis Kenzior <denkenz@gmail.com>
Cc: keyrings@vger.kernel.org
Signed-off-by: Vitaly Chikunov <vt@altlinux.org>
Reviewed-by: Denis Kenzior <denkenz@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Previous akcipher .verify() just `decrypts' (using RSA encrypt which is
using public key) signature to uncover message hash, which was then
compared in upper level public_key_verify_signature() with the expected
hash value, which itself was never passed into verify().
This approach was incompatible with EC-DSA family of algorithms,
because, to verify a signature EC-DSA algorithm also needs a hash value
as input; then it's used (together with a signature divided into halves
`r||s') to produce a witness value, which is then compared with `r' to
determine if the signature is correct. Thus, for EC-DSA, nor
requirements of .verify() itself, nor its output expectations in
public_key_verify_signature() wasn't sufficient.
Make improved .verify() call which gets hash value as input and produce
complete signature check without any output besides status.
Now for the top level verification only crypto_akcipher_verify() needs
to be called and its return value inspected.
Make sure that `digest' is in kmalloc'd memory (in place of `output`) in
{public,tpm}_key_verify_signature() as insisted by Herbert Xu, and will
be changed in the following commit.
Cc: David Howells <dhowells@redhat.com>
Cc: keyrings@vger.kernel.org
Signed-off-by: Vitaly Chikunov <vt@altlinux.org>
Reviewed-by: Denis Kenzior <denkenz@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
This patch adds a requirement to the generic 3DES implementation
such that 2-key 3DES (K1 == K3) is no longer allowed in FIPS mode.
We will also provide helpers that may be used by drivers that
implement 3DES to make the same check.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull RCU and LKMM commits from Paul E. McKenney:
- An LKMM commit adding support for synchronize_srcu_expedited()
- A couple of straggling RCU flavor consolidation updates
- Documentation updates.
- Miscellaneous fixes
- SRCU updates
- RCU CPU stall-warning updates
- Torture-test updates
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
When lockdep is enabled, and -Wuninitialized warnings are enabled,
Clang produces a silly warning for every file we compile:
In file included from kernel/sched/fair.c:23:
kernel/sched/sched.h:1094:15: error: variable 'cookie' is uninitialized when used here [-Werror,-Wuninitialized]
rf->cookie = lockdep_pin_lock(&rq->lock);
^~~~~~~~~~~~~~~~~~~~~~~~~~~
include/linux/lockdep.h:474:60: note: expanded from macro 'lockdep_pin_lock'
#define lockdep_pin_lock(l) ({ struct pin_cookie cookie; cookie; })
^~~~~~
kernel/sched/sched.h:1094:15: note: variable 'cookie' is declared here
include/linux/lockdep.h:474:34: note: expanded from macro 'lockdep_pin_lock'
#define lockdep_pin_lock(l) ({ struct pin_cookie cookie; cookie; })
^
As the 'struct pin_cookie' structure is empty in this configuration,
there is no need to initialize it for correctness, but it also
does not hurt to set it to an empty structure, so do that to
avoid the warning.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman Long <longman@redhat.com>
Cc: clang-built-linux@googlegroups.com
Link: http://lkml.kernel.org/r/20190325125807.1437049-1-arnd@arndb.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
To be able to check and set bad block markers in the first and
second page of a block independently of each other, we create
separate flags for both cases.
Previously NAND_BBM_SECONDPAGE meant, that both, the first and the
second page were used. With this patch NAND_BBM_FIRSTPAGE stands for
using the first page and NAND_BBM_SECONDPAGE for using the second
page.
This patch is only for preparation of subsequent changes and does
not implement the logic to actually handle both flags separately.
Signed-off-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Reviewed-by: Boris Brezillon <bbrezillon@kernel.org>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
|
|
Now that we have moved the information to the chip level, let's
remove all the unused flags and fields.
Signed-off-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
|
|
The information about where the manufacturer puts the bad block
markers inside the bad block and in the OOB data is stored in
different places. Let's move this information to the chip struct,
as we did it for rawnand.
Signed-off-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
|
|
The information about where the manufacturer puts the bad block
markers inside the bad block and in the OOB data is stored in
different places. Let's move this information to nand_chip.options
and nand_chip.badblockpos.
As this chip-specific information is not directly related to the
bad block table (BBT), we also rename the flags to NAND_BBM_*.
Signed-off-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
|
|
Currently, drivers are able to constify a nand_op_parser array,
but not nand_op_parser_pattern and nand_op_parser_pattern_elem
since they are instantiated by using the NAND_OP_PARSER(_PATTERN).
Add 'const' to them in order to move more driver data from .data to
.rodata section.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
|
|
Sphinx doesn't handle expressions in identifier references.
This fixes the following warnings:
./include/linux/mtd/rawnand.h:1184: WARNING: Inline strong start-string without end-string.
./include/linux/mtd/rawnand.h:1186: WARNING: Inline strong start-string without end-string.
Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
|