summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2019-04-22Merge tag 'v5.1-rc1' of ↵Saeed Mahameed
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into mlx5-next Linux 5.1-rc1 We forgot to reset the branch last merge window thus mlx5-next is outdated and still based on 5.0-rc2. This merge commit is needed to sync mlx5-next branch with 5.1-rc1. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-04-22RDMA/mlx5: Allow inserting a steering rule to the FDBMark Bloch
Allow this only via mlx5 raw create flow API, legacy verbs are not supported. To accommodate that, we add a new attribute to matcher creation to indicate the type of flow table to be used. MLX5_IB_ATTR_FLOW_MATCHER_FT_TYPE With this new attribute MLX5_IB_ATTR_FLOW_MATCHER_FLOW_FLAGS is no longer needed, we keep it for compatibility but at most only a single attribute can be passed of the two. When inserting a flow rule to the FDB we require that a DEVX FT is provided as a destination, no other configuration is allowed. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-22vfio: Use dev_printk() when possibleBjorn Helgaas
Use dev_printk() when possible to make messages consistent with other device-related messages. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2019-04-22RDMA/core: Add a netlink command to change net namespace of rdma deviceParav Pandit
Provide an option to change the net namespace of a rdma device through a netlink command. When multiple rdma devices exists in a system, and when containers are used, this will limit rdma device visibility to a specified net namespace. An example command to change net namespace of mlx5_1 device to the previously created net namespace 'foo' is: $ ip netns add foo $ rdma dev set mlx5_1 netns foo Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-22media: cec-notifier: add cec_notifier_parse_hdmi_phandle helperHans Verkuil
Add helper function to parse the DT for the hdmi-phandle property and return the corresponding struct device pointer. It takes care to avoid increasing the device refcount since all we need is the device pointer. This pointer is used in the notifier list as a key, but it is never accessed by the CEC driver. Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Reported-by: Wen Yang <wen.yang99@zte.com.cn> Acked-by: Wen Yang <wen.yang99@zte.com.cn> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-04-22media: rc: xbox_remote: add protocol and set timeoutMatthias Reichl
The timestamps in ir-keytable -t output showed that the Xbox DVD IR dongle decodes scancodes every 64ms. The last scancode of a longer button press is decodes 64ms after the last-but-one which indicates the decoder doesn't use a timeout but decodes on the last edge of the signal. 267.042629: lirc protocol(unknown): scancode = 0xace 267.042665: event type EV_MSC(0x04): scancode = 0xace 267.042665: event type EV_KEY(0x01) key_down: KEY_1(0x0002) 267.042665: event type EV_SYN(0x00). 267.106625: lirc protocol(unknown): scancode = 0xace 267.106643: event type EV_MSC(0x04): scancode = 0xace 267.106643: event type EV_SYN(0x00). 267.170623: lirc protocol(unknown): scancode = 0xace 267.170638: event type EV_MSC(0x04): scancode = 0xace 267.170638: event type EV_SYN(0x00). 267.234621: lirc protocol(unknown): scancode = 0xace 267.234636: event type EV_MSC(0x04): scancode = 0xace 267.234636: event type EV_SYN(0x00). 267.298623: lirc protocol(unknown): scancode = 0xace 267.298638: event type EV_MSC(0x04): scancode = 0xace 267.298638: event type EV_SYN(0x00). 267.543345: event type EV_KEY(0x01) key_down: KEY_1(0x0002) 267.543345: event type EV_SYN(0x00). 267.570015: event type EV_KEY(0x01) key_up: KEY_1(0x0002) 267.570015: event type EV_SYN(0x00). Add a protocol with the repeat value and set the timeout in the driver to 10ms (to have a bit of headroom for delays) so the Xbox DVD remote performs more responsive. Signed-off-by: Matthias Reichl <hias@horus.com> Acked-by: Benjamin Valentin <benpicco@googlemail.com> Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-04-22media: vb2: add waiting_in_dqbuf flagHans Verkuil
Calling VIDIOC_DQBUF can release the core serialization lock pointed to by vb2_queue->lock if it has to wait for a new buffer to arrive. However, if userspace dup()ped the video device filehandle, then it is possible to read or call DQBUF from two filehandles at the same time. It is also possible to call REQBUFS from one filehandle while the other is waiting for a buffer. This will remove all the buffers and reallocate new ones. Removing all the buffers isn't the problem here (that's already handled correctly by DQBUF), but the reallocating part is: DQBUF isn't aware that the buffers have changed. This is fixed by setting a flag whenever the lock is released while waiting for a buffer to arrive. And checking the flag where needed so we can return -EBUSY. Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Reported-by: Syzbot <syzbot+4180ff9ca6810b06c1e9@syzkaller.appspotmail.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-04-22block: fix use-after-free on gendiskYufen Yu
commit 2da78092dda "block: Fix dev_t minor allocation lifetime" specifically moved blk_free_devt(dev->devt) call to part_release() to avoid reallocating device number before the device is fully shutdown. However, it can cause use-after-free on gendisk in get_gendisk(). We use md device as example to show the race scenes: Process1 Worker Process2 md_free blkdev_open del_gendisk add delete_partition_work_fn() to wq __blkdev_get get_gendisk put_disk disk_release kfree(disk) find part from ext_devt_idr get_disk_and_module(disk) cause use after free delete_partition_work_fn put_device(part) part_release remove part from ext_devt_idr Before <devt, hd_struct pointer> is removed from ext_devt_idr by delete_partition_work_fn(), we can find the devt and then access gendisk by hd_struct pointer. But, if we access the gendisk after it have been freed, it can cause in use-after-freeon gendisk in get_gendisk(). We fix this by adding a new helper blk_invalidate_devt() in delete_partition() and del_gendisk(). It replaces hd_struct pointer in idr with value 'NULL', and deletes the entry from idr in part_release() as we do now. Thanks to Jan Kara for providing the solution and more clear comments for the code. Fixes: 2da78092dda1 ("block: Fix dev_t minor allocation lifetime") Cc: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Yufen Yu <yuyufen@huawei.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-22Merge tag 'v5.1-rc6' into for-5.2/blockJens Axboe
Pull in v5.1-rc6 to resolve two conflicts. One is in BFQ, in just a comment, and is trivial. The other one is a conflict due to a later fix in the bio multi-page work, and needs a bit more care. * tag 'v5.1-rc6': (770 commits) Linux 5.1-rc6 block: make sure that bvec length can't be overflow block: kill all_q_node in request_queue x86/cpu/intel: Lower the "ENERGY_PERF_BIAS: Set to normal" message's log priority coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping mm/kmemleak.c: fix unused-function warning init: initialize jump labels before command line option parsing kernel/watchdog_hld.c: hard lockup message should end with a newline kcov: improve CONFIG_ARCH_HAS_KCOV help text mm: fix inactive list balancing between NUMA nodes and cgroups mm/hotplug: treat CMA pages as unmovable proc: fixup proc-pid-vm test proc: fix map_files test on F29 mm/vmstat.c: fix /proc/vmstat format for CONFIG_DEBUG_TLBFLUSH=y CONFIG_SMP=n mm/memory_hotplug: do not unlock after failing to take the device_hotplug_lock mm: swapoff: shmem_unuse() stop eviction without igrab() mm: swapoff: take notice of completion sooner mm: swapoff: remove too limiting SWAP_UNUSE_MAX_TRIES mm: swapoff: shmem_find_swap_entries() filter out other types slab: store tagged freelist for off-slab slabmgmt ... Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-22media: uapi: Add MEDIA_BUS_FMT_BGR888_3X8 media bus formatMickael Guene
This patch adds MEDIA_BUS_FMT_BGR888_3X8 used by STM MIPID02 CSI-2 to PARALLEL bridge driver when input format is MEDIA_BUS_FMT_BGR888_1X24. Signed-off-by: Mickael Guene <mickael.guene@st.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-04-22media: media.h: Enable ALSA MEDIA_INTF_T* interface typesShuah Khan
Move PCM_CAPTURE, PCM_PLAYBACK, and CONTROL ALSA MEDIA_INTF_T* interface types back into __KERNEL__ scope to get ready for adding ALSA support for these to the media controller. Signed-off-by: Shuah Khan <shuah@kernel.org> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-04-22media: Media Device Allocator APIShuah Khan
Media Device Allocator API to allows multiple drivers share a media device. This API solves a very common use-case for media devices where one physical device (an USB stick) provides both audio and video. When such media device exposes a standard USB Audio class, a proprietary Video class, two or more independent drivers will share a single physical USB bridge. In such cases, it is necessary to coordinate access to the shared resource. Using this API, drivers can allocate a media device with the shared struct device as the key. Once the media device is allocated by a driver, other drivers can get a reference to it. The media device is released when all the references are released. Signed-off-by: Shuah Khan <shuah@kernel.org> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-04-22media: Introduce helpers to fill pixel format structsEzequiel Garcia
Add two new API helpers, v4l2_fill_pixfmt and v4l2_fill_pixfmt_mp, to be used by drivers to calculate plane sizes and bytes per lines. Note that driver-specific padding and alignment are not taken into account, and must be done by drivers using this API. Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-04-22media: v4l2-ctrls.h: remove spurious textHans Verkuil
Somehow the string "Controls name" got pasted in two places where it doesn't belong. Remove that text. Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-04-22media: v4l: add I / P frame min max QP definitionsFish Lin
Add following V4L2 QP parameters for H.264: * V4L2_CID_MPEG_VIDEO_H264_I_FRAME_MIN_QP * V4L2_CID_MPEG_VIDEO_H264_I_FRAME_MAX_QP * V4L2_CID_MPEG_VIDEO_H264_P_FRAME_MIN_QP * V4L2_CID_MPEG_VIDEO_H264_P_FRAME_MAX_QP These controls will limit QP range for intra and inter frame, provide more manual control to improve video encode quality. Signed-off-by: Fish Lin <linfish@google.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-04-22iio: trigger: stm32-timer: fix build issue when disabledFabrice Gasnier
This fixes a build issue when CONFIG_IIO_STM32_TIMER_TRIGGER isn't set but used in stm32-dfsdm-adc driver (e.g. CONFIG_STM32_DFSDM_ADC is set): ERROR: "is_stm32_timer_trigger" [drivers/iio/adc/stm32-dfsdm-adc.ko] undefined! There are two possible options to fix this issue: - select IIO_STM32_TIMER_TRIGGER along with CONFIG_STM32_DFSDM_ADC. This is what's being done currently for CONFIG_STM32_ADC. - stub "is_stm32_timer_trigger" function Choice is made to stub this function as suggested in [1]. This is also inspired by similar "is_stm32_lptim_trigger" function (see [2]) in include/linux/iio/timer/stm32-lptim-trigger.h [1] https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1977377.html [2] https://lkml.org/lkml/2017/9/10/124 Fixes: 11646e81d775 ("iio: adc: stm32-dfsdm: add support for buffer modes") Reported-by: Randy Dunlap <rdunlap@infradead.org> Fix-suggested-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2019-04-21ipv6: Restore RTF_ADDRCONF check in rt6_qualify_for_ecmpDavid Ahern
The RTF_ADDRCONF flag filters out routes added by RA's in determining which routes can be appended to an existing one to create a multipath route. Restore the flag check and add a comment to document the RA piece. Fixes: 4e54507ab1a9 ("ipv6: Simplify rt6_qualify_for_ecmp") Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-21i2c: mux: pca9541: remove support for unused platform dataRobert Shearman
There are no in-tree users of the platform data, so remove it to simplify the code slightly. Remove the now unused pca954x.h platform data header. Signed-off-by: Robert Shearman <robert.shearman@att.com> Signed-off-by: Peter Rosin <peda@axentia.se>
2019-04-21Merge 5.1-rc6 into tty-nextGreg Kroah-Hartman
We want the serial/tty fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-21Merge 5.1-rc6 into staging-nextGreg Kroah-Hartman
We want the fixes in here as well as this resolves an iio driver merge issue. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-21ipv6: Simplify rt6_qualify_for_ecmpDavid Ahern
After commit c7a1ce397ada ("ipv6: Change addrconf_f6i_alloc to use ip6_route_info_create"), the gateway is no longer filled in for fib6_nh structs in a prefix route. Accordingly, the RTF_ADDRCONF flag check can be dropped from the 'rt6_qualify_for_ecmp'. Further, RTF_DYNAMIC is only set in rt6_info instances, so it can be removed from the check as well. This reduces rt6_qualify_for_ecmp and the mlxsw version to just checking if the nexthop has a gateway which is the real indication of whether entries can be coalesced into a multipath route. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-21uapi/habanalabs: add missing fields in bmon paramsOded Gabbay
This patch adds missing fields of start address 0 and 1 in the bmon parameter structure that is received from the user in the debug IOCTL. Without these fields, the functionality of the bmon trace is broken, because there is no configuration of the base address of the filter of the bus monitor. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-20Merge tag 'for-linus-20190420' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: "A set of small fixes that should go into this series. This contains: - Removal of unused queue member (Hou) - Overflow bvec fix (Ming) - Various little io_uring tweaks (me) - kthread parking - Only call cpu_possible() for verified CPU - Drop unused 'file' argument to io_file_put() - io_uring_enter vs io_uring_register deadlock fix - CQ overflow fix - BFQ internal depth update fix (me)" * tag 'for-linus-20190420' of git://git.kernel.dk/linux-block: block: make sure that bvec length can't be overflow block: kill all_q_node in request_queue io_uring: fix CQ overflow condition io_uring: fix possible deadlock between io_uring_{enter,register} io_uring: drop io_file_put() 'file' argument bfq: update internal depth state when queue depth changes io_uring: only test SQPOLL cpu after we've verified it io_uring: park SQPOLL thread if it's percpu
2019-04-20Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Misc fixes: - various tooling fixes - kretprobe fixes - kprobes annotation fixes - kprobes error checking fix - fix the default events for AMD Family 17h CPUs - PEBS fix - AUX record fix - address filtering fix" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/kprobes: Avoid kretprobe recursion bug kprobes: Mark ftrace mcount handler functions nokprobe x86/kprobes: Verify stack frame on kretprobe perf/x86/amd: Add event map for AMD Family 17h perf bpf: Return NULL when RB tree lookup fails in perf_env__find_btf() perf tools: Fix map reference counting perf evlist: Fix side band thread draining perf tools: Check maps for bpf programs perf bpf: Return NULL when RB tree lookup fails in perf_env__find_bpf_prog_info() tools include uapi: Sync sound/asound.h copy perf top: Always sample time to satisfy needs of use of ordered queuing perf evsel: Use hweight64() instead of hweight_long(attr.sample_regs_user) tools lib traceevent: Fix missing equality check for strcmp perf stat: Disable DIR_FORMAT feature for 'perf stat record' perf scripts python: export-to-sqlite.py: Fix use of parent_id in calls_view perf header: Fix lock/unlock imbalances when processing BPF/BTF info perf/x86: Fix incorrect PEBS_REGS perf/ring_buffer: Fix AUX record suppression perf/core: Fix the address filtering fix kprobes: Fix error check when reusing optimized probes
2019-04-20Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes all over the place: a console spam fix, section attributes fixes, a KASLR fix, a TLB stack-variable alignment fix, a reboot quirk, boot options related warnings fix, an LTO fix, a deadlock fix and an RDT fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu/intel: Lower the "ENERGY_PERF_BIAS: Set to normal" message's log priority x86/cpu/bugs: Use __initconst for 'const' init data x86/mm/KASLR: Fix the size of the direct mapping section x86/Kconfig: Fix spelling mistake "effectivness" -> "effectiveness" x86/mm/tlb: Revert "x86/mm: Align TLB invalidation info" x86/reboot, efi: Use EFI reboot for Acer TravelMate X514-51T x86/mm: Prevent bogus warnings with "noexec=off" x86/build/lto: Fix truncated .bss with -fdata-sections x86/speculation: Prevent deadlock on ssb_state::lock x86/resctrl: Do not repeat rdtgroup mode initialization
2019-04-19random: move rand_initialize() earlierKees Cook
Right now rand_initialize() is run as an early_initcall(), but it only depends on timekeeping_init() (for mixing ktime_get_real() into the pools). However, the call to boot_init_stack_canary() for stack canary initialization runs earlier, which triggers a warning at boot: random: get_random_bytes called from start_kernel+0x357/0x548 with crng_init=0 Instead, this moves rand_initialize() to after timekeeping_init(), and moves canary initialization here as well. Note that this warning may still remain for machines that do not have UEFI RNG support (which initializes the RNG pools during setup_arch()), or for x86 machines without RDRAND (or booting without "random.trust=on" or CONFIG_RANDOM_TRUST_CPU=y). Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2019-04-19tipc: introduce new socket option TIPC_SOCK_RECVQ_USEDTung Nguyen
When using TIPC_SOCK_RECVQ_DEPTH for getsockopt(), it returns the number of buffers in receive socket buffer which is not so helpful for user space applications. This commit introduces the new option TIPC_SOCK_RECVQ_USED which returns the current allocated bytes of the receive socket buffer. This helps user space applications dimension its buffer usage to avoid buffer overload issue. Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-19clk: Allow parents to be specified via clkspec indexStephen Boyd
Some clk providers are simple DT nodes that only have a 'clocks' property without having an associated 'clock-names' property. In these cases, we want to let these clk providers point to their parent clks without having to dereference the 'clocks' property at probe time to figure out the parent's globally unique clk name. Let's add an 'index' property to the parent_data structure so that clk providers can indicate that their parent is a particular index in the 'clocks' DT property. Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Jerome Brunet <jbrunet@baylibre.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Michael Turquette <mturquette@baylibre.com> Cc: Jeffrey Hugo <jhugo@codeaurora.org> Cc: Chen-Yu Tsai <wens@csie.org> Tested-by: Jeffrey Hugo <jhugo@codeaurora.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2019-04-19clk: Allow parents to be specified without string namesStephen Boyd
The common clk framework is lacking in ability to describe the clk topology without specifying strings for every possible parent-child link. There are a few drawbacks to the current approach: 1) String comparisons are used for everything, including describing topologies that are 'local' to a single clock controller. 2) clk providers (e.g. i2c clk drivers) need to create globally unique clk names to avoid collisions in the clk namespace, leading to awkward name generation code in various clk drivers. 3) DT bindings may not fully describe the clk topology and linkages between clk controllers because drivers can easily rely on globally unique strings to describe connections between clks. This leads to confusing DT bindings, complicated clk name generation code, and inefficient string comparisons during clk registration just so that the clk framework can detect the topology of the clk tree. Furthermore, some drivers call clk_get() and then __clk_get_name() to extract the globally unique clk name just so they can specify the parent of the clk they're registering. We have of_clk_parent_fill() but that mostly only works for single clks registered from a DT node, which isn't the norm. Let's simplify this all by introducing two new ways of specifying clk parents. The first method is an array of pointers to clk_hw structures corresponding to the parents at that index. This works for clks that are registered when we have access to all the clk_hw pointers for the parents. The second method is a mix of clk_hw pointers and strings of local and global parent clk names. If the .fw_name member of the map is set we'll look for that clk by performing a DT based lookup of the device the clk is registered with and the .name specified in the map. If that fails, we'll fallback to the .name member and perform a global clk name lookup like we've always done before. Using either one of these new methods is entirely optional. Existing drivers will continue to work, and they can migrate to this new approach as they see fit. Eventually, we'll want to get rid of the 'parent_names' array in struct clk_init_data and use one of these new methods instead. Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Jerome Brunet <jbrunet@baylibre.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Michael Turquette <mturquette@baylibre.com> Cc: Jeffrey Hugo <jhugo@codeaurora.org> Cc: Chen-Yu Tsai <wens@csie.org> Cc: Rob Herring <robh@kernel.org> Tested-by: Jeffrey Hugo <jhugo@codeaurora.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2019-04-19clk: Add of_clk_hw_register() API for early clk driversStephen Boyd
In some circumstances drivers register clks early and don't have access to a struct device because the device model isn't initialized yet. Add an API to let drivers register clks associated with a struct device_node so that these drivers can participate in getting parent clks through DT. Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Jerome Brunet <jbrunet@baylibre.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Michael Turquette <mturquette@baylibre.com> Cc: Jeffrey Hugo <jhugo@codeaurora.org> Cc: Chen-Yu Tsai <wens@csie.org> Cc: Rob Herring <robh@kernel.org> Tested-by: Jeffrey Hugo <jhugo@codeaurora.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2019-04-19driver core: Let dev_of_node() accept a NULL devStephen Boyd
We'd like to chain this in places where the 'dev' argument might be NULL. Let this function take a NULL 'dev' so this can work. Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Jerome Brunet <jbrunet@baylibre.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Michael Turquette <mturquette@baylibre.com> Cc: Jeffrey Hugo <jhugo@codeaurora.org> Cc: Chen-Yu Tsai <wens@csie.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Rob Herring <robh@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Tested-by: Jeffrey Hugo <jhugo@codeaurora.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2019-04-19net: socket: implement 64-bit timestampsArnd Bergmann
The 'timeval' and 'timespec' data structures used for socket timestamps are going to be redefined in user space based on 64-bit time_t in future versions of the C library to deal with the y2038 overflow problem, which breaks the ABI definition. Unlike many modern ioctl commands, SIOCGSTAMP and SIOCGSTAMPNS do not use the _IOR() macro to encode the size of the transferred data, so it remains ambiguous whether the application uses the old or new layout. The best workaround I could find is rather ugly: we redefine the command code based on the size of the respective data structure with a ternary operator. This lets it get evaluated as late as possible, hopefully after that structure is visible to the caller. We cannot use an #ifdef here, because inux/sockios.h might have been included before any libc header that could determine the size of time_t. The ioctl implementation now interprets the new command codes as always referring to the 64-bit structure on all architectures, while the old architecture specific command code still refers to the old architecture specific layout. The new command number is only used when they are actually different. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-19net: rework SIOCGSTAMP ioctl handlingArnd Bergmann
The SIOCGSTAMP/SIOCGSTAMPNS ioctl commands are implemented by many socket protocol handlers, and all of those end up calling the same sock_get_timestamp()/sock_get_timestampns() helper functions, which results in a lot of duplicate code. With the introduction of 64-bit time_t on 32-bit architectures, this gets worse, as we then need four different ioctl commands in each socket protocol implementation. To simplify that, let's add a new .gettstamp() operation in struct proto_ops, and move ioctl implementation into the common sock_ioctl()/compat_sock_ioctl_trans() functions that these all go through. We can reuse the sock_get_timestamp() implementation, but generalize it so it can deal with both native and compat mode, as well as timeval and timespec structures. Acked-by: Stefan Schmidt <stefan@datenfreihafen.org> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> Link: https://lore.kernel.org/lkml/CAK8P3a038aDQQotzua_QtKGhq8O9n+rdiz2=WDCp82ys8eUT+A@mail.gmail.com/ Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-19vlan: support binding link state to vlan member bridge portsMike Manning
In the case of vlan filtering on bridges, the bridge may also have the corresponding vlan devices as upper devices. Currently the link state of vlan devices is transferred from the lower device. So this is up if the bridge is in admin up state and there is at least one bridge port that is up, regardless of the vlan that the port is a member of. The link state of the vlan device may need to track only the state of the subset of ports that are also members of the corresponding vlan, rather than that of all ports. Add a flag to specify a vlan bridge binding mode, by which the link state is no longer automatically transferred from the lower device, but is instead determined by the bridge ports that are members of the vlan. Signed-off-by: Mike Manning <mmanning@vyatta.att-mail.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-19USB: core: Fix bug caused by duplicate interface PM usage counterAlan Stern
The syzkaller fuzzer reported a bug in the USB hub driver which turned out to be caused by a negative runtime-PM usage counter. This allowed a hub to be runtime suspended at a time when the driver did not expect it. The symptom is a WARNING issued because the hub's status URB is submitted while it is already active: URB 0000000031fb463e submitted while active WARNING: CPU: 0 PID: 2917 at drivers/usb/core/urb.c:363 The negative runtime-PM usage count was caused by an unfortunate design decision made when runtime PM was first implemented for USB. At that time, USB class drivers were allowed to unbind from their interfaces without balancing the usage counter (i.e., leaving it with a positive count). The core code would take care of setting the counter back to 0 before allowing another driver to bind to the interface. Later on when runtime PM was implemented for the entire kernel, the opposite decision was made: Drivers were required to balance their runtime-PM get and put calls. In order to maintain backward compatibility, however, the USB subsystem adapted to the new implementation by keeping an independent usage counter for each interface and using it to automatically adjust the normal usage counter back to 0 whenever a driver was unbound. This approach involves duplicating information, but what is worse, it doesn't work properly in cases where a USB class driver delays decrementing the usage counter until after the driver's disconnect() routine has returned and the counter has been adjusted back to 0. Doing so would cause the usage counter to become negative. There's even a warning about this in the USB power management documentation! As it happens, this is exactly what the hub driver does. The kick_hub_wq() routine increments the runtime-PM usage counter, and the corresponding decrement is carried out by hub_event() in the context of the hub_wq work-queue thread. This work routine may sometimes run after the driver has been unbound from its interface, and when it does it causes the usage counter to go negative. It is not possible for hub_disconnect() to wait for a pending hub_event() call to finish, because hub_disconnect() is called with the device lock held and hub_event() acquires that lock. The only feasible fix is to reverse the original design decision: remove the duplicate interface-specific usage counter and require USB drivers to balance their runtime PM gets and puts. As far as I know, all existing drivers currently do this. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-and-tested-by: syzbot+7634edaea4d0b341c625@syzkaller.appspotmail.com CC: <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-19drm/msm/gpu: Add submit queue queriesJordan Crouse
Add the capability to query information from a submit queue. The first available parameter is for querying the number of GPU faults (hangs) that can be attributed to the queue. This is useful for implementing context robustness. A user context can regularly query the number of faults to see if it is responsible for any and if so it can invalidate itself. This is also helpful for testing by confirming to the user driver if a particular command stream caused a fault (or not as the case may be). Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-04-19drm/msm: add param to retrieve # of GPU faults (global)Rob Clark
For KHR_robustness, userspace wants to know two things, the count of GPU faults globally, and the count of faults attributed to a given context. This patch providees the former, and the next patch provides the latter. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
2019-04-19drm/msm/gpu: add per-process pagetables paramRob Clark
For now it always returns '0' (false), but once the iommu work is in place to enable per-process pagetables we can update the value returned. Userspace needs to know this to make an informed decision about exposing KHR_robustness. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
2019-04-19Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "16 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping mm/kmemleak.c: fix unused-function warning init: initialize jump labels before command line option parsing kernel/watchdog_hld.c: hard lockup message should end with a newline kcov: improve CONFIG_ARCH_HAS_KCOV help text mm: fix inactive list balancing between NUMA nodes and cgroups mm/hotplug: treat CMA pages as unmovable proc: fixup proc-pid-vm test proc: fix map_files test on F29 mm/vmstat.c: fix /proc/vmstat format for CONFIG_DEBUG_TLBFLUSH=y CONFIG_SMP=n mm/memory_hotplug: do not unlock after failing to take the device_hotplug_lock mm: swapoff: shmem_unuse() stop eviction without igrab() mm: swapoff: take notice of completion sooner mm: swapoff: remove too limiting SWAP_UNUSE_MAX_TRIES mm: swapoff: shmem_find_swap_entries() filter out other types slab: store tagged freelist for off-slab slabmgmt
2019-04-19irqchip: Add driver for IXP4xxLinus Walleij
The IXP4xx (arch/arm/mach-ixp4xx) is an old Intel XScale platform that has very wide deployment and use. As part of modernizing the platform, we need to implement a proper irqchip in the irqchip subsystem. The IXP4xx irqchip is tightly jotted together with the GPIO controller, and whereas in the past we would deal with this complex logic by adding necessarily different code, we can nowadays modernize it using a hierarchical irqchip. The actual IXP4 irqchip is a simple active low level IRQ controller, whereas the GPIO functionality resides in a different memory area and adds edge trigger support for the interrupts. The interrupts from GPIO lines 0..12 are 1:1 mapped to a fixed set of hardware IRQs on this IRQchip, so we expect the child GPIO interrupt controller to go in and allocate descriptors for these interrupts. For the other interrupts, as we do not yet have DT support for this platform, we create a linear irqdomain and then go in and allocate the IRQs that the legacy boards use. This code will be removed on the DT probe path when we add DT support to the platform. We add some translation code for supporting DT translations for the fwnodes, but we leave most of that for later. Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Jason Cooper <jason@lakedaemon.net> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2019-04-19cgroup: add tracing points for cgroup v2 freezerRoman Gushchin
Add cgroup:cgroup_freeze and cgroup:cgroup_unfreeze events, which are using the existing cgroup tracing infrastructure. Add the cgroup_event event class, which is similar to the cgroup class, but contains an additional integer field to store a new value (the level field is dropped). Also add two tracing events: cgroup_notify_populated and cgroup_notify_frozen, which are raised in a generic way using the TRACE_CGROUP_PATH() macro. This allows to trace cgroup state transitions and is generally helpful for debugging the cgroup freezer code. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2019-04-19cgroup: cgroup v2 freezerRoman Gushchin
Cgroup v1 implements the freezer controller, which provides an ability to stop the workload in a cgroup and temporarily free up some resources (cpu, io, network bandwidth and, potentially, memory) for some other tasks. Cgroup v2 lacks this functionality. This patch implements freezer for cgroup v2. Cgroup v2 freezer tries to put tasks into a state similar to jobctl stop. This means that tasks can be killed, ptraced (using PTRACE_SEIZE*), and interrupted. It is possible to attach to a frozen task, get some information (e.g. read registers) and detach. It's also possible to migrate a frozen tasks to another cgroup. This differs cgroup v2 freezer from cgroup v1 freezer, which mostly tried to imitate the system-wide freezer. However uninterruptible sleep is fine when all tasks are going to be frozen (hibernation case), it's not the acceptable state for some subset of the system. Cgroup v2 freezer is not supporting freezing kthreads. If a non-root cgroup contains kthread, the cgroup still can be frozen, but the kthread will remain running, the cgroup will be shown as non-frozen, and the notification will not be delivered. * PTRACE_ATTACH is not working because non-fatal signal delivery is blocked in frozen state. There are some interface differences between cgroup v1 and cgroup v2 freezer too, which are required to conform the cgroup v2 interface design principles: 1) There is no separate controller, which has to be turned on: the functionality is always available and is represented by cgroup.freeze and cgroup.events cgroup control files. 2) The desired state is defined by the cgroup.freeze control file. Any hierarchical configuration is allowed. 3) The interface is asynchronous. The actual state is available using cgroup.events control file ("frozen" field). There are no dedicated transitional states. 4) It's allowed to make any changes with the cgroup hierarchy (create new cgroups, remove old cgroups, move tasks between cgroups) no matter if some cgroups are frozen. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Tejun Heo <tj@kernel.org> No-objection-from-me-by: Oleg Nesterov <oleg@redhat.com> Cc: kernel-team@fb.com
2019-04-19cgroup: protect cgroup->nr_(dying_)descendants by css_set_lockRoman Gushchin
The number of descendant cgroups and the number of dying descendant cgroups are currently synchronized using the cgroup_mutex. The number of descendant cgroups will be required by the cgroup v2 freezer, which will use it to determine if a cgroup is frozen (depending on total number of descendants and number of frozen descendants). It's not always acceptable to grab the cgroup_mutex, especially from quite hot paths (e.g. exit()). To avoid this, let's additionally synchronize these counters using the css_set_lock. So, it's safe to read these counters with either cgroup_mutex or css_set_lock locked, and for changing both locks should be acquired. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Tejun Heo <tj@kernel.org> Cc: kernel-team@fb.com
2019-04-19block: make sure that bvec length can't be overflowMing Lei
bvec->bv_offset may be bigger than PAGE_SIZE sometimes, such as, when one bio is splitted in the middle of one bvec via bio_split(), and bi_iter.bi_bvec_done is used to build offset of the 1st bvec of remained bio. And the remained bio's bvec may be re-submitted to fs layer via ITER_IBVEC, such as loop and nvme-loop. So we have to make sure that every bvec's offset is less than PAGE_SIZE from bio_for_each_segment_all() because some drivers(loop, nvme-loop) passes the splitted bvec to fs layer via ITER_BVEC. This patch fixes this issue reported by Zhang Yi When running nvme/011. Cc: Christoph Hellwig <hch@lst.de> Cc: Yi Zhang <yi.zhang@redhat.com> Reported-by: Yi Zhang <yi.zhang@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Fixes: 6dc4f100c175 ("block: allow bio_for_each_segment_all() to iterate over multi-page bvec") Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-19block: kill all_q_node in request_queueHou Tao
all_q_node has not been used since commit 4b855ad37194 ("blk-mq: Create hctx for each present CPU"), so remove it. Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-19Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input updates from Dmitry Torokhov: - several new key mappings for HID - a host of new ACPI IDs used to identify Elan touchpads in Lenovo laptops * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: snvs_pwrkey - initialize necessary driver data before enabling IRQ HID: input: add mapping for "Toggle Display" key HID: input: add mapping for "Full Screen" key HID: input: add mapping for keyboard Brightness Up/Down/Toggle keys HID: input: add mapping for Expose/Overview key HID: input: fix mapping of aspect ratio key [media] doc-rst: switch to new names for Full Screen/Aspect keys Input: document meanings of KEY_SCREEN and KEY_ZOOM Input: elan_i2c - add hardware ID for multiple Lenovo laptops
2019-04-19coredump: fix race condition between mmget_not_zero()/get_task_mm() and core ↵Andrea Arcangeli
dumping The core dumping code has always run without holding the mmap_sem for writing, despite that is the only way to ensure that the entire vma layout will not change from under it. Only using some signal serialization on the processes belonging to the mm is not nearly enough. This was pointed out earlier. For example in Hugh's post from Jul 2017: https://lkml.kernel.org/r/alpine.LSU.2.11.1707191716030.2055@eggly.anvils "Not strictly relevant here, but a related note: I was very surprised to discover, only quite recently, how handle_mm_fault() may be called without down_read(mmap_sem) - when core dumping. That seems a misguided optimization to me, which would also be nice to correct" In particular because the growsdown and growsup can move the vm_start/vm_end the various loops the core dump does around the vma will not be consistent if page faults can happen concurrently. Pretty much all users calling mmget_not_zero()/get_task_mm() and then taking the mmap_sem had the potential to introduce unexpected side effects in the core dumping code. Adding mmap_sem for writing around the ->core_dump invocation is a viable long term fix, but it requires removing all copy user and page faults and to replace them with get_dump_page() for all binary formats which is not suitable as a short term fix. For the time being this solution manually covers the places that can confuse the core dump either by altering the vma layout or the vma flags while it runs. Once ->core_dump runs under mmap_sem for writing the function mmget_still_valid() can be dropped. Allowing mmap_sem protected sections to run in parallel with the coredump provides some minor parallelism advantage to the swapoff code (which seems to be safe enough by never mangling any vma field and can keep doing swapins in parallel to the core dumping) and to some other corner case. In order to facilitate the backporting I added "Fixes: 86039bd3b4e6" however the side effect of this same race condition in /proc/pid/mem should be reproducible since before 2.6.12-rc2 so I couldn't add any other "Fixes:" because there's no hash beyond the git genesis commit. Because find_extend_vma() is the only location outside of the process context that could modify the "mm" structures under mmap_sem for reading, by adding the mmget_still_valid() check to it, all other cases that take the mmap_sem for reading don't need the new check after mmget_not_zero()/get_task_mm(). The expand_stack() in page fault context also doesn't need the new check, because all tasks under core dumping are frozen. Link: http://lkml.kernel.org/r/20190325224949.11068-1-aarcange@redhat.com Fixes: 86039bd3b4e6 ("userfaultfd: add new syscall to provide memory externalization") Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Reported-by: Jann Horn <jannh@google.com> Suggested-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Jann Horn <jannh@google.com> Acked-by: Jason Gunthorpe <jgg@mellanox.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19mm: swapoff: shmem_unuse() stop eviction without igrab()Hugh Dickins
The igrab() in shmem_unuse() looks good, but we forgot that it gives no protection against concurrent unmounting: a point made by Konstantin Khlebnikov eight years ago, and then fixed in 2.6.39 by 778dd893ae78 ("tmpfs: fix race between umount and swapoff"). The current 5.1-rc swapoff is liable to hit "VFS: Busy inodes after unmount of tmpfs. Self-destruct in 5 seconds. Have a nice day..." followed by GPF. Once again, give up on using igrab(); but don't go back to making such heavy-handed use of shmem_swaplist_mutex as last time: that would spoil the new design, and I expect could deadlock inside shmem_swapin_page(). Instead, shmem_unuse() just raise a "stop_eviction" count in the shmem- specific inode, and shmem_evict_inode() wait for that to go down to 0. Call it "stop_eviction" rather than "swapoff_busy" because it can be put to use for others later (huge tmpfs patches expect to use it). That simplifies shmem_unuse(), protecting it from both unlink and unmount; and in practice lets it locate all the swap in its first try. But do not rely on that: there's still a theoretical case, when shmem_writepage() might have been preempted after its get_swap_page(), before making the swap entry visible to swapoff. [hughd@google.com: remove incorrect list_del()] Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1904091133570.1898@eggly.anvils Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1904081259400.1523@eggly.anvils Fixes: b56a2d8af914 ("mm: rid swapoff of quadratic complexity") Signed-off-by: Hugh Dickins <hughd@google.com> Cc: "Alex Xu (Hello71)" <alex_y_xu@yahoo.ca> Cc: Huang Ying <ying.huang@intel.com> Cc: Kelley Nielsen <kelleynnn@gmail.com> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Rik van Riel <riel@surriel.com> Cc: Vineeth Pillai <vpillai@digitalocean.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-19drm/ttm: fix re-init of global structuresChristian König
When a driver unloads without unloading TTM we don't correctly clear the global structures leading to errors on re-init. Next step should probably be to remove the global structures and kobjs all together, but this is tricky since we need to maintain backward compatibility. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Tested-by: Karol Herbst <kherbst@redhat.com> CC: stable@vger.kernel.org # 5.0.x Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-04-19vt: selection: allow functions to be called from inside kernelOkash Khawaja
This patch breaks set_selection() into two functions so that when called from kernel, copy_from_user() can be avoided. The two functions are called set_selection_user() and set_selection_kernel() in order to be explicit about their purposes. This also means updating any references to set_selection() and fixing for name change. It also exports set_selection_kernel() and paste_selection(). These changes are used the following patch where speakup's selection functionality calls into the above functions, thereby doing away with parallel implementation. Signed-off-by: Okash Khawaja <okash.khawaja@gmail.com> Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Tested-by: Gregory Nowak <greg@gregn.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>