Age | Commit message (Collapse) | Author |
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
As found by syzkaller, malicious users can set whatever tx_queue_len
on a tun device and eventually crash the kernel.
Lets remove the ALIGN(XXX, SMP_CACHE_BYTES) thing since a small
ring buffer is not fast anyway.
Fixes: 2e0ab8ca83c1 ("ptr_ring: array based FIFO for pointers")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The only usage of vmbus_sendpacket_ctl was by vmbus_sendpacket.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The function vmbus_sendpacket_pagebuffer_ctl was never used directly.
Just have vmbus_send_pagebuffer
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This function is not used anywhere in current code.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Resolve issues with !CONFIG_BPF_SYSCALL and !STREAM_PARSER
net/core/filter.c: In function ‘do_sk_redirect_map’:
net/core/filter.c:1881:3: error: implicit declaration of function ‘__sock_map_lookup_elem’ [-Werror=implicit-function-declaration]
sk = __sock_map_lookup_elem(ri->map, ri->ifindex);
^
net/core/filter.c:1881:6: warning: assignment makes pointer from integer without a cast [enabled by default]
sk = __sock_map_lookup_elem(ri->map, ri->ifindex);
Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support")
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://anongit.freedesktop.org/git/drm-misc into drm-next
UAPI Changes:
- vc4: Allow userspace to dictate rendering order in submit_cl ioctl (Eric)
Cross-subsystem Changes:
- vboxvideo: One of Cihangir's patches applies to vboxvideo which is maintained
in staging
Core Changes:
- atomic_legacy_backoff is officially killed (Daniel)
- Extract drm_device.h (Daniel)
- Unregister drm device on unplug (Daniel)
- Rename deprecated drm_*_(un)?reference functions to drm_*_{get|put} (Cihangir)
Driver Changes:
- vc4: Error/destroy path cleanups, log level demotion, edid leak (Eric)
- various: Make various drm_*_funcs structs const (Bhumika)
- tinydrm: add support for LEGO MINDSTORMS EV3 LCD (David)
- various: Second half of .dumb_{map_offset|destroy} defaults set (Noralf)
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Eric Anholt <eric@anholt.net>
Cc: Bhumika Goyal <bhumirks@gmail.com>
Cc: Cihangir Akturk <cakturk@gmail.com>
Cc: David Lechner <david@lechnology.com>
Cc: Noralf Trønnes <noralf@tronnes.org>
* tag 'drm-misc-next-2017-08-16' of git://anongit.freedesktop.org/git/drm-misc: (50 commits)
drm/gem-cma-helper: Remove drm_gem_cma_dumb_map_offset()
drm/virtio: Use the drm_driver.dumb_destroy default
drm/bochs: Use the drm_driver.dumb_destroy default
drm/mgag200: Use the drm_driver.dumb_destroy default
drm/exynos: Use .dumb_map_offset and .dumb_destroy defaults
drm/msm: Use the drm_driver.dumb_destroy default
drm/ast: Use the drm_driver.dumb_destroy default
drm/qxl: Use the drm_driver.dumb_destroy default
drm/udl: Use the drm_driver.dumb_destroy default
drm/cirrus: Use the drm_driver.dumb_destroy default
drm/tegra: Use .dumb_map_offset and .dumb_destroy defaults
drm/gma500: Use .dumb_map_offset and .dumb_destroy defaults
drm/mxsfb: Use .dumb_map_offset and .dumb_destroy defaults
drm/meson: Use .dumb_map_offset and .dumb_destroy defaults
drm/kirin: Use .dumb_map_offset and .dumb_destroy defaults
drm/vc4: Continue the switch to drm_*_put() helpers
drm/vc4: Fix leak of HDMI EDID
dma-buf: fix reservation_object_wait_timeout_rcu to wait correctly v2
dma-buf: add reservation_object_copy_fences (v2)
drm/tinydrm: add support for LEGO MINDSTORMS EV3 LCD
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into next/soc
Pull "soc changes for omaps for v4.14" from Tony Lindgren:
SoC updates for omaps for v4.14. Most of the chages are to add
support for new dra762 SoC. The other changes are are for legacy
DMA code removal, and MMC quirk and iodelay config for dra7.
* tag 'omap-for-v4.14/soc-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: OMAP: dra7: powerdomain data: Register SoC specific powerdomains
ARM: dra762: Enable SMP for dra762
ARM: dra7: hwmod: Register dra76x specific hwmod
ARM: dra762: Add support for device identification
ARM: OMAP2+: board-generic: add support for dra762 family
ARM: OMAP2+: Select PINCTRL_TI_IODELAY for SOC_DRA7XX
ARM: OMAP2+: Add pdata-quirks for MMC/SD on DRA74x EVM
ARM: OMAP2+: Remove unused legacy code for DMA
|
|
http://git.linaro.org/people/jens.wiklander/linux-tee into next/drivers
Pull "Small fixes and enhancements for the TEE subsystem" from Jens Wiklander:
* tag 'tee-drv-for-4.14' of http://git.linaro.org/people/jens.wiklander/linux-tee:
tee: optee: sync with new naming of interrupts
tee: indicate privileged dev in gen_caps
tee: optee: interruptible RPC sleep
tee: optee: add const to tee_driver_ops and tee_desc structures
tee: tee_shm: Constify dma_buf_ops structures.
tee: add forward declaration for struct device
tee: optee: fix uninitialized symbol 'parg'
|
|
Instead add a mechanism to ensure that the request doesn't disappear
from underneath us while copying from the socket. We do this by
preventing xprt_release() from freeing the XDR buffers until the
flag RPC_TASK_MSG_RECV has been cleared from the request.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Recently we added a new map type called dev map used to forward XDP
packets between ports (6093ec2dc313). This patches introduces a
similar notion for sockets.
A sockmap allows users to add participating sockets to a map. When
sockets are added to the map enough context is stored with the
map entry to use the entry with a new helper
bpf_sk_redirect_map(map, key, flags)
This helper (analogous to bpf_redirect_map in XDP) is given the map
and an entry in the map. When called from a sockmap program, discussed
below, the skb will be sent on the socket using skb_send_sock().
With the above we need a bpf program to call the helper from that will
then implement the send logic. The initial site implemented in this
series is the recv_sock hook. For this to work we implemented a map
attach command to add attributes to a map. In sockmap we add two
programs a parse program and a verdict program. The parse program
uses strparser to build messages and pass them to the verdict program.
The parse programs use the normal strparser semantics. The verdict
program is of type SK_SKB.
The verdict program returns a verdict SK_DROP, or SK_REDIRECT for
now. Additional actions may be added later. When SK_REDIRECT is
returned, expected when bpf program uses bpf_sk_redirect_map(), the
sockmap logic will consult per cpu variables set by the helper routine
and pull the sock entry out of the sock map. This pattern follows the
existing redirect logic in cls and xdp programs.
This gives the flow,
recv_sock -> str_parser (parse_prog) -> verdict_prog -> skb_send_sock
\
-> kfree_skb
As an example use case a message based load balancer may use specific
logic in the verdict program to select the sock to send on.
Sample programs are provided in future patches that hopefully illustrate
the user interfaces. Also selftests are in follow-on patches.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
bpf_prog_inc_not_zero will be used by upcoming sockmap patches this
patch simply exports it so we can pull it in.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A class of programs, run from strparser and soon from a new map type
called sock map, are used with skb as the context but on established
sockets. By creating a specific program type for these we can use
bpf helpers that expect full sockets and get the verifier to ensure
these helpers are not used out of context.
The new type is BPF_PROG_TYPE_SK_SKB. This patch introduces the
infrastructure and type.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Legacy PCI INTx interrupts are represented in the PCI_INTERRUPT_PIN
register using the range 1-4, which matches our enum pci_interrupt_pin.
This is however not ideal for an IRQ domain, where with 4 interrupts we
would ideally have a domain of size 4 & hwirq numbers in the range 0-3.
Different PCI host controller drivers have handled this in different ways.
Of those under drivers/pci/ which register an INTx IRQ domain, we have:
- pcie-altera uses the range 1-4 in device trees and an IRQ domain of
size 5 to cover that range, with entry 0 wasted.
- pcie-xilinx & pcie-xilinx-nwl use the range 1-4 in device trees but
register an IRQ domain of size 4, which doesn't cover the hwirq=4/INTD
case leading to that interrupt being broken.
- pci-ftpci100 & pci-aardvark use the range 0-3 in both device trees & as
hwirq numbering in the driver & IRQ domain.
In order to introduce some level of consistency in at least the hwirq
numbering used by the drivers & IRQ domains, this patch introduces a new
pci_irqd_intx_xlate() helper function which drivers using the 1-4 range in
device trees can assign as the xlate callback for their INTx IRQ domain.
This translates the 1-4 range into a 0-3 range, allowing us to use an IRQ
domain of size 4 & avoid a wasted entry. Further patches will make use of
this in drivers to allow them to use an IRQ domain of size 4 for legacy
INTx interrupts without breaking INTD.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
This patch handles the following issues that were observed when we are
handling racing channel offer message and rescind message for the same
offer:
1. Since the host does not respond to messages on a rescinded channel,
in the current code, we could be indefinitely blocked on the vmbus_open() call.
2. When a rescinded channel is being closed, if there is a pending interrupt on the
channel, we could end up freeing the channel that the interrupt handler would run on.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Tested-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Add a resource managed version of irq_sim_init(). This can be
conveniently used in device drivers.
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Lars-Peter Clausen <lars@metafoo.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: linux-doc@vger.kernel.org
Cc: linux-gpio@vger.kernel.org
Cc: Bamvor Jian Zhang <bamvor.zhangjian@linaro.org>
Cc: Jonathan Cameron <jic23@kernel.org>
Link: http://lkml.kernel.org/r/20170814145318.6495-3-brgl@bgdev.pl
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Implement a simple, irq_work-based framework for simulating
interrupts. Currently the API exposes routines for initializing and
deinitializing the simulator object, enqueueing the interrupts and
retrieving the allocated interrupt numbers based on the offset of the
dummy interrupt in the simulator struct.
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Lars-Peter Clausen <lars@metafoo.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: linux-doc@vger.kernel.org
Cc: linux-gpio@vger.kernel.org
Cc: Bamvor Jian Zhang <bamvor.zhangjian@linaro.org>
Cc: Jonathan Cameron <jic23@kernel.org>
Link: http://lkml.kernel.org/r/20170814145318.6495-2-brgl@bgdev.pl
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The omapdrm platform data are not used anymore, remove them.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
|
|
|
|
Pull networking fixes from David Miller:
1) Fix TCP checksum offload handling in iwlwifi driver, from Emmanuel
Grumbach.
2) In ksz DSA tagging code, free SKB if skb_put_padto() fails. From
Vivien Didelot.
3) Fix two regressions with bonding on wireless, from Andreas Born.
4) Fix build when busypoll is disabled, from Daniel Borkmann.
5) Fix copy_linear_skb() wrt. SO_PEEK_OFF, from Eric Dumazet.
6) Set SKB cached route properly in inet_rtm_getroute(), from Florian
Westphal.
7) Fix PCI-E relaxed ordering handling in cxgb4 driver, from Ding
Tianhong.
8) Fix module refcnt leak in ULP code, from Sabrina Dubroca.
9) Fix use of GFP_KERNEL in atomic contexts in AF_KEY code, from Eric
Dumazet.
10) Need to purge socket write queue in dccp_destroy_sock(), also from
Eric Dumazet.
11) Make bpf_trace_printk() work properly on 32-bit architectures, from
Daniel Borkmann.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (47 commits)
bpf: fix bpf_trace_printk on 32 bit archs
PCI: fix oops when try to find Root Port for a PCI device
sfc: don't try and read ef10 data on non-ef10 NIC
net_sched: remove warning from qdisc_hash_add
net_sched/sfq: update hierarchical backlog when drop packet
net_sched: reset pointers to tcf blocks in classful qdiscs' destructors
ipv4: fix NULL dereference in free_fib_info_rcu()
net: Fix a typo in comment about sock flags.
ipv6: fix NULL dereference in ip6_route_dev_notify()
tcp: fix possible deadlock in TCP stack vs BPF filter
dccp: purge write queue in dccp_destroy_sock()
udp: fix linear skb reception with PEEK_OFF
ipv6: release rt6->rt6i_idev properly during ifdown
af_key: do not use GFP_KERNEL in atomic contexts
tcp: ulp: avoid module refcnt leak in tcp_set_ulp
net/cxgb4vf: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
net/cxgb4: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
PCI: Disable Relaxed Ordering Attributes for AMD A1100
PCI: Disable Relaxed Ordering for some Intel processors
PCI: Disable PCIe Relaxed Ordering if unsupported
...
|
|
The extcon header file defines the functions which used the mismatched
indentation and used the space on some case. So, this patch clean-up
the indentation in order to improve the readbility.
And this patch changes the return value of extcon_get_extcon_dev()
because of maintaing the same value with extcon_get_edev_by_phandle().
Signed-off-by: Chanwoo Choi <cwchoi00@gmail.com>
|
|
The extcon files explains the detailed operation for functions and
what is meaning of extcon structure. There are different explanation
even if the same argument.
So, it modifies the description for both functions and structures
in order to improve the readability and guide the role of functions
more well.
Also, this patch fixes the mismatching license info as a GPL v2
and removes the inactive author information.
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
|
|
|
|
The commit 575c2b867ee0 ("extcon: Rename the extcon_set/get_state()
to maintain the function naming pattern") renames the extcon function as
following: But, the extcon just keeps the old API to prevent the build error.
This patch removes the deprecatd extcon API.
- extcon_get_cable_state_() -> extcon_get_state()
- extcon_set_cable_state_() -> extcon_set_state_sync()
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
|
|
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
State of a register doesn't matter if it wasn't read in reaching an exit;
a write screens off all reads downstream of it from all explored_states
upstream of it.
This allows us to prune many more branches; here are some processed insn
counts for some Cilium programs:
Program before after
bpf_lb_opt_-DLB_L3.o 6515 3361
bpf_lb_opt_-DLB_L4.o 8976 5176
bpf_lb_opt_-DUNKNOWN.o 2960 1137
bpf_lxc_opt_-DDROP_ALL.o 95412 48537
bpf_lxc_opt_-DUNKNOWN.o 141706 78718
bpf_netdev.o 24251 17995
bpf_overlay.o 10999 9385
The runtime is also improved; here are 'time' results in ms:
Program before after
bpf_lb_opt_-DLB_L3.o 24 6
bpf_lb_opt_-DLB_L4.o 26 11
bpf_lb_opt_-DUNKNOWN.o 11 2
bpf_lxc_opt_-DDROP_ALL.o 1288 139
bpf_lxc_opt_-DUNKNOWN.o 1768 234
bpf_netdev.o 62 31
bpf_overlay.o 15 13
Signed-off-by: Edward Cree <ecree@solarflare.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We currently have a definition of enum pci_interrupt_pin in a header
specific to PCI endpoints - linux/pci-epf.h. In order to allow for use of
this enum from PCI host code in a future commit, move its definition to
linux/pci.h & include that from linux/pci-epf.h.
Additionally we add a PCI_NUM_INTX macro which indicates the number of PCI
INTx interrupts, and will be used alongside enum pci_interrupt_pin in
further patches.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
[bhelgaas: move enum pci_interrupt_pin outside #ifdef CONFIG_PCI]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux into for-next/core
* 'arm64/vmap-stack' of git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux:
arm64: add VMAP_STACK overflow detection
arm64: add on_accessible_stack()
arm64: add basic VMAP_STACK support
arm64: use an irq stack pointer
arm64: assembler: allow adr_this_cpu to use the stack pointer
arm64: factor out entry stack manipulation
efi/arm64: add EFI_KIMG_ALIGN
arm64: move SEGMENT_ALIGN to <asm/memory.h>
arm64: clean up irq stack definitions
arm64: clean up THREAD_* definitions
arm64: factor out PAGE_* and CONT_* definitions
arm64: kernel: remove {THREAD,IRQ_STACK}_START_SP
fork: allow arch-override of VMAP stack alignment
arm64: remove __die()'s stack dump
|
|
In some cases, an architecture might wish its stacks to be aligned to a
boundary larger than THREAD_SIZE. For example, using an alignment of
double THREAD_SIZE can allow for stack overflows smaller than
THREAD_SIZE to be detected by checking a single bit of the stack
pointer.
This patch allows architectures to override the alignment of VMAP'd
stacks, by defining THREAD_ALIGN. Where not defined, this defaults to
THREAD_SIZE, as is the case today.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: linux-kernel@vger.kernel.org
|
|
Add a timer to flush entries from the Flush-Queues every
10ms. This makes sure that no stale TLB entries remain for
too long after an IOVA has been unmapped.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
The lock is taken from the same CPU most of the time. But
having it allows to flush the queue also from another CPU if
necessary.
This will be used by a timer to regularily flush any pending
IOVAs from the Flush-Queues.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
There are two counters:
* fq_flush_start_cnt - Increased when a TLB flush
is started.
* fq_flush_finish_cnt - Increased when a TLB flush
is finished.
The fq_flush_start_cnt is assigned to every Flush-Queue
entry on its creation. When freeing entries from the
Flush-Queue, the value in the entry is compared to the
fq_flush_finish_cnt. The entry can only be freed when its
value is less than the value of fq_flush_finish_cnt.
The reason for these counters it to take advantage of IOMMU
TLB flushes that happened on other CPUs. These already
flushed the TLB for Flush-Queue entries on other CPUs so
that they can already be freed without flushing the TLB
again.
This makes it less likely that the Flush-Queue is full and
saves IOMMU TLB flushes.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Add a function to add entries to the Flush-Queue ring
buffer. If the buffer is full, call the flush-callback and
free the entries.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
This patch adds the basic data-structures to implement
flush-queues in the generic IOVA code. It also adds the
initialization and destroy routines for these data
structures.
The initialization routine is designed so that the use of
this feature is optional for the users of IOVA code.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
This new call-back will be used to check if the domain attach need be
deferred for now. If yes, the domain attach/detach will return directly.
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Add zstd compression and decompression kernel modules.
zstd offers a wide varity of compression speed and quality trade-offs.
It can compress at speeds approaching lz4, and quality approaching lzma.
zstd decompressions at speeds more than twice as fast as zlib, and
decompression speed remains roughly the same across all compression levels.
The code was ported from the upstream zstd source repository. The
`linux/zstd.h` header was modified to match linux kernel style.
The cross-platform and allocation code was stripped out. Instead zstd
requires the caller to pass a preallocated workspace. The source files
were clang-formatted [1] to match the Linux Kernel style as much as
possible. Otherwise, the code was unmodified. We would like to avoid
as much further manual modification to the source code as possible, so it
will be easier to keep the kernel zstd up to date.
I benchmarked zstd compression as a special character device. I ran zstd
and zlib compression at several levels, as well as performing no
compression, which measure the time spent copying the data to kernel space.
Data is passed to the compresser 4096 B at a time. The benchmark file is
located in the upstream zstd source repository under
`contrib/linux-kernel/zstd_compress_test.c` [2].
I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM.
The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor,
16 GB of RAM, and a SSD. I benchmarked using `silesia.tar` [3], which is
211,988,480 B large. Run the following commands for the benchmark:
sudo modprobe zstd_compress_test
sudo mknod zstd_compress_test c 245 0
sudo cp silesia.tar zstd_compress_test
The time is reported by the time of the userland `cp`.
The MB/s is computed with
1,536,217,008 B / time(buffer size, hash)
which includes the time to copy from userland.
The Adjusted MB/s is computed with
1,536,217,088 B / (time(buffer size, hash) - time(buffer size, none)).
The memory reported is the amount of memory the compressor requests.
| Method | Size (B) | Time (s) | Ratio | MB/s | Adj MB/s | Mem (MB) |
|----------|----------|----------|-------|---------|----------|----------|
| none | 11988480 | 0.100 | 1 | 2119.88 | - | - |
| zstd -1 | 73645762 | 1.044 | 2.878 | 203.05 | 224.56 | 1.23 |
| zstd -3 | 66988878 | 1.761 | 3.165 | 120.38 | 127.63 | 2.47 |
| zstd -5 | 65001259 | 2.563 | 3.261 | 82.71 | 86.07 | 2.86 |
| zstd -10 | 60165346 | 13.242 | 3.523 | 16.01 | 16.13 | 13.22 |
| zstd -15 | 58009756 | 47.601 | 3.654 | 4.45 | 4.46 | 21.61 |
| zstd -19 | 54014593 | 102.835 | 3.925 | 2.06 | 2.06 | 60.15 |
| zlib -1 | 77260026 | 2.895 | 2.744 | 73.23 | 75.85 | 0.27 |
| zlib -3 | 72972206 | 4.116 | 2.905 | 51.50 | 52.79 | 0.27 |
| zlib -6 | 68190360 | 9.633 | 3.109 | 22.01 | 22.24 | 0.27 |
| zlib -9 | 67613382 | 22.554 | 3.135 | 9.40 | 9.44 | 0.27 |
I benchmarked zstd decompression using the same method on the same machine.
The benchmark file is located in the upstream zstd repo under
`contrib/linux-kernel/zstd_decompress_test.c` [4]. The memory reported is
the amount of memory required to decompress data compressed with the given
compression level. If you know the maximum size of your input, you can
reduce the memory usage of decompression irrespective of the compression
level.
| Method | Time (s) | MB/s | Adjusted MB/s | Memory (MB) |
|----------|----------|---------|---------------|-------------|
| none | 0.025 | 8479.54 | - | - |
| zstd -1 | 0.358 | 592.15 | 636.60 | 0.84 |
| zstd -3 | 0.396 | 535.32 | 571.40 | 1.46 |
| zstd -5 | 0.396 | 535.32 | 571.40 | 1.46 |
| zstd -10 | 0.374 | 566.81 | 607.42 | 2.51 |
| zstd -15 | 0.379 | 559.34 | 598.84 | 4.61 |
| zstd -19 | 0.412 | 514.54 | 547.77 | 8.80 |
| zlib -1 | 0.940 | 225.52 | 231.68 | 0.04 |
| zlib -3 | 0.883 | 240.08 | 247.07 | 0.04 |
| zlib -6 | 0.844 | 251.17 | 258.84 | 0.04 |
| zlib -9 | 0.837 | 253.27 | 287.64 | 0.04 |
Tested in userland using the test-suite in the zstd repo under
`contrib/linux-kernel/test/UserlandTest.cpp` [5] by mocking the kernel
functions. Fuzz tested using libfuzzer [6] with the fuzz harnesses under
`contrib/linux-kernel/test/{RoundTripCrash.c,DecompressCrash.c}` [7] [8]
with ASAN, UBSAN, and MSAN. Additionaly, it was tested while testing the
BtrFS and SquashFS patches coming next.
[1] https://clang.llvm.org/docs/ClangFormat.html
[2] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/zstd_compress_test.c
[3] http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia
[4] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/zstd_decompress_test.c
[5] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/test/UserlandTest.cpp
[6] http://llvm.org/docs/LibFuzzer.html
[7] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/test/RoundTripCrash.c
[8] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/test/DecompressCrash.c
zstd source repository: https://github.com/facebook/zstd
Signed-off-by: Nick Terrell <terrelln@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
Adds xxhash kernel module with xxh32 and xxh64 hashes. xxhash is an
extremely fast non-cryptographic hash algorithm for checksumming.
The zstd compression and decompression modules added in the next patch
require xxhash. I extracted it out from zstd since it is useful on its
own. I copied the code from the upstream XXHash source repository and
translated it into kernel style. I ran benchmarks and tests in the kernel
and tests in userland.
I benchmarked xxhash as a special character device. I ran in four modes,
no-op, xxh32, xxh64, and crc32. The no-op mode simply copies the data to
kernel space and ignores it. The xxh32, xxh64, and crc32 modes compute
hashes on the copied data. I also ran it with four different buffer sizes.
The benchmark file is located in the upstream zstd source repository under
`contrib/linux-kernel/xxhash_test.c` [1].
I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM.
The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor,
16 GB of RAM, and a SSD. I benchmarked using the file `filesystem.squashfs`
from `ubuntu-16.10-desktop-amd64.iso`, which is 1,536,217,088 B large.
Run the following commands for the benchmark:
modprobe xxhash_test
mknod xxhash_test c 245 0
time cp filesystem.squashfs xxhash_test
The time is reported by the time of the userland `cp`.
The GB/s is computed with
1,536,217,008 B / time(buffer size, hash)
which includes the time to copy from userland.
The Normalized GB/s is computed with
1,536,217,088 B / (time(buffer size, hash) - time(buffer size, none)).
| Buffer Size (B) | Hash | Time (s) | GB/s | Adjusted GB/s |
|-----------------|-------|----------|------|---------------|
| 1024 | none | 0.408 | 3.77 | - |
| 1024 | xxh32 | 0.649 | 2.37 | 6.37 |
| 1024 | xxh64 | 0.542 | 2.83 | 11.46 |
| 1024 | crc32 | 1.290 | 1.19 | 1.74 |
| 4096 | none | 0.380 | 4.04 | - |
| 4096 | xxh32 | 0.645 | 2.38 | 5.79 |
| 4096 | xxh64 | 0.500 | 3.07 | 12.80 |
| 4096 | crc32 | 1.168 | 1.32 | 1.95 |
| 8192 | none | 0.351 | 4.38 | - |
| 8192 | xxh32 | 0.614 | 2.50 | 5.84 |
| 8192 | xxh64 | 0.464 | 3.31 | 13.60 |
| 8192 | crc32 | 1.163 | 1.32 | 1.89 |
| 16384 | none | 0.346 | 4.43 | - |
| 16384 | xxh32 | 0.590 | 2.60 | 6.30 |
| 16384 | xxh64 | 0.466 | 3.30 | 12.80 |
| 16384 | crc32 | 1.183 | 1.30 | 1.84 |
Tested in userland using the test-suite in the zstd repo under
`contrib/linux-kernel/test/XXHashUserlandTest.cpp` [2] by mocking the
kernel functions. A line in each branch of every function in `xxhash.c`
was commented out to ensure that the test-suite fails. Additionally
tested while testing zstd and with SMHasher [3].
[1] https://phabricator.intern.facebook.com/P57526246
[2] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/test/XXHashUserlandTest.cpp
[3] https://github.com/aappleby/smhasher
zstd source repository: https://github.com/facebook/zstd
XXHash source repository: https://github.com/cyan4973/xxhash
Signed-off-by: Nick Terrell <terrelln@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
|
Rather than forcing us to take the inode->i_lock just in order to bump
the number.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
|
The commit lists can get very large, so using the inode->i_lock can
end up affecting general metadata performance.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
|
nfs_page_group_lock() is now always called with the 'nonblock'
parameter set to 'false'.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
|
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
|
When running in guest mode ppc64 supports a different mechanism for hugetlb
allocation/reservation. The LPAR management application called HMC can
be used to reserve a set of hugepages and we pass the details of
reserved pages via device tree to the guest. (more details in
htab_dt_scan_hugepage_blocks()) . We do the memblock_reserve of the range
and later in the boot sequence, we add the reserved range to huge_boot_pages.
But to enable 16G hugetlb on baremetal config (when we are not running as guest)
we want to do memblock reservation during boot. Generic code already does this
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
This patch introduces the usb charger support based on usb phy that
makes an enhancement to a power driver. The basic conception of the
usb charger is that, when one usb charger is added or removed by
reporting from the extcon device state change, the usb charger will
report to power user to set the current limitation.
Power user can register a notifiee on the usb phy by issuing
usb_register_notifier() to get notified by charger status changes
or charger current changes.
we can notify what current to be drawn to power user according to
different charger type, and now we have 2 methods to get charger type.
One is get charger type from extcon subsystem, which also means the
charger state changes. Another is we can get the charger type from
USB controller detecting or PMIC detecting, and the charger state
changes should be told by issuing usb_phy_set_charger_state().
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
|
|
When XIP_KERNEL is enabled, some functions are defined in the .data
ELF section because we require them to be in RAM whenever we communicate
with the flash chip. However this causes problems when FTRACE is
enabled and gcc emits calls to __gnu_mcount_nc in the function
prolog:
drivers/built-in.o: In function `cfi_chip_setup':
:(.data+0x272fc): relocation truncated to fit: R_ARM_CALL against symbol `__gnu_mcount_nc' defined in .text section in arch/arm/kernel/built-in.o
drivers/built-in.o: In function `cfi_probe_chip':
:(.data+0x27de8): relocation truncated to fit: R_ARM_CALL against symbol `__gnu_mcount_nc' defined in .text section in arch/arm/kernel/built-in.o
/tmp/ccY172rP.s: Assembler messages:
/tmp/ccY172rP.s:70: Warning: ignoring changed section attributes for .data
/tmp/ccY172rP.s: Error: 1 warning, treating warnings as errors
make[5]: *** [drivers/mtd/chips/cfi_probe.o] Error 1
/tmp/ccK4rjeO.s: Assembler messages:
/tmp/ccK4rjeO.s:421: Warning: ignoring changed section attributes for .data
/tmp/ccK4rjeO.s: Error: 1 warning, treating warnings as errors
make[5]: *** [drivers/mtd/chips/cfi_util.o] Error 1
/tmp/ccUvhCYR.s: Assembler messages:
/tmp/ccUvhCYR.s:1895: Warning: ignoring changed section attributes for .data
/tmp/ccUvhCYR.s: Error: 1 warning, treating warnings as errors
Specifically, this does not work because the .data section is not
marked executable, which leads LD to not generate trampolines for
long calls.
This moves the __xipram functions into their own .xiptext section instead.
The section is still placed next to .data and located in RAM but is marked
executable, which avoids the build errors.
Also, we only need to place the XIP functions into a separate section
if both CONFIG_XIP_KERNEL and CONFIG_MTD_XIP are set: When only MTD_XIP
is used, the whole kernel is still in RAM and we do not need to worry
about pulling out the rug under it. When only XIP_KERNEL but not MTD_XIP
is set, the kernel is in some form of ROM, but we never write to it.
Note that MTD_XIP has been broken on ARM since around 2011 or 2012. I
have sent another patch[2] to fix compilation, which I plan to merge
through arm-soc unless there are objections. The obvious alternative
to that would be to completely rip out the MTD_XIP support from the
kernel, since obviously nobody has been using it in a long while.
Link: [1] https://patchwork.kernel.org/patch/8109771/
Link: [2] https://patchwork.kernel.org/patch/9855225/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
|
|
The struct iommu_device has a 'struct device' embedded into
it, not as a pointer, but the whole struct. In the
conversion of the iommu drivers to use struct iommu_device
it was forgotten that the relase function for that struct
device simply calls kfree() on the pointer.
This frees memory that was never allocated and causes memory
corruption.
To fix this issue, use a pointer to struct device instead of
embedding the whole struct. This needs some updates in the
iommu sysfs code as well as the Intel VT-d and AMD IOMMU
driver.
Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Fixes: 39ab9555c241 ('iommu: Add sysfs bindings for struct iommu_device')
Cc: stable@vger.kernel.org # >= v4.11
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
The MT6380 is a regulator found those boards with MediaTek MT7622 SoC
It is connected as a slave to the SoC using MediaTek PMIC wrapper which
is the common interface connecting with Mediatek made various PMICs.
Signed-off-by: Chenglin Xu <chenglin.xu@mediatek.com>
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
In the SG case this is already handled since a non-zero
request->num_mapped_sgs is a clear indicator that dma_map_sg()
had been called. While it would be nice to do the same for the
singly mapped case by simply checking for non-zero request->dma,
it's conceivable that 0 is a valid dma_addr_t handle. Hence add
a flag 'dma_mapped' to struct usb_request and use this to
determine the need to call dma_unmap_single(). Otherwise, if a
request is not DMA mapped then the result of calling
usb_request_unmap_request() would safely be a no-op.
Signed-off-by: Jack Pham <jackp@codeaurora.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
|
|
The current f_hid driver doesn't handle GET_PROCOTOL and
SET_PROCOTOL requests, which are required to operate HID
gadgets in BOOT mode. This patch implements this feature for
devices that have the same implementation for REPORT and BOOT mode
so that these devices are recognized by older BIOSes.
Signed-off-by: Abdulhadi Mohamed <abdulahhadi2@gmail.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull RCU fix from Paul McKenney:
" This pull request is for an RCU change that permits waiting for grace
periods started by CPUs late in the process of going offline. Lack of
this capability is causing failures:
http://lkml.kernel.org/r/db9c91f6-1b17-6136-84f0-03c3c2581ab4@codeaurora.org"
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|