Age | Commit message (Collapse) | Author |
|
With the CPU renaming registers on its own, and all the overhead of the
syscall entry/exit, it is doubtful whether the compiled output of
mov %r8, %rax
mov %rcx, %r8
mov %rax, %rcx
jmpq sys_clone
is measurably slower than the hand-crafted version of
xchg %r8, %rcx
So get rid of this special case.
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: luto@amacapital.net
Cc: viro@zeniv.linux.org.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
While at it, convert declarations of type "unsigned" to "unsigned int".
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: luto@amacapital.net
Cc: viro@zeniv.linux.org.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Using SYSCALL_DEFINEx() is recommended, so use it also here.
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: luto@amacapital.net
Cc: viro@zeniv.linux.org.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
sys32_vm86_warning() is long gone.
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: luto@amacapital.net
Cc: viro@zeniv.linux.org.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
If the compat entry point is equivalent to the native entry point, it
does not need to be specified explicitly.
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: luto@amacapital.net
Cc: viro@zeniv.linux.org.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Fix the objtool build when cross-compiling a 64-bit kernel on a 32-bit
host. This also simplifies read_retpoline_hints() a bit and makes its
implementation similar to most of the other annotation reading
functions.
Reported-by: Sven Joachim <svenjoac@gmx.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: b5bc2231b8ad ("objtool: Add retpoline validation")
Link: http://lkml.kernel.org/r/2ca46c636c23aa9c9d57d53c75de4ee3ddf7a7df.1520380691.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The refcount.c file missed the mass-addition of the SPDX lines.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The LKDTM modules keep expanding, and it's getting weird to have each file
get a prefix. Instead, move to a subdirectory for cleaner handling.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Hitting the following hardlockup due to a race condition in
error CQE processing.
[26146.879798] bnxt_en 0000:04:00.0: QPLIB: FP: CQ Processed Req
[26146.886346] bnxt_en 0000:04:00.0: QPLIB: wr_id[1251] = 0x0 with status 0xa
[26156.350935] NMI watchdog: Watchdog detected hard LOCKUP on cpu 4
[26156.357470] Modules linked in: nfsd auth_rpcgss nfs_acl lockd grace
[26156.447957] CPU: 4 PID: 3413 Comm: kworker/4:1H Kdump: loaded
[26156.457994] Hardware name: Dell Inc. PowerEdge R430/0CN7X8,
[26156.466390] Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
[26156.472639] Call Trace:
[26156.475379] <NMI> [<ffffffff98d0d722>] dump_stack+0x19/0x1b
[26156.481833] [<ffffffff9873f775>] watchdog_overflow_callback+0x135/0x140
[26156.489341] [<ffffffff9877f237>] __perf_event_overflow+0x57/0x100
[26156.496256] [<ffffffff98787c24>] perf_event_overflow+0x14/0x20
[26156.502887] [<ffffffff9860a580>] intel_pmu_handle_irq+0x220/0x510
[26156.509813] [<ffffffff98d16031>] perf_event_nmi_handler+0x31/0x50
[26156.516738] [<ffffffff98d1790c>] nmi_handle.isra.0+0x8c/0x150
[26156.523273] [<ffffffff98d17be8>] do_nmi+0x218/0x460
[26156.528834] [<ffffffff98d16d79>] end_repeat_nmi+0x1e/0x7e
[26156.534980] [<ffffffff987089c0>] ? native_queued_spin_lock_slowpath+0x1d0/0x200
[26156.543268] [<ffffffff987089c0>] ? native_queued_spin_lock_slowpath+0x1d0/0x200
[26156.551556] [<ffffffff987089c0>] ? native_queued_spin_lock_slowpath+0x1d0/0x200
[26156.559842] <EOE> [<ffffffff98d083e4>] queued_spin_lock_slowpath+0xb/0xf
[26156.567555] [<ffffffff98d15690>] _raw_spin_lock+0x20/0x30
[26156.573696] [<ffffffffc08381a1>] bnxt_qplib_lock_buddy_cq+0x31/0x40 [bnxt_re]
[26156.581789] [<ffffffffc083bbaa>] bnxt_qplib_poll_cq+0x43a/0xf10 [bnxt_re]
[26156.589493] [<ffffffffc083239b>] bnxt_re_poll_cq+0x9b/0x760 [bnxt_re]
The issue happens if RQ poll_cq or SQ poll_cq or Async error event tries to
put the error QP in flush list. Since SQ and RQ of each error qp are added
to two different flush list, we need to protect it using locks of
corresponding CQs. Difference in order of acquiring the lock in
SQ poll_cq and RQ poll_cq can cause a hard lockup.
Revisits the locking strategy and removes the usage of qplib_cq.hwq.lock.
Instead of this lock, introduces qplib_cq.flush_lock to handle
addition/deletion of QPs in flush list. Also, always invoke the flush_lock
in order (SQ CQ lock first and then RQ CQ lock) to avoid any potential
deadlock.
Other than the poll_cq context, the movement of QP to/from flush list can
be done in modify_qp context or from an async error event from HW.
Synchronize these operations using the bnxt_re verbs layer CQ locks.
To achieve this, adds a call back to the HW abstraction layer(qplib) to
bnxt_re ib_verbs layer in case of async error event. Also, removes the
buddy cq functions as it is no longer required.
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Fix warning limit for kernel stack consumption:
drivers/infiniband/core/cq.c: In function 'ib_process_cq_direct':
drivers/infiniband/core/cq.c:78:1: error: the frame size of 1032 bytes
is larger than 1024 bytes [-Werror=frame-larger-than=]
Using smaller ib_wc array on the stack brings us comfortably below that
limit again.
Fixes: 246d8b184c10 ("IB/cq: Don't force IB_POLL_DIRECT poll context for ib_process_cq_direct")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Sergey Gorenko <sergeygo@mellanox.com>
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
"err" is either zero or possibly uninitialized here. It should be
-EINVAL.
Fixes: 427c1e7bcd7e ("{IB, net}/mlx5: Move the modify QP operation table to mlx5_ib")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
The series that introduced dual port RoCE mode assumed that we don't have
a dual port HCA that use the mlx5 driver, this is not the case for
Connect-IB HCAs. This reasoning led to assigning 1 as the native port
index which causes issue when the second port is used.
For example query_pkey() when called on the second port will return values
of the first port. Make sure that we assign the right port index as the
native port index.
Fixes: 32f69e4be269 ("{net, IB}/mlx5: Manage port association for multiport RoCE")
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
The commit cited below added a gid_type field (RoCEv1 or RoCEv2)
to GID properties.
When adding GIDs, this gid_type field was copied over to the
hardware gid table. However, when deleting GIDs, the gid_type field
was not copied over to the hardware gid table.
As a result, when running RoCEv2, all RoCEv2 gids in the
hardware gid table were set to type RoCEv1 when any gid was deleted.
This problem would persist until the next gid was added (which would again
restore the gid_type field for all the gids in the hardware gid table).
Fix this by copying over the gid_type field to the hardware gid table
when deleting gids, so that the gid_type of all remaining gids is
preserved when a gid is deleted.
Fixes: b699a859d17b ("IB/mlx4: Add gid_type to GID properties")
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
When using IPv4 addresses in RoCEv2, the GID format for the mapped
IPv4 address should be: ::ffff:<4-byte IPv4 address>.
In the cited commit, IPv4 mapped IPV6 addresses had the 3 upper dwords
zeroed out by memset, which resulted in deleting the ffff field.
However, since procedure ipv6_addr_v4mapped() already verifies that the
gid has format ::ffff:<ipv4 address>, no change is needed for the gid,
and the memset can simply be removed.
Fixes: 7e57b85c444c ("IB/mlx4: Add support for setting RoCEv2 gids in hardware")
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
If the read-only flag is true on a SCSI disk, re-reading the partition
table sets the flag back to false.
To observe this bug, you can run:
1. blockdev --setro /dev/sda
2. blockdev --rereadpt /dev/sda
3. blockdev --getro /dev/sda
This commit reads the disk's old state and combines it with the device
disk-reported state rather than unconditionally marking it as RW.
Reported-by: Li Ning <lining916740672@icloud.com>
Signed-off-by: Jeremy Cline <jeremy@jcline.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Because of the shifting around of code in qla2x00_probe_one recently,
failures during adapter initialization can lead to problems, i.e. NULL
pointer crashes and doubly freed data structures which cause eventual
panics.
This V2 version makes the relevant memory free routines idempotent, so
repeat calls won't cause any harm. I also removed the problematic
probe_init_failed exit point as it is not needed.
Fixes: d64d6c5671db ("scsi: qla2xxx: Fix NULL pointer crash due to probe failure")
Signed-off-by: Bill Kuzeja <william.kuzeja@stratus.com>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Rework sd_zbc_check_zone_size() to avoid a memory leak due to an early
return if sd_zbc_report_zones() fails.
Reported-by: David.butterfield <david.butterfield@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Cc: stable@vger.kernel.org
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The firmware event workqueue should not be marked as WQ_MEM_RECLAIM
as it's doesn't need to make forward progress under memory pressure.
In the current state it will result in a deadlock if the device had been
forcefully removed.
Cc: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>
Acked-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
iWARP does not support RDMA WRITE or SEND with immediate data.
Driver should check this before submitting to FW and return an
immediate error
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Race in qedr_poll_cq, lastest_cqe wasn't protected by lock,
leading to a case where two context's accessing poll_cq at
the same time lead to one of them having a pointer to an old
latest_cqe and reading an invalid cqe element
Signed-off-by: Amit Radzi <Amit.Radzi@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Fix iWARP connect and listen to use the mapped port for
ipv4 and ipv6. Without this fixed, running on a server
that has iwpmd enabled will not use the correct port
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
The wrong parameter was passed to dst_neigh_lookup
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Samsung micro-USB 11-pin connector beside standard micro-USB pins,
has pins dedicated to route MHL traffic.
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
|
|
These bindings allow to describe most known standard USB connectors
and it should be possible to extend it if necessary.
USB connectors, beside USB can be used to route other protocols,
for example UART, Audio, MHL. In such case every device passing data
through the connector should have appropriate graph bindings.
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
|
|
In practice this is really only meaningful in the context of the DM
multipath target (which uses dm_table_set_type() to set the type of
device DM should create via its "queue_mode" option).
So this change allows a DM multipath device with "queue_mode bio" to be
upgraded from DM_TYPE_BIO_BASED to DM_TYPE_NVME_BIO_BASED -- iff the
underlying device(s) are NVMe.
DM_TYPE_NVME_BIO_BASED is just a DM core implementation detail that
allows for NVMe-specific optimizations (e.g. use direct_make_request
instead of generic_make_request). If in the future there is no benefit
or need to distinguish NVMe vs not: then it will be removed.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
This eliminates the "queue_mode" configuration's "nvme" mode. There
wasn't anything NVMe-specific about that mode. It was named "nvme"
because it was a short name for the mode. But the entire point of the
mode was to optimize the multipath target for underlying devices that
are _not_ SCSI-based. Devices that aren't SCSI have no need for the
various SCSI device handler (scsi_dh) specific code in DM multipath.
But rather than narrowly define this scsi_dh vs not branching in terms
of "nvme": invert the logic so that we're just checking whether a
multipath device is layered on SCSI devices with scsi_dh attached.
This allows any future storage technology to avoid scsi_dh specific code
in the multipath target too.
Fixes: 848b8aefd4 ("dm mpath: optimize NVMe bio-based support")
Suggested-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
The strncmp function should compare 4 bytes.
Fixes: 22c11858e8002 ("dm: introduce DM_TYPE_NVME_BIO_BASED")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Upstream commit 4102d9de6d375 ("dm raid: fix rs_get_progress()
synchronization state/ratio") in combination with commit 7c29744ecce
("dm raid: simplify rs_get_progress()") introduced a regression by
incorrectly reporting a sync_ratio of 0 for degraded raid sets. This
caused lvm2 to fail to repair raid legs automatically.
Fix by identifying the degraded state by checking the MD_RECOVERY_INTR
flag and returning mddev->recovery_cp in case it is set.
MD sets recovery = [ MD_RECOVERY_RECOVER MD_RECOVERY_INTR
MD_RECOVERY_NEEDED ] when a RAID member fails. It then shuts down any
sync thread that is running and leaves us with all MD_RECOVERY_* flags
cleared. The bug occurs if a status is requested in the short time it
takes to shut down any sync thread and clear the flags, because we were
keying in on the MD_RECOVERY_NEEDED - understanding it to be the initial
phase of a “recover” sync thread. However, this is an incorrect
interpretation if MD_RECOVERY_INTR is also set.
This also explains why the bug only happened when automatic repair was
enabled and not a normal ‘manual’ method. It is impossible to react
quick enough to hit the problematic window without it being automated.
Fix passes automatic repair tests.
Fixes: 7c29744ecce ("dm raid: simplify rs_get_progress()")
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Otherwise an underlying device's teardown (e.g. SCSI) may race with the
DM ioctl or persistent reservation and result in dereferencing driver
memory that gets freed when the underlying device's final blkdev_put()
occurs.
bdgrab() only increases the refcount for the block_device's inode to
ensure the block_device struct itself will not be freed, but does not
guarantee the block_device will remain associated with the gendisk or
its storage.
Cc: stable@vger.kernel.org # 4.8+
Reported-by: David Jeffery <djeffery@redhat.com>
Suggested-by: David Jeffery <djeffery@redhat.com>
Reviewed-by: Ben Marzinski <bmarzins@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
gcc-6.3 and earlier show a new warning after a seemingly unrelated
change to the arm64 PAGE_KERNEL definition:
In file included from drivers/md/dm-bufio.c:14:0:
drivers/md/dm-bufio.c: In function 'alloc_buffer':
include/linux/sched/mm.h:182:56: warning: 'noio_flag' may be used uninitialized in this function [-Wmaybe-uninitialized]
current->flags = (current->flags & ~PF_MEMALLOC_NOIO) | flags;
^
The same warning happened earlier on linux-3.18 for MIPS and I did a
workaround for that, but now it's come back.
gcc-7 and newer are apparently smart enough to figure this out, and
other architectures don't show it, so the best I could come up with is
to rework the caller slightly in a way that makes it obvious enough to
all arm64 compilers what is happening here.
Fixes: 41acec624087 ("arm64: kpti: Make use of nG dependent on arm64_kernel_unmapped_at_el0()")
Link: https://patchwork.kernel.org/patch/9692829/
Cc: stable@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
[snitzer: moved declarations inside conditional, altered vmalloc return]
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Compilation of bpftool on a distro that lacks eBPF support in the installed
kernel headers fails with:
common.c: In function ‘is_bpffs’:
common.c:96:40: error: ‘BPF_FS_MAGIC’ undeclared (first use in this function)
return (unsigned long)st_fs.f_type == BPF_FS_MAGIC;
^
Fix this the same way it is already in tools/lib/bpf/libbpf.c and
tools/lib/api/fs/fs.c.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Only allow ifindex from IP_PKTINFO to override SO_BINDTODEVICE settings
if the index is actually set in the message.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull sigingo fix from Eric Biederman:
"The kbuild test robot found that I accidentally moved si_pkey when I
was cleaning up siginfo_t. A short followed by an int with the int
having 8 byte alignment. Sheesh siginfo_t is a weird structure.
I have now corrected it and added build time checks that with a little
luck will catch any similar future mistakes. The build time checks
were sufficient for me to verify the bug and to verify my fix. So they
are at least useful this once."
* 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
signal/x86: Include the field offsets in the build time checks
signal: Correct the offset of si_pkey in struct siginfo
|
|
devm_memremap_pages() was re-worked in e8d513483300 "memremap: change
devm_memremap_pages interface to use struct dev_pagemap" to take a
caller allocated struct dev_pagemap as a function parameter. A call to
devres_free() was left in the error cleanup path which results in a
kernel panic if the remap fails for some reason. Remove it to fix the
panic and let devm_memremap_pages() fail gracefully.
Fixes: e8d513483300 ("memremap: change devm_memremap_pages interface...")
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Use sysfs to allow supporting sub-channels. The userspace application
makes request to host to create sub-channels and the UIO kernel
driver populates the sysfs per-channel directory with a binary
attribute file that can be used to read/write ring.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Hyper-V balloon driver makes non-trivial calculations to convert Linux's
representation of free/used memory to what Hyper-V host expects to see. Add
a tracepoint to see what's being sent and where the data comes from.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Our num_pages_onlined accounting is buggy:
1) In case we're offlining a memory block which was present at boot (e.g.
when there was no hotplug at all) we subtract 32k from 0 and as
num_pages_onlined is unsigned get a very big positive number.
2) Commit 6df8d9aaf3af ("Drivers: hv: balloon: Correctly update onlined
page count") made num_pages_onlined counter accurate on onlining but
totally incorrect on offlining for partly populated regions: no matter
how many pages were onlined and what was actually added to
num_pages_onlined counter we always subtract the full region (32k) so
again, num_pages_onlined can wrap around zero. By onlining/offlining
the same partly populated region multiple times we can make the
situation worse.
Solve these issues by doing accurate accounting on offlining: walk HAS
list, check for covered range and gaps.
Fixes: 6df8d9aaf3af ("Drivers: hv: balloon: Correctly update onlined page count")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Instead of doing pfn_to_page() and continuosly casting page to unsigned
long just cache the pfn of the page with page_to_pfn().
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We have a mix of different ideas of which loglevel should be used. Unify
on the following:
- pr_info() for normal operation
- pr_warn() for 'strange' host behavior
- pr_err() for all errors.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The 2016 version of Hyper-V offers the option to operate the guest VM
per-vcpu stimer's in Direct Mode, which means the timer interupts on its
own vector rather than queueing a VMbus message. Direct Mode reduces
timer processing overhead in both the hypervisor and the guest, and
avoids having timer interrupts pollute the VMbus interrupt stream for
the synthetic NIC and storage. This patch enables Direct Mode by
default on stimer0 when running on a version of Hyper-V that supports
it.
In prep for coming support of Hyper-V on ARM64, the arch independent
portion of the code contains calls to routines that will be populated
on ARM64 but are not needed and do nothing on x86.
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The comments doesn't match what the current code does, also have a
typo. This patch corrects them.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The usage of strchr requires inclusion of string.h.
Fixes: 0c38cda64aec ("tools: hv: remove unnecessary header files and netlink related code")
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This patch fixes the below warnings with new glibc and gcc:
hv_vss_daemon.c:100:13: warning: In the GNU C Library, "major" is defined
by <sys/sysmacros.h>. For historical compatibility, it is currently
defined by <sys/types.h> as well, but we plan to remove this soon.
To use "major", include <sys/sysmacros.h> directly.
hv_fcopy_daemon.c:42:2: note: 'snprintf' output between 2 and 1040
bytes into a destination of size 260
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Just a trivial tyop fix.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The kzalloc function is called with GFP_ATOMIC.
But according to driver call graph, it is not in atomic context,
namely no spinlock is held nor in an interrupt handler.
This GFP_ATOMIC is unnecessary, and replace with GFP_KERNEL.
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
On Hyper-V the VF NIC has the same MAC as the related synthetic NIC.
VF NIC can work under the synthetic NIC transparently, without its
own IP address. The existing KVP daemon only gets IP from the first
NIC matching a MAC address, and may not be able to find the IP in
this case.
This patch fixes the problem by searching the NIC matching the MAC,
and having an IP address. So, the IP address will be found and
reported to the host successfully.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Fix bugs in signaling the Hyper-V host when freeing space in the
host->guest ring buffer:
1. The interrupt_mask must not be used to determine whether to signal
on the host->guest ring buffer
2. The ring buffer write_index must be read (via hv_get_bytes_to_write)
*after* pending_send_sz is read in order to avoid a race condition
3. Comparisons with pending_send_sz must treat the "equals" case as
not-enough-space
4. Don't signal if the pending_send_sz feature is not present. Older
versions of Hyper-V that don't implement this feature will poll.
Fixes: 03bad714a161 ("vmbus: more host signalling avoidance")
Cc: Stable <stable@vger.kernel.org> # 4.14 and above
Signed-off-by: Michael Kelley <mhkelley@outlook.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Commit 57e6f0d7b804 ("typec: tcpm: Only request matching pdos") is causing
a regression, before this commit e.g. the GPD win and GPD pocket devices
were charging at 9V 3A with a PD charger, now they are instead slowly
discharging at 5V 0.4A, as this commit causes the ports max_snk_mv/ma/mw
settings to be completely ignored.
Arguably the way to fix this would be to add a PDO_VAR() describing the
voltage range to the snk_caps of boards which can handle any voltage in
their range, but the "typec: tcpm: Only request matching pdos" commit
looks at the type of PDO advertised by the source/charger and if that
is fixed (as it typically is) only compairs against PDO_FIXED entries
in the snk_caps so supporting a range of voltage would require adding a
PDO_FIXED entry for *every possible* voltage to snk_caps.
AFAICT there is no reason why a fixed source_cap cannot be matched against
a variable snk_cap, so at a minimum the commit should be rewritten to
support that.
For now lets revert the "typec: tcpm: Only request matching pdos" commit,
fixing the regression.
Fixes: 57e6f0d7b804 ("typec: tcpm: Only request matching pdos")
Cc: Badhri Jagan Sridharan <badhri@google.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Acked-by: Adam Thomson <Adam.Thomson.Opensource@diasemi.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Without pm_runtime_{get,put}_sync calls in place, reading
vbus status via /sys causes the following error:
Unhandled fault: external abort on non-linefetch (0x1028) at 0xfa0ab060
pgd = b333e822
[fa0ab060] *pgd=48011452(bad)
[<c05261b0>] (musb_default_readb) from [<c0525bd0>] (musb_vbus_show+0x58/0xe4)
[<c0525bd0>] (musb_vbus_show) from [<c04c0148>] (dev_attr_show+0x20/0x44)
[<c04c0148>] (dev_attr_show) from [<c0259f74>] (sysfs_kf_seq_show+0x80/0xdc)
[<c0259f74>] (sysfs_kf_seq_show) from [<c0210bac>] (seq_read+0x250/0x448)
[<c0210bac>] (seq_read) from [<c01edb40>] (__vfs_read+0x1c/0x118)
[<c01edb40>] (__vfs_read) from [<c01edccc>] (vfs_read+0x90/0x144)
[<c01edccc>] (vfs_read) from [<c01ee1d0>] (SyS_read+0x3c/0x74)
[<c01ee1d0>] (SyS_read) from [<c0106fe0>] (ret_fast_syscall+0x0/0x54)
Solution was suggested by Tony Lindgren <tony@atomide.com>.
Signed-off-by: Merlijn Wajer <merlijn@wizzup.org>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Corsair Strafe RGB keyboard does not respond to usb control messages
sometimes and hence generates timeouts.
Commit de3af5bf259d ("usb: quirks: add delay init quirk for Corsair
Strafe RGB keyboard") tried to fix those timeouts by adding
USB_QUIRK_DELAY_INIT.
Unfortunately, even with this quirk timeouts of usb_control_msg()
can still be seen, but with a lower frequency (approx. 1 out of 15):
[ 29.103520] usb 1-8: string descriptor 0 read error: -110
[ 34.363097] usb 1-8: can't set config #1, error -110
Adding further delays to different locations where usb control
messages are issued just moves the timeouts to other locations,
e.g.:
[ 35.400533] usbhid 1-8:1.0: can't add hid device: -110
[ 35.401014] usbhid: probe of 1-8:1.0 failed with error -110
The only way to reliably avoid those issues is having a pause after
each usb control message. In approx. 200 boot cycles no more timeouts
were seen.
Addionaly, keep USB_QUIRK_DELAY_INIT as it turned out to be necessary
to have the delay in hub_port_connect() after hub_port_init().
The overall boot time seems not to be influenced by these additional
delays, even on fast machines and lightweight distributions.
Fixes: de3af5bf259d ("usb: quirks: add delay init quirk for Corsair Strafe RGB keyboard")
Cc: stable@vger.kernel.org
Signed-off-by: Danilo Krummrich <danilokrummrich@dk-develop.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|