Age | Commit message (Collapse) | Author |
|
Add support for displaying and modifying rx and tx ring sizes using
ethtool.
Also, increasing version to 2.3.0.45
Signed-off-by: Parvi Kaustubhi <pkaustub@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since we are allowing rx ring size modification, reset fetch index
everytime. Otherwise it could have a stale value that can lead to a null
pointer dereference.
Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com>
Signed-off-by: Parvi Kaustubhi <pkaustub@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
into drm-fixes
Just two small patches for stable to fix the driver failing to load on polaris
cards with harvested VCE or UVD blocks.
* 'drm-fixes-4.14' of git://people.freedesktop.org/~agd5f/linux:
drm/amdgpu: allow harvesting check for Polaris VCE
drm/amdgpu: return -ENOENT from uvd 6.0 early init for harvesting
|
|
Only MD_SB_CHANGE_PENDING should be used to wait for transition from
clean to dirty. Checking also MD_SB_CHANGE_CLEAN is unnecessary and can
race with e.g. md_do_sync(). This sporadically causes a hang when
changing consistency policy during resync:
INFO: task mdadm:6183 blocked for more than 30 seconds.
Not tainted 4.14.0-rc3+ #391
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mdadm D12752 6183 6022 0x00000000
Call Trace:
__schedule+0x93f/0x990
schedule+0x6b/0x90
md_allow_write+0x100/0x130 [md_mod]
? do_wait_intr_irq+0x90/0x90
resize_stripes+0x3a/0x5b0 [raid456]
? kernfs_fop_write+0xbe/0x180
raid5_change_consistency_policy+0xa6/0x200 [raid456]
consistency_policy_store+0x2e/0x70 [md_mod]
md_attr_store+0x90/0xc0 [md_mod]
sysfs_kf_write+0x42/0x50
kernfs_fop_write+0x119/0x180
__vfs_write+0x28/0x110
? rcu_sync_lockdep_assert+0x12/0x60
? __sb_start_write+0x15a/0x1c0
? vfs_write+0xa3/0x1a0
vfs_write+0xb4/0x1a0
SyS_write+0x49/0xa0
entry_SYSCALL_64_fastpath+0x18/0xad
Fixes: 2214c260c72b ("md: don't return -EAGAIN in md_allow_write for external metadata arrays")
Cc: <stable@vger.kernel.org>
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
The pointer q is assigned but never read; it is redundant and can
be removed. Cleans up clang warning:
drivers/md/md-multipath.c:260:4: warning: Value stored to 'q' is
never read
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
There are some lines could be removed due to recent
change for raid1 such as commit 3956df15d634 ("md:
move suspend_hi/lo handling into core md code").
Also, seems some comments are put to wrong place,
move them before wait_barrier.
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
Suspending the entire device for resync could take
too long. Resync in small chunks.
cluster's resync window is maintained in r10conf as
cluster_sync_low and cluster_sync_high, and processed
in raid10's sync_request(). If the current resync is
outside the cluster resync window:
1. Set the cluster_sync_low to curr_resync_completed.
2. Set cluster_sync_high to cluster_sync_low + stripe
size.
3. Send a message to all nodes so they may add it in
their suspension list.
Note:
We only support "near" raid10 so far, resync a far or
offset raid10 array could have trouble. So raid10_run
checks the layout of clustered raid10, it will refuse
to run if the layout is not correct.
With the "near" layout we process one stripe at a time
progressing monotonically through the address space.
So we can have a sliding window of whole-stripes which
moves through the array suspending IO on other nodes,
and both resync which uses array addresses and recovery
which uses device addresses can stay within this window.
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
If there is a resync going on, all nodes must suspend
writes to the range. This is recorded in suspend_info
and suspend_list.
If there is an I/O within the ranges of any of the
suspend_info, area_resyncing will return 1.
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
Just like clustered raid1, it is impossible for cluster raid10
to choose the best device for read balance when the area of
array is resyncing. Because we cannot trust the data to be the
same on all devices at that time, so we choose just the first
one to use, so set do_balance to 0.
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
lockdep_assert_held is a better way to assert lock held, and it works
for UP.
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
If freeze_array is attempted in the middle of close_sync/
wait_all_barriers, deadlock can occur.
freeze_array will wait for nr_pending and nr_queued to line up.
wait_all_barriers increments nr_pending for each barrier bucket, one
at a time, but doesn't actually issue IO that could be counted in
nr_queued. So freeze_array is blocked until wait_all_barriers
completes and allow_all_barriers runs. At the same time, when
_wait_barrier sees array_frozen == 1, it stops and waits for
freeze_array to complete.
Prevent the deadlock by making close_sync call _wait_barrier and
_allow_barrier for one bucket at a time, instead of deferring the
_allow_barrier calls until after all _wait_barriers are complete.
Signed-off-by: Nate Dailey <nate.dailey@stratus.com>
Fix: fd76863e37fe(RAID1: a new I/O barrier implementation to remove resync window)
Reviewed-by: Coly Li <colyli@suse.de>
Cc: stable@vger.kernel.org (v4.11)
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
Hi - I submit this patch for the next merge window:
Some times ago, I made a patch f9c79bc05a2a that blocks signals around the
schedule() calls in MD. The MD subsystem needs to do an uninterruptible
sleep that is not accounted in load average - so we block signals and use
interruptible sleep.
The kernel has a special TASK_IDLE state for this purpose, so we can use
it instead of blocking signals. This patch doesn't fix any bug, it just
makes the code simpler.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Acked-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
The '2' argument means "wake up anything that is waiting".
This is an inelegant part of the design and was added
to help support management of suspend_lo/suspend_hi setting.
Now that suspend_lo/hi is managed in mddev_suspend/resume,
that need is gone.
These is still a couple of places where we call 'quiesce'
with an argument of '2', but they can safely be changed to
call ->quiesce(.., 1); ->quiesce(.., 0) which
achieve the same result at the small cost of pausing IO
briefly.
This removes a small "optimization" from suspend_{hi,lo}_store,
but it isn't clear that optimization served a useful purpose.
The code now is a lot clearer.
Suggested-by: Shaohua Li <shli@kernel.org>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
There are various deadlocks that can occur
when a thread holds reconfig_mutex and calls
->quiesce(mddev, 1).
As some write request block waiting for
metadata to be updated (e.g. to record device
failure), and as the md thread updates the metadata
while the reconfig mutex is held, holding the mutex
can stop write requests completing, and this prevents
->quiesce(mddev, 1) from completing.
->quiesce() is now usually called from mddev_suspend(),
and it is always called with reconfig_mutex held. So
at this time it is safe for the thread to update metadata
without explicitly taking the lock.
So add 2 new flags, one which says the unlocked updates is
allowed, and one which ways it is happening. Then allow it
while the quiesce completes, and then wait for it to finish.
Reported-and-tested-by: Xiao Ni <xni@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
mddev_suspend() is a more general interface than
calling ->quiesce() and is so more extensible. A
future patch will make use of this.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
responding to ->suspend_lo and ->suspend_hi is similar
to responding to ->suspended. It is best to wait in
the common core code without incrementing ->active_io.
This allows mddev_suspend()/mddev_resume() to work while
requests are waiting for suspend_lo/hi to change.
This is will be important after a subsequent patch
which uses mddev_suspend() to synchronize updating for
suspend_lo/hi.
So move the code for testing suspend_lo/hi out of raid1.c
and raid5.c, and place it in md.c
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
bitmap_create() allocates memory with GFP_KERNEL and
so can wait for IO.
If called while the array is quiesced, it could wait indefinitely
for write out to the array - deadlock.
So call bitmap_create() before quiescing the array.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
Most often mddev_suspend() is called with
reconfig_mutex held. Make this a requirement in
preparation a subsequent patch. Also require
reconfig_mutex to be held for mddev_resume(),
partly for symmetry and partly to guarantee
no races with incr/decr of mddev->suspend.
Taking the mutex in r5c_disable_writeback_async() is
a little tricky as this is called from a work queue
via log->disable_writeback_work, and flush_work()
is called on that while holding ->reconfig_mutex.
If the work item hasn't run before flush_work()
is called, the work function will not be able to
get the mutex.
So we use mddev_trylock() inside the wait_event() call, and have that
abort when conf->log is set to NULL, which happens before
flush_work() is called.
We wait in mddev->sb_wait and ensure this is woken
when any of the conditions change. This requires
waking mddev->sb_wait in mddev_unlock(). This is only
like to trigger extra wake_ups of threads that needn't
be woken when metadata is being written, and that
doesn't happen often enough that the cost would be
noticeable.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
Having both a bitmap and a journal is pointless.
Attempting to do so can corrupt the bitmap if the journal
replay happens before the bitmap is initialized.
Rather than try to avoid this corruption, simply
refuse to allow arrays with both a bitmap and a journal.
So:
- if raid5_run sees both are present, fail.
- if adding a bitmap finds a journal is present, fail
- if adding a journal finds a bitmap is present, fail.
Cc: stable@vger.kernel.org (4.10+)
Signed-off-by: NeilBrown <neilb@suse.com>
Tested-by: Joshua Kinard <kumba@gentoo.org>
Acked-by: Joshua Kinard <kumba@gentoo.org>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
Fixes init failures on Polaris cards with harvested
VCE blocks.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
Fixes init failures on polaris cards with harvested UVD.
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
Fixes for Stable:
- Fix KBL Blank Screen (Jani)
- Fix FIFO Underrun on SNB (Maarten)
Other fixes:
- Fix GPU Hang on i915gm (Chris)
- Fix gem_tiled_pread_pwrite IGT case (Chris)
- Cancel modeset retry work during modeset clean-up (Manasi)
* tag 'drm-intel-fixes-2017-11-01' of git://anongit.freedesktop.org/drm/drm-intel:
drm/i915: Check incoming alignment for unfenced buffers (on i915gm)
drm/i915: Hold rcu_read_lock when iterating over the radixtree (vma idr)
drm/i915: Hold rcu_read_lock when iterating over the radixtree (objects)
drm/i915/edp: read edp display control registers unconditionally
drm/i915: Do not rely on wm preservation for ILK watermarks
drm/i915: Cancel the modeset retry work during modeset cleanup
|
|
Supply the Smack module hooks in support of overlayfs.
Ensure that the Smack label of new files gets the correct
value when a directory is transmuting. Original implementation
by Romanini Daniele, with a few tweaks added.
Signed-off-by: Romanini Daniele <daniele.romanini@aalto.fi>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
|
|
Add an additional symbol to the decompressor image, which will allow
future debugging of non-bootable problems similar to the one encountered
with the EFI stub.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
The smp-cmp build has been (further) broken since commit 856fbcee6099
("MIPS: Store core & VP IDs in GlobalNumber-style variable") in
v4.14-rc1 like so:
arch/mips/kernel/smp-cmp.c: In function ‘cmp_init_secondary’:
arch/mips/kernel/smp-cmp.c:53:4: error: ‘struct cpuinfo_mips’ has no member named ‘vpe_id’
c->vpe_id = (read_c0_tcbind() >> TCBIND_CURVPE_SHIFT) &
^
Fix by replacing vpe_id with cpu_set_vpe_id().
Fixes: 856fbcee6099 ("MIPS: Store core & VP IDs in GlobalNumber-style variable")
Signed-off-by: James Hogan <jhogan@kernel.org>
Reviewed-by: Paul Burton <paul.burton@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/17569/
Signed-off-by: James Hogan <jhogan@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull signal bugfix from Eric Biederman:
"When making the generic support for SIGEMT conditional on the presence
of SIGEMT I made a typo that causes it to fail to activate. It was
noticed comparatively quickly but the bug report just made it to me
today"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
signal: Fix name of SIGEMT in #if defined() check
|
|
Since i2c_unregister_device() became NULL-aware we may remove duplicate
NULL check.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Since i2c_unregister_device() became NULL-aware we may remove duplicate
NULL check.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
It's a common pattern to be NULL-aware when freeing resources.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
This patch supports xgene-slimpro-i2c v2 which uses the non-cachable memory
as the PCC shared memory.
Signed-off-by: Hoan Tran <hotran@apm.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
cppcheck rightfully says:
drivers/i2c/busses/i2c-mpc.c:329: style: Variable 'node' is reassigned a value before the old one has been used.
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio into i2c/for-4.15
Refactor i2c-gpio and its users to use gpiod. Done by GPIO maintainer
LinusW.
|
|
Neither of the current maintainers works for Imagination any more.
Removed both imgtec email addresses and added back mine for
occasional reviews, also changed from Maintained to Odd Fixes to
reflect the time that I will be able to spend on it.
Signed-off-by: James Hartley <james.hartley@sondrel.com>
Patchwork: https://patchwork.linux-mips.org/patch/17475/
Signed-off-by: James Hogan <jhogan@kernel.org>
|
|
When task_struct was moved, this MIPS code was neglected. Evidently
nobody is using it anymore. This fixes this build error:
In file included from ./arch/mips/include/asm/thread_info.h:15:0,
from ./include/linux/thread_info.h:37,
from ./include/asm-generic/current.h:4,
from ./arch/mips/include/generated/asm/current.h:1,
from ./include/linux/sched.h:11,
from arch/mips/kernel/smp-cmp.c:22:
arch/mips/kernel/smp-cmp.c: In function ‘cmp_boot_secondary’:
./arch/mips/include/asm/processor.h:384:41: error: implicit declaration
of function ‘task_stack_page’ [-Werror=implicit-function-declaration]
#define __KSTK_TOS(tsk) ((unsigned long)task_stack_page(tsk) + \
^
arch/mips/kernel/smp-cmp.c:84:21: note: in expansion of macro ‘__KSTK_TOS’
unsigned long sp = __KSTK_TOS(idle);
^~~~~~~~~~
Fixes: f3ac60671954 ("sched/headers: Move task-stack related APIs from <linux/sched.h> to <linux/sched/task_stack.h>")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: <stable@vger.kernel.org> # 4.11+
Patchwork: https://patchwork.linux-mips.org/patch/17522/
Signed-off-by: James Hogan <jhogan@kernel.org>
|
|
Commit cc731525f26a ("signal: Remove kernel interal si_code magic")
added a check for SIGMET and NSIGEMT being defined. That SIGMET should
in fact be SIGEMT, with SIGEMT being defined in
arch/{alpha,mips,sparc}/include/uapi/asm/signal.h
This was actually pointed out by BenHutchings in a lwn.net comment
here https://lwn.net/Comments/734608/
Fixes: cc731525f26a ("signal: Remove kernel interal si_code magic")
Signed-off-by: Andrew Clayton <andrew@digital-domain.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
Some were missed in the pass that converted the function return
values from int to bool. Update the remaining ones for consistency.
Signed-Off-By: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
As we walk the attribute btree, explicitly check the structure of the
attribute leaves to make sure the pointers make sense and the freemap is
sensible.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
|
Move the error injection tag names into a libxfs header so that we can
share it between kernel and userspace.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
|
Remove xfs_inode_log_format_t now that xfs_inode_log_format is
explicitly padded and therefore is a real on-disk structure. This
enables xfs/122 to check the size of the structure.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
|
|
Pull block fixes from Jens Axboe:
"A few fixes that should go into this series:
- Regression fix for ide-cd, ensuring that a request is fully
initialized. From Hongxu.
- Ditto fix for virtio_blk, from Bart.
- NVMe fix from Keith, ensuring that we set the right block size on
revalidation. If the block size changed, we'd be in trouble without
it.
- NVMe rdma fix from Sagi, fixing a potential hang while the
controller is being removed"
* 'for-linus' of git://git.kernel.dk/linux-block:
ide:ide-cd: fix kernel panic resulting from missing scsi_req_init
nvme: Fix setting logical block format when revalidating
virtio_blk: Fix an SG_IO regression
nvme-rdma: fix possible hang when issuing commands during ctrl removal
|
|
Change all relevant instances of miodrag.dinic@imgtec.com
email address to miodrag.dinic@mips.com.
Signed-off-by: Miodrag Dinic <miodrag.dinic@mips.com>
Signed-off-by: Aleksandar Markovic <aleksandar.markovic@mips.com>
Patchwork: https://patchwork.linux-mips.org/patch/17515/
[jhogan@kernel.org: Fix .mailmap direction]
Signed-off-by: James Hogan <jhogan@kernel.org>
|
|
Change all relevant instances of aleksandar.markovic@imgtec.com
email address to aleksandar.markovic@mips.com.
Signed-off-by: Miodrag Dinic <miodrag.dinic@mips.com>
Signed-off-by: Aleksandar Markovic <aleksandar.markovic@mips.com>
Patchwork: https://patchwork.linux-mips.org/patch/17514/
Signed-off-by: James Hogan <jhogan@kernel.org>
|
|
Next Thing Co. is the company behind the C.H.I.P. and C.H.I.P. Pro
miniature single board computers.
The "nextthing" vendor-prefix is already used for these two board as
well as their own "GR8" SoC.
Cc: Alexander Kaplan <alex@nextthing.co>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
Commit 1ec9dd80bedc ("MIPS: CPS: Detect CPUs in secondary clusters")
added a check in cps_boot_secondary() that the secondary being booted is
in the same cluster as the CPU running this code. This check is
performed using current_cpu_data without disabling preemption. As such
when CONFIG_PREEMPT=y, a BUG is triggered:
[ 57.991693] BUG: using smp_processor_id() in preemptible [00000000] code: hotplug/1749
<snip>
[ 58.063077] Call Trace:
[ 58.065842] [<8040cdb4>] show_stack+0x84/0x114
[ 58.070830] [<80b11b38>] dump_stack+0xf8/0x140
[ 58.075796] [<8079b12c>] check_preemption_disabled+0xec/0x118
[ 58.082204] [<80415110>] cps_boot_secondary+0x84/0x44c
[ 58.087935] [<80413a14>] __cpu_up+0x34/0x98
[ 58.092624] [<80434240>] bringup_cpu+0x38/0x114
[ 58.097680] [<80434af0>] cpuhp_invoke_callback+0x168/0x8f0
[ 58.103801] [<804362d0>] _cpu_up+0x154/0x1c8
[ 58.108565] [<804363dc>] do_cpu_up+0x98/0xa8
[ 58.113333] [<808261f8>] device_online+0x84/0xc0
[ 58.118481] [<80826294>] online_store+0x60/0x98
[ 58.123562] [<8062261c>] kernfs_fop_write+0x158/0x1d4
[ 58.129196] [<805a2ae4>] __vfs_write+0x4c/0x168
[ 58.134247] [<805a2dc8>] vfs_write+0xe0/0x190
[ 58.139095] [<805a2fe0>] SyS_write+0x68/0xc4
[ 58.143854] [<80415d58>] syscall_common+0x34/0x58
In reality we don't currently support running the kernel on CPUs not in
cluster 0, so the answer to cpu_cluster(¤t_cpu_data) will always
be 0, even if this task being preempted and continues running on a
different CPU. Regardless, the BUG should not be triggered, so fix this
by switching to raw_current_cpu_data. When multicluster support lands
upstream this check will need removing or changing anyway.
Fixes: 1ec9dd80bedc ("MIPS: CPS: Detect CPUs in secondary clusters")
Signed-off-by: Matt Redfearn <matt.redfearn@mips.com>
Reviewed-by: Paul Burton <paul.burton@mips.com>
CC: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/17563/
Signed-off-by: James Hogan <jhogan@kernel.org>
|
|
insn_get_addr_ref() returns the effective address as defined by the
section 3.7.5.1 Vol 1 of the Intel 64 and IA-32 Architectures Software
Developer's Manual. In order to compute the linear address, we must add
to the effective address the segment base address as set in the segment
descriptor. The segment descriptor to use depends on the register used as
operand and segment override prefixes, if any.
In most cases, the segment base address will be 0 if the USER_DS/USER32_DS
segment is used or if segmentation is not used. However, the base address
is not necessarily zero if a user programs defines its own segments. This
is possible by using a local descriptor table.
Since the effective address is a signed quantity, the unsigned segment
base address is saved in a separate variable and added to the final,
unsigned, effective address.
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: ricardo.neri@intel.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Garnier <thgarnie@google.com>
Link: https://lkml.kernel.org/r/1509135945-13762-19-git-send-email-ricardo.neri-calderon@linux.intel.com
|
|
is 101b
Section 2.2.1.3 of the Intel 64 and IA-32 Architectures Software
Developer's Manual volume 2A states that when ModRM.mod is zero and
ModRM.rm is 101b, a 32-bit displacement follows the ModRM byte. This means
that none of the registers are used in the computation of the effective
address. A return value of -EDOM indicates callers that they should not
use the value of registers when computing the effective address for the
instruction.
In long mode, the effective address is given by the 32-bit displacement
plus the location of the next instruction. In protected mode, only the
displacement is used.
The instruction decoder takes care of obtaining the displacement.
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: ricardo.neri@intel.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Garnier <thgarnie@google.com>
Link: https://lkml.kernel.org/r/1509135945-13762-18-git-send-email-ricardo.neri-calderon@linux.intel.com
|
|
Obtain the default values of the address and operand sizes as specified in
the D and L bits of the the segment descriptor selected by the register
CS. The function can be used for both protected and long modes.
For virtual-8086 mode, the default address and operand sizes are always 2
bytes.
The returned parameters are encoded in a signed 8-bit data type. Auxiliar
macros are provided to encode and decode such values.
Improvements-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: ricardo.neri@intel.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Garnier <thgarnie@google.com>
Link: https://lkml.kernel.org/r/1509135945-13762-17-git-send-email-ricardo.neri-calderon@linux.intel.com
|
|
and limit
With segmentation, the base address of the segment is needed to compute a
linear address. This base address is obtained from the applicable segment
descriptor. Such segment descriptor is referenced from a segment selector.
These new functions obtain the segment base and limit of the segment
selector indicated by segment register index given as argument. This index
is any of the INAT_SEG_REG_* family of #define's.
The logic to obtain the segment selector is wrapped in the function
get_segment_selector() with the inputs described above. Once the selector
is known, the base address is determined. In protected mode, the selector
is used to obtain the segment descriptor and then its base address. In
long mode, the segment base address is zero except when FS or GS are used.
In virtual-8086 mode, the base address is computed as the value of the
segment selector shifted 4 positions to the left.
In protected mode, segment limits are enforced. Thus, a function to
determine the limit of the segment is added. Segment limits are not
enforced in long or virtual-8086. For the latter, addresses are limited
to 20 bits; address size will be handled when computing the linear
address.
Improvements-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: ricardo.neri@intel.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Garnier <thgarnie@google.com>
Link: https://lkml.kernel.org/r/1509135945-13762-16-git-send-email-ricardo.neri-calderon@linux.intel.com
|