Age | Commit message (Collapse) | Author |
|
See Documentation/filesystems/mount_api.rst for details on new mount API
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Also rename the function from cifs_ to smb3_
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
No change to logic, just moving the enum of cifs mount parms into a header
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Continue restructuring needed for support of new mount API
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Harmonize and change all such variables to 'ctx', where possible.
No changes to actual logic.
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
In the negotiate protocol preauth context, the server is not required
to populate the salt (although it is done by most servers) so do
not warn on mount.
We retain the checks (warn) that the preauth context is the minimum
size and that the salt does not exceed DataLength of the SMB response.
Although we use the defaults in the case that the preauth context
response is invalid, these checks may be useful in the future
as servers add support for additional mechanisms.
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Trivial changes to clarify confusing comment about
SPNEGO blog (and also one length comparisons in negotiate
context parsing).
Suggested-by: Tom Talpey <tom@talpey.com>
Suggested-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
For the cifsacl mount option, we did not support sticky bits.
With this patch, we do support it, by setting the DELETE_CHILD perm
on the directory only for the owner user. When sticky bit is not
enabled, allow DELETE_CHILD perm for everyone.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
With the "cifsacl" mount option, the mode bits set on the file/dir
is converted to corresponding ACEs in DACL. However, only the
ALLOWED ACEs were being set for "owner" and "group" SIDs. Since
owner is a subset of group, and group is a subset of
everyone/world SID, in order to properly emulate unix perm groups,
we need to add DENIED ACEs. If we don't do that, "owner" and "group"
SIDs could get more access rights than they should. Which is what
was happening. This fixes it.
We try to keep the "preferred" order of ACEs, i.e. DENYs followed
by ALLOWs. However, for a small subset of cases we cannot
maintain the preferred order. In that case, we'll end up with the
DENY ACE for group after the ALLOW for the owner.
If owner SID == group SID, use the more restrictive
among the two perm bits and convert them to ACEs.
Also, for reverse mapping, i.e. to convert ACL to unix perm bits,
for the "others" bits, we needed to add the masked bits of the
owner and group masks to others mask.
Updated version of patch fixes a problem noted by the kernel
test robot.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Azure does not send an SPNEGO blob in the negotiate protocol response,
so we shouldn't assume that it is there when validating the location
of the first negotiate context. This avoids the potential confusing
mount warning:
CIFS: Invalid negotiate context offset
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Mounts to Azure cause an unneeded warning message in dmesg
"CIFS: VFS: parse_server_interfaces: incomplete interface info"
Azure rounds up the size (by 8 additional bytes, to a
16 byte boundary) of the structure returned on the query
of the server interfaces at mount time. This is permissible
even though different than other servers so do not log a warning
if query network interfaces response is only rounded up by 8
bytes or fewer.
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
In preparation to enable -Wimplicit-fallthrough for Clang, fix multiple
warnings by explicitly adding multiple break/goto statements instead of
just letting the code fall through to the next case.
Link: https://github.com/KSPP/linux/issues/115
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
|
|
Lockdep (on 5.10-rc) points out that we're delivering IRQs while IRQs
are not even enabled, which clearly shouldn't happen. Defer the time
event IRQ delivery until they actually are enabled.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
If the sigio workaround needed to be applied to a file descriptor,
set_irq_wake() wouldn't work for it since it would get polled by
the thread instead of causing SIGIO, and thus could never really
cause a wakeup, since the thread notification FD wasn't marked as
being able to wake up the system.
Fix this by marking the thread's notification FD explicitly as a
wake source FD, i.e. not suppressing SIGIO for it in suspend. In
order to not cause spurious wakeups, we then need to remove all
FDs that shouldn't wake up the system from the polling thread. In
order to do this, add unlocked versions of ignore_sigio_fd() and
add_sigio_fd() (nothing else is happening in suspend, so this is
fine), and also modify ignore_sigio_fd() to return -ENOENT if the
FD wasn't originally in there. This doesn't matter because nothing
else currently checks the return value, but the irq code needs to
know which ones to restore the workaround for.
All told, this lets us use a timerfd for the RTC clock in the next
patch, which doesn't send SIGIO.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Due a bug - we never checked the time_travel_ext_free_until value - we
were always requesting time for every single scheduling. This adds up
since we make reading time cost 256ns, and it's a fairly common call.
Fix this.
While at it, also make reading time only cost something when we're not
currently waiting for our scheduling turn - otherwise things get mixed
up in a very confusing way. We should never get here, since we're not
actually running, but it's possible if you stick printk() or such into
the virtio code that must handle the external interrupts.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
xterm serial channel was leaking a fd used in setting up the
port helper
This bug is prehistoric - it predates switching to git. The "fixes"
header here is really just to mark all the versions we would like this to
apply to which is "Anything from the Cretaceous period onwards".
No dinosaurs were harmed in fixing this bug.
Fixes: b40997b872cd ("um: drivers/xterm.c: fix a file descriptor leak")
Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Fix a logical error in tty reading. We get 0 and errno == EAGAIN
on the first attempt to read from a closed file descriptor.
Compared to that a true EAGAIN is EAGAIN and -1.
If we check errno for EAGAIN first, before checking the return
value we miss the fact that the descriptor is closed.
This bug is as old as the driver. It was not showing up with
the original POLL based IRQ controller, because it was
producing multiple events. Switching to EPOLL unmasked it.
Fixes: ff6a17989c08 ("Epoll based IRQ controller")
Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Ensure that file closes, connection closes, etc are propagated
as interrupts in the interrupt controller.
Fixes: ff6a17989c08 ("Epoll based IRQ controller")
Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
We've been running into stack overflows in helper threads
corrupting memory (e.g. because somebody put printf() or
os_info() there), so to avoid those causing hard-to-debug
issues later on, allocate a guard page for helper thread
stacks and mark it read-only.
Unfortunately, the crash dump at that point is useless as
the stack tracer will try to backtrace the *kernel* thread,
not the helper thread, but at least we don't survive to a
random issue caused by corruption.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
For now, only support set_memory_ro()/rw() which we need for
the stack protection in the next patch.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
If there is some kind of interrupt negotation or such then
it may happen that we send an update message multiple times,
avoid that in the interest of efficiency by storing the last
transmitted value and only sending a new update if it's not
the same as the last update.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
UML userspace fetches siginfo and passes it to signal handlers
in UML. This is needed only for some of the signals, because
key handlers like SIGIO make no use of this variable.
Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
With all the previous bits in place, we can now also support
suspend to RAM, in the sense that everything is suspended,
not just most, including userspace, processes like in s2idle.
Since um_idle_sleep() now waits forever, we can simply call
that to "suspend" the system.
As before, you can wake it up using SIGUSR1 since we're just
in a pause() call that only needs to return.
In order to implement selective resume from certain devices,
and not have any arbitrary device interrupt wake up, suspend
interrupts by removing SIGIO notification (O_ASYNC) from all
the FDs that are not supposed to wake up the system. However,
swap out the handler so we don't actually handle the SIGIO as
an interrupt.
Since we're in pause(), the mere act of receiving SIGIO wakes
us up, and then after things have been restored enough, re-set
O_ASYNC for all previously suspended FDs, reinstall the proper
SIGIO handler, and send SIGIO to self to process anything that
might now be pending.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
In order to be able to experiment with suspend in UML, add the
minimal work to be able to suspend (s2idle) an instance of UML,
and be able to wake it back up from that state with the USR1
signal sent to the main UML process.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
In time-travel mode, we've relied on read_persistent_clock64()
being called only once at system startup, but this is both the
right thing to call from the pseudo-RTC, and also gets called
by the timekeeping core during suspend/resume.
Thus, fix this to always fall make use of the time_travel_time
in any time-travel mode, initializing time_travel_start at boot
to the right value depending on the time-travel mode.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
There really is no reason to pass the amount of time we should
sleep, especially since it's just hard-coded to one second.
Additionally, one second isn't really all that long, and as we
are expecting to be woken up by a signal, we can sleep longer
and avoid doing some work every second, so replace the current
clock_nanosleep() with just an empty select() that can _only_
be woken up by a signal.
We can also remove the deliver_alarm() since we don't need to
do that when we got e.g. SIGIO that woke us up, and if we got
SIGALRM the signal handler will actually (have) run, so it's
just unnecessary extra work.
Similarly, in time-travel mode, just program the wakeup event
from idle to be S64_MAX, which is basically the most you could
ever simulate to. Of course, you should already have an event
in the list that's earlier and will cause a wakeup, normally
that's the regular timer interrupt, though in suspend it may
(later) also be an RTC event. Since actually getting to this
point would be a bug and you can't ever get out again, panic()
on it in the time control code.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Reduce dynamic allocations (and thereby cache misses) by simply
embedding the registration data for IRQs in the irq_entry, we
never supported these being really dynamic anyway as only one
was ever allowed ("Trying to reregister ...").
Lockless behaviour is preserved by removing the FD from the poll
set appropriately, but we use reg->events to indicate whether or
not this entry is used, rather than dynamically allocating them.
Also port the list of IRQ entries to list_head instead of the
current open-coded singly-linked list implementation, just for
sanity.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
We don't actually use this in um_request_irq(), so it can
never be assigned. It's also not clear what that would be
useful for, so just remove it.
This results in quite a number of cleanups, all the way to
removing the "SIGIO on close" startup check, since the data
it assigns (pty_close_sigio) is not used anymore.
While at it, also make this an enum so we get a minimum of
type checking, and remove the IRQ_NONE hack in virtio since
we now no longer have the name twice.
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
We don't need an array of 4 entries to capture three and the
name 'MAX_IRQ_TYPE' really gets confusing as well. Remove it
and add a correct NUM_IRQ_TYPES, and use that correctly.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
This really shouldn't be called "irq_fd" since it doesn't
carry an fd. Well, it used to, apparently, but that struct
member is unused.
Rename it to "irq_reg" since it more accurately reflects a
registered interrupt, and remove the unused 'next' and 'fd'
members from the struct as well.
While at it, also move it to the implementation, it's not
used anywhere else, and the header file is shared with the
userspace components.
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
We don't use "SIGVTALRM", it's just SIGALRM. Clean up the naming.
While at it, fix the comment's grammar.
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
This separates the devices, which is better for debug and for
later suspend/resume and wakeup support, since there we'll
have to separate which IRQs can wake up the system and which
cannot.
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
It's cumbersome and error-prone to keep adding fixed IRQ numbers,
and for proper device wakeup support for the virtio/vhost-user
support we need to have different IRQs for each device. Even if
in theory two IRQs (with and without wake) might be sufficient,
it's much easier to reason about it when we have dynamic number
assignment. It also makes it easier to add new devices that may
dynamically exist or depending on the configuration, etc.
Add support for this, up to 64 IRQs (the same limit as epoll FDs
we have right now). Since it's not easy to port all the existing
places to dynamic allocation (some data is statically initialized)
keep the low numbers are reserved for the existing hard-coded IRQ
numbers.
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
If we run out of space, return an error instead of 0.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Adds the ability to set the UBD device serial number from the
commandline, disabling the serial number functionality by default.
In some cases it may be useful to set a serial to the UBD device, such
that downstream users (i.e. udev) can use this information to better
describe the hardware to the user from the UML cmdline. In our case we
use this parameter to create some entries under /dev/disk/by-ubd-id/
for each of the UBD devices passed through the UML cmdline.
Signed-off-by: Christopher Obbard <chris.obbard@collabora.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
The signal.c can't use heap for bit data located on stack. However,
by default a compiler warns us about overstepping stack frame size
threshold:
arch/um/os-Linux/signal.c: In function ‘sig_handler_common’:
arch/um/os-Linux/signal.c:51:1: warning: the frame size of 2960 bytes is larger than 2048 bytes [-Wframe-larger-than=]
51 | }
| ^
arch/um/os-Linux/signal.c: In function ‘timer_real_alarm_handler’:
arch/um/os-Linux/signal.c:95:1: warning: the frame size of 2960 bytes is larger than 2048 bytes [-Wframe-larger-than=]
95 | }
| ^
Due to above increase stack frame size threshold explicitly for signal.c
to avoid unnecessary warning.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: David Gow <davidgow@google.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Lockdep correctly complains that one shouldn't call um_free_irq()
with free_irq() inside under a spinlock since that will attempt
to acquire a mutex.
Rearrange the code to keep the list manipulations under the lock
while moving the actual freeing outside of it, to avoid this.
In particular, this removes the lockdep complaint at shutdown that
I was seeing with lockdep enabled.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-By: anton.ivanov@cambridgegreys.com
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Internally, UBD treats each physical IO segment as a separate command to
be submitted in the execution pipe. If the pipe returns a transient
error after a few segments have already been written, UBD will tell the
block layer to requeue the request, but there is no way to reclaim the
segments already submitted. When a new attempt to dispatch the request
is done, those segments already submitted will get duplicated, causing
the WARN_ON below in the best case, and potentially data corruption.
In my system, running a UML instance with 2GB of RAM and a 50M UBD disk,
I can reproduce the WARN_ON by simply running mkfs.fvat against the
disk on a freshly booted system.
There are a few ways to around this, like reducing the pressure on
the pipe by reducing the queue depth, which almost eliminates the
occurrence of the problem, increasing the pipe buffer size on the host
system, or by limiting the request to one physical segment, which causes
the block layer to submit way more requests to resolve a single
operation.
Instead, this patch modifies the format of a UBD command, such that all
segments are sent through a single element in the communication pipe,
turning the command submission atomic from the point of view of the
block layer. The new format has a variable size, depending on the
number of elements, and looks like this:
+------------+-----------+-----------+------------
| cmd_header | segment 0 | segment 1 | segment ...
+------------+-----------+-----------+------------
With this format, we push a pointer to cmd_header in the submission
pipe.
This has the advantage of reducing the memory footprint of executing a
single request, since it allow us to merge some fields in the header.
It is possible to reduce even further each segment memory footprint, by
merging bitmap_words and cow_offset, for instance, but this is not the
focus of this patch and is left as future work. One issue with the
patch is that for a big number of segments, we now perform one big
memory allocation instead of multiple small ones, but I wasn't able to
trigger any real issues or -ENOMEM because of this change, that wouldn't
be reproduced otherwise.
This was tested using fio with the verify-crc32 option, and by running
an ext4 filesystem over this UBD device.
The original WARN_ON was:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at lib/refcount.c:28 refcount_warn_saturate+0x13f/0x141
refcount_t: underflow; use-after-free.
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 5.5.0-rc6-00002-g2a5bb2cf75c8 #346
Stack:
6084eed0 6063dc77 00000009 6084ef60
00000000 604b8d9f 6084eee0 6063dcbc
6084ef40 6006ab8d e013d780 1c00000000
Call Trace:
[<600a0c1c>] ? printk+0x0/0x94
[<6004a888>] show_stack+0x13b/0x155
[<6063dc77>] ? dump_stack_print_info+0xdf/0xe8
[<604b8d9f>] ? refcount_warn_saturate+0x13f/0x141
[<6063dcbc>] dump_stack+0x2a/0x2c
[<6006ab8d>] __warn+0x107/0x134
[<6008da6c>] ? wake_up_process+0x17/0x19
[<60487628>] ? blk_queue_max_discard_sectors+0x0/0xd
[<6006b05f>] warn_slowpath_fmt+0xd1/0xdf
[<6006af8e>] ? warn_slowpath_fmt+0x0/0xdf
[<600acc14>] ? raw_read_seqcount_begin.constprop.0+0x0/0x15
[<600619ae>] ? os_nsecs+0x1d/0x2b
[<604b8d9f>] refcount_warn_saturate+0x13f/0x141
[<6048bc8f>] refcount_sub_and_test.constprop.0+0x2f/0x37
[<6048c8de>] blk_mq_free_request+0xf1/0x10d
[<6048ca06>] __blk_mq_end_request+0x10c/0x114
[<6005ac0f>] ubd_intr+0xb5/0x169
[<600a1a37>] __handle_irq_event_percpu+0x6b/0x17e
[<600a1b70>] handle_irq_event_percpu+0x26/0x69
[<600a1bd9>] handle_irq_event+0x26/0x34
[<600a1bb3>] ? handle_irq_event+0x0/0x34
[<600a5186>] ? unmask_irq+0x0/0x37
[<600a57e6>] handle_edge_irq+0xbc/0xd6
[<600a131a>] generic_handle_irq+0x21/0x29
[<60048f6e>] do_IRQ+0x39/0x54
[...]
---[ end trace c6e7444e55386c0f ]---
Cc: Christopher Obbard <chris.obbard@collabora.com>
Reported-by: Martyn Welch <martyn@collabora.com>
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Tested-by: Christopher Obbard <chris.obbard@collabora.com>
Acked-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Since the time-travel rework, basic time-travel mode hasn't worked
properly, but there's no longer a need for this WARN_ON() so just
remove it and thereby fix things.
Cc: stable@vger.kernel.org
Fixes: 4b786e24ca80 ("um: time-travel: Rewrite as an event scheduler")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
asprintf is not compatible with the existing uml memory allocation
mechanism. Its use on the "user" side of UML results in a corrupt slab
state.
Fixes: 0d4e5ac7e780 ("um: remove uses of variable length arrays")
Cc: stable@vger.kernel.org
Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Wire up TIF_NOTIFY_SIGNAL handling for um.
Cc: linux-um@lists.infradead.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
The UML random driver creates a dummy device under the guest,
/dev/hw_random. When this file is read from the guest, the driver
reads from the host machine's /dev/random, in-turn reading from
the host kernel's entropy pool. This entropy pool could have been
filled by a hardware random number generator or just the host
kernel's internal software entropy generator.
Currently the driver does not fill the guests kernel entropy pool,
this requires a userspace tool running inside the guest (like
rng-tools) to read from the dummy device provided by this driver,
which then would fill the guest's internal entropy pool.
This all seems quite pointless when we are already reading from an
entropy pool, so this patch aims to register the device as a hwrng
device using the hwrng-core framework. This not only improves and
cleans up the driver, but also fills the guest's entropy pool
without having to resort to using extra userspace tools in the guest.
This is typically a nuisance when booting a guest: the random pool
takes a long time (~200s) to build up enough entropy since the dummy
hwrng is not used to fill the guest's pool.
This port was originally attempted by Alexander Neville "dark" (in CC,
discussion in Link), but the conversation there stalled since the
handling of -EAGAIN errors were no removed and longer handled by the
driver. This patch attempts to use the existing method of error
handling but utilises the new hwrng core.
The issue can be noticed when booting a UML guest:
[ 2.560000] random: fast init done
[ 214.000000] random: crng init done
With the patch applied, filling the pool becomes a lot quicker:
[ 2.560000] random: fast init done
[ 12.000000] random: crng init done
Cc: Alexander Neville <dark@volatile.bz>
Link: https://lore.kernel.org/lkml/20190828204609.02a7ff70@TheDarkness/
Link: https://lore.kernel.org/lkml/20190829135001.6a5ff940@TheDarkness.local/
Cc: Sjoerd Simons <sjoerd.simons@collabora.co.uk>
Signed-off-by: Christopher Obbard <chris.obbard@collabora.com>
Acked-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
In preparation for unconditionally passing the
struct tasklet_struct pointer to all tasklet
callbacks, switch to using the new tasklet_setup()
and from_tasklet() to pass the tasklet pointer explicitly.
Signed-off-by: Romain Perier <romain.perier@gmail.com>
Signed-off-by: Allen Pais <allen.lkml@gmail.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
An index node can have up to c->fanout branches, all branches should be
displayed while dumping index node.
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Function ubifs_dump_sleb() is defined but unused, it can be removed.
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Function ubifs_dump_node() has been modified to avoid memory oob
accessing while dumping node, node length (corresponding to the
size of allocated memory for node) should be passed into all node
dumping callers.
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
node_len"
This reverts commit acc5af3efa30 ("ubifs: Fix out-of-bounds memory access caused by abnormal value of node_len")
No need to avoid memory oob in dumping for data node alone. Later, node
length will be passed into function 'ubifs_dump_node()' which replaces
all node dumping places.
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
To prevent memory out-of-bounds accessing in ubifs_dump_node(), actual
dumping length should be restricted by another condition(size of memory
which is allocated for the node).
This patch handles following situations (These situations may be caused
by bit flipping due to hardware error, writing bypass ubifs, unknown
bugs in ubifs, etc.):
1. bad node_len: Dumping data according to 'ch->len' which may exceed
the size of memory allocated for node.
2. bad node content: Some kinds of node can record additional data, eg.
index node and orphan node, make sure the size of additional data
not beyond the node length.
3. node_type changes: Read data according to type A, but expected type
B, before that, node is allocated according to type B's size. Length
of type A node is greater than type B node.
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
There is a redundant return in dbg_check_nondata_nodes_order,
which will be never reached. In addition, error code should be
returned instead of zero in this branch.
Signed-off-by: Chengsong Ke <kechengsong@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|