Age | Commit message (Collapse) | Author |
|
When userspace use RTM_GETROUTE to dump route table, with an already
expired route entry, we always got an 'expires' value(2147157)
calculated base on INT_MAX.
The reason of this problem is in the following satement:
rt->dst.expires - jiffies < INT_MAX
gcc promoted the type of both sides of '<' to unsigned long, thus
a small negative value would be considered greater than INT_MAX.
With the help of Eric Dumazet, do the out of bound checks in
rtnl_put_cacheinfo(), _after_ conversion to clock_t.
Signed-off-by: Li Wei <lw@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The test for the fillempty condition was wrong in one place.
Changed the variable to the right boolean type.
Signed-off-by: Karsten Keil <keil@b1-systems.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
the driver sees wether the dev_seeq pointer is having a error that can be
read by using the PTR_ERR, and returns it at error case, other wise 0 at
success case.
the PTR_RET does the same thing, and use PTR_RET instead of redoing the
code of PTR_RET
Signed-off-by: Devendra Naga <develkernel412222@gmail.com>
Acked-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
casting the void pointer is redundant (Documentation/CodingStyle)
Signed-off-by: Devendra Naga <develkernel412222@gmail.com>
Acked-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The first parameter struct trie *t is not used anymore.
Remove it.
Signed-off-by: Lin Ming <mlin@ss.pku.edu.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It should print size of struct rt_trie_node * allocated instead of size
of struct rt_trie_node.
Signed-off-by: Lin Ming <mlin@ss.pku.edu.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch fills the net_device vlan_features with the proper hardware features,
thus, improving the vlan interface performance.
With the patch applied, I can see around 148% improvement on a TCP_STREAM test,
from 3.5 Gb/s to 8.7 Gb/s. On TCP_RR, I see a 11% improvement, from 18k
to 20. The CPU utilization is almost the same on both cases, from the comparison
above.
Signed-off-by: Breno Leitao <brenohl@br.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
OK, what we have so far is e.g.
setxattr(path, name, whatever, 0, XATTR_REPLACE)
with name being good enough to get through xattr_permission().
Then we reach security_inode_setxattr() with the desired value and size.
Aha. name should begin with "security.selinux", or we won't get that
far in selinux_inode_setxattr(). Suppose we got there and have enough
permissions to relabel that sucker. We call security_context_to_sid()
with value == NULL, size == 0. OK, we want ss_initialized to be non-zero.
I.e. after everything had been set up and running. No problem...
We do 1-byte kmalloc(), zero-length memcpy() (which doesn't oops, even
thought the source is NULL) and put a NUL there. I.e. form an empty
string. string_to_context_struct() is called and looks for the first
':' in there. Not found, -EINVAL we get. OK, security_context_to_sid_core()
has rc == -EINVAL, force == 0, so it silently returns -EINVAL.
All it takes now is not having CAP_MAC_ADMIN and we are fucked.
All right, it might be a different bug (modulo strange code quoted in the
report), but it's real. Easily fixed, AFAICS:
Deal with size == 0, value == NULL case in selinux_inode_setxattr()
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Dave Jones <davej@redhat.com>
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
|
|
Prepare second set of changes for 3.6 merge window.
|
|
linux/key-type.h needs to #include linux/errno.h as it refers to ENOKEY.
Without this, with sparc's allmodconfig in one of my test trees, the following
error occurs:
include/linux/key-type.h: In function 'key_negate_and_link':
include/linux/key-type.h:122:43: error: 'ENOKEY' undeclared (first use in this function)
include/linux/key-type.h:122:43: note: each undeclared identifier is reported only once for each fun
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
|
|
Consider the input case of a rule that consists entirely of non space
symbols followed by a \0. Say 64 + \0
In this case strlen(data) = 64
kzalloc of subject and object are 64 byte objects
sscanfdata, "%s %s %s", subject, ...)
will put 65 bytes into subject.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Morris <james.l.morris@oracle.com>
|
|
This got renamed and clarified, but let's not break any userspace out there.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
This patch adds support for the new VIRTIO_BLK_F_CONFIG_WCE feature,
which exposes the cache mode in the configuration space and lets the
driver modify it. The cache mode is exposed via sysfs.
Even if the host does not support the new feature, the cache mode is
visible (thanks to the existing VIRTIO_BLK_F_WCE), but not modifiable.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
Block layer will allocate a spinlock for the queue if the driver does
not provide one in blk_init_queue().
The reason to use the internal spinlock is that blk_cleanup_queue() will
switch to use the internal spinlock in the cleanup code path.
if (q->queue_lock != &q->__queue_lock)
q->queue_lock = &q->__queue_lock;
However, processes which are in D state might have taken the driver
provided spinlock, when the processes wake up, they would release the
block provided spinlock.
=====================================
[ BUG: bad unlock balance detected! ]
3.4.0-rc7+ #238 Not tainted
-------------------------------------
fio/3587 is trying to release lock (&(&q->__queue_lock)->rlock) at:
[<ffffffff813274d2>] blk_queue_bio+0x2a2/0x380
but there are no more locks to release!
other info that might help us debug this:
1 lock held by fio/3587:
#0: (&(&vblk->lock)->rlock){......}, at:
[<ffffffff8132661a>] get_request_wait+0x19a/0x250
Other drivers use block layer provided spinlock as well, e.g. SCSI.
Switching to the block layer provided spinlock saves a bit of memory and
does not increase lock contention. Performance test shows no real
difference is observed before and after this patch.
Changes in v2: Improve commit log as Michael suggested.
Cc: virtualization@lists.linux-foundation.org
Cc: kvm@vger.kernel.org
Cc: stable@kernel.org
Signed-off-by: Asias He <asias@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
blk_cleanup_queue() will call blk_drian_queue() to drain all the
requests before queue DEAD marking. If we reset the device before
blk_cleanup_queue() the drain would fail.
1) if the queue is stopped in do_virtblk_request() because device is
full, the q->request_fn() will not be called.
blk_drain_queue() {
while(true) {
...
if (!list_empty(&q->queue_head))
__blk_run_queue(q) {
if (queue is not stoped)
q->request_fn()
}
...
}
}
Do no reset the device before blk_cleanup_queue() gives the chance to
start the queue in interrupt handler blk_done().
2) In commit b79d866c8b7014a51f611a64c40546109beaf24a, We abort requests
dispatched to driver before blk_cleanup_queue(). There is a race if
requests are dispatched to driver after the abort and before the queue
DEAD mark. To fix this, instead of aborting the requests explicitly, we
can just reset the device after after blk_cleanup_queue so that the
device can complete all the requests before queue DEAD marking in the
drain process.
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: virtualization@lists.linux-foundation.org
Cc: kvm@vger.kernel.org
Cc: stable@kernel.org
Signed-off-by: Asias He <asias@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
del_gendisk() might not return due to failing to remove the
/sys/block/vda/serial sysfs entry when another thread (udev) is
trying to read it.
virtblk_remove()
vdev->config->reset() : guest will not kick us through interrupt
del_gendisk()
device_del()
kobject_del(): got stuck, sysfs entry ref count non zero
sysfs_open_file(): user space process read /sys/block/vda/serial
sysfs_get_active() : got sysfs entry ref count
dev_attr_show()
virtblk_serial_show()
blk_execute_rq() : got stuck, interrupt is disabled
request cannot be finished
This patch fixes it by calling del_gendisk() before we disable guest's
interrupt so that the request sent in virtblk_serial_show() will be
finished and del_gendisk() will success.
This fixes another race in hot-unplug process.
It is save to call del_gendisk(vblk->disk) before
flush_work(&vblk->config_work) which might access vblk->disk, because
vblk->disk is not freed until put_disk(vblk->disk).
Cc: virtualization@lists.linux-foundation.org
Cc: kvm@vger.kernel.org
Cc: stable@kernel.org
Signed-off-by: Asias He <asias@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
Unregister from the hwrng interface and remove the vq before entering
the S3 or S4 states. Add the vq and re-register with hwrng on restore.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
The freeze/restore s3/s4 operations will use code that's common to the
probe and remove routines. Put the common code in separate funcitons.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
No use waiting for input from host when the module is being removed.
We're going to remove the vq in the next step anyway, so just perform
any other steps for cleanup (currently none).
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
Use wait_for_completion_killable() instead of wait_for_completion() when
waiting for the host to send us entropy. Without this,
# cat /dev/hwrng
^C
just hangs.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
It's virtio-rng, not virtio-ring.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
Commit 36a8f01cd24b ("IB/qib: Add congestion control agent
implementation") tries to store the value 1984 in a u8, which leads to
truncation. Fix this by making the member big enough.
This bug was detected by a smatch warning.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Ramkrishna Vepa <ramkrishna.vepa@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
|
|
* devel: (33 commits)
edac i5000, i5400: fix pointer math in i5000_get_mc_regs()
edac: allow specifying the error count with fake_inject
edac: add support for Calxeda highbank L2 cache ecc
edac: add support for Calxeda highbank memory controller
edac: create top-level debugfs directory
sb_edac: properly handle error count
i7core_edac: properly handle error count
edac: edac_mc_handle_error(): add an error_count parameter
edac: remove arch-specific parameter for the error handler
amd64_edac: Don't pass driver name as an error parameter
edac_mc: check for allocation failure in edac_mc_alloc()
edac: Increase version to 3.0.0
edac_mc: Cleanup per-dimm_info debug messages
edac: Convert debugfX to edac_dbg(X,
edac: Use more normal debugging macro style
edac: Don't add __func__ or __FILE__ for debugf[0-9] msgs
Edac: Add ABI Documentation for the new device nodes
edac: move documentation ABI to ABI/testing/sysfs-devices-edac
i7core_edac: change the mem allocation scheme to make Documentation/kobject.txt happy
edac: change the mem allocation scheme to make Documentation/kobject.txt happy
...
|
|
Linux 3.5
* tag 'v3.5': (1242 commits)
Linux 3.5
Remove SYSTEM_SUSPEND_DISK system state
kdb: Switch to nolock variants of kmsg_dump functions
printk: Implement some unlocked kmsg_dump functions
printk: Remove kdb_syslog_data
kdb: Revive dmesg command
dm raid1: set discard_zeroes_data_unsupported
dm thin: do not send discards to shared blocks
dm raid1: fix crash with mirror recovery and discard
pnfs-obj: Fix __r4w_get_page when offset is beyond i_size
pnfs-obj: don't leak objio_state if ore_write/read fails
ore: Unlock r4w pages in exact reverse order of locking
ore: Remove support of partial IO request (NFS crash)
ore: Fix NFS crash by supporting any unaligned RAID IO
UBIFS: fix a bug in empty space fix-up
cx25821: Remove bad strcpy to read-only char*
HID: hid-multitouch: add support for Zytronic panels
MIPS: PCI: Move fixups from __init to __devinit.
MIPS: Fix bug.h MIPS build regression
MIPS: sync-r4k: remove redundant irq operation
...
|
|
v2: Add the xfs_buf_lock to xfs_quiesce_attr().
Add explaination why xfs_buf_lock() is used to wait for write.
xfs_wait_buftarg() does not wait for the completion of the write of the
uncached superblock. This write can race with the shutdown of the log
and causes a panic if the write does not win the race.
During the log write, xfsaild_push() will lock the buffer and set the
XBF_ASYNC flag. Because the XBF_FLAG is set, complete() is not performed
on the buffer's iowait entry, we cannot call xfs_buf_iowait() to wait
for the write to complete. The buffer's lock is held until the write is
complete, so we can block on a xfs_buf_lock() request to be notified
that the write is complete.
Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
|
|
xfsaild idle mode logic currently leads to a couple hangs:
1.) If xfsaild is rescheduled in during an incremental scan
(i.e., tout != 0) and the target has been updated since
the previous run, we can hit the new target and go into
idle mode with a still populated ail.
2.) A wake up is only issued when the target is pushed forward.
The wake up can race with xfsaild if it is currently in the
process of entering idle mode, causing future wake up
events to be lost.
These hangs have been reproduced and verified as fixed by
running xfstests 273 in a loop on a slightly modified upstream
kernel. The kernel is modified to re-enable idle mode as
previously implemented (when count == 0) and with a revert of
commit 670ce93f, which includes performance improvements that
make this harder to reproduce.
The solution, the algorithm for which has been outlined by
Dave Chinner, is to modify xfsaild to enter idle mode only when
the ail is empty and the push target has not been moved forward
since the last push.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
|
|
Content-Disposition: inline; filename=xfs-remove-iolock-classes
Now that we never take the iolock during inode reclaim we don't need
to play games with lock classes.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Rich Johnston <rjohnston@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
|
|
Same rational as the last patch - these inodes are not reachable, so
don't bother with locking.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Rich Johnston <rjohnston@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
|
|
The memory regions which are passed to arm_add_memory() from
device tree blobs via early_init_dt_add_memory_arch() can
have sizes which are larger than will fit in a 32 bit integer,
so switch to using a phys_addr_t to hold them, to avoid
silently dropping the top 32 bits of the size. Similarly, use
phys_addr_t in early_mem() so that mem=size@start command line
options specifying more than 4GB behave sensibly.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
An inode that enters xfs_inactive has been removed from all global
lists but the inode hash, and can't be recycled in xfs_iget before
it has been marked reclaimable. Thus taking the iolock in here
is not nessecary at all, and given the amount of lockdep false
positives it has triggered already I'd rather remove the locking.
The only change outside of xfs_inactive is relaxing an assert in
xfs_itruncate_extents.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Rich Johnston <rjohnston@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
|
|
Remove this helper as the code flow is a lot more obvious when it gets
merged into its only caller.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Rich Johnston <rjohnston@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
|
|
The code to reserve log space and join the inode to the transaction is
common for all cases, so don't duplicate it. Also remove the trivial
xfs_inactive_symlink_local helper which can simply be opencode now.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Rich Johnston <rjohnston@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
|
|
Refactor the AG selection loop in xfs_dialloc to operate on the in-memory
perag data as much as possible. We only read the AGI buffer once we have
selected an AG to allocate inodes now instead of for every AG considered.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
|
|
Loop over the in-core perag structures and prefer using pagi_freecount over
going out to the AGI buffer where possible.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
|
|
In this case we already have selected an AG and know it has free space
beause the buffer lock never got released. Jump directly into xfs_dialloc_ag
and short cut the AG selection loop.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
|
|
We can simplify check the IO_agbp pointer for being non-NULL instead of
passing another argument through two layers of function calls.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
|
|
Move the actual allocation once we have selected an allocation group into a
separate helper, and make xfs_dialloc a wrapper around it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
It's used both for client and server hosts; we can't do nlmclnt_release_host()
on failure exits, since the host might need nlmsvc_release_host(), with BUG_ON()
for calling the wrong one. Makes life simpler for callers, actually...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Adds audit messages for unexpected link restriction violations so that
system owners will have some sort of potentially actionable information
about misbehaving processes.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
This adds symlink and hardlink restrictions to the Linux VFS.
Symlinks:
A long-standing class of security issues is the symlink-based
time-of-check-time-of-use race, most commonly seen in world-writable
directories like /tmp. The common method of exploitation of this flaw
is to cross privilege boundaries when following a given symlink (i.e. a
root process follows a symlink belonging to another user). For a likely
incomplete list of hundreds of examples across the years, please see:
http://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=/tmp
The solution is to permit symlinks to only be followed when outside
a sticky world-writable directory, or when the uid of the symlink and
follower match, or when the directory owner matches the symlink's owner.
Some pointers to the history of earlier discussion that I could find:
1996 Aug, Zygo Blaxell
http://marc.info/?l=bugtraq&m=87602167419830&w=2
1996 Oct, Andrew Tridgell
http://lkml.indiana.edu/hypermail/linux/kernel/9610.2/0086.html
1997 Dec, Albert D Cahalan
http://lkml.org/lkml/1997/12/16/4
2005 Feb, Lorenzo Hernández García-Hierro
http://lkml.indiana.edu/hypermail/linux/kernel/0502.0/1896.html
2010 May, Kees Cook
https://lkml.org/lkml/2010/5/30/144
Past objections and rebuttals could be summarized as:
- Violates POSIX.
- POSIX didn't consider this situation and it's not useful to follow
a broken specification at the cost of security.
- Might break unknown applications that use this feature.
- Applications that break because of the change are easy to spot and
fix. Applications that are vulnerable to symlink ToCToU by not having
the change aren't. Additionally, no applications have yet been found
that rely on this behavior.
- Applications should just use mkstemp() or O_CREATE|O_EXCL.
- True, but applications are not perfect, and new software is written
all the time that makes these mistakes; blocking this flaw at the
kernel is a single solution to the entire class of vulnerability.
- This should live in the core VFS.
- This should live in an LSM. (https://lkml.org/lkml/2010/5/31/135)
- This should live in an LSM.
- This should live in the core VFS. (https://lkml.org/lkml/2010/8/2/188)
Hardlinks:
On systems that have user-writable directories on the same partition
as system files, a long-standing class of security issues is the
hardlink-based time-of-check-time-of-use race, most commonly seen in
world-writable directories like /tmp. The common method of exploitation
of this flaw is to cross privilege boundaries when following a given
hardlink (i.e. a root process follows a hardlink created by another
user). Additionally, an issue exists where users can "pin" a potentially
vulnerable setuid/setgid file so that an administrator will not actually
upgrade a system fully.
The solution is to permit hardlinks to only be created when the user is
already the existing file's owner, or if they already have read/write
access to the existing file.
Many Linux users are surprised when they learn they can link to files
they have no access to, so this change appears to follow the doctrine
of "least surprise". Additionally, this change does not violate POSIX,
which states "the implementation may require that the calling process
has permission to access the existing file"[1].
This change is known to break some implementations of the "at" daemon,
though the version used by Fedora and Ubuntu has been fixed[2] for
a while. Otherwise, the change has been undisruptive while in use in
Ubuntu for the last 1.5 years.
[1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/linkat.html
[2] http://anonscm.debian.org/gitweb/?p=collab-maint/at.git;a=commitdiff;h=f4114656c3a6c6f6070e315ffdf940a49eda3279
This patch is based on the patches in Openwall and grsecurity, along with
suggestions from Al Viro. I have added a sysctl to enable the protected
behavior, and documentation.
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
I can reliably reproduce the following panic by simply setting an audit
rule on a recent 3.5.0+ kernel:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
IP: [<ffffffff810d1250>] audit_copy_inode+0x10/0x90
PGD 7acd9067 PUD 7b8fb067 PMD 0
Oops: 0000 [#86] SMP
Modules linked in: nfs nfs_acl auth_rpcgss fscache lockd sunrpc tpm_bios btrfs zlib_deflate libcrc32c kvm_amd kvm joydev virtio_net pcspkr i2c_piix4 floppy virtio_balloon microcode virtio_blk cirrus drm_kms_helper ttm drm i2c_core [last unloaded: scsi_wait_scan]
CPU 0
Pid: 1286, comm: abrt-dump-oops Tainted: G D 3.5.0+ #1 Bochs Bochs
RIP: 0010:[<ffffffff810d1250>] [<ffffffff810d1250>] audit_copy_inode+0x10/0x90
RSP: 0018:ffff88007aebfc38 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff88003692d860 RCX: 00000000000038c4
RDX: 0000000000000000 RSI: ffff88006baf5d80 RDI: ffff88003692d860
RBP: ffff88007aebfc68 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
R13: ffff880036d30f00 R14: ffff88006baf5d80 R15: ffff88003692d800
FS: 00007f7562634740(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000040 CR3: 000000003643d000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process abrt-dump-oops (pid: 1286, threadinfo ffff88007aebe000, task ffff880079614530)
Stack:
ffff88007aebfdf8 ffff88007aebff28 ffff88007aebfc98 ffffffff81211358
ffff88003692d860 0000000000000000 ffff88007aebfcc8 ffffffff810d4968
ffff88007aebfcc8 ffff8800000038c4 0000000000000000 0000000000000000
Call Trace:
[<ffffffff81211358>] ? ext4_lookup+0xe8/0x160
[<ffffffff810d4968>] __audit_inode+0x118/0x2d0
[<ffffffff811955a9>] do_last+0x999/0xe80
[<ffffffff81191fe8>] ? inode_permission+0x18/0x50
[<ffffffff81171efa>] ? kmem_cache_alloc_trace+0x11a/0x130
[<ffffffff81195b4a>] path_openat+0xba/0x420
[<ffffffff81196111>] do_filp_open+0x41/0xa0
[<ffffffff811a24bd>] ? alloc_fd+0x4d/0x120
[<ffffffff811855cd>] do_sys_open+0xed/0x1c0
[<ffffffff810d40cc>] ? __audit_syscall_entry+0xcc/0x300
[<ffffffff811856c1>] sys_open+0x21/0x30
[<ffffffff81611ca9>] system_call_fastpath+0x16/0x1b
RSP <ffff88007aebfc38>
CR2: 0000000000000040
The problem is that do_last is passing a negative dentry to audit_inode.
The comments on lookup_open note that it can pass back a negative dentry
if O_CREAT is not set.
This patch fixes the oops, but I'm not clear on whether there's a better
approach.
Cc: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
... only needed if it's been in descriptor table
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
sigh...
* opened files have non-NULL dentries and non-NULL inodes
* close_filp() needs current->files only if the file had been
in descriptor table.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
* filp_close() needs non-NULL second argument only if it'd been in descriptor
table
* opened files have non-NULL dentries, TYVM
* ... and those dentries are positive - it's kinda hard to open a file that
doesn't exist.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
a) vfs_llseek() does *not* access userland pointers of any kind
b) neither does filp_close(), for that matter
c) ... nor filp_open()
d) vfs_read() does, but we do have a wrapper for that (kernel_read()),
so there's no need to reinvent it.
e) passing current->files to filp_close() on something that never
had been in descriptor table is pointless.
ISAGN: voodoo dolls to be used on voodoo programmers...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
What inline? Its only use is passing its address to call_rcu(), for fuck sake!
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|