Age | Commit message (Collapse) | Author |
|
If we received weak cache consistency data from the server, then those
attributes are up to date, and there is no reason to mark them as
dirty in the attribute cache.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Even if the change attribute is missing, it is still OK to mark the other
attributes as being up to date.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Starting with NFSv4.1, the server is able to deduce the client id from
the SEQUENCE op which means it can always figure out whether or not
the client is holding a delegation on a file that is being changed.
For that reason, RFC5661 does not require a delegation to be unconditionally
recalled on operations such as SETATTR, RENAME, or REMOVE.
Note that for now, we continue to return READ delegations since that is
still expected by the Linux knfsd server.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Ensure that when we do finally delete the file, then we return the
delegation.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Move the delegation recall out of the generic code, and into the NFSv4
specific callback.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Move the delegation return out of generic code and down into the
NFSv4 specific unlink code.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Move the delegation return out of generic code.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
The 'fmode' argument can take an FMODE_EXEC value, which we want to
filter out before comparing to the delegation type.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
If we have a RECLAIM_COMPLETE with a populated cl_lock_waitq, then
that implies that a reconnect has occurred. Since we can't expect a
CB_NOTIFY_LOCK callback at that point, just wake up the entire queue
so that all the tasks can re-poll for their locks.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
The task is expected to sleep for a while here, and it's possible that
a new EXCHANGE_ID has occurred in the interim, and we were assigned a
new clientid. Since this is a per-client list, there isn't a lot of
value in vetting the clientid on the incoming request.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
We may get a notification and lose the race to another client. Ensure
that we wait again for a notification in that case.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
Pull ceph updates from Ilya Dryomov:
"The big ticket items are:
- support for rbd "fancy" striping (myself).
The striping feature bit is now fully implemented, allowing mapping
v2 images with non-default striping patterns. This completes
support for --image-format 2.
- CephFS quota support (Luis Henriques and Zheng Yan).
This set is based on the new SnapRealm code in the upcoming v13.y.z
("Mimic") release. Quota handling will be rejected on older
filesystems.
- memory usage improvements in CephFS (Chengguang Xu).
Directory specific bits have been split out of ceph_file_info and
some effort went into improving cap reservation code to avoid OOM
crashes.
Also included a bunch of assorted fixes all over the place from
Chengguang and others"
* tag 'ceph-for-4.17-rc1' of git://github.com/ceph/ceph-client: (67 commits)
ceph: quota: report root dir quota usage in statfs
ceph: quota: add counter for snaprealms with quota
ceph: quota: cache inode pointer in ceph_snap_realm
ceph: fix root quota realm check
ceph: don't check quota for snap inode
ceph: quota: update MDS when max_bytes is approaching
ceph: quota: support for ceph.quota.max_bytes
ceph: quota: don't allow cross-quota renames
ceph: quota: support for ceph.quota.max_files
ceph: quota: add initial infrastructure to support cephfs quotas
rbd: remove VLA usage
rbd: fix spelling mistake: "reregisteration" -> "reregistration"
ceph: rename function drop_leases() to a more descriptive name
ceph: fix invalid point dereference for error case in mdsc destroy
ceph: return proper bool type to caller instead of pointer
ceph: optimize memory usage
ceph: optimize mds session register
libceph, ceph: add __init attribution to init funcitons
ceph: filter out used flags when printing unused open flags
ceph: don't wait on writeback when there is no more dirty pages
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm updates from Dan Williams:
"This cycle was was not something I ever want to repeat as there were
several late changes that have only now just settled.
Half of the branch up to commit d2c997c0f145 ("fs, dax: use
page->mapping to warn...") have been in -next for several releases.
The of_pmem driver and the address range scrub rework were late
arrivals, and the dax work was scaled back at the last moment.
The of_pmem driver missed a previous merge window due to an oversight.
A sense of obligation to rectify that miss is why it is included for
4.17. It has acks from PowerPC folks. Stephen reported a build failure
that only occurs when merging it with your latest tree, for now I have
fixed that up by disabling modular builds of of_pmem. A test merge
with your tree has received a build success report from the 0day robot
over 156 configs.
An initial version of the ARS rework was submitted before the merge
window. It is self contained to libnvdimm, a net code reduction, and
passing all unit tests.
The filesystem-dax changes are based on the wait_var_event()
functionality from tip/sched/core. However, late review feedback
showed that those changes regressed truncate performance to a large
degree. The branch was rewound to drop the truncate behavior change
and now only includes preparation patches and cleanups (with full acks
and reviews). The finalization of this dax-dma-vs-trnucate work will
need to wait for 4.18.
Summary:
- A rework of the filesytem-dax implementation provides for detection
of unmap operations (truncate / hole punch) colliding with
in-progress device-DMA. A fix for these collisions remains a
work-in-progress pending resolution of truncate latency and
starvation regressions.
- The of_pmem driver expands the users of libnvdimm outside of x86
and ACPI to describe an implementation of persistent memory on
PowerPC with Open Firmware / Device tree.
- Address Range Scrub (ARS) handling is completely rewritten to
account for the fact that ARS may run for 100s of seconds and there
is no platform defined way to cancel it. ARS will now no longer
block namespace initialization.
- The NVDIMM Namespace Label implementation is updated to handle
label areas as small as 1K, down from 128K.
- Miscellaneous cleanups and updates to unit test infrastructure"
* tag 'libnvdimm-for-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (39 commits)
libnvdimm, of_pmem: workaround OF_NUMA=n build error
nfit, address-range-scrub: add module option to skip initial ars
nfit, address-range-scrub: rework and simplify ARS state machine
nfit, address-range-scrub: determine one platform max_ars value
powerpc/powernv: Create platform devs for nvdimm buses
doc/devicetree: Persistent memory region bindings
libnvdimm: Add device-tree based driver
libnvdimm: Add of_node to region and bus descriptors
libnvdimm, region: quiet region probe
libnvdimm, namespace: use a safe lookup for dimm device name
libnvdimm, dimm: fix dpa reservation vs uninitialized label area
libnvdimm, testing: update the default smart ctrl_temperature
libnvdimm, testing: Add emulation for smart injection commands
nfit, address-range-scrub: introduce nfit_spa->ars_state
libnvdimm: add an api to cast a 'struct nd_region' to its 'struct device'
nfit, address-range-scrub: fix scrub in-progress reporting
dax, dm: allow device-mapper to operate without dax support
dax: introduce CONFIG_DAX_DRIVER
fs, dax: use page->mapping to warn if truncate collides with a busy page
ext2, dax: introduce ext2_dax_aops
...
|
|
In xfs_itruncate_extents, only cancel cow blocks and clear the reflink
flag if we were asked to truncate the data fork. Attr fork blocks
cannot be shared, so this makes no sense.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
Processes like ld that do lots of small writes that aren't necessarily
contiguous result in a lot of small StoreData operations to the server, the
idea being that if someone else changes the data on the server, we only
write our changes over that and not the space between. Further, we don't
want to write back empty space if we can avoid it to make it easier for the
server to do sparse files.
However, making lots of tiny RPC ops is a lot less efficient for the server
than one big one because each op requires allocation of resources and the
taking of locks, so we want to compromise a bit.
Reduce the load by the following:
(1) If a file is just created locally or has just been truncated with
O_TRUNC locally, allow subsequent writes to the file to be merged with
intervening space if that space doesn't cross an entire intervening
page.
(2) Don't flush the file on ->flush() but rather on ->release() if the
file was open for writing.
Just linking vmlinux.o, without this patch, looking in /proc/fs/afs/stats:
file-wr : n=441 nb=513581204
and after the patch:
file-wr : n=62 nb=513668555
there were 379 fewer StoreData RPC operations at the expense of an extra
87K being written.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Add statistics to /proc/fs/afs/stats for data transfer RPC operations. New
lines are added that look like:
file-rd : n=55794 nb=10252282150
file-wr : n=9789 nb=3247763645
where n= indicates the number of ops completed and nb= indicates the number
of bytes successfully transferred. file-rd is the counts for read/fetch
operations and file-wr the counts for write/store operations.
Note that directory and symlink downloading are included in the file-rd
stats at the moment.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Trace protocol errors detected in afs.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Locally edit the contents of an AFS directory upon a successful inode
operation that modifies that directory (such as mkdir, create and unlink)
so that we can avoid the current practice of re-downloading the directory
after each change.
This is viable provided that the directory version number we get back from
the modifying RPC op is exactly incremented by 1 from what we had
previously. The data in the directory contents is in a defined format that
we have to parse locally to perform lookups and readdir, so modifying isn't
a problem.
If the edit fails, we just clear the VALID flag on the directory and it
will be reloaded next time it is needed.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Adjust the AFS directory XDR structures in a number of superficial ways:
(1) Rename them to all begin afs_xdr_.
(2) Use u8 instead of uint8_t.
(3) Mark the structures as __packed so they don't get rearranged by the
compiler.
(4) Rename the hdr member of afs_xdr_dir_block to meta.
(5) Rename the pagehdr member of afs_xdr_dir_block to hdr.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Split the directory content definitions into a header file so that they can
be used by multiple .c files.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
AFS directories are structured blobs that are downloaded just like files
and then parsed by the lookup and readdir code and, as such, are currently
handled in the pagecache like any other file, with the entire directory
content being thrown away each time the directory changes.
However, since the blob is a known structure and since the data version
counter on a directory increases by exactly one for each change committed
to that directory, we can actually edit the directory locally rather than
fetching it from the server after each locally-induced change.
What we can't do, though, is mix data from the server and data from the
client since the server is technically at liberty to rearrange or compress
a directory if it sees fit, provided it updates the data version number
when it does so and breaks the callback (ie. sends a notification).
Further, lookup with lookup-ahead, readdir and, when it arrives, local
editing are likely want to scan the whole of a directory.
So directory handling needs to be improved to maintain the coherency of the
directory blob prior to permitting local directory editing.
To this end:
(1) If any directory page gets discarded, invalidate and reread the entire
directory.
(2) If readpage notes that if when it fetches a single page that the
version number has changed, the entire directory is flagged for
invalidation.
(3) Read as much of the directory in one go as we can.
Note that this removes local caching of directories in fscache for the
moment as we can't pass the pages to fscache_read_or_alloc_pages() since
page->lru is in use by the LRU.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Split the AFS dynamic root stuff out of the main directory handling file
and into its own file as they share little in common.
The dynamic root code also gets its own dentry and inode ops tables.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Each afs dentry is tagged with the version that the parent directory was at
last time it was validated and, currently, if this differs, the directory
is scanned and the dentry is refreshed.
However, this leads to an excessive amount of revalidation on directories
that get modified on the client without conflict with another client. We
know there's no conflict because the parent directory's data version number
got incremented by exactly 1 on any create, mkdir, unlink, etc., therefore
we can trust the current state of the unaffected dentries when we perform a
local directory modification.
Optimise by keeping track of the last version of the parent directory that
was changed outside of the client in the parent directory's vnode and using
that to validate the dentries rather than the current version.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Rearrange the AFSFetchStatus to inode attribute mapping code in a number of
ways:
(1) Use an XDR structure rather than a series of incremented pointer
accesses when decoding an AFSFetchStatus object. This allows
out-of-order decode.
(2) Don't store the if_version value but rather just check it and abort if
it's not something we can handle.
(3) Store the owner and group in the status record as raw values rather
than converting them to kuid/kgid. Do that when they're mapped into
i_uid/i_gid.
(4) Validate the type and abort code up front and abort if they're wrong.
(5) Split the inode attribute setting out into its own function from the
XDR decode of an AFSFetchStatus object. This allows it to be called
from elsewhere too.
(6) Differentiate changes to data from changes to metadata.
(7) Use the split-out attribute mapping function from afs_iget().
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Store the data version number indicated by an FS.FetchData op into the read
request structure so that it's accessible by the page reader.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
We no longer parse symlinks when we get the inode to determine if this
symlink is actually a mountpoint as we detect that by examining the mode
instead (symlinks are always 0777 and mountpoints 0644).
Access the cache after mapping the status so that we don't have to manually
set the inode size now.
Note that this may need adjusting if the disconnected operation is
implemented as the file metadata may have to be obtained from the cache.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Introduce a proc file that displays a bunch of statistics for the AFS
filesystem in the current network namespace.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Dump an AFS FileStatus record that is detected as invalid.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Implement @cell substitution handling such that if @cell is seen as a name
in a dynamic root mount, then the name of the root cell for that network
namespace will be substituted for @cell during lookup.
The substitution of @cell for the current net namespace is set by writing
the cell name to /proc/fs/afs/rootcell. The value can be obtained by
reading the file.
For example:
# mount -t afs none /kafs -o dyn
# echo grand.central.org >/proc/fs/afs/rootcell
# ls /kafs/@cell
archive/ cvs/ doc/ local/ project/ service/ software/ user/ www/
# cat /proc/fs/afs/rootcell
grand.central.org
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Implement the AFS feature by which @sys at the end of a pathname component
may be substituted for one of a list of values, typically naming the
operating system. Up to 16 alternatives may be specified and these are
tried in turn until one works. Each network namespace has[*] a separate
independent list.
Upon creation of a new network namespace, the list of values is
initialised[*] to a single OpenAFS-compatible string representing arch type
plus "_linux26". For example, on x86_64, the sysname is "amd64_linux26".
[*] Or will, once network namespace support is finalised in kAFS.
The list may be set by:
# for i in foo bar linux-x86_64; do echo $i; done >/proc/fs/afs/sysname
for which separate writes to the same fd are amalgamated and applied on
close. The LF character may be used as a separator to specify multiple
items in the same write() call.
The list may be cleared by:
# echo >/proc/fs/afs/sysname
and read by:
# cat /proc/fs/afs/sysname
foo
bar
linux-x86_64
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
When afs_lookup() is called, prospectively look up the next 50 uncached
fids also from that same directory and cache the results, rather than just
looking up the one file requested.
This allows us to use the FS.InlineBulkStatus RPC op to increase efficiency
by fetching up to 50 file statuses at a time.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
AFS cells that are added or set as the workstation cell through /proc are
pinned against removal by setting the AFS_CELL_FL_NO_GC flag on them and
taking a ref. The ref should be only taken if the flag wasn't already set.
Fix this by making it conditional.
Without this an assertion failure will occur during module removal
indicating that the refcount is too elevated.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Fix warnings raised by checker, including:
(*) Warnings raised by unequal comparison for the purposes of sorting,
where the endianness doesn't matter:
fs/afs/addr_list.c:246:21: warning: restricted __be16 degrades to integer
fs/afs/addr_list.c:246:30: warning: restricted __be16 degrades to integer
fs/afs/addr_list.c:248:21: warning: restricted __be32 degrades to integer
fs/afs/addr_list.c:248:49: warning: restricted __be32 degrades to integer
fs/afs/addr_list.c:283:21: warning: restricted __be16 degrades to integer
fs/afs/addr_list.c:283:30: warning: restricted __be16 degrades to integer
(*) afs_set_cb_interest() is not actually used and can be removed.
(*) afs_cell_gc_delay() should be provided with a sysctl.
(*) afs_cell_destroy() needs to use rcu_access_pointer() to read
cell->vl_addrs.
(*) afs_init_fs_cursor() should be static.
(*) struct afs_vnode::permit_cache needs to be marked __rcu.
(*) afs_server_rcu() needs to use rcu_access_pointer().
(*) afs_destroy_server() should use rcu_access_pointer() on
server->addresses as the server object is no longer accessible.
(*) afs_find_server() casts __be16/__be32 values to int in order to
directly compare them for the purpose of finding a match in a list,
but is should also annotate the cast with __force to avoid checker
warnings.
(*) afs_check_permit() accesses vnode->permit_cache outside of the RCU
readlock, though it doesn't then access the value; the extraneous
access is deleted.
False positives:
(*) Conditional locking around the code in xdr_decode_AFSFetchStatus. This
can be dealt with in a separate patch.
fs/afs/fsclient.c:148:9: warning: context imbalance in 'xdr_decode_AFSFetchStatus' - different lock contexts for basic block
(*) Incorrect handling of seq-retry lock context balance:
fs/afs/inode.c:455:38: warning: context imbalance in 'afs_getattr' - different
lock contexts for basic block
fs/afs/server.c:52:17: warning: context imbalance in 'afs_find_server' - different lock contexts for basic block
fs/afs/server.c:128:17: warning: context imbalance in 'afs_find_server_by_uuid' - different lock contexts for basic block
Errors:
(*) afs_lookup_cell_rcu() needs to break out of the seq-retry loop, not go
round again if it successfully found the workstation cell.
(*) Fix UUID decode in afs_deliver_cb_probe_uuid().
(*) afs_cache_permit() has a missing rcu_read_unlock() before one of the
jumps to the someone_else_changed_it label. Move the unlock to after
the label.
(*) afs_vl_get_addrs_u() is using ntohl() rather than htonl() when
encoding to XDR.
(*) afs_deliver_yfsvl_get_endpoints() is using htonl() rather than ntohl()
when decoding from XDR.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs namei updates from Al Viro:
- make lookup_one_len() safe with parent locked only shared(incoming
afs series wants that)
- fix of getname_kernel() regression from 2015 (-stable fodder, that
one).
* 'work.namei' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
getname_kernel() needs to make sure that ->name != ->iname in long case
make lookup_one_len() safe to use with directory locked shared
new helper: __lookup_slow()
merge common parts of lookup_one_len{,_unlocked} into common helper
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux
Pull orangefs updates from Mike Marshall:
"Fixes and cleanups:
- Documentation cleanups
- removal of unused code
- make some structs static
- implement Orangefs vm_operations fault callout
- eliminate two single-use functions and put their cleaned up code in
line.
- replace a vmalloc/memset instance with vzalloc
- fix a race condition bug in wait code"
* tag 'for-linus-4.17-ofs' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
Orangefs: documentation updates
orangefs: document package install and xfstests procedure
orangefs: remove unused code
orangefs: make several *_operations structs static
orangefs: implement vm_ops->fault
orangefs: open code short single-use functions
orangefs: replace vmalloc and memset with vzalloc
orangefs: bug fix for a race condition when getting a slot
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull pstore fix from Kees Cook:
"Fix another compression Kconfig combination missed in testing (Tobias
Regnery)"
* tag 'pstore-v4.17-rc1-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
pstore: fix crypto dependencies without compression
|
|
|
|
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
The filestreams allocator stores an xfs_fstrm_item structure in the MRU to
cache inode number to agno mappings for a particular length of time. Each
xfs_fstrm_item contains the internal MRU structure, an inode pointer and
agno value.
The inode pointer stored in the xfs_fstrm_item is not referenced, however,
which means the inode itself can be removed and reclaimed before the MRU
item is freed. If this occurs, xfs_fstrm_free_func() can access freed or
unrelated memory through xfs_fstrm_item->ip and crash.
The obvious solution is to grab an inode reference for xfs_fstrm_item.
The filestream mechanism only actually uses the inode pointer as a means
to access the xfs_mount, however. Rather than add unnecessary
complexity, simplify the implementation to store an xfs_mount pointer in
struct xfs_mru_cache, and pass it to the free callback. This also
requires updates to the tracepoint class to provide the associated data
via parameters rather than the inode and a minor hack to peek at the MRU
key to establish the inode number at free time.
Based on debugging work and an earlier patch from Brian Foster, who
also wrote most of this changelog.
Reported-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
dquot_init() is never called in atomic context.
This function is only set as a parameter of fs_initcall().
Despite never getting called from atomic context,
dquot_init() calls __get_free_pages() with GFP_ATOMIC,
which waits busily for allocation.
GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL,
to avoid busy waiting and improve the possibility of sucessful allocation.
This is found by a static analysis tool named DCNS written by myself.
And I also manually check it.
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
When event on child inodes are sent to the parent inode mark and
parent inode mark was not marked with FAN_EVENT_ON_CHILD, the event
will not be delivered to the listener process. However, if the same
process also has a mount mark, the event to the parent inode will be
delivered regadless of the mount mark mask.
This behavior is incorrect in the case where the mount mark mask does
not contain the specific event type. For example, the process adds
a mark on a directory with mask FAN_MODIFY (without FAN_EVENT_ON_CHILD)
and a mount mark with mask FAN_CLOSE_NOWRITE (without FAN_ONDIR).
A modify event on a file inside that directory (and inside that mount)
should not create a FAN_MODIFY event, because neither of the marks
requested to get that event on the file.
Fixes: 1968f5eed54c ("fanotify: use both marks when possible")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
missed it in "kill struct filename.separate" several years ago.
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull integrity updates from James Morris:
"A mixture of bug fixes, code cleanup, and continues to close
IMA-measurement, IMA-appraisal, and IMA-audit gaps.
Also note the addition of a new cred_getsecid LSM hook by Matthew
Garrett:
For IMA purposes, we want to be able to obtain the prepared secid
in the bprm structure before the credentials are committed. Add a
cred_getsecid hook that makes this possible.
which is used by a new CREDS_CHECK target in IMA:
In ima_bprm_check(), check with both the existing process
credentials and the credentials that will be committed when the new
process is started. This will not change behaviour unless the
system policy is extended to include CREDS_CHECK targets -
BPRM_CHECK will continue to check the same credentials that it did
previously"
* 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
ima: Fallback to the builtin hash algorithm
ima: Add smackfs to the default appraise/measure list
evm: check for remount ro in progress before writing
ima: Improvements in ima_appraise_measurement()
ima: Simplify ima_eventsig_init()
integrity: Remove unused macro IMA_ACTION_RULE_FLAGS
ima: drop vla in ima_audit_measurement()
ima: Fix Kconfig to select TPM 2.0 CRB interface
evm: Constify *integrity_status_msg[]
evm: Move evm_hmac and evm_hash from evm_main.c to evm_crypto.c
fuse: define the filesystem as untrusted
ima: fail signature verification based on policy
ima: clear IMA_HASH
ima: re-evaluate files on privileged mounted filesystems
ima: fail file signature verification on non-init mounted filesystems
IMA: Support using new creds in appraisal policy
security: Add a cred_getsecid hook
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull general security layer updates from James Morris:
- Convert security hooks from list to hlist, a nice cleanup, saving
about 50% of space, from Sargun Dhillon.
- Only pass the cred, not the secid, to kill_pid_info_as_cred and
security_task_kill (as the secid can be determined from the cred),
from Stephen Smalley.
- Close a potential race in kernel_read_file(), by making the file
unwritable before calling the LSM check (vs after), from Kees Cook.
* 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
security: convert security hooks to use hlist
exec: Set file unwritable before LSM check
usb, signal, security: only pass the cred, not the secid, to kill_pid_info_as_cred and security_task_kill
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull fscache updates from David Howells:
"Three patches that fix some of AFS's usage of fscache:
(1) Need to invalidate the cache if a foreign data change is detected
on the server.
(2) Move the vnode ID uniquifier (equivalent to i_generation) from
the auxiliary data to the index key to prevent a race between
file delete and a subsequent file create seeing the same index
key.
(3) Need to retire cookies that correspond to files that we think got
deleted on the server.
Four patches to fix some things in fscache and cachefiles:
(4) Fix a couple of checker warnings.
(5) Correctly indicate to the end-of-operation callback whether an
operation completed or was cancelled.
(6) Add a check for multiple cookie relinquishment.
(7) Fix a path through the asynchronous write that doesn't wake up a
waiter for a page if the cache decides not to write that page,
but discards it instead.
A couple of patches to add tracepoints to fscache and cachefiles:
(8) Add tracepoints for cookie operators, object state machine
execution, cachefiles object management and cachefiles VFS
operations.
(9) Add tracepoints for fscache operation management and page
wrangling.
And then three development patches:
(10) Attach the index key and auxiliary data to the cookie, pass this
information through various fscache-netfs API functions and get
rid of the callbacks to the netfs to get it.
This means that the cache can get at this information, even if
the netfs goes away. It also means that the cache can be lazy in
updating the coherency data.
(11) Pass the object data size through various fscache-netfs API
rather than calling back to the netfs for it, and store the value
in the object.
This makes it easier to correctly resize the object, as the size
is updated on writes to the cache, rather than calling back out
to the netfs.
(12) Maintain a catalogue of allocated cookies. This makes it possible
to catch cookie collision up front rather than down in the bowels
of the cache being run from a service thread from the object
state machine.
This will also make it possible in the future to reconnect to a
cookie that's not gone dead yet because it's waiting for
finalisation of the storage and also make it possible to bring
cookies online if the cache is added after the cookie has been
obtained"
* tag 'fscache-next-20180406' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
fscache: Maintain a catalogue of allocated cookies
fscache: Pass object size in rather than calling back for it
fscache: Attach the index key and aux data to the cookie
fscache: Add more tracepoints
fscache: Add tracepoints
fscache: Fix hanging wait on page discarded by writeback
fscache: Detect multiple relinquishment of a cookie
fscache: Pass the correct cancelled indications to fscache_op_complete()
fscache, cachefiles: Fix checker warnings
afs: Be more aggressive in retiring cached vnodes
afs: Use the vnode ID uniquifier in the cache key not the aux data
afs: Invalidate cache on server data change
|
|
Commit 58eb5b670747 ("pstore: fix crypto dependencies") fixed up the crypto
dependencies but missed the case when no compression is selected.
With CONFIG_PSTORE=y, CONFIG_PSTORE_COMPRESS=n and CONFIG_CRYPTO=m we see
the following link error:
fs/pstore/platform.o: In function `pstore_register':
(.text+0x1b1): undefined reference to `crypto_has_alg'
(.text+0x205): undefined reference to `crypto_alloc_base'
fs/pstore/platform.o: In function `pstore_unregister':
(.text+0x3b0): undefined reference to `crypto_destroy_tfm'
Fix this by checking at compile-time if CONFIG_PSTORE_COMPRESS is enabled.
Fixes: 58eb5b670747 ("pstore: fix crypto dependencies")
Signed-off-by: Tobias Regnery <tobias.regnery@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit
Pull audit updates from Paul Moore:
"We didn't have anything to send for v4.16, but we're back with a
little more than usual for v4.17.
Eleven patches in total, most fall into the small fix category, but
there are three non-trivial changes worth calling out:
- the audit entry filter is being removed after deprecating it for
quite a while (years of no one really using it because it turns out
to be not very practical)
- created our own version of "__mutex_owner()" because the locking
folks were upset we were using theirs
- improved our handling of kernel command line parameters to make
them more forgiving
- we fixed auditing of symlink operations
Everything passes the audit-testsuite and as of a few minutes ago it
merges well with your tree"
* tag 'audit-pr-20180403' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
audit: add refused symlink to audit_names
audit: remove path param from link denied function
audit: link denied should not directly generate PATH record
audit: make ANOM_LINK obey audit_enabled and audit_dummy_context
audit: do not panic on invalid boot parameter
audit: track the owner of the command mutex ourselves
audit: return on memory error to avoid null pointer dereference
audit: bail before bug check if audit disabled
audit: deprecate the AUDIT_FILTER_ENTRY filter
audit: session ID should not set arch quick field pointer
audit: update bugtracker and source URIs
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull pstore updates from Kees Cook:
"This cycle was almost entirely improvements to the pstore compression
options, noted below:
- Add lz4hc and 842 to pstore compression options (Geliang Tang)
- Refactor to use crypto compression API (Geliang Tang)
- Fix up Kconfig dependencies for compression (Arnd Bergmann)
- Allow for run-time compression selection
- Remove stack VLA usage"
* tag 'pstore-v4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
pstore: fix crypto dependencies
pstore: Use crypto compress API
pstore/ram: Do not use stack VLA for parity workspace
pstore: Select compression at runtime
pstore: Avoid size casts for 842 compression
pstore: Add lz4hc and 842 compression support
|