diff options
Diffstat (limited to 'Documentation/filesystems')
-rw-r--r-- | Documentation/filesystems/9p.rst | 10 | ||||
-rw-r--r-- | Documentation/filesystems/ceph.rst | 6 | ||||
-rw-r--r-- | Documentation/filesystems/orangefs.rst | 34 | ||||
-rw-r--r-- | Documentation/filesystems/overlayfs.rst | 82 | ||||
-rw-r--r-- | Documentation/filesystems/qnx6.rst | 2 |
5 files changed, 113 insertions, 21 deletions
diff --git a/Documentation/filesystems/9p.rst b/Documentation/filesystems/9p.rst index f054d1c45e86..671fef39a802 100644 --- a/Documentation/filesystems/9p.rst +++ b/Documentation/filesystems/9p.rst @@ -158,6 +158,16 @@ Options /sys/fs/9p/caches. (applies only to cache=fscache) ============= =============================================================== +Behavior +======== + +This section aims at describing 9p 'quirks' that can be different +from a local filesystem behaviors. + + - Setting O_NONBLOCK on a file will make client reads return as early + as the server returns some data instead of trying to fill the read + buffer with the requested amount of bytes or end of file is reached. + Resources ========= diff --git a/Documentation/filesystems/ceph.rst b/Documentation/filesystems/ceph.rst index b46a7218248f..0aa70750df0f 100644 --- a/Documentation/filesystems/ceph.rst +++ b/Documentation/filesystems/ceph.rst @@ -107,17 +107,17 @@ Mount Options address its connection to the monitor originates from. wsize=X - Specify the maximum write size in bytes. Default: 16 MB. + Specify the maximum write size in bytes. Default: 64 MB. rsize=X - Specify the maximum read size in bytes. Default: 16 MB. + Specify the maximum read size in bytes. Default: 64 MB. rasize=X Specify the maximum readahead size in bytes. Default: 8 MB. mount_timeout=X Specify the timeout value for mount (in seconds), in the case - of a non-responsive Ceph file system. The default is 30 + of a non-responsive Ceph file system. The default is 60 seconds. caps_max=X diff --git a/Documentation/filesystems/orangefs.rst b/Documentation/filesystems/orangefs.rst index 7d6d4cad73c4..e41369709c5b 100644 --- a/Documentation/filesystems/orangefs.rst +++ b/Documentation/filesystems/orangefs.rst @@ -41,16 +41,6 @@ Documentation http://www.orangefs.org/documentation/ - -Userspace Filesystem Source -=========================== - -http://www.orangefs.org/download - -Orangefs versions prior to 2.9.3 would not be compatible with the -upstream version of the kernel client. - - Running ORANGEFS On a Single Server =================================== @@ -94,6 +84,14 @@ Mount the filesystem:: mount -t pvfs2 tcp://localhost:3334/orangefs /pvfsmnt +Userspace Filesystem Source +=========================== + +http://www.orangefs.org/download + +Orangefs versions prior to 2.9.3 would not be compatible with the +upstream version of the kernel client. + Building ORANGEFS on a Single Server ==================================== @@ -107,18 +105,24 @@ default, we will probably be changing the default to LMDB soon. :: - ./configure --prefix=/opt/ofs --with-db-backend=lmdb + ./configure --prefix=/opt/ofs --with-db-backend=lmdb --disable-usrint make make install -Create an orangefs config file:: +Create an orangefs config file by running pvfs2-genconfig and +specifying a target config file. Pvfs2-genconfig will prompt you +through. Generally it works fine to take the defaults, but you +should use your server's hostname, rather than "localhost" when +it comes to that question:: /opt/ofs/bin/pvfs2-genconfig /etc/pvfs2.conf Create an /etc/pvfs2tab file:: +Localhost is fine for your pvfs2tab file: + echo tcp://localhost:3334/orangefs /pvfsmnt pvfs2 defaults,noauto 0 0 > \ /etc/pvfs2tab @@ -132,7 +136,7 @@ Bootstrap the server:: Start the server:: - /opt/osf/sbin/pvfs2-server /etc/pvfs2.conf + /opt/ofs/sbin/pvfs2-server /etc/pvfs2.conf Now the server should be running. Pvfs2-ls is a simple test to verify that the server is running:: @@ -142,11 +146,11 @@ test to verify that the server is running:: If stuff seems to be working, load the kernel module and turn on the client core:: - /opt/ofs/sbin/pvfs2-client -p /opt/osf/sbin/pvfs2-client-core + /opt/ofs/sbin/pvfs2-client -p /opt/ofs/sbin/pvfs2-client-core Mount your filesystem:: - mount -t pvfs2 tcp://localhost:3334/orangefs /pvfsmnt + mount -t pvfs2 tcp://`hostname`:3334/orangefs /pvfsmnt Running xfstests diff --git a/Documentation/filesystems/overlayfs.rst b/Documentation/filesystems/overlayfs.rst index e443be7928db..c9d2bf96b02d 100644 --- a/Documentation/filesystems/overlayfs.rst +++ b/Documentation/filesystems/overlayfs.rst @@ -40,13 +40,46 @@ On 64bit systems, even if all overlay layers are not on the same underlying filesystem, the same compliant behavior could be achieved with the "xino" feature. The "xino" feature composes a unique object identifier from the real object st_ino and an underlying fsid index. + If all underlying filesystems support NFS file handles and export file handles with 32bit inode number encoding (e.g. ext4), overlay filesystem will use the high inode number bits for fsid. Even when the underlying filesystem uses 64bit inode numbers, users can still enable the "xino" feature with the "-o xino=on" overlay mount option. That is useful for the case of underlying filesystems like xfs and tmpfs, which use 64bit inode -numbers, but are very unlikely to use the high inode number bit. +numbers, but are very unlikely to use the high inode number bits. In case +the underlying inode number does overflow into the high xino bits, overlay +filesystem will fall back to the non xino behavior for that inode. + +The following table summarizes what can be expected in different overlay +configurations. + +Inode properties +```````````````` + ++--------------+------------+------------+-----------------+----------------+ +|Configuration | Persistent | Uniform | st_ino == d_ino | d_ino == i_ino | +| | st_ino | st_dev | | [*] | ++==============+=====+======+=====+======+========+========+========+=======+ +| | dir | !dir | dir | !dir | dir + !dir | dir | !dir | ++--------------+-----+------+-----+------+--------+--------+--------+-------+ +| All layers | Y | Y | Y | Y | Y | Y | Y | Y | +| on same fs | | | | | | | | | ++--------------+-----+------+-----+------+--------+--------+--------+-------+ +| Layers not | N | Y | Y | N | N | Y | N | Y | +| on same fs, | | | | | | | | | +| xino=off | | | | | | | | | ++--------------+-----+------+-----+------+--------+--------+--------+-------+ +| xino=on/auto | Y | Y | Y | Y | Y | Y | Y | Y | +| | | | | | | | | | ++--------------+-----+------+-----+------+--------+--------+--------+-------+ +| xino=on/auto,| N | Y | Y | N | N | Y | N | Y | +| ino overflow | | | | | | | | | ++--------------+-----+------+-----+------+--------+--------+--------+-------+ + +[*] nfsd v3 readdirplus verifies d_ino == i_ino. i_ino is exposed via several +/proc files, such as /proc/locks and /proc/self/fdinfo/<fd> of an inotify +file descriptor. Upper and Lower @@ -248,6 +281,50 @@ overlay filesystem (though an operation on the name of the file such as rename or unlink will of course be noticed and handled). +Permission model +---------------- + +Permission checking in the overlay filesystem follows these principles: + + 1) permission check SHOULD return the same result before and after copy up + + 2) task creating the overlay mount MUST NOT gain additional privileges + + 3) non-mounting task MAY gain additional privileges through the overlay, + compared to direct access on underlying lower or upper filesystems + +This is achieved by performing two permission checks on each access + + a) check if current task is allowed access based on local DAC (owner, + group, mode and posix acl), as well as MAC checks + + b) check if mounting task would be allowed real operation on lower or + upper layer based on underlying filesystem permissions, again including + MAC checks + +Check (a) ensures consistency (1) since owner, group, mode and posix acls +are copied up. On the other hand it can result in server enforced +permissions (used by NFS, for example) being ignored (3). + +Check (b) ensures that no task gains permissions to underlying layers that +the mounting task does not have (2). This also means that it is possible +to create setups where the consistency rule (1) does not hold; normally, +however, the mounting task will have sufficient privileges to perform all +operations. + +Another way to demonstrate this model is drawing parallels between + + mount -t overlay overlay -olowerdir=/lower,upperdir=/upper,... /merged + +and + + cp -a /lower /upper + mount --bind /upper /merged + +The resulting access permissions should be the same. The difference is in +the time of copy (on-demand vs. up-front). + + Multiple lower layers --------------------- @@ -383,7 +460,8 @@ guarantee that the values of st_ino and st_dev returned by stat(2) and the value of d_ino returned by readdir(3) will act like on a normal filesystem. E.g. the value of st_dev may be different for two objects in the same overlay filesystem and the value of st_ino for directory objects may not be -persistent and could change even while the overlay filesystem is mounted. +persistent and could change even while the overlay filesystem is mounted, as +summarized in the `Inode properties`_ table above. Changes to underlying filesystems diff --git a/Documentation/filesystems/qnx6.rst b/Documentation/filesystems/qnx6.rst index b71308314070..fd13433d362c 100644 --- a/Documentation/filesystems/qnx6.rst +++ b/Documentation/filesystems/qnx6.rst @@ -185,7 +185,7 @@ tree structures are treated as system blocks. The rational behind that is that a write request can work on a new snapshot (system area of the inactive - resp. lower serial numbered superblock) while -at the same time there is still a complete stable filesystem structer in the +at the same time there is still a complete stable filesystem structure in the other half of the system area. When finished with writing (a sync write is completed, the maximum sync leap |