Project
Requirements Specification
NFSv4.0 client referrals
Author
Evan Layton
Alok Aggarwal
NFSv4 allows the file namespace to extend beyond the boundaries of a single server. This is done with a mechanism known as referrals.When a client crosses over a namespace boundary at a server, the server refers the client to another server via a specific error and the fs_locations attributes. The client connects to the new server and resumes its operations.
The specification of file system location provides a means by which file systems located on one server can be associated with a name space defined by another server, thus allowing a general multi-server namespace facility.
Multi-server namespaces can provide many advantages by separating a file system's logical position in a name space from the (possibly changing) logistical and administrative considerations that result in particular file systems being located on particular servers.
This project implements the client side capabilities for processing a referral. That is the Solaris NFSv4 client will be able to process a referral returned from the server. This project does not add the capabilities to the server to return the referral.
Automatic mounting of referral points
Whenever the client traverses a referral point in the server namespace,
the client shall automatically mount the target of that referral
(subject to the triggering rules listed below) and in exactly the same place in the client namespace as where the referral was discovered.
The referral mount will be processed by the client in such a fashion that it is transparent to the user. No special configuration will be required on the client to enable this behavior.
As part of that mount an entry will be placed in mnttab as would normally be expected for a mount.
Triggering Actions
Any action on a target directory resulting in a vnode operations of either VOP_LOOKUP(target) or VOP_GETATTR(target) is not a triggering action.
Any action on a target directory resulting in any other vnode operation is a triggering action.
Any action on a parent directory containing the target directory resulting in a vnode operation of VOP_READDIR(parent) is not a triggering action.
multiple referrals (referral is referred...)
The client must be able to handle getting the moved error while processing a previous referral.
A referral encountered during any mount (initial or otherwise) will result in the mount of the referred-to server.
If a referral points at a previously processed/mounted referral in a "chain" or set of nested referrals the referral and mount will fail.
List of referrals (multiple locations & possible failover)
For the first phase of referrals we will attempt to mount the first server/resource in the list returned in fs_locations.
If the first server in the list returned in fs_locations is not available we will iterate through that list until we find one that is available. If none are available the mount will fail.
Host/node name and IP address resolution
When a host name is returned in fs_locations it will be resolved to an IP address.
We will also correctly handle a hosts IP address (IPv4 or IPv6) when it is returned in fs_locations.
Hierarchical unmounting
When unmounting the top of a tree/hierarchy of referrals all referrals under this must also be unmounted.
When autofs mounts are unmounted any referrals mounted within must also be automatically unmounted.
Referral mounts under a Mirror Mount or Mirror Mounts under Referral mounts must also be automatically unmounted when a mount above them is unmounted.
Inherited mount properties
mount properties specified by the client in the original mount should be used for the referral.
The client will always attempt to inherit the security flavor from the parent mount however if this does not match that of the referred to server's file system, it will attempt to re-negotiate the security flavor. If the re-negotiation fails the mount will also fail.
In the case where the security flavor inherited from the parent mount does not match that of the referred to server's file system and the re-negotiation fails, the client will iterate through the list of servers returned in fs_locations. If the client is not able to agree on a security flavor with any of the servers in the list, the mount will fail.
8. Replication/migration detection
The client will be able to distinguish between a referral event and a replication/migration event. A replication/migration event will not be processed by the client and the failure mode will be the same as that that exists currently in Solaris 10.
Other:
performance - when crossing this new mount there will be a slight delay while the referral is reconciled but once the mount is established there will be no performance degradation compared to other NFS-mounted file systems". This slight delay in mounting will be very similar to that experienced while doing autofs mounts.
Out of scope/future work
This project will not be implementing v4.1 fs_locations_info.
Enabling of client side fs_locations based replication and migration will not be done as part of this project.
Enabling of server side referrals and fs_locations based replication and migration will not be done as part of this project.
Administration of server side referrals and fs_locations based replication and migration will not be be done as part of this project.
This project will require changes to the automounter with respect the unmounting of referrals that have been mounted under an autofs mounted file system.
This project is closely linked with the NFSv4 Mirror Mounts project, and will share its implementation.
This line left intentionally blank...
NFSv4.1 draft document
http://www.ietf.org/internet-drafts/draft-ietf-nfsv4-minorversion1-10.txt
A referral event occurs upon the "first access" to a server filesystem. For example, a client looks up an object in the server namespace for the very first time but is told that that object is located on another server. The client is subsequently referred to the server filesystem that contains that object. In such a case the client is said to have encountered a referral event.
If the client traverses into a server filesystem and it finds the objects that once existed in the filesystem (having established that fact by virtue of having accessed those objects previously), are no longer present at that server but are present at an alternate set(s) of servers - it is said to have encountered a migration/replication event.
A case of a migration/replication event is one in which the client accesses a file/directory a number of times before being told by the server that the file/directory in question has been migrated over to a different server.
As outlined above, a referral event will be handled by this project whereas a migration/replication event will not be.
The triggering actions on a target directory that will result in a referral are defined in terms of their resultant vnode operations.
From an API perspective: with the single exception of stat(2), all filesystem calls involving the target directory will trigger a mount. However, a readdir(3)/getdents(2) of the parent directory (/parent) enclosing the target (/parent/target) will not trigger a mount.
For example:
ls /parent
will not trigger a mount
ls -l /parent
will not trigger a mount
ls -d /parent/target
will not trigger a mount
All other filesystem commands involving the target directory will trigger a mount
Automounter comparison: the automounter will only automatically mount nested mounts when encountered under /net.
referrals will enable a "browsing" feature similar, but not identical, to the automounter. (will use the same mechanism employed by NFSv4 mirror mounts)
When the automounter browsing option is enabled for indirect maps, it is possible to see the existence of automount trigger points before they are mounted:
estale $ ls -ld /home/alice dr-xr-xr-x 1 root root 1 Oct 18 12:36 /home/alice estale $ mount | grep alice estale $
Note that the attributes of the directory are generated by the client, and do not match reality on the server. The directory is given mode 0555, with root ownership, and the modification time is the current time. If the directory is mounted - e.g. by changing into the directory - the automounter completes the mount, and the real directory attributes are seen:
estale $ ls -ld /home/alice drwxr-xr-x 79 alice pawns 20480 Oct 18 13:19 /home/alice
(as well as its contents).
Note that the automounter "/net" feature is a special case, where the automounter will automatically mount any server filesystems it traverses. The functionality proposed here is similar but includes being referred to another server, by using solely NFSv4 mechanisms, with no involvement of the automounter. In addition, of course, it is not tied to a particular trigger-point (/net).
In the absence of any client automount map, the existing NFSv4 server implementation in Solaris still presents the entire server namespace to the client, i.e server mounts-points (in effect) are visible to the client before the client has mounted them, even if the server mount-points themselves are on a server filesystem that is not shared:
NFSv4-server # share - /dum rw=pawns "" - /dee rw=pawns "" # note that the server does not share "/", yet we may mount it NFSv4-client # mount NFSv4-server:/ /mnt NFSv4-client # ls -l /mnt total 4 drwxr-xr-x 3 alice pawns 512 Oct 18 15:01 dee drwxr-xr-x 37 root sys 1024 Oct 18 14:50 dum
This continues to provide the useful browsing feature, previously available via the automounter, without imposing the overhead of a mount, which may be important in the presence of many server filesystems e.g. when using ZFS.
Note that the attributes of the file systems are not presented correctly. Because the new server has not be contacted yet we don't have access to the correct attributes. What will be presented will be the attributes for the referral point and not the file system on the new server.
However, the contents of the server's filesystems cannot be seen:
NFSv4-server # ls -al /dee total 20 drwxr-xr-x 3 alice pawns 512 Oct 18 15:01 . drwxr-xr-x 31 root root 1024 Oct 18 15:01 .. drwx------ 2 alice pawns 8192 Oct 18 14:53 lost+found -rw-r--r-- 1 alice pawns 0 Oct 18 14:58 this_file_is_in_slash_dee NFSv4-client # ls -al /mnt/dee total 4 drwxr-xr-x 3 alice pawns 512 Oct 18 15:01 . drwxr-xr-x 31 root root 1024 Oct 18 15:01 ..
The proposed referral functionality would cause a real NFSv4 mount to occur when the client crosses into the new filesystem on the referred to server by accessing /mnt/dee. This functionality is the same as mirror mounts with the additional caveat NFSv4-server is the referred to server and the original lookup was to another server which referred us to NFSv4-server.