Monday, 16 February 2009

nfs - cachefs considerations

You can use cachefs as a way to improve nfs performance over slow / inconsistent networks

there is a great article from here;

The history of CacheFS
Sun didn´t introduced this feature for webservers. Long ago, admins didn´t want to manage dozens of operating system installations. Instead of this they wanted to store all this data on a central fileserver (you know ... the network is the computer). Thus netbooting Solaris and SunOS was invented. But there was a problem: Swap via Network was a really bad idea that days (it was a bad idea in 10 MBit/s times and it´s still a bad idea in 10 GBit/s times). Thus the diskless systems got a disk for a local swap. But there was another problem. All the users started to work at 9 o´clock ... they switched on their workstations ... and the load on the fileserver and the network got higher and higher. They had a local disk ... local installation again? No ... the central installation had it´s advantages. Thus the idea of CacheFS was born.

CacheFS is a really old feature of Solaris/SunOS. It´s first implementation dates back into the year 1991. I really think you can call this feature matured

CacheFS in theory
The mechanism of CacheFS is pretty simple. As i told you before, CacheFS is somewhat similar to a caching web proxy. The CacheFS is a proxy to the original filesystem and caches files their way through CacheFS. The basic idea is to cache remote files locally on a harddisk, so you can deliver them without using the network when you access them the second time.

Of course the CacheFS has to handle changes to the original files. So CacheFS checks the metadata of the file before delivering the copy. If the metadata has changed, the CacheFS loads the original file from the server. When the metadata hasn´t changed it delivers the copy from the cache.

The CacheFS isn´t just usable for NFS, you could use it as well for caching optical media like CD or DVD.

Okay ... using CacheFS is really easy. Let´s assume, you have an fileserver called theoden We use the directory /export/files as the directory shared by NFS. The client in our example is gandalf.


Preparations

Let´s create a NFS server at first. This is easy. Just share an directory on a Solaris Server. We login onto theoden and execute the following commands with root privileges.

[root@theoden:/]# mkdir /export/files
[root@theoden:/]# share -o rw /export/files
# share
- /export/files rw ""
Okay, of course it would be nice to have some files to play around in this directory. I will use some files of the Solaris Environment.
[root@theoden:/]# cd /export/files
[root@theoden:/export/files]# cp -R /usr/share/doc/pcre/html/* .Let´s do a quick test, if we can mount the directory:
[root@gandalf:/]# mkdir /files
[root@gandalf:/]# mount theoden:/export/files /files
[root@gandalf:/]# unmount /filesNow you should be able to access the \verb=/export/files= directory on theoden by accessing \verb=/files= on gandalf. There should be no error messages.

Okay, at first we have to create the location for our caching directories. Let´s assume we want to place our cache at /var/cachefs/caches/cache1. At first we create the directories above the cache directory. You don´t create the last part of the directory structure manually.
[root@gandalf:/]# mkdir -p /var/cachefs/cachesThis directory will the the place where we store our caches for CacheFS. After this step we have to create the cache for the CacheFS.
[root@gandalf:/files]# cfsadmin -c -o maxblocks=60,minblocks=40,threshblocks=50 /var/cachefs/caches/cache1The directory cache1 is created automatically by the command. In the case the directory already exists, the command will quit without creating the cache.

Additionally you have created the cache and you specified some basic parameters to control the behaviour of the cache. Citing the manpage of cfsadmin:
maxblocks: Maximum amount of storage space that CacheFS can use, expressed as a percentage of the total number of blocks in the front file system.
minblocks Minimum amount of storage space, expressed as a percentage of the total number of blocks in the front file system, that CacheFS is always allowed to use without limitation by its internal control mechanisms.
threshblocks A percentage of the total blocks in the front file system beyond which CacheFS cannot claim resources once its block usage has reached the level specified by minblocks.
All this parameter can be tuned to preven CacheFS to eat away all the storage available in a filesystem, a behaviour that was quite common to early versions of this feature.


Mounting a filesystem via CacheFS
We have to mount the original filesystem now.
[root@gandalf:/files]# mkdir -p /var/cachefs/backpaths/files
[root@gandalf:/files]# mount -o vers=3 theoden:/export/files /var/cachefs/backpaths/filesYou may have noticed the parameter that sets the NFS version to 3. This is nescessary, as CacheFS isn´t supported with NFSv4. Thus you can only use it with NFSv3 and below. The reason of this limitation has it´s foundation in the different way NFSv4 handles inodes.

Okay, now we mount the cache filesystem at the old location:
[root@gandalf:/files]# mount -F cachefs -o backfstype=nfs,backpath=/var/cachefs/backpaths/files,cachedir=/var/cachefs/caches/cache1 theoden:/export/files /files
The options of the mount controls some basic parameters of the mount:
backfstype specifies what type of filesystem is proxied by the CacheFS filesystem
backpath specifies where this proxied filesystem is currently mounted
cachedir specifies the cache directory for this instance of the cache. Multiple CacheFS mounts can use the same cache.
From now on every access to the /files directory will be cached by CacheFS. Let´s have a quick look into the /etc/mnttab. There are two important mounts for us:
[root@gandalf:/etc]# cat mnttab
[...]
theoden:/export/files /var/cachefs/backpaths/files nfs vers=3,xattr,dev=4f80001 1219049560
/var/cachefs/backpaths/files /files cachefs backfstype=nfs,backpath=/var/cachefs/backpaths/files,cachedir=/var/cachefs/caches/cache1,dev=4fc0001 1219049688
The first mount is our back file system, it´s a normal NFS mountpoint. But the second mount is a special one. This one is the consequence of the mount with the \verb=-F cachefs= option.


Statistics about the cache
While using it, you will see the cache structure at /var/cachefs/caches/cache1 filling up with files. I will explain some of the structure in the next section. But how efficient is this cache? Solaris provides an command to gather some statistics about the cache. With cachefsstat you print out data like hit rate inclusive the absolute number of cache hits and cache misses:
[root@gandalf:/files]# /usr/bin/cachefsstat

/files
cache hit rate: 60% (3 hits, 2 misses)
consistency checks: 7 (7 pass, 0 fail)
modifies: 0
garbage collection: 0
[root@gandalf:/files]#


###########################################################################

from http://publib.boulder.ibm.com/infocenter/systems/index.jsp?topic=/com.ibm.aix.prftungd/doc/prftungd/cachefs_perf_impacts.htm

CacheFS performance impacts

CacheFS will not increase the write performance to NFS file systems. However, you have some write options to choose as parameters to the -o option of the mount command, when mounting a CacheFS. They will influence the subsequent read performance to the data.

The write options are as follows:

write around
The write around mode is the default mode and it handles writes the same way that NFS does. The writes are made to the back file system, and the affected file is purged from the cache. This means that write around voids the cache and new data must be obtained back from the server after the write.
non-shared
You can use the non-shared mode when you are certain that no one else will be writing to the cached file system. In this mode, all writes are made to both the front and the back file system, and the file remains in the cache. This means that future read accesses can be done to the cache, rather than going to the server.
Small reads might be kept in memory anyway (depending on your memory usage) so there is no benefit in also caching the data on the disk. Caching of random reads to different data blocks does not help, unless you will access the same data over and over again.

The initial read request still has to go to the server because only by the time a user attempts to access files that are part of the back file system will those files be placed in the cache. For the initial read request, you will see typical NFS speed. Only for subsequent accesses to the same data, you will see local JFS access performance.

The consistency of the cached data is only checked at intervals. Therefore, it is dangerous to cache data that is frequently changed. CacheFS should only be used for read-only or read-mostly data.

Write performance over a cached NFS file system differs from NFS Version 2 to NFS Version 3. Performance tests have shown the following:

Sequential writes to a new file over NFS Version 2 to a CacheFS mount point can be 25 percent slower than writes directly to the NFS Version 2 mount point.
Sequential writes to a new file over NFS Version 3 to a CacheFS mount point can be 6 times slower than writes directly to the NFS Version 3 mount point.

No comments: