X
Tech

DFS replicas the key to better data availability

A few months ago, I wrote an article entitled "Building a Windows 2000 Distributed File System." In it, I talked about how a distributed file system (DFS) can make files and directories that are scattered across multiple servers appear as though they exist on the same server.
Written by Brien M. Posey, Contributor
A few months ago, I wrote an article entitled "Building a Windows 2000 Distributed File System." In it, I talked about how a distributed file system (DFS) can make files and directories that are scattered across multiple servers appear as though they exist on the same server. For example, if your users need to access files from share points on two different servers, you can create one DFS tree that includes the two different share points. Users can then access the DFS tree as a single share point without having to know which server or share point the files actually exist on.

The problem with implementing DFS in a large organization is that you can find an excessive number of users trying to access files through a single share point. The more users accessing a set of files, the slower the access to them becomes. This is where DFS load balancing comes into play.

You can build a DFS tree and copy the tree's files and directories to other servers. By doing so, you can have two or more identical copies of your entire DFS tree. In such an arrangement, the original DFS tree is known as the master and the copies are known as replicas. You can have up to 32 replicas.

Brien M. Posey is an MCSE and freelance writer.How does Windows 2000 determine which replica a client should connect to? The server selection occurs at the client end. During the server selection process, clients use a list stored in the Active Directory to randomly select a replica. Because the server selection is a random process, client connections are somewhat evenly distributed among DFS servers. However, random server selection also has its disadvantages.

There's no way for a client to see the number of other client sessions attached to a given DFS server. This makes true load balancing impossible. If a client running Windows 2000 Professional attaches to a DFS server that appears to be overloaded, there's a way to select a different server. You can use the DFSUTIL command to flush the client's partition knowledge table (PKT), which forces the client to request a new DFS server referral. Of course, there's always the chance the client could select the same DFS server again. If the client is running a version of Windows other than Windows 2000, the client has to reboot to change DFS servers.

On the client end, the PKT is the key to the load balancing process. Because the process works differently for different types of clients, the remainder of this article assumes that clients are running Windows 2000 Professional.

The PKT functions as a cache, storing information about available links and servers. Any time a client tries to access something from the DFS, it checks the cache for the resources path. If the cache doesn't contain an entry, the client uses the server to search for the desired resource. When the client finds the resource, it adds an entry to the list.

The content in the PKT cache has a five-minute life span (unless the time to live is modified by an administrator). If a cached object hasn't been used in the last five minutes, it is removed from the cache, and the client is forced to attach to a different server to access a replica of the data.

If an entire server goes dead, such as during a power failure, it takes clients a few moments to realize the server is no longer available. Keep in mind that the other DFS Servers still show the failed server as being a valid replica, and entries pointing to the failed server may still exist in the client's PKT cache. What usually happens in such a situation is that clients attempt to communicate with the failed server, because they don't know it has dropped offline. However, because TCP/IP is designed to retry failed communications several times, it may take a few minutes before clients realize that communications are failing. At that point, a client will check its PKT cache for a new replica. If the cache doesn't contain any entries for other replicas, the client will consult the DFS root to find the name of another replica.

When a server fails, you can save yourself a lot of trouble by removing it from the list of replicas on the DFS root. To do so, open the Distributed File System console from the Administrative tools menu, right-click the DFS root or the DFS link, and select the Replication Policy command from the resulting context menu to invoke the Replication Policy dialog box. Select the failed server from the list and click the Disable button followed by the OK button.

Information about enabled DFS replicas is stored in the Active Directory, so it may take some time for changes you make to propagate to other domain controllers.

In the event of a partial server failure, such as a faulty hard disk, the server hosting the shared resource can report back to the client, telling it the resource is unavailable. The client can then look for a functional replica. Crash recovery is almost instantaneous unless a client has a file open during the crash. In this situation, the DFS failover process works the same way, but it's up to each application to detect the server change and establish new file locks.

Not only is a replica handy in a crash situation, but it also makes life easier when you have to perform server maintenance. For example, installing a new service pack often requires removing users from the server or performing the installation at night or over the weekend (assuming your company isn't a 24/7 operation). DFS makes this process more convenient. If you've created a replica of your DFS tree, you can simply make the replica unavailable for a period of time while you upgrade the servers. When you're done, you can make the replica available and then make the original unavailable while you tend to its servers. Users might experience slower response times while you're working on the servers, but they'll be able to keep working, and you won't have to lose sleep to perform system maintenance.

Implementing DFS with load balancing can be expensive, especially if you lack the necessary hardware and need to buy more servers to host the replicas. However, distributing file access creates a tremendous performance boost, so administrators should weigh the throughput improvement against the extra hardware expense.

Now that you understand the benefits of DFS replicas, let's turn to the process of creating them.

Before you begin creating replicas, consider starting with the DFS root server, which is the server through which users access all DFS links. If the DFS root server goes down, users won't be able to access any of the DFS links through the usual method of attaching to the share point; they would have to know the exact server and path where a file resides.

The best way to avoid this problem is to replicate the DFS root. Open the Distributed File System console from the Administrative Tools menu. When the console opens, right-click the existing DFS root and select New Root Replica to launch the New DFS Root Wizard.

Enter the DNS name of the server that should host the new replica. You can use the Browse function to find the desired server. Once you've selected the server, click Next and you'll see a screen that asks which share point you want to use to store the replica. You can use an existing share point or create a new share. Once you've made your selection, click Finish to complete the wizard.

Although you've created an original and a replica of the DFS tree, Windows won't replicate data between them until you tell it to do so, so your next step is to configure the way that replication occurs. From the DFS console right-click on the DFS root and select Replication Policy. Select the DFS root that you want to make the original and click the Set Master button. When you do, you'll see the replication status for that root change from No to Yes (Primary). Now, enable replication by selecting the replica and clicking the Enable button. Doing so activates automatic replication, which will occur every 15 minutes. It's possible to disable automatic replication, but this only makes sense if you're performing infrequent manual replications.

Now that you've replicated the DFS root, you can turn your attention to other folders. The idea behind replicating shared folders is to replicate the critical data, but not the DFS root. For example, suppose that a folder called DATA existed on server A. You could create a replica of the DATA folder on server B so that servers A and B both have copies of the data.

The process for replicating a share is almost identical to the process of replicating a DFS root. From the DFS console, right-click on the DFS link that points to the share and select the New Replica command from the resulting context menu. In the Add Replica dialog box enter the UNC (Universal Naming Convention) path to a share that you want to contain the replica, then choose automatic or manual replication.

Editorial standards