Naming in Distributed Systems

| Comments

Introduction

The scope of this post is limited to the study of naming systems for following system
• Directory Service for Wide Area
• File system or Content manage system for collaborative work

Naming can be categorized into four kinds
1. Host based naming
2. Global naming
3. User/Objet centered naming
4. Attribute based naming

The system we are going post will be falling into one of these category or hybrid of it. On the broad level, naming is associated with users, hosts, services, files, objects and groups. The requirements for directory for wide area and for Filesystem/CMS can be broadly categorize into following components:
• Scalability
• Availability
• Consistency
• Reliability
• Fault isolation
• Performance and Efficiency

Directory Service for WAN – DNS †

DNS is the case study of directory service for wide area network. The various requirements of DS†† for WAN††† has been summarized and presented in form of DNS as follows:

Granularity of Names

In DNS, domain names are represented by a character strings and machine oriented binary identifier is called Internet Address. Domain name changes rarely than the host more down the hierarchy so, on the course of granularity the less frequent to change lead to less number of messages and less number of objects to deal. In DNS mechanism of name to machine address lookup, finer the level of granularity more it is prone to change hence leads to more number of messages and query flow in network. So higher the hierarchy is lower the name granularity and lower the hierarchy is higher the name granularity.

Caching of Names/Placement of Caches

All names are cached which the name server heard about from other name servers while handling the request of name resolution.

Iterative query – Name query goes to local name server where server matches query to the longest name prefix in its local cache. It caches the request and response for future reference.
Recursive query – Every level of name server maintains the cache, in other words the multilevel caching, which tries to resolves longest sequence of name query. In recursive query look up the server caches the query request and as well as query response.
Negative Caching – Negative caching is used for bad names or absence of a resource record in order to answer future queries as quickly.

Use of Replication

To ensure high availability and enhance performance of name service name servers are replicated and the frequency of replication depends upon the frequency of its use and the degree off its importance in the network. For example, root name server is highly replicated to ensure its high availability and avoiding frequent name queries.

Use of Distribution

The hierarchical model of DNS distributes the job of managing the handing out of names by distributing the responsibility of operating name servers. Distribution is maintained in terms of different domain name servers for different top levels domains, so there is a natural separation in terms of sending particular kind of name queries to one name server and other kind of name queries to other name servers.

More formally, Namespace is delegated at every domain and the whole space is partitioned into number of area called zones, which starts a domain and extends till leaf nodes, which is individual computer, or to other domain where other zone starts.

Consistency/Synchronization Requirement

DNS cache manager synchronizes the cache records when expired. For consistency caches maintains the time to live for every entry.

For any update operation the primary server of a zone is contacted. Each secondary server periodically establishes a communication connection with primary server and gets the update.



File system/Content management system (CMS)

File System or CMS for collaborative work is smaller environment as compared to directory service for wide area. Hence, the priority of requirement we discussed earlier changes. The scalability for small scale is not the top priority but it remains requirement for future. But, other requirements like Availability, consistency etc remains a major requirements.

CMS/File System for University can have Global naming, User centric naming system, attribute type naming scheme.

Granularity of Names

Granularity of names in CMS systems is large in nature. Since, the system is not huge and not distributed of highest degree, large granularity works. Moreover, in system like this more details can be accommodated in to naming increasing the performance of naming system.

For instance, Tilde naming system is a relative naming system based on collection of small, disjoint, hierarchical namespace. The level of will be very less as compared to wide area hence lower the granularity.
In Prospero File System, Virtual System Model implements the concept of closure, which reduces the granularity.

Caching of Names/Placement of Caches

In CMS/File Systems environment caching of names provide enormous performance boost because the effect of locality of reference or caching the alias plays a significant role. Cache can be most efficiently used for most frequent access file names, which is limited and manageable in case of small environment like university. Cache can be managed on centralized server or the primary name server based on system naming architecture and boosts the overall availability.

Use of Replication

Replication is of required for high availability of naming service. Replication enhances the performance in University environment when using Global naming system and act as a load balancing to serve request quickly. It provides the fault tolerance by maintaining the replication of progressive collaborative work.

Use of Distribution

Distribution in global naming system if designed hierarchy, in the case of Prospero, local name server associated serves each requests. Next component is resolved by directory server in response of local name server. So, the naming service is distributed in the context of processing of user name query in distributed fashion rather than query served by dedicated one name server. Maintaining a non-distributed global name service irrespective of distributed file content can be a bottleneck and global name server performance issue, when files across the server is moved and renamed frequently.

Consistency/Synchronization Requirement

Consistency is achieved by synchronization and in collaborative environment like university synchronization is high priority. Stale name values in cache or time to live values in cache can lead to poor performance in dynamic collaborative environment. Although frequent synchronization leads to degrade the system performance but in order to provide consistency and reliability it can be compromised.





† Domain name service
†† Directory Service
††† Wide Area Network

Comments