Heading:
Grapevine: A Distributed Electronic Message System
Page Numbers: Yes X: 527 Y: 10.5"
Grapevine:
A Distributed Electronic Message System
Summary of Research Interests and Activities
Andrew Birrell
Roy Levin
Mike Schroeder
Xerox Palo Alto Research Center
3333 Coyote Hill Road
Palo Alto, CA 94304
May 27, 1980
We are interested in the structure of distributed systems and the issues that surround the design and implementation of practical instances of such systems. We believe that a realistic distributed system must face squarely the problems of naming, access control, reliability, and performance. Obviously, the degree of concern with each of these issues depends upon the goals and specific task domain of the individual system. To explore a particular set of choices and tradeoffs, we chose to design and implement a distributed system for the transport of electronic messages. However, we soon realized that the system could be viewed as providing a number of useful and interesting services for the construction of other distributed systems, including:
- replicated data bases for maintaining names (atoms) and groups of names
- authentication and access control facilities
- several communication protocols and services
- resource location mechanisms
Of course, all of these facilities are well-suited to the message system task, but find application in other distributed systems as well.
Our research has been performed within the general framework of an internetwork architecture with high-bandwidth local networks. The specific characteristics of the network architecture do not fundamentally influence the organization of our system; we merely assume some network-wide addressing mechanism for identifying host computers and low-level protocols for reliable communication between hosts. Thus, our solutions to the problems mentioned above are not specifically tailored to our particular computing configuration and should find applicability in similar networks.
The remainder of this document provides a functional overview of Grapevine, the message system we are building, and briefly discusses several research areas to which we believe this project has made contributions.
Overview of the Grapevine Project
Most conventional message systems rely on a central facility (e.g., a time-shared computer) that is used by all members of the message community. While a centralized approach simplifies the design of the message transport mechanism, it complicates use, growth, and administration when the community is large, dispersed, and includes people from multiple organizations. A distributed transport mechanism is more complex to design, but permits more natural use, simple growth, and flexible administration.
At the outset, we established a number of functional requirements for our distributed transport facility:
Coherent naming and name management. We wanted to rationalize and simplify the naming of recipients. In many present-day systems, the concepts of naming, addressing, and routing are often confused. For example, in the Arpanet message "system", recipient names contain explicit routing and resource location information that may change when additional networks or host computers are added. We wanted to separate these notions, allowing recipient names to be unaffected by changes in the network structure. We also wanted to decentralize the administration of the name space, and permit individuals to access appropriate portions of the naming data base.
Reliability and Availability. We wanted to assure users of the system that once a message has been accepted for delivery, it will eventually be delivered. We were willing to sacrifice throughput to achieve this reliability. In addition, we wanted the system to be available at all times, implying that a particular user should not be tied to a single server machine for his message sending and receiving activities.
Rapid interactive response. We wanted the time it takes for a user to inject a message into the system to be small. In particular, we were willing to delay most of the processing of the message until after the system has accepted it for delivery and the user has gone off to do other things. At the same time, we didn’t want to give up certain error checking that normally occurs before a message is accepted, e.g., validation of recipients. We also wanted message retrieval to be as swift as possible.
Distribution lists. We wanted to allow users to specify groups of recipients indirectly via distribution lists. We wanted to avoid a common mistake of other systems (including an earlier version of our own) in which distribution list names are treated as syntactic macros and are expanded using the file system. This technique introduces a separate name space (file names) that can interact with the recipient name space in confusing and undesirable ways. It can also lead to significant maintenance problems and compromise the goal of high availability. We wanted to unify the management of recipient and group names.
Security. We wanted to offer users a means of ensuring secure transmittal of messages. It seemed likely that recent advances in encryption techniques and protocol design would enable us to offer high security at a reasonable cost.
Having established these requirements, we now had to answer a central question: how do we decompose this functionality into the various components of the distributed system? To answer this question requires an in-depth examination of the structure and implementation of Grapevine, which is inappropriate here (although we would like to present some of it at the workshop). Instead, we briefly describe the research areas in which we believe the Grapevine system embodies some advances and interesting notions. At the same time, some aspects of the functional decomposition will become apparent.
Research Topics
The following paragraphs identify the topics of primary research interest that we have addressed in the course of building the Grapevine system.
Naming, Addressing, and Resource Location
Naming is a critical issue in any distributed system, and is of paramount importance in a message system. There are many distinct entities in the system that require names: computers, services, data bases, recipients, groups of recipients, etc. We carefully separate the identification of these entities from the mechanisms used to locate them, and their associated descriptions, in the distributed system. Because we expect the number of names to be large (especially the number of recipients) and because the system is geographically dispersed, we cannot reasonably require centralized administration of the name space. We therefore define a hierarchical name space in which portions of the hierarchy can be managed independently by multiple administrators. The semantics of the hierarchical levels are not constrained by the system; they may represent geographical or organizational groupings, or some other convenient structure determined by the administrators.
Although a hierarchical naming structure is appropriate for administration, a more flexible grouping mechanism is needed for other purposes. A registered name in the administrative hierarchy is either an individual (e.g., a recipient, a machine, a service) or a group (i.e., a set of names). Groups have no other inherent semantics, and are used by Grapevine to represent a variety of notions, including distribution lists, resource instances, access control lists (discussed below), and a number of internal structures. Simple as they are, groups represent a powerful organizational mechanism that is fundamental to Grapevine’s naming data bases.
It should be evident from the above discussion that there is no implicit connection between names in the Grapevine data bases and addresses of nodes in the underlying physical network. We derive several benefits from this decoupling:
Conceptual cleaniness. Names identify distinct entities; the semantics of those entities are derived from their names through explicit mapping, not textual inclusion (as in the Arpanet). These mappings may be one-to-one, one-to-many, or (occasionally) one-to-none.
Flexibility. Explicit mappings permit easy dynamic reconfiguration and clearly identify the parts of the system structure that permit (expect) changes.
Generality. The semantics of names are not wired-in, but are implied by the values to which they map and the places in which they appear. This permits them to be used for other purposes than just message transmission. In particular, we exploit them for access control functions as well (see below).
By keeping names and addresses separated, Grapevine is able to offer generalized protocols for locating (instances of) named services in the network. The message system itself offers several such services (e.g., mail delivery and retrieval, distribution list maintenance), but other applications can use Grapevine’s data bases and resource-location protocols to make their services available to the network population.
Consistency of the Naming Data Bases
All names are stored in a collection of data bases that are manipulated in a uniform manner. Most of these data bases are replicated in multiple computers to increase reliability and performance. To make this workable, of course, we must ensure in some way that copies of the same data base have consistent contents. We must ask ourselves the following questions, which should be familiar to all designers of distributed data bases:
Is it ever reasonable for a client of the data base to get different information from different copies of the same logical entities? If so, what does "consistency" mean?
How do updates propagate among copies of a data base? In the presence of machine and communication line failures, what ensures that updates will not be lost? Can a lost update cause permanent divergence of data base copies?
Our answers to these questions and the solutions in which they are embodied differ in interesting ways from some of the conventional approaches, yet do not depend on particular properties of the message transport task.
Reliability and Performance
The systematic separation of naming, addressing, and resource location leads to many levels of indirection in the data bases, each of which permits some degree of variation in the global configuration. Such mappings include, among others:
machine name => network address
service name => server(s) offering the service
recipient name => server(s) willing to store mail in transit
distribution list name => members of list
This can make some queries of the naming data base expensive to answer, particularly if the information is distributed across several servers. To enhance performance without sacrificing the flexibility implicit in these mappings, we depend heavily upon local caching and "hints". We then need only ensure that nothing disastrous happens when an obsolete cached value is used, and that the user of the value soon discovers that it is out-of-date and discards it.
This general approach gives high reliability without great complexity. Components of the Grapevine system tend to cache substantial amounts of information obtained over time by probes of the naming data base. A component uses a cached value until it either receives explicit notification that the information is obsolete or it discovers, through using the value, that it is no longer appropriate. The component then flushes the obsolete information and restarts its current activity (e.g., delivering a message) from the beginning. The next time around, an updated cache item will be obtained by consulting the data bases. Thus, components rarely attempt so-called "forward" error recovery, but rather "roll back" to a well-defined starting point. This keeps the code simple by cutting down on the number of error paths that must be written (and debugged), and the performance cost (in terms of redundantly executed code) is inconsequential.
Access Control
The naming data bases are manipulated by administrators and users alike, and therefore require some form of access control. One might reasonably want to allow any registered user to create a new distribution list, but permit only administrators to register new users. Access control lists provide a natural mechanism for representing such policies. We observe that the grouping mechanism already present in the naming data bases can be exploited to define access control lists. For example, one can associate with an entry in the data base an access control list for the individuals permitted to maintain that entry. We represent this list by a registered group name, which is itself an item in the naming data base.
Given this mechanism for defining access control lists, we can define a general set of operations and permissions that collectively offer an "access control service". That is, other facilities in the network can use the Grapevine access control mechanism to implement their own policies. For example, a file server might implement protection on a file by defining a group whose representation and manipulation reside within the Grapevine system, but whose name (access control list) is stored with the file. To check an attempted access by an individual, the file server asks Grapevine if the individual’s name appears in (is a member of) the group. Grapevine itself has no idea that it is performing a protection check; it is merely providing a mechanism for maintaining and examining groups of names.
Authentication and Security
The message system task offers an opportunity to apply some of the recent work in secure communication in a practical, everyday context. We have designed protocols that perform two-way authentication and deliver a secure communication link between the authenticated participants. Thus, a mail system user can be sure that he is submitting his confidential message to an authentic mail server, and the server can be sure that the sender is who he says he is. We use encryption techniques to secure the connection protocol and ensuing conversation against various well-known hazards, e.g., wiretaps, injection of extraneous data, retransmission of previous data, etc. Our protocols permit certain information to be cached locally, thereby reducing the cost of subsequent reconnection of the same participants without incurring the full authentication cost. This caching, however, does not compromise the security of the authentication mechanism. The protocols are also designed to permit authenticated connections between arbitrary individuals; they are not specialized to the message task.
Present Status and Outlook
We presently have one server machine in daily operation. It provides mail transport, name registration, and authentication services to about 25 users, and has done so for about 3 months. By the time of the workshop, we will have had three servers in operation for several months, providing mail and registration facilities for a community of several hundred users in Palo Alto and the Los Angeles area. We expect, therefore, to have some concrete experience with the distributed updating algorithms and the replicated data bases they maintain. We will also know whether the functional decomposition we have chosen satisfies our performance requirements in the presence of a realistic load.