A New Large-Scale Distributed System

Leandro Navarro

A New Large-Scale Distributed System

1997

We introduce in this work Object Distribution System, a distributed system based on distribution models used in everyday life (e.g. food distribution chains, newspapers, etc.). This system i s designed to scale c orrectly in a wide a rea network, u sing weak consistency replication mechanisms. It is formed by two independent virtual networks on top of Internet, one for

A New Large-Scale Distributed System María Eva M. Lijding Computer Science Department Universidad de Buenos Aires mlijding@dc.uba.ar Claudio E. Righetti (*) Computer Science Department Universidad de Buenos Aires claudio@dc.uba.ar Leandro Navarro Moldes Computer Architecture Department Universidad Politécnica de Cataluña leandro@ac.upc.es Abstract We introduce in this work Object Distribution System, a distributed system based on distribution models used in everyday life (e.g. food distribution chains, newspapers, etc.). This system is designed to scale correctly in a wide area network, using weak consistency replication mechanisms. It is formed by two independent virtual networks on top of Internet, one for replicating objects and the other one to build distribution chains to be used by the first network. As in Internet some sites often become inaccessible due to latency, partitions and flashcrowd, objects in our system are accessed locally and updated off-line. It also provides methods for the classification of objects. This allows selective distribution, and provides order in the chaos that reigns nowadays in Internet. Distribution chains are build dynamically to provide end users with the objects they want to consume, while making good use of available resources. 1. Introduction In the last few years Internet has been growing exponentially in number of users and hosts, but the communication infrastructure does not keep pace with that growth. This growth brings about an important rise in traffic, so users experience a big latency when accessing resources. A resource is any object that can be accessed or referenced in a network [Deutsch 94] (e.g. a document, a name server, etc.). Berners-Lee arguments that a reasonable latency is about a 100 ms [Berners 95], but Viles and French have found that latency is about 500 ms [Villes 95]. Our measures show that the situation nowadays is even more discouraging, as we have found out that average latencies between Spain and Argentine are over 2000 ms. Packet loss is considered common at about 40% or more [Goldin 92b], but we usually suffer losses between 60% and 70% (measures taken in Jan/Feb. 97). Another problem in Internet are network partitions, due to link or node failures. We must not only take in to account the growth in the number of users, but also their behavior. The fact that an important amount of users have a common interest at a certain moment, cause an inundation phenomena in a resource server and the network nearby, called flash-crowd [Nielsen 95]. This phenomena can be seen with resources that thousands of users want to access simultaneously (e.g. 880.000 accesses to the NASA Web Servers during the comet Levy-Schoemaker-9 collision). If the resource involved is a document, it is usually referred to as hot document. (*) Claudio E. Righetti - Instituto Nacional de Tecnología Industrial Av. Gral. Paz y Constituyentes s/n C.C. 157 - C.P. 1650 San Martín (Pcia. Bs. As.) - Argentina Phone: 54-1-7526915 / Fax: 54-1-7545194 All this causes that many resources are considered unreachable by huge amounts of users, that feel frustrated by this. Another problem is the great volume of information available in Internet. Generally there is no guaranty of the quality and reliability of such information (e.g. in News over 90% of the news is considered noise [Saltz 92]). The information available in Internet is not classified, except in News and some catalogues in Web servers. This brings about yet another problem, how to find relevant information for the user. The excess of information and it’s associated problems, leads to total misinformation. Sometimes the excess of information is a problem as important as the lack of it. IRTF (Internet Research Task Force) has stated that resource discovery tools (RDT) (e.g. WAIS, Gopher, Web, Archie, Prospero, etc.) have scalability failures. Scalability has been shown by Danzing et al. [Danzing 94] in three dimensions, to put this problems in a proper framework: Data Volume, Number of users and RDT diversity Our work intends to solve the lack of scalability of the first two dimensions and assure the availability of resources even when network partitions occur. 2. Framework Replicating resources improves performance and availability. By storing copies of shared data on processors where they are frequently accessed , the need for expensive remote read accesses is decreased. By storing copies of critical data on processors with independent failure modes, the probability that at least one copy of the data will be accessible increases. But when resources are replicated the correctness of each copy must be taken into account. Consistency is defined as all the copies of the same logical data item to agree on exactly one value [Davidson 85]. Maintaining correctness and availability of data during network partitions are competing goals. Correctness may be guaranteed by allowing operations to take place on only one partition. This means first of all that the system must be able to determine the existence of a network partition. In very congested networks, such as Internet, this can not be achieved because frequently a partition can not be distinguished from the latency of the network. This way of maintaining correctness diminishes the system’s availability. If we want to ensure the availability of resources at all time, we can allow the normal operation of the system, even in the face of partitions. By allowing this, replicas may not always be consistent and it is needed to apply some correcting mechanism once the partition has been solved [Davidson 85]. The consistency degree of a distributed systems, depends mainly on the constraints inherent to the application. It is not the same to maintain brokers informed about the stock exchange than providing new articles to a group of researchers. Systems based on weak consistency replication, allow the replicas to diverge in the face of a network partition, so that each replica can continue offering service. Once the partition is solved, the replicas eventually converge to a consistent state. We consider that significant latencies must also be thought of as network partitions. We believe weak consistency protocols are needed in order to make replication mechanisms scalable in wide area networks. These protocols have already been used in a great variety of systems, due to the high availability, good scalability and design simplicity (e.g. Grapevine [Birell 82], Clearinghouse [Oppen 83], Locus [Popek 85], Coda [Satya 92], GNS [Lampson 86], AFS [Satya 93], News, Refdbms [Goldin 92a], Harvest [Obraczka 94], OSCAR [Downing 90]). Information tools deployed on the Internet may all be classified as using one of the following techniques to ‘disseminate’ their information [Weider 94]: ♦Come get it: Is the most used technique (e.g. FTP, Gopher, Web, etc.). It means a waste of time for the user, as a combination of access latency and excess of information (relevant and noise). These systems are mainly based on a client-server model. ♦Send it everywhere: We believe it is better to call this technique ‘send it to all interested members’. News, Mbone [Eriksson 94] and Harvest may be considered to use this technique. They are generally distributed systems, where service agents cooperate in order to replicate information. Something important to take into account is the topology used to disseminate information and the way it is build. Let’s take Web as an example of an information system used by a great number of users. Actually Web is responsible for most of nowadays traffic on Internet. As many other information systems, Web was not designed to be used in such a big scale. A solution to diminish access latency to a Web document, is to use cache techniques. These techniques store documents asked by a user, so that they will be locally available when another user wants to access them. But the first user that accesses a document must stand the hole latency at that precise moment. Generally they also show a lack of robustness when network partitions occur, because they must validate the local replica against the original document. Even if more efficient ways of caching have been proposed (e.g. geographical push cashing [Gwertzman 94], demand based dissemination [Bestravros 95], cooperative cache [Malpani 95]) they all have the problems just mentioned. Caching: Distribution: - on demand (readers) - on production (authors) - partial (page) - total (volume) - synchronous (same time) - asynchronous (diff. time) - event: get - event: post - 1 doc; N (old) fragments - 1 doc replicated N times Fig. 1: Cache techniques vs. Distribution techniques Cache techniques fail to handle big information volumes and they also fail when information changes rapidly. Web documents are generally formed by several files. Cache techniques access documents using HTTP, but HTTP opens a TCP connection for each file in a document. Handling multiple TCP connections causes an important load on busy servers. Another widely used technique in order to diminish latency and the flash-crowd phenomena, is making an exact copy of a remote FTP site and present it as a local FTP site. This is usually known as mirroring or mirrored sites. But using mirrored sites has various problems. Users usually do not trust the source of information and doubt about the information being kept updated. Another problem is that users do not remember about mirrored sites and even when they do, they are not able to decide which one is the closest. Keeping mirrored sites does not forbid the access to the original document, so most users go on accessing only one site. Even if all the problems we have just mentioned are brought about by the behavior of users, there is yet another underlying problem, because generally replication is carried out in a centralized manner. 3. Object Distribution System (ODS) Our idea is to introduce a system based on distribution models used in everyday life (e.g. food distribution chains, publications, etc.). Consumers do not go to the places where the goods are produced (e.g. factories, author’s house, etc.), they buy them in the closest retail shop. Consumers purchase the goods that are already available in the shops, they must not wait for them to arrive. Even if the factory is constantly producing new goods, consumers purchase the goods already available in the shops. This systems works because consumers trust their retail shops, they believe that they have products as fresh as possible at a reasonable price. Consumers that do not want to accept this rules, can try to obtain the goods straight from the producers, but the producers may not want to supply them directly. Even if the producers accept direct purchasing, it is not as easy for consumers to do this than to go to retail shops (distance, working hours, product presentation, service to the clients, etc.). In Internet community this concept does not exist. We should like to introduce it in a way that adequates to the infrastructure we have nowadays, without modifying used protocols and standards, and trying to take advantage from their wide utilization. We believe that models that propose a radical change can not find immediate viability, because defining new models and protocols to replace the ones being used, is not as fast as the Internet community demands. The goal of Object Distribution System is to provide this service in Internet. It is formed by two independent virtual networks: Object Distribution Network (ODN) and Object Routing Network (ORN). ODN brings objects close to end users according to their interests and ORN builds the distribution chains that ODN needs to do this in an optimal way. Object Distribution Network (ODN) group membership distribution chains Object Routing Network (ORN) Fig. 2: Object Distribution System ODN handles objects (not with the precise meaning given in object programming) that are persistent and replicated in every interested service agent. ODN can handle different classes at the same time. Inside each class, objects can be further classified by their authors or some classification authority. Our objects are write-one/read-many (e.g. Web documents, FTP archives, etc.), this means that each object is only modified in the service agent where it is registered. In this way no information correction is needed, because the worst thing that may happen is that some users are accessing in a read-only manner a version of an object that may not be the last one. It must be noted that in a finite time the objects will be updated, even if this time is not bounded. Accessing a version of an object different from the last one, is not acceptable in certain applications. This can be clearly seen in stock exchange information for real time decisions or a videoconference. We have designed ODN for objects that do not suffer constant changes. A good example of this kind of objects are documents, because they are persistent, can be classified and do not change often. We introduce Document Distribution Network (DDN) as a special case of ODN, for which we have a reference model. We state that documents generally do not change often based on the following statistics: −Bestavros [Bestavros 95b] states that the amount of updates on documents that he calls ‘globally popular’ is at most 0.5% per day, even more this amount is restricted to a small subset of them. −Blaze [Blaze 93] states that the probability that a documents changes gets smaller further in time since the last update from the document. Service agents (members of ODN) join different groups according to the interests of their users. In each ODN group the service agents cooperate to obtain an efficient replication. Having groups allows selective replication. There is a group for each kind of objects. In this way we also want to put some order in the chaos that is brought about by having information that is not classified. ORN builds distribution chains dynamically for each group. To build the chains the routing agents (members of ORN) take into account the type of membership to a group of each service agent and the underlying network state. Even if News and GNS distribute objects in a hierarchical manner, they do not build distribution paths dynamically. The routing mechanisms used in ORN for building distribution chains is completely independent of the class of objects that are being handled by ODN. Both networks were designed to work independently, defining a clear interface between them so that ORN can provide services to ODN in a transparent way. ODN UA UAP SA SAP SA ORN RAP RA SA ORP RA RAP RA UAP RA ODN: Object Distribution Network ORN: Object Routing Network SA: Server Agent UA: User Agent RA: Routing Agent SAP: Server Agent Protocol UAP: User Agent Protocol RAP: Routing Agent Protocol ORP: Objetct Routing Protocol UA Fig. 3: Object Distribution System’s Protocols Some authors propose the use of IP multicast [Deering 89] for massive replication (e.g. DPM [Donnelley 95]). Using IP multicast implies best-effort sending of datagrams to a group of hosts that share a unique IP address. This fits correctly to real time application for audio and video, where the loss of a datagram is not an important problem and receiving datagrams too late or replicated is much worse. But using directly IP multicast does not work when data consistency is needed. Message delivery is not reliable and all the members of the group must be active to receive an update, because the distribution is done directly by the producer of the object. Multicast routing algorithms (e.g. CBT [Ballardie 93], DVMRP [Waitzman 88], MOSPF [Moy 94], PIM [Deering 96], etc.) may be studied separately from the use of IP multicast. As ours is an application level problem it is not right to depend on the network level to solve it. Our proposal is to adapt routing techniques used on network level to application level. 3.1 Object Distribution Network (ODN) We define an Object Distribution Network (ODN) as a set of service agents (SA) that cooperate in order to replicate objects for users in different sites. The users need not know which is the original site of each object, and even less if it is reachable. Each user accesses a replica of the objects he subscribes to, in the service agent that provides him with an access point to ODN and also registers there the new objects he wants to disseminate through ODN. Users really access ODN through user agents (UA) that work as interfaces. A user agent communicates with the service agent using the User Agent Protocol (UAP). User agents are not part of the system, so the way users interact with them is not defined. ODN SA: Service Agent UA: User Agent ODN: Object Distribution Network UAP: User Agent Protocol SAP: Service Agent Protocol UA UAP SA SAP SA SA Fig. 4: Service Agent’s Protocols Objects are identified uniquely. Objects can only be updated or deleted by it’s owners. They have an associated version that allows the service agents to decide if an object is updated. A possible way to determine the version of an object is to have attributes for the last update and if it was deleted, a timestamp indicating when this took place. This attributes of an object can only be modified in the service agent were the object is registered. Each service agent keeps only the last version it received of an object, without keeping track of former versions. A new version of an object always replaces former versions. If an object is deleted, each service agent must keep some information about it, in order to be able to decide if it is being offered an old version. Objects have two parts: Meta-information and Body. Service agents can have hole objects or meta-information alone. Users subscribe to certain kind of objects in their user agents. Subscription is done to hole objects or to meta-information, by choosing individual objects or sets of objects. Each service agent defines it’s role in ODN by tags that represent the type of objects that it produces and consumes. We say that a service agent produces the objects it’s users register. It consumes the kind of objects it’s users are interested in. ODN handles three type of tags: Producer[X] (P[X]), Object Consumer[X] (C[X]) and Meta-information Consumer[X] (MIC[X]). Where X indicates the kind of object the tag refers to. A group is a set of service agents joined by a shared interest. There is a group for each possible X. Service agents belong to all the groups matching their tags. In each group a distribution chain must be maintained. This chain must assure that all the consumers of the group are able to receive the objects produced in the group. A service agent considers all those it must send objects in the group as clients. On the other hand, the service agent that receives objects considers those who send them as suppliers. Service agents work actively to distribute objects, without waiting until they are asked for (‘send everywhere’). Service agents communicate using the Service Agent Protocol (SAP). When a service agent receives a new version of an object, it offers it to it’s clients. In this way objects can reach all the service agents of the group, without users having to ask for them. Users must not make any effort to obtain the objects they subscribed to, they just access the version that is in their service agents. It is the task of the service agents to maintain the objects updated. Distribution of an object begins at the service agent where it is registered. Let’s call this service agent, home service agent. The home SA offers it to it’s clients, that do the same with their clients. If the distribution chain provides at least one path from the home SA to all consumers in the group, this flooding mechanism assures that the object will reach all interested SA in the group. Even if a distribution chain provides more than one path between two SA, objects do not cycle. Server agents reject versions of an object that are not newer than the one they have. When they are offered a newer version, they accept it, and once they receive it, they offer it to it’s clients. A distribution chain is only build with service agents from one group. Service agents only receive objects they are interested in consuming. Each consumer keeps a persistent local replica of all objects in the group and of the objects that are locally produced. In this way service agents can offer once and again the same object to their clients. They can also determine if the version of an object that is being offered to them is newer than the ones they have seen, without this meaning any burden to them. In other networks, such as Mbone, maintaining information to determine if an object was already flooded may become too expensive, so it is needed that distribution chains do not have alternate paths [Semeria 97]. 3.1.1 Distribution Chains Networks have a physical and a logical topology. The physical topology is determined by connections between physical components. Internet’s logical topology is a completely connected graph, because IP hides the physical topology to allow all the hosts to communicate. We must find distribution chains over this logical topology. Distribution chains must allow the objects produced in the group to reach all the consumers in the group, optimizing the distribution cost. The tags each service agent has, defines it’s role in every group. When building distribution chains, each service agent is treated according to it’s role. Roles are not exactly the same as tags, even if they are defined by the tags (e.g. there is no tag for mere consumers, but there is a role as consumer that implies that the service agent is either an object consumer or a metainformation consumer). Service Agent Producer Pure Producer Producer Consumer Object Consumer Object Consumer Meta-information Consumer Meta-info Consumer Producer/ Consumer Fig. 5: Service Agent’s roles Distribution chains are build by an Object Routing Network (ORN). The way this network works is not known by the service agents. ORN is formed by routing agents that cooperate to build distribution chains that adequate to the requirements of the service agents, through a protocol that is known generically as Object Routing Protocol (ORP). Each service agent is user of a routing agent that provides it with the set of clients it must have in each group it joins. Service agents communicate with the routing agents using the Routing Agent Protocol (RAP). ORN SA: Service Agent RA: Routing Agent ORN: Object Routing Network RAP: Routing Agent Protocol ORP: Object Routing Protocol SA RAP RA ORP RA RA Fig. 6: Routing Agent’s Protocols Finding efficient distribution chains in not trivial and the most appropriate way of doing it depends strongly on the working scenario. The amount of service agents must be taken into account as well as their behavior. The network infrastructure is also very important. In some simple cases distribution chains may be manually configured, but we can state that in order to scale correctly, a dynamic routing protocol is needed. All these will determine the kind of routing protocol that is needed. Distribution chains must achieve the following goals: •Respect tags of Service Agents •Minimize distribution costs •Adaptativity •Scalability 3.2 Object Routing Network (ORN) It is the task of this network to build distribution chains for each group of ODN that has at least one producer and one consumer. Each chain must provide a path from every producer to all consumers. There need not be only one path, there could be various paths, as ODN has an independent way to prevent objects from getting into cycles. We can present ODN as a graph, where each service agent is a node. As we are working over a logical network that allows all nodes to communicate, the graph is complete. Each arc in the graph has a weight assigned that represents the cost of using the logical link. A possible way to determine costs is using statistic measures of the state of the link (e.g. using ‘ping’). Nodes in this graph are classified according to their roles in each group. We will handle this graph as a set of graphs in different levels, one for each group. Computations will be done using this resulting graphs. Routing agents must build distribution chains taking into account that the members of a group may change and paying special attention to the fact that the amount of them may grow significantly. On the other hand they must also consider the state of the underlying network. We introduce here a distributed routing mechanism that may be used in a flat or hierarchical routing topology. We first introduce the basic algorithm to be used in a flat topology and afterwards the changes that must be done to adapt it to a hierarchical topology. At last we show a protocol suite to be used in a hierarchical topology. Routing agents determine dynamically the structure of groups. Each routing agent announces it’s peers the tags of the service agents it represents. With this information, each routing agent can know the members of each group. Routing agents must make computations for all the groups where at least one of the service agents they represent is involved. Each service agent has information according to it’s role in the group. In the graphs for each group, nodes (service agents) with the same role form a layer. We show here the information nodes of each layer have: 1)Pure Producers: only locally produced objects. 2)Producers/Consumers: objects that are locally produced, because they are producers, and all objects in the group, because they are consumers. 3)Object Consumers: all objects in the group. 4)Meta-Information Consumers: meta-information of all objects in the group. We build the distribution chain of a group in layers. Each node has clients in the same layer or in higher layers (see fig. 7). In this way we do not allow certain distribution chains that may be optimal, but do not obey this layer division. In each layer global distribution cost is minimized. Producers/consumers form a completely connected graph, that we call core. Inside the core there is a path between every pair of nodes. So each node inside the core eventually has all objects produced in the core. Each pure producer chooses as client a node inside the core. In this way every node inside the core will have also all objects produced by the pure producers of the group. In other words every node in the core eventually has all objects in the group. As long as there are producers in the group, there must be a core. If there are no producers/consumers, an object consumer is chosen. If there is no object consumers either, then a meta-information consumer is chosen. Producer/ Consumer Producer/ Consumer D P[X] A P[X] Pure Producer B PC[X] J PC[X] core Object Consumer Object Consumer Objects I PC[X] H C[X] L C[X] Core Node E C[X] G C[X] M C[X] Meta-Info Consumer Meta-Info Consumer Fig. 7: Communication between layers F MIC[X] K MIC[X] C MIC[X] Metainformation Fig. 8: Example of a distribution chain Routing agents must have a global vision of the group in order to build distribution chains. Each routing agents floods it’s peers with the link states of it’s nodes [Mc Quillan 80], independently from the groups that they belong to. Each routing agent keeps this information, received from it’s peers, in a topological table. Routing agents make independent computations for each group using this table. Routing agents need to know how to flood information to the other routing agents of ORN. There are entities, that we call managers, that are in charge of providing them with this information. When a routing agent starts working, it subscribes with the manager. The manager announces the subscription of the new member to the routing agents that are already working and informs the new member about all the members of ORN. 3.2.1 Algorithm Nodes have a unique identifier. Identifiers can be ordered in ascending way (e.g. IP addresses or character strings), so it is always possible to decide given a set of nodes, which one has the smallest identifier. The arcs of the graph are also identified uniquely by the identifiers of the nodes they join given as an ordered pair (i, j) / i < j. In this way arcs can be ordered in lexicographic order (∀ pair of arcs (i, j) and (k, l), (i, j) < (k, l) if and only if i < k or i = k and j < l). In our proposed model there must always exist a core in the group if a distribution chain is needed. So if there are no producers/consumers, the object consumer with the smallest identifier is chosen to be the core of the group. If there are no object consumers either, the meta-information consumer with the smallest identifier is chosen. A routing agent that represents a pure producer picks the link to a node of the core with minimum cost. Every routing agent that represent producers/consumers compute a minimum spanning tree that covers all nodes in the core. We call this tree, producers tree. We propose to use Prim’s algorithm to Define a set S of nodes included in compute the spanning tree. Whenever there the tree are conflicts with two arcs of equal cost, the S = {1} /* original version */ (#) one with smaller identifier is chosen. By Define a set LIST, that will have the arcs chosen for the tree doing this all routing agents making this LIST = ∅ computation will obtain the same tree, as While |S| ≤ n long as the topological tables they use are Choose arc (i, j) of minimum cost in the same. As minimum spanning trees the cut [S, Sc] (i ∈ S y j ∈ Sc) Add j to S computations are defined over a non Add (i, j) to LIST directed graph, we consider the topological End While table as a symmetric adjacency matrix. To Fig 9.: Prim’s Algorithm build this new matrix (A*) we could assign a*ij=a*ji=(aij+aji)/2. The service agents that are represented by the nodes of the producers tree must consider clients all their adjacent nodes in the tree. Each pair of adjacent nodes in this tree see each other as supplier and client. Routing agents that represent object consumers compute a minimum spanning tree where all object consumers are included. We call this tree, consumers tree. It is build using Prim’s algorithm with S={producers/consumers} (see (#) in fig. 9). In a similar way, routing agents that represent meta-information consumers must compute another tree, called metainformation tree, that includes all nodes interested in consuming meta-information. To build this tree we use S={all nodes that consume objects (producers/consumers and object consumers)} (see (#) in fig. 9). See fig. 8 that illustrates a hole distribution chain. When a node is a pure producer and a meta-information consumer, it’s routing agent must carry out computations as if it should be for two different nodes. The node will appear twice in the distribution chain, sending objects to the core and receiving meta-information through the meta-information tree. 3.2.2 Hierarchical Routing In a hierarchical topology nodes are divided into domains. In each domain an intra-domain routing protocol is used, generally called internal routing protocol. This protocol may be different in each domain. To build paths between domains, an inter-domain routing protocol (or external routing protocol) is used. Service agents should not even note that the routing topology is hierarchical, because the way distribution chains are build must be transparent to them. Something similar happens in Internet with IP that hides all details about division in autonomous systems. We can also view domains as nodes of a graph. This graph is also complete. It’s arcs are assigned weights in a static way. Arcs represent logical links that communicate two domains. As all nodes inside a domain are able to communicate with all nodes in other domains, the weight of an arc can be taken as an estimation of the average behavior of the logical links between all nodes in the two domain the arc joins. Each domain has it’s own manager that handles subscriptions of the routing agents of the domain and an entity, that we call external router, in charge of inter-domain routing. External routers summarize the tags from the domain nodes and assign this summary to the domain as a set of tags. The tags from the domain are informed to the peer external routers. External routers also flood their peers with the external link states of their domains. Using this information each external router computes the algorithm we proposed in each group with local members. The algorithm is computed over the graph that represents domains to obtain inter-domain routing chains. As a result each external router knows which domains must be considered clients of it’s domain, but service agents do not know anything about domains and need information about service agents. Each domain must provide a node from it’s core for each group with local consumers. This is only needed for the groups that have an inter-domain routing chain. A core must be formed even if there is no intra-domain distribution chain, because of lack of producers in the domain. The node chosen to represent the core externally is the one with the smallest identifier. When there are no nodes that consume objects in the domain (e.g. there are only pure producers), there is no core with objects. There may be a core with meta-information if the domain has meta-information consumers, but a node from this core is not useful to supply other domains with the objects produced in the domain. In this case each producer in the domain must take as clients the selected nodes from the client domains. The routing agent from the chosen service agent, provides information about this SA to the external router. The external router forwards this information to the external routers of the supplier domains. This external routers finally pass this information to the routing agents of the service agents that will act as suppliers (a node from the core or the pure producers from the domain). 4. Document Distribution Network (DDN) This network is a special case of ODN, where the main objects to be distributed are documents. Even if they are not the only class of objects DDN handles. Following we give the complete list of object classes: •Documents •Classification Schemes •Document Authors •Classification Schemes Authors Classification schemes allow to classify documents in order to fulfill more optimally users requests, as it provides a framework to carry out selective distribution. A classification scheme has labels that are assigned to documents in order to classify them. According to the working scenario, labels can be assigned by the authors or by classification authorities. We call document author everyone that produces documents inside DDN and classification scheme authors the ones that produce schemes. We consider that an author inside DDN can really be a working team. An author is restricted to work in only one service agent. In order to allow different people to act as an author, a service agent can be accessed by the same author through several user agents. Classif. Scheme Author Document Author belongs belongs Document Classification Scheme classifed as part of Label part of Label Fig. 10: Relation between DDN objects We propose subscription to objects of DDN to be done by specifying one of the following: −Documents with a Label. −Documents of an Author. −Documents of an Author with a Label. −A Document. −Information about Document Authors. −Information about Classification Schemes Authors −Classification Schemes of an Author. −A Classification Scheme. Each user agent acts as an interface and so it should adequate to the working ambient and needs of it’s users, allowing the users to configure it according to their preferences. User agents need not be the same everywhere. We mention here some possible user agent implementations: •Access (potentially off-line) to publications by a client application in a personal workstation. By user’s request it picks up the new information that may interest the user. It also hands in the documents the user produced since the last connection. •CGI application. Our reference model uses this. •A process that cooperates with a proxy server [Luotonen 94] that allows HTTP requests from users to be redirected to local requests to the service agent. This only uses part of the functionalities provided by a service agent, but is transparent to users, that need not even know about the existence of DDN. Each service agent can be provided with internal Contract Payment (Contract) Contract funcitonalities that configure it’s Literary Editorial Reader Bookshop behavior in various ways (e.g. Author Near Near schemes collections supplier; (UA-SA) Far (DDN) (UA-SA) catalogues supplier without Fig 11: Commercial DDN contents; references to documents Contract Contract (inter-organisational) Contract in paper; publisher (editorial) with Scientific Scientific different policies of document Cientific University Society Reader Author Library admission, charge and delivery; Near Near (UA-SA) Far (DDN) (UA-SA) university libraries with thematic and admission policies; etc.). Fig. 12: Academic DDN Through DDN we want to prevent that the volume of available documents becomes too big for the user to handle it and to provide the user with a reference about their quality. As we have already seen this must be done in a way that improves access performance to documents and uses resources efficiently. 5. Conclusions We have presented a weakly consistent replication system, Object Distribution System (ODS). It uses the available resources in Internet and adapts to the underlying network. Based on ODS, there may be multiple systems working in Internet at the same time, each one with it’s own kind of objects, security mechanisms, classification authorities, etc. They can be easily set up for special events, such as workshops, virtual courses, etc. Objects are updated automatically off-line. Users subscribe to receive certain objects and the system assures them that they will always have access to the latest available version of those objects. This model is suitable for persistent objects that do not change constantly, and even if when they change, this does not need to be informed immediately to users. In order to fulfill the requests of users more optimally, service agents get together in groups. We use selective replication, by building a dynamic distribution chain for each group. Chains adapt to the state of the underlying network and respect the tags of each service agent. The two virtual networks of ODS: Object Distribution Network (ODN) and Object Routing Network (ORN) were designed to work independently. This will allow us to continue developing each network separately. We also wish to generalize the routing mechanisms used in ORN for other systems and to define ORN services in a more general way. Comparing the reference model of DDN with FTP and HTTP, we found out that some documents could never be transferred between Argentine and Spain either using FTP or HTTP, due to the latency and high packet loss, but were successfully replicated using DDN. This documents were not as big as may be thought, just over 1Mb, but FTP and HTTP close the connection before completing the transfer. We must add that the reference model we are now using is just a prototype and can be further improved to work in a more efficient way. We are now studying how to implement copyrights, signatures, certification, payment and distribution registers (e.g. number of users that access a replicated object, amount of replicas, etc.). References [Ballardie 93] Ballardie, A.J.; Francis, P.F.; Crowcroft, J. “Core Based Trees (CBT)”. Proc. of the ACM SIGCOMM ‘93. San Francisco, CA. August 93. [Berners 95] Berners-Lee, Tim. “Propagation, Replication and Caching”. Web Consortium. MIT. December 1995. http://www.w3.org/member/WWW/Propagation/Activity.html [Bestavros 95] Bestavros, Azer. “Demand-based dissemination for Distribute Multimedia Aplication”. Proceedings pf the ACM/ISMM/IASTED International Conference on Distributed Multimedia Systems and Aplications. Stanford, CA. August 1995 . [Birell 82] Birell, Andrew; Levin, Roy; Needham, Roger; Schoroeder, Michael. “Grapevine: An Exercise in Distributed Computing”. Communication of ACM. Vol 25-num 4. April 1982. Pp 260-274. [Blaze 93] Blaze, Matthew. “Caching in Large Scale Distributed File Systems”. PhD Thesis, Technical Report 397-92. Princeton University, Dept. of Computer Science. January 1993. [Bowman 94] Bowman, M.; Danzing, Peter; Mander, Udi; Schwartz, Michael F. “The Harvest Information Discovery and Acces System”. Computer Networks and ISDN Systems. Vol 28. December 1995. Pp 119-125 [Danzing 94] Bowman, M.; Danzing. Peter; Mander Udi, Schwartz Michael F. “Scalable Internet Resource Discovery : Research Problems and Approaches”. Comm. of the ACM. Vol 37 - num 8. August 1994. Pp 98-107. [Davidson 85] Davidson, S.; García-Molina, H.; Skeen, D. “Consistency in Partitioned Networks”. ACM Computing Surveys 1985. Vol 17 - num 3. September 85. Pp 341-370. [Deering 96] Deering, S.; Estrin, D.; Farinacci, D.; Jacobson, V., Liu, C., Wei, L. “Protocol Independent Multicast (PIM): Motivation and Architecture”. <draft-ietf-idmr-pim-arch04.ps>. September 1996. [Deutsch 94] Deutsch, P.; Emtage, A. “Publishing Information on the Internet with Anonymous FTP”. <draft-ietf-iiir-publishing-01.txt>. May 1994. [Donnelley 95] Donnelley, James “WWW media distribution via hopwise realiable multicast”. Computer Networks and ISDN Systems. Vol 27-num 6. April 1995. Pp 781-788. [Downing 90] Downing, A.R.; Greenberg, I.B.; Peha, J.M. “OSCAR: A systems for weakconsistency replication”. Proceedings Workshop on the Management of Replicated Data. Houston Texas. November 1990. Pp 26-30 [Eriksson 94] Eriksson, Hans. “MBONE: The Multicast Backbone”. Comm. f the ACM. Vol 37 - num 8. August 1994. [Golding 92a] Golding, Richard. “Weak Consistency group communications and memberships”. Ph.D. Thesis. University of California, Santa Cruz. December 1992. ftp://ftp.cse.ucsc.edu/pub/ucsc-crl -92-52.ps.Z [Golding 92b] Golding, Richard. “End-to-end performance prediction for the internet”. Technical Report UCSC-CRL-92-26. University of California, Santa Cruz. June 1992. URL:ftp://ftp.cse.ucsc.edu/pub/ucsc-crl -92-26.ps.Z [Guyton 95] Guyton, J.; Schwartz, M. “Locating Nearby Copies of Replicated Internet Servers”. Technical Report CU-CS-762-95. Dep. Computer Science, Univ. of Colorado Boulder. February 1995. [Gwertzman 94] Gwertzman, James; Seltzer, Margo. “The case for geographical pushcaching”. HotOZ Conference. 1994. ftp://das-ftp.harvard.edu/techreports/tr-3494.ps.gz [Lampson 86] Lampson, B. W. “Dessigning a Global Name Service”. Proceedings of the Fifth ACM Annual Symposium on Principles of Distributed Computing. Calgary, Canada. August 1986. Pp 1-10. [Luotonen 94] Luotonen, A.; Altis, K. “World Wide Web Proxies”. 1st. International Conference on the World Wide Web. May 1994. [Malpani 95] Malpani, Radhika; Lorch, Jacob; Berger, David. “Making World Wide Web Caching Servers Cooperate”. 5th. International Conference on the World Wide Web. December 1996. [Mc Quillan 80] Mc Quillan, John; Richer, Ira; Rosen, Eric C. “The New Routing Algorithm for the ARPANET”. IEEE Transaction on Communications. Vol 28-num 5. May 1980. [Moy 94] Moy, John. “Multicast Extensions to OSPF”. RFC 1548. March 1994. [Nielsen 95] Nielsen, J. “Multimedia and Hypertext”. Academic Press San Diego. 1995. [Obraczka 94] Obraczka, Katia. “Massively replicating services in wide area internetworks”. Ph.D. Thesis. University of Southern California. Deciembre 1994. [Oppen 83] Oppen, D. C.; Dalal, Y. K. “The Clearinghouse: A decentralized agent for locating named objects in a distributed enviroment”. ACM Transactions on Office Information Sytems. Vol 1 - num 3. July 1983. Pp 230-253. [Popek 85] Popek; Walker; Kiser; English; Matthews; Butterfield; Thiel . “The LOCUS Distributed System Architecture”. The MIT Press Cambridge London England. 1985. [Satya 92] Satyanarayanan, M.; Kistler, James. “Disconnected Operation in the CODA file system”. ACM Transactions on Computer Systems. Vol 10 - num 1. February 1992. Pp 325. [Semeria 97] Semeria, C.; Maufer, T. “Introduction to IP Multicast Routing”. <draft-ietfmboned-intro-multicast-00.txt>. 3Com Corporation. January 1997. [Villes 95] Viles, C.L.; French, J.C. “Availability and Latency of World Wide Web Information Servers”. Computing Systems. Vol 1. 1995. Pp 61-91. [Waitzman 88] Waitzman, D.; Partridge, C.; Deering, S. “Distance Vector Multicast Routing Protocol”. RFC 1075. November 1988.

Log In

A New Large-Scale Distributed System

Related papers

Related papers

Related topics