Bibliography on Replication
author: Cheriton, David and Skeen, Dale
title: Understanding the Limitations of Causally and Totally Ordered Communication
pages:
booktitle: Proceedings of the 14th Symposium on Operating
Systems Principles (SOSP-14)
year: 1993
publisher: ACM
address: Asheville, North Carolina
month:
author: Petersen, Karin and Spreitzer, Mike and Terry, Douglas and Theimer, Marvin and Demers, Alan
title: Flexible Update for Weakly Consistent Replication
pages: 3--6
booktitle: Proceedings of the 16th ACM Symposium on Operating Systems Principles (SOSP-16)
year: 1997
publisher: ACM
address: Saint-Malo, France
month: oct
author: Schiper, André and Raynal, Michel
title: From Group Communication to Transactions in Distributed Systems
pages: 84--87
journal: cacm
year: 1996
volume: 39
number: 4
month: apr
author: Steen, Maarten van and Homburg, Philip and Tanenbaum, Andrew S.
title: The Architectural Design of Globe: A Wide-Area Distributed System
institution: vrije Universiteit - Faculty of Mathematics and Computer Science
year: 1997
number: 422
address: Amsterdam - Netherlands
month: mar
The authors focus on understanding the limitations of causally and
totally ordered communication (CATOCS). They point that the
fundamental dificulty with the CATOCS is that it attempts to solve
state problems at the communication level in violation to the
well-known «end-to-end» argument.
CATOCS is at the communication level, but consistency requirements
are typically expressed in terms of the application's state. CATOCS is
not adequate in itself to ensure application-level consistency, and
providing additional mechanism at the state level to remedy this
deficiency eliminates the need for CATOCS, or it is expensive.
The specific limitations of CATOCS can be summarized as:
- Unrecognized Causality (or it can't say for «sure»)
Causal relationships can arise between messages at the semantic level
that are not recognizable by the «hapens-before» relationship on messages.
- Lack of serialization ability (or can't say «together»)
CATOCS cannot ensure serializable ordering between operations that
correspond to groups of messages.
- Unexpressed semantic ordering constraints (or it can't say the
«whole story»)
Many semantic ordering constraints are not expressible in the
«hapens-before» relationship, and hence not enforceable by
CATOCS. Such ordering constraints, include causal memory,
linearizability and serializability.
- Lack of Efficiency Gain over State-level Techniques (or can't
say «efficiently»)
CATOCS protocols do not offer efficiency gain over state-level
techniques (ordering overhead, delaying messages based on false
causality...), and appear far less scalable.
K.P. Birman and T.A. Joseph.
Reliable communication in an unreliable environment.
ACM Transactions on Computer Systems 5,
1(Feb. 1987), 47-76.
D.R. Cheriton and D. Skeen.
Understanding the Limitations of Causally and Totally
Ordered Communication.
Comp. Sci. Research Report STAN-CS-93-1485,
Stanford Univ., Sept. 1993.
P. Keleher, A.L. Cox, and W.Zwaenepoel.
Lazy Release Consistency for Software Distributed Shared
Memory.
Proc. of the 19th INt. Symp. on Computer
Architecture, 13-21, May 1992.
R. Ladin, B. Liskov, L. Shrira and S. Ghemawat.
Providing High Availability Using Lazy Replication.
ACM Transactions on Computer Systems 10, 4
(Nov. 1992), 360-391.
A. Schiper and A. Sandoz.
Strong Stable Properties in Distributed Systems.
Technical Report LSE-TR93-02, Dept. of computer
Science, EPFL Lausanne, March 1993.
Bayou's anti-entropy protocol for replica reconciliation is based
on the following three design choices: (i) pair-wise communication;
(ii) propagation of write operations; (iii) set of ordering and
closure constraints on the propagation of the writes. the simplicity
of the design makes the protocol very flexible, thereby providing
support for diverse networking environments and usage scenarios. It
accomodates a variety of policies for when and where to propagate
updates. It operates over diverse network topologies, including
low-bandwidth links. It is incremental. It enables replica
convergence, and updates can be propagated using floppy disks and
similar transportable media. Moreover, the protocol handles replica
creation and retirement in a ligth-weight manner.
The Bayou system places additional requirements on its
anti-entropy protocol due to its support for conflict detection and
resolution based on per-write dependency-checks and merge procedures
and for session guarantees.
The epidemic property of Bayou is very interesting for the large
scale. The idea can be used on others algorithms for the large scale
environment. Another interest choice is the pair-wise one way
communication, it is simple and safe. Finally, the separation from
the protocol itself of the policies is the key to flexibility,
hence the key for large scale.
A. Birrell, R. Levin, R.M. Needham, and
M.D. Schroeder.
Grapevine: An exercise in distributed computing.
Communications of the ACM 25(4):260-274, April
1982.
A. Demers, D. Greene, C. Hauser, W. Irish, J. Larson,
S. shenker, H. Sturgis, D. Swinehart, and D. Terry.
Epidemic algorithms for replicated database maintenance.
Proceedings Sixth Symposium on Principles of
Distributed computing, Vancouver, B.C., Canada,
August 1987, pages 1-12.
R.A. Golding.
A weak-consistency architecture for distributed
information services.
Computing Systems, 5(4):379-405, Fall 1992
J.J. Kistler and M. Satyanarayanan.
Disconnected operation in the Coda file system.
ACM Transactions on Computer Systems, 10(1):3-25,
February 1992.
L. Lamport.
Time, clocks, and the ordering of events in a distributed
system.
Communicatins of the ACM, 21(7):558-565, July 1978.
D.C Oppen and Y.K. Dalal.
The Clearinhouse: Adecentralized agent for locating named
objects in a distributed environment.
ACM Transactions on Office Information Systems,
1(3):230-253, July 1983.
D.B. Terry, A.J. Demers, K. Petersen, M.J. Spreitzer,
M.M. Theimer and B.B. Welch.
Session guarantees for weakly consistent replicated
data.
Proceedings Third International Conference on Parallel
and Distributed Information Systems, Austin, Texas,
September 1994, pages 140-149.
B. Terry, M.M. Theimer, K. Petersen, A.J. Demers/
M.J. Spreitzer, and C.H. Hauser.
Managing update conflicts in Bayou, a weakly connected
replicated storage system.
Proceedings Fifteeenth ACM Symposium on Operating
Systems Principles, Cooper Mountain, Colorado,
December 1995, pages 172-183.
Adequate group communication can support a specific class of
transactions in asynchronous distributed systems. A transaction is a
sequence of operations on objects (or on data items) that satisfies
the following three properties: (i) atomicity; (ii) permanence; (iii)
ordering. A reliable multiple group communication is necessary to
support these transactions if each object is a member of a different
group.
Given two groups g and g', we define SEND (m
to {g,g'} by the following property: Either a majority of
g and a majority of g' agree to deliver m - and
consequently all operational members of g and g'
eventually deliver m - or none of the members of g or of
g' delivers m.
Most existing group-based systems provide only a total order
multicast primitive to one single group. Total order multicasts to
multiple groups permits extension of such systems to address the group
communication requirements of transactional applications.
Birman, K.
The process group approach to reliable distributed
computing.
Commun. ACM 36, 12 (DEc.1993), 37-53.
Cheriton, D.R., and Zwaenepoel, W.
Distributed process groups in the V-kernel.
ACM Transactions on Computer Systems 3, 2
(Feb. 1985) 77-107.
Fischer, M., Lynch, N., and Paterson, M.
Impossibility of distributed consensus with one faulty
process.
Journal of the ACM 32, 4 (Apr. 1985) 374-382.
Schneider, F.B.
Implementing fault-tolerant services using the state
machine approach: a tutorial.
ACM Computing Surveys 22, 4, (Apr. 1990) 299-319.
Globe is a wide-area distributed system that is constructed as a
middleware layer on top of existing networks and operating
systems. Globe aim to meet three major design objectives:
- provide a uniform model for distributed computing;
- support a flexible implementation framework, and
- ensure worldwide scalability.
Globe tackle the existing problems by:
- using the distributed shared objects model:
objects can be physically distributed; each object encapsulates its
own policy for replication, migration, communication, distribution,
etc; distributed policies on a per-object basis.
- separating the name of the object from its location:
two-level naming hierarchy; name --> object handle --> set of contact
address.
- having a hierarchical location service:
worldwide search tree; divide the world into a hierarchical set of
domains; partition scheme by which each site stores a part of the root
node and parts of the regional nodes.
Mahadev Satyanarayanan.
Scalable, Secure and Highly Available Distributed File
Access.
Computer, 23(5):9-21, May 1990.
M. van Steen, F. Hauck, and A.Tanenbaum.
A Model for Worldwide Tracking of Distributed Objects.
In Proc. TINA'96,pp.203-212, Heidelberg, Germany,
Sept. 1996. Eurescom.
D. Wessels.
Intelligent Caching for World-Wide Web Objects.
In Proc. INET'95, Honolulu, Hawaii, June
1995. Internet society.
M. Baentsch, G. Molter, and P. Sturm.
Introducing Application-level Replication and Naming into
today's Web.
Computer Networks and ISDN Systems, 28(7-11):920-930, 1996.
C. Partridge, T. Mendez, and W. Milliken.
Host Anycasting Service.
RFC 1546, Nov. 1993.
A. Grimshaw and W. Wulf.
The Legion Vision of a Worldwide Virtual Computer.
Communications of the ACM, 40(1):39-45, Jan. 1997.
M. Shapiro, Y. Gourhant, S. Habert, L. Mosseri, M. Ruffin
and C. Valot.
SOS: An Object-Oriented Operating System --- Assessment
and Perspectives.
Computing Systems, 2(4):287-337, Fall 1989.
B. Neuman.
Scale in Distributed Systems.
In T. Casavant and M. Singhal, (eds.), Readings in
Distributed computing systems, pp. 463-489. IEEE Computer
Neilze.Dorta@inria.fr
Last modified: Thu Apr 9 18:01:06 MET DST 1998