The Chandy–Lamport algorithm is a snapshot algorithm used in distributed computing for recording a consistent global state of an asynchronous distributed system. It was introduced by K. Mani Chandy and Leslie Lamport in 1985 and is one of the foundational algorithms for distributed-system observation; later applications include termination detection, deadlock detection, distributed debugging, and checkpointing for fault tolerance.1234
Background
The need for a snapshot algorithm arises because, in an asynchronous distributed system without a shared global clock, no single process can observe the simultaneous state of all other processes and the messages in transit between them. A naive collection of local states from individual processes can produce a "global state" that no real execution ever passed through, because messages may be in flight while the snapshots are taken.2 Recording a meaningful snapshot therefore requires coordination so that the recorded local states and the recorded channel contents together correspond to a state through which a possible execution of the system could have passed. Such a state is called a consistent global state.13
History
According to Lamport, the algorithm was developed during a visit he made to Chandy, who was then on the faculty of the University of Texas at Austin. Chandy posed the snapshot problem to Lamport over dinner; Lamport later wrote that he worked out the solution in the shower the next morning and arrived at Chandy's office to find that Chandy had independently arrived at the same solution. Lamport described the algorithm as a straightforward application of ideas in his earlier paper Time, Clocks, and the Ordering of Events in a Distributed System.5
Chandy and Lamport published the algorithm in February 1985 in ACM Transactions on Computer Systems under the title Distributed Snapshots: Determining Global States of Distributed Systems.1 The paper received the ACM SIGOPS Hall of Fame Award in 2013 and the Edsger W. Dijkstra Prize in Distributed Computing in 2014.67 Lamport's Turing Award biography credits him and Chandy with introducing the first algorithm for taking a snapshot of the state of an arbitrary distributed system.4
Model and assumptions
The algorithm is presented for a model of distributed computation in which:12
- The system consists of a fixed set of processes connected by directed FIFO channels.
- There is a path of channels between every pair of processes, so that any process can send a message to any other process directly or by relay.
- Channels do not lose, duplicate, or reorder messages.
- There are no process failures.
- The snapshot algorithm runs concurrently with the underlying computation and does not alter the behavior of that computation, other than by introducing additional marker messages.
A snapshot may be initiated by any single process, and multiple snapshots can run concurrently if marker messages are tagged with snapshot identifiers.3
Algorithm
The algorithm uses a special control message called a marker. Each process records its own local state and the contents of each of its incoming channels.18
- A process initiates the snapshot by recording its own state and then sending a marker on every outgoing channel before sending any further application messages on those channels.
- When a process p receives a marker on an incoming channel c:
- If p has not yet recorded its own state, it records its state, records the state of channel c as empty, and then sends a marker on every outgoing channel before sending any further application messages.
- If p has already recorded its state, it records the state of channel c as the sequence of application messages received on c after p recorded its own state and before it received the marker on c.
The algorithm terminates when every process has recorded its state and the state of all of its incoming channels. The collected local states and channel states constitute a consistent global state of the system.12

Properties and applications
Chandy and Lamport proved that the recorded global state corresponds to a possible execution of the system; that is, the recorded state is reachable from the initial state and the final state of the recorded execution is reachable from the recorded state.1 The recorded state need not coincide with any state that actually occurred during the recorded execution, but it represents a state through which a possible execution could have passed and is therefore suitable for evaluating stable predicates—predicates that, once true, remain true—such as deadlock or termination.23
Distributed snapshot algorithms have been used in:48
- Termination detection for distributed computations.
- Deadlock detection in distributed databases and resource managers.
- Checkpointing for fault tolerance and rollback recovery.
- Distributed debugging and the evaluation of stable predicates.
- Garbage collection in distributed object systems.
Subsequent work has generalized the algorithm to relax some of its assumptions. Variants exist for systems with non-FIFO channels, for systems with failures, and for systems with causal-message-ordering or other communication models.38
See also
See also
References
References
- Chandy, K. Mani; Lamport, Leslie (February 1985). "Distributed snapshots: determining global states of distributed systems". ACM Transactions on Computer Systems. 3 (1): 63–75. doi:10.1145/214451.214456.
- Lynch, Nancy A. (1996). Distributed Algorithms. Morgan Kaufmann. ISBN 978-1-55860-348-6.
- Tel, Gerard (2000). Introduction to Distributed Algorithms (2nd ed.). Cambridge University Press. ISBN 978-0-521-79483-1.
- "Leslie Barry Lamport: A.M. Turing Award Laureate". Association for Computing Machinery. Retrieved May 15, 2026.
- Lamport, Leslie. "The Writings of Leslie Lamport: Distributed Snapshots". lamport.azurewebsites.net. Retrieved May 15, 2026.
- "ACM SIGOPS Hall of Fame Award". ACM Special Interest Group on Operating Systems. Retrieved May 15, 2026.
- "The 2014 Edsger W. Dijkstra Prize in Distributed Computing". DISC 2014. Retrieved May 15, 2026.
- Ghosh, Sukumar (2014). Distributed Systems: An Algorithmic Approach (2nd ed.). Chapman and Hall/CRC. ISBN 978-1-4665-5297-5.