[ authors ]
Abstract: CRDT and OT are typically seen as competing approaches, but we show that they are instances of a more general concept: a Time Machine.
A Time Machine combines the abilities of both OT and CRDT and achieves things neither could do alone -- we can now create fully P2P systems that guarantee consistency, under arbitrary edits, over arbitrary network conditions, with improved performance, that can prune and forget old history, or share aspects of history in modules, with a common network protocol.
If you read hacker news, you might see people contrasting OT and CRDT as approaches to collaborative editing:
You might see people arguing for advantages of one or the other:
This could lead you to think that and one might win over the other:
But the model where programmers must choose between CRDT and OT is inaccurate:
Programmer Choice
o
/ \
/ \
V V
CRDT OT
A better model is:
System Capabilities
CRDT
^
|
|
|
+-------------> OT
CRDT and OT are abilities that an algorithm can support. A Time Machine supports both:
System Capabilities
CRDT
^ • <-- Time Machine
|
|
|
+-------------> OT
A Time Machine is a CRDT that can be used to do OT. It features the best of both worlds:
This article explains the Time Machine, its benefits, and demonstrates:
The general problem is:
•
/ \
A B
| \ |
C o
\ /
o
o o o
| | |
A A B
| |/ \|
C B' A'
|/ \| \|
B'' C' C'
o
, the timeline incorporates operations from other branches[example]
of text address change [example]
of text deletion does nothing if already deleted[example]
of register value change overriding this oneOperational Transform asks the question of how to convert operations onto other lines of time.
o OH
|
o
.../.\
. o . o <-- should it be only one version at a time?
. | . |
. o . o
..... |
.....
. .
. .
. .
.....
[example]
of text address change [example]
of text deletion does nothing if already deleted[example]
of register value change overridding this one[ diagram? ]
Whereas OT operates on Observable History (OH), CRDT converts all observed edits into an internal format, that we call Collapsed History (CH):
o OH
|
o
/ \
o o
| |
o o
---|---
v
CH
(*)
These collapses are like "Data Laced with History", they get closer to an observable snapshot of the state than the list of raw operations, but they still contain history.
Then we can compute an observation of the current version.
Example Collapses:
[ collapse everything ]
[ collapse nothing ]
General rules: [ callout box ]
Problems with CRDT
We want the best of both worlds:
And very pragmatic needs:
... in fact, most CRDT libraries implement this OT, but don't document it!
Approach:
What this looks like:
- [ Trad CRDT ]
OH->CH----------->|Snap
| ------------> |
-[ * * * * * ]- <--- time machine comic depiction
- [ CRD Time Machine ]
OH-------->CH->OH->Snap
=[___]= <--- encapsulated time machine comic (callout to make it big)
Let's point out key aspects:
Putting these together, a CRDT is like a time machine that is:
A much better time machine:
Now we formally define these concepts, to prove that a Time Machine is a generalization of OT and CRDT.
In Observed History, each event is described as-observed by the peer it originated at. Time Machines merge incoming observations by traveling them to the receiving peer's perspective in time.
We formally define Observed History as the set of observations , for all known mutation events, from all peers. This form of history is called the Partial-Order Log in Baquero, and the XX in Archagon.
An observation is the tuple , where:
The same event can exist in multiple observations; each at a different , displaying a potentially transformed representation of .
We also define the functions , , , and to retrieve the corresponding elements of the tuple.
An event marks a point in time, on a peer, when an edit operation occurred. You can also think of it as a “timestamp” in relative spacetime.
e1 Figure 1
/ \
e2 e3
| \ |
e4 e5
\ /
e6
In Figure 1, . e1 Figure 2
/ \
e2 e3
| \ |
e4 e5
Each mutation occurs via some . Different data types support different sets of . For instance, a counter might support:
Whereas, a collaborative text editor might support the operations:
Or it might only support explicit insertions and deletions:
Additionally, we assume that there exists an operation that composes other operations, so multiple operations can be expressed as a single .
When we observe the state of a peer, at any point of time, we call it a “snapshot,” and refer to it as . Operations apply to a snapshot of state, resulting in a new snapshot:
Each peer's local time progresses linearly as it applies a sequence of operations to its snapshots. Some operations are generated locally, and some are received remotely, and transformed so that they can be applied locally.
A Time Machine implements two functions over observed history:
The first function, , produces the observable snapshot at a in time. We sometimes abbreviate . This is what a CRDT does.
The second function, , transforms an operation to the equivalent , that would be observed when the edit event travels through time from the version to the version . This is what OT does.
To merge a new observation using View, we simply add it to our history and then View the result:
This is how a CRDT works—it reproduces the entire state with every merged edit. This makes it easier to define and guarantee consistency— you just constrain yourself to making the View function deterministic.
This is how OT works—it updates a snapshot by traveling the remote observation into our local time, and then applying it. We want this in practice because (a) re-generating a large data structure for a small edit can be inefficient, and (b) to update cursor positions and other local state annotations, we need to interpret the differential operation anyway. Practical systems need Travel.
We merge with Travel following the template , where we want to produce the transformed using our Travel function. Consider:
Now we can apply the traveled to our , and compute with:
If you a merger, you should see the same thing as the remote to the current version as and applying it with .
Both produce the same snapshot.
Practical Time Machines will want to cache, hold, and encapsulate history over time. This allows them to optimize the representation of history according to their needs, by e.g. pruning old history.
So we generally define a Time Machine as an object with fields:
class TimeMachine {
}
The method incorporates a new observation into the machine's internal_history, which different Time Machines might represent in different ways. The View and Travel methods draw on this history to produce snapshots and transformations.
Now the Time Machine implementation is free to represent history in new ways, such as via a collapse, without impinging on other peers' choices of representation. This e.g. lets peers prune history at different rates.
These functions can describe an array of systems in CS beyond CRDT/OT, such as MVCC in databases, event sourcing, STM, LFS, VCS, and backup systems which store and manipulate histories of operations and snapshots to implement archiving, collaboration, and parallel processing.
We differentiate a few properties here:
[fill in]
[fill in]
[fill in]
[fill in]
to do: relate delta crdts to this diff function
Here's a relative, consistent time machine. It has an outer CRDT and an inner CRDT. The outer CRDT does not collapse history — it works on pure observed history. But in order to guarantee consistent merging, it encapsulates an interior collapsing CRDT that we call the collapse. Since this interior CRDT is encapsulated, each implementation is free to use instantiate it (or not) on different subsets of histories, in different ways, in different cases.
First, we'll define the outer CRDT. This is a CRDT, as defined in Shapiro, that has four additional constraints to its definition, which allows it to simultaneously implement the Time Machine API.
So starting with the CmRDT definition from Shapiro (which he's proved is equivalent to the CvRDT definition):
We add four constraints to implement View and Travel:
Now we have an outer CRDT that implements both View and Travel:
This defines the contract of the outer CRDT with other peers. But it does not explain how to implement and consistently. We will do that by defining a new internal collapsing CRDT.
Now we add one more term to the CmRDT: the inner “collapsing” CRDT :
A CRD Time Machine implements View and Travel in terms of this inner collapsing CRDT . It is not required to always use — for instance, it can skip it entirely when transforming a non-concurrent (linear) sequence of operations to obtain better performance — but it always can, and its merge-type is defined in terms of .
We now constrain the Time Machine's View function in terms of the query function :
This defines the merge-type.
In order to be Consistent, we need View and Travel to be consistent. Well, View will be consistent as long as it is deterministic. That is easy to do. If we have that, we can define Travel's consistency in terms of View. We can do that with the equivalence property from before:
Given a Travel() function outputting for any merged version, TP0 is true iff applying the to any produces a , that's equivalent to the snapshots produced by View() at versions and .
A great way to test your system for TP0 is to see if it is satisfied for every possible merge, given some observation to merge into a base version .
We know that:
So we can now express TP0 in terms of View and Travel when merging at :
There are often multiple possible to go from to . For instance, the string mutation "XX"
"XXX"
could be achieved by inserting an "X"
at any of positions 0, 1, or 2; or even by replacing characters 1 or 2 with the two-byte string "XX"
.
Example: the function satisfies TP0.
TP0 guarantees that a peer's local snapshot updates consistently from Travel's transformed operations. This is all you need to prove that your snapshot-update satisfies Strong Eventual Consistency (SEC), and thus is a CRDT.
Surprise, this turns out to be equivalent to TP1 and TP2.
[Todo: incorporate from meeting 77]
Thus, far from OT and CRDT being separate, we will show that not only is a CTM a CRDT, and an OT implementation, but it is also implemented using an internal CRDT, and sometimes with internal OT, which runs top of a history that is itself a CRDT, and can be generated by OT.
Observed Observable Observed
@ Peer 1 @ Sender @ Peer 2
o o o time
| / \ | |
o o o o V
| | / | |
o o o o
| \ / |
o o o
- Collapse(H) -
v
(*)
| Sending Peer | | Receiving Peer |
| State | Edit | TM | Rebased | State |
|---------|-------|-------|---------|-------|
| | | | | |
|------>| this is a trivial crdt
|------>| this is a crdt
|------------------------>| this is a crdt too
|------>| this can be done with OT TP1, TP2
|---------------->| "" <-- maybe it should be this
|-------->| this can be done with OT TP0
|---------------->| this is OT
Figure 4.4
We hope the reader can dispel the notion that CRDTs and OTs are competing algorithms. They are rather complementary properties of algorithms that we can strive to implement together within our Time Machines.
Figure 4.4 (above) depicts the flow of information within a Collapsing Time Machine. (Acknowledge Seph.)
Todo: proofs & explanations of each line
A CRD Time Machine is a CRDT that does OT through a CRDT.
Braid-HTTP: standard protocol
Sync9 + Antimatter: prune all history
Sync7 and DT merge over subset of history
Enables Memory Hierarchies
DiffSync: a VCS-style implementation
Simpleton: easier implementation
Observed history is capable of replacing a span of history as a module with an equivalent module, without affecting the key properties of:
Examples of what we'll achieve
Peers can do this in order to:
And they can do it differently on each peer, and even change behaviors over time — all without breaking compatibility and consistency.
We can replace a span of history with an alternate subgraph of events.
Summaries are generated by a set of functions applicable to a CRDTM defined by Semantics = (Ops, View). Each function has a set of properties for that CRDTM:
Examples functions:
[ draw picture ]
forkpoint
And all these can interoperate.