Optimizing state-based CRDTs (part 1)
Other posts from this series:
- An introduction to state-based CRDTs
- Optimizing state-based CRDTs (part 1)
- Optimizing state-based CRDTs (part 2)
- State-based CRDTs: BoundedCounter
- State-based CRDTs: Map
- Operation-based CRDTs: protocol
- Operation-based CRDTs: registers and sets
- Operation-based CRDTs: arrays (part 1)
- Operation-based CRDTs: arrays (part 2)
- Operation-based CRDTs: JSON document
- Pure operation-based CRDTs
- CRDT optimizations
Last time we've talked about what are CRDTs and introduced the state-based variant of them. In this blog post we'll talk about the downsides of presented approaches and ways to optimize them. If you're not familiar with the previous post or CRDTs in general, I highly encourage you to read it, as I'll be referring to it a lot.
I've decided to split this topic in two. This is the first part, which aims to give you the better overview of delta-based state CRDT optimization. The second part will target more specific cases - causal contexts and observed-remove sets.
Updating state with deltas
As I mentioned in previous post, a state-based CRDT replication depends on the ability to serialize and disseminate the entire data structure over the network. While this approach has some advantages - i.e. very simple and straightforward replication protocol - it also comes with a cost.
Imagine that you have a set of 1000 elements. Now, when we'll try to add another item to it, the only way to synchronize this replica with others is to send an entire payload over the wire. It means serializing and sending 1001 elements, even though most of them have remain unchanged. This is even more painful given the fact that state-based CRDTs often carry additional metadata with them.
To face this problem, an alternative approach has been proposed: a delta-state CRDT. As the name suggests, instead of sending an entire state, we're going to accumulate the updates since the last synchronization as part of change set (delta). Once sync message was send, we simply reset delta and start from scratch. This way we'll send only the unseen parts.
This design doesn't stop us from performing full-state merge from time to time: delta-aware CRDTs still maintain the semantics of a state-based ones. Keep in mind that in terms of persistence, the delta itself doesn't have to be stored on disk.
If you want to play with the example data structures, I'm presenting here, feel free to take a look at this repository.
Delta-state growing-only counter
Let's start from defining some helper functions, that we'll use later on:
[<AutoOpen>]
module Helpers =
/// Insert or update the value of the map.
let upsert k v fn map =
match Map.tryFind k map with
| None -> Map.add k v map
| Some v -> Map.add k (fn v) map
/// Option-aware merge operation.
let mergeOption merge a b =
match a, b with
| Some x, Some y -> Some (merge x y)
| Some x, None -> Some x
| None, Some y -> Some y
| None, None -> None
Now, let's write the G-Counter implementation and the explanation will follow:
type GCounter = GCounter of values:Map<ReplicaId, int64> * delta:GCounter option
[<RequireQualifiedAccess>]
module GCounter =
/// Empty G-counter
let zero = GCounter(Map.empty, None)
/// Compute value of the G-counter
let value (GCounter(v, _)) =
v |> Map.toSeq |> Seq.map snd |> Seq.fold (+) 0L
/// Increment G-counter value for a given replica.
let inc r (GCounter(v, d)) =
let add1 map = upsert r 1L ((+) 1L) map
let (GCounter(dmap,None)) = defaultArg d zero
GCounter(add1 v, Some(GCounter(add1 dmap, None)))
/// Merge two G-counters.
let rec merge (GCounter(a, da)) (GCounter(b, db)) =
let values = a |> Map.fold (fun acc k va -> upsert k va (max va) acc) b
let delta = mergeOption merge da db
GCounter(values, delta)
/// Merge full-state G-counter with G-counter delta.
let mergeDelta delta counter = merge counter delta
/// Split G-counter into full-state G-counter with empty delta, and a delta itself.
let split (GCounter(v, d)) = GCounter(v, None), d
While this example is more complex than the original GCounter implementation, we again try to keep things fairly easy. Here, we're gonna represent GCounter
values as map of partial counters for each replica (just like the last time)... while the detla is optionally a GCounter
itself! Why? Because of composition of course :) This way we can merge counters with deltas and deltas themselves using the same merge
operation we defined before. In this design the delta of a delta counter will always be None
- we don't use it, so there's no risk of creating possibly infinite recursive data structure.
Delta-state increment/decrement counter
Previously, we've build a PNCounter
using two GCounter
s (one for incremented and one for decremented values). While you may ask, how will this work now? This is pretty simple: we'll build a delta of a PNCounter
dynamically from deltas of its two components.
type PNCounter = PNCounter of inc:GCounter * dec:GCounter
[<RequireQualifiedAccess>]
module PNCounter =
let zero = PNCounter(GCounter.zero, GCounter.zero)
let value (PNCounter(inc, dec)) = GCounter.value inc - GCounter.value dec
let inc r (PNCounter(inc, dec)) = PNCounter(GCounter.inc r inc, dec)
let dec r (PNCounter(inc, dec)) = PNCounter(inc, GCounter.inc r dec)
let rec merge (PNCounter(inc1, dec1)) (PNCounter(inc2, dec2)) =
PNCounter(GCounter.merge inc1 inc2, GCounter.merge dec1 dec2)
let mergeDelta delta counter = merge counter delta
let split (PNCounter(GCounter(inc, a), GCounter(dec, b))) =
let delta =
match a, b with
| None, None -> None
| _, _ ->
let inc = defaultArg a GCounter.zero
let dec = defaultArg b GCounter.zero
Some <| PNCounter(inc, dec)
PNCounter(GCounter(inc, None), GCounter(dec, None)), delta
I think, the only tricky part here is delta construction. We want it to be a (optionally) PNCounter
itself - because of this way we still can preserve commutativity, associativity and idempotency rules, all of our CRDT conform to, but also to reuse implementations of existing functions. Thankfully we can simply build PNCounter
from two GCounter
and since GCounter.delta
itself is a GCounter
instance, we can leverage that fact. If one of the deltas is not provided, we simply use a zero element.
Delta-state growing-only set
For the GSet
, its implementation is pretty analogous to GCounter
:
type GSet<'a when 'a: comparison> = GSet of values:Set<'a> * delta:GSet<'a> option
module GSet =
let zero = GSet(Set.empty, None)
let value (GSet(s, _)) = s
let add elem (GSet(v, d)) =
let (GSet(delta, None)) = defaultArg d zero
GSet(Set.add elem v, Some(GSet(Set.add elem delta, None)))
let rec merge (GSet(a, da)) (GSet(b, db)) =
let values = a + b
let delta = Helpers.mergeOption merge da db
GSet(values, delta)
let mergeDelta delta gset = merge gset delta
let split (GSet(v, d)) = GSet(v, None), d
Just like in case of original state-based CRDTs, we can take advantage of composability.
The technical difference here is that for G-Sets we should acknowledge either deltas or full state updates being broadcasted to all replicas. It was not so necessary for the counter implementations, as if we sent two following updates (D1, D2), if a first one (D1) didn't reach the target, but the later one (D2) did, we'll still end up in correct state - simply because D2 already override the partial counter value of D1.
Final notes about replication protocol
Up to this point our replication protocol is still pretty simple - the only requirement here is that we have to eventually be able to reach every single replica. However this doesn't have to occur immediately and doesn't have to occur in order (read: you're not constrained to use TCP for delta replication).
In the code presented above I'm merging deltas as part of a merge operation. This however is not a requirement, i.e. if you disseminate delta right after the operation being performed, you may as well clear deltas on each merge.
Summary
This part presented how to optimize counters and simple growing only set to take advantage of delta-state CRDTs. In the next post we'll take a closer look as specific optimizations targeting OR-Sets.