Akka.NET underestimated features - Akka.IO

Today I want to talk about one of the Akka.NET features, I think deserves a lot more attention - Akka.IO. What it gives you, is the ability to connect your actors directly to OS socket layer. This way you could handle any TCP/UDP-based protocol directly in your actor model. Just imagine the possibilities :)

But lets start from the basics. What would you say for simple telnet listener? We'll create a simple TCP server, which will listen for incoming connections and print input on the console.

To do so, we'll need two types of actors:

  1. Listener which will bind itself to the given endpoint and accept incoming connections.
  2. Connection handler, which will be delegated to serve particular client, once it connects to our server.

Nice thing here is, that whole TCP model uses the same ordinary actors, you can create by yourself. The whole process is pretty simple:

  • Create your TCP listener actor.
  • Bind it to some endpoint.
  • Handle connection event by creating connection handler actor and registering it for an incoming connection.
  • Inside connection handler, handle all received data, print it or make it stop, once connection gets closed.

First part is pretty simple. To bind actor to specific endpoint you can use so called TCP manager, which is part of Akka.NET extension. It's just an actor and you can get it by calling Context.System.Tcp() method. Then just send it a new Tcp.Bind(Self, endpoint) message if you want to register current actor as a listener for defined endpoint. Everything is message-based.

Second part is making our listener aware of incoming connections. They come in the form of Tcp.Connected messages. Once you receive them, Sender value associated with message is instance of actor reference being an abstraction over provided connection. What's great here is that it honors all attributes of actors, including location transparency. This means, we can handle incoming connections by actors materialized on different machines out of the box!

In this case we'll simply create a new actor and register it for this connection by sending new Tcp.Register(connectionHandler) message. To make this all clear and visible, you can see the whole actor definition below:

public class TelnetListener : ReceiveActor  
{
    public TelnetListener(EndPoint endpoint)
    {
        Context.System.Tcp().Tell(new Tcp.Bind(Self, endpoint));
        Receive<Tcp.Connected>(connected =>
        {
            var connection = Sender;
            var connectionHandler = Context.ActorOf(Props.Create(() => new TelnetHandler(connection)));
            connection.Tell(new Tcp.Register(connectionHandler));
        });
    }
}

Once you've got this ready, it's time for our connection handler actor. As we gonna provide it a reference to our connection - which as you may remember, is also an actor - we also want to link it's life time directly to the connection itself. In this context, having actor working on dead connection is unwanted. Instead we gonna watch for the connection using Context.Watch(connection) and stop current actor once it'll receive either Terminated or Tcp.ConnectionClosed message.

Then all that is left, is to handle incoming bytes from socket, transform them to string and display on the console. They come in form of Tcp.Received messages. The whole definition can be seen below:

public class TelnetHandler : ReceiveActor  
{
    public TelnetHandler(IActorRef connection)
    {
        Context.Watch(connection);
        Receive<Tcp.Received>(received =>
        {
            var text = Encoding.UTF8.GetString(received.Data.ToArray()).Trim();
            Console.Write(text);
        });
        Receive<Tcp.ConnectionClosed>(closed => Context.Stop(Self));
        Receive<Terminated>(terminated => Context.Stop(Self));
    }
}

What I think, is worth mentioning here is Data property of Tcp.Received message. It's not an usual byte array. It's instance of ByteString type instead. One of the problems with byte arrays, is that they may be heavily used in socket communication, causing a lot of allocations and deallocations, increasing pressure on .NET garbage collector. To avoid this we can use ByteStrings, which are fragments of preallocated buffer. Once they are no longer necessary, they are send back to buffer to be reused later. This way we can avoid unnecessary garbage collection cycles.

To test whole example, simply initialize TelnetListener actor under any free port, and connect to it using telnet <ip> <port> from your terminal. After that you should be able to see any text, you're writing in command line on the application console.

How Akka.NET persistence works?

In this post I'll explain, how the events sourcing concepts has been used by Akka.NET persistence plugin to create statefull actors being persisted and work in reliable way. My major goal is to give you an introduction over basic Akka.Persistence primitives and to describe how state recovering works for persistent actors.

State persistence and recovery

The most notable novelty, the Akka.Persistence offers, is a new actor type, you could possibly inherit from - a PersistentActor. The major difference when compared to standard Akka's actors is that it's state can be persisted into durable data store, and later recovered safely in case of actor restart. There is a variety of persistent backends already supported, including:

After persisting a state, it may be recovered at any time, once an actor gets recreated.

Persisting actor's state

There are two ways of storing a state. Major (required) is event sourcing - saving the list of all state changes as soon, as they occur. The second one is state snapshotting - serializing the whole state of an actor into a single object and store it. What is very important, snapshotting is only an optimization technique, not a reliable way of achieving persistence within in-memory systems. Reason for that is, that while you may ensure to store an actor state in case on it's failure, you're not able to ensure state persistence is case of external problems (infrastructure crashes, outages etc.).

Basic persist mechanism consists of several steps:

  1. After receiving a command, actor tries to persist an event, with a callback to be called, once persistence has been confirmed. In order to do so actor should invoke Persist(event, callback) method. There is also another version called PersistAsync, which may take advantage of event batching - useful, when a frequent state updates are expected.
  2. Event is send to dedicated actor - event journal - which stores it in the persistent backend.
  3. If message has been persisted successfully, journal sends a confirmation message back to actor. It updates it's state using previously defined callback. All state updates, that should be persisted, must be defined only inside persist callback. If you'll try to alter persistent state outside of the callback, you're risking potential state corruption.
  4. In case, when journal encountered a problem - usually due to problems or lack of response from backend provider - a failure message will be send. In that case, actor will restart itself. You may consider to use Backoff Supervision pattern in order to apply exponential backoff mechanism to reduce recovering footprint in case of cascading failures.

In case of snapshotting, there is no need for callbacks - whole mechanism is fully asynchronous. Use SaveSnapshot method for this purpose.

You can see whole process in the animation below:

Akka persistence writing

  • C - message (command) to be handled by persistent actor, which may result in producing a zero or more events.
  • E - sequenced events used to trigger actor state change.
  • S - persistent actor state, that may be snapshotted. Remember: snapshotting is optional here.

State recovering

Actor state recovery logic is encapsulated in the separate method (called OnRecovery or ReceiveRecovery). What is important here, this method should always produce a deterministic behavior - during recovery actor should neither try to persist any events nor communicate with other actors.

Here's how it works:

  1. Actor asks journal for the latest known snapshot of it's state. State is returned inside SnapshotOffer message. If no snapshot has been found, offer will contain null. Warning: don't rely on snapshot versioning, as recovered snapshot may have earlier version, than the one you've send before recovering - remember, what I said about asynchronous nature of this operation?
  2. Second step is to replay all events since the received snapshot creation date. If no snapshot was received, the whole event stream will be replayed. Event messages are received in the same order, as they were produced.
  3. Once actor fully recovered, it's invoking OnReplaySuccess method. If any exception has occurred, OnReplayFailure will be invoked instead. Override them, if you need a custom action once persistent actor is ready.

What I think is worth notice, is that persist/recovery mechanism is not aware of current actor behavior. Therefore, if you're having state machines (i.e. by using Become) in your actor, you're in charge of returning to the correct step after recovery.

One of the tradeoffs of persistent actors is that, they lay heavily on CP (Consistency) site of CAP theorem ;) . What does it actually mean?

In terms of command side evaluation, each persistent actor is guaranteed to maintain consistent memory model - use Defer method when you need to execute some logic, after actor has updated it's state from all stored events. Events are stored in linear fashion - each one of them is uniquely identified by PersistenceId/SequenceNr pair. PersistenceId is used to define stream inside event log related to a particular persistent actor, while SequenceNr is a monotonically increasing number used for ordering inside an event stream.

The negative side of that approach is that persistent actor's event stream cannot be globaly replicated, as this may result in inconsistent stream on updates. For the same reason your persistent actors should be singletons - there should never be more than one instance of an actor using the same PersistentId at the time.

Persistent Views

Sometimes you may decide, that in-memory model stored by persistent actor is not exactly what you want, and you want to display data associated with it in different schema. In this case Akka.Persistence gives you an ability to create so called Persistent Views. Major difference is that while persistent actor may write events directly to an event journal, views work in read only mode.

You may associate a persistent view with particular actor by joining them using PersistenceId property. Additionally since one persistent actor may have many views associated with it (each showing the same stream of events related to particular entity from a different perspective), each view has it's own unique persistent identifier called ViewId. It's used for view snapshotting - because views can have possibly different states (from the same stream of events), their snapshots must be unique for each particular view. Also because views don't alter an event stream, you can have multiple instances of each view actor.

How persistent views update their state, when new event is emitted? Akka.Persistence provides two methods by default:

  1. You may explicitly send Update messages through a view's ref to provoke it to read events from the journal since last update.
  2. All views used scheduled interval signals (ScheduledUpdate messages) to automatically update their internal state when signal is received. You may configure these by setting akka.persistence.view.auto-update (on by default) and akka.persistence.view.auto-update-interval (5sec by default) keys in your actor system HOCON configuration.

You probably already noticed, that views are eventually consistent with the state produced by their persistent actor. That means, they won't reflect the changes right away, as the update delay may occur.

NOTE: This part of Akka.Persistence plugin is the most likely to change in the future, once reactive streams will be fully implemented in Akka.NET.

AtLeastOnceDelivery

The last part of the Akka.Persistence is AtLeastOnceDeliveryActor. What's so special about it? By default Akka's messaging uses at-most-once delivery semantics. That means, each message is send with best effort... but it also means that it's possible for a message to never reach it's recipient.

In short at least once delivery means that message sending will be retried until confirmed or provided retry limit has been reached. In order to send a message using at-least-once-delivery semantics, you must post it with Deliver method instead of Tell. This way messages will be resend if not confirmed in specified timeout. You can adjust timeout by specifying akka.persistence.at-least-once-delivery.redeliver-interval in your HOCON configuration - by default it's 5 seconds. It's up to message sender to decide, whether to confirm message or not. If so, it may confirm a message delivery using ConfirmDelivery method.

What if our actor is trying to deliver a message, but no acknowledgement is being received? In that case we have 2 thresholds:

  • At some point your actor will see, that messages has not been confirmed for a particular number of times. If that will occur it will receive a UnconfirmedWarning message with list of unconfirmed messages. At that point you may decide how to handle that behavior. You may specify how many retries are allowed before warning to be received by setting an akka.persistence.at-leas-once-delivery.warn-after-number-of-unconfirmed-attempts key in the HOCON config (it's 5 by default).
  • Since each unconfirmed message is being stored inside current actor, in order to prevent burning down all of the program's memory, after reaching specified unconfirmed messages limit it will actively refuse to Deliver any more messages by throwing MaxUnconfirmedMessagesExceededException. You can set the possible threshold using akka.persistence.at-least-once-delivery.max-unconfirmed-messages key (it's 100 000 by default). This limit is set per each actor using at-least-once-delivery semantics.

All unconfirmed messages are stored on the internal list. If you want, you may snapshot them using GetDeliverySnapshot and set back on actor recovery using SetDeliverySnapshot methods. Remember that this methods are meant to be used for snapshotting only and won't guarantee true persistence without event sourcing for the reasons described above in this post.

Be cautious when making decisions when to use this semantics in your code. First it's not good for performance reasons. Second it would require from you recipient to have idempotent behavior, since the same message may be possibly received more than once.

Journals and Snapshot Stores

Last part of the Akka.Persistence are journals and snapshot stores, that make all persistence possible. Both of them are essentially actors working as proxy around underlying persistence mechanism. You can decide what direct implementation of the underlying backend to choose. You can also choose to create your own. If you decide to do so see my previous post to get some more details.

In order to set a default persistence backend, you must set akka.persistence.journal.plugin or akka.persistence.snapshot-store.plugin to point at specific HOCON configuration paths. Each specific plugin should describe it's configuration path, however it may also require some custom configuration to be applied first.

It's possible for different actor types to use different persistence providers. It's possible with persistent actor's JournalPluginId and SnapshotPluginId properties, which should return HOCON configuration paths to a particular journal/snapshot.

How to create an Akka.NET cluster in F#

Today I'm going to show you how to setup your own Akka.NET cluster using F# code. This is a kind of newbie guide of how to setup a cluster, directed to people unfamiliar with it. If you don't want to use Akka with F#, but you're understand the language syntaxt, don't worry. You won't miss a thing, since most of the concepts are common for .NET (and JVM) environment.

In this example I'm going to use the Akkling - it's my fork of the existing Akka.FSharp library. While it's API is mostly compatible with official F# Akka API, one of it's features is that actor refs are statically typed on handled message types. If you're using F#, you'll probably find it handy.

install-package Akka.Cluster -pre  
install-package Akkling -pre  

or using Paket

paket add nuget Akka.Cluster  
paket add nuget Akkling  

Almost all of the cluster initialization can be done through configuration. In case if you're not yet familiar with the whole concept, here are some useful informations providing both high level and detailed overview:

Cluster setup

While I'll focus on differences bellow, basic configuration, common for all nodes may look like this:

akka {  
  actor {
    provider = "Akka.Cluster.ClusterActorRefProvider, Akka.Cluster"
  }
  remote {
    log-remote-lifecycle-events = off
    helios.tcp {
      hostname = "127.0.0.1"
      port = 2551        
    }
  }
  cluster {
    roles = ["seed"]  # custom node roles
    seed-nodes = ["akka.tcp://cluster-system@127.0.0.1:2551"]
    # when node cannot be reached within 10 sec, mark is as down
    auto-down-unreachable-after = 10s
  }
}

With this configuration set up, our cluster will be able to configure itself as the new nodes will join/leave it. No additional code will be necessary.

Some explanations:

  • Since from the outside cluster looks like a single actor system, it's important, that all nodes being a part of it, must share the same actor system name.
  • Just like in Remote Actor Deployment example, we override default actor provider here. In this case we'll use ClusterActorRefProvider.
  • While cluster is able to self-manage joining nodes, each of them must know at least one node (access point) that is already part of the cluster. For this reason, we may define so called seed-nodes under akka.cluster.seed-nodes config key. While in this example there is only one defined, in practice you'd want to have at least 2-3 of them in case, when some will go down for any reason.

The basic difference between seed node and the joining ones, is that seed must listen on well-defined address to be easily accessed. Because we have already defined seed node address on port 2551, our seed node must have akka.remote.helios.tcp.port value set to that port.

What we are going to achieve is to create a fully operative cluster environment with an actor aware of it's presence inside it. What I mean by that, is that we want an actor to be able to react on incoming cluster events. For this let's use Cluster extension, which gives a many useful features, when working with cluster from within an actor.

let aref =  
    spawn system "listener"
    <| fun mailbox ->
        // subscribe for cluster events at actor start 
        // and usubscribe from them when actor stops
        let cluster = Cluster.Get (mailbox.Context.System)
        cluster.Subscribe (mailbox.Self, [| typeof<ClusterEvent.IMemberEvent> |])
        mailbox.Defer <| fun () -> cluster.Unsubscribe (mailbox.Self)
        printfn "Created an actor on node [%A] with roles [%s]" cluster.SelfAddress (String.Join(",", cluster.SelfRoles))
        let rec seed () = 
            actor {
                let! (msg: obj) = mailbox.Receive ()
                match msg with
                | :? ClusterEvent.IMemberEvent -> printfn "Cluster event %A" msg
                | _ -> printfn "Received: %A" msg
                return! seed () }
        seed ()

At this moment we will be able to track all of the control events going through the cluster.

Few more important properties, I think are worth explaining, are:

  • cluster.SelfAddress returns an Akka address of the current node (with host:port included). That allows to quickly recognize current node among the others.
  • cluster.ReadView contains cluster overview including things such as health check monitors or collection of all cluster member nodes. Members are useful in cases when you need to communicate between actors placed on different nodes, as they contain necessary localization addresses.
  • cluster.SelfRoles returns collection of so called roles attached to the current node. You may specify them using akka.cluster.roles configuration key. One of their purposes is to group the nodes inside a cluster and limit responsibilities of your system to a particular subset of nodes using common abstract discriminators. Example: if you want, you may run akka system on a web server or even mobile device as part of the cluster, but ensure that no resource-intensive work will ever run there.

Sending messages between nodes

The last thing, I want to show in this post, is how to pass message between two actors placed on the different nodes.

With first node already set up, configuration of a second node is very similar, except that in this case akka.remote.helios.tcp.port may be set to 0 (dynamic resolution) and akka.cluster.roles shouldn't contain a seed role.

Here's an actor code:

let aref =  
    spawn system "greeter"
    <| fun mailbox ->
        let cluster = Cluster.Get (mailbox.Context.System)
        cluster.Subscribe (mailbox.Self, [| typeof<ClusterEvent.MemberUp> |])
        mailbox.Defer <| fun () -> cluster.Unsubscribe (mailbox.Self)
        let rec loop () = 
            actor {
                let! (msg: obj) = mailbox.Receive ()
                match msg with
                // wait for member up message from seed
                | :? ClusterEvent.MemberUp as up when up.Member.HasRole "seed" -> 
                    let sref = select (up.Member.Address.ToString() + "/user/listener") mailbox
                    sref <! "Hello"
                | _ -> printfn "Received: %A" msg
                return! loop () }
        loop ()

Because in cluster environment node addresses are usually dynamic, we can't use static configuration to resolve them. For this sample we're waiting for a node to join the cluster, and receive MemberUp event. Then if the member passed with the event is marked with seed role, we'll send a message to it's listener actor.

After runing both nodes we should be able to see Hello message on the seed node received each time a new process with the joining node code is started - with dynamic port allocation config value (0), you should be able to run it as a separate process multiple times receiving message on the seed each time.

Akka.NET application logging in your browser

In this post I want to share with fairly simple trick - binding Akka.NET logging bus directly to browser console output. While this is mostly form of exercise, you may find it useful when you're developing system working with Akka and web sockets.

Lets start with creating our custom logger actor. Here's a starter code.

public class BrowserLogger : ReceiveActor  
{
    public BrowserLogger()
    {
        Receive<Debug>(e => Log(LogLevel.Debug, e.ToString()));
        Receive<Info>(e => Log(LogLevel.Info, e.ToString()));
        Receive<Warning>(e => Log(LogLevel.Warning, e.ToString()));
        Receive<Error>(e => Log(LogLevel.Error, e.ToString()));
        Receive<InitializeLogger>(_ => Sender.Tell(new LoggerInitialized()));
    }
}

BTW: Congratulations! If you had done that, you've actually created a fully functional logger library :)

Unfortunately, devil is in the details. Now we need to integrate our logging actor with SignalR hubs. For this, lets create a dedicated Hub class, which we could use to forward log messages back to the client.

public class LoggerHub : Hub  
{
    public override Task OnConnected()
    {
        Bootstrap.Register(this);
        return base.OnConnected();
    }

    public void Log(LogLevel level, string message)
    {
        Clients.All.log((int)level, message);
    }

    public override Task OnDisconnected(bool stopCalled)
    {
        Bootstrap.Unregister(this);
        return base.OnDisconnected(stopCalled);
    }
}

Don't worry, if you don't know, what Bootstrap class is all about. We're gonna cover it later.

Now, once our SingalR hub is ready, it's time to enrich our logger actor to use it for forwarding messages back to client. The idea is to simply have a list of hubs to be informed, and provide additional register/unregister message handlers from there.

// message used to register a hub
public class RegisterHub  
{
    public readonly LoggerHub Hub;
    public RegisterHub(LoggerHub hub)
    {
        Hub = hub;
    }
}

// message used to unregister a hub
public sealed class UnregisterHub  
{
    public readonly LoggerHub Hub;
    public UnregisterHub(LoggerHub hub)
    {
        Hub = hub;
    }
}

public class BrowserLogger : ReceiveActor  
{
    private readonly ISet<LoggerHub> _hubs = new HashSet<LoggerHub>();

    public BrowserLogger()
    {
        Receive<RegisterHub>(register => _hubs.Add(register.Hub));
        Receive<UnregisterHub>(unregister => _hubs.Remove(unregister.Hub));

        ...
    }

    private void Log(LogLevel level, string message)
    {
        foreach (var hub in _hubs) hub.Log(level, message);
    }
}

Note: logging may be a risky operation, since invoking hub method may result with throwing an exception and reseting an actor. IMO in that case, the best solution is to catch an exception, wrap it with our own custom Exception to be ignored by current logger (using in example following statement: Receive<Error>(e => !(e.Cause is BrowserLoggingException), e => Log(LogLevel.Error, e.ToString()));) and propagate it using standard akka logging mechanism to other loggers in our actor system (you don't want to loose these exceptions, right?).

In one of the previous code snippets you might notice, that we're using Bootstrap.Register and Bootstrap.Unregister methods, which hasn't been defined yet. For this we create a static class used as access point to actor system and it's functionalities.

public static class Bootstrap  
{
    private static ActorSystem _system;
    private static ActorSelection _logger;

    public static ActorSystem System { get { return _system; } }

    // call this method in your Startup.cs or Global.asax when application starts
    public static void Init()
    {
        _system = ActorSystem.Create("logging-system", 
            ConfigurationFactory.ParseString(@"akka.loggers = [""MyNamespace.BrowserLogger, MyAssembly""]"));
        _logger = _system.ActorSelection("/system/log*");
    }

    public static void Register(LoggerHub hub)
    {
        _logger.Tell(new RegisterHub(hub));
    }

    public static void Unregister(LoggerHub hub)
    {
        _logger.Tell(new UnregisterHub(hub));
    }
}

I think, that one thing worth notice is the way, we refer to logger actor in code above. Loggers are created internally by actor system on it's start and live on their own rights. Therefore it's hard to get any reference to them. /system/log* path may be confusing at the beginning, here's why:

  • Logger actor is part of the actor system internals (Akka.NET not only exposes actors to external users, but also uses them to manage itself). For this reason unlike usual /user prefix, here we have /system actor guardian used for internal actors.
  • Loggers names are generated and they usually take a form of log-<inc>-<logger_type_name>, where inc is auto-incremented number. Therefore it's hard to predict, what exact name of our logger will be. To omit this issue we use wildchart operator to access all actors with name prefixed with log (basically all loggers). It's not a problem, since only our BrowserLogger is able to respond on Register/Unregister messages.

The final step is to register our hub on the browser side. Assuming that you already have SignalR configured in your application, this is pretty straightforward code:

 var logLevel = {
     DEBUG: 0,
     INFO: 1,
     WARNING: 2,
     ERROR: 3
 };
 var log = $.connection.loggerHub;
 log.client.log = function (level, message) {
     switch (level) {
         case logLevel.DEBUG: console.debug(message); break;
         case logLevel.INFO: console.info(message); break;
         case logLevel.WARNING: console.warn(message); break;
         case logLevel.ERROR: console.error(message); break;
     }
 };

 $.connection.hub.start().done(function() {
     console.log('connection initialized');
 });

Final notes

Following example can be used only in local actor system scope. In cluster wide scenarios you should ensure that hub registration doesn't pass local system scope, since SignalR hubs are not simple POCOs, and cannot be freely serialized/deserialized.

Additionally remember that logging bus, just like any other Akka's event stream, works only in local scope. To publish message in cluster-wide environments you'll need to provide your own event propagation between node boundaries (it's not so hard) or use Akka.Cluster.Tools plugin (which is still in progress at the moment of writing that post).

Create your own Akka.NET persistence plugin

Subject of this post is one of the Akka.NET plugins, Akka.Persistence. It’s major task is to give your actors an ability to store their internal state and recover it from the persistent storage after actor is created or underlying actor system crashes. While the rules driving the whole persistence mechanism deserves a separate article, for purposes of this post I’ll cover only topic of event journals and snapshot stores.

Introduction

If you’re already familiar with the idea of Eventsourcing, you probably know everything what’s needed by now. If not, here is a simplified model – statefull actors don’t store their state directly, but in form of the events. Event is a single immutable fact informing about a change made in time over actor’s state. Later, when any form of the recovery is needed, actor’s state is recreated by sending to it a series of persisted events, from which it can rebuild itself. Storage responsible for persisting those events is called Event Journal . In Akka.Persistence by default it works only in memory which does not make it a truly persistent one.

Since a possible stream of events to be replayed on actor may require a long time to full recovery, it’s internal state may be serialized at any point of time and stored separately. In this case by default actor will first try to get the latest intermediate state (a snapshot) and then recover only from those events, which have been stored from that point in time onward. Store used to persist those snapshots is called a Snapshot Store and in Akka.NET by default it uses a local file system as a persistence layer.

Akka.Persistence plugin is designed in modular fashion and allows you to create your own plugins and integrate them with any underlying persistence provider.

Journal

Just like most of the other entities in Akka, journals and snapshot stores are actors. To create your own journal, you could derive from either AsyncWriteJournal or SyncWriteJournal. Difference between these two is that first one returns a Task continuation from all of it’s operations (reads, writes and deletions – as said earlier journals don’t perform updates), while the second only from reads, expecting writes and deletions to be completed in synchronous manner. For this example I’m going to use sync journal implementation.

Lets start from the following state of our class:

public class MyJournal : SyncWriteJournal  
{
    private static readonly object Ack = new object();
    private readonly ConcurrentDictionary<string, ISet<IPersistentRepresentation>> _eventStreams =
              new ConcurrentDictionary<string, ISet<IPersistentRepresentation>>();

    Task ReplayMessagesAsync(string pid, long from, long to, long max, Action<IPersistentRepresentation> replay) ...
    Task<long> ReadHighestSequenceNrAsync(string pid, long from) ...
    void WriteMessages(IEnumerable<IPersistentRepresentation> messages) ...
    void DeleteMessagesTo(string pid, long to, bool isPermanent) ...
}

For this example all persistent actors events are stored in memory. Journal is able to recognize, which event stream refers to specific actor thanks to PersistenceId – an unique identifier for each PersistentActor which may identify it’s state across many actor incarnations. Each event inside the stream has it’s own Sequence number – an unique number in scope of event stream. It’s always increasing monotonically for every separate persistent actor.

This is how WriteMessages method may look like:

protected override void WriteMessages(IEnumerable<IPersistentRepresentation> messages)  
{
    foreach (var message in messages)
    {
        var list = _eventStreams.GetOrAdd(message.PersistenceId,
            new SortedSet<IPersistentRepresentation>(PersistentComparer.Instance));
        list.Add(message);
    }
}

This is a straightforward operation of finding an event stream for each message persistence id (or creating a new one if it was not found) and appending a message to it. PersistentComparer is simply a IComparer used to order messages by their sequence number. IPersistentRepresentation wraps user-defined event and contain all data necessary for the persistence plugin to perform it’s magic.

To be able to replay stored messages, now we must define a ReplayMessagesAsync logic:

public override Task ReplayMessagesAsync(string pid, long from, long to,  
      long max, Action<IPersistentRepresentation> replay)
{
    var dispatcher = Context.System.Dispatchers.DefaultGlobalDispatcher;
    var promise = new TaskCompletionSource<object>();
    dispatcher.Schedule(() =>
    {
        try
        {
            Replay(pid, from, to, max, replay);
            promise.SetResult(Ack);
        }
        catch (Exception e)
        {
            promise.SetException(e);
        }
    });

    return promise.Task;
}

Because this method is already prepared to work in asynchronous manner – which is the most probable solution for third party providers integration nowadays – here we delegate this work through our system’s dispatcher. Because API of the underlying dispatcher may work in various ways (Akka is not bound to TPL), it’s Schedule method doesn’t expose any task to be returned. Here I’ve introduced a little trick using TaskCompletionSource to create similar behavior.

The Replay method itself is pretty straightforward:

private void Replay(string pid, long from, long to,  
      long max, Action<IPersistentRepresentation> replay)
{
    ISet<IPersistentRepresentation> eventStream;
    if (_eventStreams.TryGetValue(pid, out eventStream))
    {
        var s = eventStream as IEnumerable<IPersistentRepresentation>;

        if (from > 0) s = s.Where(p => p.SequenceNr >= from);
        if (to < int.MaxValue) s = s.Where(p => p.SequenceNr <= to);
        if (max < eventStream.Count) s = s.Take((int)max);

        foreach (var persistent in s)
        {
            replay(persistent);
        }
    }
}

What’s worth to notice here is that, from/to sequence number range provided is expected to be inclusive from both sides.

Next in line is ReadHighestSequenceNrAsync – it’s used to receive the highest sequence number for current event stream associated with a particular persistent actor. While in your solution you may find appropriate to use some async API or dispatcher task delegation described above, I’ve used a simple synchronous solution here:

public override Task<long> ReadHighestSequenceNrAsync(string pid, long from)  
{
    ISet<IPersistentRepresentation> eventStream;
    if (_eventStreams.TryGetValue(pid, out eventStream))
    {
        var last = eventStream.LastOrDefault();
        var seqNr = last == null ? 0L : last.SequenceNr;
        return Task.FromResult(seqNr);
    }

    return Task.FromResult(0L);
}

The last method remaining is message deletion. While this is not a usual way of working i.e. in event sourcing model (it introduces a data loss), it’s still a viable option.

Akka.Persistance recognizes two types of deletion:

  • Logical, when messages are marked as deleted (IPersistentRepresentation.IsDeleted flag) but still resides in event journal.
  • Physical, when messages are physically removed from the journal and cannot be restored.
protected override void DeleteMessagesTo(string pid, long to, bool isPermanent)  
{
    ISet<IPersistentRepresentation> eventStream;
    if (_eventStreams.TryGetValue(pid, out eventStream))
    {
        foreach (var message in eventStream.ToArray())
        {
            if (message.SequenceNr <= to)
            {
                eventStream.Remove(message);
                if(!isPermanent)
                {
                    // copy message with IsDeleted flag set
                    var copy = message.Update(message.SequenceNr,
                    message.PersistenceId, isDeleted: true, sender: message.Sender);
                    eventStream.Add(copy);
                }
            }
            // messages are already stored by their sequence number order
            else break;
        }
    }
}

Test specification fulfillment

When your custom persistence plugin is ready, it’s good to check if it’s behavior is a valid one from the Akka.Persistence perspective. In order to do this, you may get an existing test suite in form of a Akka.Persistence.TestKit available as a NuGet package – it’s a set of tests used to check if your custom persistence plugin fulfills expected journal (and snaphot store) behavior. It used Akka.NET TestKit along with xUnit in order to run.

The only thing needed is your own spec class inheriting from JournalSpec with configuration set to use your plugin as a default one. The same configuration set may be used to inject your persistence provider to real working actor system.

public class MyJournalSpec : JournalSpec  
{
    private static readonly Config SpecConfig =
    ConfigurationFactory.ParseString(@"
       akka.persistence {
           publish-plugin-commands = on
           journal {
               plugin = ""akka.persistence.journal.my""
               my {
                   class = ""MyModule.MyJournal, MyModule""
                   plugin-dispatcher = ""akka.actor.default-dispatcher""
               }
           }
       }");

    public MyJournalSpec() : base(SpecConfig) 
    {
        Initialize();
    }
}

Snapshot store

Custom snapshot store implementation seems to be an easier task. Here I’m going to use again the store based on in-memory snapshots, which still is not a valid persistence layer, but works good for training purposes.

Actor state snapshots are identified by two entities: SnapshotMetadata and Snapshot itself. First covers the data necessary to correctly place snapshot instance in the lifetime of a specific persistent actor, while the second one is a wrapper over actor’s state itself.

For purposes of this in-memory plugin, I’ll wrap both of them with single object defined like this:

public class SnapshotBucket  
{
    public readonly SnapshotMetadata Metadata;
    public readonly Snapshot Snapshot;

    public SnapshotBucket(SnapshotMetadata metadata, Snapshot snapshot)
    {
        Metadata = metadata;
        Snapshot = snapshot;
    }
}

This way we may describe the snapshot store in the following manner:

 public class MySnapshotStore : SnapshotStore
 {
   private static readonly object Ack = new object();
   private readonly LinkedList<SnapshotMetadata> _saving = new LinkedList<SnapshotMetadata>();
   private readonly List<SnapshotBucket> _buckets = new List<SnapshotBucket>();

   Task<SelectedSnapshot> LoadAsync(string pid, SnapshotSelectionCriteria criteria) ...
   Task SaveAsync(SnapshotMetadata metadata, object snapshot) ...
   void Saved(SnapshotMetadata metadata) ...
   void Delete(SnapshotMetadata metadata) ...
   void Delete(string pid, SnapshotSelectionCriteria criteria) ...
}

To begin with, lets abstract dispatcher behavior described before. Remember, this is only a trick – nothing stands on your way to simply use Task.Run or delegating work to child actors. In real life scenario you’d probably use built in async behavior of underlying provider.

private Task<T> Dispatch<T>(Func<T> fn)  
{
    var dispatcher = Context.System.Dispatchers.DefaultGlobalDispatcher;
    var promise = new TaskCompletionSource<T>();
    dispatcher.Schedule(() =>
    {
        try
        {
            var result = fn();
            promise.SetResult(result);
        }
        catch (Exception e)
        {
            promise.SetException(e);
        }
    });
    return promise.Task;
}

Next, lets take a saving operation. In case of snapshot stores it’s divided into two separate methods:

  • SaveAsync asynchronous operation of saving the snapshot, returning the Task continuation.
  • Saved called only after message has been successfully saved.

They may be combined to i.e. introduce a log for actually saving, but not yet saved snapshots, as I’ve presented below.

protected override Task SaveAsync(SnapshotMetadata metadata, object snapshot)  
{
    return Dispatch(() =>
    {
        var bucket = new SnapshotBucket(metadata, new Snapshot(snapshot));
        _saving.AddLast(metadata);
        _buckets.Add(bucket);

        return Ack;
    });
}

protected override void Saved(SnapshotMetadata metadata)  
{
    _saving.Remove(metadata);
}

When it comes to loading a snapshots, snapshot store is expected to return only one snapshot being the latest one from the provided range. Just like any other range in persistence plugin, this one is expected to have inclusive boundaries.

protected override Task<SelectedSnapshot> LoadAsync(string pid, SnapshotSelectionCriteria criteria)  
{
    return Dispatch(() =>
    {
        var limited = _buckets.Where(bucket => bucket.Metadata.PersistenceId == pid
        && bucket.Metadata.SequenceNr <= criteria.MaxSequenceNr
        && bucket.Metadata.Timestamp <= criteria.MaxTimeStamp);

        var latest = limited.OrderBy(bucket => bucket.Metadata.Timestamp).Last();

        return new SelectedSnapshot(latest.Metadata, latest.Snapshot.Data);
    });
}

The last behavior is message deletion, represented in this case by two methods – one used for single snapshot and one able to operate on specific range.

protected override void Delete(SnapshotMetadata metadata)  
{
    var found = _buckets.SingleOrDefault(bucket => bucket.Metadata.Equals(metadata));
    if (found != null)
    {
        _buckets.Remove(found);
    }
}

protected override void Delete(string pid, SnapshotSelectionCriteria criteria)  
{
    _buckets.RemoveAll(bucket =>
       bucket.Metadata.PersistenceId == pid
        && bucket.Metadata.SequenceNr <= criteria.MaxSequenceNr
        && bucket.Metadata.Timestamp <= criteria.MaxTimeStamp);
}

Test specification fulfillment

Just like in the case of journal spec, you may verify your actor behavior simply by creating custom spec. While you may see a minimal configuration string below, nothing stands on your way to extend it with your custom settings.

public class MySnapshotStoreSpec : SnapshotStoreSpec  
{
    private static readonly Config SpecConfig =
      ConfigurationFactory.ParseString(@"
        akka.persistence.snapshot-store {
            plugin = ""akka.persistence.snapshot-store.my""
            my {
                class = ""TestPersistencePlugin.MySnapshotStore, TestPersistencePlugin""
                plugin-dispatcher = ""akka.persistence.dispatchers.default-plugin-dispatcher""
            }
        }
    ");

    public MySnapshotStoreSpec() : base(SpecConfig)
    {
        Initialize();
    }
}

Another difference from event journal spec is that fore most test cases snapshot store spec requires an existing collection of snapshots metadata to work. They are initialized in constructor in the code above.

Final notes

  1. Examples, I’ve presented, work on predefined constructs – they derive directly from SnapshotStore and SyncWriteJournal. However it’s not necessary, as you may promote any actor type to work as snapshot store or event journal as long as they expose necessary behavior and are able to respond on required set of persistence messages.
  2. While persistence stores specs may be used to verify your implementations, remember that they don’t ensure things such as data races etc.
  3. Remember that Akka.Persistence is still in Prerelease version. Therefore some breaking changes may occur in incoming versions.