f#

Quest for optionally asynchronous APIs

In this blog post I wanted to share with some of the improvements, we've made while working on the FSharp.Data.GraphQL library. To start with I'm going to describe our use case, what issues did we observe and how we solved them. If you're not familiar with GraphQL, please introduce yourself with it.

Introduction

GraphQL is the application-level query language, that can be integrated with application logic on the server side through one of the libraries of your choice. FSharp.Data.GraphQL is one of those.

What most of the server implementations allows you to do, is to define custom resolution logic for any of the schema object's fields. We can define them as:

Define.Object("User", [  
    Define.Field("firstName", String, fun ctx user -> user.FirstName)
])

However sometimes we would like to have some more advanced logic. Some of the fields may not refer to a particular entity instance. They may be obtained as part of asynchronous call. Therefore we have introduced another creation method, which allows us to define asynchronous operations:

Define.AsyncField("picture", String, fun ctx user -> async { return! getPic user.AvatarId })  

GraphQL allows us to define queries to fetch only necessary fields. Queries are highly dynamic constructs - we never know which fields will be expected. This also means that if we have synchronous and asynchronous fields in our API, any combination of these may be expected on the output. This is, where the problem begins.

Problem

It turned out, that the case above - solving potential resolution of both synchronous and asynchronous values - was quite problematic for the underlying GraphQL execution engine, we was working on. Since F# is a statically typed language, we needed some uniform way to work with both synchronously and asynchronously resolved fields.

We started with modeling them all in terms of F# Async computations. However, Async introduces a constant overhead - both in terms of the CPU and memory. Overhead, which now applies to every resolved field. Bad part: As practice shows, for the ~99% of the time, field resolvers are synchronous. This means introducing a heavy overhead by default, where it was not needed for most cases.

In case if you think, you're free of that in C# Task Parallel Library - you're not. As I said, when combination of fields requested by query is dynamic and runtime-dependent, compiler is not able to determine when to optimize async methods or not at the compile time.

Solution

We needed other kind of abstraction - something that will allow us to work with Async computations, but also will respect mostly-synchronous nature of our problem.

If you're familiar with list of changes planned for the future for the C# language, you'll notice an idea called ValueTask - shortly speaking it's a lightweight value type that is going to conform async/await API and allows to use both immediately returned values and TPL Tasks in the same way. Exactly something, we needed here.

However, ValueTasks still belongs to the future. Besides, we're building F# library and we needed something, that would feel natural for the F# devs, where F# contains it's own Async primitive.

This is why we created our own AsyncVal type - it behaves similar to Async, but it's able to use optimized path for synchronously resolved values. To make it easier to work with we've also created asyncVal { ... } computation expression and interop methods for async { ... } builder. With it we are free to express things such as:

let result = asyncVal {  
    let rand = random.Next()
    if rand % 1000 = 1 
    then return! async { return! someAsyncOperation () }
    else return rand
} |> AsyncVal.get

... and get both support for asynchronous execution and optimal performance for happy case.

How fast it is? While this implementation is still in flux, we've made some initial benchmarks (remember: benchmarks lie and actual performance growth for our case was not so optimistic), comparing AsyncVal vs. async.Return. It turned out to be almost 3000 times faster with no heap allocations (it's a struct type, and introduces only 4 bytes of overhead for 32-bit and 8 bytes for 64-bit machines). For truly async computations, it introduces a minimal overhead over existing Async implementation. You can see actual benchmarks in the following PR.

This allowed us to optimize the most common cases, without loosing potential to work with some higher level abstractions.

Summary

Right now AsyncVal is part of the FSharp.Data.GraphQL library and probably will require some more polishing, as it's created to solve our specific problems, not ready to be used for a general purpose - i.e. error management isn't exactly uniform at the moment.

However this PoC already proves it's usefulness and may be used as a foundation for something greater. Maybe in the future it will deserve it's own package.

On GraphQL issues and how we're going to solve them

Facebook GraphQL is one of the emerging (web) technologies, giving a new light on the future of web APIs. I'm not going to introduce it here - there is a plenty of articles on that subject, starting from official site. What I'm going to write about is one specific issue of GraphQL design, we've met when working on its F# implementation. But lets start from the beginning.

Designing with capabilities

The most important difference between REST APIs and GraphQL endpoints is an approach to exposing data and available actions to the API consumers.

In RESTful APIs we describe them quite vaguely, as there is no standard approach here, and REST itself is more kind of a guidelines than natural rules. Of course some approaches to that have been already created (i.e. Swagger).

In general we expose our API as set of URLs (routes) which - when called with right set of input arguments and HTTP methods - will reply with the right data in the response.

Problem here is that eventually this always leads to epidemic growth of exposed routes, as we often need a little more (underfetching) or a little less (overfetching) fields than provided with endpoints exposed so far, but we don't want to end up making additional calls to complete missing data.

On the other hand, with GraphQL over- and under-fetching is almost never a problem. Reason for that is simple: instead of exposing endpoints aiming to reply to one particular query, we create a single endpoint an schema, which describes all data - with all possible relationships between entities - that is potentially accessible from corresponding service. Therefore you have all capabilities of your web service in one place.

So, where's the catch?

While all of that sounds simple and fun, the story behind that is a little more grim.

Lets consider a web service with some ORM and SQL-backed database - a quite common example nowadays. One of the well known problems with ORMs is their leaky abstraction. The one I'm going to write here about is known as N+1 SELECTs.

For those of you, who haven't heard of it: this is a common problem with almost all advanced ORMs, which happens when you haven't explicitly mentioned all fields, you're going to use, when sending query to a database. Once it has been executed and you're try to access a property, which has not been fetched from database, most "smart" ORMs will automagically send a subsequent query to retrieve missing data. If you're iterating over collection of those data objects, a new query will be send each time an (not fetched) field has been accessed.

How we solve this problem in RESTful API? Because we exactly know, what that is going to be returned from a route, we can simply tailor a logic able to retrieve all necessary data in a minimal number of queries.

And how about GraphQL? Well... you're fucked. There is no easy way to "tailor" a SQL query to match specific request, as an expected result set is highly dynamic, depending on the incoming GraphQL query string.

Of course, there are multiple approaches to solve this problem among many different implementations in multiple languages. Lets take a look at some of them.

Dedicated GraphQL-to-SQL library

Existing examples, I know about:

  • postgraph (node.js), which exposes PostgreSQL database schema into GraphQL, and is able to compile GraphQL queries into SQL.
  • GraphQL.NET (.NET), which translates GraphQL queries into LINQ.

Both of them have the same problems:

  • They specialize in translating one query language to another. This solves only a subset of problems. What if you need to merge response from database with a call to/from another web service?
  • They may usually introduce tight coupling between your Data Access Layer (or even database schema itself) and the model exposed on the endpoint.
  • They are separate libraries, usually hardly compatible with other, more general, GraphQL implementations.

Throttling and batching queries

Second kind - introduced in Ruby implementation - digs into backend of the ORM itself. What it did was splitting DB query execution into small timeframes. Therefore instead of immediate SQL query execution, we batch all requests within, lets say 10ms time frame, and execute them at once.

I won't cover that idea in detail, as it's more like the feature of underlying database provider than an application server implementation itself. Beside it feels a little hacky solution too ;)

Can we do more?

One of the major problems here is an interpretation of incoming GraphQL query. While most of the implementations expose info about things such as AST of the executed query and schema itself, usually that data is not correlated in any way.

In the FSharp.Data.GraphQL, we have created something called execution plan - it's an intermediate representation of GraphQL query, which includes things like inlining query fragment's fields and correlating them with schema and type system.

To show this on an example- lets say we have following type system described in our schema:

schema { query: Query }

type Query {  
    users: [User!]
}

type User {  
    id: ID!
    info: UserInfo!
    contacts: [UserInfo!]
}

type UserInfo {  
    firstName: String
    lastName: String
}

And we want to execute a query, which looks like this:

query Example {  
    users {
        id
        ...fname
        contacts {
            ...fname
        }
    }
}

fragment fname on UserInfo {  
    firstName
}

How an execution plan for that query looks like?

ResolveCollection: users of [User!]  
    SelectFields: users of User!
        ResolveValue: id of ID!
        SelectFields: info of UserInfo!
            ResolveValue: firstName of String
        ResolveCollection: contacts of [UserInfo!]
            SelectFields: contacts of UserInfo!
                ResolveValue: firstName of String

A little description:

  • ResolveValue is a leaf of the expression plan tree. It marks scalar or enum coercion.
  • SelectFields is used to mark retrieval of set of fields from object types.
  • ResolveCollection marks a node that is going to return a list of results.
  • ResolveAbstraction (not present on the example) is a map of object type ⇒ SelectFields set used for abstract types (unions and interfaces).

As we can see, information about query fragment has been erased (fragment's fields have been inlined), and every field contains a type information. What we have in a result is fully declarative representation of our query. What can we do from here?:

  • Standard execution path is trivial - we simply make a tree traversal and execute every field's resolve function on the way.
  • We can decide to use an interpreter to create another representation from it (i.e. LINQ/SQL query).
  • Since execution plan is build as a tree, we could "manually" add/remove nodes from it, or even split it in two, and then again interpret each part separately. Having a set operations (such as unions and intersections) over them is especially promising.
  • We could cache the execution plans, reducing number of operations necessary to perform when executing each query. This can be a big advantage in the future - especially when combined with publish/subscribe pattern.

While presence of intermediate representation doesn't solve all problems, I think it's a step in the right direction, as it allows us to compose our server logic in more declarative fashion without headaches about performance issues.

Akka.NET remote deployment with F#

Today I want to present how Akka.NET can be used to leverage distributed computing and discuss some of the changes in the latest F# package (Akka.FSharp 0.7.1). If you are not interested in explanation of how Akka.NET internals actually allow us to create distributed actor-based application, just skip the next part and dive directly into examples.

How Akka.NET remote deployment works

When a new actor should be created, it’s actor system deployer have to figure out if deployment should occur on local machine or on the remote node. Second option requires a network connection to be established between local/remote nodes participating in the communication. To be able to do so, each participating node needs to know a localization of the others. In Akka, this is done by using actor paths.

Actor path allows us to identify actors in scope of their systems actor hierarhies as well as localize them over the network. There are two formats of actor paths, both of them uses standard URI convention:

  1. Local paths (eg. akka://local-system/user/parent/child) may be used to identify actors deployed (remotely or not) by our local actor system, we are referring to. In provided example actor path refers to actor with name child, which parent/supervisor actor is called parent, which resides under user guardian (which is a specialized actor supervisor existing inside actor system kernel). All of them exists in actor system named local-system.
  2. Remote paths (eg. akka.tcp://remote-system@localhost:9001/) are similar to local ones, but a few differences occur. First we need to rename akka protocol to either akka.tcp or akka.udp to show what kind of underlying network connection we want to target. Second we have to suffix actor system name with it’s localization. This is done by providing address and port, remote node is expected to listening on.

Image below shows how actually actors refer to each other in remote deployment scenario.

When an actor is deployed remotely, a remote node is responsible for creating it’s instances, but we still refer to it using it’s actor reference and our local system context. Local/remote coordination is done by remote daemon – specialized system actor, which resides directly under /remote path of the system root. With example above we may refer to remotely deployed child using akka.tcp://sys@A:2552/user/parent/child address, while it’s true hierarchy lays under akka.tcp://sys@B:2552/remote/sys@A:2552/user/parent/child path. This allows us to preserve location transparency, hiding true underlying actor. References returned this way all almost indistinguishable from their local counterparts, allowing use of features such as monitoring, supervising and so on.

Remote deployment

Hocon configuration

To start playing with remote deployment, we should configure both endpoints to be able to maintain network communication. This is achievable through the HOCON configuration, which is the default format for both JVM and .NET Akka implementations. While you may be a little irritated by need of learning new format only to use Akka, remember:

  1. It’s actually very easy to learn
  2. It’s not XML ;)

Before continue, I encourage you to get yourself familiar with it.

To enable remote deployment feature, this would be a minimal Akka.NET configuration setup:

akka {  
    actor {
        provider = "Akka.Remote.RemoteActorRefProvider, Akka.Remote"
    }
    remote.helios.tcp {
        port = 9001
        hostname = localhost
    }
}

First, every configuration node resides under common root called akka. Since remoting is not default feature of Akka.NET, we have to inform our actor system how to couple remotely deployed actors with our local actor system. This setup is done through RemoteActorRefProvider defined under akka.actor.provider config node. It will allow us to associate references to spawned actors on both local and remote systems.

Next one is remote node which defines configuration specific to remoting feature. There, we need to inform how our system will communicate with others. At the present moment it’s achievable using Helios server by default, which is lightweight and highly-concurrent TCP/UDP server developed by Aaron Stannard, one of the Akka.NET core developers – don’t confuse it with Microsoft Helios, which is totally different project. Akka.Remote will use it automatically under the hood – the only thing we have to define is address and port, under which it will listen for incoming messages. If you want port to be resolved automatically, just set 0 as the port number.

Excluding differences in system nodes localization, this configuration string may be shared by both local and remote node. You may use this configuration simply by parsing it using Configuration.parse function.

Let’s get to the point

Unlike the C#, F# API is able to construct actors using expressions compiled directly at the runtime (while still providing type safety, F# programmers are used to have). It’s achievable by leveraging F# quotations, which can be serialized and compiled on demand on other machine.

For the sample we don’t need to have two separate machines, instead we may mock them by running two Akka applications. What is necessary for the most basic example, is that both of them have to share at least Akka.FSharp and Akka.Remote assemblies as well as all of their dependencies.

install-package Akka.FSharp -version 0.7.1  
install-package Akka.Remote -version 0.7.1  

Remote node

The only job of the remote system is to listen for incoming actor deployment requests and execute them. Therefore implementation is very simplistic:

let config =  
    Configuration.parse
        @"akka {
            actor.provider = ""Akka.Remote.RemoteActorRefProvider, Akka.Remote""
            remote.helios.tcp {
                hostname = localhost
                port = 9001
            }
        }"

[<EntryPoint>]
let main args =  
    use system = System.create "remote-system" config
    System.Console.ReadLine()
    0

After running it our remote system should be listening on localhost on port 9001 and be ready to instantiate remotely deployed actors.

Local actor system instance

Second application will be used for defining actors and sending deployment requests to remote node. To do so it has to know it’s address.

To deploy our actors remotely, lets build some helper functions. To begin with, write some logic to inform our local system, that deployment should occur on the remote machine. Remember that we need to provide full address to the remote system including it’s network localization and protocol type used for communication.

open Akka.Actor  
// return Deploy instance able to operate in remote scope
let deployRemotely address = Deploy(RemoteScope (Address.Parse address))  

Remote deployment in Akka F# is done through spawne function and it requires deployed code to be wrapped into F# quotation.

let spawnRemote systemOrContext remoteSystemAddress actorName expr =  
    spawne systemOrContext actorName expr [SpawnOption.Deploy (deployRemotely remoteSystemAddress)]

Quotations give us a few nice features, but also have some limitations:

  1. Unlike the C# approach, here we don’t define actors in shared libraries, which have to be bound to both endpoints. Actor logic is compiled in the runtime, while remote actor system is operating. That means, there is no need to stop your remote nodes to reload shared actor assemblies when updated.
  2. Code embedded inside quotation must use only functions, types and variables known to both endpoints. There are limited ways to define functions inside quotation expression (and no way to define types), but generally speaking in most cases it’s better to define them in separate library and share between nodes.

Last line of the spawne function is list of options used to configure actor. We used SpawnOption.Deploy to specify what deployment specifics are meant to occur. Other options may describe specifics such as message mailboxes, actor routers or failure handling strategies.

Because Akka actor system is required to negotiate deployment specifics with external nodes, it’s local instance has to be provided even thou we want to deploy our actors on the remote machines.

Finally when everything is set, we can run our example (remember that remote node must be up and running):

let system = System.create "local-system" config  
let aref =  
    spawnRemote system "akka.tcp://remote-system@localhost:9001/" "hello"
       // actorOf wraps custom handling function with message receiver logic
       <@ actorOf (fun msg -> printfn "received '%s'" msg) @>

// send example message to remotely deployed actor
aref <! "Hello world"

// thanks to location transparency, we can select 
// remote actors as if they where existing on local node
let sref = select "akka://local-system/user/hello" system  
sref <! "Hello again"

// we can still create actors in local system context
let lref = spawn system "local" (actorOf (fun msg -> printfn "local '%s'" msg))  
// this message should be printed in local application console
lref <! "Hello locally"  

As a result, you should receive two messages printed in remote application console and one locally. See?

Final thoughts

Remember that Akka.NET is still in beta and all of the F# API functions are subject to change. If you have some concepts or interesting ideas, or want to help and become part of the family, you may share them on Akka.NET discussion group or directly on Github.

Appendix A: set your configuration string in application config file

While you may define configuration strings in code, the better idea is to actually store them in .config files. To be able to do so, we must simply extend configuration file with custom Akka config section:

<configSections>  
    <section name="akka" type="Akka.Configuration.Hocon.AkkaConfigurationSection, Akka" />
</configSections>  

Next, we can embed our Hocon-formated configuration string directly into configuration file by using <![CDATA[]]> marker (directly under main <configuration> root node):

<akka>  
    <hocon>
        <![CDATA[
          ... paste your config string here
        ]]>
    </hocon>
</akka>  

To load the configuration, simply call Akka.Configuration.ConfigurationFactory.Load() method.

Appendix B: Share-nothing

Since keeping actual assemblies in sync over all of the remote nodes may feel cumbersome for some people, I decided to show a little trick, which may be used to replace some of the complex data structures in your F# code. Lets look how data structure problem may be simply solved in Akka predecessor, Erlang:

go() ->  
  Pid = spawn(RemoteNode, loop),
  Pid ! {hello, "world"},
  Pid ! {hi}.

loop() ->  
  receive
      {hello, Msg} ->
          io:fwrite("hello ~p~n", [Msg]),
          loop();
      {hi} ->
          io:fwrite("hi~n"),
          loop()
  end.

I think, that based on your knowledge about functional programming, most of it should be at least familiar if not self-explanatory. Crux is the {hello, Msg} and {hi} syntax – these are actually instances of tuples (nested into pattern matched message receiver construct). Standard Erlang convention precises them as so called tagged tuples, which first argument is an atom – also known as symbol in other languages. Since Erlang is a dynamic language, this way we may differentiate tuples of the same size from each other.

Example below shows how usage of the Erlang’s tagged tuples could be moved into F#. The big difference is that F# has no Erlang atom equivalent. Therefore they has been replaced by F# literal integers, which gives us an advantage of human readable tags, while still counted as integers inside quotation expressions.

[<Literal>]let Hello = 1
[<Literal>]let Hi = 2

let pid = spawne system "remote" (<@ fun mailbox ->  
        let rec loop() : Cont<obj, obj> = actor {
            let! msg = mailbox.Receive()
            match msg with
            | (Hello, str) -> printfn "hello %A" str
            | (Hi) -> print "hi\n"
            | _ -> mailbox.Unhandled()
            return! loop()
        }
        loop() @>) [SpawnOption.Deploy Deploy(RemoteScope(Address.Parse remoteAddr))]

pid <! (Hello, "world")  
pid <! (Hi)  

Hipsterize your backend for The Greater Good with Akka.NET, F# and some DDD flavor

Initially this post was supposed to cover concepts of implementing an example DDD (Domain Driven Design) architecture using F# and Akka.NET. But the more I’ve written, the more I’ve realized that whole idea won’t fit into a single blog post. Therefore I’ve decided for now to focus on some basic example covering command/event handlers. To make this more interesting I want to show how Akka.NET and F# features can be used to leverage design established this way.

Before we begin…

I want to notice, that all examples in this post are made mostly to learning purposes. There are some things missing, but I’ve decided to leave them since they don’t have direct impact on the overall design or current state of the application. If I will continue this thread, I will probably try to extend them and provide some more meaningful context. So please don’t treat this code as a real-life reference of DDD implementation.

What is clearly missing in example below? I didn’t describe aggregate roots, because boundary context is very simplistic – to make this post clearer I’ve focused more on command-handler-event pattern. Sample doesn’t enforce any specific database, using simple dictionary as abstraction. Also it doesn’t provide any form of data validation or bindings to existing HTTP server frameworks. They are outside of scope of this post but are very simple to integrate.

Step 1: Define our domain data structures

Lets start with domain model. As a boundary context I’ve chosen one of the most generic ones – user account’s space. Our simplified business logic in this example will cover two actions – user registration and sign in. Users will be identified by their email and will use the most basic security features.

But before we move there, lets use some of the features, F# could give us, to shape a type system for our advantage.

type Email = string  
type PasswordHash = byte array  

These are examples of type aliases. You may consider using them, if you’ve found that usage of some primitive type in given context may be defined as subtype/supertype relation. In this example Email can be considered as a specialization of string – that means, each email is string, but not each string is a valid email. This kind of granularity can be stretched quite far. Just think about any primary/foreign key field in any entity as it’s own unique type. This will automatically disallow mistakes such as assigning keys from entities of incompatible type eg. assigning UserId value to ProductId field! F# expressiveness promotes this behavior.

Now use these types to define our User entity. It’s very straightforward:

type User =  
    { email : Email
      phash : PasswordHash
      salt  : byte array }
    static member empty =  { email = ""; phash = [||]; salt = [||] }

Next I’m going to describe data models for our domain commands and events. It’s important to understand purpose of both. Commands are produced to describe tasks to be executed by the system and contains all data necessary to perform it. Events are describing diffs in general application state as a results of performed tasks. What is wort mentioning, events should be described in way, that will allow us to recreate current application state by replaying all events timeline stream. And because events describe changes, that already occurred, they should be immutable by nature – don’t try to change your past if you not wish to mess with your future.

type UserCommand =  
    | RegisterUser of email : Email * password : string
    | LoginUser of email : Email * password : string

NOTE: If you want to use F# records to describe your commands/events, that’s perfectly fine. I’ve chosen discriminated unions for their expressiveness.

type UserEvent =  
    | UserRegistered of email : Email * salt: byte array * phash : byte array * timestamp : DateTime
    | UserLogged of email : Email * timestamp : DateTime

As you can see there are few things worth of emphasizing:

  • We didn’t store user password provided directly from command, using salt + password hash instead. Reason behind is that in DDD events usually could be serialized and stored in some kind of event database. Therefore it’s not secure to provide them any sensitive data. However events should contain enough information to give us ability to recreate current database state based on provided stream of events, assuming that it’s complete.
  • Each of our event’s contain a timestamp field. Events give us a historical change set of the application. And because they contain only diffs between database states rather than snapshot of entities, they have to be ordered by their creation date. Without it we couldn’t recreate them.

What other fields may be useful when working with events? Consider attaching current API version number – this will allow you to endure breaking changes in your code responsible for handling events. Other handy field is unique event id, when you may need functionality associated with generating cascading event chains in response to some other events. In that case having some kind of pointer to originating event may be useful.

Step 2: describe application behavior in functional manner

This may be a little controversial statement, but I must say it – exceptions are bad. They leaves the leaks in your type system and fucks with your application logic flow in the way, that many programmers and almost all compilers cannot understand.

The alternative conception of error handling is old and well-known. It’s based on returning error codes. To start with – if you already know Rust, this may look familiar to you:

type Result<'ok, 'err> =  
    | Ok of 'ok
    | Error of 'err

You may ask: why the hell do we return error values instead of just throwing exceptions? Are these some archaeological excavations and we’re gone back to the C era? Actually no. There is very good reason behind this.

Think about exceptions as a side effects of the application state behavior. You cannot be 100% sure about flow of your application process. Since information about exceptions is not part of any function/method signature, they actually act as a dynamically typed cascading return values anyway – both compiler and programmers will ignore them until they will occur at runtime.

This time instead of cryptic error codes returned by C, we use F# power to provide type-safe and expressive way for this solution. If you’re interested more about interesting ways to utilize return-based error handling using higher-level concepts, I recommend you Railway Oriented Programming blogpost and video presented on NDC Oslo 2014. Here, I’ll use a greatly simplified version, exposing operator which chain two input functions together and call second one only if result from the first was Ok, and forwarding an Error result otherwise:

let (>=>) f1 f2 arg =  
    match f1 arg with
    | Ok data -> f2 data
    | Error e -> Error e

I’ve also decided to use more discriminated unions to describe each specific business logic error in actual bounded context scope. I’ve found that returning a string-based messages back to programmers in error logs are not better in any way than creating a specialized error type for each case. The only disadvantage is amount of code you need to write, to declare new type, but again – it’s not a problem with F#.

type UserError =  
    | UserAlreadyExists of userEmail : Email
    | UserNotFound of userEmail : Email
    | WrongPassword of userEmail : Email * hashedInput : PasswordHash

Step 3: handle user registration

How does F# command handling differs from it’s more objective oriented counterpart? Since we deal with functional oriented language, most of our logic is composed into functions instead of objects. You’ve used to execute your logic with services, here you’ve got a function. Wanna some Dependency Injection? Here, you have some currying.

I’ll describe our service as handle function which takes some UserCommand as an input and produces Result<UserEvent, UserError> as output:

let handle clock findUser saveUser =  
    function
    | RegisterUser(email, password) ->
        match findUser email with
        | Some _ -> Error(UserAlreadyExists email)
        | None ->
            let salt, phash = defaultGeneratePassword password
            let user =
                { email = email
                  salt = salt
                  phash = phash }
            saveUser user
            Ok (UserRegistered(email, salt, phash, clock()))

As you’ve seen, this function actually takes 4 parameters (clock, findUser, saveUser and command as implicit argument of calling function keyword). But as I’ve said, our handle function should take only one input argument. That’s because we’ll provide three first arguments through partial function application later – now you can think about them as constructor arguments injected by your IoC container when resolving new service.

While most of the parameters are quite understandable, one of them may be confusing – the clock parameter. It’s only functionality is to provide current DateTime value. If so, why didn’t I just used DateTime.Now directly inside the code? In this example it’s actually not so important, but I’ve used it to expose simple problem – things such as date/configuration/environment providers or random number generators could make our handler behavior non-deterministic. That means, if we call this function two times with the same input, we couldn’t count on the same output. It’s actually problem for logic predictability and in some cases is especially cumbersome when writing tests to verify application behavior. Therefore I think that it’s better to separate them as injectable arguments.

Password hashing and generation

While you may find some good libraries providing real life hashing features out of hand, I’ve decided to leave my example free on unnecessary dependencies and use a standard library instead.

let private saltWith salt (p : string) =  
    let passBytes = System.Text.Encoding.UTF8.GetBytes p
    Array.append passBytes salt

let sha512 (bytes : byte array) =  
    use sha = System.Security.Cryptography.SHA512.Create()
    sha.ComputeHash bytes

let hashPassword hashFn salt password = hashFn (saltWith salt password)

let generatePassword hashFn saltSize password =  
    use saltGen = System.Security.Cryptography.RandomNumberGenerator.Create()
    let salt = Array.zeroCreate saltSize
    saltGen.GetBytes salt
    (salt, hashPassword hashFn salt password)

let inline defaultGeneratePassword pass = generatePassword sha512 64 pass  
let inline defaultHashPassword salt pass = hashPassword sha512 salt pass  

I hope this code doesn’t need any explanations.

Step 4: handle user logging

Once we already described nuances of the function-based services, this one shouldn’t be a problem. This case was created to give our example more diversity.

let handle clock findUser saveUser =  
    function
    | RegisterUser(email, password) -> ...
    | LoginUser(email, password, remember) ->
        match findUser email with
        | None -> Error(UserNotFound email)
        | Some user ->
            let computedPasswordHash = defaultHashPassword (user.salt) password
            if computedPasswordHash = user.phash then
                Ok (UserLogged(user.email, clock()))
            else Error(WrongPassword(user.email, computedPasswordHash))

Mock database access

As I’ve said in the introduction, I won’t use any real-life database provider, mocking it with concurrent dictionary instead. However if you want to use a normal database, nothing stands in your way.

let userStore = ConcurrentDictionary()

let inline flip f x y = f y x  
let findInUserStore email (store : ConcurrentDictionary<string, User>) =  
    match store.TryGetValue email with
    | (false, _) -> None
    | (true, user) -> Some user

let saveInUserStore user (store : ConcurrentDictionary<string, User>) =  
    store.AddOrUpdate(user.email, user, System.Func<_, _, _>(fun key old -> user)) |> ignore

F# is heavily function-oriented and this also corresponds to arguments precedence. While in OOP subject (this) usually precedes method invocation, in functional languages it’s better to use it as last argument of the function. Argumentation for this is currying and partial function application, which allows us to define functions with only part of all necessary arguments provided. The more specific argument, the later it could be applied. Therefore common convention is to provide more detailed parameters at the end of the function parameters list. It’s a very useful feature, especially when combined with pipeline operators.

On the other side flip function may be used to reverse parameters precedence in case when, for example, we want to partially apply second argument to the function without providing first one (because it may be yet unknown in this moment). This option will be presented later.

Handle event subscribers

One of the nice things in Akka is that it provides a Publish/Subscribe pattern offhand. As an actor-based framework, the only necessary task to do is to subscribe an ActorRef of desired observer directly to Akka system’s event stream.

As a lazy person I don’t want to be forced to explicitly create new actor each time I want to subscribe some behavior to react for events I’ve produced somewhere else. Therefore I’ve created simple subscribe helper, which will handle subscriber actor creation for me.

// automatic unique concurrent name generator
let mutable subscriptionNr = 0  
let inline getSubNr() = System.Threading.Interlocked.Increment(&subscriptionNr)

let subscribe system (fn : 'a -> unit) =  
    let subId = getSubNr()
    let subscriber = spawn system ("subscriber-" + (string subId)) <| actAs fn
    system.EventStream.Subscribe(subscriber, typeof<'a>) |> ignore
    subscriber

let publish (bus : Akka.Event.EventStream) evt =  
    bus.Publish evt
    Ok evt

Bind everything together

Now when all logic has been defined, it’s time to show it in some example code. It will initialize a new Akka system, setup console logging and show features of successful and unsuccessful registering and signing in of the user.

let inline defaultClock() = DateTime.Now  
// display any errors returned by command handler
let handleErrors = function  
    | Ok _ -> ()
    | Error data -> printf "User error: %A\n" data

let system = System.create "system" <| Akka.Configuration.ConfigurationFactory.Default()  
let userHandler =  
    // inject "dependencies" into handle function
    handle
    <| defaultClock
    <| flip findInUserStore userStore
    <| flip saveInUserStore userStore
let subscriber = subscribe system <| printf "User event: %A\n"  
let userActor = spawn system "users" <| actorOf (userHandler >=> (publish system.EventStream) >> handleErrors)

userActor <! RegisterUser("j.doe@testmail.co", "pass")  
userActor <! RegisterUser("j.doe@testmail.co", "pass")  // UserAlreadyExists error  
System.Threading.Thread.Sleep 100

userActor <! LoginUser("j.doe@testmail.co", "pass")  
userActor <! LoginUser("j.doe@testmail.co", "fails")     // WrongPassword error

System.Threading.Thread.Sleep 100  
system.Shutdown()  

That’s all. Whole example took about 130 lines of F# code to create. I hope this would give you some insights about embedding your business logic into F# function composition model.

PS: If you’re interested in more details about DDD and F# programming, at the present moment I can recommend you following other exploring lecture of the other developers, such as Lev Gorodinski (@eulerfx) and his blog for more advanced concepts.

Actor supervisors in Akka.NET FSharp API

This post will describe changes, that will affect a F# API for Akka.NET in the incoming versions of the framework (>= 0.62) – mostly those, which concerns supervisioning and manipulating actor hierarchies.

I don’t want to overload this post with explaining what Akka supervision is all about, so I’ll just recommend a very good article, which could be found here.

While previous versions of the Akka F# API introduced some nice features such as function-based actor definitions and creations (spawn function), there still was no simple way to handle some core framework concepts. Remember that march towards Akka.NET 1.0 is still in progress, more and more features are implemented. Therefore it’s F# API most likely will also be a subject to change.

To give a better description of the new features, lets see an example:

let strategy =  
    Strategy.oneForOne (fun e ->
        match e with
        | :? ArithmeticException -> Directive.Resume
        | :? ArgumentException -> Directive.Stop
        | _ -> Directive.Escalate)

let supervisor =  
    spawnOpt system "master" (fun mailbox ->
        // by using spawn we may create a child actors without exiting a F# functional API
        let worker = spawn mailbox "worker" workerFun

        let rec loop() =
            actor {
                let! msg = mailbox.Receive()
                match msg with
                | Respond -> worker.Tell(msg, mailbox.Sender())
                | _ -> worker <! msg
                return! loop()
            }
        loop()) [ SupervisorStrategy(strategy) ]

A whole example is available inside Akka.NET project source code, along with other examples.

Unlike standard spawn, spawnOpt function could take a list of options used to configure spawned actor internals. SupervisorStrategy is one of them. By using F# API you are able to use Strategy module to quickly create a corresponding strategy instance. It supports two types of strategy creators:

  1. Strategy.oneForOne – strategy will affect only an exception thrower.
  2. Strategy.allForOne – strategy result will propagate to all children of the actor implementing that strategy.

Both functions come in two variants. First one takes only one argument – decider function, which determines an expected behavior based upon type of exception thrown. Second one (Strategy.oneForOne2 and Strategy.allForOne2) precedes decider argument with a timeout and number of possible behavior retries. All of them corresponds directly to standard Akka strategy API, but are wrapped to fit F# functional approach instead of method overrides.