Quest for optionally asynchronous APIs
In this blog post I wanted to share with some of the improvements, we've made while working on the FSharp.Data.GraphQL library. To start with I'm going to describe our use case, what issues did we observe and how we solved them. If you're not familiar with GraphQL, please introduce yourself with it.
Introduction
GraphQL is the application-level query language, that can be integrated with application logic on the server side through one of the libraries of your choice. FSharp.Data.GraphQL is one of those.
What most of the server implementations allows you to do, is to define custom resolution logic for any of the schema object's fields. We can define them as:
Define.Object("User", [
Define.Field("firstName", String, fun ctx user -> user.FirstName)
])
However sometimes we would like to have some more advanced logic. Some of the fields may not refer to a particular entity instance. They may be obtained as part of asynchronous call. Therefore we have introduced another creation method, which allows us to define asynchronous operations:
Define.AsyncField("picture", String, fun ctx user -> async { return! getPic user.AvatarId })
GraphQL allows us to define queries to fetch only necessary fields. Queries are highly dynamic constructs - we never know which fields will be expected. This also means that if we have synchronous and asynchronous fields in our API, any combination of these may be expected on the output. This is, where the problem begins.
Problem
It turned out, that the case above - solving potential resolution of both synchronous and asynchronous values - was quite problematic for the underlying GraphQL execution engine, we was working on. Since F# is a statically typed language, we needed some uniform way to work with both synchronously and asynchronously resolved fields.
We started with modeling them all in terms of F# Async
computations. However, Async
introduces a constant overhead - both in terms of the CPU and memory. Overhead, which now applies to every resolved field. Bad part: As practice shows, for the ~99% of the time, field resolvers are synchronous. This means introducing a heavy overhead by default, where it was not needed for most cases.
In case if you think, you're free of that in C# Task Parallel Library - you're not. As I said, when combination of fields requested by query is dynamic and runtime-dependent, compiler is not able to determine when to optimize async methods or not at the compile time.
Solution
We needed other kind of abstraction - something that will allow us to work with Async
computations, but also will respect mostly-synchronous nature of our problem.
If you're familiar with list of changes planned for the future for the C# language, you'll notice an idea called ValueTask - shortly speaking it's a lightweight value type that is going to conform async/await API and allows to use both immediately returned values and TPL Tasks in the same way. Exactly something, we needed here.
However, ValueTasks still belongs to the future. Besides, we're building F# library and we needed something, that would feel natural for the F# devs, where F# contains it's own Async
primitive.
This is why we created our own AsyncVal type - it behaves similar to Async
, but it's able to use optimized path for synchronously resolved values. To make it easier to work with we've also created asyncVal { ... }
computation expression and interop methods for async { ... }
builder. With it we are free to express things such as:
let result = asyncVal {
let rand = random.Next()
if rand % 1000 = 1
then return! async { return! someAsyncOperation () }
else return rand
} |> AsyncVal.get
... and get both support for asynchronous execution and optimal performance for happy case.
How fast it is? While this implementation is still in flux, we've made some initial benchmarks (remember: benchmarks lie and actual performance growth for our case was not so optimistic), comparing AsyncVal
vs. async.Return
. It turned out to be almost 3000 times faster with no heap allocations (it's a struct type, and introduces only 4 bytes of overhead for 32-bit and 8 bytes for 64-bit machines). For truly async computations, it introduces a minimal overhead over existing Async
implementation. You can see actual benchmarks in the following PR.
This allowed us to optimize the most common cases, without loosing potential to work with some higher level abstractions.
Summary
Right now AsyncVal
is part of the FSharp.Data.GraphQL library and probably will require some more polishing, as it's created to solve our specific problems, not ready to be used for a general purpose - i.e. error management isn't exactly uniform at the moment.
However this PoC already proves it's usefulness and may be used as a foundation for something greater. Maybe in the future it will deserve it's own package.