Value absence - why Scala/F# approach is bad?
Handling an absent values is one of the most common problems in programming languages. Why we consider this so important? It’s because data processing is essential task of every program ever made. And one of the main reasons of systems errors and undefined behaviors are values not present in the system. Most of the modern programming languages tries to aim programmers with solution to handle those cases. Lets look at the most common way of dealing with them:
- Null pointer reference – also called a “billion-dollar mistake”, introduced by Tony Hoare. It allows to represent an absence of the heap allocated values (objects) in form of pointer referring to not a valid instance of represented type. In most of the mainstream programming languages all heap objects are nullable by default – no matter if you want it or not.
- Option/Maybe type – this approach is characteristic for most of the functional languages. Instead of creating null pointer, it defines a special generic type (eg. Option), represented by either one of it’s subtypes (Some as wrapper for existing underlying value, or None if value is not present).
So why we consider a dedicated Option type a better solution? Because it’s explicit, propagates type information about value nullability and – most of all – it’s non-default. By the most of the time, we expect to have a meaningful references to our data. Their absence is not a desired and certainly should not be an expected behavior. Problem in nullable references is that literally any instance in our code could represent a non-existing value. Analogously to functional approach, all of our objects always represent an Option type.
Currently, after decades on living in shadows, functional paradigm is gaining more and more ground of the field of mainstream programming languages. This also reflects in two of the most popular programming environments, JVM and .NET. Their functional representatives, Scala and F#, claims to combine best of both programming paradigms. This way they also introduced and popularized an Option types among object oriented approach.
But something has been missed along the way, and (in my opinion) both Scala and F# had failed in face of the problem of value absence. Why? Because of their dualism. Since they are functional languages, both of them allows you to create an Option types with full support expected from a functional language. But while they also are object oriented and built on top of object oriented VMs, they allow you to use and create a nullable references. This leads to false sense of security, when using Option types. Lets look at the following example (F#):
let value: String option = Some doSmthAndReturnString()
How can you be sure that function doSmthAndReturnString
won’t return a null? Actually you can’t until you perform an explicit null check. From the runtime perspective Some null
is a perfectly valid record. Does it sound rational? No. So why is this even possible?
How to combine functional and object oriented worlds?
From my perspective, the best solution of that problem came with Ceylon language. It’s based on the concept of Union Types, one of the key features of Ceylon. For tl;dr people – union type of X|Y
is a supertype of both types X
and Y
. How does it correspond to null/options?
In Ceylon null
have it’s own unique type Null
(just like in Scala). By default all reference types are non-nullable (similarly to functional languages). However they may be nulled if referenced as union type of T|Null
(or T?
using some syntax suggar). This concept is basically – however not entirely – equivalent of Scala Either[T,Null]
but without verbose requirement for object wrapping. Moreover, Ceylon compiler is able to optimize this away to standard JVM null references, without any overhead.
// Scala - Either
var m: Either[String, Null] = Left("hello")
var n: Either[String, Null] = Right(null)
var m: Either[String, Null] = Left(null) // still valid
// Scala - Option type
var m: Option[String] = Some("hello")
var n: Option[String] = None
var m: Option[String] = Some(null) // still valid
// Ceylon
String|Null m = "hello"; // no need for value wrapping
String? n = null; // String? -> String|Null
String f = null; // compile time error, references are not nullable by default
As you can see on the example above, this way we combined advantages of safe, explicitly nullable types with verbosity and performance of null referenced types. In my opinion this is a proper, yet still not widespread, solution of one of the most common problem in languages theory.