Adventure time of semi-dynamic typing
This post is the result of some ideas, that raised in (not only) my head after reading next article about inevitable fall of dynamically typed languages. However I won't be discussing it, as it's not major subject here.
One of the core problems shown by statically typed enthusiasts is maintenance and refactoring difficulty. This is well-known issue that dynamic languages usually tries to solve by unit testing and/or introducing gradual typing.
But question I've asked myself was: Can we create dynamically typed language, that allows us to perform compile time type analysis without defining types?
I'll try to present my idea by coming out from statically typed languages and continuously trying to remove as much type information as possible, until we'll stay with type-less code.
Type inference
First big step to erase any type info is to introduce type inference. Most of the mainstream statically typed languages provide more or less such feature (yes, even Java). While we have different approaches to that, one of the oldest and most well known is Hindley-Millner type system.
For those, who still are not familiar with it, it's type system popular in ML-family (eg. Haskell, F#), and allows us to lazily infer signatures based on usage.
let add x y = x + y
add 1 2 // function signature is inferred from the usage
While this one looks pretty good, it still will fail on more advanced example:
// what the hell is person?
let fullName person = person.fistName + " " + person.lastName
Unfortunately compiler still needs to know, what type of the person
is. It has to be defined prior this point in program.
Implicit interfaces
Another nice feature of some statically typed languages is connected to interfaces. Interfaces allow us to define the contract, splitting type signature from actual implementation. Most popular languages like Java or C# require to specify this contract dependency explicitly, making the code quite verbose.
However we could also improve the compiler to resolve this dependency for us. This way we can specify interface as a function parameter and pass multiple objects of different types without having them implementing this interface explicitly, as long as they are satisfying it.
This kind of interfaces are already used in languages such as TypeScript, Go and Pony.
interface Person {
firstName: string
lastName: string
}
function fullName(person: Person) {
return person.firstName + " " + person.lastName;
}
class User {
constructor(public firstName: string, public lastName: string) { }
}
fullName(new User("Joe", "Doe"));
fullName({ firstName: "Joe", lastName: "Doe"});
Next step - automatically generated interfaces?
Even with those two features, we're still pretty far from being OK - at least form the dynamic typing perspective. We still have to define type if we want to take advantage of statically compiled type assertion.
Consider following dynamic code:
function fullName(person) {
return person.firstName + " " + person.lastName;
}
fullName({lastName: "Joe"})
You may see that inside this JS-like snippet we actually haven't defined any types. Does that mean, that we are not able to tell, that this piece of code will be executed correctly? It's quite clear that input parameter is missing property firstName
. How do we know that? We can infer this from the function body.
Based on function body we know that person
argument should satisfy interface of type { firstName: 'a, lastName: 'b }
, where 'a
and 'b
are some generic types. This way we already know more that most of the dynamic languages usually know at compile time. But we still cannot guarantee anything about those generic types. But could we? One way would be to use heavy type inference introduced earlier. Based on that, lets apply two more rules to this language:
- Language cannot be weakly typed in sense, that no implicit casts should be allowed. Their presence would seriously complicate process of type inference. This also means no implicit
toString()
invocations. - To avoid fully generic type analysis we need to define some basic operations of well know signature, just like we need some primitive types (numbers, strings or booleans) in most of the programming languages to start with building our own types. Good example could be arithmetic operations such as addition or subtraction (signatue
(+): 'a -> 'a -> 'a
) .
Given these two rules, we could now tell that person.firstName + " " + person.lastName
need to satisfy addition operation signature, and because we have whitespace string between them, we can tell, that both firstName
and lastName
must be strings. Also we know that returned type is also a string.
Using those rules we can verify that our function signature looks like that: fullName: {fistName: string, lastName: string} -> string
. Having automatically constructed interface we can now utilize well known techniques of implicit interface resolution and type inference to verify possible arguments. Without providing any actual type by ourselves.
Limitations
While idea seems clear and easy in theory, there are still problems that needs considerations. Some of them comes from fact, that most of the dynamically typed languages are heavily using mutable data types.
Immutability is the king here. It's a lot harder to perform compile-time analysis on the program, where basically anything can move and change, from any place at any time. With immutability, each field change would produce a new object. This way we'll know that structure, which satisfied some interface in the past will also satisfy it in the future. Of course we could go quite far without immutable types, but this surely helps.
Another problem are collections - which in dynamic languages can be contain a variety of objects of different types. Here we can take advantage of set theory and apply it to type system (just like Ceylon does). This way we could allow to have collections of multiple different types of values and still be able to keep that type information inside.