Algebraic Datatypes in Java 8
Sum types, also known as tagged unions are a feature in many modern languages. Traditional object-oriented languages from the 90's, having recently aquired lambda functions, still have to catch up in this area. But with this trick you can get pretty close.
In Haskell, defining a tree structure is straightforward:
data Term = Constant Int | Add Term Term | IfZero Term Term Term
This means that a value of type Term
is either Constant
, Add
or IfZero
. Each of these constructors take arguments, such as the Int
argument for Constant
. The constructor and the arguments can be inspected via pattern matching, eg:
eval term = case term of
Constant i -> i
Add e1 e2 -> eval e1 + eval e2
IfZero e1 e2 e3 -> if eval e1 == 0 then eval e2 else eval e3
This means that you can have all the code required to implement eval
in one place, instead of having it spread over three different classes that each implement their little piece.
It also means you can put the datatype in a library, and somebody else can define eval
somewhere else without having to modify your code. This is not possible if eval
has to be defined as part of the Term
class, as is usual in object oriented programming.
In Java, the Visitor pattern is a common workaround for this missing feature. In this post, I'll showcase a very minimal way of implementing it. To continue with the running example, Term
will be a class in Java:
public interface Term {
We have the accept
method which takes a Visitor
:
public abstract <R> R accept(Visitor<R> visitor);
public interface Visitor<R> {
R constant(int value);
R add(Term left, Term right);
R ifZero(Term condition, Term then, Term otherwise);
}
The visitor has a method for each subtype of Term
. So far, so good. Not that much worse than Haskell. However, to be able to actually construct Terms
, we'll need to write a bit of boilerplate:
public static Visitor<Term> term = new Visitor<Term>() {
public Term constant(int value) { return new Term() {
public <R> R accept(Visitor<R> v) { return v.constant(value); }
};}
public Term add(Term left, Term right) { return new Term() {
public <R> R accept(Visitor<R> v) { return v.add(left, right); }
};}
public Term ifZero(Term condition, Term then, Term otherwise) { return new Term() {
public <R> R accept(Visitor<R> visitor) { return visitor.ifZero(condition, then, otherwise); }
};}
};
The static term
field acts as a factory for Terms
. It could have been a bunch of static methods, but implementing the Visitor
interface gives os a guarantee that we have exactly the constructors that the Visitor
interface supports - no more and no less.
The methods are all similar. For example, the constant
method returns a new Term
whose accept method invokes constant
on the provided visitor. The add
method returns a new Term
whose accept method invokes add
, etc. And that's it:
}
Now we can construct a Term
:
import static Term.term;
...
Term myTerm = term.add(term.constant(5), term.constant(7));
And we can implement eval
:
int eval(Term e) {
return e.accept(new Visitor<Integer>() {
public Integer constant(int value) {
return value;
}
public Integer add(Term left, Term right) {
return eval(left) + eval(right);
}
public Integer ifZero(Term condition, Term then, Term otherwise) {
return eval(condition) == 0 ? eval(then) : eval(otherwise);
}
});
}
eval(myTerm) // returns 12
There you have it - safe and flexible, if a little verbose.
If you have comments, catch me on Twitter! I'm @Continuational