Algebraic Data Types

CSC-430

Phillip Wright

Algebraic Data Types

Algebraic Data Types, also known as sum types provide a type safe way to concisely model problems.

Sum Types

The term “sum type” comes from the mathematical representation of types. For instance, if we have a type that can be either a 2d or 3d point, we could represent this as:

\[(\mathbb{Z} \times \mathbb{Z}) + (\mathbb{Z} \times \mathbb{Z} \times \mathbb{Z})\]

Where we can think of $\times$ as “and” and “+” as “or”.

Sum Types II

In other words, the type we are defining is either a structure with an int and an int or an int and an int and an int.

A tuple being a “product type” and the overall point type being a sum of products.

Java

Bringing this from the theoretical world to the real world, we could imagine that such sum types might be implemented using an interface and different implementations.

Optional

For instance, we could implement an Optional type from scratch without algebraic data types with:

public class Optional<T> {
  final T value;

  public Optional(T value){
    this.value = value;
  }

  public T get(){
    // what if there is no value...
  }
}

Optional II

But since this class represents two clear cases (or products) we could make this a bit safer by implementing this as a sum type:

public sealed interface Optional<T> permits Some, None {}

Optional III

public final class Some<T> implements Optional<T>{
  final T value;
  public Some(T value){
    this.value = value;
  }
  public T get(){
    return value;
  }
}

Optional IV

public final class None<T> implements Optional<T>{

}

public static <U>  Optional<U> of(U value){
  return new Some<>(value);
}

Bloat

This approach works, but these classes really only exist to store data and encode some semantics in the types.

Java requires a lot of boilerplate to accomplish this relatively simple task.

Records

To facilitate such simple classes, a new entity was introduced to the Java programming language: records.

Records allow for very concise definitions of types which primarily just carry data.

Records II

public sealed interface Optional<T> {
  record Some<T>(T value) implements Optional<T>{}
  record None<T>() implements Optional<T>{}
}

final var o = new Optional.Some<>("hello");
final var s = o.value(); // just use field name

Records III

And other things can be added as well:

public sealed interface Optional<T> {
  record Some<T>(T value) implements Optional<T>{}
  record None<T>() implements Optional<T>{}
  static <U> Optional<U> of(U value) {
    return new Some<>(value);
  }
}

Points Again

public sealed interface Point {
  record Point2D(int x, int y) implements Point {}
  record Point3D(int x, int y, int z) implements Point {}
}

Students

public sealed interface Student {
  UserId id(); // since it's shared!

  record Freshman(UserId id) implements Student {}
  record Sophomore(UserId id) implements Student {}
  record Junior(UserId id) implements Student {}
  record Senior(UserId id) implements Student {}
}

Constructors

By default, the “constructor” is implicitly defined as the list of fields given in the record which binds the values to fields/getters of the same name.

Constructors II

But we can have custom constructors as well:

record Some<T>(T value) implements Optional<T>{
  public Some(T value){
    this.value = Objects.requireNonNull(value);
  }
}

Constructors III

For validation, records offer a concise “compact constructor” syntax. The parameter list is omitted, and this.field = field assignments are implicit.

record Some<T>(T value) implements Optional<T>{
  // Compact constructor for validation
  public Some {
    Objects.requireNonNull(value);
  }
}

Switch

Since the ADT pattern uses sealed interfaces, we can use switch expressions to process them in an elegant way:

public static Point2D project(Point point){
  return switch(point){
    case Point2D p -> p;
    case Point3D p -> new Point2D(p.x(), p.y());
  };
}

Pattern Matching

Even better, by using records, we can extract fields using pattern matching:

public static Point2D project(Point point){
  return switch(point){
    case Point2D p -> p;
    case Point3D(var x, var y, var z) -> new Point2D(x, y);
  };
}

This will pull out the nested fields for you and bind them to variables.

Expression ADT

public sealed interface Exp{
  record Const(int value) implements Exp{}
  record Mult(Exp left, Exp right) implements Exp{}
  record Div(Exp left, Exp right) implements Exp{}
  record Add(Exp left, Exp right) implements Exp{}
  record Sub(Exp left, Exp right) implements Exp{}
}

Expression ADT II

public static int eval(Exp expression){
  return switch(expression){
    case Const(var i) -> i;
    case Mult(var l, var r) -> eval(l) * eval(r);
    case Div(var l, var r) -> eval(l) / eval(r);
    case Add(var l, var r) -> eval(l) + eval(r);
    case Sub(var l, var r) -> eval(l) - eval(r);
  }
}

When

While we’re on the topic of switches, we can also add conditional checks to a case to further refine it:

public static int eval(Exp expression){
  return switch(expression){
    //...
    case Div(var l, var r) when eval(r) > 0 
                         -> eval(l) / eval(r);
    //...
  }
}