Type Safety

CSC-430

Phillip Wright

Type Safety

In a statically typed language like Java, we can significantly reduce the number of bugs in our code by leveraging types to model the problems we are solving.

The more accurately we model our problem, the fewer invalid states that are even compilable.

Example

Let’s consider a method that processes a learner based on their current class:

public boolean doSomething(int userId, String studentClass){
 //...
}

Example II

For this example, only four values really make sense, conceptually, but the String type permits a significantly larger set of values.

Almost every possible input will be invalid!

With sufficient unit tests, we might be able to make it safe…

Stringly Typed

When a String is used to represent data that has a more specific, constrained set of values, it’s often referred to as being “stringly typed”.

This is generally considered an anti-pattern because it defers type-checking to runtime, requires extensive validation, and reduces readability.

Enums

As an alternative, we could simply create a four-valued type which exactly matches the valid states in our model.

public enum StudentClass {
  FRESHMAN,
  SOPHOMORE,
  JUNIOR,
  SENIOR
}

Enums II

With this type, we can use

public boolean doSomething(int userId, StudentClass studentClass){
  //...
}

Now, we don’t need to test for invalid classes. Actually, we can’t even write such tests, because the invalid code won’t compile!

(ignoring null values…)

Enums III

By using a well modeled, constrained type, we can also do other things more easily. Imagine such code:

if(studentClass.equals(FRESHMAN)){
  return freshmanStuff;
}else if(studentClass.equals(JUNIOR){
  return juniorStuff;
}else{
  return seniorStuff;
}

We accidentally forget the Sophomores, who now get processed as Seniors!

Enums IV

Using modern Java features, we could instead write:

return switch(studentClass){
  case FRESHMAN -> freshmanStuff;
  case JUNIOR -> juniorStuff;
  case SENIOR -> seniorStuff;
}

At compile time, this will fail, because we have not exhaustively covered all cases! This will force us to either add logic for Sophomores or to add a default case (which you should generally avoid!).

Enums V

This also protects us in the future! Imagine if we get a feature request to add support for GRAD students. We can add this to the enum, and the compiler will tell us all locations in the code that must be updated to support this new value in the type!

Testing

There is even support for testing such types in unit tests:

@ParameterizedTest
@EnumSource(StudentClass.class)
void someTest(StudentClass studentClass){
  // ... Set up test data ...

  final var result = doSomething(id, studentClass);

  // ... Test result ...
}

Testing II

If we add an enum value, and have parameterized the unit tests for all code using the enum, then all of our tests will automatically start covering the new value!

Wrappers

Even if some large, primitive type accurately models some data, it’s still probably a good idea to partition the values in your code so you don’t use them in the wrong place.

For instance, instead of passing in an int, we could define:

public class UserId{
  private final int id;
  // ...
}

Wrappers II

So, now, we don’t have to worry about passing in an int that does not represent a user’s id:

public boolean doSomething(UserId id, StudentClass studentClass){
  // ...
}

Ada Subtypes

Some languages, like Ada, allow you to create new types from existing ones that share representation but are incompatible.

type Celsius is new Float;
type Fahrenheit is new Float;

Ada Subtypes II

Now, you cannot accidentally assign a Celsius value to a variable expecting Fahrenheit, even though they are both just floats at runtime!

This is much nicer than Java, because you don’t have to unwrap values to access the underlying primitives.

Warning

Many languages offer type “aliases” which allow you to use an alternative name for a type, but they do not create a new type and can be used interchangebly with the original type!

Limitations

There are some obvious limitations, though:

  1. At some point we have raw input that could be invalid, but we only have to validate once at the edge of our code.

  2. Java doesn’t care if you just sprinkle null values around, so many of the protections assume you have not done so irresponsibly.

Example III

Often, people will implement an expression structure using such nodes:

public class Expression{
  private int value;
  private Operation op;
  private Expression left;
  private Expression right;
}

This requires a lot of safety checks, etc. to make sure that the code is used correctly.

Better Expressions

public interface Expression {}

public class Constant implements Expression {
  private final int value;
  // ...
}

public class BinOp implements Expression {
  private Operation op;
  private final Expression left;
  private final Expression right;
  // ...
}

Better Expressions II

Now, we don’t need to check if there are children or a value, etc. This is encoded in the type itself.

We can also leverage the types to process the tree:

int compute(Expression e){
  return switch(e){
    case Constant c -> c.value();
    case BinOp bo -> doOp(
      compute(bo.left()),
      compute(bo.right()),
      bo.op);
    default -> throw new IllegalArgumentException("Unknown Type");
  }
}

Exhaustiveness

Because Expression is a standard interface, the compiler cannot guarantee that Constant and BinOp are the only implementations.

Therefore, the switch is considered not exhaustive, and we are forced to handle the default case (or potential unknown types) at runtime.

Sealed Interfaces

We can fix this by sealing the interface:

public sealed interface Expression
    permits Constant, BinOp {}

Now the compiler knows the hierarchy is closed. It can prove exhaustiveness at compile time, meaning we can safely remove the default case!

Better Expressions III

public int doOp(int left, int right, Operations op){
  return switch(op){
    case ADD -> left + right;
    case MULT -> left * right;
    case SUB -> left - right;
  }
}

Even Better Expressions

public class Constant implements Expression{
  // ...
}

public interface BinOp extends Expression{}

public class Mult implements BinOp {
  // ...
}

public class Add implements BinOp {
  // ...
}
// ...

Even Better Expressions II

switch(e){
  case Constant c -> c.value();
  case Add a -> compute(a.left())+compute(a.right());
  case Mult m -> compute(m.left())*compute(m.right());
  // ...
}

Even Better Expressions III

switch(e){
  case Constant c -> c.value();
  case BinOp b -> {
    final var left = compute(b.left());
    final var right = compute(b.right());
    yield switch(b){
      case ADD -> left + right;
      case MULT -> left * right;
      case SUB -> left -right;
      // ...
    }
  }
}