pattern: extract mutating variables to State class

January 7, 2019 - 3 minutes read - 572 words

I’ve come to really like the pattern of extracting local or instance variables into their own State class, and writing an algorithm in terms of that State.

A really trivial (if contrived) example of this would be a “sum cubes” algorithm; something like this:

public class CubeSummer {
  private final PrintStream printStream;
  private long i;
  private long sum;
  
  public CubeSummer() {
    this.printStream = System.out;
    reset();
  }
  
  public long sumCubes(long limit) {
    for (; i<limit; i++) {
      sum += i * i * i;
      display();
    }
    return sum;
  }
  
  public void reset() {
    i = 0;
    sum = 0;
  }
  
  public void display() {
    printStream.print("i = " + i + ";");
    printStream.println(" sum = " + sum);
  }
}

What’s happened here is we’ve essentially promoted some variables that should arguably be local variables into instance variables, to avoid needing to pass them around explicitly. But in the process, we’ve also mixed long-lived state with short-lived state, broken thread safety, and forced clients of this class into a particular calling convention (either instantiate an instance per sumCubes call or carefully respect thread safety and call #reset().)

Put a different way: The lifecycle of i/sum are not very clear – are they scoped to a particular sumCubes operation? Or across many invocations? or something else entirely? Is this class thread-safe? Who is responsible for resetting the per-request state – the sumCubes method, or its caller? If it is the subCubes method, is that reset behavior guaranteed even when something terrible (an exception) happens inside the method?

To make it more clear, we can pull out the i & sum variables into a new state class like

public static class CubeSumState {
  public long i = 0;
  public long sum = 0;
}

With those definitions in mind, our implementation becomes:

public long sumCubes(long limit) {
  CubeSumState state = new CubeSumState();
  for (; state.i < limit; state.i++) {
    state.sum += state.i * state.i * state.i;
    display(state);
  }
  return state.sum;
}

Now the lifecycle of state is now very clear – it’s exactly one invocation of sumCubes! By removing the “per-request” state from the CubeSummer class, we’ve also made it thread-safe and safe to use as a singleton.

We haven’t bothered with this here, but if it was useful, we could incorporate validation logic into the setters of our State object to ensure that we never enter an invalid state.

Also note that we’re not strictly bound to our custom debugging output anymore: we can replace the display() with something like

public class CubeSummer {
  // ...
  private final Gson gson;

  public void display(CubeSumState state) {
    printStream.println(gson.toJson(state));
  }
}

This genericism comes with an immense benefit: it is precisely a structured log of all the relevant state, which can be used for for subsequent (re)interpretation or as a checkpoint for later resumption. It also automatically “adapts” if we introduce new fields into CubeSumState! This approach also has similar benefits for debugging, since all state and mutations are focused within a single object.

This example is, of course, a bit artificial: a pair of long wouldn’t normally require a custom object to represent. But the distinction between state that is

long-lived / reused across many invocations
short-lived / used for a single invocation

is an extremely useful one in practice.

edit 2019-01-09: thanks Wouter for recommending some great improvements here wrt naming + correctness; nobody who has worked with him would be surprised at his rigor :-)