Versioning

These are presently ideas for how to implement support for versioning in a file format.

Forward compatibility

The following sequence could preserve newer data that an older reader / writer can’t handle.

Two version case

The two version case has vOLD and vNEW. We have a counter called fresh on each version that indicates if it is newer than others. In the 2 case, it must be 0 or 1.

Initial write by vNEW

A vNEW writer writes: version: vNEW, data: {o:1, n:1}, fresh: 0 It then calls downcast_vNEW_to_vOLD(vNEW: {o:1, n:1}) to write: * version: vOLD, data: {o:1}, fresh: 0

Update by vOLD client

The vOLD reader reads {o:1} but can’t read {o:1, n:1}. It can copy the binary data, though. After updates, the vOLD writer writes: version: vOLD, data: {o:2}, fresh: 1 version: vNEW, data: {o:1, n:1}, fresh: 0

Reading by vNEW client

The vNEW reader can read both versions. It returns to the host either: upcast_vOLD_to_vNEW(vOLD: {o:2}) * The upcast doesn’t know anything about the new value so picks a default. * Thus yields {o:2, n:0} upcombine_vOLD_to_vNEW(vOLD: {o:2}, vNEW: {o:1, n:1}) * This version can read the best value for n from the new data while using the most recent o. * Thus yields {o:2, n:1} Either way, the vNEW reader has a single, current version to work with.

Three version case

This would be more compelling reworked as an induction step. In the 3 case, we have versions vA, vB, vC where vC is new.

Let’s consider vC, vB, vA, vC.

  • Initial write by vC
    • vA(0): {a:1}, vB(0): {a:1, b:1}, vC(0): {a:1, b:1, c:1}
  • Update by vB
    • vA(1): {a:2}, vB(1): {a:2, b:2}, vC(0): {a:1, b:1, c:1}
  • Update by vA
    • vA(2): {a:3}, vB(1): {a:2, b:2}, vC(0): {a:1, b:1, c:1}
  • Read by vC
    • b = upcombine_vA_to_vB(vA: {a:3}, vB: {a:2, b:2})
    • b = {a:3, b:2}
    • c = upcombine_vB_to_vC(vB: b, vC: {a:1, b:1, c:1})
    • c = {a:3, b:2, c:1}

Let’s consider vC, vA, vB, vC.

  • Initial write by vC
    • vA(0): {a:1}, vB(0): {a:1, b:1}, vC(0): {a:1, b:1, c:1}
  • Update by vA
    • vA(1): {a:2}, vB(0): {a:1, b:1}, vC(0): {a:1, b:1, c:1}
  • Update by vB
    • b = upcombine_vA_to_vB(vA: {a:2}, vB: {a:1, b:1})
    • b = {a:2, b:1}
    • b.a =3; b.b = 3
    • vA(1): {a:3}, vB(1): {a:3, b:3}, vC(0): {a:1, b:1, c:1}
  • Read by vC
    • c = upcombine_vB_to_vC(vB: {a:3, b:3}, vC: {a:1, b:1, c:1})
    • c = {a:3, b:3, c:1}

Let’s consider vC, vB, vA, vB, vC.

  • Initial write by vC
    • vA(0): {a:1}, vB(0): {a:1, b:1}, vC(0): {a:1, b:1, c:1}
  • Update by vB
    • vA(1): {a:2}, vB(1): {a:2, b:2}, vC(0): {a:1, b:1, c:1}
  • Update by vA
    • vA(2): {a:3}, vB(1): {a:2, b:2}, vC(0): {a:1, b:1, c:1}
  • Update by vB
    • b = upcombine_vA_to_vB(vA: {a:3}, vB: {a:2, b:2})
    • b = {a:3, b:2}
    • b.a = 4; b.b = 4
    • vA(1): {a:4}, vB(1): {a:4, b:4}, vC(0): {a:1, b:1, c:1}
  • Read by vC
    • c = upcombine_vB_to_vC(vB: {a:4, b:4}, vC: {a:1, b:1, c:1})
    • c = {a:4, b:4, c:1}

Generalized

Downcasts

An initial write must produce all versions. We ought have, at a minimum, downcasts for n to n-1. This pseudocode would generate them:

downcasts = [
  (10, 5, tentofive),
  (10, 9, tentonine),
  (9, 8, ninetoeight),
  ...
]
def expand_downcasts(downcasts):
   by_target = {}
   for up, down, func in downcasts:
      cur_up, cur_func = by_target.get(down, (None, None))
      if cur_up is None or cur_up < up:
         by_target[down] = up, func

   versions = {up for up, _, _ in downcasts}
   versions.update(down for _, down, _ in downcasts}
   versions = iter(sorted(versions, reverse=True))
   yield identity(next(versions))  # Just reuse the parameter.
   for version in versions:
      yield by_target(version)

Branching

There might be a case for branching versions, for instance if competing clients offer experimental features or custom functionality.

Opaque values and open types

One technique that might simplify some of this, especially for the consumer, would be for the designer to add strategically placed opaque values. Consider:

type MonochromeWindow := Tuple:
    x: Integer
    y: Integer
    title: String
    future: Opaque[empty~~]

Now, when the new ColorWindow API is released, it can store color information for a given window in future. Opaque behaves the same as Union but it’s treated in the type system as though there are additional unspecified tags.

That’s essentially an open union, and we might have a notion of open tuples.