Versioning and serialization of enums

Versioning keeps coming up everywhere in Java. Probably because the Java language is used to run huge applications and guess what, some things change and others don’t (or at least not at the same rate). Not to mention that Java is just awful at versioning of anything.

One thing I’ve come to realize over the last several years is that certain objects should not be serialized. This class of Objects include any class that does readResolves for object replacement. Most people are thinking, “wait, the new JDK 1.5 enums and all those cool type safe enums we’ve been using for ages fall into that category,” and they’d be correct. There are a number of issues with using readResolve in serialization but for the most part returning null or throwing exceptions is the biggest problem especially when clients are expecting results.

When using the type safe enum pattern and the serialized version comes across the wire but your local version of the type safe enum doesn’t have the enum value that was serialized you have two choices:

  • Throw an exception
  • Return null from readResolve

Neither of these cases is ideal. You could also create a new instance of the enum class and add it to the local storage mechanism (usually a Map). However, when the enum contains multiple member variables that are usually statically initialized, you’re screwed.

JDK 1.5 takes the first solution and throws this exception:

Exception in thread "main" java.io.InvalidObjectException:
enum constant bar does not exist in class com.inversoft.TestEnum
        at java.io.ObjectInputStream.readEnum(ObjectInputStream.java:1665)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1296)
        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:339)
        at com.inversoft.EnumRead.main(EnumRead.java:8)
Caused by: java.lang.IllegalArgumentException:
No enum const class com.inversoft.TestEnum.bar
        at java.lang.Enum.valueOf(Enum.java:192)
        at java.io.ObjectInputStream.readEnum(ObjectInputStream.java:1663)
        ... 3 more

This and the returning null solution, which will ivetiably result in a NullPointerException, are both bad options. But, if you’ve made the choice to use enumerations and serialize them, you might hit these cases.

One problem with JDK 1.5 is that all enums are automatically serializable and developers have no choice in the matter. This means that the first novice developer or even a seasoned developer that hasn’t been bitten with the errors mentioned above will decide to pass an enum across the wire; you just can’t avoid it unless you dictate that JDK 1.5 enums are off limits. You can still use enumerations though, just make sure they don’t implement Serializable.

All of this only occurs when enums change. If you have an enum called Foo that looks like:

public enum Foo {
  ONE, TWO;
}

Then you realize that you also need a THREE:

public enum Foo {
  ONE, TWO, THREE;
}

You update the server and the clients you know about. However, you missed one and the client calls a service on the server, which returns Foo.THREE. The client is going to throw the exception above. Likewise, you probably ended up reving clients who really didn’t care that there was a new value of Foo solely because you don’t want runtime exceptions. This is just unnecessary and can be fixed by simply converting Foo to a POJO or better yet in this case a String. This way clients never throw exceptions when deserializing Foo.

2 thoughts on “Versioning and serialization of enums

  1. Hey Brian,

    A few questions:

    What (if any) languages are actually good at versioning? Is it always something you have to graft into the language?

    Can’t you use the serialVersionUID to determine if a enum has changed (forgive my ignorance, as I haven’t done much with enums)?

    Should it be a requirement that all enums have a UNKNOWN value, or that every caller should check for null (which would mean the same thing)?

    Why would you have Foo be a string? Doesn’t that defeat the whole point of enums (that there are only X valid values allowed)?

    Like

  2. What (if any) languages are actually good at versioning? Is it always something you have to graft into the language?

    .Net versions every Class in addition to the serialized version of the Class instance. This allows them a lot of flexibility because they have the ability to see what version a Class is and distinquish between two different versions of a Class in the VM. Python does some cool things with Class loading and scoping of Classes. The thing to keep in mind with Java is that a Class instance and the Class object only know the ClassLoader that they came from. They don’t know what version they are, who called them, who they might call, what dependencies they have, how they were deployed, etc. This means that you have to do all versioning using ClassLoader magic and custom code on top of Java.

    Can’t you use the serialVersionUID to determine if a enum has changed (forgive my ignorance, as I haven’t done much with enums)?

    My example actually assumes you have added a serialVersionUID to the type safe enum (non JDK 1.5). If you haven’t, you won’t even have the opportunity to return null or throw an exception from the readResolve method because the serialization will throw a UnmarshalException instead. Additionally, JDK 1.5 enums don’t get new serialVersionUIDs when you add or subtract a enum value. This is different than type safe enum pattern that is a Java class which does get a new serialVersionUID during compilation (if one wasn’t set). Regardless, using serialVersionUID throws a different runtime exception placing you back in the same tight spot and in using type safe enum pattern actually makes life a bit worse.

    Should it be a requirement that all enums have a UNKNOWN value, or that every caller should check for null (which would mean the same thing)?

    This is an option, but one that I try to shy away from. The reason is that if your enum on the server contains values that the client can actually use than what you’ve actually done is place data that is operational into a static structure. Using an UNKNOWN value would have to retain the value from the enum on the server side so that the client could get at that information. I think my main point of this post is really to address the fact that most enums really shouldn’t be enums unless you can 100% guarentee that you’ve thought of all the possible values and they WILL NOT change in the forseeable future, or ever. Thread state is a good example of a true enum. Countries is an example of a bad enum.

    Why would you have Foo be a string? Doesn’t that defeat the whole point of enums (that there are only X valid values allowed)?

    This is the tough part. You are using an enum to ensure a constrained set, but that set is in flux. I’d much rather make the objects flyweights that are backed by a database or properties file. This can ensure that bad values aren’t used, but allows client and server to be updated to work with new values without restarting the VM and changing code. There are other solutions like a reloadable resource bundle or the like as well. But, you better make sure to update the clients first and the server second 😉

    Like

Leave a comment