Nov 232004

I was just sitting down this morning to do a bit of coding on a yet undisclosed project (yep, it’s a secret for the next month or so until a prototype is ready and then I’ll spill the beans) and I noticed something very interesting: Java 5 locks alone do not provide thread safety. (Edit 11/23/2004 – This post has been proven by the JSR 133 expert group to be incorrect for Java 5. It is still an interesting read so please continue. All incorrect statements will be edited like this. The new title of this post is therefore Java 5 locks alone DO provide thread safety) I think this topic is something I’ll try to put an article together about but I just wanted to go over it here quickly so that I can at least get the idea out into the community and start some discussion about this.

The Java memory model is a pretty complex beast and you had better understand it before diving in here (Edit 11/23/2004 JSR 133 is the new JMM adopted by Java 5.0 and above. It covers all aspects of the JMM to make certain that correctly synchronized code can be written. This JSR is required reading for those wanting to know more about multi-threaded Java). The thing about the java.util.concurrent.lock package and more specifically the ReentrantLock implementation there is that these are just plain old Java objects and have no synchronization or native code at all (Edit 11/23/2004 Even without synchronization or native code, these classes can provide thread safety because of JSR 133). The JMM states that assigns and stores can be separate and not necessarily in any order. That is unless there is an intervening unlock operation. Then all assigns must be followed by stores. (Edit 11/23/2004 JSR 133 and the new memory model have removed the concept of assigns and stores because of the confusion that they introduced).

This is really nice because it allows us developers to not have to think about whether or not a variable is actually going to be set to the value we just assigned to it. We can just slap a synchronized block in or put it on the method and we are 100% guaranteed that our variable will be set to that value in the eyes of all threads. Well, if we use locks, there’s no way for the JVM to enforce this. So, we now have to worry again. Here’s an example:

Yeah, this code is ideal. It guarantees that no matter who is calling inc, a call to getValue will always be the latest and greatest value. Here it is again with locks:

This code is essentially the same except for that pesky JMM. We don’t get that nice feature where when a Thread is releasing a monitor it must also perform all its stores. Instead one thread can call inc, get the lock, increment i, release the lock and another thread can call getValue and still see the old value. The reason is that the JVM does NOT have to perform a store when the lock is unlocked. Nope, not even a little store. No store at all! Yeah, this sucks. I thought the java.util.concurrent.lock package was supposed to make life easy and make code more readable. Not the case my friends. We all now have to think really hard about what threads are calling which methods, etc. etc.

(Edit 11/23/2004 The above paragraph is incorrect because it is trying to apply the Java 2 memory model to Java 5. The new memory model dictates that if a happens-before condition can be identified than the JVM is not allowed to reorder writes or reads. Since a lock guarantees a happens-before condition using volatile variables and the check-and-store atomic operation, where a variables value is checked and updated in one operation, the JVM is not allowed to change the value of i and not store it to memory. The happens-before condition is that i being incremented happens before the lock is released. Likewise, the lock being obtained happens-before i is incremented. Therefore, the JVM is not allowed to reorder the write of i and the release of lock. Likewise, the JVM is not allowed to reorder the locking of lock and the writing of i. If the lock was not present, the JVM would be allowed to wait until later to perform the increment and write of i)

So, how do you fix this? Well, luckily Java provides us with an easy fix. Just make every variable known to man volatile. Why? Because volatile variables guarantee that all assigns are followed by stores. This is excellent! I can just rewrite the code to this:

(Edit 11/23/2004 This fix is not necessary, but will also work. The semantics for volatile variables became much more strict with JSR 133. A volatile variable is now a form of synchronization and read and write blocks are performed around volatile variables. For this reason and with the addition of atomic check-store operations, volatile variables can now be used as locks).

Yep, that will do it. But, you have to remember to do this everytime you think that two threads will need to share data. Not to mention that you have NO control over how HashMap and other data structures are implemented and therefore can’t use those. You have to remember to use all the concurrent collections and primitives only. If you have written your own data structures or structs or whatever, you now have to go back and retrofit those so that everything is volatile. Yeah, this totally sucks!

(Edit 11/23/2004 You no longer have to worry about HashMaps and other classes. Using locks guarantees happens-before conditions regardless of the classes being used. For this reason, any data structure or custom class can be used with locks and as long as all access uses the lock then the data will always be thread safe).

(Edit 11/23/2004 I removed the last paragraph completely because it was a message to allow people to prove me wrong. Well, I proved myself wrong by finding JSR 133, reading it and then talking with the members of the JSR to get everything straight. So, I won’t summarize best practices, but I will say this, forget everything that Java 2 taught you about the JMM. The new world order is JSR 133).