Hibernate Pitfalls part 3
Friday, April 6th, 2007In the two previous episodes we looked at implicit updates and delayed SQL and the consequences of these pitfalls.
In this episode we’ll explore the pitfall known as cache fetching.
This pitfall occurs when you have a EntityManager that you do many operations from, including updating and selecting the same objects many times. We’ve seen that Hibernate keeps a cache of Objects that have been inserted, updated, deleted and also the Objects that have been queried. This pitfall is just an extensive of that caching mechanism. Here’s an example:
EntityManager em = emf.createEntityManager();
EntityTransaction et = em.getTransaction();
et.begin();
Test t = new Test();
t.setName("Foo");
t.setId(1);
em.persist(t);
t.setName("Bar");
Test t2 = (Test) em.createQuery("select test from Test test where test.id=1").
getSingleResult();
Assert.assertSame(t, t2);
Assert.assertEquals("Bar", t2.getName());
et.commit();
em.close();
As you can see here, although we inserted one Test Object and then query another, since our query resulted in the Test we had previously added, we just got that one back. In addition, any modifications we made to our Test Object we see when we execute the query.
So, what’s bad about this? First, there is no way to retrieve the original Object after you have modified it unless you create a new EntityManager and use that for the query. We can’t call the refresh method because that would clobber the call to setName(”Bar”). This can become very expensive because each EntityManager uses a different JDBC connection. You could quickly run out of connections, depending on the application (I’ve seen this happen).
Furthermore, if the database has changed underneath you (possibly via straight JDBC or by another server), you will not see the change. This is by far the trickiest pitfall we have seen in this series and one of the more non-intuitive Hibernate behaviors.
Just so folks know, other ORMs divide on this issue. Some feel that going direct to the DB for queries is best, others feel that queries are heavy and hitting a cache is better. I personally feel that caches should be added when necessary, but not default. The ORM should do what the name implies, the Object-Relational Mapping and not caching.
Here’s an example of that:
EntityManager em = emf.createEntityManager();
EntityTransaction et = em.getTransaction();
et.begin();
Test t = new Test();
t.setName("Foo");
t.setId(1);
em.persist(t);
et.commit();
// Put the Object in the cache
Test t2 = (Test) em.createQuery("select test from Test test where test.id=1").
getSingleResult();
// This method uses a separate JDBC connection to update the row
executeViaJDBC("update Test set name = 'Baz' where id = 1");
Test t3 = (Test) em.createQuery("select test from Test test where test.id=1").
getSingleResult();
Assert.assertEquals("Foo", t3.getName()); // This object is Foo
em.refresh(t3);
Assert.assertEquals("Baz", t3.getName()); // Now it is Bar since we reloaded it
em.close();
You can see that if we first add our Test Object to the cache using a query and then modify it using plain old JDBC, the next time we execute the query, we do not get the new values from the database even though we explicitly asked for them. Furthermore if we use MySQL and take a look at the database we get this:
mysql> select * from Test; +----+------+ | id | name | +----+------+ | 1 | Baz | +----+------+
This means that we must be exceptionally omniscient and know that something in our database has changed so that we can call refresh to re-fetch from the database. This expands if you have say 1500 machines or so. If your EntityManager is mostly read-only, you can call refresh after every query and sorta fix the problem, but this really isn’t usable at all. Due to the fact that Hibernate doesn’t have any good work-arounds for this problem and there really isn’t any method of fixing this, the cache fetching pitfall gets a suckiness rating of 9.5 (out of 10).