Hibernate Pitfalls part 2

This is the second installment of the Hibernate Pitfalls montage. Before we get to this one, I’d like to sum up some comments from the previous one. First of all, I did rename the series. Someone suggested that my title was a bit off and I agreed. Second, many folks wrote in to tell me I’m an idiot about delayed SQL and someone went as far as to tell me that I have no idea about databases. Delaying the execution of SQL really doesn’t impact any portion of Hibernate at all except batching. If Hibernate went to the database after every statement, the only impact would be possibly performance and that would be it. It would still be able to lazy fetch and all the other nice features it has.

Remember that Hibernate is an ORM tool and I’m identifying points of pain that I and many others have felt for a long time with Hibernate. I’m not stating that Hibernate as a tool is completely worthless, I’m just pointing out places where I believe Hibernate is trying to be too smart, or is doing something I feel is cumbersome, annoying, obscure, etc. Now, on to todays installment.

In this episode we will be delving into another facet of Hibernate that has also spilled over into JPA (it appears) that can have some very difficult to handle consequences.

In this episode we’ll explore the pitfall known as implicit updates:

As we have already seen, Hibernate maintains a cache of Objects that have been inserted, updated or deleted. It also maintains a cache of Objects that have been queried from the database. These Objects are referred to as persistent Objects as long as the EntityManager that was used to fetch them is still active. What this means is that any changes to these Objects within the bounds of a transaction are automatically persisted when the transaction is committed. These updates are implicit within the boundary of the transaction and you don’t have to explicitly call any method to persist the values. Here’s an example to illustrate this. This uses the same table and entity as part 1:

EntityManager em = emf.createEntityManager();
EntityTransaction et = em.getTransaction();

et.begin();
Test t = new Test();
t.setName("Foo");
t.setId(1);
em.persist(t);
et.commit();
em.close();

em = emf.createEntityManager();
et = em.getTransaction();
et.begin();
t = (Test) em.createQuery("select test from Test test where test.name = 'Foo'").getSingleResult();
t.setName("Bar");
et.commit(); // This implicitly performs an update of test where id = 1
em.close();

Connection c = ds.getConnection();
Statement s = c.createStatement();
ResultSet rs = s.executeQuery("select * from Test where id = 1");
Assert.assertTrue(rs.next());
Assert.assertEquals("Bar", rs.getString("name"));

As you can see, Hibernate is storing a reference to the Test Object we fetched from the database using a JPA query. If we modify that Object using any of the properties within a transaction, all of the modifications will be persisted when the transaction is committed implicitly.

Okay, so why is this an issue? The main downside to this is that we don’t really know whether or not modifications made to an object might later be persisted. Some code might modify the Object and not even realize that it is in a transaction. In fact we might call a toolkit or external library, which might modify the Object and we might not even know that the Object was modified. There is no way around this unless you forcibly refresh the Object instance from the EntityManager.

Another downside is that we must manage all of our Objects by hand. Instead of telling the EntityManager to update an Object (which is far more intuitive), we must tell the EntityManager which Objects NOT to update. We do this by calling refresh, which essentially rolls back a single entity. We do this just prior to calling commit on the transaction or when we realize the Object shouldn’t be updated. There is a downside to this however that is difficult to remedy. If we want to maintain the changes we’ve made to the Object thus far, but not persist the Object, we have very few options and sometimes no options for accomplishing this. We might use a copy constructor to make a copy of all our work and then refresh the persistent Object, but this is error prone, a maintenance nightmare and brittle.

So, although there is a solution for this issue, it is completely non-intuitive and can cause some applications to be designed in horrible ways just to handle this pitfall. Therefore, implicit updates gets a pitfall rating of 7 (out of 10).

20 thoughts on “Hibernate Pitfalls part 2

  1. ” I and many others have felt for a long time with Hibernate.”

    No, it’s just you. Stop using “many others” as an excuse for your poor technical arguments. Apparently you don’t know what session.evict() and session.clear() are doing. Using these together with session.update() you can get your desired “I want to tell Hibernate to update explicitly” behavior. You can also disable automatic dirty checking and updating by disabling the snapshots in Hibernate with session.setReadOnly(o, true) or for all queried objects with query.setReadOnly(true). Your conclusions about refresh() are completely wrong, it is not used for what you think it is used. This stuff just doesn’t work as you think it should work.

    Can you please stop this now and invest some more time in learning the basics?

    Like

  2. No, it’s just you. Stop using “many others” as an excuse for your poor technical arguments. Apparently you don’t know what session.evict() and session.clear() are doing. Using these together with session.update() you can get your desired “I want to tell Hibernate to update explicitly” behavior. You can also disable automatic dirty checking and updating by disabling the snapshots in Hibernate with session.setReadOnly(o, true) or for all queried objects with query.setReadOnly(true). Your conclusions about refresh() are completely wrong, it is not used for what you think it is used. This stuff just doesn’t work as you think it should work.

    No, there has already been one person who commented that they left Hibernate behind due to maintainability. Just because the opponents aren’t as vocal as I am doesn’t mean they aren’t out there. I know and have had long discussions with many folks who are not Hibernate fans.

    I feel that you really missed the point of the post if you think that evict and clear will fix this issue. Remember that this is JPA and not Hibernate APIs and regardless I didn’t miss the point about refresh, it does exactly what is stated. Let’s look that them quick:

    refresh (javadoc) – “Refresh the state of the instance from the database, overwriting changes made to the entity, if any.”

    refresh (me) – “which essentially rolls back a single entity.”

    clear (javadoc) – Clear the persistence context, causing all managed entities to become detached.

    clear (me) – didn’t even mention it, but this would essentially make all the work I’ve done thus far worthless because everything would be detached and no longer lazy loadable or really usable as persistent entities.

    Although I didn’t mention clear, my use of refresh was accurate.

    Lastly, dirty checking would work, but again this is JPA. Still though, I might pass an Object to a library and not realize I need to set it to read only. What I’m saying is that the method of least surprise is to programmatically update objects via the persist or an update method and NOT via dirty checking. It is usually very difficult to track down if and where objects changed in large code bases (such as a few hundred codebases of a few hundred classes a piece).

    To sum up, I know that many folks love Hibernate and feel that it solves all the worlds problems, but I’m being quite pragmatic and thinking about large and small code bases with large and small teams and the possibility of having junior or mid level developers working in the code.

    Like

  3. Going to be fun what the next thing is you are complaining about 😉

    It seems you just don’t like automatic (or help to do) *state* management. Again StatelessSession would give you full control; but you have declined that previously because it did too little.

    As Christian points out there are ways in Hibernate to control this very explicitly (e.g. .setReadOnly).
    FYI independent on your “series” we have been talking for a while about adding a session.setReadOnly(x) flag that you could use to get your “be-more-explicit-and-requires-more-user-code” approach.)

    Your “series” seem to forget the downsides of your alternate approaches. Most users I have talked to actually gets very happy when they realize they don’t have to keep track on if a thirdparty library or code changed their objects; if they had to do that they would have to write alot more code and most likely that code would be much more inefficient (both in runtime and developer performance).

    And yes, I’ve met people too that were surprised over some behavior but in general when they were explained why it was and what you would miss if it weren’t there peoples opinion turns.

    And if that does not help them Hibernate still in many cases allow you to configure or use it the way you like – it is just not the way it works by default nor the way we document the most (since it would give alot of other issues)

    Like

  4. I can see your point Brian – but again I’m biased as I wrote Ebean which requires an explicit save() and delete() unlike JPA (and is session less).

    I’d be interested to hear your thoughts on ‘session less’ ORMs such as Ebean.

    Like

  5. I don’t like implicit flush either. As far as Christian’s solution goes, which does not work with JPA, I don’t think that using evict() and clear() is really too helpful in any other abstract data layer implemented using Hibernate and JTA.

    I found that forcing FlushMode to MANUAL, since JTASessionContext defaults to flush before transaction completion, and calling flush() on update operations and finally commit() or rollback() is more flexible. Christian may argue that his approach is more efficient, and while I agree, I argue that it is impractical.

    Andre

    Like

  6. We ran into this same issue recently. We were surprised to have objects persisted when we never called save on them. Pretty lame “feature” if you ask me. Calling evict or session.clear is not a good solution for this issue.

    Like

  7. Brian Pontarelli’s post a year and a half ago just helped massively. We had Hibernate automatically saving objects to the database without being asked – calling session.clear() was the solution. Brian’s post was clearer than all the hibernate documentation!

    Like

  8. Remember the Atari game ‘Pitfall’? That’s what it’s like using hibernate. You have to jump over one pit and then onto the next pit, jump, jump, jump. I am doing all this so that I don’t have write some SQL? Hardly seems worth it to me. Especially when I still end up writing SQL wrapped in HSQL for all the complex stuff anyway. Huh? Am I the only one who thinks this is the worst idea that ever got popular?

    Like

  9. I totally agree with Brian and StickyBandit. We are using Hibernate for our project and writing HQLs most times. I really don’t see any point using Hibernate. In most cases, I believe, the pure JDBC is still the best way to go.

    Like

  10. No, Mark, it is not. I personally will never fall back to plain JDBC even in the most trivial cases. And still these implicit updates are lame stuff for sure. I would at least consider it being optional feature (disabled by default).
    As for the Christian’s comment — the worst workaround for such an offense. Are POJOs recovered from persistence context supposed to participate in some business logic? And what if, depending on that logic, you eventually decide to persist them? If you only have a handfull of queries, then it is probably OK (very ugly though and, yepp, I didn’ manage to find an equivalent in Criteria API, which is essentially the ‘O’ of ORM), but as soon as you deal with tons of queries within a transaction you just have to call readOnly on a unit-of-work, rendering it unable to do further persistence stuff.
    Actually I have never saw a beautiful solution for that ‘feature’ — just ugly workarounds.

    Like

  11. Again this post is simple ignorance of the principles behind the development of Hibernate.

    I agree with Andre Piwoni, all of these pitfalls can be fixed by setting AutoFlushMode = Never

    Like

  12. Hello,

    I also think Brians Post is very helpfull. Thank you very much Brian, please keep up the good work.

    Hibernate tries to “solve” some of the problems of the SQL/Relationalworld but on the otherhand therefore imposes problems on another level. Hibernate tries sometimes to be too smart. In regard to the article the FLUSHMODE = AUTO I consider not necessarly as a Feature. I had serious Problems for example if you do a series of validations and one fails then the object is in an indeterministic state. But Hibernate happly will update the database. You than have to introduce logic to prevent this. OK setting FLUSHMODE to never is easy but you first have to figure this out and you then introduce some strange “Persistance Layer” dependant Logic into your code. So I think the post Hibernate Pitfalls 1 an 2 are very helpfull!

    Like

  13. “Automatic dirty checking” is a design pattern, that has it’s own synopsis and therefore fits only certain circumstances. Apparently, it does not fit the Open Session In View pattern: collecting values into an attached entity object, then validating this object and then deciding, whether persist it or not. While the session is alive all the way, you’ll get updates before you will be able to validate (even if you will eventually decide not to do so).
    Now a word about workarounds. The worst thing the developer may do is to introduce his own workarounds to avoid some undesirable features.

    Like

  14. In response to jens statement that
    “OK setting FLUSHMODE to never is easy but you first have to figure this out and you then introduce some strange “Persistance Layer” dependant Logic into your code”

    What do you mean by strange persistance layer? You access data via simple persistence layer (think of something like Generic DAO) which encapsulates this behavior? How complex this is:

    public void delete( T argObject )
    {
    Session session = getSession();
    session.delete( argObject );
    session.flush();
    }

    When you are done code upstream does commit or rollback.
    Is there anything that you see here that requires upstream code to have some persistence layer dependent logic?

    Like

  15. I just found this lovely little ‘feature’ in Grails. I have to wonder how many millions of dollars have been lost in bugs/man hours working around this little ‘feature’.

    Like

Leave a comment