Aug 172011
 

Okay, after many hours of battling with MySQL and PostgreSQL, I’ve come to the conclusion that databases support for dates and times suck. PostgreSQL appears to do a much better job than MySQL, but overall they both leave a lot to be desired. The main

The main issues with both these databases is that the lack the ability to store values as UTC without any timezone information. That’s right, I don’t want the database to know anything about timezones. See, once the database starts to store timezone information or worse, store dates and times without timezone information but not in UTC, you get into the situation where everything melts down if the timezone of the operating system or the database server changes. You also get into really nasty issues when you migrate data across locations, for example by exporting data from a server in San Francisco and migrating it to a server in London.

If you think about it, databases are primarily used for storing and retrieving data. In 99.9999% of applications, the database is NOT responsible for displaying data. For this reason, the database should not know anything about timezones because these are purely for display purposes. The correct way to store date and time data is at UTC.

So, what’s my solution given the horrible date and time support in databases? Don’t use them and use bigint instead. Store everything using the number of milliseconds since epoch in UTC. Never ever do any timezone math with your dates and times. Handle everything in UTC and only do timezone manipulations just before you display the data to a user.

I pulled this off by changing all of my datetime and timestamp columns to be bigint instead. They now look like this:

Additionally, in my application code when I use JDBC I pull the data out as Longs like this:

Lastly, I wrote a Hibernate UserType that converts from longs to Joda DateTime instances and annotate my entities like this:

And that’s it. It works perfectly and I never have to worry about any of my data getting screwed up if someone changes the system timezone or migrates data from one server to another. Furthermore, I finally get full support for milliseconds on MySQL and I also can add support to Clean Speak for new databases easily.

Aug 162011
 

Today I’ve been working with Postgresql and MySQL trying to figure out how they handle date-time values and timezones. This is actually quite tricky, so I wanted to write it down for later. First, both databases have two different types of columns:

  • One that stores the date-time without any time zone information
  • One that stores the date-time with time zone information

For Postgresql, these are:

  • timestamp without time zone
  • timestamp with time zone

respectively.

For MySQL, these are:

  • datetime
  • timestamp

There are a number of things to consider when dealing with date-time values:

  • What if the timezone of the server changes?
  • What if the server moves physical locations thereby indicating a new time zone?
  • What if the timezone of the database server changes (different than the timezone of the server)?
  • How does the JDBC driver handle timezones?
  • How does the database handle timezones?

To figure all of this out, it is important to understand how the date-time value goes from the application, to the database, out of the database and back to the application (for inserts and later selects). Here is how this works for values without timezone information:

Insert

  1. Java creates a java.sql.Timestamp instance. This is stored as UTC
  2. Java sends this value to the JDBC driver in UTC
  3. The JDBC driver sends the value to the database in UTC
  4. The database converts the date-time from UTC to its current timezone setting
  5. The database inserts the value into the column in the current timezone (NOT UTC)

Select

  1. The database selects the value from the column in the system timezone
  2. The database converts the value to UTC
  3. The database sends the UTC value to the JDBC driver
  4. The JDBC driver creates a new java.sql.Timestamp instance with the UTC value

For this scenario, you will run into major issues if the the server or database timezone changes between Insert #5 and Select #1. In this case, the value will not be correct. The only way to make things work correctly for this setup is to ensure no timezone settings ever change or to set all timezones to UTC for everything.

For the types that store timezone information, here is how the data is passed around:

Insert

  1. Java creates a java.sql.Timestamp instance. This is stored as UTC
  2. Java sends this value to the JDBC driver in UTC
  3. The JDBC driver sends the value to the database in UTC
  4. The database calculates the offset between the current timezone and UTC
  5. The database inserts the value as UTC into the column along with the offset it calculated

Select

  1. The database selects the value from the column in UTC along with the offset
  2. The database sends the UTC value to the JDBC driver
  3. The JDBC driver creates a new java.sql.Timestamp instance with the UTC value

Since the JDBC driver only handles UTC values, there is no potential for data mangling here since everything is stored in UTC inside the database. Although the database stores the timezone information along with the date-time value in UTC, it isn’t ever used because the JDBC driver doesn’t care what the original timezone was and ignores that value.