May 162012
 

Inversoft has recently switched from Savant to Gradle until Savant 2.0 is finalized and can effectively rolled out (if this ever happens). Gradle is a decent temporary solution, but as a build tool junkie, it rubs me the wrong way. Here are some things that we encountered during the switch that Savant addresses.

Maven Central

We started by attempting to use Maven Central for our dependencies, but quickly realized that Maven Central is a complete mess. We also realized that it suffers from the “most surprise” pattern as well as “GPL” infestation. During our transition, we started to realize that using Maven Central actually caused our number of JARs to dramatically increase and for a couple of GPL JARs to sneak into our classpath.

In the end, we had to rever to managing our own repository and manage all of the ivy.xml dependency files by hand to ensure that everything was correct.

Transitive Mess

Gradle appears to suffer from the “let’s make it simple” methodology of Maven. In order to actually make life simple and avoid actual work, Maven pulls in the entire universe of your project dependencies for compiling and runtime. This means that even if you never import a class from a transitive dependency, Maven puts it in the classpath. This allows you to use classes from transitive dependencies. If you do this though, you’ll never actually know what JAR file a class comes from since it could come from some deeply nested dependency. It also tightly couples your project to your dependencies and those dependencies transitives. We prefer loose coupling rather than tight, so this is generally bad. We also like fewer dependencies at compile time rather than more. Again, bad news all around.

Flat Structure

Gradle plugins and domain are entirely flat. This makes it difficult to separate concerns and distinguish between Java compiling, testing and runtime. Maven actually got this right by cleaning separating concerns and separating plugins.

Configuration vs. tasks

Gradle build files can quickly become large and complex. It is often hard to distinguish between what is configuration and what are tasks.

Order matters

Order matters in Gradle build files unless you always use the String syntax for all configurations and tasks. If you have one task that depends on another task, the order of declaration matters. This is quite painful to debug and I thought we had finally grown out of single pass compiling. Apparently not.

No Versions

There are many pieces of Gradle that don’t like versions, e.g. plugins. These don’t have versions and just use the latest available. This is usually a bad idea because historical builds can break and you’ll never know what version you’ll be using when you execute a build.

MD5 and SHA1?

Why both?

Hashes not always used

The MD5 and SHA1 hashes aren’t always used. For example, the ivy.xml files don’t appear to use the hashes associated with that file. It really should use the hashes for everything.

UploadArchives doesn’t run the tests

WHAT?!?!?

Summary

Overall, I’m not a big fan of Gradle. It seems to be overly complex and sorta punts on dependency management. Having used Maven, Savant, Ant and Ivy, I would say that Savant 2.0 is still the best build system out there, even though it still isn’t finished.

Oct 142011
 

David Lloyd and I are working on getting some standards together for dependency management. We are hoping that this will help unify projects and provide the ability for all the different projects to use the same repositories.

Please join us for the discussion if you have ideas you want to share.

Here are the details:

10am PDT October 21
##java-modularity on irc.freenode.net

Aug 172011
 

Okay, after many hours of battling with MySQL and PostgreSQL, I’ve come to the conclusion that databases support for dates and times suck. PostgreSQL appears to do a much better job than MySQL, but overall they both leave a lot to be desired. The main

The main issues with both these databases is that the lack the ability to store values as UTC without any timezone information. That’s right, I don’t want the database to know anything about timezones. See, once the database starts to store timezone information or worse, store dates and times without timezone information but not in UTC, you get into the situation where everything melts down if the timezone of the operating system or the database server changes. You also get into really nasty issues when you migrate data across locations, for example by exporting data from a server in San Francisco and migrating it to a server in London.

If you think about it, databases are primarily used for storing and retrieving data. In 99.9999% of applications, the database is NOT responsible for displaying data. For this reason, the database should not know anything about timezones because these are purely for display purposes. The correct way to store date and time data is at UTC.

So, what’s my solution given the horrible date and time support in databases? Don’t use them and use bigint instead. Store everything using the number of milliseconds since epoch in UTC. Never ever do any timezone math with your dates and times. Handle everything in UTC and only do timezone manipulations just before you display the data to a user.

I pulled this off by changing all of my datetime and timestamp columns to be bigint instead. They now look like this:

Additionally, in my application code when I use JDBC I pull the data out as Longs like this:

Lastly, I wrote a Hibernate UserType that converts from longs to Joda DateTime instances and annotate my entities like this:

And that’s it. It works perfectly and I never have to worry about any of my data getting screwed up if someone changes the system timezone or migrates data from one server to another. Furthermore, I finally get full support for milliseconds on MySQL and I also can add support to Clean Speak for new databases easily.

Aug 162011
 

Today I’ve been working with Postgresql and MySQL trying to figure out how they handle date-time values and timezones. This is actually quite tricky, so I wanted to write it down for later. First, both databases have two different types of columns:

  • One that stores the date-time without any time zone information
  • One that stores the date-time with time zone information

For Postgresql, these are:

  • timestamp without time zone
  • timestamp with time zone

respectively.

For MySQL, these are:

  • datetime
  • timestamp

There are a number of things to consider when dealing with date-time values:

  • What if the timezone of the server changes?
  • What if the server moves physical locations thereby indicating a new time zone?
  • What if the timezone of the database server changes (different than the timezone of the server)?
  • How does the JDBC driver handle timezones?
  • How does the database handle timezones?

To figure all of this out, it is important to understand how the date-time value goes from the application, to the database, out of the database and back to the application (for inserts and later selects). Here is how this works for values without timezone information:

Insert

  1. Java creates a java.sql.Timestamp instance. This is stored as UTC
  2. Java sends this value to the JDBC driver in UTC
  3. The JDBC driver sends the value to the database in UTC
  4. The database converts the date-time from UTC to its current timezone setting
  5. The database inserts the value into the column in the current timezone (NOT UTC)

Select

  1. The database selects the value from the column in the system timezone
  2. The database converts the value to UTC
  3. The database sends the UTC value to the JDBC driver
  4. The JDBC driver creates a new java.sql.Timestamp instance with the UTC value

For this scenario, you will run into major issues if the the server or database timezone changes between Insert #5 and Select #1. In this case, the value will not be correct. The only way to make things work correctly for this setup is to ensure no timezone settings ever change or to set all timezones to UTC for everything.

For the types that store timezone information, here is how the data is passed around:

Insert

  1. Java creates a java.sql.Timestamp instance. This is stored as UTC
  2. Java sends this value to the JDBC driver in UTC
  3. The JDBC driver sends the value to the database in UTC
  4. The database calculates the offset between the current timezone and UTC
  5. The database inserts the value as UTC into the column along with the offset it calculated

Select

  1. The database selects the value from the column in UTC along with the offset
  2. The database sends the UTC value to the JDBC driver
  3. The JDBC driver creates a new java.sql.Timestamp instance with the UTC value

Since the JDBC driver only handles UTC values, there is no potential for data mangling here since everything is stored in UTC inside the database. Although the database stores the timezone information along with the date-time value in UTC, it isn’t ever used because the JDBC driver doesn’t care what the original timezone was and ignores that value.