On Shimmer and Patience
Remember Shimmer on Saturday Night Live? Dan Aykroyd and Gilda Radner? “It’s a dessert topping and a floor wax!” If you’re as old as I am or enough of an SNL fan to have caught that old episode on DVD, you’re probably chuckling a little. Humor has many triggers in the human brain, and absurdity is certainly one of them.
That’s why my initial reaction to something that happened at work the other day was to laugh – a reaction which, thankfully, I was successful in suppressing. Not only because laughing would have been unprofessional, but because, as is often the case, my first impression was superficial and, ultimately, off the mark.
The topic of discussion was the Continuous Integration build at a client. Let’s see if I can recreate the scene. Names, of course, have been changed to protect those who may not want their names to appear here…
Jeff the Consultant: I see that there are 800-odd unit tests in our code base, and I can run them locally, but the CI build only runs 150 or so of them. Is something wrong?
Bob the Builder: No, this is correct.
Jeff the Consultant: Come again?
Bob the Builder: Yes, we do this intentionally.
Jeff the Consultant: Okay, I have to ask. Why? Does it take too long to run all of the tests?
Bob the Builder: No, not really. The build with all the tests runs in 10 minutes or so.
Jeff the Consultant: Then why do we run a just a subset of them on code checkin?
Bob the Builder: Well, we run the full set of tests in a different build that runs nightly.
Jeff the Consultant: Hmm… Doesn’t running the tests at night make it difficult to track down whose code submission is responsible for tests that start failing? And doesn’t it remove the benefit of immediate feedback? I’m getting on in years, and sometimes I get to work in the morning not remembering a lot about what I did the night before.
Bob the Builder: I see your point. It would be a lot better if we ran the full set of tests all the time, on every code check-in.
Jeff the Consultant: Great. So you’ll make the necessary changes to the build script?
Bob the Builder: No, I can’t do that.
Jeff the Consultant: Wonderful, thanks… wait – did you just say “No?”
Bob the Builder: Correct. I can’t change the build.
Jeff the Consultant: Okay, I have to ask. Why not?
Bob the Builder: Because we don’t want this build to fail.
Jeff the Consultant: (This is about the point where I had to start physically suppressing laughter) But the build is supposed to fail if there are broken tests. That’s how we know there’s something that needs to be fixed, and who’s responsible – er, best qualified to fix it.
Bob the Builder: Right, I understand that. But the jars we produce in this build get deployed in an overnight process. So we don’t want the build to fail, because if it does, the daily build won’t get deployed.
Jeff the Consultant: Okay, I see that. But why do you want to deploy a build that you know has bugs in it? Or, in this case, that might have bugs in it, since you’re not running all of the tests…
Bob the Builder: Well, the deployment isn’t just for this team. There’s another team on the East Coast that needs that build to be deployed every day.
Jeff the Consultant: Hmm… interesting. Are the tests we’re not running our tests or theirs?
Bob the Builder: Mostly ours, I think. But it doesn’t matter – we can’t let broken tests stop the deployment, because usually only a small number of features are affected by those tests, and lots of people need to see the deployed application the next day in our demo environment.
Jeff the Consultant: Okay, that makes sense – I guess. But isn’t there another way to do this? For example, couldn’t we run the full suite of unit tests at code check-in time, and make the nightly build the one that skips tests and reliably produces the demo jar files? That way at least we’d get back the benefit of immediate feedback and have a better idea whose code is causing a test to start failing.
Bob the Builder: I think we could do that. But I don’t have time to dig into an overhaul of the build process, because I have development responsibilities too, and we’re only 2 days from code freeze.
(Exit cubicle, stage right…)
By this time my head was spinning. I wasn’t sure whether what I’d just heard made sense, or whether I’d just been had. I was pretty sure the whole build arrangement they had settled on was crazy, but I could also see how an organization could have evolved into this arrangement because of conflicting needs.
The root of their problem is the Shimmer conundrum. It’s neither a dessert topping nor a floor wax, but their CI build is certainly trying to be two very different things at the same time. It was probably started as a CI build in the Agile sense, with the primary purpose being immediate feedback in the form of unit test results.
But of course, like any build, it also builds something. Those jar files were ripe for the picking, and some enterprising person had the great idea to deploy them every day so interested parties in the company would be able to see the application “fresh from the oven” – with all of the evolving new features on display.
The problem is that the secondary purpose of this particular build – generation of compiled, deployable software, became more critical to somebody – some person with influence – and the build began to slowly stop meeting its initial purpose.
So, is there a point to this rambling blog post? Indeed there is. It’s a word of caution to those of us in the consulting profession. It’s very easy for those of us who spend a lot of time reading and in other ways learning about “ideal” situations – such as well-run Agile projects – to become easily frustrated when we encounter things that are in some way less than ideal. We have to remember to suspend judgment long enough to understand the big picture – all of the various problems our clients are trying to juggle – and then offer our assistance in helping improve things, at a pace that makes sense in their environment.
I’ll try to remember this myself as I return to work tomorrow…
Of Entities and Identities
I really like Hibernate. Sure, it can be frustrating at times, the learning curve can be steep, and some of the exceptions you get when you do stupid things aren’t all that helpful. But the fact that it’s frustrating and hard to learn are manifestations of the fact that it’s tackling some really thorny problems. And the cryptic-exception issue does get a little better with each release.
There is one thing, however, that has bothered me since I started using Hibernate several years ago. A deep bother. Not a “this is a nuisance, but I can live with it” bother, but more of a “this is so fundamental an issue that there just has to be a better solution to it than is being offered by Hibernate’s authors” bother.
The problem: bridging the gap between object identity and database identity. In most modern database schemas, tables are defined with a numeric primary key column, which is populated through some unique key assignment mechanism (Identity columns in SQL Server or MySQL, sequences in Oracle, etc).
This works great at the database layer. But problems arise in the JVM (or CLR for you NHibernate folks). In order for objects representing database entities (those managed by Hibernate) to be handled properly in collections (including Hibernate cache), objects that have the same database identity need to be recognized as equals as well. In Java, this means overriding equals() and hashCode() such that any two objects that represent the same database entity will be equal, and return the same hash code.
The “obvious” solution, at first blush, would be to base equals() and hashCode() on entity ID values. This works great for entities created by Hibernate, which represent existing rows in a database, all with unique ID’s. The problem has to do with new entites that you create programmatically, which haven’t yet been saved. Those entities will typically have a null ID value – and if you base equality on ID values, any two unsaved instances will be considered equal.
There are actually two problems when dealing with unsaved entity objects. The first, as stated above, is that they don’t yet have a unique identity – so multiple new instances will collide in collections like Maps and Sets. The other problem is that the identity of these objects will be set at some point, which violates part of the contract of hashCode() – the part that says its value must not mutate over time.
To avoid all of thes problems, Hibernate’s authors have long recommended a workaround – use “natural keys” instead of identity values for equals() and hashCode(). The problem with this is that many entities simply don’t have a natural key – that is, a set of one or more attributes whose values uniquely identify the entity. And those that do often fail the hashCode() qualification that the value should not change over time.
There are other possible solutions I’ve seen or come up with myself, but all had at least one glaring weakness. It seemed that there was no silver bullet for dealing with database-generated identity values in Hibernate.
And therein lies the silver bullet. Don’t use the database (or Hibernate) to generate identity values.
I stumbled across a wonderful article on O’Reilly’s OnJava site that quite clearly describes this problem (in more detail than I do here), and offers what I believe to be the only real solution. Check it out at http://www.onjava.com/pub/a/onjava/2006/09/13/dont-let-hibernate-steal-your-identity.html.
And now, as I look out the window at the late February snowstorm, I think maybe it’s time for me to hibernate…
