Remember Shimmer on Saturday Night Live? Dan Aykroyd and Gilda Radner? “It’s a dessert topping and a floor wax!” If you’re as old as I am or enough of an SNL fan to have caught that old episode on DVD, you’re probably chuckling a little. Humor has many triggers in the human brain, and absurdity is certainly one of them.
That’s why my initial reaction to something that happened at work the other day was to laugh – a reaction which, thankfully, I was successful in suppressing. Not only because laughing would have been unprofessional, but because, as is often the case, my first impression was superficial and, ultimately, off the mark.
The topic of discussion was the Continuous Integration build at a client. Let’s see if I can recreate the scene. Names, of course, have been changed to protect those who may not want their names to appear here…
Jeff the Consultant: I see that there are 800-odd unit tests in our code base, and I can run them locally, but the CI build only runs 150 or so of them. Is something wrong?
Bob the Builder: No, this is correct.
Jeff the Consultant: Come again?
Bob the Builder: Yes, we do this intentionally.
Jeff the Consultant: Okay, I have to ask. Why? Does it take too long to run all of the tests?
Bob the Builder: No, not really. The build with all the tests runs in 10 minutes or so.
Jeff the Consultant: Then why do we run a just a subset of them on code checkin?
Bob the Builder: Well, we run the full set of tests in a different build that runs nightly.
Jeff the Consultant: Hmm… Doesn’t running the tests at night make it difficult to track down whose code submission is responsible for tests that start failing? And doesn’t it remove the benefit of immediate feedback? I’m getting on in years, and sometimes I get to work in the morning not remembering a lot about what I did the night before.
Bob the Builder: I see your point. It would be a lot better if we ran the full set of tests all the time, on every code check-in.
Jeff the Consultant: Great. So you’ll make the necessary changes to the build script?
Bob the Builder: No, I can’t do that.
Jeff the Consultant: Wonderful, thanks… wait – did you just say “No?”
Bob the Builder: Correct. I can’t change the build.
Jeff the Consultant: Okay, I have to ask. Why not?
Bob the Builder: Because we don’t want this build to fail.
Jeff the Consultant: (This is about the point where I had to start physically suppressing laughter) But the build is supposed to fail if there are broken tests. That’s how we know there’s something that needs to be fixed, and who’s responsible – er, best qualified to fix it.
Bob the Builder: Right, I understand that. But the jars we produce in this build get deployed in an overnight process. So we don’t want the build to fail, because if it does, the daily build won’t get deployed.
Jeff the Consultant: Okay, I see that. But why do you want to deploy a build that you know has bugs in it? Or, in this case, that might have bugs in it, since you’re not running all of the tests…
Bob the Builder: Well, the deployment isn’t just for this team. There’s another team on the East Coast that needs that build to be deployed every day.
Jeff the Consultant: Hmm… interesting. Are the tests we’re not running our tests or theirs?
Bob the Builder: Mostly ours, I think. But it doesn’t matter – we can’t let broken tests stop the deployment, because usually only a small number of features are affected by those tests, and lots of people need to see the deployed application the next day in our demo environment.
Jeff the Consultant: Okay, that makes sense – I guess. But isn’t there another way to do this? For example, couldn’t we run the full suite of unit tests at code check-in time, and make the nightly build the one that skips tests and reliably produces the demo jar files? That way at least we’d get back the benefit of immediate feedback and have a better idea whose code is causing a test to start failing.
Bob the Builder: I think we could do that. But I don’t have time to dig into an overhaul of the build process, because I have development responsibilities too, and we’re only 2 days from code freeze.
(Exit cubicle, stage right…)
By this time my head was spinning. I wasn’t sure whether what I’d just heard made sense, or whether I’d just been had. I was pretty sure the whole build arrangement they had settled on was crazy, but I could also see how an organization could have evolved into this arrangement because of conflicting needs.
The root of their problem is the Shimmer conundrum. It’s neither a dessert topping nor a floor wax, but their CI build is certainly trying to be two very different things at the same time. It was probably started as a CI build in the Agile sense, with the primary purpose being immediate feedback in the form of unit test results.
But of course, like any build, it also builds something. Those jar files were ripe for the picking, and some enterprising person had the great idea to deploy them every day so interested parties in the company would be able to see the application “fresh from the oven” – with all of the evolving new features on display.
The problem is that the secondary purpose of this particular build – generation of compiled, deployable software, became more critical to somebody – some person with influence – and the build began to slowly stop meeting its initial purpose.
So, is there a point to this rambling blog post? Indeed there is. It’s a word of caution to those of us in the consulting profession. It’s very easy for those of us who spend a lot of time reading and in other ways learning about “ideal” situations – such as well-run Agile projects – to become easily frustrated when we encounter things that are in some way less than ideal. We have to remember to suspend judgment long enough to understand the big picture – all of the various problems our clients are trying to juggle – and then offer our assistance in helping improve things, at a pace that makes sense in their environment.
I’ll try to remember this myself as I return to work tomorrow…