The great Derby slowdown

Development, Java 1 Comment »

We recently switched our client database implementation from Mckoi to Derby due to some performance issues we’d run into. All was going well, performance was up significantly and everyone was happy, that was until Friday when one of the developers noted that performance had dropped a little then promptly went on holidays :).

It wasn’t until today that I got around to integrating the new database into the client application. Once I did I pulled some customer information from the server onto the client and waited, and waited, and waited, then the sinking feeling came. First I checked to see if I’d done something lame, like turned off the turbo switch (anyone else remember when PCs had turbo switches?), nope, everything pretty much checked out. Bum.

So out comes the trusty profiler, Yourkit, odd name, great profiler. Quick run through with yourkit collecting profiling information and it was apparent that we were compiling statements far more often than we should have been. The code seemed to be doing the correct thing. We use prepared statements almost everywhere, still the database log and yourkit where showing that each time the statement was used it was being recompiled.

Okay time to attach the Derby source and set a couple of breakpoints to get a better idea about what was going on. Nope the Derby source is compiled without debug symbols. That’s okay, it’s open source, I can recompile it myself. Now I’m only going to say one thing about the whole sordid process. Any ant build that has a property named sane that must be set to false to get the build to work has some very serious issues.

Now, finally, after the shock and horror of building Derby, I begin to trace execution. There are some prepared statements, there they are getting cached, all good. Let the application run and enter the area running slowly, what, the cache is empty, huh? I saw statements in there just a moment ago. Trace, click follow, ooohhh. It seems that one of the last little optimisations / sanity checks added by my holidaying colleague before departing, was to drop a constraint before running a series of operations. It was then added back (the database is accessed by a single user on a single thread) once everything was done. Unbeknown to him once you make a DDL manipulation call in a Derby transaction no statements will be cached and every call recalculates the plan. Eeeks, a minor modification to drop and restore the constraints in separate transactions did no end of good.

Derby’s documentation could have been a little clearer. It states that the following must be true for the prepared statement to be cached:

  • The text must match exactly
  • The current schema must match
  • The Unicode flag that the statement was compiled under must match the current connection’s flag

Obviously we were violating the second point. It didn’t occur to me because the manipulation is done once at the start of the operation and the schema is the same for all calls after that. There is no mention of the fact that no statements will be cached until the transaction manipulating the schema commits. At least with the source available I was able to trace through and locate the cause.

Cedric is wrong.

Java 2 Comments »

I’m a big fan of Cedric Beust’s work but in “Testing private methods? You bet.” Cedric asserts that private methods should have tests because “if it can break, test it”. In my opinion this is just plain wrong and I believe that Cedrics own examples back me up.

Cedric provides us with a reasonably common scenario.

Imagine that you are writing a Swing table widget that lets you order your columns in several different ways. Your sorting algorithm is private and one day, you introduce a bug in it. If all you do is test the public methods, the widget will simply start showing wrong results and your tests won’t tell you anything more. For all you know, there might be a bug in your Swing code. If you unit test your (private) sorting algorithm, you will catch the error right away instead of wasting time “higher up the stack”.

This is variant on a theme that we’ve all come across. In the past I’ve seen people expose private methods as public just to test them and these methods then become a part of the public contract of the class. Funnier is the public method with a note in the javadoc that the method is really private and only exposed for testing purposes. Better yet is the use of reflection to call the a private method to test it. I could go on but I’m guessing you get my point.

My rule is simple. If it’s part of the public contract, test it, if it’s internal implementation, test is via the public contract.

So we now go back to Cedric’s problem. From the description doesn’t this sound like the perfect place to use a strategyy? Cedric’s need for a sorting algorithm sounds like an ideal place for a sorting strategy. You can then test your sorting algorithm simply and easily external to it’s usage in the widget. This breaks the design down neatly but still leaves us with a dangling pointer. The sorting strategy class is now a stand alone class that is not really part of the public contract of the application. We could make the class package protected and have the test in the same package but I like to have my tests in a separate package to ensure that I am just testing the public contract. So following the lead of the eclipse team, I put classes like this, ones that need to be public for implementation purposes but are not a part of the public contract in their own package, internal. Eclipse can even enforce that I don’t directly use classes found in any internal package.

In my opinion Cedric’s solution tests a bad design. No amount of testing will “fix” it. If you need to test a private method look at your design. If the method should be part of the public contract then make it so. If not, take another look at the design of your classes.

WP Theme & Icons by N.Design Studio
Entries RSS Comments RSS Log in