In the course of developing WattDepot, I ran into an annoying intermittent bug in my JUnit tests. I would sometimes get a failure in one particular test class, but not always in the same method of that class. The failure manifested as a 60 pause on the affected test, followed by the WattDepotClient method returning a 1001 miscellaneous failure status code. Maddeningly, it would only fail sometimes, making it much harder to track down (and making continuous integration comical). Further, running the test from within Eclipse would work fine every time, so I was unable to use the debugger to figure out what was going on.
Philip pointed out that this sounded like a classic deadlock problem between threads, perhaps in Derby which I’m using for persistence. He suggested that I use VisualVM to see if I could track down any deadlocks. Mac OS X comes with VisualVM installed as “jvisualvm”, and it’s pretty easy to use. Luckily, since the failure manifested as a 60 second pause, I could start the test, and then attach to the JUnit process and obtain thread dumps to see what was going on.
After a few thread dumps, I tracked it down to HTTP communication. The failure happens when the client is using PUT to send a new resource to the server, and the server is waiting for the end of the entity body from the client. This happens before any Derby call, so it looks like Derby is ruled out (at least for this bug).
WattDepot uses the Restlet framework to make it easier to implement the REST API, and to perform all the HTTP client and server work. Restlet provides a variety of connectors for both the client and server HTTP connections. In fact, there are enough options that it is somewhat confusing trying to pick one. Restlet has internal HTTP client and server connectors that come in the core Restlet jars. According to this email thread, the choice of connector is done automatically by scanning the classpath, with the first match winning.
When first setting up WattDepot, I based the set of Restlet jars I was using on Hackystat. Hackystat’s SensorBase includes org.restlet.jar (API classes), com.noelios.restlet.jar (reference implementation, including internal HTTP connectors), com.noelios.restlet.ext.net.jar (client connector based on JDK HTTP code), and com.noelios.restlet.ext.simple_3.1.jar (server connector based on Simple framework). So it appears that WattDepot is using the Net connector for client HTTP connections, and the Simple connector for server connections, both overriding the internal HTTP connections in the reference implementation.
Since my problem was taking place in the HTTP code, I decided to try experimenting with removing Net and Simple from the classpath, thereby allowing the appropriate internal HTTP connector to kick in. Since I’m using Ivy and Ivy RoundUp for dependency management, this turns out to be as easy as changing the configuration parameter in the Restlet Ivy config, deleting the project “lib” directory and rerunning the tests.
After trying all combinations (all internal connectors, internal server & Net client, Simple server & internal client, Simple server & Net client), I found that only the combination of the Simple server connector and the Net client connector leads to my unit test failure. I guess I’m just lucky that way. 🙂
The solution is then to stop using either the Net client or the Simple server. Since the WattDepot server is likely to be the more performance-sensitive aspect of WattDepot, I opted to keep the Simple server on the assumption that it is higher performance than the internal Restlet server. It would be nice to figure out which of the variety of client and server connectors is recommended as the best performing, but this will do for now.
In the future I plan to post something to the Restlet mailing list to see if anyone else has run into this problem so it can be tracked down and perhaps fixed.