Monthly Archives: February 2010

Using XPath to pick data out of XML

This week I wrote a WattDepot sensor for the TED 5000 home energy meter. The TED 5000 gateway (a small Internet-connected embedded computer) provides a URI that generates XML showing the current power data. First, I needed to figure out what the XML meant. Once that was done, I wanted a quick and simple way to pick out the 2 pieces of data from the XML that I care about using Java.

WattDepot uses JAXB extensively for XML processing, but that was kinda heavyweight for my needs here. I had heard about XPath, and it sounded like the right type of tool for just grabbing a little data from XML. Turns out that Java 1.5 and later have XPath built-in, so there’s no additional dependencies.

IBM has a good tutorial on using XPath from Java by Elliotte Rusty Harold. Unfortunately, I was confused initially because all the XPath examples in the tutorial are for finding all XML nodes in a document that meet certain criteria, whereas I knew exactly where in the XML tree my data was lurking. Luckily, it turns out that XPath is really a lot like a path in a filesystem (duh), so traversing the tree is easy.

Say you have the following XML from TED (some parts elided):

<LiveData>
  ...
  <Power>
    <Total>
      <PowerNow>2995</PowerNow>
      ...
      <PowerMTD>515227</PowerMTD>
      ...
    </Total>
  ...
  </Power>
</LiveData>

The XPath that would pull out the value from PowerNow is /LiveData/Power/Total/PowerNow/text(), and for PowerMTD it is /LiveData/Power/Total/PowerMTD/text(). Simple!

Here a code fragment that extracts those two values from an XML file (stealing liberally from the XPath tutorial linked above):

public class XPathTest {

  public static void main(String[] args) throws ParserConfigurationException, SAXException,
      IOException, XPathExpressionException {
    if (args.length != 1) {
      System.out.println("Need XML filename arg.");
      return;
    }
    DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
    domFactory.setNamespaceAware(true);
    DocumentBuilder builder = domFactory.newDocumentBuilder();
    Document doc = builder.parse(args[0]);

    XPathFactory factory = XPathFactory.newInstance();
    XPath powerXpath = factory.newXPath();
    XPath energyXpath = factory.newXPath();
    XPathExpression exprPower = powerXpath.compile("/LiveData/Power/Total/PowerNow/text()");
    XPathExpression exprEnergy = energyXpath.compile("/LiveData/Power/Total/PowerMTD/text()");
    Object powerResult = exprPower.evaluate(doc, XPathConstants.NUMBER);
    Object energyResult = exprEnergy.evaluate(doc, XPathConstants.NUMBER);

    Double power = (Double) powerResult;
    Double energy = (Double) energyResult;
    System.out.println("Power from TED 5K: " + power + "W");
    System.out.println("Energy from TED 5K month to date: " + energy + "Wh");
  }
}

It’s nice to have a quick and easy way to make use of XML from Java in my toolbox.

it’s electric: TED data storage and plotting

I was checking on the website for The Energy Detective the other day looking for API info, and found that their page of 3rd-party applications had been updated, and included an application called it’s electric. it’s electric is a Java web application that queries the TED gateway frequently for the 1 second resolution power data, and stores it in a Berkeley DB. That alone is useful, as the TED has a segmented data storage system, keeping the 1 second resolution data only for an hour (and so on for coarser grained data).

It also provides a graphing system based on Google’s Annotated Timeline visualization, with some enhancements like automatically changing the resolution of the displayed data depending on the time interval displayed. Here’s a screenshot:

Screenshot of graph produced by it's electric

There’s a Google group for support and discussion, and the author Robert Tupelo-Schneck seems quite responsive. A jar file is provided on the group page (which I won’t link to since you should download the latest version), which includes the Java bytecode as well as the source, which is released under the AGPL license. The application is not large, consisting of 5 class files.

Compared to WattDepot, it’s electric seems considerably snappier. Presumably this is due in part to using Berkeley DB for persistence instead of an SQL database. The code also stores data in byte form, rather than higher-level Java objects and XML. Also, it’s electric occupies a clear functionality niche: it provides long-term storage of the finest-grained TED data (which is otherwise lost every hour), and provides graphing of that data from locations outside the home network.

I experienced some problems when scrolling around the data on the live it’s electric website, sometimes the graph would not update, or I was unable to scroll to where I wanted to apparently because new data was being loaded in for the current location.

Overall it’s electric looks like it could be useful for TED owners that want to hold on to that fine grained data, and want more options for displaying that data outside the home.

WattDepot going “real-time”

In the past week I have added a new REST API method to support near-real-time queries in WattDepot. The goal is to support user interface widgets that display the latest sensor data from a source, such as a smart meter in a home or dormitory. I have also written a command line monitoring client that shows how to use the new functionality. Both of these will be released as part of WattDepot 1.2 in the near future, hopefully with the addition of a sensor that collects data from TED 5000 home smart meters.

Speaking of sensors, I created a wiki page that explains how to write a WattDepot sensor. This should be helpful for anyone planning to write a sensor to support a new type of meter.

In other WattDepot news, there are three projects in ICS414 this semester that are related to WattDepot. The WattDepot Apps team are working on demonstration web applications for WattDepot. The first application is a visualizer that makes use of the Google Visualization API to make graphs of WattDepot data. It should be ready for a 1.0 release very soon. Next they will be moving on to create a web application that monitors the latest sensor data from a source using the new API method. In the future, hopefully they will be working on a browsing application that lets users look over the users and sources in a WattDepot repository.

The Stoplight Gadget team is working on a Google Visualization gadget that checks a data source for a value, and based on user-settable thresholds displays a traffic light as either red, yellow, or green. While this is a general-purpose visualization gadget, we expect to use it with WattDepot data as part of the UH dorm energy competition, though precisely how is yet to be determined.

Finally, the Energy Meter team is surveying power meters that can be used for the UH dorm energy competition. While they have been in a data gathering phase so far, they are now switching to implement a Modbus/TCP sensor for WattDepot. This sensor will be used to collect data from the floors of the dorms in the energy competition.

Debugging Restlet connector problem

In the course of developing WattDepot, I ran into an annoying intermittent bug in my JUnit tests. I would sometimes get a failure in one particular test class, but not always in the same method of that class. The failure manifested as a 60 pause on the affected test, followed by the WattDepotClient method returning a 1001 miscellaneous failure status code. Maddeningly, it would only fail sometimes, making it much harder to track down (and making continuous integration comical). Further, running the test from within Eclipse would work fine every time, so I was unable to use the debugger to figure out what was going on.

Philip pointed out that this sounded like a classic deadlock problem between threads, perhaps in Derby which I’m using for persistence. He suggested that I use VisualVM to see if I could track down any deadlocks. Mac OS X comes with VisualVM installed as “jvisualvm”, and it’s pretty easy to use. Luckily, since the failure manifested as a 60 second pause, I could start the test, and then attach to the JUnit process and obtain thread dumps to see what was going on.

After a few thread dumps, I tracked it down to HTTP communication. The failure happens when the client is using PUT to send a new resource to the server, and the server is waiting for the end of the entity body from the client. This happens before any Derby call, so it looks like Derby is ruled out (at least for this bug).

WattDepot uses the Restlet framework to make it easier to implement the REST API, and to perform all the HTTP client and server work. Restlet provides a variety of connectors for both the client and server HTTP connections. In fact, there are enough options that it is somewhat confusing trying to pick one. Restlet has internal HTTP client and server connectors that come in the core Restlet jars. According to this email thread, the choice of connector is done automatically by scanning the classpath, with the first match winning.

When first setting up WattDepot, I based the set of Restlet jars I was using on Hackystat. Hackystat’s SensorBase includes org.restlet.jar (API classes), com.noelios.restlet.jar (reference implementation, including internal HTTP connectors), com.noelios.restlet.ext.net.jar (client connector based on JDK HTTP code), and com.noelios.restlet.ext.simple_3.1.jar (server connector based on Simple framework). So it appears that WattDepot is using the Net connector for client HTTP connections, and the Simple connector for server connections, both overriding the internal HTTP connections in the reference implementation.

Since my problem was taking place in the HTTP code, I decided to try experimenting with removing Net and Simple from the classpath, thereby allowing the appropriate internal HTTP connector to kick in. Since I’m using Ivy and Ivy RoundUp for dependency management, this turns out to be as easy as changing the configuration parameter in the Restlet Ivy config, deleting the project “lib” directory and rerunning the tests.

After trying all combinations (all internal connectors, internal server & Net client, Simple server & internal client, Simple server & Net client), I found that only the combination of the Simple server connector and the Net client connector leads to my unit test failure. I guess I’m just lucky that way. :)

The solution is then to stop using either the Net client or the Simple server. Since the WattDepot server is likely to be the more performance-sensitive aspect of WattDepot, I opted to keep the Simple server on the assumption that it is higher performance than the internal Restlet server. It would be nice to figure out which of the variety of client and server connectors is recommended as the best performing, but this will do for now.

In the future I plan to post something to the Restlet mailing list to see if anyone else has run into this problem so it can be tracked down and perhaps fixed.

Future publication venues

Update March 5, 2010: now maintained as a wiki page rather than a blog post

Updated Feb 2, 2010: added IEEE upcoming journals and IEEE Sensors conference.

So I’ve been thinking lately about where I might publish my research on WattDepot, and later on the UH dorm energy challenge. Here’s what I have come up with so far.

Conferences:

  • IEEE Smart Grid Conference. The deadline for papers is May 1, with the conference happening Oct 4-6 in Maryland. Philip has suggested this might be a good place for a paper on WattDepot, and I agree. The maximum page length is 6 pages.
  • Behavior, Energy, Climate Change 2010. Philip attended BECC 2009 and it seems like an ideal conference for the dorm energy competition results. The call for abstracts (presentations & posters) goes out in March, with a mid-May deadline to submit abstracts. The conference is November 14-17 in Sacramento. There is no paper required for the presentation, just slides, so we could potentially present some actual results in the presentation (which wouldn’t be available in May when the abstract is submitted).
  • Hawaii International Conference on System Sciences 44. This is happening on Kauai January 4-7 2011. The deadline for papers is June 15. The Jennifer Mankoff’s group has had a series of papers about StepGreen and related work at HICSS so this seems like a good venue, and the travel will be much easier. :)
  • CHI 2011. Apparently this is happening May 7-12 in Vancouver (BC I assume?). There have been a variety of papers on supporting green/sustainable behavior in CHI before, and the CHI community is a large and vibrant one.
  • Ubicomp 2010. There was plenty of sustainability work at Ubicomp 2008 (including a workshop I attended), so this is a possible venue. However, the submission deadlines are rather soon (March for papers) so it’s probably more realistic for 2011.
  • Pervasive 2011. This is similar to Ubicomp, but happens in May in Europe. Submission deadline is mid-October.
  • IEEE Sensors 2010. May 4 is the abstract submission deadline, with the conferencing happening November 1-4 at the Hilton Waikaloa. Maximum paper length is 4 pages. This is perhaps less relevant than the IEEE Smart Grid conference, but still worthy of consideration.

Journals:

  • International Journal of Sustainability in Higher Education. This is where the Oberlin dorm energy contest paper was published, so it seems an obvious choice. Not the broadest appeal though.
  • Environment & Behavior.
  • IEEE Transactions on Smart Grid. Journal to be launched soon.
  • IEEE Transactions on Sustainable Energy. Another journal to be launched soon. The smart grid one looks more relevant, this looks to be focused on energy generation from renewables.

I’m sure more will come up as I read more, but this is a good starter list.