This week was mostly preliminary activities. As mentioned in my intro post, I’m currently investigating whether I can determine the amount of greenhouse gas produced by transportation using GPS data points. The basic idea is this:
- Collect data on where a person is located and their speed every few seconds via GPS
- Upload that information to a computer periodically
- Infer from GPS data what modes of transportation (walking, biking, driving, riding bus, riding train, riding boat, flying, etc) were used for each segment of each trip
- Determine the length and approximate speed of each segment
- Using the length, speed, and mode of transport for each segment, estimate the amount of greenhouse gas produced
- Provide a variety of analyses on the data such as: total GHG production per day or per week, calories expended, “what if” comparisons between modes of transport (how would my GHG production change if I traded in my Impala for a Prius?)
Obviously, to get started I need data, so I’ve started carrying the AMOD AGL3080 GPS data logger daily. The reason I went with the AGL3080 is that it stores the data onto built-in flash memory that can be mounted like any flash drive. Most GPS data loggers have some custom driver (OS-specific) that you have to install in order to get the data off, but the AGL3080 will work with any computer that can mount a USB flash drive.
I figure that for regular usage, there can’t be much special effort required to gather the GPS data. I’ve been clipping the included strap to one of my belt loops and turning it on before heading out of the house in the morning and turning it off when getting home at night. This is wasteful in terms of battery usage, since the data logger spends several hours each day inside with no view of the GPS satellites (and no useful data to record even if it could see the sky). However, requiring the user to remember to turn the logger on before going outside and off when coming inside seems like a recipe for constantly missing data. I’m also not making any special effort to put the logger somewhere with a better view of the sky when I’m in a car or riding the bus, to reflect how it would be used by a normal user.
I’ve been collecting data for about a week now. I haven’t downloaded or analyzed the data yet. The batteries (recharegable Eneloops) seem to last about 2 days before needing to be recharged. In addition to the GPS data, I’ve started a travel log in a little notebook, recording where I went that day and via which modes of transportation. This should be helpful when segmenting the data so I have some “ground truth” on what mode I was using when the data was collected.
Converting from GPS data to mode of transportation is where the magic happens. It may be the case that certain types of transport, like jogging and bicycling, cannot be reliably distinguished using just GPS data. However, for some types of transport (like jogging and bicycling) we might not care since both have very low GHG emissions. Telling car trips and bus trips apart could be tricky, and it’s an important distinction. I am hoping that the raw GPS data will be enough (buses usually make frequent stops, they accelerate and decelerate slower than cars, riders often wait at stops before getting on the bus), but the bus routes are another source of data for differentiating the two modes. Bus route data could be retrieved from things like Google Transit or bus maps, but it could also be crowdsourced, allowing users of the system to indicate which segments were by bus and then applying that to other users data.
If GPS data turns out to be insufficient, 3-axis accelerometer data could be another cheap simple source for disambiguation. The acceleration profile of different vehicles, fused with GPS data might tell a bus from a car.
Once the segmentation/identification has been done, there is still the issue of converting it into GHG emissions. My impression (as yet not backed by citations) is that this is a fuzzy area where the results depend greatly on what assumptions you make. If this turns out to be the case, then I plan to make the GHG calculation formula exposed for user modification. This allows users to see what assumptions have been made (important since the results could be deliberately skewed in favor of one mode of transport or a particular product), and experiment with the assumptions and see if they can come up with something better.
Past week accomplishments:
- Started logging GPS data and recording travel log
- Came up with ideas for ICS 413 project
- Started looking at LaTeX again
- Got permission from grad chair to submit research portfolio in web format
- Thought about and wrote up more details on ideas about tracking transportation GHS emissions via GPS data logging (above)
Hours worked: 10ish (target: 5 credits * 3 hours/credit = 15 hr)
Plans for coming week:
- Start web portfolio
- Download initial GPS data, start visualizing in Google Maps/Earth and maybe Excel
- Read AGL3080 manual, check out available logging rates
- Investigate LaTeX options for Mac OS X
- Install LaTeX
- Try getting uhthesis LaTeX style working on modern LaTeX
Pointers to work products:
- Just this blog post for this week
Google’s Chrome browser (currently Windows-only) looks interesting. It’s based on Webkit (which also drives Safari), so hopefully this means that we’ll see better support for Safari around the web (and certainly on Google sites).
The Chrome comic book explains the details of why it’s interesting. They even discuss their testing process (they apparently do regression testing on millions of crawled pages from Google’s caches) on page 9. They mention TDD on page 11. 🙂
DBEDT is now producing monthly energy reports. Lots of interesting data in there. Philip has suggested I throw it into Swivel and/or Many Eyes, but I haven’t undertaken that yet.
Mozilla Ubiquity seems like an interesting project. The demo video looks cool, not clear how rigged the demo is though. 🙂