Monitoring My Energy Usage (Part 2)

Published on 2 February 2011

Following on from part 1, I am looking to reduce my energy usage through a Current Cost Envi device which allows me to capture data on my energy usage. Now that I've received my Envi and data cable, it's time to start capturing that data!

The data provided by the Envi is via serial, and the data cable is simply a serial-to-USB converter (using the Prolific PL-2303 chipset if you're interested) and once the drivers are installed it just shows up as a new serial port. In my case, on my Windows laptop, "COM4". Thankfully, the device is nice and simple and there is nothing which is required in order to get the data—no need to send any commands or any horrible protocol to figure out—the data is just spat out every few seconds and its up to you to be listening to it and processing it however you want.

The Plan

I decided that the first job was to write a small app which can listen to this incoming data, process it and store it somewhere useful for later processing. Of course, just saying "store it somewhere" isn't really a plan as such, so I gave some thought as to what I was actually going to use for this. I decided on the following:

A .NET app for the data-gathering app (henceforth known as the "logger")
MongoDB to store the data
Rails or ASP.NET MVC for a front-end app (I haven't decided on this yet, it'll come in part 3)

The reason I went with .NET for the logger app is mainly because I work with C# on a day-to-day basis and am therefore familiar with it. Ruby was a close second choice (and there is no reason I can't change later if I have a change of mind). but it's just as easy for me to set up a new Windows VM as it would be to set up a Linux VM to run all this on. Since working with serial ports is new to me, I thought it might be best to stick to at least some things which are familiar.

The Data

The data comes in several different flavours:

Instant readouts (containing the current energy consumption, current temperature as sensed by the unit and some metadata about the unit and sensor)
Historical data, giving a breakdown of how much power (KWh) was used in two-hourly chunks
Historical data, giving a breakdown of how much power (KWh) was used in daily chunks
Historical data, as above but broken down in monthly* chunks

* It's not clear to me how the monthly data is actually calculated as the device does not have any date which can be set, only the time. I assume it actually breaks it down into 30-day chunks.

I decided that I would record both the instant readings and the historical readings for the two-hourly chunks and daily chunks. The monthly (or whatever) data seems somewhat irrelevant since it probably won't match the real calendar and the trends can be calculated from the daily data.

I also decided that I would split up the temperature and store it separately since it's a totally separate piece of information. If I ever find myself wanting to extend the system with temperature readings from anything else, they can all live together in the same collection. I ended up with the following list of collections and document structures:

Instant temperature readings (in °c): temperatures

{
    "timestamp": /* a Date instance containing the date/time */,
    "temperature": /* the recorded temperature */
}

Instant energy readings (in watts): energy-instant

{
    "timestamp": /* a Date instance containing the date/time */,
    "meta": { 
        /* some metadata from the device inc. 'days since birth' 
           and current time on device */ 
    },
    "sensor": {
        "number": 0, /* the sensor number; 0 = whole house, 1-9 = appliance sensors */
        /* a couple of other bits of data about the sensor */
    },
    "readings": {
        "channels": { "1": 400 }, /* the number of channels can vary per sensor */
        "total": 400 /* the total of all channels - this is the important one */
    }
}

Historical two-hourly power readings (in KWh): energy-history-twohourly

{
    "timestamp": /* a Date instance corresponding to the start of the 2-hour slot */,
    "timestamp-end": /* a Date corresponding to the end of the slot - only included for my own clarification */,
    "value": 0.2, /* the recorded value for this time period */
    "unit": "kwhr", /* the device-returned unit for this value */
    "sensor": { "number": 1 },
    "meta": {
        /* meta-data similar to instant energy readings */
    }
}

Historical daily power readings (in KWh): energy-history-daily

The document structure for this collection matches that of the two-hourly history, except the timestamp is for the start of the day in question (i.e. '2011-02-01 00:00:00') and the timestamp-end field is not included.

You may be thinking that I've included a lot of unnecessary stuff, and most of it won't be needed. You'd be right, but I decided to put it in anyway in case I find myself wanting it in the future. Disk space is cheap and I can always run a little script over the data later to slim it down if I want to—that's certainly easier than building a time machine to go back and collect the missing data!

Data Overload

At this point, you might be wondering how frequently this data is provided by the unit. The answer is this: the instant temperature/energy readings are sent by the device each time it receives it from the sensor. This is (roughly) every six seconds. The two-hourly data is sent every two hours, but covers a span of time going back days. The daily data is sent once a day, but again spans a large period of time. The same goes for the monthly data, but as I said before, I'm ignoring that.

Having most of the hourly and daily updates essentially repeated each time means I have to check each timestamp (after calculating it) to make sure I don't accidentally store data for the same period twice. Thankfully that is pretty easy; I simply determine the timestamp for the period and then check the database to see if there's already data for it. If there is, I can ignore it, and if there is not, I can add the new data.

The second problem is that having a reading for the temperature and current energy usage every six seconds is too much for most things. Sure, having such a fine resolution is great, but if I want to graph the usage over a day, or a week, it is going to be far too many data points.

To solve this, the logger also runs a map/reduce operation every so often (currently once an hour) which averages the energy usage total for each minute and writes the results to another collection (energy-average-oneminute). I also do the same for the temperature, but for every 10 minutes (since it is less likely to encounter significant change over short periods of time (temperature-average-tenminutes). Initially, I tried having a ten-minute average for both, but this proved too much of a loss of resolution for the energy, since significant peaks which only last for a short period of time were totally lost.

// Map/reduce for averaging out energy consumption each minute
function map() {
    var itemTS = this.timestamp;
    var timestamp = new Date(
        itemTS.getFullYear(), 
        itemTS.getMonth(), 
        itemTS.getDate(), 
        itemTS.getHours(),
        itemTS.getMinutes(),
        0, 0
    ); // the timestamp of the minute we are covering (e.g. 20:01:00)

    emit(timestamp, this.readings.total);
}

function reduce(key, values) {
    var summed = 0.0;
    for (var i=0; i<values.length; i++) {
        summed += values[i];
    }
    return summed / values.length;
}


// Map/reduce for averaging out temperature every 10 mins
function map() {
    var itemTS = this.timestamp;
    var minutes = Math.floor(itemTS.getMinutes() / 10) * 10;  // 23 => 20, 57 => 50, etc.
    var timestamp = new Date(
        itemTS.getFullYear(), 
        itemTS.getMonth(), 
        itemTS.getDate(), 
        itemTS.getHours(),
        minutes,
        0, 0
    ); // the timestamp of the ten mins we are covering (e.g. 20:10:00)

    emit(timestamp, this.temperature);
}

// Reduce is the same as before.

Future Issues

I am not sure whether constantly running this map/reduce operation across all the data is feasible, as the data is going to grow constantly every day. I am considering clearing out older instant data after it's been averaged down and then run subsequent map/reduce operations so they merge in their data rather than rewrite it each time.

I'm not too sure on this part yet-I'll keep an eye on it and watch to see how fast the data grows. If it turns out to be getting unwieldy, my first step will be to slim down the amount of extra data stored alongside the important stuff and go from there.

What's Next...?

Now I've got the data being captured (and set up a backup schedule), it's time to write a web front-end for it all. This will give me the ability to view historical data, view recent data (e.g. "today's energy usage" and "yesterday's energy usage") and hopefully see the current consumption from the latest reading. And of course, it'll all have to be presented nicely. I'll write about that in part 3.