Raw data

The River Thames data logger above Shiplake Lock records data in an RRD database, which is periodically dumped to an XML file and archived here.

The data is open source and you are welcome to use it, but there are a few things that you need to be aware of:

  1. The sensors are MCP9701A chips, which have a typical accuracy of +/- 1C and max error of -4/+2C. I have not attempted to check the calibration. The analogue-to-digital conversion is done by an ATMega328P.
  2. The original river sensor has been erratic at times. See the bottom graph at https://dl1.findlays.net/bin/display/rivertemp - 2014, 2015, and 2016 all included some time reading below 0C! The sensor was embedded in silicone but I suspect the water got in anyway.
    I replaced the river sensor in early October 2016 and it appears to be OK now. This one is in a sealed copper pipe.
  3. The river sensor is some way below the surface - in the silt of the river bed near the bank. It is probably 1m below the surface in normal conditions - rather more in floods!
  4. River temperature seems to follow air temperature more closely than I had originally expected. Where data is missing or dubious it will probably be OK to use a time-smoothed version of air temp.
  5. The data is stored in RRDtool - a constant-size time-series database format. The datalogger collects one measurement every minute and averages these to produce 5-minute intervals in storage. Further averaging produces 1-hour and 6-hour intervals which are held for longer:
    # Archive periods based on 300s primary data points:
    #       samples/entry   time/entry      entries         archive length
    #       1               5m              10000           34 days
    #       12              1h              10000           416 days
    #       72              6h              15000           10 years
  6. Data is dumped from the database to XML on the first day of every month. There are a couple of older files, allowing access to more frequent readings back to about January 2015. All files contain 6-hour readings back to October 2012.
    Each XML file has a header and three sections containing 5-minute, 1-hour, and 6-hour data. Each row has these items:
  7. 6-hour averages are also dumped each month, and these are stored in CSV format which is much easier to handle. This output started in November 2019 but the data goes back to about Noveber 2012.

See also

findlays.net is run by Andrew Findlay