Raw data
The River Thames data logger above Shiplake Lock records data in an RRD database,
which is periodically dumped to an XML file and archived here.
The data is open source and you are welcome to use it, but there are a
few things that you need to be aware of:
-
The temperature sensors are MCP9701A chips, which have a typical accuracy of
+/- 1C and max error of -4/+2C. I have not attempted to check the
calibration. The analogue-to-digital conversion is done by an
ATMega328P.
-
The original river sensor has been erratic at times. See the bottom
graph at https://dl1.findlays.net/bin/display/rivertemp
- 2014, 2015, and 2016 all included some time reading below 0C! The sensor was
embedded in silicone but I suspect the water got in anyway.
I replaced the river sensor in early October 2016 and it appears to be
OK now. This one is in a sealed copper pipe.
-
The river sensor is some way below the surface - in the silt of
the river bed near the bank. It is probably 1m below the surface
in normal conditions - rather more in floods!
-
River temperature seems to follow air temperature more closely than
I had originally expected. Where data is missing or dubious it will
probably be OK to use a time-smoothed version of air temp.
-
River level is measured in cm above an arbitrary datum that corresponds
roughly to the bank level, so in normal conditions the value is around
-50cm. Measurements are taken once per minute, and the values stored
are the 5-minute average, the minimum and the maximum. The level sensor
is ultrasonic and it suffers from atmospheric conditions at times so
the maximum level in particular is often much too high.
-
The data is stored in RRDtool - a constant-size time-series database
format. The datalogger collects one measurement every minute and
averages these to produce 5-minute intervals in storage. Further
averaging produces 1-hour and 6-hour intervals which are held for
longer:
# Archive periods based on 300s primary data points:
# samples/entry time/entry entries archive length
# 1 5m 10000 34 days
# 12 1h 10000 416 days
# 72 6h 15000 10 years
Timestamps in rrdtool databases are relative to the Unix epoch
(seconds since 1st January 1970, UTC). The graph generators convert
to UK local time for display. The CSV and XML files mentioned below
have timestamps in UTC.
-
If you want recent data in an easy-to-use format then you should look at
these files which are regenerated every 5 minutes:
Be aware that some samples (often the latest one) will show 'nan' (not-a-number) values
as not enough data has arrived to complete the calculation.
-
All data is dumped from the database to XML on the first day of every month.
There are a couple of older files, allowing access to more frequent readings
back to about January 2015. All files contain 6-hour readings back to
October 2012 (or 3750 days before the file was made if that is later).
Each XML file has a header and three sections containing 5-minute,
1-hour, and 6-hour data. Each row has these items:
- Air temperature (C)
- River temperature (C)
- River level (cm)
- Min river level in the interval (cm)
- Max river level in the interval (cm)
-
6-hour averages are also dumped each month, and these are stored in CSV format
which is much easier to handle. This output started in November 2019 but the
data goes back to about Noveber 2012.
See also
findlays.net is run by Andrew Findlay