I have been intrigued by these graphs of sea level rise
which the UK Environment Agency (EA) used to display on this page of their website, although the page was removed some time around 13 October 2011.
http://www.environmentagency.gov.uk/cy/ymchwil/llyfrgell/data/34449.aspx
It was claimed that these graphs indicate an average sea level rise of between 2.3 and 4mm/year sea level rise. These graphs are part of the way EA attempts to justify its policy of managed realignment (abandoning or knocking down sea walls and flooding land). The argument seems to be that sea level rise means the we can not afford to maintain sea walls so we might as well get rid of them.
However EA are spending billions of tax payer pounds on their preferred policy, outlined in the Shoreline Management Plan. So if the claimed rate of sea level rise is too large we should ask why are EA proceeding with this plan.
Although the scale of these graphs is very small it seemed that these figures are too high. Just looking at the graphs, without a magnifying glass, and doing approximate sums it seemed that.
Location  Rise in Sea Level  Period  Rate of Rise 

North Shields  200 mm  100 years  2 mm/year 
Sheerness

200 mm  170 years  1.2 mm/year 
Liverpool  300 mm  150 years  2 mm/year 
Newlyn  100 mm  80 years  1.25 mm/year 
So the sea level rise was actually 1 to 2mm/year, half what was claimed.
Fortunately the data on sea level rise is publicly available at Permanent Service For Mean Sea Level
So I downloaded the data and used the free stats program R to perform linear regression.
Linear regression is a way of finding a line which best goes through a set of points on a graph, for example observations of sea level on different dates.
Location  Rate of Rise 

North Shields  1.95 mm/year 
Sheerness

1.67 mm/year 
Liverpool  1.08 mm/year 
Newlyn  1.78 mm/year 
So my approximations were a bit off but they’re in the right ball park. And in the interest of disclosure there are 2 datasets for Liverpool. One going from 1858 to 1983. The second from 1992 to 2007 but even so having 3 missing years. I only used the older dataset.
So I made a Freedom of Information request to the EA to supply the data and calculation they had used, and EA sent me an Excel spreadsheet.
To my amazement the average sea level rise had been calculated using the average worksheet function.
Sounds fair enough doesn’t it. You want to find out average sea level rise use the average function.
But there’s a problem and that problem is missing data.
To illustrate the problem lets generate some data, ranging from 1000 to 3000 in steps of 100. Linear regression requires to ranges, so we’ll just run a count from 1 to 21. In the spreadsheet below the count is in the column A, labelled x. Column B, labelled y, contains the values 1000 to 3000. In column C the spreadsheet calculates the change in y from the previous row which is, surprise surprise, 100.
In cell E3 the average change is calculated using the average function and in E4 the slope function (which uses linear regression) calculates the overall rate of change. Both methods give 100. So far so good.
Now lets simulate missing data by only using one row in 5.
The slope function gives us the correct result, but average is 5 times to large. Well we’ve used 1/5 the number of rows so it is logical that the average of change over the rows that have been used will be 5 times greater. Suppose each row of the original spreadsheet gave the sea level on successive years. Looking at every fifth row would give the sea level every five years. Hence, as the data was generated with a constant rise year on year, the rise after 5 years will be 5 times as great as after 1 year.
So the summary is don’t use average if there is missing data.
Sea levels vary about 4 meters between high and low tides. The rate of sea level rise is tiny in comparison, about 1 mm/year. In the following sheet I have simulated this, with no missing rows.
The starting sea level is 6000mm.
There is a rise of 1mm/row
The tides ranges +/ 2000mm
There is one tide every 4 rows.
The average and slope columns contain the results up to the current row.
Comparing average and slope columns it can be seen that slope settles to a value around 1 fairly quickly but average continues to vary widely.
And a graphic comparison of slope (red) and average (blue).
But remember there are only about 100 rows, 2 tides a day so 50 days.
Even so don’t use average use linear regression(slope).