I continue to be bothered by Pall et al, the paper which attempts to determine how much more likely the autumn 2000 floods in England and Wales were because of the anthropogenic global warming (AGW) since 1900.
To recap, Part 1 of this extended critique described the method adopted by Pall et al and made a few criticisms, one of which I’ll elaborate on in the first part of this post. Part 1 ended by asking why Pall et al didn’t eliminate more statistical uncertainty, given the large number of of data points they produced (they ran over 10,000 simulations of the climate in 2000 when floods occurred).
Part 2 looked more closely at how Pall et al had defined risk and uncertainty and handled it statistically. Part 3 will further question the approach adopted, in particular by considering the uncertainty introduced by the process of modelling the climate itself.
Oops, it’s a log scale, or “about this 0.41mm threshold” revisited
In Part 1, I noted the arbitrariness of the threshold for severe flooding adopted by Pall et al. They considered their model had predicted flooding when it estimated 0.41mm/day or more of runoff, but their Fig 3 clearly shows that this level actually gives rather more than the 5-7 floods in the ~2000 model runs of each of the 4 A2000N scenarios (those without AGW, the AGW runs being referred to as the A2000 series, of which around 2000 were also run) that would be expected for the once in 3-400 year event the 2000 floods are said to be.
Pall et al includes no evidence as to the skill of their model in predicting flooding or calibration between the models’ estimation of runoff in the 2000 floods and what actually happened in the real world. As I noted in Part 1, they could have run the model for years other than 2000 in order to show what is termed its “skill”, in this case in predicting flooding.
Why, then, did Pall et al not calibrate their model? Because they didn’t think it mattered, that’s why. They write:
“Crucially, however, most runoff occurrence frequency curves in Fig 3 remain approximately linear over a range of extreme values, so our FAR estimate would be consistent over a range of bias-corrected flood values.”
It’s about time we had a picture, and I can now include Pall et al’s Fig 3 itself. Ignore the sub-divisions on the bottom of the 2 scales in each diagram – these are in error as pointed out in Part 1. The question for any youngsters reading is: are the scales on these diagrams linear or logarithmic?:
Answer: logarithmic, of course.
So is it the case that the “FAR estimate would be consistent over a range of bias-corrected flood thresholds”? The FAR, remember, is the ratio of the AGW risk of flooding to the non-AGW risk of flooding. This ratio would indeed not depend on the level chosen in the model set-up to indicate flooding of the extent seen in the real world in 2000 were the runoff occurrence frequency curves linear. But they’re not. They’re logarithmic. The increased risk therefore does depend on the flood level, as was seen simply from reading figures off the diagrams in Part 1. One wonders if we’re all clear exactly what the graphs in Pall et al’s Fig 3 actually represent.
Does Pall et al actually tell us anything useful at all?
The Pall et al study assumes it has some skill in forecasting flooding in England in autumn from the state of the climate system in April. Unfortunately we have no idea what this level of skill actually is. The model has not been calibrated against the real world by running it for years other than 2000 (or if it has, this information is not included in Pall et al). Note that analysing the results of such an exercise would not be a trivial exercise, since there are two unknowns: the skill of the model and its bias. As far as we know, 0.41mm runoff in the model could be anything in the real world – 0.35mm or 0.5mm, we have no idea. Similarly we don’t know if the model would forecast floods such as those in 2000 with a probability of 1 in 10, 1 in 100 or whatever.
To be fair, Pall et al do devote one of their 4 pages in Nature to showing their modelling does bear some relation to reality. Their Fig 1 shows similar correlation between Northern Hemisphere (NH) air pressure patterns in the model and rainfall in England and Wales as exists in the real world. And their Fig 2 shows that the rainfall patterns in the model bear some resemblance to those in the real world.
But one (more) big problem nags away at me. The basic premise is that a particular pattern of SSTs and sea ice causes the pressure system patterns that lead to rainfall in the UK. Pall et al therefore used the observed April 2000 pattern as input to the A2000 (AGW) series of model runs. But the patterns used for the non-AGW (A2000N) runs were different. Here’s what they say:
“…four spatial patterns of attributable [i.e. to AGW] warming were obtained from simulations with four coupled atmosphere-ocean climate models (HadCM3, GFDLR30, NCARPCM1 and MIROC3.2)… Hence the full A2000N scenario actually comprises four scenarios with a range of SST patterns and sea ice…” [my stress]
So if the A2000 model runs can predict flooding in a particular year from the SST and sea ice pattern in April, we wouldn’t expect the A2000N runs to do so, not just because everything is warmer, but also because the SST and sea ice patterns are different! So we don’t know whether the increased flood risk in the A2000 series is because of global warming or because the SST patterns are different.
It also seems to me that were it the case that Pall et al’s model could predict autumn flooding in April around 15-20x as often as it actually occurs (around 1 in 20 times for 2000 compared to the actual risk of 1 in 3-400) as is implied by their Fig 3, then we’d be reading about a breakthrough in seasonal forecasting and more money would be being invested to improve the modelling further (and increase the speed of forecasting of course, so that it’s not autumn already by the time we know it’s going to be wet!). This isn’t just the forecast for the next season we’re talking about, which the Met Office has given up on, but the forecast for the season after that.
So I’m not convinced. I’m going to assume that Pall et al’s modelling can’t tell one year from another, and that all they’ve done is model the increased risk of flooding in a warmer world in general. (One way to test this would be to compare the flood risks of the 4 A2000N models against each other for the same extent of AGW – it could be that the models give different results simply because they suggest different amounts of warming, not different patterns).
Under this not very radical assumption, we can actually calibrate Pall et al’s modelling. We know that the floods in 2000 were a once in 3-400 year event. That implies that in each of the diagrams in Fig 3 there should be around 5-7 floods (there are – or should be – approx. 2000 dots representing non AGW model runs on each diagram). We can therefore estimate by inspecting the figures how much flooding in the model respresents a 3-400 flood – it’s the level with only 5-7 dots above. We can then read across to the line of blue dots (the AGW case) and then, by reading up to the return time scale (the one with correct subdivisions), work out how often the modelling suggests the flooding should then occur. Here’s what I get:
- Fig 3a: 3-400 year flood threshold ~0.49mm; risk after AGW once every 40 years.
- Fig 3b: ~0.47mm, and risk now once every 30 years.
- Fig 3c and d: ~0.5mm, and risk now once every 50 years.
So the Pall et al study implies, assuming it’s no better at forecasting flooding when it knows the SST and sea ice conditions in April than it is if it doesn’t, that the risk of a 3-400 year flood in England and Wales, similar or more severe to that which occurred in 2000 is now, as a result of AGW up to 2000 only, between once in 30 and once in 50 years. That is, under this assumption, the risk of flooding in England and Wales of what was previously once in 3-400 year severity has increased by a factor of between 6 and 13, according to Pall et al’s modelling.
Trouble is, the Pall et al model may have a bit of skill in forecasting flooding from April SST and sea ice conditions (the A2000 case) and this skill may have been reduced by an unknown factor when processing the data to remove the effects of 20th century warming. If Pall et al’s results are to have any meaning whatsoever they need to do further work to establish the skill of the model and calibrate it to measures of flooding in the real world.
More uncertainty about uncertainty
In Part 2 I discussed how Pall et al’s treatment of uncertainty has resulted in them actually saying very little. Essentially, they’ve estimated that the risk of autumn flooding as great as or exceeding that in 2000 has increased as a result of AGW by between 20% and around 700% – and there’s 20% probability it could be outside that range! I argued that the sources of this uncertainty are:
(i) the 4 different models used to derive conditions as if AGW hadn’t happened – fair enough, we can’t distinguish between these, (though in Part 1 I estimated how certain we’d be of the increased risk of flooding if we did assume they were all equally probable), and
(ii) statistical uncertainty which could have been eliminated.
But these are not the only sources of uncertainty. We are also uncertain of all the parameters used to drive the HadAM3-N144 model which attempts to reproduce the development of the autumn weather from the April conditions that were fed into it; we’re uncertain of the accuracy of the April SST and sea-ice conditions input into the model; we’re uncertain as to whether atmosphere-ocean feedbacks may have affected the autumn 2000 weather (Pall et al are explicit that such feedbacks were insignificant, so used “an atmosphere-only model, with SSTs and sea ice as bottom boundary conditions); we’re uncertain of the precise magnitude of the forcings in 2000 which affected the development of the autumn weather; we’re uncertain as to whether there are errors in the implementation of the models; and we’re uncertain as to whether there are processes below the resolution of the model which are important in the development of weather patterns. There are probably more.
Consider that the reason we are uncertain as to which of the 4 models used to derive the A2000N initial conditions is most correct (or how correct any of them are) is because we don’t know how well each of them perform on moreorless the same criteria as the higher resolution model used to simulate the 2000 weather. If they didn’t have different parameters, all had the same resolution and so on, then – tautologically – they’d all be the same! If we’re uncertain which of those is most accurate then we must also be uncertain about the HadAM3-N144 model. Just because only one model was used for that stage of the exercise doesn’t mean we’re not still uncertain (and for that matter the fact that we’ve used 4 in the first stage doesn’t mean we’re certain any of them, they could all be wildly wrong, a possibility not apparently taken account of in Pall et al).
It seems to me the real causes of uncertainty in the findings of Pall et al derive from the general characteristics of the models, not (as discussed in Part 2) the statistical uncertainty as to the amplitude of 20th century warming (the 10 sub-scenarios for each of the 4 cases) which has been used.
Judith Curry has recently written at length about uncertainty and her piece is well worth a look (though I disagree where statistical uncertainty belongs in Rumsfeld’s classification – I think it’s a known unknown, maybe in a “knowable” category, since it can be reduced simply by collecting more of the same type of data as one already has). In particular, though, she provides a link to IPCC guidelines on dealing with uncertainty (pdf). A quick skim of this document suggests to me that Probability Distribution Functions (PDFs) such as Pall et al’s Fig 4 should be accompanied by a discussion of the factors creating uncertainty in the estimate, including some consideration as to how complete the analysis is deemed to be. I say deemed to be, since by it’s very nature uncertainty is uncertain!
That seems a good note to end the discussion on.
Here’s Pall et al’s Fig 4 (apologies if it looks a bit smudged):
In Part 1 of this critique I identified the two main problems with Pall et al:
- the model results are not calibrated with real world data. The paper therefore chooses an arbitrary threshold of flooding.
- statistical uncertainty has not been eliminated, rather it seems to have been introduced unnecessarily.
Part 2 drilled down into the issue of statistical uncertainty and suggested how Pall et al could have used the vast computing resources at their disposal to eliminate much of the uncertainty of their headline findings.
Part 3 picks up on some of the issues raised in Parts 1 and 2, in particular noting that the paper seems to include an erroneous assumption which led them to conclude that calibration of their model for skill and bias was not important. If my reasoning is correct, this was a mistake. Part 3 also continues the discussion about uncertainty, suggesting that the real reasons for uncertainty as to the increased risk of flooding have not been included in the analysis (whereas statistical uncertainty should have been eliminated).
There are so many open questions that it is not clear what Pall et al does tell us, if anything. I suspect, though, that the models used have little skill in modelling autumn floods on the basis of April SST and sea ice conditions. If this is correct then the study confirms that extreme flooding in general is likely to become more frequent in a warmer world, with events that have historically been experienced only every few centuries occurring every few decades in the future.
Note: Towards the end of writing Part 3 I came across another critique by Willis Eschenbach. So there may well be a Part 4 when I’ve digested what Willis has to say!