Step 1 is to model A as a function of B to predict the “missing” values of A, the period from year 1 – 49. The result is the (hard-to-read) dashed red line. But even somebody slapped upside the head with a hockey stick knows that these predictions are not 100% certain. There should be some kind of plus or minus bounds. The dark red shaded area are the classical 95% parametric error bounds, spit right out of the linear model. These parametric bounds are the ones (always?) found in reporting of homogenizations (technically: these are the classical predictive bounds, which I call “parametric”, because the classical method is entirely concerned with making statements about non-observable parameters; why this is so is a long story).
Problem is, like I have been saying, they are too narrow. Those black dots in the years 1 – 49 are the actual values of A. If those parametric error bounds were doing their job, then about 95% of the black dots would be inside the dark red polygon. This is not the case.
I repeat: this is not the case. I emphasize: it is never the case. Using parametric confidence bounds when you are making predictions of real observables is sending a mewling boy to do a man’s job. Incidentally, climatologists are not the only ones making this mistake: it is rampant in statistics, a probabilistic pandemic.
Make sure to read the whole series!
No comments:
Post a Comment