Brain Advance Access published online on May 21, 2008
Brain, doi:10.1093/brain/awn090
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Reply: A plea for confidence intervals and consideration of generalizability in diagnostic studies
Medical Statistics Unit, Department of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, London WC1E 7HT, UK
Correspondence to:
Chris Frost, Medical Statistics Unit, Department of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK E-mail: Chris.Frost{at}lshtm.ac.uk
Received March 20, 2008. Accepted April 18, 2008.
Sir, We read with interest the recent article by Klöppel and colleagues (2008
) on automatic classification of MR scans in Alzheimer's disease and the coverage on the BBC's website. In particular, we were concerned that the claim on the website that computers "spot Alzheimer's fast" cannot be justified on the basis of this research. Although the authors do not make this claim in their paper, in our opinion they do not pay enough attention to two limitations of their work, which may have encouraged the media over-interpretation of their findings.
The first limitation of their work is the relatively small sample size. It has been recognized for many years that confidence intervals should be reported in medical research (Altman et al., 1983
; Gardner and Altman, 1986
). Indeed, some journals recommend that these be included in publications in preference to the rather over-used P-values. We were disappointed that they were not reported here as their inclusion would have been revealing. For example in Group 1 (confirmed Alzheimer's disease cases from the US versus controls) although the estimated sensitivity and specificity are both 95%, an exact 95% confidence interval on each of these extends from 75.1% to 99.9%. We believe that a balanced interpretation of the authors' findings would include reference to the fact that their results are consistent with somewhat lower (and higher) sensitivities and specificities.
The second limitation is that the paper does not, in our opinion, adequately discuss the limited generalizability of the results. The Standards for Reporting of Diagnostic Accuracy (STARD) initiative (Bossuyt et al., 2003
) rightly emphasizes the importance of an evaluation of this issue. In this study, if our understanding is correct, a clinical diagnosis of Alzheimer's disease (or probable Alzheimer's disease in Group 3) had already been made before the MR scans were taken. It is this aspect of their study that makes the claim that computers "spot Alzheimer's fast" unreasonable. Clearly, if the MR derived test is to be useful in practice it must be able to detect Alzheimer's disease earlier than is possible using the existing best clinical techniques.
References
Altman DG, Gore SM, Gardner MJ, Pocock SJ. Statistical guidelines for contributors to medical journals. Br Med J (1983) 286:1489–93.
Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig LM, et al. Towards complete and accurate reporting of studies of diagnostic accuracy: The STARD Initiative. Clin Chem (2003) 49:1–6. (also published in other journals).
Computers spot Alzheimer's fast. (22 February 2008, date last accessed). http://news.bbc.co.uk/2/hi/health/7258379.stm.
Gardner MJ, Altman DG. Confidence intervals rather than P-values: estimation rather than hypothesis testing. Br Med J (1986) 292:746–50.
Kloppel S, Stonnington CM, Chu C, Draganski B, Scahill RI, Rohrer JD, et al. Automatic classification of MR scans in Alzheimer's disease. Brain (2008) 131:681–9.
![]()
CiteULike
Connotea
Del.icio.us What's this?
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||