Abstract
Following a fundamental statement made in 2016 by the American Statistical Associations and broad and consistent changes in data analysis and interpretation methodology in public health and other sciences, statistical significance/null hypothesis testing is being increasingly criticized and abandoned in the reporting and interpretation of the results of biomedical research. This shift in favor of a more comprehensive and non-dichotomous approach in the assessment of causal relationships may have a major impact on human health risk assessment. It is interesting to see, however, that authoritative opinions by the Supreme Court of the United States and European regulatory agencies have somehow anticipated this tide of criticism of statistical significance testing, thus providing additional support to its demise. Current methodological evidence further warrants abandonment of this approach in both the biomedical and public law contexts, in favor of a more comprehensive and flexible method of assessing exposures of toxicological interest to human and environmental health.
References
[2] Greenland S, Senn SJ, Rothman KJ, Carlin JB, Poole C, Goodman SN, Altman DG. Statistical tests, P values, confidence intervals, and power: A guide to misinterpretations. Eur J Epidemiol 2016;31(4):337-50. doi:10.1007/s10654-016-0149-3.
[3] Rothman KJ. Disengaging from statistical significance. Eur J Epidemiol 2016;31(5):443-4. doi:10.1007/s10654-016-0158-2.
[4] Lash TL, VanderWeele TJ, Haneuse S, Rothman KJ. Modern Epidemiology. Philadelphia: Wolters Kluwer; 2020.
[5] Rothman KJ. Significance questing. Ann Intern Med 1986;105(3):445-7. doi:10.7326/0003-4819-105-3-445.
[6] Lang JM, Rothman KJ, Cann CI. That confounded P-value. Epidemiology 1998;9(1):7-8. doi:10.1097/00001648-199801000-00004.
[7] Wasserstein RL, Lazar NA. The ASA's statement on p-values: Context, process, and purpose. Am Stat 2016;70(2):129-33.
[8] Amrhein V, Greenland S, McShane B. Scientists rise up against statistical significance. Nature 2019;567(7748):305-7. doi:10.1038/d41586-019-00857-9.
[9] Fisher RA. The arrangement of field experiments. J Min Agric Great Britain 1926;33:503-13.
[10] Fisher RA. The design of experiments. London: Oliver and Boyd; 1935.
[11] Hill AB. The environment and disease: Association or causation? Proc R Soc Med 1965;58:295-300.
[12] Kluxen FM, Jensen SM. Expanding the toxicologist's statistical toolbox: Using effect size estimation and dose-response modelling for holistic assessments instead of generic testing. Regul Toxicol Pharmacol 2021;121:104871. doi:10.1016/j.yrtph.2021.104871.
[13] Rothman KJ. A show of confidence. N Engl J Med 1978;299(24):1362-3. doi:10.1056/NEJM197812142992410.
[14] Rothman KJ, Greenland S. Modern Epidemiology. Second Edition. Philadelphia: Lippincott-Raven; 1998.
[15] Nuzzo R. Scientific method: Statistical errors. Nature 2014;506(7487):150-2. doi:10.1038/506150a.
[16] Lash TL. The harm done to reproducibility by the culture of null hypothesis significance testing. Am J Epidemiol 2017;186(6):627-35. doi:10.1093/aje/kwx261.
[17] Li G, Walter SD, Thabane L. Shifting the focus away from binary thinking of statistical significance and towards education for key stakeholders: Revisiting the debate on whether it's time to de-emphasize or get rid of statistical significance. J Clin Epidemiol 2021;137:104-12. doi:10.1016/j.jclinepi.2021.03.033.
[18] Ciapponi A, Belizan JM, Piaggio G, Yaya S. There is life beyond the statistical significance. Reprod Health 2021;18(1):80. doi:10.1186/s12978-021-01131-w.
[19] Frank O, Tam CM, Rhee J. Is it time to stop using statistical significance? Aust Prescr 2021;44(1):16-8. doi:10.18773/austprescr.2020.074.
[20] Trafimow D. Editorial. Basic and Applied Social Psychology 2014;36(1):1-2. doi:10.1080/01973533.2014.865505.
[21] Lederer DJ, Bell SC, Branson RD, Chalmers JD, Marshall R, Maslove DM, Ost DE, Punjabi NM, Schatz M, Smyth AR, Stewart PW, Suissa S, Adjei AA, Akdis CA, Azoulay E, Bakker J, Ballas ZK, Bardin PG, Barreiro E, Bellomo R, Bernstein JA, Brusasco V, Buchman TG, Chokroverty S, Collop NA, Crapo JD, Fitzgerald DA, Hale L, Hart N, Herth FJ, Iwashyna TJ, Jenkins G, Kolb M, Marks GB, Mazzone P, Moorman JR, Murphy TM, Noah TL, Reynolds P, Riemann D, Russell RE, Sheikh A, Sotgiu G, Swenson ER, Szczesniak R, Szymusiak R, Teboul JL, Vincent JL. Control of confounding and reporting of results in causal inference studies. Guidance for authors from editors of respiratory, sleep, and critical care journals. Ann Am Thorac Soc 2019;16(1):22-8. doi:10.1513/AnnalsATS.201808-564PS.
[22] Harrington D. New guidelines for statistical reporting. Reply. N Engl J Med 2019;381(16):1597-8. doi:10.1056/NEJMc1911817.
[23] Lin L, Shi L, Chu H, Murad MH. The magnitude of small-study effects in the Cochrane Database of Systematic Reviews: An empirical study of nearly 30 000 meta-analyses. BMJ Evid Based Med 2020;25(1):27-32. doi:10.1136/bmjebm-2019-111191.
[24] Supreme Court of the United States. Matrixx Initiatives, Inc., et al., No. 09-1156, Petitioner v. James Siracusano et al. On Writ of Certiorari to the United States Court of Appeals for the Ninth Circuit 2011:25.
[25] Kaye DH. Trapped in the Matrixx: The U.S. Supreme Court and the need for statistical significance. Product Safety and Liability Reporter 2011;39:1007.
[26] Gastwirth JL. Statistical considerations support the Supreme Court's decision. Matrixx Initiatives vs. Siracusano 2012.
[27] Kadane JB. Matrixx v. Siracusano: What do courts mean by 'statistical significance'? Law, Probability and Risk 2012;11(1):41-9. doi:10.1093/lpr/mgr022.
[28] Ziliak ST. Statistical significance and scientific misconduct: Improving the style of the published research paper. Review of Social Economy 2016;74(1):83-97. doi:10.1080/00346764.2016.1150730.
[29] Ziliak S, McCloskey D. Lady justice v. cult of statistical significance: Oomph-less science and the New Rule of Law. Oxford Handbook of Professional Economic Ethics 2016:352-64.
[30] Wikipedia Matrixx Initiatives, Inc. v. Siracusano. https://en.wikipedia.org/wiki/Matrixx_Initiatives,_Inc._v._Siracusano
[31] Kaye DH. Is proof of statistical significance relevant? Wash L Rev 1986;61(4):1333-65.
[32] BRNOVICH v. Democratic National Committee. In Secondiary BRNOVICH v. Democratic National Committee, Supreme Court: 2021; Vol. 141, p 1263.
[33] EFSA Scientific Committee. Scientific Opinion: Statistical significance and biological relevance. EFSA Journal 2011;9(9):2372. doi:10.2903/j.efsa.2011.2372.