On April 1st our Singapore ReproducibiliTea Journal Club held the first meeting of the Second Season. Together, we discussed the article by Greenland and colleagues “Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations“. The discussion was moderated by Giulio Gabrieli from the Social and Affective Neuroscience Lab (SAN Lab – @SANLabNTU [twitter.com]) at Nanyang Technological University (NTU). The session started with a brief summary of the key points of the article. Giulio selected five common misconception from the article, that were turned into a small interactive quiz to trick the audience and show how easy it is to “hack” the definition of p-value, as well as to show how little we know about the real meaning of statistical terms we employ daily.

The subsequent discussion focused on how p-value is currently employed in research, and what can be done to make p-value more easily understandable and less prone to be interpreted the wrong way. We also discussed about the validity of confidence intervals as viable alternatives to p-vales. Among our members, it seems like there is a general idea that while it’s true p-values are not always employed and reported in the correct way, at the current state it is difficult to completely replace them with a different measure.

A very interesting discussion point has been raised by one of the references of the article we discussed, which is an editorial published by David Trafimow and  Michael Marks on the journal “Basic and Applied Social Psychology”. The article challenged the validity of the null hypothesis significance testing. Moreover, “Basic and Applied Social Psychology” asks authors to “remove all vestiges of the null hypothesis significance testing (p-values, t-values, F-values, statements about “significant” differences or lack thereof, and so on)”. At the same time, the journal also discussed the validity of confidence intervals, that were mentioned as possible alternatives in our discussion, together with Bayesian procedures.

A possible problematic that emerged in the discussion concerns the way we learn how to use and employ p-values, which is in the almost totality of the cases referred to null hypothesis statistical testing, and are taught that the null-hypothesis significance testing provides information about the reliability of a research outcome. Here are some more good readings on the topic:

– Branch, M. (2014). Malignant side effects of null-hypothesis significance testing. Theory & Psychology, 24(2), 256-277. 

– Amrhein, V., Korner-Nievergelt, F., & Roth, T. (2017). The earth is flat (p> 0.05): significance thresholds and the crisis of unreplicable research. PeerJ, 5, e3544. 

– Meyer, K. E., Van Witteloostuijn, A., & Beugelsdijk, S. (2017). What’s in ap? Reassessing best practices for conducting and reporting hypothesis-testing research

– Branch, M. N. (2019). The “reproducibility crisis:” Might the methods used frequently in behavior-analysis research help?. Perspectives on Behavior Science, 42(1), 77-89.

 

On a side note, we were extremely happy to see that our group is growing, and also thanks to the new format, also researchers from abroad, and especially from Taiwan and Italy joined us for this first session of the year.

In our next session on April 8th, we will focus on the topic of Reproducibility in Neuroimaging. Reena Koh Cheng Yee will guide us through the article “Scanning the horizon: towards transparent and reproducible neuroimaging research” by Poldrack and colleagues. Read the paper and come along to a great Open Science chat! Bring your own Tea and snacks 🙂 We will be waiting for you virtually at:

 

https://ntu-sg.zoom.us/j/99534005049

Meeting ID: 995 3400 5049. Passcode: 032787

 

References:

– Greenland, S., Senn, S. J., Rothman, K. J., Carlin, J. B., Poole, C., Goodman, S. N., & Altman, D. G. (2016). Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations. European journal of epidemiology, 31(4), 337-350.

– Trafimow, D., & Marks, M. (2015). Editorial: banning null hypothesis significance testing procedures. Basic and Applied Social Psychology, 37(1), 1-2.

– Poldrack, R. A., Baker, C. I., Durnez, J., Gorgolewski, K. J., Matthews, P. M., Munafò, M. R., … & Yarkoni, T. (2017). Scanning the horizon: towards transparent and reproducible neuroimaging research. Nature reviews neuroscience, 18(2), 115.

 

Click here for the schedule of the Singapore ReproducibiliTea Journal Club sessions.

 

Author of post: Giulio Gabrieli from the Social and Affective Neuroscience Lab (SAN Lab – @SANLabNTU [twitter.com]) at Nanyang Technological University (NTU).