Easing the pain of preregistration: Data Unboxing Parties!

Hi all. It’s been a while, for sure – lots of stuff been going on over the past months, including giving a TEDx talk, getting a new puppy, getting an annoying diagnosis, and me presenting at the Parapsychological Association meeting in Boulder. Yes, I’ve turned to the dark side. Sort of. In Boulder, I met several wonderful people, and one of them gave me a brilliant idea. As you know, in parapsychology people happily borrow concepts from physics, often with disastrous/hilarious results (see here). However, I think this idea I got in Boulder from a physicist will appeal to many.

It is about preregistration. Yet another epic blow to my Introduction to Psychology lecture slides: smiling does not make you feel better! Well thanks, Eric-Jan, now I have to disappoint another 450 students. What’s next, terror management theory not being true, so I can throw my joke in which I remind students about their mortality right before the weekend out of the window (oh… cr@p)? Anyway, all this has led to another revival of the preregistration debate. Should we preregister our studies?

I am not going re-iterate what has been already said about the topic. The answer is unequivocally YES. Really, there is absolutely no sound argument against preregistration. It does not take away creativity, it does not take away ‘academic freedom’, all it does is MAKE YOUR SCIENCE BETTER. However, many people do fear preregistration is at best unnecessary, and, at worst, a severe limitation to academic freedom.

In all seriousness – I think we need to be a bit less stressed out about preregistration. Basically, it’s a very simple procedure in which you state your a priori information and beliefs about the outcomes of your manipulation. Together with the actual data and results, this gives a far more complete record of what an empirical observation (i.e., the outcomes of a study) actually tells us. That’s it. Nothing more. The preregistration is simply an extension of the data, telling us the beliefs and expectations of the researcher, allowing for better interpretation of the data. And yes, this is what an introduction section of a paper is for, but simply think of your preregistration as a verifiable account of that piece of data, just as your uploaded/shared data are a verifiable account of your observations.

This also means that if you have *not* preregistered your study or analysis, it’s still a valuable observation. But less so than a preregistered one, for the simple reason that we lack a verifiable account of the a priori information, and need to trust a researcher on her/his blue eyes – similar to researchers who refuse to share empirical data for no good reason.

All this does not preclude exploratory analyses – you can still do them. However, it’s up to the reader to decide upon the interpretation of such outcomes. A preregistration (or lack thereof) will make this process easier and more transparent.

Now, how to implement all this in good lab practice and make it less of a pain?

A physicist I met in Boulder told me a very interesting thing about his work (a.o. at LIGO): for any experiment, first, they develop the data analysis protocols. In this stage, they allow themselves all degrees of freedom in torturing pilot datasets. Once the team has settled on the analysis, the protocols are registerd, and data collection begins. All data is stored in a ‘big black box’. No one is allowed to look at the data, touch the data, manipulate the data, or think about the data (I think I made that last one up). Then, once the data is in, the team gathers, with several crates of beer/bottles of wine/spirit/etc., and unboxes the data by running the preregistered script. The alcohol has two main uses: either, to have a big party if the data confirm the hypothesis, or to drown the misery if the data does not confirm the hypothesis.

I found this such a great idea I’m implementing this in my lab as well. We’re going to have data unboxing parties!

Ideally, we’ll do stuff like this from now on:

[1] go crazy on experimental design
[2] run pilot *), get some data in (for most of my stuff, typically N=5)
[3] write analysis script to get desired DVs. If necessary, go back to [1]
[4] preregister study at aspredicted.org, upload analysis software and stimulus software to OSF
[5] collect data, store data immediately on secure server
[6] get lab together with beer, run analysis script
[7] sleep off hangover, write paper regardless of outcome

So – who’s in on this?!

*) the pilot as mentioned here is a full run of the procedure. This is not to get an estimate of an effect size, or to see ‘if the manipulation works’, but rather a check to see if the experimental software is running properly, if the participants understand what they need to do, if they do not come up with alternative strategies to do a task, etc. The data from these sessions is used to fine tune my analyses – often, I look for e.g. EEG components that need to be present in the data. My ‘signature paradigm’ for example evoked a strong 10 Hz oscillation. If I cannot extract that from a single dataset, I know something is wrong. So that’s what the pilot is for.