The pitfalls of open data

TL;DR summary: some data can be made publicly available without any problems. A lot of data, however, cannot. Therefore, unrestricted sharing should not be the default. In stead, all data could be hosted on institutional repositories to which researchers can get access upon request to the institution.

Data is an essential part of research, and it is a no-brainer that scientists should share their data. The default approach is and has been ‘share on request’: if you’re interested in a dataset, you simply e-mail the author of a paper, and ask for the data. However, it turns out that this does not work that well. Wicherts, Borsboom, Kats, & Molenaar (2006)  have shown, for example, that authors are not really enthusiastic about sharing data, something not unique to psychology.

This is bad. Sadly, not just for the sake of scientific progress – recently, social science has seen another data-fabrication scandal where a graduate student faked his data for a study published in Science (you would think they had learned their lesson at Science after Stapel, but sadly, no). Making data available with your publication at least makes sure that a) you conducted the study, and b) allows others to (re)use your data, saving work in the end.

It is therefore not surprising that there is now an open research movement, calling for full transparency in research, including making all research data public by default. I totally support open research, and I have considered signing the ‘Agenda’ several times. After a discussion of the ISCON Facebook page I have now decided not to.

As a matter of fact, the discussion has convinced me that making all research data publicly available without restriction by default is in fact a bad idea.

Before the flame-war starts, let me point out that I am not against sharing data between researchers, or not even against compulsory data sharing (i.e., if an author refuses to share data without good reason, her/his boss will send the data). However, I disagree with unrestricted data publishing, i.e. putting all data online where anyone (i.e., the general public) can access it. I am strongly in favour of a system where data is deposited at an institutional repository and anyone interested in the data may ask for access, if necessary even without consent of the author.

Let me illustrate my concerns with the following thought experiment. You participate in an experiment on sexual arousal, and have to fill out a questionnaire about how aroused you are after watching a clip with the most depraved sex acts. Your data is stored anonymously, and will be uploaded to Github directly after the experiment (see Jeff Rouder’s paper on an implementation of such a system). Would you give consent?

For this example, I may. I can always fake my response on the questionnaire should I feel something tingling in my nether regions to avoid embarrassment.

For the next experiment, this study is repeated, but we’re now measuring physiological arousal (i.e., the response of your private parts to said depraved sex acts). Again, the data will be uploaded directly to Github after the experiment.

Now, I would be a bit uncomfortable. Suppose I got sexually aroused (or not – it actually does not matter, the behaviour of my private Willy Johnson is not anyone’s business besides my own and my wife’s, and for this one occasion, the researcher’s). This is now freely available for anyone to see. And by the timestamp on the file, I may be identified by the one or two students who saw me entering the sex research room for the 12:00 session on June 2nd. Unlikely, but not impossible. Oh sure, remove the timestamp then! Yes, but how is a researcher then going to show (s)he collected all the data after preregistering his/her study and not before (or did not fabricate the data on request after someone asked for it)?

Ok, we take it a step further. We now measure the response of your nether regions, but now we ask you to have your fingerprint scanned and stored with the data as well.

Making this data publicly available would be huge no to me. Fingerprints are unique identifiers, are you mad?

But now replace ‘fingerprint’ with raw EEG data. We do not often consider this, but EEG data is as uniquely identifiable as fingerprints. I can recognize some of my regular test subjects and students from their raw EEG data – shape and topography of alpha, for example, are individual traits and may be used to identify individuals if you really, really want to.

One step further: individual, raw fMRI data, associated with your physiological ‘performance’ on this sex rating task. Rendering a 3D face from the associated anatomical image is trivial – it’s one of the first things you do (for fun!) when you start learning MRI analysis. How identifiable do you want to have your participant? And note that raw individual fMRI data cannot be interpreted without the anatomical scan – you need the latter to map activations on brain structures.

So, don’t publish the raw data then! Sure, that fixes some problems, but creates others. What if I want to re-analyze a study’s data, because I do not agree with the author’s preprocessing pipeline, and rather try my own? For this I would still ask the author for the full data set, then. Mind you – most researcher degrees of freedom for EEG and fMRI are in the preprocessing of data (e.g., what filters do you use, what kind of corrections do you apply, what rereferencing do you apply, etc.), and aggregate datasets, such as published on Neurosynth do not allow you to reproduce a preprocessing pipeline.

But the main problem is that many data or patterns in data can be used as unique identifiers. Even questionnaire, reaction time, or psychophysics data. Data mining techniques can be used to find patterns in datasets, such as Facebook likes, that can be used for personal identification. What’s to stop people from running publicly available research data through such algorithms? Unlikely? Sure. Very much so, even. Impossible? Nope.

Of course, my thought experiment deals with a rather extreme example – I guess that very little people are willing to have their boy/girl-boner data in a public database for everyone to see. So let’s take another example. Visual masking. What can go wrong with that? Well – performance on a visual masking task may be affected by illnesses such as schizophrenia, or being related to an individual with schizophrenia. Is that something you want to be publicly accessible? And so there are many other examples. Data reveals an awful lot about participants and it is not clear at all how much data is needed to identify people. It may be less than we think.

I fully realize that the scenarios I put forward here are extreme, hypothetical, and I am sure some people will think I am fearmongering, making a fuss, and maybe even an enemy of open science. Ok, so be it. I think that we as scientists do not only have a responsibility to each other, but even more so to our participants. People participating in our studies are the lifeblood of what we do and earn our utmost respect and care. They participate in our studies and provide us with often very intimate data, but also trust us we handle that data conscientiously, and they contribute their data for science. We need to protect their privacy. Just putting all data online for everyone to see does not fit with that idea. There is always a potential for violations of privacy, but making all data public also opens up the data for, let’s say, the government, insurance companies, marketeers, and so on, for corporate analyses, marketing purposes, and other goals than the progress of science. Do we want that?

Maybe I should give another example – what about video material? Suppose you carried out an experiment in which you taped participants’ emotional responses to shocking material. Even if you would blur out faces to prevent identification, and my IRB is ok with publishing these clips, I would still not submit such material to a public depository for every Tom, Dick, and Harry to browse clips of crying participants.

I am not saying these are realistic scenarios, but is worth giving some thought – at least, more than people are doing now.

There are and will be many datasets that can be made publicly available without any concern at all. I’ve got a feeling that the authors of the Agenda for Open Research primarily work with such datasets, but do not sufficiently realize that there is a lot of sensitive data being collected as well. The ideal of all data made public by default does not fit well with my ideas of being a responsible experimenter. And there is a clear ‘grey zone’ here. Not everyone will share my concerns. Some will even say I am making a fuss out of nothing. But I would like to be able to carry out my job with a clear conscience. Towards my colleagues, but most of all towards my participants. And that means I will not make every dataset I collect publicly available, even if this entails the signatories of Agenda for Open Research will not review my paper because they do not agree with my reasons not to make a given dataset publicly available. Too bad.

So, you want access to the data I did not made publicly available, but I am on extended leave? There is a fix! And actually, this fix should appeal to the Open Research movement too.

For every IRB-approved experiment, require authors to deposit their data at an institutional repository. All data and materials that is. Raw data, stimuli, analysis code and scripts. The whole shebang, all documented of course. Authors are free to give anyone access to this data they want to. Scientists interested in the data can request access via a link provided with a paper. In principle, the author will provide access, but if no reply is given within a reasonable term (let’s say two weeks), or the data is not shared, but without proper reason, the request is forwarded to the Director of Research (or another independent authority) who then decides.

In Groningen, we have such a system in place. It ensures that for every published study, the data is accounted for, and access to the raw data can be granted if an individual requires so. The author of a study controls who has access to the data, but can be overruled by the Director of Research. It works for me, and I do not see what the added benefits of unrestricted access are over this system. Working in this way makes me feel a lot better. I can only hope that the signatories of the Agenda for Open Research consider this practice to be open enough.

A quick comment to my previous post

Well, my previous blogpost was poorly timed – I did not expect such an explosion on my Twitter timeline. I cannot reply to all (I am sorting Legos with my son on my day off, quite an important and pressing matter), and replying tomorrow during my staff meeting would be a bit rude, so let me clarify on a couple of things I have seen in the message centre of my iPad.

An apology to Daniel Lakens

First, I feel I have to briefly apolize to Daniel Lakens – the first two paragraphs of my post I made little fun of him. From my part in good jest, but I admit that I may have let some of my sentiment of annoyance towards Daniel’s occasionally moralizing tone shine through a bit too much. I’m sorry, Daniel, if I offended you – if anything, know I deeply appreciate the good work you’re doing.

Can you please derive psi from GTR?

Not sure if this was directed to me, but I’ve seen this briefly in a tweet. Er, no, I cannot derive psi from the general theory of relativity. But neither can the Stroop effect be derived from the GTR. So, it’s a silly question. If the question is “can you derive psi from known physics”, then it’s a different matter. Physical laws give the boundary conditions for normal biological functioning. Psi, according to many, cannot exist because these boundary conditions forbid it. I have argued that that is not necessarily true. That does not mean that psi exists, though – the existence of pink elephants and uranium-powered dragons is also not prohibited by physics, and their existence remains also unproven by many studies (my son and I prefer uranium-powered dragons over Russell’s Teapot, but essentially it’s the same argument).

By the way – I do suspect that the asker knows that GTR is indeed potentially problematic for psi. Most physics-based psi theories are based on the concept of quantum non-locality. The non-locality aspect of quantum theory is incompatible with GTR, but yet we know both are correct – it’s one of the great problems in physics. There is presently one theory that seems to integrate both successfully, based on the very speculative concept of spontaneous collapse of the wave function. If this theory is correct, this would rule out pretty much all non-local psi theories that assign a special role to consciousness.

There is no evidence for psi, can we please stop this non-sense?

True. There is no conclusive experimental evidence for the existence of psi. If such evidence existed, we would not have this debate. However, does this mean psi does not exist? No, of course not. But, “Russell’s Teapot!”, I hear you think. Sure, it would be, if psi would be confined to a (non-existent) lab-related phenomenon. However, paranormal experiences have been reported throughout history, by all cultures. Of course, the vast majority of these phenomena can be explained nowadays by normal psychological or biological processes (including fraud). Psi research, however, started with the aim to reliably recreate such phenomena in the lab. Which does not seem to work for many phenomena. Although that’s of course not very promising, we have to acknowledge that Reality is not confined to our labs. An inability to recreate something in a lab does not mean it does not exist, of course.

You are a@#$@@#$ psi-believer!

Well, not really. As stated above, I do not agree with people like Bem and Tressoldi that there is convincing evidence for psi. However, I also do not agree with people like Daniel Lakens and EJ Wagenmakers that psi research is nonsense. I do believe psi is a very worthwhile topic of study, if done properly, because a convincing demonstration of psi would be a breakthrough for consciousness research. Given that there is a continuous stream of experiments that do seem to show effects, and that I have been getting some odd (and replicable) results in a my own lab, I am inclined to keep a close eye on this line of research and give it the benefit of doubt. However, I do not expect others to jump on the bandwagon or make it priority research (yet).

What genuinely annoys me, though, is the patronizing, scoffing, ridiculing, and accusations of QRPs or outright fraud towards psi researchers by self-proclaimed skeptics. There’s a lot a chaff between the wheat, that’s absolutely true, but I would say it’s not really necessary to make fun of intelligent people who are really trying to do serious research.

Any more?

Not for now – but if you’ve got comments/questions re: this topic, please do engage.

 

Why a meta-analysis of 90 studies does not tell that much about psi, or why academic papers should not be reduced to their data

Social psychologist-turned-statistics-and-publication-ethics crusader Daniel Lakens has recently published his review of a meta-analysis of 90 studies by Bem and colleagues that allegedly shows that there is strong evidence for precognition. Lakens rips apart the meta-analysis in his review, in particular because of the poor control for publication bias. According to Lakens, who recently converted to PET-PEESE as the best way to control for publication bias, there is a huge publication bias in the literature on psi, and if one, contrary to the meta-analysis’ authors, properly controls for that, the actual effect size is not different from zero. Moreover, Lakens suggests in his post that doing experiments without a theoretical framework is like running around like a headless chicken – every now and then you bump into something, but it’s not as if you were actually aiming.

I cannot comment on Daniel’s statistical points. I have not spent the last two years freshing up my stats, as Daniel so thoroughly has done, so I have to assume that he knows to some extent what he’s talking about. However, it may be worthwhile noting that the notion of what is an effect, and how to determine its existence has become somewhat fluid over the past five years. An important part of the debate we’re presently having in psychology is no longer about interpretations of what we have observed, but increasingly about the question whether we have observed anything at all. Daniel’s critical review of Bem et al.’s meta-analysis is an excellent example.

However, I do think Daniel’s post shows something interesting about the role of theory and methods in meta-analyses as well, that in my opinion stretches beyond the present topic. After reading Daniel’s post, and going through some of the original studies included in the meta-analysis it struck me that there might be going something wrong here. And with ‘here’ I mean reducing experiments to datasets and effect sizes. We all know that in order to truly appreciate an experiment and its outcomes, it does not suffice to look at the results section, or to have access to the data. You also need to carefully study the methods section to verify that an author has actually carried out the experiment in such a way that is measured what the author claims has been measured. And this is where many studies go wrong. I will illustrate this with a (in)famous example; Maier et al.’s 2014 paper ‘Feeling the Future Again’.

To give you some more background: Daniel claims that psi lacks a theoretical framework. This statement is incorrect. In fact, there are quite some theoretical models that potentially explain psi effects. Most of these make use or abuse concepts from (quantum) physics, and a as a result many psychologists either do not understand the models, or do not bother to try understand them, and simply draw the ‘quantum waffling’ card. Often this is the appropriate response, but it’s nothing more than a heuristic.

Maier et al. (2014) did not start running experiments like headless chickens hoping to find a weird effect. In fact, they quite carefully crafted a hypothesis about what can be expected from precognitive effects. Precognition is problematic from a physics point of view, not because it’s impossible (it isn’t), but because it creates the possibility for grandfather paradoxes. In typical precognition/presentiment experiments, an observer shows an anomalous response to an event that will take place in the near future, let’s say a chandelier falling down from the ceiling. However, if the observer is aware of his precognitive response, (s)he can act in order to prevent the future event (fixing new screws to the chandelier). However, now said event will not occur anymore, so how can it affect the past? Similarly, you cannot win roulette using your precognitive powers – any attempt to use a signal from the future to intentionally alter your behaviour leads to time paradoxes.

In order to avoid this loophole, Maier et al. suggest that precognition may only work unconsciously; that is, if there are precognitive effects they may only work in a probabilistic way, and only affect unconsciously initiated behaviour. Very superficially, this line of reasoning resembles Deutsch’s closed timelike curves proposal for time-travel of quantum information, but that’s besides the point here. The critical issue is that Maier et al. set up a series of experiments in which they manipulated consciousness of stimuli and actions that were believed to induce or be influenced by precognitive signals.

And that is where things go wrong in their paper.

Maier et al. used stimuli from the IAPS to evoke emotional responses. Basically, the methodology is this: participants had to press two buttons, left and right. Immediately after, two images would appear in on the screen, one of which would have negative emotional content. The images were masked in order to avoid them from entering conscious awareness. The idea is that participants would respond slower pressing the button at the same side as where the negative image would appear (ie., they would precognitively avoid the negative image). However, since this would be a strictly unconscious effect, it would avoid time paradoxes (although one could argue about that one).

What Maier et al. failed to do, though, is properly checking whether their masking manipulation worked. Properly masking stimuli is deceivingly difficult, and reading through their method sections, I am actually very skeptical whether they could have been successful at all. The presentation times of the masked stimuli were 1 video-frame, which would be necessary to properly mask the stimuli, but the presentation software used (E-Prime) is quite notorious for its timing errors, especially under Windows 7 or higher, with video cards low on VRAM. The authors, however, do not provide any details on what operating system they used, or what graphics board they used. To add insult to injury, they did not ask participants on a trial-by-trial basis whether the masked image was seen or not (and even that may not be the best way to check for awareness). Therefore, I have little faith the authors actually succeeded in successfully masking their emotional images in the lab. Their important, super-high-powered N=1221 study, which is often cited, has been carried out on-line. It’s very dubious whether masking was successful in this case at all.

If we follow the reasoning of Maier et al., conscious awareness of stimuli is important in getting precognitive effects (or not). Suppose that E-Prime’s timing messed up in 1 out of 4 trials, and the stimuli were visible – what does that mean for the results? Should these trials have been excluded? Can’t it be the case that such trials diluted the effect, so we end up with an underestimation? And, can’t it be that the inclusion of non-masked trials in the online experiment has affected the outcomes? Measuring ‘unconscious’ behaviour, as in blindsight-like behaviour, in normal individuals is extremely difficult and sensitive to general context – could this have played a role?

In sum, if you do not carefully check your critical manipulations you’re left with a high-powered study that may or may not tell us something about precognition. However, it matters when you include it in your meta-analysis – a study with such a high N will appear very informative because of its (potential) power, but if the methodology is not sound, it is not informative at all.

On a separate note, Maier et al.’s study is not the only one where consciousness is sloppily manipulated – the average ‘social priming’ or ‘unconscious thinking’ study is far worse – make sure you read Tom Stafford’s excellent commentary on this matter!

So, how it this relevant to Bem’s meta-analysis? Quite simply put: what studies you put in matters. You cannot reduce an experiment to its data if you are not absolutely sure the experiment has been carried out properly. And in particular with sensitive techniques like visual masking, or manipulations of consciousness, having some expertise matters. To some extent, Neuroskeptic’s Harry Potter Theory makes perfect sense – there are effects and manipulations which require specific expertise and technical knowledge to replicate (ironically, Neuroskeptic came up with HPT to state the opposite). In order to evaluate an experiment you not only need to have access to the data, but also to the methods used. Given that this information seems to be lacking, it is unclear what this meta-analysis actually tells us.

Now, one problem that you will run in to a whole series ‘No True Scotsman’ arguments (“we should leave Maiers’s paper out of our discussions of psi, because they did not really measure psi”), but some extent that is inevitable. The data of an experiment with clear methodological problems is worthless, even if it is preregistered, open, and high-powered. Open data is not necessarily good data, more data does not necessarily mean better data, and a replication of a bad experiment will not result in better data. The present focus in the ‘replication debate’ draws attention away from this – Tom Postmes referred to this ‘data fetishism’ in a recent post, and he is right.

So how do we solve this problem? The answer is not just “more power”. The answer is “better methods”. And a better a priori theoretical justification of our experiments and analysis. What do we measure, how do we measure it, and why? Preferably, such a justification has to be peer-reviewed, and ideally a paper should be accepted on basis of such a proposal rather than on basis of the results. Hmm, this sounds somewhat familiar

Is psi a legitimate topic of study?

Weird stuff happens from time to time – it’s a fact of life. When I was applying for a position as assistant professor in Exeter, the weeks before the interview the word ‘Exeter’ started to pop up in weird places in my life – newspaper articles, names of conference rooms I happened to present in, stuff like that. Or that time that my then four-year old, right before leaving home, started to talk about the street in front of his school being changed into a river – when we arrived fifteen minutes later we found the street to be flooded by a bursted water pipe.

Coincidence? Probably. Nevertheless, most people will have experienced such weird coincidences, and sometimes people interpret these experiences as ‘paranormal’. Paranormal experiences include clairvoyance, telepathy, precognition, and psychokinesis. Not surprisingly, these phenomena, collectively labelled as ‘psi’, are met with great interest by the general audience, but traditionally, psychologists have been very interested in them as well. William James, the founding father of psychology, studied paranormal phenomena, and the name-giver of our research intstitute, Gerard Heymans, credited as the first Dutch experimental psychologist, was also the founder of the Dutch Society for Psychical Researcy, for example.

However, psi research (or parapsychology) was met with increasing skepticism. The weird experiences we all have from time to time turned out to be very difficult to replicate in laboratory settings, and basically, ever since its inception, parapsychology is struggling to confirm psi even exists, let alone come up with a solid theoretical background. As a result, nowadays, parapsychology is typically considered to be (at best) fringe science, or even pseudoscience, and is actively ignored by main stream psychology.

A small number of researchers have continued parapsychological investigations, though. Rather than aiming for dramatic demonstrations of psi by mediums, for example, modern-day parapsychologists study far more subtle effects. One of the best-studied topics is that of presentiment (see eg. Bierman and Radin, 1997): an anomalous baseline response to a stimulus that yet has to appear. An emotional picture, for example, induces a strong physiological response. In presentiment, this response appears to be temporally mirrored in the baseline: emotional picture evoke a physical reaction before they are presented, and apparently this physiological response is symmetrical in time (i.e., if the typical physiological response occurs at t = 2.5s, the ‘presponse’ will occur at t = -2.5s). Presentiment has been quite widely studied, and the effects seem to be more stable and legit than your typical psi-effect.

Presentiment has drawn wide mainstream attention with the publication of Bem’s landmark 2011 paper ‘Feeling the future’ in JPSP (Bem, 2011). Bem claimed to show solid evidence for the existence of presentiment in a whole series of experiments, in over 1000 subjects. However, the paper was met with a storm of criticism. I am calling Bem’s paper a landmark paper because it can be pinpointed as the paper that started off the massive debate on questionable research practices in (social) psychology, even before the uncovering of the fraud cases of Stapel, Hauser, Smeesters, and (likely) Foerster. The main critique on Bem’s work was that Bem appeared to ‘shop’ in his results, and did not use appropriate statistics: basically, he was accused of practices such as ‘p-hacking’ (mildly massaging your data until you hit p <.05), and ‘HARKing’ (hypothesizing after the results are known). Moreover, replications of Bem’s experiments failed, and the refusal of JPSP to publish these failed replications only added to the uproar surrounding the field of social psychology (one set of null results has been published in PLOS One, though: here).

Bem, however, did not give up. He, and his coworkers, have published two meta-analyses on the presentiment effect. Armed with the statistical techniques used by his critics (such as p-curves and Bayesian analyes), Bem and colleagues analyzed 90 experiments and show that there ‘is something going on’ in their latest meta-analysis, to be found here.

What to make of this? Does Bem’s latest work indeed support the existence of psi? Let me first say that a) I consider myself to be an open-minded skeptic (more on that later), and b) that I am far from an statistics expert. Having reviewed the latest paper I find the case compelling at first sight. I have got two major concerns, though.

1. How big is the file drawer? Bem et al. estimate that the overall effect size they find for psi can only be negated by at least 520 studies finding no effect. At first sight, this seems a large number. But is it really that unlikely that there are 520 experiments yielding no result Bem et al. did not know about and therefore did not include? Well, it is not. The Bem-experiment has drawn wide attention, and is likely to have been replicated. In our own institute several teaching assistants (ie. not academic faculty!) have used the Bem case as example in second year research practicals. Such data is typically not archived, as it is not considered to be ‘research’, and therefore goes unnoticed. Tenured and tenure-track faculty are less likely to engage in direct replications for the ‘usual’ reaons: lack of time, concerns about tenure, or simply lack of interest. In other words, I do not think 520 experiments (worldwide!) is a completely unrealistic number. The file drawer on psi might be quite substantial.

In addition, if they exist, pre-cognitive effects should also emerge in main-stream research. If a stimulus can exert retro-active influence, this should also be apparent in paradigms that are not explicitly designed to measure psi. For example, in a typical priming experiment, one could look at effects of primes (or targets) presented in trials after the critical stimulus. If precognitive effects do exist, one should be able to find effects in such datasets as well. Now, I myself have several datasets in such weird things are going on, although the most parsimonious account is still a main-stream one, obviously. However, ideally, to make a strong case for psi, main-stream research should be taken into account as well.

2. The theoretical background. The Achilles-heel of psi-research, though, is the lack of a clear theoretical framework. Just that you find a difference between two conditions in an experiment does not mean that much: laws of probability dictate you will find a difference in 1 out of every 20 experiments with p<.05. What matters if your difference fits with a given theory: were you able to predict you would find a difference? Many critics argue that parapsychology lacks a theoretical basis, and that given that psi is incompatible with the laws of physics, research into psi is by definition pseudoscience.

And that is where many critics are wrong.

Psi is not necessarily incompatible with physics. Bem et al. drag in quantum physics, which is something more scholars appear to do when stuff gets complicated (I consider it to be a bad habit), but there is no need for quantum physics to find a loophole in physics that allows psi. Psi involves the flow of information backward in time. Perhaps surprisingly, this is not impossible according to physics. Most physical laws are in fact time-symmetric. For example, in the formula x(t) = v * t, which gives the position x of a point mass at moment t given its speed v, there is nothing against entering a negative number for t. The only physical law which imposes a direction of the flow of time is the second law of thermodynamics, which postulates that the entropy of a closed system will increase over time. Basically it means that the sugar cube you put in your coffee will dissolve, but not spontaneously re-appear after it dissolved, and that your coffee will get colder over time, but not hotter (unless you heat it).

This is such a fundamental property of the universe that it dictates a direction of time on all other physical processes. However, it is also not a law, but a statistical property of the universe. Basically, potential high-entropy states of a system are far more abundant than potential low-entropy states. Consider the sugar cube in your coffee: the organized, low-entropy state in which sugar molecules are when packed into a cube is quite specific. All molecules of the sugar cube are packed into a confined spatial location (the cube), which limits the number of possible states. However, molecules of a fully dissolved cube can be anywhere in your mug, yielding a far larger number of possible states. Yet, it is not impossible that a sugar cube will rematerialize in your cup of coffee after dissolving (there is no physical law prohibiting the sugar molecules in your cup of coffee of reassembling themselves into a sugar cube), it is just impossibly unlikely. And to date, we do not know how the universe got itself into this mess – why was entropy at the beginning of everything so low?

That aside – technically, time reversals, or better, the universe adopting a state where effect appears to precede cause, are not impossible, just extremely unlikely. But does it mean that presentiment can be real? Well, there is a second problem. Even if we accept that information in extraordinary circumstances can travel backwards in time, presentiment potentially creates time paradoxes. The best-known example is the grandfather-paradox. Suppose you have a time-travelling machine, and travel back in time to when you grandfather was a child. And you shoot him. Apart from that being quite cruel in the typical circumstance where you have a loving grandpa, it creates a paradox: if your grandfather dies before he fathered one of your parents, how can you exist and travel back in time to shoot him?

The same thing applies to presentiment. Technically, it is possible to build a presentiment detector if the effect is real: Bierman and Radin (1997), for example, report an anomalous increase in galvanic skin response before presentation of an unpredictable emotional stimulus compared with the baseline response to neutral stimuli. If you sample a participants GSR to stimuli for a sufficient number of trials, it is well possible to estimate a participant’s typical baseline to neutral stimuli and estimate whether the baseline response on a given trial is typical of a subsequent neutral or subsequent emotional stimulus. That would allow you to predict the future: by a quick analysis of anticipatory GSR activity you should be able to guess (not perfectly, though) whether the upcoming stimulus will be emotional or neutral. And this is where the paradox arises: if your analysis is quick enough, you can quickly switch of the monitor, blindfold the participant, or even change the upcoming stimulus before the stimulus is presented and thus prevent the participant from seeing the stimulus that would trigger the emotional reaction. Effectively, this is the same as shooting your grandfather in the grandfather-paradox: if you erase or change the event that triggered the presentiment response in the first place, how can it trigger presentiment?

Stephen Hawking has argued that the universe ‘actively resists’ time paradoxes. Bierman (2001, and personal communications) supposes that this is one of the reasons why psi effects are so elusive: as soon as they become informative, and are able to create time paradoxes, they cease to exist. I am not sure if this is the case or not. However, I find the time paradox argument against presentiment compelling – a lot more compelling than the incorrect argument that psi does not fit with the laws of physics.

In their meta-analysis, Bem et al. cite a Bayesian analysis of Bem’s 2011 paper by Wagenmakers et al. In Bayesian statistics, one evaluates the likelihood of evidence gained by an experiment, or set of experiments, relative to the likelihood of the data given a prior hypothesis. Wagenmakers et al. set their prior to 10^20, meaning that they would be convinced of the reality of presentiment if the outcome of an experiment was 10^20 as likely to be found under the hypothesis that psi exists than under the hypothesis that it does not. Bem et al. report a Bayes-factor of 10^9 in favour of the existence of psi. Impressive – but not impressive enough for Wagenmakers et al. Bem et al. argue that a prior of 10^20 is unrealistically and unreasonably high. Well, is it?

Wagenmakers et al. base their prior on the fact that psi is incompatible with the laws of physics. With that argument I do not agree (see above) per se, but I do think the time paradox argument is convincing enough to warrant a very high prior. 10^20 seems not unrealistic or unreasonable in that sense to me

So to conclude: I am open to the possibility of psi. There are no universal laws against it, it is just extremely unlikely. Bem’s meta-analysis is convincing, but to me not convincing enough. It would be far more convincing if Bem et al. could make more specific theoretical predictions about when and how presentiment might occur. They do offer some interesting points – in particular, presentiment seems to be most prominent in ‘fast thinking’ scenarios. Now, that is actually what I would predict as well. If it exists, presentiment is most likely not a ‘conscious’ phenomenon. Personally, I have never consciously experienced a time-symmetric emotion. However, most of us will know the experience of a ‘gut feeling’ (“I knew something bad was gonna happen!”) My own research has shown that such ‘gut feelings’ are easier picked up when you’re not consciously trying to ‘listen’ to them. So, in that respect, if presentiment exists, it might indeed be easier to find it in experiments that require ‘fast thinking’.

One final remark: some critics argue that psi should not be studied at all, and ridicule any serious treatment of the topic. I have even seen some tweets arguing that psi is an excellence benchmark for novel statistical methods – if you find evidence for psi using your statistical method, your method is wrong!

This stance seems profoundly unscientific to me, in particular coming from people who claim to have strong dislike for pseudoscience. You simply cannot and should not dismiss results purely on the basis that they do not fit your beliefs. I’d say we have some bad experiences with that in the Middle Ages.

Arguably, the theoretical case for psi is not strong, but it is strong enough to take seriously. As a researcher, I study consciousness, and to be frank, we haven’t got a clue how consciousness works. The hard problem of consciousness is still far from solved, and there are profound philosophical problems with the materialist theories of consciousness. Psi (if it exists) may offer new insights into the nature of consciousness. For me, this is the main reason to regard the field with interest and not dismiss is right out of hand. In that respect, I think Bem’s work is valuable, interesting, and deserves recognition from main stream science, and I’d rather have skeptics trying to shoot at the data and analyses than actively ignoring psi altogether. Therefore, I am happy to see that many people have taken a critical look at the data; in the end, this can only give us a clearer picture of what is going on.