Does that mean the original research was wrong? No. It means science is really, really hard. an article published on VOX written by Julia Belluz on Jan 23,2017

Replication has emerged as a powerful tool to check science and get us closer to the truth. Researchers take an experiment that is already been done, and test whether its conclusions hold up by reproducing it. The general principle is that if the results repeat, then the original results were correct and reliable. If they do not then the first study must be flawed, or its findings false.

But there is a big wrinkle with replication studies : they do not work like that. As researchers reproduce more experiments, they are learning that they can not always get clear answers about the reliability of the original results. Replication, it seems, is a whole lot murkier than we thought. Take the latest findings from the large-scale reproducibility project: cancerbiology. Here researchers focus on reproducing experiments from the highest impact papers about cancer biology published from 2010 to 2012. They shared their results in five papers in the journal Elife last week – and not one of their replications definitively confirmed the original results.

The finding echoes those of another landmark reproducibility project, which like the cancer biology project, came from the Center for Open Science. This time, the researchers  replicated major psychology studies – and only 36 percent of them confirmed the original conclusions.

Replicating studies, it turns out is really really hard. For them to get easier, scientists are going to have to learn to be much more transparent about their process. We are also going to need to start looking at replicating studies with the same critical gaze we reserve for single studies of any kind.

Replicating studies is really really hard

Replication of a study does not just means reading the paper and trying to run it again. It is more like trying to play a complicated board game without all the instructions or even all the parts. To pull off the replication studies in cancer biology, the researchers had to go back to the authors of the original papers and ask them to share any information they may have left out of their methods sections and any additional unpublished data that helped them arrive at their conclusions. They then came up with a plan for reproducing the study and got that plan peer reviewed including by the original authors, statisticians and experts in the relevant field. “Sometimes, things as small as the temperature of the lab could cause a cell biology experiment to flop”, Errington said. Others researchers who have replicated psychology studies said they had to overcome language and cultural barriers in translating the original science to a new lab. The takeaway here is that replication can go off the rails very easily if researchers do not take the utmost care and even when they do.

If a replication fails, it may be a bad replication-not necessary a bad original study

Interpreting the results of a replication study is another place when things can get messy, fast. And increasingly scientists are realizing that if a replication fails to reproduce the original results it doesn’t mean the original was wrong. With the reproducibility project cancer biology, not one of the replications definitively confirmed the original results: two of the papers reproduced parts of the original experiments, one failed to reproduce the original experiment entirely and the other two replications were impossible to interpret because of technical problems with the models. “it could be that the original was false positive” Errington said “It could be that the replication is a false negative that the replication did something wrong and probably that they are both right- the original is right and the replication is right- and what is occuring is that we don’t know the cause of the discrepancy”

In other words, there are any number of reasons a replication may fail, and they may say nothing about the quality of the original research. “a single failure is no more definitive than a single study claiming discovery”Bristol researcher Marcus Munafo who recently co-autored a manifesto for reproducible science told me “Given the latter we need to do more of the former and not just assume that a single study claiming discovery is robust”. “Doing replication studies properly is costly and time consuming, said Lawrence Tabak, the principal deputy director at the National Institute of Health. Which means “you can not do wholesale replications-but there is likely a subset of studies we really consider doing replication”. For example, researchers shoud try to replicate their findings from animal studies before moving to costlier human studies”Let’s says “a replication study of an animal study will cost half a million dollars” Tabak said “One could argue that is a lot of money -why spend it? The answer is you would want to do be certain the key experimental results could be replicated before spending 5 millions dollars on a first in human study” The neurology and aging centers at NIH already do this, he added, and the NIH recently released guidelines on reproducing research.

Scientists and journals need to get better at describing their methods and sharing data

Whenever I have talked to researchers who have done replication studies, it becomes clear quickly pretty that their endeavour won’t go so well if they don’t have buy-in from the researchers who did the original study. That is because the methods section of research papers-which describe how an experiment had done-often are not very detailed.

“One of the biggest barrier to reproducibility is simply knowing how the original research was done” said Brian Nosek who co-founded the Centre for Open Science, which supported the Cancer Biology and Psychology reproducibility projects. A greater degree of transparency and data sharing would make replications easier and more reliable. So researchers need to get better at describing their methods in detail and sharing their data. There may be some results that can be reproduced under specific conditions and others that are not reproduced at all. The emerging picture is that the two last categories may be very common” Stanford meta-researcher John Ioannidis said in an email. “Funding agencies should clearly pay attention to this” he added instead of wasting money on research that is not reproducible. Researchers could also make use of tools like open-source software that track every version of data set so that they can share their data more easily and have transparency built into their workflow. As we wrote in a feature on the biggest problem in science at Vox, journals and funders also need to rethink their incentive structures to reward more transparency and replications.

“Replication projects are shining a light on the way we conduct research” Errington said. For better or worse, the way studies are done now makes the work of replication them very hard and the results of replication more debious” but he added ” if we make ourselves more open and transparent from the beginning, before our work is even published in a paper, that will probably help a lot”