Ideally, scientific research should be replicable (reproducible). The research should use processes that can be used by others wanting to conduct a similar or the same study. When referring to the replication crisis it is often understood that what is meant is lack of replicating statistically significant findings. It would be more precise to say there is a "statistically significant replication crisis." It is possible that original studies that fail to show significance may demonstrate a type 2 error- missing an effect. This could occur do to a number of methodological or statistical issues. As an example, when I conducted a study on expectations influence on food liking the finding was insignificant; when I ran a statistical power analysis it revealed I needed a larger sample, considering effect size and p-value to find significance. Statistically significant and insignificant finding should be replicated, and they should involve different type of replications using samples with varying characteristics.
What are some different types of replication studies?
There are least 3 general types of replication studies- direct replication, conceptual replication and replication-plus-extension. In direct replication, researchers attempt to conduct research using methods that are as close as they can to those used by original researchers. The more transparent the original research the easier it will generally be to directly replicate. In conceptual replication researchers address same topics, questions, but use different methods. Variables are manipulated and measured using different strategies, but conceptualization remains intact. In a replication-plus-extension study, researchers replicate original studies, but also add variables, that may include different operationalizations.
What are the implications of replication studies?
Extra weight is often given to studies that are replicated (also find significance) outside of the original lab, or when conducted by researchers other than the ones making the original findings. A red flag is indicated if only a specific group or lab is able to make a finding. Why is it others can't make the finding? It is essential that researchers are transparent with their methods and all relevant research materials. Strong evidence is the result of various studies; not a single study, or series of studies that can only be found by one research group. To reiterate, scientific progress is cumulative; it develops as a product of the work, of sometimes many people. In some cases it is necessary to repeat studies that didn't find significance. The original study might be flawed. The Apex of evidence is converging evidence. Various research methods, stats, models and inferential strategies have limitations- it is the preponderance of evidence from various lines of inquiry that converge to produce the highest level of evidence.
For further discussion on issues with scientific methods refer to - In Evidence We Trust 2nd Edition
Various articles on replication from Andrew Gelman's site