Written by Lauren Cadwallader, Lindsay Morton, and Iain Hrynaszkiewicz PLOS recently introduced Open Science Indicators (OSIs), a large public dataset identifying and…
What a difference a data repository makes: Six ways depositing data maximizes the impact of your science
Data is key to verification, replication, reuse, and enhanced understanding of research conclusions. When your data is in a repository—instead of an old hard drive, say, or even a Supporting Information file—its impact and its relevance are magnified. Here are six ways that putting your data in a public repository can help your research go further.
1. You can’t lose data that’s in a public data repository
Have you ever lost track of a dataset? Maybe you’ve upgraded your computer or moved to a new institution. Maybe you deleted a file by mistake, or simply can’t remember the name of the file you’re looking for. No matter the cause, lost data can be embarrassing and time consuming. You’re unable to supply requested information to journals during the submission process or to readers after publication. Future meta analyses or systematic reviews are impossible. And you may end up redoing experiments in order to move forward with your line of inquiry. With data securely deposited in a repository with a unique DOI for tracking, archival standards to prevent loss, and metadata and readme materials to make sure your data is used correctly, fulfilling journal requests or revisiting past work is easy.
2. Public data repositories support understanding, reanalysis and reuse
Transparently posting raw data to a public repository supports trustworthy, reproducible scientific research. Insight into the data and analysis gives readers a deeper understanding of published research articles. Offering the opportunity for others to interpret results demonstrates integrity and opens new avenues for discussion and collaboration. Machine-readable data formatting allows the work to be incorporated into future systematic reviews or meta analyses, expanding its usefulness.
3. Public data repositories facilitate discovery
Even the best data can’t be used unless it can be found. Detailed metadata, database indexing, and bidirectional linking to and from related articles helps to make data in public repositories easily searchable—so that it reaches the readers who need it most, maximizing the impact and influence of the study as a whole.
4. Public data repositories reflect the true value of data
Data shouldn’t be treated like an ancillary bi-product of a research article. Data is research. And researchers deserve academic credit for collecting, capturing and curating the data they generate through their work. Public repositories help to illustrate the true importance and lasting relevance of datasets by assigning them their own unique DOI, distinct from that of related research articles—so that datasets can accumulate citations in their own right.
5. Public data demonstrates rigor
There’s no better way to illustrate the rigor of your results than explaining exactly how you achieved them. Sharing data lets you demonstrate your credibility and inspires confidence in readers by contextualizing results and facilitating reproducibility.
6. Research with data in public data repositories attracts more citations
A 2020 study of more than 500,000 published research articles found articles that link to data in a public repository have a 25% higher citation rate on average than articles where data is available on request or as Supporting Information. The precise reasons for the association remain unclear. Are researchers who deposit carefully curated data in a repository also more likely to produce rigorous, citation-worthy research? Are researchers with the time and resources to devote to data curation and deposition more established in their careers, and therefore more highly cited? Are readers more likely to cite research when they trust that they can verify the conclusions with data? Perhaps some combination?
I hope that the Official PLOS BLOG would help documenting, popularization and promoting my theoretical biology work for over 45 years. This non-conventional unified integrated “Top-Down” Digital Approach is opposite yet complementary to the conventional specific aimed “Bottom-Up” Observational Open/Empirical Approach, which heavily depends on statistics.
I developed the mass-action law based biodynamics, Pharmacodynamics, bioinformatics theory (MAL-BD/PD/BI) and the combination index (CI) theorem for synergy (CI1) by digital computer simulation. As of this week, my cost-effective, efficient MAL theory has received over 23,000 citations in over 1,446 journals.
This way of storing the information is very interesting
Agreed, thank you Lindsay. But what options are available for depositing data where they can be found please? Are repositories also suitable for software?
We do have a list of data repositories by discipline on our journal sites–it’s far from complete, but there are some great options:
Some of these repositories (like figshare) will accept code as well as data. But there are dedicated code repositories as well (github, codeocean, etc.)