Skip to main content

Reproducible Data Management

Information and resources for reproducible data management for the UCSF research community

Why publish your research data?

Many biomedical journals and funders now require researchers to make their data publicly available. Journals like PLOS, PNAS, Science, and Nature require data sharing as a prerequisite to publication. Funders like Howard Hughes, the Gates Foundation, and the Chan Zuckerberg Biohub now require that grantees make data public alongside their articles. But what does it mean to publish your data? This page will give you an overview of the process.

Prepare your data for publishing

Before you publish your data you should consider the following:

  • What data will you share? This is largely dependent on your field of research, but you should consider what someone else in your field would need to validate your results
  • What documentation needs to accompany the data? Data by itself is seldom useful. What other dictionaries, metadata, code would someone need to use your data?
  • Do you have consent to share? If your research involves human subjects did you mention data sharing in your IRB and informed consent documents? 
  • Has your data been completely de-identified? UCSF now offers a service to validate this.

Research data repositories

Public data repositories are the best place to publish your research data. Repositories preserve and archive your data and make it easy for others to find and cite your data. The best repositories are the ones specific to your discipline (especially those at NIH) because they are designed with your community in mind. That said, there are several general purpose data repositories that are also excellent.