Skip to Main Content

Reproducible Data Management: Prepare your Research Data for Sharing

Information and resources for reproducible data management for the UCSF research community

Director of Data Science and Open Scholarship

Profile Photo
Ariel Deardorff

Why share your research data?

Many biomedical and population health journals and funders now require researchers to make their de-identified research data publicly available when they publish their results. Journals like PLOS, PNAS, Science, and Nature require de-identified data sharing as a prerequisite to publication. Funders like Howard Hughes, NIH, and the Gates Foundation now require that grantees make data public alongside their articles, and starting in 2023 NIH will require a data sharing plan from all grantees. But what does it mean to share your data? This page will give you an overview of the process.

UCSF Guidance for Data Sharing

UCSF now has guidance for sharing de-identified data to meet publisher and journal requirements. Check out the step-by-step guidance to learn how to plan for data sharing, and share data in accordance with UCSF policies.

Select data and documentation for sharing

What data will you share? This is largely dependent on your field of research, but you should consider what someone else in your field would need to validate your results

What documentation needs to accompany the data? Data by itself is seldom useful. What other dictionaries, metadata, code would someone need to use your data?

Ensure you have permission to share

Do you have consent to share?

If your research involves human subjects did you mention data sharing in your IRB and informed consent documents?

De-identify your data

Has your data been de-identified?

It is important to properly de-identify your data to reduce the risk of identifying individuals in a dataset. UCSF now offers a service to validate this.