Skip to Main Content

Archives as Data: UCSF Archives as Data: Guidance and Collections

This guide provides an overview of archival collections datasets (“Archives as Data”), primarily that made available by UCSF Archives and Special Collections, including guidance for accessing and using such data.

UCSF Archives as Data

UCSF Archives and Special Collections manages and provides access to special collections, rare materials, university archives, and manuscript collections that document events, organizations, individuals, and movements that have played important roles in the health sciences, as well as industries that influence public health. Many of the materials in UCSF Archives and Special Collections have been (or will be) digitized or are born digital. Of those, a number of archival collections have been prepared and packaged as datasets, comprising data extracted from, and/or about, the collections. These archival collections datasets are made available and open for research.

This guide serves as a starting point for researchers interested in working with "archives as data" and includes brief descriptions of the form and content of selected collections:

Updates to this guide will be added as more data from archival collections is made available.

Guidance for Accessing and Using UCSF Archives as Data

Where did UCSF Archives and Special Collections Datasets come from, and how can I find them?

  • UCSF Archives and Special Collections (A&SC) creates and maintains datasets comprising metadata and textual data that represent digitized and born-digital archival collections (“Collections Datasets”). In addition to providing these data for research purposes, A&SC in some cases makes available accompanying digital object files of collection materials that represent either a digitized copy of or a born-digital record (typically, these will be .pdf or .tif files). Collections datasets and accompanying digital objects are made available for free downloading via Library-hosted links. Some collection materials may also be available via Library-hosted API.
  • UCSF Library may change the type and format of the collections data we provide and may from time to time update collections datasets as collections are added or added to. We will take reasonable steps to inform users of any changes to the type, format, or status of the data before new or updated collections datasets are made available.

What can I do with UCSF Archives and Special Collections Datasets?

  • UCSF A&SC Collections Datasets are open for research. While items in UCSF A&SC collections from which datasets have been derived may be protected by copyright and/or subject to restricted access as indicated in finding aids or collection descriptions, these are generally are made open for research and are accessible for fair use purposes, including criticism, comment, news reporting, teaching, scholarship, and/or research. Restricted materials containing protected health information (PHI) as defined under HIPAA are made accessible in Library collections datasets only after they have been redacted and PHI removed. When applicable, collection-specific access considerations will be indicated with the collection dataset description.
  • All requests for permission to publish or quote from material must be submitted in writing to the UCSF A&SC (through Ask an Archivist). Permission for publication is given on behalf of the Library as the owner of the physical items and is not intended to include or imply permission of the copyright holder, which must also be obtained by the researcher. Should you use UCSF A&SC Collections Datasets, please cite them with the name of the repository responsible for the dataset, date associated with data publishing, source collection name, date/time accessed, dataset filename (as appropriate) and URL. Please attribute these with the following statement, “Courtesy of the UCSF Archives & Special Collections.” For example:

Industry Documents Library, 2021, DC Leaks Coca-Cola Emails Collection Dataset, accessed <date/time downloaded> as DC_Leaks_Coca_Cola_Emails.csv at https://ucsf.app.box.com/v/IDL-DataSets/file/484614170435. Courtesy of the UCSF Archives and Special Collections. 

What if I still have a question about UCSF Library Collections Datasets?

Where can I find and learn more about Digital Collections from the UCSF Archives and Special Collections?

  • To learn more about or engage with the digital collections materials from which UCSF collection datasets have been derived, the UCSF A&SC deposits many of these in Calisphere, a searchable, online gateway to digital materials that have been contributed by all ten campuses of the University of California and other important libraries, archives, and museums throughout the state of California.
  • This Calisphere overview provides important and useful context about how collection materials have come to be included in Calisphere, especially the circumstances of and considerations relating to their collection, description, and availability.