Reference and Instruction Librarian University of Florida Gainesville, Florida
Objectives: Delivery of health care is moving toward greater precision as access to vast amounts of data with more heterogeneous variables increases. Identifying datasets with variables of interest is time-consuming and challenging. To shift researcher time and effort from locating to integrating and analyzing useful datasets, two librarians have created two versions of a one-stop-shop of relevant data sets.
Methods: Librarians collaborated with members of their institution’s Clinical Translational Science Institute and Precision Public Health Work Group to identify possible topics of interest to researchers. Librarians identified potentially useful data sets on these topics and curated metadata for each dataset including title, identifiers, creators/curators, landing and access page URLs, all relevant dates (release, creation, modification, etc.), licensing, relevant content (specific data elements of value), and indexing terms ( geographical regions, diseases, etc. that the data are about ) on a (SpringShare) LibGuide. Use of the LibGuide for actual research projects led to further collaboration with CTSI to convert the browsable webpage to an ontology-based data catalog.
Results: High usage statistics indicate the browsable webpage is well utilized. The webpage supports browsing and investigation of data sets for utility. A primary limitation is the inability to provide users with specific data sets that contain variables of research interest. To address this, librarians collaborated with a CTSI ontologist and a technology team, which resulted in the indexing of 94 unique variables from 28 datasets for a pilot ontology-based data catalog. The improved search capability is expected to assist researchers in hypothesis generation and assessing research feasibility.
Conclusions: The ontology-based catalog’s soft release in May 2019 will be broadened as more data sets are indexed and case studies reveal potential target audiences. A head-to-head comparison of time required and ease of navigation in the two tools (browsable vs. ontology-based) in the next year is expected to justify expansion of the pilot ontology-based data catalog.