Information Management

Paper: Program Description Abstract

From Conception to Action: Elevating Library Projects through Collaboration Between Librarians and Developers

Sunday, May 5
5:35 PM - 5:50 PM
Room: Columbus GH (East Tower, Ballroom/Gold Level)

Background : The Data Catalog Collaboration Project (DCCP) is a multi-institutional venture aiming to unlock the true value of research data by making datasets easily findable and reusable. The project has a technical component consisting of an open-source search engine and user interface for metadata curation, as well as a human component: the collaboration both within and between institutions. Librarians and developers work on an equal footing to suggest new directions and innovations to continually improve the catalog. The DCCP leverages several strategies and technologies to improve dataset visibility, and many fruitful innovations have arisen from the collaboration between developers and librarians.
Description : Librarians and developers each bring their own expertise to the table, and the fusion of these viewpoints has been valuable. Initially, librarians suggested the use of Linked Open Data, and developers advocated for using the vocabulary; this combination has enabled our metadata to be harvested by Google’s new Dataset Search tool, which now prominently displays datasets from our catalogs. A new application programming interface (API) allows librarians to provide developers with a CSV file of curated metadata which can be ingested in bulk, allowing rapid growth of the catalog. Librarians suggest tweaks to the metadata schema to accommodate new use cases, which are then implemented by developers. And developers suggested a novel way for librarians to provide a temporary link to researchers so they can approve the way their dataset appears in the catalog, before it goes live.
Conclusion : Librarians and developers established a streamlined workflow for metadata curation which, combined with using the API to batch ingest dataset records, enabled DCCP institutions to increase efficiency when adding records. The creation of temporary links allows librarians to distribute pre-published records to researchers so they can view their datasets in context before approval. Finally, records from DCCP catalogs now appear in Google Dataset Search, which has drastically increased the number of daily visitors and pageviews. This project demonstrates that collaboration between librarians and developers where both parties work on equal footing can provide high value outcomes for specific library projects.

Ian Lamb

Solutions Developer
NYU Health Sciences Library
New York, New York

Ian Lamb is a Solutions Developer at the NYU Health Sciences Library. He focuses on building friendly and usable library systems.


Send Email for Ian Lamb

Joel Marchewka

Web Applications Developer
Health Sciences Library System / Digital Library Services
Pittsburgh, Pennsylvania


Send Email for Joel Marchewka

Jean-Paul Courneya

Health Sciences and Human Services Library
Baltimore, Maryland

Jean-Paul is the Bioinformationist at the University of Maryland Health Sciences and Human Services Library. Jean-Paul combines an extensive experience in laboratory science for biomedical research and bioinformatics to provide education, consultation and networking opportunities for the Faculty, Staff, and Students of the University. His areas of interest are high performance computing and data analysis for global molecular profiling of disease. He is passionate about Open Source statistical and scientific software, panOmics, research data management, and connecting users to software and information resources for meeting scholarly goals.


Send Email for Jean-Paul Courneya

Kevin Read

Lead, Data Discovery and Data Services Librarian
NYU Health Sciences Library
New York, New York

Kevin Read, MLIS, MAS is the Lead of Data Discovery and Data Services Librarian at NYU Langone Health. He leads the NYU Data Catalog project; an initiative to make research datasets created and used by NYU researchers more discoverable. He also leads the Data Catalog Collaboration Project, a multi-site collaboration consisting of eight academic institutions working to improve the discoverability of institutional research data using the NYU Data Catalog model.

Beyond his data discovery efforts, Kevin provides training and research support to faculty, residents, students and staff on topics including: clinical research data management, REDCap, reproducibility, and data sharing.


Send Email for Kevin Read


From Conception to Action: Elevating Library Projects through Collaboration Between Librarians and Developers

Audio Slides Video

Attendees who have favorited this

Please enter your access key

The asset you are trying to access is locked. Please enter your access key to unlock.

Send Email for From Conception to Action: Elevating Library Projects through Collaboration Between Librarians and Developers