The University of Waterloo Libraries will be performing maintenance on UWSpace tomorrow, November 5th, 2025, from 10 am – 6 pm EST.
UWSpace will be offline for all UW community members during this time. Please avoid submitting items to UWSpace until November 7th, 2025.

Automating Big Data Cleaning: An Example Using Local Bibliometric Data

Loading...
Thumbnail Image

Authors

Carson, Jana
Gordon, Shannon

Advisor

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

The University of Waterloo recognizes bibliometric data as an important piece of evidence-based research assessment, and recommends bibliometric data as one measure, among many, for capturing research productivity trends, and elements of research impact. Even when working from a basket of measures, bibliometric data remains complex and requires significant cleaning due to issues of name ambiguity. This session will explore an innovative collaboration between the Library and Institutional Analysis and Planning (IAP) to support the integrity of local, discipline-level bibliometric data by automating key data processes of an internal project. This session will introduce how bibliometric data is relevant to the University, the process used to gather and vet local bibliometric data, and the ways in which key data processes have been successfully automated using Python and a database to support efficient reporting. Given known challenges presented by name ambiguity, this collaborative framework makes it possible to support the integrity of local bibliometric data—a key step in supporting this and similar in-demand analyses at the University.

Description

LC Subject Headings

Citation