Illusionary Order: Online Databases, Optical Character Recognition, and Canadian History, 1997–2010

Loading...
Thumbnail Image

Date

2013-12

Authors

Milligan, Ian

Advisor

Journal Title

Journal ISSN

Volume Title

Publisher

University of Toronto Press

Abstract

It all seems so orderly and comprehensive. Instead of firing up the microfilm reader to navigate the Globe and Mail or the Toronto Star, one needs only to log into online newspaper databases. A keyword search, for a particular event, person, or cultural phenomenon, brings up a list of research findings. Previously impossible research projects can now be attempted. This process has fundamentally reshaped Canadian historical scholarship. We can see this in Canadian history dissertations. In 1998, a year with 67 dissertations, the Toronto Star was cited 74 times. However it was cited 753 times in 2010, a year with 69 dissertations. Similar data appears in the Canadian Historical Review (CHR), a prestigious peer-reviewed journal. Databases are skewing our research. We are witnessing the application of commercial Optical Character Recognition (OCR) technology – originally and primarily designed for the efficient digitization of large reams of corporate and legal documents, conventionally formatted – to historical sources. The results are, unsurprisingly, a mixed bag. In this article, I make two arguments. Firstly, online historical databases have profoundly shaped Canadian historiography. In a shift that is rarely – if ever – made explicit, Canadian historians have profoundly reacted to the availability of online databases. Secondly, historians need to understand how OCR works, in order to bring a level of methodological rigor to their work that use these sources.

Description

This is a pre-copyedited, author-produced version of an article published in Canadian Historical Review. Ian Milligan. “Illusionary Order: Online Databases, Optical Character Recognition, and Canadian History, 1997-2010.” Canadian Historical Review, Vol. 94, No. 4 (December 2013): 540—569.The version of record is available online at: http://www.utpjournals.press/doi/abs/10.3138/chr.694

Keywords

Historiography, Digital history, Databases, Newspapers, Primary sources, Historical methodology, Pedagogy

LC Keywords

Citation