Illusionary Order: Online Databases, Optical Character Recognition, and Canadian History, 1997–2010
It all seems so orderly and comprehensive. Instead of firing up the microfilm reader to navigate the Globe and Mail or the Toronto Star, one needs only to log into online newspaper databases. A keyword search, for a particular event, person, or cultural phenomenon, brings up a list of research findings. Previously impossible research projects can now be attempted. This process has fundamentally reshaped Canadian historical scholarship. We can see this in Canadian history dissertations. In 1998, a year with 67 dissertations, the Toronto Star was cited 74 times. However it was cited 753 times in 2010, a year with 69 dissertations. Similar data appears in the Canadian Historical Review (CHR), a prestigious peer-reviewed journal. Databases are skewing our research. We are witnessing the application of commercial Optical Character Recognition (OCR) technology – originally and primarily designed for the efficient digitization of large reams of corporate and legal documents, conventionally formatted – to historical sources. The results are, unsurprisingly, a mixed bag. In this article, I make two arguments. Firstly, online historical databases have profoundly shaped Canadian historiography. In a shift that is rarely – if ever – made explicit, Canadian historians have profoundly reacted to the availability of online databases. Secondly, historians need to understand how OCR works, in order to bring a level of methodological rigor to their work that use these sources.
Cite this version of the work
Ian Milligan (2013). Illusionary Order: Online Databases, Optical Character Recognition, and Canadian History, 1997–2010. UWSpace. http://hdl.handle.net/10012/11748