UWSpace is currently experiencing technical difficulties resulting from its recent migration to a new version of its software. These technical issues are not affecting the submission and browse features of the site. UWaterloo community members may continue submitting items to UWSpace. We apologize for the inconvenience, and are actively working to resolve these technical issues.
 

An Open-Source Strategy for Documenting Events: The Case Study of the 42nd Canadian Federal Election on Twitter

dc.contributor.authorRuest, Nick
dc.contributor.authorMilligan, Ian
dc.date.accessioned2017-04-26T14:28:37Z
dc.date.available2017-04-26T14:28:37Z
dc.date.issued2016-04-25
dc.descriptionThis work is licensed and made available under Creative Commons Attribution 3.0 United States license. Article first appeared in Code4Lib Journal, issue 32, 2016-04-25, Original available here http://journal.code4lib.org/articles/11358en
dc.description.abstractThis article examines the tools, approaches, collaboration, and findings of the Web Archives for Historical Research Group around the capture and analysis of about 4 million tweets during the 2015 Canadian Federal Election. We hope that national libraries and other heritage institutions will find our model useful as they consider how to capture, preserve, and analyze ongoing events using Twitter. While Twitter is not a representative sample of broader society – Pew research shows in their study of US users that it skews young, college-educated, and affluent (above $50,000 household income) – Twitter still represents an exponential increase in the amount of information generated, retained, and preserved from 'everyday' people. Therefore, when historians study the 2015 federal election, Twitter will be a prime source. On August 3, 2015, the team initiated both a Search API and Stream API collection with twarc, a tool developed by Ed Summers, using the hashtag #elxn42. The hashtag referred to the election being Canada's 42nd general federal election (hence 'election 42' or elxn42). Data collection ceased on November 5, 2015, the day after Justin Trudeau was sworn in as the 42nd Prime Minister of Canada. We collected for a total of 102 days, 13 hours and 50 minutes. To analyze the data set, we took advantage of a number of command line tools, utilities that are available within twarc, twarc-report, and jq. In accordance with the Twitter Developer Agreement & Policy, and after ethical deliberations discussed below, we made the tweet IDs and other derivative data available in a data repository. This allows other people to use our dataset, cite our dataset, and enhance their own research projects by drawing on #elxn42 tweets. Our analytics included: breaking tweet text down by day to track change over time; client analysis, allowing us to see how the scale of mobile devices affected medium interactions; URL analysis, comparing both to Archive-It collections and the Wayback Availability API to add to our understanding of crawl completeness; and image analysis, using an archive of extracted images. Our article introduces our collecting work, ethical considerations, the analysis we have done, and provides a framework for other collecting institutions to do similar work with our off-the-shelf open-source tools. We conclude by ruminating about connecting Twitter archiving with a broader web archiving strategy.en
dc.description.sponsorshipSocial Sciences and Humanities Research Council of Canada || Insight Grant (435-2015-0011)en
dc.identifier.urihttp://journal.code4lib.org/articles/11358
dc.identifier.urihttp://hdl.handle.net/10012/11747
dc.language.isoenen
dc.publisherCode4Liben
dc.rightsAttribution 3.0 United States*
dc.rights.urihttps://creativecommons.org/licenses/by/3.0/us/*
dc.subjectCanadian federal electionen
dc.subjectTweet analysisen
dc.subjectTwitteren
dc.subjectTwarcen
dc.subjectHistorical researchen
dc.subjectTwitter archivingen
dc.titleAn Open-Source Strategy for Documenting Events: The Case Study of the 42nd Canadian Federal Election on Twitteren
dc.typeArticleen
dcterms.bibliographicCitationNick Ruest and Ian Milligan. “An Open-Source Strategy for Documenting Events: The Case Study of the 42nd Canadian Federal Election on Twitter.” Code4Lib Journal, Issue 32, April 2016.en
uws.contributor.affiliation1Faculty of Artsen
uws.contributor.affiliation2Historyen
uws.peerReviewStatusRevieweden
uws.scholarLevelFacultyen
uws.typeOfResourceTexten

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
The Code4Lib Journal – An Open-Source Strategy for Documenting Events_ The Case .pdf
Size:
876.86 KB
Format:
Adobe Portable Document Format
Description:
Publisher's version
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
4.46 KB
Format:
Item-specific license agreed upon to submission
Description: