Predicting Repository Upkeep with Textual Personality Analysis
Loading...
Date
2019-08-29
Authors
Sachs, Alexander
Advisor
Hoey, Jesse
Journal Title
Journal ISSN
Volume Title
Publisher
University of Waterloo
Abstract
GitHub is an excellent democratic source of software. Unlike traditional work groups
however, GitHub repositories are primarily anonymous and virtual.
Traditional strategies for improving the productivity of a work group often include
external consultation agencies that do in-person interviews. The resulting data from these
interviews are then reviewed and their recommendations provided. This is one such claim of
a group of strategies called group dynamics. In the online world however where colleagues
are often anonymous and geographically dispersed, it is often impossible to apply such
approaches.
We developed experimental methods to discern the same information that one would
normally obtain through in-person interviews through automated means. Here we provide
this automated method of data collection and analysis that can later be applied for the
purposes of recommendation agents.
Comments from individual developers were collected via various GitHub APIs. That
data was then converted into personality traits for each individual through textual persona
extraction and mapped to a personality space called SYMLOG. The resulting dynamics
between each of the personalities of the developers of each repository are analyzed though
SYMLOG to predict how successful each project is likely to be. These predictions are
compared against valid preexisting success metrics.
Description
Keywords
artificial intelligence, personality, SYMLOG, NLP, big 5