UWSpace is currently experiencing technical difficulties resulting from its recent migration to a new version of its software. These technical issues are not affecting the submission and browse features of the site. UWaterloo community members may continue submitting items to UWSpace. We apologize for the inconvenience, and are actively working to resolve these technical issues.
 

Logging Statements Analysis and Automation in Software Systems with Data Mining and Machine Learning Techniques

dc.contributor.authorGholamian, Sina
dc.date.accessioned2022-01-19T18:37:07Z
dc.date.available2022-01-19T18:37:07Z
dc.date.issued2022-01-19
dc.date.submitted2022-01-13
dc.description.abstractLog files are widely used to record runtime information of software systems, such as the timestamp of an event, the name or ID of the component that generated the log, and parts of the state of a task execution. The rich information of logs enables system developers (and operators) to monitor the runtime behavior of their systems and further track down system problems in development and production settings. With the ever-increasing scale and complexity of modern computing systems, the volume of logs is rapidly growing. For example, eBay reported that the rate of log generation on their servers is in the order of several petabytes per day in 2018 [17]. Therefore, the traditional way of log analysis that largely relies on manual inspection (e.g., searching for error/warning keywords or grep) has become an inefficient, a labor intensive, error-prone, and outdated task. The growth of the logs has initiated the emergence of automated tools and approaches for log mining and analysis. In parallel, the embedding of logging statements in the source code is a manual and error-prone task, and developers often might forget to add a logging statement in the software's source code. To address the logging challenge, many e orts have aimed to automate logging statements in the source code, and in addition, many tools have been proposed to perform large-scale log le analysis by use of machine learning and data mining techniques. However, the current logging process is yet mostly manual, and thus, proper placement and content of logging statements remain as challenges. To overcome these challenges, methods that aim to automate log placement and content prediction, i.e., `where and what to log', are of high interest. In addition, approaches that can automatically mine and extract insight from large-scale logs are also well sought after. Thus, in this research, we focus on predicting the log statements, and for this purpose, we perform an experimental study on open-source Java projects. We introduce a log-aware code-clone detection method to predict the location and description of logging statements. Additionally, we incorporate natural language processing (NLP) and deep learning methods to further enhance the performance of the log statements' description prediction. We also introduce deep learning based approaches for automated analysis of software logs. In particular, we analyze execution logs and extract natural language characteristics of logs to enable the application of natural language models for automated log le analysis. Then, we propose automated tools for analyzing log files and measuring the information gain from logs for different log analysis tasks such as anomaly detection. We then continue our NLP-enabled approach by leveraging the state-of-the-art language models, i.e., Transformers, to perform automated log parsing.en
dc.identifier.urihttp://hdl.handle.net/10012/17923
dc.language.isoenen
dc.pendingfalse
dc.publisherUniversity of Waterlooen
dc.subjectsoftware systemsen
dc.subjectloggingen
dc.subjectlog statementen
dc.subjectlog fileen
dc.subjectlog automationen
dc.subjectlog analysisen
dc.subjectlog miningen
dc.subjectlogging costen
dc.subjectnatural language processingen
dc.subjectmachine learningen
dc.titleLogging Statements Analysis and Automation in Software Systems with Data Mining and Machine Learning Techniquesen
dc.typeDoctoral Thesisen
uws-etd.degreeDoctor of Philosophyen
uws-etd.degree.departmentElectrical and Computer Engineeringen
uws-etd.degree.disciplineElectrical and Computer Engineeringen
uws-etd.degree.grantorUniversity of Waterlooen
uws-etd.embargo.terms0en
uws.contributor.advisorWard, Paul A. S.
uws.contributor.affiliation1Faculty of Engineeringen
uws.peerReviewStatusUnrevieweden
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Gholamian_Sina.pdf
Size:
4.17 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
6.4 KB
Format:
Item-specific license agreed upon to submission
Description: