Natural Language Processing using Deep Learning for Classifying Water Infrastructure Procurement Records and Calculating Unit Costs

Loading...
Thumbnail Image

Date

2024-03-06

Authors

Khaki, Milad

Journal Title

Journal ISSN

Volume Title

Publisher

University of Waterloo

Abstract

This thesis introduces a novel ontology-based deep learning classification model specifically tailored for civil engineering applications, focusing on automating the extraction and classification of water infrastructure capital works tenders and progress certificates. Utilizing ontology for standardizing tender-bid data and employing Named Entity Recognition (NERC) for item categorization, the model adeptly addresses the challenges posed by the diversity in document styles and formats. Incorporating Long Short-Term Memory (LSTM) structures within the model enables the learning of both linear and non-linear dependencies between words. This aspect is particularly significant in tackling the unique language constructs present in tender-bid document records. The model's effectiveness is underscored by its impressive classification accuracy, achieving 92.6% for testing data and 98.7% for training data, thereby marking a significant advancement in the field. The practical application of this model through a web server highlights its adaptability and efficiency in real-world scenarios. The model's role in tasks such as unit cost calculation establishes a new benchmark in the industry, showcasing the thesis's innovative contributions in areas like ontology-based data structuring and LSTM-driven automated unit cost computation. Looking beyond its current scope, this research holds potential for broader applications and adaptations in different civil engineering domains. It lays the groundwork for future enhancements, including exploring multilingual extensions and specialized approaches aligned with evolving industry trends. This thesis amalgamates data preprocessing, deep learning, and engineering expertise to boost efficiency and accuracy significantly. The unique methodology fosters continuous improvement and broad applicability across different regions. The practical integration of this technology in civil engineering tasks, demonstrated through the web server, opens avenues for further development to encompass a wider array of applications. Future research directions include refining the framework to cater to the dynamic needs of various civil engineering domains and extending the web server's capabilities for real-time data processing and analysis. Investigating the applicability of this methodology in other engineering or interdisciplinary contexts could also provide valuable insights, extending the utility of this research. This thesis lays a solid foundation for ongoing enhancements in capital work planning and tender contract assessment within the civil engineering industry.

Description

Keywords

Deep learning, Ontology-based classification, Water infrastructure, Tender-bid documents, Long Short-Term Memory (LSTM), Data preprocessing, Capital work planning, Document style diversity, Interdisciplinary applicability

LC Keywords

Citation