Polymerization Data Mining: A Perspective

Mohammadi, Yousef; Penlidis, Alexander

Polymerization Data Mining: A Perspective

Files

Polymerization Data Mining Dec 3 2018 as accepted ATS UWSpace upload Jan 15 2021.pdf (483.99 KB)

Date

2019-04-01

Authors

Mohammadi, Yousef

Penlidis, Alexander

Publisher

Wiley

Abstract

Nowadays, ‘Data mining’ is widely proposed by data scientists as the most accepted and powerful approach to properly handle the information explosion. Data mining is defined as the extraction of interesting patterns and knowledge from huge amounts of data. It should be noted that the word ‘interesting’ refers to ‘non-trivial’, ‘implicit’, ‘previously unknown’, and ‘potentially useful’. Generally, data mining projects are composed of three essential steps including data pre-processing, processing, and post-processing. The first step, i.e. data pre-processing, is mostly applied for data cleaning, data integration, data transformation, and also dimensionality reduction. Data processing, the heart of all data mining projects, results in knowledge discovery as the main outcome of data mining, applying powerful modeling and optimization techniques. Post processing, the last step of data mining, is mostly employed to appropriately interpret, visualize, and present the processed outputs. The main functions of data mining are generalization, pattern discovery, classification, clustering, outlier analysis, time and ordering (sequential pattern, trend, and evolution analysis), and structure/network analysis. Data mining is the confluence of multiple disciplines including Statistics, visualization technology, high-performance computing, database technology, algorithm design, machine learning, and pattern recognition, with a wide variety of applications. It is mostly due to (1) a tremendous amount of data being generated (i.e. ‘big data’), (2) the high-dimensionality of data, (3) the high-complexity of data, and (4) the emergence of new novel and sophisticated applications. Today, data mining has been implemented and applied over a vast range of applications, like web page analysis, market basket analysis, fraud and intrusion detection, banking, telecommunication, customer relationship management, bioinformatics, educational technology, software engineering, criminal investigation, medical and health systems, text analysis, voice recognition, social and information networks, and the analysis of large amounts of unstructured information in the oil and gas industry. Polymerization data mining, like in other disciplines, can be considered as the measurement, collection, analysis, and reporting of data about polymerization systems for purposes of understanding, controlling, and optimizing macromolecular reactions and the environments in which they occur. In fact, polymerization data mining is an effective and intelligent processing/analysis of massive datasets frequently generated in polymerization systems. In general, for all macromolecular reaction engineering projects, several polymerization recipes are predefined applying experimental design techniques first. Then, the polymerization processes are separately performed for each recipe. Afterwards, the produced macromolecules are precisely analyzed applying available experimental techniques to determine their micromolecular characteristics and also final properties. The microstructure and architecture of the synthesized chains is precisely quantified by well-defined micromolecular indices either as average or distributional properties. Also, the final properties including chemical, physical, thermal, mechanical, optical, and/or biological properties determine the appropriateness of the produced macromolecules in different applications. Undoubtedly, understanding the intricate interrelationships between polymerization recipe, microstructure, and ultimately the polymer properties is the key to tailor-make complex macromolecules. Hence, the ultimate goal of polymerization data mining is to ‘crack’ the complexity of recipe-architecture-property interrelationships via masterful processing of the collected data.

Description

This is the peer reviewed version of the following article:Mohammadi, Y. and Penlidis, A. (2019), Polymerization Data Mining: A Perspective. Adv. Theory Simul., 2: 1800144, which has been published in final form at https://doi.org/10.1002/adts.201800144. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Use of Self-Archived Versions.