Flexible Mixed-Effect Modeling of Functional Data, with Applications to Process Monitoring

Mosesova, Sofia

Flexible Mixed-Effect Modeling of Functional Data, with Applications to Process Monitoring

Files

mosesova-thesis.pdf (9.87 MB)

Date

2007-06-18T17:04:23Z

Authors

Mosesova, Sofia

Publisher

University of Waterloo

Abstract

High levels of automation in manufacturing industries are leading to data sets of increasing size and dimension. The challenge facing statisticians and field professionals is to develop methodology to help meet this demand. Functional data is one example of high-dimensional data characterized by observations recorded as a function of some continuous measure, such as time. An application considered in this thesis comes from the automotive industry. It involves a production process in which valve seats are force-fitted by a ram into cylinder heads of automobile engines. For each insertion, the force exerted by the ram is automatically recorded every fraction of a second for about two and a half seconds, generating a force profile. We can think of these profiles as individual functions of time summarized into collections of curves. The focus of this thesis is the analysis of functional process data such as the valve seat insertion example. A number of techniques are set forth. In the first part, two ways to model a single curve are considered: a b-spline fit via linear regression, and a nonlinear model based on differential equations. Each of these approaches is incorporated into a mixed effects model for multiple curves, and multivariate process monitoring techniques are applied to the predicted random effects in order to identify anomalous curves. In the second part, a Bayesian hierarchical model is used to cluster low-dimensional summaries of the curves into meaningful groups. The belief is that the clusters correspond to distinct types of processes (e.g. various types of “good” or “faulty” assembly). New observations can be assigned to one of these by calculating the probabilities of belonging to each cluster. Mahalanobis distances are used to identify new observations not belonging to any of the existing clusters. Synthetic and real data are used to validate the results.