Evaluation of High-Speed Videoendoscopy for Bayesian Inference on Reduced Order Vocal Fold Models
MetadataShow full item record
The ability to use our voice occurs through a complex bio-mechanical process known as phonation. The study of this process is interesting, not only because of the complex physical phenomena involved, but also because of the presence of phonation disorders that can make the everyday task of using ones voice difficult. Clinical studies of phonation aim to help diagnose such disorders using various measurement techniques, such as microphone recordings, video of the vocal folds, and perceptual sound quality measures. In contrast, scientific investigations of phonation have focused on understanding the physical phenomena behind phonation using simplified physical and numerical models constructed using representative population based parameters. A particularly useful type of model, reduced-order numerical models, are simplified representations of the vocal folds with low computational complexity that allow broad parameter changes to be investigated. To bring the physical understanding of phonation from these models into clinical usage, it is necessary to have patient specific parameters. Due to the difficulty of measuring vocal fold parameters and other structures in phonation directly, inverse analysis techniques must be employed. These techniques estimate the parameters of a model, by finding model parameters that lead to outputs of the model which compare well with measured outputs. With the measured outputs being patient specific measurements, these techniques can produce patient specific model parameters. However, this is complicated by the fact that measurements are uncertain, which leads to uncertainty in inferred parameters. The uncertainty in the parameters provides a way to judge how confident clinicians should be in using them. Large measurements errors could result in high uncertainties (and vice versa), which should guide clinicians on whether or not to believe the estimated parameters. Bayesian inference is an inverse analysis technique, that can take into account the inherent uncertainty in measurements in a probabilistic framework. Applying Bayesian inference to reduced-order models and clinical measurements allows patient specific model parameters with associated uncertainties to be inferred. A promising clinical measurement for use in Bayesian inference is high-speed videoendoscopy, in which high-speed video is taken of the vocal folds in motion. This captures the time varying motion of the vocal folds, which allows many quantitative measurements to be derived from the resulting video, for example the glottal width (distance between the vocal folds) or glottal area (area between the vocal folds). High-speed videoendoscopy is subject to variable imaging parameters, in particular the frame rate, spatial resolution, and tilted views of the camera can all modify the resulting video of the vocal folds, changing the uncertainty in the derived measurements. To investigate the effect of these three imaging parameters on Bayesian inference applied to high-speed video endoscopy, a simulated high-speed videoendoscopy experiment was conducted. Using a reduced order model, with known parameters, a set of enlarged, artificial vocal folds were driven in slow motion. These were imaged by a consumer DSLR camera, where the slow motion increased the effective frame rate, and the enlarged vocal folds increased the effective spatial resolution, to a fidelity much greater than typical high-speed videos of the vocal folds. This allowed investigation of the three parameters; titled views of the camera were investigated by physically tilting the camera, while variable frame rates and spatial resolutions were investigated by numerical downsampling of the original recording. Bayesian inference was conducted on these simulated high-speed videos, by measuring the distance between the vocal folds (the glottal width), in order to determine the parameters of the same reduced-order model driving the artificial vocal folds. This provided a reference to compare the estimated parameters with. The changes in estimated parameters from Bayesian inference were then investigated as the angle of view, frame rate, and spatial resolution were modified. From the experiment, the effects of frame rate, spatial resolution, and angle of view in high-speed videoendoscopy were found relative to changes from a reference video. Specifically, uncertainty in estimates increased linearly with respect to downsampling factor of frame rate. A frame rate that is half that of the reference video will have an uncertainty on estimated parameters that is twice as large. Spatial resolution affects the level of uncertainty based on the edge detection techniques that are used to extract quantitative data (i.e., the glottal width in this study). As the spatial resolution was downsampled, the level of error from the edge detection algorithm increased linearly with respect to the downsampling factor, which subsequently led to the same linear increase in the level of uncertainty in the estimate. However, different edge detection algorithms will likely have different accuracies as the resolution of the image decreases. While in this study it is preferable to decrease spatial resolution instead of frame rate, more general conclusions would be dependent on the specific edge detection technique used. The angle of view was found to bias estimates as a result of projecting the vocal folds (glottis) onto an offset image plane (like viewing a coin from an angle, results in increasingly narrow ellipses until a single line is formed, rather than a circle). This decreased the glottal width measured, which biased the estimated parameters. To account for this bias, it is suggested that the angle of view can be treated as an uncertain parameter, which leads to increased uncertainty in the quantitative measures from high-speed video. Alternatively, the angle of view can be estimated as an additional parameter.
Cite this version of the work
Jonathan Deng (2018). Evaluation of High-Speed Videoendoscopy for Bayesian Inference on Reduced Order Vocal Fold Models. UWSpace. http://hdl.handle.net/10012/13319