Show simple item record

dc.contributor.authorPodder, Sushilen
dc.date.accessioned2006-08-22 14:02:42 (GMT)
dc.date.available2006-08-22 14:02:42 (GMT)
dc.date.issued2004en
dc.date.submitted2004en
dc.identifier.urihttp://hdl.handle.net/10012/933
dc.description.abstractThe goal of an automatic speech recognition system is to enable the computer in understanding human speech and act accordingly. In order to realize this goal, language modeling plays an important role. It works as a knowledge source through mimicking human comprehension mechanism in understanding the language. Among many other approaches, statistical language modeling technique is widely used in automatic speech recognition systems. However, the generation of reliable and robust statistical model is very difficult task, especially for a large vocabulary system. For a large vocabulary system, the performance of such a language model degrades as the vocabulary size increases. Hence, the performance of the speech recognition system also degrades due to the increased complexity and mutual confusion among the candidate words in the language model. In order to solve these problems, reduction of language model size as well as minimization of mutual confusion between words are required. In our work, we have employed clustering techniques, using self-organizing map, to build topical language models. Moreover, in order to capture the inherent semantics of sentences, a lexical dictionary, WordNet has been used in the clustering process. This thesis work focuses on various aspects of clustering, language model generation, extraction of task dependent acoustic parameters, and their implementations under the framework of the CMU Sphinx3 speech engine decoder. The preliminary results, presented in this thesis show the effectiveness of the topical language models.en
dc.formatapplication/pdfen
dc.format.extent1353777 bytes
dc.format.mimetypeapplication/pdf
dc.language.isoenen
dc.publisherUniversity of Waterlooen
dc.rightsCopyright: 2004, Podder, Sushil. All rights reserved.en
dc.subjectSystems Designen
dc.subjectASRen
dc.subjectlanguage model generationen
dc.subjectWordNeten
dc.titleUnsupervised Clustering and Automatic Language Model Generation for ASRen
dc.typeMaster Thesisen
dc.pendingfalseen
uws-etd.degree.departmentSystems Design Engineeringen
uws-etd.degreeMaster of Applied Scienceen
uws.typeOfResourceTexten
uws.peerReviewStatusUnrevieweden
uws.scholarLevelGraduateen


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record


UWSpace

University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

All items in UWSpace are protected by copyright, with all rights reserved.

DSpace software

Service outages