Hidden conditional random fields for speech recognition

Sung, Yun-Hsuan; Stanford University, Department of Electrical Engineering

Hidden conditional random fields for speech recognition

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Fzn927hy7753" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: This thesis investigates using a new graphical model, hidden conditional random fields (HCRFs), for speech recognition. Conditional random fields (CRFs) are discriminative sequence models that have been successfully applied to several tasks in text processing, such as named entity recognition. Recently, there has been increasing interest in applying CRFs to speech recognition due to the similarity between speech and text processing. HCRFs are CRFs augmented with hidden variables that are capable of representing the dynamic changes and variations in speech signals. HCRFs also have the ability to incorporate correlated features from both speech signals and text without making strong independence assumptions among them. This thesis presents my current research on applying HCRFs to speech recognition and HCRFs' potential to replace the current hidden Markov model (HMM) for acoustic modeling. Experimental results of phone classification, phone recognition, and speaker adaptation are presented and discussed. Our monophone HCRFs outperform both maximum mutual information estimation (MMIE) and minimum phone error (MPE) trained HMMs and achieve the-start-of-the-art performance in TIMIT phone classification and recognition tasks. We also show how to jointly train acoustic models and language models in HCRFs, which shows improvement in the results. Maximum a posterior (MAP) and maximum conditional likelihood linear regression (MCLLR) successfully adapt speaker-independent models to speaker-dependent models with a small amount of adaptation data for HCRF speaker adaptation. Finally, we explore adding gender and dialect features for phone recognition, and experimental results are presented.

Description

Type of resource	text
Form	electronic; electronic resource; remote
Extent	1 online resource.
Publication date	2010
Issuance	monographic
Language	English

Creators/Contributors

Associated with	Sung, Yun-Hsuan
Associated with	Stanford University, Department of Electrical Engineering
Primary advisor	Jurafsky, Dan, 1962-
Thesis advisor	Jurafsky, Dan, 1962-
Thesis advisor	Gray, Robert M, 1943-
Thesis advisor	Manning, Christopher D
Advisor	Gray, Robert M, 1943-
Advisor	Manning, Christopher D

Subjects

Genre	Theses

Bibliographic information

Statement of responsibility	Yun-Hsuan Sung.
Note	Submitted to the Department of Electrical Engineering.
Thesis	Ph.D. Stanford University 2010
Location	electronic resource

Access conditions

License: This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

View in SearchWorks

Loading usage metrics...