Transcribing real-valued sequences with deep neural networks

Placeholder Show Content

Abstract/Contents

Abstract
Speech recognition and arrhythmia detection from electrocardiograms are examples of problems which can be formulated as transcribing real-valued sequences. These problems have traditionally been solved with frameworks like the Hidden Markov Model. To generalize well, these models rely on carefully hand engineered building blocks. More general, end-to-end neural networks capable of learning from much larger datasets can achieve lower error rates. However, getting these models to work well in practice has other challenges. In this work, we present end-to-end models for transcribing real-valued sequences and discuss several applications of these models. The first is detecting abnormal heart activity in electrocardiograms. The second is large vocabulary continuous speech recognition. Finally, we investigate the tasks of keyword spotting and voice activity detection. In all cases we show how to scale high capacity models to unprecedentedly large datasets. With these techniques we can achieve performance comparable to that of human experts for both arrhythmia detection and speech recognition and state-of-the-art error rates in speech recognition for multiple languages.

Description

Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2018
Issuance monographic
Language English

Creators/Contributors

Associated with Hannun, Awni
Associated with Stanford University, Computer Science Department.
Primary advisor Jurafsky, Dan, 1962-
Primary advisor Ng, Andrew Y, 1976-
Thesis advisor Jurafsky, Dan, 1962-
Thesis advisor Ng, Andrew Y, 1976-
Thesis advisor Kundaje, Anshul, 1980-
Thesis advisor Zou, James
Advisor Kundaje, Anshul, 1980-
Advisor Zou, James

Subjects

Genre Theses

Bibliographic information

Statement of responsibility Awni Hannun.
Note Submitted to the Department of Computer Science.
Thesis Thesis (Ph.D.)--Stanford University, 2018.
Location electronic resource

Access conditions

Copyright
© 2018 by Awni Yusuf Hannun
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...