Interactive sound source separation
Abstract/Contents
- Abstract
- In applications such as audio denoising, music transcription, music remixing, and audio-based forensics, it is desirable to decompose a single-channel recording into its respective sources. One of the most promising and effective classes of methods to do so is based on non-negative matrix factorization (NMF) and related probabilistic latent variable models (PLVMs). Such techniques, however, typically perform poorly when no isolated training data is given and offer no mechanism to improve upon unsatisfactory results. To overcome these issues, we present a new interaction paradigm and separation algorithm for single-channel source separation. The method works by allowing an end-user to roughly paint on time-frequency displays of sound. The rough annotations are then used to constrain, regularize, or otherwise inform an NMF/PLVM algorithm using the framework of posterior regularization and to perform separation. The output estimates are presented back to the user and the entire process is repeated in an interactive manner, until a desired result is achieved. To test the proposed method, we developed and released an open-source software project embodying our approach, conducted user studies, and submitted separation results to a community-based signal separation evaluation campaign. For a variety of real-world tasks, we found that expert users of our proposed method can achieve state-of-the-art separation quality according to standard evaluation metrics, and inexperienced users can achieve good separation quality with minimal instruction. In addition, we show that our method can perform well with or without isolated training data and is relatively insensitive to model selection, thus improving upon past methods in a variety of ways. Overall, these results demonstrate that our proposed approach is both a general and powerful separation method and motivates further work on interactive approaches to source separation. To download the application, code, and audio/video demonstrations, please see http://ccrma.stanford.edu/~njb/thesis.
Description
Type of resource | text |
---|---|
Form | electronic; electronic resource; remote |
Extent | 1 online resource. |
Publication date | 2014 |
Issuance | monographic |
Language | English |
Creators/Contributors
Associated with | Bryan, Nicholas James | |
---|---|---|
Associated with | Stanford University, Department of Music. | |
Primary advisor | Wang, Ge | |
Thesis advisor | Wang, Ge | |
Thesis advisor | Abel, Jonathan (Jonathan Stuart) | |
Thesis advisor | Chafe, Chris | |
Thesis advisor | Smith, Julius O. (Julius Orion) | |
Advisor | Abel, Jonathan (Jonathan Stuart) | |
Advisor | Chafe, Chris | |
Advisor | Smith, Julius O. (Julius Orion) |
Subjects
Genre | Theses |
---|
Bibliographic information
Statement of responsibility | Nicholas J. Bryan. |
---|---|
Note | Submitted to the Department of Music. |
Thesis | Thesis (Ph.D.)--Stanford University, 2014. |
Location | electronic resource |
Access conditions
- Copyright
- © 2014 by Nicholas James Bryan
- License
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...