Phonetic Classification Using Controlled Random Walks Katrin Kirchhoff Andrei Alexandrescuy Department of Electrical Engineering Facebook University of Washington, Seattle, WA, USA
[email protected] [email protected] Abstract a given node. When building a graph representing a speech Recently, semi-supervised learning algorithms for phonetic database, the neighbourhood sizes and entropies often exhibit classifiers have been proposed that have obtained promising re- large variance, due to the uneven distribution of phonetic classes sults. Often, these algorithms attempt to satisfy learning criteria (e.g. vowels are more frequent than laterals). MAD offers a that are not inherent in the standard generative or discriminative unified framework for accommodating such effects without the training procedures for phonetic classifiers. Graph-based learn- need for downsampling or upsampling the training data. In the ers in particular utilize an objective function that not only max- past, MAD has been applied to class-instance extraction from imizes the classification accuracy on a labeled set but also the text [10] but it has to our knowledge not been applied to speech global smoothness of the predicted label assignment. In this pa- classification tasks before. In this paper we contrast MAD to per we investigate a novel graph-based semi-supervised learn- the most widely used graph-based learning algorithm, viz. label ing framework that implements a controlled random walk where propagation. different possible moves in the random walk are controlled by probabilities that are dependent on the properties of the graph 2. Graph-Based Learning itself. Experimental results on the TIMIT corpus are presented In graph-based learning, the training and test data are jointly that demonstrate the effectiveness of this procedure.