Sampling from Stochastic Finite Automata with Applications to CTC Decoding
Abstract
Stochastic finite automata arise naturally in many language and speech
processing tasks. They include stochastic acceptors, which represent
certain probabilty distributions over random strings. We consider the
problem of efficient sampling: drawing random string variates from the
probability distribution represented by stochastic automata and
transformations of those. We show that path-sampling is effective and can
be efficient if the epsilon-graph of a finite automaton is acyclic. We
provide an algorithm that ensures this by conflating epsilon-cycles
within strongly connected components. Sampling is also effective in the
presence of non-injective transformations of strings. We illustrate this
in the context of decoding for Connectionist Temporal Classification
(CTC), where the predictive probabilities yield auxiliary sequences which
are transformed into shorter labeling strings. We can sample efficiently
from the tranformed labeling distribution and use this in two different
strategies for finding the most probable CTC labeling.
processing tasks. They include stochastic acceptors, which represent
certain probabilty distributions over random strings. We consider the
problem of efficient sampling: drawing random string variates from the
probability distribution represented by stochastic automata and
transformations of those. We show that path-sampling is effective and can
be efficient if the epsilon-graph of a finite automaton is acyclic. We
provide an algorithm that ensures this by conflating epsilon-cycles
within strongly connected components. Sampling is also effective in the
presence of non-injective transformations of strings. We illustrate this
in the context of decoding for Connectionist Temporal Classification
(CTC), where the predictive probabilities yield auxiliary sequences which
are transformed into shorter labeling strings. We can sample efficiently
from the tranformed labeling distribution and use this in two different
strategies for finding the most probable CTC labeling.