Latent LSTM Allocation: Joint clustering and non-linear dynamic modeling of sequence data

Amr Ahmed
Alexander Smola
WSDM, ACM(2017)
Google Scholar


Recurrent neural network, such as Long-short term memory (LSTM), are powerful tools for modeling sequential data, however, they lack interpretability and requires large num- ber of parameters. On the other hand, topic models, such as Latent Dirichlet Allocation (LDA), are powerful tools for uncovering the hidden structure in a document collection, however, they lack the same strong predictive power as deep models. In this paper we bridge the gap between such mod- els and propose Latent LSTM Allocation (LLA). In LLA each document is modeled as a sequence of words, and the model jointly groups words into topics and learns the tempo- ral dynamics over the sequence. Our model is interpretable, concise and can capture intricate dynamics. We give an ef- ficient MCMC-EM inference algorithm for our model that scales to millions of documents. Our experimental evalu- ations shows that the proposed model compares favorably with several state-of-the-art baselines.