Jump to Content

Towards a Human-like Open-Domain Chatbot

Apoorv Kulshreshtha
Daniel De Freitas Adiwardana
David Richard So
Gaurav Nemade
Jamie Hall
Romal Thoppilan
Yifeng Lu
Zi Yang
arXiv (2020)
Google Scholar


We present Meena, a multi-turn end-to-end open-domain chatbot trained on data mined from public social media and filtered. The model was trained to minimize perplexity of the next token, but we have found evidence that this metric correlates with human judgement of quality. We propose a human judgement metric called Sensibleness and Specificity Average (SSA) which captures key elements of good conversation. Extensive experiments show strong correlation between perplexity and SSA. The fact that Meena scores high on SSA, 72%, on multi-turn evaluation suggests that a human-like chatbot with SSA score of 82% is potentially within reach if we manage to optimize perplexity better.