The Circa (meaning ‘approximately’) dataset aims to help machine learning systems to solve the problem of interpreting indirect answers to polar questions.
The dataset contains pairs of yes/no questions and indirect answers, together with annotations for the interpretation of the answer.
The data is collected in 10 different social conversational situations (eg. food preferences of a friend).
Q: Want to get some dinner together?
A: I'd rather just go to bed. [No]
Q: Do you like spicy food?
A: I put hot sauce on everything. [Yes]
Q: Would you like to go see live music?
A: If it’s not too crowded. [Yes, upon a condition]
Currently, the Circa annotations focus on a few classes such as ‘yes’, ‘no’ and ‘yes, upon condition’. The data can be used to build machine learning models which can replicate these classes on new question-answer pairs, and allow evaluation of methods for doing so.