AIS is an evaluation framework for assessing whether the output of natural language models only contains information about the external world that is verifiable in source documents, or attributable to identified sources.
This repository contains annotations for model output on four source datasets (CNN/DM, QReCC, Wizard of Wikipedia, and ToTTo). We provide the model output and annotations. The detailed description of our evaluation methodology, including the annotation guidelines, annotation interfaces, and operational statistics are included in our paper.
In order to obtain the model inputs and source documents, you will have to download the original datasets and map from our data to the input examples. Please see instructions in GitHub for each data source.