Adapting Language Models to Temporal Knowledge

Abstract

It is only a matter of time before facts become out of date: from the name of \abr{POTUS} to the basketball team Lebron James plays for.
This continuously limits the usefulness of previously collected datasets and language models (LMs) trained on them.
This problem is exacerbated as LMs are used in the closed book question answering setting,
where the pretraining data must contain the facts for the model to remember within its fixed parameters.
A frequent paradigm is to update or refresh the dataset every so often, then retrain models with the new data: this is costly, but does it work?
In this paper, we introduce a diagnostic dataset for probing LMs for factual knowledge that changes over time.
Using it we show that models trained only on the most recent slice of data perform
worse on questions about the past than models trained on uniform data across time,
while being better on current and future questions.
Moreover, we propose jointly modeling text with the time it was created
and show that this improves memorization of previous facts,
as well as reasoning about the uncertainty around future facts.
We also show that models trained with temporal context allow for efficient refreshes as
new data arrives without the need of retraining from scratch.