Item language model

Anushya Subbiah
Li Yang
Vikram Aggarwal
2025

Abstract

Embeddings are extensively used in many domains to represent information aboutdomain entities in a compressed manner. In recommendation systems, theseembeddings are trained to extract meaningful information about an item/user fromcollaborative filtering data consisting users ratings or implicit feedback on items.These behavioral embeddings are usually not trained on data from language domain,but they encode very useful behavioral information which cannot be describedusing language. This collaborative data and behavioral entities(users/items) arenot well represented in large language model (LLM) pretraining data as they arenot textual and are specific to the recommendation system/product. Bridging thisgap between behavioral understanding and language understanding can enable newitem and language interleaved tasks. In our work we show how we can efficientlyadapt rich behavioral embeddings and use as behavioral input representation inpre-trained LLM. To achieve this we adapt Querying Transformer technique witha new item contrastive loss and show improved item-text joint understanding inPALM2. Finally, we also demonstrate improved capabilities in recommendationdomain than using the behavioral embeddings directly as input to PALM2
×