Scaling Embedding Layers in Language Models

Da Yu
Yangsibo Huang
Pritish Kamath
Daogao Liu
Chiyuan Zhang
2025

Abstract