Efficient transformation invariant preconditioning for LoRA optimization

Jui-Nan Yen
Zhao Meng
Inderjit Dhillon
Cho-Jui Hsieh
ICLR (2025)

Abstract

Adaptive methods with non-diagonal preconditioning have demonstrated state-of-the-art performances on various tasks. However, the high memory cost of the existing non-diagonal preconditioning methods makes them unsuitable for the training of Low Rank Adaptions (LoRA). Additionally, these methods do not meet the criteria of efficient feature learning, which is important for LoRA optimization. In this work, we propose a non-diagonal preconditioning method to improve LoRA optimization. It has a low memory cost and achieves efficient feature learning through transformation invariance among equivalent LoRA weights. We provide theoretical justifications for our method. Our experiments on LLM LoRA finetuning demonstrate the effectiveness of our method.
×