CodeGemma: Open Code Models Based on Gemma

Heri Zhao
Joshua Howland
Nam Nguyen
Siqi Zuo
Andrea Hu
Christopher A. Choquette-Choo
Jingyue Shen
Joe Kelley
Mateo Wirth
Paul Michel
Peter Choy
Pratik Joshi
Sarmad Hashmi
Shubham Agrawal
Zhitao Gong
Jane Fine
Ale Hartman
Bin Ni
Kathy Korevec
Kelly Schaefer
(2024)

Abstract

This paper introduces CodeGemma, a family of specialized open code models built on top of Gemma, capable of a variety of code and natural language generation tasks. We release three model checkpoints. CodeGemma 7B pretrained (PT) and instruction-tuned (IT) variants have remarkably resilient natural language understanding, excel in mathematical reasoning, while matching code capabilities of other open models. CodeGemma 2B is a state-of-the-art code completion model designed for fast code infilling and open-ended generation in latency sensitive settings.
×