ToolGrad: Efficient Tool-use Dataset Generation with Textual "Gradients"

Zhongyi Zhou; Kohei Uehara; Haoyu Zhang; Jingtao Zhou; Lin Gu; Ruofei Du; Zheng Xu; Tatsuya Harada

ToolGrad: Efficient Tool-use Dataset Generation with Textual "Gradients"

Zhongyi Zhou

Kohei Uehara

Haoyu Zhang

Jingtao Zhou

Lin Gu

Ruofei Du

Zheng Xu

Tatsuya Harada

ACL 2026 (2026)

Download Google Scholar

Abstract

Prior work synthesizes tool-use LLM datasets by first generating a user query, followed by complex tool-use annotations like depth-first search (DFS). This leads to inevitable annotation failures and low efficiency in data generation. We introduce ToolGrad, an agentic framework that inverts this paradigm. ToolGrad first constructs valid tool-use chains through an iterative process guided by textual "gradients", and then synthesizes corresponding user queries. This "answer-first" approach led to ToolGrad-500, a dataset generated with more complex tool use, lower cost, and almost 100% pass rate. Experiments show that ToolGrad models outperform those trained on expensive baseline datasets and proprietary LLMs.

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

ToolGrad: Efficient Tool-use Dataset Generation with Textual "Gradients"

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs