Papers
arxiv:2603.05515

Enhancing Tool Calling in LLMs with the International Tool Calling Dataset

Published on Jan 21
Authors:
,

Abstract

International Tool Calling (ITC) presents a large-scale, multilingual benchmark for evaluating LLMs' ability to use real APIs across diverse geographical and linguistic contexts, revealing performance disparities between open- and closed-source models and demonstrating improved cross-lingual capabilities through fine-tuning.

AI-generated summary

Tool calling allows large language models (LLMs) to interact with external systems like APIs, enabling applications in customer support, data analysis, and dynamic content generation. While recent benchmarks have advanced tool-use research, they suffer from key limitations, including reliance on simulated or restricted APIs, limited reproducibility, and a lack of cultural and geographic diversity. To address these gaps, we introduce International Tool Calling (ITC), a large-scale, multilingual benchmark designed for realistic, globally distributed tool-calling scenarios. ITC includes 3,571 real APIs and 17,540 tool calling tasks across 20 categories and 40 countries. Experiments reveal substantial performance gaps between open- and closed-source LLMs, while fine-tuning on ITC yields significant improvements, particularly for non-English queries, enhancing cross-lingual generalization, reasoning consistency, and robustness to out-of-domain tools. ITC provides a valuable benchmark for advancing LLM robustness and performance in complex, multi-tool, and international scenarios. Dataset: https://anonymous.4open.science/r/International-Tool-Calling-ITC-dataset-FAF4/.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2603.05515
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2603.05515 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2603.05515 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2603.05515 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.