Tool Calling
Tool calling (also known as function calling) lets the model invoke external functions you define. You describe available tools in your request, and the model can choose to call one or more of them instead of — or in addition to — generating a text response. Your code executes the function, returns the result, and the model uses it to produce a final answer.
Supported Models
Tool calling is supported on all chat completion models:
llama-3.3-70bllama-3.1-8bqwen-3-32bmistral-7b-instructdeepseek-r1-distill-70bcodestral-latest
Defining Tools
A tool definition includes a function object with a name, description, and a JSON Schema for parameters:
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather for a given city.",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The city name, e.g. 'San Francisco'"
},
"units": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature unit"
}
},
"required": ["city"]
}
}
}Tips for tool definitions:
- Write clear, specific descriptions — the model uses them to decide when and how to call each tool.
- Mark only truly required parameters as
required. - Use
enumto constrain values where possible.
Full Python Example
import json
from tensoras import Tensoras
client = Tensoras(api_key="tns_your_key_here")
# 1. Define your tools
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather for a given city.",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The city name, e.g. 'San Francisco'",
},
"units": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature unit",
},
},
"required": ["city"],
},
},
}
]
# 2. Your actual function implementation
def get_weather(city: str, units: str = "celsius") -> dict:
# In a real app, call a weather API here
return {"city": city, "temperature": 18, "units": units, "condition": "partly cloudy"}
# 3. Send the initial request with tools
messages = [
{"role": "user", "content": "What's the weather like in San Francisco?"}
]
response = client.chat.completions.create(
model="llama-3.3-70b",
messages=messages,
tools=tools,
)
message = response.choices[0].message
# 4. Check if the model wants to call a tool
if message.tool_calls:
# Append the assistant message (with tool_calls) to the conversation
messages.append(message)
for tool_call in message.tool_calls:
# Parse arguments and call the function
args = json.loads(tool_call.function.arguments)
result = get_weather(**args)
# Append the tool result
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": json.dumps(result),
})
# 5. Send the conversation back so the model can produce a final answer
final_response = client.chat.completions.create(
model="llama-3.3-70b",
messages=messages,
tools=tools,
)
print(final_response.choices[0].message.content)
else:
print(message.content)The weather in San Francisco is currently 18°C and partly cloudy.Full Node.js Example
import Tensoras from "tensoras";
const client = new Tensoras({ apiKey: "tns_your_key_here" });
// 1. Define your tools
const tools = [
{
type: "function",
function: {
name: "get_weather",
description: "Get the current weather for a given city.",
parameters: {
type: "object",
properties: {
city: {
type: "string",
description: "The city name, e.g. 'San Francisco'",
},
units: {
type: "string",
enum: ["celsius", "fahrenheit"],
description: "Temperature unit",
},
},
required: ["city"],
},
},
},
];
// 2. Your actual function implementation
function getWeather(city, units = "celsius") {
return { city, temperature: 18, units, condition: "partly cloudy" };
}
// 3. Send the initial request with tools
const messages = [
{ role: "user", content: "What's the weather like in San Francisco?" },
];
const response = await client.chat.completions.create({
model: "llama-3.3-70b",
messages,
tools,
});
const message = response.choices[0].message;
// 4. Check if the model wants to call a tool
if (message.tool_calls) {
messages.push(message);
for (const toolCall of message.tool_calls) {
const args = JSON.parse(toolCall.function.arguments);
const result = getWeather(args.city, args.units);
messages.push({
role: "tool",
tool_call_id: toolCall.id,
content: JSON.stringify(result),
});
}
// 5. Send the conversation back for a final answer
const finalResponse = await client.chat.completions.create({
model: "llama-3.3-70b",
messages,
tools,
});
console.log(finalResponse.choices[0].message.content);
} else {
console.log(message.content);
}How the Conversation Flows
A tool-calling conversation typically follows this pattern:
- You send messages + tool definitions.
- Model responds with
tool_calls(instead of or alongside text content). - You execute each function and append
role: "tool"messages with the results. - Model uses the tool results to produce a final text response.
The model may call multiple tools in a single response. Each tool call has a unique id that you must reference when returning the result via tool_call_id.
Controlling Tool Use
You can guide the model’s tool-calling behavior with the tool_choice parameter:
| Value | Behavior |
|---|---|
"auto" (default) | Model decides whether to call a tool or respond with text |
"none" | Model will not call any tools |
"required" | Model must call at least one tool |
{"type": "function", "function": {"name": "get_weather"}} | Model must call the specified tool |
Streaming with Tool Calls
Tool calls work with streaming. When streaming, tool call arguments arrive incrementally in delta.tool_calls. See Streaming for more details.
Server-Side Tool Calling with the Responses API
If you want the server to execute tool calls automatically (e.g., searching your Knowledge Bases), use the Responses API instead. The server runs a multi-turn agentic loop where the model issues tool calls, the server executes built-in tools like file_search, and the model produces a final answer — all in a single request.
response = client.responses.create(
model="llama-3.3-70b",
input="What does our docs say about SSO configuration?",
tools=[{
"type": "file_search",
"file_search": {
"knowledge_base_ids": ["kb_abc123"],
},
}],
)See the Responses API reference for details.
Related
- Responses API — server-side agentic tool-calling loop
- Streaming — real-time streaming of tool call arguments
- Structured Outputs — enforce JSON schemas on responses
- Chat Completions API — full endpoint reference