Structured Outputs
Structured Outputs ensures the model’s response conforms to a JSON Schema you provide. This eliminates parsing failures and guarantees type-safe responses for extraction, classification, data transformation, or feeding results into downstream systems.
Response Formats
Tensoras supports three response format modes, all set via the response_format parameter on the Chat Completions endpoint:
| Mode | response_format | Guarantee |
|---|---|---|
| Text (default) | { "type": "text" } | No constraint — model returns free-form text |
| JSON Object | { "type": "json_object" } | Output is valid JSON, but no schema enforcement |
| JSON Schema | { "type": "json_schema", "json_schema": { ... } } | Output conforms to the exact schema you provide |
JSON Object Mode
JSON Object mode ensures the model always returns valid JSON. You still need to describe the desired structure in your prompt, but the output is guaranteed to parse without errors.
Python
from tensoras import Tensoras
client = Tensoras(api_key="tns_your_key_here")
response = client.chat.completions.create(
model="llama-3.3-70b",
messages=[
{
"role": "system",
"content": "You extract structured data. Always respond in JSON with keys: name, genre, year.",
},
{
"role": "user",
"content": "Tell me about the movie Inception.",
},
],
response_format={"type": "json_object"},
)
import json
data = json.loads(response.choices[0].message.content)
print(data){
"name": "Inception",
"genre": "Science Fiction",
"year": 2010
}Node.js
import Tensoras from "tensoras";
const client = new Tensoras({ apiKey: "tns_your_key_here" });
const response = await client.chat.completions.create({
model: "llama-3.3-70b",
messages: [
{
role: "system",
content:
"You extract structured data. Always respond in JSON with keys: name, genre, year.",
},
{
role: "user",
content: "Tell me about the movie Inception.",
},
],
response_format: { type: "json_object" },
});
const data = JSON.parse(response.choices[0].message.content!);
console.log(data);curl
curl https://api.tensoras.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer tns_your_key_here" \
-d '{
"model": "llama-3.3-70b",
"messages": [
{"role": "system", "content": "You extract structured data. Always respond in JSON with keys: name, genre, year."},
{"role": "user", "content": "Tell me about the movie Inception."}
],
"response_format": {"type": "json_object"}
}'JSON Schema Mode
JSON Schema mode goes further — you provide a JSON Schema definition, and the model’s output is guaranteed to conform to that schema. This uses constrained decoding to enforce the schema at the token level.
Python
from tensoras import Tensoras
client = Tensoras(api_key="tns_your_key_here")
response = client.chat.completions.create(
model="llama-3.3-70b",
messages=[
{
"role": "system",
"content": "You extract structured movie data from user queries.",
},
{
"role": "user",
"content": "Tell me about The Matrix and Inception.",
},
],
response_format={
"type": "json_schema",
"json_schema": {
"name": "movie_list",
"strict": True,
"schema": {
"type": "object",
"properties": {
"movies": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {"type": "string"},
"genre": {"type": "string"},
"year": {"type": "integer"},
"director": {"type": "string"},
},
"required": ["name", "genre", "year", "director"],
"additionalProperties": False,
},
}
},
"required": ["movies"],
"additionalProperties": False,
},
},
},
)
import json
data = json.loads(response.choices[0].message.content)
for movie in data["movies"]:
print(f"{movie['name']} ({movie['year']}) - {movie['director']}"){
"movies": [
{
"name": "The Matrix",
"genre": "Science Fiction",
"year": 1999,
"director": "The Wachowskis"
},
{
"name": "Inception",
"genre": "Science Fiction",
"year": 2010,
"director": "Christopher Nolan"
}
]
}You can also use the typed SDK models for a more structured approach:
from tensoras import Tensoras
from tensoras.types import ResponseFormatJsonSchema, JsonSchemaConfig
client = Tensoras(api_key="tns_your_key_here")
response = client.chat.completions.create(
model="llama-3.3-70b",
messages=[
{"role": "system", "content": "Extract structured movie data."},
{"role": "user", "content": "Tell me about The Matrix and Inception."},
],
response_format=ResponseFormatJsonSchema(
json_schema=JsonSchemaConfig(
name="movie_list",
strict=True,
schema={
"type": "object",
"properties": {
"movies": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {"type": "string"},
"genre": {"type": "string"},
"year": {"type": "integer"},
},
"required": ["name", "genre", "year"],
"additionalProperties": False,
},
}
},
"required": ["movies"],
"additionalProperties": False,
},
),
),
)Node.js
import Tensoras from "tensoras";
import type { ResponseFormatJsonSchema } from "tensoras";
const client = new Tensoras({ apiKey: "tns_your_key_here" });
const responseFormat: ResponseFormatJsonSchema = {
type: "json_schema",
json_schema: {
name: "movie_list",
strict: true,
schema: {
type: "object",
properties: {
movies: {
type: "array",
items: {
type: "object",
properties: {
name: { type: "string" },
genre: { type: "string" },
year: { type: "integer" },
director: { type: "string" },
},
required: ["name", "genre", "year", "director"],
additionalProperties: false,
},
},
},
required: ["movies"],
additionalProperties: false,
},
},
};
const response = await client.chat.completions.create({
model: "llama-3.3-70b",
messages: [
{
role: "system",
content: "You extract structured movie data from user queries.",
},
{
role: "user",
content: "Tell me about The Matrix and Inception.",
},
],
response_format: responseFormat,
});
const data = JSON.parse(response.choices[0].message.content!);
for (const movie of data.movies) {
console.log(`${movie.name} (${movie.year}) - ${movie.director}`);
}curl
curl https://api.tensoras.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer tns_your_key_here" \
-d '{
"model": "llama-3.3-70b",
"messages": [
{"role": "system", "content": "You extract structured movie data from user queries."},
{"role": "user", "content": "Tell me about The Matrix and Inception."}
],
"response_format": {
"type": "json_schema",
"json_schema": {
"name": "movie_list",
"strict": true,
"schema": {
"type": "object",
"properties": {
"movies": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {"type": "string"},
"genre": {"type": "string"},
"year": {"type": "integer"},
"director": {"type": "string"}
},
"required": ["name", "genre", "year", "director"],
"additionalProperties": false
}
}
},
"required": ["movies"],
"additionalProperties": false
}
}
}
}'Strict Tool Definitions
In addition to response_format, you can set strict: true on individual function/tool definitions. When strict mode is enabled, the model’s function call arguments are guaranteed to match the tool’s parameters schema exactly.
response = client.chat.completions.create(
model="llama-3.3-70b",
messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
tools=[
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather for a location.",
"strict": True,
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string", "description": "City name"},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
},
"required": ["location", "unit"],
"additionalProperties": False,
},
},
}
],
)With strict: true, the model will always produce valid JSON arguments that match the parameters schema — no need to handle malformed arguments.
Supported Schemas
JSON Schema mode supports a broad subset of JSON Schema Draft 2020-12:
| Feature | Supported |
|---|---|
type (string, number, integer, boolean, array, object, null) | Yes |
properties and required | Yes |
additionalProperties: false | Yes (recommended) |
enum | Yes |
const | Yes |
items (for arrays) | Yes |
anyOf | Yes |
$ref / $defs | Yes |
minLength / maxLength | Yes |
minimum / maximum | Yes |
pattern | Limited |
minItems / maxItems | Yes |
default | Ignored (not used during generation) |
Limitations
- No recursive schemas — a
$refthat creates a cycle is not supported. - Maximum nesting depth — schemas deeper than 5 levels may increase latency.
- Maximum schema size — very large schemas (hundreds of properties) may increase latency or fail validation.
additionalPropertiesshould always befalse— setadditionalProperties: falseat every object level for reliable constrained decoding, regardless of whether you usestrict: true. This applies to bothjson_schemaresponse formats and strict tool definitions.
Best Practices
- Use strict mode for reliable parsing. Setting
strict: truein yourjson_schemaconfig enables constrained decoding, guaranteeing schema conformance. - Set
additionalProperties: falseat every level of your schema to prevent unexpected fields. - Define minimal schemas. Fewer constraints and properties result in faster generation. Break complex extractions into multiple calls if needed.
- Use
json_objectmode for simpler use cases. If you just need valid JSON without strict schema enforcement,json_objectmode has lower latency. - Include a clear system prompt. Even with JSON Schema mode, a system prompt describing the task improves the quality of extracted values.
- Works with streaming. Both JSON mode and JSON Schema mode work alongside
stream: true. Tokens are streamed as they are generated and the final concatenated output is guaranteed to be valid.
Migration from JSON Mode
If you are currently using {"type": "json_object"} and want stricter guarantees, upgrading to JSON Schema mode is straightforward:
Before (JSON mode):
response = client.chat.completions.create(
model="llama-3.3-70b",
messages=[
{"role": "system", "content": "Return JSON with keys: name, year, genre."},
{"role": "user", "content": "Tell me about Inception."},
],
response_format={"type": "json_object"},
)
# Output might have unexpected keys or wrong typesAfter (JSON Schema mode):
response = client.chat.completions.create(
model="llama-3.3-70b",
messages=[
{"role": "system", "content": "Extract movie data."},
{"role": "user", "content": "Tell me about Inception."},
],
response_format={
"type": "json_schema",
"json_schema": {
"name": "movie",
"strict": True,
"schema": {
"type": "object",
"properties": {
"name": {"type": "string"},
"year": {"type": "integer"},
"genre": {"type": "string"},
},
"required": ["name", "year", "genre"],
"additionalProperties": False,
},
},
},
)
# Output is guaranteed to match the schema exactlyKey differences when migrating:
- You no longer need to describe the JSON structure in the system prompt (though it can still help with value quality).
- The output is guaranteed to have exactly the fields you specify — no missing or extra keys.
- All field types are enforced (e.g.,
yearwill always be an integer, not a string).
When to Use Which
| Use case | Recommended mode |
|---|---|
| You need any valid JSON and handle flexible structures | JSON Object mode |
| You need output matching an exact schema (extraction, APIs, data pipelines) | JSON Schema mode |
| You want to stream structured output in real time | Either — both support streaming |
| You need strict function call arguments | Strict tool definitions |
Related
- Tool Calling — another way to get structured data from models
- Streaming — stream structured output tokens in real time
- Chat Completions API — full endpoint reference