Structured Outputs

Structured Outputs ensures the model’s response conforms to a JSON Schema you provide. This eliminates parsing failures and guarantees type-safe responses for extraction, classification, data transformation, or feeding results into downstream systems.

Response Formats

Tensoras supports three response format modes, all set via the response_format parameter on the Chat Completions endpoint:

Mode	`response_format`	Guarantee
Text (default)	`{ "type": "text" }`	No constraint — model returns free-form text
JSON Object	`{ "type": "json_object" }`	Output is valid JSON, but no schema enforcement
JSON Schema	`{ "type": "json_schema", "json_schema": { ... } }`	Output conforms to the exact schema you provide

JSON Object Mode

JSON Object mode ensures the model always returns valid JSON. You still need to describe the desired structure in your prompt, but the output is guaranteed to parse without errors.

Python

from tensoras import Tensoras
 
client = Tensoras(api_key="tns_your_key_here")
 
response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[
        {
            "role": "system",
            "content": "You extract structured data. Always respond in JSON with keys: name, genre, year.",
        },
        {
            "role": "user",
            "content": "Tell me about the movie Inception.",
        },
    ],
    response_format={"type": "json_object"},
)
 
import json
data = json.loads(response.choices[0].message.content)
print(data)

Output

{
  "name": "Inception",
  "genre": "Science Fiction",
  "year": 2010
}

Node.js

import Tensoras from "tensoras";
 
const client = new Tensoras({ apiKey: "tns_your_key_here" });
 
const response = await client.chat.completions.create({
  model: "llama-3.3-70b",
  messages: [
    {
      role: "system",
      content:
        "You extract structured data. Always respond in JSON with keys: name, genre, year.",
    },
    {
      role: "user",
      content: "Tell me about the movie Inception.",
    },
  ],
  response_format: { type: "json_object" },
});
 
const data = JSON.parse(response.choices[0].message.content!);
console.log(data);

curl

curl https://api.tensoras.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer tns_your_key_here" \
  -d '{
    "model": "llama-3.3-70b",
    "messages": [
      {"role": "system", "content": "You extract structured data. Always respond in JSON with keys: name, genre, year."},
      {"role": "user", "content": "Tell me about the movie Inception."}
    ],
    "response_format": {"type": "json_object"}
  }'

JSON Schema Mode

JSON Schema mode goes further — you provide a JSON Schema definition, and the model’s output is guaranteed to conform to that schema. This uses constrained decoding to enforce the schema at the token level.

Python

from tensoras import Tensoras
 
client = Tensoras(api_key="tns_your_key_here")
 
response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[
        {
            "role": "system",
            "content": "You extract structured movie data from user queries.",
        },
        {
            "role": "user",
            "content": "Tell me about The Matrix and Inception.",
        },
    ],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "movie_list",
            "strict": True,
            "schema": {
                "type": "object",
                "properties": {
                    "movies": {
                        "type": "array",
                        "items": {
                            "type": "object",
                            "properties": {
                                "name": {"type": "string"},
                                "genre": {"type": "string"},
                                "year": {"type": "integer"},
                                "director": {"type": "string"},
                            },
                            "required": ["name", "genre", "year", "director"],
                            "additionalProperties": False,
                        },
                    }
                },
                "required": ["movies"],
                "additionalProperties": False,
            },
        },
    },
)
 
import json
data = json.loads(response.choices[0].message.content)
for movie in data["movies"]:
    print(f"{movie['name']} ({movie['year']}) - {movie['director']}")

Output

{
  "movies": [
    {
      "name": "The Matrix",
      "genre": "Science Fiction",
      "year": 1999,
      "director": "The Wachowskis"
    },
    {
      "name": "Inception",
      "genre": "Science Fiction",
      "year": 2010,
      "director": "Christopher Nolan"
    }
  ]
}

You can also use the typed SDK models for a more structured approach:

from tensoras import Tensoras
from tensoras.types import ResponseFormatJsonSchema, JsonSchemaConfig
 
client = Tensoras(api_key="tns_your_key_here")
 
response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[
        {"role": "system", "content": "Extract structured movie data."},
        {"role": "user", "content": "Tell me about The Matrix and Inception."},
    ],
    response_format=ResponseFormatJsonSchema(
        json_schema=JsonSchemaConfig(
            name="movie_list",
            strict=True,
            schema={
                "type": "object",
                "properties": {
                    "movies": {
                        "type": "array",
                        "items": {
                            "type": "object",
                            "properties": {
                                "name": {"type": "string"},
                                "genre": {"type": "string"},
                                "year": {"type": "integer"},
                            },
                            "required": ["name", "genre", "year"],
                            "additionalProperties": False,
                        },
                    }
                },
                "required": ["movies"],
                "additionalProperties": False,
            },
        ),
    ),
)

Node.js

import Tensoras from "tensoras";
import type { ResponseFormatJsonSchema } from "tensoras";
 
const client = new Tensoras({ apiKey: "tns_your_key_here" });
 
const responseFormat: ResponseFormatJsonSchema = {
  type: "json_schema",
  json_schema: {
    name: "movie_list",
    strict: true,
    schema: {
      type: "object",
      properties: {
        movies: {
          type: "array",
          items: {
            type: "object",
            properties: {
              name: { type: "string" },
              genre: { type: "string" },
              year: { type: "integer" },
              director: { type: "string" },
            },
            required: ["name", "genre", "year", "director"],
            additionalProperties: false,
          },
        },
      },
      required: ["movies"],
      additionalProperties: false,
    },
  },
};
 
const response = await client.chat.completions.create({
  model: "llama-3.3-70b",
  messages: [
    {
      role: "system",
      content: "You extract structured movie data from user queries.",
    },
    {
      role: "user",
      content: "Tell me about The Matrix and Inception.",
    },
  ],
  response_format: responseFormat,
});
 
const data = JSON.parse(response.choices[0].message.content!);
for (const movie of data.movies) {
  console.log(`${movie.name} (${movie.year}) - ${movie.director}`);
}

curl

curl https://api.tensoras.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer tns_your_key_here" \
  -d '{
    "model": "llama-3.3-70b",
    "messages": [
      {"role": "system", "content": "You extract structured movie data from user queries."},
      {"role": "user", "content": "Tell me about The Matrix and Inception."}
    ],
    "response_format": {
      "type": "json_schema",
      "json_schema": {
        "name": "movie_list",
        "strict": true,
        "schema": {
          "type": "object",
          "properties": {
            "movies": {
              "type": "array",
              "items": {
                "type": "object",
                "properties": {
                  "name": {"type": "string"},
                  "genre": {"type": "string"},
                  "year": {"type": "integer"},
                  "director": {"type": "string"}
                },
                "required": ["name", "genre", "year", "director"],
                "additionalProperties": false
              }
            }
          },
          "required": ["movies"],
          "additionalProperties": false
        }
      }
    }
  }'

Strict Tool Definitions

In addition to response_format, you can set strict: true on individual function/tool definitions. When strict mode is enabled, the model’s function call arguments are guaranteed to match the tool’s parameters schema exactly.

response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
    tools=[
        {
            "type": "function",
            "function": {
                "name": "get_weather",
                "description": "Get the current weather for a location.",
                "strict": True,
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {"type": "string", "description": "City name"},
                        "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
                    },
                    "required": ["location", "unit"],
                    "additionalProperties": False,
                },
            },
        }
    ],
)

With strict: true, the model will always produce valid JSON arguments that match the parameters schema — no need to handle malformed arguments.

Supported Schemas

JSON Schema mode supports a broad subset of JSON Schema Draft 2020-12:

Feature	Supported
`type` (string, number, integer, boolean, array, object, null)	Yes
`properties` and `required`	Yes
`additionalProperties: false`	Yes (recommended)
`enum`	Yes
`const`	Yes
`items` (for arrays)	Yes
`anyOf`	Yes
`$ref` / `$defs`	Yes
`minLength` / `maxLength`	Yes
`minimum` / `maximum`	Yes
`pattern`	Limited
`minItems` / `maxItems`	Yes
`default`	Ignored (not used during generation)

Limitations

No recursive schemas — a $ref that creates a cycle is not supported.
Maximum nesting depth — schemas deeper than 5 levels may increase latency.
Maximum schema size — very large schemas (hundreds of properties) may increase latency or fail validation.
additionalProperties should always be false — set additionalProperties: false at every object level for reliable constrained decoding, regardless of whether you use strict: true. This applies to both json_schema response formats and strict tool definitions.

Best Practices

Use strict mode for reliable parsing. Setting strict: true in your json_schema config enables constrained decoding, guaranteeing schema conformance.
Set additionalProperties: false at every level of your schema to prevent unexpected fields.
Define minimal schemas. Fewer constraints and properties result in faster generation. Break complex extractions into multiple calls if needed.
Use json_object mode for simpler use cases. If you just need valid JSON without strict schema enforcement, json_object mode has lower latency.
Include a clear system prompt. Even with JSON Schema mode, a system prompt describing the task improves the quality of extracted values.
Works with streaming. Both JSON mode and JSON Schema mode work alongside stream: true. Tokens are streamed as they are generated and the final concatenated output is guaranteed to be valid.

Migration from JSON Mode

If you are currently using {"type": "json_object"} and want stricter guarantees, upgrading to JSON Schema mode is straightforward:

Before (JSON mode):

response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[
        {"role": "system", "content": "Return JSON with keys: name, year, genre."},
        {"role": "user", "content": "Tell me about Inception."},
    ],
    response_format={"type": "json_object"},
)
# Output might have unexpected keys or wrong types

After (JSON Schema mode):

response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[
        {"role": "system", "content": "Extract movie data."},
        {"role": "user", "content": "Tell me about Inception."},
    ],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "movie",
            "strict": True,
            "schema": {
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "year": {"type": "integer"},
                    "genre": {"type": "string"},
                },
                "required": ["name", "year", "genre"],
                "additionalProperties": False,
            },
        },
    },
)
# Output is guaranteed to match the schema exactly

Key differences when migrating:

You no longer need to describe the JSON structure in the system prompt (though it can still help with value quality).
The output is guaranteed to have exactly the fields you specify — no missing or extra keys.
All field types are enforced (e.g., year will always be an integer, not a string).

When to Use Which

Use case	Recommended mode
You need any valid JSON and handle flexible structures	JSON Object mode
You need output matching an exact schema (extraction, APIs, data pipelines)	JSON Schema mode
You want to stream structured output in real time	Either — both support streaming
You need strict function call arguments	Strict tool definitions

Tool Calling — another way to get structured data from models
Streaming — stream structured output tokens in real time
Chat Completions API — full endpoint reference

Tool Calling Reasoning

Structured Outputs

Response Formats

JSON Object Mode

Python

Node.js

curl

JSON Schema Mode

Python

Node.js

curl

Strict Tool Definitions

Supported Schemas

Limitations

Best Practices

Migration from JSON Mode

When to Use Which

Related