SDKsOpenAI Compatible

OpenAI-Compatible Usage

Tensoras.ai is drop-in compatible with the OpenAI SDK. If you already use the OpenAI client in your application, you can switch to Tensoras by changing two lines: the API key and the base URL.

Python

Installation

pip install openai

Usage

from openai import OpenAI
 
client = OpenAI(
    api_key="tns_your_key_here",
    base_url="https://api.tensoras.ai/v1",
)
 
response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is retrieval-augmented generation?"},
    ],
)
 
print(response.choices[0].message.content)

Async

from openai import AsyncOpenAI
 
client = AsyncOpenAI(
    api_key="tns_your_key_here",
    base_url="https://api.tensoras.ai/v1",
)
 
response = await client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[
        {"role": "user", "content": "Hello!"},
    ],
)

Node.js

Installation

npm install openai

Usage

import OpenAI from "openai";
 
const client = new OpenAI({
  apiKey: "tns_your_key_here",
  baseURL: "https://api.tensoras.ai/v1",
});
 
const response = await client.chat.completions.create({
  model: "llama-3.3-70b",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "What is retrieval-augmented generation?" },
  ],
});
 
console.log(response.choices[0].message.content);

Environment Variables

You can set these environment variables instead of passing them in code:

export OPENAI_API_KEY="tns_your_key_here"
export OPENAI_BASE_URL="https://api.tensoras.ai/v1"
from openai import OpenAI
 
client = OpenAI()  # picks up both env vars automatically

Streaming

Streaming works exactly the same as with the OpenAI SDK:

Python

stream = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[
        {"role": "user", "content": "Write a haiku about open-source AI."},
    ],
    stream=True,
)
 
for chunk in stream:
    delta = chunk.choices[0].delta.content
    if delta:
        print(delta, end="", flush=True)

Node.js

const stream = await client.chat.completions.create({
  model: "llama-3.3-70b",
  messages: [
    { role: "user", content: "Write a haiku about open-source AI." },
  ],
  stream: true,
});
 
for await (const chunk of stream) {
  const delta = chunk.choices[0]?.delta?.content;
  if (delta) process.stdout.write(delta);
}

Tool Calling

Tool calling follows the OpenAI format:

response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[
        {"role": "user", "content": "What is the weather in San Francisco?"},
    ],
    tools=[
        {
            "type": "function",
            "function": {
                "name": "get_weather",
                "description": "Get current weather for a city",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "city": {"type": "string", "description": "City name"},
                    },
                    "required": ["city"],
                },
            },
        }
    ],
)
 
tool_call = response.choices[0].message.tool_calls[0]
print(tool_call.function.name)       # "get_weather"
print(tool_call.function.arguments)  # '{"city": "San Francisco"}'

JSON Mode

response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[
        {"role": "user", "content": "List 3 programming languages as JSON."},
    ],
    response_format={"type": "json_object"},
)

Embeddings

response = client.embeddings.create(
    model="gte-large-en-v1.5",
    input="The quick brown fox.",
)
 
print(len(response.data[0].embedding))

File Uploads

file = client.files.create(
    file=open("data.jsonl", "rb"),
    purpose="batch",
)
 
print(file.id)

What Works

The following OpenAI SDK features are fully compatible with Tensoras:

  • Chat completions (streaming and non-streaming)
  • Tool calling / function calling
  • JSON mode and structured outputs
  • Embeddings
  • File uploads
  • Batch API
  • Model listing

What’s Different

These features are Tensoras extensions and require the Tensoras SDK to use:

FeatureNotes
knowledge_bases parameterPass Knowledge Base IDs to a chat completion for RAG. Not part of the OpenAI spec — use extra_body to pass it via the OpenAI SDK.
/v1/rerank endpointReranking is not part of the OpenAI API. Use the Tensoras SDK or call it directly with HTTP.
Citations in responsesRAG responses include a citations field. Accessible via the raw response when using the OpenAI SDK.

Using Tensoras Extensions with the OpenAI SDK

You can still access Tensoras-specific features through extra_body:

response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[
        {"role": "user", "content": "How do I reset my password?"},
    ],
    extra_body={
        "knowledge_bases": ["kb_a1b2c3d4"],
    },
)

Tip: For full access to Tensoras-specific features with proper typing, use the Tensoras Python SDK or Tensoras Node.js SDK.

Migrating from OpenAI

If you are migrating an existing OpenAI integration, see the Migrate from OpenAI guide for a step-by-step walkthrough.

Next Steps