OpenAI-Compatible Usage

Tensoras.ai is drop-in compatible with the OpenAI SDK. If you already use the OpenAI client in your application, you can switch to Tensoras by changing two lines: the API key and the base URL.

Python

Installation

pip install openai

Usage

from openai import OpenAI
 
client = OpenAI(
    api_key="tns_your_key_here",
    base_url="https://api.tensoras.ai/v1",
)
 
response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is retrieval-augmented generation?"},
    ],
)
 
print(response.choices[0].message.content)

Async

from openai import AsyncOpenAI
 
client = AsyncOpenAI(
    api_key="tns_your_key_here",
    base_url="https://api.tensoras.ai/v1",
)
 
response = await client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[
        {"role": "user", "content": "Hello!"},
    ],
)

Node.js

Installation

npm install openai

Usage

import OpenAI from "openai";
 
const client = new OpenAI({
  apiKey: "tns_your_key_here",
  baseURL: "https://api.tensoras.ai/v1",
});
 
const response = await client.chat.completions.create({
  model: "llama-3.3-70b",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "What is retrieval-augmented generation?" },
  ],
});
 
console.log(response.choices[0].message.content);

Environment Variables

You can set these environment variables instead of passing them in code:

export OPENAI_API_KEY="tns_your_key_here"
export OPENAI_BASE_URL="https://api.tensoras.ai/v1"

from openai import OpenAI
 
client = OpenAI()  # picks up both env vars automatically

Streaming

Streaming works exactly the same as with the OpenAI SDK:

Python

stream = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[
        {"role": "user", "content": "Write a haiku about open-source AI."},
    ],
    stream=True,
)
 
for chunk in stream:
    delta = chunk.choices[0].delta.content
    if delta:
        print(delta, end="", flush=True)

Node.js

const stream = await client.chat.completions.create({
  model: "llama-3.3-70b",
  messages: [
    { role: "user", content: "Write a haiku about open-source AI." },
  ],
  stream: true,
});
 
for await (const chunk of stream) {
  const delta = chunk.choices[0]?.delta?.content;
  if (delta) process.stdout.write(delta);
}

Tool Calling

Tool calling follows the OpenAI format:

response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[
        {"role": "user", "content": "What is the weather in San Francisco?"},
    ],
    tools=[
        {
            "type": "function",
            "function": {
                "name": "get_weather",
                "description": "Get current weather for a city",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "city": {"type": "string", "description": "City name"},
                    },
                    "required": ["city"],
                },
            },
        }
    ],
)
 
tool_call = response.choices[0].message.tool_calls[0]
print(tool_call.function.name)       # "get_weather"
print(tool_call.function.arguments)  # '{"city": "San Francisco"}'

JSON Mode

response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[
        {"role": "user", "content": "List 3 programming languages as JSON."},
    ],
    response_format={"type": "json_object"},
)

Embeddings

response = client.embeddings.create(
    model="gte-large-en-v1.5",
    input="The quick brown fox.",
)
 
print(len(response.data[0].embedding))

File Uploads

file = client.files.create(
    file=open("data.jsonl", "rb"),
    purpose="batch",
)
 
print(file.id)

What Works

The following OpenAI SDK features are fully compatible with Tensoras:

Chat completions (streaming and non-streaming)
Tool calling / function calling
JSON mode and structured outputs
Embeddings
File uploads
Batch API
Model listing

What’s Different

These features are Tensoras extensions and require the Tensoras SDK to use:

Feature	Notes
`knowledge_bases` parameter	Pass Knowledge Base IDs to a chat completion for RAG. Not part of the OpenAI spec — use `extra_body` to pass it via the OpenAI SDK.
`/v1/rerank` endpoint	Reranking is not part of the OpenAI API. Use the Tensoras SDK or call it directly with HTTP.
Citations in responses	RAG responses include a `citations` field. Accessible via the raw response when using the OpenAI SDK.

Using Tensoras Extensions with the OpenAI SDK

You can still access Tensoras-specific features through extra_body:

response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[
        {"role": "user", "content": "How do I reset my password?"},
    ],
    extra_body={
        "knowledge_bases": ["kb_a1b2c3d4"],
    },
)

Tip: For full access to Tensoras-specific features with proper typing, use the Tensoras Python SDK or Tensoras Node.js SDK.

Migrating from OpenAI

If you are migrating an existing OpenAI integration, see the Migrate from OpenAI guide for a step-by-step walkthrough.

Next Steps

Python SDK — full Tensoras Python client
Node.js SDK — full Tensoras Node.js client
Migrate from OpenAI — step-by-step migration guide

Node.js LangChain