Migrate from OpenAI
Tensoras.ai implements the OpenAI API specification, so migrating an existing OpenAI integration takes just a few minutes. This guide walks through each step and highlights the differences you should know about.
Step 1: Get a Tensoras API Key
- Sign up or log in at cloud.tensoras.ai.
- Navigate to Console > API Keys.
- Click Create Key, give it a name, and copy the key. It starts with
tns_.
export TENSORAS_API_KEY="tns_your_key_here"Step 2: Update Base URL and API Key
The only code change required is pointing your client at the Tensoras endpoint and swapping the key.
Python (OpenAI SDK)
# Before -- OpenAI
from openai import OpenAI
client = OpenAI(
api_key="sk-...",
)
# After -- Tensoras
from openai import OpenAI
client = OpenAI(
api_key="tns_your_key_here",
base_url="https://api.tensoras.ai/v1",
)Or use the native Tensoras SDK, which wraps the same API with additional helpers:
from tensoras import Tensoras
client = Tensoras() # reads TENSORAS_API_KEY from envNode.js (OpenAI SDK)
// Before -- OpenAI
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "sk-...",
});
// After -- Tensoras
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "tns_your_key_here",
baseURL: "https://api.tensoras.ai/v1",
});Or use the native Tensoras SDK:
import Tensoras from "tensoras";
const client = new Tensoras(); // reads TENSORAS_API_KEY from envcurl
# Before -- OpenAI
curl https://api.openai.com/v1/chat/completions \
-H "Authorization: Bearer sk-..." \
-H "Content-Type: application/json" \
-d '{ ... }'
# After -- Tensoras
curl https://api.tensoras.ai/v1/chat/completions \
-H "Authorization: Bearer $TENSORAS_API_KEY" \
-H "Content-Type: application/json" \
-d '{ ... }'Step 3: Map Your Models
Replace OpenAI model names with the corresponding Tensoras models:
| OpenAI Model | Tensoras Model | Notes |
|---|---|---|
gpt-4o | llama-3.3-70b | Best overall quality, $0.20/$0.60 per M tokens |
gpt-4o-mini | llama-3.1-8b | Fast and cheap, $0.05/$0.10 per M tokens |
o1 / o1-mini | deepseek-r1-distill-70b | Chain-of-thought reasoning, $0.15/$0.45 per M tokens |
gpt-3.5-turbo | mistral-7b-instruct | Budget inference, $0.04/$0.08 per M tokens |
text-embedding-3-small | bge-large-en-v1.5 | Embeddings |
text-embedding-3-large | bge-large-en-v1.5 | Embeddings |
Before and After — Python
# Before -- OpenAI
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum computing."},
],
)
# After -- Tensoras
response = client.chat.completions.create(
model="llama-3.3-70b",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum computing."},
],
)Before and After — Node.js
// Before -- OpenAI
const response = await client.chat.completions.create({
model: "gpt-4o",
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Explain quantum computing." },
],
});
// After -- Tensoras
const response = await client.chat.completions.create({
model: "llama-3.3-70b",
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Explain quantum computing." },
],
});What’s Compatible
Tensoras supports the same request and response formats as OpenAI for the following features:
- Chat Completions —
POST /v1/chat/completionswith messages, temperature, top_p, max_tokens, stop sequences, and more - Streaming — Server-Sent Events with
stream: true, same delta format - Tool Calling —
toolsandtool_choiceparameters work identically - JSON Mode —
response_format: { type: "json_object" }is supported - Structured Outputs —
response_format: { type: "json_schema", json_schema: {...} }is supported - Embeddings —
POST /v1/embeddingswith the same request/response shape - Files —
POST /v1/filesfor uploading documents
Any code that uses these features through the OpenAI SDK will work with Tensoras after updating the base URL, API key, and model name.
What’s Different
While the core API is compatible, there are a few differences to be aware of:
Model Names
Tensoras hosts open-source models. You must use Tensoras model identifiers (e.g., llama-3.3-70b) instead of OpenAI model names (e.g., gpt-4o). See the mapping table above.
Knowledge Bases (Tensoras Extension)
Tensoras adds a knowledge_bases parameter to the chat completions endpoint. This lets you attach one or more Knowledge Bases for retrieval-augmented generation directly in the API call — no separate retrieval step required:
response = client.chat.completions.create(
model="llama-3.3-70b",
messages=[
{"role": "user", "content": "What is our refund policy?"},
],
knowledge_bases=["kb_a1b2c3d4"],
)This parameter is not part of the OpenAI spec and will be ignored if you send it to OpenAI. See the RAG Overview for details.
Pricing Model
OpenAI and Tensoras both charge per token, but Tensoras pricing is significantly lower because it serves open-source models on optimized infrastructure. See the Billing guide for the full pricing table.
No Assistants API
Tensoras does not implement the OpenAI Assistants API. If you use Assistants, Threads, or Runs, you will need to migrate that logic to direct chat completions with tool calling.
Full Migration Checklist
- Create a Tensoras API key at cloud.tensoras.ai
- Update
base_url/baseURLtohttps://api.tensoras.ai/v1 - Replace
api_key/apiKeywith yourtns_key - Replace OpenAI model names with Tensoras equivalents
- Test chat completions, streaming, and any tool calling flows
- If using embeddings, switch to
bge-large-en-v1.5and re-embed your data - Update any hardcoded cost calculations to use Tensoras pricing
- (Optional) Switch from the OpenAI SDK to the native Tensoras Python SDK or Node.js SDK for Knowledge Base support
Next Steps
- Quickstart — make your first Tensoras API call
- OpenAI-Compatible Usage — detailed OpenAI SDK configuration
- Billing — pricing details for all models
- RAG Overview — leverage Knowledge Bases for grounded responses