Quickstart

Go from zero to your first Tensoras.ai API call in under five minutes.

1. Get an API Key

Sign up or log in at cloud.tensoras.ai.
Navigate to Console > API Keys.
Click Create Key, give it a name, and copy the key. It starts with tns_.

Important: Store your API key securely. You will not be able to view it again after creation.

Set it as an environment variable so the SDKs pick it up automatically:

export TENSORAS_API_KEY="tns_your_key_here"

2. Install the SDK

pip install tensoras

npm install tensoras

No SDK needed — use the built-in fetch API. Or install the OpenAI SDK for convenience:

npm install openai

You can also use the OpenAI SDK directly — see OpenAI-Compatible Usage.

3. Make Your First Request

Send a chat completion request to Llama 3.3 70B.

from tensoras import Tensoras
 
client = Tensoras()  # reads TENSORAS_API_KEY from env
 
response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is retrieval-augmented generation?"},
    ],
)
 
print(response.choices[0].message.content)

import Tensoras from "tensoras";
 
const client = new Tensoras(); // reads TENSORAS_API_KEY from env
 
const response = await client.chat.completions.create({
  model: "llama-3.3-70b",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "What is retrieval-augmented generation?" },
  ],
});
 
console.log(response.choices[0].message.content);

const response = await fetch("https://api.tensoras.ai/v1/chat/completions", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    Authorization: `Bearer ${process.env.TENSORAS_API_KEY}`,
  },
  body: JSON.stringify({
    model: "llama-3.3-70b",
    messages: [
      { role: "system", content: "You are a helpful assistant." },
      { role: "user", content: "What is retrieval-augmented generation?" },
    ],
  }),
});
 
const data = await response.json();
console.log(data.choices[0].message.content);

curl https://api.tensoras.ai/v1/chat/completions \
  -H "Authorization: Bearer $TENSORAS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-3.3-70b",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is retrieval-augmented generation?"}
    ]
  }'

Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "model": "llama-3.3-70b",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Retrieval-augmented generation (RAG) is a technique that enhances LLM responses by retrieving relevant documents from an external knowledge base and including them in the prompt context. This allows the model to ground its answers in specific, up-to-date information rather than relying solely on its training data."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 28,
    "completion_tokens": 52,
    "total_tokens": 80
  }
}

4. Try Streaming

Stream tokens as they are generated for a responsive user experience.

from tensoras import Tensoras
 
client = Tensoras()
 
stream = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[
        {"role": "user", "content": "Write a haiku about open-source AI."},
    ],
    stream=True,
)
 
for chunk in stream:
    delta = chunk.choices[0].delta.content
    if delta:
        print(delta, end="", flush=True)
 
print()  # newline after stream completes

import Tensoras from "tensoras";
 
const client = new Tensoras();
 
const stream = await client.chat.completions.create({
  model: "llama-3.3-70b",
  messages: [
    { role: "user", content: "Write a haiku about open-source AI." },
  ],
  stream: true,
});
 
for await (const chunk of stream) {
  const delta = chunk.choices[0]?.delta?.content;
  if (delta) process.stdout.write(delta);
}
 
console.log(); // newline after stream completes

const response = await fetch("https://api.tensoras.ai/v1/chat/completions", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    Authorization: `Bearer ${process.env.TENSORAS_API_KEY}`,
  },
  body: JSON.stringify({
    model: "llama-3.3-70b",
    messages: [
      { role: "user", content: "Write a haiku about open-source AI." },
    ],
    stream: true,
  }),
});
 
const reader = response.body.getReader();
const decoder = new TextDecoder();
 
while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  const chunk = decoder.decode(value, { stream: true });
  for (const line of chunk.split("\n")) {
    if (!line.startsWith("data: ") || line === "data: [DONE]") continue;
    const { choices } = JSON.parse(line.slice(6));
    const content = choices[0]?.delta?.content;
    if (content) process.stdout.write(content);
  }
}

curl -N https://api.tensoras.ai/v1/chat/completions \
  -H "Authorization: Bearer $TENSORAS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-3.3-70b",
    "messages": [
      {"role": "user", "content": "Write a haiku about open-source AI."}
    ],
    "stream": true
  }'

Output (streamed)

Weights shared with the world
Silicon dreams set free to learn
All may shape the mind

See the full Streaming guide for SSE details, error handling, and cancellation.

5. Try RAG with Knowledge Bases

Tensoras Knowledge Bases let you upload documents and query them alongside a chat completion. The model receives relevant chunks as context and can return citations.

Step 1: Create a Knowledge Base

from tensoras import Tensoras
 
client = Tensoras()
 
kb = client.knowledge_bases.create(
    name="product-docs",
    description="Internal product documentation",
)
 
print(kb.id)  # e.g. "kb_a1b2c3d4"

import Tensoras from "tensoras";
 
const client = new Tensoras();
 
const kb = await client.knowledgeBases.create({
  name: "product-docs",
  description: "Internal product documentation",
});
 
console.log(kb.id); // e.g. "kb_a1b2c3d4"

const response = await fetch("https://api.tensoras.ai/v1/knowledge_bases", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    Authorization: `Bearer ${process.env.TENSORAS_API_KEY}`,
  },
  body: JSON.stringify({
    name: "product-docs",
    description: "Internal product documentation",
  }),
});
 
const kb = await response.json();
console.log(kb.id); // e.g. "kb_a1b2c3d4"

curl -X POST https://api.tensoras.ai/v1/knowledge_bases \
  -H "Authorization: Bearer $TENSORAS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "product-docs",
    "description": "Internal product documentation"
  }'

Step 2: Upload a File

data_source = client.knowledge_bases.data_sources.create(
    knowledge_base_id=kb.id,
    type="file_upload",
    file=open("product-guide.pdf", "rb"),
)
 
# Wait for ingestion to complete
print(data_source.status)  # "processing" -> "completed"

import fs from "fs";
 
const dataSource = await client.knowledgeBases.dataSources.create(kb.id, {
  type: "file_upload",
  file: fs.createReadStream("product-guide.pdf"),
});
 
console.log(dataSource.status); // "processing" -> "completed"

const formData = new FormData();
formData.append("type", "file_upload");
formData.append("file", fileInput.files[0]);
 
const response = await fetch(
  `https://api.tensoras.ai/v1/knowledge_bases/${kb.id}/data_sources`,
  {
    method: "POST",
    headers: { Authorization: `Bearer ${process.env.TENSORAS_API_KEY}` },
    body: formData,
  }
);
 
const dataSource = await response.json();
console.log(dataSource.status); // "processing" -> "completed"

curl -X POST "https://api.tensoras.ai/v1/knowledge_bases/kb_a1b2c3d4/data_sources" \
  -H "Authorization: Bearer $TENSORAS_API_KEY" \
  -F "type=file_upload" \
  -F "file=@product-guide.pdf"

Step 3: Query with RAG

Pass the knowledge_bases parameter in your chat completion request. Tensoras retrieves relevant chunks using hybrid search (vector + keyword) and injects them as context.

response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[
        {"role": "user", "content": "How do I reset my password?"},
    ],
    knowledge_bases=["kb_a1b2c3d4"],
)
 
print(response.choices[0].message.content)
 
# Access citations
for citation in response.citations:
    print(f"Source: {citation.source}, Score: {citation.score:.3f}")

const response = await client.chat.completions.create({
  model: "llama-3.3-70b",
  messages: [
    { role: "user", content: "How do I reset my password?" },
  ],
  knowledge_bases: ["kb_a1b2c3d4"],
});
 
console.log(response.choices[0].message.content);
 
// Access citations
for (const citation of response.citations) {
  console.log(`Source: ${citation.source}, Score: ${citation.score.toFixed(3)}`);
}

const response = await fetch("https://api.tensoras.ai/v1/chat/completions", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    Authorization: `Bearer ${process.env.TENSORAS_API_KEY}`,
  },
  body: JSON.stringify({
    model: "llama-3.3-70b",
    messages: [
      { role: "user", content: "How do I reset my password?" },
    ],
    knowledge_bases: ["kb_a1b2c3d4"],
  }),
});
 
const data = await response.json();
console.log(data.choices[0].message.content);
 
// Access citations
for (const citation of data.citations) {
  console.log(`Source: ${citation.source}, Score: ${citation.score.toFixed(3)}`);
}

curl https://api.tensoras.ai/v1/chat/completions \
  -H "Authorization: Bearer $TENSORAS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-3.3-70b",
    "messages": [
      {"role": "user", "content": "How do I reset my password?"}
    ],
    "knowledge_bases": ["kb_a1b2c3d4"]
  }'

Output

To reset your password, go to Settings > Account > Security and click
"Reset Password." You will receive a confirmation email with a reset link
that expires after 24 hours.

Source: product-guide.pdf, Score: 0.934

See the RAG Overview, Hybrid Search, Citations, and Connectors guides for more.

6. Next Steps

You are all set. Here is where to go next depending on what you want to build:

Streaming — real-time token delivery over SSE
Tool Calling — let the model invoke functions
Structured Outputs — enforce JSON schemas on responses
Reasoning — chain-of-thought with DeepSeek R1
API Reference — full endpoint specs
Python SDK | Node.js SDK — client library docs
Integrations — LangChain, LlamaIndex, Vercel AI SDK, and more
Migrate from OpenAI — drop-in migration guide

Introduction Authentication