Node.js SDK

The official Tensoras Node.js SDK provides a fully typed client for the Tensoras.ai API with first-class TypeScript support.

Installation

npm install tensoras

# or with yarn / pnpm
yarn add tensoras
pnpm add tensoras

Requires Node.js 18+.

Quick Start

import Tensoras from "tensoras";
 
const client = new Tensoras({ apiKey: "tns_your_key_here" });
 
const response = await client.chat.completions.create({
  model: "llama-3.3-70b",
  messages: [
    { role: "user", content: "Explain RAG in one sentence." },
  ],
});
 
console.log(response.choices[0].message.content);

Authentication

The client looks for an API key in this order:

The apiKey option passed to the constructor.
The TENSORAS_API_KEY environment variable.

export TENSORAS_API_KEY="tns_your_key_here"

import Tensoras from "tensoras";
 
const client = new Tensoras(); // reads TENSORAS_API_KEY from env

Custom Base URL

Point the client at a different endpoint for local development or self-hosted deployments:

const client = new Tensoras({
  apiKey: "tns_...",
  baseURL: "http://localhost:8000/v1",
});

The default base URL is https://api.tensoras.ai/v1.

Available Resources

Resource	Description
`client.chat.completions`	Chat completions (streaming and non-streaming)
`client.embeddings`	Text embeddings
`client.rerank`	Reranking
`client.models`	List and retrieve models
`client.files`	Upload and manage files
`client.batches`	Batch processing
`client.knowledgeBases`	Create and manage Knowledge Bases

Chat Completions

Basic Request

const response = await client.chat.completions.create({
  model: "llama-3.3-70b",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "What is the capital of France?" },
  ],
  temperature: 0.7,
  max_tokens: 256,
});
 
console.log(response.choices[0].message.content);

Streaming

const stream = await client.chat.completions.create({
  model: "llama-3.3-70b",
  messages: [
    { role: "user", content: "Write a short poem about APIs." },
  ],
  stream: true,
});
 
for await (const chunk of stream) {
  const delta = chunk.choices[0]?.delta?.content;
  if (delta) process.stdout.write(delta);
}
 
console.log();

Structured Outputs

Force the model to return JSON conforming to a specific schema using response_format:

JSON Object Mode

const response = await client.chat.completions.create({
  model: "llama-3.3-70b",
  messages: [
    { role: "system", content: "Return JSON with keys: name, genre, year." },
    { role: "user", content: "Tell me about Inception." },
  ],
  response_format: { type: "json_object" },
});
 
const data = JSON.parse(response.choices[0].message.content!);

JSON Schema Mode

import type { ResponseFormatJsonSchema } from "tensoras";
 
const responseFormat: ResponseFormatJsonSchema = {
  type: "json_schema",
  json_schema: {
    name: "movie",
    strict: true,
    schema: {
      type: "object",
      properties: {
        name: { type: "string" },
        year: { type: "integer" },
        genre: { type: "string" },
      },
      required: ["name", "year", "genre"],
      additionalProperties: false,
    },
  },
};
 
const response = await client.chat.completions.create({
  model: "llama-3.3-70b",
  messages: [
    { role: "system", content: "Extract movie data." },
    { role: "user", content: "Tell me about Inception." },
  ],
  response_format: responseFormat,
});
 
const data = JSON.parse(response.choices[0].message.content!);
console.log(data.name);  // "Inception"
console.log(data.year);  // 2010

See Structured Outputs for full details on schema support and best practices.

Embeddings

const response = await client.embeddings.create({
  model: "gte-large-en-v1.5",
  input: "The quick brown fox jumps over the lazy dog.",
});
 
const embedding = response.data[0].embedding;
console.log(`Dimensions: ${embedding.length}`);

Reranking

const response = await client.rerank.create({
  model: "bge-reranker-v2-m3",
  query: "What is deep learning?",
  documents: [
    "Deep learning is a subset of machine learning.",
    "The weather today is sunny.",
    "Neural networks are the foundation of deep learning.",
  ],
});
 
for (const result of response.results) {
  console.log(`Index: ${result.index}, Score: ${result.relevance_score.toFixed(4)}`);
}

Knowledge Bases

Create and Query

// Create a Knowledge Base
const kb = await client.knowledgeBases.create({
  name: "support-docs",
  description: "Customer support documentation",
});
 
console.log(kb.id); // e.g. "kb_a1b2c3d4"
 
// Add a data source
const dataSource = await client.knowledgeBases.dataSources.create(kb.id, {
  type: "file_upload",
  file: fs.createReadStream("handbook.pdf"),
});
 
// Query with RAG
const response = await client.chat.completions.create({
  model: "llama-3.3-70b",
  messages: [
    { role: "user", content: "How do I reset my password?" },
  ],
  knowledge_bases: [kb.id],
});
 
console.log(response.choices[0].message.content);
 
for (const citation of response.citations) {
  console.log(`  Source: ${citation.source}, Score: ${citation.score.toFixed(3)}`);
}

List Knowledge Bases

const knowledgeBases = await client.knowledgeBases.list();
 
for (const kb of knowledgeBases.data) {
  console.log(`${kb.id}: ${kb.name}`);
}

TypeScript Types

The SDK exports types for all request and response objects:

import Tensoras from "tensoras";
import type {
  ChatCompletion,
  ChatCompletionChunk,
  ChatCompletionMessage,
  ChatCompletionCreateParams,
  EmbeddingResponse,
  EmbeddingCreateParams,
  RerankResponse,
  Model,
  KnowledgeBase,
} from "tensoras";
 
const client = new Tensoras();
 
const params: ChatCompletionCreateParams = {
  model: "llama-3.3-70b",
  messages: [{ role: "user", content: "Hello" }],
};
 
const response: ChatCompletion = await client.chat.completions.create(params);
const message: ChatCompletionMessage = response.choices[0].message;

Error Handling

The SDK throws typed errors that you can catch individually:

import Tensoras, {
  TensorasAPIError,
  AuthenticationError,
  RateLimitError,
} from "tensoras";
 
const client = new Tensoras();
 
try {
  const response = await client.chat.completions.create({
    model: "llama-3.3-70b",
    messages: [{ role: "user", content: "Hello" }],
  });
} catch (error) {
  if (error instanceof AuthenticationError) {
    console.error("Invalid or missing API key.");
  } else if (error instanceof RateLimitError) {
    console.error(`Rate limited. Retry after ${error.retryAfter}s.`);
  } else if (error instanceof TensorasAPIError) {
    console.error(`API error ${error.status}: ${error.message}`);
  } else {
    throw error;
  }
}

Exception Hierarchy

Error Class	Status Code	Description
`TensorasAPIError`	—	Base class for all API errors
`AuthenticationError`	401	Invalid or missing API key
`PermissionDeniedError`	403	Key lacks required permissions
`NotFoundError`	404	Resource not found
`RateLimitError`	429	Too many requests
`InternalServerError`	500+	Server-side error

Automatic Retries

The SDK automatically retries failed requests up to 3 times with exponential backoff for transient errors (429, 500, 502, 503, 504). You can customize this:

const client = new Tensoras({
  maxRetries: 5,      // default: 3
  timeout: 60_000,    // request timeout in ms, default: 120_000
});

Next Steps

Python SDK — Python client
OpenAI-Compatible Usage — use the OpenAI SDK with Tensoras
Streaming — SSE details and cancellation
Tool Calling — let the model invoke functions
Vercel AI SDK Integration — use Tensoras with Next.js

Python OpenAI Compatible