Embeddings
Generate vector embeddings for text input. Embeddings are numerical representations of text that capture semantic meaning, useful for search, clustering, classification, and retrieval-augmented generation (RAG).
Endpoint
POST https://api.tensoras.ai/v1/embeddingsAuthentication
Authorization: Bearer tns_your_key_hereRequest Body
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
model | string | Yes | — | The embedding model to use. Currently supported: bge-large-en-v1.5. |
input | string or array | Yes | — | The text to embed. Can be a single string or an array of strings. Maximum of 2048 tokens per string. |
encoding_format | string | No | "float" | The format of the returned embeddings. Either "float" or "base64". |
Response Body
{
"object": "list",
"data": [
{
"object": "embedding",
"index": 0,
"embedding": [0.0023064255, -0.009327292, 0.015797347, ...]
}
],
"model": "bge-large-en-v1.5",
"usage": {
"prompt_tokens": 8,
"total_tokens": 8
}
}| Field | Type | Description |
|---|---|---|
object | string | Always "list". |
data | array | A list of embedding objects. |
data[].object | string | Always "embedding". |
data[].index | integer | The index of the embedding in the input array. |
data[].embedding | array | The embedding vector. A list of floats with 1024 dimensions for bge-large-en-v1.5. |
model | string | The model used to generate the embeddings. |
usage | object | Token usage statistics. |
usage.prompt_tokens | integer | Number of tokens in the input. |
usage.total_tokens | integer | Total tokens processed. |
Examples
Single Text Embedding
curl
curl https://api.tensoras.ai/v1/embeddings \
-H "Content-Type: application/json" \
-H "Authorization: Bearer tns_your_key_here" \
-d '{
"model": "bge-large-en-v1.5",
"input": "The quick brown fox jumps over the lazy dog."
}'Python
from openai import OpenAI
client = OpenAI(
base_url="https://api.tensoras.ai/v1",
api_key="tns_your_key_here",
)
response = client.embeddings.create(
model="bge-large-en-v1.5",
input="The quick brown fox jumps over the lazy dog.",
)
embedding = response.data[0].embedding
print(f"Embedding dimensions: {len(embedding)}")
print(f"First 5 values: {embedding[:5]}")Node.js
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.tensoras.ai/v1",
apiKey: "tns_your_key_here",
});
const response = await client.embeddings.create({
model: "bge-large-en-v1.5",
input: "The quick brown fox jumps over the lazy dog.",
});
const embedding = response.data[0].embedding;
console.log(`Embedding dimensions: ${embedding.length}`);
console.log(`First 5 values: ${embedding.slice(0, 5)}`);Batch Embeddings
Generate embeddings for multiple texts in a single request.
curl
curl https://api.tensoras.ai/v1/embeddings \
-H "Content-Type: application/json" \
-H "Authorization: Bearer tns_your_key_here" \
-d '{
"model": "bge-large-en-v1.5",
"input": [
"What is machine learning?",
"How does natural language processing work?",
"Explain deep neural networks."
]
}'Python
from openai import OpenAI
client = OpenAI(
base_url="https://api.tensoras.ai/v1",
api_key="tns_your_key_here",
)
texts = [
"What is machine learning?",
"How does natural language processing work?",
"Explain deep neural networks.",
]
response = client.embeddings.create(
model="bge-large-en-v1.5",
input=texts,
)
for item in response.data:
print(f"Text {item.index}: {len(item.embedding)} dimensions")Node.js
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.tensoras.ai/v1",
apiKey: "tns_your_key_here",
});
const texts = [
"What is machine learning?",
"How does natural language processing work?",
"Explain deep neural networks.",
];
const response = await client.embeddings.create({
model: "bge-large-en-v1.5",
input: texts,
});
for (const item of response.data) {
console.log(`Text ${item.index}: ${item.embedding.length} dimensions`);
}Computing Similarity
Use cosine similarity to measure the semantic similarity between two texts.
Python
import numpy as np
from openai import OpenAI
client = OpenAI(
base_url="https://api.tensoras.ai/v1",
api_key="tns_your_key_here",
)
def cosine_similarity(a, b):
return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
response = client.embeddings.create(
model="bge-large-en-v1.5",
input=[
"I love programming in Python",
"Python is my favorite coding language",
"The weather is nice today",
],
)
emb_a = response.data[0].embedding
emb_b = response.data[1].embedding
emb_c = response.data[2].embedding
print(f"Similarity (a, b): {cosine_similarity(emb_a, emb_b):.4f}") # High similarity
print(f"Similarity (a, c): {cosine_similarity(emb_a, emb_c):.4f}") # Low similarityError Handling
{
"error": {
"message": "Input text exceeds maximum token limit of 2048",
"type": "invalid_request_error",
"param": "input",
"code": "token_limit_exceeded"
}
}