LangChain

Use Tensoras.ai as the LLM, embeddings, and retrieval provider in your LangChain applications.

Installation

pip install langchain-tensoras

This installs the Tensoras integration package for LangChain. It requires langchain-core>=0.2.

Authentication

Set your API key as an environment variable:

export TENSORAS_API_KEY="tns_your_key_here"

Or pass it directly to each component.

Chat Model

Use ChatTensoras as a drop-in LangChain chat model:

from langchain_tensoras import ChatTensoras
 
llm = ChatTensoras(
    model="llama-3.3-70b",
    api_key="tns_your_key_here",  # or set TENSORAS_API_KEY
    temperature=0.7,
    max_tokens=512,
)
 
response = llm.invoke("What is retrieval-augmented generation?")
print(response.content)

Streaming

for chunk in llm.stream("Write a poem about open-source AI."):
    print(chunk.content, end="", flush=True)

Async

response = await llm.ainvoke("What is RAG?")
print(response.content)

With Message History

from langchain_core.messages import HumanMessage, SystemMessage
 
messages = [
    SystemMessage(content="You are a helpful coding assistant."),
    HumanMessage(content="Write a Python function to reverse a string."),
]
 
response = llm.invoke(messages)
print(response.content)

Embeddings

Use TensorasEmbeddings for document and query embedding:

from langchain_tensoras import TensorasEmbeddings
 
embeddings = TensorasEmbeddings(
    model="gte-large-en-v1.5",
    api_key="tns_your_key_here",  # or set TENSORAS_API_KEY
)
 
# Embed a single query
query_vector = embeddings.embed_query("What is deep learning?")
print(f"Dimensions: {len(query_vector)}")
 
# Embed multiple documents
doc_vectors = embeddings.embed_documents([
    "Deep learning is a subset of machine learning.",
    "Neural networks have multiple layers.",
])
print(f"Embedded {len(doc_vectors)} documents")

Retriever

Use TensorasRetriever to retrieve documents from a Tensoras Knowledge Base:

from langchain_tensoras import TensorasRetriever
 
retriever = TensorasRetriever(
    knowledge_base_id="kb_a1b2c3d4",
    api_key="tns_your_key_here",  # or set TENSORAS_API_KEY
    top_k=5,
)
 
docs = retriever.invoke("How do I reset my password?")
 
for doc in docs:
    print(f"Score: {doc.metadata['score']:.3f}")
    print(doc.page_content[:200])
    print()

RAG Chain

Combine the chat model, embeddings, and retriever into a full RAG chain:

from langchain_tensoras import ChatTensoras, TensorasRetriever
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
 
# Components
llm = ChatTensoras(model="llama-3.3-70b")
retriever = TensorasRetriever(knowledge_base_id="kb_a1b2c3d4", top_k=5)
 
# Prompt
prompt = ChatPromptTemplate.from_template(
    """Answer the question based on the following context.
 
Context:
{context}
 
Question: {question}
 
Answer:"""
)
 
# Helper to format retrieved docs
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)
 
# Chain
chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)
 
answer = chain.invoke("How do I reset my password?")
print(answer)

Tool Calling

ChatTensoras supports LangChain’s tool calling interface:

from langchain_core.tools import tool
from langchain_tensoras import ChatTensoras
 
@tool
def get_weather(city: str) -> str:
    """Get the current weather for a city."""
    return f"The weather in {city} is sunny, 72F."
 
llm = ChatTensoras(model="llama-3.3-70b")
llm_with_tools = llm.bind_tools([get_weather])
 
response = llm_with_tools.invoke("What is the weather in San Francisco?")
 
for tool_call in response.tool_calls:
    print(f"Tool: {tool_call['name']}, Args: {tool_call['args']}")

Using with Existing Vector Stores

If you prefer to use LangChain’s built-in vector stores with Tensoras embeddings:

from langchain_tensoras import TensorasEmbeddings
from langchain_community.vectorstores import FAISS
 
embeddings = TensorasEmbeddings(model="gte-large-en-v1.5")
 
texts = [
    "Tensoras provides serverless AI inference.",
    "Knowledge Bases enable automatic RAG.",
    "Hybrid search combines vector and keyword search.",
]
 
vectorstore = FAISS.from_texts(texts, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 3})
 
docs = retriever.invoke("What is hybrid search?")
for doc in docs:
    print(doc.page_content)

Next Steps

LlamaIndex Integration — use Tensoras with LlamaIndex
LangGraph Integration — build agents with LangGraph and Tensoras
RAG Overview — how Tensoras RAG works under the hood
Python SDK — full SDK reference

OpenAI Compatible LlamaIndex