API Reference

Complete API documentation for TensorCortex - your unified interface to 10 AI providers.

Getting Started

Authentication (BYOK)

Use your own provider API keys directly. TensorCortex uses BYOK (Bring Your Own Keys) - no markup fees.

// Add to your request headers
Authorization: Bearer your-openai-api-key

Use your actual provider API key (e.g., sk-proj-... for OpenAI). TensorCortex passes it directly to the provider - zero markup.

Base URL

https://openai.tensor.cx/v1

Replace openai with any supported provider: anthropic, google, mistral, groq, etc. Note: Include /v1 in base URL - TensorCortex is a pure pass-through proxy.

Rate Limits

Free

BYOK

Bring your own keys

Provider rate limits apply

Pro

Custom

Configurable limits

Daily & monthly cost caps

Enterprise

Unlimited

No gateway limits

Custom SLA & support

TensorCortex uses BYOK - rate limits are determined by your provider plan. Set custom daily and monthly cost limits in the Cockpit dashboard.

Response Headers

All API responses include tracking headers for debugging, caching, and cost monitoring:

HeaderDescriptionExample
X-Request-IdUnique request identifier (UUID)550e8400-e29b...
X-ProviderProvider used for this requestopenai
X-CacheCache status: HIT, MISS, or SKIPHIT
X-CostRequest cost in cents25
X-Tokens-PromptPrompt token count150
X-Tokens-CompletionCompletion token count75

API Endpoints

GET/health

Basic health check endpoint. Returns service status and version information.

Response

{
  "status": "healthy",
  "service": "tensorcortex-gateway",
  "version": "0.1.0",
  "environment": "production",
  "timestamp": "2025-01-02T12:00:00Z"
}
POST/v1/chat/completions

Generate chat completions using any supported model. OpenAI-compatible format for most providers (OpenAI, Groq, Mistral, Together, Fireworks, Perplexity, DeepInfra, OpenRouter). Anthropic and Google use their native formats.

Request Body

{
  "model": "gpt-4o",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ],
  "max_tokens": 1000,
  "temperature": 0.7,
  "stream": false
}

The provider is determined by your base URL (e.g., openai.tensor.cx, anthropic.tensor.cx)

Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "model": "gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 20,
    "completion_tokens": 10,
    "total_tokens": 30
  }
}

Response format matches the provider's native format. TensorCortex is a transparent pass-through proxy.

Example (cURL)

curl https://openai.tensor.cx/v1/chat/completions \
  -H "Authorization: Bearer your-openai-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "user", "content": "Hello!"}
    ]
  }'

Example (TypeScript)

import OpenAI from 'openai'

// Just change baseURL - use your existing OpenAI SDK!
const client = new OpenAI({
  apiKey: 'your-openai-api-key',
  baseURL: 'https://openai.tensor.cx/v1'
})

const response = await client.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Hello!' }]
})

console.log(response.choices[0].message.content)
GET/v1/models

List all available models for the selected provider. Response format matches provider's native format.

Response (OpenAI example)

{
  "object": "list",
  "data": [
    {
      "id": "gpt-4o",
      "object": "model",
      "created": 1715367049,
      "owned_by": "system"
    },
    {
      "id": "gpt-4o-mini",
      "object": "model",
      "created": 1721172741,
      "owned_by": "system"
    }
  ]
}

Response format is provider-native. TensorCortex passes through the provider's actual response.

POST/v1/embeddings

Generate embeddings for text using various embedding models. Available on providers that support embeddings (OpenAI, Mistral, Together, etc.).

Request Body

{
  "model": "text-embedding-3-large",
  "input": "The quick brown fox jumps over the lazy dog"
}

Parameters

model

required

Model name (e.g., gpt-4o, claude-3-5-sonnet-20241022). Provider is determined by your base URL.

messages

required

Array of message objects with role and content

stream

optional

Enable streaming responses via Server-Sent Events. Default: false

Error Codes & Handling

All errors follow OpenAI-compatible format with detailed error messages. Always check the X-Request-Id header for debugging.

400 Bad Requestinvalid_request_error

Cause: Malformed request body or invalid parameters (e.g., missing required fields, invalid JSON).

Fix: Validate your JSON syntax and ensure all required fields (model, messages) are present.

401 Unauthorizedauthentication_error

Cause: Invalid or missing API key. Key may be deleted, archived, or misspelled.

Fix: Verify your API key in the Cockpit dashboard. Check for extra whitespace or truncation.

429 Too Many Requestsrate_limit_error

Cause: Rate limit exceeded or cost limit reached (daily/monthly).

Fix: Implement exponential backoff. Check X-RateLimit-Reset header. Increase limits in Cockpit settings.

// Python retry example
@retry(wait=wait_exponential(min=1, max=60)) def make_request(): return client.chat.completions.create(...)
500 Internal Server Errorinternal_error

Cause: Gateway internal error. Rare occurrence, usually temporary.

Fix: Retry after a brief delay. Save the X-Request-Id and contact support if persistent.

502 Bad Gatewayprovider_error

Cause: Upstream provider API error (OpenAI, Anthropic, etc.). Invalid model name or provider issues.

Fix: Check provider status page. Verify model name is correct for the provider. Check your provider API key validity.

503 Service Unavailableservice_unavailable

Cause: Gateway temporarily unavailable. Scheduled maintenance or high traffic.

Fix: Retry with exponential backoff. Check status page for announcements.

Request ID Tracking

All responses include an X-Request-Id header. Always log this ID for debugging. Example: 550e8400-e29b-41d4-a716-446655440000

Supported Providers

TensorCortex supports 10 AI providers via subdomain routing. Each provider has its own subdomain for clean separation.

OpenAI

Base URL: https://openai.tensor.cx/v1

General-purpose AI, coding, reasoning

Popular models: gpt-4o, gpt-4o-mini, gpt-4-turbo

Features: Chat, vision, function calling

Anthropic (Claude)

Base URL: https://anthropic.tensor.cx/v1

Long-form content, analysis, safety-critical

Popular models: claude-3-5-sonnet, claude-3-5-haiku

Features: Chat, vision, tool use

Groq

Base URL: https://groq.tensor.cx/openai/v1

Ultra-fast inference, real-time applications

Popular models: llama-3.1-70b, llama-3.1-8b, mixtral

Features: OpenAI-compatible, streaming

Google AI (Gemini)

Base URL: https://google.tensor.cx/v1beta

Multimodal, long context (up to 2M tokens)

Popular models: gemini-1.5-pro, gemini-1.5-flash

Features: Chat, vision, multimodal

Mistral AI

Base URL: https://mistral.tensor.cx/v1

European alternative, coding, GDPR-compliant

Popular models: mistral-large, mistral-small, codestral

Features: OpenAI-compatible, streaming

Together AI

Base URL: https://together.tensor.cx/v1

Open-source models, cost-effective inference

Popular models: Llama, Qwen, Mixtral variants

Features: OpenAI-compatible, streaming

Fireworks AI

Base URL: https://fireworks.tensor.cx/inference/v1

Fast inference, function calling specialists

Popular models: Llama, Mixtral, function calling models

Features: OpenAI-compatible, streaming

Perplexity AI

Base URL: https://perplexity.tensor.cx/v1

Search-enhanced AI, real-time information

Popular models: sonar-small, sonar-medium

Features: OpenAI-compatible, streaming

DeepInfra

Base URL: https://deepinfra.tensor.cx/v1/openai

Cost-effective open-source model hosting

Popular models: Llama, Mixtral, Qwen variants

Features: OpenAI-compatible, streaming

OpenRouter

Base URL: https://openrouter.tensor.cx/api/v1

Meta-router for multiple providers

Popular models: Access 100+ models from one key

Features: OpenAI-compatible, streaming

Provider Switching

Switch providers by changing the subdomain. Same API format for OpenAI-compatible providers (OpenAI, Groq, Mistral). Anthropic and Google use their native formats.

Use Your Existing SDK

No new SDK needed! TensorCortex is fully compatible with your existing provider SDKs. Just change the base URL.

Python

Works with openai and anthropic packages.

client = OpenAI( base_url="https://openai.tensor.cx/v1" )

TypeScript/JavaScript

Works with openai npm package.

new OpenAI({ baseURL: "https://openai.tensor.cx/v1" })

Any Language

Just change the API endpoint URL.

OPENAI_BASE_URL= https://openai.tensor.cx/v1

Zero-Config Setup

The easiest way to use TensorCortex - just set environment variables. No code changes required!

Python

1. Set environment variables:

export OPENAI_BASE_URL="https://openai.tensor.cx/v1"
export OPENAI_API_KEY="your-openai-api-key"

2. Your existing code works immediately:

from openai import OpenAI

client = OpenAI()  # Reads from environment

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)

Node.js/TypeScript

1. Set environment variables:

export OPENAI_BASE_URL="https://openai.tensor.cx/v1"
export OPENAI_API_KEY="your-openai-api-key"

2. Your existing code works immediately:

import OpenAI from 'openai'

const client = new OpenAI()  // Reads from environment

const response = await client.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Hello!' }]
})

Key Points

  • Include /v1 in your Base URL (e.g., https://openai.tensor.cx/v1)
  • ✅ Works with existing code - SDKs read environment variables automatically
  • ✅ True pass-through - everything after hostname stays the same
  • ✅ Zero markup - you only pay your provider directly with your own key

Need Help?

Our developer support team is here to help you integrate TensorCortex.