Documentation

Complete guide to using TensorCortex for unified AI model access.

Quick Start

1. Create an Account

Sign up for TensorCortex and get your API key instantly. No credit card required for the free tier with 10,000 requests per month.

Get Started

2. Use Your Existing SDK

No new SDK needed! Just change your base URL. Works with OpenAI, Anthropic, and all provider SDKs you already use.

# Just change the base URL - that's it!
export OPENAI_BASE_URL=https://openai.tensor.cx/v1
export OPENAI_API_KEY=your-openai-api-key

⚠️ Important: Include /v1 in your base URL. TensorCortex is a pure pass-through proxy.

3. Make Your First Request

Your existing code works instantly. Zero code changes required - just point to TensorCortex.

from openai import OpenAI

# Just change base_url - your code stays the same!
client = OpenAI(
    api_key="your-openai-api-key",
    base_url="https://openai.tensor.cx/v1"
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)

⚠️ BYOK (Bring Your Own Keys): Use your actual provider API key. TensorCortex uses your OpenAI key to call OpenAI directly - zero markup.

4. Enjoy Automatic Benefits

That's it! You now get automatic caching (60% cost savings), global edge routing, and real-time cost tracking - all with zero configuration.

View API Reference

Core Concepts

Automatic Caching

TensorCortex automatically caches deterministic requests (temperature=0) to save costs and reduce latency. Cache hits are served from global edge storage in under 50ms.

Caching Benefits:

• Up to 60% cost savings - Cached responses are free
• 50x faster responses - <50ms vs 500-3000ms typical latency
• Zero configuration - Works automatically for temperature=0
• Global edge cache - Distributed worldwide

Provider Routing via Subdomain

Route to different providers by changing the subdomain. Models use their native names.

Subdomain Routing:

• https://openai.tensor.cx/v1 → OpenAI (use gpt-4o, gpt-4o-mini, etc.)
• https://anthropic.tensor.cx/v1 → Anthropic (use claude-3-5-sonnet-20241022, etc.)
• https://groq.tensor.cx/openai/v1 → Groq (use llama-3.1-70b-versatile, etc.)
• https://google.tensor.cx/v1beta → Google AI (use gemini-1.5-pro, etc.)
• https://mistral.tensor.cx/v1 → Mistral (use mistral-large-latest, etc.)

Cortex Configuration (Dashboard)

• System Prompts - Auto-inject context into all requests
• Guardrails - Content filtering, PII detection
• Rate Limits - Daily/monthly cost and usage caps
• Analytics - Real-time cost tracking and insights
• Multi-Provider Keys - Manage all your provider keys in one place

Best Practices

Use Fallbacks for Production

Always configure at least 2-3 fallback models for production applications. This ensures your app stays online even during provider outages.

Monitor Your Usage

Use the dashboard analytics to track costs across providers. Identify opportunities to optimize by switching to more cost-effective models for certain use cases.

Cache Responses When Possible

For deterministic queries (temperature=0), enable response caching to reduce costs and latency for repeated requests.

Use Streaming for Long Responses

Enable streaming for chat applications to provide a better user experience with immediate feedback as tokens are generated.

Security

API Key Security

Your API keys provide full access to TensorCortex. Keep them secure and never expose them in client-side code.

Security Best Practices:

• Store API keys in environment variables
• Never commit keys to version control
• Rotate keys regularly
• Use separate keys for dev/staging/production
• Monitor key usage for anomalies

Data Privacy

TensorCortex processes requests at the edge and does not store your prompts or responses. Your provider API keys are encrypted with AES-256-GCM. Enterprise customers can request custom data residency options.

Need More Help?

Can't find what you're looking for? Our support team is here to help.

Contact Support API Reference