Documentation
Complete guide to using TensorCortex for unified AI model access.
Quick Start
1. Create an Account
Sign up for TensorCortex and get your API key instantly. No credit card required for the free tier with 10,000 requests per month.
Get Started2. Use Your Existing SDK
No new SDK needed! Just change your base URL. Works with OpenAI, Anthropic, and all provider SDKs you already use.
# Just change the base URL - that's it! export OPENAI_BASE_URL=https://openai.tensor.cx/v1 export OPENAI_API_KEY=your-openai-api-key
⚠️ Important: Include /v1 in your base URL. TensorCortex is a pure pass-through proxy.
3. Make Your First Request
Your existing code works instantly. Zero code changes required - just point to TensorCortex.
from openai import OpenAI
# Just change base_url - your code stays the same!
client = OpenAI(
api_key="your-openai-api-key",
base_url="https://openai.tensor.cx/v1"
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}]
)⚠️ BYOK (Bring Your Own Keys): Use your actual provider API key. TensorCortex uses your OpenAI key to call OpenAI directly - zero markup.
4. Enjoy Automatic Benefits
That's it! You now get automatic caching (60% cost savings), global edge routing, and real-time cost tracking - all with zero configuration.
View API ReferenceCore Concepts
Automatic Caching
TensorCortex automatically caches deterministic requests (temperature=0) to save costs and reduce latency. Cache hits are served from global edge storage in under 50ms.
Caching Benefits:
- • Up to 60% cost savings - Cached responses are free
- • 50x faster responses - <50ms vs 500-3000ms typical latency
- • Zero configuration - Works automatically for temperature=0
- • Global edge cache - Distributed worldwide
Provider Routing via Subdomain
Route to different providers by changing the subdomain. Models use their native names.
Subdomain Routing:
- • https://openai.tensor.cx/v1 → OpenAI (use gpt-4o, gpt-4o-mini, etc.)
- • https://anthropic.tensor.cx/v1 → Anthropic (use claude-3-5-sonnet-20241022, etc.)
- • https://groq.tensor.cx/openai/v1 → Groq (use llama-3.1-70b-versatile, etc.)
- • https://google.tensor.cx/v1beta → Google AI (use gemini-1.5-pro, etc.)
- • https://mistral.tensor.cx/v1 → Mistral (use mistral-large-latest, etc.)
Cortex Configuration (Dashboard)
Sign up for the dashboard at app.tensorcortex.com to configure advanced features per Cortex:
- • System Prompts - Auto-inject context into all requests
- • Guardrails - Content filtering, PII detection
- • Rate Limits - Daily/monthly cost and usage caps
- • Analytics - Real-time cost tracking and insights
- • Multi-Provider Keys - Manage all your provider keys in one place
Best Practices
Use Fallbacks for Production
Always configure at least 2-3 fallback models for production applications. This ensures your app stays online even during provider outages.
Monitor Your Usage
Use the dashboard analytics to track costs across providers. Identify opportunities to optimize by switching to more cost-effective models for certain use cases.
Cache Responses When Possible
For deterministic queries (temperature=0), enable response caching to reduce costs and latency for repeated requests.
Use Streaming for Long Responses
Enable streaming for chat applications to provide a better user experience with immediate feedback as tokens are generated.
Security
API Key Security
Your API keys provide full access to TensorCortex. Keep them secure and never expose them in client-side code.
Security Best Practices:
- • Store API keys in environment variables
- • Never commit keys to version control
- • Rotate keys regularly
- • Use separate keys for dev/staging/production
- • Monitor key usage for anomalies
Data Privacy
TensorCortex processes requests at the edge and does not store your prompts or responses. Your provider API keys are encrypted with AES-256-GCM. Enterprise customers can request custom data residency options.
Need More Help?
Can't find what you're looking for? Our support team is here to help.