Infrastructure
A single-code-path stack for model definition, training, evals, decontamination, conversion, lineage, and reproducible run artifacts.
Tensor Cortex builds deterministic evaluation, sandboxed execution, training infrastructure, and private coding-agent release gates for teams that need evidence before they trust an AI system.
Private release gate
A single-code-path stack for model definition, training, evals, decontamination, conversion, lineage, and reproducible run artifacts.
Private coding-agent evaluation on a customer's own repo: hidden tests, sandboxed execution, repeatable scoring, and regression gates.
Provider-friendly evidence packages with config, git SHA, eval hashes, goodput, MFU, checkpoint behavior, and failure notes.
The public surface is deliberately narrow: reproducible infrastructure, private EvalOps tooling, and evidence reports. The private research set, model details, and data-mixture decisions stay private.
Most teams choose coding agents using public benchmarks, demos, or vibes. Private EvalOps answers the sharper question: which agent setup actually works on your codebase?
Bug fixes, multi-file changes, refactors, and feature tasks derived from real repo work.
The same tasks across Codex, Claude Code, Cursor-style agents, open-weight models, or internal loops.
Hidden tests and verifiers decide pass/fail. The agent's self-report is never the score.
Re-run the suite whenever prompts, tools, models, or guardrails change.
Tensor Cortex is looking for practical compute partnerships: start with a GPU smoke, then a bounded bootstrap run, and publish the evidence package instead of making unsupported claims.