Cloud
General
March 16, 2026

Building the future foundation for intelligence

Agentic AI demands more than raw compute. At NVIDIA GTC 2026, Crusoe is announcing an expanded collaboration with NVIDIA spanning the full AI infrastructure stack, plus new Crusoe Cloud capabilities.

Erwan Menard Photo
Erwan Menard
SVP, Product Management
March 16, 2026
Crusoe and NVIDIA logos side by side on a white card against a natural lichen background.

The transition from passive chatbots to autonomous agentic AI isn't just changing what we build — it's changing what the underlying infrastructure has to deliver. The agents defining the next era of AI don't just need raw compute power. They need compute that's fast, close, sovereign, and self-aware enough to get out of the way of the engineers building on top of it.

At Crusoe, we call this the "architecture of immediacy" — and we're building it from the ground up. That means collapsing the distance between data and compute, giving developers deep visibility and control over the full model lifecycle, and partnering with the best in the industry to future-proof every layer of the AI Infrastructure stack. Today, we're excited to share a collection of capabilities and partnerships that move that vision forward.

The TL;DR:

Announced today:

  • NVIDIA integrations: Significant expansion of our collaboration with NVIDIA with integrations across the full AI factory stack – from physical data center design through models and inference performance.
  • New Crusoe Cloud capabilities: Automate orchestration, observability, and support for maximum reliability with Crusoe Command Center (GA), stream data to your own stack with Telemetry Relay (GA), and accelerate model development using our new Serverless Fine-Tuning service (Private Preview).

Announced last week:

  • Crusoe SparkTM Factory: An expansion of our modular AI factory strategy to accelerate production of Crusoe SparkTM units – smaller, modular data center solutions that rapidly deliver high-performance compute to the edge, where data is born.
  • Crusoe Edge Zones: An expansion of Crusoe Cloud at the edge, deployed on top of our Crusoe Spark units, that gives you the ability to serve low-latency inference at the edge and deploy sovereign AI workloads globally. 

Co-architecting for immediacy with NVIDIA

The future foundation of intelligence requires a breadth of cutting-edge technology. A critical part of that ecosystem includes our collaboration with NVIDIA, which is now expanding to every layer of the AI infrastructure stack, from physical data center design to frontier model inference. These announcements reflect our core thesis: delivering immediacy for AI developers requires more than great hardware. It demands a fully integrated, NVIDIA-native foundation built to move at the speed of intelligence. 

We are proud to be an early adopter of NVIDIA Vera, deploying Vera CPU platforms alongside NVIDIA HGX Rubin NVL8 systems and NVIDIA Vera Rubin NVL72 systems in late 2026 and throughout 2027. With a single Vera CPU Rack supporting over 22,500 concurrent environments, our infrastructure will keep accelerators fully utilized at AI factory scale — ensuring that as models grow more complex, the foundation beneath them grows stronger.

To underpin our physical infrastructure foundation we are building to NVIDIA's most advanced standards. We are adopting the NVIDIA Omniverse DSX Blueprint to guide how we design and operate our gigawatt-scale AI factories — incorporating digital twins, AI-driven power and cooling optimization, and future-proofed mechanical and electrical integrations.

At the model and inference layer, we are a “Day 1” adopter of NVIDIA Nemotron 3 Super and NVIDIA Nemotron 3 VoiceChat (in early access), serving both models through our Crusoe Managed Inference service powered by our inference engine with MemoryAlloy™ technology. Nemotron 3 Super brings over 50% faster token generation and support for up to one million token context windows — purpose-built for the multi-agent, long-horizon workloads defining the agentic AI era. Nemotron 3 VoiceChat is a full duplex, speech-to-speech model built to support simultaneous speaking and listening. While many current voice agents rely on "cascaded" architectures that stitch multiple components together, this model represents a shift toward a more integrated approach that improves conversational dynamics and fluidity. 

To push performance even further, we are integrating our proprietary tokenizer with NVIDIA Dynamo open-source inference library and contributing it back to the community — this high-performance Rust BPE tokenizer delivers approximately 9 times average speedup over HuggingFace tokenizers and up to 31 times faster speedup on long prompts. In agentic workloads, this leads to up to 40% faster time-to-first-token.

Enabling intelligent, autonomous AI cloud

The future foundation of intelligence isn't just physical — it's operational. Building AI that delivers immediacy requires infrastructure that manages itself, so your engineers can focus on what actually creates value: building and innovating, not babysitting servers.

That's the idea behind Crusoe Command Center, our unified operations platform for AI workloads, now generally available to all Crusoe Cloud customers. Command Center gives you the high-fidelity visibility needed to manage a fleet of distributed agents with the same precision as if the hardware were in your own server room — turning blind spots into actionable insights and reducing the operational friction that slows teams down. Out-of-the-box telemetry delivers real-time visibility into individual GPU health, storage, and network metrics so you can quickly identify bottlenecks and maximize resource efficiency across your fleet. And with Telemetry Relay, now generally available, pre-defined metrics stream directly to your preferred tools — so your infrastructure data lives alongside your broader operational environment without requiring you to change your workflow.

But managing infrastructure is only half the equation. The other half is making your models smarter, faster, and more specific to your use case. Serverless Fine-Tuning, now in private preview, enables you to customize state-of-the-art open-source models without the overhead of managing GPU infrastructure. The developer-friendly fine-tuning APIs (OpenAI-compatible) and UI accelerate the end-to-end model lifecycle management journey as you create specialized models for the optimal price-performance. Your IP, the custom datasets, and fine-tuned models are managed in the clean room environment of Crusoe's Object Storage service.

Building where the intelligence lives

Just as the range of AI models has grown to include trillion-parameter powerhouses and smaller models purpose-built for precision, the traditional approach to building data centers is evolving. As organizations move from resource-intensive model training to revenue-generating solutions, inference becomes critical to whether you succeed or stall. Multi-gigawatt data centers remain critical to powering AI infrastructure, but as AI scales globally and moves out to the edge, distributing compute power to deliver immediacy everywhere will require local, right-sized infrastructure. 

Last week we announced Crusoe Spark Factory and Crusoe Edge Zones to power and scale the next generation of AI. Crusoe Spark Factory expands the manufacturing capacity for our Crusoe Spark modular AI factories, providing a distributed compute ecosystem that decouples performance from the physics of distance. Crusoe Edge Zones leverage these units to give you full control over where your compute resides, enabling rapid expansion to new geographic regions, low-latency inference near business demand,and  dedicated deployments for enterprises and governments with strict data residency requirements.

Whether it's a gigawatt-scale campus for foundational model training or a fleet of Spark units to power local inference, we are committed to building the best AI infrastructure foundation for your unique needs.

Create the future with us

The future foundation of intelligence won't be built in a single data center, by a single vendor, or on a single bet about where AI is headed. It will be assembled layer by layer — at the edge and at scale, in sovereign deployments and hyperscale campuses, from silicon to inference — by teams willing to rethink every assumption about how infrastructure should work. That's what we're building at Crusoe, and our latest advancements are the next step. 

Whether you're training frontier models, deploying real-time agents, or fine-tuning for a specific use case, we're committed to giving you the speed, control, and proximity to build the future faster. We're just getting started — and we'd love to build it with you.

  • Meet us in person: Visit us at NVIDIA GTC 2026 this week to see the architecture of immediacy in action. Explore demos, dive deeper with sessions led by Crusoe experts, and join us for an exclusive happy hour.
  • Explore the platform: Visit crusoe.ai to learn more about the announcements in this post and to request access to the Serverless Fine-Tuning private preview.

Latest articles

Chase Lochmiller - Co-founder, CEO
March 16, 2026
Building the future foundation for intelligence
Chase Lochmiller - Co-founder, CEO
March 16, 2026
Reducing TTFT by CPUMaxxing Tokenization
Chase Lochmiller - Co-founder, CEO
March 12, 2026
Going big (and small) in AI infrastructure

Are you ready to build something amazing?