Description

Overview

NVIDIA Rubin CPX is a specialized member of the Rubin family designed exclusively for the context-processing phase of large language model inference. By disaggregating the prefill (context) stage from the decode stage, Rubin CPX delivers dramatically higher throughput for million-token prompts, long-context agents, and video understanding.

Rubin CPX pairs with standard Rubin GPUs in the same rack: CPX nodes ingest and tokenize huge contexts; standard Rubin GPUs handle decode. The result is a more efficient, more profitable inference factory for modern reasoning workloads.

Key Features

Massive Context Optimization: Architectural focus on attention prefill at million-token scale.
Disaggregated Inference: Separates context and generation stages for higher utilization.
HBM4 Memory: High-capacity, high-bandwidth memory tier for long-sequence states.
NVLink 6 Compatible: Drops into Vera Rubin racks alongside standard Rubin GPUs.
NVFP4 Precision: Native support for NVIDIA’s next-gen 4-bit floating point.

Technical Specifications

Specification	Details
Architecture	NVIDIA Rubin (CPX variant)
Memory	HBM4 (capacity per partner configuration)
Precision Support	NVFP4, FP8, BF16
Interconnect	NVLink 6, PCIe Gen 6
Target Workload	Context prefill / massive-context inference
Form Factor	SXM / rack-integrated

Ideal Use Cases

Million-token coding agents and long-document reasoning
Video and multimodal understanding at production scale
Retrieval-augmented generation with very large context windows
Disaggregated inference factories optimizing TCO per token

Why Choose Rubin CPX?

If your inference workloads are dominated by long-context prefill, dedicating standard Rubin GPUs to decode while CPX handles context can deliver step-function improvements in tokens-per-dollar. We help you size CPX-to-Rubin ratios for your traffic mix and integrate with your serving stack.

Interested? Contact us for personalized pricing and configuration options.

NVIDIA Rubin CPX

Description

Overview

Key Features

Technical Specifications

Ideal Use Cases

Why Choose Rubin CPX?

Compare this GPU

Related Products

NVIDIA RTX PRO 6000 Blackwell Workstation Edition

NVIDIA L4

NVIDIA RTX 5000 Ada Generation