What Happened
NVIDIA has unveiled the B300 GPU, the successor to the B200, delivering 2x AI training throughput and 1.5x inference performance compared to its predecessor. The B300 is built on a refined Blackwell architecture with enhanced Tensor Cores and expanded HBM4 memory.
Why It Matters
Specifications
| Feature | B300 | B200 | H100 |
|---|---|---|---|
| FP8 Performance | 40 PFLOPS | 20 PFLOPS | 4 PFLOPS |
| HBM Memory | 288 GB HBM4 | 192 GB HBM3e | 80 GB HBM3 |
| Memory Bandwidth | 12 TB/s | 8 TB/s | 3.35 TB/s |
| TDP | 1200W | 1000W | 700W |
| Interconnect | NVLink 6 | NVLink 5 | NVLink 4 |
Impact on AI Development
The B300's doubled performance means:
- Faster training: Models that took weeks can now be trained in days
- Larger models: 288 GB HBM4 enables training larger models without model parallelism
- Better efficiency: Performance-per-watt improves by 30% over B200
- Cost reduction: Fewer GPUs needed for the same workload
The NVLink 6 Advantage
NVLink 6 provides 3.6 TB/s of GPU-to-GPU bandwidth, enabling:
- Seamless scaling across thousands of GPUs
- Reduced communication overhead in distributed training
- Support for the new NVLink Switch System connecting up to 576 GPUs in a single domain
Pricing and Availability
- B300 GPU: Available Q2 2026
- DGX B300 System (8x B300): $400,000+
- GB300 NVL72 (72 GPUs): Designed for hyperscale data centers
What's Next
NVIDIA's roadmap includes:
- Rubin architecture (2027): Expected 4x improvement over Blackwell
- Vera CPU: ARM-based CPU designed specifically for AI workloads
- Enhanced software ecosystem with updates to CUDA, TensorRT, and Triton
Summary
The B300 GPU continues NVIDIA's dominance in AI hardware with meaningful performance gains. For AI labs and enterprises, the upgrade path is clear: more performance, more memory, and better efficiency for the ever-growing demands of frontier model training.