16GB NVIDIA Tesla V100 Gets Reprieve; Remains in Production

Publish date: 2024-07-09

Back in March at their annual GPU Technology Conference, NVIDIA announced the long-anticipated 32GB version of their flagship Tesla V100 accelerator. By using newer 8-Hi HBM2 memory stacks, NVIDIA was able to double the accelerator’s previous 16GB of VRAM to a class-leading 32GB. Meanwhile, at the time company representatives told us that the launch of the 32GB model would be a wholesale replacement of the 16GB model, with the smaller version to be phased out and all future cards to go out as the 32GB model.

However, this week NVIDIA has reached out to inform us that this will not the case, and that the 16GB model is being continued after all.

In a somewhat odd exchange, the official line from the company is that the previous statement – made in the heat of a pre-briefing Q&A session – was in error, and that the 16GB model was never being discontinued. Instead, NVIDIA’s plan has always been to sell the two models side-by-side. Unfortunately, the company hasn’t been able to make it clear why that information wasn’t presented at the show instead; though what I do know is that this wasn’t caught until customers recently started asking questions.

NVIDIA Tesla/Titan Family Specification Comparison
 Tesla V100
(SXM2)
Tesla V100
(PCIe)
Titan V
(PCIe)
Tesla P100
(SXM2)
CUDA Cores5120512051203584
Tensor Cores640640640N/A
Core Clock??1200MHz1328MHz
Boost Clock1455MHz1370MHz1455MHz1480MHz
Memory Clock1.75Gbps HBM21.75Gbps HBM21.7Gbps HBM21.4Gbps HBM2
Memory Bus Width4096-bit4096-bit3072-bit4096-bit
Memory Bandwidth900GB/sec900GB/sec653GB/sec720GB/sec
VRAM16GB
32GB
16GB
32GB
12GB16GB
L2 Cache6MB6MB4.5MB4MB
Half Precision30 TFLOPS28 TFLOPS27.6 TFLOPS21.2 TFLOPS
Single Precision15 TFLOPS14 TFLOPS13.8 TFLOPS10.6 TFLOPS
Double Precision7.5 TFLOPS7 TFLOPS6.9 TFLOPS5.3 TFLOPS
Tensor Performance
(Deep Learning)
120 TFLOPS112 TFLOPS110 TFLOPSN/A
GPUGV100GV100GV100GP100
Transistor Count21B21B21.1B15.3B
TDP300W250W250W300W
Form FactorMezzanine (SXM2)PCIePCIeMezzanine (SXM2)
CoolingPassivePassiveActivePassive
Manufacturing ProcessTSMC 12nm FFNTSMC 12nm FFNTSMC 12nm FFNTSMC 16nm FinFET
ArchitectureVoltaVoltaVoltaPascal

But whatever the internal rationale and timetable on NVIDIA’s part, the end result is that at least for the foreseeable future, NVIDIA is going to be offering multiple V100 capacities across its lineup, including both the SXM2 and PCIe form factors. For NVIDIA's customers then, they now have a choice to make on capacity. The larger version is clocked identically to its 16GB counterpart, so it doesn't have an immediate performance advantage outside of memory capacity. However in cases where a dataset that doesn't fit in the 16GB model fits in the 32GB model, the performance differences can be very significant due to the large impact of memory thrashing; NVIDIA is advertising a 50% performance boost in some memory-limited HPC applications thanks to the larger RAM pool.

Finally, the company also confirmed that these cards will be priced differently. However they aren’t sharing the list prices for the parts, so it’s not clear whether the new pricing structure gives the 16GB model a price cut, or if the 32GB model is being offered at a price premium.

Source: NVIDIA

ncG1vNJzZmivp6x7orrAp5utnZOde6S7zGiqoaenZH5zhI9yZmpul5d6r8LInaCaZaSawK2tjK9oaWhdnLK1v4yrnKmqmZrDpnnRnqSaoZ6oeqq6jKmpqJylmMGqu80%3D