The NVIDIA A100 Tensor Core GPU delivers unprecedented acceleration—at every scale—to power the world’s highest performing elastic data centers for AI, data analytics, and high-performance computing (HPC) applications. As the engine of the NVIDIA data center platform, A100 provides up to 20x higher performance over the prior NVIDIA Volta generation
PNY NVIDIA A100 40GB PCIe 4.0 Graphics Card | NVA100TCGPU-KIT
Unprecedented Acceleration for World’s Highest-Performing Elastic Data Centers
The NVIDIA A100 Tensor Core GPU delivers unprecedented acceleration—at every scale—to power the world’s highest performing elastic data centers for AI, data analytics, and high-performance computing (HPC) applications. As the engine of the NVIDIA data center platform, A100 provides up to 20x higher performance over the prior NVIDIA Volta generation. A100 can efficiently scale up or be partitioned into seven isolated GPU instances, with Multi-Instance GPU (MIG) providing a unified platform that enables elastic data centers to dynamically adjust to shifting workload demands.
NVIDIA Ampere-Based Architecture
A100 accelerates workloads big and small. Whether using MIG to partition an A100 GPU into smaller instances, or NVLink to connect multiple GPUs to accelerate large-scale workloads, the A100 easily handles different-sized application needs, from the smallest job to the biggest multi-node workload.
With 40 gigabytes (GB) of high-bandwidth memory (HBM2e), A100 delivers improved raw bandwidth of 1.6TB/sec, as well as higher dynamic random access memory (DRAM) utilization efficiency at 95 percent. A100 delivers 1.7x higher memory bandwidth over the previous generation.
Third-Generation Tensor Cores
First introduced in the NVIDIA Volta architecture, NVIDIA Tensor Core technology has brought dramatic speedups to AI training and inference operations, bringing down training times from weeks to hours and providing massive acceleration to inference. The NVIDIA Ampere architecture builds upon these innovations by providing up to 20x higher FLOPS for AI. It does so by improving the performance of existing precisions and bringing new precisions—TF32, INT8, and FP64—that accelerate and simplify AI adoption and extend the power of NVIDIA Tensor Cores to HPC.
AI networks are big, having millions to billions of parameters. Not all of these parameters are needed for accurate predictions, and some can be converted to zeros to make the models “sparse” without compromising accuracy. Tensor Cores in A100 can provide up to 2x higher performance for sparse models. While the sparsity feature more readily benefits AI inference, it can also improve the performance of model training.
Next Generation NVLink
NVIDIA NVLink in A100 delivers 2x higher throughput compared to the previous generation, at up to 600 GB/s to unleash the highest application performance possible on a single server. Two NVIDIA A100 PCIe boards can be bridged via NVLink, and multiple pairs of NVLink connected boards can reside in a single server (number varies based on server enclose, thermals, and power supply capacity).
MIG (Multi-Instance GPU) Support
Maximum Power Consumption
Typically replies within a few minutes.