Components NVIDIA DGX A100 – 640GB (8 x 80 GB GPUs) Specifications
Processors & performance (per node, minimum) Dual ROME AMD processor with total of 128 CPU cores with minimum 2.25Ghz, with 8 x Nvidia A100 GPU Accelerators; Minimum of 160Tflops peak performance double precision. GPU topology to CPU should be 4:1 (4GPU connected to 1CPU)
Number of GPUs  and GPU Communication 8 x Nvidia A100 GPUs with 80GB RAM, NVLink 3.0/ configured or NV Switch with minimum 600GB/s bidirectional communication bandwidth
Performance 160TF Double precision Performance,5 PetaFlops AI performance10 PetaOPS INT8
Multi Instance GPU Single GPU can be partitioned into as many as 7 GPU instances
Internal switches 6 internal NV-Switches for GPU connectivity
System Memory Minimum 1TB DDR4, 3200 Mhz RAM / Upgradable to 2TB
GPU Memory Minimum 80GB per GPU, 640GB Per node minimum, with 1.6TB/sec of memory bandwidth
CUDA Cores Minimum 5000 or above, per GPU
Tensor Cores Minimum 400 or above per GPU
Network Minimum 8 x Single port Mellanox connectXIB HDR Ports (200Gbps)Minimum    2 x Dual           port Mellanox ConnectX–6 (10/25/50/100/200Gb/sec Ethernet) for storage connectivity
Internal Storage OS – Minimum 2 X 1.92 TB NVMe RAID Internal storage – Minimum 8 x 3.84 TB NVMe
Security Features The platform should support Trusted platform module for secure cryptographic key generation Self-encrypting drives for enhanced data at rest security Secure Firmware Updates for GPU, CPU and BMC
Power requirements 6.5 KW or less; hot plug & redundant power
Rack space 6U or less
System Network (IPMI) 1Gbps network
OS Support Red Hat Enterprise Linux /CentOS/ Ubuntu Linux. Quoted OS should be under Enterprise support from OEM.
AI, HPC  Software Containers   and Required DL SDKs with Support Nvidia NGC (Nvidia GPU Cloud) containers with Nvidia NGC support for 5 years for each system with unlimited user access. Proposed system should be NGC certified system.
SDK/library/containers that need to be in the system are: CUDA toolkit, CUDA tuned Neural Network (cuDNN) Primitives
Tensor RT Inference Engine
CUDA tuned BLAS (cuBLAS)
CUDA tuned Sparse Matrix Operations (cuSPARSE) Multi-GPU Communications (NCCL)
Industry SDKs – NVIDIA DeepStream, ISAAC, DRIVE, Nemo, Jarvis  
Preinstalled AI frameworks Installed   optimized AI    frameworks   like      Caffe,    CNTK, Tensor flow, Theano, Torch with Docker containers for deploying Deep learning frameworks.
Pre-installed Deep learning GPU Training System for to train highly accurate deep neural network (DNNs) for image classification, segmentation, and object detection tasks
Scalability & Cluster software System should be scalable with multi node cluster.
Software support & cluster tools to be supplied along with product. Full-stack reference designs with all of the leading Storage providers.