Article · Wikipedia archive · Last revised May 30, 2026

SXM (socket)

SXM is a high bandwidth socket solution for connecting Nvidia Compute Accelerators to a system. Each generation of Nvidia Tesla since the P100 models, the DGX computer series, and the HGX board series come with an SXM socket type that provides high bandwidth and power delivery for the GPU daughter cards. Nvidia offers these combinations as an end-user product e.g. in their models of the DGX system series. Current socket generations are SXM for Pascal based GPUs, SXM2 and SXM3 for Volta based GPUs, SXM4 for Ampere based GPUs, and SXM5 for Hopper based GPUs. These sockets are used for specific models of these accelerators, and offer higher performance per card than PCIe equivalents. The DGX-1 system was the first to be equipped with SXM-2 sockets and thus was the first to carry the form factor compatible SXM modules with P100 GPUs and later was unveiled to be capable of allowing upgrading to SXM2 modules with V100 GPUs.

Last revised
May 30, 2026
Read time
≈ 4 min
Length
1,031 w
Citations
16
Source
Computing node of TSUBAME 3.0 supercomputer showing four Nvidia Tesla P100 SXM modules source ↗
Bare SXM sockets next to sockets with GPUs installed source ↗

SXM (Server PCI Express Module)1 is a high bandwidth socket solution for connecting Nvidia Compute Accelerators to a system. Each generation of Nvidia Tesla since the P100 models, the DGX computer series, and the HGX board series come with an SXM socket type that provides high bandwidth and power delivery for the GPU daughter cards.2 Nvidia offers these combinations as an end-user product e.g. in their models of the DGX system series. Current socket generations are SXM for Pascal based GPUs, SXM2 and SXM3 for Volta based GPUs, SXM4 for Ampere based GPUs, and SXM5 for Hopper based GPUs. These sockets are used for specific models of these accelerators, and offer higher performance per card than PCIe equivalents.2 The DGX-1 system was the first to be equipped with SXM-2 sockets and thus was the first to carry the form factor compatible SXM modules with P100 GPUs and later was unveiled to be capable of allowing upgrading to (or being pre-equipped with) SXM2 modules with V100 GPUs.34

Technical details

SXM boards are typically built with four or eight GPU slots, although some solutions such as the Nvidia DGX-2 connect multiple boards to deliver high performance. While third party solutions for SXM boards exist, most systems integrators such as Supermicro use prebuilt Nvidia HGX boards, which come in four or eight socket configurations.5 This solution greatly lowers the cost and difficulty of SXM based GPU servers, and enables compatibility and reliability across all boards of the same generation.

SXM modules on e.g. HGX boards, particularly recent generations, may have NVLink switches to allow faster GPU-to-GPU communication. This further reduces bottlenecks which would normally be imposed by CPU and PCIe limitations.26 The GPUs on the daughter cards use NVLink as their main communication protocol. For example, a Hopper-based H100 SXM5 based GPU can use up to 900 GB/s of bandwidth across 18 NVLink 4 channels, with each contributing a 50 GB/s of bandwidth;7 In contrast, PCIe 5.0 can handle up to 64 GB/s of bandwidth within a x16 slot.8 This high bandwidth also means that GPUs can share memory over the NVLink bus, allowing an entire HGX board to present to the host system as a single, massive GPU.9

Power delivery is also handled by the SXM socket, negating the need for external power cables such as those needed in PCIe equivalent cards. This, combined with the horizontal mounting, allows more efficient cooling mechanisms, which in turn allow SXM-based GPUs to operate at a much higher thermal design power (TDP). The Hopper-based H100, for example, can draw up to 700 W solely from the SXM socket.10 The lack of cabling also makes assembling and repairing of large systems much easier, and also reduces the number of possible points of failure.2

Comparison of accelerators used in DGX:111213

Model Architecture Socket Cores Boost clock
(MHz)
Memory VRAM Single
precision
(FP32; TFLOPS)
Double
precision
(FP64; TFLOPS)
INT8
(non-tensor)
INT8
dense tensor
INT32 FP4
dense tensor
FP16
(TFLOPS)
FP16
dense tensor
bfloat16
dense tensor
TensorFloat-32
(TF32)
dense tensor
FP64
dense tensor
Interconnect
(NVLink; TB/sec)
GPU #SM L1 Cache (KB) L2 Cache
(KB)
TDP
(W)
Die size
(mm2)
Transistor
count
(billion)
Fabrication
Process
Launched
FP32
CUDA
FP64
(excl. tensor)
Mixed
INT32/FP32
INT32 Type
(HBM)
Speed
(Gb/s)
Bus width
(bits)
Bandwidth
(TB/s)
Type
(HBM)
Size
(GB)
Per SM Total
P100 Pascal SXM/SXM2 3584 1792 N/a 1480 HBM2 1.4 4096 0.72 HBM2 16 10.6 5.3 N/a 21.2 N/a 0.16 GP100 56 24 1344 4096 300 610 15.3 TSMC 16FF+ Q2 2016
V100 16GB Volta SXM2 5120 2560 N/a 5120 1530 1.75 0.9 15.7 7.8 62 TOPS N/a 15.7 TOPS N/a 31.4 125 TFLOPS N/a 0.3 GV100 80 128 10240 6144 815 21.1 TSMC 12FFN Q3 2017
V100 32GB SXM3 32 350
A100 40GB Ampere SXM4 6912 3456 6912 N/a 1410 2.4 5120 1.52 40 19.5 9.7 N/a 624 TOPS 19.5 TOPS 78 312 TFLOPS 312 TFLOPS 156 TFLOPS 19.5 TFLOPS 0.6 GA100 108 192 20736 40960 400 826 54.2 TSMC N7 Q1 2020
A100 80GB HBM2e 3.2 HBM2e 80
H100 Hopper SXM5 16896 4608 16896 1980 HBM3 5.2 3.35 HBM3 67 34 1.98 POPS N/a 990 TFLOPS 990 TFLOPS 495 TFLOPS 67 TFLOPS 0.9 GH100 132 192 25344 51200 700 814 80 TSMC 4N Q3 2022
H200 HBM3e 6.3 6144 4.8 HBM3e 141 1000 Q3 2023
B100 Blackwell SXM6 N/a 8 8192 8 192 N/a 3.5 POPS N/a 7 PFLOPS N/a 1.98 PFLOPS 1.98 PFLOPS 989 TFLOPS 30 TFLOPS 1.8 GB100 N/a 700 N/a 208 TSMC 4NP Q4 2024
B200 4.5 POPS 9 PFLOPS 2.25 PFLOPS 2.25 PFLOPS 1.2 PFLOPS 40 TFLOPS 1000
See also

See also

  • Tegra – System on a chip by Nvidia
References

References

  1. Brown, W. Michael; Nguyen, Trung D.; Fuentes-Cabrera, Miguel; et al. (2012). "An Evaluation of Molecular Dynamics Performance on the Hybrid Cray XK6 Supercomputer". Procedia Computer Science. 9: 186–195. doi:10.1016/j.procs.2012.04.020.
  2. Kharya, Paresh (February 2, 2018). "Achieving Maximum Compute Throughput: PCIe vs. SXM2" (Press release). Nvidia. Retrieved March 31, 2022 – via TheNextPlatform.com.
  3. "Volta architecture whitepaper" (PDF). Nvidia.
  4. "DGX 1 User Guide" (PDF). Nvidia.
  5. Kennedy, Patrick (May 14, 2020). "Nvidia A100 4x GPU HGX Redstone Platform". ServeTheHome.com. Axautik Group. Retrieved December 30, 2025.
  6. "Nvidia NVLink and NVSwitch". Nvidia. Retrieved December 30, 2025.
  7. "Nvidia's H100 – What It Is, What It Does, and Why It Matters". DataCenterKnowledge.com. March 23, 2022. Retrieved March 31, 2022.
  8. "Is PCIe 5.0 Worth It? The Benefits of PCIe 5.0 (2022)". TechReviewer.com. Retrieved March 31, 2022.
  9. "Nvidia HGX A100: Powered by A100 GPUs and NVSwitch". Nvidia. Retrieved March 31, 2022.
  10. "Nvidia H100 GPU full details: TSMC N4, HBM3, PCIe 5.0, 700W TDP, more". TweakTown.com. March 23, 2022. Retrieved March 31, 2022.
  11. Smith, Ryan (March 22, 2022). "NVIDIA Hopper GPU Architecture and H100 Accelerator Announced: Working Smarter and Harder". AnandTech. Archived from the original on September 23, 2023.
  12. Smith, Ryan (May 14, 2020). "NVIDIA Ampere Unleashed: NVIDIA Announces New GPU Architecture, A100 GPU, and Accelerator". AnandTech. Archived from the original on July 29, 2024.
  13. Garreffa, Anthony (September 17, 2017). "NVIDIA Tesla V100 Tested: Near Unbelievable GPU Power". TweakTown.com. Retrieved December 30, 2025.
External links