2024.09.26

What's HBM3E (High Bandwidth Memory 3)?

Definition

High Bandwidth Memory 3 (HBM3) is a memory standard (JESD238) for 3D stacked synchronous dynamic random-access memory (SDRAM) released by JEDEC in January 2022, offering significant improvements over the previous HBM2E standard (JESD235D). These enhancements include support for larger densities, higher speed operation, increased bank count, advanced reliability, availability, and serviceability (RAS) capabilities, a lower power interface, and a new clocking architecture. HBM3 is poised to make an impact in high-performance computing (HPC) applications such as AI, graphics, networking, and even potentially automotive. The HBM3 standard enables devices with up to 32 Gb of density and up to 16-high stack for a total of 64 GB storage, nearly a 3x growth compared to HBM2E. With top speeds of 6.4 Gbps, HBM3 almost doubles the top speed of HBM2E at 3.6 Gbps.


How Does HBM3 Work?

High Bandwidth Memory 3 (HBM3) is a cutting-edge memory technology that is tightly coupled to the host compute die via a distributed interface. This interface is split into multiple independent channels, which may operate asynchronously. HBM3 DRAM utilizes a wide-interface architecture to ensure high-speed performance and energy efficiency. Each channel interface features two pseudo-channels with a 32-bit data bus operating at double data rate (DDR).


Every channel provides access to a distinct set of DRAM banks. Data requests from one channel cannot access data linked to another channel. Channels operate with independent clocks and are not required to be synchronous.




Benefits of HBM3

HBM3 (High Bandwidth Memory 3) is an advanced memory technology that offers significant improvements over its predecessor, HBM2E. With increased storage capacity, speed, and power efficiency, HBM3 is set to revolutionize the graphics, cloud computing, networking, AI and potentially the automotive industry. Here, we outline the key benefits of HBM3:



Increased Storage Capacity

HBM3 supports up to 32 Gb of density and up to a 16-high stack, resulting in a maximum of 64 GB storage. This is almost a 3x growth compared to HBM2E, providing more memory for advanced applications.


Faster Data Transfer RatesWith a top speed of 6.4 Gbps, HBM3 is almost double the speed of HBM2E (3.6 Gbps). The market may see a second generation of HBM3 devices in the not-too-distant future. One need only look at the speed history of HBM2/2E, DDR5 (6400 Mbps upgraded to 8400 Mbps), and LPDDR5 maxing out at 6400 Mbps and quickly giving way to LPDDR5X operating at 8533 Mbps. HBM3 above 6.4 Gbps is within reason.


Improved Power Efficiency

HBM2E already offers the lowest energy per bit transferred, largely due to being an unterminated interface, but HBM3 substantially improves on HBM2E.  HBM3 decreases the core voltage to 1.1V compared to HBM2E’s 1.2V core voltage.  In addition to the 100mV core supply drop, HBM3 reduces the I/O signaling down to 400mV from HBM2E’s 1.2V.


Enhanced Channel Architecture

HBM3 has kept the overall interface size the same for the HBM DRAMs – 1024-bits of data. However, this 1024-bit interface is now divided into 16 64-bit channels, or more importantly, 32 32-bit pseudo-channels.  Since the width of the pseudo-channels has been reduced to 4 bytes, the burst length of accesses to the memory have increased to 8 beats – maintaining a 32-byte packet size for memory accesses. Doubling the number of pseudo-channels will be a performance improvement over HBM2E. Combined with the increase in data rate, HBM3 can provide a substantial increase in performance over HBM2E.


Advanced RAS Features

HBM3 introduces several improvements to its reliability, availability, and serviceability (RAS) capabilities, including on-die error correcting code (ECC). HBM3 DRAM devices also support error check and scrub (ECS) when the device is in self-refresh mode or when the host issues a refresh-all bank command. The results of the ECS operation may be obtained by accessing ECC transparency registers via the IEEE standard 1500 Test Access Port (TAP). The HBM3 standard’s new RAS feature supports refresh management (RFM) or adaptive refresh management (ARFM).


New Clocking Architecture

HBM3 changes the clocking architecture by decoupling the traditional clock signal from the host to the device and the data strobe signals. In fact, while the new maximum rate of WDQS and RDQS in HBM3 is 3.2 GHz to enable a data transfer rate of up to 6.4 Gbps, the fastest rate the CK will run from the host to the device is only 1.6 GHz (even when the data channels are operating at 6.4 Gbps). Decoupling the clock signal from the strobes allows the clock signal to run significantly slower than the data strobes. The maximum transfer rate of information on the CA bus is now 3.2 Gbps since the CA clock has been capped at a maximum rate of 1.6 GHz.  While HBM2E requires a CA transfer rate of 3.6 Gbps, HBM3 only requires a command and address (CA) transfer rate of 3.2 Gbps.


High-speed Internal Clocking

The new HBM3 clocking architecture enables the user to keep focus on a low-latency, high-performance solution when migrating from HBM2E to HBM3. The highest defined frequency for the CA bus with HBM3 is 1.6 GHz while the data strobes operate at 3.2 GHz. This enables users to implement a DFI 1:1:2 frequency ratio for an HBM3 controller and PHY. In this case, the controller, DFI, PHY and memory clock all run at 1.6 GHz while the strobe frequency is 3.2 GHz. This gives designers a DFI 1:1 frequency ratio for the CA interface and a DFI 1:2 frequency ratio for the data, all of which minimize latency.