Open, industry-standard RDMA specifications have ignited the era of AI

It’s incredible to think that 25 years have passed since the inception of the InfiniBand® Trade Association (IBTA). Established in August 1999 by industry leaders, the IBTA created a clear and accessible interconnect architecture standard to provide a highly efficient, high-performance, scalable RDMA fabric for data center interconnect. In just four years, InfiniBand made its mark in the supercomputing realm with the SDR 10Gb/s system at Virginia Tech in November 2003, ranking as the world’s third most powerful system at the time—a remarkable achievement from the outset.

As InfiniBand gained traction in supercomputing, demand grew in the data center market for RDMA’s performance and efficiency benefits within Ethernet-based systems. Responding to this, the IBTA introduced the RDMA over Converged Ethernet (RoCE) specification in 2011. This IBTA standard has evolved steadily and by 2014, a multi-vendor ecosystem emerged and RoCE was deployed at large scale. So, where do we stand today?

Since its debut on the 2003 TOP500, RDMA networking has seen significant growth. In the latest June 2024 TOP500 list, RDMA-based networking is now the predominant interconnect, with 72% adoption across the entire list. InfiniBand connects 238 systems, including 5 in the Top 10, while RoCE-based Ethernet connects 121 systems. Moreover, InfiniBand powers the #1 system in the Green500, highlighting its exceptional system efficiency.

Figure 1: June 2024 TOP500 interconnect breakdown

Moreover, RDMA networking has not only solidified its position in supercomputing but has also experienced substantial growth catalyzed by the AI market. AI workloads demand significant computational power, especially those involving expansive and intricate models like generative pre-trained transformers (GPT), a type of large language model (LLM) and BERT, a model for natural language processing. To expedite model training and handle vast datasets, AI practitioners have increasingly turned to distributed computing systems. These systems essentially mirror those used in scientific and supercomputing domains and benefit immensely from InfiniBand and RoCE-based networking solutions.

It’s no wonder that InfiniBand and RoCE are the preferred choices for connecting these large-scale AI data centers. Offering substantial bandwidth and exceptional efficiency, RDMA-based networking not only reduces total ownership costs but also accelerates training times—an essential metric for foundational AI efforts at companies like Microsoft, OpenAI, Meta, and others.

Highlighting this surge of growth, the 650 Group updated the RDMA market size and forecast, revealing that demand for RDMA networking exceeded $6 billion in 2023 and is expected to surpass $22 billion by 2028. The 650 Group’s paper, accessible to everyone, includes an introduction to RDMA, RoCE, and the market sizing of this rapidly growing AI networking segment.

This massive growth has generated significant excitement in the industry, with the IBTA experiencing remarkable activity in membership growth and participation in our annual Plugfest. The IBTA now boasts over 50 members, and our 2024 Plugfest saw a record-breaking 350 devices from 20 vendors tested for compliance. During the Plugfest, attendees test their InfiniBand and RoCE products for compliance with IBTA specifications and interoperability with other compliant solutions. Products that pass these rigorous tests are listed in the IBTA Integrators’ List and RoCE Integrators’ List, providing the industry with a robust ecosystem of both InfiniBand and RoCE products.

These 25 years have been incredible, and the future looks bright! We anticipate new products based on the XDR 800Gb/s specification to be released within the next 12 months, potentially debuting in time for the June 2025 TOP500. With a robust roadmap to 3.2Tb/s links, the IBTA’s work is far from finished. Many member companies are diligently working on the specifications for InfiniBand and RoCE, and we look forward to announcing these developments as they reach their final stages of release.

Figure 2: IBTA Roadmap

I would like to extend my gratitude to HPE, IBM, Intel, and NVIDIA for their ongoing leadership in the Steering Committee. In particular, I want to recognize Jim Pappas from Intel, who will be retiring at the end of this month. Jim has been a vital part of the IBTA since its inception, helping to guide and inspire the entire consortium. I want to take this moment to express my deepest gratitude for his leadership, guidance, and unwavering support. We wish you a well-deserved and fulfilling retirement.

I also want to thank the many other members who contribute their expertise and technical knowledge to develop RDMA networking. The work accomplished has had a tremendous impact on the market and the advancements we see today, but there’s still more to do!

Is your company interested in participating? IBTA membership is open to any company, government department, or academic institution. For more information on membership and its benefits, visit the Membership page on the IBTA website.

Congrats and happy birthday IBTA!

Brian Sparks

IBTA Marketing Working Group Chair