RNGD: A Game-Changing AI Inference Chip for Data Centers

Last week, FuriosaAI introduced its latest innovation, the RNGD (pronounced “Renegade”) AI accelerator, a considerable development for the AI semiconductor industry at Hot Chips 2024. This new accelerator is set to have far-reaching implications for the industry, potentially setting new standards for efficiency, performance, and scalability in AI-driven data centers.

June Paik, Co-Founder and CEO of FuriosaAI.

Implications for the Industry

The introduction of RNGD could mark a turning point in the AI hardware landscape. Traditional AI accelerators from established chipmakers have long dominated the market, often prioritizing raw performance at the expense of power efficiency. RNGD challenges this status quo by delivering high performance with significantly lower power consumption, making it an attractive alternative for data centers increasingly concerned with energy efficiency and sustainability.

RNGD represents a shift toward more programmable and adaptable AI hardware for engineers. Its Tensor Contraction Processor (TCP) architecture, coupled with a co-designed compiler, allows for greater flexibility in developing and deploying AI models. This means faster time-to-market for AI applications and potentially lower costs associated with model development and execution.

RNGD’s efficient design could influence broader industry trends, enabling a move away from the high-power, GPU-centric architectures that currently dominate AI computing. As data centers strive to manage energy costs and meet sustainability goals, RNGD offers a viable path forward, combining efficiency with the ability to handle large, complex AI models.

Inside the RNGD Accelerator

At the heart of RNGD is its innovative Tensor Contraction Processor (TCP) architecture, which departs from most AI accelerators’ conventional matrix multiplication (matmul) approach. This architecture allows RNGD to achieve a balanced trifecta of efficiency, programmability, and performance, which are critical attributes for handling today’s demanding AI workloads.

RNGD is designed to efficiently handle large and multimodal language (LLMs) models. Early testing has shown that a single RNGD PCIe card can deliver throughput performance of 2,000 to 3,000 tokens per second for models with around 10 billion parameters, depending on context length. This capability positions RNGD as a top-tier solution for data centers looking to scale their AI operations.

Engineers will also appreciate RNGD’s power efficiency. Operating at a thermal design power (TDP) of just 150W, RNGD is far more power-efficient than the leading GPUs, which often exceed 1000W. Despite its lower power consumption, RNGD is equipped with 48GB of HBM3 memory, efficiently running large models like Llama 3.1 on a single card.

A New Era of Rapid Development

FuriosaAI has demonstrated a commitment to rapid innovation, and the development of RNGD is a prime example. The company completed the full bring-up of RNGD shortly after receiving its first silicon samples from TSMC. This swift development process mirrors their experience with their first-generation chip in 2021, where they submitted MLPerf benchmark results within three weeks of receiving silicon, achieving a 113% performance boost in subsequent tests through compiler enhancements.

This agility in development underscores FuriosaAI’s technological prowess and ensures its products are quickly available to meet market demands. RNGD is already being sampled to early access customers, with broader availability anticipated in early 2025.

Conclusion

The launch of RNGD by FuriosaAI could signal a significant shift in the AI hardware industry. With its innovative architecture, impressive performance, and unmatched efficiency, RNGD sets a new benchmark for what AI accelerators can achieve. For engineers and data center operators, this means access to a powerful new tool that balances the need for high-performance AI processing with the growing demand for energy efficiency and sustainability.

As the AI landscape continues to evolve, RNGD may well be at the forefront of the next wave of advancements, offering a glimpse into the future of AI computing.

For more information, visit https://furiosa.ai/

Leave A Reply

Your email address will not be published.