Laurent - stock.adobe.com

Startup unveils trillion-transistor AI accelerator chip

The AI chip has 10,000 times more bandwidth to accelerate artificial intelligence training

Cerebras Systems, a startup from San Francisco, has unveiled what it claims is the industry’s first trillion-transistor chip optimised for artificial intelligence (AI). The company’s Cerebras Wafer Scale Engine (WSE) measures 46,225mm2, making it 56.7 times larger than the biggest graphics processing unit (GPU).

Compared to GPU-based AI acceleration, the WSE also contains 3,000 times more high-speed, on-chip memory, and has 10,000 times more memory bandwidth, said Cerebras.

According to Cerebras, larger chips can process information more quickly, producing answers in less time. It said that by reducing the time-to-insight, or “training time”, researchers can test more ideas, use more data and solve new problems.

“Designed from the ground up for AI work, the Cerebras WSE contains fundamental innovations that advance the state of the art by solving decades-old technical challenges that limited chip size – such as cross-reticle connectivity, yield, power delivery and packaging,” said Andrew Feldman, founder and CEO of Cerebras Systems. 

Cerebras said the Wafer Scale Engine accelerates calculations and communications, which reduces training time. “With 56.7 times more silicon area than the largest graphics processing unit, the WSE provides more cores to do calculations and more memory closer to the cores, so the cores can operate efficiently,” it said.

“Because this vast array of cores and memory are on a single chip, all communication is kept on-silicon. This means the WSE’s low-latency communication bandwidth is immense, so groups of cores can collaborate with maximum efficiency, and memory bandwidth is no longer a bottleneck.”

The 46,225mm2 of silicon in the Cerebras WSE houses 400,000 AI-optimised, compute cores and 18 gigabytes of local, distributed, memory. Memory bandwidth is 9 petabytes per second. The cores are linked together with a fine-grained, all-hardware, on-chip mesh-connected communication network that delivers an aggregate bandwidth of 100 petabits per second. 

Read more about AI acceleration

  • Latest forecasts suggest spending on artificial intelligence is ramping up, and organisations that need raw machine learning performance are turning to custom hardware.
  • Custom hardware is usually the only option available to organisations that need to achieve the ultimate level of performance for AI applications.

Read more on Artificial intelligence, automation and robotics