Anusorn - stock.adobe.com

Bfloat16: What it is and how it impacts storage

Bfloat16 is an emerging way to handle very large numbers, developed by Google for its machine and neural learning and prediction. We look at what it means for IT and storage

Analysis and prediction are core to today’s IT as organisations embark on digital transformation, with use cases that range from speech recognition and pattern analysis in science, to use in fraud detection and security, to AIOps in IT and storage.

As artificial intelligence and predictive methods – machine learning, deep learning, neural processing, and so on – become more prevalent, ways to streamline these operations have developed.

Key among these is the emergence of new ways of dealing with large numbers, and bfloat16 – originally developed by Google – is the most prominent among these.

In this article, we look at what bfloat16 is, the impact it will have on memory and back-end storage, and which hardware makers support it. 

What is bfloat16?

Bfloat16 – short for Brain Float 16 – is a way of representing floating point numbers in computing operations. Floating point numbers are a way that computers can handle very large numbers (think millions, billions, trillions, and so on) or very small ones (think lots of zeros after decimal points) while using the same schema. 

In floating point schemes, there are a set number of binary bits. A bit indicates whether the number is positive or negative, some of the bits indicate the number itself, and the floating point element – the exponent – is a number of bits that say where the decimal point should be.

Bfloat16, as the name suggests, uses a 16-bit format to do all this. In doing so, it cuts in half the weight in bits of the most-prevalent existing standard, IEEE 754, which is 32-bit.

But bfloat16 uses an exponent that is the same size as that in IEEE 754, which allows it to represent the same range in size of numbers, but with less precision.

What is bfloat16 used for?

Bfloat16 was developed by Google – the B represents the company’s Brain project – specifically for its tensor processing units (TPUs) used for machine learning. The key thing here is that for machine learning operations you don’t need the levels of precision in terms of binary powers that other calculations might require. But you do want speed of operation, and that’s what bfloat16 is aimed at.

How will bfloat16 impact storage?

The key benefits of bfloat16 are that it reduces storage requirements during processing and speeds up individual calculations during machine learning operations.

Bfloat16 takes half the memory of equivalent operations that use IEEE 754 32-bit numbers, meaning that more can be held in memory and take less time to swap in and out of memory. That means larger models and datasets can be used. Also, bfloat16 takes less time to load into memory from bulk storage.

Hardware support for bfloat16 is something that gets built into processors and processing units, so it will be tailored to the standard.

Back-end storage volumes are likely to be positively impacted. In other words, you’ll need less storage if you do a lot of machine learning operations with bfloat16. But it’s more likely, at least for a while, that IEEE 754 will predominate, and bfloat16 converts from that existing standard.

What hardware support exists for bfloat16?

Bfloat16 was first deployed on Google’s hardware TPUs, supported by Intel, which can be used via the provider’s cloud services (Cloud TPU) or bought as a product for customer on-premise use. 

At the time of writing, it is also supported by Intel’s third-generation Xeon Scalable CPUs, IBM’s Power10 and ARM’s Neoverse processors. 

Bfloat16 is also supported in a number of NPUs – neural processing units, of which the TPU is one – including in ARM’s Trillium, Centaur, Flex, Habana, Intel’s Nervana and Wave.

Read more on Datacentre capacity planning