IT leaders are preparing their organisations for increasing use of artificial intelligence (AI) to extract the insights they need from growing amounts of data. To help them, Intel is enabling businesses to future-proof their IT architecture by building AI accelerators into its second-generation Xeon® scalable processors.
The key technology to deliver this capability is Intel® DL Boost - DL meaning deep learning – which is built into the second-generation Intel® Xeon® scalable processor. DL Boost gives more AI power to classic CPUs by offering an instruction set in the Intel® Xeon® scalable processor that provides more operations in a single cycle of computation.
Better still, DL Boost is embedded into the CPU, so there is no extra cost for users looking to deploy AI, according to Walter Riviera, AI technical solution specialist at Intel.
“The effort Intel is making in the AI domain space for customers and partners is huge and is providing the widest portfolio of technology available. The Intel model can be accelerated to make computations faster,” he says.
“The point is to evolve technology for all the different challenges in the AI space, and develop Xeon® processors with new AI muscles that allow people to accelerate their workload on their favourite platform.”
Intel also ensures that IT architectures are prepared for a future with growing big data requirements and a huge increase in AI developments. DL Boost uses vector neural network instructions (VNNI), an instruction set designed to speed up AI algorithms.
“We can see in the future an increased demand for computer and memory. Data resolution will augment with better cameras and mics, for example. Research will provide bigger and deeper models that will require big compute,” says Riviera.
He highlights how Intel is committed to meeting new compute challenges with different requirements, while continuing to work with Intel’s backbone – the Intel® Xeon® scalable processor.
The AI journey
There are three phases that characterise an AI project engagement, says Riviera.
The first is data management, which involves the collecting, accessing and cleansing of data. The second is the training phase, which consists of teaching the machine what needs to be done and to solve specific tasks to the required level of accuracy. The third stage is inference, when the machine is used to make decisions.
DL Boost can help to accelerate the inference stage of the AI workloads.
This can be described with a healthcare example – where an AI trained on historical magnetic resonance imaging (MRI) scan data can be used to infer results from new data. In other words, the AI can tell whether the new scan shows good or bad news for the patient, based on learning about previous scans.
Inference can take place in a large datacentre or at the edge. The final application will define what are the requirements and where the solution will need to be hosted.
“Training and inference bring their different challenges and require different technologies,” says Riviera. Intel has products that are appropriate for both scenarios.
Siemens Healthineers is one organisation making exciting developments with DL Boost. With one in three deaths worldwide coming from cardiovascular disease, MRI is required to make a quick diagnosis - but radiology departments must cope with the huge amount of data. Turnaround time is a challenge that can be helped by Intel’s AI technologies, and Intel has worked with Siemens Healthineers to improve the analysis of cardiac MRI exams using DL Boost and OpenVINO™, without the added cost and complexity of specialist accelerators.
OpenVINO is a free, open source developer toolkit for optimising and deploying deep learning solutions across multiple platforms.
The result is that AI-optimised Intel technologies have achieved a 5.5-times speed increase for quantifying heart function in cardiac MRI, thereby accelerating clinical workflows, improving accuracy and diagnosis, reducing hospital costs and supporting medical research.
Meeting different priorities
However, as Riviera points out, different customers have different priorities. Intel can meet requirements for inference with low latency where an answer needs to be given as soon as possible, such as braking during autonomous driving. Intel can also support workloads where the priority is for throughput and the need to process as much data as possible, for example a bank that wants to maximise the amount of data it can process overnight.
The adoption of Intel’s solution is also facilitated by OpenVINO.
Developers are harnessing the OpenVINO™ toolkit to unlock cost-effective, real-time vision applications by improving neural network performance on Intel processors.
Riviera explains that, thanks to the many Intel products available on the market for inference, customers would benefit from having a unified platform to interact with all of them.
With OpenVINO, a developer could deploy a trained model for inference on devices without knowing anything about the underlying architecture, like is happening with the Intel® Movidius™ vision processing unit
(VPU), Intel field programmable gate arrays (FPGA), and the recently launched Intel® Nervana™ Neural Network processor (NNP) and many others.
“OpenVINO helps if you want to deploy on the CPU. Second-generation Xeon can give you the optimised model that can activate VNNI. Developers love OpenVINO because it unifies everything in one platform,” says Riviera.
Thanks to DL Boost functionality, organisations can be prepared for their AI future. Second-generation Xeon processors mean they are ready for these exciting developments.