Leveraging the IBM Telum Integrated Accelerator for AI

Leveraging the IBM Integrated Accelerator for AI¶

The IBM Integrated Accelerator for AI is an on-chip AI accelerator available on the IBM Telum chip that is part of IBM z16 and LinuxONE 4 servers. It is designed to enable high throughput, low latency inference for deep learning and machine learning.

With IBM z16 and the Integrated Accelerator for AI, you can build and train your models on any platform - including IBM zSystems and LinuxONE. When you are ready to deploy your assets, they will receive transparent acceleration and optimization on IBM zSystems, and will leverage the best available acceleration for the model type.

The IBM Integrated Accelerator for AI is more than just a matrix multiply accelerator - it provides optimization and acceleration for a wide set of complex functions commonly found in deep learning and machine learning models. This enables a broader set of functions to be accelerated on the chip.

The following operations are supported on the accelerator (by machine generation):

Operation	z16 or LinuxONE 4 (Telum I)	z17 (Telum II)
LSTM Activation	Supported	Supported
GRU Activation	Supported	Supported
Fused Matrix Multiply, Bias op	Supported	+ added transpose, INT8 quantization
Fused Matrix Multiply (w/ broadcast)	Supported	+ added transpose, INT8 quantization
Batch Normalization	Supported	Supported
Fused Convolution, Bias Add, Relu	Supported	Supported
L2 Norm		Supported
Layer Normalization		Supported
Max Pool 2D	Supported	Supported
Average Pool 2D	Supported	Supported
Softmax	Supported	Supported
Relu	Supported	Supported
Leaky Relu		Supported
Gelu		Supported
Tanh	Supported	Supported
Sigmoid	Supported	Supported
Add	Supported	Supported
Subtract	Supported	Supported
Multiply	Supported	Supported
Divide	Supported	Supported
Min	Supported	Supported
Max	Supported	Supported
Log	Supported	Supported
Square root		Supported
Transform (Tensor)		Supported
Reduce		Supported

These allow supporting frameworks to target a significantly larger set of operations to the Integrated Accelerator for AI.

Using the Integrated Accelerator for AI¶

IBM Z and LinuxONE Telum Integrated Accelerator for AI reference technologies

Depending on your model type, there are a few essential approaches to leveraging the Integrated Accelerator for AI. These capabilities are all available in various IBM product offerings as well as through no-cost channels (such as the IBM Z Container Image Repository).

For deep learning models, such as those created in PyTorch or TensorFlow:

ONNX deep learning models, when compiled using the IBM Z Deep Learning Compiler (onnx-mlir).
TensorFlow
PyTorch

For machine learning models, such as those created in sci-kit learn, XGBoost, or lightGBM:

IBM Snap ML, a machine learning framework that provides optimized training and inference.

For those interested in enhancing frameworks or compilers to use the Integrated Accelerator for AI:

IBM zDNN
- This is the accelerator development library, which is intended for use by those interested in enhancing frameworks or compilers to use the accelerator.

For further details, use the navigation bar on this page to select a 'Featured Frameworks and Technologies' choice.

Each of these are available as standalone packages, free of charge, or embedded within IBM products such as Machine Learning for z/OS and Cloud Pak for Data.