Optimized TensorFlow for Ampere

Ampere released their latest update for TensorFlow to help your AI Workloads. Check it out here:


It’s awesome to see pytorch optimisations land for Ampere / AARCH64 CPUs.
A few years ago when we benchmarked ALTA CPUs against x86, I remember pytorch being one of the major disappointments.
Way to go!


I wonder how it compares to Graviton 3 as it has BFloat16 and some other features not present in Altra.

Ampere Altra and Graviton 2 are Neoverse N1. Graviton 3 is Neoverse V1.

1 Like

Andrew Goodbody (Linaro LDCG) merged AArch64 build support into Tensorflow to have a way of building (and testing) Python packages in same way as it is done on x86-64.

I am working on getting it run on Linaro CI (we use Thunder X2 there). On Honeycomb (with only 32GB ram) build and test for Python 3.8 takes ~6 hours. I wonder how fast it will go on Altra (will publish scripts before x-mas).

As far as I understand, for inference we compare well, but for training and other vector/floating-point heavy jobs, we would not perform as well.

1 Like

Ampere Altra supports FP16. The compute performance of FP16 and BF16 is the same. In a lot of cases, FP16 can be used for many common models easily for inference. Ampere has benchmarked Ampere Optimized Frameworks against Graviton3 on TF with ACL. Running AI inference on Ampere Optimized Frameworks (TF, PyTorch, etc) outperforms Graviton 3 with ACL.