I came across an interesting writeup from the 2025 HPC Asia Workshops from Stonybrook University https://dl.acm.org/doi/10.1145/3703001.3724384
The paper describes benchmarking tools and methods used using AmpereOne A192-32X vs. a number of other machines including Sapphire Rapids, Milan, and Grace. They tested a variety of primarily HPC workloads (not exactly the design point for AmpereOne, but an interesting comparison anyway) including: Benchmarks included genomics, AI/ML, computational fluid dynamics, molecular dynamics, linguistics, and statistical analysis.
A couple interesting ideas I noticed:
-
Strong performance on multithreaded applications looking for high throughput (e.g. genomics with BWA, AI inference with PyTorch, and Linguistics (Bufia), but weaker on some HPC tasks requiring wide SIMD floating-point registers or high memory bandwidth as HBM can offer dramatic improvements like OpenFOAM.
-
Measuring power and energy is an important metric when considering total response - things like Turbo or HT may prove to be problematic for deterministic results for high node occupancy systems