One of the things I always consider when deploying aarch64 architectures inside a developer ecosystem is about how to deploy applications to the cloud.
We all build on Intel except some teammates that are using M1, M2 processors (ARM). So it creates some friction cause when you have to test developments locally but deploy to AArch64 it means you should have a layer of emulation based on QEMU or similar.
So I wonder how does people handle this. At least from what I can see we need:
- Be able to run x64 workloads in Aarch64 systems. Postgres, mysql, etc.
- So you have dependencies running.
- Have different compilations for different architectures in the pipeline, so we test on intel but we also tests on target. That means that repos must have doubled size artifacts (x84, aarch64)
- Larger pipelines to run test suites in both archs. (I did a patch for thrift that was working nice in intel 32, 64 but not in arm).
- Optimized jvm, if you run java.
- Emulation layer to be able to run the whole ecosystem in local for local dev or testing. QEMU? Or native images in case of performance issues.
So… Having all this in mind. What are best practices?
In my mind I can think that the team will work in his laptops with intel, buid and test on native arch. And push to a repo.
Once code it’s there it should pass the pipeline for the target architecture, aarch64. So the build system must run in this architecture to be able to run the tests. It means that we must have workers ready for aarch64. Once it passes it will push to a artifact repository in the native architecture and build images for common development architectures.
Intel workers will recompile, pass tests and build artifacts for x64. An leave ready to use images for composing services and local use.
That means that there’s no way to take advantage of compilation steps or artifact storage… They will be doubled or tripled depending on how many architectures we will require to be supported.
What happens with third party software like Mysql, Postgres, and other products. I suppose that cassandra for example is not a problem since it runs in JVM. But what about native ones… Do we have performance graphs when running on the emulation layer?
Do you use native nodes to run this software? (Mixed architecture datacenters)
Somehow I missed this back in May! For sure, building aarch64 binaries and containers is not the default, so it’s a little more of a challenge than x86_64. However:
- You can run aarch64 builders for GitHub Actions in VMs on a CSP - all the big cloud service providers offer ARM instances now (and most are using Ampere)
- The wide availability of Raspberry Pi, M1 and M2 means that a lot of people now have a general purpose ARM v8 processor available at home
- There are lower-end options than a rackable server - Snapdragon-based desktop or laptop options, or for heavier workloads, Ampere based developer kit/software development platforms are ideal for including in a build farm, because they provide pretty high core counts and can thus run a lot of builders
- There’s always cross-compilation using qemu or docker buildx
In the open source projects that support arm64, they have typically added arm64 infrastructure to their build farms - sometimes ThunderX or EMAGs, but increasingly Ampere Altra hardware - which can take the load of automated integration tests and arm64 builds. I have not seen a lot of need to run x86_64 applications like Postgres on arm64 - why would you avoid running the arm64 version? - of course you can run x86 workloads under emulation with qemu or box86, if you need to, but I wouldn’t recommend it.
The best practice is to add ARM hardware to your build/test pipeline, and compile and test natively on all target platforms. As you say, this doubles both the size of your build artifacts and the test matrix you have to run, but it is what I have seen most often.
And welcome to the Ampere community!