Hi all,
A few weeks ago, we were at KubeCon EU, and among the Ampere-related activities at the event, we had a very nice System 76 Thelio Astra system at the Open Telemetry Observatory. We’ve been doing quite a bit with OTel recently, and it has been great working with that community! The demo that they had prepared, however, had a few issues, and fixing them created more answers (for me) than it answered - I have a hypothesis below, but would love to hear if anyone else has seen this issue and has a better explanation.
This demo is a reference application that uses multiple languages and runtimes to exercise Open Telemetry, including a React.js front-end, which pulls a lot of Javascript dependencies from npm. The application is designed to dispatch (using a Spring Boot dispatcher) parts of the Mandelbrot set to different Golang workers before rendering the result using React.js. The idea is to exercise many different cores to render quickly, and give good raw data to your observability platform to read out on.
Unfortunately, when we ran npm build install
on site, we got the delightfully helpful error message “Bus error (core dumped)”. We could not find the core to get a stack trace, so this was all we had to go on. Searching the Internet yielded a number of promising hits:
- Bus error (core dumped) while starting a NextJs project
- How can I recover from ‘Bus Error’ in Ubuntu 20.04?
- Bus error trying to install npm
This error appears to happen more often on Arm64 nodes (lots of Mac links show up). The advice is universal, and typically arrives without any additional comment:
- delete
node_modules
directory - delete the
package-lock.json
file - delete
_next
is sometimes included
And indeed this did fix our issue on site! Unfortunately, it involves downloading and rebuilding hundreds of megabytes of dependencies, and over conference wifi, that wasn’t ideal, but we got there!
My question, though, is: what’s going on? And how should people head the issue off if it is a frequently occurring one on Arm64 systems?
My hypothesis is that the package-lock.json file included in the repository results in specific binary artifacts being downloaded by npm, which have some kind of pre-compiled Javascript modules in there, to save rebuilding time locally, and that this results in binary-incompatible compiled Javascript modules to be downloaded, when they need to be rebuilt locally (or have some architecture awareness built in to npm) - does that even make sense?
Anyone else encountered this issue?
Dave.