64K Memory page sizes - any experiences to share?

Thanks for trying it out and reporting back!

2 Likes

Thanks Tristan.

It’ll take some time before I get to this - and I am worried about the precedent of having to include all the distributions afterwards. Is there a NixOS documentation page that describes the process? You mention that the system rebuilds the kernel - as in a from scratch recompile?!?

Dave.

1 Like

Yes, the entire kernel would rebuild due to the nature of how Nix works. However, it can be alleviated by using ccache. The good thing is the kernel only takes less than 20 minutes to compile on Ampere Altra Q64-22. I’ll link some documentation:

As to the thing about distro’s, the great thing with Nix is you don’t have to use NixOS. Nix can run on Fedora, Ubuntu, RHEL, any distro (even Android to some degree). Note that these instructions won’t fully apply but the documentation is enough to give users the right direction outside of NixOS to do this.

1 Like

Anyone noticed the Memory comsumption become very high after updating to 64k kernel?

1 Like

Solved the issue… Seems hugepage reserved.Thanks @David.Zeng help. Need to free the hugepage by command:
echo 0 | sudo tee /proc/sys/vm/nr_hugepages

1 Like

I just got an ext4 filesystem corruption on a GH200 system that uses 64k page size :joy: Another system got OOM errors with simple compilation tasks.

Don’t know why. Ubuntu 22.04 with kernel “6.8.0-1013-nvidia-64k”

1 Like

So I don’t want to derail things too bad but Nix kinda fixes this lol. We have everything and it’s easy to recover from a broken system. Feel free to DM me if you’d like to find out more heh.

I’ve been investigating this JEMalloc issue with some colleagues and we’ve found that the issue is because the test code (not JEMalloc code) allocates a large amount of memory on the stack that happens to be dependent on the base page size of the kernel:

#  define HUGEPAGE_PAGES (HUGEPAGE / PAGE)

...

      edata_t alloc[HUGEPAGE_PAGES];


On a 4K kernel HUGEPAGE_PAGES = (HUGEPAGE / PAGE) = 2M/4K = 512. On a 64K kernel it’s 512M/64K = 8192. Because this large array is declared as a local variable and not malloced it’s allocated on the stack, so the code SEGVs at function entry when the code tries to move the stack pointer almost 10 MB.

The easiest workaround is to increase the size of your stack:

ulimit -S -s 200000

The better solution would be to allocate that array with malloc or some similar fix. We are engaging with the JEMalloc community to determine the best course of action.

This doesn’t imply that 64K kernels use more stack space. It’s just an odd quirk of this code that the amount of memory allocated depends on the ratio of huge page size to base page size, and that just happes to be a larger ratio with 64K kernels.

3 Likes

For me was around 80% higher in idle.

1 Like

80% higher memory usage? While the computer was idle?

With 4k kernel for me it’s 8GB of RAM used by Ubuntu 24.04 or 22.04 few minutes after boot, nothing else running, no startup apps. With 64k kernel it is 14GB of RAM. My M128-30 has 256GB of RAM.

That doesn’t seem right. On my systems it’s more like 1 GB used by the kernel after a fresh reboot. Does the memory use go down if you drop caches (echo 3 > /proc/sys/vm/drop_caches)? Does “top” report an application using a lot of memory (type “M” to sort by memory use)? Does “cat /proc/meminfo” tell you where the memory use is?

1 Like