So… I’m insane. I want to build Linux from scratch - entirely - because I wonder if the ILP32 ABI on ARM is worth some performance. I think it is. And this has been done, but not recently.
If you’re familiar with the x32 ABI, skip the next section.
Typically, OSes are built either for a 32-bit or 64-bit environment. In a 32-bit environment, registers are 32-bit, memory addresses are 32-bit, pointers are 32-bit, etc. In a 64-bit environment, all of those are 64-bit. BUT: That means your pointers are now double the size - even if you’re not using that much memory in a process.
The x32/ILP32 ABIs are a 64-bit operating mode (so you get the increased register count, 64-bit only extensions, bigger math, typically an improved ISA) - BUT, they use 32-bit memory addresses and pointers. So your process only has up to 4GB of virtual address space, but if you use a lot of pointers (which many modern things like browsers do), you literally double your data density in your caches. It’s a thing, and it’s a thing worth quite a bit of performance in some workloads.
Given my hobby of gutless wonder ARM systems, this might be worth some useful performance gains. I’ve seen claims of 20% on ARM, which, given the itty bitty little caches, seems plausible.
Just, there’s no process to do this. And I’m not even sure what will break.
So, I’m considering a Gentoo build on ARM, just to see if I can do it. I may experiment on qemu first just for sanity reasons, but… any advice here other than “Go for it and document it?”