The Apple M1, ARM/x86 Linux Virtualization, and BOINC

About [six months ago](/2020/06/21/apple-and-arm-transition/, I speculated a bit on what Apple might do with their upcoming (rumored at the time) ARM transition. Apple did it, has shipped hardware, and I’ve had a chance to play with for a while now. I’ve also, as is usual for me, gone down some weird paths - like ARM Linux virtualization, x86 Linux emulation, and BOINC in an ARM VM!


This is a companion discussion topic for the original entry at https://www.sevarg.net/2021/01/09/arm-mac-mini-and-boinc/

Hello,

macOS: provide hint for PCore binding [do not upstream] · imbushuo/qemu@70b95c7 · GitHub contains a patch that you can use for PCore pinning on macOS.

For Arm chips with TSO, some more heavy duty Arm 64-bit chips like the Fujitsu A64fx implement TSO in hardware (always-on).

And the NVIDIA Tegra Xavier processor specifically (present in the Jetson AGX Xavier and the Jetson Xavier NX) is one of the very rare modern processors with sequential consistency as the memory model, which makes those concerns moot.

Thanks - unfortunately, I’ve already tried that method, and it doesn’t work reliably.

The magic code in that patch,

pthread_attr_set_qos_class_np(&attr, QOS_CLASS_USER_INTERACTIVE, 0);

is consistent with Apple’s scheduling guidelines, but I spent part of today messing with it, with the following results:

I tried adding this the pthread_set_qos_class_self_np call to the start of qemu_thread_start in util/qemu_thread_posix.c - which is definitely getting called, and seems to be the only thing that pthread_create creates - which is how qemu, as I understand it, is creating threads.

Results with -smp 4:

QOS_CLASS_USER_INTERACTIVE: No meaningful change from default behavior. Still lives on performance cores when in the foreground and utility cores in the background.

QOS_CLASS_USER_INITIATED: Definite improvements. Remains on the performance cores far, far longer when minimized, but will still shuttle over to the efficiency cores at some point. Remains on the performance cores for a short period with the screen locked, but still ends up on the efficiency cores after a while.

QOS_CLASS_UTILITY: Same general behavior as QOS_CLASS_USER_INITIATED, still moves to the efficiency cores.

QOS_CLASS_BACKGROUND: Runs purely on the efficiency cores. Performance sucks. However, if you were on a laptop and really didn’t want your VM to impact battery life, this might be useful.

Results with -smp 8:

QOS_CLASS_USER_INTERACTIVE: Hangs out on the performance cores for a while, including for a reasonable period of time (I got bored of waiting) when minimized. Still migrates off to the efficiency cores after the screen has been locked for a while.

QOS_CLASS_USER_INITIATED: Still moves to the efficiency cores, but takes longer when minimized. Remains on the performance cores for a while with screen lock, but eventually migrates over.

QOS_CLASS_UTILITY: Still migrates all threads to efficiency cores after being minimized for a while.

QOS_CLASS_BACKGROUND: Launches all 8 threads on the efficiency cores.

I’m still looking for some way to keep a VM on the performance cores, even with the screen locked, for long periods of time.

I was not aware of the default-TSO in some of the ARM cores, though - thanks! Very interesting!

One of the things there is that those Macs don’t have a conventional sleeping mode.

An Apple Silicon Mac on sleep just means one with everything on the efficiency cores in low power mode (eventually, with most processes being paused depending on configuration). Like an iPhone, notifications instantly wake up the system - which is always kept online.

pmset should allow to tweak those, but didn’t try doing that on my side…

I don’t think that’s true if you’ve disabled sleep, but… worth poking at, I suppose.

While it’s quite impressive, and given I know your history with the vast majority of your general, day-to-day use being able to easily fit into 8GB of RAM, whats your feelings on how much is reliant on Apple’s “these are the only hardware configurations” and optimizing them significantly, and how much do you think if they had to have more flexibility (e.g. standard DDR4 SODIMMs, etc) vs the unified 8 or 16GB in the way they’ve implemented it?

And even more so, from my brief research it appears the SSD (I presume NVMe) looks soldered on, and how do you feel about that, and how much do you think it’ll affect medium to longer term usefulness? In actually usage, download, upload, and I presume some amounts of media processing.

Apple’s “This way or the highway” thing has been getting stronger over time, but… if you’re going to do something annoying, at least make it useful. Apple’s closely coupled RAM and SSD have really, really made for some amazing performance, and I think that, if anything, we’re actually OK with less RAM than we used to be. One of the common cases for lots of RAM was “Disk is glacial, keep everything in RAM cache.” That’s no longer true, and with compressed swap technology now being used widely, RAM doesn’t matter as much as it used to. Except for VMs.

Unfortunately, you’re right about the SSD. On the M1s, everything is soldered on, as-is. I don’t like that, but Apple doesn’t have large numbers of phones or tablets failing from disk/RAM issues, so… it’s probably fine? I’d obviously rather have it be user-serviceable, but that’s not the direction things are going.

I see what you mean, and I suppose Apple’s primary audience is NOT likely to want/think they can upgrade things such as storage, to care about it. Or they just are going to push the iCloud synced storage even more heavily, if that’s possible.

We’re also at a point where fast external storage is a thing - so internal storage just doesn’t matter quite as much, as far as I’m concerned. You need enough internal storage to handle OS/applications, but if a lot of data files are external, well… OK. It just doesn’t matter as much as it used to.

Using BlackMagic Disk Speed Test, because I have it laying around:

  • Internal Storage on the M1: Write 2300 MB/ s Read 2900 MB/s.
  • An external OWC Thunderbolt enclosure with NVMe 1TB drive (about $150 for the whole thing): Read/Write around 750MB/s.

That’s still plenty fast enough for just about anything, and it’s far cheaper than internal storage. If your goal is maximum upvotes on Reddit for buying hardware, sure, it’s rubbish, but in terms of actual use, well… OK.

I also don’t take quite so hard a line as some on things like phone repair. I want them repairable - in that, with reasonable tools, you can repair them. But I’m also willing to look at what you gain for various things that make them more difficult to repair. I’d argue that Apple’s last few gens of iPhones being waterproof in the “You can get pushed into the pool and they’ll probably be fine” sense is a good tradeoff. They’re a bit more annoying to repair with the adhesive sealants, but you can get them open without heroics, and the common failure point (the screen being cracked) comes off first. I’ll take that set of tradeoffs, because it optimizes for eliminating a common failure (water damage) at the cost of making repair a bit more difficult.

The aarch64-softmmu build is working for me and I can successfully run Ubuntu, but the x86_64-softmmu and i386-softmmu builds won’t let me run anything. I get this error:

qemu-system-x86_64: qemu_mprotect__osdep: mprotect failed: Permission denied
**
ERROR:../tcg/tcg.c:734:tcg_region_init: assertion failed: (!rc)
Bail out! ERROR:../tcg/tcg.c:734:tcg_region_init: assertion failed: (!rc)

I’ve tracked down the exact location where things go wrong. This functions calls mprotect and receives a non-zero code.

util/osdep.c:

static int qemu_mprotect__osdep(void *addr, size_t size, int prot)
{
    g_assert(!((uintptr_t)addr & ~qemu_real_host_page_mask));
    g_assert(!(size & ~qemu_real_host_page_mask));

#ifdef _WIN32
    DWORD old_protect;

    if (!VirtualProtect(addr, size, prot, &old_protect)) {
        g_autofree gchar *emsg = g_win32_error_message(GetLastError());
        error_report("%s: VirtualProtect failed: %s", __func__, emsg);
        return -1;
    }
    return 0;
#else
    if (mprotect(addr, size, prot)) {
        error_report("%s: mprotect failed: %s", __func__, strerror(errno));
        return -1;
    }
    return 0;
#endif
}

I’m assuming this has to do with Apple’s memory protection as mentioned in the blog post.

I should also mention that I’m specifically compiling QEMU 5.2.0. This is because the patches provided in the blog post don’t merge cleanly with the current master (and I assumed that 5.2.0 would roughly match up with the master when the blog post was written).

If anyone could help me out here I would appreciate it greatly!

Oof. The joys of open source changing…

Try checking out @ e551455f1e7a3d7eee9e11e2903e4050bc5511ae - that’s what I built the patches on top of.

2 Likes

Unfortunately I get the same error. There must be something wrong with my system or one of my dependencies.

Turns out I can get everything working by configuring QEMU with --enable-debug-tcg… Not quite sure what’s going on.

git clone https://git.qemu.org/git/qemu.git
cd qemu
mkdir build
cd build
../configure --target-list=i386-softmmu --enable-cocoa --enable-debug-tcg
make -j 8

No patches required!

It would be nice to get it working without any debug stuff in the way, but this investigation is now beyond my pay grade.

The joys of bleeding edge open source. :wink:

Hopefully those patches will get sucked into master soon so they won’t be needed.

1 Like

Thanks for that tip. It allowed me to apply the patches and build without error. But at runtime I am getting an error that

qemu-system-aarch64: -accel hvf: invalid accelerator hvf

It is strange because in the configure step it lists HVF as supported.

After spending much time playing around with qemu’s settings trying to get Ubuntu for amd64 to boot up, I stumbled across UTM (Releases · utmapp/UTM · GitHub) that had qemu compiled in. Using this I was able to get Ubuntu 20.04.02 desktop installed and running with 2 cores. It boots slowly, but after that it allows me to build legacy apps that require x64. I took a look at ACVM, but there was not an easy way to select x86_x64 emulation

I also started getting invalid accelerator hvf. I found this helpful post however.

Make sure to substitute the right paths in the following commands…

Build qemu:

git clone https://git.qemu.org/git/qemu.git
cd qemu
git checkout 56a11a9b7580b576a9db930667be07f1dd1564d5
curl https://patchwork.kernel.org/series/418581/mbox/ | git am
mkdir build
cd build
../configure --target-list=aarch64-softmmu --enable-hvf --disable-gnutls
make -j 8

Add to path (I have this in my .zshrc):

export PATH="/Users/will/Desktop/qemu/build:$PATH"

Then assuming you’re in a new directory with an Ubuntu image called focal-desktop-arm64.iso :

qemu-img create -f qcow2 disk.qcow2 20G

dd if=/dev/zero conv=sync bs=1m count=64 of=ovmf_vars.fd

qemu-system-aarch64 \
    -accel hvf \
    -m 4096 \
    -smp 4 \
    -cpu cortex-a57 -M virt,highmem=off  \
    -drive file=/Users/will/Desktop/qemu/build/pc-bios/edk2-aarch64-code.fd,if=pflash,format=raw,readonly=on \
    -drive file=ovmf_vars.fd,if=pflash,format=raw \
    -serial telnet::4444,server,nowait \
    -drive if=none,file=disk.qcow2,format=qcow2,id=hd0 \
    -device virtio-blk-device,drive=hd0,serial="dummyserial" \
    -device virtio-net-device,netdev=net0 \
    -netdev user,id=net0 \
    -vga none -device ramfb \
    -cdrom focal-desktop-arm64.iso \
    -device usb-ehci -device usb-kbd -device usb-mouse -usb \
    -monitor stdio

After installation, just run with:

qemu-system-aarch64 \
    -accel hvf \
    -m 4096 \
    -smp 4 \
    -cpu cortex-a57 -M virt,highmem=off  \
    -drive file=/Users/will/Desktop/qemu/build/pc-bios/edk2-aarch64-code.fd,if=pflash,format=raw,readonly=on \
    -drive file=ovmf_vars.fd,if=pflash,format=raw \
    -serial telnet::4444,server,nowait \
    -drive if=none,file=disk.qcow2,format=qcow2,id=hd0 \
    -device virtio-blk-device,drive=hd0,serial="dummyserial" \
    -device virtio-net-device,netdev=net0 \
    -netdev user,id=net0 \
    -vga none -device ramfb \
    -device usb-ehci -device usb-kbd -device usb-mouse -usb \
    -monitor stdio

I have install qemu binary by M1 native homebrew on behalf of building from source.
But I have failed to create qcow2 file via qemu-img as bellow, any comments about this?

$ brew info qemu
qemu: stable 5.2.0 (bottled), HEAD
Emulator for x86 and PowerPC
https://www.qemu.org/
/opt/homebrew/Cellar/qemu/5.2.0 (161 files, 560MB) *
  Poured from bottle on 2021-02-26 at 09:41:33
From: https://github.com/Homebrew/homebrew-core/blob/HEAD/Formula/qemu.rb
License: GPL-2.0-only
==> Dependencies
Build: libtool ✔, meson ✔, ninja ✔, pkg-config ✔
Required: glib ✔, gnutls ✔, jpeg ✔, libpng ✔, libssh ✔, libusb ✔, lzo ✔, ncurses ✔, nettle ✔, pixman ✔, snappy ✔, vde ✔
==> Options
--HEAD
	Install HEAD version
==> Analytics
install: 12,020 (30 days), 29,110 (90 days), 106,888 (365 days)
install-on-request: 11,955 (30 days), 28,968 (90 days), 104,695 (365 days)
build-error: 0 (30 days)

$ qemu-img create -f qcow2 disk.qcow2 10G
[1]    70497 killed     qemu-img create -f qcow2 disk.qcow2 10G

Does anything show up in dmesg output? “Killed” implies that a process was terminated somewhat against its will.