Monday, December 16, 2013

automatic bisection of qemu with console and monitor interaction

I had some fun debugging a performance issue after VM migration on v1.0 [1]. Because this was fixed in newer versions, a bisect was in order. Doing this manually becomes very tedious so writing some scripts to use 'git bisect run' really helps. Here I'll document how I did this in case others find it useful when trying to bisect issues in qemu.

First step is getting the problem reproducible using the command line. In addition you will also need to output anything that be used to determine if the test case passes or fails using standard console output. This makes it easy to run the test case with expect. You will need to follow [2] on how to setup your VM to use serial output.

Next modify the expect script and ensure it works by just running it by itself. Identify which versions pass or fail (a coarse bisect). Once you can get between two release tags you should start a bisect between those tags.

Next run:
git bisect start
git bisect good <good tag>
git bisect bad <bad tag>
git bisect run ../bisect-run

I ran into a few gotchas that may be bugs that really need fixing. Occasionally when running in '-noconsole' mode, I wouldn't see any prompt for a very long time. When re-running with '-vga std', I'd see that it was waiting at GRUB. You may need to modify timeouts such that you don't hit those issues without a VGA console.

Overall, you can find these scripts here [3].

Hopefully they will evolve a bit once its used more and more.

setup an external drives as raid1

I purchased a cheap two drive USB enclosure in order to setup an external drive that had RAID-1 so I could backup photos and recordings.

First, I formatted both drives. Then I ran extended smart self-tests to ensure I had decent drives. With RAID-1 and two drives I can only tolerate 1 drive failure.

Next ensure mdadm is installed.
sudo apt-get install mdadm

Determine which dev devices the disks show up as.
Next, create the raid device pointing at the correct dev directory.
sudo mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sdx /dev/sdy
sudo mkfs.ext /dev/md0

The device will start a resync process which on my system takes a really long time (days). If you want to avoid this initial re-sync you can use '--assume-clean' to avoid this. I would recommend letting it resync.

And there ya go.