Testing and automation. These two are key to ensuring high quality of software releases.
Ever since I worked briefly in the team at MySQL AB that is responsible for creating the binary (and source) packages of MySQL releases, I have had the vision of a fully automated release procedure. Whenever someone pushes a new commit to the release branch revision control tree, the continuous integration test framework should kick in and do all the steps needed for producing release packages:
To do this efficiently, clearly the use of virtual machines is needed. This weekend I played with KVM and Buildbot, and managed to set up a proof-of-concept of this that I am really pleased with.
There are lots of options for virtualisation these days, including KVM, Xen,
VirtualBox, and Vmware. I use KVM, and I really like it. The integration into
the distributions is excellent (sudo apt-get install kvm and
you're up and running). The interface is powerful and flexible, and at the
same time really simple to learn and use. Just a couple of commands with man
pages, like it should be in a Unix system.
I started by installing a basic ubuntu Jaunty server in a virtual machine:
  qemu-img create -f qcow2 vm-jaunty-i386-base.qcow2 8G
  kvm -m 2047 -hda vm-jaunty-i386-base.qcow2 -cdrom ubuntu-9.04-server-i386.iso \
    -boot d -smp 2 -cpu qemu32,-nx -net nic,model=virtio -net user -redir tcp:2222::22
I use the user mode network stack with port forwarding for ssh access. This
allows to run kvm without root privileges, avoids any need to manage different
MAC addresses, avoids the need for routing or configuring interfaces, in short
is nice and simple and just works :-).
Using the virtio network driver greatly improved throughput for
me when copying things into and out of the virtual machine. The -cpu
qemu32,-nx (disable "No eXecute" support) is needed in this case due to
some bug or incompatibility, or the installation hangs upon reboot. As usual
Google is your friend in cases like this:
Incidentally, I did this using remote X over an SSH connection. This works fine, no need for physical access to the host server. After installation we will run the virtual machine without a graphic console, but it was just easier to use the stock Ubuntu installer than trying to find a way to install over the emulated serial port.
Next I did some basic preparation to make the installed virtual machine work well for command line and script usage. However, the amount of extra packages installed is kept to a minimum to get proper testing against unwanted dependencies.
I Installed ssh server for remote access. I then set it up to use the serial
console (as we will be running kvm in -nographic mode). To get a login prompt
on serial port 0, create /etc/event.d/ttyS0:
    start on stopped rc2
    start on stopped rc3
    start on stopped rc4
    start on stopped rc5
    stop on runlevel 0
    stop on runlevel 1
    stop on runlevel 6
    respawn
    exec /sbin/getty 115200 ttyS0
To get the kernel to output its boot log to the serial port, edit the kernel
line in /boot/grub/menu.lst, removing quiet splash
and adding console=ttyS0,115200n8 console=tty0. To get Grub to
use the serial port, add these lines to /boot/grub/menu.lst:
    serial --unit=0 --speed=115200 --word=8 --parity=no --stop=1
    terminal --timeout=3 serial console
Next, we need a user account inside the virtual machine that we can use from
the outside with passwordless login and sudo access. Inside the
guest, create the account and grant passwordless sudo:
    sudo adduser --disabled-password buildbot
    sudo adduser buildbot sudo
    sudo visudo
    # uncomment `%sudo ALL=NOPASSWD: ALL'
Then, in the host create an SSH public/private key pair without passphrase:
    ssh-keygen -t dsa
Copy the resulting ~/.ssh/id_dsa.pub from the host
into ~/.ssh/authorized_keys in the guest.
Now we should be able to test that things work:
    kvm_pid_2222' ; exec kvm -m 2047 -hda /kvm/vms/vm-jaunty-i386-makedist.qcow2 \
        -redir 'tcp:2222::22' -boot c -smp 2 -cpu qemu32,-nx -nographic \
        -net nic,model=virtio -net user
    # We should get a login prompt in the terminal window
    ssh -p 2222 buildbot@127.0.0.1 'sudo id'
    # We should get root access without login or sudo asking for password.
We now have the basis for scripting actions against the virtual machine: We
can start up the guest from the command line (and shutdown
with kill from the host or sudo shutdown -h now from
the guest). And we can run commands inside the guest using
ssh -p 2222 buildbot@127.0.0.1. The next step is to create
variants of this base virtual installation for the different purposes we need.
The qcow2 virtual hard disk image format used by qemu (and kvm) has a very
powerful feature, activated with the -b option of
qemu-img create:
    qemu-img create -b vm-jaunty-i386-base.qcow2 -f qcow2 vm-jaunty-i386-makedist.qcow2
This creates a new image vm-jaunty-i386-makedist.qcow2, which is
initially a clone of the base image vm-jaunty-i386-base.qcow2
that takes up (almost) no extra space. But as we use this new image, changes
are added in the new image (copy-on-write), without modifying the original
base image. This allows painless mass cloning and modification of virtual
machines without having to re-install, and without taking up unnecessary extra
disk space and I/O for copying images.
We use this to create a virtual machine that we will use to produce the source tarball from bzr sources. This needs installing bzr and some development packages (compilers etc).
    sudo apt-get install bzr
    sudo apt-get build-dep mysql-5.1-server
(note the very nice build-dep feature of apt-get, it
actually installs a ton of packages needed to build the MySQL server (and
MariaDB has the save dependencies). I also copied in an existing shared bzr
repository; this is not strictly necessary, but saves a very painful initial
cloing of the entire MariaDB repository from Launchpad (bzr is just painfully
slow on source trees of the size of MariaDB/MySQL):
    scp -rp -P 2222 .bzr buildbot@127.0.0.1:
Another virtual machine image is set up for building the binary packages (this does not need bzr):
    qemu-img create -b vm-jaunty-i386-base.qcow2 -f qcow2 vm-jaunty-i386-build.qcow2
(with a bit more planning, I could have cloned -makedist
from -build; now I just repeated the install
of mysql-server-5.1 dependencies, but not the bzr install).
Finally, a third image for testing installation:
    qemu-img create -b vm-jaunty-i386-base.qcow2 -f qcow2 vm-jaunty-i386-install.qcow2
I will be testing a bintar package install, so create the mysql user and
group:
     sudo adduser --system --group mysql
With these preparations, we should be ready to put the pieces together:
For MariaDB, we use Buildbot for continuous integration testing. The Pushbuild system I developed at MySQL was never released publicly, and in any case it is better to use a general tool like Buildbot that is widely used and maintained by a large community.
I have been very satisfied with Buildbot. It has its quirks and bugs, but we can fix those over time (and have fixed a number of them already, as well as added extra features we needed). I think Buildbot has all of the right ideas for doing serious continuous integration testing. As I read in some presentation, running the builds and tests is the easy part. The hard part is providing the information and tools needed by developers to fix problems that are found by testing. Fixing these problems is what it is all about, after all, not just producing pretty status reports.
First, I installed a buildbot slave on the host machine:
    sudo apt-get install buildbot
    sudo addgroup buildbot kvm  # To allow buildbot to run kvm
    sudo -u buildbot buildbot create-slave --usepty=0 /var/lib/buildbot/maria-slave hasky.askmonty.org:9989 knielsen-kvm-x86 <password>
Then I set up an account for this in the Buildbot master, and configured the
builder.
With the above preparation, configuring the build is just setting up the
proper shell commands to be run against the slave, although it is of course a
bit more involved than for a normal configure+make. I really like
the simplicity of this. Basically, after initial preparation of the KVM
images, there is very little setup required on the buildbot slave host, it is
all just normal shell commands configured on the master. Of course going
forward we can refine some of this and maybe put some of it into generic
scripts called from the main config, but for a proof-of-concept I think it is
brilliant that one can see exactly which commands are run.
I included the complete config in all detail at the end of this post, but here are the main points.
f_kvm_jaunty_x86.addStep(Compile(
        logfiles={"kernel": "kernel_2222.log"},
        command=["sh", "-c", """
kill -9 "$(cat kvm_pid_2222)"
(exec sh -c "echo \$\$ > 'kvm_pid_2222' ; exec kvm -m 2047 -hda /kvm/vms/vm-jaunty-i386-makedist.qcow2 -redir 'tcp:2222::22' -boot c -smp 2 -cpu qemu32,-nx -nographic -net nic,model=virtio -net user" </dev/null >kernel_2222.log 2>&1) &
sleep 15
while : ; do ssh -o ConnectTimeout=4 -p 2222 buildbot@127.0.0.1 true && break; sleep 2; done
ssh -p 2222 buildbot@127.0.0.1 'mkdir -p buildbot && cd buildbot && rm -Rf build && bzr co "lp:~maria-captains/maria/mariadb-5.1-knielsen" build && cd build && BUILD/compile-dist && make dist && mv "$(make show-dist-name).tar.gz" ..'
"""]))
The kill command removes any previous left-over kvm process
(better safe than sorry). We run kvm in the backgroud, getting the console
output through a log file. Note that redirecting the kvm output is necessary,
as the buildstep will wait for all processes to close the stdout before
considering the buildstep done.
After starting the virtual machine, we wait for boot to have completed by
checking for successful ssh connection in the while loop. Once it
is ready, we send the commands to build the source tarball into the guest
using ssh.
f_kvm_jaunty_x86.addStep(SetProperty(
        property="distname",
	command=["ssh", "-p", "2222", "buildbot@127.0.0.1", "cd buildbot/build && make show-dist-name"],
        ))
This gets the base name of the source tarball into a Buildbot build
  property, an essential feature of Buildbot for more advanced usage. We
  will need this name in the following build steps (the name depends on the
  version of the MariaDB server code).
f_kvm_jaunty_x86.addStep(ShellCommand(
	command=["sh", "-c", WithProperties("""
scp -P 2222 buildbot@127.0.0.1:buildbot/%(distname)s.tar.gz .
ssh -p 2222 buildbot@127.0.0.1 'sudo shutdown -h now'
while : ; do sleep 5; kill -0 "$(cat kvm_pid_2222)" || break; done
rm -f kvm_pid_2222
""")],))
We copy out the generated source tarball (we will need it in the next
buildstep, which runs in a different virtual machine). We then shutdown this
guest, and wait for it to finish with another while loop. Note
the use of WithProperties to interpolate the source tarball name
obtained in the previous build step.
f_kvm_jaunty_x86.addStep(Compile(
        command=["sh", "-c", WithProperties("""
qemu-img create -b /kvm/vms/vm-jaunty-i386-build.qcow2 -f qcow2 vm-tmp-2222.qcow2
kill -9 "$(cat kvm_pid_2222)"
(exec sh -c "echo \$\$ > 'kvm_pid_2222' ; exec kvm -m 2047 -hda vm-tmp-2222.qcow2 -redir 'tcp:2222::22' -boot c -smp 2 -cpu qemu32,-nx -nographic -net nic,model=virtio -net user" </dev/null >>kernel_2222.log 2>&1) &
# ...
ssh -p 2222 buildbot@127.0.0.1 'rm -Rf buildbot && mkdir buildbot'
scp -P 2222 %(distname)s.tar.gz buildbot@127.0.0.1:buildbot/
ssh -p 2222 buildbot@127.0.0.1 'cd buildbot && tar zxf %(distname)s.tar.gz && cd %(distname)s && ./configure ...'
# ...
""")],))
Here (and in the following install step), we use
qemu-img create -b to create a new, temporary image to work
in. This ensures that each build will run in a clean, fresh install, without
any risk of contamination from previous builds. (The reason we did not do this
for the initial step is that we want to save the bzr revisions pulled from
Launchpad so we do not have to keep repeatedly pulling the old ones over for
each new build. An alternative would be to keep the permanent shared
repository on the host machine and export from that inside the virtual
machine).
And that's it! Full config details below, but it is basically the same, just with different commands run in the different steps. The result is a builder that fully automatically tests build and install on real machines with the correct setup, 100% repeatable between builds.
This is just a quick proof of concept, but I think all of the essential ingredients are in there. I am hoping we will in the not too distant future be using something like this regularly to check MariaDB release builds, which should be very good for getting ensuring both the quality and efficiency of future MariaDB releases!
f_kvm_jaunty_x86= factory.BuildFactory()
f_kvm_jaunty_x86.addStep(Compile(
        description=["making", "dist"],
        descriptionDone=["make", "dist"],
        logfiles={"kernel": "kernel_2222.log"},
        command=["sh", "-c", """
kill -9 "$(cat kvm_pid_2222)"
(exec sh -c "echo \$\$ > 'kvm_pid_2222' ; exec kvm -m 2047 -hda /kvm/vms/vm-jaunty-i386-makedist.qcow2 -redir 'tcp:2222::22' -boot c -smp 2 -cpu qemu32,-nx -nographic -net nic,model=virtio -net user" </dev/null >kernel_2222.log 2>&1) &
sleep 15
while : ; do ssh -o ConnectTimeout=4 -p 2222 buildbot@127.0.0.1 true && break; sleep 2; done
ssh -p 2222 buildbot@127.0.0.1 'mkdir -p buildbot && cd buildbot && rm -Rf build && bzr co "lp:~maria-captains/maria/mariadb-5.1-knielsen" build && cd build && BUILD/compile-dist && make dist && mv "$(make show-dist-name).tar.gz" ..'
"""
                 ],
        ))
f_kvm_jaunty_x86.addStep(SetProperty(
        property="distname",
	command=["ssh", "-p", "2222", "buildbot@127.0.0.1", "cd buildbot/build && make show-dist-name"],
        ))
f_kvm_jaunty_x86.addStep(ShellCommand(
        description=["copying", "tarball"],
        descriptionDone=["copying", "tarball"],
        logfiles={"kernel": "kernel_2222.log"},
	command=["sh", "-c", WithProperties("""
scp -P 2222 buildbot@127.0.0.1:buildbot/%(distname)s.tar.gz .
ssh -p 2222 buildbot@127.0.0.1 'sudo shutdown -h now'
while : ; do sleep 5; kill -0 "$(cat kvm_pid_2222)" || break; done
rm -f kvm_pid_2222
""")],
        ))
f_kvm_jaunty_x86.addStep(Compile(
        description=["making", "bintar"],
        descriptionDone=["make", "bintar"],
        logfiles={"kernel": "kernel_2222.log"},
        command=["sh", "-c", WithProperties("""
qemu-img create -b /kvm/vms/vm-jaunty-i386-build.qcow2 -f qcow2 vm-tmp-2222.qcow2
kill -9 "$(cat kvm_pid_2222)"
(exec sh -c "echo \$\$ > 'kvm_pid_2222' ; exec kvm -m 2047 -hda vm-tmp-2222.qcow2 -redir 'tcp:2222::22' -boot c -smp 2 -cpu qemu32,-nx -nographic -net nic,model=virtio -net user" </dev/null >>kernel_2222.log 2>&1) &
sleep 15
while : ; do ssh -o ConnectTimeout=4 -p 2222 buildbot@127.0.0.1 true && break; sleep 2; done
ssh -p 2222 buildbot@127.0.0.1 'rm -Rf buildbot && mkdir buildbot'
scp -P 2222 %(distname)s.tar.gz buildbot@127.0.0.1:buildbot/
ssh -p 2222 buildbot@127.0.0.1 'cd buildbot && tar zxf %(distname)s.tar.gz && cd %(distname)s && CC="gcc -static-libgcc" CXX="gcc -static-libgcc" CFLAGS="-O2 -fno-omit-frame-pointer -g" CXXFLAGS="-O2 -fno-omit-frame-pointer -g" ./configure --prefix=/usr/local/mysql --exec-prefix=/usr/local/mysql --libexecdir=/usr/local/mysql/bin --localstatedir=/usr/local/mysql/data --with-server-suffix=1 --with-comment="(MariaDB - http://askmonty.org/)" --with-system-type=linux-gnu --enable-shared --enable-static --enable-thread-safe-client --enable-local-infile --with-big-tables --with-libwrap --with-ssl --without-docs --with-readline --with-extra-charsets=all --with-embedded-server --with-libevent --with-partition --with-zlib-dir=bundled --with-plugins=max-no-ndb && make -j3 && sudo rm -Rf /usr/local/mysql && sudo make install && sudo mv /usr/local/mysql /usr/local/%(distname)s-Linux-x386 && tar zcf ../%(distname)s-Linux-x386.tar.gz -C /usr/local %(distname)s-Linux-x386/'
scp -P 2222 buildbot@127.0.0.1:buildbot/%(distname)s-Linux-x386.tar.gz .
ssh -p 2222 buildbot@127.0.0.1 'sudo shutdown -h now'
while : ; do sleep 5; kill -0 "$(cat kvm_pid_2222)" || break; done
rm -f kvm_pid_2222
""")],
        ))
f_kvm_jaunty_x86.addStep(Test(
        description=["testing", "bintar"],
        descriptionDone=["test", "bintar"],
        logfiles={"kernel": "kernel_2222.log"},
        command=["sh", "-c", WithProperties("""
qemu-img create -b /kvm/vms/vm-jaunty-i386-install.qcow2 -f qcow2 vm-tmp-2222.qcow2
kill -9 "$(cat kvm_pid_2222)"
(exec sh -c "echo \$\$ > 'kvm_pid_2222' ; exec kvm -m 2047 -hda vm-tmp-2222.qcow2 -redir 'tcp:2222::22' -boot c -smp 2 -cpu qemu32,-nx -nographic -net nic,model=virtio -net user" </dev/null >>kernel_2222.log 2>&1) &
sleep 15
while : ; do ssh -o ConnectTimeout=4 -p 2222 buildbot@127.0.0.1 true && break; sleep 2; done
ssh -p 2222 buildbot@127.0.0.1 'rm -Rf buildbot && mkdir buildbot'
scp -P 2222 %(distname)s-Linux-x386.tar.gz buildbot@127.0.0.1:buildbot/
ssh -p 2222 buildbot@127.0.0.1 'cd buildbot && sudo rm -Rf /usr/local/mysql /usr/local/%(distname)s-Linux-x386 && sudo tar zxf %(distname)s-Linux-x386.tar.gz -C /usr/local/ && sudo ln -s %(distname)s-Linux-x386 /usr/local/mysql && cd /usr/local/mysql && sudo sudo chown -R mysql . && sudo chgrp -R mysql . && sudo bin/mysql_install_db --user=mysql && sudo chown -R root . && sudo chown -R mysql data mysql-test && cd mysql-test && sudo su -s /bin/sh -c "perl mysql-test-run.pl alias" mysql'
ssh -p 2222 buildbot@127.0.0.1 'sudo shutdown -h now'
while : ; do sleep 5; kill -0 "$(cat kvm_pid_2222)" || break; done
rm -f kvm_pid_2222
""")],
        ))
bld_kvm_jaunty_x86 = {'name': 'kvm-jaunty-x86',
                      'slavename': 'knielsen-kvm-x86',
                      'builddir': 'kvm-jaunty-x86',
                      'factory': f_kvm_jaunty_x86,
                      }
c['builders'].append(bld_kvm_jaunty_x86)