Valgrinding Drizzle

Like so many others, I got interested in the Drizzle project when it started. Some good ideas, lots of enthusiasm, and just pure GPL license, no “yes, we will take your work for free and sell proprietary licenses to it” SCA.

I even started contributing some development, fixing a number of Valgrind-detected bugs in Drizzle. I am proud that we kept the MySQL code 100% free of Valgrind errors, and wanted to help keep the same in Drizzle. So I debugged and fixed quite a few of the Valgrind-detected bugs that had crept in since forking from MySQL.

As I remember, I got down to two or three remaining or so. However, I it did discourage me somewhat to see how quickly these bugs had been allowed to enter the code. I remember one case where there was a Drizzle patch that had tried to simplify some field types. As I remember, the patch tried to simplify the code by eliminating some of multiple variants of string types. All well and good, but then there was one place where this elimination was a bit tricky, and the patch just #ifdef’ed out the offending part of the code, leaving the resulting code completely broken, as detected by Valgrind. And this had been in the source for 4 months! Cleaning up code is good, but not if only the easy 90% is done, and the rest is left undone. [Later the Drizzle people started the nonsense with “Drizzle is GPL, but contributions are considered licensed to Sun under BSD”, and I kind of lost interest.]

Anyway, so half a year later I though it would be interesting to see how the state of Valgrind is for Drizzle nowadays. So I branced the latest lp:drizzle (and lp:libdrizzle), built it, and ran the test suite under Valgrind:

    (cd tests && ./mtr --valgrind --force)

Unfortunately, the results are not good: 1900 Valgrind warnings!

The warnings are all kinds: Memory leaks, mismatched free()/delete, uninitialised memory accessed from system calls or conditional jumps, etc. Some are probably benign or even false positives from Valgrind. Some are probably minor bugs, like tiny memory leaks in seldom-used features or garbage in log output. And some are most definitely serious bugs in the code that need to be fixed. With a flood of 1900 errors, it is impossible to tell which is which without days of careful study of the errors and debugging of the code.

I hope Drizzle will fix these issues. I have a lot of experience with Valgrind, and I know how hard it is both to debug and fix the issues reported, and also to get all developers to understand the importance of not allowing code into the tree with Valgrind problems. But I have also learned how many real, serious problems Valgrind can detect, problems that are often impossible to otherwise catch during development. Valgrind warnings can be caused by benign problems, but they are very rarely false alarms. But it is important to fix problems quickly, otherwise the number of problems will pile up until the sheer mass of issues makes it impossible to ever get back to a clean state with zero warnings in the test suite.

(Drizzle has done other good stuff. Like Building with -Werror -pedantic -Wall. This is something I hope we can soon duplicate in MariaDB. We do have a clean Valgrind test run in MariaDB, and make sure we keep it by running every push through Valgrind in Buildbot).

4 comments

  1. Funny – we were just working on that…

    this past Friday. We have a tree that’s clean except for 2 or 3 issues and expect to have the rest cleaned on Monday before we push again. We’re also adding automated Valgrind to our set of tests that a branch must pass before it will hit the trunk.

    Thanks for noticing!

    http://mysql-ha.com/post/64

    Monty (the Drizzle one)

  2. Hi!

    We have only been doing spot checks on Drizzle. I noticed that as of a recent patch to use STL that we had gained a few more (unfortunately a lot of the older code is not C++ safe). Going forward we have added to our staging system a “valgrind” free requirement.

    Bug reports are always welcome.

    Cheers,
    -Brian

  3. On BSD

    Hi!

    The reason for BSD is that we simply prefer it at this point. Any code provided to Drizzle which is “derivative” continues to be GPL. All new code? Like for instance our Drivers?

    That is all BSD. Over time you will see more code that is BSD, and less that is GPL.

    Derivative code continues to be GPL, you will notice we have not changed any of the headers in regard to this.

    Why BSD? Simply put it the simplest way for us to avoid “licensing discussions”. This is not for just Sun’s benefit, I believe you would agree that our BSD based drivers are for the good of everyone.

    If someone comes along and modifies word done by another individual, I want everyone to be able to benefit from these changes. This way the original author can continue to make use of code they originally wrote, and share in any benefit that come from future work on it.

    Cheers,
    -Brian

Leave a Reply to knielsen Cancel reply

Your email address will not be published. Required fields are marked *