Leaving C++03: automation, part 1 — Implementing QuantLib

Hello.

In this post, I get some help continuing my experiment in migrating our code to C++11. Namely, I describe how I enlisted clang-tidy to do some of the work for me.

But first, a couple of notes on last post. The first: the post deals mostly with trying to modify the main code so that it can optionally use std::shared_ptr but it keeps backwards compatibility with Boost pointers. Why not stay in the context of the experiment, and just replace the pointers without worrying about breaking code? (After all, this is what I’ll be doing in all future posts, including this one; once we use C++11 syntax, we can’t go back.) The reason is twofold. On the one hand, it would make it possible to start using the new pointers in future official releases, which is a point of interest in itself; and on the other hand, it might make it possible to keep integrating the experimental branch with the official code. Having std::shared_ptr in one branch and boost::shared_ptr in the other would prevent any future merge.

The second note: I’m not the first experimenting with the modernization of QuantLib. About a year ago Hao Zhang forked QuantLib, replaced every Boost feature with the corresponding standard class, and also moved to CMake for building the library and Catch for testing it. His code is on GitHub if you’re interested.

Back to today’s topic. I won’t go into explaining what clang-tidy is, or how to use it. I’ll stand on the shoulders of others which have already written about it; you can also read the clang-tidy documentation for more details on any of the checks. Here, I’ll report how I fared running different modernize checks on the QuantLib code base.

To put things in context: we’re talking of about 2500 source and header files for a total of some 500 kloc, of which 100k or so are actually data (mostly direction integers or primitive polynomial for low-discrepancy sequences, hard-coded as arrays in the source). I’ll use 2500 files and 400 kloc as a baseline to estimate how invasive any check is.

Finally, the version of clang-tidy I used was the 5.0.1, running in a Docker container based on the upcoming Ubuntu Bionic. I ran each check in isolation, using a combination of find and xargs to run clang-tidy on each .cpp and .hpp file in the QuantLib folder as, for instance,

  clang-tidy -checks=-*,modernize-use-auto -fix -header-filter=.* <filename> \
    -- -std=c++11 -I <quantlib root>

The above is not the only way to do it. I could have omitted the -header-filter=.*, causing the modernization to be applied only to the file being processed, and not to the headers it includes, that will be processed in their own turn; for some of the checks, this gives different results that I’ll describe. It’s also possible to integrate clang-tidy with a cmake build, but that would cause header files to not be processed if they’re not included in any compilation unit. (Gasp! Does this mean that we don’t have full coverage? Yes, it does.)

As I write this, I was able to run about half of the available checks; I’ll be back for more in a later post. The summary so far is as follows:

	files	lines	compiles	fails PR merges
`modernize-use-override`	788	3302	no	11/43
`modernize-use-auto`	260	485	yes
`modernize-use-equals-default`	182	237	yes	1/43
`modernize-use-nullptr`	149	243	no
`modernize-use-emplace`	87	1724	yes
`modernize-use-default-member-init`	46	126	yes	1/43
`modernize-redundant-void-arg`	4	5	yes
`modernize-use-equals-delete`	3	4	no
`modernize-raw-string-literal`	3	3	yes
`modernize-use-noexcept`	2	3	yes
`modernize-deprecated-headers`	2	2	yes
`modernize-replace-random-shuffle`	1	20	yes

where the files column contains the number of files touched, the lines column contains the number of lines touched, compiles shows whether or not the code compiled after the modernization, and fails PR merges shows if any of my sample set of pull requests (see last post) failed to merge against the modernized code.

Ideally, we would have wanted all “yes” in the compile column and no failed merges in the last one, but that wasn’t always the case. The pull requests failing to merge can be fixed with a bit of straightforward, if tedious, work; and frankly, I expected more of them, so I won’t complain. The compilation failures, instead, were unexpected, given that clang-tidy is based on the same technology used by clang and thus should be able to verify that the transformed code works the same as the original. What happened, then?

When running modernize-use-override, the code managed to trick the tool. We have a template class PiecewiseDefaultCurve in the library which works a bit like this:

    template <class T>
    class Tricky : public typename T::base {
        // ... other stuff
        double foo() const;
        double bar() const;
        double baz() const;
    }

that is, its base class depends on its template parameter. Sometimes, the chosen base class declares virtual foo, bar and baz methods; and sometimes, it only declares foo and bar. When processing the first instantiation, clang-tidy adds override to all three methods; after which, the compiler gives an error on the second instantiation because baz is marked override but it’s not declared in the base class.

The problem seems to be easily fixed; we can remove the override from the third method, and the code compiles again. We’d probably need a couple of extra steps, though. First, removing the override manually would cause the problem to resurface if we were to run the checks again (or again and again, in case we wanted to add a tidy step to our Travis run); thus, we also need to disable the check for that method, which we can do by adding a comment containing NOLINT to its declaration. Second, I’m a stickler for running the compilation with -Wall -Werror, and recent versions of clang have an “inconsistent missing override” warning which is included in -Wall and would be triggered if we added override to only two out of the three methods defined in the base class. Therefore, we’d also have to add a pragma around the declaration to disable the warning.

A last note: running clang-tidy without -header-filter=.* doesn’t add override to the methods, since the header is only changed when processed on its own (and therefore, without information on the possible base classes). This prevents the compilation error, but still causes the warning; the tricky class also inherits from another base class, not dependent on the template parameter, and its virtual methods are marked with override.

The problem with modernize-use-nullptr was that we had probably used incorrectly the interface of boost::function, and that bit us back. After modernization, the initialization of such a function failed to compile; it turns out that both g++ and clang happily accept

    boost::function<void()> f(NULL);

but refuse to compile

    boost::function<void()> f(nullptr);

I haven’t investigated which constructor takes NULL, and why it doesn’t take nullptr instead. Frankly, in the abridged words of Rhett Butler, I don’t care. The intent here was to create an uninitialized function object, and that can be expressed more clearly by using the default constructor as in

    boost::function<void()> f;

Once I fixed that, the code compiled but I got a number of errors in the test suite. The problem was, again, with boost::function. Instead of using the provided empty() method to check if a function was initialized, we had compared it with NULL. Unfortunately, given an uninitialized function f, the expression f == NULL evaluates to true but f == nullptr evaluates to false. Again, the problem can be avoided by using the provided interface and writing f.empty() instead.

Both fixes make sense independently of my experiment, and have now been merged in the master branch. They will be included in the next release. For completeness, I’ll mention that this happened with version 1.66 of Boost, which was the latest one when I ran the checks; I haven’t tried it with other versions.

As a final note on nullptr, I didn’t expect the change to touch so many files. Apparently, we developed a habit of checking a pointer by writing if (p == 0) or if (p == NULL) instead of the shorter if (p), which wouldn’t need to be modernized.

The only other compilation error so far was after using modernize-use-equals-delete. As you can read from its documentation, the check uses a heuristics to avoid false positives. We managed to beat it by declaring a class with all methods implemented in its header, except for its private destructor. This can be avoided by moving some other method to the cpp file; I haven’t done it yet, though.

The other checks didn’t cause compilation errors; but that doesn’t mean I’m done with my commentary. However, I’m already past the 1500-words mark, so I think I’ll wrap this post here. I’ll be back with its sequel next week. If you have comments or suggestions in the meantime, tweet or email me.

Subscribe to my Substack to receive my posts in your inbox, or follow me on Twitter or LinkedIn if you want to be notified of new posts, or subscribe via RSS if you’re the tech type: the buttons for all that are in the footer. Also, I’m available for training, both online and (when possible) on-site: visit my Training page for more information.