Leaving C++03: automation, part 1
Hello.
In this post, I get some help continuing my experiment in migrating our code to C++11. Namely, I describe how I enlisted clang-tidy to do some of the work for me.
But first, a couple of notes on last post. The first: the post deals
mostly with trying to modify the main code so that it can optionally
use std::shared_ptr
but it keeps backwards compatibility with Boost
pointers. Why not stay in the context of the experiment, and just
replace the pointers without worrying about breaking code? (After
all, this is what I’ll be doing in all future posts, including this
one; once we use C++11 syntax, we can’t go back.) The reason is
twofold. On the one hand, it would make it possible to start using
the new pointers in future official releases, which is a point of
interest in itself; and on the other hand, it might make it possible
to keep integrating the experimental branch with the official code.
Having std::shared_ptr
in one branch and boost::shared_ptr
in the
other would prevent any future merge.
The second note: I’m not the first experimenting with the modernization of QuantLib. About a year ago Hao Zhang forked QuantLib, replaced every Boost feature with the corresponding standard class, and also moved to CMake for building the library and Catch for testing it. His code is on GitHub if you’re interested.
Back to today’s topic. I won’t go into explaining what clang-tidy is,
or how to use it. I’ll stand on the shoulders of others which have
already
written about it;
you can also read the
clang-tidy documentation
for more details on any of the checks. Here, I’ll report how I fared
running different modernize
checks on the QuantLib code base.
To put things in context: we’re talking of about 2500 source and header files for a total of some 500 kloc, of which 100k or so are actually data (mostly direction integers or primitive polynomial for low-discrepancy sequences, hard-coded as arrays in the source). I’ll use 2500 files and 400 kloc as a baseline to estimate how invasive any check is.
Finally, the version of clang-tidy I used was the 5.0.1, running in a
Docker container based on the upcoming Ubuntu Bionic. I ran each
check in isolation, using a combination of find
and xargs
to run
clang-tidy on each .cpp
and .hpp
file in the QuantLib folder as,
for instance,
The above is not the only way to do it. I could have omitted the
-header-filter=.*
, causing the modernization to be applied only to
the file being processed, and not to the headers it includes, that
will be processed in their own turn; for some of the checks, this
gives different results that I’ll describe. It’s also possible to
integrate clang-tidy with a cmake build, but that would cause header
files to not be processed if they’re not included in any compilation
unit. (Gasp! Does this mean that we don’t have full coverage? Yes, it
does.)
As I write this, I was able to run about half of the available checks; I’ll be back for more in a later post. The summary so far is as follows:
files | lines | compiles | fails PR merges | ||||
---|---|---|---|---|---|---|---|
modernize-use-override |
788 | 3302 | no | 11/43 | |||
modernize-use-auto |
260 | 485 | yes | ||||
modernize-use-equals-default |
182 | 237 | yes | 1/43 | |||
modernize-use-nullptr |
149 | 243 | no | ||||
modernize-use-emplace |
87 | 1724 | yes | ||||
modernize-use-default-member-init |
46 | 126 | yes | 1/43 | |||
modernize-redundant-void-arg |
4 | 5 | yes | ||||
modernize-use-equals-delete |
3 | 4 | no | ||||
modernize-raw-string-literal |
3 | 3 | yes | ||||
modernize-use-noexcept |
2 | 3 | yes | ||||
modernize-deprecated-headers |
2 | 2 | yes | ||||
modernize-replace-random-shuffle |
1 | 20 | yes | ||||
where the files column contains the number of files touched, the lines column contains the number of lines touched, compiles shows whether or not the code compiled after the modernization, and fails PR merges shows if any of my sample set of pull requests (see last post) failed to merge against the modernized code.
Ideally, we would have wanted all “yes” in the compile column and no failed merges in the last one, but that wasn’t always the case. The pull requests failing to merge can be fixed with a bit of straightforward, if tedious, work; and frankly, I expected more of them, so I won’t complain. The compilation failures, instead, were unexpected, given that clang-tidy is based on the same technology used by clang and thus should be able to verify that the transformed code works the same as the original. What happened, then?
When running modernize-use-override, the code managed to trick the
tool. We have a template class PiecewiseDefaultCurve
in the library
which works a bit like this:
that is, its base class depends on its template parameter. Sometimes,
the chosen base class declares virtual foo
, bar
and baz
methods;
and sometimes, it only declares foo
and bar
. When processing the
first instantiation, clang-tidy adds override
to all three methods;
after which, the compiler gives an error on the second instantiation
because baz
is marked override
but it’s not declared in the base
class.
The problem seems to be easily fixed; we can remove the override
from the third method, and the code compiles again. We’d probably
need a couple of extra steps, though. First, removing the override
manually would cause the problem to resurface if we were to run the
checks again (or again and again, in case we wanted to add a tidy step
to our Travis run); thus,
we also need to disable the check for that method, which we can do by
adding a comment containing NOLINT
to its declaration. Second, I’m
a stickler for running the compilation with -Wall -Werror
, and
recent versions of clang have an “inconsistent missing override”
warning which is included in -Wall
and would be triggered if we
added override
to only two out of the three methods defined in the
base class. Therefore, we’d also have to add a pragma around the
declaration to disable the warning.
A last note: running clang-tidy without -header-filter=.*
doesn’t
add override
to the methods, since the header is only changed when
processed on its own (and therefore, without information on the
possible base classes). This prevents the compilation error, but
still causes the warning; the tricky class also inherits from another
base class, not dependent on the template parameter, and its virtual
methods are marked with override
.
The problem with modernize-use-nullptr was that we had probably
used incorrectly the interface of boost::function
, and that bit us
back. After modernization, the initialization of such a function
failed to compile; it turns out that both g++ and clang happily accept
but refuse to compile
I haven’t investigated which constructor takes NULL
, and why it
doesn’t take nullptr
instead. Frankly, in the abridged words of
Rhett Butler, I don’t care. The intent here was to create an
uninitialized function object, and that can be expressed more clearly
by using the default constructor as in
Once I fixed that, the code compiled but I got a number of errors in
the test suite. The problem was, again, with boost::function
.
Instead of using the provided empty()
method to check if a function
was initialized, we had compared it with NULL
. Unfortunately, given
an uninitialized function f
, the expression f == NULL
evaluates to
true
but f == nullptr
evaluates to false
. Again, the problem
can be avoided by using the provided interface and writing f.empty()
instead.
Both fixes make sense independently of my experiment, and have now been merged in the master branch. They will be included in the next release. For completeness, I’ll mention that this happened with version 1.66 of Boost, which was the latest one when I ran the checks; I haven’t tried it with other versions.
As a final note on nullptr
, I didn’t expect the change to touch so
many files. Apparently, we developed a habit of checking a pointer by
writing if (p == 0)
or if (p == NULL)
instead of the shorter if
(p)
, which wouldn’t need to be modernized.
The only other compilation error so far was after using
modernize-use-equals-delete. As you can read from its
documentation,
the check uses a heuristics to avoid false positives. We managed to
beat it by declaring a class with all methods implemented in its
header, except for its private destructor. This can be avoided by
moving some other method to the cpp
file; I haven’t done it yet,
though.
The other checks didn’t cause compilation errors; but that doesn’t mean I’m done with my commentary. However, I’m already past the 1500-words mark, so I think I’ll wrap this post here. I’ll be back with its sequel next week. If you have comments or suggestions in the meantime, tweet or email me.
Subscribe to my Substack to receive my posts in your inbox, or follow me on Twitter or LinkedIn if you want to be notified of new posts, or subscribe via RSS if you’re the tech type: the buttons for all that are in the footer. Also, I’m available for training, both online and (when possible) on-site: visit my Training page for more information.