Odds and ends: linear algebra
Greetings.
This week, some more content from Implementing QuantLib. I’ll look at some classes for linear algebra.
As I mentioned, I’ll publish one such post per week until around my next course, which is based on Implementing QuantLib and a number of hands-on exercises. It’s organized by MoneyScience, and you can find more info at the links in the banner above (or on my Training page if you’re reading this after the course and the banner is no longer there). Click them, I’ll wait.
And now, arrays and matrices.
Subscribe to my Substack to receive my posts in your inbox, or follow me on Twitter or LinkedIn if you want to be notified of new posts, or subscribe via RSS if you’re the tech type: the buttons for all that are in the footer. Also, I’m available for training, both online and (when possible) on-site: visit my Training page for more information.
Linear algebra
I don’t have a lot to write about the current implementation of the
Array
and Matrix
classes, shown in the following listing.
Their interface is what you would expect: constructors and assignment
operators, element access (the Array
class provides the a[i]
syntax; the Matrix
class provides both m[i][j]
and m(i,j)
,
because we aim to please), a bunch of arithmetic operators, all
working element by element as usual (it always bothered me that a*b
returns the element-wise product and not the dot product, but I seem
to be alone among programmers), and a few utilities. There are no
methods for resizing, or for other operations suited for containers,
because this classes are not meant to be used as such; they’re
mathematical utilities. Storage is provided by a simple scoped_ptr
,
managing the lifetime of the underlying memory.
In the case of Array
, we also provide a few functions such as
Abs
, Log
and their like; being good citizens, we’re not
overloading the corresponding functions in namespace std
because that’s forbidden by the standard. More complex functionality
(such as matricial square root, or various decompositions) can be
found in separate modules.
In short, a more or less straightforward implementation of arrays and
matrices. The one thing which is not obvious is the presence of the
Disposable
class template, which I’ll describe in more detail
in a further section of this appendix; for the time being, let me just
say that it’s a pre-C++11 attempt at move semantics.
The idea was to try and reduce the abstraction penalty. Operator
overloading is very convenient—after all, c = a+b
is much
easier to read than understand than add(a,b,c)
—but
doesn’t come for free: declaring addition as
means that the operator must create and return a new Array
instance, that is, must allocate and possibly copy its memory. When
the number of operations increases, so does the overhead.
In the first versions of the library, we tried to mitigate the problem
by using expression templates. The idea (that I will only describe
very roughly, so I suggest you read Veldhuizen, 2000 [1] for details)
is that operators don’t return an array, but some kind of parse tree
holding references to the terms of the expression; so, for instance,
2*a+b
won’t actually perform any calculation but only create a small
structure with the relevant information. It is only when assigned
that the expression is unfolded; and at that point, the compiler would
examine the whole thing and generate a single loop that both
calculates the result and copies it into the array being assigned.
The technique is still relevant today (possibly even more so, given
the progress in compiler technology) but we abandoned it after a few
years. It was difficult to read and maintain (compare the current
declaration of operator+
with
for a taste of what the code was like) and not all compilers were able
to process it, forcing us to maintain both expression templates and
the simpler implementation; therefore, when the C++ community
started talking of move semantics and some ideas for implementations
began to appear, we took the hint and switched to Disposable
.
As I said, compilers progressed a lot during these years; nowadays,
I’m guessing that all of them would support an expression-template
implementation, and the technique itself has probably improved.
However, if I were to write the code today (or if I started to change
things) the question might be whether to write classes such as
Array
or Matrix
at all. At the very least, I’d consider
implementing them in terms of std::valarray
, which is supposed
to provide facilities for just such a task; but in the end, I’d
probably go for some existing library such as uBLAS, which is
available in Boost, is written by actual experts in numerical code,
and that we already use in some parts of the library for specialized
calculations.
Bibliography
[1] T. Veldhuizen, Techniques for Scientific C++. Indiana University Computer Science Technical Report TR542, 2000.