Chapter 8, part 1 of n: the finite-differences framework.

Welcome back.

As I write, release candidates for QuantLib 1.6 have been out for a week or two (if you haven’t heard about this already, you might want to subscribe to the QuantLib mailing list). If you have a few cycles to spare, please download it, give it a try and report any problems to the mailing list. The official 1.6 release should be out shortly. (Brace yourself for new developments afterwards. Just saying.)

This week, some initial content from chapter 8 of my book. Yes, I finally sat down to write it. The first couple subsections were pushed to the Leanpub edition a couple of weeks ago, with more coming during the summer (both there and here).

Follow me on Twitter if you want to be notified of new posts, or add me to your circles, or subscribe via RSS: the buttons for that are in the footer. Also, make sure to check my Training page.

The finite-differences framework

The last framework I’ll be writing about was, in fact, the first to appear in the library. Of course, it was not a framework back then; it was just a bunch of code hard-coded inside a single option pricer.

Since then, the reusable parts of the code were reorganized and grew in complexity—probably to the point where it’s hard to find one’s way inside the architecture. A few years ago, a new version of the framework was contributed adding 2-dimensional finite-difference models (which were absent in the first version) and sporting a more modular architecture.

The two versions of the framework still coexist in the library. In this chapter, I’ll describe both of them—while hoping for the new one to replace the first some day.

The old framework

As I mentioned in a previous post, our master plan to create a common framework for trees and finite-differences models never came to fruition. Instead, the finite-differences code stayed closer to its underlying mathematics and implemented concepts such as differential operators, boundary conditions and so on. I’ll tackle one of them in each of the following subsections.

Differential operators

If I were to start at the very beginning, I’d describe the Array class here; but they behave as you can imagine, so you’ll forgive me if I leave its description to a later post and just note that it’s used here to discretize the function \( f(x) \) on a grid \( { x_0 \ldots x_{N-1} } \) as the array \( \mathbf{f} \), with \( \mathbf{f}_i = f(x_i) \) for any \( i \).

Onwards to operators. Now, I’m probably preaching to the choir, so I’ll keep the math to a minimum to avoid telling you lots of stuff you know already. (And if you don’t, there’s no way I can tell you all you need. You can refer to any of the several books on finite differences; for instance, Duffy, 2006 [1]).

In short: a differential operator transforms a function \( f(x) \) in one of its derivatives, say, \( f^{\prime}(x) \). The discretized version transforms \( \mathbf{f} \) into \( \mathbf{f^{\prime}} \), and since differentiation is linear, it can be written as a matrix. In general, the operator doesn’t give the exact discretization of the derivative but just an approximation; that is, \( \mathbf{f^{\prime}}_i = f^{\prime}(x_i) + \epsilon_i \) where the error terms can be reduced by decreasing the spacing of the grid.

QuantLib provides differential operators for the first derivative \( \partial/\partial x \) and the second derivative \( \partial^2/\partial x^2 \). They are called \( D_0 \) and \( D_+D_- \), respectively, and are defined as:

\[\frac{\partial f}{\partial x}(x_i) \approx D_0 \mathbf{f}_i = \frac{\mathbf{f}_{i+1}-\mathbf{f}_{i-1}}{2h} \qquad \frac{\partial^2 f}{\partial x^2}(x_i) \approx D_+D_- \mathbf{f}_i = \frac{\mathbf{f}_{i+1}-2\mathbf{f}_i+\mathbf{f}_{i-1}}{h^2}\]

where \( h = x_i - x_{i-1} \) (we’re assuming that the grid is evenly spaced). Taylor expansion shows that the error is \( o(h^2) \) for both of them.

As you can see, the value of the derivative at any given index \( i \) (except the first and last, that I’ll ignore now) only depends from the values of the function at the same index, the one that precedes it, and the one that follows. This makes \( D_0 \) and friends tridiagonal: the structure of such operators can be sketched as

\[\left( \begin{array}{ccccccc} d_o & u_0 & & & & & \\ l_0 & d_1 & u_1 & & & & \\ & l_1 & d_2 & u_2 & & & \\ & & \ddots & \ddots & \ddots & & \\ & & & l_{n-4} & d_{n-3} & u_{n-3} & \\ & & & & l_{n-3} & d_{n-2} & u_{n-2} \\ & & & & & l_{n-2} & d_{n-1} \end{array} \right)\]

They have a number of desirable properties, including the fact that a linear combination of tridiagonal operators (such as the one found, say, in the Black-Scholes-Merton formula) is still tridiagonal; thus, instead of using a generic Matrix class (which does exist, but is not used here), we made them instances of a specific class called TridiagonalOperator and shown in the following listing.

    class TridiagonalOperator {
        Size n_;
        Array diagonal_, lowerDiagonal_, upperDiagonal_;

      public:
        typedef Array array_type;

        explicit TridiagonalOperator(Size size = 0);
        TridiagonalOperator(const Array& low,
                            const Array& mid,
                            const Array& high);

        static Disposable<TridiagonalOperator>
        identity(Size size);

        Size size() const;
        const Array& lowerDiagonal() const;
        const Array& diagonal() const;
        const Array& upperDiagonal() const;

        void setFirstRow(Real, Real);
        void setMidRow(Size, Real, Real, Real);
        void setMidRows(Real, Real, Real);
        void setLastRow(Real, Real);

        Disposable<Array> applyTo(const Array& v) const;
        Disposable<Array> solveFor(const Array& rhs) const;
        void solveFor(const Array& rhs, Array& result) const;

      private:
        friend Disposable<TridiagonalOperator>
        operator+(const TridiagonalOperator&,
                  const TridiagonalOperator&);
        friend Disposable<TridiagonalOperator>
        operator*(Real, const TridiagonalOperator&);
        // other operators
    };

As you can see from the private members, the representation of the operator reflects its tridiagonal structure: instead of storing all the mostly null elements of the matrix, we just store three arrays corresponding to the diagonal, lower diagonal, and upper diagonal; that is, the \( d_i \), \( l_i \) and \( u_i \) of the earlier sketch.

The constructors and the first bunch of methods deal with building and inspecting the operator. The first constructor sets the size of the matrix (that is, of the three underlying arrays) and trusts the values to be filled later, by means of the provided setters: referring to the figure again, the setFirstRow method sets \( d_0 \) and \( u_0 \), the setLastRow method sets \( l_{n-2} \) and \( d_{n-1} \), setMidRow sets \( l_{i-1} \), \( d_i \) and \( u_i \) for a given \( i \), and the convenience method setMidRows sets them for all \(i \). The second constructor sets all three arrays in one fell swoop; and a factory method, identity, works as a third constructor by building and returning an operator with \( 1 \) on the diagonal and \( 0 \) elsewhere. The inspectors do the obvious thing and return each the corresponding data member. Oftentimes, the returned values are wrapped in a Disposable class template, in a possibly misguided attempt to avoid a few copies. Again, details will be in a future post.

The applyTo and solveFor methods are the meat of the interface. Given a tridiagonal operator L and an array u, L.applyTo(u) returns the result \( L u \) of applying \( L \) to \( u \), while L.solveFor(u) returns the array \( v \) such that \( u = L v \) (The overloaded call L.solveFor(u,v) does the same while avoiding an allocation). Both operations will be used later, and for a tridiagonal operator they both have the nice property to be \( O(N) \) in time, that is, to only take time proportional to the size \( N \) of the operator. The algorithms were taken from the classic Numerical Recipes in C [2]. Not the code, of course: besides not being free, the original code had a number of endearing old conventions (such as numbering arrays from 1, like in Downton Abbey) that wouldn’t have matched the rest of the library.

Finally, the class defines a few operators. They make it possible to write numerical expressions such as 2*L1 + L2 that might be used to compose tridiagonal operators. To this effect, the library also provides a few building blocks such as the \( D_0 \) and \( D_+D_- \) operators I mentioned earlier; for instance, given the Black-Scholes equation, written in operator form as

\[\frac{\partial f}{\partial t} = \left[ - \left(r - q - \frac{\sigma^2}{2} \right) \frac{\partial}{\partial x} - \frac{\sigma^2}{2} \frac{\partial^2}{\partial x^2} + r I \right] f \equiv L_{BS} f\]

one could define the corresponding \( L_{BS} \) operator as

    TridiagonalOperator L_BS = -(r-q-sigma*sigma/2)*DZero(N)
                               -(sigma*sigma/2)*DPlusDMinus(N)
                               -r*TridiagonalOperator::identity(N);

In this case, though, we didn’t eat our own dog food. The library does define a Black-Scholes operator, but not through composition as above.

A final note: while they’re probably the most performant, tridiagonal operators are not the only ones that could be used in a finite-difference scheme. However, there’s no base class for such operators. The interface that they should implement (that is, applyTo and solveFor is only defined implicitly, through its use in the template classes that we’ll see in the next subsection. In hindsight, defining such a base class wouldn’t have degraded performance; the cost of a virtual call to applyTo or solveFor would have been negligible, compared to the cost of the actual calculation. But we were eagerly reading about template numerical techniques at that time; and like the two guys in the Wondermark strip, we decided to jingle all the way.

Bibliography

[1] D. J. Duffy, Finite Difference Methods in Financial Engineering: A Partial Differential Equation Approach. John Wiley and Sons, 2006.
[2] W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical Recipes in C, 2nd edition. Cambridge University Press, 1992.