Chapter 6, part 5 of 8: path generators
Welcome back.
This week, back to some content from my book after the screencast of last week (and the announcement of the QuantLib Notebooks series; leave your feedback if you want the transcripts to be published). This post is the fifth in a series that started here and covers chapter 6, that is, the Monte Carlo framework in QuantLib.
Subscribe to my Substack to receive my posts in your inbox, or follow me on Twitter or LinkedIn if you want to be notified of new posts, or subscribe via RSS if you’re the tech type: the buttons for all that are in the footer. Also, I’m available for training, both online and (when possible) on-site: visit my Training page for more information.
Path generation
Before tackling path generation proper, we need a last basic
component: a structure to hold the path itself. Listing 6.9 shows the
Path
class, which models a random path for a single variable. It
contains the TimeGrid
instance and the Array
holding the node
times and values, respectively, and provides methods to access and
iterate over them. The method names are usually chosen to follow those
of the standard containers, allowing for a more familiar interface;
methods with domain-based names such as time
are also available.
Listing 6.9: Interface of the Path
and MultiPath
classes.
class Path {
public:
Path(const TimeGrid& timeGrid, const Array& values);
bool empty() const;
Size length() const;
Real operator[](Size i) const;
Real at(Size i) const;
Real& operator[](Size i);
Real& at(Size i);
Real front() const;
Real& front();
Real back() const;
Real& back();
Real value(Size i) const;
Real& value(Size i);
Time time(Size i) const;
const TimeGrid& timeGrid() const;
typedef Array::const_iterator iterator;
typedef Array::const_reverse_iterator reverse_iterator;
iterator begin() const;
iterator end() const;
reverse_iterator rbegin() const;
reverse_iterator rend() const;
private:
TimeGrid timeGrid_;
Array values_;
};
class MultiPath {
public:
MultiPath(const std::vector<Path>& multiPath);
// ...more constructors...
Size assetNumber() const;
Size pathSize() const;
const Path& operator[](Size j) const;
const Path& at(Size j) const;
Path& operator[](Size j);
Path& at(Size j);
private:
std::vector<Path> multiPath_;
};
The MultiPath
class, also shown in listing 6.9, holds paths for a
number of underlying variables. It is basically a glorified container,
holding a vector of Path
instances and providing a few additional
inspectors to uphold the law of Demeter. (For instance, if p
is a
MultiPath
instance, the additional methods allow us to write
p.pathSize()
instead of p[0].length()
. Besides being uglier, the
latter might also suggest that p[1].length()
could be different.)
The simplicity of the implementation is at the expense of some space:
by storing a vector of Path
instances, we’re storing N identical
copies of the time grid.
Aside: access patterns
The basic inspectors in the Path
class, such as operator[]
and the
iterator interface, return the values \( S_i \) of the underlying
variable rather than the pairs \( (t_i ,S_i) \) including the node
times. While it could well be expected that p[i]
return the whole
node, we thought that in the most common cases a user would want just
the value, so we optimized such access (it’s some kind of
Huffman-coding principle, in which the most common operations should
require the least typing; the Perl language shines at this, or so I’m
told).
In a sad note about reuse (or lack thereof) if we were to add to the
Path
class a few methods to return whole nodes, we probably wouldn’t
use std::pair<Time,Real>
to hold the data, but an inner Node
struct. This is not because we instinctively leap at the most complex
implementation—even though it may seem so at times—but
because code such as p.node(i).second
is not self-describing (wait,
is second
the time or the value?) The advantage of writing the above
as p.node(i).value
would offset the cost of defining a Node
struct.
Path generators
What remains to be done for path generation is now simply to connect
the dots between the classes that I described in the last few
posts. The logic for doing so is implemented in the
MultiPathGenerator
class template, sketched in listing 6.10; I’ll
leave it to you to decide whether this makes it an instance of the
Factory pattern.
Listing 6.10: Sketch of the MultiPathGenerator
class template.
template <class GSG>
class MultiPathGenerator {
public:
typedef Sample<MultiPath> sample_type;
MultiPathGenerator(const shared_ptr<StochasticProcess>&,
const TimeGrid&,
GSG generator,
bool brownianBridge = false);
const sample_type& next() const { return next(false); }
const sample_type& antithetic() const { return next(true); }
private:
const sample_type& next(bool antithetic) const;
bool brownianBridge_;
shared_ptr<StochasticProcess> process_;
GSG generator_;
mutable sample_type next_;
};
template <class GSG>
const typename MultiPathGenerator<GSG>::sample_type&
MultiPathGenerator<GSG>::next(bool antithetic) const {
if (brownianBridge_) {
QL_FAIL("Brownian bridge not supported");
} else {
typedef typename GSG::sample_type sequence_type;
const sequence_type& sequence_ =
antithetic ? generator_.lastSequence()
: generator_.nextSequence();
Size m = process_->size(), n = process_->factors();
MultiPath& path = next_.value;
Array asset = process_->initialValues();
for (Size j=0; j<m; j++)
path[j].front() = asset[j];
Array temp(n);
next_.weight = sequence_.weight;
TimeGrid timeGrid = path[0].timeGrid();
Time t, dt;
for (Size i = 1; i < path.pathSize(); i++) {
Size offset = (i-1)*n;
t = timeGrid[i-1];
dt = timeGrid.dt(i-1);
if (antithetic)
std::transform(sequence_.value.begin()+offset,
sequence_.value.begin()+offset+n,
temp.begin(),
std::negate<Real>());
else
std::copy(sequence_.value.begin()+offset,
sequence_.value.begin()+offset+n,
temp.begin());
asset = process_->evolve(t, asset, dt, temp);
for (Size j=0; j<m; j++)
path[j][i] = asset[j];
}
return next_;
}
}
Its constructor takes and stores the stochastic process followed by
the variable to be modeled; the time grid to be used for generating
the path nodes; a generator of random Gaussian sequences, which must
be of the right dimension for the job (that is, at each draw it must
return N × M numbers for N factors and M steps); and a boolean flag to
specify whether or not it should use Brownian bridging, which at this
time cannot be set to true
and with which we’re unfortunately stuck
for backward compatibility.
The interface of the class provides a next
method, returning a new
random path (well, multipath, but cut me some slack here); and an
antithetic
method, returning the antithetic path of the one last
returned by next
. (Unfortunately, the correctness of the
antithetic
method depends on its being called at the correct time,
that is, after a call to next
; but I can’t think of any acceptable
way out of this.) Both methods forward to a private method
next(bool)
, which implements the actual logic. If the
brownianBridge
flag was set to true
, the method bails out by
raising an exception since the functionality is not yet
implemented. Following the general principles of C++, which suggest to
fail as early as possible, we should probably move the check to the
constructor (or better yet, implement the thing; send me a patch if
you do). As it is now, one can build a path generator and have the
constructor succeed, only to find out later that the instance is
unusable.
If Brownian bridging is not required, the method gets to work. First,
it retrieves the random sequence it needs: a new one if we asked for a
new path, or the latest sequence we used if we asked for an antithetic
path. Second, it performs some setup. It gets a reference to the path
to be built (which is a data member, not a temporary; more on this
later); retrieves the array of the initial values of the variables and
copies them at the beginning of the path; creates an array to hold the
subsets of the random variables that will be used at each step; and
retrieves the time grid to be used. Finally, it puts the pieces
together. At each step, it takes as starting point the values at the
previous one (it uses the same array where it stored the initial
values, so the first step is covered, too); it retrieves from the time
grid the current time t
and the interval dt
over which to evolve
the variables; it copies the subset of the random variables that it
needs (the first N for the first step, the second N for the second
step, and so on, N being the number of factors; also, it negates them
if it must generate an antithetic path); and to close the circle, it
calls the evolve
method of the process and copies the returned
values at the correct places in the paths.
As I mentioned, the returned path is stored as a data member of the
generator instance. This saves quite a few allocations and copies; but
of course it also makes it impossible to share a path generator
between threads, since they would race for the same storage. However,
that’s just the last nail in the coffin. Multithreading was pretty
much out of the picture as soon as we stored the Gaussian number
generator as a data member; the path generator can only be as
thread-safe as its RNG—that is, not much, since most (if not
all) of them store state. A much easier strategy to distribute work
among threads would be to use different MultiPathGenerator
instances, each with its own RNG. The problem remains of creating RNGs
returning non-overlapping sequences; for algorithms that allow it
(such as Sobol’s) one can tell each instance to skip ahead a given
number of draws.
To close the section, I’ll mention that the library also provides a
PathGenerator
class for 1-D stochastic processes. It has some
obvious differences (it calls the StochasticProcess1D
interface and
it creates Path
s instead of MultiPath
s; also, it implements
Brownian bridging, which is a lot easier in the 1-D case) but other
than that, it has the same logic as MultiPathGenerator
and doesn’t
warrant being described in detail here.
Aside: stepping on one’s own toes.
Unfortunately, there’s quite a bit of memory allocation going on
during multi-path generation. The obvious instances (the two Array
s
being created to store the current variable values and the current
subset of the random numbers) could be avoided by storing them as data
members; but on the one hand, we would have yet more mutable members,
which is something we should use sparingly; and on the other hand, the
allocation of the two arrays only happens once in a whole path
generation, so it might not make a lot of difference.
Instead, the worst offender might be the underlying process, which at
each step creates an Array
instance to be returned from its evolve
method. How can we fix this? In this case, I wouldn’t return a
reference to a data member if I can help it: the StochasticProcess
class might just work in a multithreaded environment (well, if market
data don’t change during calculation, or if at least the process is
kept frozen) and I’d rather not ruin it. Another possibility might be
that the evolve
method take a reference to the output array as an
argument; but then, we (and for we, I mean whoever will implement a
stochastic process) would have to be very carefully about aliasing.
The sad fact is that the natural way to write the evolve
method
would usually be something like (in pseudocode, and in the case of,
say, two variables)
void evolve(const Array& x, Time t, Time dt,
const Array& w, Array& y) {
y[0] = f(x[0], x[1], t, dt, w[0], w[1]);
y[1] = g(x[0], x[1], t, dt, w[0], w[1]);
}
where x
is the input array and y
is the output. However, it would
be just as natural for a user to write
p->evolve(x,t,dt,w,x);
(that is, to pass the same x
as both input and output) meaning to
replace the old values of the variables with the new ones.
Well, you see the catch. When the first instruction in evolve
writes
into y[0]
, it actually writes into x[0]
. The second instruction
then calls g
with a value of x[0]
which is not the intended
one. Hilarity ensues.
Fixing this requires either the developer to make each process safe from aliasing, by writing
a = f(...); b = g(...); y[0] = a; y[1] = b;
or something to this effect; or the user to be mindful of the risk of aliasing, and never to pass the same array as both input and output. Either solution requires more constant care than I credit myself with.
So, where does this leave us? For the time being, the current
interface might be an acceptable compromise. One future possibility is
that we take away the allocation cost at the origin, by using some
kind of allocation pool inside the Array
class. As usual, we’re open
to suggestions.
Pricing on a path
There’s not much to say on the PathPricer
class template, shown in
listing 6.11. It defines the interface that must be implemented by a
path pricer in order to be used in a Monte Carlo model: an
operator()
taking a path and returning some kind of result. It is
basically a function interface: in fact, if we were to write the
framework now instead of ten years ago, I’d just use boost::function
and dispense with PathPricer
altogether.
Listing 6.11: Interface of the PathPricer
class template.
template<class PathType, class ValueType=Real>
class PathPricer : public unary_function<PathType,ValueType> {
public:
virtual ~PathPricer() {}
virtual ValueType operator()(const PathType& path) const=0;
};
The template arguments generalize on both the type of the argument,
which can be a Path
, a MultiPath
, or a user-defined type if needed
(for instance, when pricing instruments with an early-termination
feature, one might want to save time by using a path class that
generates nodes lazily and only if they are actually used); and the
type of the result, which defaults of a simple Real
(usually the
value of the instrument) but might be an array of results, or some
kind of structure. On the one hand, one might like to have the choice:
an Array
would be more appropriate for returning a set of
homogeneous results, such as the amounts of a series of coupons, while
a structure with named fields would be better suited for inhomogeneous
results such as the value and the several Greeks of an instrument. On
the other hand, the results will need to be combined, and therefore
they’ll need some basic algebra; the Array
class already provides
it, which makes it ready to use, whereas a results structure would
need to define some operators in order to play well with the
facilities described in the next section.