The benefits of procrastination — Implementing QuantLib

Welcome back.

This post comes courtesy of a QuantLib user who submitted an interesting pull request to GitHub and gave me permission to write about it (thanks, Guillaume). It’s something that might bite you, too. This might go into Implementing QuantLib at some point, but for the time being, here it is.

The background: we have an ImpliedTermStructure class in the library that takes an existing interest-rate curve and creates another one based on the first but with a different reference date. (I posted a screencast about it a few years ago, and the corresponding notebook is in the QuantLib Python Cookbook.) The main bit of business logic in the class is the calculation of the implied discount factors, implemented as:

    void ImpliedTermStructure::discountImpl() {
        ...
        return originalCurve_->discount(originalTime, true) /
               originalCurve_->discount(ref, true);
    }

that is, the discount \( \tilde{B}(d) \) between the reference date \( d_0 \) of the implied term structure and some date \( d \) is calculated as the ratio between the discounts \( B(d) \) and \( B(d_0) \) as given by the original term structure. The observation relevant to this post is that \( d \) changes at each invocation, while \( d_0 \) stays the same; however, we can’t precalculate \( B(d_0) \) at construction time because the original curve (which is stored in a handle) might change; therefore, the current implementation retrieves it at each call.

The idea in the pull request was that this was wasteful, and that the discount \( B(d_0) \) could be calculated when the original curve notifies us of a change, that is, in the update method, and cached between changes. A simplified version of this would read:

    void ImpliedTermStructure::update() {
        ...
        df_ = originalCurve_->discount(ref, true);
        YieldTermStructure::update();
    }

    void ImpliedTermStructure::discountImpl() {
        ...
        return originalCurve_->discount(originalTime, true) / df_;
    }

The above looks correct, but it has a serious drawback, which tells me that we might have designed a dangerous framework. The problem is that calculations performed inside the update method might be wasted, because they might not be needed before another change happens; this is why most implementations (in the LazyObject class, for instance, or in the TermStructure class) only set some kind of out-of-date flag.

How does this apply to the implementation above? Let’s say that the original curve is built by a bootstrap over a set of market quotes, and let’s say that the quotes change. We would set new values to all of them, probably with something like the following pseudocode:

    for each quote:
        quote.setValue(new_value)

It seems innocent, doesn’t it? In the current implementation, it is. Each time around the loop, one of the quotes is set a new value and notifies the bootstrapped curve, which is invalidated and in turn forwards the notification to the implied curve; and that’s all that happens.

The proposed implementation, instead, makes a world of difference. Each time around the loop, when the notification comes, the implied curve asks for the discount factor, and this trigger a bootstrap of the original curve (which otherwise wouldn’t be able to return it). The loop would cause as many bootstraps as the number of quotes—each one of them but the last performed over an only-partially-updated set of values.

The lesson to take away here is that the update method should never perform any calculations, but should instead defer them to a later time. In this case, for instance, we could declare the stored discount factor as a boost::optional and write:

    void ImpliedTermStructure::update() {
        ...
        df_ = boost::none;
        YieldTermStructure::update();
    }

    void ImpliedTermStructure::discountImpl() {
        if (!df_) {
            ...
            df_ = originalCurve_->discount(ref, true);
        }
        ...
        return originalCurve_->discount(originalTime, true) / *df_;
    }

In this implementation, update merely invalidates the cache; the original curve is not asked for results until they’re actually needed, and thus no bootstrap is performed during the loop that sets the quotes.

Subscribe to my Substack to receive my posts in your inbox, or follow me on Twitter or LinkedIn if you want to be notified of new posts, or subscribe via RSS if you’re the tech type: the buttons for all that are in the footer. Also, I’m available for training, both online and (when possible) on-site: visit my Training page for more information.