GSoC 2019: Designing an API

4 minute read

Published:

Let’s take stock of where we are on our journey to add ODE capabilities on PyMC3.

  • We know that HMC requires derivatives of the log likelihood, which means taking derivatives of the differential equation’s solution with repect to it’s parameters. That sounds hard, but we’ve discovered that it is actually as easy as solving more differential equations. The method by which we compute derivatives of the differential equation’s solution is called the “Forward Sensitivity Method”.

  • We’ve done a little proof of concept for the forward sensitivity method. We wrote a computation graph in theano to compute these derivatives via automatic differentiation, and then used those derivatives to do gradient descent. Since we successfully fit our differential equation to data, we become more confident in our approach.

  • We’ve dropped our computation graph into an existing notebook and were very surprised that the model sampled. We estimated the correct parameter value, the Gelman-Rubin statistic indicated our chains converged, and the number of effective samples was quite large. All in all, a great success.

Now, we are at the point where we have to think about generalizing, and honestly, I’ve been dreading this point.

It is not news that I am a bad software engineer (in fact, that is precisely the reason I did GSoC). So, I am going to lean very heavily on you, dear reader, to help me through this. I’m going to be very open with what I am thinking, and I hope that you will chime in when neccesary to tell me that things can be done better, faster, safer.

So, here we go.

Thinking About an API

I don’t think an entire blog post outlining how the API works is interesting. Instead, I’m going to highlight some relevant bits, and if you have questions, you can ask me directly.

I’m keeping all my code in this repo. The API should take as an entry a function which computes the appropriate derivatives for the differential equation given the system’s current state, time, and parameters. Then, the computation graph I wrote will be able to compute a theano compiled function which returns an augmented system (that is, the derivatives for the system and the derivatives to be used in the forward sensitivity method).

Before we begin integrating the augmented system, we need to construct appropriate initial conditions. In order to do that, we need to understand how the initial conditions look for the forward sensitivity method. For a differential equation

with sensitivities defined by

the sensitivities (the derivative of the ODE’s solution with respect to the parameters) is a matrix, namely

I tried to get my website to show the actual matrix version of the above, but no dice. Note, if $p_j$ is a model parameter, then $\partial y_i / \partial p_j = 0$. If $p_j$ is an initial condition for one of the equations in our system (i.e. $y_i(t_0) = p_j$), then the $\partial y_i / \partial p_j = 1$ (can you see why)? If we arrange the columns of $\partial y / \partial p$ so that all the model parameters are adjacent, and all the initial conditions are adjacent, that means that the identity matrix will appear somewhere in our initial condition for $\partial y / \partial p $. I’ve written a little helper method to make this initial condition and then flatten it (it needs to be flattened in order to be passed to the integrator).

The notebook I posted 2 weeks ago used a “cached simulator” to avoid having to integrate the system twice for the same parameters in the forward and backward pass, thereby increasing the speed of the computation. I’ve made that a part of the API so it is, more or less, automatic. Small digression, for some particular reason this was the HARDEST thing to implement. I struggled for more than I care to admit. To fix whatever bug I was experiencing, I rewrote the entire API in a fit of rage and frustration.

Here is a notebook with a small example showing the API working. There are a few changes I plan to make that I hope will result in some speed ups, but as it stands all you need to do is: pass your ODE function, pass the initial time that your system evolves forward from, pass an array of times you would like to evaluate your ODE at, pass the number of states your system has, and pass the number of parameters your model has (pure parameters, not the number of parameters plus the number of initial conditions you plan on estimating).