Neural Prophet: Bridging the Gap Between Accuracy and Interpretability

6 min readDec 4, 2020

Time series forecasting has always been a bit of a pain for machine learning. Everyone always wants to know what the future holds, and they say data is the key to all the answers, so asking “can a model tell me what happens next?” is not an illogical question to ask. Unfortunately, that question is significantly more difficult to answer than “is this a picture of a cat or a dog?”. While computer vision is no joke, the difficulty in time series forecasting isn’t just a statistical or computational problem, but a human problem.

In all of machine learning there is a struggle between accuracy and interpretability, but this struggle is especially apparent in time series forecasting. Inevitably, the question after “what happens next?” is “how did you come up with that answer?”. It turns out neither humans nor machines are very good at predicting the future and when the answer isn’t right, everyone wants to know what went wrong. We are stuck between trying to make a model as accurate as possible and hoping that no one asks why the model makes the decisions it does, or making a very easy to explain model that isn’t very useful because it’s mostly wrong. It’s a bit like Heisenberg’s uncertainty principle from quantum mechanics; the better we are at answering one of the two questions, the worse we are at answering the other.

A new model called Neural Prophet¹ aims to solve this see-saw of uncertainty. The authors’ goal is to combine the power of neural networks and the interpretability of traditional autoregressive models into a single easy-to-use Python library. If you’re stuck on that sentence, don’t worry — I’ll spend the remainder of this post attempting to explain what it means. Let’s start simple, with autoregressive models.

Autoregressive models try to predict future values of a variable based on what has happened in the past (previous values of the variable). These are linear models, and the inputs are whatever time-steps are autocorrelated with the variable (autocorrelated means correlated to itself, like if yesterday’s sales are correlated to tomorrow’s sales, or last year’s Q4 numbers are correlated to this year’s Q4 numbers). Autocorrelated models are great because they are very interpretable and intuitive, meaning we can see how a past time step is weighted and how that input influenced our prediction. The downside is that these models are parametric (overly rigid), are unable to account for any features outside of autocorrelation, and do a poor job of scaling to datasets with a larger number of autocorrelated regressors².

Neural networks on the other hand are nonparametric, which is really important for time series data (time signals rarely follow a single standard form). They can generally map any nonlinear function to approximate any continuous function, meaning they essentially morph themselves into whatever function best explains your data. While these models sound like the perfect solution so far, the type of neural networks that can apply to forecasting (sequence-to-sequence networks) were originally designed for natural language processing and computer vision, so these networks require a significant amount of data prep and the hyper-parameters aren’t the easiest to manipulate when applied to forecasting³. On top of that, interpreting these neural networks is by no means an easy task and has become an entirely separate field of research on its own⁴.

The authors of Neural Prophet aimed to combine the scalability of neural networks with the interpretability of autoregressive models with AR-Net, or Auto Regressive Network. AR-Net is a single layer network that is trained to mimic the autoregressive process in a time series signal, but at a much larger scale than traditional autoregressive models. The inputs for both classic autoregressive models and AR-Net are the same, but AR-Net is able to scale to a much larger input size with ease as can be seen in the below figure⁵. A quick side note here: “p-value” referred to in the figure is not the p-value we associate with frequentist null hypothesis testing. Here p-value is the number of auto-regressors used in an autoregressive model.

As we increase the number of auto-regressors, computational time remains constant for AR-Net while increasing quadratically for Classic AR. This figure is from the original AR-Net paper.

Facebook’s Prophet model is an extension of the basic autoregressive model. Instead of just using values at previous time stamps, the model applies Fourier series to the input signal as a form of feature engineering which allows for more accurate models that can be fine-tuned via analyst-in-the-loop and decomposed for better interpretability. Prophet is also able to ingest additional covariates and regressors to better forecast a signal’s reaction to a system (a given city’s weather data might help a popsicle vendor more accurately forecast their sales for the week) and autodetect change points in its piecewise trends. Not only is Prophet decently accurate and great for interpretability, but the parameters and model setup are fairly straightforward and intuitive. Neural Prophet aims to maintain the same level of interpretability and ease-of-use as regular Prophet, but with a more souped-up backend⁶.

Prophet is known for its intuitive hyper-parameters and interpretable components, and Neural Prophet continues with both of these features. What Neural Prophet aims to improve is the backend computations for the model. Optimizations are made using PyTorch to quickly converge on a more global optima, and autoregression is handled by AR-Net to accommodate a larger number of inputs which allows for a more accurate representation of the time series signal. Neural Prophet continues to use the original Prophet’s piecewise linear trends and Fourier series to account for multiple seasonalities, but incorporating additional regressors is upgraded in Neural Prophet by using additional neural networks¹.

The best one line explanation for Neural Prophet that I can come up with is that Neural Prophet is still Prophet, but with a turbo-charged engine. The functionality, usability, and interpretability of the original Prophet is all still there, there’s just a more powerful backend for the computational lift which should result in a more accurate model. Of course the only way to prove that any of these new features works is to run an experiment, so I did.

For this experiment I used the Peyton Manning web traffic dataset that the documentation for both Neural Prophet and original Prophet refer to. I added a “holiday” dataframe (some playoff and superbowl dates) to help with the forecast accuracy and ran the models across a series of forecast horizons ranging from 1 day to 730 days (in increments of 10) using the default parameters for both models. The results are plotted below.

Let’s start on the left side and move our way right. It looks like the original Prophet was actually more accurate up through 70 days, which is probably due to some overfitting by Neural Prophet. From about 70 days through approximately 365 days both models are very similar in accuracy and declining error. My best guess is that the matched slope has to do with both models sharing a large number of default parameters, but we’ll have to do more research later to confirm. What’s really interesting is when error starts to increase again, right after the 365 day mark. Original Prophet’s error begins to steadily increase throughout the remainder of the forecast horizons while Neural Prophet maintains a fairly steady error rate. This is indicative of the AR-Net advantage in Neural Prophet; the ability to model a larger number of autoregressive terms allows for a more accurate forecast over longer forecast horizons.

There is obviously a lot more research that can be done to compare these two models, but I wanted to share at least one example to me writing this post and you taking the time to read it. Neural Prophet is still quite new and some of the more advanced features from the original Prophet like logistic growth are still being rolled out (at the time of publication), but I think Neural Prophet is a very important package for time series forecasting that proves there is a road to being both accurate and interpretable.

References:

Neural Prophet Github: https://github.com/ourownstory/neural_prophet
R.J. Hyndman and G. Athanasopoulos. Forecasting: principles and practice
Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. Sequence to sequence learning with neural networks. In Proceedings of the 27th International Conference on Neural Information Processing Systems — Volume 2, NIPS’14, pages 3104–3112, Cambridge, MA, USA, 2014. MIT Press.
G. Montavon, W. Samek, and K.-R. Muller. Methods for interpreting and understanding deep neural networks. arXiv preprint arXiv:1706.07979, 2017
AR-Net: https://arxiv.org/pdf/1911.12436.pdf
Facebook Prophet: https://facebook.github.io/prophet/

Neural Prophet: Bridging the Gap Between Accuracy and Interpretability

Written by Alec Delany