Skip to main content
## Time Series Forecasting using Facebook Prophet

##

## Why Facebook prophet?

##

## Highlights of Facebook Prophet

##

## The Prophet Forecasting Model

###

### 1. Installing the packages

###

### 2. Loading and preprocessing data

###

### 3. Model Fitting

### 4. Obtaining the forecasts

### 5. Evaluating the model

##

## Summary

##

## References

## POST COMMENTS

Time series analysis is an approach to analyse timely historical data to extract meaningful characteristics and generate other useful insights applied in businesses. Generally, time-series data is a sequence of observations stored in time order. It helps understand time based patterns of set of data points which are critical for any business. Techniques of time series forecasting could answer business questions like what level of inventory to maintain, how much website traffic can you expect in your e-store, to how many products will be sold in the next month. All of these are important time series problems to solve. For an instance, large organisations like Facebook and Google must engage in capacity planning to allocate scarce resources and goal setting with respect to high increase of their users. The basic objective of time series analysis usually is to determine a model that describes the pattern of the time series and could be used for future forecasting.

Classical time series forecasting techniques are built on statistical models which require a lot of effort to tune models in order to get high accuracy. The person has to tune the parameters of the method with regards to the specific problem when a forecasting model doesn’t perform as expected. Tuning these methods requires a thorough understanding of how the underlying time series models work. It’s difficult for some organisations to handle that level of forecasting without data science teams. And it might not seem profitable for an organisation to have a bunch of expects on board if there is no need a build a complex forecasting platform or other services.

Facebook developed "Prophet", an open source forecasting tool available in both **Python** and **R**. It provides intuitive parameters which are easy to tune. Even someone who lacks a deep expertise in time-series forecasting models can use this to generate meaningful predictions for a variety of problems in business scenarios.

Excerpt from Facebook Prophet website:

“ Producing high quality forecasts is not an easy problem for either machines or for most analysts. We have observed two main themes in the practice of creating a variety of business forecasts:

- Completely automatic forecasting techniques can be brittle and they are often too inflexible to incorporate useful assumptions or heuristics.
- Analysts who can product high quality forecasts are quite rare because forecasting is a specialised data science skill requiring substantial experience. ”

- Very fast, since it’s built in Stan, the code translates easily between R and Python.
- An additive regression model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects:
- A piece-wise linear or logistic growth curve trend. Prophet automatically detects changes in trends by selecting changepoints from the data
- A yearly seasonal component modeled using Fourier series
- A weekly seasonal component using dummy variables
- A user-provided list of important holidays.
- Ability to add additional regressors to the model.

- Robust to missing data and shifts in the trend, and typically handles outliers .
- Easy procedure to tweak and adjust forecast while adding domain knowledge or business insights.

Prophet builds a model by finding a best smooth line which can be represented as a **sum of the following components:**

**y(t) = g(t) + s(t) + h(t) + **ϵₜ

- g(t) – Overall growth trend.
- s(t) – Periodic changes (e.g., weakly and yearly seasonality)
- h(t) – Holidays effects which occur on irregular schedules
- ϵₜ
**–**Error term (Any idiosyncratic changes which are not accommodated by the model)

In this blog post, we will see some of the useful functions present in the library

by training a basic prophet model using an example data set. In the following tutorial, the following topics will be covered.**fbprophet **

- Installing & importing the dependencies
- Reading and preprocessing data
- Model Fitting
- Obtaining the forecasts
- Evaluating the model

Since Python is used as the programming language here, the ways how the prophet package can be installed in the Python environment are mentioned below.

Just like every Python library, you can install

using pip. The major dependency that Prophet has is **fbprophet**

Install pystan with pip before using pip to install fbprophet.**pystan.**

`pip install pystan`

`pip install fbprophet`

You can also install prophet in your conda environment.

`conda install -c conda-forge fbprophet`

After installation, let’s get started!

After setting up your Python environment with the dependencies installed, let’s import the required Python libraries including fbprophet which will be useful on our way to do the future forecasting.

`import pyodbc`

`import pandas as pd`

`import numpy as np`

`import matplotlib.pyplot as plt`

`from fbprophet import Prophet`

The dataset is then loaded as a pandas dataframe. Here the dataset contains daily page views for the Wikipedia page for Peyton Manning. The dataset has been modified for the representation purposes for this article. You can access the dataset and the source code here.

`df = pd.read_csv("peyton_manning.csv")`

When the following commands executed in order to see the first 10 and last 10 tuples of the dataframe, it appears as follows. As you see, it consists of two columns “Date” and “Views” where number of views of each date has been recorded. This dataset has records from year 2007 to 2016. The number of rows and columns in the dataset can be obtained using the Python command and it outputs as

First the date column should be converted into “Datetime” format before fitting out dataset into the model.

`df['Date'] = pd.to_datetime(df['Date'])`

`df.dtypes`

Output:

```
Date datetime64[ns]
Views int64
dtype: object
```

We can visually represent the variation of data using the plot function in Matplotlib.

`df`

.plot(x = 'Date')

Taking the date column as the x axis, the above variation can be obtained which is not **stationary** by the appearance. The curve is more rightly skewed and the data does not look much cleaner. In order to fit the data into the model, there should be a stationary variation of data in the data set. This can be achieved in mainly in two ways.

- Taking Difference
- df.diff
- yt = yt -y(t-1)
- df[‘diff’] = df[‘a’] – df[‘a’].shift(1)

- Log Transformation :to stabilize the non consistence values
- using numpy.log()

In this tutorial, the log transformation has been applied to all the values in Views column.

`df['Views'] = np.log(df['Views'])`

When the plot is obtained again, the data appears to be stationary.

Before fitting our model using the peyton manning dataset, the ‘date’ and ‘views’ columns should be renamed as ‘ds’ and ‘y’ respectively. This is a standard that is introduced by prophet.

`df.columns = ['ds','y'] `

When this is done, we are good to go ahead and train our prophet model.

We fit the model by instantiating a new `Prophet`

object. Any settings to the forecasting procedure are passed into the constructor. Then you call its `fit`

method and pass in the preprocessed dataset with historical data.

`model = Prophet()`

`model.fit(df)`

Predictions are then made on a dataframe with a column `ds`

containing the dates for which a prediction is to be made. You can get a suitable dataframe that extends into the future a specified number of days using the helper method `Prophet.make_future_dataframe`

. By default it will also include the dates from the history, so we will see the model fit as well. The number of future dates to be predicted can be specified by the parameter “periods”.

`future_dates = model.make_future_dataframe(periods=365)`

In the peyton manning dataset, it contains records from 2007 to 2016. If you examine the last tuples of the future_dates dataframe, it now consists dates from 2017 which are to be included in the forecast of the model.

The `predict`

method will assign each row in `future_dates`

a predicted value which it names `yhat`

. If you pass in historical dates, it will provide an in-sample fit. The `prediction`

object here is a new dataframe that includes a column `yhat`

with the forecast, as well as columns for components and uncertainty intervals.

`prediction = model.predict(future_dates)`

`model.plot(prediction)`

When you plot the prediction, it is illustrated as follows.

In the above figure, black dots are the actual datapoints. Dark blue colour area is the trend variation of the data which has been predicted for the 2016-2017 period (indicated by red arrow) by the prophet model. The light blue regions represent the range of bounding boxes yhat_upper and yhat_lower.

You can also see the forecast components using the `Prophet.plot_components`

method. By default you’ll see the trend, yearly seasonality, and weekly seasonality of the time series. If you include holidays, you’ll see those here, too.

`model.plot_components(prediction)`

Once the forecast is obtained from the model, the accuracy of the model has to be measured using a relevant performance metric. Prophet includes an inbuilt function in order to carry out cross validation to measure forecast error using historical data. The forecast horizon (`horizon`

), initial training period (`initial`

) and the spacing between cutoff dates (`period`

) should be specified.

Here cross-validation is done to assess prediction performance on a horizon of 365 days, starting with 730 days of training data in the first cutoff and then making predictions every 180 days. On this 8 year time series, this corresponds to 11 total forecasts. Thus the performance metrics can be calculated

`from fbprophet.diagnostics import cross_validation`

`df_cv = cross_validation(model, initial='730 days', period='180 days', horizon = '365 days')`

Thus the performance metrics can be calculated. The statistics computed are mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), mean absolute percent error (MAPE), median absolute percent error (MDAPE) and coverage of the `yhat_lower`

and `yhat_upper`

estimates.

`from fbprophet.diagnostics import performance_metrics`

`df_p = performance_metrics(df_cv)`

Cross validation performance metrics can be visualized with `plot_cross_validation_metric`

, here shown for MAPE. Dots show the absolute percent error for each prediction in `df_cv`

. The blue line shows the MAPE, where the mean is taken over a rolling window of the dots. We see for this forecast that errors around 5% are typical for predictions one month into the future, and that errors increase up to around 11% for predictions that are a year out.

`from fbprophet.plot import plot_cross_validation_metric`

`fig = plot_cross_validation_metric(df_cv, metric='mape')`

It can also be visualised for the other metrics such as rmse, mae and mse which have been already done in the complete code. You can access the source code for this tutorial here.

There are many time-series models such as ARIMA, exponential smoothing, snaive …etc which can be used for forecasting from historical data. From the practical example, it seems that Prophet provides completely automated forecasts just as its official document states. It’s fast and productive which would be very useful if your organisation doesn’t have a very solid data science team handing predictive analytics. It saves your time to answer internal stakeholder’s or client’s forecasting questions without spending too much effort to build an amazing model based on classic time-series modeling techniques.

- Facebook prophet official documentation – https://facebook.github.io/prophet/
- Forecasting at Scale by Sean J. Taylor & Benjamin Letham – https://peerj.com/preprints/3190.pdf

## LEAVE A COMMENT