Time Series and Machine Learning - The mathematics beneath [2/4] With Examples

We saw a few historical examples of time series data and analysis in the first part of a whole tutorial from beginning to end on time series analysis:

Time Series and Machine Learning – An introduction

This is the second part, so let’s get into the math and statistics of time series.

Also, the diagrams will be hand-drawn by me, so maybe a little different from the graphs in other articles. 🙂

Time series classification

They are classified in 3 ways :

  • Discrete & Continuous
  • Deterministic & Non-deterministic
  • Stationary & Non-stationary

Short definitions of these can be as follows:

Discrete: observations are taken at specific times, usually equally spaced.

Continuous: observations are taken continuously through time.

Deterministic: If it can be determined/predicted exactly.

Non-deterministic: (aka stochastic) Exact predictions are possible, and future values have a probability distribution based on past values.

Stationary: If there is no systematic change in the mean, the variance, or other periodic properties.

Stationary Time Series

Non-stationary: If properties of one period is different from the other.


Non Stationary Time Series

Components of a time series

The hierarchy of a time series function consist of the following:

  • A random element/irregularity also called noise. This can’t be predicted in any manner and is always present in some manner.
  • Systematic component: Has two components –
    • Trend
    • Periodic elements
      • Short term periodic component
      • Long term periodic component

Mathematical Models for time-series

In statistics, a model is the representation of the system is an unknown function in terms of a known functions or variables.

There are two classical time series models:

  • Additive : Yt = Tt + St +Ct + It
  • Multiplicative : Yt = Tt * St * Ct * It


  • Yt = the time series function
  • Tt = trend
  • St = seasonality
  • Ct = cyclical
  • It = random components

Methods of trend enumeration/trend component determination

So, one of the most important methods to learn is with respect to the trend of your series. Generally, there are two different reasons for studying the trend:

  • to eliminate the trend from the series
  • to study the trend and attempt to forecast future behavior of said trend.

There are four methods for the determination of trend component:

  • Freehand curve fitting
  • Method of Semi-averages
  • Fitting mathematical curves
  • Moving Averages method

We’ll go over three of these methods to explain the concept well.

1. Method of semi-averages

Assumption: The underlying trend is linear.

The whole data is divided into two parts with respect to time.

Then we compute the arithmetic mean for each part and plot these two averages against the mid values of the respective periods covered by each part.

The line obtained on joining these two points is the required trend line and maybe extended both ways to estimate intermediate or future values.


Method Of Semi Averages

Two points are ((m+1)/2, mean(y1)) and ((3m+1)/2, mean(y2)).

Equation of the straight line :


Equation – Method Of Semi Averages

2. Fitting mathematical curves

Assumption : The trend is of polynomial form – yt = Tt + It,

where, Tt = { a + bt , a + bt + ct2 , etc... }

We try to estimate the nearest polynomial by solving the first differential and the hessian matrix. It is a calculation that will be quite difficult for me to type here, but you can read about in some of the books mentioned in the recommended section.

We are also able to similarly fit exponential curves of the form – Yt = abt by taking a logarithm of this and then fitting a second-degree curve to the logarithm.

Demerits : quite tedious to perform. Also, it completely ignores seasonal, cyclic and irregular fluctuations.

Method of Moving Averages

Let us consider a time series { yt | t = 1,2,3,...}.

The k points weighted moving average value is defined as :

K Points Weighted Moving Average

Here, sum(wj) = 1, and M[w1,w2,w3,...] is called the k points moving average operator. The K Points Weighted Moving Average corresponding to Yt may be defined as:

Solving The Moving Average Operator

For example, if we have a time series {Xt}, which changes base and scale as: yt = (xt - a )/ b.

Ending Note

Those are the absolute basics of time series, but there’s still another section on math to cover before we go on to working on a real dataset.

If you have any questions, mention them in the comments. Book
mark the website, and keep yourself updated. Here’s the third part of the series, so check that out:

Time Series & Machine Learning – Autocorrelation, Heteroskedasticity, ARMA, ARIMA and more

By admin

Leave a Reply

%d bloggers like this: