# Calculating Exponential Moving Average in SQL with Recursive CTEs

Similar to simple/weighted moving averages, exponential moving averages (EMA) smooth out the observed data values. The difference is that these methods use the previously calculated EMA value as a basis rather than the original (non-smooth) data value. Since EMA builds upon itself, all previous data values have some effect on the new EMA, though the effect diminishes quickly with time.

## 1. The math

The equation for EMA is recursive, i.e., new values use the previously calculated EMA values:

where,

Variable | Explanation |
---|---|

Current observation at . | |

Current EMA at , i.e., EMA to be calculated. | |

Smoothing factor that determines how much weight we give to the most recent observed value versus the last calculated EMA. is between 1 and 0. A value of 0.5 weighs both sides equally. | |

Last calculated EMA, with a special value of to be equal to the first data point. |

## 2. The SQL with PostgreSQL recursive query

To implement this in SQL, let's take an illustrative example: our data is sales data that shows a strong seasonal component. Moreover, it's trending higher period-over-period.

The table has two columns: dt a timestamp and sales as a number.

To solve the recursive equation, we'll use PostgreSQL's recursive queries. PostgreSQL queries can have a `WITH`

clause, that allows you use statements – know as common table expressions (CTE) – that behave like temporary tables that only exist during the execution of the query.

If the CTE contains a recursive component, that's a recursive query. Recursive queries are useful to query data that demonstrate a hierarchical nature.

To define a recursive query, we need two parts – an initial query that is non-recursive and the recursive part. The general form of a recursive query is as follows:

```
WITH cte_name (
CTE_query_definition -- non-recursive portion
UNION [ALL]
CTE_query_definion -- recursive portion
)
SELECT * from cte_name;
```

For the non-recursive portion, we will pick out the first row of the sales data. We can get the first row by numbering the results with the `row_number`

window function and adding a `WHERE`

clause:

```
SELECT * FROM (
SELECT dt,
sales,
row_number() OVER ()
FROM sales_data) w
WHERE row_number = 1;
```

For the non-recursive portion, we use a self-join with the rows offset by 1 to get the previous row (we additionally re-factor getting the columns from the table as `t`

)

```
WITH t as (
SELECT dt,
sales,
row_number() OVER ()
)
SELECT * FROM t
JOIN t t2 ON t2.row_number = t.row_number - 1;
```

Now, it's a matter of filling in the equation to calculate EMA, a `UNION`

with the non-recursive part and making sure both portions have the same number of columns.

```
WITH RECURSIVE t AS (
SELECT dt,
0.5 AS alpha,
row_number() OVER (),
sales
FROM sales_data
),
ema AS (
SELECT *, sales AS sales_ema FROM t
WHERE row_number = 1
UNION ALL
SELECT t2.dt,
t2.alpha,
t2.row_number,
t2.sales,
t2.alpha * t2.sales + (1.0 - t2.alpha) * ema.sales AS sales_ema
FROM ema
JOIN t t2 ON ema.row_number = t2.row_number - 1
)
SELECT dt, sales, sales_ema
FROM ema;
```

You are reading a series of **Advanced SQL** tricks.

Read this next: Calculating Difference from Beginning Row

Join **300+** data analysts who are using our weekly SQL tips to improve their skills. No spam, promise!