Calculating Percentage (%) of Total Sum in SQL

How to compute the ratio of a particular row to the sum of values?

This question comes up frequently when you want to the relative contribution of a row against the backdrop of the total sum of all the rows. For example:

  • how is the browser marketshare changing over time
  • what's each sales person's contribution to your company's revenue

Consider a table with the number of page view (in billions) with each browser:

Browser Pageviews
Chrome 7.1685
Safari 1.935
Firefox 1.3455
UC Browser 1.0965
IE 1.341
Opera 0.816
Android 0.7245
Rest 1.2

What we really want to see is the browser market share. We can use a Common Table Expression (CTE) to compute the total pageview which we then use to compute the ratio:

WITH total AS
    ( SELECT sum(pageviews) as total
    FROM pageviews )
SELECT browser,
    pageviews / total.total as share
FROM pageviews,
    total

Which gives a ratio of each browser to the total:

Browser Share
Chrome 0.895
Safari 0.241875
Firefox 0.1681875
UC Browser 0.1370625
IE 0.167625
Opera 0.102
Android 0.0905625
Rest 0.15

And the visualization:

Percentage to Total per Group

The next question to ask is how this is changing over time?

What we are attempting to do here is to group our data into months, compute the total for that group and for each row within that group compute a ratio. An overall total wouldn't make sense. Conside the pageview table as before, but with an additional date field:

dt Browser Pageviews
2016-01-01 Chrome 7.1685
2016-01-01 Safari 1.935
2016-01-01 ... ...
2016-01-02 Chrome 7.2485
2016-01-02 Safari 1.721
2016-01-02 ... ...
... ... ...
2016-12-31 Chrome 7.864
2016-12-31 Safari 2.011
2016-12-31 ... ...

We once again to resort to window functions with a partition over the month portion of the datetime.

SELECT date_trunc('month', dt),
    browser,
    pageviews / sum(pageviews) OVER(PARTITION BY date_trunc('month', dt))
FROM pageviews 

Let's unpack the query a bit. Our window function creates partitions (groups) for each month, and for each partition we sum up the pageviews. The ratio between the current row pageviews and the total for that group is exactly what we want.

Redshift has ratio_to_report

Fortunately on Redshift, it supports a window function called ratio_to_report which computes the ratio of the value of a column in a row with that of the sum of the said expression over the group.

SELECT date_trunc('month', dt),
    browser,
    ratio_to_report(pageviews) OVER(PARTITION BY date_trunc('month', dt))
FROM pageviews 

No fuss, just SQL We are open sourcing everything from the experience working with our agency clients. They spend thousands of dollars to get this level of detailed analysis – which you can now get for free. You'll learn bucketing, finding outliers, calculating correlations, budgeting & forecasting, return on spend and investment, data smoothing and cleaning applied to freemium, transactional, advertising, two-sided marketplace and subscription based business models. We send one update every week. Join 400+ data analysts who are leveling up with our recipes. 👊

No spam, ever! Unsubscribe any time.