Using SQL to analyzing Bitcoin, Ethereum & Cryptocurrency Performance

Cryptocurrency like Bitcoins, Ethererum and a hundred other currencies are probably the most exciting tech to appear on the scene after the Internet – some pundits even say cryptocoins maybe the TCP/IP of money. It remains to be seen how the tech evolves, but meanwhile we can do some basic technical analysis on what's fuelling the growth and to see if it's sustainable.

In this recipe, we'll analyze bitcoin, ethereum and other cryptocoin performances using intermediate to advanced SQL:

  1. Capturing correlations between the different cryptocoins ➡ helps balance/diversify your portfolio
  2. Exposure of cryptocoins to one specific fiat currency ➡ reduces risk of one foreign currency tanking your investments
  3. Sustainability of Initial Coin Offerings (ICO) ➡ the modern day gold-rush
  4. Liquidity of the top cryptocoins ➡ a health marker for the market for cryptocoins
  5. Possible emergent behavior that's typical of the public markets (window dressing) ➡ identifies cyclical patterns

Disclaimer: The purpose here is to teach SQL and timeseries data analysis, and not be a deep-dive technical post on cryptocoins. And it's most certainly not financial advice. If you find mistakes, please reach out with corrections.


Analysis 1: How correlated are cryptocoins with each other

As an investor, you'll be interested in diversifying your portfolio using a variety of assets – stocks, bonds, cash including foreign currency, real-estate and fine-art if you're into that. If you're looking to further diversify your portfolio using cryptocoins, you'll want to understand if the hundred or so coins are related in any way. Investing in assets that move up or down together exposes your portfolio to undue risk.

We'll look at the top three coins – Bitcoin, Ethereum and Monero and calculate their pair-wise correlations. We've calculated the correlation every quarter – the analysis shows certain quarters are highly correlated. Global events like the SEC rejecting the Bitcoin ETF proposal or the rise / fall of the DAO impacts all cryptocoins resulting in a strong correlation.

Chart 1: Quarterly correlations between Bitcoin, Ethereum and Monero

Chart 2: Monthly correlations between Bitcoin, Ethereum and Monero

We zoom into the fourth quarter of 2016, where apparently Ethereum shows a negative correlation with Bitcoin and Monero.

Analysis 2: What's the Japanese exposure to cryptocoin prices

There are hundreds of exchanges around the world where you can trade real-world money like US Dollars and Japanese Yen in exchange for cryptocoins. Charting the cryptocoin exchange volumes across different currencies, we see really interesting patterns. We've made the chart a proportional chart (a chart that adds upto 100% because trading volumes are growing over time and absolute numbers will mask this pattern.)

Until the end of 2013, the US dollar dominated the exchanges, but for the next three years, the Chinese Yuan dominates. Early on in 2017, the Japanese Yen picks up as a dominant currency equal to and sometimes surpassing the US dollar.

Just for fun, we compute correlations in cryptocoin trading volumes in Japanese Yen and US dollars across Bitcoin, Ethereum and Monero. As expected, it's mostly positive correlations. If there were any arbitrage opportunities, it's not visible in the data granuarity available to us.

Analysis 3: Are Initial Coin Offerings sustainable?

There's an Initial Coin Offering (ICO) almost every day and the data is too noisy to understand anything. There are now well over a thousand cryptocoins available for trade. Soliciting analysis ideas.

A comparison with the years 1999-2000 Internet stock bubble would be a good comparison. We know how that story ends – will ICOs have the same fate?

Analysis 4: How liquid are cryptocoins

I equate liquidity of a security / financial instrument as a proxy for the health of the underlying instrument. Tell me if I'm wrong.

Looking at month-over-month growth of trading volume – you see big spikes followed by big drops. The market for cryptocoins still has liquidity issues and in our humble opinion, still unstable.

Analysis 5: Do cryptocoins exhibit end-of-quarter volume volatility (window dressing)

Traditional assets classes like mutual funds show anomalous end-of-quarter effects while fund managers sell / buy holdings to make their portfolio look good. Our analysis is as follows:

For the S&P 500 index and bitcoins, we look at trading volumes on the last day for every quarter. For the S&P 500 index, we look at the last 30 years, while for bitcoins we look at the last five years. We assign a postiive one when the trading volume is higher than the 30-day trailing moving average, and a negative one when it's not.

For the S&P 500 index, this analysis shows clear bull and bear markets, specially around 1998-2000 and 2008. For the bitcoin trading volumes, there's no discernable pattern to speak of.

The SQL

Data retrieved from CoinGecko, Cryptocompare and Bitcoinity.

Here's the SQL code you can use as a template for your analysis.

Query 1: How correlated are cryptocoins with each other

We use PostgreSQL's corr function to calculate correlations between time series. We further breakdown the timeseries by quarters.

SELECT [crypto.eth_usd_jpy_cny.ts:quarter] AS "Quarter",
    corr(btc_usd, eth_usd) AS "BTC ETH corr", 
    corr(eth_usd, xmr_usd) AS "ETH XMR corr", 
    corr(btc_usd, xmr_usd) AS "BTC XMR corr"
FROM crypto.eth_usd_jpy_cny,
    crypto.btc_usd_jpy_cny,
    crypto.xmr_usd_jpy_cny
WHERE crypto.eth_usd_jpy_cny.ts = crypto.btc_usd_jpy_cny.ts
    AND crypto.xmr_usd_jpy_cny.ts = crypto.btc_usd_jpy_cny.ts
GROUP BY 1
ORDER BY 1 

Query 2: What's the Japanese exposure to cryptocoin prices

The proportional chart was done on the visualization layer. If you're interested in doing it in the SQL, Percentage of total sum shows you how.

SELECT [ts:month],
        sum(usd_volume) AS "US Dollars",
        sum(jpy_volume) AS "Japanese Yen",
        sum(cny_volume) AS "Chinese Yuan"
FROM crypto.btc_volume
GROUP BY 1
ORDER BY 1 DESC 

Query 3: Are Initial Coin Offerings sustainable

We find the first date a currency started trading with the caveat that it might not be the ICO date. We then calculate a cumulative sum using window functions.

SELECT ts,
    sum(count(*)) OVER (ORDER BY ts)
FROM
    (SELECT currency_code,
            date_trunc('month', min(TIME)) AS ts
    FROM crypto.crypto_prices
    GROUP BY currency_code
    ORDER BY 2 ) a
GROUP BY ts
ORDER BY ts

Query 4: Crytocoin liquidity

We use window functions to compute month-over-month growth rates in trading volumes.

SELECT [ts:month],
    SUM(volume_btc), 
    SUM(volume_btc) - LAG(SUM(volume_btc), 1) OVER (ORDER BY [time:month]) 
FROM crypto.crypto_prices
WHERE currency_code = 'XMR'
    AND ts > '2017-01-01'
GROUP BY 1 

Query 5: End-of-quarter volatility

A couple of tricks here:

  1. We find the last day of the quarter by truncating the date to a quarter (which resets the date to the first day), adding three months and subtracting one day
  2. Use a CASE statement to zero out days that are not the last day of the quarter
  3. Calculate a simple 30-day moving average for the time series
  4. Use another CASE statement to assign +1 and -1 for days that are higher or lower than the moving average
  5. Summing up the volatility by year to identify bull / bear years
SELECT [ts:year], 
SUM ( CASE WHEN last_day - ma > 0 THEN 1 ELSE -1 END )
FROM (
    SELECT [date:day] AS ts,

        CASE WHEN (date_trunc('quarter', date) + interval '3 month - 1 day')::date = date::date
        THEN volume 
        ELSE 0 
        END as last_day,

        AVG(volume) OVER (ORDER BY date ROWS BETWEEN 30 PRECEDING AND CURRENT ROW) AS ma
    FROM crypto.sp_volume
    WHERE date > '1980-01-01' 
    ORDER BY 1 
) a

WHERE last_day > 0
GROUP BY 1 
ORDER BY 1

👋 No fuss, just SQL We are open sourcing everything from the experience working with our agency clients. They spend thousands of dollars to get this level of detailed analysis – which you can now get for free. You'll learn bucketing, finding outliers, calculating correlations, budgeting & forecasting, return on spend and investment, data smoothing and cleaning applied to freemium, transactional, advertising, two-sided marketplace and subscription based business models. We send one update every week. Join 400+ data analysts who are leveling up with our recipes. 👊

No spam, ever! Unsubscribe any time.