Querying JSON (JSONB) data types in PostgreSQL

One of the unusual features of the PostgreSQL database is the ability to store and process JSON documents. In the past, data analysts and engineers had to revert to a specialized document store like MongoDB for JSON processing. But with the powerful JSON features built into PostgreSQL, the need for an external document store is no longer necessary.

Document stores are enticing because it enables you to "store data now, figure out schema later." You were always able to store arbitrary data structures as plain text in databases like PostgreSQL and MySQL. But processing and speed were a problem because the database had no internal knowledge of the structure of the document. On every query, the database had to load and parse the entire text blob. Moreover, querying deep into the JSON document required the use of gnarly regular expressions.

PostgreSQL has two native data types to store JSON documents: JSON and JSONB. The key difference between them is that JSON stores data in a raw format and JSONB stores data in a custom binary format. Our focus here is going to be on the JSONB data type because it allows the contents to be indexed and queried with ease.


1. Difference between JSON and JSONB

The JSON data type is basically a blob that stores JSON data in raw format, preserving even insignificant things such as whitespace, the order of keys in objects, or even duplicate keys in objects. It offers limited querying capabilities, and it's slow because it needs to load and parse the entire JSON blob each time.

JSONB on the other hand stores JSON data in a custom format that is optimized for querying and will not reparse the JSON blob each time.

If you know before hand that you will not be performing JSON querying operations, then use the JSON data type. For all other cases, use JSONB.

The following example demonstrates the difference:

select '{"user_id":1,    "paying":true}'::json, '{"user_id":1, "paying":true}'::jsonb;

            json                |             jsonb              
--------------------------------+--------------------------------
{"user_id":1,    "paying":true} | {"paying": true, "user_id": 1}
(1 row)

(the whitespace and the order of the keys are preserved in the JSOB column.)

2. Basics of PostgreSQL's JSONB data type

We'll first look at some basic operations for inserting and updating JSONB columns.

Creating a JSONB column

create table sales (
    id serial not null primary key,
    info jsonb not null
);

Inserting a JSON document

insert into sales values (1, '{name: "Alice", paying: true, tags: ["admin"]}');

Updating a JSON document

Updating by inserting a whole document:

update sales set info = '{name: "Bob", paying: false, tags: []}';

Updating by adding a key:

Use the || operator to concatenate existing data with new data. The operator will either update or insert the key to the existing document.

update sales set info = info || '{"country": "Canada"}';

Update by removing a key:

Use the - operator to remove a key from the document.

update sales set info = info - 'country';

3. Querying the JSON document

PostgreSQL has two native operators -> and ->> to query JSON documents. The first operator -> returns a JSON object, while the operator ->> returns text. These operators work on both JSON as well as JSONB columns.

Because the -> operator returns an object, you can chain it to inspect deep into a JSON document. For example,

select '{"name": "Alice", "agent": {"bot": true} }'::jsonb -> 'agent' -> 'bot';
-- returns true

There are additional operators available for JSONB columns. Of these, @>, ?, || and - are the most interesting. We've already seen the || and - operators in use in the prior sections.

3a. JSONB Containment with @>

The containment operator @> tests whether one document contains another.

select '{"name": "Alice", "agent": {"bot": true} }'::jsonb @> '{"agent": {"bot": false}}';
-- returns false

select '{"name": "Alice", "agent": {"bot": true} }'::jsonb @> '{"agent": {"bot": true}}';
-- return true

3b. JSONB Existence with ?

You can use the ->> operator to return text and PostgreSQL's standard filtering operations on strings like =, <>, is null etc. For example,

select '{"name": "Alice", "agent": {"bot": true} }'::jsonb -> 'agent' ->> 'bot' is not null;

Alternatively, you can use the JSONB existence operator ? to check if a string exists as a top-level key.

select '{"name": "Alice", "agent": {"bot": true} }'::jsonb -> 'agent' ? 'bot';

4. JSONB functions

There are four (out of many) interesting functions that operate on JSONB we want to highlight. These functions have the json_ variant that works on JSON columns.

4a. jsonb_each

Expands the top-level JSON document into a set of key-value pairs.

select jsonb_each( '{"name": "Alice", "agent": {"bot": true} }'::jsonb );
jsonb_each
1 (name,"Alice")
2 (agent,"{"bot": true}")

4b. jsonb_object_keys

Returns the keys of the top-level JSON document.

select jsonb_object_keys( '{"name": "Alice", "agent": {"bot": true} }'::jsonb );
jsonb_object_keys
1 name
2 agent

4c. jsonb_extract_path

Returns a JSON object that is traversed by a "path".

select jsonb_extract_path( '{"name": "Alice", "agent": {"bot": true} }'::jsonb, 'agent', 'bot');
jsonb_extract_path
1 true

4d. jsonb_pretty

By default, PostgreSQL returns a compact representation which works for machine consumption. If you want your JSON documents pretty printed for human consumption, use this function:

select jsonb_pretty( '{"name": "Alice", "agent": {"bot": true} }'::jsonb );
-- returns the following
{
    "name": "Alice",
    "agent": {
        "bot": true
    }
}

👋 No fuss, just SQL We are open sourcing everything from the experience working with our agency clients. They spend thousands of dollars to get this level of detailed analysis – which you can now get for free. We send one update every week. Join 400+ data analysts who are leveling up with our recipes. 👊

No spam, ever! Unsubscribe any time. See past emails here.