Querying JSON with PostgreSQL

Sivakumar V

3 min read

PostgreSQL provide powerful operator and methods to process JSON data. I have taken mock data to explore different operators and methods. Before proceed make sure you have installed PostgreSQL 9.4+ and psql.

I have created mock date with help of Faker gem. Mock data available at https://gist.githubusercontent.com/sivakumarbdu/5c6838015f1ad9efc1959b015d939587/raw/71c0810ffd7dc9e41bc01be8c8ee76f99b6d3dc2/person-data.sql

JSON data will be look like this

{"0":
{"name":"Kiera Halvorson","email":"keegan.bashirian@legros.name","interest":["memes","agriculture"],"age":25},
"1" :{},
...}

In this example, “0” is the Id of a person and it has related information as values. This JSON has information about persons interest. It contains two interest for a person. Lets call it as primary and secondary interest. They might be same (Because generated by Random)

Date generated using ruby script with help of Faker Gem. Ruby script for generating mock data is available here https://gist.github.com/sivakumarbdu/591f6faa9fe55f2d1c4744703b77c193

We have mock data of 10000 users and their interest and age. We are going to frame the SQL query for following.

  • Display all users’s details
  • User who are interested it “DIY home”
  • Users who are interested it “nature” and their age is less than 25
  • User who has same primary and secondary interest
  • User who are interested in “web design” and “ai”

Lets frame query for each. Before proceed lets create a database and table in PostgreSQL and insert our mock data.

Login to postgresql console and create database, table and mock data from file.

Download the mock data and save it on your desk. https://gist.githubusercontent.com/sivakumarbdu/5c6838015f1ad9efc1959b015d939587/raw/71c0810ffd7dc9e41bc01be8c8ee76f99b6d3dc2/person-data.sql

Lets create “person-data.json” at /home/$user/Documents/

 

CREATE DATABASE json_demo;
CREATE TABLE  person(data json)

Once done with above query exit from PostgreSQL and run the following command to insert data into person table. Below examples uses psql utility.

#insert data using psql client.
psql -h $host-U $user -d json_demo -f person-data.sql

Now we are ready to query the data. Now we have only on row which contains 10000 person data in a JSON. Lets start with our queries.

  1. Display all users’s details
SELECT
  json_each_text(person.data)
FROM
  person

Output will be like this. It contains two column key and value.

(0,"{""name"":""Kiera Halvorson"",""email"":""keegan.bashirian@legros.name"",""interest"":[""memes"",""agriculture""],""age"":25}")
(1,"{""name"":""Golden Lynch"",""email"":""brandon.kautzer@boehm.org"",""interest"":[""DIY home"",""electronics""],""age"":43}")
.
.
.

Now Lets take each persons interest. In the below query we are taking key(person’s Id) and navigating to next level to get person’s interest.

SELECT
 person_data.key ID,
 data -> person_data.key::text ->> 'interest'  Interest
FROM
  person, json_each_text(person.data) as person_data

Output will be

id  |interest                          |
----|----------------------------------|
0   |["memes","agriculture"]           |
1   |["DIY home","electronics"]        |
2   |["garden design","DIY home"]      |
...

Now lets take only primary interest of a person.

SELECT
  person_data.key ID,
  (data -> person_data.key::text ->> 'interest')::jsonb -> 0  Interest
FROM
  person, json_each_text(person.data) as person_data

Output will be

id  |interest        |
----|----------------|
0   |"memes"         |
1   |"DIY home"      |
2   |"garden design" |
.
.
.
.

Note the selection option “(data -> person_data.key::text ->> ‘interest’)::jsonb -> 0”.

Similarly to display person’s secondary interest use the index 1

SELECT
  person_data.key ID,
  (data -> person_data.key::text ->> 'interest')::jsonb -> 1 Interest
FROM
  person, json_each_text(person.data) as person_data

output will be

id  |interest        |
----|----------------|
0   |"agriculture"   |
1   |"electronics"   |
2   |"DIY home"      |
.
.
.

2. User who are interested it “DIY home”

SELECT
(data -> person_data.key::text ->> 'name') as Name ,
(data -> person_data.key::text ->> 'email') as Email,
(data -> person_data.key::text ->> 'interest')  as Interest
FROM
 person, json_each_text(person.data) as person_data
where
(data -> person_data.key::text ->> 'interest')::jsonb ? 'DIY home'

Above query will return person’s name, email and interest who are all interested in “DIY home”.

Output will be :

name                     |email                                    |interest                     |
-------------------------|-----------------------------------------|-----------------------------|
Golden Lynch             |brandon.kautzer@boehm.org                |["DIY home","electronics"]   |
Alex Zemlak              |braulio_kirlin@hane.biz                  |["garden design","DIY home"] |

3. Users who are interested it “nature” and their age is less than 25

SELECT
  (data -> person_data.key::text ->> 'name') as Name ,
  (data -> person_data.key::text ->> 'age') as Age,
  (data -> person_data.key::text ->> 'interest')  as Interest
FROM  person, json_each_text(person.data) as person_data
WHERE
(data -> person_data.key::text ->> 'age')::integer < 25 and
(data -> person_data.key::text ->> 'interest')::jsonb ? 'nature'

4. User who has same primary and secondary interest

SELECT
(data -> person_data.key::text ->> 'name') as Name ,
(data -> person_data.key::text ->> 'email') as Email,
(data -> person_data.key::text ->> 'interest')  as Interest
FROM
person, json_each_text(person.data) as person_data
where
(data -> person_data.key::text ->> 'interest')::jsonb -> 0 =
(data -> person_data.key::text ->> 'interest')::jsonb -> 1

I am comparing the Primary interest with Primary and Secondary interest.

Primary interest (data -> person_data.key::text ->> ‘interest’)::jsonb -> 0

Secondary interest (data -> person_data.key::text ->> ‘interest’)::jsonb -> 1

5. User who are interested in “web design” and “ai”

SELECT
  (data -> person_data.key::text ->> 'name') as Name ,
  (data -> person_data.key::text ->> 'email') as Email,
  (data -> person_data.key::text ->> 'interest')  as Interest
FROM
person, json_each_text(person.data) as person_data
where
 (data -> person_data.key::text ->> 'interest')::jsonb ? 'web design' and
(data -> person_data.key::text ->> 'interest')::jsonb ? 'ai'

Optimization for above queries are welcome. 🙂

 

Related posts:

Leave a Reply

Your email address will not be published. Required fields are marked *