Querying JSON with PostgreSQL

PostgreSQL provide powerful operator and methods to process JSON data. I have taken mock data to explore different operators and methods. Before proceed make sure you have installed PostgreSQL 9.4+ and psql.

I have created mock date with help of Faker gem. Mock data available at https://gist.githubusercontent.com/sivakumarbdu/5c6838015f1ad9efc1959b015d939587/raw/71c0810ffd7dc9e41bc01be8c8ee76f99b6d3dc2/person-data.sql

JSON data will be look like this

In this example, “0” is the Id of a person and it has related information as values. This JSON has information about persons interest. It contains two interest for a person. Lets call it as primary and secondary interest. They might be same (Because generated by Random)

Date generated using ruby script with help of Faker Gem. Ruby script for generating mock data is available here https://gist.github.com/sivakumarbdu/591f6faa9fe55f2d1c4744703b77c193

We have mock data of 10000 users and their interest and age. We are going to frame the SQL query for following.

  • Display all users’s details
  • User who are interested it “DIY home”
  • Users who are interested it “nature” and their age is less than 25
  • User who has same primary and secondary interest
  • User who are interested in “web design” and “ai”

Lets frame query for each. Before proceed lets create a database and table in PostgreSQL and insert our mock data.

Login to postgresql console and create database, table and mock data from file.

Download the mock data and save it on your desk. https://gist.githubusercontent.com/sivakumarbdu/5c6838015f1ad9efc1959b015d939587/raw/71c0810ffd7dc9e41bc01be8c8ee76f99b6d3dc2/person-data.sql

Lets create “person-data.json” at /home/$user/Documents/

 

Once done with above query exit from PostgreSQL and run the following command to insert data into person table. Below examples uses psql utility.

Now we are ready to query the data. Now we have only on row which contains 10000 person data in a JSON. Lets start with our queries.

  1. Display all users’s details

Output will be like this. It contains two column key and value.

Now Lets take each persons interest. In the below query we are taking key(person’s Id) and navigating to next level to get person’s interest.

Output will be

Now lets take only primary interest of a person.

Output will be

Note the selection option “(data -> person_data.key::text ->> ‘interest’)::jsonb -> 0”.

Similarly to display person’s secondary interest use the index 1

output will be

2. User who are interested it “DIY home”

Above query will return person’s name, email and interest who are all interested in “DIY home”.

Output will be :

3. Users who are interested it “nature” and their age is less than 25

4. User who has same primary and secondary interest

I am comparing the Primary interest with Primary and Secondary interest.

Primary interest (data -> person_data.key::text ->> ‘interest’)::jsonb -> 0

Secondary interest (data -> person_data.key::text ->> ‘interest’)::jsonb -> 1

5. User who are interested in “web design” and “ai”

Optimization for above queries are welcome. 🙂

 

Leave a comment