How to filter json data using jq?
With increase use of API and event driven architectures which mostly use JSON as payload, its essential for a DevOps or Software engineer to know how to filter json data.
What is JSON?
JSON or JavaScript Object Notation is a text-based standard used to store and transmit data objects between applications, clients and servers in the web. A JSON data object consist of key-value pairs and arrays and is written is human readable format. JSON was first conceptualized and introduced to the world by Douglas Crockford around the year 2000 and since then it has slowly become the standard for communication between web applications and event driven architectures and has replaced XML like markup languages due to its simplicity to use. In the cloud and DevOps space most of the application now uses JSON or YAML formats.
Sample JSON data:
1 2 3 4 5 6 7 8 9 10 11 |
{ "id": "0001", "name": "debjeet", "dob": "10 Oct 1986", "weight_kg": 80, "skill": { "cloud": [ { "provider": "aws", "proficiency": "intermediate" } ] } } |
What is jq?
JQ is a program developed to filter JSON data. You can consider jq as sed, awk, grep like program but designed specifically for filtering JSON data. JQ supports filters, conditions, operators and other language features. While creation and understanding of JSON data is very simple due to its human readable format, JSON data parsing is still a very complicated task to achieve if done manually. Instead, you can use programs like jq to parse JSON data in minutes.
How to install jq?
JQ is available in almost all modern operating systems and can be installed fairly easy in couple of commands. In my view tools like jq, git should be included as part of the OS distribution.
Installing JQ in Windows OS:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
## ---------- ## Windows OS ## ---------- ## Check if Chocolety is already installed choco --version #should return an output ## Install Chocolety if not already installed ## Open a command propmt as admin and execute below command @"%SystemRoot%\System32\WindowsPowerShell\v1.0\powershell.exe" -NoProfile -InputFormat None -ExecutionPolicy Bypass -Command "[System.Net.ServicePointManager]::SecurityProtocol = 3072; iex ((New-Object System.Net.WebClient).DownloadString('https://community.chocolatey.org/install.ps1'))" && SET "PATH=%PATH%;%ALLUSERSPROFILE%\chocolatey\bin" ## Install jq chocolatey install jq |
Installing JQ in MAC OS:
1 2 3 4 5 6 7 8 9 10 11 12 |
## ------ ## MAC OS ## ------ ## Check if Homebrew already installed brew --version #should return an output ## Install Homebrew if not already installed /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)" ## Install jq brew install jq |
Installing JQ in Ubuntu or Debian OS:
1 2 3 4 5 6 |
## ------------------ ## Ubuntu | Debian OS ## ------------------ ## Install jq sudo apt-get install jq |
Installing JQ in REDHAT or CentOS OS:
1 2 3 4 5 6 |
## ---------------- ## RHEL | CentOS OS ## ---------------- ## Install jq sudo yum -y install jq |
Installing JQ in Fedora OS:
1 2 3 4 5 6 |
## --------- ## Fedora OS ## --------- ## Install jq sudo dnf -y install jq |
Installing JQ in AWS EC2 instance with Amazon Linux OS:
1 2 3 4 5 6 |
## --------------- ## Amazon Linux OS ## --------------- ## Install jq sudo yum -y install jq |
Once you have installed JQ in your OS, validate the installation by running jq –version command.
1 2 3 4 5 6 |
## --------------------- ## Validate Installation ## --------------------- ## Should return an output jq --version |
How to parse JSON data using JQ?
Next, we will start with JSON data parsing and explore different concepts of JQ using some examples.
Basic filtering of JSON data using jq:
Identity: .
A . (dot) in jq represent the entire input without any filters. JQ command followed by a dot will output the input as it is with some formatting to make it pretty. You can use jq . command to format the output of a curl command.
1 2 3 4 5 6 7 |
## ----------------------------- ## Basic Filtering Data Using jq ## ----------------------------- ## Identity: . echo '{ "name": "deb", "age": 36 }' | jq . curl -s https://reqres.in/api/users | jq . |
Object Identifier-Index: .element,.element.sub-element
You can get the value of an JSON key using the . (dot) notation. Like path you have to give full key location in case the key is a sub-key of another key with dot notation. Let me explain with an example.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
## Object Identifier-Index: .element, .element.sub-element cat << EOF > data.json { "name": "debjeet", "skills": { "cloud": "aws", "devops": "git" } } EOF cat data.json | jq . cat data.json | jq .name ## returns "debjeet" cat data.json | jq .skills.cloud ## returns "aws" |
In the above example, since “name” is a top level element, we can get the value of “name” using .name but “cloud” is not a top level element but is under top level element “skills”, hence we cannot get the value of “cloud” element using .cloud instead we have to descend the path like .skills.cloud to get the value “aws”.
Array Index: .[<array_index>]
If you have an array in your json data, you can retrieve the data inside the array element using array index.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
## Array Index: .[ cat << EOF > data.json [ { "name": "debjeet", "age": 36 }, { "name": "chandrima", "age": 33, "skills": ["sap", "hana", "bw"] } ] EOF cat data.json | jq . cat data.json | jq .[0].name ## returns "debjeet" cat data.json | jq .[].name ## returns "debjeet" "chandrima" cat data.json | jq .[1].skills[0] ## returns "sap" |
Comma: , and Pipe |
You can select multiple elements from your JSON data using comma and pipe in jq.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
## Comma: , and Pipe: | cat << EOF > data.json [ { "name": "debjeet", "age": 36 }, { "name": "chandrima", "age": 33 } ] EOF cat data.json | jq . cat data.json | jq '.[0] | .name, .age' ## returns "debjeet" 36 |
Array construction: []
You can create an array on the filtered output of jq using [].
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
## Array construction: [] cat << EOF > data.json [ { "name": "debjeet", "age": 36, "skills": ["aws","azure"] }, { "name": "chandrima", "age": 33 } ] EOF cat data.json | jq . cat data.json | jq '[.[].name]' ## returns ["debjeet","chandrima"] cat data.json | jq '[.[0] | .name, .skills[0]]' ## returns ["debjeet","aws"] |
Object construction {}
You can create a dictionary or hash from the filtered output in jq.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
## Object Construction: {} cat << EOF > data.json [ { "name": "debjeet", "age": 36 }, { "name": "chandrima", "age": 33 } ] EOF cat data.json | jq . cat data.json | jq '.[0] | {Name: .name, Age: .age}' ## returns {"Name": "debjeet","Age": 36} |
Recursive Descent: ..
You can recursively decent to the required element using two dot operators in jq.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
## Recursive Descent: .. cat << EOF > data.json [ [ { "name": "debjeet" } ] ] EOF cat data.json | jq . cat data.json | jq '..|.name?' ## returns "debjeet" |
Built-in operators :
JQ supports all arithmetic operators like addition, substation, multiplication and division.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
## --------- ## Operators ## --------- ## Addition: + echo '{ "x":20 }' | jq '.x + 10' ## returns 30 echo '{ "x":"deb","y":"jeet" }' | jq '.x + .y' ## returns "debjeet" echo '{ "x":[1,2,3],"y":[4,5,6] }' | jq '.x + .y' ## returns [1,2,3,4,5,6] ## Subtraction: - echo '{ "x":20 }' | jq '.x - 10' ## returns 10 echo '{ "x":[1,2,3,4],"y":[3,4] }' | jq '.x - .y' ## returns [1,2] ## Multiplication, division, modulo: *, /, and % echo '{ "x":20 }' | jq '.x * 10' ## returns 200 echo '{ "x":20 }' | jq '.x / 10' ## returns 2 echo '{ "x":20 }' | jq '.x % 7' ## returns 6 |
Inbuilt functions:
JQ has a complete set of in-built function to add, remove, sort, update and filter JSON data. Lets explore some of them with examples.
Check if a key exists in your json data
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
## ---------------------------------------- ## Advance filters using built-in functions ## ---------------------------------------- ## Check if a key exist in json output | has(key) cat << EOF > data.json { "name": "debjeet", "skills": { "cloud": "aws", "devops": "git" } } EOF cat data.json | jq 'has("salary")' ## returns false cat data.json | jq 'has("skills")' ## returns true cat data.json | jq '.skills | has("devops")' ## returns true |
Get all the elements of json data by element types
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
## Get different objects types in json output echo '[[1,2],{"x":"y"},1,"deb",null,true,false]' | jq '.[]|arrays' ## returns [1,2] echo '[[1,2],{"x":"y"},1,"deb",null,true,false]' | jq '.[]|objects' ## returns {"x":"y"} echo '[[1,2],{"x":"y"},1,"deb",null,true,false]' | jq '.[]|iterables' ## returns [1,2]{"x","y"} echo '[[1,2],{"x":"y"},1,"deb",null,true,false]' | jq '.[]|booleans' ## returns true false echo '[[1,2],{"x":"y"},1,"deb",null,true,false]' | jq '.[]|numbers' ## returns 1 echo '[[1,2],{"x":"y"},1,"deb",null,true,false]' | jq '.[]|normals' ## returns 1 echo '[[1,2],{"x":"y"},1,"deb",null,true,false]' | jq '.[]|finites' ## returns 1 echo '[[1,2],{"x":"y"},1,"deb",null,true,false]' | jq '.[]|strings' ## returns "deb" echo '[[1,2],{"x":"y"},1,"deb",null,true,false]' | jq '.[]|nulls' ## returns null echo '[[1,2],{"x":"y"},1,"deb",null,true,false]' | jq '.[]|values' ## returns [1,2]{"x":"y"} 1 "deb" true false echo '[[1,2],{"x":"y"},1,"deb",null,true,false]' | jq '.[]|scalars' ## returns 1 "deb" null true false |
Sort json data by key or value
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
## Sort by key in json output | sort, sort_by(path_expression) cat << EOF > data.json [ { "id": 300, "did": [2,1,3,7,4,6,5] }, { "name": "debjeet", "id": 100 }, { "name": "chandrima", "id": 200 } ] EOF cat data.json | jq '.[0].did | sort' ## returns [1,2,3,4,5,6,7] cat data.json | jq 'sort_by(.id)' ## returns sorted output on id |
Get json element with maximum and minimum vales.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
## Get minimum and maximum value in json output | min, min_by, max, max_by cat << EOF > data.json [ { "id": 300, "name": "cloudaffaire", "did": [2,1,3,7,4,6,5] }, { "name": "debjeet", "id": 100 }, { "name": "chandrima", "id": 200 } ] EOF cat data.json | jq '.[0].did | max' ## returns 7 cat data.json | jq '.[0].did | min' ## returns 1 cat data.json | jq 'max_by(.id)' ## returns output with id having max value cat data.json | jq 'min_by(.id)' ## returns output with id having min value |
Remove duplicate data from JSON output
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
## Get unique values from json output | unique, unique_by(path_exp) cat << EOF > data.json [ { "id": 300, "name": "debjeet", "did": [2,1,2,5,4,2,5] }, { "name": "debjeet", "id": 100 }, { "name": "debjeet", "id": 100 } ] EOF cat data.json | jq '.[0].did | unique' ## returns [1,2,4,5] cat data.json | jq 'unique_by(.id)' ## returns output with unique id value |
Check if a key has certain value in json
1 2 3 4 5 6 7 8 9 10 11 |
## Check if a key has certain value in json output | contains(element) cat << EOF > data.json [ { "name": "debjeet", "id": 100 } ] EOF cat data.json | jq '.[].name | contains("debjeet")' ## returns true |
Hope you have enjoyed this article. To get more details on JQ, please refer below official documentation.
https://stedolan.github.io/jq/manual/