How to Query JSON Data in Redshift? – Examples

  • Post author:
  • Post last modified:March 3, 2023
  • Post category:Redshift
  • Reading time:7 mins read

Similar to many cloud data warehouses such as Snowflake, Amazon Redshift supports many json functions to perform operations on json such as validating json data, parsing json values, etc. You can use these json functions to query the JSON data stored in a varchar column. In this article, we will check how to query JSON data in an Amazon Redshift with an example.

How to Query JSON Data in Redshift? - Examples
Query JSON Data in Redshift

Amazon Redshift do not contain any special data type to store JSON string. You can store JSON value in SUPER data type.

Query JSON Data in Redshift

Querying semi-structured data in Amazon Redshift is not as sophisticated as other cloud data warehouses such as Snowflake. The Snowflake supports SQL queries that access semi-structured data using special operators and functions.

Recently, Amazon Redshift provided support to SUPER data type to store and query semi-structured data such as JSON.

Redshift SUPER Data Type

You can use the SUPER data type to store semistructured data or documents as values into your database table. Semistructured data doesn’t conform to the rigid and tabular structure of the relational data model used in SQL databases. The SUPER data type is a set of schemaless array and structure values that encompass all other scalar types of Amazon Redshift.

Querying Semistructured Data in Redshift

Following steps allow you to insert and query the semistructured data using Amazon Redshift.

  1. Create Redshift table with SUPER data type:
create table SS_DATA(col1 super);

2. Insert JSON data into table:

You can use Amazon Redshift json function JSON_PARSE to parse input json and insert into super data type.

insert into SS_DATA select JSON_PARSE('{"info": {"status": "alive", "responseTime": "9.054"}, "timestamp": "2014-01-01 00:20:00", "app": "sales_portal"}')
insert into SS_DATA select JSON_PARSE('{"info": {"status": "alive", "responseTime": "9.953"}, "timestamp": "2014-01-01 00:21:00", "app": "sales_portal"}')

3. Navigate into JSON Data in Amazon Redshift

Amazon Redshift uses PartiQL […] notations to navigate into json and array data.

For examples,

select col1."info"."status", 
col1."app"
from SS_DATA;

|status |app           |
|-------|--------------|
|"alive"|"sales_portal"|
|"alive"|"sales_portal"|

Query Semistructured Data using Amazon Redshift Spectrum

Amazon Redshift Spectrum allow you to access the semi structured data files including Parquet, ORC, RCFile, TextFile, SequenceFile, RegexSerde, OpenCSV, AVRO, Ion and JSON.

Follow the below article to create an external table and access semistructured data using Redshift Spectrum.

Apart from SUPER data type and Spectrum, Amazon Redshift also provides few useful json functions that you can use to extract values from simple json data. One of such function is JSON_EXTRACT_PATH_TEXT function.

Redshift JSON_EXTRACT_PATH_TEXT Function

The function JSON_EXTRACT_PATH_TEXT returns the value for the key:value pair referenced by a series of path elements in a JSON string. The input JSON path can be nested up to five levels. The function return element if present, otherwise empty string is returned.

In the following example, JSON_EXTRACT_PATH_TEXT function returns appid from a json string.

select json_extract_path_text ('{"appid": "app-1000", "appname": "Reporting"}', 'appid') as appid;
|appid   |
|--------|
|app-1000|

Path elements in a json string are case-sensitive. If a path element does not exist in the JSON string, JSON_EXTRACT_PATH_TEXT returns an empty string. If the null_if_invalid argument is set to true and the function returns NULL if the JSON string is invalid

For example,

select json_extract_path_text ('{"appid-": "app-1000", "appname": "Reporting"}','appid', true ) as appid;
|appid|
|------|
| NULL |

Note that, the last argument is set to True, which is null_if_invalid argument. In the above example, appid key is wrong hence function returns NULL.

Related Articles,

Hope this helps 🙂