You can use recursive query to query hierarchies of data, such as an organizational structure, bill-of-materials, and document hierarchy. Redshift does not support all features that are supported in PostgreSQL. One of such features is Recursive CTE or VIEWS. Redshift does not support either WITH RECURSIVE Clause or using the RECURSIVE Clause in a CREATE VIEW Statement. In this article, we will check Redshift Recursive Query Alternative with an working example.
Redshift Recursive Query
Amazon Redshift, a fully-managed cloud data warehouse, now adds support for Recursive Common Table Expression (CTE) to analyze hierarchical data, such as organizational charts where employees reports to other employees (managers), or multi-level product orders where a product consists of many components, which in turn consist of other components.
Following is the syntax of recursive CTE in Redshift.
[ WITH [RECURSIVE] common_table_expression [, common_table_expression , ...] ] SELECT statement;
For example, you can use the following recursive query for hierarchical data.
with recursive org_cte(id, name, manager_id, level) as
( select id, name, manager_id, 1 as level
from employee
where name = 'John'
union all
select e.id, e.name, e.manager_id, level + 1
from employee e, john_org j
where e.manager_id = j.id and level < 4
)
select id, name, manager_id from org_cte order by manager_id;
Redshift Recursive CTE Alternative
In the other RDBMS such as Teradata or Snowflake you can specify a recursive query by preceding a query with the WITH RECURSIVE clause or create a CREATE VIEW statement.
Amazon Redshift does not support recursive CTEs, you have to use Redshift union all set operators or inner join approach if you know the depth of the recursive query hierarchy.
But, if you don’t know the depth of the recursive query hierarchy, you have to use Redshift stored procedure to identify the depth dynamically.
Redshift Recursive Query Alternative using Stored Procedure
To implement the Redshift recursive CTE, we have already created emp table. We need to identify the employee and manager hierarchy. Here is the content of table.
EMPLOYEE_NUMBER | MANAGER_EMPLOYEE_NUMBER |
801 | NULL |
1016 | 801 |
1003 | 801 |
1019 | 801 |
1010 | 1003 |
1004 | 1003 |
1001 | 1003 |
1012 | 1004 |
1002 | 1004 |
1015 | 1004 |
1008 | 1019 |
1006 | 1019 |
1014 | 1019 |
1011 | 1019 |
Following is the example of Redshift recursive CTE using stored procedure.
create or replace procedure Rec_test2( result_set INOUT REFCURSOR)
AS $$
DECLARE
df_cnt int := 1;
cnt int :=1;
a int;
tblnm varchar(50) := '';
tblnm1 varchar(50) := '';
qry varchar(2500) := '';
BEGIN
-- Create initial temp table
create temp table temp_tab0 as (SELECT employee_number from employee_rec_test where manager_employee_number=801);
while (df_cnt != 0 )
loop
tblnm := 'temp_tab'||(cnt-1);
tblnm1 := 'temp_tab'||cnt;
execute 'select * from( select indirect.employee_number
FROM '||quote_ident(tblnm)||' direct, employee_rec_test indirect
WHERE direct.employee_number = indirect.manager_employee_number) a';
GET DIAGNOSTICS df_cnt := ROW_COUNT;
if (df_cnt!=0)
then
execute 'create temp table '||quote_ident(tblnm1)||' as select indirect.employee_number
FROM '||quote_ident(tblnm)||' direct, employee_rec_test indirect
WHERE direct.employee_number = indirect.manager_employee_number';
end if;
cnt := cnt + 1;
end loop;
for a in 0..(cnt-2)
loop
if (a=0)
then
qry := qry || 'select employee_number from temp_tab' || a;
else
qry := qry || ' union select employee_number from temp_tab' || a;
end if;
end loop;
-- Return Result set
--OPEN result_set for
execute 'create table result_rec as '|| qry;
END;
$$ LANGUAGE plpgsql;
Following is the result.
call Rec_test2('newcursor');
select * from result_rec order by 1;
employee_number
1001
1002
1003
1004
1006
1008
1010
1011
1012
1014
1015
1016
1019
Example, is borrowed from Teradata Recursive CTE example. Teradata Recursive Queries.
Related Articles,
Hope this helps 🙂
Hi Vithal,
Interesting post, but unfortunately, the code doesn’t seem to work.
cnt2 is declared as 0 and then you select into 0 a little later. I’m not sure how that works exactly.
Also, your exit condition is true for every loop, so why not just exit without the if condition?
Finally, without the call, some may find actually implementing this quite difficult.
Thanks for the approach though – really interesting!
Thanks for the quick edit! I think you’re missing a ; after FROM ‘||quote_ident(tblnm)||’ direct, employee_rec_test indirect WHERE direct.employee_number = indirect.manager_employee_number’
You’re also missing the level indicator seen in your other CTE-based approach – this just loops once and outputs the same output again
Hi,
Sorry for multiple edits. Corrected and attached working code for your reference. I have taken example from Teradata recursive query.
I have kept REFCURSOR so that you can modify and return the result sets.
Regards,
Vithal