BigQuery Recursive Query Alternative – Example

  • Post author:
  • Post last modified:April 18, 2023
  • Post category:GCP BigQuery
  • Reading time:6 mins read

You can use recursive query to identify the hierarchies of data, such as an organizational structure, employee-manager, bill-of-materials, and document hierarchy. Many relational databases such as Teradata, Oracle, etc. supports recursive queries, but Google BigQuery does not support all features. One of such features is Recursive CTE or VIEWS. Google BigQuery does not support either WITH RECURSIVE Clause or using the RECURSIVE Clause in a CREATE VIEW Statement. In this article, we will check Google BigQuery Recursive Query Alternative with an working example.

BigQuery Recursive Query Alternative - Example

BigQuery Recursive Query Alternative

As mentioned earlier, in the other RDBMS such as Teradata or Snowflake, you can specify a recursive query by preceding a query with the “WITH RECURSIVE” clause or create a CREATE VIEW statement.

For example, consider following Teradata recursive query example.

WITH RECURSIVE temp_table (employee_number) AS
( SELECT root.employee_number
FROM employee root
WHERE root.manager_employee_number = 801
UNION ALL 
SELECT indirect.employee_number
FROM temp_table direct, employee indirect
WHERE direct.employee_number = indirect.manager_employee_number
)
SELECT * FROM temp_table ORDER BY employee_number;

You can notice WITH clause is using RECURSIVE keyword.

BigQuery does not support these types of CTE. If you know the depth of the hierarchy, you can use union all or left outer join to identify the hierarchy.

But, depth of the recursive query hierarchy is unknown, you could do this with a BigQuery WHILE loop.

Related Articles,

BigQuery Recursive Query Alternative using WHILE Loop

Similar to SQL Server, BigQuery support scripting. We will write a BigQuery script to identify the employee-manager hierarchy.

We will be using following employee table to implement the recursive query logic.

+--------+------------+-----------+------------+
| emp_id | manager_id | last_name | first_name |
+--------+------------+-----------+------------+
|   1015 |       1004 | cutting   | tom        |
|   1003 |        801 | depp      | jhony      |
|   1010 |       1003 | bieder    | leo        |
|    801 |       NULL | mill      | jhon       |
|   1019 |        801 | joker     | jack       |
|   1012 |       1004 | kanch     | jhon       |
|   1004 |       1003 | samp      | sam        |
|   1016 |        801 | miller    | trace      |
|   1002 |       1004 | hit       |  ralph     |
|   1008 |       1019 | cutting   | sam        |
|   1014 |       1019 | lakh      | ram        |
|   1011 |       1019 | sunder    | sham       |
|   1006 |       1019 | sharma    | alok       |
|   1001 |       1003 | miller    | rock       |
+--------+------------+-----------+------------+

The recursive query and data, is borrowed from the Teradata Recursive CTE example. Teradata Recursive Queries.

Following is the BigQuery recursive query using BigQuery script and while loop.

-- Loop counter
DECLARE counter int64 default 1;

-- Create intermediate table with initial records where employee directly reports
CREATE TABLE td.emp
AS
WITH cte AS
(
SELECT 1 AS xlevel, *
FROM td.employee e
WHERE e.manager_id = 801
)
SELECT *
FROM cte;

WHILE EXISTS (
SELECT c.*
FROM td.emp p
INNER JOIN td.employee c ON p.emp_id = c.manager_id
WHERE p.xlevel = counter
)
DO
-- Insert next level
INSERT INTO td.emp ( xlevel, emp_id, manager_id, last_name, first_name )
SELECT counter + 1 AS xlevel, c.emp_id, c.manager_id, c.last_name, c.first_name
FROM td.emp p
INNER JOIN td.employee c ON p.emp_id = c.manager_id
WHERE p.xlevel = counter;
	
SET counter = counter + 1;

-- Loop safely
IF counter > 50
THEN
	BREAK;
END IF;
END WHILE;

-- Display employee-manger hierarchy
SELECT  xlevel, emp_id, manager_id, last_name, first_name  FROM td.emp ORDER BY xlevel, manager_id;

The script is almost similar to T-SQL script with minor syntax changes.

And here is the output of the query.

+--------+--------+------------+-----------+------------+
| xlevel | emp_id | manager_id | last_name | first_name |
+--------+--------+------------+-----------+------------+
|      1 |   1016 |        801 | miller    | trace      |
|      1 |   1019 |        801 | joker     | jack       |
|      1 |   1003 |        801 | depp      | jhony      |
|      2 |   1004 |       1003 | samp      | sam        |
|      2 |   1001 |       1003 | miller    | rock       |
|      2 |   1010 |       1003 | bieder    | leo        |
|      2 |   1014 |       1019 | lakh      | ram        |
|      2 |   1006 |       1019 | sharma    | alok       |
|      2 |   1011 |       1019 | sunder    | sham       |
|      2 |   1008 |       1019 | cutting   | sam        |
|      3 |   1015 |       1004 | cutting   | tom        |
|      3 |   1012 |       1004 | kanch     | jhon       |
|      3 |   1002 |       1004 | hit       |  ralph     |
+--------+--------+------------+-----------+------------+

When you write scripts, keep an eye on exit condition. Just remember if something goes wrong, the loop may end up in an infinite loop.

Related Articles

Hope this helps 🙂