UnderStanding QUALIFY Clause in Redshift

If you’re familiar with Teradata, you might have used the QUALIFY clause to filter the results of ordered analytical functions. Amazon Redshift also started supporting QUALIFY clause. In this blog post, we’ll explore what the QUALIFY clause does, why it’s useful, and how you can achieve similar functionality in Redshift.

Page Content

Introduction

What is the QUALIFY Clause?

How QUALIFY Clause is used in Redshift?

Alternatives to the QUALIFY Clause

Conclusion

Introduction

The QUALIFY is one of the important clause as it helps filter results using window functions, similar to how the HAVING clause filters results using aggregate functions with GROUP BY. This makes it easier to create precise and detailed queries, which is very useful for working with complex data. Using QUALIFY can make your data analysis more efficient, making it a valuable tool for anyone who works with SQL.

What is the QUALIFY Clause?

The QUALIFY clause in the Redshift filters results of a previously computed window function according to user‑specified search conditions. You can use the clause to apply filtering conditions to the result of a window function without using a subquery.

Basically, QUALIFY works with window functions in the same way that HAVING operates with aggregate functions and GROUP BY clauses

How QUALIFY Clause is used in Redshift?

The QUALIFY clause Amazon Redshift database is used in SQL to filter the results of a previously computed ordered analytical function. It allows you to apply conditions directly to the results of window functions, making your queries more concise, readable and easy to understand. Syntax is similar to the QUALIFY clause in other databases such as Snowflake, Teradata, etc.

The following example uses analytical functions and a QUALIFY clause.

SELECT StoreID, SUM(profit) OVER (PARTITION BY StoreID)
   FROM facts
   QUALIFY SUM(profit) OVER (PARTITION BY StoreID) > 2;

In this example, the QUALIFY clause filters the results to include only those rows where the total profit for each store exceeds 2.

The QUALIFY is a conditional clause that, similar to HAVING, further filters rows from a WHERE clause. The major difference between QUALIFY and HAVING is that with QUALIFY the filtering is based on the result of performing various ordered analytical functions on the data.

QUALIFY Clause Alternative

Some cloud databases do not support the QUALIFY clause. If you encounter this issue, you can use a subquery along with the WHERE clause as an alternative. For example, consider the following query:

SELECT StoreID, auto_01
FROM(SELECT StoreID, SUM(profit) OVER (PARTITION BY StoreID) as auto_01,
 SUM(profit) OVER (PARTITION BY StoreID) as rn
   FROM facts) as T
   WHERE rn > 2;

As you can see, windows function in a QUALIFY clause is treated as a column inside the subquery and you can use that column in the WHERE clause of the upper query to filter records.

In this query, the window function SUM(profit) OVER (PARTITION BY StoreID) is placed inside a subquery. The WHERE clause is then used to filter the results based on the rn calculated in the subquery. This approach allows you to achieve the same result as you would with the QUALIFY clause, ensuring you can still perform complex data filtering even if the QUALIFY clause is not available in your database.

Conclusion

In Amazon Redshift, the QUALIFY clause is a recently added feature for filtering results based on window functions, making it easier to handle filter on ordered withow functions. However, not all cloud databases support the QUALIFY clause. In such cases, you can use a subquery combined with the WHERE clause as an alternative. For instance, by placing the window function inside a subquery and applying the WHERE clause to filter the results, you can achieve similar outcomes. This approach ensures that you can still perform advanced data filtering and analysis, even in databases where the QUALIFY clause is not available.