Redshift CASE Statement, Usage and Examples

  • Post author:
  • Post last modified:April 19, 2023
  • Post category:Redshift
  • Reading time:12 mins read

In general, the CASE expression or command is a conditional expression, similar to if-then-else statements found in other languages. CASE is used to specify a result when there are multiple conditions. In this article, we will check how to use Redshift CASE Statement, its syntax and usage with some examples.

Redshift CASE Statement, Usage and Examples

Post Content

Introduction to Redshift CASE Statement

The Redshift CASE statement selects value from a sequence of conditions based on your requirement, and executes a corresponding statement that matches the evaluated condition. The CASE statement also evaluates a single expression and compares it against several potential available values, or evaluates multiple Boolean expressions and chooses the first one that is TRUE.

Syntax of CASE Statements in Redshift

There are two types of Redshift CASE statements:

Let us verify both CASE statement with an example.

Simple CASE Statement

In simple CASE expressions, an expression is compared with a value provided in WHEN clause. When a match is found, the specified action in the THEN clause is applied. If no match is found, the action in the ELSE clause is applied. The ELSE part will be the last statement.

Simple CASE Statement Syntax

Below is the example of simple case statement:

CASE expression
WHEN value THEN result
[WHEN...]
[ELSE result]
END;

For example, consider simple case example:

SELECT 
CASE 3
WHEN 1 THEN 'ONE'
WHEN 2 THEN 'TWO'
WHEN 3 THEN 'THREE'
WHEN 4 THEN 'FOUR'
ELSE 'NO MATCH FOUND'
END;

Here is the output:

 case
-------
 THREE
(1 row)

Searched CASE Statement

In searched CASE expressions, Boolean expression is used, each CASE is evaluated based on a Boolean expression, and the CASE statement returns the first matching CASE. If no matching CASE’s are found among the WHEN clauses, the action in the ELSE clause is returned.

Searched CASE Statement Syntax

Below is the syntax of searched case statement available in Redshift:

CASE
WHEN boolean condition THEN result
[WHEN ...]
[ELSE result]
END

For example, consider simple searched case statement example:

SELECT EXP,
CASE WHEN EXP = 1 THEN 'ONE'
WHEN EXP = 2 THEN 'TWO'
WHEN EXP = 3 THEN 'THREE'
WHEN EXP = 4 THEN 'FOUR'
ELSE 'NO MATCH FOUND'
END
FROM (
	SELECT 10 AS EXP 
	UNION 
	SELECT 2 AS EXP 
	UNION SELECT 3 AS EXP
	) AS TMP;

Here is the output:

 exp |      case
-----+----------------
   2 | TWO
   3 | THREE
  10 | NO MATCH FOUND
(3 rows)

Usage of CASE Statements in Redshift

CASE statements are used for data transformation and conditional logic in SQL, including in Amazon Redshift. In Redshift, the CASE statement is used to conditionally execute an expression and return different values depending on the result of the expression.

Here are some common use cases for CASE statements in Redshift:

CASE Statements with Aggregation

CASE statements with aggregation in Redshift are used to transform data in a way that categorizes it into groups, and then aggregates each group using a specified function. There are several aggregation functions that can be used with CASE statements in Redshift, including COUNT, SUM, AVG, MAX, and MIN. The function used depends on the type of data being aggregated and the insights that are being sought.

Here is the example of using a CASE statement to count records with conditions:

SELECT
  COUNT(*) AS total_sales,
  SUM(CASE WHEN EXTRACT(MONTH FROM sale_date) <= 3 THEN 1 ELSE 0 END) AS sales_q1,
  SUM(CASE WHEN EXTRACT(MONTH FROM sale_date) BETWEEN 4 AND 6 THEN 1 ELSE 0 END) AS sales_q2
FROM
  sales;

Redshift CASE Statement with Windows Functions

You can use Redshift CASE statement with windows or analytics functions. For examples, consider CASE statement with ROW_NUMBER analytics functions.

SELECT
  CASE
    WHEN ROW_NUMBER() OVER(PARTITION BY col1 ORDER BY col1) = 1
      THEN 'ONE'
      ELSE 'OTHER'
  END as al
FROM
  test1;

Redshift CASE Statement in HAVING Clause

You can also use the CASE statement in the HAVING clause to filter the results of a query based on complex conditions:

For examples,

SELECT column_1,
       SUM(column_2)
FROM table_name
GROUP BY column_1
HAVING SUM(column_2) >=
  (CASE
    WHEN column_1 = 'value_1' THEN 100
    WHEN column_1 = 'value_2' THEN 200

    ELSE 0
  END);

CASE Statement for Data Cleansing

Redshift CASE statements can be used to replace or transform specific values in a column. For example, you can use a CASE statement to replace NULL values with a default value:

SELECT
  customer_id,
  CASE WHEN first_name IS NULL THEN 'Unknown' ELSE first_name END AS first_name,
  CASE WHEN last_name IS NULL THEN 'Unknown' ELSE last_name END AS last_name
FROM
  customers;

Best Practices for Using CASE Statements in Redshift

Here are some best practices to keep in mind when using CASE statements in Redshift:

  • Use aliases for CASE statements: Always use aliases for the columns generated by CASE statements. This will make the query more readable and help you to easily refer to the generated columns in other parts of the query.
  • Keep the syntax simple: Avoid using overly complex syntax for CASE statements. Instead, use simple syntax with easy-to-understand conditions and expressions.
  • Use short statements: If you have multiple conditions in a single CASE statement, break them up into separate statements. This will make it easier to read and modify the code.
  • Optimize your queries: CASE statements can sometimes impact query performance, especially if you are using them with large tables.

Conclusion

In conclusion, the CASE statement is used for data transformation and conditional logic in Redshift queries. It allows you to execute expressions conditionally and return different values based on the results. By using CASE statements, you can perform conditional aggregation, data cleansing, and implement conditional logic in your queries. To get the most out of CASE statements in Redshift, it’s important to follow best practices such as using aliases, keeping syntax simple, optimizing queries, and testing thoroughly.

Related Articles

Hope this helps 🙂