SQL SET Operator MINUS Alternative in Impala

  • Post author:
  • Post last modified:June 2, 2019
  • Post category:BigData
  • Reading time:5 mins read

The SQL set operators are used to combine data from two or more SELECT statements. The set operators can combine only similar data sets. Here similar data set literally means the number of columns and its data type should match, otherwise you must explicitly type cast the types of the values or columns in the SELECT statements. Just like Apache hive, Impala support only UNION and UNION ALL set operator, INTERSECT and MINUS are not supported as of now. In this article, we will check SQL set operator MINUS alternative in Impala with an example.

SQL Minus Operator Example

Below example demonstrates SQL minus operator:

SELECT * 
 FROM   Table1 
 minus 
 SELECT * 
 FROM   Table2;

Output:

 SK  | PHONE_NO  |  NAME
-----+-----------+--------
 234 | 876543210 | Stuart
 113 |   7877434 | ACCA
(2 rows)

SQL SET Operator MINUS Alternative in Impala

In SQL, MINUS is also called EXCEPT. The MINUS operator finds the difference between two tables or sub-queries and return results from only first SELECT statement.

Cloudera Impala does not support MINUS set operator. If you have any requirement to perform MINUS, then you have to rewrite your queries using an alternate method.

There are two methods that you can use:

LEFT JOIN as a MINUS Alternative in Impala

We have used below mentioned two tables to test Impala SET operator MINUS alternative:

SQL Left outer join returns complete set of records from the left side table along with any matched records from right side table. By default, left join returns unmatched records from right side tables as NULL.

Related article:

Impala EXCEPT/MINUS Alternative

You can write Impala equivalent using left join:

SELECT a.* 
 FROM   table1 a 
        LEFT JOIN table2 b 
               ON ( a.sk = b.sk ) 
 WHERE  b.sk IS NULL; 

Output:

 WHERE  b.sk IS NULL; 
 
 <Output3>
 +-----+-----------+--------+
| sk  | phone_no  | name   |
+-----+-----------+--------+
| 234 | 876543210 | Stuart |
| 113 | 7877434   | ACCA   |
+-----+-----------+--------+

NOT EXISTS as a MINUS Alternative in Impala

Cloudera Impala supports EXISTS and NOT EXISTS clauses. You can use this clause as a MINUS alternative in Impala queries.

Below is the example:

SELECT * 
 FROM   table1 aa 
 WHERE  NOT EXISTS (SELECT 1 
                    FROM   table2 bb 
                    WHERE  ( aa.sk = bb.sk )); 

Output:

+-----+-----------+--------+
| sk  | phone_no  | name   |
+-----+-----------+--------+
| 113 | 7877434   | ACCA   |
| 234 | 876543210 | Stuart |
+-----+-----------+--------+

Related Article,

Hope this helps 🙂