Clouderal Impala SQL Join Types and Examples

  • Post author:
  • Post last modified:February 26, 2018
  • Post category:BigData
  • Reading time:4 mins read

Impala SQL Join is a clause that is used for combining specific fields from two or more tables based on the common columns. The joins in the Impala are similar to the SQL and Hive joins. Joins are used to combine rows from multiple tables. In this article, we will learn about different Impala SQL join types with examples.

Different Impala Join Types

Following are Different Hive Join Types

  • INNER JOIN
  • LEFT OUTER JOIN
  • RIGHT OUTER JOIN
  • FULL OUTER JOIN
  • SEMI JOIN
  • ANTI JOIN
  • CROSS JOIN

Below are the tables that we will be using to demonstrate different Join types in Cloudera Impala:

Query: select * from students 
+----+------+-----------+ 
| id | name | city | 
+----+------+-----------+ 
| 2 | BCD | Mumbai | 
| 5 | EFG | Bangalore | 
| 1 | ABC | London | 
| 4 | DEF | Mumbai | 
| 3 | CDE | Bangalore | 
+----+------+-----------+

Fetched 5 row(s) in 1.52s 
Dept:

Query: select * from dept 
+------+---------+-----+ 
| code | subject | sid | 
+------+---------+-----+ 
| 102 | Physics | 2 | 
| 103 | Chem | 3 | 
| 101 | Maths | 1 | 
| 100 | CS | 1 | 
+------+---------+-----+

Fetched 4 row(s) in 0.43s

Impala INNER JOIN or Simply JOIN

Impala Inner join or Join returns matching columns from two or more tables.

Syntax and Examples

SELECT ID, code 
FROM students 
INNER JOIN dept 
ON (ID = sid);

Result:
+----+------+ 
| id | code | 
+----+------+ 
| 2 | 102 | 
| 3 | 103 | 
| 1 | 101 | 
| 1 | 100 | 
+----+------+

Impala LEFT OUTER JOIN or LEFT JOIN

You can use the name LEFT OUTER JOIN or LEFT JOIN to combine two or more tables. This type of join returns all rows from the left table even if there is no matching row available in the right table. Unmatched right table’s records will be NULL.

Syntax and Examples:

SELECT ID, code 
FROM students 
LEFT OUTER JOIN dept 
ON (ID = sid);

Results:
+----+------+ 
| id | code | 
+----+------+ 
| 5 | NULL | 
| 4 | NULL | 
| 3 | 103 | 
| 2 | 102 | 
| 1 | 101 | 
| 1 | 100 | 
+----+------+

Impala RIGHT OUTER JOIN or RIGHT JOIN

You can use the name RIGHT OUTER JOIN or RIGHT JOIN to combine two or more tables. This type of join returns all rows from the right table even if there is no matching row available in the left table. Unmatched left table’s records will be NULL.

Syntax and Examples

SELECT ID, code 
FROM students 
RIGHT OUTER JOIN dept 
ON (ID = sid);

Result:
+----+------+ 
| id | code | 
+----+------+ 
| 1 | 101 | 
| 1 | 100 | 
| 3 | 103 | 
| 2 | 102 | 
+----+------+

Impala FULL OUTER JOIN

It returns all rows from the both tables that fulfill the JOIN condition. The unmatched rows from both tables will be returned as a NULL.

Syntax and Examples

SELECT ID, code 
FROM students 
FULL OUTER JOIN dept 
ON (ID = sid);

Results:
+----+------+ 
| id | code | 
+----+------+ 
| 4 | NULL | 
| 5 | NULL | 
| 1 | 101 | 
| 1 | 100 | 
| 2 | 102 | 
| 3 | 103 | 
+----+------+

Impala SEMI JOIN

These types of joins are rarely used in Impala SQL. You can use LEFT and RIGHT along with SEMI JOIN. With the left semi join, only data from the left table is returned for rows where there is matching data in the right table. Only one instance of each row from the left table is returned, regardless of how many matching rows exist in the right table.

The right semi join is reverse of the left semi join.

Syntax and Examples

SELECT ID 
FROM students 
LEFT SEMI JOIN dept 
ON (ID = sid);

Results:
+----+ 
| id | 
+----+ 
| 1 | 
| 2 | 
| 3 | 
+----+ 

SELECT code 
FROM students 
RIGHT SEMI JOIN dept 
ON (ID = sid);

Results:
+------+ 
| code | 
+------+ 
| 102 | 
| 101 | 
| 100 | 
| 103 | 
+------+

Impala ANIT JOIN

The LEFT and RIGHT keywords are required for ANTI JOIN. For LEFT ANTI JOIN, this clause returns those values from the left-hand table that have no matching value in the right-hand table. And vice-versa for RIGHT ANTI JOIN.

Syntax and Examples

SELECT ID 
FROM students 
LEFT ANTI JOIN dept 
ON (ID = sid);

Results:
+----+ 
| id | 
+----+ 
| 5 | 
| 4 | 
+----+ 
SELECT CODE 
FROM students 
RIGHT ANTI JOIN dept 
ON (ID = sid);

Impala CROSS JOIN or Cartesian Join

To avoid huge data set producing, Impala does not allow the CROSS JOIN without where condition. The CROSS JOIN produces a result set with all combinations of rows from the left and right tables.

Syntax and Examples

SELECT ID,CODE 
FROM students 
CROSS JOIN dept 
where ID = 4

Result:
+----+------+ 
| id | code | 
+----+------+ 
| 4 | 100 | 
| 4 | 103 | 
| 4 | 102 | 
| 4 | 101 | 
+----+------+