Netezza Joins and it’s Algorithms

  • Post author:
  • Post last modified:May 20, 2019
  • Post category:Netezza
  • Reading time:4 mins read

Netezza joins combines the columns of the reports from the more than one table. SQL JOIN clause is used to combine rows from two or more tables, based on a common field between them.

Netezza Joins and it's Algorithms

The best way to understand the Netezza joins is to get to know the SQL joins and then consider the algorithm that is used by database to resolve them.

Read:

Netezza Joins: Equi Join

An equijoin is a join where the join condition is an equality operator ( = ), and each term is a simple column reference. An equijoin combines rows that have equivalent values for the specified columns.

SELECT a.col1, b.col2 
FROM table1 a 
 JOIN table2 b 
 ON a.col1= b.col2

Netezza Algorithm:

Netezza usually resolves the equi-join by using a Hash Join approach, where the hashed values from the column provided in the join conditions and compared for the equality. This type of the join is a fastest join in the Netezza.

In some cases, data which we are comparing is already in the sorted order then Netezza resolves using the Merge sort join approach. For example, if you use the materilized views in join operations then Netezza makes use of Merge sort join.

Netezza Joins: Cross Joins

If two tables in a join query have no join condition, or join criteria does not include any columns from at least one of the table then these types of joins are called cross joins.

SELECT * from table1 a, table2 b;

or

SELECT a.col1, b.col2 
FROM table1 a 
 JOIN table2 b 
 ON a.col1 = 0;

Netezza Algorithm:

Netezza resolves cross join by using Cartesian product approach. Cartesian product approach is very expensive and will never be a co-located as every data slice from once table must be joined to every data slice in other table. You join be very careful while performing any cross joins.

Netezza Joins: Expression Joins

Expression joins are joins where the joining condition is anything other than equality conditions. For example, if you are using > operator condition in join to compare values from two tables then that could leads to expression join.

SELECT a.col1, b.col2 
FROM table1 a 
 JOIN table2 b 
 ON a.col1 > b.col2;

Netezza Algorithm:

Netezza resolves these types of joins by using Nested loop approach. These types of joins are also very expensive as they can never be a co-located joins as each rows in one table are evaluated individually with each rows in other table. These types of joins require lots of resources to resolve hence can degrade the Netezza performance.

Netezza Joins: Implicit IN/EXISTS Joins

These types of joins are not explicit join. If specified in the SQL queries, then Netezza resolves them with implicit joins.

SELECT a.col1, a.col2 
FROM table1 a 
WHERE a.col1 IN 
 (
 SELECT b.col1 
 FROM table2 b 
 );

or

SELECT a.col1, a.col2 
FROM table1 a 
WHERE NOT EXISTS 
 (
 SELECT * 
 FROM table2 b 
 WHERE a.col1= b.col1 
 );

Netezza Algorithm:

Netezza resolves these types of joins by using Exists join approach. Basically, the exists join is same as hash join approach.