Netezza Fuzzy String search Functions and Examples

  • Post author:
  • Post last modified:February 27, 2018
  • Post category:Netezza
  • Reading time:4 mins read

If you ever wondered how to verify how similar or different strings are. The Netezza SQL language supports two Netezza fuzzy string search functions:  Levenshtein Edit Distance and Damerau-Levenshtein Edit Distance. You can perform the fuzzy search with help of these functions.

netezza fuzzy string search functions

Basically, Netezza fuzzy string search function supports a search in a form of approximate string matching that is based on the defined technique or algorithms. These functions compare two strings to show how similar or different they are. These two fuzzy functions support only VARCHAR or CHAR data types.

Read:

Netezza Fuzzy String search Functions: Levenshtein Edit Distance

Syntax

le_dst(<str_expr_1>, <str_expr_2>)

Return value

Function returns the value which indicates how different the two input string expressions are according to Levenshtein Edit Distance algorithms.

A value of 0 indicates that the two given strings are equivalent and no other modification required. The algorithm computes the number of modifications that are required to change the first string expression into the second string. The string arguments to the functions are case-sensitive. A modification is a change such as an addition, deletion, letter case-change, or substitution of a single character.

Levenshtein Edit Distance Examples

TRAINING(ADMIN)=> select le_dst('fellow','hollow');
 LE_DST
--------
 2
(1 row)

This returns value 2. Replace the character ‘f’ to ‘h’ and ‘e’ to ‘0’.

TRAINING(ADMIN)=> select le_dst('12345','1236');
 LE_DST
--------
 2
(1 row)

This returns value 2. Remove character string ‘5’ and change ‘4’ to ‘6’.

Because the string comparisons are case-sensitive, you can use Netezza built in string functions such as upper() and lower() to change the letter casing of strings before the comparison and ignore case-change modifications.

Read:

For example,

TRAINING(ADMIN)=> select le_dst('fellow','FELLOW');
 LE_DST
--------
 6
(1 row)

Return a value of 6. Six uppercase letter changes and a letter substitution.

TRAINING(ADMIN)=> select le_dst(upper('fellow'),'FELLOW');
 LE_DST
--------
 0
(1 row)

Return a value of 0. String expressions are similar.

Netezza Fuzzy String search Functions: Damerau-Levenshtein Edit Distance

Syntax:

dle_dst (<str_expr_1>, <str_expr_2>);

Function returns the value which indicates how different the two input string expressions are according to Damerau-Levenshtein Edit Distance algorithms.

Similar to the Levenshtein algorithm, a modification is a change such as an addition, deletion, letter case-change, or substitution of a single character. However, in the Damerau-Levenshtein algorithm, a character transposition change such as ‘two’ to ‘tow’ counts as one change, not two.

Damerau-Levenshtein Edit Distance Examples

TRAINING(ADMIN)=> select dle_dst('fellow','hollow');
 DLE_DST
---------
 2
(1 row)

TRAINING(ADMIN)=> select dle_dst('two','tow');
 DLE_DST
---------
 1
(1 row)