Hadoop Hive Regular Expression Functions and Examples

  • Post author:
  • Post last modified:December 25, 2019
  • Post category:BigData
  • Reading time:3 mins read

The Hadoop Hive regular expression functions identify precise patterns of characters in the given string and are useful for extracting string from the data and validation of the existing data, for example, validate date, range checks, checks for characters, and extract specific characters from the data.

Hadoop Hive Regular Expression Functions

In this article, we will be checking some commonly used Hadoop Hive regular expressions with an examples.

Types of Hadoop Hive regular expression functions

As of now, Hive supports only two regular expression functions:

  • REGEXP_REPLACE
  • REGEXP_EXTRACT

Hive REGEXP_REPLACE Function

Searches a string for a regular expression pattern and replaces every occurrence of the pattern with the specified replacement.

Hive REGEXP_REPLACE Function Syntax

Below is the syntax for Hive REGEXP_REPLACE Function.

regexp_replace(string INITIAL_STRING, string PATTERN, string REPLACEMENT);

Hive REGEXP_REPLACE Function Example

Below example shows how to extract first part of the email id by replacing anything between ‘@’ and ‘.com’ with blank:

hive> select regexp_replace('HA^G^FER$JY',"\\^","\\$");

HA$G$FER$JY

You can also use Hive Translate function to replace values. You can read about more on translate function in my other article – Apache Hive Replace Function and Examples

Related Article,

Hive REGEXP_EXTRACT Function

The Hive REGEXP_EXTRACT function returns the matching text item in the string or data.

Related reading:

Hive REGEXP_EXTRACT Function Syntax

Below is the Hive REGEXP_EXTRACT Function syntax:

regexp_extract(string subject, string pattern, int index);

Hive REGEXP_EXTRACT Function Example

Below is the Hive REGEXP_EXTRACT Function example to extract data from the string:

hive> select regexp_extract('foothebar', 'foo(.*?)(bar)', 2);

bar

Hope this helps 🙂

Also Read: