Apache Hive Integer Value Check – Examples

  • Post author:
  • Post last modified:November 17, 2019
  • Post category:BigData
  • Reading time:6 mins read

In a data warehouse environment, there are many options that you can check for an integer value. Using this process, you can usually remove the unwanted records and save some I/O. For example, filter out non-numeric values when comparing it with integer types. In this article, we see different methods to check integer value in Hive.

Apache Hive Integer Value Check

Many relational databases provide an extended SQL function to help the data warehouse developers. The built-in functions such isnumeric is used to check given string value is a number or numeric values. In my other article, we have seen Hive isnumeric alternatives. As of now, Apache Hive does not support any built-in function to check if given string or expression is numeric or integer. But, luckily there are many other options those you can use to check integer values in Apache Hive.

Methods to Check Integer Value in Hive

As mentioned earlier, integer value of a column or an expression can be validated by many methods.

You can use any of the following methods to validate numeric or number type in Hive.

  • RLIKE with Regular Expression
  • CAST Function
  • Create Hive Macro, or
  • Optionally, create user defined function

Now, let use verify the above methods with an example.

Hive RLIKE Statement

The RLIKE is a pattern matching statements available in Hive. Integer type can be validated by RLIKE. The RLIKE uses regular expression to match patterns to identify the integer type value.

You can use this Hive pattern matching function in the WHERE clause to get only integer values from a column or an expression.

For example, consider below SQL statement to validate integer values.

select col1 rlike '^(([0-9]*))$'
from (select '11214' as col1) as q;

+-------+--+
|  _c0  |
+-------+--+
| true  |
+-------+--+

You can use RLIKE in a WHERE condition to filter out non-integer values. In short, RLIKE can be used in either SELECT or WHERE depends on your requirements.

Use Hive CAST Function

The Hive CAST function converts the value of an expression or column to any other type. However, the result of the function will be NULL in case if the function cannot convert to a particular data type.

For example, consider below Hive query with CAST function in WHERE clause to ignore non-integer values.

select * 
from (select '11214' as col1 union all select '1255' as col1 union all select 'aaaa' as col1) as q 
where cast(col1 as INT) is not null;

+---------+--+
| q.col1  |
+---------+--+
| 11214   |
| 1255    |
+---------+--+

Related Article

Create Hive Macro To Check Integer Type Column

You can also create macro function to validate the integer column or an expression.

For example, consider below macro function which uses CAST function.

DROP TEMPORARY MACRO IF EXISTS isinteger;

CREATE TEMPORARY MACRO isinteger(inputstring string)
CASE WHEN CAST(inputstring AS INT) IS NULL THEN false else true
END
;

Use macro to validate an expression. For example,

select isinteger('1232'), isinteger('1ddd');

+-------+--------+--+
|  _c0  |  _c1   |
+-------+--------+--+
| true  | false  |
+-------+--------+--+

Related Article

Create User Defined Function to Validate Integer Value

Optionally, create user defined functions that you can use to validate integer value. Follow my other article Apache Hive user defined function and examples. But, you should know the programming language such as Java or Python to create UDF.

Hope this helps 🙂