Details about bigdata

Apache Hive LIKE statement and Pattern Matching Example

Unlike various relational databases such as Netezza, Teradata, Oracle etc, Apache hive support pattern matching using LIKE, RLIKE or INSTR functions. You can search for string by matching patterns. Note that, Hive LIKE statement is case-sensitive. Apache Hive LIKE statements returns TRUE if string that you are searching for. The Hive NOT LIKE is negation of LIKE and vice-versa. Related reading: Apache Hive Regular Expression Functions Apache Hive String Functions and Examples Hive LIKE Statement Patterns Matching If the string does not contain any percentage sign or underscore, then pattern…

Continue ReadingApache Hive LIKE statement and Pattern Matching Example
Comments Off on Apache Hive LIKE statement and Pattern Matching Example

Apache Hive Derived Tables and Examples

In some application, you may have to derive column values from base tables. For example, you may have to find out maximum value of aggregated column data. In this scenario, you will have to create aggregated data first and then apply MAX function on that column. You can achieve this by using Apache Hive derived tables. We will check type of derived tables supported in Hive with some examples. Apache Hive Derived Tables Apache Hive derived tables is a subquery which will be there in FROM clause of the HiveQL…

Continue ReadingApache Hive Derived Tables and Examples
Comments Off on Apache Hive Derived Tables and Examples

Apache Hive Correlated Subquery and it’s Restrictions

Apache Hive Correlated subquery is a query within a query that refer the columns from the outer query. Hive does support some of subqueris such as table subquery, WHERE clause subquery etc, and correlated subqueries. In most cases, the Hive correlated subqueries are used to improve the Hive query performance. Above diagram clearly explains the correlated subqueries in case of relational databases and Apache Hive. Read: Apache Hive Supported Subqueries and Examples Apache Hive Correlated Subquery Examples For example, consider query, “check if student id is already exists in the…

Continue ReadingApache Hive Correlated Subquery and it’s Restrictions
2 Comments

Apache Hive Supported Subqueries and Examples

A subquery in Hive is a select expression that is enclosed in parentheses as a nested query block in a HiveQL query statement. The subquery in Hive is like other relational database subquery that may return zero to one or more values to its upper select statements. In this article, we will check Apache Hive supported subqueries and some examples. Apache Hive Supported Subqueries As mentioned above, Hive subquery is a select expression enclosed in parenthesis as a nested query block. You can use these nested query blocks in any…

Continue ReadingApache Hive Supported Subqueries and Examples
Comments Off on Apache Hive Supported Subqueries and Examples

How to List Hive High Volume Tables?

Unlike other relational databases, Apache Hive does not have any system table that keeps track of size of growing tables. It is difficult to find table size in hive using query. As a part of maintenance, you should identify the size of growing tables periodically. Big tables can cause the performance issue in the Hive.Below are some of methods that you can use to list Hive high volume tables. Use hdfs dfs -du Command Hadoop supports many useful commands that you can use in day to day activities such as…

Continue ReadingHow to List Hive High Volume Tables?
Comments Off on How to List Hive High Volume Tables?

Apache Hive LEFT-RIGHT Functions Alternative and Examples

If you have been working on other RDBMS like Oracle, Redshift etc then you will be surprised to know Hive does not support LEFT-RIGHT functions. You will either should write your own UDF’s using Java or find out any other alternatives. In this article, we will check Apache Hive LEFT-RIGHT functions alternative with some examples. Hive LEFT-RIGHT Functions Alternatives Since Hive does not support LEFT-RIGHT function, you could use Hive SUBSTR string function or regexp_extract regular expression function to select leftmost or rightmost characters from the string values. Other possible…

Continue ReadingApache Hive LEFT-RIGHT Functions Alternative and Examples
Comments Off on Apache Hive LEFT-RIGHT Functions Alternative and Examples

Cloudera Impala Extract Function and Examples

Cloudera extract function returns one of the numeric date or time fields from a TIMESTAMP value. Cloudera Impala extract function extracts the sub field represented by units from the date/time value, interval, or duration specified for column. This function is equivalent to Impala date_part() function but parameters reversed. In this article, we will discuss on Impala extract function and its usage with some examples. Cloudera Impala Extract Function Syntax The extract function complies with SQL-99 standard function. The syntax for extract function is same as other RDBMS extract functions. Below…

Continue ReadingCloudera Impala Extract Function and Examples
Comments Off on Cloudera Impala Extract Function and Examples

Apache Hive Extract Function Alternative and Examples

In general, extract function extracts the sub field represented by units from the date/time value, interval, or duration specified for column. Apache Hive does not support extract function, you can use other built in functions to extract required units from date value. In this article, we will check Hive extract function alternative and some examples. Hive Extract Function Alternative There is no extract function in Hive to extract sub part of date values. You can use Hive built in date function date_format() to extract required values from date fields. Below…

Continue ReadingApache Hive Extract Function Alternative and Examples
Comments Off on Apache Hive Extract Function Alternative and Examples

Impala Create External Table, Syntax and Examples

A Impala external table allows you to access external HDFS file as a regular managed table. This operation saves resources and expense of importing data file into Impala database. You can perform join using these external tables same as managed tables. You can write complex queries using these external tables. In this article, we will check on Impala create external table with some examples. Syntax for creating impala external table is same as creating managed tables. There is one exception to this, LOCATION option is mandatory for creating external tables. LOCATION…

Continue ReadingImpala Create External Table, Syntax and Examples
Comments Off on Impala Create External Table, Syntax and Examples

Difference Between Hive CLI and Beeline Client – Hive vs Beeline

Beeline is developed to interact with new server. Hive CLI is an Apache Thrift based client, whereas Beeline is JDBC client, based on SQLLine CLI. In this article, we will check difference between Hive CLI and Beeline client – Hive vs Beeline. Difference Between Hive CLI and Beeline Client – Hive vs Beeline Below are the some of the difference between Hive CLI and Beeline client. These some of differences will help you in case if you are migrating from old Hive CLI to new Beeline client. Server Connection Hive…

Continue ReadingDifference Between Hive CLI and Beeline Client – Hive vs Beeline
Comments Off on Difference Between Hive CLI and Beeline Client – Hive vs Beeline