Hive Insert from Select Statement and Examples

  • Post author:
  • Post last modified:September 11, 2020
  • Post category:BigData
  • Reading time:6 mins read

Apache Hive is the data warehouse framework on top of the Hadoop distributed file system (HDFS). It provides a query language called Hive Query Language, HiveQL or HQL. HiveQL syntax is similar to SQL syntax with minor changes. Similar to SQL insert statements, HQL also supports inserting data into tables using various methods. In this article, we will check one of the data insert methods into Hive table using a Select statement or clause.

Hive Insert from Select Statement and Examples

Hive Insert Data into Table Methods

Below are the some of commonly used methods to insert data into tables.

  • INSERT INTO table using VALUES clause
  • The Insert data into table using LOAD command
  • INSERT INTO table using SELECT clause

Now let us check these methods with some simple examples. Though, we will check third method in details, but let us check other methods too.

INSERT INTO table using VALUES clause

This method is easiest and mostly widely used when you have a very limited set of records to be inserted into the Hive table.

Below is the syntax of the INSERT into Hive table using VALUES clause.

INSERT INTO table [(col1, col2,...)]  VALUES (value1, value2, ...);

For example, consider below example to insert recrods into ‘insert_test’ table using VALUES clause.

INSERT INTO insert_test (col1, col2) VALUES (1, 'ABC');

Insert data into table using LOAD command

The second method, inserting data into Hive table using LOAD command works well when you have a data stored in flat files.

I have written another post on a similar subject. You can read more about loading data from CSV file in my other post Apache Hive Load Quoted Values CSV File and Examples.

INSERT INTO table using SELECT clause

This is one of the widely used methods to insert data into Hive table. We will use the SELECT clause along with INSERT INTO command to insert data into a Hive table by selecting data from another table.

Below is the syntax of using SELECT statement with INSERT command.

INSERT INTO TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...)] select_statement1

You should use PARTITION clause only if your Hive table is partitioned.

For example, consider simple example of inserting data into Hive table using SELECT clause.

INSERT INTO insert_test SELECT * FROM insert_test2;

failed rule ‘regularbody’ in statement – Error while Inserting Data

This is interesting error you may get when you try to insert data into Hive table using SELECT statement without FROM clause.

This is a known issue, and it is fixed in Hive version 1.3.1/2.0.

You can’t use the insert as select without from a clause in earlier versions. Rewrite your query using insert into… values… or upgrade to latest Hive version.

Below is the workaround to insert data into Hive table using select clause without FROM if you can’t upgrade to the latest version.

INSERT INTO insert_test SELECT * (SELECT Col1, Col2 FROM insert_test2) as tmp;

The above query will work on all versions of Apache Hive. You can use it to provide back-word compatibility.

Related Articles

Hope this helps 🙂