INSERT OVERWRITE statements to HDFS filesystem or LOCAL directories are the best way to extract large amounts of data from Hive table or query output. Hive can write to HDFS directories in parallel from within a map-reduce job. In this article, we will check Export Hive Query Output into Local Directory using INSERT OVERWRITE and some examples.
Export Hive Query Output into Local Directory using INSERT OVERWRITE
Query results can be inserted into filesystem directories by using Hive INSERT OVERWRITE statement. You can insert data into either HDFS or LOCAL directory. You can store the high volume output of Hive query as per your requirements.
Related reading:
- Export Hive Table into CSV Format using Beeline Client – Example
- Hadoop – Export Hive Data with Quoted Values into Flat File and Example
Hive INSERT OVERWRITE Syntax
Below is the INSERT OVERWRITE syntax that you can use to export Hive query output into local directory.
If LOCAL keyword is used, Hive will write data to the directory on the local file system. Hive extension also supports multiple inserts.
Export Hive Query Output into Local Directory using INSERT OVERWRITE – Example
Export Hive Query Output into Local Directory
You can use the INSERT OVERWRITE command in hive to export data to local directory.
Below are some examples to demonstrate export Hive query output into local directory using INSERT OVERWRITE statement:
Now let us check the local directory for output:
Verify the local directory for the data file.
Export Hive Query Output into HDFS Directory
You can export the query output as a HDFS directory. This features saves lots of time to export local file to HDFS
Now let us check the HDFS directory for output:
Verify the HDFS directory for the data file which you have exported
Feel free to add any alternate method in comment section 🙂