Hadoop – Export Hive Data with Quoted Values into Flat File and Example

  • Post author:
  • Post last modified:February 26, 2018
  • Post category:BigData
  • Reading time:3 mins read

In general, quoted values are values which are enclosed in single or double quotation marks. Usually, quoted values files are system generated where each and every fields in flat files is either enclosed in SINGLE or DOUBLE quotation mark. In this article, we will check how to export Hadoop Hive data with quoted values into flat file such as CSV file format.

Export Hive Data with Quoted Values into Flat File

Quoted Value File Overview

In the quoted values files, values are enclosed in quotation mark in case there is a embedded delimiter. For example, comma separated values file can have comma embedded within its values. One example of such values is address field in the tables. Below is the sample example of the embedded commas.

“1”, “Johny”, “1, NYC”
“2”, “Tim”, “10, DC”

Export Hive Data with Quoted Values into Flat File

The possible solution could be create external table by using Hive CSV SerDe (Serializer/Deserializer). It provides a way to specify custom delimiters, quote, and escape characters.

For more information on Hive External table read:

Below is the Hive external table example that you can use to unload table with values enclosed in quotation mark:

CREATE EXTERNAL TABLE quoted_file(name string, amount int)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
"separatorChar" = ",",
"quoteChar" = "\""
)
STORED AS TEXTFILE
LOCATION '/test';

Then insert into the external table by selecting data from table that you want to unload:

hive> insert into quoted_file select * from test;
Query ID = cloudera_20171105031313_dd5450c3-1710-461b-94a6-6d6d7b22f02f
Total jobs = 3
Launching Job 1 out of 3
....
Stage-Stage-1: Map: 1 Cumulative CPU: 2.68 sec HDFS Read: 3532 HDFS Write: 166 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 680 msec
OK

Time taken: 34.141 seconds

Now check out the HDFS /test location for the CSV files and you can copy that to local disks.

[cloudera@quickstart test]$ hdfs dfs -ls /test 
Found 1 items 
-rwxr-xr-x 1 cloudera supergroup 108 2017-11-05 03:20 /test/000000_0

[cloudera@quickstart test]$ hdfs dfs -cat /test/000000_0 
"cde","400" 
"efg","600" 
"bcd","100" 
"cde","400" 
"efg","600" 
"bcd","100" 
"abc","100" 
"abc","200" 
"bcd","100" 
[cloudera@quickstart test]$