There are many situations where you are required to export DDL’s. For example, you are migrating some of your Hive tables to the RDBMS for reporting. If you are working as a Hadoop administrator, you should have knowledge on how to export Table DDL. In this article, we will check on how to export Hive table DDL to a text file using shell script and beeline connection string.
Export Hive Table DDL
As mentioned earlier, it is good to have a utility that allows you to generate DDL in Hive. There are two methods that you can use to generate DDL:
- Use SHOW CREATE TABLE command
- Write Shell Script to generate able tables in Hive Database
Second method uses a SHOW CREATE TABLE command to dump all tables DDL present in given database.
Use SHOW CREATE TABLE command
The SHOW CREATE TABLE command Hive provided table command that can be used when you want to generate DDL for the single Hive table. The Command is similar to CREATE TABLE command, except it dumps DDL instead of creating tables.
SHOW CREATE TABLE command Synax
Below is the syntax to dump Hive table DDL on standard output:
SHOW CREATE TABLE Hive_Table;
SHOW CREATE TABLE command Example
Below is the example on using show create table command in Hive.
> SHOW CREATE TABLE test;
...
| CREATE TABLE `test`(`col1` int, `col2` int)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
WITH SERDEPROPERTIES (
'serialization.format' = '1'
)
STORED AS
INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
TBLPROPERTIES (
'rawDataSize' = '11',
'numFiles' = '1',
'transient_lastDdlTime' = '1554958792',
'totalSize' = '14',
'COLUMN_STATS_ACCURATE' = '{"BASIC_STATS":"true"}',
'numRows' = '3'
)
...
As you can see in the example, the SHOW CREATE TABLE command provides all the information that are associated with your managed tables. For example, command will display the table properties that are associated with your table. You may keep these properties in the new table or simply ignore them from the output.
Export All Hive Tables DDL in the Database
You can make use of SHOW CREATE TABLE command to export all Hive tables DDL present in any database. This little script comes handy when you have requirement to export Hive DDL for multiple tables.
Below is the sample script. Note that, we have used beeline with kerberos details to connect to Hive database. You can simply use hive if you are on cluster node.
#!/bin/bash
tbl_name='/home/user/vithal/tableNames.txt'
ddl_name='/home/user/vithal/HiveTableDDL.txt'
rm -f $tbl_name
rm -f $ddl_name
beeline -u "jdbc:hive2://192.168.200.103:10000/test_db; principal=hive/company.example.com@EXAMPLE.COM; auth=Kerberos;" -n user1 --showHeader=false --silent=true --outputformat=csv2 -e "show tables" > $tbl_name
wait
cat $tbl_name | while read LINE
do
beeline -u "jdbc:hive2://192.168.200.103:10000/test_db; principal=hive/company.example.com@EXAMPLE.COM; auth=Kerberos;" -n user1 --showHeader=false --silent=true --outputformat=csv2 -e "show create table $LINE" >> $ddl_name
echo -e "\n" >> $ddl_name
echo $LINE
done
rm -f $tbl_name
echo "Table DDL generated"
Related Article
Hive Describe Formatted/Extended Output
Another command that you can use to get table data types and HDFS location is Describe Formatted/Extended command. It does not provide true DDL but you can use the information provided to build the DDL statement.
For example, you can use below command to get extended properties of customer table.
> describe Formatted customer;
Hope this helps 🙂