A Hive external table allows you to access external HDFS file as a regular managed tables. You can join the external table with other external table or managed table in the Hive to get required information or perform the complex transformations involving various tables. In this article, we will check on Hive create external tables with an examples.
You have to create external table same as if you are creating managed tables. LOCATION is mandatory for creating external tables. LOCATION indicates the location of the HDFS flat file that you want to access as a regular table.
Hive Create External Tables Syntax
Below is the simple syntax to create Hive external tables:
CREATE EXTERNAL TABLE [IF NOT EXISTS] [db_name.] table_name [(col_name data_type [COMMENT col_comment], ...)] [COMMENT table_comment] [ROW FORMAT row_format] [FIELDS TERMINATED BY char] [STORED AS file_format] [LOCATION hdfs_path];
There are other options along with above commonly used options. You can refer Hive CREATE TABLE official documentation for more information.
Read:
You can also create the external table similar to existing managed tables. Use below syntax:
CREATE EXTERNAL TABLE [IF NOT EXISTS] [db_name.]table_name LIKE existing_table_or_view_name [LOCATION hdfs_path];
A Hive External table has a definition or schema, the actual HDFS data files exists outside of hive databases. Dropping external table in Hive does not drop the HDFS file that it is referring whereas dropping managed tables drop all its associated HDFS files.
Read:
- Apache Hive Fixed-Width File Loading Options and Examples
- Apache Hive Temporary Tables and Examples
- Hadoop Distributed File System (HDFS) Architecture
Hive Create External Tables Examples
Below is the example to create external tables:
hive> CREATE EXTERNAL TABLE IF NOT EXISTS test_ext > (ID int, > DEPT int, > NAME string > ) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY ',' > STORED AS TEXTFILE > LOCATION '/test'; OK Time taken: 0.395 seconds hive> select * from test_ext; OK 1 100 abc 2 102 aaa 3 103 bbb 4 104 ccc 5 105 aba 6 106 sfe Time taken: 0.352 seconds, Fetched: 6 row(s) hive> CREATE EXTERNAL TABLE IF NOT EXISTS test_ext > LIKE test_ext21 > LOCATION '/test'; OK Time taken: 0.332 seconds
Hive External Table Skip Header
Skipping header comes to picture when your data file has a header row and you want to skip it before reading it.
From Hive version 0.13.0, you can use skip.header.line.count property to skip header row when creating external table. You could also specify the same while creating the table.
For example, consider below external table.
create external table test_ext (name string, message string)
row format delimited
fields terminated by '\t'
lines terminated by '\n'
location '/testtable'
tblproperties ("skip.header.line.count"="1");
or simply use ALTER TABLE command to add tblpoperties.
For example, Consider below alter table statement.
ALTER TABLE test_ext SET TBLPROPERTIES ("skip.header.line.count"="1");
Hope this helps 🙂