Cloudera Impala TRUNCATE TABLE statement removes all records from the table while keeping the table structure as it is. This statement is low overhead alternative for dropping and re-creating the tables. This statement is also low overhead compared to the INSERT OVERWRITE to replace the existing data from the HDFS directory before copying data. This is one of the features added in CDH 5.5 or higher.
This statement helps when you are performing ELT/ELT operation cycles on Cloudera Impala where you have to empty the table after the data has been copied to other table for the next stage of processing.
Read:
- Cloudera Impala Generate Sequence Numbers without UDF
- Run Impala SQL Script File Passing argument and Working Example
Cloudera Impala Truncate Table Statement in CDH 5.5 or higher Example
Below shows the example of the truncate table statement in CDH 5.8:
[quickstart.cloudera:21000] > create table trunc_demo as select * from test; Query: create table trunc_demo as select * from test +-------------------+ | summary | +-------------------+ | Inserted 4 row(s) | +-------------------+ Fetched 1 row(s) in 0.66s [quickstart.cloudera:21000] > select * from trunc_demo; Query: select * from trunc_demo +----+------+--------+ | id | name | dep_id | +----+------+--------+ | 1 | abc | 100 | | 2 | bcd | 102 | | 3 | cde | 103 | | 4 | def | 104 | +----+------+--------+ Fetched 4 row(s) in 0.37s [quickstart.cloudera:21000] > TRUNCATE TABLE trunc_demo; Query: truncate TABLE trunc_demo Fetched 0 row(s) in 0.45s [quickstart.cloudera:21000] > select * from trunc_demo; Query: select * from trunc_demo Fetched 0 row(s) in 0.17s [quickstart.cloudera:21000] >
Simulate Truncate Table in CDH 5.4 or Lower version
If you are using CDH 5.4 or lower, there is no TRUNCATE TABLE statement available. You have to simulate the TRUNCATE TABLE by using INSERT OVERWRITE. Below is the example how it works:
[quickstart.cloudera:21000] > create table trunc_demo as select * from test; Query: create table trunc_demo as select * from test +-------------------+ | summary | +-------------------+ | Inserted 4 row(s) | +-------------------+ Fetched 1 row(s) in 0.71s [quickstart.cloudera:21000] > select * from trunc_demo; Query: select * from trunc_demo +----+------+--------+ | id | name | dep_id | +----+------+--------+ | 1 | abc | 100 | | 2 | bcd | 102 | | 3 | cde | 103 | | 4 | def | 104 | +----+------+--------+ Fetched 4 row(s) in 0.55s [quickstart.cloudera:21000] > INSERT OVERWRITE trunc_demo select * from test limit 0; Query: insert OVERWRITE trunc_demo select * from test limit 0 Inserted 0 row(s) in 0.40s [quickstart.cloudera:21000] > select * from trunc_demo; Query: select * from trunc_demo Fetched 0 row(s) in 0.14s [quickstart.cloudera:21000] >
Related reading:
- An Introduction to Hadoop Cloudera Impala Architecture
- Cloudera Impala Cumulative Sum, Average and Example