Hadoop Hive Bucket Concept and Bucketing Examples
Hadoop Hive bucket concept is dividing Hive partition into number of equal clusters or buckets. The bucketing concept is very much similar to Netezza Organize on clause for table clustering. Hive bucket is decomposing the hive partitioned data into more manageable parts. Let us check out the example of Hive bucket usage. Let us say we have sales table with sales_date, product_id, product_dtl etc. The Hive table will be partitioned on sales_date and product_id as the second-level partition would have led to too many small partitions in HDFS. To tackle this…