Apache Hive Different File Formats:TextFile, SequenceFile, RCFile, AVRO, ORC,Parquet

Apache Hive supports several familiar file formats used in Apache Hadoop. Hive can load and query different data file created by other Hadoop components such as Pig or MapReduce. In this article, we will check Apache Hive different file formats such as TextFile, SequenceFile, RCFile, AVRO, ORC and Parquet formats. Cloudera Impala also supports these file formats. Hive Different File Formats Different file formats and compression codecs work better for different data sets in Apache Hive. Following are the Apache Hive different file formats: Text File Sequence File RC File…

Continue ReadingApache Hive Different File Formats:TextFile, SequenceFile, RCFile, AVRO, ORC,Parquet
Comments Off on Apache Hive Different File Formats:TextFile, SequenceFile, RCFile, AVRO, ORC,Parquet

Apache HBase Data Model Explanation

Apache HBase is column oriented scalable database built on top of Hadoop HDFS. The HBase is an open-source implementation of Google’s BigTable. In this article, we will check Apache HBase data model and explanation. Apache HBase Data Model The Apache HBase Data Model is designed to accommodate structured or semi-structured data that could vary in field size, data type and columns. HBase stores data in tables, which have rows and columns. The table schema is very different from traditional relational database tables. You can consider HBase table as a multi-dimensional…

Continue ReadingApache HBase Data Model Explanation
2 Comments

HBase Table Schema Design and Concept

HBase table can scale to billions of rows and many number of column based on your requirements. This table allows you to store terabytes of data in it. The HBase table supports the high read and write throughput at low latency. A single value in each row is indexed; this value is known as the row key. In this article, we will check HBase table schema design and concept. HBase Table Schema Design General Concepts The HBase schema design is very different compared to the relation database schema design. Below…

Continue ReadingHBase Table Schema Design and Concept
Comments Off on HBase Table Schema Design and Concept

Sqoop import Relational Database Table into HBase Table

Apache Sqoop can be used to transform relational database table into HBase tables. You have to follow some process to import relational database or data warehouse tables into HBase schema. In this article, we will check on Sqoop import relational database table into HBase table and some working examples for the same. Sqoop import Relational Database Table into HBase Table You cannot directly import entire data warehouse or relational database tables into HBase. HBase is column oriented and the schema design is way different for HBase tables compared to Hive…

Continue ReadingSqoop import Relational Database Table into HBase Table
Comments Off on Sqoop import Relational Database Table into HBase Table

Apache HBase Bulk Load CSV and Examples

Apache HBase starts where Hadoop HDFS stops, i.e. HBase provides random, realtime read/write access to the Bigdata. If you have flat files such as CSV and TSV, you can use Apache HBase bulk load CSV and TSV features to get the data into HBase tables. In this post, I will tell you how to import data to HBase from CSV and TSV files. We will not dig into any transformation. We will check importing data into already existing HBase table. HBase Importtsv utility Importtsv is a utility that will load…

Continue ReadingApache HBase Bulk Load CSV and Examples
2 Comments

Commonly used HBase Data Manipulation Shell Commands

Hbase is usually installed on the top of the Hadoop and it uses the Hadoop file system for its storage. Just like other Bigdata framework such as Hive, Pig etc, Hbase provides jruby interactive shell. You can also use various Java API to interact with HBase. In this article, we will check some commonly used HBase Data Manipulation shell commands and explanation on how to use them. Commonly used HBase Data Manipulation Shell Commands HBase provide many data manipulation shell commands that you can use on the interactive shell to…

Continue ReadingCommonly used HBase Data Manipulation Shell Commands
Comments Off on Commonly used HBase Data Manipulation Shell Commands

Read HBase Tables using scan shell command and examples

The set of HBase basic operations are referred to as CRUD operations. i.e. create, read, update, delete operations. HBase scan command is used to get data out of HBase tables. In this article, we will check how to read HBase tables using scan shell command and various examples. HBase scan command The HBase scan command is yet another HBase shell command that you can use to read the table. Scan command is similar to HBase get shell command but supports more options. The HBase scan command scans entire table and…

Continue ReadingRead HBase Tables using scan shell command and examples
Comments Off on Read HBase Tables using scan shell command and examples

Commonly used HBase Table Management Shell Commands

Hbase is installed on the top of the Hadoop and it uses the Hadoop file system only. Just like other Bigdata framework such as Hive, Pig etc, Hbase provides jruby interactive shell. You can also use the Java API to interact with Hbase. In this article, we will check some commonly used HBase table management shell commands and explanation on how to use them. Commonly used HBase Table Management Shell Commands In HBase, jruby interactive shell mode is used to interact with HBase for table operations, table management, and data modeling.…

Continue ReadingCommonly used HBase Table Management Shell Commands
Comments Off on Commonly used HBase Table Management Shell Commands

Hbase Namespace Commands and Examples

You can compare the namespace to the RDBMS shema’s. You can create the namespace in the HBase table and then create multiple tables in that namespace. In this article, we will check out Hbase namespace commands with an examples. Hbase Namespace Commands There are three commonly used namespace commands: create_namespace, alter_namespace, describe_namespace, drop_namespace, list_namespace, list_namespace_tables HBase create_namesace command This command is used to create a namespace in the HBase. Below are the examples to create namespace: hbase(main):019:0> create 'test:test1','cf' 0 row(s) in 2.3760 seconds => Hbase::Table - test:test1 Create Table inside…

Continue ReadingHbase Namespace Commands and Examples
2 Comments

Commonly used General Hbase Shell Commands

Hbase is installed on the top of the Hadoop and it uses the Hadoop file system only. Just like other Bigdata framework such as Hive, Pig etc, Hbase provides interactive shell. You can also use the Java API to interact with Hbase. In this article, we will check some commonly used general Hbase shell commands. Commonly used General Hbase Shell Commands In HBase, interactive shell mode is used to interact with HBase for table operations, table management, and data modeling. We will check out the general Hbase shell commands. There…

Continue ReadingCommonly used General Hbase Shell Commands
Comments Off on Commonly used General Hbase Shell Commands