Automatically Delete HBase row – Time to Live (TTL) Settings

One of the HBase features is that it can delete the rows in the table automatically. This feature reduces lot of time that is required to maintain rows if you are handling sensitive data.  In this article, we will check automatically delete HBase row using time to live (TTL) setting. HBase Time to Live (TTL) Option -  Automatically Delete HBase Row You can set ColumnFamilies a TTL length in seconds, and HBase will automatically delete rows or automatically expires the row once the expiration time is reached. This setting applies…

Continue ReadingAutomatically Delete HBase row – Time to Live (TTL) Settings
Comments Off on Automatically Delete HBase row – Time to Live (TTL) Settings

HBase Auto Sharding Concept and Explanation

HBase is the Hadoop storage manager on the top of Hadoop HDFS that provides low-latency random reads and writes, and it can handle petabytes of data without any issue. One of the interesting capabilities in HBase is auto sharding, which simply means that tables are dynamically distributed by the system to different region servers when they become too large. In other word, Splitting and serving regions can be thought of as auto sharding, as offered by other systems. Regions and Region Servers In Hbase, the scalability and load balancing is…

Continue ReadingHBase Auto Sharding Concept and Explanation
Comments Off on HBase Auto Sharding Concept and Explanation

Methods to Measure Data Dispersion

Data processing to be successful, it is essential to have an overall picture of the data. Descriptive data summarization techniques can be used to identify the typical properties of your data and highlight which data values should be treated as noise or outliers. Therefore, it’s very important to learn about the data characteristics and measure for the same. In this article, we will check Methods to Measure Data Dispersion. Methods to Measure Data Dispersion Let’s know how can we disperse the numeric data or spread the numeric data. Below are five…

Continue ReadingMethods to Measure Data Dispersion
Comments Off on Methods to Measure Data Dispersion

How Column Oriented Database Stores Data? – Details

Column-oriented databases save their data grouped by columns. Subsequent column values are stored contiguously on disk. Columnar storage for database tables is one of an important factor in optimizing analytic query performance in the database.In this article, we will check how column oriented database stores data. Also we will check the difference between row oriented database and columnar database - columnar database vs document database What is Column Oriented Database? The column-oriented databases save their data grouped by columns. This differs from the usual row-oriented approach of traditional databases, which…

Continue ReadingHow Column Oriented Database Stores Data? – Details
Comments Off on How Column Oriented Database Stores Data? – Details

Apache HBase Column Versions and Explanations

Cells in HBase is a combination of the row, column family, and version contains a value and a timestamp, which represents the column family version. In this article, we will check Apache HBase column versions and explanations with some examples. Apache HBase Column Versions As mentioned in beginning of this post, A {row, column, version} tuple exactly specifies a cell in HBase. In the Apache HBase you can have many cells where row and columns are same but differs only in version values. A version is a timestamp values is…

Continue ReadingApache HBase Column Versions and Explanations
Comments Off on Apache HBase Column Versions and Explanations

Splitting HBase Tables, Examples and Best Practices

Apache HBase distributes its load through region splitting. HBase stored rows in the tables and each table is split into ‘regions’. Those regions are distributed across the cluster, hosted and made available to client processes by the RegionServer process in the system. All rows in the tables are sorted between regions start and end key. Every single row is belonging to exactly one region and a region is served by single region server at any given point of time. In this article, we will check Splitting HBase Tables, Examples and…

Continue ReadingSplitting HBase Tables, Examples and Best Practices
Comments Off on Splitting HBase Tables, Examples and Best Practices

HBase Exit Code – Capture Last Executed Command Status

You can use HBase exit code to check for success or failure of last executed command in script. These exit code help you to make decision on whether to continue the script execution or abort it in case of command failure. In this article, we have discussed HBase exit code – We have also discussed how to capture last executed command status. HBase Exit Code You can us $? to return the status of last executed command on HBase Shell. Just like other relational databases like Netezza, HBase exit code…

Continue ReadingHBase Exit Code – Capture Last Executed Command Status
Comments Off on HBase Exit Code – Capture Last Executed Command Status

Working with HBase Table Variables – Assign Table Name to jruby Variable

Apache HBase 0.95 allows you to assign table name to a jruby variable. This new feature allows you to save lot of time while working on table operations such as insert, read, delete data from table. In this article, we will discuss working with HBase table variables – assign table name to jruby variable with some examples. Working with HBase Table Variables – Assign Table Name to jruby Variable In earlier version, HBase shell commands takes table name as an argument. Apache HBase 0.95 version of HBase adds facility to…

Continue ReadingWorking with HBase Table Variables – Assign Table Name to jruby Variable
Comments Off on Working with HBase Table Variables – Assign Table Name to jruby Variable

How to Rename HBase Table? – Examples

In earlier, we had a simple script ‘rename_table.rb’, that would rename the HBase hdfs table directory and then edit hbase:meta table replacing all details of the old table name with the new. The script was deprecated and removed as it was un-maintained. In this article, we will check how to rename HBase table using snapshot with some examples. How to Rename HBase Table? You can use HBase snapshot facility to rename the tables. Here is how you would do it using the HBase shell: Related reading: Steps to Migrate HBase…

Continue ReadingHow to Rename HBase Table? – Examples
Comments Off on How to Rename HBase Table? – Examples

Steps to Migrate HBase Tables from Default to another Namespace

In Hbase, you can create different namespaces as per your requirements. You can think namespace as schema in relational database. When you create HBase tables without specifying namespace then tables will be available in “default” namespace. In this article, we will check steps to migrate HBase tables from default to another namespace with some examples. HBase Snapshots You can INSERT (single value at a time) data into tables that is present in another namespace using HBase PUT command. But, you can’t use the HBase put command to copy entire table…

Continue ReadingSteps to Migrate HBase Tables from Default to another Namespace
Comments Off on Steps to Migrate HBase Tables from Default to another Namespace