Impala or Hive Slowly Changing Dimension – SCD Type 2 Implementation

Slowly changing dimensions in Data warehouse are commonly known as SCD, usually captures the data that changes slowly but unpredictably, rather than regular bases. Slowly changing dimension type 2 is most popular method used in dimensional modelling to preserve historical data. Since Cloudera impala or Hadoop Hive does not support update statements, you have to implement the update using intermediate tables. In this article, we will check Cloudera Impala or Hive Slowly Changing Dimension - SCD Type 2 Implementation steps with an example. For demonstration purpose, lets take the example…

Continue ReadingImpala or Hive Slowly Changing Dimension – SCD Type 2 Implementation
2 Comments

Apache HBase Data Model Explanation

Apache HBase is column oriented scalable database built on top of Hadoop HDFS. The HBase is an open-source implementation of Google’s BigTable. In this article, we will check Apache HBase data model and explanation. Apache HBase Data Model The Apache HBase Data Model is designed to accommodate structured or semi-structured data that could vary in field size, data type and columns. HBase stores data in tables, which have rows and columns. The table schema is very different from traditional relational database tables. You can consider HBase table as a multi-dimensional…

Continue ReadingApache HBase Data Model Explanation
2 Comments

HBase Table Schema Design and Concept

HBase table can scale to billions of rows and many number of column based on your requirements. This table allows you to store terabytes of data in it. The HBase table supports the high read and write throughput at low latency. A single value in each row is indexed; this value is known as the row key. In this article, we will check HBase table schema design and concept. HBase Table Schema Design General Concepts The HBase schema design is very different compared to the relation database schema design. Below…

Continue ReadingHBase Table Schema Design and Concept
Comments Off on HBase Table Schema Design and Concept

How to avoid HBase Hotspotting?

HBase hotspotting occurs when large amount of traffic from various clients redirected to single or very few numbers of nodes in the cluster. The HBase hotspotting occurs because of bad row key design. In this article, we will see how to avoid HBase hotspotting or region server hotspotting. How Does HBase hotspotting occurs? HBase hotspotting occurs because of poorly designed row key. Because of bad row key, HBase stores large amount of data on single node and entire traffic is redirected to this node when client requests some data leaving…

Continue ReadingHow to avoid HBase Hotspotting?
Comments Off on How to avoid HBase Hotspotting?

Netezza Export Table Data into Excel Format

You can export the Netezza table in many ways. You can export the data to CSV format using either Netezza external table or Netezza nzsql commands with - o option. Netezza does not support exporting data to excel (xls/xlsx) format. You have to perform some work around to get data out to excel format. In this article, we will check Netezza Export Table Data into Excel Format. Netezza Export Table Data into Excel Format Netezza export table data into excel format involves two steps: Export Netezza table data to CSV…

Continue ReadingNetezza Export Table Data into Excel Format
Comments Off on Netezza Export Table Data into Excel Format

Sqoop import Relational Database Table into HBase Table

Apache Sqoop can be used to transform relational database table into HBase tables. You have to follow some process to import relational database or data warehouse tables into HBase schema. In this article, we will check on Sqoop import relational database table into HBase table and some working examples for the same. Sqoop import Relational Database Table into HBase Table You cannot directly import entire data warehouse or relational database tables into HBase. HBase is column oriented and the schema design is way different for HBase tables compared to Hive…

Continue ReadingSqoop import Relational Database Table into HBase Table
Comments Off on Sqoop import Relational Database Table into HBase Table

Apache HBase Bulk Load CSV and Examples

Apache HBase starts where Hadoop HDFS stops, i.e. HBase provides random, realtime read/write access to the Bigdata. If you have flat files such as CSV and TSV, you can use Apache HBase bulk load CSV and TSV features to get the data into HBase tables. In this post, I will tell you how to import data to HBase from CSV and TSV files. We will not dig into any transformation. We will check importing data into already existing HBase table. HBase Importtsv utility Importtsv is a utility that will load…

Continue ReadingApache HBase Bulk Load CSV and Examples
2 Comments

Check Netezza System Configurations using System Commands

Verifying the Netezza system configuration is basic step to identify any issues with the system. You can troubleshoot the problem once you identify the same. In this article, we will see how to check Netezza system configurations using system commands and see some of examples. Netezza System Configurations All the Netezza system configuration details are available in system.cfg file. This configuration file contains configuration settings that the Netezza uses for system startup, system management, host processes, and SPUs. It is advised to get in touch with IBM support team before…

Continue ReadingCheck Netezza System Configurations using System Commands
Comments Off on Check Netezza System Configurations using System Commands

Commonly used HBase Data Manipulation Shell Commands

Hbase is usually installed on the top of the Hadoop and it uses the Hadoop file system for its storage. Just like other Bigdata framework such as Hive, Pig etc, Hbase provides jruby interactive shell. You can also use various Java API to interact with HBase. In this article, we will check some commonly used HBase Data Manipulation shell commands and explanation on how to use them. Commonly used HBase Data Manipulation Shell Commands HBase provide many data manipulation shell commands that you can use on the interactive shell to…

Continue ReadingCommonly used HBase Data Manipulation Shell Commands
Comments Off on Commonly used HBase Data Manipulation Shell Commands

Read HBase Tables using scan shell command and examples

The set of HBase basic operations are referred to as CRUD operations. i.e. create, read, update, delete operations. HBase scan command is used to get data out of HBase tables. In this article, we will check how to read HBase tables using scan shell command and various examples. HBase scan command The HBase scan command is yet another HBase shell command that you can use to read the table. Scan command is similar to HBase get shell command but supports more options. The HBase scan command scans entire table and…

Continue ReadingRead HBase Tables using scan shell command and examples
Comments Off on Read HBase Tables using scan shell command and examples