Netezza TwinFin Architecture

Netezza is a data warehouse and big data analytics appliance. It uses Asymmetric Massively Parallel Processing (AMPP) architecture, which combines an SMP front end with a shared MPP back end for query processing. Netezza is a result of  database integration , processing engine and storage in a system. Netezza architecture resembles Hadoop cluster design in may ways. e.g. Distribution, active-passive node, data storing methods, replications etc Check out: Key Features of Netezza Appliances Netezza Striper Architecture: N2001 Technical Specification Netezza Mako Architecture: N3001 Technical Specification Commonly used Netezza Utilities Related book: Netezza underground - This…

Continue ReadingNetezza TwinFin Architecture
1 Comment

Import data using Apache Sqoop

Sqoop allows easy import of data from structured data stores such as relational databases, enterprise data warehouses, and NoSQL systems. Using Sqoop, you can provision the data from external system on to HDFS, and populate tables in Hive and HBase. Sqoop can integrates with Oozie, allowing you to schedule and automate tasks. Read: Sqoop import Relational Database Table into HBase Table Export data using Sqoop Sqoop Command with Secure Password Import Netezza Tables using Apache Sqoop Sqoop uses a connector based architecture which supports plugins that provide connectivity to new…

Continue ReadingImport data using Apache Sqoop
1 Comment

Export data using Apache Sqoop

In some cases data processed by Hadoop pipelines may be needed in production systems to help run additional critical business functions. The sqoop can exports a set of files from HDFS to an RDBMS. The target table must already exist in the database. The input files are read and parsed into a set of records according to the user-specific delimiters. Delimiter is provided in the sqoop options file or CLI options.

Continue ReadingExport data using Apache Sqoop
Comments Off on Export data using Apache Sqoop

Sqoop Architecture – Mappers with No Reducers

Sqoop is a tool designed to transfer data between Hadoop and various relational databases. You can use Sqoop to import data from a relational database management system (RDBMS) such as Netezza, MySQL, Oracle or SQL Server into the Hadoop Distributed File System (HDFS), transform the data and perform complex calculations in Hadoop MapReduce, and then export the data back into an RDBMS. Sqoop is based on a connector architecture which supports plugins to provide connectivity to external systems (RDBMS) .

Continue ReadingSqoop Architecture – Mappers with No Reducers
Comments Off on Sqoop Architecture – Mappers with No Reducers