How to Import Netezza Tables using Sqoop?

  • Post author:
  • Post last modified:March 15, 2019
  • Post category:BigData
  • Reading time:7 mins read

With growing data, organizations are moving computation part to Hadoop ecosystem. Apache Sqoop is an open source tool to import data from relational databases to Hadoop and vice versa. Apache Sqoop is one of the easiest tool to import relational database such as Netezza into Hadoop ecosystem. The Sqoop command allows you to import all tables, single table, execute query and store result in Hadoop HDFS. In this article, we will check how to import Netezza tables using Sqoop with some practical examples.

Import Netezza Tables using Apache Sqoop

Sqoop uses a connector based architecture which supports plugins that provide connectivity to new external systems.

Import Netezza Tables using Sqoop

Netezza is relational database system with limited storage, usually organizations migrate required workload to Hadoop.

Related Articles:

Apache Sqoop supports Netezza connectors. Using Sqoop, you can import the data from external system such as Netezza on to HDFS, and populate tables in Hive and HBase.

Import Netezza Tables into Hive using Sqoop

As mentioned earlier, you can use sqoop to import Netezza tables into Hive metastore.

Apache Sqoop does following when you invoke import:

  • It populats the Hive metastore (tables) with the appropriate metadata for the table.
  • invokes the necessary commands to load the table or partition

Sqoop Command to Import Netezza Table

Below is the Sqoop connection string or Sqoop import command that you can use to import Netezza tables into Hadoop Hive or Hadoop HDFS.

sqoop import --connect jdbc:netezza://host_ip/database --table tableName --username userName --password password --hive-import --hive-database DatabaseName --create-hive-table --hive-table tableName -m 1

Import Netezza Tables into Hive using Sqoop Example

Below is the example of importing CUSTOMER table into Hive database table:

$ sqoop import-all-tables --connect jdbc:netezza://192.168.100.123/RETAIL --username admin --password password --hive-import --hive-database vithal_test --create-hive-table -m 1
Warning: /usr/hdp/2.5.0.0-1245/accumulo does not exist! Accumulo imports will fail.
...
...
Time taken: 0.843 seconds
Note: /tmp/sqoop-impadmin/compile/363c30ac0559f7783dd473977a3d71ca/WEB_SALES.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.

Logging initialized using configuration in jar:file:/usr/hdp/2.5.0.0-1245/hive/lib/hive-common-1.2.1000.2.5.0.0-1245.jar!/hive-log4j.properties
OK
Time taken: 0.423 seconds
Loading data to table vithal_test.web_sales
Table vithal_test.web_sales stats: [numFiles=1, numRows=0, totalSize=402, rawDataSize=0]
OK
Time taken: 0.912 seconds

You can verify the Hive database for created table.

0: jdbc:hive2://192.168.100.203:10000/default> show tables;
+-----------+--+
| tab_name  |
+-----------+--+
| customer  |
| test1     |
+-----------+--+
2 rows selected (0.038 seconds)

Import all Tables into Apache Hive using Sqoop

You can use sqoop to import all tables from Netezza database into Hive.

Sqoop command to Import all Tables

Below is the sqoop command or connection string that you can use to import all tables from relational databases.

sqoop import-all-tables --connect jdbc:netezza://host_ip/database  --username admin --password password --hive-import --hive-database vithal_test --create-hive-table -m 1

As it will create jar and perform migration, it may take some time to migrate all your tables.

Import all Netezza Tables into Hive using Sqoop Example

Below is the example of importing all table from Netezza database into Hive database table:

$ sqoop import-all-tables --connect jdbc:netezza://192.168.100.123/RETAIL --username admin --password password --hive-import --hive-database vithal_test --create-hive-table -m 1
Warning: /usr/hdp/2.5.0.0-1245/accumulo does not exist! Accumulo imports will fail.
...
...
Time taken: 0.843 seconds
Note: /tmp/sqoop-impadmin/compile/363c30ac0559f7783dd473977a3d71ca/WEB_SALES.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.

Logging initialized using configuration in jar:file:/usr/hdp/2.5.0.0-1245/hive/lib/hive-common-1.2.1000.2.5.0.0-1245.jar!/hive-log4j.properties
OK
Time taken: 0.423 seconds
Loading data to table vithal_test.web_sales
Table vithal_test.web_sales stats: [numFiles=1, numRows=0, totalSize=402, rawDataSize=0]
OK
Time taken: 0.912 seconds

You can verify the Hive database for imported table.

0: jdbc:hive2://192.168.100.203:10000/default> show tables;
+-------------------------+--+
|        tab_name         |
+-------------------------+--+
| catalog_sales           |
| customer                |
| customer_address        |
| customer_demographics   |
| date_dim                |
| household_demographics  |
| income_band             |
| item                    |
| store                   |
| store_returns           |
| store_sales             |
| test1                   |
| warehouse               |
| web_sales               |
+-------------------------+--+
14 rows selected (0.038 seconds)

Hope this helps 🙂