The Greenplum database is one of the widely used MPP machine. There are many methods that you can use to connect to Greenplum, for example, you can use JDBC, ODBC, etc. The Greenplum analytics database support connection from any programming language that supports JDBC driver. In this article, we will check how to connect Greenplum using JDBC driver from Python programming language. Connection string will remain same if you are using any programming language other than Python.
Greenplum JDBC Driver
The Greenplum database provides support to JDBC driver. You can use this JDBC driver with Python or any other programming language to connect to Greenplum database.
You can download Greenplum IDBC connectors from official website.
Before going into steps on using JDBC driver in Python program, let us see how to install and use jaydebeapi, a third-party python module.
Though you can use any module that supports JDBC drivers, jaydebeapi is one of the easy module that I have been using.
Greenplum JDBC Connection String
Connection string differ from database to database. The Greenplum jdbc connection string is slightly different from databases like Netezza, Vertica, etc.
Below is the JDBC connection string that you can use to connect Greenplum database:
jdbc:pivotal:greenplum://host_IP:port; DatabaseName=db_name;
for example;
jdbc:pivotal:greenplum://192.168.200.185:5432; DatabaseName=template1;
Install Jaydebeapi
The JayDeBeApi module allows you to connect from Python code to various databases using Java JDBC drivers and connection string. It provides a Python DB-API v2.0 to that database.
You can install it using pip:
# Python2
pip install Jaydebeapi
Set CLASSPATH to Driver Location
If there are dependent jar files for any JDBC jar file, you have to provide path of those jar files in CLASSPATH shell or environment variable.
The Greenplum JDBC driver does not have any dependent jars, you can directly refer this driver in your jaydebeapi module. Alternatively, you can export jar location to CLASSPATH shell variable and run your python program without needing to set jar location in your module.
You can read on how to set CLASSPATH variable in my other post – Set and Use Environment Variable inside Python Script.
How to Connect Greenplum using JDBC Driver from Python?
Now we have Greenplum jdbc driver in place and classpath variable is set. In this section, we will check how to connect Greenplum using JDBC driver from Python.
Note that, Greenplum jdbc driver class name is “com.pivotal.jdbc.GreenplumDriver“
Here is the code that can help you:
import jaydebeapi as jay
import os
# Greenplum Server Details
dsn_database = "template1"
dsn_hostname = "192.168.200.185"
dsn_port = "5432"
dsn_uid = "gpadmin"
dsn_pwd = "gpadmin"
# Greenplum JDBC class name
jdbc_driver_name = "com.pivotal.jdbc.GreenplumDriver"
# Greenplum JDBC driver path
jdbc_driver_loc = os.path.join('D:\\Work\\ Connections_Softwares\\Jar\\ greenplumdriver-5.1.1.jar')
# Sample SQL Query
sql_str = "select now()"
# jdbc:pivotal:greenplum://192.168.200.185:5432; DatabaseName=template1;
# JDBC connection string
connection_string='jdbc:pivotal:greenplum://'+ dsn_hostname+': '+dsn_port+';DatabaseName='+dsn_database
url = '{0}:user={1}; password={2}'.format(connection_string, dsn_uid, dsn_pwd)
print("Connection String: " + connection_string)
# Establish JDBC connection
conn = jay.connect(jdbc_driver_name, connection_string, {'user': dsn_uid, 'password': dsn_pwd},
jars=jdbc_driver_loc)
curs = conn.cursor()
curs.execute(sql_str)
result = curs.fetchall()
print(result[0])
Here is the sample output:
Connection String: jdbc:pivotal:greenplum://192.168.200.185:5432; DatabaseName=template1
('2019-04-26 10:11:19.135110',)
Hope this helps 🙂