It is somewhat difficult when it comes to setting and using bash environment variables in python script file. The same step is very easy and straight forward using shell script. In this post, we will check one of the method to set and use environment variable inside python scrip file.
Note that, steps mentioned in this post helps only if you are setting and using that variable inside same process i.e. in same python script. There is no way you can modify bash script from python and use that variable in different process. Later, at the end of this post I will explain some simple methods on how to achieve that.
Set and Use Environment Variable inside Python Script
I was working on one of the python script to connect to Hadoop Hive server2, our requirement was to use Hive JDBC driver and that require CLASSPATH bash environment variable to be set to Hadoop native jar files path. Below are the steps that I followed to set and use CLASSPATH environment variables:
Set Use Environment Variable inside Python Script
Before attempting to set variable, you should expand the path if there are any shell variables involved.
Expand Shell Variables in Python
You can use subprocess module along with check_output to execute shell variables and capture output in python variable.
Below is the example;
output = subprocess.check_output(['bash','-c', 'echo $CLASSPATH:$(hadoop classpath):/usr/hdp/current/hadoop-client/*'])
Set and Use Environment Variable
To set CLASSPATH environment variable inside python script, you can use ‘os.environ’.
Below is the example;
os.environ['CLASSPATH'] = output
Just to confirm that the environment variable assignment was successful, you can print CLASSPATH and check output.
print os.environ['CLASSPATH']
As mentioned at the beginning of this post above variable is visible only within current process. Once execution is completed, environment variable will be lost.
Export Environment Variable in Shell script and Execute Python inside Shell Script
Another method is to use shell script to ‘export’ environment variable and then call Python script that consumes that shell variable. Same as previous approach, environment variable is visible only within current process and will be lost once execution is completed.
For examples;
#!/bin/bash export CLASSPATH=$CLASSPATH:$(hadoop classpath):/usr/hdp/current/hadoop-client/* python hive_jdbc_conn.py
Change .bashrc or .profile file
Another simple and best approach is set environment variable by changing .bashrc or .profile file.
Variables added to .bashrc or.profile files are visible from all processes. If your environmental variable is being used in multiple application, then you can use this approach.
For examples,
Open .bashrc file using vi editor in insert mode.
vi ~/.bashrc
and simply add below line and source changed file.
export CLASSPATH=$CLASSPATH:$(hadoop classpath):/usr/hdp/current/hadoop-client/*
source ~/.bashrc
Hope this helps. Let me know if you have any other approach to set and use environment variable inside python scrip file.