Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
503 views
in Technique[技术] by (71.8m points)

pyspark - How do I set the driver's python version in spark?

I'm using spark 1.4.0-rc2 so I can use python 3 with spark. If I add export PYSPARK_PYTHON=python3 to my .bashrc file, I can run spark interactively with python 3. However, if I want to run a standalone program in local mode, I get an error:

Exception: Python in worker has different version 3.4 than that in driver 2.7, PySpark cannot run with different minor versions

How can I specify the version of python for the driver? Setting export PYSPARK_DRIVER_PYTHON=python3 didn't work.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Setting PYSPARK_PYTHON=python3 and PYSPARK_DRIVER_PYTHON=python3 both to python3 works for me. I did this using export in my .bashrc. In the end, these are the variables I create:

export SPARK_HOME="$HOME/Downloads/spark-1.4.0-bin-hadoop2.4"
export IPYTHON=1
export PYSPARK_PYTHON=/usr/bin/python3
export PYSPARK_DRIVER_PYTHON=ipython3
export PYSPARK_DRIVER_PYTHON_OPTS="notebook"

I also followed this tutorial to make it work from within Ipython3 notebook: http://ramhiser.com/2015/02/01/configuring-ipython-notebook-support-for-pyspark/


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...