Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
710 views
in Technique[技术] by (71.8m points)

apache spark - Pyspark append executor environment variable

Is it possible to append a value to the PYTHONPATH of a worker in spark?

I know it is possible to go to each worker node, configure spark-env.sh file and do it, but I want a more flexible approach

I am trying to use setExecutorEnv method, but with no success

conf = SparkConf().setMaster("spark://192.168.10.11:7077")
              .setAppName(''myname')
              .set("spark.cassandra.connection.host", "192.168.10.11") /
              .setExecutorEnv('PYTHONPATH', '$PYTHONPATH:/custom_dir_that_I_want_to_append/')

It creates a pythonpath env.variable on each executor, force it to be lower_case, and does not interprets $PYTHONPATH command to append the value.

I end up with two different env.variables,

pythonpath  :  $PYTHONPATH:/custom_dir_that_I_want_to_append
PYTHONPATH  :  /old/path/to_python

The first one is dynamically created and the second one already existed before.

Does anyone know how to do it?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I figured out myself...

The problem is not with spark, but in ConfigParser

Based on this answer, I fixed the ConfigParser to always preserve case.

After this, I found out that the default spark behavior is to append the values to existing worker env.variables, if there is a env.variable with the same name.

So, it is not necessary to mention $PYTHONPATH within dollar sign.

.setExecutorEnv('PYTHONPATH', '/custom_dir_that_I_want_to_append/')

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...