I am new to Spark and I am trying to install the PySpark by referring to the below site.
http://ramhiser.com/2015/02/01/configuring-ipython-notebook-support-for-pyspark/
I tried to install both prebuilt package and also by building the Spark package thru SBT.
When I try to run a python code in IPython Notebook I get the below error.
NameError Traceback (most recent call last)
<ipython-input-1-f7aa330f6984> in <module>()
1 # Check that Spark is working
----> 2 largeRange = sc.parallelize(xrange(100000))
3 reduceTest = largeRange.reduce(lambda a, b: a + b)
4 filterReduceTest = largeRange.filter(lambda x: x % 7 == 0).sum()
5
NameError: name 'sc' is not defined
In the command window I can see the below error.
<strong>Failed to find Spark assembly JAR.</strong>
<strong>You need to build Spark before running this program.</strong>
Note that I got a scala prompt when I executed spark-shell command
Update:
With help of a friend I am able to fix the issue related to Spark assembly JAR by correcting the contents of .ipython/profile_pyspark/startup/00-pyspark-setup.py file
I have now only the problem of Spark Context variable. Changing the title to be appropriately reflect my current issue.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…