dataframe - Runnning Spark on cluster: Initial job has not accepted any resources

Question

Welcome To Ask or Share your Answers For Others

dataframe - Runnning Spark on cluster: Initial job has not accepted any resources

asked Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

dataframe - Runnning Spark on cluster: Initial job has not accepted any resources

I have a remote Ubuntu server on linode.com with 4 cores and 8G RAM
I have a Spark-2 cluster consisting of 1 master and 1 slave on my remote Ubuntu server.

I have started PySpark shell locally on my MacBook, connected to my master node on remote server by:

$ PYSPARK_PYTHON=python3 /vagrant/spark-2.0.0-bin-hadoop2.7/bin/pyspark --master spark://[server-ip]:7077

I tried executing simple Spark example from website:

from pyspark.sql import SparkSession

spark = SparkSession 
    .builder 
    .appName("Python Spark SQL basic example") 
    .config("spark.some.config.option", "some-value") 
    .getOrCreate()
df = spark.read.json("/path/to/spark-2.0.0-bin-hadoop2.7/examples/src/main/resources/people.json")

I have got error

Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
I have enough memory on my server and also on my local machine, but I am getting this weird error again and again. I have 6G for my Spark cluster, my script is using only 4 cores with 1G memory per node.

[
I have Googled for this error and tried to setup different memory configs, also disabled firewall on both machines, but it does not helped me. I have no idea how to fix it.
Is someone faced the same problem? Any ideas?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-17T03:09:03+0000

You are submitting application in the client mode. It means that driver process is started on your local machine.

When executing Spark applications all machines have to be able to communicate with each other. Most likely your driver process is not reachable from the executors (for example it is using private IP or is hidden behind firewall). If that is the case you can confirm that by checking executor logs (go to application, select on of the workers with the status EXITED and check stderr. You "should" see that executor is failing due to org.apache.spark.rpc.RpcTimeoutException).

There are two possible solutions:

Submit application from the machine which can be reached from you cluster.
Submit application in the cluster mode. This will use cluster resources to start driver process so you have to account for that.

Categories

dataframe - Runnning Spark on cluster: Initial job has not accepted any resources

dataframe - Runnning Spark on cluster: Initial job has not accepted any resources

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags