I faced with an issue when serving ml model using PySpark 2.4.0 and MlFlow.
Executor fails with the following exception:
org.apache.spark.util.TaskCompletionListenerException: Memory was leaked by query. Memory leaked: (2048) Allocator(stdin reader for ./my-job-impl-condaenv.tar.gz/bin/python) 0/2048/8194/9223372036854775807 (res/actual/peak/limit)
From the articles about PySpark, I understood the following things:
- spark runs at least one python process for each core of each executor;
spark.executor.memory
parameter configures only JVM memory limits and doesn't affect the python process;
- python worker process consumes memory from executor overhead, configured using
spark.yarn.executor.memoryOverhead
;
- since spark 2.4.0 we can reserve memory for python worker explicitly using
spark.executor.pyspark.memory
that allows us to plan memory more granularly and stop overcommitting memory using spark.yarn.executor.memoryOverhead
;
Here is an explanation of spark.executor.pyspark.memory
from official docs:
The amount of memory to be allocated to PySpark in each executor, in MiB unless otherwise specified. If set, PySpark memory for an executor will be limited to this amount. If not set, Spark will not limit Python's memory use and it is up to the application to avoid exceeding the overhead memory space shared with other non-JVM processes. When PySpark is run in YARN or Kubernetes, this memory is added to executor resource requests.
At first, I just increased the amount of memory using spark.yarn.executor.memoryOverhead
and error finally gone.
Then I decided to make things better and specify the amount of memory for python worker using spark.executor.pyspark.memory
that caused to the same error.
So, seems that I didn't properly understood what exactly configures spark.executor.pyspark.memory
and how it correlates with spark.yarn.executor.memoryOverhead
I don't have too much experience with PySpark, so I hope that you'll help me to understood the process of memory allocation in PySpark, thx !
question from:
https://stackoverflow.com/questions/65922306/pyspark-correlation-between-spark-yarn-executor-memoryoverhead-and-spark-execut 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…