Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.2k views
in Technique[技术] by (71.8m points)

pyspark - How can set the default spark logging level?

I launch pyspark applications from pycharm on my own workstation, to a 8 node cluster. This cluster also has settings encoded in spark-defaults.conf and spark-env.sh

This is how I obtain my spark context variable.

spark = SparkSession 
        .builder 
        .master("spark://stcpgrnlp06p.options-it.com:7087") 
        .appName(__SPARK_APP_NAME__) 
        .config("spark.executor.memory", "50g") 
        .config("spark.eventlog.enabled", "true") 
        .config("spark.eventlog.dir", r"/net/share/grid/bin/spark/UAT/SparkLogs/") 
        .config("spark.cores.max", 128) 
        .config("spark.sql.crossJoin.enabled", "True") 
        .config("spark.executor.extraLibraryPath","/net/share/grid/bin/spark/UAT/bin/vertica-jdbc-8.0.0-0.jar") 
        .config("spark.serializer", "org.apache.spark.serializer.KryoSerializer") 
        .config("spark.logConf", "true") 
        .getOrCreate()

    sc = spark.sparkContext
    sc.setLogLevel("INFO")

I want to see the effective config that is being used in my log. This line

        .config("spark.logConf", "true") 

should cause the spark api to log its effective config to the log as INFO, but the default log level is set to WARN, and as such I don't see any messages.

setting this line

sc.setLogLevel("INFO")

shows INFO messages going forward, but its too late by then.

How can I set the default logging level that spark starts with?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

you can also update the log level programmatically like below, get hold of spark object from JVM and do like below

    def update_spark_log_level(self, log_level='info'):
        self.spark.sparkContext.setLogLevel(log_level)
        log4j = self.spark._jvm.org.apache.log4j
        logger = log4j.LogManager.getLogger("my custom Log Level")
        return logger;


use:

logger = update_spark_log_level('debug')
logger.info('you log message')

feel free to comment if you need more details


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...