Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
675 views
in Technique[技术] by (71.8m points)

java - Apache Spark - how to set timezone to UTC? currently defaulted to Zulu

In Spark's WebUI (port 8080) and on the environment tab there is a setting of the below:

user.timezone Zulu

Do you know how/where I can override this to UTC?

Env details:

  • Spark 2.1.1
  • jre-1.8.0-openjdk.x86_64
  • no jdk
  • EC2 Amazon Linux

EDIT (someone answered the below then deleted): https://www.timeanddate.com/time/zones/z

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Now you can use:

spark.conf.set("spark.sql.session.timeZone", "UTC")

Since https://issues.apache.org/jira/browse/SPARK-18936 in 2.2.0

EDIT:

Additionally I set my default TimeZone to UTC to avoid implicit conversions

TimeZone.setDefault(TimeZone.getTimeZone("UTC"))

Otherwise you will get implicit conversions from your default Timezone to UTC when no Timezone information is present in the Timestamp you're converting

Example:

val rawJson = """ {"some_date_field": "2018-09-14 16:05:37"} """

val dsRaw = sparkJob.spark.createDataset(Seq(rawJson))

val output =
  dsRaw
    .select(
      from_json(
        col("value"),
        new StructType(
          Array(
            StructField("some_date_field", DataTypes.TimestampType)
          )
        )
      ).as("parsed")
    ).select("parsed.*")

If my default TimeZone is Europe/Dublin which is GMT+1 and Spark sql session timezone is set to UTC, Spark will assume that "2018-09-14 16:05:37" is in Europe/Dublin TimeZone and do a conversion (result will be "2018-09-14 15:05:37")


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...