Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
868 views
in Technique[技术] by (71.8m points)

scala - Replace null values in Spark DataFrame

I saw a solution here but when I tried it doesn't work for me.

First I import a cars.csv file :

val df = sqlContext.read
              .format("com.databricks.spark.csv")
              .option("header", "true")
              .load("/usr/local/spark/cars.csv")

Which looks like the following :

+----+-----+-----+--------------------+-----+
|year| make|model|             comment|blank|
+----+-----+-----+--------------------+-----+
|2012|Tesla|    S|          No comment|     |
|1997| Ford| E350|Go get one now th...|     |
|2015|Chevy| Volt|                null| null|

Then I do this :

df.na.fill("e",Seq("blank"))

But the null values didn't change.

Can anyone help me ?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

This is basically very simple. You'll need to create a new DataFrame. I'm using the DataFrame df that you have defined earlier.

val newDf = df.na.fill("e",Seq("blank"))

DataFrames are immutable structures. Each time you perform a transformation which you need to store, you'll need to affect the transformed DataFrame to a new value.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...