Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
4.1k views
in Technique[技术] by (71.8m points)

scala - Is it possible to access another DataFrame while iterating through a DataFrame?

When iterating through a Dataframe using .foreach in Spark Scala is it possible to access another DataFrame, or load a DataFrame from SparkSQL, to make comparisons? For example, DF1 has available days and if a day is marked as not available on DF1 but appears on DF2 I would like to ignore that row of DF1. I have the logic working when I do a .collect on DF1 and iterate, but DF1 will be a large dataset and I do not want to be pulling all of that data back on to the driver.

DF1 Schema
 |-- id: integer (nullable = false)
 |-- monday: boolean (nullable = false)
 |-- tuesday: boolean (nullable = false)
 |-- wednesday: boolean (nullable = false)
 |-- thursday: boolean (nullable = false)
 |-- friday: boolean (nullable = false)
 |-- saturday: boolean (nullable = false)
 |-- sunday: boolean (nullable = false)

 DF2 Schema
 |-- start: timestamp (nullable = false)
 |-- end: timestamp (nullable = false)
 |-- dayStart: string (nullable = false)
 |-- dayEnd: string (nullable = false)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)
等待大神答复

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...