You can coalesce
the nulls with the other startValue
found in the same partition of id
, minus 10:
import org.apache.spark.sql.expressions.Window
val df2 = df.withColumn(
"startValue",
coalesce($"startValue", max($"startValue").over(Window.partitionBy("id")) - 10)
)
df2.show
+---+----------+--------+
| id|startValue|endValue|
+---+----------+--------+
| 1| 544| 11a|
| 1| 554| 22b|
| 2| 6733| 33c|
| 2| 6743| 44d|
+---+----------+--------+
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…