According to this article from Snakeflow Load Data in Spark with Overwrite mode without Changing Table Structure, you can set the 2 options usestagingtable
and truncate_table
respectively to OFF
and ON
when writing.
[...] if you write the data into this table using
Snowflake Spark Connector with OVERWRITE mode, then the table gets
re-created with the default length of the datatypes. It's like
VARCHAR(32) will become VARCHAR(16777216).
[...] you can set the parameters TRUNCATE_TABLE=ON and USESTAGINGTABLE=OFF in the database connection string of your spark code and can run the spark data write job in "OVERWRITE" mode.
By default, the parameter USESTAGINGTABLE is set to ON because the connector writes the data frame into a temporary table, if the writing operation succeeds then the target table is being replaced by the new one.
So your code should be like this:
options = {
"sfUrl": "xxxxxxxxxxxx",
"sfUser": user,
"sfPassword": password,
"sfDatabase": "xxxx",
"sfSchema": "xxx",
"sfWarehouse": "xxxx",
"truncate_table": "ON",
"usestagingtable": "OFF"
}
df.write.format("net.snowflake.spark.snowflake")
.mode("overwrite")
.options(**options)
.option("dbtable", "xxx")
.save()
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…