I have a list of data frames,
on each location of a list, I have one dataframe
I need to combine all those in one dataframe.
this is to be done in PySpark ,
before I was using
dataframe_new =pd.concat(listName)
solution 1
from pyspark.sql.types import *
import pyspark.sql
from pyspark.sql import SparkSession, Row
customSchema = StructType([
StructField("col1", StringType(), True),
StructField("col2", StringType(), True),
StructField("col3", StringType(), True),
StructField("col4", StringType(), True),
StructField("col5", StringType(), True),
StructField("col6", StringType(), True),
StructField("col7", StringType(), True)
])
df = spark.createDataFrame(queried_dfs[0],schema=customSchema)
Solution 2 I tried: (iterating through the list of dataframes, but don't know how to combine them
for x in ListOfDataframe
new_df=union_all()
but this is always create a new_df
any help to resolve this?
question from:
https://stackoverflow.com/questions/65923884/make-single-dataframe-from-list-of-dataframes 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…