I have the following piece of code:
# fact table
df = (spark.table(f'nn_squad7_{country}.fact_table')
.filter(f.col('date_key').between(start_date,end_date))
#.filter(f.col('is_lidl_plus')==1)
.filter(f.col('source')=='tickets')
.filter(f.col('subtype')=='trx')
.filter(f.col('is_trx_ok') == 1)
.join(dim_stores,'store_id','inner')
.join(dim_customers,'customer_id','inner')
.withColumn('week', f.expr('DATE_FORMAT(DATE_SUB(date_key, 1), "Y-ww")'))
.withColumn('quarter', f.expr('DATE_FORMAT(DATE_SUB(date_key, 1), "Q")')))
#checking metrics
df2 =(df
.groupby('is_client_plus','quarter')
.agg(
f.countDistinct('store_id'),
f.sum('customer_id'),
f.sum('ticket_id')))
display(df2)
When I execute the query I get the following error:
SparkException: Job aborted due to stage failure: Task 58 in stage 13.0 failed 4 times, most recent failure: Lost task 58.3 in stage 13.0 (TID 488, 10.32.14.43, executor 4): java.lang.IllegalArgumentException: Illegal pattern character 'Q'
I'm not sure about why I'm getting this error because when I run the fact table chunck alone I'm not getting any kind of error.
Any advice? Thanks!
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…