I need to create a column for a day of the week where values will be Monday, Tuesday, Wednesday...
and then apply a filter only for Friday.
The code I'm using is the following:
df = (
spark.table(f'nn_squad7_{country}.fact_table')
.filter(f.col('date_key').between(start,end))
.filter(f.col('is_client_plus')==1)
.filter(f.col('source')=='tickets')
.filter(f.col('subtype')=='trx')
.filter(f.col('is_trx_ok') == 1)
.withColumn('week', f.date_format(f.date_sub(f.col('date_key'), 1), 'YYYY-ww'))
.withColumn('month', f.date_format(f.date_sub(f.col('date_key'), 1), 'M'))
.withColumn('HP_client', f.col('customer_id').isNotNull())
.withColumn('local_time',f.from_utc_timestamp(f.col('trx_begin_date_time'),'Europe/Brussels'))
.withColumn('Hour', f.hour(f.col('local_time')))
.withColumn('Day', f.day(f.col('local_time')))
.filter(f.col('Hour').between(4, 8))
)
Here is the error I get:
AttributeError: module 'pyspark.sql.functions' has no attribute 'day'
How can I create a column for on a dayli basis? Thanks
question from:
https://stackoverflow.com/questions/65918594/how-to-select-day-for-week-pyspark 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…