I have a snowflake table, with a clustering key with the format yyyy-mm-dd-hh
, and I would like to filter it on all the values of a certain date.
I have written the where clause in those ways:
WHERE dt >= current_date() || '-00' AND dt <= current_date() || '-23'
- took 0.7 seconds
WHERE dt IN (current_date() || '-00',current_date() || '-01',current_date() || '-02',current_date() || '-03',current_date() || '-04',current_date() || '-05',current_date() || '-06',current_date() || '-07',current_date() || '-08',current_date() || '-09',current_date() || '-10',current_date() || '-11',current_date() || '-12',current_date() || '-13',current_date() || '-14',current_date() || '-15',current_date() || '-16',current_date() || '-17',current_date() || '-18',current_date() || '-19',current_date() || '-20',current_date() || '-21',current_date() || '-22',current_date() || '-23')
- took the same as #1
WHERE dt LIKE current_date() || '%'
- took 2.9 seconds
WHERE left(dt, 10) = current_date
- took 9.2 seconds
and I set ALTER SESSION SET USE_CACHED_RESULT = FALSE
to make sure the result cache is not being used
my questions are:
- is the a more efficient way to query than #1?
- what type of queries can take advantage of the clustering? I would expect #3 to perform similar to #1, so I'm wandering what are the rules to write an efficient query
question from:
https://stackoverflow.com/questions/65937717/filter-a-table-on-a-cluster-key-efficiently 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…