I have a large table that consists of hourly aggregates in Athena, and users would like to run some queries like
SELECT *
from hourly_table
where timestamp > START_TIME
and timestamp < END_TIME
The date range can be one or several weeks. However, since the data amount is huge, the query is very slow. Therefore, I tried to generate two extra tables with daily and monthly data. However, that will involve queries on three tables (hourly, daily, and monthly) with data range breakdown. For example, to query data from Jan 1st 12:00 am to Feb 12th 1:00 am, users need to write the query like
SELECT * from monthly_table
where time < date '2020-02-01' and time >= date '2020-01-01'
UNION ALL
SELECT * from daily_table
where time < date '2020-02-12' and time >= date '2020-02-01'
UNION ALL
SELECT * from hourly_table
where time < timestamp '2020-02-12 01:00:00' and time >= timestamp '2020-02-12 00:00:00'
Is there a way to expose just one table (view) to the user and the query data range can be broken down automatically? Thanks!
For example, same above query would be exposed like
SELECT * from table_view
where time < timestamp '2020-02-12 01:00:00' and time >= timestamp '2020-01-01 00:00:00'
question from:
https://stackoverflow.com/questions/65925650/automatic-data-range-breakdown-on-multiple-tables-for-queries-in-athena-presto 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…