Optimize ETL Flow:
Is there any specific reason to not connect Tableau directly to Teradata? I am assuming that Teradata contains the actual data and you are extracting that data into a file, perhaps in CSV format, importing that file into Excel and then importing Excel into Tableau. Is that correct? If so, then you really should look into this link to set up a connection from Tableau directly to Teradata.
This should eliminate CSV files and Excel's involvement, which should be the main problems with large extracts' analysis.
Reduce Row Count:
If the issue really remains, you should look into narrowing your results. You mentioned your use case as assessment of sales performance. Are you interested in best, worst, average performers? You can trim your data based upon that. Replace DESC
with ASC
if you are looking for worst performers in following query. TOP N
limits the results to N
number of rows, if you need anything other than 10, replace it with your required count.
SELECT
TOP 10
"Store"
,"SKU"
,"WEEK"
,"YEAR"
,SUM("Sales") as SUM_SALES
FROM Database.tablename
WHERE "SKU" IN (1,2,3,4,5)
AND
"YEAR" >= 2020
GROUP BY "Store"
,"SKU"
,"WEEK"
,"YEAR"
ORDER BY SUM_SALES DESC
;
Use Sampling:
If you are more interested in overall trends rather than specific values (e.g. drawing charts etc.), and would like to reduce the result set randomly while preserving the high level trends, use Teradata's SAMPLE
clause.
SELECT "Store"
,"SKU"
,"WEEK"
,"YEAR"
,SUM("Sales") as SUM_SALES
FROM Database.tablename
WHERE "SKU" IN (1,2,3,4,5)
AND
"YEAR" >= 2020
GROUP BY "Store"
,"SKU"
,"WEEK"
,"YEAR"
ORDER BY SUM_SALES DESC
SAMPLE 0.25
;
Notice that TOP N
is no longer present. SAMPLE
can accept a specific row count if you provide a whole number > 1 or a percentage sample from total set if you provide a fraction between 0 and 1. In the example above, it will return 25% sample, so your extracted data will be 25% (1/4th) of original size. This link goes in depth about the SAMPLE
clause in Teradata.