Some background information:
I'm working with Dataiku DSS, HDFS, and partitioned datasets. I have a particular job running (Hive query) which has two input datasets - one a very large, partitioned dataset, the other a small (~250 rows, 2 columns), non-partitioned dataset. Let's call the partitioned table A, and the non-partitioned table B.
Question:
The query is of the following format,
SELECT a.f1, f2, ..., fn
FROM A as a
LEFT JOIN B as b
ON a.f1 = b.f1
WHERE {PARTITION_FILTER}
Here is the current ouput from the MapReduce job (keep in mind this job is still running):
[09:05:53] [INFO] [dku.utils] - INFO : Total jobs = 4
[09:05:53] [INFO] [dku.utils] - INFO : Starting task [Stage-10:CONDITIONAL] in serial mode
[09:05:53] [INFO] [dku.utils] - INFO : Stage-11 is filtered out by condition resolver.
[09:05:53] [INFO] [dku.utils] - INFO : Stage-1 is selected by condition resolver.
[09:05:53] [INFO] [dku.utils] - INFO : Launching Job 1 out of 4
[09:05:53] [INFO] [dku.utils] - INFO : Starting task [Stage-1:MAPRED] in serial mode
[09:05:53] [INFO] [dku.utils] - INFO : Number of reduce tasks not specified. Estimated from input data size: 307
[09:05:53] [INFO] [dku.utils] - INFO : In order to change the average load for a reducer (in bytes):
[09:05:53] [INFO] [dku.utils] - INFO : set hive.exec.reducers.bytes.per.reducer=<number>
[09:05:53] [INFO] [dku.utils] - INFO : In order to limit the maximum number of reducers:
[09:05:53] [INFO] [dku.utils] - INFO : set hive.exec.reducers.max=<number>
[09:05:53] [INFO] [dku.utils] - INFO : In order to set a constant number of reducers:
[09:05:53] [INFO] [dku.utils] - INFO : set mapreduce.job.reduces=<number>
[09:05:53] [INFO] [dku.utils] - INFO : number of splits:75
[09:05:53] [INFO] [dku.utils] - INFO : Submitting tokens for job: job_1529508387832_10211
[09:05:53] [INFO] [dip.hiveserver2.log.sniffer] - sniffed applicationId => application_1529508387832_10211/
[09:05:53] [INFO] [dku.utils] - INFO : Kill Command = /opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib/hadoop/bin/hadoop job -kill job_1529508387832_10211
[09:05:53] [INFO] [dku.utils] - INFO : Hadoop job information for Stage-1: number of mappers: 75; number of reducers: 307
[09:05:53] [INFO] [dku.utils] - INFO : 2018-06-27 09:05:47,749 Stage-1 map = 0%, reduce = 0%
[09:06:48] [INFO] [dku.utils] - INFO : 2018-06-27 09:06:48,444 Stage-1 map = 0%, reduce = 0%, Cumulative CPU 278.18 sec
[09:06:53] [INFO] [dku.utils] - INFO : 2018-06-27 09:06:51,682 Stage-1 map = 1%, reduce = 0%, Cumulative CPU 373.35 sec
[09:07:03] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:00,159 Stage-1 map = 2%, reduce = 0%, Cumulative CPU 501.46 sec
[09:07:03] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:02,235 Stage-1 map = 3%, reduce = 0%, Cumulative CPU 539.6 sec
[09:07:08] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:07,472 Stage-1 map = 4%, reduce = 0%, Cumulative CPU 1389.81 sec
[09:07:13] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:10,605 Stage-1 map = 5%, reduce = 0%, Cumulative CPU 2172.52 sec
[09:07:13] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:12,574 Stage-1 map = 6%, reduce = 0%, Cumulative CPU 2577.81 sec
[09:07:13] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:13,604 Stage-1 map = 8%, reduce = 0%, Cumulative CPU 2865.44 sec
[09:07:18] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:15,747 Stage-1 map = 9%, reduce = 0%, Cumulative CPU 3110.21 sec
[09:07:23] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:19,898 Stage-1 map = 10%, reduce = 0%, Cumulative CPU 4080.2 sec
[09:07:23] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:21,988 Stage-1 map = 11%, reduce = 0%, Cumulative CPU 4522.48 sec
[09:07:23] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:23,015 Stage-1 map = 12%, reduce = 0%, Cumulative CPU 4755.96 sec
[09:07:33] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:29,335 Stage-1 map = 13%, reduce = 0%, Cumulative CPU 5710.85 sec
[09:07:33] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:31,407 Stage-1 map = 14%, reduce = 0%, Cumulative CPU 5948.34 sec
[09:07:38] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:34,555 Stage-1 map = 15%, reduce = 0%, Cumulative CPU 6399.6 sec
[09:07:38] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:37,663 Stage-1 map = 16%, reduce = 0%, Cumulative CPU 6811.22 sec
[09:07:43] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:38,695 Stage-1 map = 19%, reduce = 0%, Cumulative CPU 7087.68 sec
[09:07:43] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:39,729 Stage-1 map = 20%, reduce = 0%, Cumulative CPU 7288.22 sec
[09:07:43] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:40,769 Stage-1 map = 22%, reduce = 0%, Cumulative CPU 7520.54 sec
[09:07:43] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:41,903 Stage-1 map = 24%, reduce = 0%, Cumulative CPU 7771.37 sec
[09:07:43] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:42,930 Stage-1 map = 25%, reduce = 0%, Cumulative CPU 7936.9 sec
[09:07:48] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:45,035 Stage-1 map = 27%, reduce = 0%, Cumulative CPU 8254.78 sec
[09:07:48] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:46,075 Stage-1 map = 29%, reduce = 0%, Cumulative CPU 8428.35 sec
[09:07:48] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:47,111 Stage-1 map = 30%, reduce = 0%, Cumulative CPU 8661.23 sec
[09:07:48] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:48,153 Stage-1 map = 31%, reduce = 0%, Cumulative CPU 8834.37 sec
[09:07:53] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:49,193 Stage-1 map = 33%, reduce = 0%, Cumulative CPU 8983.68 sec
[09:07:53] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:50,227 Stage-1 map = 35%, reduce = 0%, Cumulative CPU 9149.94 sec
[09:07:53] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:51,263 Stage-1 map = 37%, reduce = 0%, Cumulative CPU 9268.9 sec
[09:07:53] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:52,301 Stage-1 map = 38%, reduce = 0%, Cumulative CPU 9415.86 sec
[09:07:53] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:53,352 Stage-1 map = 39%, reduce = 0%, Cumulative CPU 9540.63 sec
[09:07:58] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:54,381 Stage-1 map = 41%, reduce = 0%, Cumulative CPU 9711.54 sec
[09:07:58] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:55,421 Stage-1 map = 43%, reduce = 0%, Cumulative CPU 9823.52 sec
[09:07:58] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:56,453 Stage-1 map = 46%, reduce = 0%, Cumulative CPU 10010.83 sec
[09:07:58] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:57,492 Stage-1 map = 49%, reduce = 0%, Cumulative CPU 10081.9 sec
[09:07:58] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:58,532 Stage-1 map = 53%, reduce = 0%, Cumulative CPU 10230.13 sec
[09:08:03] [INFO] [dku.utils] - INFO : 2018-06-27 09:07:59,576 Stage-1 map = 56%, reduce = 0%, Cumulative CPU 10392.61 sec
[09:08:03] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:00,607 Stage-1 map = 58%, reduce = 0%, Cumulative CPU 10483.38 sec
[09:08:03] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:01,649 Stage-1 map = 63%, reduce = 0%, Cumulative CPU 10618.16 sec
[09:08:03] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:02,672 Stage-1 map = 66%, reduce = 0%, Cumulative CPU 10684.82 sec
[09:08:08] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:03,695 Stage-1 map = 67%, reduce = 0%, Cumulative CPU 10701.95 sec
[09:08:08] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:04,720 Stage-1 map = 70%, reduce = 0%, Cumulative CPU 10767.21 sec
[09:08:08] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:05,750 Stage-1 map = 71%, reduce = 0%, Cumulative CPU 10849.92 sec
[09:08:08] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:06,780 Stage-1 map = 73%, reduce = 0%, Cumulative CPU 10924.45 sec
[09:08:08] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:07,902 Stage-1 map = 76%, reduce = 0%, Cumulative CPU 11000.21 sec
[09:08:13] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:09,965 Stage-1 map = 77%, reduce = 0%, Cumulative CPU 11013.58 sec
[09:08:13] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:10,991 Stage-1 map = 78%, reduce = 0%, Cumulative CPU 11057.76 sec
[09:08:18] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:14,216 Stage-1 map = 80%, reduce = 0%, Cumulative CPU 11157.51 sec
[09:08:18] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:17,362 Stage-1 map = 82%, reduce = 0%, Cumulative CPU 11392.85 sec
[09:08:23] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:20,460 Stage-1 map = 83%, reduce = 0%, Cumulative CPU 11610.7 sec
[09:08:28] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:26,106 Stage-1 map = 84%, reduce = 0%, Cumulative CPU 11781.65 sec
[09:08:28] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:28,163 Stage-1 map = 85%, reduce = 0%, Cumulative CPU 11788.58 sec
[09:08:38] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:35,410 Stage-1 map = 86%, reduce = 0%, Cumulative CPU 12167.24 sec
[09:08:38] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:38,540 Stage-1 map = 86%, reduce = 1%, Cumulative CPU 12317.09 sec
[09:08:43] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:39,583 Stage-1 map = 86%, reduce = 2%, Cumulative CPU 12329.8 sec
[09:08:43] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:40,631 Stage-1 map = 86%, reduce = 3%, Cumulative CPU 12333.61 sec
[09:08:48] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:47,927 Stage-1 map = 87%, reduce = 3%, Cumulative CPU 12651.77 sec
[09:08:53] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:48,970 Stage-1 map = 88%, reduce = 4%, Cumulative CPU 12826.37 sec
[09:08:53] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:51,092 Stage-1 map = 88%, reduce = 5%, Cumulative CPU 12857.19 sec
[09:08:58] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:54,214 Stage-1 map = 90%, reduce = 5%, Cumulative CPU 13037.63 sec
[09:09:03] [INFO] [dku.utils] - INFO : 2018-06-27 09:08:59,396 Stage-1 map = 91%, reduce = 5%, Cumulative CPU 13117.71 sec
[09:09:03] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:00,440 Stage-1 map = 92%, reduce = 5%, Cumulative CPU 13238.06 sec
[09:09:03] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:01,485 Stage-1 map = 93%, reduce = 5%, Cumulative CPU 13249.8 sec
[09:09:08] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:05,660 Stage-1 map = 94%, reduce = 5%, Cumulative CPU 13306.0 sec
[09:09:08] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:06,706 Stage-1 map = 97%, reduce = 5%, Cumulative CPU 13393.5 sec
[09:09:08] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:07,751 Stage-1 map = 97%, reduce = 6%, Cumulative CPU 13409.12 sec
[09:09:08] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:08,795 Stage-1 map = 98%, reduce = 6%, Cumulative CPU 13433.07 sec
[09:09:13] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:09,835 Stage-1 map = 98%, reduce = 8%, Cumulative CPU 13474.03 sec
[09:09:13] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:10,874 Stage-1 map = 98%, reduce = 9%, Cumulative CPU 13484.64 sec
[09:09:18] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:14,004 Stage-1 map = 100%, reduce = 11%, Cumulative CPU 13580.71 sec
[09:09:18] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:15,118 Stage-1 map = 100%, reduce = 13%, Cumulative CPU 13619.15 sec
[09:09:18] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:16,160 Stage-1 map = 100%, reduce = 16%, Cumulative CPU 13707.2 sec
[09:09:18] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:17,210 Stage-1 map = 100%, reduce = 33%, Cumulative CPU 14456.75 sec
[09:09:18] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:18,258 Stage-1 map = 100%, reduce = 39%, Cumulative CPU 14708.07 sec
[09:09:23] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:19,291 Stage-1 map = 100%, reduce = 40%, Cumulative CPU 14768.29 sec
[09:09:23] [INFO] [dku.utils] - INFO : 2018-06-27 09:09:20,329 Stage-1 map = 100%, reduce = 41%, Cumulative