hadoop - Why is Fetch task in Hive works faster than Map-only task?

Question

Welcome To Ask or Share your Answers For Others

hadoop - Why is Fetch task in Hive works faster than Map-only task?

1 Answer

深蓝 · Answer 1 · 2021-10-23T18:23:40+0000

FetchTask directly fetches data, whereas Mapreduce will invoke a map reduce job

<property>
  <name>hive.fetch.task.conversion</name>
  <value>minimal</value>
  <description>
    Some select queries can be converted to single FETCH task 
    minimizing latency.Currently the query should be single 
    sourced not having any subquery and should not have
    any aggregations or distincts (which incurrs RS), 
    lateral views and joins.
    1. minimal : SELECT STAR, FILTER on partition columns, LIMIT only
    2. more    : SELECT, FILTER, LIMIT only (+TABLESAMPLE, virtual columns)
  </description>
</property>

Categories

hadoop - Why is Fetch task in Hive works faster than Map-only task?

hadoop - Why is Fetch task in Hive works faster than Map-only task?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags