Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
573 views
in Technique[技术] by (71.8m points)

hiveql - How to use Hive Query results(multiple) in a variable for other query

I have two tables one is schools and one is students.I want to find all the students of a particular school. The schema of schools is: id, name, location and of students is :id, name, schoolId. I wrote the following script:

schoolId=$(hive -e "set hive.cli.print.header=false;select id from school;")
 hive -hiveconf "schoolId"="$schoolId" 

hive>select id,name from student where schoolId like  '${hiveconf:schoolId}%'

I dont get any result as schoolId stores all the id together.For example there are 3 schools with id: 123, 256,346 schoolId variable stores as 123 256 346 and the result is null.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Use collect_set() with concat_ws to get comma delimited string, IDs should be cast to string:

schoolId=$(hive -e "set hive.cli.print.header=false;select concat_ws('\',\'',collect_set(cast(id as string))) from school;");

hive -hiveconf "schoolId"="$schoolId" 

Then use IN operator:

select id,name from student where schoolId in ('${hiveconf:schoolId}');

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...