Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
380 views
in Technique[技术] by (71.8m points)

pivot - Pivoting in Pig

This is related to the question in Pivot table with Apache Pig. I have the input data as

Id    Name     Value 
1     Column1  Row11 
1     Column2  Row12 
1     Column3  Row13 
2     Column1  Row21 
2     Column2  Row22 
2     Column3  Row23 

and want to pivot and get the output as

Id    Column1 Column2 Column3 
1      Row11    Row12   Row13 
2      Row21    Row22   Row23 

Pls let me know how to do it in Pig.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

The simplest way to do it without UDF is to group on Id and than in nested foreach select rows for each of the column names, then join them in the generate. See script:

inpt = load '~/rows_to_cols.txt' as (Id : chararray, Name : chararray, Value: chararray);
grp = group inpt by Id;
maps = foreach grp {
    col1 = filter inpt by Name == 'Column1';
    col2 = filter inpt by Name == 'Column2';
    col3 = filter inpt by Name == 'Column3';
    generate flatten(group) as Id, flatten(col1.Value) as Column1, flatten(col2.Value)  as Column2, flatten(col3.Value)  as Column3;
};

Output:

(1,Row11,Row12,Row13)
(2,Row21,Row22,Row23)

Another option would be to write a UDF which converts a bag{name, value} into a map[], than use get values by using column names as keys (Ex. vals#'Column1').


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...