Aggregate results of separate elasticsearch queries with their own aggs

Question

Welcome To Ask or Share your Answers For Others

Aggregate results of separate elasticsearch queries with their own aggs

asked Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

Aggregate results of separate elasticsearch queries with their own aggs

Let's say I have a single index called /recipes The mappings include keywords for fields contributor and dish_name (which is always a single word, like "pancakes").

I have multiple recipes and multiple contributors of recipes (docs), but I'm really interested in those from Martha and Shane.

Moreover, I'd like to find out what is the percentage of unique dish names (non-overlapping in all the recipes they've each contributed) contributed by just these two individuals.

E.g., they each could have contributed multiple different recipes that are all for dishes named "pancakes."

I imagine I want to find all the recipes where contributor:Martha and then further get the count of unique dish names (if Martha has multiple recipes for pancakes, I only want one of those to count). Then I would do the same for Shane. Finally, I need to have a way to compare these results against each other.

In SQL land this sounds like I want a left outer join. In ES, I've tried filter aggregations, terms aggregations, sub-aggregations, pipeline aggregations. However, I can't seem to find just the right combo to get a single query to do what I want.

Example data:

recipes: [
 {
 _id: 1,
 dish_name: pancakes,
 contributor: Martha,
 ingredients: who cares
 },
 _id: 2,
 dish_name: pancakes
 contributor: Shane,
 ingredients: still doesn't matter
 },
 _id: 3,
 dish_name: pancakes,
 contributor: Martha,
 ingredients: totally diff from id 1
 },
 {
 _id: 4,
 dish_name: souffle
 contributor: Martha,
 ingredients: souffle stuff
 },
 _id: 5,
 dish_name: pie,
 contributor: Shane,
 ingredients: pie stuff
 }
]

I would expect that there is a total pool of 4 dish_names: 2 unique dishes contributed by Martha (one pancakes and one souffle), 2 unique dishes contributed by Shane (pancakes and pie). The final outcome would reflect 50% uniqueness from each because they each have contributed one thing the other did (non-unique) and one unique remaining contribution.

Is this doable in a single query? Does it require multiple queries? I'm trying in ES 5.6 btw, so may not necessarily have some more modern options like composite aggregation.

question from:https://stackoverflow.com/questions/65850968/aggregate-results-of-separate-elasticsearch-queries-with-their-own-aggs

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

Categories

Aggregate results of separate elasticsearch queries with their own aggs

Aggregate results of separate elasticsearch queries with their own aggs

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags