I second jhs answer, but I think his "Option 2" is too dangerous. I learned it the hard way. You can use reduce function for many nice things like getting the last post of each user of a blog, but you cannot use it for anything which does not reduce the amount of data returned.
To support it with facts, I made this little script to generate 200 customers with 20 orders each.
#!/bin/bash
echo '{"docs":['
for i in $(seq 1 200); do
id=customer/$i
echo '{"_id":"'$id'","type":"customer","name":"Customer '$i'"},'
for o in $(seq 1 20); do
echo '{"type":"order","_id":"order/'$i'/'$o'", "for":"'$id'", "desc":"Desc '$i$o'"},'
done
done
echo ']}'
It is a very likely scenario and it is enough to throw a Error: reduce_overflow_error
.
IMHO the two option you have are:
Option 1: Optimized list function
With a little bit of work, you can build the JSON response by hand, so that you do not need to accumulate orders in an array.
I have edited the list function of jhs to avoid any use of arrays, so you can have customers with any number of orders.
function(head, req) {
start({'headers':{'Content-Type':'application/json'}});
var first_customer = true
, first_order = true
, row
;
send('{"rows":[');
while(row = getRow()) {
if(row.key[1] === 2) {
// Order for customer
if (first_order) {
first_order = false;
} else {
send(',');
}
send(JSON.stringify(row.value));
}
else if (row.key[1] === 1) {
// New customer
if (first_customer) {
first_customer = false;
} else {
send(']},');
}
send('{"customer":');
send(JSON.stringify(row.key[0]));
send(',"orders":[');
first_order = true;
}
}
if (!first_customer)
send(']}');
send('
]}');
}
Option 2: Optimize the documents for your use case
If you really need to have the orders in the same document, then ask yourself if you can store it this way and avoid any processing while querying.
In other words: try to fully exploit the possibilities offered by a document database. Design the documents to best fit your use case, and reduce the post-processing needed to use them.