I am trying to insert a large(-ish) number of elements in the shortest time possible and I tried these two alternatives:
1) Pipelining:
List<Task> addTasks = new List<Task>();
for (int i = 0; i < table.Rows.Count; i++)
{
DataRow row = table.Rows[i];
Task<bool> addAsync = redisDB.SetAddAsync(string.Format(keyFormat, row.Field<int>("Id")), row.Field<int>("Value"));
addTasks.Add(addAsync);
}
Task[] tasks = addTasks.ToArray();
Task.WaitAll(tasks);
2) Batching:
List<Task> addTasks = new List<Task>();
IBatch batch = redisDB.CreateBatch();
for (int i = 0; i < table.Rows.Count; i++)
{
DataRow row = table.Rows[i];
Task<bool> addAsync = batch.SetAddAsync(string.Format(keyFormat, row.Field<int>("Id")), row.Field<int>("Value"));
addTasks.Add(addAsync);
}
batch.Execute();
Task[] tasks = addTasks.ToArray();
Task.WaitAll(tasks);
I am not noticing any significant time difference (actually I expected the batch method to be faster): for approx 250K inserts I get approx 7 sec for pipelining vs approx 8 sec for batching.
Reading from the documentation on pipelining,
"Using pipelining allows us to get both requests onto the network
immediately, eliminating most of the latency. Additionally, it also
helps reduce packet fragmentation: 20 requests sent individually
(waiting for each response) will require at least 20 packets, but 20
requests sent in a pipeline could fit into much fewer packets (perhaps
even just one)."
To me, this sounds a lot like the a batching behaviour. I wonder if behind the scenes there's any big difference between the two because at a simple check with procmon
I see almost the same number of TCP Send
s on both versions.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…