How does one store history of edits effectively?

Question

Welcome To Ask or Share your Answers For Others

How does one store history of edits effectively?

1 Answer

深蓝 · Answer 1 · 2021-10-06T05:13:08+0000

There are a number of options, the simplest, of course, being to simply record all versions independently. For a site like Stack Overflow, where posts aren't usually edited very many times, this is appropriate. However for something like Wikipedia, one needs to be more clever to save space.

In the case of Wikipedia, pages are initially stored with each version separate, in the text table. Periodically, a number of older revisions are compressed together, then packed into a single field. Since there will be a lot of repetition, you save a lot of space this way.

You might also want to look into how some version control systems do it - for example, subversion uses skip deltas, where revisions are stored as a difference from a revision halfway down the history. This means that one will have to examine at most O(lg n) revisions to reconstruct one's revision of interest.

Git, on the other hand, uses something more similar to Wikipedia's approach.

Revisions are stored as individually compressed 'loose' objects at first, then periodically git takes all of the loose objects, sorts them according to a somewhat complex heuristic, then builds compressed deltas between 'nearby' objects and dumps the result as a packfile.
The number of revisions that need to be read to reconstruct a file is bounded by an argument to the pack building process. This has the interesting property that deltas can be built between objects that are unrelated, in some cases.

Categories

How does one store history of edits effectively?

How does one store history of edits effectively?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags