Question:
I want to know an exact algorithm (or near that) behind ‘git merge’. The answers at least to these sub-questions will be helpful:
- How does git detect the context of a particular non-conflicting change?
- How does git find out that there is a conflict in these exact lines?
- Which things does git auto-merge?
- How does git perform when there is no common base for merging branches?
- How does git perform when there are multiple common bases for merging branches?
- What happens when I merge multiple branches at once?
- What is a difference between merge strategies?
But the description of a whole algorithm will be much better.
Answer:
You might be best off looking for a description of a 3-way merge algorithm. A high-level description would go something like this:
- Find a suitable merge base
B
– a version of the file that is an ancestor of both of the new versions (X
andY
), and usually the most recent such base (although there are cases where it will have to go back further, which is one of the features ofgit
s defaultrecursive
merge) - Perform diffs of
X
withB
andY
withB
. - Walk through the change blocks identified in the two diffs. If both sides introduce the same change in the same spot, accept either one; if one introduces a change and the other leaves that region alone, introduce the change in the final; if both introduce changes in a spot, but they don’t match, mark a conflict to be resolved manually.
The full algorithm deals with this in a lot more detail, and even has some documentation (https://github.com/git/git/blob/master/Documentation/technical/trivial-merge.txt for one, along with the git help XXX
pages, where XXX is one of merge-base
, merge-file
, merge
, merge-one-file
and possibly a few others). If that’s not deep enough, there’s always source code…