How to Merge Git Repositories Into One Keeping History

en in code • 4 min read
Mind the age! Most likely, its content is outdated. Especially if it’s technical.

We had many git repositories, and sometimes we had to implement some new features across more of them and keep in sync. Which is hard and which means we actually need only one repository for our project. Well, Google has everything, like everything, in one enormous repository, so why would you need separate git for every microservice, right?

The decision was made—merge it but with history, no merge without it. You know, it’s pretty easy to split one big repository into more repositories. Another way around is much harder with challenges on the way. That’s why I want to give you this help if you are facing the same puzzle.

Let’s go! First of all, you need to know the final status. The best way to merge repositories is to prepare them in a state with no conflict. Let’s say you have repositories A, B, and C, and you want to blend them into X. Think about how you want them to merge into X. The best option, probably, is to have directories A, B, and C containing original repositories in the new big one. By now, you probably get that idea—the first step is to move all files into directories and then merge them.

You can do it simply with git mv, but you will lose history. Actually, you will not lose it; still, you can blame files and see history with git log --follow. And that’s why I don’t like this solution. There is a better one which will move all files into a new directory and rewrite the whole history as it would be for the whole time like that:

git filter-branch --index-filter \
    'git ls-files -s | sed "s-\t\"*-&FOLDER/-" |
    GIT_INDEX_FILE=$GIT_INDEX_FILE.new \
    git update-index --index-info &&
    mv "$GIT_INDEX_FILE.new" "$GIT_INDEX_FILE"
    ' --tag-name-filter cat -f -- --all

Important: watch the end of a second line; you have to change FOLDER for your name of the directory.

When you do it with all your current repositories, we can move on to the merging part. It’s a kind of easy step. Add remote and pull it. Just repeat the following for all your repositories:

git remote add src_repo_name src_repo_filepath
git pull --allow-unrelated-histories src_repo_name
git remote rm src_repo_name

Important: src_repo_filepath is meant as a just local path. I don’t recommend pushing those changes to the origin. It’s good to have old repositories untouched for historical purposes or if something goes wrong.

And now you have your new shiny merged repository, sweet!

Yes, but… but what about other branches? You can make a similar move for all branches as for the master branch. There could be just two use cases when it’s not enough or too complicated—for example, I want two branches from two repositories as one branch in the final one. I didn’t want to make a mistake with some cross-git merging (which can be done very easily), so I used a different technique: make patches of affected commits and apply them in order as I need.

git format-patch -X HASH

Call this in the original repository and branch. HASH is a hash of the latest commit, and X means how many commits you want to do patches for. Then you will see patch files that you can apply. You can also modify and merge more commits into one.

git apply xxx.patch
git commit -m "..." --author ""

The second use case for this solution is when someone has a local branch. You can “easily” merge public branches because you changed hashes, and it matches, but some colleagues can have a local branch, and s/he will need to merge it as well. The colleague can use this technique for that; just before making patches, s/he needs to move all into the same directory (all commits). For those purposes, s/he doesn’t have to run first slow command keeping history but faster one:

git subtree add --prefix=FOLDER

And that’s it! I hope you will successfully merge it without problems.






5 responses

This post is not in English. It is in Czenglish. Understandable if translated to Czech in your head :)

@Martin Oh yes, I know. I will try to be better over time. :-)

Hi,
What if I want to move directory A somewhere deep in directory B, for example B/components/submodules/A
Should I create the same full folder structure inside repo B and move the repo in it with filter-branch script, so that the merge gives such outcome? "FOLDER for your name of the directory"
By this do you mean the source folder (git repo name itself), or target folder? @TJ Hi. You can move directory to any directory you want. So just change the mv command properly. The first step is to move files in original repositories to places you want them after the merge. The FOLDER is the target directory.




You may also like

en Makefile with Python, November 6, 2017
en Fast JSON Schema for Python, October 1, 2018
en Deployment of Python Apps, August 15, 2018
cs Jasně, umím Git…, August 6, 2014
cs Checklist na zabezpečení webových aplikací, March 1, 2016

More posts from category code.
Do not miss new posts thanks to Atom/RSS feed.



Recent posts

cs Zápisky z cest: Česká Sibiř, December 22, 2024 in travel
cs Zápisky z cest: Šumava, November 24, 2024 in travel
cs O klimatizaci, November 10, 2024 in family
cs První slůvka, November 3, 2024 in family
cs Jakou knihu čteš?, October 12, 2024 in family