How to Merge Git Repositories Into One Keeping History

en in code • 4 min read

We had a lot of git repositories and sometimes we had to implement some new feature across more of them and keep in sync. Which is hard and means we actually need only one repository for our project. Well, Google has everything, like everything, in one huge repository, so why would you need separate git for every microservice, right?

The decision was made—merge it but with history. No merge without history. You know, it’s pretty easy to split one big repository into more repositories. Another way around is much harder with challenges on the way. That’s why I want to give you this help if you are facing the same step.

Let’s go! First of all, you need to know some theory. The best way how to merge repositories is to prepare them in a state that there will be no conflict. Let’s say you have repositories A, B and C and want them to merge into X. Think about how you want them to merge into X. Probably the best option is to have in final solution repository X with directories A, B and C containing original repositories in new merged one. By now you probably get that idea—the first step is to move all files into directories and then merge them.

You can do it simply with git mv but you will lose history. Actually you will not lose it, still, you can blame files and see history with git log --follow. And that’s why I don’t like this solution. There is a better one which will move all files into a new directory and rewrite a whole history as it would be for the whole time like that:

git filter-branch --index-filter \
    'git ls-files -s | sed "s-\t\"*-&FOLDER/-" |
    GIT_INDEX_FILE=$GIT_INDEX_FILE.new \
    git update-index --index-info &&
    mv "$GIT_INDEX_FILE.new" "$GIT_INDEX_FILE"
    ' --tag-name-filter cat -f -- --all

Important: watch end of a second line, you have to change FOLDER for your name of the directory.

When you do it with all your current repositories we can move on to merge it. It’s kind of easy part. Add remote and pull it. Just repeat following for all your repositories:

git remote add src_repo_name src_repo_filepath
git pull src_repo_name
git remote rm src_repo_name

Important: src_repo_filepath is meant as a just local path. I don’t recommend to push those changes to the origin. For historical purposes or if something goes wrong, it’s good to have old repositories untouched.

And now you have your new shiny merged repository, nice!

Yes, but… but what about other branches? You can do similar move for all branches as for master branches. There could be just two use cases when it’s not enough or too complicated. For example two branches from two repositories I actually wanted as one branch in final one. I didn’t want to do a mistake with some cross-git merging (which can be done very easily) so I used a different technique: make patches of affected commits and apply them in order as I need.

git format-patch -X HASH

Call this in original repository and branch. HASH is a hash of latest commit and X means for how many commits you want to do patches. Then you will see patch files which you can apply. You can also modify and merge more commits into one.

git apply xxx.patch
git commit -m "..." --author ""

The second use case for this solution is when someone has a local branch. You can “easily” merge public branches because you changed hashes and it matches but some colleague can have a local branch and he will need to merge it as well. He can use this technique for that, just before making patches he needs to move all into the same directory (all commits). For those purposes he doesn’t have to run first slow command keeping history but faster one:

git subtree add --prefix=FOLDER

And that’s it! Hope you will successfully merge it without problems.


Note: this is my first post in English. Sorry for those who prefer Czech. Don’t worry I will still publish also some Czech posts. It will depend on how many people I will want to share it with. :-)





You may also like



Popular from code