We had many git repositories, and sometimes we had to implement some new features across more of them and keep in sync. Which is hard and which means we actually need only one repository for our project. Well, Google has everything, like everything, in one enormous repository, so why would you need separate git for every microservice, right?
The decision was made—merge it but with history, no merge without it. You know, it’s pretty easy to split one big repository into more repositories. Another way around is much harder with challenges on the way. That’s why I want to give you this help if you are facing the same puzzle.
Let’s go! First of all, you need to know the final status. The best way to merge repositories is to prepare them in a state with no conflict. Let’s say you have repositories A, B, and C, and you want to blend them into X. Think about how you want them to merge into X. The best option, probably, is to have directories A, B, and C containing original repositories in the new big one. By now, you probably get that idea—the first step is to move all files into directories and then merge them.
You can do it simply with git mv
, but you will lose history. Actually, you will not lose it; still, you can blame files and see history with git log --follow
. And that’s why I don’t like this solution. There is a better one which will move all files into a new directory and rewrite the whole history as it would be for the whole time like that:
git filter-branch --index-filter \
'git ls-files -s | sed "s-\t\"*-&FOLDER/-" |
GIT_INDEX_FILE=$GIT_INDEX_FILE.new \
git update-index --index-info &&
mv "$GIT_INDEX_FILE.new" "$GIT_INDEX_FILE"
' --tag-name-filter cat -f -- --all
Important: watch the end of a second line; you have to change FOLDER for your name of the directory.
When you do it with all your current repositories, we can move on to the merging part. It’s a kind of easy step. Add remote and pull it. Just repeat the following for all your repositories:
git remote add src_repo_name src_repo_filepath
git pull --allow-unrelated-histories src_repo_name
git remote rm src_repo_name
Important: src_repo_filepath is meant as a just local path. I don’t recommend pushing those changes to the origin. It’s good to have old repositories untouched for historical purposes or if something goes wrong.
And now you have your new shiny merged repository, sweet!
Yes, but… but what about other branches? You can make a similar move for all branches as for the master branch. There could be just two use cases when it’s not enough or too complicated—for example, I want two branches from two repositories as one branch in the final one. I didn’t want to make a mistake with some cross-git merging (which can be done very easily), so I used a different technique: make patches of affected commits and apply them in order as I need.
git format-patch -X HASH
Call this in the original repository and branch. HASH
is a hash of the latest commit, and X
means how many commits you want to do patches for. Then you will see patch
files that you can apply. You can also modify and merge more commits into one.
git apply xxx.patch
git commit -m "..." --author ""
The second use case for this solution is when someone has a local branch. You can “easily” merge public branches because you changed hashes, and it matches, but some colleagues can have a local branch, and s/he will need to merge it as well. The colleague can use this technique for that; just before making patches, s/he needs to move all into the same directory (all commits). For those purposes, s/he doesn’t have to run first slow command keeping history but faster one:
git subtree add --prefix=FOLDER
And that’s it! I hope you will successfully merge it without problems.