A Sensible Way to Organize your Repositories Locally
A Sensible Way to Organize your Repositories Locally

A Sensible Way to Organize your Repositories Locally

Tags
github
gitlab
dev
git
date
Feb 23, 2021
If you interact with any type of code in your life, you probably are using git. Git is an open-source version control system. It allows us to track the changes in a project, also known as a repository. And then we have platforms like Github.com and Gitlab.com which host these repositories remotely so multiple people can collaborate and contribute. You can clone these repositories locally to work on them, commit your changes and push them to the remote repository so they are incorporated in the last version.
So far so good.

What is the problem?

Depending on how many repositories you interact in a day-to-day basis, you can easily lose track of where exactly your repositories are in your local machine. You may just want to use a ~/projects/ folder and put everything there. It is straightforward, but other problems arise:
  • Sooner or later that folder will become a huge mess with tons of projects, which makes it harder to look for a particular project
  • You try to clone a project twice because you don't remember that you have downloaded it before.
  • You lost track of the origin of the repository, meaning both where the remote is located (GitHub, GitLab, Bitbucket) and where that repository belongs (personal, organization, open-source project...).
  • Maybe you want to separate your work and personal projects and use different users and keys
This is just an organizational problem, so we just need to come up with a simple system that allows us to put order and keep the order of the repositories.

Solution

Replicate the remote repositories structure inside your local machine
This is the very simple rule I follow to organize my repositories. The algorithm is as follows:
  1. Create a folder for your projects. E.g. ~/projects/
  1. For each git platform you use (GitHub, GitLab, etc...) create a single folder. E.g ~/projects/github.com/ or ~/projects/gitlab.com/. I like to use the full domain name so I can distinguish between enterprise GitHub/GitLab and the public ones.
  1. For a repository of a platform, use the path in the URL as the directory structure to place that repository. E.g. let's say I want to work on https://github.com/apache/spark, then I will clone the repository into ~/projects/github.com/apache/spark
That is the basic idea. This way you will have a 1:1 relationship between the remote repositories and your local machine. This is great to locate your projects quickly. You can also distinguish very quickly between a fork and the original one. It is especially useful with GitLab, where you can have an undefined number of subprojects too.

Advanced usage

Let's say that you want to keep your accounts separate. For example, your company has GitHub enterprise at github.company.com and you also have your personal account with github.com. We don't want to make commits with our personal email or username into our company repositories or the other way around! How can we avoid this?
The username and email are managed in your ~/.gitconfig file. It can also be managed by git config --global --set user.email, for example. However, you are not restricted to a single .gitconfig file. You can have as many as you want and include them in your global ~/.gitconfig file. Moreover, this include can be conditional, like this:
[user] email = hi@pablosanjose.com name = pablo [includeIf "gitdir:~/projects/github.com/"] path = ~/projects/github.com/.gitconfig [includeIf "gitdir:~/projects/github.company.com/"] path = ~/projects/github.company.com/.gitconfig
This way you can have a different configuration for any of the sites you use to host your repositories. Then is a matter of including the [user] section in those .gitconfig files.
Note that the order is important. If a section is repeated, it will take the last one. This is why we put the [user] at the top of the file in the example.