Default namespace in Gitlab: hosting repositories at the root

Recently, we deployed Gitlab at Infinit, moving from gitolite, mainly for the code review and the continuous integration features. To preserve our historical repository URLs, I had to hack around a bit since Gitlab forces you to organise repositories in folders. Skip to TL;DR if you just want my solution.

The default namespace and why we need it

Gitlab organises projects in groups that act like namespaces: you can access your project dashboard via http://<hostname>/<group>/<project and clone your repository via git clone git@<hostname>:<group>/<project>. This architecture is well suited for large organisations or if you enable the fork feature where everyone can and should create new repositories: it's tidier and avoids name collisions. However, for a small company such as a startup or a personal project manager, this is quite overkill. At infinit, we have a few dozen repositories that all belong to the "Infinit" group, hence cloning urls are always of the git@git.infinit.sh/infinit/<repo> form, which is a bit cumbersome and unneeded: git@git.infinit.sh/<repo> would be sufficient.

Gitlab used to have a "default namespace"enabling you to access a set of repositories directly. This would solve the problem, but the feature was dropped in version 6. This does not seem to be a massive deal: use the group and type an additional infinit every now and then, some would say. And that's actually the official answer to people asking for the feature back: it's tidier, fix your flow, just accept it, deal with it. Except that one size does not fit all.

If it was just about sometimes typing the additional "infinit", I would have settled with it, although it's quite frustrating to have to compromise with your brand new shiny system. However, there's more than that. Those repositories have been around for years with the previous URL, and the migration would not be easy. Everyone and every tool in the company would have to change his setup. Far worse: we make extensive use of submodules, and one thing git submodules are really not good at is changing URL. Even with git submodule deinit, I always end up having to remove part of my .git directory, and I do not fancy sitting next to dozens of people helping them fix it and patching tons of servers. Especially when some people are questioning whether the change is worth it and you're assuring them the migration will be a piece of cake ; if the whole company is a mess for one week because of broken submodules, that's going to be on you.

Final nail in the coffin: we work very, very hard here to make sure are builds are entirely reproducible and deterministic. We can reproduce any past version of our code thanks to submodule, pined dependencies and stateless build environments. Changing the URL of git repositories would mean that any past build would be broken since submodules would now refer invalid URLs. This makes the "accept it" approach go from very inconvenient to inacceptable.

How to bypass this additional directory in Gitlab git hosting

I don't speak ruby fluently - and really don't want to - and I don't know much about rails, so patching Gitlab itself isn't really practical. Plus, the development team stresses that the notion was entirely removed, and getting it right would probably require to patch dozen of places, making it very hard to get right.

On the other hand I know a lot about git. Cloning over SSH (git clone git@host:repo) is just a matter of invoking git-receive-pack and git-upload-pack over SSH. Most git server management systems will force the execution of a script when you connect to the git account with your SSH keys. That script will check you are invoking one of the two git commands and just forward them. This is a place where we can sneak in and hack something to get what we want.

$ head -n 1 ~git/.ssh/authorized_keys
command="/opt/gitlab/embedded/service/gitlab-shell/bin/gitlab-shell key-1",no-port-forwarding,no-X11-forwarding,no-agent-forwarding,no-pty ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAq1pXuFI8l8MopHufZ4S3fe+WoR5wgeaPtZhw9IFuHZ+3F7V7fCzy76gKp5EPz5sk2Dowd90d+TuEUjUUkI0fRLJipRPjo2reFsuOAZ244ee/NLtG601vQUS/sV8ow2QZEAoNAiNZQGr4jEqvmjIB+rwOmx9eUgs887KjUYlX+wH5984EAr/qd62VddYXga8o4T2QX4GlYik/s/yKm0dlCQgZXQPYM5Wogv6KluGdLFKBaNc2HYkGEArZE51sATRcDOSQcycg2sGuwfL/LfClsCkx2LSYjJh9qkiBNUsAg+LeRt/9Hv3S32tcMszCph3nSX5u+1yz8VURHjVGh9ptAw== mefyl@alucard
$ 

Examining how gitlab handles SSH connections

Haha, gitlab indeed invokes the /opt/gitlab/embedded/service/gitlab-shell/bin/gitlab-shell script when you log into the git account. This is what the command="..." part is for. Note that the location of the script may differ on your installation. Let's see that script:

#!/opt/gitlab/embedded/bin/ruby
# Fix the PATH so that gitlab-shell can find git-upload-pack and friends.
ENV['PATH'] = '/opt/gitlab/bin:/opt/gitlab/embedded/bin:' + ENV['PATH']

#!/usr/bin/env ruby

unless ENV['SSH_CONNECTION']
  puts "Only ssh allowed"
  exit
end

key_id = /key-[0-9]+/.match(ARGV.join).to_s
original_cmd = ENV['SSH_ORIGINAL_COMMAND']

require_relative '../lib/gitlab_init'

#
#
# GitLab shell, invoked from ~/.ssh/authorized_keys
#
#
require File.join(ROOT_PATH, 'lib', 'gitlab_shell')

if GitlabShell.new(key_id).exec(original_cmd)
  exit 0
else
  exit 1
end

Content of /opt/gitlab/embedded/service/gitlab-shell/bin/gitlab-shell

This is what I'm talking about: the script setups a few things such as the PATH to be able to find git and friends and forwards the command to a GitlabShell instance that will know what to do. We can hook ourselves here and alter the commands: if someone is performing a git operation on a repository outside of any directory, simply prepend a default, "infinit/" in my case.

splitted = original_cmd.split(" ")
if splitted[0] =~ /^git-(upload|receive)-pack$/ and not splitted[1].include? "/"
  original_cmd = ([splitted[0], "'infinit/" + splitted[1][1..-1]] + splitted[2..-1]).join(" ")
  # $stderr.puts "Rewrote git command to: #{original_cmd}"
end

Forgive my ruby, I even had to google how to write a conditional statement

This script will for instance intercept any command of the form git-upload-pack <repository> ... and rewrite it git-upload-pack infinit/<repository> ... - of course, replace "infinit" with your own default directory. Et voilĂ , we can now git clone git@git.infinit.io:backend and it will automatically map to git@git.infinit.io:infinit/backend. Standard Gitlab URLs will still work as we only prepend our directory at the git level if none was specified.

One slight drawback: this script is regenerated everytime Giopt/gitlab/embedded/service/gitlab-shell/bin/gitlab-shelltlab is updated, so remember to reinsert our path patcher when you update.

Going further: git over HTTP ?

You may have noticed my solution only works for git SSH operations and does not include HTTP: we still require git clone https://git.infinit.io/infinit/backend.git. I don't really care in my specific case as we only use SSH, and in the rare case when we want to use HTTP, we can still type that additonal "infinit/". However, this could probably be hacked using a similar solution, except this time at the HTTP server level: add some rewrite rule in your NGINX/Apache/... configuration that prepends your default directory to /[...]+/[...]+\.git urls.

TL;DR

How to host repository of one of your Gitlab groups at the root level over SSH ?

  • On your gitlab box, look at ~git/.ssh/authorized_keys and find the script passed to command= at every line beginning. In my case, it is /opt/gitlab/embedded/service/gitlab-shell/bin/gitlab-shell.
  • Edit that script and squeeze in the code snippet mentioned earlier, replacing "infinit" with the gitlab group containing your repositories.
  • Remember to reinsert that code everytime the script is overwritten by a Gitlab update.