The next Generation of Code Hosting Platforms

February 12, 2016    code federation git github hacking selfhosting

CC BY-SA 2.0 by Christiaan Colen

The last few weeks there has been a lot of rumors about GitHub. GitHub is a code hosting platform which tries to make it as easy as possible to develop software and collaborate with people. The main achievement from GitHub is probably to moved the social part of software development to a complete new level. As more and more Free Software initiatives started using GitHub it became really easy to contribute a bug fix or a new feature to the 3rd party library or application you use. With a few clicks you can create a fork, add your changes and send them back to the original project as a pull request. You don’t need to create a new account, don’t need to learn the tools used by the project, etc. Everybody is on the same platform and you can contribute immediately. In many cases this improves the collaboration between projects a lot. Also the ability to mention the developer of other projects easily in your pull request or issue improved the social interactions between developers and makes collaboration across different projects the default.

That’s the good parts of GitHub, but there are also bad parts. GitHub is completely proprietary which makes it impossible to fix or improve stuff by yourself or run it by your own. Benjamin Mako Hill already argued 2010 why this is a problem and why Free Software needs free tools. More and more people seems to realize that this can create serious problems and a large group of active and influential GitHub users sent a letter to GitHub which ends with:

“Hopefully none of these are a surprise to you as we’ve told you them before. We’ve waited years now for progress on any of them. If GitHub were open source itself, we would be implementing these things ourselves as a community — we’re very good at that!”

I can’t stress this argument enough. The Free Software community is a community of people who are used to do stuff and don’t just consume it. If we use a third party library and find a bug or need a feature we don’t just complain, instead we look at the code, try to fix it and provide a patch to upstream. We could do the same for the tools we use. But we need to be able to do it. It has to be Free Software.

Now a lot of rumors and discussion evolved around the news that GitHub is undergoing a full-blown overhaul as execs and employees depart. Some people even predict that this will be the end of GitHub.

Wait for it. Three months from now, GitHub introduces “features” no-one wants or needs. 12 months from now, the exodus.

— Pieter Hintjens (@hintjens) February 7, 2016

It seems that many people underestimated the lock-in effect of the new hosting platforms such as GitHub for a long time. Now they start to realize that it might be easy to export the git repository but what about the issue tracker, the wiki, CI integration, all the social interaction and collaboration between the projects, all the useful scripts written for the GitHub-API? You can’t clone all this stuff easily and move on.

I don’t want to go deeper into the discussion about what’s going on at GitHub and what will happen next. There are plenty of articles and discussions about it, you can read some of them if you follow the links in this blog.

At the moment the ESLint initiative discusses the option to move away from GitHub and by reading the comments you can get a idea about the lock-in effect I’m talking about. With the growing dissatisfaction and with people realizing that they are sitting in a “golden cage” I have the feeling that we might have a opportunity to think about the next generation of code hosting platforms and how they should look like.

Some of you may remember how Git come into existence, the tool which is used as the underlying technology of GitHub. Ironically, Git was born because of quite similar reasons for which the next generation source code hosting platforms might arise. Before Git, the Linux-Kernel developer community used BitKeeper. BitKeeper is a proprietary source control management system. The developer decided to use it because from a technical point of view BitKeeper was so much better than what we had until then, mainly SVN and CVS. The developer enjoyed the tool and didn’t thought about the problems such a dependency could create. At some point the copyright holder of BitKeeper had withdrawn gratis use of the product after claiming that Andrew Tridgell had reverse-engineered the BitKeeper protocols. The Linux-Kernel community had to move on and Linus Torvalds wrote Git.

Back to the next generation of source code hosting and collaboration platforms. It is easy to find Free Software to run your own git repository, a issue tracker and a wiki. But in 2016 I think that this is no longer enough. As described before, the crucial part is to connect software initiatives and developer to make the interaction between them as easy as possible. That’s why traditional code hosting platforms like for example Savannah are no longer a real option for many projects. I think the next generation code hosting platform needs to work in a decentralized way. Every project should be able to either host its own platform or chose a provider freely without loosing the connection to other software initiatives and developers. This development, from proprietary and centralized solutions to centralized Free Software solutions to federated Free Software solutions is something we already saw in the area of social networks and cloud services. Maybe it is worth looking at what they already achieved and how they did it.

To make the same transition happen for code hosting platforms we need implementations based on Free Software, Open Standards and protocols which enabled this kind of federation. The good news is that we already have most of them. Git by itself is already a distributed revision control system and doesn’t need a central server for collaboration. What’s missing is a nice web interfaces to glue all this parts together: a issue tracker, a wiki, good integration in Free Software CI tools, good APIs and of course Git. This will enable us to fork projects across servers, send pull requests, interact with the other developers and comment on issues no matter if they are on the same server or not. Chances are high that we will already find a suitable protocol by looking at the large amount of federated social networks. By choosing a exiting protocol of a established federated social network we could even provide a tight integration in traditional social networks which could provide additional benefits beyond what we already have. The hard part will be to pull all this together. Will it happen? I don’t know. But I hope that after we have seen the raise and fall of SourceForge, Google Code and maybe at some point GitHub we will move on to create something more sustainable instead of building the next data silo and wait until it fails again.

Feel free to drop me a mail if you want to share your thoughts on this topic or if you want to discuss it further.