Wednesday, May 5, 2010

Open source bioinformatics

All the code developed at SGN is "open source". Why the quotes? Because so far, no-one really could see our code. Of course, if you requested it, we provided it. But that is not yet the whole story to open source. Ideally, you would like to collaboratively develop code with others in the community, and merge in improvements made by others. However, merging back changes from outside SGN into the code base was, up to now, too cumbersome, and basically required giving other people full access to our filesystem, which is dangerous.

The version control system called git has really changed that. Every git "checkout" contains the entire history of the project, which means that there is not really a central repository anymore - all are equivalent. What the central repository is, is just a convention. There are websites like http://github.com, which host git repositories (for small repositories, it is even free). We have exported our git repositories to github.com, from where anyone in the world can 'clone' our repository, make changes, and feed back changes for merging with our repository, which is particularly simple using git.

Our git repository can be accessed at http://github.com/solgenomics .