Bioinformatics Software Releases Best Practices

We write a lot of software, and we release it all open source. That can be good and bad, but it is often a challenge especially when people graduate and move on. Here are some of our best practices that we have learnt over the years.

Provide test data sets and examples

This should be a no brainer, but the most critical thing about ensuring longevity of your code is providing a test case data set and explaining the answer that you should get.

When we download and install your software on our machines, we want to be able to run the test suite and get an answer. This tells us that the software has been installed correctly, it tells us what format the input data should be, and it shows us what a working output should look like.

You should have test data sets for all aspects of your code, and the errors should be clear and obvious. Don’t obfuscate an error! If you rely on some third party packages for your code to run , you should check that they are installed and let the user know if things are missing. In addition, don’t test one package and die – most package managers allow you to install multiple packages at once, and so I’d rather have a list of packages that need to be installed (e.g. a requirements.txt file) than have to keep going through endless cycles of install-test-install-test.

Document your software

No one reads documentation, but we need to at least have a brief documentation about the input and output formats. Tell us, don’t make us reverse engineer your output.

Provide a help menu

If I run your code with no options don’t die with a weird message. Exit gracefully with a help menu that reminds me what the options are. That same menu should also appear if I use -h as an option.

Provide a version number

Use -v or –version (or both) as an option for me to get the current version of your software. Then when it is updated I’ll know that I have an out of date version!

Provide the citation somewhere

You should put the citation in the help menu. You want us to cite your software – make sure we get the right citation!

Other resources

Here are some good references for things you should also do to make your software portable and usable by a wide audience. These were suggested by people on twitter.

This article by Torsten Seemann reiterates what I said above!

Ten recommendations for creating usable bioinformatics command line software

Ten Simple Rules for the Open Development of Scientific Software

So you want to be a computational biologist?

Best Practices for Scientific Computing

Software Testing Techniques