While answering some reviewers comments, I pulled out this data about the instruments used to submit data to the SRA. Clearly the HiSeq and MiSeq are dominating the number of runs that people are submitting.
Gap-fill a model in PyFBA
by Daniel Cuevas
In this notebook, we will present the steps to generate a genome-scale metabolic model from RAST annotations, gap-fill the model on rich LB type media, and save the model to hard disk.
Generating, saving, loading models in PyFBA
by Daniel Cuevas
In this notebook, we will present the steps to generate a genome-scale metabolic model from RAST annotations, save the model on your computer, and load the model from your computer.
So you have written some software and want to release it to the world! Congratulations!
You’re not done yet. Now you need to document your software and make a release so that everyone knows how to use it. Here are the minimal set of files that you should include and a discussion about each.
We have a globus endpoint for large file transfers into and out of the lab. If you need to use it contact Rob for authorization. Here are some details about the endpoint.
How many possible bacterial species could there be with the usual species definition, and how many are there?
To download things from NCBI a bit faster, you can try aspera connect. This is proprietary, closed-source, software that the NCBI uses for large data transfers, but to run it in batch you need to figure out where to download it from and what to do with it.
We write a lot of software, and we release it all open source. That can be good and bad, but it is often a challenge especially when people graduate and move on. Here are some of our best practices that we have learnt over the years.
I love standards; there are always so many to choose from. The sequence read archive strives hard to capture appropriate information about the sequences that people deposit, but in the end scientists are people too, and they are never uniform and standard. This means there are a lot of ways to describe metagenomes. To get your data used by other people (and cite your papers), make sure you tag it so we can find it!
There is a lot of metagenomics data in the SRA, but it is not very well organized. To get it all, you need some wicked SQL-FU … or you can copy these recipes!