As easy as it is to install PyFBA using the
pip command, it can be quite cumbersome to do so when you are working on a system without granted administrative or
sudo permissions. Here is a quick guide that has worked for me when installing PyFBA on a CentOS 6.3 system running a SunGrid Engine cluster system. If you are working on a Linux system and you do have admin and
sudo permissions, please follow the install guide here. Continue reading
We were curious about how many bp of metagenomes in the SRA. This was partly inspired by our grant writing, and partly by this question on twitter from Tom Delmont:
This is how to answer the question!
CAMI (Critical Assessment of Metagenome Interpretation) is a community-led initiative designed to help tackle the problems faced by metagenomics analyses, aiming for an independent, comprehensive and bias-free evaluation of these metagenomics pipelines [source]. As part of the challenge, several simulated datasets were generated in order to evaluate each of the assembly, profiling, and binning tools submitted for review. Three distinct datasets were generated simulating microbiomes of varying complexities: low, medium, and high complexity. A pre-print version of the CAMI manuscript can be found on bioRxiv here: http://biorxiv.org/content/early/2017/01/09/099127
This blog post contains links to the binning and profiling results for those datasets. Continue reading
Once again we are offering a one week workshop on metagenomics data analysis at San Diego State University from June 26th to June 30th. The course will have a focus on random metagenomics sequencing and data analysis (not 16S sequencing). The course will cover sequencing technologies and sequencing approaches, data analysis using the linux command line, paired end sequencing, sequence assembly, mapping reads and visualization, population genomics, and extracting data from the sequence read archive. If you are interested, click the read more.
Lately, I’ve been playing around with the Seaborn library and making heatmaps. The colors are usually some gradient showing highs and lows and I wanted to show how to make some of those here.
The metadata in the SRA is not all the data you can get about a run. Here is how to get more data about a run from the SRA without going to the SRA website.
Recall that in the SRA A project (SRP) has one or more samples, a sample (SRS) has one or more experiments (SRX), and an experiment has one or more runs (SRR). [source: davetang.org]
How many experiments only have one run, and how many experiments have lots of runs?
Our genome-scale metabolic model software, PyFBA, has been upgraded to version 1.2 recently. With new features ranging from a new Model class to Python 3 support, this release expands the usefulness of PyFBA. Version 1.2 has been updated on our GitHub master branch and PyPI. Read further to see more details.
While answering some reviewers comments, I pulled out this data about the instruments used to submit data to the SRA. Clearly the HiSeq and MiSeq are dominating the number of runs that people are submitting.