uBiome Raw Data: Taxonomy

When you have a sample tested by uBiome, uBiome provide the results via an interactive web interface. It’s designed for normal people, not technically-minded science boffins so you should get on with it whatever your level of knowledge.

It is currently a beta release so some displays are incomplete. The tree view for instance — which allows you to drill down through the hierarchy of a bacteria’s classification, all the way from phylum down to genus — might be missing some data, particularly when you get down toward the genus level. For this reason, I currently prefer to use the ‘Compare’ interface where you look at one rank at a time, but which seems to have all the data correctly listed.

The obvious limitation of this is that you cannot visualize each bacteria’s hierarchy; the taxonomic ranks it belongs to. For this reason you might find it useful to look at the raw data underneath it all. uBiome make this easily available.

You login to the beta web interface and click on “Dashboard” and you’ll see “Raw Taxonomy” and “Raw Reads” links appear. If you click on Raw Reads then you can download a zip of all your sequence reads in a .fastq format. (I’m not that familiar with this format yet but I know there are various programs you can run the data through – more on that in the future, maybe).

This post is about the “Raw Taxonomy” data. Click this and a few seconds later your data appears in raw form in your browser; a couple of hundred lines. All bacteria are genetically related and this data describes the family tree for the bacteria in your sample.

Let’s look at an example that many will be familiar with: Bifidobacterium.
If I search my raw data for that, I find this row:

{“taxon”:”1678″, “parent”:”31953″, “count”:”549″, “count_norm”:”11041″, “avg”:null,
“tax_name”:“Bifidobacterium”, “tax_rank”:”genus”, “tax_color”:null},

This is telling us that the genus “Bifidobacterium” has been assigned the taxon identifier “1678”. It also tells us that it’s parent is identified by the taxon ID “31953”. If you search for that ID in your raw data it will predictably take you to the row of data for “Bifidobacteriaceae”, the family that Bifidobacterium belongs to. This also has a parent, and if you follow that it will take you to the order, “Bifidobacteriales” which in turn gives a parent, which if you look up will take you further up the tree to the subclass “Actinobacteridae” and if you follow that up another level you’ll see the class “Actinobacteria” and above that one final level up you reach “Bacteria”! in row 2 of the raw data file.

Going back to the genus “Bifidobacterium” row, it also gives a “count” which is the number of sequence reads of this bacteria in my sample. The “count_norm” value is just a normalization of your result so that it can be compared to other samples on a level playing field. This has been calculated as follows:

(counts taxon X)x(1,000,000)/(counts taxon 2)

Where taxon 2 represents the total sequence reads of all the bacteria in your sample.

If you want to know what percentage of the whole any given taxon is you can take the “count_norm” value and divide it by 10,000. So, how abundant is Bifidobacterium in my sample? 11,041 / 10,000 which is 1.1%.

The other neat thing that I can see in the raw data that doesn’t yet appear in the web interface is some species level information. I don’t know how this is worked out and not each species is identified, but some are there, which is a nice bonus!

It turns out, most of my Bifidobacterium is species B. catenulatum.


2 thoughts on “uBiome Raw Data: Taxonomy

    • That’s great to hear, I’m glad it is useful.
      To answer your question, if you look at the raw taxonomy file and you search with Ctrl+F for “species” it will show you the species level data that has been identified. How accurate it is, I don’t know yet. May just be the closest matching species in their database. But I’ll try and find out.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s