Wednesday, April 22, 2015

Contamination of genome projects with DNA from other organisms

I was blasting a protein named PsbO, also known as the 'manganese stabilizing protein' of Photosystem II. This is a protein found in cyanobacteria, algae, and plants, and it is important in photosynthesis. I was doing a phylogenetic tree and noted that one of the proteins originated from the recently sequenced non-photosynthetic bacterium Paenibacillus sp. IHB B 3415. A BLAST showed that the PsbO in this strain is identical to that in Camellia sinensis, the tea plant.

The chances of horizontal gene transfer from the chloroplast of the tea plant to Paenibacillus, I would say, is pretty close to 0%. So I imagine this is some form of contamination.

It is interesting that some of the investigators involved in the genome project are from Hill Area Tea Science Division, CSIR-Institute of Himalayan Bioresource Technology in Palampur, India.

I'm a little bit concerned. What is the chance of contamination to be present in genome projects? In the case of contamination from Eukaryote DNA into that of a bacterium, I guess it is not such a big deal because it can be easily spotted... but if you have contamination from another strain of bacteria, this might look like horizontal gene transfer and it may be not that simple to differentiate using just bioinformatics.

Update (April 20, 2015)
I contacted GenBank to report the issue, they investigated and this is what they told me:

"The submitter concurs with your assessment, so we have removed the contaminated contig JUEI01000195 from the public record." 

manganese stabilizing protein msp
The PsbO protein of Photosystem II

No comments:

Post a Comment