Friday, September 8, 2017

2. Results – First phylogeny of BchC and we have discovered a new phototroph!

Phylogeny of BchC
BchC is also known as 3-hydroxyethyl BChlide a dehydrogenase (Bryant et al. 2012). It catalyses the oxidation of the 3-hydroxyethyl group, in the bacteriochlorophyllide a precursor, to a 3-acetyl moiety (Lange et al. 2015).
A pfam search using a BchC from Chloroflexus found that it belongs to the pfam family: “ADH_zinc_N (PF00107)”, or Zinc-binding dehydrogenases.
It is a large family of proteins with more than 50k sequences in the database. That complicates things tremendously, but less not lose courage just yet.
A BLAST in the refseq database showed that all BchC have overall an E-value greater than 7e-29. Some of the more divergent ones appear to be from Proteobacteria.
I used the BchC sequence of Chloroflexus aggregans DSM 9485 as query and made a cut-off at 1000 sequences, excluding eukaryotes. Of these, about 550 are likely true BchC sequences, with the rest being a range of dehydrogenases found in a wide diversity of prokaryotes. I can tell this mostly by doing quick neighbor joining trees of the sequences: but, judging from the E-values, annotations, or by comparing the sequences alone is really hard to tell where BchC sequences end and where other dehydrogenases begin.
The E-value of the 450 ‘other’ dehydrogenases ranged from 1e-28 to 1e-18 relative to my query.
I aligned the 1000 sequences using Clustal Omega and 10 HMM iterations. There was a significant number of gaps and insertions… so there is likely to be some artefacts… but it is preliminary: just to see what the trees look like so far, and to see what type of sequences I have retrieved.
The tree below was calculated using the online service: http://www.atgc-montpellier.fr/phyml/
I used their new Smart Model Selection option with the ‘Bayesian Information Criterion’ (Lefort et al. 2017). This pretty much computes all parameters from the data. You guys are amazing, thanks for making phylogenetics easier to deal with! Double thumbs up for you! I used the NNI tree searching setting and the aLRT SH-like setting for branch support. It ran for nearly 20 hours.
photosynthesis chlorophyll bacteriochlorophyll evolution
Preliminar ML phylogeny of BchF.
The black dots are supported nodes, above 0.8, and those in yellow have no support. I only added these to key nodes.
I was thinking that the tree was going to reproduce the same phylogeny as BchF (see previous update), but there are a number of differences. I suspect those differences may be the result of artefacts.
You can see that all the BchC are monophyletic, with good support. BchC were split into two groups by the outgroup (grey branches): one includes the Proteobacteria sequences; and the other includes the rest of the photototrophs.
As expected, the BchC of Gemmatimonas branched within the Proteobacteria (Zeng et al. 2014).
I expected that the Acidobacteira branch was going to be sister to Proteobacteria, as in BchF, but instead it clustered with the other ones, but with no support at all. I think this might be a bit of an artefact… Similarly, I expected the sequence from the Chloroflexus to cluster with Acidobacteria and Proteobacteria as sister of the two, but it also branched with low support.
I had retrieved the same topology as BchF in some previous tests I had done with smaller datasets, so there may be some attraction going on here. But who knows… we’ll confirm this later on when I refine the analysis a bit.
It seems that the Chlorobi have two types of BchC, which is interesting because they also have two types of BchF. This may be known already though, not sure about that.
At this point, I don’t want to draw many conclusions on the implications. Unlike, BchF, which do not seem to have homologous in other processes, these dehydrogenases seem to be quite abundant across the tree of life, so it is much harder to make sense of their evolution.

A new group of phototrophs in metagenomic sequences?
For the moment, I want to point out an unusual BchF sequences that branched ‘early’ within the Proteobacteria group. It is coloured orange in the tree. As you can see in the screen captures, that sequence is annotated as belonging to Euryarchaeota Archaeon TMED255 38454, GenBank: NHLA01000022.1.
Photosynthetic Archaea? The gene cluster in the metagenome
The sequence is in a fragement that include 12 other sequences, all of them seem to be from a photosynthesis gene cluster and includes sequences like BchF, BchIDH, BchG, BchXYZ, PufL and PufM. This is unusual because there are no strains of Archaea known to be able to do chlorophyll-based photosynthesis.
I BLASTed the PufL sequence and the best hit was to:
AAM48602.1: photosynthetic reaction center L subunit [uncultured marine proteobacterium], 97% sequence coverage, an E-value of 2e-157, and a level of sequence identity of 77%. It is quite divergent, which means that these sequences originated from a clade of Proteobacteria that has not yet been characterized. Some of the Mg-chelatase subunits in that cluster have a level of sequence identity to the best hit of only about 45%!
BLAST results for PufL in that gene cluster
Lo and behold my friends! We have just discovered a new type of phototroph!
Now, the question is: is this an event of horizontal gene transfer from an uncharacterized clade of phototrophic Proteobacteria into a strain of Archaea, or is it a misannotation, or some sort of contamination? I am not familiar with the intricacies of metagenomics and genome assemblies, so I cannot tell. Can you tell?
If the HGT event is true, we have the first case of a phototrophic Archaea ever discovered! But I think contamination or bad annotation is more likely.
The metagenome project is:
AUTHORS: Tully,B.J., Sachdeva,R., Graham,E.D. and Heidelberg,J.F.
TITLE: 290 Metagenome-assembled Genomes from the Mediterranean Sea: a resource for marine microbiology
JOURNAL Unpublished
I have noted also ChlLNB sequences in two genomes of Altiarchaea, this represents bona fide horizontal gene transfer, but these strains do not have any other photosynthesis genes and they live in subsurface environments. The transfer occurred from another unidentified group of phototrophs. I wrote a short blog-post about this, please take a look of you’re interested: http://tanaiscience.blogspot.co.uk/2017/05/a-new-undiscribed-clade-of-phototrophic.html

References
Bryant, D., Z. Liu, T. LI, F. Zhao, C. G. Klatt, D. Ward, N. U. Frigaard and J. Overmann (2012). Comparative and functional genomics of anoxygenic green bacteria from the taxa Chlorobi, Chloroflexi, and Acidobacteria. Functional Genomics and Evolution of Photosynthetic Systems. R. L. Burnap and W. Vermaas. Dordrecht Springer. 33: 47-102.
Lange, C., S. Kiesel, S. Peters, S. Virus, H. Scheer, D. Jahn and J. Moser (2015). "Broadened Substrate Specificity of 3-Hydroxyethyl Bacteriochlorophyllide a Dehydrogenase (BchC) Indicates a New Route for the Biosynthesis of Bacteriochlorophyll a." Journal of Biological Chemistry 290(32): 19697-19709.
Lefort, V., J. E. Longueville and O. Gascuel (2017). "SMS: Smart Model Selection in PhyML." Molecular Biology and Evolution 34(9): 2422-2424.
Zeng, Y. H., F. Y. Feng, H. Medova, J. Dean and M. Koblizek (2014). "Functional Type 2 photosynthetic reaction centers found in the rare bacterial phylum Gemmatimonadetes." Proceedings of the National Academy of Sciences of the United States of America 111(21): 7795-7800.

No comments:

Post a Comment