Thursday, November 30, 2017

Photosystem I and the evolution of oxygenic photosynthesis

I want to understand when and how oxygenic photosynthesis originated. Sometime ago I posted an evolutionary analysis of the core proteins of Photosystem II, which is still undergoing peer-review.
Check it out here if you have not seen it already:
Basically, all reaction centres are made of a dimer of homologous core proteins. All Type II reaction centres have a heterodimeric core, meaning that each monomer is different. Photosystem II, the water oxidising enzyme, has a core made of D1 and D2. D1 and D2 share slightly under 30% sequence identity and it is in D1 that the manganese cluster that oxidises water to oxygen is located.
It is pretty evident when the sequence and structure of D1 and D2 are compared, that the homodimeric Photosystem II (before the divergence of D1 and D2) was able to do some highly-oxidising photochemistry on both sides. Indeed, it looks like there was some kind of manganese cluster also in the D2 side at some point in time. It has been suggested before that this homodimeric Photosystem II was able to oxidise water. So if I can time when the gene duplication event that led to D1 and D2 happened, then I can have a pretty good idea of when water oxidation appeared for the first time. That is the subject of the above mentioned paper.
Type I reaction centres come in two versions: with homodimeric cores, meaning that the core is made of two copies of the same subunit; or with heterodimeric cores. The homdimeric Type I reaction centres are found only in anoxygenic phototrophs, and the heterodimeric Type I reaction centres are found only in oxygenic phototrophs: Cyanobacteria and photosynthetic eukaryotes. This heterodimeric Type I reaction centre is also known as Photosystem I.
It is hypothesised that the reason why Photosystem I is heterodimeric in oxygenic photosynthesis has something to do with oxygen… and well, there is no other alternative hypothesis that I know of.
The core of Photosystem I is made of two subunits, PsaA and PsaB. They share about 45% sequence identity. So, if oxygen is responsible for the heterodimerisation of Photosystem I; that means that the gene duplication event that led to PsaA and PsaB had to occur AFTER the evolution of water oxidation to oxygen.
That is the subject of the new manuscript:
I found out, like it was the case for Photosystem II, that the divergence of PsaA and PsaB is a lot older than anyone could have imagined… from my calculations, I would say that such gene duplication event occurred minimum 3.4 billion years ago, but it is likely much older than that.
Because PsaA and PsaB are more similar to each other than D1 and D2, it would seem as if the gene duplication event that led to PsaA and PsaB occurred long after the gene duplication that led to D1 and D2… but that may not necessarily be the case: that is only an illusion. Each duplication could have followed each other almost immediately: from this perpesctive, it may even be possible that heterodimeric Photosystem I predates heterodimeric Photosystem II, since the latter was oxidising water in a homodimeric form!
One thing that needs to be taken into account is that PsaA and PsaB are each about 730 to 750 amino acids long, while D1 and D2 are only about 360 amino acids long.
So in the case of D1 and D2: 70% sequence divergence is equivalent to about 250 amino acid differences along the entire sequence.
In the case of PsaA and PsaB, 55% sequence divergence is equivalent to about 440 amino acid differences along the entire sequence.
So consider the following example: if we assume that D1, D2, PsaA, and PsaB evolve at exactly the same rate: measured as amino acid substitutions PER POSITION per unit of time... a change of amino acids at 100 positions would take each protein the same amount of time, but in the case of D1 and D2, that would represent a change in sequence identity of 27%; while in PsaA and PsaB it would represent a change of only 13%!
In real life, however, PsaA and PsaB are evolving at a different pace than D1 and D2... but the difference is not huge.
So the fact that the level of sequence identity of PsaA and PsaB is significantly higher than D1 in D2 does not mean that the gene duplication event had to occur much later in Photosystem I than in Photosystem II. It is only an illusion caused by the fact that PsaA and PsaB are much longer than D1 and D2 and by the fact that sequence similarity in percentage is not necessarily the best measurement of sequence divergence.
Therefore, when the rates of evolution are taken into account and if we add to this the significant level of sequence divergence between PsaA and PsaB, it turns out that this gene duplication event happened a lot deeper in time than anyone could have guessed. Nevertheless, it is consistent with my hypothesis that water oxidation originated rapidly after (or at) the origin of photosynthesis, in the very early Archean.

Friday, September 15, 2017

Origin of water oxidation at the divergence of Type I and Type II reaction centres

Introduction
My friends, the way we think about the evolution of photosynthesis is about to change irreversibly.
I want to share with you some awesome stuff regarding the recent structure of the homodimeric Type I reaction centre by Gisriel et al. (2017) and what I think it all means for the origin of oxygenic photosynthesis. Huge thanks to all the authors. I know it must have been an unbelievable effort.
I have had a chance now to play with the structure a bit. What great pleasure! The structure is amazing and I have seen something that intrigues me enormously. Please, keep reading.
This is the link to the paper describing the structure:
This is a link to the pdb files:
We need first a bit of background though:
In a recent letter to the editor of the Journal of Molecular Evolution I argued that the peculiar structural characteristics of Photosystem II are better explained if water oxidation originated at the divergence of Type I and Type II reaction centres (Cardona, 2017).
What are these peculiar characteristics?
I find quite peculiar that Photosystem II is made of a core, which originated form a Type II reaction centre (D1 and D2) and an antenna, which originated from a Type I reaction centre (CP43 and CP47).
Even more peculiar still is the fact that the CP43 subunit offers a direct ligand to the Mn4CaO5 cluster.
Another peculiar trait about Photosystem II is that D1 and D2 coordinate each a peripheral chlorophyll, ChlZ-D1 and ChlZ-D2. These peripheral chlorophylls and their binding sites are also conserved in Type I reaction centres, but are not found in anoxygenic Type II reaction centres. This means that the most ancestral reaction centre, before the divergence of Type I and Type II, had these peripheral pigments.
The implication of these peculiarities is that an interaction of ancestral Type I and Type II reaction centres is required for the origin of the Mn4CaO5 cluster. A second implication is that this interaction is continuous since the origin of both types of reaction centres, and therefore since very early after the emergence of photosynthesis and the first reaction centres.
So, given the fact that Photosystem II and Photosystem I 'working in series' is the hallmark of oxygenic photosynthesis. And add to this the fact that Photosystem II is a chimera of Type I and Type II reaction centres… it does not take a huge leap forward to think that the initial divergence of both types of reaction centres is actually linked to the origin of water oxidation chemistry.
Think about this for a moment.
This would actually mean that the earliest stages in the evolution of photosynthesis are related to the origin of water oxidation chemistry. In other words, this would mean that oxygenic photosynthesis traces back to the very early stages in the evolution of photochemical reaction centres (Cardona, 2017).
It sounds crazy, right?
Photosystem II and the homodimeric Type I reaction centre
So, what about the homodimeric Type I reaction centre? What about it?
AMAZINGLY, the structure of the homodimeric Type I reaction centre has a Ca-binding site with a number of intriguing parallels to the Mn4CaO5 cluster of Photosystem II:
1. It is positioned exactly where the redox Tyr-His pair is found in D1 and D2 (See Figures 1 and 2).
2. It is connected to the C-terminus by L605 and V608. V608 is the last amino acid in the sequence. In D1 of Photosystem II, the Mn4CaO5 cluster is coordinated by D342 and A344. A344 is the last amino acid of the processed D1 and it ligates not only Mn, but also the Ca!!!!!!!!!
3. It has a connection to the antenna domain, via N263, which is within the 5th and 6th helices. N263 connects to the Ca via two water molecules. In PSII, the CP43 antenna residue E354 offers a ligand to two Mn atoms and it is in the loop connecting the 5th and 6th helices of the antenna! A totally homologous site is also found in D2 and CP47, but phenylalanine residues are found instead of ligands.
4. At the overlapping position where TyrZ/TyrD is, there is a coordinating aspartate.

Figure 1: Panel A shows the Ca-binding site from the reaction centre of H. modesticaldum. In grey I show the connections from the core domain, and in orange the connections from the antenna domain. Panel B shows a schematized version of the Ca-binding site and in italics I have highlighted the parallels with the Mn4CaO5 cluster. In Panel C I show PSII for comparison; and in Panel D I overlap D1 (orange) and PshA (grey). The yellow atom is the Ca of the H. modesticaldum reaction centre.
Figure 2: It shows a comparison of PSII and the homodimeric reaction centre. No doubt that the Ca-binding site and the Mn4CaO5 cluster occupy homologous positions.
Implications for the evolution of water oxidation and the origin of photosynthesis
The main implication is that the most ancestral reaction centre before the divergence of Type I and Type II reaction centres had, at the very least, a Ca-binding site like the one in the structure of H. modesticaldum.
This is strong evidence that the divergence of Type I and Type II reaction centres was due to the development of the structural and energetic requirements to support water oxidation chemistry and the emergence of the oxygen-evolving complex.
This explains why the Mn4CaO5 cluster has a Ca atom! It was there to begin with.
This explains why the Mn4CaO5 cluster has a ligand from the C-terminus. It was there to begin with!
This explains why the Mn4CaO5 cluster has a ligand from the antenna. Guess what? It was there to begin with!
This explains why the site is also mirrored in D2 and CP47, because it all started symmetrically on both sides, as suggested by Rutherford and Faller (2003).
In other words, this implies that the connection from the C-terminus and the antenna domain to the cluster site has been continuous since the emergence of the first reaction centres. Just as I mentioned in my letter!
This also implies that the emergence of water oxidation can be traced to the earliest events in the evolution of photosynthesis. It implies that water oxidation likely predates the diversification of most groups of phototrophs, including Cyanobacteria!
This implies that Cyanobacteria are the only bacteria to have retained water oxidation chemistry, but a greater diversity of oxygenic phototrophs must have predated them.
This is also in perfect agreement with the conclusions of my molecular clock analysis that suggests water oxidation started long before the most recent common ancestor of Cyanobacteria:
This implies that the anoxygenic Type II reaction centres likely evolved from a water-oxidizing Type II reaction centre before the gene duplication event that led to D1 and D2.
I had mentioned earlier that the ancestral Type II reaction centre (before D1, D2 , L, and M) already had some of the components that were needed to evolve the water-oxidizing complex (Cardona, 2015, 2016; Cardona et al., 2015). This validate those observations too.
In the near future, I hope to write something more substantial about this. Put all these ideas together in a nice review: go a bit deeper. In the meantime, it would be nice to discuss what you all think of this madness!
Don't hesitate to leave comments, especially if you strongly disagree and think this is all bonkers.
References
Cardona, T. (2015). A fresh look at the evolution and diversification of photochemical reaction centers. Photosynth Res, 126(1), 111-134. doi:10.1007/s11120-014-0065-x
Cardona, T. (2016). Reconstructing the origin of oxygenic photosynthesis: Do assembly and photoactivation recapitulate evolution? Frontiers in Plant Science, 7, 257. doi:10.3389/fpls.2016.00257
Cardona, T. (2017). Photosystem II is a chimera of reaction centers. Journal of Molecular Evolution, 84(2-3), 149-151. doi:10.1007/s00239-017-9784-x
Cardona, T., Murray, J. W., & Rutherford, A. W. (2015). Origin and evolution of water oxidation before the last common ancestor of the cyanobacteria. Mol Biol Evol, 32(5), 1310-1328. doi:10.1093/molbev/msv024
Gisriel, C., Sarrou, I., Ferlez, B., Golbeck, J. H., Redding, K. E., & Fromme, R. (2017). Structure of a symmetric photosynthetic reaction center-photosystem. Science, 357(6355), 1021-1025. doi:10.1126/science.aan5611
Rutherford, A. W., & Faller, P. (2003). Photosystem ii: Evolutionary perspectives. Philos Trans R Soc Lond B Biol Sci, 358(1429), 245-253. doi:10.1098/rstb.2002.1186

Friday, September 8, 2017

2. Results – First phylogeny of BchC and we have discovered a new phototroph!

Phylogeny of BchC
BchC is also known as 3-hydroxyethyl BChlide a dehydrogenase (Bryant et al. 2012). It catalyses the oxidation of the 3-hydroxyethyl group, in the bacteriochlorophyllide a precursor, to a 3-acetyl moiety (Lange et al. 2015).
A pfam search using a BchC from Chloroflexus found that it belongs to the pfam family: “ADH_zinc_N (PF00107)”, or Zinc-binding dehydrogenases.
It is a large family of proteins with more than 50k sequences in the database. That complicates things tremendously, but less not lose courage just yet.
A BLAST in the refseq database showed that all BchC have overall an E-value greater than 7e-29. Some of the more divergent ones appear to be from Proteobacteria.
I used the BchC sequence of Chloroflexus aggregans DSM 9485 as query and made a cut-off at 1000 sequences, excluding eukaryotes. Of these, about 550 are likely true BchC sequences, with the rest being a range of dehydrogenases found in a wide diversity of prokaryotes. I can tell this mostly by doing quick neighbor joining trees of the sequences: but, judging from the E-values, annotations, or by comparing the sequences alone is really hard to tell where BchC sequences end and where other dehydrogenases begin.
The E-value of the 450 ‘other’ dehydrogenases ranged from 1e-28 to 1e-18 relative to my query.
I aligned the 1000 sequences using Clustal Omega and 10 HMM iterations. There was a significant number of gaps and insertions… so there is likely to be some artefacts… but it is preliminary: just to see what the trees look like so far, and to see what type of sequences I have retrieved.
The tree below was calculated using the online service: http://www.atgc-montpellier.fr/phyml/
I used their new Smart Model Selection option with the ‘Bayesian Information Criterion’ (Lefort et al. 2017). This pretty much computes all parameters from the data. You guys are amazing, thanks for making phylogenetics easier to deal with! Double thumbs up for you! I used the NNI tree searching setting and the aLRT SH-like setting for branch support. It ran for nearly 20 hours.
photosynthesis chlorophyll bacteriochlorophyll evolution
Preliminar ML phylogeny of BchF.
The black dots are supported nodes, above 0.8, and those in yellow have no support. I only added these to key nodes.
I was thinking that the tree was going to reproduce the same phylogeny as BchF (see previous update), but there are a number of differences. I suspect those differences may be the result of artefacts.
You can see that all the BchC are monophyletic, with good support. BchC were split into two groups by the outgroup (grey branches): one includes the Proteobacteria sequences; and the other includes the rest of the photototrophs.
As expected, the BchC of Gemmatimonas branched within the Proteobacteria (Zeng et al. 2014).
I expected that the Acidobacteira branch was going to be sister to Proteobacteria, as in BchF, but instead it clustered with the other ones, but with no support at all. I think this might be a bit of an artefact… Similarly, I expected the sequence from the Chloroflexus to cluster with Acidobacteria and Proteobacteria as sister of the two, but it also branched with low support.
I had retrieved the same topology as BchF in some previous tests I had done with smaller datasets, so there may be some attraction going on here. But who knows… we’ll confirm this later on when I refine the analysis a bit.
It seems that the Chlorobi have two types of BchC, which is interesting because they also have two types of BchF. This may be known already though, not sure about that.
At this point, I don’t want to draw many conclusions on the implications. Unlike, BchF, which do not seem to have homologous in other processes, these dehydrogenases seem to be quite abundant across the tree of life, so it is much harder to make sense of their evolution.

A new group of phototrophs in metagenomic sequences?
For the moment, I want to point out an unusual BchF sequences that branched ‘early’ within the Proteobacteria group. It is coloured orange in the tree. As you can see in the screen captures, that sequence is annotated as belonging to Euryarchaeota Archaeon TMED255 38454, GenBank: NHLA01000022.1.
Photosynthetic Archaea? The gene cluster in the metagenome
The sequence is in a fragement that include 12 other sequences, all of them seem to be from a photosynthesis gene cluster and includes sequences like BchF, BchIDH, BchG, BchXYZ, PufL and PufM. This is unusual because there are no strains of Archaea known to be able to do chlorophyll-based photosynthesis.
I BLASTed the PufL sequence and the best hit was to:
AAM48602.1: photosynthetic reaction center L subunit [uncultured marine proteobacterium], 97% sequence coverage, an E-value of 2e-157, and a level of sequence identity of 77%. It is quite divergent, which means that these sequences originated from a clade of Proteobacteria that has not yet been characterized. Some of the Mg-chelatase subunits in that cluster have a level of sequence identity to the best hit of only about 45%!
BLAST results for PufL in that gene cluster
Lo and behold my friends! We have just discovered a new type of phototroph!
Now, the question is: is this an event of horizontal gene transfer from an uncharacterized clade of phototrophic Proteobacteria into a strain of Archaea, or is it a misannotation, or some sort of contamination? I am not familiar with the intricacies of metagenomics and genome assemblies, so I cannot tell. Can you tell?
If the HGT event is true, we have the first case of a phototrophic Archaea ever discovered! But I think contamination or bad annotation is more likely.
The metagenome project is:
AUTHORS: Tully,B.J., Sachdeva,R., Graham,E.D. and Heidelberg,J.F.
TITLE: 290 Metagenome-assembled Genomes from the Mediterranean Sea: a resource for marine microbiology
JOURNAL Unpublished
I have noted also ChlLNB sequences in two genomes of Altiarchaea, this represents bona fide horizontal gene transfer, but these strains do not have any other photosynthesis genes and they live in subsurface environments. The transfer occurred from another unidentified group of phototrophs. I wrote a short blog-post about this, please take a look of you’re interested: http://tanaiscience.blogspot.co.uk/2017/05/a-new-undiscribed-clade-of-phototrophic.html

References
Bryant, D., Z. Liu, T. LI, F. Zhao, C. G. Klatt, D. Ward, N. U. Frigaard and J. Overmann (2012). Comparative and functional genomics of anoxygenic green bacteria from the taxa Chlorobi, Chloroflexi, and Acidobacteria. Functional Genomics and Evolution of Photosynthetic Systems. R. L. Burnap and W. Vermaas. Dordrecht Springer. 33: 47-102.
Lange, C., S. Kiesel, S. Peters, S. Virus, H. Scheer, D. Jahn and J. Moser (2015). "Broadened Substrate Specificity of 3-Hydroxyethyl Bacteriochlorophyllide a Dehydrogenase (BchC) Indicates a New Route for the Biosynthesis of Bacteriochlorophyll a." Journal of Biological Chemistry 290(32): 19697-19709.
Lefort, V., J. E. Longueville and O. Gascuel (2017). "SMS: Smart Model Selection in PhyML." Molecular Biology and Evolution 34(9): 2422-2424.
Zeng, Y. H., F. Y. Feng, H. Medova, J. Dean and M. Koblizek (2014). "Functional Type 2 photosynthetic reaction centers found in the rare bacterial phylum Gemmatimonadetes." Proceedings of the National Academy of Sciences of the United States of America 111(21): 7795-7800.

Thursday, September 7, 2017

1. Evolution of BchC: an enzyme required for the synthesis of bacteriochlorophyll a — Introduction

I published in 2016 a little paper in PLoS ONE about the evolution of BchF (Cardona 2016), an enzyme in the pathway to make bacteriochlorophyll a. In this project, I will present data and analysis on the evolution of another enzyme required to make bacteriochlorophyll a, that is: BchC. As far as I know, there are no evolutionary studies on this enzyme, so I plan to update my progress in here. Not sure where this will lead, but let’s keep an open mind, and lets keep it informal. I welcome your feedback and your participation!
I would say that there are 2 types of anoxygenic phototrophs described so far. Those which use bacteriochlorophyll as ‘primary’ pigment and those which use bacteriochlorophyll g.
Those with bacteriochlorophyll a are:
  • Proteobacteria
  • Gemmatimonadetes
  • Acidobacteria
  • Chloroflexi
  • Chlorobi
Those with bacteriochlorophyll are:
  • Heliobacteira
Cyanobacteria, which are the only bacteria capable of oxygenic photosynthesis do not use any form of bacteriochlorophyll. They use chlorophyll a as the primary pigment, and other chlorophylls derived from it, like chlorophyll d and f.
So this is the thing:
To make bacteriochlorophyll you first need to make the chlorophyll precursor, chlorophyllide a. Chlorophyllide is produced from protochlorophyllide, and the reaction is catalysed by the ChlLNB enzyme, the one homologous to nitrogenases. If you attach a tale directly to chlorophyllide a, then it becomes chlorophyll a, which is used by Cyanobacteria and photosynthetic eukaryotes. However, Acidobacteria, Chlorobi, and Heliobacteria also make chlorophyll a in addition to their primary bacteriochlorophylls and use it in their homodimeric Type I reaction centres.
To make bacteriochlorophyll g, you need the enzyme BchXYZ, which is homologous to ChlLNB and the nitrogenases as well. That takes chlorophyllide and makes bacteriochlorophyllide. If you put a tail to it, without any further modifications, it becomes bacteriochlorophyll (Tsukatani et al. 2013).
To make bacteriochlorophyll a, you need BchXYZ, and two additional enzymes not found in Cyanobacteria nor Heliobacteria, these are BchF and BchC. After these two extra steps, you add a tail and then you have—finally—bacteriochlorophyll a.
It seems to me that some of the earliest phototrophs likely made both chlorophyll a and bacteriochlorophyll g. Or something like those, using the ancestral enzymes to LNB and XYZ, or the early versions of these. And only at a later stage, bacteriochlorophyll was invented by adding BchF and BchC. And, I think, the addition of BchF and BchC happened in an ancestral bacterium predating the divergence of Chlorobi, Chloroflexi, Proteobacteria, and Acidobacteria... but already after the branching of some early groups of phototrophic Firmicutes, of which only Heliobacteria survives (or has been discovered).
And I also think the reason why bacteriochlorophyll was invented is because bacteriochlorophyll reacts with oxygen and bleaches really quickly.
I like an scheme presented in 1987 by Olson and Pierson, see the attached figure (Olson and Pierson 1987). I think they got pretty close! I have made some adjustments, which reflects more what I think… not exactly, but it will do for now.
And scheme by Olson and Pierson 1987 on the evolution of reaction centres and chlorophylls, with my modifications!

So, I suspect that bacteriochlorophyll a evolved at a relatively late stage, already when there was enough oxygen around to cause some damage. My hypothesis is that it evolved to stabilize the pigment in the presence of oxygen. In my PLoS ONE paper I argued that the phylogeny of BchF, if compared to that of Type I and Type II reaction centre proteins, suggests that when BchF appeared, both reaction centres had already diverged.
An interesting question is, when did bacteriochlorophyll appear relative to bacteriochlorophyll g? I will attempt to measure this at some point.
Now, the evolution of BchC could also hold some interesting clues on the origin of bacteriochlorophyll a-based photosynthesis. And that is why I will present here some data, as I produce it, on the evolution of BchC. I already have some phylogenetic trees...

References
Cardona, T. (2016). "Origin of Bacteriochlorophyll a and the Early Diversification of Photosynthesis." PLoS One 11(3): e0151250.
Olson, J. M. and B. K. Pierson (1987). "Origin and Evolution of Photosynthetic Reaction Centers." Origins of Life and Evolution of the Biosphere 17(3-4): 419-430.
Tsukatani, Y., H. Yamamoto, T. Mizoguchi, Y. Fujita and H. Tamiaki (2013). "Completion of biosynthetic pathways for bacteriochlorophyll g in Heliobacterium modesticaldum: The C8-ethylidene group formation." Biochimica Et Biophysica Acta-Bioenergetics 1827(10): 1200-1204.