Friday, December 14, 2018

Evolution of the CP43 and CP47 antenna proteins of Photosystem II and the link to water oxidation

In our recent paper in Geobiology we made a strong case for the process of water oxidation to oxygen having originated before the duplication leading to D1 and D2.

Article Early Archean origin of Photosystem II

As you may know by now (if you follow my posts or work), the core of Photosystem II is not just made of D1 and D2, but these also have an intimate relationship with the antenna proteins CP43 and CP47. Why is it intimate? Because the CP43 binds the Mn4CaO5 cluster together with D1.

CP43-E354 coordinates two Mn atoms, and CP43-R357 offers a hydrogen bond to one of the Mn-bridging oxygen atoms and it is within 4 Å from the calcium in the cluster.

We have seen now that D2 does not bind a cluster but instead a number of phenylalanine residues seem to replace the ligands and block access to Mn and water. What is remarkable is that CP47 also reaches within D2, as if to provide ligands to a long-gone cluster, but instead it inserts a few phenylalanine residues: one of them within less than 4 Å of the redox tyrosine, YD. Have a look at Figure 7H in the paper.

How? Why? What does this mean? Does it mean that in the homodimeric Photosystem II, before the D1/D2 duplication, the water-oxidising cluster was also coordinated by the antenna domain? Like CP43 does today?

When the crystal structure of the homodimeric Type I reaction centre of heliobacteria was released in 2017, I found a Ca2+ bound to the place where the Mn4CaO5 cluster would be, and these Ca2+-binding sites had a number of structural similarities with the water-oxidising cluster that I thought could not possibly be just coincidence. In particularly, the fact that the putative Ca2+-binding site interacted with the antenna domain in a manner similar to Photosostem II.

I discussed this in an early and hasty version of a manuscript that I should be submitting for publication soon. Have a look:

Working Paper Origin of water oxidation at the divergence of Type I and Ty...

Funnily enough, Prof. Bob Blankenship said in a news article that he didn't believe it. Well, he should believed it, because I'm right! :D haha

https://www.quantamagazine.org/simple-bacteria-offer-clues-to-the-origins-of-photosynthesis-20171017/

I jest.

Anyways, I have now taken a closer look at the antenna's extrinsic domains. And I found something AMAZING.

Have a look at the attached figure with the structural comparisons.


A, B, and C, are the antenna of heliobacteria, CP43, and CP47 respectively. In four different views. In grey you see the transmembrane helices and in colours the extrinsic domain between the 5th and 6th helices. In panel D you can see a schematic view.

I have split the extrinsic domain of CP43 into three bits: EF2, EF3, and EF1.

EF1 is retained in all Type I reaction centres (except PsaA and PsaB) and in CP43 and CP47.

EF3 binds the manganese cluster in CP43. This EF3 region is also found in CP47, but it is at a different location! A change of place occurred!

There is sequence identity in all of the matching domains once they are compared to each other.

Have a look at the attached alignment comparing only the EF3. Sequence identity is unambiguous.


The green arrows indicate the positions where EF3/EF4 are “inserted” in both subunits.

The two residues at homologous positions in the CP47-EF3 region bind a calcium! Yeah, that is right! They bind a calcium!

CP43-E354 is CP47-E435, and CP43-R357 is CP47-N438 as shown in the figure. The Ca2+ is not found in the CP47 of photosynthetic eukaryotes (I did not see it in the structure of the red algae PSII). Except perhaps for the PSII of Cyanophora paradoxa and relatives: early-branching algae.

In CP47, EF1 which in heliobacteria binds the Ca2+, interacts with the CP47-N438 via K332.

The phenylalanine residues that in CP47 insert themselves into D2, are found in the region marked as EF4, which does not exists in CP43.

The level of sequence identity between CP43 and CP47 is about 20%. But this falls to virtually 0% in the extrinsic domain if these are compared in their current order. If you remove EF4, and align the homologous bits together, the sequence identity is back to 20%! Unbelievable.

You might think that 20% overall sequence identity is too low, but the level of sequence identity between the alpha and beta subunits of ATP synthase is also 20%. Just to give you context.

You might think that the CP43 and CP47 have evolved very fast… the opposite is true. Currently after D1 and D2, the second slowest evolving reaction centre subunits are the CP43 and CP47, evolving even slower than ATP synthase today (unpublished data).

All in all it means that EF2, EF3, and EF1 were already present at the moment of duplication!

Given that EF4 only exists in CP47, we can then argue that this was not present before duplication, and therefore the phenylalanine residues that today get inserted into D2 and interact with YD could not have been in the homodimer. So the D2 and CP47 phenylalanine patch could not have been the ancestral state, as it is of course obvious from everything we discussed in the Geobiology paper and what had been described by Bill Rutherford and Wolfgang Nitschke in the 90s (see references in the paper).

Given that EF3 is found in both CP43 and CP47, and that CP43-E354 is conserved as CP47-E435, and similar for position CP43-357 (CP47-438), and given that they still bind something (manganese/calcium), we can then argue that these residues were also available for metal-binding before duplication.

It is consistent with a homodimer photosystem, with clusters on both sides, and with ligands from the antenna. It also strengthens the notion that the Ca2+-binding site in the homodimeric Type I reaction centre is a real thing, and that the structural divergence of Type I and Type II reaction centres is indeed linked to the evolution of the Mn4CaO5 cluster and water oxidation to oxygen.

What this means you can read here:

Article Photosystem II is a Chimera of Reaction Centers

And here:

Preprint Thinking Twice about the Evolution of Photosynthesis

I think that originally manganese and water oxidation started with the help of a small domain similar to that in heliobacteria. A metal-binding site exposed to the media and soluble ions. Once manganese oxidation and an early version of water oxidation got started, the extrinsic domain in the ancestral protein to CP43 and CP47 then increased in complexity, evolving EF2 and EF3 in a drive to provide proton and water channels, to shield the cluster, and to provide a site of interaction with extrinsic polypeptides.

Then the swap of position of EF3 and the evolution of EF4 in the ancestral CP47 contributed to heterodimerization and the loss of water oxidation in D2.

This happens immediatly after the divergence of Type I and Type II reaction centres LONG before the most recent common ancestor of Cyanobacteria.

Did you know that at the gene level, the N-terminus of the CP43 gene overlaps with the C-terminus of the D2 gene contributing a few additional amino-acids to the latter? This is a trait shared by most cyanobacteria, including the earliest branching, and explains how D2 lost the ligands to the cluster located at the C-terminus.

Beautiful, just beautiful.

Sunday, December 2, 2018

Early Archean origin of Photosystem II: materials for the press office


An integral part of research is outreach and dissemination. I like my papers to be accompanied with a press release, if possible, to make it more visible to the public. Sometimes, what I do is send some materials to the press officer in our faculty and request if a press release can be written on that.

Below you find those materials, which I think could help some interested readers digest some of the information in the paper. This is the official press release from the college: https://www.imperial.ac.uk/news/189232/oxygen-could-have-been-available-life/

This is our recent paper: Early Archean origin of Photosystem II

Summary of the paper

The problem
When or how oxygenic photosynthesis originated remains controversial. Understanding how and when oxygenic photosynthesis emerged is fundamental to understand how life has evolved through the long history of the planet. For example, it is important to understand when oxygen was available to life for the first time. Oxygen permitted the evolution of aerobic respiration, which is the main energetic process that powers most life on Earth and it is essential to sustain the complexity of animals and humans. It is also important to understand the probability of complex life evolving in other solar systems. For example, if oxygenic photosynthesis is a very difficult process to evolve, then the probability of complex life emerging in a distant exoplanet may be very low.

The controversy is the result of the difficulty of unequivocally and unambiguously detecting oxygen in the rock record or figuring out when the first oxygen-producers evolved for the first time.

The older the rocks, the rarer they are, and the harder it is to prove conclusively that any fossil microbes found in these ancient rocks used or produced any amount of oxygen.

Today, the oldest known oxygen-producers are called cyanobacteria. These bacteria became the chloroplast of algae and plants, but all cyanobacteria that we know of use a very sophisticated form of oxygenic photosynthesis. So figuring out when cyanobacteria originated does not really tell us when oxygenic photosynthesis appeared for the first time, but only tells us when a very sophisticated form of oxygenic photosynthesis was already possible.

Therefore, it cannot tell us when oxygenic photosynthesis really got started and what ancestral forms of oxygenic photosynthesis looked like.

What we did
To overcome this difficulties, we studied the evolution of Photosystem II, nature’s solar panels that use the energy of light to break water molecules into its components, protons, electrons, and oxygen. Then, if we can understand when and how Photosystem II evolved the capacity to oxidize water, then we may have a better idea of when and how oxygenic photosynthesis got started, even before there was enough oxygen in the planet to leave a trace in the rock record.

The core of Photosystem II is made of two evolutionarily related proteins: called D1 and D2, which originated from a gene duplication. D1 and D2 are very similar to each other at a structural level but they differ at the basic sequence level, at the amino acid level, or in other words: they look the same but the basic building blocks have changed. Today D1 and D2 share 30% of the amino acid sequence identity. That means that from the approximately 350 building blocks that make D1 and D2, slightly over a hundred are perfectly identical between D1 and D2, but at some point in time they were 100% identical.

Fortunately, the function and structure of Photosystem II has been studied in great detail, so we can tell from what D1 and D2 look like, and from the remaining ~100 identical building blocks, that before the duplication that allowed the evolution of D1 and D2, water oxidation was possible.
Oxygen is a very reactive molecule: that is why it is so important to life because it can drive many chemical reactions that are essential to life. Oxygen can also react with chlorophyll leading to the formation of what is called reactive oxygen species. These reactive forms of oxygen are very toxic to life. So all photosynthetic organisms have evolved mechanism to protect against reactive oxygen species and to prevent oxygen molecules from interacting with chlorophyll. By comparing D1 and D2 we can also tell that before the duplication, the ancestral Photosystem II had already evolved mechanisms to protect against damage caused by oxygen.

What needed to be done now is to find out the span of time between the duplication event (when D1 and D2 were 100% identical) to the ancestor of all cyanobacteria, which inherited a standard sophisticated Photosystem II (when D1 and D2 had left only about 30% identical building blocks).
To do that we need to find out how fast D1 and D2 are changing: that is, the rate of evolution. We can find out using a technique called Bayesian relaxed molecular clock analysis. The method uses the power of statistics and known events in the evolution of photosynthetic organisms from the fossil record to calculate the rates of change.

The results
We found out that D1 and D2 are evolving at a very slow rate. The rate is so slow that it would take about 8 billion years for two identical D1 sequences today to become indistinguishable from each other in the future. For example, we know that the ancestor of flowering plants and most algae is more than 1 billion years old, but if I compare D1 in an algae and D1 in the banana tree, they will be about 87% identical. So in more than 1 billion years of evolution out of approximately 350 building blocks, less than 50 have changed in all plants and algae. If you compare the D1 in all flowering plants, which appeared around the time of the dinosaurs, they’ll be over 98% identical: that is less than 10 changes in more than 130 million years!

It is not strange at all that Photosystem II evolve so slowly: all complex enzymes that can be traced to the earliest forms of life evolve at similar rates. Because they fulfil important functions most changes are likely to result in a worst enzyme than a better enzyme, so most mutations are naturally wiped out. That is why we can tell that all life on Earth originated from a single origin, because many of the enzymes important for function have evolved at a really slow pace so that even after 4 billion years of evolution, they still look the same and work in similar ways in all groups of life.

We found out that because D1 and D2 are evolving so slowly, the span of time between the duplication and the ancestor of cyanobacteria is likely to be over a billion years or more! We cannot tell however with perfect exactitude when the ancestor of cyanobacteria appeared for the first time, but if it existed about 2.5 billion years ago, then the duplication could have easily occurred more than 3.5 billion years ago. The important discovery is that it does not matter when the ancestor of cyanobacteria appeared, because the span of time between the duplication (the dawn of oxygenic photosynthesis) and this ancestor will always be very large.

Another amazing thing we discovered is that even when the span of time is one billion years, the rate of change at the moment of duplication had to be about 40 times greater than the observed rates in the past 2.0 billion years. Forty times the current speed of change is about the limit of what is possible for molecular machines of such level of complexity. In fact, it is already above any measured rate for these kind of complex, highly conserved, molecular machines. Then, knowing that, we can calculate that if this gap of time were to be smaller, the rate at the duplication would have to be faster, and quickly enough the rates would be so large that they would be outside the realms of biology.

Imagine a car going from Paris to Berlin, a journey of about 1000 km, it would take about 10 hours to drive such distance at about 100 km per hour. If we want to arrive in 5 hours, we would need to drive at about twice the speed, but if we want to arrive in 1 hour, we would need to go at 10 times the speed, at almost the speed of sound. Not possible even for the fastest Formula 1 car. It is the same for the speed of evolution.

This is also important because it tells us in a very straightforward manner that evolutionary scenarios in which oxygenic photosynthesis originated very quickly before the ancestor of cyanobacteria can be ruled out with confidence. Even if we don’t know when exactly cyanobacteria originated.

The bigger picture
The main implications of the paper is that oxygen was available to life long before it started to accumulate in the air at about 2.4 billion years ago. This is in agreement with current geological data that suggests that whiffs of oxygen or localized accumulations of oxygen were possible before 3.0 billion years ago.

There has been debates on whether aerobic respiration evolved before or after cyanobacteria, and therefore before or after oxygenic photosynthesis. This is because the enzymes used for aerobic respiration appear to be much older than cyanobacteria. But how can aerobic respiration have evolved before oxygen was available to life? In the absence of oxygenic photosynthesis it is expected that the amount of oxygen available to life would be virtually negligible. So scientist have had to come up with convoluted scenarios to explain this. Our data help understand how this is possible, because oxygenic photosynthesis likely got started long before the ancestor of cyanobacteria. Today oxygenic photosynthesis is only found in cyanobacteria, but our data suggests that it is likely that many other forms of microbes that today do not do photosynthesis may have had old ancestors with the capacity to split-water using light.

In fact, recent data hints to the possibility that oxygen was important for the development of the genetic code, and reconstructions of the genetic capabilities of the earliest forms of life always retrieve enzymes to protect against reactive forms of oxygen, but the latter are usually dismissed as artefacts or anomalies. Our work can help understand how this is actually possible, because the older cyanobacteria is found to be, the more likely it is that oxygenic photosynthesis started at the earliest stages in the history of life and soon after the earliest forms of photosynthesis.

What’s next
We are trying now to bring back to life what the ancestral photosystem before the duplication looked like using a method called Ancestral Sequence Reconstruction. This is a well-established method that allows us to predict the basic building blocks of the ancestral enzyme using the known variation across all extant species. We cannot travel back in time to 3.0 billion years ago, but we can make the ancestral enzyme travel from the distant past into our test tube in the lab today.

Because the enzyme is evolving so slowly its structure has not change too much since its origin, what has changed is the particular building blocks along the different positions of the preserved structure. That makes it very suitable system for Ancestral Sequence Reconstruction, or targeted site-directed mutagenesis, although that does not mean it is easy. Nevertheless, we have now modified strains of cyanobacteria expressing some of the ancestral genes and we will soon attempt to validate our predictions experimentally. This is a three year-project funded by the Leverhulme Trust.