This blog has now migrated to https://www.tanaicardona.com/blog
Thank you for reading and visiting.
All the best,
Tanai
Tanai's Science Blog
This is a space to share my thoughts, ideas, hypotheses, some data, and unpublished results.
Thursday, August 8, 2019
Tuesday, March 12, 2019
What if light was important for the origin and early evolution of life?
I have become very
interested in the idea that photosynthesis and photosynthetic water oxidation
was important for the origin and early evolution of life. Call me crazy, but I
have reasons!
In this brief post, I
will try to present the reasons why I suspect that the emergence of
photosynthesis may predate the last universal common ancestor (LUCA).
I have not always
thought this way. I have been led to think this way guided by the results of my
research. I think the evolution of photosynthetic reaction centres strongly
suggests that light was involved in the early evolution of life.
But how?
Please, bear with me.
The reader should
know that I am not an expert on Origin of Life research, and I am only
superficially familiar with a couple of the different scenarios. I know for
example that today the idea that life arose in hydrothermal vents is very
popular, although I also know that it is not the only competing hypothesis. In
a couple of years, I might be an expert (I am studying hard).
I also know that
quite some time ago, it was speculated and considered that the origin of life
was somehow photosynthetic, even oxygenic.
For example, Sam
Granick wrote in his famous 1957 paper: “It
seems more reasonable to consider that the functions of oxidation and
photosynthesis were so fundamental that they were part of the first beginnings
of protoplasm that arose from inorganic origins.” Then he went on to say: “I propose, as speculation, that the
earliest unit around which any living entity arose was an energy-conversion
unit. This unit of mineral origin would contain an organization of atoms that
would serve as a photocatalyst, at first perhaps in the decomposition of water
by UV radiation.”
Today things have
changed and scientist do not think this way anymore. Why is that? It is due to
a number of reasonable, but unproven assumptions:
1) Photosynthesis has
only been discovered in the domain Bacteria, therefore it appears reasonable
that the origin of photosynthesis likely occurred after the divergence of Archaea
and Bacteria.
2) Oxygenic
photosynthesis evolved in Cyanobacteria, so it appears reasonable that the
origin of water oxidation is a late invention relative to the origin of life.
In reality, it is a
bit more complicated than that as I have recently discussed. This is mainly
because the origin of photosynthesis cannot be determined based on a species
tree alone. What I mean is that a gene tree and a species tree do not always correspond.
So, to understand at what point in the history of life photosynthesis arose we
must understand how and when photosynthetic reaction centres and the
chlorophyll synthesis pathway arose.
OK, so what is the
evidence that suggests photosynthesis is a pre-LUCA innovation?
Allow me to
recapitulate several aspects regarding the evolution of photosynthesis.
Firstly, I have
concluded that the divergence of Type I and Type II reaction centres predates
the divergence of the major groups of bacteria. This is true regardless of the specific
evolutionary processes that led to the current distribution of photosynthesis
across the tree of life. In other words, the earliest events in the origin of
photosynthesis predate the evolution of most groups of bacteria that we know
of, including all phototrophs.
There are several
reasons why this can be concluded with a good level of confidence. I cannot
discuss them here in huge detail because it is not the point of this post, but
if you want to know more please see this, this, or just message me for more
details. The most important reason, however, is because both Type I and Type II
reaction centres make monophyletic clades. Therefore, before we have the Type
II reaction centre of purple bacteria or the homodimeric Type I reaction centre
of the green sulfur bacteria, we first need to have the processes that led to
the ancestor of all Type I reaction centres and the ancestor of all Type II
reaction centres.
From this it can also
be concluded that at the point in time of the most recent common ancestor of all phototrophs, whatever this was,
Type I and Type II reaction centres had already appeared.
Secondly, I have
shown that to explain the structural characteristics of Photosystem II,
including the coordination sphere of the Mn4CaO5 cluster (the
oxygen evolving complex), water oxidation must have appeared before, at, or
immediately after the divergence of Type I and Type II reaction centres.
Putting these two
points together, we then get that water oxidation chemistry originated before
the diversification of most groups of bacteria, including Cyanobacteria.
Thirdly, I attempted
to understand the evolution of Photosystem II as a function of time. What I
discovered is that the roots of Photosystem II, as determined by the gene
duplication leading to the heterodimerisation of the photochemical core (D1 and
D2), trace back to long before the most recent common ancestor of Cyanobacteria.
This boils down to the fact that the rates of evolution of Photosystem II are
tremendously slow. It is a bit more complicated than that, but this should
suffice for the moment.
At this point we have
traced photosynthetic water oxidation to an early stage in the evolution of the
domain Bacteria.
But how we go from
there to before the LUCA?
Warning! I am not
trying here to explain the origin of life. I am no trying to come up with a
reasonable evolutionary scenario. I am only following the evidence at hand,
which is directly derived from the study of the molecular evolution of the
reaction centres.
About two years ago,
I was at a local meeting at Imperial. I presented my research on the evolution
of Photosystem II and a well-known Nobel Prize winner mentioned that the
evolution of ATP synthase seemed to share some similarities with Photosystem
II.
Basically, the photochemical
core of Photosystem II is made of two homologous subunits, D1 and D2. Catalysis
occurs in D1. The catalytic core of ATP synthase is made of two homologous
subunits, the alpha and beta subunits: they make the hexameric head. The beta
subunit has the catalytic active site.
To provide further
support that Photosytem II and water oxidation is as old as I suggested in the
Geobiology paper, I thought that it would be a good idea to compare it to the
evolution of other enzymes. I wanted to compare the D1/D2 and CP43/CP47
duplication events with one duplication that is known to be very ancient and
with a duplication that is known to be very recent.
ATP synthase is a perfect
point of reference for the very ancient duplication, not only because of those
similarities with Photosystem II, but also because we know that the duplication
leading to alpha and beta predate the LUCA.
Therefore, if
Photosytem II emerged long after the LUCA: then, given the slow and very predictable
rates of evolution of these complexes, major differences in evolutionary
patters should be absolutely clear.
What I found is that
Photosystem II evolves at a slower rate than ATP synthase.
I am talking here of some
of the slowest rates of evolution in all biology.
ATP synthase evolves
so slowly that even though the duplication leading to alpha and beta occurred
before the LUCA, they still retain about 20% sequence identity and they are still
structurally very similar. That is slow enough so that after billions of years
of evolution strong sequence and structural identity is retained. Because the
duplication is so old, then it makes sense that after billions of years the
level of sequence identity between alpha and beta is relatively low.
Well, Photosytem II
evolves slower than ATP synthase! And the core subunits, D1 and D2, show 29%
sequence identity. The antenna of Photosytem II, CP43 and CP47, which also
originated from a gene duplication event have about the same level of sequence
identity as alpha and beta, 18%. And guess what, the rate of evolution of CP43
and CP47 is only slightly slower than the rate of alpha and beta.
From this reference. |
Under similar
conditions D1 and D2 are evolving at about 0.12 ± 0.04 amino acid changes per
site per billion years (Cardona et al. 2019). CP43 and CP47 at about 0.19 ± 0.04 amino acid
changes per site per billion years (unpublished) and alpha and beta at about 0.28
± 0.06 amino acid changes per site per billion years (unpublished).
This means that there
is no differences in the evolutionary patterns of the ATP synthase catalytic unit
when compared to the core of Photosystem II! No matter how I model their
evolution, I will not be able to place the origin of Photosystem II after the
origin of ATP synthase.
The rate of evolution
is strongly related to the complexity of the system. A case could be made to argue
that all reaction centres show greater complexity than ATP synthase.
Therefore, the
earliest stages of Photosystem II evolution could be coincidental or might
slightly predate those leading to V-/F-type ATP synthases. If this is the case,
then water oxidation and photosynthesis predates the LUCA.
Again, I want the
reader to understand that I am not trying to come up with an origin of life
scenario based on a collection of reasonable assumptions.
This is the path that
the evidence has pointed towards…
Imagine the ribosome.
Kind of in between the origin of information processing and protein synthesis.
A complex molecular machine made of protein and RNA.
Imagine now reaction
centres. Forget everything you know about reaction centres and look at them
with fresh eyes. A bag of cofactors and proteins unlike anything else in
biology. What if they emerged at the interface between the pre-biotic synthesis
of porphyrin-derived compounds and the very first proteins involved in
photochemical energy conversion and electron transfer?
Friday, March 1, 2019
Two phototrophic strains of Deltaproteobacteria (Myxococcota)
Phototrophy has not been found in Deltaproteobacteria. Using bioinformatics, I show that two distant strains of Deltaproteobacteria probably acquired phototrophy via a single event of horizontal gene transfer from Alphaproteobacteria into the most recent common ancestor of the proposed class of Deltaproteobacteria, Polyangia.
I have uploaded a short document to Researchgate with the details of this. Please have a look if you're interested and leave some feedback.
https://www.researchgate.net/publication/331453324_Two_phototrophic_strains_of_Deltaproteobacteria_Myxococcota
I have uploaded a short document to Researchgate with the details of this. Please have a look if you're interested and leave some feedback.
https://www.researchgate.net/publication/331453324_Two_phototrophic_strains_of_Deltaproteobacteria_Myxococcota
Friday, January 11, 2019
Has scientific output in photosynthesis research peaked?
I have bookmarked search queries for "photosystem", "cyanobacteria", and "photosynthesis" on the pubmed database to keep up to date with the literature. I have done that for quite a few years now and I have noted a trend in the "results per year" box that the search usually shows, on the right corner...
It looks like scientific output in photosynthesis research has peaked. See the graph below that shows the number of papers found for each keyword per year. The trend is clear:
In the years 2000 and 2001 there was a big rise in the number of publications on "photosynthesis" and "cyanobacteria"... and then it kept increasing non-stop. There is a tiny slow-down around the 2008 economic crisis, but since 2015/2016 the output reached plateau.
Is this reflecting the economy?
I don't think it is just photosynthesis research. Have a look at this, using "mice", "cancer", and "neuron" as search queries:
You can see similar trends... What does this mean? Have we reached the maximum capacity of our intellectual potential as humans?
Well, I do not think so... while the number of PhD graduates and postdocs has increased massively the number of tenure-track positions at universities and other academic institutions has not change at all for decades. So, I don't think it has anything to do with capacity for output, but a reflection of the amount of cash that is invested in research.
It is a problematic trend, however, if one is counting with scientific innovations to overcome the greatest challenge we have ever faced: climate change!
Let me know what you think.
Friday, December 14, 2018
Evolution of the CP43 and CP47 antenna proteins of Photosystem II and the link to water oxidation
In our recent paper in Geobiology we made a strong case for the process of water oxidation to oxygen having originated before the duplication leading to D1 and D2.
Article Early Archean origin of Photosystem II
As you may know by now (if you follow my posts or work), the core of Photosystem II is not just made of D1 and D2, but these also have an intimate relationship with the antenna proteins CP43 and CP47. Why is it intimate? Because the CP43 binds the Mn4CaO5 cluster together with D1.
CP43-E354 coordinates two Mn atoms, and CP43-R357 offers a hydrogen bond to one of the Mn-bridging oxygen atoms and it is within 4 Ã… from the calcium in the cluster.
We have seen now that D2 does not bind a cluster but instead a number of phenylalanine residues seem to replace the ligands and block access to Mn and water. What is remarkable is that CP47 also reaches within D2, as if to provide ligands to a long-gone cluster, but instead it inserts a few phenylalanine residues: one of them within less than 4 Ã… of the redox tyrosine, YD. Have a look at Figure 7H in the paper.
How? Why? What does this mean? Does it mean that in the homodimeric Photosystem II, before the D1/D2 duplication, the water-oxidising cluster was also coordinated by the antenna domain? Like CP43 does today?
When the crystal structure of the homodimeric Type I reaction centre of heliobacteria was released in 2017, I found a Ca2+ bound to the place where the Mn4CaO5 cluster would be, and these Ca2+-binding sites had a number of structural similarities with the water-oxidising cluster that I thought could not possibly be just coincidence. In particularly, the fact that the putative Ca2+-binding site interacted with the antenna domain in a manner similar to Photosostem II.
I discussed this in an early and hasty version of a manuscript that I should be submitting for publication soon. Have a look:
Working Paper Origin of water oxidation at the divergence of Type I and Ty...
Funnily enough, Prof. Bob Blankenship said in a news article that he didn't believe it. Well, he should believed it, because I'm right! :D haha
https://www.quantamagazine.org/simple-bacteria-offer-clues-to-the-origins-of-photosynthesis-20171017/
I jest.
Anyways, I have now taken a closer look at the antenna's extrinsic domains. And I found something AMAZING.
Have a look at the attached figure with the structural comparisons.
A, B, and C, are the antenna of heliobacteria, CP43, and CP47 respectively. In four different views. In grey you see the transmembrane helices and in colours the extrinsic domain between the 5th and 6th helices. In panel D you can see a schematic view.
I have split the extrinsic domain of CP43 into three bits: EF2, EF3, and EF1.
EF1 is retained in all Type I reaction centres (except PsaA and PsaB) and in CP43 and CP47.
EF3 binds the manganese cluster in CP43. This EF3 region is also found in CP47, but it is at a different location! A change of place occurred!
There is sequence identity in all of the matching domains once they are compared to each other.
Have a look at the attached alignment comparing only the EF3. Sequence identity is unambiguous.
The green arrows indicate the positions where EF3/EF4 are “inserted” in both subunits.
The two residues at homologous positions in the CP47-EF3 region bind a calcium! Yeah, that is right! They bind a calcium!
CP43-E354 is CP47-E435, and CP43-R357 is CP47-N438 as shown in the figure. The Ca2+ is not found in the CP47 of photosynthetic eukaryotes (I did not see it in the structure of the red algae PSII). Except perhaps for the PSII of Cyanophora paradoxa and relatives: early-branching algae.
In CP47, EF1 which in heliobacteria binds the Ca2+, interacts with the CP47-N438 via K332.
The phenylalanine residues that in CP47 insert themselves into D2, are found in the region marked as EF4, which does not exists in CP43.
The level of sequence identity between CP43 and CP47 is about 20%. But this falls to virtually 0% in the extrinsic domain if these are compared in their current order. If you remove EF4, and align the homologous bits together, the sequence identity is back to 20%! Unbelievable.
You might think that 20% overall sequence identity is too low, but the level of sequence identity between the alpha and beta subunits of ATP synthase is also 20%. Just to give you context.
You might think that the CP43 and CP47 have evolved very fast… the opposite is true. Currently after D1 and D2, the second slowest evolving reaction centre subunits are the CP43 and CP47, evolving even slower than ATP synthase today (unpublished data).
All in all it means that EF2, EF3, and EF1 were already present at the moment of duplication!
Given that EF4 only exists in CP47, we can then argue that this was not present before duplication, and therefore the phenylalanine residues that today get inserted into D2 and interact with YD could not have been in the homodimer. So the D2 and CP47 phenylalanine patch could not have been the ancestral state, as it is of course obvious from everything we discussed in the Geobiology paper and what had been described by Bill Rutherford and Wolfgang Nitschke in the 90s (see references in the paper).
Given that EF3 is found in both CP43 and CP47, and that CP43-E354 is conserved as CP47-E435, and similar for position CP43-357 (CP47-438), and given that they still bind something (manganese/calcium), we can then argue that these residues were also available for metal-binding before duplication.
It is consistent with a homodimer photosystem, with clusters on both sides, and with ligands from the antenna. It also strengthens the notion that the Ca2+-binding site in the homodimeric Type I reaction centre is a real thing, and that the structural divergence of Type I and Type II reaction centres is indeed linked to the evolution of the Mn4CaO5 cluster and water oxidation to oxygen.
What this means you can read here:
Article Photosystem II is a Chimera of Reaction Centers
And here:
Preprint Thinking Twice about the Evolution of Photosynthesis
I think that originally manganese and water oxidation started with the help of a small domain similar to that in heliobacteria. A metal-binding site exposed to the media and soluble ions. Once manganese oxidation and an early version of water oxidation got started, the extrinsic domain in the ancestral protein to CP43 and CP47 then increased in complexity, evolving EF2 and EF3 in a drive to provide proton and water channels, to shield the cluster, and to provide a site of interaction with extrinsic polypeptides.
Then the swap of position of EF3 and the evolution of EF4 in the ancestral CP47 contributed to heterodimerization and the loss of water oxidation in D2.
This happens immediatly after the divergence of Type I and Type II reaction centres LONG before the most recent common ancestor of Cyanobacteria.
Did you know that at the gene level, the N-terminus of the CP43 gene overlaps with the C-terminus of the D2 gene contributing a few additional amino-acids to the latter? This is a trait shared by most cyanobacteria, including the earliest branching, and explains how D2 lost the ligands to the cluster located at the C-terminus.
Beautiful, just beautiful.
Article Early Archean origin of Photosystem II
As you may know by now (if you follow my posts or work), the core of Photosystem II is not just made of D1 and D2, but these also have an intimate relationship with the antenna proteins CP43 and CP47. Why is it intimate? Because the CP43 binds the Mn4CaO5 cluster together with D1.
CP43-E354 coordinates two Mn atoms, and CP43-R357 offers a hydrogen bond to one of the Mn-bridging oxygen atoms and it is within 4 Ã… from the calcium in the cluster.
We have seen now that D2 does not bind a cluster but instead a number of phenylalanine residues seem to replace the ligands and block access to Mn and water. What is remarkable is that CP47 also reaches within D2, as if to provide ligands to a long-gone cluster, but instead it inserts a few phenylalanine residues: one of them within less than 4 Ã… of the redox tyrosine, YD. Have a look at Figure 7H in the paper.
How? Why? What does this mean? Does it mean that in the homodimeric Photosystem II, before the D1/D2 duplication, the water-oxidising cluster was also coordinated by the antenna domain? Like CP43 does today?
When the crystal structure of the homodimeric Type I reaction centre of heliobacteria was released in 2017, I found a Ca2+ bound to the place where the Mn4CaO5 cluster would be, and these Ca2+-binding sites had a number of structural similarities with the water-oxidising cluster that I thought could not possibly be just coincidence. In particularly, the fact that the putative Ca2+-binding site interacted with the antenna domain in a manner similar to Photosostem II.
I discussed this in an early and hasty version of a manuscript that I should be submitting for publication soon. Have a look:
Working Paper Origin of water oxidation at the divergence of Type I and Ty...
Funnily enough, Prof. Bob Blankenship said in a news article that he didn't believe it. Well, he should believed it, because I'm right! :D haha
https://www.quantamagazine.org/simple-bacteria-offer-clues-to-the-origins-of-photosynthesis-20171017/
I jest.
Anyways, I have now taken a closer look at the antenna's extrinsic domains. And I found something AMAZING.
Have a look at the attached figure with the structural comparisons.
A, B, and C, are the antenna of heliobacteria, CP43, and CP47 respectively. In four different views. In grey you see the transmembrane helices and in colours the extrinsic domain between the 5th and 6th helices. In panel D you can see a schematic view.
I have split the extrinsic domain of CP43 into three bits: EF2, EF3, and EF1.
EF1 is retained in all Type I reaction centres (except PsaA and PsaB) and in CP43 and CP47.
EF3 binds the manganese cluster in CP43. This EF3 region is also found in CP47, but it is at a different location! A change of place occurred!
There is sequence identity in all of the matching domains once they are compared to each other.
Have a look at the attached alignment comparing only the EF3. Sequence identity is unambiguous.
The green arrows indicate the positions where EF3/EF4 are “inserted” in both subunits.
The two residues at homologous positions in the CP47-EF3 region bind a calcium! Yeah, that is right! They bind a calcium!
CP43-E354 is CP47-E435, and CP43-R357 is CP47-N438 as shown in the figure. The Ca2+ is not found in the CP47 of photosynthetic eukaryotes (I did not see it in the structure of the red algae PSII). Except perhaps for the PSII of Cyanophora paradoxa and relatives: early-branching algae.
In CP47, EF1 which in heliobacteria binds the Ca2+, interacts with the CP47-N438 via K332.
The phenylalanine residues that in CP47 insert themselves into D2, are found in the region marked as EF4, which does not exists in CP43.
The level of sequence identity between CP43 and CP47 is about 20%. But this falls to virtually 0% in the extrinsic domain if these are compared in their current order. If you remove EF4, and align the homologous bits together, the sequence identity is back to 20%! Unbelievable.
You might think that 20% overall sequence identity is too low, but the level of sequence identity between the alpha and beta subunits of ATP synthase is also 20%. Just to give you context.
You might think that the CP43 and CP47 have evolved very fast… the opposite is true. Currently after D1 and D2, the second slowest evolving reaction centre subunits are the CP43 and CP47, evolving even slower than ATP synthase today (unpublished data).
All in all it means that EF2, EF3, and EF1 were already present at the moment of duplication!
Given that EF4 only exists in CP47, we can then argue that this was not present before duplication, and therefore the phenylalanine residues that today get inserted into D2 and interact with YD could not have been in the homodimer. So the D2 and CP47 phenylalanine patch could not have been the ancestral state, as it is of course obvious from everything we discussed in the Geobiology paper and what had been described by Bill Rutherford and Wolfgang Nitschke in the 90s (see references in the paper).
Given that EF3 is found in both CP43 and CP47, and that CP43-E354 is conserved as CP47-E435, and similar for position CP43-357 (CP47-438), and given that they still bind something (manganese/calcium), we can then argue that these residues were also available for metal-binding before duplication.
It is consistent with a homodimer photosystem, with clusters on both sides, and with ligands from the antenna. It also strengthens the notion that the Ca2+-binding site in the homodimeric Type I reaction centre is a real thing, and that the structural divergence of Type I and Type II reaction centres is indeed linked to the evolution of the Mn4CaO5 cluster and water oxidation to oxygen.
What this means you can read here:
Article Photosystem II is a Chimera of Reaction Centers
And here:
Preprint Thinking Twice about the Evolution of Photosynthesis
I think that originally manganese and water oxidation started with the help of a small domain similar to that in heliobacteria. A metal-binding site exposed to the media and soluble ions. Once manganese oxidation and an early version of water oxidation got started, the extrinsic domain in the ancestral protein to CP43 and CP47 then increased in complexity, evolving EF2 and EF3 in a drive to provide proton and water channels, to shield the cluster, and to provide a site of interaction with extrinsic polypeptides.
Then the swap of position of EF3 and the evolution of EF4 in the ancestral CP47 contributed to heterodimerization and the loss of water oxidation in D2.
This happens immediatly after the divergence of Type I and Type II reaction centres LONG before the most recent common ancestor of Cyanobacteria.
Did you know that at the gene level, the N-terminus of the CP43 gene overlaps with the C-terminus of the D2 gene contributing a few additional amino-acids to the latter? This is a trait shared by most cyanobacteria, including the earliest branching, and explains how D2 lost the ligands to the cluster located at the C-terminus.
Beautiful, just beautiful.
Sunday, December 2, 2018
Early Archean origin of Photosystem II: materials for the press office
An integral part of research is outreach and dissemination. I like my papers to be accompanied with a press release, if possible, to make it more visible to the public. Sometimes, what I do is send some materials to the press officer in our faculty and request if a press release can be written on that.
Below you find those materials, which I think could help some interested readers digest some of the information in the paper. This is the official press release from the college: https://www.imperial.ac.uk/news/189232/oxygen-could-have-been-available-life/
This is our recent paper: Early Archean origin of Photosystem II
Summary of the paper
The problem
When or how oxygenic photosynthesis originated remains
controversial. Understanding how and when oxygenic photosynthesis emerged is
fundamental to understand how life has evolved through the long history of the
planet. For example, it is important to understand when oxygen was available to
life for the first time. Oxygen permitted the evolution of aerobic respiration,
which is the main energetic process that powers most life on Earth and it is
essential to sustain the complexity of animals and humans. It is also
important to understand the probability of complex life evolving in other solar
systems. For example, if oxygenic photosynthesis is a very difficult
process to evolve, then the probability of complex life emerging in a distant
exoplanet may be very low.
The controversy is the result of the difficulty of
unequivocally and unambiguously detecting oxygen in the rock record or figuring
out when the first oxygen-producers evolved for the first time.
The older the rocks, the rarer they are, and the harder it
is to prove conclusively that any fossil microbes found in these ancient rocks
used or produced any amount of oxygen.
Today, the oldest known oxygen-producers are called
cyanobacteria. These bacteria became the chloroplast of algae and plants, but
all cyanobacteria that we know of use a very sophisticated form of oxygenic
photosynthesis. So figuring out when cyanobacteria originated does not really
tell us when oxygenic photosynthesis appeared for the first time, but only tells
us when a very sophisticated form of oxygenic photosynthesis was already
possible.
Therefore, it cannot tell us when oxygenic photosynthesis really
got started and what ancestral forms of oxygenic photosynthesis looked like.
What we did
To overcome this difficulties, we studied the evolution of
Photosystem II, nature’s solar panels that use the energy of light to break
water molecules into its components, protons, electrons, and oxygen. Then, if
we can understand when and how Photosystem II evolved the capacity to oxidize
water, then we may have a better idea of when and how oxygenic photosynthesis
got started, even before there was enough oxygen in the planet to leave a trace
in the rock record.
The core of Photosystem II is made of two evolutionarily
related proteins: called D1 and D2, which originated from a gene duplication.
D1 and D2 are very similar to each other at a structural level but they differ
at the basic sequence level, at the amino acid level, or in other words: they
look the same but the basic building blocks have changed. Today D1 and D2 share
30% of the amino acid sequence identity. That means that from the approximately
350 building blocks that make D1 and D2, slightly over a hundred are perfectly
identical between D1 and D2, but at some point in time they were 100%
identical.
Fortunately, the function and structure of Photosystem II
has been studied in great detail, so we can tell from what D1 and D2 look like,
and from the remaining ~100 identical building blocks, that before the
duplication that allowed the evolution of D1 and D2, water oxidation was possible.
Oxygen is a very reactive molecule: that is why it is so
important to life because it can drive many chemical reactions that are
essential to life. Oxygen can also react with chlorophyll leading to the
formation of what is called reactive oxygen
species. These reactive forms of oxygen are very toxic to life. So all
photosynthetic organisms have evolved mechanism to protect against reactive oxygen
species and to prevent oxygen molecules from interacting with chlorophyll. By
comparing D1 and D2 we can also tell that before the duplication, the ancestral
Photosystem II had already evolved mechanisms to protect against damage caused
by oxygen.
What needed to be done now is to find out the span of time
between the duplication event (when D1 and D2 were 100% identical) to the
ancestor of all cyanobacteria, which inherited a standard sophisticated Photosystem
II (when D1 and D2 had left only about 30% identical building blocks).
To do that we need to find out how fast D1 and D2 are changing:
that is, the rate of evolution. We can find out using a technique called
Bayesian relaxed molecular clock analysis. The method uses the power of
statistics and known events in the evolution of photosynthetic organisms from
the fossil record to calculate the rates of change.
The results
We found out that D1 and D2 are evolving at a very slow
rate. The rate is so slow that it would take about 8 billion years for two
identical D1 sequences today to become indistinguishable from each other in the
future. For example, we know that the ancestor of flowering plants and most
algae is more than 1 billion years old, but if I compare D1 in an algae and D1
in the banana tree, they will be about 87% identical. So in more than 1 billion
years of evolution out of approximately 350 building blocks, less than 50 have
changed in all plants and algae. If you compare the D1 in all flowering plants,
which appeared around the time of the dinosaurs, they’ll be over 98% identical:
that is less than 10 changes in more than 130 million years!
It is not strange at all that Photosystem II evolve so slowly:
all complex enzymes that can be traced to the earliest forms of life evolve at
similar rates. Because they fulfil important functions most changes are likely
to result in a worst enzyme than a better enzyme, so most mutations are
naturally wiped out. That is why we can tell that all life on Earth originated
from a single origin, because many of the enzymes important for function have
evolved at a really slow pace so that even after 4 billion years of evolution,
they still look the same and work in similar ways in all groups of life.
We found out that because D1 and D2 are evolving so slowly,
the span of time between the duplication and the ancestor of cyanobacteria is
likely to be over a billion years or more! We cannot tell however with perfect
exactitude when the ancestor of cyanobacteria appeared for the first time, but
if it existed about 2.5 billion years ago, then the duplication could have
easily occurred more than 3.5 billion years ago. The important discovery is that
it does not matter when the ancestor of cyanobacteria appeared, because the
span of time between the duplication (the dawn of oxygenic photosynthesis) and
this ancestor will always be very large.
Another amazing thing we discovered is that even when the span
of time is one billion years, the rate of change at the moment of duplication
had to be about 40 times greater than the observed rates in the past 2.0
billion years. Forty times the current speed of change is about the limit of
what is possible for molecular machines of such level of complexity. In fact,
it is already above any measured rate for these kind of complex, highly
conserved, molecular machines. Then, knowing that, we can calculate that if
this gap of time were to be smaller, the rate at the duplication would have to
be faster, and quickly enough the rates would be so large that they would be
outside the realms of biology.
Imagine a car going from Paris to Berlin, a journey of about
1000 km, it would take about 10 hours to drive such distance at about 100 km
per hour. If we want to arrive in 5 hours, we would need to drive at about
twice the speed, but if we want to arrive in 1 hour, we would need to go at 10
times the speed, at almost the speed of sound. Not possible even for the
fastest Formula 1 car. It is the same for the speed of evolution.
This is also important because it tells us in a very straightforward
manner that evolutionary scenarios in which oxygenic photosynthesis originated
very quickly before the ancestor of cyanobacteria can be ruled out with
confidence. Even if we don’t know when exactly cyanobacteria originated.
The bigger picture
The main implications of the paper is that oxygen was
available to life long before it started to accumulate in the air at about 2.4
billion years ago. This
is in agreement with current geological data that suggests that whiffs of
oxygen or localized accumulations of oxygen were possible before 3.0 billion
years ago.
There has been debates on whether aerobic respiration
evolved before or after cyanobacteria, and therefore before or after oxygenic
photosynthesis. This
is because the enzymes used for aerobic respiration appear to be much older
than cyanobacteria. But how can aerobic respiration have evolved before
oxygen was available to life? In the absence of oxygenic photosynthesis it is
expected that the amount of oxygen available to life would be virtually
negligible. So scientist have had to come up with convoluted scenarios to
explain this. Our data help understand how this is possible, because oxygenic
photosynthesis likely got started long before the ancestor of cyanobacteria.
Today oxygenic photosynthesis is only found in cyanobacteria, but our data suggests
that it is likely that many other forms of microbes that today do not do
photosynthesis may have had old ancestors with the capacity to split-water
using light.
In fact, recent
data hints to the possibility that oxygen was important for the development of
the genetic code, and reconstructions of the
genetic capabilities of the earliest forms of life always retrieve enzymes to
protect against reactive forms of oxygen, but the latter are usually
dismissed as artefacts or anomalies. Our work can help understand how this is
actually possible, because the older cyanobacteria is found to be, the more
likely it is that oxygenic photosynthesis started at the earliest stages in the
history of life and soon after the earliest forms of photosynthesis.
What’s next
We are trying now to bring back to life what the ancestral
photosystem before the duplication looked like using a method called Ancestral
Sequence Reconstruction. This is a well-established method that allows us to
predict the basic building blocks of the ancestral enzyme using the known
variation across all extant species. We cannot travel back in time to 3.0
billion years ago, but we can make the ancestral enzyme travel from the distant
past into our test tube in the lab today.
Because the enzyme is evolving so slowly its structure has
not change too much since its origin, what has changed is the particular
building blocks along the different positions of the preserved structure. That
makes it very suitable system for Ancestral Sequence Reconstruction, or
targeted site-directed mutagenesis, although that does not mean it is easy. Nevertheless,
we have now modified strains of cyanobacteria expressing some of the ancestral
genes and we will soon attempt to validate our predictions experimentally. This
is a three year-project funded by the Leverhulme Trust.
Thursday, November 15, 2018
Answer to Dawn Summer's comments and questions regarding the evolution of oxygenic photosynthesis
Regarding our paper published recently in Geobiology, titled "Early Archean origin of Photosystem II"
I wrote "undescribed assumptions" because usually
the papers read really well and describe many of their assumption in ways that
are convincing, but results vary significantly. I've identified a couple of
things that aren't justified, but I don't know if they are reasonable.
Example: It doesn't make sense to me that molecular
evolution rates in chloroplasts should be the same as in free-living
cyanobacteria given the significantly different "environmental"
contexts, including pigments to absorb damaging radiation. Has anyone looked at
this?
You are absolutely right. There
are differences in the rates of evolution between chloroplast and
cyanobacteria, and overall plastid proteins evolve at a faster rate than those
in cyanobacteria. But that is not true for every protein. For example, proteins
involved in information processing (e.g. ribosomal proteins, RNA polymerase) are
evolving significantly faster in plastids. On the other hand, proteins of
bioenergetics and photosynthesis metabolism, like ATP synthase, Rubico large
subunit, the core subunits of the photosystems, are evolving at about similar
rates in cyanobacteria and plastids.
It has to do with the different
evolutionary pressures. The proteins of bioenergetics are under strong purifying
selection (slow rates), but those of information processing have undergone
periods of positive selection (accelerations of the rates) because they had to
be put under the control of the eukaryotic replication/gene
expression/translation systems. I don’t know much about it, but I have now been comparing systematically the rates of evolution between a bunch of these proteins. I am trying to
establish what is a reasonable time for the emergence of the most recent common
ancestor of Cyanobacteria... but of course, not so straight forward.
In our analysis, we used D1. One
of the slowest evolving proteins in all life. We found that there is hardly any difference
in the overall rates of evolution between D1 in all photosynthetic eukaryotes and in
Cyanobacteria. In fact, the G4-D1 that in cyanobacteria is used to do oxygenic
photosynthesis with chlorophyll f have experienced faster rates of evolution
than those in the chloroplast.
That is why we presented Figure
2 in our paper. To try to show that the rates of evolution of D1 and D2 are
quite slow, both in plants and cyanobacteria, and that if it just happens that
cyanobacteria are much older than we anticipate, that would imply even slower
rates, which then would push the duplication that led to D1 and D2 to even
older times.
To give you an idea of how slow
D1 and D2 are evolving... They are evolving slower than the alpha and beta
subunits of ATP synthase. Alpha and beta originated from a gene duplication
event that occurred before the LUCA. D1 and D2 are under tremendous evolutionary pressure, because
they bind so many cofactors and they have to be maintained at the right
orientations, plus they also interact with a bunch of other subunits, and in addition they have to incorporate protection mechanisms. Therefore, when primary endosymbiosis
occurred, this had virtually no effect on the rates of evolution of D1 and D2. Unlike the ribosome for example.
If they do evolve at different rates on average, almost none
of the fossil record calibrations will be effective without a deep dive into
these variations.
I agree 100%! That is something
I am exploring at the moment. In the case of cyanobacteria/chloroplast trees,
calibrations have to be placed on either side of the node you are more
interested in. That is why timing the most recent common ancestor of
cyanobacteria is so difficult. If we only put calibrations on fast evolving
branches, then the dates on the slowest evolving uncalibrated clades will be
overestimated. On the other hand, if we place calibrations on slower evolving
clades, then the rates in those clades that are fast evolving will
be underestimated resulting in older calculated ages.
Therefore, when performing a
molecular clock it is important to maximize calibrations and to put
them strategically. However, the changes in the rates between clades should not
be a big problem. The molecular clock algorithms can cope with differences in
the rates orders of magnitude apart, believe me, I have tested this. But the
only way the software can infer accurate dates, is with the appropriate use of
calibrations.
There is no perfect dataset, and
there is no perfect molecular clock, but we tried to do the best we can. We tried to model every possible scenario. The
point of the paper is not to find out when cyanobacteria originated, but to
find out what is the span of time between the duplication leading to D1 and D2,
and standard Photosystem II (inherited by all cyanobacteria). And we find that
that span of time is likely to be pretty substantial…
Think about this, the origin of
ATP synthase (the duplication leading to alpha and beta subunit) does not
depend on the age of any particular group of bacteria. Same for Photosystem II,
the origin of Photosystem II does not depend on the age of the most recent
common ancestor of cyanobacteria, but it depends on when the duplication that
led to D1 and D2 occurred. And that photosystem, before the duplication, even
if it didn’t oxidize water, was already a pretty special photosystem unlike any
of the known anoxygenic ones.
Example 2: Atm O2 was lower pre-late Ediacaran, so there was
less O3 & more UV. Even more pre-GOE. And w/ more Fe2+ in seawater, more
free radicals are produced from light. How do environmental conditions such as
these affect mutation rates? Different in cyanos vs chloroplasts?
Different for organisms living in different environments?
E.g. Nostoc in super high light vs new cyanos found living in subsurface?
Phormidium living at light limit w/HS-? How do ecological variations feed into
long term mutation accumulation?
From the patterns that I have
seen, it appears that overall, chloroplast proteins (eukaryotes in general) are
evolving faster than cyanobacteria. But as I was mentioning above, the rates of
evolution vary a between proteins. What scientists have tried to do is to
measure the background rates of evolution in non-coding regions of the genome,
and compare them to the coding regions. The change in the ratio of these rates
reflect different evolutionary pressures.
There are no systematic studies
of the changes of the rates of evolution across geological time. Your questions
are super interesting, and it is something that needs to be explored in more detail.
Have a look at the figure below.
That is a comparison of the level of sequence divergence between pairs of
cyanobacteria (a measurement of phylogenetic distance). What you see is a total
of 703 comparisons. And I am plotting that for RpoB (RNA polymerase subunit B)
and for the beta subunit of the ATP synthase. For example, if I compare the
level of sequence identity between beta of Nostoc punctiforme with that of
Chroococcidiopsis thermalis, they’ll be about 10% different. If I compare against
Gloeobacter violaceous it would about 30% different.
The dots in blue are comparing
between heterocystous cyanobacteria, and the orange dot is every comparison
against Gloeobacter, the earliest branching cyano. There is a big scatter but it
follows an overall linear trend, the slope of the trend line is 1.06. It means
that RpoB and beta are evolving at pretty much the same rate across the core
diversity of cyanobacteria.
The figure also shows that the
distance between Gloeobacter and the rest of cyanobacteria is about three times
as great as that among heterocystous cyanobacteria. Then if it can be
established that the rates of evolution across most cyanobcateria follow
approximately uniform patterns we can then be more confident of a time for
their most recent common ancestor. We will only need a good fossil to calibrate
it all.
Let us assume that we have
identified a number of proteins that have evolved at a constant rate across
cyanobacteria (say those in the figure). Now, there was a recent paper showing
fossil heterocystous cyanobacteria in the Tonian period, did you see it? The lower age is 720
Ma. That would imply that the branch leading to Gloeobacter occurred at about
2.1 Ga. If instead we think that heterocystous cyanobacteria appeared about 1.0
Ga, then that would make the branching of Gloeobacter about 3.0 Ga. Molecular
clocks also behave in a similar way depending of course on the calibration
choices.
Example 3: Gene exchange among closely related organisms,
including via viruses. Is it possible that D1 G4 (and assoc genes) evolved in
one sp of cyanos, was better, and was transferred to a bunch of others post GOE
with those who didn't get the transfer dying out?
What I found out in my study of
the evolution of D1, is that G4 is found in all Cyanobacteria, see Figure 1 of our paper. And when you focus on G4 only, it appears to follow a
species tree of cyanobacteria, bear in mind that even D1 G4 have duplicated
several times (e.g. low-light vs high-light forms, the one in the far-red light
gene cluster). Nevertheless, it seems as if at least G4 had mostly been
inherited vertically. That is not to say that horizontal gene transfer has not
occurred, it certainly has occurred, but I don’t think to such an extent that
it would dominate the topology of the tree.
Because of that, then we also
concluded that the atypical D1 forms branched out before the most recent common
ancestor of Cyanobacteria, including the so-called microaerobic forms.
I do think that a post-GOE
ancestor of cyanobacteria is likely an artefact resulting from an overestimation
of the rates of evolution, and I think there are a number of reasons for this. It
turns out however that D1 and D2 are very susceptible to that because they are
so slowly evolving. That is why we focused on the concept of delta-T instead.
We did not focus on trying to
figure out if cyanobacteria occurred after or before the GOE, but on the span
of time between the duplication leading to D1 and D2, and standard PSII. We
concluded therefore that regardless of the exact timing for the MRCA of
cyanobacteria, delta-T will always be very large (1.0 billion years). We also found out that if
delta-T is made to be smaller, the rates of evolution will increase beyond what
is likely for these type of proteins, and quickly enough beyond what is
possible for any kind of protein.
So if the MRCA of cyanobacteria is found to be 2.5 Ga old, I think it would be reasonable to assume that the duplication leading to D1 and D2 occurred about 3.5 Ga... see what I mean?
In any case, I think that most
of the diversity of oxygenic phototrophs that have ever existed actually
predated the MRCA of cyanobacteria. That does not mean that such diversity had
to be abundant or globally distributed though.
Or being present only in environments where they can compete
with relatively ineffective D1s?
I'm not saying I think these necessarily happened. It just
leaves me with the feeling that we are missing something really big and
important in our assumptions.
I agree. Think about this:
There are three gene duplication
events that are exclusive to oxygenic photosynthesis. D1 and D2, the core of
PSII. CP43 and CP47, the core antenna of PSII. And PsaA and PsaB, the core of Photosystem
I.
All cyanobacteria today have a
form of oxygenic photosynthesis that have remained basically unchanged from Gloeobacter to avocados. In fact, most
of the sequence change in the evolution of Photosystem II and Photosystem I
that has ever occurred in the history of life, happened before the MRCA of
cyanobacteria. From the moment those key duplications occurred countless forms
of oxygenic phototrophic bacteria should have appeared spanning all of those
changes that are not accounted for in the known diversity. And given that these enzymes are some of the slowest evolving enzymes
we know of, the roots of oxygenic photosynthesis are likely placed deep in
time... early Archean deep. We are oblivious to such huge diversity. By the time cyanobacteria enters
the scene, when Gloeobacter split
from the rest, oxygenic photosynthesis had already reached a pretty
sophisticated stage.
So yeah, we are missing so much,
in fact, we’re probably missing most of it.
Subscribe to:
Posts (Atom)