Thursday, November 15, 2018

Answer to Dawn Summer's comments and questions regarding the evolution of oxygenic photosynthesis

Regarding our paper published recently in Geobiology, titled "Early Archean origin of Photosystem II"

I wrote "undescribed assumptions" because usually the papers read really well and describe many of their assumption in ways that are convincing, but results vary significantly. I've identified a couple of things that aren't justified, but I don't know if they are reasonable.
Example: It doesn't make sense to me that molecular evolution rates in chloroplasts should be the same as in free-living cyanobacteria given the significantly different "environmental" contexts, including pigments to absorb damaging radiation. Has anyone looked at this?

You are absolutely right. There are differences in the rates of evolution between chloroplast and cyanobacteria, and overall plastid proteins evolve at a faster rate than those in cyanobacteria. But that is not true for every protein. For example, proteins involved in information processing (e.g. ribosomal proteins, RNA polymerase) are evolving significantly faster in plastids. On the other hand, proteins of bioenergetics and photosynthesis metabolism, like ATP synthase, Rubico large subunit, the core subunits of the photosystems, are evolving at about similar rates in cyanobacteria and plastids.

It has to do with the different evolutionary pressures. The proteins of bioenergetics are under strong purifying selection (slow rates), but those of information processing have undergone periods of positive selection (accelerations of the rates) because they had to be put under the control of the eukaryotic replication/gene expression/translation systems. I don’t know much about it, but I have now been comparing systematically the rates of evolution between a bunch of these proteins. I am trying to establish what is a reasonable time for the emergence of the most recent common ancestor of Cyanobacteria... but of course, not so straight forward.

In our analysis, we used D1. One of the slowest evolving proteins in all life. We found that there is hardly any difference in the overall rates of evolution between D1 in all photosynthetic eukaryotes and in Cyanobacteria. In fact, the G4-D1 that in cyanobacteria is used to do oxygenic photosynthesis with chlorophyll f have experienced faster rates of evolution than those in the chloroplast.

That is why we presented Figure 2 in our paper. To try to show that the rates of evolution of D1 and D2 are quite slow, both in plants and cyanobacteria, and that if it just happens that cyanobacteria are much older than we anticipate, that would imply even slower rates, which then would push the duplication that led to D1 and D2 to even older times.

To give you an idea of how slow D1 and D2 are evolving... They are evolving slower than the alpha and beta subunits of ATP synthase. Alpha and beta originated from a gene duplication event that occurred before the LUCA. D1 and D2 are under tremendous evolutionary pressure, because they bind so many cofactors and they have to be maintained at the right orientations, plus they also interact with a bunch of other subunits, and in addition they have to incorporate protection mechanisms. Therefore, when primary endosymbiosis occurred, this had virtually no effect on the rates of evolution of D1 and D2. Unlike the ribosome for example.

If they do evolve at different rates on average, almost none of the fossil record calibrations will be effective without a deep dive into these variations.

I agree 100%! That is something I am exploring at the moment. In the case of cyanobacteria/chloroplast trees, calibrations have to be placed on either side of the node you are more interested in. That is why timing the most recent common ancestor of cyanobacteria is so difficult. If we only put calibrations on fast evolving branches, then the dates on the slowest evolving uncalibrated clades will be overestimated. On the other hand, if we place calibrations on slower evolving clades, then the rates in those clades that are fast evolving will be underestimated resulting in older calculated ages.

Therefore, when performing a molecular clock it is important to maximize calibrations and to put them strategically. However, the changes in the rates between clades should not be a big problem. The molecular clock algorithms can cope with differences in the rates orders of magnitude apart, believe me, I have tested this. But the only way the software can infer accurate dates, is with the appropriate use of calibrations.

There is no perfect dataset, and there is no perfect molecular clock, but we tried to do the best we can. We tried to model every possible scenario. The point of the paper is not to find out when cyanobacteria originated, but to find out what is the span of time between the duplication leading to D1 and D2, and standard Photosystem II (inherited by all cyanobacteria). And we find that that span of time is likely to be pretty substantial…

Think about this, the origin of ATP synthase (the duplication leading to alpha and beta subunit) does not depend on the age of any particular group of bacteria. Same for Photosystem II, the origin of Photosystem II does not depend on the age of the most recent common ancestor of cyanobacteria, but it depends on when the duplication that led to D1 and D2 occurred. And that photosystem, before the duplication, even if it didn’t oxidize water, was already a pretty special photosystem unlike any of the known anoxygenic ones.

Example 2: Atm O2 was lower pre-late Ediacaran, so there was less O3 & more UV. Even more pre-GOE. And w/ more Fe2+ in seawater, more free radicals are produced from light. How do environmental conditions such as these affect mutation rates? Different in cyanos vs chloroplasts?
Different for organisms living in different environments? E.g. Nostoc in super high light vs new cyanos found living in subsurface? Phormidium living at light limit w/HS-? How do ecological variations feed into long term mutation accumulation?

From the patterns that I have seen, it appears that overall, chloroplast proteins (eukaryotes in general) are evolving faster than cyanobacteria. But as I was mentioning above, the rates of evolution vary a  between proteins. What scientists have tried to do is to measure the background rates of evolution in non-coding regions of the genome, and compare them to the coding regions. The change in the ratio of these rates reflect different evolutionary pressures.

There are no systematic studies of the changes of the rates of evolution across geological time. Your questions are super interesting, and it is something that needs to be explored in more detail.

Have a look at the figure below. That is a comparison of the level of sequence divergence between pairs of cyanobacteria (a measurement of phylogenetic distance). What you see is a total of 703 comparisons. And I am plotting that for RpoB (RNA polymerase subunit B) and for the beta subunit of the ATP synthase. For example, if I compare the level of sequence identity between beta of Nostoc punctiforme with that of Chroococcidiopsis thermalis, they’ll be about 10% different. If I compare against Gloeobacter violaceous it would about 30% different.


The dots in blue are comparing between heterocystous cyanobacteria, and the orange dot is every comparison against Gloeobacter, the earliest branching cyano. There is a big scatter but it follows an overall linear trend, the slope of the trend line is 1.06. It means that RpoB and beta are evolving at pretty much the same rate across the core diversity of cyanobacteria.

The figure also shows that the distance between Gloeobacter and the rest of cyanobacteria is about three times as great as that among heterocystous cyanobacteria. Then if it can be established that the rates of evolution across most cyanobcateria follow approximately uniform patterns we can then be more confident of a time for their most recent common ancestor. We will only need a good fossil to calibrate it all.

Let us assume that we have identified a number of proteins that have evolved at a constant rate across cyanobacteria (say those in the figure). Now, there was a recent paper showing fossil heterocystous cyanobacteria in the Tonian period, did you see it? The lower age is 720 Ma. That would imply that the branch leading to Gloeobacter occurred at about 2.1 Ga. If instead we think that heterocystous cyanobacteria appeared about 1.0 Ga, then that would make the branching of Gloeobacter about 3.0 Ga. Molecular clocks  also behave in a similar way depending of course on the calibration choices.

Example 3: Gene exchange among closely related organisms, including via viruses. Is it possible that D1 G4 (and assoc genes) evolved in one sp of cyanos, was better, and was transferred to a bunch of others post GOE with those who didn't get the transfer dying out?

What I found out in my study of the evolution of D1, is that G4 is found in all Cyanobacteria, see Figure 1 of our paper. And when you focus on G4 only, it appears to follow a species tree of cyanobacteria, bear in mind that even D1 G4 have duplicated several times (e.g. low-light vs high-light forms, the one in the far-red light gene cluster). Nevertheless, it seems as if at least G4 had mostly been inherited vertically. That is not to say that horizontal gene transfer has not occurred, it certainly has occurred, but I don’t think to such an extent that it would dominate the topology of the tree.

Because of that, then we also concluded that the atypical D1 forms branched out before the most recent common ancestor of Cyanobacteria, including the so-called microaerobic forms.

I do think that a post-GOE ancestor of cyanobacteria is likely an artefact resulting from an overestimation of the rates of evolution, and I think there are a number of reasons for this. It turns out however that D1 and D2 are very susceptible to that because they are so slowly evolving. That is why we focused on the concept of delta-T instead.

We did not focus on trying to figure out if cyanobacteria occurred after or before the GOE, but on the span of time between the duplication leading to D1 and D2, and standard PSII. We concluded therefore that regardless of the exact timing for the MRCA of cyanobacteria, delta-T will always be very large (1.0 billion years). We also found out that if delta-T is made to be smaller, the rates of evolution will increase beyond what is likely for these type of proteins, and quickly enough beyond what is possible for any kind of protein.

So if the MRCA of cyanobacteria is found to be 2.5 Ga old, I think it would be reasonable to assume that the duplication leading to D1 and D2 occurred about 3.5 Ga... see what I mean?

In any case, I think that most of the diversity of oxygenic phototrophs that have ever existed actually predated the MRCA of cyanobacteria. That does not mean that such diversity had to be abundant or globally distributed though.

Or being present only in environments where they can compete with relatively ineffective D1s?
I'm not saying I think these necessarily happened. It just leaves me with the feeling that we are missing something really big and important in our assumptions.

I agree. Think about this:

There are three gene duplication events that are exclusive to oxygenic photosynthesis. D1 and D2, the core of PSII. CP43 and CP47, the core antenna of PSII. And PsaA and PsaB, the core of Photosystem I.

All cyanobacteria today have a form of oxygenic photosynthesis that have remained basically unchanged from Gloeobacter to avocados. In fact, most of the sequence change in the evolution of Photosystem II and Photosystem I that has ever occurred in the history of life, happened before the MRCA of cyanobacteria. From the moment those key duplications occurred countless forms of oxygenic phototrophic bacteria should have appeared spanning all of those changes that are not accounted for in the known diversity. And given that these enzymes are some of the slowest evolving enzymes we know of, the roots of oxygenic photosynthesis are likely placed deep in time... early Archean deep. We are oblivious to such huge diversity. By the time cyanobacteria enters the scene, when Gloeobacter split from the rest, oxygenic photosynthesis had already reached a pretty sophisticated stage.

So yeah, we are missing so much, in fact, we’re probably missing most of it.

1 comment: