I want to understand when and how oxygenic photosynthesis originated. Sometime ago I posted an evolutionary analysis of the core proteins of Photosystem II, which is still undergoing peer-review.
Check it out here if you have not seen it already:
Basically, all reaction centres are made of a dimer of homologous core proteins. All Type II reaction centres have a heterodimeric core, meaning that each monomer is different. Photosystem II, the water oxidising enzyme, has a core made of D1 and D2. D1 and D2 share slightly under 30% sequence identity and it is in D1 that the manganese cluster that oxidises water to oxygen is located.
It is pretty evident when the sequence and structure of D1 and D2 are compared, that the homodimeric Photosystem II (before the divergence of D1 and D2) was able to do some highly-oxidising photochemistry on both sides. Indeed, it looks like there was some kind of manganese cluster also in the D2 side at some point in time. It has been suggested before that this homodimeric Photosystem II was able to oxidise water. So if I can time when the gene duplication event that led to D1 and D2 happened, then I can have a pretty good idea of when water oxidation appeared for the first time. That is the subject of the above mentioned paper.
Type I reaction centres come in two versions: with homodimeric cores, meaning that the core is made of two copies of the same subunit; or with heterodimeric cores. The homdimeric Type I reaction centres are found only in anoxygenic phototrophs, and the heterodimeric Type I reaction centres are found only in oxygenic phototrophs: Cyanobacteria and photosynthetic eukaryotes. This heterodimeric Type I reaction centre is also known as Photosystem I.
It is hypothesised that the reason why Photosystem I is heterodimeric in oxygenic photosynthesis has something to do with oxygen… and well, there is no other alternative hypothesis that I know of.
The core of Photosystem I is made of two subunits, PsaA and PsaB. They share about 45% sequence identity. So, if oxygen is responsible for the heterodimerisation of Photosystem I; that means that the gene duplication event that led to PsaA and PsaB had to occur AFTER the evolution of water oxidation to oxygen.
That is the subject of the new manuscript:
I found out, like it was the case for Photosystem II, that the divergence of PsaA and PsaB is a lot older than anyone could have imagined… from my calculations, I would say that such gene duplication event occurred minimum 3.4 billion years ago, but it is likely much older than that.
Because PsaA and PsaB are more similar to each other than D1 and D2, it would seem as if the gene duplication event that led to PsaA and PsaB occurred long after the gene duplication that led to D1 and D2… but that may not necessarily be the case: that is only an illusion. Each duplication could have followed each other almost immediately: from this perpesctive, it may even be possible that heterodimeric Photosystem I predates heterodimeric Photosystem II, since the latter was oxidising water in a homodimeric form!
One thing that needs to be taken into account is that PsaA and PsaB are each about 730 to 750 amino acids long, while D1 and D2 are only about 360 amino acids long.
So in the case of D1 and D2: 70% sequence divergence is equivalent to about 250 amino acid differences along the entire sequence.
In the case of PsaA and PsaB, 55% sequence divergence is equivalent to about 440 amino acid differences along the entire sequence.
So consider the following example: if we assume that D1, D2, PsaA, and PsaB evolve at exactly the same rate: measured as amino acid substitutions PER POSITION per unit of time... a change of amino acids at 100 positions would take each protein the same amount of time, but in the case of D1 and D2, that would represent a change in sequence identity of 27%; while in PsaA and PsaB it would represent a change of only 13%!
In real life, however, PsaA and PsaB are evolving at a different pace than D1 and D2... but the difference is not huge.
So the fact that the level of sequence identity of PsaA and PsaB is significantly higher than D1 in D2 does not mean that the gene duplication event had to occur much later in Photosystem I than in Photosystem II. It is only an illusion caused by the fact that PsaA and PsaB are much longer than D1 and D2 and by the fact that sequence similarity in percentage is not necessarily the best measurement of sequence divergence.
Therefore, when the rates of evolution are taken into account and if we add to this the significant level of sequence divergence between PsaA and PsaB, it turns out that this gene duplication event happened a lot deeper in time than anyone could have guessed. Nevertheless, it is consistent with my hypothesis that water oxidation originated rapidly after (or at) the origin of photosynthesis, in the very early Archean.