Friday, July 6, 2018

The atypical D1 sequence of Gloeobacter kilaueensis: looking for another one in metagenomes

The evolution of D1 proteins is complicated. It is characterized by many gene duplication events occurring at every taxonomic level. Some of these duplications could potentially predate the most recent common ancestor of all described cyanobacteria.
See our previous work on this:
Some of the earliest duplications, we suggested, gave rise to the atypical D1 forms, of which we have described three forms. What I have called Group 0, Group 1, and Group 2 D1.
Group 0 is made of a single sequence, found exclusively in the genome of Gloeobacter kilaueensisG. kilaueensis has additionally 5 standard D1 forms. There may be a D1 fragment encoded in the genome of the early branching Synechococcus sp. PCC 7336, have a look at this:
Group 1 is the super-rogue D1 also known as chlorophyll f synthase (or PsbA4).
Group 2 is the rogue D1: function unknown/unconfirmed.
A recent preprint by Grettenberger et al., described a new type of early branching cyanobacteria, which was named Aurora. The genome of this cyanobacterium was assembled from a metagenome of a microbial mat found in lake Vanda in Antarctica. It is more than 90% complete. This strain seems to be distantly related to Gloeobacter. As far as I understand, it is not clear however if this strain is an early-branching cyanobacterium sister to Gloeobacter, or whether it predates Gloeobacter, being therefore a sister branch to all described cyanobacteria.
This is the preprint:
Aurura vandensis has a PSII with a subunit composition similar to that of Gloeobacter. Only one D1 was reported in the preprint, and this is a standard form of D1, a Group 4.
Excited by this, I wondered if I could find another Group 0 sequence in the available metagenomes. Another G0, similar to that from G. kilaueensis.
So, I did a BLAST to all JGI environmental metagenomes: these were a total of 12361. I left out metagenomes categorized as “engineered” or “host-associated”.
To do a BLAST in so many metagenomes directly on the JGI site, it is necessary to split the data into sets of maximum 500 metagenomes. That gives 25 sets of metagenomes that needed to be BLASTed.
My query sequence was the very atypical G0 sequence from G. kilaueensis.
In the first set I obtained more than 30000 hits, which must include D1, D2, L, and M subunits; both complete and partial sequences. The cut-off E-value was 1e-5.
None of the 25 sets produced a sequence similar to the G0 sequence. Nothing close to it. The closest identity was 54%, usually to other standard forms of D1. No sequence alignment included the C-terminus, which is kind of special in the G0 sequence. Some of the metagenome sets gave a top hit to super-rogue D1 sequences, but the level of sequence identity between G0 and the other atypical forms is also just over 50%. This makes sense if the phylogenetic tree that we published in the paper above is correct, as it would imply that the G0 sequence is as close to the other atypical sequences, as it is to the standard forms of D1.
This is because we suggested based on the phylogeny of D1, that Group 1 to Group 4 would make a monophyletic group to the exclusion of the G0 sequence. But, phylogenetic trees are susceptible to artifacts, so having more G0 sequences could potentially improve the D1 phylogeny.
Each search for each of the metagenome sets produced more than 30k hits: that means that I could have obtained more than 750k hits in these 12361 metagenomes! But not a second G0 sequence?
I have to say that I did not examine every sequence in detail (of course)… waaay too many. So there may have been a partial sequence close to G0 that did not score high due to its very short length. If there was another G. kialueensis somewhere else I would have expected at least some identical sequences, but nothing at all!
I thought that Gloeobacter was not that uncommon after all:
Would anyone be interested in repeating this search? :)
This is the link to the G0 sequence: https://www.ncbi.nlm.nih.gov/protein/AGY58976.1
Now, with the recent eruption of Kilauea this unique strain of Gloeobacter may have just gone extinct.

No comments:

Post a Comment