Sunday, April 20, 2014

Bidirectionally Overlapping Genes

The occurrence of bidirectionally overlapping genes in bacteria is rare, and most such examples are dismissed as chimeric or representative of simple genome mis-annotation. After all, how can a gene make sense in one direction, but also make sense on the reverse-reading complementary strand of DNA? Such a situation is more than a mere palindrome. It's akin to the phrase:
Warsaw won, eh?
He now was raw.
The phrase has a sensical message in each direction, yet is not a mere bidi-symmetry of the "A man, a plan, a canal, Panama" kind. It defies credulity to believe a stretch of DNA spanning several hundred bases (several hundred "letters") could evolve to give a useful message in both directions. And yet, what is life itself, if not credulity-defying? Somehow, life began from primordial chemistry and evolved toward DNA genes coding for proteins. Is it so hard to believe that early replicant molecules (probably RNA) were transcribed and translated in both directions, and that some of the happy accidents survived? Is it so hard to believe that some proteins began life as reverse transcripts ("nonsense" proteins) that then evolved toward specialized functionality?

A bonafide example of a bidirectionally transcribed and translated gene was verified experimentally in 2008 by Silby and Levy, who were investigating the soil bacterium Pseudomonas fluorescens PF0-1. They found that the hitherto unknown cosA gene, which overlaps (on the opposite DNA strand) a gene for a fusaric acid resistance protein, is not only expressed as a protein but is required for soil colonization.

A section of P. fluorescens PF0-1 genome showing the existence of overlapping genes (note the yellow-colored segment, representing the cosA gene; the larger green gene above it, on the opposite strand, encodes a fusaric acid resistance protein). The overlapping genes have been shown experimentally to be expressed as protein.
Ironically, a month after Silby and Levy published their results, BMC Genetics published a study by Pallejà et al. looking at large gene overlaps in bacterial genomes. The Pallejà study concluded:
Among the 968 overlaps larger than 60 bps which we analysed, we did not find a single real one among the co-directional and divergent orientations and concluded that there had been an excessive number of misannotations. Only convergent orientation seems to permit some long overlaps, although convergent overlaps are also hampered by misannotations. We propose a simple rule to flag these erroneous gene length predictions to facilitate automatic annotation.
Silby and Levy argue that, to the contrary, current genome annotations are obscuring potentially important discoveries:
[Our] findings suggest that current genome annotations provide an incomplete view of the genetic potential of a given organism . . . In eukaryotes, the concept that genomes include numerous sense/antisense gene pairs is becoming increasingly obvious with genome-wide transcriptional studies in yeast [8] and Arabidopsis [10]. Antisense transcripts have been implicated in eye development [20] and control of entry into meiosis in yeast [21]. However, discussion of antisense transcription is limited to possible regulatory roles for antisense RNA [e.g. 8], without consideration of the possibility that they may specify proteins. Genome annotations do not routinely predict the existence of two protein-coding genes on opposite DNA strands, and in fact normally deliberately eliminate predicted overlaps. Moreover, small protein-coding genes can be missed by predictive algorithms. For example, the blr gene in E. coli specifies a 41 residue protein, and was discovered in a sequence believed to be intergenic [22]. The fact that antisense genes have been implicated in important biological functions indicates that more attention should be given to this emerging class of genes.
I happen to agree with Silby and Levy. It would be a shame if bidirectional overlaps in genomes are not investigated. The notion (furthered by Pallejà) that annotation software should suppress such findings automatically is repulsive. It's the kind of intolerant, rigid, dogmatic thinking science, quite frankly, doesn't need more of.