Both ends of the alphavirus genomic RNA are potentially important in its replication. The region preceding and including the 5'-end of the subgenomic 26S RNA in genomic RNA might also be involved in 26S RNA transcription. Sequences of these regions of up to 10 alphaviruses were determined by using strategies including enzymatic, chain-termination and cDNA sequencing methods.
Comparison of the nucleotide sequences reveals three highly conserved sequences. The first conserved sequence is 19 nucleotides in length and is located at the extreme 3'-end next to the poly(A) tail. The second conserved sequence, which is 21 nucleotides in length, precedes the 5'-end of 26S RNA and includes the first two nucleotides of it. The third conserved sequence is 51 nucleotides in length and is located at a position of about 130 to 150 nucleotides from the 5'-end, depending on the virus. The last conserved sequence in all alphaviruses examined is capable of forming two stable hairpin structures and could also base-pair stably with the 3'-terminal sequences to cyclize genomic RNAs. Besides these three conserved sequences, a highly conserved stem and loop structure could also be formed at the extreme 5'-end of genomic RNA.
Defective interfering (DI) RNAs of alphaviruses are mutated genomic RNAs which often contain deleted, repeated and translocated sequences, but yet retain all elements essential for their replication. By studying the sequence organization of alphavirus DI RNAs, and the 3'-terminal sequences of the genomic RNAs of two alphavirus variants and their replication, the importance of these conserved sequences and secondary structures in alphavirus replication are discussed.
Both the 3'- and 5'-terminal sequences of several alphavirus 26S RNAs were also determined. Results show that 26S and genomic RNAs are coterminal. Together with the results previously published, the total length of the 26S RNAs of two alphaviruses, Sindbis virus and Semliki Forest virus, were determined to be 4102 and 4074 nucleotides, respectively.
The NH2- and COOH-terminal sequences of the precursors of nonstructural proteins (translated from genomic RNA) and structural proteins (translated from 26S RNA) of several alphaviruses were deduced from the nucleotide sequences determined. The initiation codons of most alphavirus genomic and 26S RNAs are preceded by the sequence CANN. To determine the importance of these tetranucleotides, their sequences in 65 eucaryotic mRNAs were surveyed. Results show that the sequence distribution of these tetranucleotides are non-random and they might be involved in initiation of translation.
The 3'-noncoding regions of alphavirus genomic RNAs contain AU rich sequences. Sequence organization in the 3'-noncoding regions is similar to those in alphavirus DI RNAs. Mechanisms for the generation of these sequence rearrangements are discussed.