RNA, or ribonucleic acid, is one of two nucleic acids found in nature. The other, deoxyribonucleic acid (DNA), is certainly more fixed in the imagination. Even people with little interest in science have an inkling that DNA is vital in the passing on of traits from one generation to the next, and that every human being's DNA is unique (and therefore is a bad idea to leave at a crime scene). But for all of DNA's notoriety, RNA is a more versatile molecule, coming in three major forms: messenger RNA (mRNA), ribosomal RNA (rRNA) and transfer RNA (tRNA).
The job of mRNA relies heavily on the other two types, and mRNA lies squarely at the center of the so-called central dogma of molecular biology (DNA begets RNA, which in turn begets proteins).
Nucleic Acids: An Overview
DNA and RNA are nucleic acids, which means that they are polymer macromolecules, the monomeric constituents of which are called nucleotides. Nucleotides consist of three distinct portions: a pentose sugar, a phosphate group and a nitrogenous base, selected from among four choices. A pentose sugar is a sugar that includes a five-atom ring structure.
Three major differences distinguish DNA from RNA. First, in RNA, the sugar portion of the nucleotide is ribose, while in DNA it is deoxyribose, which is simply ribose with a hydroxyl (-OH) group removed from one of the carbons in the five-atom ring and replaced by a hydrogen atom (-H). Thus the sugar portion of DNA is just one oxygen atom less massive than RNA, but RNA is a far more chemically reactive molecule than DNA because of its one extra -OH group. Second, DNA is, rather famously, double-stranded and wound into a helical shape in its most stable from. RNA, on the other hand, is single-stranded. And third, while DNA and RNA both feature the nitrogenous bases adenine (A), cytosine (C) and guanine (G), the fourth such base in DNA is thymine (T) while in RNA it is uracil (U).
Because DNA is double-stranded, scientists have known since the mid-1900s that these nitrogenous bases pair with and only with one other kind of base; A pairs with T, and C pairs with G. Furthermore, A and G are chemically classified as purines, while C and T are called pyrimidines. Because purines are substantially larger than pyrimidines, an A-G pairing would be overly bulky, whereas a C-T pairing would be unusually undersized; both of these situations would be disruptive to the two strands in double-stranded DNA being the same distance apart at all points along the two strands.
Because of this pairing scheme, the two strands of DNA are called "complementary," and the sequence of one can be predicted if the other is known. For example, if a string of ten nucleotides in a strand of DNA has the base sequence AAGCGTATTG, the complementary DNA strand will have the base sequence TTCGCATAAC. Because RNA is synthesized from a DNA template, this has implications for transcription as well.
Basic RNA Structure
mRNA is the most "DNA-like" form of ribonucleic acid because its job is largely the same: to transmit the information encoded in genes, in the form of carefully ordered nitrogenous bases, to the cellular machinery that assembles proteins. But various vital types of RNA exist as well.
The three-dimensional structure of DNA was elucidated in 1953, earning James Watson and Francis Crick a Nobel Prize. But for years afterward, the structure of RNA remained elusive despite efforts by some of the same DNA experts to describe it. In the 1960s, it became clear that although RNA is single-stranded, its secondary structure – that is, the relationship of the sequence of nucleotides to each other as the RNA winds its way through space – implies that lengths of RNA can fold back in on themselves, with bases in the same strand thus linking to one another in the same way a length of duct tape might stick to itself if you allow it to kink. This is the basis for the cross-like structure of tRNA, which includes three 180-degree bends that create the molecular equivalent of cul-de-sacs in the molecule.
rRNA is somewhat different. All rRNA is derived from one monster of an rRNA strand some 13,000 nucleotides long. After a number of chemical modifications, this strand is cleaved into two unequal subunits, one called 18S and the other labeled 28S. ("S" stands for "Svedberg unit," a measure biologists use to indirectly estimate the mass of macromolecules.) The 18S portion is incorporated to what is called the small ribosomal subunit (which when complete is actually 30S) and the 28S part contributes to the large subunit (whichin total has a size of 50S); all ribosomes contain one of each subunit along with a number of proteins (not nucleic acids, which make proteins themselves possible) to provide ribosomes with structural integrity.
DNA and RNA strands both have what are called 3' and 5' ("three-prime" and "five-prime") ends based on the positions of molecules attached to the sugar portion of the strand. In each nucleotide, the phosphate group is attached to the carbon atom labeled 5' in its ring, whereas the 3' carbon features a hydroxyl (-OH) group. When a nucleotide is added to a growing nucleic acid chain, this always occurs at the 3' end of the existing chain. That is, the phosphate group at the 5' end of the new nucleotide is joined to the 3' carbon featuring the hydroxyl group before this linking occurs. The -OH is replaced by the nucleotide, which loses a proton (H) from its phosphate group; thus a molecule of H2O, or water, is lost to the environment in this process, making RNA synthesis an example of a dehydration synthesis.
Transcription: Encoding the Message Into mRNA
Transcription is the process in which mRNA is synthesized from a DNA template. In principle, given what you now know, you can easily envision how this happens. DNA is double-stranded, so each strand can serve as a template for single-stranded RNA; these two new RNA strands, owing to the vagaries of specific base-pairing, will be complementary to each other, not that they will bond together. The transcription of RNA is very similar to the replication of DNA in that the same base-pairing rules apply, with U taking the place of T in RNA. Note that this replacement is a one-directional phenomenon: T in DNA still codes for A in RNA, but A in DNA codes for U in RNA.
For transcription to occur, the DNA double helix must become uncoiled, which it does under the direction of specific enzymes. (It later re-assumes its proper helical conformation.) After this happens, a specific sequence aptly called the promoter sequence signals where transcription is to begin along the molecule. This summons to the molecular scene an enzyme called RNA polymerase, which by this time is part of a promoter complex. All of this occurs as a sort of biochemical fail-safe mechanism to keep RNA synthesis from beginning in the wrong spot on DNA and thereby producing an RNA strand that contains an illegitimate code. The RNA polymerase "reads" the DNA strand starting at the promoter sequence and moves along the DNA strand, adding nucleotides to the 3' end of the RNA. Be aware that the RNA and DNA strands, by virtue of being complementary, are also antiparallel. This means that as the RNA grows in the 3' direction, it moves along the DNA strand at the DNA's 5' end. This is a minor but often confusing point for students, so you may wish to consult a diagram to assure yourself that you understand the mechanics of mRNA synthesis.
The bonds created between the phosphate groups of one nucleotide and the sugar group on the next are called phosphodiester linkages (pronounced "phos-pho-die-es-ter," not "phos-pho-dee-ster" as it may be tempting to assume).
The enzyme RNA polymerase comes in many forms, although bacteria include only a single type. It is a large enzyme, consisting of four protein subunits: alpha (α), beta (β), beta-prime (β′ ) and sigma (σ). Combined, these have a molecular weight of around 420,000 Daltons. (For reference, a single carbon atoms has a molecular weight of 12; a single water molecule, 18; and a whole glucose molecule, 180.) The enzyme, called a holoenzyme when all four subunits are present, is responsible for recognizing the promoter sequences on DNA and pulling apart the two DNA strands. RNA polymerase moves along the gene to be transcribed as it adds nucleotides to the growing RNA segment, a process called elongation. This process, like so many within cells, requires adenosine triphosphate (ATP) as an energy source. ATP is really nothing more than an adenine-containing nucleotide that has three phosphates instead of one.
Transcription ceases when the moving RNA polymerase encounters a termination sequence in DNA. Just as the promoter sequence may be viewed as the equivalent of a green light on a traffic light, the termination sequence is the analog of a red light or stop sign.
Translation: Decoding the Message From mRNA
When an mRNA molecule carrying the information for a particular protein – that is, a piece of mRNA corresponding to a gene – is complete, it still needs to be processed before it is ready to do its job of delivering a chemical blueprint to the ribosomes, where protein synthesis takes place. In eukaryotic organisms, it also migrates out of the nucleus (prokaryotes do not have a nucleus).
Critically, nitrogenous bases carry genetic information in groups of three, called triplet codons. Each codon carries instructions to add a particular amino acid to a growing protein. Just as nucleotides are the monomer units of nucleic acids, amino acids are the monomers of proteins. Because RNA contains four different nucleotides (owing to the four different bases available) and a codon consists of three consecutive nucleotides, there are 64 total triplet codons available (43 = 64). That is, starting with AAA, AAC, AAG, AAU and working all the way to UUU, there are 64 combinations. Humans, however, make use of only 20 amino acids. As a result, the triplet code is said to be redundant: In most cases, multiple triplets code for the same amino acid. The inverse is not true – that is, the same triplet cannot code for more than one amino acid. You can probably envision the biochemical chaos that would ensue otherwise. In fact, the amino acids leucine, arginine and serine each have six triplets corresponding to them. Three different codons are STOP codons, similar to the transcription termination sequences in DNA.
Translation itself is a highly cooperative process, bringing together all of the members of the extended RNA family. Because it occurs on ribosomes, it obviously involves the use of rRNA. The tRNA molecules, described earlier as tiny crosses, are responsible for carrying individual amino acids to the translation site on the ribosome, with each amino acid carted about by its own specific brand of tRNA escort. Like transcription, translation has initiation, elongation and termination phases, and at the end of the synthesis of a protein molecule, the protein is released from the ribosome and packaged into Golgi bodies for use elsewhere, and the ribosome itself dissociates into its component subunits.
About the Author
Kevin Beck holds a bachelor's degree in physics with minors in math and chemistry from the University of Vermont. Formerly with ScienceBlogs.com and the editor of "Run Strong," he has written for Runner's World, Men's Fitness, Competitor, and a variety of other publications. More about Kevin and links to his professional work can be found at www.kemibe.com.