Regardless of whether you're a newcomer to biology or a longtime aficionado, chances are excellent that by default, you view deoxyribonucleic acid (DNA) as perhaps the single most indispensable concept in all of life science. At a minimum, you're likely aware that DNA is what makes you unique among the billions of people on the planet, giving it a role in the criminal justice world as well as center stage in molecular biology lectures. You've almost surely learned that DNA is responsible for endowing you with whatever traits you inherited from your parents, and that your own DNA is your direct legacy to future generations should you have children.
What you may not know a lot about is the path that connects the DNA in your cells to the physical traits you manifest, both overt and concealed, and the series of steps along that path. Molecular biologists have produced the concept of a "central dogma" in their field, which can be summarized simply as "DNA to RNA to protein." The first part of this process – generating RNA, or ribonucleic acid, from DNA – is known as transcription, and this well-studied and coordinated series of biochemical gymnastics is as elegant as it is scientifically profound.
Overview of Nucleic Acids
DNA and RNA are nucleic acids. Both are fundamental to all of life; these macromolecules are very closely related, but their functions, while exquisitely intertwined, are highly divergent and specialized.
DNA is a polymer, which means that it consists of a large number of repeating subunits. These subunits are not precisely identical, but they are identical in form. Consider a long string of beads consisting of cubes that come in four colors and vary ever so slightly in size, and you gain a basic sense of how DNA and RNA are arranged.
The monomers (subunits) of nucleic acids are known as nucleotides. Nucleotides themselves consist of triads of three distinct molecules: a phosphate group (or groups), a five-carbon sugar and a nitrogen-rich base ("base" not in the sense of "foundation," but meaning "hydrogen-ion acceptor"). The nucleotides that make up nucleic acids have one phosphate group, but some have two or even three phosphates attached in a row. The molecules adenosine diphosphate (ADP) and adenosine triphosphate (ATP) are nucleotides of extraordinary importance in cellular energy metabolism.
DNA and RNA differ in several important ways. One, while each of these molecules includes four different nitrogenous bases, DNA includes adenine (A), cytosine (C), guanine (G) and thymine (T), whereas RNA includes the first three of these, but substitutes uracil (U) for T. Two, the sugar in DNA is deoxyribose, while that in RNA is ribose. And three, DNA is double-stranded in its most energetically stable form, whereas RNA is single-stranded. These differences are of major importance in both transcription specifically and the function of these respective nucleic acids generally.
The bases A and G are called purines, while C, T and U are classified as pyrimidines. Critically, A chemically binds to, and only to, T (if DNA) or U (if RNA); C binds to and only to G. The two strands of a DNA molecule are complementary, meaning that the bases in each strand match at every point to the unique "partner" base in the opposite strand. Thus AACTGCGTATG is complementary to TTGACGCATAC (or UUGACGCAUAC).
DNA Transcription vs. Translation
Before delving into the mechanics of DNA transcription, it's worth taking a moment to review the terminology associated with DNA and RNA, because with so many similar-sounding words in the mix, it can be easy to confuse them.
Replication is the act of making an identical copy of something. When you make a photocopy of a written document (old school) or use the copy-and-paste function on a computer (new school), you are replicating the content in both cases.
DNA undergoes replication, but RNA, insofar as modern science can ascertain, does not; it arises only from transcription_._ From a Latin root that means "a writing across," transcription is the encoding of a particular message in a copy of an original source. You may have heard of medical transcriptionists, whose job is to type into written form the medical notes made as an audio recording. Ideally, the words, and thus the message, will be precisely the same despite the change in medium. In cells, transcription involves the copying of a genetic DNA message, written in the language of nitrogenous base sequences, into RNA form – specially, messenger RNA (mRNA). This RNA synthesis occurs in the nucleus of eukaryotic cells, after which the mRNA leaves the nucleus and heads for a structure called a ribosome to undergo translation.
Whereas transcription is the simple physical encoding of a message in a different medium, translation, in biological terms, is the conversion of that message into purposeful action. A length of DNA or single DNA message, called a gene, ultimately results in cells manufacturing a unique protein product. The DNA ships this message along in the form of mRNA, which then carries the message to a ribosome for it to be translated into making a protein. In this view, mRNA is like a blueprint or a set of instructions for assembling a piece of furniture.
That hopefully clears up any mysteries you have about what nucleic acids do. But what about transcription in particular?
The Steps of Transcription
DNA, rather famously, is woven into a double-stranded helix. But in this form, it would physically be difficult to build anything from it. Therefore, in the initiation phase (or step) of transcription, the DNA molecule is unwound by enzymes called helicases. Only one of the two resulting DNA strands is used for RNA synthesis at a time. This strand is referred to as the noncoding strand, because, thanks to the rules of DNA and RNA base-pairing, the other DNA strand has the same sequence of nitrogenous bases as the mRNA to be synthesized, thus making this strand the coding strand. Based on points made previously, you can conclude that a strand of DNA and the mRNA it is responsible for manufacturing are complementary.
With the strand now ready for action, a section of DNA called the promoter sequence indicates where transcription is to start along the strand. The enzyme RNA polymerase arrives at this location and becomes part of a promoter complex. All of this is to ensure that mRNA synthesis begins exactly where it is supposed to on the DNA molecule, and this generates an RNA strand that holds the desired coded message.
Next, in the elongation phase, RNA polymerase "reads" the DNA strand, starting at the promoter sequence and moving along the DNA strand, like a teacher walking up a row of students and distributing tests, adding nucleotides to the growing end of the newly forming RNA molecule.
The bonds created between the phosphate groups of one nucleotide and the ribose or deoxyribose group on the next nucleotide are called phosphodiester linkages. Note that a DNA molecule has what is called a 3' ("three-prime") terminus at one end and a 5' ("five-prime") terminus at the other, with these numbers coming from the terminal carbon-atom positions in the respective terminal ribose "rings." As the RNA molecule itself grows in the 3' direction, it moves along the DNA strand in the 5' direction. You should examine a diagram to assure yourself that you fully understand the mechanics of mRNA synthesis.
The addition of nucleotides – specifically, nucleoside triphosphates (ATP, CTP, GTP and UTP; ATP is adenosine triphosphate, CTP is cytidine triphosphate and so on) – to the elongating mRNA strand requires energy. This, like so many biological processes, is provided by the phosphate bonds in the nucleoside triphosphates themselves. When the high-energy phosphate-phosphate bond is broken, the resulting nucleotide (AMP, CMP, GMP and UMP; in these nucleotides, the "MP" stands for "monophosphate") is added to mRNA, and a pair of inorganic phosphate molecules, usually written PPi, fall away.
As transcription occurs, it does so, as stated, along a single strand of DNA. Be aware, however, that the entire DNA molecule does not uncoil and separate into complementary strands; this only happens in the direct vicinity of transcription. As a result, you can visualize a "transcription bubble" moving along the DNA molecule. This is like an object that moves along a zipper that is being unzipped just ahead of the object by one mechanism while a different mechanism re-zips the zipper in the object's wake.
Finally, when the mRNA has reached its required length and form, the termination phase gets underway. Like initiation, this phase is enabled by specific DNA sequences that function as stop signs for RNA polymerase.
In bacteria, this can happen in two general ways. In one of these, the termination sequence is transcribed, generating a length of mRNA that folds back in on itself and thereby "bunches up" as the RNA polymerase continues to do its job. These folded sections of mRNA are often referred to as hairpin strands, and they involve complementary base pairing within the single-stranded but contorted mRNA molecule. Downstream from this hairpin section is a prolonged stretch of U bases, or residues. These events compel the RNA polymerase to stop adding nucleotides and detach from the DNA, ending transcription. This is referred to as rho-independent termination because it does not rely on a protein known as a rho factor.
In rho-dependent termination, the situation is simpler, and no hairpin mRNA segments or U residues are needed. Instead, the rho factor binds to the required spot on mRNA and physically pulls the mRNA away from RNA polymerase. Whether rho-independent or rho-dependent termination occurs depends on the exact version of RNA polymerase that is acting on DNA and mRNA (a variety of subtypes exist) as well as the proteins and other factors in the immediate cellular environment.
Both cascades of events ultimately lead to the mRNA breaking free of the DNA at the transcription bubble.
Prokaryotes vs. Eukaryotes
Numerous differences exist between transcription in prokaryotes (almost all of which are bacteria) and eukaryotes (multicellular organisms such as animals, plants and fungi). For example, initiation in prokaryotes usually involves a DNA base arrangement known as the Pribnow box, with the base sequence TATAAT located roughly 10 base pairs away from where transcription initiation itself occurs. Eukaryotes, however, have enhancer sequences positioned at a considerable distance from the initiation site, as well as activator proteins that help deform the DNA molecule in a way that renders it more accessible to RNA polymerase.
In addition, elongation occurs about twice as fast in bacteria (around 42 to 54 base pairs per minute, bordering on one per second) as in eukaryotes (about 22 to 25 base pairs per minute). Finally, while bacterial mechanisms of termination are described above, in eukaryotes, this phase involves specific termination factors, as well as a strand of RNA called a poly-A (as in, many adenine bases in a row) "tail." It is not yet clear whether cessation of elongation triggers cleavage of the mRNA from the bubble or whether cleavage itself abruptly ends the elongation process.