Each gene on a chromosome can be thought of as the instructions for making a particular protein in a cell. However, the genes themselves cannot direct the synthesis of the proteins they encode but must first be converted into a form that can be recognized by the cellular protein-making machine, the ribosome . This conversion process is called transcription. During transcription the instructions held in the genes (in the form of deoxyribonucleic acid [DNA]) are transcribed into a chemical form called ribonucleic acid (RNA). Because this RNA carries the message, or instructions, from the genes to the ribosomes, where it is ultimately converted into a protein molecule, it is called messenger RNA (mRNA).
Most genes in a cell code for protein and hence are transcribed into mRNA. However, a few genes code for different types of RNA that are not used as templates for protein synthesis but instead are ends in themselves and carry out a variety of functions in the cell. Examples of these other kinds of RNA are transfer RNA (tRNA) and ribosomal RNA (rRNA), which are both critical to the process of protein synthesis.
Transcription in Prokaryotes
Much of the pioneering work on transcription was carried out in prokaryotes , most notably in the bacterium E. coli. These studies laid the foundation for work that was later carried out in the more complex eukaryotes. The enzyme that carries out transcription is called RNA polymerase, and it consists of four kinds of polypeptides , designated α, β, β′ and σ, which are bound together into a complex called a holoenzyme.
Transcription can be divided into three phases: initiation, elongation, and termination. Initiation occurs when the polymerase, sliding along the chromosome, encounters a promoter , a sequence of DNA that identifies the beginning of a gene. The promoter contains two sequence elements, six base pairs apiece, called the -10 and -35 elements, which are located ten and thirty-five base pairs respectively upstream of the transcription start site. DNA is double-stranded, but only one side serves as a template from which RNA is made. With the two strands bound to one another, the template strand must be made accessible if it is to serve as a template for RNA synthesis. To begin, the polymerase unwinds a region of approximately seventeen base pairs, setting the stage for the formation of the first phosphodiester bond. Unlike DNA, the synthesis of RNA can be initiated without the need for a primer .
RNA is made by linking together ribonucleotides in an order dictated by the DNA template strand. The essence of transcription is to use the sequence of nucleotides already on the DNA strand to dictate the sequence of RNA nucleotides that will be formed into the new RNA strand. The four DNA nucleotides are adenine, guanine, cytosine, and thymine (A, G, C, and T). RNA nucleotides are A, G, C, and uracil (U). RNA polymerase pairs up an RNA nucleotide with each DNA nucleotide. However, rather than matching the DNA sequence, the polymerase pairs up complementary base pairs. G is always paired with C, so that if the DNA sequence is GGCC, the resulting RNA sequence is CCGG. The A of DNA is paired with the U of RNA, and the T of DNA is paired with the A of RNA, so that if the DNA sequence is AATT, the RNA sequence is UUAA. Thus, like DNA replication, the rules of Watson-Crick base pairing apply during transcription.
The nucleotides used in RNA synthesis are triphosphates, meaning they have three phosphate groups attached. This energizes them, and hydrolysis of these phosphates powers the transcription process. Once the chain has reached a length of approximately ten ribonucleotides the subunit dissociates , leaving the core enzyme (α2ββ′) to continue transcribing until the signal for termination is reached. Termination signals on DNA vary but the most common is a GC-rich region followed by an AT-rich region.
OCHOA, SEVERO (1905–)
Spanish molecular biologist who received, with Arthur Kornberg, the 1959 Nobel Prize in physiology for discovering an enzyme that can be used to make ribonucleic acid (RNA). His work was fundamental to modern biotechnology.
Transcription in Eukaryotes
The basic features of RNA synthesis are shared between prokaryotes and eukaryotes; however, transcription in eukaryotes differs in that it is significantly more complex. First, rather than having a single RNA polymerase, eukaryotes have three different RNA polymerases, each of which transcribes a different set of genes. RNA polymerase I transcribes three types of rRNA (the 18S, 5.8S, and 28S species), RNA polymerase II transcribes mRNA, and RNA polymerase III transcribes tRNA and the smallest rRNA (the 5S species). The eukaryotic RNA polymerases consist of between eight and fourteen subunits, with two of them corresponding to the β and β′ subunits of prokaryotic RNA polymerases.
Unlike the bacterial RNA polymerase, eukaryotic RNA polymerases cannot initiate transcription by themselves but need the help of a set of proteins called the basic transcription factors. The basic transcription factors perform a number of functions, including binding to gene promoter regions and attracting the appropriate RNA polymerase to the initiation site, as well as unwinding the DNA double helix to allow access of the incoming ribonucleotides of the growing RNA chain.
RNA Polymerase II Transcription. The promoters of eukaryotic genes have been intensely studied, especially those transcribed by RNA polymerase II. Most genes transcribed by RNA polymerase II contain a sequence called a TATA-box located twenty-five to thirty-five nucleotides upstream of the transcription start site. The TATA-box contains the sequence TATAA, which is recognized by a multicomponent transcription factor called TFIID. One of the components of the transcription factor is the TATA-binding protein (TBP), which directly links to the TATAA sequence. RNA polymerase II genes contain additional binding sites in their promoters for transcriptional regulators and can even be affected by elements located at large distances called enhancers.
In contrast to prokaryotes, transcription termination by RNA polymerase II does not occur simply by release of the RNA molecule. Rather, transcription continues well beyond the termination point, and the transcript is later cleaved to the appropriate length. Following cleavage an enzyme called poly-A polymerase adds approximately 250 adenine residues to the tail end of the transcript.
Each polymerase has its own set of basic transcription factors, designated by a number and a letter. For example, TFIIA is transcription factor A, which functions with RNA polymerase II.
RNA Polymerase I and III Transcription. RNA polymerase I is exclusively devoted to transcribing the ribosomal RNA genes, which are present in many copies as tandem arrays (multiple copies, existing side by side). RNA polymerase I synthesizes one long RNA molecule containing the 28S, 18S, and 5.8S rRNAs, which is subsequently cleaved into separate parts. In contrast to the promoters for RNA polymerases I and II, the promoters of RNA polymerase III genes typically lie downstream of the transcription start site. Interestingly, although most of the basic transcription factors are not shared between the three polymerases, TBP, which was first discovered as a protein involved in RNA polymerase II transcription, has now been found to be required for transcription by all three polymerases. Thus, despite the differences between the polymerases, they have all incorporated TBP into their mechanism of transcription initiation.
Stryer, Lubert. Biochemistry, 4th ed. New York: W. H. Freeman and Company, 1995.
Tijian, Robert. "Molecular Machines that Control Genes." Scientific American 272 (1995): 54–61.
Transcription is the process in which genetic information stored in a strand of DNA is copied into a strand of RNA. The sequence of the four bases in DNA, which are adenine (A), cytosine (C), guanine (G), and thymine (T), is preserved in the sequence of the four bases in RNA, which are A, C, G, and uracil (U).
Functions of RNA Transcripts
RNA molecules have various functions in the cell. Many of the functions are associated with translation, in which the genetic code of messenger RNA molecules is used to help the ribosomes synthesize a specific protein. In addition, ribosomal RNA is the main component of the ribosome, and transfer RNA does the actual translating from nucleotide sequence into amino acid sequence.
RNA molecules may also function as enzymes. They do so either alone or in association with proteins. RNA molecules associate with proteins, for example, when they serve as components of machinery that helps make other, newly formed RNA molecules functional.
RNA is chemically better suited to carry out certain tasks than is DNA. There are also other reasons RNA, not DNA, is used for these tasks. First, it is desirable to keep DNA available for replication and not tied up with other functions. Second, the small number of DNA molecules in the cell is often insufficient. Creating many identical RNA molecules that are copies of a single segment of DNA provides the necessary numbers. Third, RNA can be differentially degraded when it is no longer needed, providing an important regulatory mechanism that would be unavailable if there were only one type of nucleic acid.
Transcription is initiated at regions of DNA called promoters, which are typically 20 to 150 base pairs long, depending on the organism. The sequence of bases at a promoter is recognized by RNA polymerase, the enzyme that synthesizes RNA.
The RNA polymerases in bacteria, as well as in viruses in bacteria, are able to recognize particular promoter sequences without the help of any other cellular proteins. However, in eukaryotes and Archaea, other proteins, called initiation factors, recognize the promoter sequence, "recruit" RNA polymerase and other proteins, help the RNA polymerase bind to the DNA, and regulate the enzyme's activity.
RNA polymerase is assembled on promoters in a particular orientation (Figure 1A). This allows RNA synthesis to start at a precise location and proceed in only one direction, "downstream" toward the gene (Figure 1B).
RNA, like DNA, is a polymer of nucleotides. Each nucleotide consists of a sugar that is attached to a phosphate group and any one of four bases. The RNA polymerase, as it builds the chain of nucleotides, processes only one of the two complementary strands of DNA. This DNA strand is referred to as the template strand. The least confusing name for the other DNA strand is "the nontemplate strand."
The bases in the newly synthesized RNA are complementary to the bases in the template DNA strand and, therefore, identical in sequence to the bases in the nontemplate strand, except that the RNA contains U where the nontemplate strand of DNA contains T.
Before the nucleotides are linked together, they exist separately as ribonucleoside triphosphates (NTPs). As shown below, the NTPs contain one of the four common RNA bases, A, C, G, and U, linked to a five-carbon ribose sugar, linked, in turn, to a chain of three phosphate groups. During RNA synthesis, a covalent, "phosphodiester" bond is formed between one of the three phosphate groups on one NTP and a hydroxyl group on another. The two other phosphate groups that were part of the original NTP are released.
RNA synthesis is said to proceed in the 5′ to 3′ direction, reflecting the fact that the attachment of new nucleotides always occurs at the 3′ hydroxyl group of the growing RNA chain. RNA synthesis goes through phases that are typical of polymerization processes: initiation, elongation, and termination, yielding an RNA product of defined size and sequence.
The first phase of RNA synthesis is initiation (Figure 1B). Initiation starts when the first phosphodiester bond is formed. At precise locations, determined by the promotor DNA sequence, the first and second RNA bases bind to the complex, and RNA polymerase catalyzes the formation of a covalent bond between them.
When the growing RNA chain reaches a length of about ten nucleotides, the complex loses contact with the promoter and starts moving along the DNA. This is referred to as promoter "clearance" or "escape."
Only a fraction of initiation events lead to promoter clearance. In many instances, an "abortive" RNA molecule, shorter than ten nucleotides, is released from the RNA polymerase, and RNA synthesis begins all over again. Such an abortive molecule is shown in the figure as a thick line.
Once the growing RNA chain has reached the critical length of about ten nucleotides, the initiation stage is considered to have ended, and elongation begins. In eukaryotes, the transition from initiation to elongation can be triggered by enzymes called kinases, which attach phosphate groups to RNA polymerase, facilitating promoter clearance.
Genes range in length from about 80 base pairs of DNA, as is the case for those transcribed into transfer RNA, to more than 1 million base pairs, as is the case for those encoding very long proteins. An RNA polymerase molecule that has disengaged from DNA during elongation would be unable to finish synthesizing the RNA molecule. Thus the enzyme has to traverse even the longest genes (Figure 1C), without falling off.
Along the way, there are DNA sequences that the RNA polymerase traverses considerably more slowly than at its usual rate of about 50 nucleotides per second. At regions called pause sites, it may take longer than 1 second for a single nucleotide to be added to the growing polymer.
In eukaryotes, many genes contain blocks of DNA called introns , which disrupt the coding information of the gene. Introns are removed from the newly made RNA by a process called splicing. It is thought that the proteins which carry out the splicing are carried by the RNA polymerase as it is transcribing the gene, allowing the processing of the RNA to occur at the same time as the RNA molecule is synthesized.
When the RNA polymerase reaches a specific DNA sequence known as a terminator, it slows down and the transcription complex dissociates from the DNA, as shown in Figure 1D. The released RNA polymerase is then free to participate in a new initiation event.
At some terminators, primarily in bacteria, the RNA polymerase is able to respond to the release signal without being helped by any other proteins. Such sites are called intrinsic terminators. At other sites, termination is accomplished only with the aid of additional proteins. These proteins, called termination factors, are also instrumental in causing RNA to be released from the transcribing complex.
"Factor-dependent" terminators have been found in organisms from each of the three domains of life, the eukaryotes, bacteria, and Archaea . In eukaryotes, but usually not in bacteria, transcription of most genes proceeds past the end of the gene, as shown in Figure 1D.
The initial RNA molecules are often referred to as "primary" transcripts. In many instances, the primary transcripts must be processed to yield functional, or "mature," RNA. The processing can involve shortening them by removing their terminal or internal regions, or modifying specific nucleotides in other ways.
Regulation of Transcription
Only a few of an organism's genes are active or "expressed" at any particular time. Which genes are expressed in a particular cell depends on such factors as the nutrients available, the cell's state of differentiation, and the cell's age. There are intricate mechanisms that let the cell regulate the expression of many of its genes. Transcription, the first step in the expression of the genetic information, is an important point at which gene expression can be regulated.
There are two types of regulation: positive control, in which transcription is enhanced in response to a certain set of conditions; and negative control, in which transcription is repressed. Usually, positive control is used at promoters that are otherwise engaged in the initiation of few RNA molecules. Negative control is used at promoters where many molecules of RNA are initiated.
Activator proteins enable positive control by binding to the promoter to recruit RNA polymerase or other required initiation proteins. Such activator proteins usually bind upstream of the promoter (Figure 1). Increased recruitment then leads to an increased rate of synthesis of RNA for a particular gene. The more regulatory sites that are bound, the greater the increase in the rate of RNA synthesis. Repressor proteins can inhibit initiation of transcription by binding to the promoter and preventing RNA polymerase or a required initiation protein from binding.
In eukaryotes, DNA is "packaged" into nucleosomes by being wrapped around histone proteins. This can dramatically reduce the ability of genes to be transcribed, because the packaging may hide promoter sequences that are recognized by initiation factors.
Two mechanisms are used to alter the DNA packaging, to regulate transcription. First, enzymes called chromatin remodeling factors can move his-tone proteins around on the DNA, so that promoter sequences are more accessible or less accessible to the transcription initiation machinery. Second, enzymes can attach small chemical groups, including acetyl, phosphate, methyl or other groups, to the histone proteins. This modification of his-tone proteins may alter the interaction between the DNA and the histones, or between histones and other proteins, either facilitating or blocking the ability of initiation factors to bind DNA.
Transcription also is regulated by proteins that influence how quickly RNA polymerase moves along the DNA. These proteins, called regulatory elongation factors, may help the polymerase traverse pause sites, and they may facilitate elongation through packaged DNA. On the other hand, they may also facilitate the termination of transcription at specific sites.
see also Archaea; Gene Expression: Overview of Control; Nucleotide; Operon; RNA Polymerases; RNA Processing; Transcription Factors; Translation.
David T. Auble
and Pieter L. de Haseth
de Haseth, Pieter L., Margaret Zupancic, and M. Thomas Record Jr. "RNA Polymerase-Promoter Interaction: The Comings and Goings of RNA Polymerase." Journal of Bacteriology 180 (1998): 3019-3025.
Lemon, Bryan, and Robert Tjian. "Orchestrated Response: A Symphony of Transcription Factors for Gene Control." Genes & Development 14 (2000): 2551-2569.
Transcription is defined as the transfer of genetic information from deoxyribonucleic acid (DNA ) to ribonucleic acid (RNA ). The process of transcription in prokaryotic cells (e.g., bacteria ) differs from the process in eukaryotic cells (cells with a true nucleus ) but the underlying result of both transcription processes is the same, which is to provide a template for the formation of proteins.
The use of DNA as a blueprint to manufacture RNA begins with an enzyme called RNA polymerase. The enzyme is guided to a certain region on the DNA, called the promoter, by association with molecules known as sigma factors. There are many promoters on DNA, located just before a region of DNA that codes for a protein. The promoter serves to position the RNA polymerase so that transcription of the full coding region is accomplished.
Once the polymerase has bound to a promoter, the sigma factors detach and can serve another polymerase. The attached polymerase then begins to move along the DNA, unwinding the two strands of DNA that are linked together and using the sequence on one of the strands as the blueprint for RNA manufacture. The strand from which RNA is made is known as the template or the antisense strand, while the other strand to which it is complimentary is called the sense or the coding strand.
As the polymerase moves along the DNA, the strands link back together behind the polymerase. The effect is somewhat similar to a zipper with a bulge, where the two links of the zipper have come apart. The bulging region can move along the zipper, with separation and reannealing of the strands occurring continuously with time. The promoter can accommodate the binding of another polymerase as soon as the region is free. Thus, the same stretch of DNA can be undergoing several rounds of transcription at the same end, with polymerase molecules positioned all along the DNA.
The RNA that is produced is known as messenger RNA (or mRNA). The species derives its name from its function. It is the tangible form of the message that is encoded in the DNA. The mRNA in turn functions as a template for the next step in the genetic process, that of translation . In translation the mRNA information is used to manufacture protein.
Termination of transcription occurs when the RNA polymerase reaches a signal on the DNA template strand that signals the polymerase to stop and to end the association with the DNA.
Some microorganisms have variations on the basic transcription mechanism. For example, in yeast cells the mRNA can be "capped" by the addition of specialized pieces of nucleic acid called telomeres to either end of the transcribed molecule. The telomeres function to extend the life of the mRNA and provide a signal of the importance of the information contained within.
The intricate and coordinated transcription process in bacteria is also a rapid process. For example, measurements in Escherichia coli have established that the RNA polymerase moves along the DNA at a speed of 50 nucleotides per second.
See also Bacterial artificial chromosome; Genetic regulation of prokaryotic cells