Transcription factors are protein complexes that help RNA polymerase bind to DNA. RNA polymerase is the enzyme that transcribes genes to make messenger RNA, which is then used to make protein. By controlling RNA polymerase's access to the gene, transcription factors control the rate at which a gene is transcribed. Without transcription factors, cells would not be able to effectively regulate the rate at which genes are expressed.
Basal Transcription Factors Bind to the Gene Promoter Region
Every gene has a region known as the promoter . This is a DNA sequence "upstream" from the coding region, to which RNA polymerase must bind before it begins transcribing the coding region of the gene. In eukaryotes, the promoters of many (but not all) genes contain the sequence TATAA twenty-five to thirty nucleotides upstream from the transcription start site (T is the nucleotide adenine; A is thymine). Called the "TATA box," this sequence binds the TATA-binding protein (TBP), one of the most ancient and most important transcription factors.
The DNA-binding region of TBP has changed very little in millions of years of evolution, indicating how central this portion of the protein is in gene transcription. Transcription factors in archaeans are closely akin to those in eukaryotes , though simpler, and they reveal a deep evolutionary relation between the two groups. (Transcription factors are also used by eubacteria , but the details differ significantly and will not be discussed here.)
When TBP contacts DNA, the DNA bends. This distortion in shape allows the two sides of the double helix to come apart more easily. TATA-box DNA is especially easy to separate, because successive adenine-thymine pairs are somewhat less stable than series of other nucleotide pairs. The separation of the two strands makes the coding region of the gene more accessible to the RNA polymerase. (There are in fact three eukaryotic RNA polymerases, known as pol I, pol II, and pol III. Each uses a different set of transcription factors; we will discuss those for pol II.)
TBP plays a central role in initiating transcription, but it does not act alone. In archaeans, it works with another protein, transcription factor B. In eukaryotes, TBP is part of a larger complex, TFII-D (this rather colorless name is derived from "transcription factor D for RNA polymerase II"). TFII-D includes several other proteins besides TBP that interact with other factors and help stabilize the assembly on the DNA.
By itself, TFII-D cannot efficiently promote DNA-binding and transcription. Four other factors, TFII-B,-F,-E, and-H (binding in the order listed), allow pol II to bind to the promoter. Together, these are known as the basal transcription factors. Because each of these is composed of numerous individual polypeptides, the entire complex is thought to comprise at least twenty-five interacting polypeptides whose multiple interactions are critical for successful transcription.
The basal transcription factors assembled at the promoter are effective because they bind pol II. To finish their job, though, they must release it to begin transcription. This occurs when TFII-E and-H cooperate to phosphorylate (add a phosphate group to) RNA polymerase. This changes the polymerase's shape sufficiently to allow it to escape the complex of transcription factors and begin transcription.
Gene-Specific Factors Differentially Enhance Transcription Rates
The basal transcription factors increase the rate of transcription for all genes; indeed, RNA polymerase cannot bind to the promoter without them. However, not all genes should be transcribed at an equal rate all the time. Red blood cells should make lots of hemoglobin but not the digestive enzyme pepsin, while stomach lining cells should do the opposite. Differential control of gene transcription is facilitated by gene-specific transcription factors.
Hormones are an important class of molecules that regulate gene expression. A hormone is not a transcription factor itself but binds to a receptor to form a gene-specific factor. Once bound together, the hormone-receptor complex binds to DNA. Growth factors and homeotic proteins also act as gene-specific factors or form complexes that do.
The number of known gene-specific factors is currently in the low thousands and inevitably will grow as the genome becomes better known. An average gene may have several dozen specific factors involved in its regulation, giving the potential for very precise control of its expression.
Gene-specific factors are known as activators or repressors, depending on whether they increase or decrease the rate of transcription. The DNA sequences that activators bind to are called enhancer sites; repressors bind to silencer sites. Since enhancer and silencer sites are on the same DNA sequence as the gene they control, they are called "cis" regulatory elements (from the Latin word for "side"). The factors that bind to them come from elsewhere in the genome and are called "trans" acting factors.
Many gene-specific factors bind to the promoter outside of the TATA box, especially near the transcription initiation site, the beginning of the DNA sequence that is actually read by RNA polymerase. Others bind to sequences within the coding region of the gene, or downstream from it at the termination region. Some bind to DNA sequences hundreds or thousands of nucleotides away from the promoter. Because of the looped structure of DNA, these sequences are physically close to the promoter, despite being far away along the double helix. The binding sites of transcription factors can be determined by "DNA footprinting."
Gene-specific factors work in a variety of ways. Some interact with the basal factors, altering the rate at which they bind to the promoter. Some influence RNA polymerase's rate of escape from the promoter, or its return to it for another round of transcription.
Some factors physically alter the local structure of the DNA, making it more or less accessible. In eukaryotic organisms, DNA is wound around protein complexes called histones and is further looped, coiled, and condensed to allow efficient packing in the cell nucleus. This arrangement keeps the DNA well ordered but also decreases its accessibility for transcription. By interacting directly with DNA, transcription factors can open up otherwise inaccessible regions.
Gene-Specific Factors Share Several Common DNA-Binding Motifs
Gene-specific factors must position themselves on specific DNA sequences to exert their effects, and the relatively simple structure of DNA offers only a few ways for proteins to grab on. Therefore, despite the wealth of different individual transcription factors, each employs one of only a handful of structural "motifs" to bind to the DNA double helix. Each motif is a small portion of a much larger protein, whose other portions confer DNA-sequence specificity and control its interaction with the basal factors or other proteins.
The helix-turn-helix motif is composed of a short section of alpha-helix, linked to a loop of amino acids that changes the direction of the chain, followed by another alpha-helix. The first helix fits into the so-called major groove of the DNA double helix. The side chains of the protein's amino acids make contact with the exposed portions of the nucleotides. The shape and charges of the one complement those of the other, allowing them to bind; this provides the sequence-specificity needed for effective gene regulation. The homeotic proteins are a special class of proteins employing a modified helix-turn-helix motif. These proteins play critical roles in regulating development in organisms as diverse as fruit flies and humans.
The zinc-finger motif is constructed around an atom of zinc, which binds four amino acids to hold the amino acid chain in proper orientation. While many of the other amino acids vary among different types of zinc-finger proteins, the four key amino acids—either four cysteines or two cysteines and two histidines—are invariant in this class of transcription factors. This group of factors includes the steroid receptors. Steroids are a class of hormones , including testosterone and the estrogen, that exert profound effects on development. Steroids must bind to a receptor to form the transcription factor complex. Mutations in steroid receptors are responsible for a large variety of inherited disorders, including androgen insensitivity syndrome, thyroid hormone resistance syndrome, and some forms of prostate cancer, breast cancer, and osteoporosis.
Regulation of Transcription Factors
All cells need to be responsive to their environments, whether that environment is the pond-water habitat of a Paramecium or the thousands of other cells that a single neuron communicates with every second. Transcription factors are a central feature of this responsiveness. The hormone-receptor complex mentioned above provides a model for understanding how a cell can coordinate its gene expression with external events. The hormone acts as a signal that a change has occurred in the outside world that requires action by the cell, whether it be to grow or divide, or to release its own hormone.
A key feature governing a cell's repertoire of responses is the set of receptors it makes. Cells that should not respond to testosterone need only ensure that they do not make the testosterone receptor—a decision itself governed by the presence or absence of other transcription factors. Hormones are not the only type of signal possible. Cells have complex networks of signaling pathways that help to regulate their actions.
The exquisite coordination of cellular processes needed to maintain life might be likened to a symphony, in which the many different instruments must play their parts in time with all the others. The timing of gene expression is one of the great puzzles of understanding life: How does each gene get turned on and off at the right time? Although transcription factors are clearly an important part of the answer, they themselves are proteins—the products of genes that must be regulated by yet other transcription factors.
The way out of this paradox is to remember that each organism does not arise from nothing, nor does it spring forth fully formed, with all of its parts fully functioning. Rather, it develops from a preexisting cell, with a specific set of transcription factors in place to turn on a specific set of developmental genes, many of which are themselves transcription factors that turn on other genes. Although this is only the barest outline of an explanation that is still being worked out, it is clear that the pulsing interplay of transcription factors is a central feature of life's coordinated complexity.
see also Development, Genetic Control of; DNA Footprinting; Gene Expression: Overview of Control; Hormonal Regulation; Proteins; RNA Polymerases; Signal Transduction; Transcription.
Alberts, Bruce, et al. Molecular Biology of the Cell, 4th ed. New York: Garland Science,2002.
Semenza, Gregg L. Transcription Factors and Human Disease. Oxford, U.K.: OxfordUniversity Press, 1998.
Weinzierl, Robert O. J. Mechanisms of Gene Expression: Structure, Function and Evolution of the Basal Transcriptional Machinery. London: World Scientific, 1999.
Dutnall, Robert N., David N. Neuhaus, and Daniela Rhodes. "The Solution Structure of the First Zinc Finger Domain of SWI5: A Novel Extension to a Common Fold." Structure 4 (1996): 599-611. <http://baldrick.ucsd.edu/~dutnall/StructureDisplay.html>.
TRANSFAC: The Transcription Factor Database. GBF-Braunschweig. <http://transfac.gbf.de/TRANSFAC/>.