Gene Expression: Overview of Control
Gene Expression: Overview of Control
The chromosomes of an organism contain genes that encode all of the RNA and protein molecules required to construct that organism. Gene expression is the process through which information in a gene is used to produce the final gene product: an RNA molecule or a protein.
Each cell in a multicellular organism such as a human contains the same genes as every other cell. Nonetheless, there are hundreds of distinct types of cells in the human body, each expressing a unique set of genes. Indeed, it is this unique constellation of expressed genes that makes each cell type distinct.
Cells may also change the genes they express over time, and they are constantly adjusting the amount of protein made in response to changing conditions. How does a cell express some, but not all, of the genes in its genome ? How does it react to environmental changes to adjust the level of gene expression? These are the problems of control of gene expression. While the genes whose final products are RNA molecules are also regulated, this entry will focus on genes that encode proteins.
The Flow of Genetic Information from Genes to Proteins
Cells can regulate gene expression at every step along the way, from DNA to the final protein, as shown in Figure 1. Genetic information in DNA is first copied to form an RNA molecule, in a process known as transcription . The RNA used to make proteins is called messenger RNA (mRNA) because it carries information from the DNA to the ribosome, where protein synthesis occurs.
The mRNA serves as a template to guide protein synthesis. Scientists refer to protein synthesis as "translation" because ribosomes translate an mRNA sequence into a protein sequence. Prokaryotic cells use the mRNA directly as a template for protein synthesis.
Eukaryotic cells, however, must modify the precursor mRNA in several ways before it can be used to guide protein synthesis. The two ends are chemically altered, and sections of the RNA that do not encode protein sequences, called introns, are spliced out. Together, these modifications are called "mRNA processing." After processing, the mature mRNA moves from the nucleus into the cytoplasm , where it binds to the ribosome and serves as a template for synthesis of a protein.
The most important stage for the regulation of most genes is when transcription begins. This is because it costs the cell less energy to regulate transcription than to regulate the steps after transcription. The second point where regulation occurs is during RNA processing. Cells can regulate the rate of processing. In addition, the final mRNA product can be altered through alternative splicing, as shown in Figure 2. Alternative splicing can regulate the types of proteins produced from a single gene.
Cells can also regulate mRNA transport out of the nucleus. Once the mRNA has moved into the cytoplasm, the abundance of mRNA can be regulated by RNA degradation. Cells can regulate translation, controlling the number of proteins each mRNA produces. Finally, even after a cell has generated a protein, it can regulate the abundance and activity of that protein. For instance, cells regulate the activity of many proteins by post-translational modifications such as phosphorylation . Cells can also regulate the abundance of most proteins by degrading them.
Gene Control Occurs at Several Levels
For a gene to be transcribed, RNA polymerase must first find the gene. This is made more difficult by the tight packing required to fit the entire genome within the nucleus. The cell uses this packing to its advantage, though, to prevent access to and expression of genes in some chromosomal regions.
Regions that are tightly condensed are called heterochromatin and can be distinguished from more open regions (euchromatin) by their dense staining, when viewed under a microscope. In females, an entire X chromosome in each cell is kept condensed throughout life, to avoid a "double dose" of these genes (recall that females have two X chromosomes, while males have only one). This random X inactivation leads to mosaicism, in which some female cells express genes from one chromosome, while others express genes from the other.
Even within an active chromosome, some regions may be temporarily inactivated. Inactive heterochromatin can be converted to active euchromatin, and vice versa, by chemical modification of the histone proteins to which the DNA is attached in the chromosome. Negatively charged DNA is chemically attracted to the positively charged histones. By adding or removing chemical groups to the histones, this attraction can be modulated. A weaker attraction, as would occur by adding negatively charged groups to the histones, tends to open up the chromatin , favoring gene expression. A stronger attraction keeps it more condensed.
How Do Cells Regulate Transcription?
To understand transcriptional regulation, consider the structure of a typical eukaryotic gene, shown in Figure 3. The promoter of a gene is the binding site for a group of general transcription factors and for RNA polymerase. Transcription begins when a complex of proteins called TFIID binds to a promoter. The sequential binding of other general transcription factors and RNA polymerase follows. A protein tail tethers the RNA polymerase to the general transcription complex. When the general transcription factor TFIIH phosphorylates this tail, the RNA polymerase is released and moves along the DNA to begin transcription.
These steps are identical for essentially all genes. However, the rate at which each of the steps occurs can be influenced by the presence or absence of other, gene-specific transcription factors. It is these other factors, often called gene regulatory proteins, that give the cell the ability to turn some genes on and others off. In contrast to the few general transcription factors, which assemble on the promoters of all genes, there are thousands of gene regulatory proteins. The regulatory proteins bound to a gene vary from gene to gene, and each is usually present at low levels in the cell.
Binding sites for these additional regulatory proteins are usually located upstream from the promoter. Surprisingly, these binding sites can be located some distance from the promoter and still regulate transcription. It is thought that this action at a distance can occur because the DNA between the regulatory sequence and the promoter can loop out, to allow the regulatory protein to contact the promoter, as shown in Figure 4.
Regulatory proteins are classified as either activators or repressors. Activators increase the rate of transcription, whereas repressors decrease it. The DNA sequences that bind activator proteins are called "enhancer elements," and those that bind repressor proteins are "repressor elements" or "silencer elements."
Many regulatory proteins have at least two distinct regions or domains, as shown in Figure 5. One domain binds to a specific DNA sequence. The other domain typically contacts the general transcription machinery assembled at the promoter. In one class of activating proteins, the activation domain contains a cluster of negatively charged (acidic) amino acids. Scientists believe acidic transcriptional activators accelerate the assembly of general transcription factors on the promoter. This is just one way a transcriptional activator can work. For instance, other regulatory proteins affect how tightly the gene is packaged within chromatin. Opening up the chromatin allows the transcription machinery to more quickly gain access to the promoter.
Gene regulatory proteins bind to the DNA when it is in a double-helical state. They recognize a specific DNA sequence by forming hydrogen bonds to chemical groups on the outside of the DNA. These proteins often contain common identifiable structural "motifs" that directly contact specific DNA sequences.
Combinatorial Regulation of Gene Expression
Some eukaryotic gene regulatory proteins work individually, but most act within a complex of proteins. Furthermore, a single gene regulatory protein may participate in multiple types of regulatory complexes. For example, a protein might function in a complex that activates the transcription of one gene and in a complex that represses transcription of another gene.
A gene that must be turned on at different times and in different tissues during development might have gene regulatory proteins clustered at multiple sites along its regulatory region. These complexes can then regulate the expression of the gene in a variety of developmental processes. The rate of RNA synthesis initiation will depend on the combination of regulatory proteins bound to the control regions of the gene. Thus, we can think of the regulatory region of the gene as an information processor, like a computer, that integrates input from all the regulatory proteins present and determines an appropriate level of RNA synthesis.
Regulation of Gene Expression during Development
During development, cells become different from one another because they synthesize and accumulate different proteins. Most of these differences come from changes in gene expression. Specialized cell types result from different genes being turned on or off in a coordinated manner. When a cell becomes a specific cell type, it continues in this role through many subsequent cell generations. This implies that cells remember the changes in gene expression involved in the choice of cell type. How is this achieved? One way is through the euchromatin-heterochromatin conversion, which can be faithfully inherited through cell division. Another way is for an important regulatory protein to activate its own expression as well as the expression of other genes. This gene, once expressed, will maintain its own expression.
As described above, it is usually a combination of gene regulatory proteins, rather than a single protein, that determines when and where gene expression occurs. Certain proteins can be more important than others, though. If all the other factors required for expression of a group of genes are present, a single gene regulatory protein can switch a cell from one developmental pathway to another.
For example, forced expression of the MyoD protein in fibroblasts will cause these cells to form into muscle fibers. In an extreme example, homeotic genes specify large regions of an animal's body plan. Mutations in these genes can transform one body part into another. For instance, a mutation in the antennapedia gene of fruit flies will convert antennae to legs. Thus, expression of a single gene can trigger the expression of a whole battery of genes.
An advantage of multiple gene regulatory proteins over single ones is that many different genes can be controlled with a handful of proteins. Consider the opposite situation, in which every gene would need a unique regulatory protein. The gene for each of those proteins would also need its own protein, and so on. Instead, a smaller set of proteins combines in different ways to make a large set of regulatory possibilities. Imagine that any regulatory element must be composed of two proteins. Four proteins (a, b, c, and d) can combine in pairs in ten different ways (aa, ab, ac, ad, bb, bc, etc.), and could thus, theoretically, control ten different genes. In reality, the situation is more complex, with literally thousands of regulatory proteins combining in ways researchers have not even begun to calculate.
With combinatorial control, therefore, a regulatory protein does not necessarily regulate a particular battery of genes or specify a particular cell type. Instead it might serve many purposes, and those purposes might overlap with those of other regulatory proteins. A regulatory protein might be switched on in many cell types, at different locations in the animal, and several times during development. Thus, combinatorial gene control makes it possible to generate a great deal of biological complexity with relatively few gene regulatory proteins.
Hormones and Growth Factors
During development, cells can change the expression of their genes when influenced by both external and internal signals. Signals from outside the cell that influence gene expression include contact with other cells, growth factors, and hormones .
Growth factors are extracellular molecules that stimulate a cell to grow or proliferate. Examples include epidermal growth factor and fibroblast growth factor. Growth factors regulate gene expression indirectly through a network of intracellular signaling cascades.
Hormones are signaling molecules that endocrine cells secrete into the bloodstream. Some hormones, such as insulin, bind to cell surface receptors and affect gene expression through a network of intracellular signaling cascades. Other hormones, such as testosterone, pass through the cell membrane and bind to regulatory proteins in the cell that directly regulate transcription.
When the Regulation of Gene Expression Fails
When the control of gene expression fails, there can be serious consequences, such as death, birth defects, and cancer. Birth defects can result when the regulation of one or more genes important for development is lost. This often occurs because of a mutation, but it can also occur if the embryo or fetus is exposed to certain chemicals, such as alcohol. Mutations in the receptor for fibroblast growth factor, for instance, cause dwarfism. Cancer occurs when the regulation of genes that control growth and cell division, programmed cell death (apoptosis), and cell migration are lost.
seealso Alternative Splicing; Birth Defects; Development, Genetic Control of; Gene; Hormonal Regulation; Post-translational Control; Proteins; RNA Processing; Signal Transduction; Transcription; Transcription Factors.
Alberts, Bruce, et al. Molecular Biology of the Cell, 4th ed. New York: Garland Science,2002.
Lodish, Harvey, et al. Molecular Cell Biology, 4th ed. New York: W. H. Freeman, 2000.
Struhl, K. "Gene Regulation. A Paradigm for Precision." Science 293 (2001):1054-1055.
Tjian, R. "Molecular Machines That Control Genes." Scientific American 272, no. 2(1995): 54-61.