« Prev Next »
Do complex organisms have more genes than simpler organisms? Now that researchers can sequence whole genomes and have done so for a number of organisms, they know that many vertebrates have only about twice as many genes as invertebrates, and many of these are the result of duplication of existing genes rather than development of new ones. But if there are not that many new genes, what is responsible for the incredible diversity in plant and animal species?
The simple answer to this question is that eukaryotes have developed a more complex way of controlling expression of their existing genes than prokaryotes. This system of expression control relies on a group of proteins known as transcription factors (TFs), and it allows eukaryotes to alter their cell types and growth patterns in a variety of ways. TFs are not solely responsible for gene regulation; eukaryotes also rely on cell signaling, RNA splicing, siRNA control mechanisms, and chromatin modifications. However, TFs that bind to cis-regulator DNA sequences are responsible for either positively or negatively influencing the transcription of specific genes, essentially determining whether a particular gene will be turned "on" or "off" in an organism.
Transcription Factors Recognize Specific DNA Sequences
The mode of action of TFs is to recognize and bind to a segment of DNA in the promoter and/or enhancer region. Often, a change in the conformation, or three-dimensional structure of a TF, will accompany DNA binding. For example, the two loops in NFATC1 that interact with DNA are found in different conformations, depending on whether NFATC1 is complexed with DNA or not (Figure 1). Moreover, the structure of different TF families, described later in this article, results in specific areas in these protein complexes that interact with the DNA recognition motif. The recognition motif is usually only about 6 to 10 base pairs long.
Experiments have shown that TFs can bind tightly, both within cells and in vitro. After TFs bind to promoter or enhancer regions of the DNA, they interact with other bound TFs and recruit RNA polymerase II. Their influence, however, can be either positive or negative, depending on the presence of other functional domains on the protein and the overall impact of the entire TF complex. A typical TF has multiple functional domains, not only for recognizing and binding to the appropriate DNA strand, but also for interactions with other TFs, with proteins called coactivators, with RNA polymerase II, with chromatin remodeling complexes, and with small noncoding RNAs.
TFs control many important parts of development; therefore, organisms with a deletion of a TF gene exhibit profound irregularities in organization and development (Table 1). For example, in Drosophila, deletion of the TF antennapedia gene results in the development of the antennal imaginal disc into legs rather than antennae.
Table 1: Effects of Some Transcription Factor (TF) Gene Deletions in Drosophila
TF Gene Deleted | Gene Group | Type of TF | Phenotypic Effects Observed |
Buttonhead | Gap | Zinc finger | Lack of mandibular, intercalary, and antennal head segments |
Hairy | Pair rule | bHLH | Ectopic expression of bristles on legs and wings |
Antennapedia | Homeotic | Homeobox | Legs on the head where antennae should be |
Transcription Factors Exert Combinatorial Control
Many TFs are known to facilitate transcription at hundreds of different promoters, while some are only active at a select few. Laboratory techniques such as chromatin immunoprecipitation (ChIP) and DNA microarrays are commonly used to study the target DNA motifs recognized by individual TFs (Iyer et al., 2001). Signal molecules can influence activation by TFs by covalently binding or modifying their functional domains. It is even possible for a TF to respond to a physical signal, such as red or far-red light, but the signal must be transduced to a chemically modified activator that interacts with the TF.
The complexity and fine gradations of DNA expression in eukaryotes result from combinatorics, in that the combination of chromatin and TF signals, rather than the individual TF signal, is read out. Thus, transcriptional control is dependent on the interactions of all the TFs and whether they attract RNA polymerase or block it from initiating transcription. Multiple TFs can accumulate, creating a bulk the size of a ribosome. Once bound together, changes to the functional domains of a TF and/or covalent interactions with other factors can turn transcription on or off, depending on whether they allow or prohibit the recruitment of RNA polymerase.
A typical enhancer can be up to 500 base pairs in length and contain multiple binding sites for at least two or three different TFs (Levine & Tijan, 2003). Two TFs bound at sites near one another on the DNA strand can combine to form a dimer and bend the DNA in what is believed to be part of the activation process. Chromatin structure allows activators to associate with one another, even when they are bound to DNA sequences many hundreds of base pairs apart. Some TFs are believed to act as tethering elements between distant enhancers and promoters by forming connections with other proteins.
The Evolution of Transcription Factor Families
In many animals, including humans, a prominent group of genes involved in cell development, including many that encode TFs, contain a 180 base-pair sequence called the homeobox. The homeobox encodes a 60-amino acid protein segment called the homeodomain, which recognizes and binds to promoters in the DNA of its target genes. Complete control over transcription, and sometimes binding, is dependent on interactions between TFs, so activation often depends upon the presence of another TF. A similar system of gene recognition is found in plants, where the DNA-binding domain is called the MADS box.
TFs often have certain specific DNA-binding motifs, a common one being the basic helix-loop-helix (bHLH) structure that recognizes a specific sequence of DNA and sits on the DNA like a train car on a track. One such example is the TF MyoD (myoblast determination). Expression of the MyoD gene results in production of MyoD protein, which binds to the promoters of muscle-determining genes, causing the differentiation of muscle precursor cells (myoblasts) into muscle fibers. MyoD also binds to its own promoter, thus maintaining its own levels in differentiated muscle cells and their progeny.
In addition to bHLH, there are some other common structural motifs for recognition and binding of DNA, and these are found in most regulatory proteins. These are the helix-turn-helix, zinc finger, and leucine zipper (Figure 2). The NFATC1 example shown in Figure 1 is known as a β-barrel. Proteins having each of these motifs are effective because they fit neatly into the major or minor grooves of the DNA strand, and also because they expose specific amino acids at the appropriate places to form hydrogen bonds with the nucleotide bases. Molecular genetic techniques can be used to change any amino acids to test whether this affects the binding affinity of the TF for the target.
Complexity and Transcription Factors
Complexity of transcriptional control can be illustrated by comparing the number and locations of cis-control elements in higher and lower eukaryotes. For instance, Drosophila typically has several enhancers for a single gene of 2 to 3 kilobases, scattered over a large (10 kilobase) region of DNA, while, as described earlier, yeast have no enhancers but instead use one UAS sequence per gene, located upstream. Long-range regulation is thought to be indicative of the need for a higher level of control over genes involved in cell development and differentiation.
The yeast genome encodes around 300 TFs, or one per every 20 genes, while humans express approximately 3,000 TFs, or one per every 10 genes. With combinatorial control, the twofold increase in TFs per gene actually translates into many more possible combinations of interactions, allowing for the dramatic increase in diversity among organisms. When we consider the additional complexities of chromatin remodeling, regulated mRNA stability, and translational control, it is easier to understand how the cells of higher organisms can produce such an enormous variety of genetic responses to environmental signals.
Conclusion
References and Recommended Reading
Chen, K., & Rajewsky, N. The evolution of gene regulation by transcription factors and microRNAs. Nature Reviews Genetics 8, 93–103 (2007) doi:10.1038/nrg1990 (link to article)
Hochschild, A., et al. Repressor structure and the mechanism of positive control. Cell 32, 319–325 (1983) doi:10.1016/0092-8674(83)90451-8
Iyer, V., et al. Genomic binding sites of the yeast cell-cycle transcription factors SBF and MBF. Nature 409, 533–538 (2001) doi:10.1038/35054095 (link to article)
Levine, M., & Tjian, R. Transcription regulation and animal diversity. Nature 424, 147–151 (2003) doi:10.1038/nature01763 (link to article)
Sadava, D., et al. Life: The Science of Biology (Gordonsville, VA, W. H. Freeman, 2006)