9.1: Transposable Elements (Transposons) - Biology

9.1: Transposable Elements (Transposons) - Biology

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

Transposable elements (both active and inactive) occupy approximately half the human genome and a substantially greater fraction of some plant genomes! These movable elements are ubiquitous in the biosphere, and are highly successful in propagating themselves. We now realize that some transposable elements are also viruses, for instance, some retroviruses can integrate into a host genome to form endogenous retroviruses. Indeed, some viruses may be derived from natural transposable elements and vice versa. Since viruses move between individuals, at least some transposable elements can move between genomes (between individuals) as well as within an individual’s genome. Given their prevalence in genomes, the function (if any) of transposable elements has been much discussed but is little understood. It is not even clear whether transposable elements should be considered an integral part of a species’ genome, or if they are successful parasites. They do have important effects on genes and their phenotypes, and they are the subject of intense investigation.

Transposition is related to replication, recombination and repair. The process of moving from one place to another involves a type of recombination, insertions of transposable elements can cause mutations, and some transpositions are replicative, generating a new copy while leaving the old copy intact. However, this ability to move is a unique property of transposable elements, and warrants treatment by itself.

Properties and effects of transposable elements

The defining property of transposable elements is their mobility; i.e. they are genetic elements that can move from one position to another in the genome. Beyond the common property of mobility, transposable elements show considerable diversity. Some move by DNA intermediates, and others move by RNA intermediates. Much of the mechanism of transposition is distinctive for these two classes, but all transposable elements effectively insert at staggered breaks in chromosomes. Some transposable elements move in a replicative manner, whereas others are nonreplicative, i.e. they move without making a copy of themselves.

Transposable elements are major forces in the evolution and rearrangement of genomes (Figure 9.1). Some transposition events inactivate genes, since the coding potential or expression of a gene is disrupted by insertion of the transposable element. A classic example is the r allele (rugosus) of the gene encoding a starch branching enzyme in peas is nonfunctional due to the insertion of a transposable element. This allele causes the wrinkled pea phenotype in homozygotes originally studied by Mendel. In other cases, transposition can activate nearby genes by bringing an enhancer of transcription (within the transposable element) close enough to a gene to stimulate its expression. If the target gene is not usually expressed in a certain cell type, this activation can lead to pathology, such as activation of a proto-oncogene causing a cell to become cancerous. In other cases, no obvious phenotype results from the transposition. A particular type of transposable element can activate, inactivate or have no effect on nearby genes, depending on exactly where it inserts, it’s orientation and other factors.

Figure 9.1. Possible effects of movement of a transposable element in the function and expression of the target gene. The transposable element is shown as a red rectangle, and the target gene (X) is composed of multiple exons. Protein coding regions of exons are green and untranslated regions are gold. The angled arrow indicates the start site for transcription.

Transposable elements can cause deletions or inversions of DNA. When transposition generates two copies of the same sequence in the same orientation, recombination can delete the DNA between them. If the two copies are in the opposite orientations, recombination will invert the DNA between them.

As part of the mechanism of transposition, additional DNA sequences can be mobilized. DNA located between two copies of a transposable element can be moved together with them when they move. In this manner, transposition can move DNA sequences that are not normally part of a transposable element to new locations. Indeed, "host" sequences can be acquired by viruses and propagated by infection of other individuals. This may be a natural means for evolving new strains of viruses. One of the most striking examples is the acquisition and modification of a proto-oncogene, such as cellular c-src, by a retrovirus to generate a modified, transforming form of the gene, called v-src. These and related observations provided insights into the progression of events that turn a normal cell into a cancerous one. They also point to the continual acquisition (and possibly deletion) of information from host genomes as a natural part of the evolution of viruses.

Mutator and MULE Transposons

The Mutator system of transposable elements (TEs) is a highly mutagenic family of transposons in maize. Because they transpose at high rates and target genic regions, these transposons can rapidly generate large numbers of new mutants, which has made the Mutator system a favored tool for both forward and reverse mutagenesis in maize. Low copy number versions of this system have also proved to be excellent models for understanding the regulation and behavior of Class II transposons in plants. Notably, the availability of a naturally occurring locus that can heritably silence autonomous Mutator elements has provided insights into the means by which otherwise active transposons are recognized and silenced. This chapter will provide a review of the biology, regulation, evolution and uses of this remarkable transposon system, with an emphasis on recent developments in our understanding of the ways in which this TE system is recognized and epigenetically silenced as well as recent evidence that Mu-like elements (MULEs) have had a significant impact on the evolution of plant genomes.


Change in gene regulation is an important mechanism underlying the emergence of new biological traits [1,2,3,4,5]. There is a substantial body of empirical studies illustrating how the addition, modification, or disappearance of cis-regulatory elements, such as enhancers, has driven the emergence of profound phenotypic changes throughout evolution [6,7,8]. Thus, there has been an intensifying effort over the past decade to better understand mechanisms underlying the evolution of enhancers and other cis-regulatory elements [4, 9,10,11,12].

In the broadest definition, enhancers are short (100 bp–1 kb) DNA sequences that modulate transcription of target genes regardless of genomic orientation or distance, and are often bound by transcription factors (TFs) [13, 14]. Recent advances in functional genomics enabled nearly unbiased mapping of enhancers and their associated TF binding sites (TFBSs) on a genome-wide scale and facilitated systematic studies of enhancer evolution across and within species [11, 15,16,17]. Seminal comparative studies in mammals revealed a low level of conservation in the genomic location of enhancers relative to genes and their promoters [18,19,20,21,22,23,24,25,26,27]. For instance, Villar and colleagues found that nearly half of 20,000–25,000 active liver enhancers mapped in each of 20 mammalian species are lineage- or even species-specific, while almost all promoters active in the liver are conserved across most or all the species examined [28]. However, recent analyses demonstrated that deeply conserved enhancers often coordinate robust and essential gene expression programs while less conserved enhancers contribute plasticity and redundancy to gene regulatory networks [29,30,31]. While these studies point to the rapid turnover of enhancers during mammalian evolution, the mechanisms underlying the birth and death of enhancers are only beginning to be understood [10, 11, 26, 32,33,34,35].

Transposable elements (TEs) represent an important source of new cis-regulatory elements, including enhancers. TEs account for a substantial amount of nuclear DNA and genetic variation in virtually all metazoans [36]. For example, between one and two-thirds of all mammalian genomes thus far examined are recognizable as being derived from TE sequences [36,37,38,39]. These elements inserted at various times during mammalian evolution, ranging from highly decayed copies integrated > 100 million years ago to recently integrated copies that may be species-specific or still polymorphic in the population [37,38,39,40,41,42]. Several studies have systematically examined the contribution of TEs to TF binding and the birth of cis-regulatory elements, and some general principles have emerged [43,44,45,46,47]. First, TEs contribute a substantial but widely variable fraction (

2–40%) of the TFBSs mapped for a given TF throughout the genome [48,49,50,51,52]. Second, TFBSs and cis-regulatory elements derived from TEs tend to be evolutionarily recent and are restricted to specific species or lineages [34, 50, 53]. For example, ∼ 20% of OCT4 and NANOG binding sites were derived from lineage-specific TEs in humans and mice [20]. This may be explained by the fact that the majority of TEs in any mammalian genome are themselves lineage-specific: for example, 85% of mouse TEs are not shared with the human [40] and 35% are not even shared with the rat [54]. Third, not all TEs contribute equally: for any given TF, there is generally one or a few TE families that account for a disproportionate fraction of binding sites relative to their frequency in the genome [20, 44, 46, 48, 50, 51].

Multiple studies have now confirmed that different TE classes and families contribute TFBSs for different TFs in different mammalian species and that these TE-derived TFBSs occasionally undergo exaptation to give rise to new host regulatory elements (reviewed in [44, 47]). However, the mechanisms by which complex enhancers emerge from TEs remain poorly understood. For instance, it is unclear why specific TE families or copies are bound by a particular TF while closely related elements in the same genome are not [45, 51]. The path by which individual TE copies are co-opted for regulatory purposes has been scarcely characterized [55], and the relative contributions of combinatorial sequence motifs pre-existing within TEs or in the vicinity of their insertion sites have not been examined in detail. To address these and other poorly understood aspects of TE co-option in regulatory evolution, we chose to examine their contribution to the cis-regulatory network underlying circadian gene expression. The machinery responsible for the transcriptional control of circadian gene expression is deeply conserved and has been extensively characterized in the mouse liver, which provides a solid experimental framework against which the impact of TEs can be queried. The circadian clock also presents the relatively unique advantage of providing a particularly robust system to examine the binding of TEs by regulatory proteins, as circadian rhythms are maintained by a series of interconnecting feedback loops of paralogous TFs [56,57,58].

The primary feedback loop consists of six circadian regulators (CRs), of which two are transcriptional activators (BMAL1 and CLOCK) and four are transcriptional repressors (PER1, PER2, CRY1, and CRY2). During the day, BMAL1 and CLOCK form a heterodimer, which binds to a tandem pair of E-box motifs in distal and promoter regions of clock-controlled genes [59]. Among the direct targets of the BMAL1:CLOCK complex are the repressors PER1/2 and CRY1/2. Following translation, PER and CRY enter the nucleus and inhibit BMAL1:CLOCK mediated transcription, thereby decreasing their own transcription and generating a feedback loop essential to the maintenance of the clock period [60, 61]. This model has recently been revised to reflect reports that BMAL1 acts as a pioneer factor and promotes rhythmic nucleosome removal and that transcription promoted by CLOCK:BMAL1 is not homogeneously oscillatory [62]. It is proposed that CLOCK:BMAL1 binding rhythmically maintains a chromatin landscape which facilitates binding and transcriptional activation by other ubiquitous or tissue-specific transcription factors, including members of the nuclear receptor (NR) family [63]. Interactions between CRs and liver-specific NRs are thought to underlie liver-specific circadian regulation of metabolic processes such as glucose, cholesterol, and lipid metabolism [64,65,66]. The vast amount of data and knowledge available for circadian regulation in the mouse liver provides a solid paradigm to dissect the mechanisms underlying the contribution of particular TEs to this cis-regulatory network.


As potent insertional mutagens, TEs can have both positive and negative effects on host fitness, but it is likely that the majority of TE copies in any given species—and especially those such as humans with small effective population size—have reached fixation through genetic drift alone and are now largely neutral to their host. When can we say that TEs have been co-opted for cellular function? The publication of the initial ENCODE paper [195], which asserted ‘function for 80% of the genome’, was the subject of much debate and controversy. Technically speaking, ENCODE assigned only ‘biochemical’ activity to this large fraction of the genome. Yet critics objected to the grand proclamations in the popular press (The Washington Post Headline: “Junk DNA concept debunked by new analysis of the human genome”) and to the ENCODE consortium’s failure to prevent this misinterpretation [196,197,198]. To these critics, ignoring evolutionary definitions of function was a major misstep.

This debate can be easily extended to include TEs. TEs make up the vast majority of what is often referred to as ‘junk DNA’. Today, the term is mostly used (and abused) by the media, but it has in fact deep roots in evolutionary biology [199]. Regardless of the semantics, what evidence is needed to assign a TE with a function? Many TEs encode a wide range of biochemical activities that normally benefit their own propagation. For example, TEs often contain promoter or enhancer elements that highjack cellular RNA polymerases for transcription and autonomous elements encode proteins with various biochemical and enzymatic activities, all of which are necessary for the transposon to replicate. Do these activities make them functional?

The vast differences in TEs between species make standard approaches to establish their regulatory roles particularly challenging [200]. For example, intriguing studies on the impact of HERVs, in particular HERV-H, in stem cells and pluripotency [150,151,152] must be interpreted using novel paradigms that do not invoke deep evolutionary conservation to imply function, as these particular ERVs are absent outside of great apes. Evolutionary constraint can be measured at shorter time scales, including the population level, but this remains a statistically challenging task especially for non-coding sequences. Natural loss-of-function alleles may exist in the human population and their effect on fitness can be studied if their impact is apparent, but these are quite rare and do not allow systematic studies. It is possible to engineer genetic knockouts of a particular human TE locus to test its regulatory role but those are restricted to in-vitro systems, especially when the orthologous TE does not exist in the model species. In this context, studying the impact of TEs in model species with powerful genome engineering tools and vast collections of mutants and other genetic resources, such as plants, fungi, and insects, will also continue to be extremely valuable.

Finally, a growing consensus is urging more rigor when assigning cellular function to TEs, particularly for the fitness benefit of the host [178]. Indeed, a TE displaying biochemical activity (such as those bound by transcription factors or lying within open chromatin regions) cannot be equated to a TE that shows evidence of purifying selection at the sequence level or, when genetically altered, result in a deleterious or dysfunctional phenotype. Recent advances in editing and manipulating the genome and the epigenome en masse yet with precision, including repetitive elements [153, 154, 189,190,191], offer the promise for a systematic assessment of the functional significance of TEs.