The transposable elements of Drosophila melanogaster – a genomics perspective.

version 16.0: 2002-08-26: maThe transposable elements of Drosophila melanogaster - a genomics perspective.Joshua s. Kaminker1'5, Casey M. Bergman-'8,

Brent Kronmiller2, Joseph Carlson2, Robert Svirskas2, Sandeep Patel2, Erwin Frise2, David A. wheeler5, Suzannna Lewis1, Gerald M. Rubin1*'4, Michael

Ashburner6'7 and Susan E. Celniker.'Department of Molecular and Cellular Biology, University of California, Berkeley, CA 94720,2Drosophilci Genome Project, Lawrence Berkeley National Laboratory', Berkeley, CA 94720, 'Amersham Biosciences, 2100 East Elliot Rd., Tempe, AZ 85284, 4Howard Hughes Medical

ject, Lawrence Berkeley National Laboratory', Berkeley, CA 94720, ’Amersham Biosciences, 2100 East Elliot Rd., Tempe, AZ 85284, 4Howard Hughes Medical

Institute, 'Human Genome Sequencing Center and Department of Molecular and Cell Biology, Baylor College of Medicine, Houston, TX 77030, 'Department o

f Genetics, University of Cambridge, Cambridge, England CB2 3EH, 'European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, England CB10 1SD. 'Correspondence to Michael

Ashburner,Department of Genetics, Downing Street, Cambridge, England CB2 3EH. phone:+44 1223-333969fax:+ 44

g Title: Transposable elements in the Drosophila euchromatin.AbstractTransposable elements are found in the genomes of nearly all eukaryotes. We have

analyzed the Release 3 genomic sequence of Drosophila melanogasier to describe the dichromatic transposable elements in the sequenced strain of this species.

We identified 85 known and 8 novel families of transposable element in the Release 3 sequences; these vary in copy number between 1 and 146. A

lthough the abundance of transposable elements varies among chromosomes, the density of transposable elements is relatively constant. There is a 2-fold enrichment of transposable elements

on chromosome 4 relative to the major chromosome arms, and transposable element abundance on the X chromosome is similar to the major autosome arms.

The abundance of of the three major classes of transposable elements (LTR, LINE-like, and TIR) are markedly higher in the proximal 2 Mb of each chromosome arm, reflecting the transition from euchromatin to heterochromatin, whereas the high abundace on chromosome 4 is due only to LINE-like and TIR el

ements. More than two-thirds of the transposable elements identified in Release 3 are partial. Analysis of structural variation of elements from diffe

ements. More than two-thirds of the transposable elements identified in Release 3 are partial. Analysis of structural variation of elements from diffe

nts of the same or different classes. Transposable elements are preferentially found outside genes; only 436 of 1,573 transposable elements are conta

ined within the 61.4 Mb of sequences which are annotated as being transcribed. The high abundance, high proportion of complete elements and low levels

of sequence diversity in LTR families suggest that indvidual LTR elements are more likely to be recent insenions into the D. melanogaster genome, rel

ative to LINE-like or TIR elements. This work provides a starling point for future genomic analysis of transposable elements in Drosophila.2Introducti

onTransposable elements are found in the genomes of nearly all eukaryotes. As d result, many biologists have an interest in the description of transposable elements in completely sequenced eukaryotic genomes, lhe evolutiona

ry biologist wants to understand the origin of transposable elements, how they arc lost and gained by a species and the role they play in the processe

s of genome evolution; the population geneticist wants Io know the factors that determine the frequency and distribution OÍ elements within and betwee

n populations; the developmental geneticist wants to know what roles these elements may play in either normal developmental processes or in the respon

s of genome evolution; the population geneticist wants Io know the factors that determine the frequency and distribution OÍ elements within and betwee

se to environmental stress; the molecular biologist wants to know how these elements replicate and transpose, what proteins they encode, what their targe

ts and how they interact with the cellular machinery of the host, ft is for all of these reasons and more that a description of the transposable eleme

nts in the recently completed Release 3 genomic sequence of D. melanogaster is desirable.rhe contribution of Drosophila to our understanding of transp

osable elements is long and glorious. Over 75 years ago. Milislav Demerec discovered highly mutable alleles of two genes in D. virilis, miniature and

magenta (Demerec 1926; 1927; reviewed in Demerec 1935; Green 1976). Both genes were mutable in soma and germ-line and. for the miniature-3alpha allele

, the mutation rate was enhanced by dominant enhancers (Demerec 1929). We now know that these mutable alleles were caused by the tra

nsposition of mobile elements; the dominant enhancers may have been particularly active elements or mutations in host genes affecting transposability

(see below). There matters essentially stood until McClintock's remarkable discovery of mutable alleles in maize and their basis -transposition of the

Ac and Ds factors (McClintock 1950), and the discovery; some 20 years later, of insenion elements in the gal operon of Escherichia coh (see Starlinge

r 1977).Green (1977) synthesized the evidence then al hand to make a strong case lor insertion as a mechanism of mutagenesis in Drosophila. Within a y

ear, the first transposable elements in Drosophila had been molecularly characterized (Ilyin et al.

1978) and evidence that they were transposable was soon available (Ilyin el al. 1978; Strobel el al. 1979; Young 1979). In fact, the Hogness group had

already, but unknowingly, molecularly characterized the first eukaryotic transposable element, the insertion sequences of 28S rRNA encoding genes (se

e Glover 1977). Ihe discovery of male recombination (Hirai/.umi 1971), and two systems ol hybrid dysgenesis in D. melunogusler (see Kidwell 1979), all

owed the gap, then wide, between genetic: and molec ular analyses to be bridged, rhe discovery of the causal transposable elements, tire P-element (Bi

ngham et al. 1982; Rubin et al. 1982) and the I-element (Bucheton et al. 1984), and the subsequent development of P-element mediated transformation (Ru

bin and Spradling 1982; Spradling and Rubin 1982), revolutionized Drosophila genetics. Th

e publication of the Release 1 genomic sequence in Match 2000 (Adams et al. 2000) and the Release 2 genomic sequence in October 2000 encouraged severa

l studies on the genomic distribution and abundance of transposable elements in D. me/anogaster (Berezikov, Rucheton and Busseau 2000; Jurka 2000; Bow

en and Mc Donald 2001; Rizzon et al. 2002; Bartolomé, Maside and charlesworth 2002). Unfortunately, neither release was suitable for3rigorous analysis

of its transposable elements since sequences corresponding to known transposable elements, along with other sequences known to be repetitive in the g

enome, were deliberately excluded from the assembly (Myers et al. 2000). In the Release 2

genome assembly, an attempt was made to fill these gaps. However, comparisons of small regions sequenced by the clone-by-clone approach versus the wh

ole genome shotgun method show that this was not a very accurate process (Myers et al. 2000; Benos et al. 2001). It was clear that any rigorous analys

is of the transposable elements, or any other repeat, required a sequence of higher quality. This has now been achieved by the finishing efforts of th

e Berkeley Drosophila Genome Project. This sequence. Release 3, is now publicly available (Celniker et al. 2002). For the first time, a reliable analy

sis of the transposable elements in the euchromatic portion of the D. melanogaster genome is possible.Results and DiscussionIdent

ification of known and novel transposable elementsEukaryotic transposable elements are divided between those that transpose via an RNA intermediate (c

lass 1), retrotransposons. and those that transpose by DNA excision and repair (class II), non-retrotransposons (Craig et al. 2002). Within the retrot

ransposons, the major division is between those that possess long terminal repeats (LTR elements) (and those that do not (LINE-like elements and SINE

elements (Deininger 1989)). Among the non-retrotransposons, the majority transpose via a DNA intermediate, encode their own transposase and are flanke

d by terminal inverted repeats (TIR elements). The foldback (FB) elements of Drosophila, which reanneal rapidly followin

g after denaturation with zero-order kinetics, are quite distinct from prototypical class 1 or II elements, and have been included in our analyses (Tr

uet et al. 1981). In addition, there are other classes of repetitive elements, such as INE-1 (Locke et al. 1999a: Locke et al. 1999b; Wilder and Hollo

cher 2001), which are structurally distinct from all other classes of elements, have not been included in this study.While the classification of trans

posable elements by structural class is relatively easy, the taxonomy of transposable element families is somewhat arbitrary (Table 1). We used a crit

