More than 45% of human genome has been annotated as transposable elements (TEs). The human genome is expanded by the mobilization of these TEs, which they may increase the plasticity and variation of the genome. Long terminal repeat (LTR) retrotransposons are important components in TEs. LTRs include regulatory sites, which the authors believe could be conserved in evolution. Therefore, these significant motifs in the sequence of LTRs are found and are used to train a Hidden Markov Model. These models are used as fingerprints to detect most of the known LTRs detected by RepeatMasker. LTR instances are classified into families using the predictive models proposed. These LTRs can support evolutionary analysis. A new method of detecting LTR is proposed. Analyzing LTR sequences reveals some specific motifs as LTR fingerprints, which can be built into HMM profiles. Experimental results reveal that the proposed experimental approach not only discovers most of the LTRs found by RepeatMasker, but also detects some novel LTRs. Moreover, the novel LTRs may be structurally incomplete or degenerate. (C) 2008 Published by Elsevier Ltd.
- Genome; Hidden Markov model; LTR; Repeats; Transposable elements