TY - GEN
T1 - A Human DNA methylation site predictor base on SVM
AU - Sun, Yi-Ming
AU - Liu, Baw-Jhiune
AU - Liao, , Wei-Li
AU - Chang, Cheng-Wei
AU - Huang, Hsien-Da
AU - Horng, Jorng-Tzong
AU - Wu, Li-Ching
PY - 2009/6
Y1 - 2009/6
N2 - During gene expression, transcription factors are unable to bind to a transcription binding site (TFBS) involved in regulation if DNA methylation has occurred at the TFBS. Methyl-CpG-binding proteins may also occupy the TFBS and prevent the functioning of a transcription factor. Thus, the methylation status of CpG sites is an important issue when trying to understand gene regulation and shows strong correlation with the TFBS involved. In addition, CpG islands would seem to undergo cell-specific and tissue-specific methylation. Such differential methylation is presented at numerous genetic loci that are essential for development. Current DNA methylation site prediction tools need to be improved so that they include TFBS features and have greater accuracy in terms of the DNA region that is involved in methylation. We developed models that compare the differences across these regions and tissues. The TFBSs, DNA properties and DNA distribution were used as features for this classification. From the results, we found some TFBSs that were able to discriminate whether a sequence was methylated or not. The sensitivity, specificity and accuracy estimated using 10-fold cross validation were 90.8%, 80.54%, and 86.07%, respectively. Thus, for these four regions and twelve tissues, the performance levels (ACC) were all greater than 80%. We propose that the differential features or methylations vary between the different regions because the features common to each DNA region made up only 50% of the top 70 features. An online predictor based on EpiMeP is available at http://140.115.51.41/EpiMeP/. Supplementary file is available at http://140.115.51.41/EpiMeP/supplementary.doc.
AB - During gene expression, transcription factors are unable to bind to a transcription binding site (TFBS) involved in regulation if DNA methylation has occurred at the TFBS. Methyl-CpG-binding proteins may also occupy the TFBS and prevent the functioning of a transcription factor. Thus, the methylation status of CpG sites is an important issue when trying to understand gene regulation and shows strong correlation with the TFBS involved. In addition, CpG islands would seem to undergo cell-specific and tissue-specific methylation. Such differential methylation is presented at numerous genetic loci that are essential for development. Current DNA methylation site prediction tools need to be improved so that they include TFBS features and have greater accuracy in terms of the DNA region that is involved in methylation. We developed models that compare the differences across these regions and tissues. The TFBSs, DNA properties and DNA distribution were used as features for this classification. From the results, we found some TFBSs that were able to discriminate whether a sequence was methylated or not. The sensitivity, specificity and accuracy estimated using 10-fold cross validation were 90.8%, 80.54%, and 86.07%, respectively. Thus, for these four regions and twelve tissues, the performance levels (ACC) were all greater than 80%. We propose that the differential features or methylations vary between the different regions because the features common to each DNA region made up only 50% of the top 70 features. An online predictor based on EpiMeP is available at http://140.115.51.41/EpiMeP/. Supplementary file is available at http://140.115.51.41/EpiMeP/supplementary.doc.
U2 - 10.1109/BIBE.2009.22
DO - 10.1109/BIBE.2009.22
M3 - Conference contribution
SN - 9780769536569
T3 - 2009 9TH IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING
SP - 22
EP - 29
BT - 2009 Ninth IEEE International Conference on Bioinformatics and BioEngineering
ER -