Abstract

Epigenetics is the study of heritable changes in gene function that occur without a change in DNA sequence. It has quickly emerged as an essential area for understanding inheritance and variation that cannot be explained by the DNA sequence alone. Epigenetic modifications have the potential to regulate gene expression and may play a role in diseases such as cancer. DNA methylation is a type of epigenetic modification that occurs when a methyl chemical group attaches to a cytosine base on the DNA molecule. To better understand this epigenetic mechanism, DNA methylation profiles can be constructed by identifying all locations of DNA methylation in a genomic region (e.g. chromosome or whole-genome). Large-scale studies of DNA methylation are supported by microarray technology known as tiling arrays. These arrays provide high-density coverage of genomic regions through the unbiased, systematic selection of probes that are tiled across the regions. Statistical methods are employed to estimate each probe’s DNA methylation status. Previous studies indicate that DNA methylation patterns of some organisms differ by genomic element (e.g., gene, transposon), suggesting that genomic annotation information may be useful in statistical analysis. In this work, a novel statistical model is proposed, which takes advantage of genomic annotation information that to date has not been effectively utilized in statistical analysis. Specifically, a hidden Markov model, which incorporates genomic annotation, is introduced and investigated through a simulation study and analysis of an Arabidopsis thaliana DNA methylation tiling array experiment.

Keywords

METHYLATION, DNA

Creative Commons License

Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Share

COinS
 
Apr 25th, 10:00 AM

MODELING DNA METHYLATION TILING ARRAY DATA

Epigenetics is the study of heritable changes in gene function that occur without a change in DNA sequence. It has quickly emerged as an essential area for understanding inheritance and variation that cannot be explained by the DNA sequence alone. Epigenetic modifications have the potential to regulate gene expression and may play a role in diseases such as cancer. DNA methylation is a type of epigenetic modification that occurs when a methyl chemical group attaches to a cytosine base on the DNA molecule. To better understand this epigenetic mechanism, DNA methylation profiles can be constructed by identifying all locations of DNA methylation in a genomic region (e.g. chromosome or whole-genome). Large-scale studies of DNA methylation are supported by microarray technology known as tiling arrays. These arrays provide high-density coverage of genomic regions through the unbiased, systematic selection of probes that are tiled across the regions. Statistical methods are employed to estimate each probe’s DNA methylation status. Previous studies indicate that DNA methylation patterns of some organisms differ by genomic element (e.g., gene, transposon), suggesting that genomic annotation information may be useful in statistical analysis. In this work, a novel statistical model is proposed, which takes advantage of genomic annotation information that to date has not been effectively utilized in statistical analysis. Specifically, a hidden Markov model, which incorporates genomic annotation, is introduced and investigated through a simulation study and analysis of an Arabidopsis thaliana DNA methylation tiling array experiment.