Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Computational discovery of DNA methylation patterns as biomarkers of ageing, cancer, and mental disorders: Algorithms and Tools
Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology. Uppsala University, Science for Life Laboratory, SciLifeLab. (Computational Biology and Bioinformatics)
2017 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Epigenetics refers to the mitotically heritable modifications in gene expression without a change in the genetic code. A combination of molecular, chemical and environmental factors constituting the epigenome is involved, together with the genome, in setting up the unique functionality of each cell type.

DNA methylation is the most studied epigenetic mark in mammals, where a methyl group is added to the cytosine in a cytosine-phosphate-guanine dinucleotides or a CpG site. It has been shown to have a major role in various biological phenomena such as chromosome X inactivation, regulation of gene expression, cell differentiation, genomic imprinting. Furthermore, aberrant patterns of DNA methylation have been observed in various diseases including cancer.

In this thesis, we have utilized machine learning methods and developed new methods and tools to analyze DNA methylation patterns as a biomarker of ageing, cancer subtyping and mental disorders.

In Paper I, we introduced a pipeline of Monte Carlo Feature Selection and rule-base modeling using ROSETTA in order to identify combinations of CpG sites that classify samples in different age intervals based on the DNA methylation levels. The combination of genes that showed up to be acting together, motivated us to develop an interactive pathway browser, named PiiL, to check the methylation status of multiple genes in a pathway. The tool enhances detecting differential patterns of DNA methylation and/or gene expression by quickly assessing large data sets.

In Paper III, we developed a novel unsupervised clustering method, methylSaguaro, for analyzing various types of cancers, to detect cancer subtypes based on their DNA methylation patterns. Using this method we confirmed the previously reported findings that challenge the histological grouping of the patients, and proposed new subtypes based on DNA methylation patterns. In Paper IV, we investigated the DNA methylation patterns in a cohort of schizophrenic and healthy samples, using all the methods that were introduced and developed in the first three papers.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2017. , 55 p.
Series
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 1520
Keyword [en]
DNA methylation, machine learning, biomarker, cancer, ageing, classification
National Category
Bioinformatics (Computational Biology)
Identifiers
URN: urn:nbn:se:uu:diva-320720ISBN: 978-91-554-9924-2 (print)OAI: oai:DiVA.org:uu-320720DiVA: diva2:1090492
Public defence
2017-06-12, A1:111a, BMC Building, Husargatan 3, Uppsala, 09:00 (English)
Opponent
Supervisors
Available from: 2017-05-22 Created: 2017-04-24 Last updated: 2017-06-07
List of papers
1. Combinatorial identification of DNA methylation patterns over age in the human brain
Open this publication in new window or tab >>Combinatorial identification of DNA methylation patterns over age in the human brain
Show others...
2016 (English)In: BMC Bioinformatics, ISSN 1471-2105, E-ISSN 1471-2105, Vol. 17, 393Article in journal (Refereed) Published
Abstract [en]

Background: DNA methylation plays a key role in developmental processes, which is reflected in changing methylation patterns at specific CpG sites over the lifetime of an individual. The underlying mechanisms are complex and possibly affect multiple genes or entire pathways. Results: We applied a multivariate approach to identify combinations of CpG sites that undergo modifications when transitioning between developmental stages. Monte Carlo feature selection produced a list of ranked and statistically significant CpG sites, while rule-based models allowed for identifying particular methylation changes in these sites. Our rule-based classifier reports combinations of CpG sites, together with changes in their methylation status in the form of easy-to-read IF-THEN rules, which allows for identification of the genes associated with the underlying sites. Conclusion: We utilized machine learning and statistical methods to discretize decision class (age) values to get a general pattern of methylation changes over the lifespan. The CpG sites present in the significant rules were annotated to genes involved in brain formation, general development, as well as genes linked to cancer and Alzheimer's disease.

Keyword
DNA methylation, Aging, Rule-based classification, Feature selection
National Category
Medical Biotechnology (with a focus on Cell Biology (including Stem Cell Biology), Molecular Biology, Microbiology, Biochemistry or Biopharmacy)
Identifiers
urn:nbn:se:uu:diva-305330 (URN)10.1186/s12859-016-1259-3 (DOI)000383750700001 ()
Funder
Swedish Research Council FormaseSSENCE - An eScience Collaboration
Available from: 2016-10-14 Created: 2016-10-14 Last updated: 2017-05-03Bibliographically approved
2. PiiL: visualization of DNA methylation and gene expression data in gene pathways
Open this publication in new window or tab >>PiiL: visualization of DNA methylation and gene expression data in gene pathways
2017 (English)In: BMC Genomics, ISSN 1471-2164, E-ISSN 1471-2164, Vol. 18, 571Article in journal (Refereed) Published
Abstract [en]

DNA methylation is a major mechanism involved in the epigenetic state of a cell. It has been observed that the methylation status of certain CpG sites close to or within a gene can directly affect its expression, either by silencing or, in some cases, up-regulating transcription. However, a vertebrate genome contains millions of CpG sites, all of which are potential targets for methylation modification, and the specific effects of most sites has not been characterized to date. To study the complex interplay between methylation status, cellular programs, and the resulting phenotypes, we present PiiL, an interactive gene expression pathway browser, facilitating the analysis through an integrated view of methylation and expression on multiple levels.

PiiL allows for specific hypothesis testing by quickly assessing pathways or gene networks, where the data is projected onto pathways that can be downloaded directly from the online KEGG database. PiiL provides a comprehensive set of analysis features, allowing for quickly searching for specific patterns, as well as to examine individual CpG sites and their impact on expression of the host gene and other genes in regulatory networks. To exemplify the power of this approach, we analyzed two types of brain tumors, Glioblastoma multiform and lower grade gliomas.

At a glance, we could confirm earlier findings that the predominant methylation and expression patterns separate perfectly by mutations in the IDH genes, rather than by histology. We could also infer the IDH mutation status for samples for which the genotype was not known. By applying different filtering methods, we show that a subset of CpG sites exhibits consistent methylation patterns, and that the status of sites affect the expression of key regulator genes, as well as other genes located downstream in the same pathways.

PiiL is implemented in Java with focus on a user-friendly graphical interface. The source code is available under the GPL license from https://github.com/behroozt/PiiL.git.

Keyword
DNA methylation, gene expression, KEGG pathways, visualization
National Category
Bioinformatics (Computational Biology)
Identifiers
urn:nbn:se:uu:diva-320675 (URN)10.1186/s12864-017-3950-9 (DOI)
Available from: 2017-04-23 Created: 2017-04-23 Last updated: 2017-08-03Bibliographically approved
3. An unsupervised approach subgroups cancer types by distinct local DNA methylation patterns
Open this publication in new window or tab >>An unsupervised approach subgroups cancer types by distinct local DNA methylation patterns
Show others...
(English)Article in journal (Other academic) Submitted
Abstract [en]

Cancer is one of the most common causes of death in humans. It can arise from many different cell types, and even cancers originating from the same tissue can constitute a heterogeneous group of diseases. While cytogenetics, the analysis of mutations and karyotypic alterations, has greatly improved the accuracy of diagnosis, it is likely that there are more categories in which cancers can be divided than is known today. Moreover, new biomarkers confirming existing classification schemes are desirable. Here, we interrogated the DNA methylation (DNAm) landscape as a novel indicator for discerning cancer subtypes.

We developed and applied an unsupervised method, methylSaguaro, which is based on the combination of a Hidden Markov Model and a Neural Net. We first compared the concept of hypothesizing patterns and grouping to statistical methods that require a priori hypotheses to perform enrichment tests. We then analyzed samples from four cancer groups, Gliomas, Chronic Lymphocytic Leukemia (CLL), Renal Cell Carcinomas (RCC), and Acute Myeloid Leukemia (AML). On gliomas and CLL, we confirmed known cancer groupings in DNAm that perfectly correspond to known mutations. On Renal Cell Carcinomas, our method disagrees with the histological classification on 4% of the samples, and finds a novel cluster, suggesting that there might be a novel subtype that was hitherto unknown. On AML, methylSaguaro spreads the samples out on a continuous spectrum, enriching one end with patients assessed as having “poor” risk based on cytogenetics, but indicating that DNAm patterns would suggest a different risk assessment. Since methylSaguaro reports both the patterns and the specific sites behind the signals, we analyzed regions and genes indicative of subtypes across the cancers, revealing 41 genes affected by alterations in more than one cancer. In summary, we expect that DNAm, coupled with a hypothesis-free analysis method, will add to the set of clinical instruments to diagnose, assess, and treat cancer.

Keyword
unsupervised learning, DNA methylation, cancer subtyping
National Category
Bioinformatics (Computational Biology)
Identifiers
urn:nbn:se:uu:diva-320676 (URN)
Available from: 2017-04-23 Created: 2017-04-23 Last updated: 2017-04-24
4. Analyzing DNA methylation patterns in Schizophrenic patients using machine learning methods
Open this publication in new window or tab >>Analyzing DNA methylation patterns in Schizophrenic patients using machine learning methods
Show others...
(English)Article in journal (Other academic) Submitted
Abstract [en]

Schizophrenia is common mental disorder with known genetic component involved. Since the association of environmental factors and schizophrenia has been reported, we analyzed a cohort of 75 schizophrenic and 50 control samples to investigate DNA methylation patterns, as one of the key players of epigenetic gene regulation.

Here we applied machine-learning and visualization methods, which were shown previously to be successful in detecting and highlighting differentially methylated patterns between cases and controls. On this data set, however, these methods did not uncover any signal discerning schizophrenia patients and healthy controls, suggesting that if a link exists, it is heterogeneous and complex.

National Category
Bioinformatics (Computational Biology)
Identifiers
urn:nbn:se:uu:diva-320678 (URN)
Available from: 2017-04-23 Created: 2017-04-23 Last updated: 2017-05-02

Open Access in DiVA

fulltext(1236 kB)175 downloads
File information
File name FULLTEXT01.pdfFile size 1236 kBChecksum SHA-512
05e1bd345bfb27f12b2c7ace05f9e5cfc5a88d7d0f3ebb4bea9cc278a6be72accb6cb7a3fd46f15acad126196e5b75f2e54e744b5adb68adc7cbc1e72f15deab
Type fulltextMimetype application/pdf
Buy this publication >>

Search in DiVA

By author/editor
Torabi Moghadam, Behrooz
By organisation
Department of Cell and Molecular BiologyScience for Life Laboratory, SciLifeLab
Bioinformatics (Computational Biology)

Search outside of DiVA

GoogleGoogle Scholar
Total: 175 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 1644 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf