Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Computational Modelling of Gene Regulation in Cancer: Coding the noncoding genome
Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, Computational Biology and Bioinformatics.
2018 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Technological advancements have enabled quantification of processes within and around us. The information stored within our body converts into petabytes of data. Processing and learning from such data requires comprehensive computational programs and software systems. We developed software programs to systematically investigate the process of gene regulation in the human genome. Gene regulation is a complex process where several genomic elements control expression of a gene through recruiting many transcription factor (TF) proteins. The TFs recognize specific DNA sequences known as motifs. DNA mutations in regulatory elements and particularly in TF motifs may cause gene deregulation. Therefore, defining the landscape of regulatory elements and their roles in cancer and complex diseases is of major importance.

We developed an algorithm (tfNet) to identify regulatory elements based on transcription factor binding sites. tfNet identified nearly 144,000 regulatory elements in five human cell lines. Investigating the elements we identified TF interaction networks and enrichment of many GWAS SNPs. We also defined the regulatory landscape for other conditions and species. Next, we investigated the role of regulatory elements in cancer. Cancer is initiated and developed by genetic aberrations in the genome. Genetic changes that are present in a cancer genome are obtained through whole genome sequencing technologies. We analyzed somatic mutations that had been detected in 326 whole genomes of liver cancer patients. Our results indicated 907 candidate mutations affecting TF motifs. Genome wide alignment of the mutated motifs revealed a significant enrichment of mutations in a highly conserved position of the CTCF motif. Gene expression analysis exhibited disruption of topologically associated domains in the mutated samples. We also confirmed the mutational pattern in pancreatic, gastric and esophagus cancers. Finally, enrichment of cancer associated gene sets and pathways suggested great role of noncoding mutations in cancer.

To systematically analyze DNA mutations in TF motifs, we developed an online database system (funMotifs). Publicly available datasets were collected for thousands experiments. The datasets were integrated using a logistic regression model. Functionality annotations and scores for motifs of 519 TFs were derived. The database allows for identification of variants affecting functional motifs in a selected tissue type. Finally, a comprehensive analysis was performed to identify mutations overlapping functional TF motifs in 37 cancer types. Somatic mutations from a pan-cancer cohort of 2,515 cancer whole genomes were investigated. A significant enrichment of mutations in the CpG site of the CEBPB motif was identified. Overall, 10,806 mutated regulatory elements were identified including 406 highly recurrent ones. Genes associated to the mutated elements were highly enriched for cancer-related pathways. Our analyses provide further insights onto the role of regulatory elements and their impacts on cancer development.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2018. , p. 54
Series
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 1627
Keywords [en]
Regulatory elements, gene regulation, cancer, motif, integrative database, software solutions for cancer data
National Category
Bioinformatics (Computational Biology)
Research subject
Bioinformatics
Identifiers
URN: urn:nbn:se:uu:diva-339937ISBN: 978-91-513-0220-1 (print)OAI: oai:DiVA.org:uu-339937DiVA, id: diva2:1177092
Public defence
2018-03-14, A1:111a, BMC, Husargatan 3, 09:00 (English)
Opponent
Supervisors
Available from: 2018-02-21 Created: 2018-01-24 Last updated: 2018-03-07
List of papers
1. Maps of context-dependent putative regulatory regions and genomic signal interactions
Open this publication in new window or tab >>Maps of context-dependent putative regulatory regions and genomic signal interactions
Show others...
2016 (English)In: Nucleic Acids Research, ISSN 0305-1048, E-ISSN 1362-4962, Vol. 44, no 19, p. 9110-9120Article in journal (Refereed) Published
Abstract [en]

Gene transcription is regulated mainly by transcription factors (TFs). ENCODE and Roadmap Epigenomics provide global binding profiles of TFs, which can be used to identify regulatory regions. To this end we implemented a method to systematically construct cell-type and species-specific maps of regulatory regions and TF-TF interactions. We illustrated the approach by developing maps for five human cell-lines and two other species. We detected similar to 144k putative regulatory regions among the human cell-lines, with the majority of them being similar to 300 bp. We found similar to 20k putative regulatory elements in the ENCODE heterochromatic domains suggesting a large regulatory potential in the regions presumed transcriptionally silent. Among the most significant TF interactions identified in the heterochromatic regions were CTCF and the cohesin complex, which is in agreement with previous reports. Finally, we investigated the enrichment of the obtained putative regulatory regions in the 3D chromatin domains. More than 90% of the regions were discovered in the 3D contacting domains. We found a significant enrichment of GWAS SNPs in the putative regulatory regions. These significant enrichments provide evidence that the regulatory regions play a crucial role in the genomic structural stability. Additionally, we generated maps of putative regulatory regions for prostate and colorectal cancer human cell-lines.

National Category
Biochemistry and Molecular Biology
Identifiers
urn:nbn:se:uu:diva-310761 (URN)10.1093/nar/gkw800 (DOI)000388016900012 ()27625394 (PubMedID)
Funder
AstraZenecaSwedish Research CouncilSwedish Diabetes AssociationeSSENCE - An eScience Collaboration
Note

De två första författarna delar förstaförfattarskapet.

Available from: 2016-12-19 Created: 2016-12-19 Last updated: 2018-01-25Bibliographically approved
2. A Significant Regulatory Mutation Burden at a High-Affinity Position of the CTCF Motif in Gastrointestinal Cancers
Open this publication in new window or tab >>A Significant Regulatory Mutation Burden at a High-Affinity Position of the CTCF Motif in Gastrointestinal Cancers
Show others...
2016 (English)In: Human Mutation, ISSN 1059-7794, E-ISSN 1098-1004, Vol. 37, no 9, p. 904-913Article in journal (Refereed) Published
Abstract [en]

Somatic mutations drive cancer and there are established ways to study those in coding sequences. It has been shown that some regulatory mutations are over-represented in cancer. We develop a new strategy to find putative regulatory mutations based on experimentally established motifs for transcription factors (TFs). In total, we find 1,552 candidate regulatory mutations predicted to significantly reduce binding affinity of many TFs in hepatocellular carcinoma and affecting binding of CTCF also in esophagus, gastric, and pancreatic cancers. Near mutated motifs, there is a significant enrichment of (1) genes mutated in cancer, (2) tumor-suppressor genes, (3) genes in KEGG cancer pathways, and (4) sets of genes previously associated to cancer. Experimental and functional validations support the findings. The strategy can be applied to identify regulatory mutations in any cell type with established TF motifs and will aid identifications of genes contributing to cancer.

Keywords
mutated binding sites, motifs, noncoding regulatory regions, CTCF, driver mutations, whole-genome sequencing, WGS
National Category
Medical and Health Sciences
Identifiers
urn:nbn:se:uu:diva-305547 (URN)10.1002/humu.23014 (DOI)000382777100009 ()27174533 (PubMedID)
Funder
Swedish Cancer Society, 15 0878Swedish Research CouncileSSENCE - An eScience Collaboration, DEC 2015/16/W/NZ2/00314
Available from: 2016-10-20 Created: 2016-10-19 Last updated: 2018-01-25Bibliographically approved
3. funMotifs: Tissue-specific transcription factor motifs
Open this publication in new window or tab >>funMotifs: Tissue-specific transcription factor motifs
Show others...
(English)Manuscript (preprint) (Other academic)
Keywords
noncoding genome, transcription factor motifs, database, annotation, genetic variants
National Category
Bioinformatics (Computational Biology)
Research subject
Bioinformatics
Identifiers
urn:nbn:se:uu:diva-339915 (URN)
Available from: 2018-01-24 Created: 2018-01-24 Last updated: 2018-01-25
4. Functional annotation of noncoding mutations identifies candidate regulatory aberrations in cancer
Open this publication in new window or tab >>Functional annotation of noncoding mutations identifies candidate regulatory aberrations in cancer
(English)Manuscript (preprint) (Other academic)
Keywords
noncoding genome, transcription factor motifs, regulatory elements, cancer
National Category
Bioinformatics and Systems Biology Medical Genetics
Research subject
Bioinformatics; Medical Genetics
Identifiers
urn:nbn:se:uu:diva-339913 (URN)
Available from: 2018-01-24 Created: 2018-01-24 Last updated: 2018-01-25

Open Access in DiVA

fulltext(1080 kB)68 downloads
File information
File name FULLTEXT01.pdfFile size 1080 kBChecksum SHA-512
1e1b1c41ccc8a613be3c651d82596975b7840276e623d9c524557b35cc74d29ec9afa24eb6da14c469e7b34dbe0de2a01d7068915370d97fb311fa3b62702daa
Type fulltextMimetype application/pdf
Buy this publication >>

Search in DiVA

By author/editor
Umer, Husen Muhammad
By organisation
Computational Biology and Bioinformatics
Bioinformatics (Computational Biology)

Search outside of DiVA

GoogleGoogle Scholar
Total: 68 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 396 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf