Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Predictive Healthcare: Cervical Cancer Screening Risk Stratification and Genetic Disease Markers
Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, Computational Biology and Bioinformatics.ORCID iD: 0000-0001-8505-403x
2019 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

The use of Machine Learning is rapidly expanding into previously uncharted waters. In the medicine fields there are vast troves of data available from hospitals, biobanks and registries that now are being explored due to the tremendous advancement in computer science and its related hardware. The progress in genomic extraction and analysis has made it possible for any individual to know their own genetic code. Genetic testing has become affordable and can be used as a tool in treatment, discovery, and prognosis of individuals in a wide variety of healthcare settings. This thesis addresses three different approaches to-wards predictive healthcare and disease exploration; first, the exploita-tion of diagnostic data in Nordic screening programmes for the purpose of identifying individuals at high risk of developing cervical cancer so that their screening schedules can be intensified in search of new dis-ease developments. Second, the search for genomic markers that can be used either as additions to diagnostic data for risk predictions or as can-didates for further functional analysis. Third, the development of a Ma-chine Learning pipeline called ||-ROSETTA that can effectively process large datasets in the search for common patterns. Together, this provides a functional approach to predictive healthcare that allows intervention at early stages of disease development resulting in treatments with reduced health consequences at a lower financial burden

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2019. , p. 62
Series
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 1862
Keywords [en]
Bioinformatics, Cervical Cancer, Screening, Computer Science, Algorithmics, Machine Learning, Genetics, SNPs, Rough Sets
National Category
Bioinformatics and Systems Biology
Research subject
Bioinformatics
Identifiers
URN: urn:nbn:se:uu:diva-394293ISBN: 978-91-513-0768-8 (print)OAI: oai:DiVA.org:uu-394293DiVA, id: diva2:1358393
Public defence
2019-11-28, Room A1:111, BMC, Husargatan 3, Uppsala, 09:15 (English)
Opponent
Supervisors
Available from: 2019-11-06 Created: 2019-10-07 Last updated: 2019-11-27
List of papers
1. Risk stratification in cervical cancer screening by complete screening history: Applying bioinformatics to a general screening population
Open this publication in new window or tab >>Risk stratification in cervical cancer screening by complete screening history: Applying bioinformatics to a general screening population
Show others...
2017 (English)In: International Journal of Cancer, ISSN 0020-7136, E-ISSN 1097-0215, Vol. 141, no 1, p. 200-209Article in journal (Refereed) Published
Abstract [en]

Women screened for cervical cancer in Sweden are currently treated under a one-size-fits-all programme, which has been successful in reducing the incidence of cervical cancer but does not use all of the participants' available medical information. This study aimed to use women's complete cervical screening histories to identify diagnostic patterns that may indicate an increased risk of developing cervical cancer. A nationwide case-control study was performed where cervical cancer screening data from 125,476 women with a maximum follow-up of 10 years were evaluated for patterns of SNOMED diagnoses. The cancer development risk was estimated for a number of different screening history patterns and expressed as Odds Ratios (OR), with a history of 4 benign cervical tests as reference, using logistic regression. The overall performance of the model was moderate (64% accuracy, 71% area under curve) with 61-62% of the study population showing no specific patterns associated with risk. However, predictions for high-risk groups as defined by screening history patterns were highly discriminatory with ORs ranging from 8 to 36. The model for computing risk performed consistently across different screening history lengths, and several patterns predicted cancer outcomes. The results show the presence of risk-increasing and risk-decreasing factors in the screening history. Thus it is feasible to identify subgroups based on their complete screening histories. Several high-risk subgroups identified might benefit from an increased screening density. Some low-risk subgroups identified could likely have a moderately reduced screening density without additional risk.

Keywords
bioinformatics, cervical cancer, screening, personalized medicine, machine learning
National Category
Cancer and Oncology Bioinformatics (Computational Biology)
Identifiers
urn:nbn:se:uu:diva-323754 (URN)10.1002/ijc.30725 (DOI)000400766500021 ()28383102 (PubMedID)
Available from: 2017-06-12 Created: 2017-06-12 Last updated: 2019-10-07Bibliographically approved
2. Allele specific chromatin signals, 3D interactions, and motif predictions for immune and B cell related diseases
Open this publication in new window or tab >>Allele specific chromatin signals, 3D interactions, and motif predictions for immune and B cell related diseases
Show others...
2019 (English)In: Scientific Reports, ISSN 2045-2322, E-ISSN 2045-2322, Vol. 9, article id 2695Article in journal (Refereed) Published
Abstract [en]

Several Genome Wide Association Studies (GWAS) have reported variants associated to immune diseases. However, the identified variants are rarely the drivers of the associations and the molecular mechanisms behind the genetic contributions remain poorly understood. ChIP-seq data for TFs and histone modifications provide snapshots of protein-DNA interactions allowing the identification of heterozygous SNPs showing significant allele specific signals (AS-SNPs). AS-SNPs can change a TF binding site resulting in altered gene regulation and are primary candidates to explain associations observed in GWAS and expression studies. We identified 17,293 unique AS-SNPs across 7 lymphoblastoid cell lines. In this set of cell lines we interrogated 85% of common genetic variants in the population for potential regulatory effect and we identified 237 AS-SNPs associated to immune GWAS traits and 714 to gene expression in B cells. To elucidate possible regulatory mechanisms we integrated long-range 3D interactions data to identify putative target genes and motif predictions to identify TFs whose binding may be affected by AS-SNPs yielding a collection of 173 AS-SNPs associated to gene expression and 60 to B cell related traits. We present a systems strategy to find functional gene regulatory variants, the TFs that bind differentially between alleles and novel strategies to detect the regulated genes.

Place, publisher, year, edition, pages
NATURE PUBLISHING GROUP, 2019
National Category
Medical Genetics
Identifiers
urn:nbn:se:uu:diva-379258 (URN)10.1038/s41598-019-39633-0 (DOI)000459571100059 ()30804403 (PubMedID)
Funder
Swedish Research Council, 78081Swedish National Infrastructure for Computing (SNIC)EXODIAB - Excellence of Diabetes Research in SwedenSwedish Diabetes AssociationErnfors FoundationSwedish Cancer Society, 160518German Research Foundation (DFG), GR-3526/1German Research Foundation (DFG), GR-3526/2
Available from: 2019-03-15 Created: 2019-03-15 Last updated: 2019-10-07Bibliographically approved
3. Risk Stratification in Cervical Cancer Screening – Validation and Generalization of a Data-driven  Screening Recall Model
Open this publication in new window or tab >>Risk Stratification in Cervical Cancer Screening – Validation and Generalization of a Data-driven  Screening Recall Model
Show others...
(English)Manuscript (preprint) (Other academic)
Keywords
Cervical Cancer, Screening, Classification, Bioinformatics, Rough Sets
National Category
Bioinformatics and Systems Biology
Research subject
Bioinformatics; Bioinformatics
Identifiers
urn:nbn:se:uu:diva-394291 (URN)
Available from: 2019-10-07 Created: 2019-10-07 Last updated: 2019-10-07
4. Studies of liver tissue identify functional gene regulatory elements associated to gene expression, type 2 diabetes, and other metabolic diseases
Open this publication in new window or tab >>Studies of liver tissue identify functional gene regulatory elements associated to gene expression, type 2 diabetes, and other metabolic diseases
Show others...
2019 (English)In: HUMAN GENOMICS, ISSN 1473-9542, Vol. 13, article id 20Article in journal (Refereed) Published
Abstract [en]

Background:

Genome-wide association studies (GWAS) of diseases and traits have found associations to gene regions but not the functional SNP or the gene mediating the effect. Difference in gene regulatory signals can be detected using chromatin immunoprecipitation and next-gen sequencing (ChIP-seq) of transcription factors or histone modifications by aligning reads to known polymorphisms in individual genomes. The aim was to identify such regulatory elements in the human liver to understand the genetics behind type 2 diabetes and metabolic diseases.

Methods:

The genome of liver tissue was sequenced using 10X Genomics technology to call polymorphic positions. Using ChIP-seq for two histone modifications, H3K4me3 and H3K27ac, and the transcription factor CTCF, and our established bioinformatics pipeline, we detected sites with significant difference in signal between the alleles.

Results:

We detected 2329 allele-specific SNPs (AS-SNPs) including 25 associated to GWAS SNPs linked to liver biology, e.g., 4 AS-SNPs at two type 2 diabetes loci. Two hundred ninety-two AS-SNPs were associated to liver gene expression in GTEx, and 134 AS-SNPs were located on 166 candidate functional motifs and most of them in EGR1-binding sites.

Conclusions:

This study provides a valuable collection of candidate liver regulatory elements for further experimental validation.

Keywords
ChIP-seq, T2D, Regulatory SNPs
National Category
Medical Genetics Bioinformatics and Systems Biology
Identifiers
urn:nbn:se:uu:diva-383513 (URN)10.1186/s40246-019-0204-8 (DOI)000466335200001 ()31036066 (PubMedID)
Available from: 2019-05-16 Created: 2019-05-16 Last updated: 2019-10-07Bibliographically approved
5. ||-ROSETTA
Open this publication in new window or tab >>||-ROSETTA
(English)Manuscript (preprint) (Other academic)
Keywords
bioinformatics, Rough Sets
National Category
Computer Sciences Bioinformatics (Computational Biology)
Research subject
Bioinformatics; Computer Science
Identifiers
urn:nbn:se:uu:diva-393477 (URN)
Available from: 2019-10-07 Created: 2019-10-07 Last updated: 2019-10-07

Open Access in DiVA

fulltext(1199 kB)38 downloads
File information
File name FULLTEXT01.pdfFile size 1199 kBChecksum SHA-512
af4181b1ad464bb39cf7c96d1845727df15e093a8b581f0659eaff269642fffc906d68cf72f9af2cf6aeab00f97d98cbefb0c0e5731145e9c94caef3ccaf0a0b
Type fulltextMimetype application/pdf
Buy this publication >>

Search in DiVA

By author/editor
Baltzer, Nicholas
By organisation
Computational Biology and Bioinformatics
Bioinformatics and Systems Biology

Search outside of DiVA

GoogleGoogle Scholar
Total: 38 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 104 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf