Digitala Vetenskapliga Arkivet

Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Machine Learning-Based Approach to Identify Genetic Interactions
University of Skövde, School of Bioscience.
2024 (English)Independent thesis Advanced level (degree of Master (One Year)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

It is well-established that certain phenotypic outcomes result from interactions between multiple genes rather than being attributable to individual genes. Mapping all possible gene combinations to study genetic interactions is challenging with low-throughput experimental assays. While CRISPR screening has provided a high-throughput approach for mapping genetic interactions, CRISPR screen data are limited by data quality, technical challenges, and the limitations of current tools. Furthermore, CRISPR screens primarily focus on phenotypic outcomes and additional methods based on transcriptomics and gene dependency failed to identify previously validated genetic interactions.The first aim focuses on enhancing the identification of genetic interactions. Rather than relying solely on expected and observed phenotypes, it used Random Forest model to integrate multiple data types. The model was trained on experimentally validated gene pairs and the model outperformed CRISPR-based method, dependency method, and RNA expression-based approaches in identifying validated genetic interactions. This underscores the value of integrating diverse datasets and utilizing random forests to improve the identification of genetic interactions. The second aim is to predict genetic interactions using predictive features from available databases. Analysis showed that the a-score (co-expression) and p-score (co-occurrence across organisms) from the STRING database were the most predictive. These features trained a random forest classifier on experimentally validated genetic interaction pairs. The classifier's test set showed a high overlap with validated pairs, providing a reference set of gene pairs likely to be genetic interactions, which can be further validated through low-throughput experiments or CRISPR screens.

Place, publisher, year, edition, pages
2024. , p. 57
National Category
Bioinformatics (Computational Biology)
Identifiers
URN: urn:nbn:se:his:diva-24937OAI: oai:DiVA.org:his-24937DiVA, id: diva2:1942281
External cooperation
Manuel Kaulich Group, Goethe University, Frankfurt
Subject / course
Bioinformatics
Supervisors
Examiners
Available from: 2025-03-04 Created: 2025-03-04 Last updated: 2025-03-04Bibliographically approved

Open Access in DiVA

fulltext(2582 kB)39 downloads
File information
File name FULLTEXT01.pdfFile size 2582 kBChecksum SHA-512
146346c742498997aa2e55bd2ea85581ddfa8f543f6bf886803043c22f1c50dc7a4c531e073b5f4ad4a59f1836a0f71ead1497ef785810f80cc7e4c2b8effd76
Type fulltextMimetype application/pdf

By organisation
School of Bioscience
Bioinformatics (Computational Biology)

Search outside of DiVA

GoogleGoogle Scholar
Total: 39 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 453 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf