Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Identifying Mitochondrial Genomes in Draft Whole-Genome Shotgun Assemblies of Six Gymnosperm Species
Stockholm University, Faculty of Science, Department of Mathematics.
2018 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesisAlternative title
Identifiering av mitokondriers arvsmassa från preliminäraversioner av arvsmassan för sex gymnospermer (Swedish)
Abstract [en]

Sequencing efforts for gymnosperm genomes typically focus on nuclear and chloroplast DNA, with only three complete mitochondrial genomes published as of 2017. The availability of additional mitochondrial genomes would aid biological and evolutionary understanding of gymnosperms. Identifying mtDNA from existing whole genome sequencing (WGS) data (i.e. contigs) negates the need for additional experimental work but previous classification methods show limitations in sensitivity or accuracy, particularly in difficult cases. In this thesis I present a classification pipeline based on (1) kmer probability scoring and (2) SVM classification applied to the available contigs. Using this pipeline the mitochondrial genomes of six gymnosperm species were obtained: Abies sibirica, Gnetum gnemon, Juniperus communis, Picea abies, Pinus sylvestris and Taxus baccata. Cross-validation experiments showed a satisfying and forsome species excellent degree of accuracy.

Abstract [sv]

Vid sekvensering av gymnospermers arvsmassa har fokus oftast lagts på kärn- och kloroplast-DNA. Bara tre fullständiga mitokondriegenom har publicerats hittills (2017). Fler mitokondriegenom skulle kunna leda till nya kunskaper om gymnospermers biologi och evolution. Då mitokondriernas arvsmassa identifieras från tillgängliga sekvenser för hela organismen (så kallade “contiger”) behövs inget ytterligare laboratoriearbete, men detta förfarande har visat sig leda till bristfällig känslighet och korrekthet, särskilt i svåra fall. I denna avhandling presenterar jag en metod baserad på (1) kmer-sannolikheter och (2) SVM-klassificering applicerad på de tillgängliga contigerna. Med denna metod togs arvsmassan för mitokondrien hos sex gymnospermer fram: Abies sibirica, Gnetum gnemon, Juniperus communis, Picea abies, Pinus sylvestris och Taxus baccata. Korsvalideringsexperiment visade en tillfredställande och för vissa arter utmärkt precision.

Place, publisher, year, edition, pages
2018. , p. 138
Keywords [en]
machine-learning classification genome plant mitochondria svm contigs gymnosperm
National Category
Bioinformatics (Computational Biology)
Identifiers
URN: urn:nbn:se:su:diva-175410OAI: oai:DiVA.org:su-175410DiVA, id: diva2:1365513
Supervisors
Examiners
Available from: 2019-10-28 Created: 2019-10-25 Last updated: 2019-10-28Bibliographically approved

Open Access in DiVA

fulltext(15978 kB)19 downloads
File information
File name FULLTEXT01.pdfFile size 15978 kBChecksum SHA-512
0c28d43644229bdf62847a42866e55e513d23123e48d81f8e58e799bfcbc802caa75cb604cc4dde35e78a97d307a1fe308d116f85b3abb30fba9fab8f5e49a19
Type fulltextMimetype application/pdf

By organisation
Department of Mathematics
Bioinformatics (Computational Biology)

Search outside of DiVA

GoogleGoogle Scholar
Total: 19 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 45 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf