Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A universal genomic coordinate translator for comparative genomics
Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Biochemistry and Microbiology. Uppsala University, Science for Life Laboratory, SciLifeLab.
Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Biochemistry and Microbiology. Uppsala University, Science for Life Laboratory, SciLifeLab.
Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Biochemistry and Microbiology. Uppsala University, Science for Life Laboratory, SciLifeLab.
Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Biochemistry and Microbiology. Uppsala University, Science for Life Laboratory, SciLifeLab.
Show others and affiliations
2014 (English)In: BMC Bioinformatics, ISSN 1471-2105, E-ISSN 1471-2105, Vol. 15, 227- p.Article in journal (Refereed) Published
Abstract [en]

Background: Genomic duplications constitute major events in the evolution of species, allowing paralogous copies of genes to take on fine-tuned biological roles. Unambiguously identifying the orthology relationship between copies across multiple genomes can be resolved by synteny, i.e. the conserved order of genomic sequences. However, a comprehensive analysis of duplication events and their contributions to evolution would require all-to-all genome alignments, which increases at N-2 with the number of available genomes, N. Results: Here, we introduce Kraken, software that omits the all-to-all requirement by recursively traversing a graph of pairwise alignments and dynamically re-computing orthology. Kraken scales linearly with the number of targeted genomes, N, which allows for including large numbers of genomes in analyses. We first evaluated the method on the set of 12 Drosophila genomes, finding that orthologous correspondence computed indirectly through a graph of multiple synteny maps comes at minimal cost in terms of sensitivity, but reduces overall computational runtime by an order of magnitude. We then used the method on three well-annotated mammalian genomes, human, mouse, and rat, and show that up to 93% of protein coding transcripts have unambiguous pairwise orthologous relationships across the genomes. On a nucleotide level, 70 to 83% of exons match exactly at both splice junctions, and up to 97% on at least one junction. We last applied Kraken to an RNA-sequencing dataset from multiple vertebrates and diverse tissues, where we confirmed that brain-specific gene family members, i.e. one-to-many or many-to-many homologs, are more highly correlated across species than single-copy (i.e. one-to-one homologous) genes. Not limited to protein coding genes, Kraken also identifies thousands of newly identified transcribed loci, likely non-coding RNAs that are consistently transcribed in human, chimpanzee and gorilla, and maintain significant correlation of expression levels across species. Conclusions: Kraken is a computational genome coordinate translator that facilitates cross-species comparisons, distinguishes orthologs from paralogs, and does not require costly all-to-all whole genome mappings. Kraken is freely available under LPGL from http://github.com/nedaz/kraken.

Place, publisher, year, edition, pages
2014. Vol. 15, 227- p.
Keyword [en]
Comparative genomics, Genomic coordinate translation, Genomic duplication, Cross-species gene expression analysis
National Category
Medical Biotechnology (with a focus on Cell Biology (including Stem Cell Biology), Molecular Biology, Microbiology, Biochemistry or Biopharmacy) Bioinformatics (Computational Biology) Clinical Medicine
Identifiers
URN: urn:nbn:se:uu:diva-229443DOI: 10.1186/1471-2105-15-227ISI: 000338579100001OAI: oai:DiVA.org:uu-229443DiVA: diva2:736714
Available from: 2014-08-08 Created: 2014-08-07 Last updated: 2017-12-05Bibliographically approved

Open Access in DiVA

fulltext(1508 kB)162 downloads
File information
File name FULLTEXT01.pdfFile size 1508 kBChecksum SHA-512
0646fe77e3d565edbceb2fa95e14dbb23f68eb31aa20f428a71f82da23cefdec161a562adf36c1fc91c205a7fc7425917b70211e0b3c21259f2ec63ccdd643ad
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Search in DiVA

By author/editor
Zamani, NedaSundström, GörelMeadows, Jennifer R. S.Höppner, Marc P.Dainat, JacquesLantz, HenrikGrabherr, Manfred G.
By organisation
Department of Medical Biochemistry and MicrobiologyScience for Life Laboratory, SciLifeLab
In the same journal
BMC Bioinformatics
Medical Biotechnology (with a focus on Cell Biology (including Stem Cell Biology), Molecular Biology, Microbiology, Biochemistry or Biopharmacy)Bioinformatics (Computational Biology)Clinical Medicine

Search outside of DiVA

GoogleGoogle Scholar
Total: 162 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 729 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf