Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Identifying Clusters of High Confidence Homologies in Multiple Sequence Alignments
Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Ecology and Genetics, Evolutionary Biology. Ghulam Ishaq Khan Inst Engn Sci & Technol, Fac Comp Sci & Engn, Topi, Pakistan.
Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Ecology and Genetics, Evolutionary Biology.ORCID iD: 0000-0003-3056-3173
Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Ecology and Genetics, Evolutionary Biology.
2019 (English)In: Molecular biology and evolution, ISSN 0737-4038, E-ISSN 1537-1719, Vol. 36, no 10, p. 2340-2351Article in journal (Refereed) Published
Abstract [en]

Multiple sequence alignment (MSA) is ubiquitous in evolution and bioinformatics. MSAs are usually taken to be a known and fixed quantity on which to perform downstream analysis despite extensive evidence that MSA accuracy and uncertainty affect results. These errors are known to cause a wide range of problems for downstream evolutionary inference, ranging from false inference of positive selection to long branch attraction artifacts. The most popular approach to dealing with this problem is to remove (filter) specific columns in the MSA that are thought to be prone to error. Although popular, this approach has had mixed success and several studies have even suggested that filtering might be detrimental to phylogenetic studies. We present a graph-based clustering method to address MSA uncertainty and error in the software Divvier (available at https://github.com/simonwhelan/Divvier), which uses a probabilistic model to identify clusters of characters that have strong statistical evidence of shared homology. These clusters can then be used to either filter characters from the MSA (partial filtering) or represent each of the clusters in a new column (divvying). We validate Divvier through its performance on real and simulated benchmarks, finding Divvier substantially outperforms existing filtering software by retaining more true pairwise homologies calls and removing more false positive pairwise homologies. We also find that Divvier, in contrast to other filtering tools, can alleviate long branch attraction artifacts induced by MSA and reduces the variation in tree estimates caused by MSA uncertainty.

Place, publisher, year, edition, pages
OXFORD UNIV PRESS , 2019. Vol. 36, no 10, p. 2340-2351
Keywords [en]
multiple sequence alignment, filtering, homology, phylogenetic inference
National Category
Evolutionary Biology
Identifiers
URN: urn:nbn:se:uu:diva-400724DOI: 10.1093/molbev/msz142ISI: 000501734200020PubMedID: 31209473OAI: oai:DiVA.org:uu-400724DiVA, id: diva2:1389040
Funder
Carl Tryggers foundation Available from: 2020-01-28 Created: 2020-01-28 Last updated: 2020-01-28Bibliographically approved

Open Access in DiVA

fulltext(555 kB)14 downloads
File information
File name FULLTEXT01.pdfFile size 555 kBChecksum SHA-512
5e8db6afc4e433d11fc325eeb0b82d9241bb0c414fec498abdfc23199e4ae52e33757ac9b75c23bc6936b653f2f0a28cf42c2020f12e964ad7983fb66ba7dfc4
Type fulltextMimetype application/pdf

Other links

Publisher's full textPubMed

Search in DiVA

By author/editor
Bogusz, MarcinWhelan, Simon
By organisation
Evolutionary Biology
In the same journal
Molecular biology and evolution
Evolutionary Biology

Search outside of DiVA

GoogleGoogle Scholar
Total: 14 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 25 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf