Digitala Vetenskapliga Arkivet

Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Domainoid: domain-oriented orthology inference
Stockholm University, Faculty of Science, Department of Biochemistry and Biophysics. Stockholm University, Science for Life Laboratory (SciLifeLab).ORCID iD: 0000-0003-0532-8251
Stockholm University, Faculty of Science, Department of Biochemistry and Biophysics. Stockholm University, Science for Life Laboratory (SciLifeLab).
Stockholm University, Faculty of Science, Department of Biochemistry and Biophysics. Stockholm University, Science for Life Laboratory (SciLifeLab).
Number of Authors: 42019 (English)In: BMC Bioinformatics, ISSN 1471-2105, E-ISSN 1471-2105, Vol. 20, no 1, article id 523Article in journal (Refereed) Published
Abstract [en]

Background: Orthology inference is normally based on full-length protein sequences. However, most proteins contain independently folding and recurring regions, domains. The domain architecture of a protein is vital for its function, and recombination events mean individual domains can have different evolutionary histories. It has previously been shown that orthologous proteins may differ in domain architecture, creating challenges for orthology inference methods operating on full-length sequences. We have developed Domainoid, a new tool aiming to overcome these challenges faced by full-length orthology methods by inferring orthology on the domain level. It employs the InParanoid algorithm on single domains separately, to infer groups of orthologous domains.

Results: This domain-oriented approach allows detection of discordant domain orthologs, cases where different domains on the same protein have different evolutionary histories. In addition to domain level analysis, protein level orthology based on the fraction of domains that are orthologous can be inferred. Domainoid orthology assignments were compared to those yielded by the conventional full-length approach InParanoid, and were validated in a standard benchmark.

Conclusions: Our results show that domain-based orthology inference can reveal many orthologous relationships that are not found by full-length sequence approaches.

Place, publisher, year, edition, pages
2019. Vol. 20, no 1, article id 523
Keywords [en]
Orthology, Domain ortholog, Protein domain
National Category
Biological Sciences
Research subject
Biochemistry towards Bioinformatics
Identifiers
URN: urn:nbn:se:su:diva-177520DOI: 10.1186/s12859-019-3137-2ISI: 000502350400001PubMedID: 31660857OAI: oai:DiVA.org:su-177520DiVA, id: diva2:1383783
Available from: 2020-01-08 Created: 2020-01-08 Last updated: 2020-01-08Bibliographically approved
In thesis
1. Functional Inference from Orthology and Domain Architecture
Open this publication in new window or tab >>Functional Inference from Orthology and Domain Architecture
2018 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Proteins are the basic building blocks of all living organisms. They play a central role in determining the structure of living beings and are required for essential chemical reactions. One of the main challenges in bioinformatics is to characterize the function of all proteins. The problem of understanding protein function can be approached by understanding their evolutionary history. Orthology analysis plays an important role in studying the evolutionary relation of proteins. Proteins are termed orthologs if they derive from a single gene in the species' last common ancestor, i.e. if they were separated by a speciation event. Orthologs are useful because they retain their function more often than other homologs. 

Inference of a complete set of orthologs for many species is computationally intensive. Currently, the fastest algorithms rely on graph-based approaches, which compare all-vs-all sequences and then cluster top hits into groups of orthologs. The initial step of performing all-vs-all comparisons is usually the primary computational challenge as it scales quadratically with the number of species. 

A new, more scalable and less computationally demanding method was developed to solve this problem without sacrificing accuracy. The Hieranoid 2 algorithm reduces computational complexity to almost linear by overcoming the necessity to perform all-vs-all similarity searches. The algorithm progresses along a known species tree, from leaves to root. Starting at the leaves, ortholog groups are predicted conventionally and then summarized at internal nodes to form pseudo-species. These pseudo-species are then re-used to search against other (pseudo-)species higher in the tree. This way the algorithm aggregates new ortholog groups hierarchically. The hierarchy is a natural structure to store and view large multi-species ortholog groups, and provides a complete picture of inferred evolutionary events. 

To facilitate explorative analysis of hierarchical groups of orthologs, a new online tool was created. The HieranoiDB website provides precomputed hierarchical groups of orthologs for a set of 66 species. It allows the user to search for orthology assignments using protein description, protein sequence, or species. Evolutionary events and meta information is added to the hierarchical groups of orthologs, which are shown graphically as interactive trees. This representation allows exploring, searching, and easier visual inspection of multi-species ortholog groups.

The majority of orthology prediction methods focus on treating the whole protein sequence as a single evolutionary unit. However, proteins are often composed of individual units, called protein domains, that can have different evolutionary histories. To extend the full sequence based methodology to a domain-aware method, a new approach called Domainoid is proposed. Here, domains are extracted from full-length sequences and subjected to orthology inference. This allows Domainoid to find orthology that would be missed by a full sequence approach.

Networks are a convenient graphical representation for showing a large number of functional associations between genes or proteins. They allow various analyses of graph properties, and can help visualize complex relationships. A framework for inferring comprehensive functional association networks was developed, called FunCoup. A major difference compared to other networks is FunCoup's extensive use of orthology relationships between species, which significantly boosts its coverage. Using naïve Bayesian classifiers to integrate 10 different evidence types and orthology transfer, FunCoup captures functional associations of many types, and provides comprehensive networks for 17 species across five gold-standards.

Place, publisher, year, edition, pages
Stockholm: Department of Biochemistry and Biophysics, Stockholm University, 2018. p. 38
Keywords
Orthology, Functional coupling networks, Association networks, Hierarchical groups of orthologs
National Category
Bioinformatics (Computational Biology)
Research subject
Biochemistry towards Bioinformatics
Identifiers
urn:nbn:se:su:diva-155096 (URN)978-91-7797-252-5 (ISBN)978-91-7797-253-2 (ISBN)
Public defence
2018-06-12, Magnélisalen, Kemiska övningslaboratoriet, Svante Arrhenius väg 16 B, Stockholm, 14:00 (English)
Opponent
Supervisors
Note

At the time of the doctoral defense, the following paper was unpublished and had a status as follows: Paper 3: Manuscript.

Available from: 2018-05-18 Created: 2018-04-24 Last updated: 2020-05-08Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textPubMed

Search in DiVA

By author/editor
Persson, EmmaKaduk, MateuszSonnhammer, Erik L. L.
By organisation
Department of Biochemistry and BiophysicsScience for Life Laboratory (SciLifeLab)
In the same journal
BMC Bioinformatics
Biological Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 3 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf