Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
BESST - Efficient scaffolding of large fragmented assemblies
Stockholm University, Science for Life Laboratory (SciLifeLab). KTH.
Stockholm University, Faculty of Science, Department of Biochemistry and Biophysics. Stockholm University, Science for Life Laboratory (SciLifeLab).
Stockholm University, Faculty of Science, Department of Biochemistry and Biophysics. Stockholm University, Science for Life Laboratory (SciLifeLab).
KTH. (Science for Life Laboratory, School of Biotechnology, Division of Gene Technology)
Show others and affiliations
2014 (English)In: BMC Bioinformatics, ISSN 1471-2105, E-ISSN 1471-2105, Vol. 15, no 1, 281- p.Article in journal (Refereed) Epub ahead of print
Abstract [en]

Background

The use of short reads from High Throughput Sequencing (HTS) techniques is now commonplace in de novo assembly. Yet, obtaining contiguous assemblies from short reads is challenging, thus making scaffolding an important step in the assembly pipeline. Different algorithms have been proposed but many of them use the number of read pairs supporting a linking of two contigs as an indicator of reliability. This reasoning is intuitive, but fails to account for variation in link count due to contig features.

We have also noted that published scaffolders are only evaluated on small datasets using output from only one assembler. Two issues arise from this. Firstly, some of the available tools are not well suited for complex genomes. Secondly, these evaluations provide little support for inferring a software’s general performance. 

Results

We propose a new algorithm, implemented in a tool called BESST, which can scaffold genomes of all sizes and complexities and was used to scaffold the genome of P. abies (20 Gbp). We performed a comprehensive comparison of BESST against the most popular stand-alone scaffolders on a large variety of datasets. Our results confirm that some of the popular scaffolders are not practical to run on complex datasets. Furthermore, no single stand-alone scaffolder outperforms the others on all datasets. However, BESST fares favorably to the other tested scaffolders on GAGE datasets and, moreover, outperforms the other methods when library insert size distribution is wide.

Conclusion

We conclude from our results that information sources other than the quantity of links, as is commonly used, can provide useful information about genome structure when scaffolding. 

Place, publisher, year, edition, pages
BioMed Central, 2014. Vol. 15, no 1, 281- p.
Keyword [en]
Genome assembly, Scaffolding, Genome analysis, Mate pair next-generation sequencing
National Category
Bioinformatics (Computational Biology)
Identifiers
URN: urn:nbn:se:su:diva-106778DOI: 10.1186/1471-2105-15-281ISI: 000341198900001OAI: oai:DiVA.org:su-106778DiVA: diva2:738943
Funder
Swedish e‐Science Research CenterSwedish Research Council, 2010-4634Knut and Alice Wallenberg Foundation
Available from: 2014-08-19 Created: 2014-08-19 Last updated: 2017-12-05Bibliographically approved

Open Access in DiVA

Provisional PDF(208 kB)112 downloads
File information
File name FULLTEXT01.pdfFile size 208 kBChecksum SHA-512
a841a1d1e9577f55ac82407df881551fbe4caf4c304530c845e47bf8578046437521009b663abe5cb20d2151bb206908b5c4313c08addef2e5e7c6ccbc6f2ce4
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Search in DiVA

By author/editor
Arvestad, Lars
By organisation
Science for Life Laboratory (SciLifeLab)Department of Biochemistry and BiophysicsNumerical Analysis and Computer Science (NADA)
In the same journal
BMC Bioinformatics
Bioinformatics (Computational Biology)

Search outside of DiVA

GoogleGoogle Scholar
Total: 112 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 47 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf