Analyzing large-scale DNA Sequences on Multi-core Architectures
2015 (English)In: Proceedings: IEEE 18th International Conferenceon Computational Science and Engineering, CSE 2015 / [ed] Plessl, C; ElBaz, D; Cong, G; Cardoso, JMP; Veiga, L; Rauber, T, IEEE Press, 2015, 208-215 p.Conference paper (Refereed)
Rapid analysis of DNA sequences is important in preventing the evolution of different viruses and bacteria during an early phase, early diagnosis of genetic predispositions to certain diseases (cancer, cardiovascular diseases), and in DNA forensics. However, real-world DNA sequences may comprise several Gigabytes and the process of DNA analysis demands adequate computational resources to be completed within a reasonable time. In this paper we present a scalable approach for parallel DNA analysis that is based on Finite Automata, and which is suitable for analysing very large DNA segments. We evaluate our approach for real-world DNA segments of mouse (2.7GB), cat (2.4GB), dog (2.4GB), chicken (1GB), human (3.2GB) and turkey (0.2GB). Experimental results on a dual-socket shared-memory system with 24 physical cores show speedups of up to 17.6x. Our approach is up to 3x faster than a pattern-based parallel approach that uses the RE2 library.
Place, publisher, year, edition, pages
IEEE Press, 2015. 208-215 p.
parallel DNA analysis, multi-core architectures, finite automata
Research subject Computer and Information Sciences Computer Science, Computer Science
IdentifiersURN: urn:nbn:se:lnu:diva-46199DOI: 10.1109/CSE.2015.25ISI: 000380496700028ISBN: 978-1-4673-8297-7OAI: oai:DiVA.org:lnu-46199DiVA: diva2:852877
The 18th IEEE International Conference on Computational Science and Engineering (CSE 2015), Porto, Portugal, 20 - 23 October 2015