HaploPOP: a software that improves population assignment by combining markers into haplotypes
2015 (English)In: BMC Bioinformatics, ISSN 1471-2105, Vol. 16, 242Article in journal (Refereed) Published
Background: In ecology and forensics, some population assignment techniques use molecular markers to assign individuals to known groups. However, assigning individuals to known populations can be difficult if the level of genetic differentiation among populations is small. Most assignment studies handle independent markers, often by pruning markers in Linkage Disequilibrium (LD), ignoring the information contained in the correlation among markers due to LD. Results: To improve the accuracy of population assignment, we present an algorithm, implemented in the HaploPOP software, that combines markers into haplotypes, without requiring independence. The algorithm is based on the Gain of Informativeness for Assignment that provides a measure to decide if a pair of markers should be combined into haplotypes, or not, in order to improve assignment. Because complete exploration of all possible solutions for constructing haplotypes is computationally prohibitive, our approach uses a greedy algorithm based on windows of fixed sizes. We evaluate the performance of HaploPOP to assign individuals to populations using a split-validation approach. We investigate both simulated SNPs data and dense genotype data from individuals from Spain and Portugal. Conclusions: Our results show that constructing haplotypes with HaploPOP can substantially reduce assignment error. The HaploPOP software is freely available as a command-line software at www.ieg.uu.se/Jakobsson/software/HaploPOP/.
Place, publisher, year, edition, pages
2015. Vol. 16, 242
IdentifiersURN: urn:nbn:se:uu:diva-260829DOI: 10.1186/s12859-015-0661-6ISI: 000358766000002PubMedID: 26227424OAI: oai:DiVA.org:uu-260829DiVA: diva2:848853
FunderSwedish Research CouncilThe Swedish Foundation for International Cooperation in Research and Higher Education (STINT)