A Second-Order Distributed Trotter-Suzuki Solver with a Hybrid CPU-GPU Kernel
2013 (English)In: Computer Physics Communications, ISSN 0010-4655, E-ISSN 1879-2944, Vol. 184, 1165-1171 p.Article in journal (Refereed)
The Trotter-Suzuki approximation leads to an efficient algorithm for solving the time-dependent Schrödinger equation.
Using existing highly optimized CPU and GPU kernels, we developed a distributed version of the algorithm that runs efficiently on a cluster. Our implementation also improves single node performance, and is able to use multiple GPUs within a node. The scaling is close to linear using the CPU kernels, whereas the efficiency of GPU kernels improve with larger matrices. We also introduce a hybrid kernel that simultaneously uses multicore CPUs and GPUs in a distributed system. This kernel is shown to be efficient when the matrix size would not fit in the GPU memory. Larger quantum systems scale especially well with a high number nodes. The code is available under an open source license.
Place, publisher, year, edition, pages
Elsevier , 2013. Vol. 184, 1165-1171 p.
GPU Accleration, Distributed Computing
Computational Mathematics Computer and Information Science
IdentifiersURN: urn:nbn:se:hb:diva-1814DOI: 10.1016/j.cpc.2012.12.008ISI: 000315974100010Local ID: 2320/13345OAI: oai:DiVA.org:hb-1814DiVA: diva2:869892
This work was carried out while P. W. was visiting the Department of Computer Applications in Science & Engineering at the Barcelona Supercomputing Center, funded by the "Access to BSC Facilities" project of the HPC-Europe2 programme (contract no. 228398).2015-11-132015-11-13