Advancements towards non-speculative concurrent execution of critical sections
2025 (English)Doctoral thesis, comprehensive summary (Other academic)Alternative title
Avances hacia la ejecución concurrente y no especulativa de secciones críticas (Spanish)
Description
Abstract [en]
Parallel programs require, besides the cache orchestration, another mechanism that guarantees synchronization among other threads of the same program.These synchronization mechanisms will induce overheads, by slowing down certain operations and stalling threads, among many others, to comply with the requirements established by the programmer.
The thesis's objective is the efficient execution of critical sections, that is, regions of code that must be executed atomically.The most efficient method is the concurrent and non-speculative executions of these sections.To achieve this, we present the 3 steps we have taken:1) single-atomic instructions can be used to implement non-speculative critical sections, therefore, we develop an updated version of the well-known Splash benchmark suite that uses single-address atomic instructions to implement most of the critical sections (Splash-4);2) a new set of multi-address atomic instructions, and a methodology on how to efficiently implement them, that can be used for small critical sections (MADs);3) without the direct intervention of the programmer, a more generic method that limits the retries required to execute contended critical regions (CLEAR).
For an efficient evaluation of the results, we have used the most up-to-date tools possible in each case, and even, when possible, real machines instead of simulations.For the simulations, we have used the gem5 simulator, at all times performing multiple runs.The simulator has been configured to emulate, as reliably as possible, processors based on the latest intel generations.
In our first step, Splash-4, we have managed to reduce the execution time by using 64-cores by 50%, while maintaining the original structure and algorithms.In the second objective (MADs), the new atomic instructions implemented, reduce execution time by 80% compared to the classical lock mechanism, and by 60% by using a transitional memory technique similar to intel TSX, adding only 68 bytes per core.Finally, CLEAR is able to limit the number of re-executions of critical sections executed under speculative methods, increasing by 35% the number of sections that complete on the first retry, and reducing from 37% to 15% the number of sections that need to reach fallback. All this improving the execution time by 35% against an Intel TSX implementation and 23% against PowerTM.
Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2025. , p. 74
Series
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 2520
Keywords [en]
Computer Architecture, microarchitecture, atomic instructions, benchmark suite, non-speculative execution.
National Category
Computer and Information Sciences
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:uu:diva-552947ISBN: 978-91-513-2437-1 (print)OAI: oai:DiVA.org:uu-552947DiVA, id: diva2:1945956
Public defence
2025-06-03, Salón de Grados, Facultad de Informatica (Building 32), University of Murcia, Murcia (Spain), 16:00 (English)
Opponent
Supervisors
2025-04-292025-03-192025-04-29
List of papers