This is the author's version of an article that has been published in this journal. Changes were made to this version by the publisher prior to publication. The final version of record is available at http://dx.doi.org/10.1109/TC.2020.2970706 © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. Analysis of Threading Libraries for High Performance Computing Adrian´ Castello´ Rafael Mayo Gual Sangmin Seo Pavan Balaji Universitat Jaume I Universitat Jaume I Ground X Argonne National Laboratory Castellon´ de la Plana, Spain Castellon´ de la Plana, Spain Seoul, Korea Lemmont, USA Email:
[email protected] Email:
[email protected] Email:
[email protected] Email:
[email protected] Enrique S. Quintana-Ort´ı Antonio J. Pena˜ Universitat Politecnica` de Valencia` Barcelona Supercomputing Center (BSC) Valencia,` Spain Barcelona, Spain Email:
[email protected] Email:
[email protected] Abstract—With the appearance of multi-/many core machines, of hardware parallelism may be inefficient. In response to applications and runtime systems evolved in order to exploit the this problem, dynamic scheduling and lightweight threads new on-node concurrency brought by new software paradigms. (LWTs) (also known as user-level threads, or ULTs) models POSIX threads (Pthreads) was widely-adopted for that purpose were first proposed in [5] in order to deal with the required and it remains as the most used threading solution in current levels of parallelism, offering more efficient management, hardware.