2010 IEEE International Conference on Cluster Computing Designing OS for HPC Applications: Scheduling Roberto Gioiosa∗, Sally A. McKeey, Mateo Valero∗z ∗Computer Science Division Barcelona Supercomputing Center, ES yComputer Science and Engineering Chalmers University of Technology, SE zComputer Architecture Department Universitat Politecnica de Catalunya, ES
[email protected],
[email protected],
[email protected] Abstract—Operating systems have historically been implemented provides portability and transparency, but may not provide as independent layers between hardware and applications. User optimal performance. programs communicate with the OS through a set of well Modern Chip Multi-Processors (CMPs) share different levels defined system calls, and do not have direct access to the of internal hardware resources among cores and hardware hardware. The OS, in turn, communicates with the underlying architecture via control registers. Except for these interfaces, threads, from the internal pipeline, to the caches, down to the the three layers are practically oblivious to each other. While memory controllers. Single thread performance is no longer this structure improves portability and transparency, it may not improving, and applications must exploit more parallelism for deliver optimal performance. This is especially true for High overall system throughput. This is especially true for High Performance Computing (HPC) systems, where modern parallel Performance Computing (HPC) systems where modern paral- applications and multi-core architectures pose new challenges in terms of performance, power consumption, and system utiliza- lel applications and CMP architectures pose new challenges tion. The hardware, the OS, and the applications can no longer for performance, power consumption, and system utilization. remain isolated, and instead should cooperate to deliver high HPC applications have long been parallel (usually using MPI performance with minimal power consumption.