Optimizing Performance Through Parallelism-Aware Compilation

Giorgis Georgakoudis | 20-FS-015

Project Overview

Compilers generate sub-optimal executable code for parallel high-performance computing (HPC) applications by forfeiting compiler optimization due to lack of parallelism awareness. This project investigated the feasibility of parallelism-aware compiler optimization by quantifying both the performance potential of missing standard compiler optimizations and the possibility of developing parallelism-specific optimizations. We developed FAROS: a Framework to Analyze OpenMP Compilation through Benchmarking and Compiler Optimization Analysis, to quantify and pinpoint missing standard compiler optimizations in parallel programs, showing that sub-optimal parallelism compilation results in up to 2.35 times slowdown. Further, we found that the most viable high-impact parallelism-aware optimization is to extend the compiler with semantic information conveyed through the existing interface to the OpenMP parallel runtime and develop a first iteration of a parallel region merging optimization in the LLVM compiler. FAROS has had immediate impact by identifying the reasons for sub-optimal compiler optimization on HPC applications of interest to Lawrence Livermore National Laboratory, while extending the compiler for parallelism-specific optimization is a promising avenue for future impact by speeding up execution of parallel HPC applications.

Mission Impact

This research supports the Laboratory's core competency in high-performance computing, simulation, and data science. Our findings show that parallelism-aware compiler optimization is promising to significantly reduce execution time of mission HPC applications, including NNSA stockpile stewardship simulations, on current and future-generation HPC systems. The results of our parallelism-aware compiler optimization study benefit multiple HPC applications, importantly, without requiring development effort for manually optimizing each individual application.

Publications, Presentations, and Patents

Doerfert, J., et al. 2020. "(OpenMP) Parallelism Aware Optimizations." LLVM Developer's Meeting (invited presentation). LLNL-VIDEO-814577

Georgakoudis, G., et al. 2020a. "Benchmarking Compiler Optimizations on OpenMP Performance." LLVM-CTH: First Workshop on LLVM Compiler and Tools for HPC (ISC 2020, online), June 2020. LLNL-PRES-811934

——— 2020b. "FAROS: A Framework to Analyze OpenMP Compilation Through Benchmarking and Compiler Optimization Analysis." In: Milfeld, K., de Supinski, B., Koesterke, L., and Klinkenberg, J. (eds) OpenMP: Portable Multi-Level Parallelism on Modern Systems. IWOMP 2020. Lecture Notes in Computer Science 12295. Springer, Cham. doi:10.1007/978-3-030-58144-2_1. LLNL-CONF-810797