Reconciling High-level Optimizations and Low-level Code in LLVM

Authors

Juneyoung Lee, Seoul National University, Korea
Chung-Kil Hur, Seoul National University, Korea
Ralf Jung, MPI-SWS, Germany
Zhengyang Liu, University of Utah, USA
John Regehr, University of Utah, USA
Nuno P. Lopes, Microsoft Research, UK

Abstract

LLVM miscompiles certain programs in C, C++, and Rust that use low-level language features such as raw pointers in Rust or conversion between integers and pointers in C or C++. The problem is that it is difficult for the compiler to implement aggressive, high-level memory optimizations while also respecting the guarantees made by the programming languages to low-level programs. A deeper problem is that the memory model for LLVM’s intermediate representation (IR) is informal and the semantics of corner cases are not always clear to all compiler developers.

We developed a novel memory model for LLVM IR and formalized it. The new model requires a handful of problematic IR-level optimizations to be removed, but it also supports the addition of new optimizations that were not previously legal. We have implemented the new model and shown that it fixes known memory-model-related miscompilations without impacting the quality of generated code.

Downloads

Reconciling High-level Optimizations and Low-level Code in LLVM.
Juneyoung Lee, Chung-Kil Hur, Ralf Jung, Zhengyang Liu, John Regehr, Nuno P. Lopes.
The 2018 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2018).
[paper] [slide]
Development
[LLVM, Clang, Coq Proof]