Machine-Learning-Based Optimization Heuristics in Dynamic Compilers
Sprache des Titels:
Englisch
Original Kurzfassung:
Modern, optimizing compilers rely on hundreds of heuristics to decide whether and how
to apply optimizations during compilation. These heuristics are typically hand-crafted and
designed to achieve good performance on a set of benchmark programs which are based
on real-world applications. This manual process has multiple shortcomings: It requires
experienced compiler engineers as well as continuous re-evaluation and tweaking of
heuristics if compiler internals change. In addition, maintaining multiple, domain-specific
heuristics is often considered infeasible, which results in the deployment of one-size-fits-all
heuristics.
Data-driven approaches, based on machine learning, have been shown to outperform
hand-crafted compiler heuristics in that they can suggest optimization decisions which
yield better performing programs after compilation. Nevertheless, only a single recent
machine learning approach has made it into a production-level compiler. Existing research
primarily focuses on using machine learning in static compilers, where the compilation
process is stable and reproducible. In contrast, dynamic compilers run in parallel to the
executed program where they compile frequently executed code. On top of that, the use
of profiling-based speculative optimizations adds an additional level of non-determinism
to dynamic compilation. This lack of consistency aggravates the use of data-driven
approaches in such systems.
In this thesis, we make multiple contributions to the use of machine learning in dynamic
compilers. We address the problems of data stability and consistency by proposing
compilation forking, a technique for extracting performance data from method-local
optimization decisions in arbitrary programs. Based on this data, we train several machine
learning models to replace optimization heuristics for loop optimizations, such as peeling,
unrolling or vectorization. Furthermore, we present self-optimizing compiler heuristics,
based on machine learning models which are continuously refined with new data at the
user site. This approach also enables the creation of domain-specific heuristics which
we showed to outperform one-size-fits-all heuristics significantly. Apart from deploying
machine learning in production systems, we also outline how models can assist compiler
engineers during heuristics design and evaluation.
iv
We implemented our approaches in the GraalVM compiler, which is among the most
highly optimizing Java compilers on the market. Our learned models are either on par
with the well-tuned heuristics in the GraalVM or outperform them significantly with only
minor regressions. In addition to that, our assistive approach enabled improvements
in existing heuristics without the actual deployment of machine learning models in
production systems