Performance vs. Correctness When Writing Low-Level HPC/Tensor/Array Code
Most applications benefiting from accelerators (especially ML accelerators) rely on hand-optimized high-performance kernel libraries to get access to new hardware, and ensure a high level of performance (e.g. BLAS, CuDNN, etc.). However, these kernel libraries are still written and optimized by hand, at great expense using low-level C and assembly code. This is because the performance engineers who write this code, (like the hardware designers on the other side of the ISA from them) require control over the design. What if we designed programming languages specially tailored to the needs of these programmers?
First, I will discuss performance and correctness. Should we think of this as a tradeoff (as the talk title implies) or two halves of the same whole? Then, I will discuss two different “user-scheduled” languages we’ve built to achieve both performance and correctness in HPC kernel programming. (1) Exo is an imperative language which turns the compiler “inside out” by externalizing control of code optimization directly to the user, and by replacing hardware-specific backends (the compiler writers’ responsibility) with user-level libraries (the performance engineers’ responsibility). (2) ATL is a simple functional tensor language, which we have embedded in Coq. Rewrites of ATL programs thereby become lemmas, and user-scheduling directives become proof tactics. These languages match the performance of highly tuned linear algebra, neural net and image processing kernels by using formal verification machinery to expedite the existing optimization process of low-level software performance engineers.
Sun 18 JunDisplayed time zone: Eastern Time (US & Canada) change
09:00 - 11:00 | |||
09:00 60mKeynote | Performance vs. Correctness When Writing Low-Level HPC/Tensor/Array Code ARRAY Gilbert Bernstein University of Washington, Seattle | ||
10:00 30mTalk | Accurate Array Program Mapping with Neural Program Translation and Synthesis (cancelled) ARRAY Hui Shi University of California, San Diego, Sicun Gao University of California San Diego, Jishen Zhao UCSD | ||
10:30 30mTalk | Array Programming via Multi-Dimensional Homomorphisms ARRAY Ari Rasch University of Muenster, Richard Schulze University of Muenster, Sergei Gorlatch University of Muenster File Attached |