Challenges with Hardware-Software Co-design for Sparse Machine Learning on Streaming Dataflow
This paper details the problem landscape that arises from using a general tensor algebra accelerator framework to compute real-world end-to-end machine learning applications. We identify three key challenges for correctness and performance, which include support for tensor reshaping and nonlinear operations, dataflow optimization (kernel fusion, optimal dataflow order), and leveraging sparsity structure. This paper motivates the need to address these problems in the domain-specific language, compiler framework, and architectural design for sparse machine learning. We extended a general tensor algebra compiler and architectural model, the Sparse Abstract Machine, to real-world sparse machine learning models in order to identify the key challenges above.
Sat 17 JunDisplayed time zone: Eastern Time (US & Canada) change
11:20 - 12:30 | |||
11:20 10mTalk | Tags: A Framework for Distributed Event Ordering PLARCH Paul Mure Stanford University, Nathan Zhang Stanford University, Caroline Trippel Stanford University, Kunle Olukotun Stanford University | ||
11:30 15mTalk | Stellar: A DSL to Build and Explore Sparse Accelerators PLARCH Hasan Genc UC Berkeley, Hansung Kim University of California, Berkeley, Prashanth Ganesh University of California, Berkeley, Yakun Sophia Shao University of California, Berkeley | ||
11:45 15mTalk | PEak: A Single Source of Truth for Hardware Design and Verification PLARCH Caleb Donovick Stanford University, Ross Daly Stanford University, USA, Jackson Melchert Stanford University, Leonard Truong Stanford University, Priyanka Raina Stanford University, Pat Hanrahan Stanford University, USA, Clark Barrett Stanford University | ||
12:00 10mTalk | Challenges with Hardware-Software Co-design for Sparse Machine Learning on Streaming Dataflow PLARCH Rubens Lacouture Stanford University, Olivia Hsu Stanford University, Kunle Olukotun Stanford University, Fredrik Kjolstad Stanford University |