Wed 21 Jun 2023 17:00 - 17:20 at Cypress 2 - PLDI: Parsing & Formal Languages Chair(s): Eric Eide

Context-free language reachability (CFL-reachability) is a fundamental framework for program analysis. A large variety of static analyses can be formulated as CFL-reachability problems, which determines whether specific source-sink pairs in an edge-labeled graph are connected by a reachable path, i.e., a path whose edge labels form a string accepted by the given CFL. Computing CFL-reachability is expensive. The fastest algorithm exhibits a slightly subcubic time complexity with respect to the input graph size. Improving the scalability of CFL-reachability is of practical interest, but reducing the time complexity is inherently difficult.

In this paper, we focus on improving the scalability of CFL-reachability from a more practical perspective—reducing the input graph size. Our idea arises from the existence of trivial edges, i.e., edges that do not affect any reachable path in CFL-reachability. We observe that two nodes joined by trivial edges can be folded—by merging the two nodes with all the edges joining them removed—without affecting the CFL-reachability result. By studying the characteristic of the recursive state machines (RSMs), an alternative form of CFLs, we propose an approach to identify foldable node pairs without the need to verify the underlying reachable paths (which is equivalent to solving the CFL-reachability problem). In particular, given a CFL-reachability problem instance with an input graph G and an RSM, based on the correspondence between paths in G and state transitions in RSM, we propose a graph folding principle, which can determine whether two adjacent nodes are foldable by examining only their incoming and outgoing edges.

On top of the graph folding principle, we propose an efficient graph folding algorithm GF. The time complexity of GF is linear with respect to the number of nodes in the input graph. Our evaluations on two clients (alias analysis and value-flow analysis) show that GF significantly accelerates RSM/CFL-reachability by reducing the input graph size. On average, for value-flow analysis, GF reduces 60.96% of nodes and 42.67% of edges of the input graphs, obtaining a speedup of 4.65× and a memory usage reduction of 57.35%. For alias analysis, GF reduces 38.93% of nodes and 35.61% of edges of the input graphs, obtaining a speedup of 3.21× and a memory usage reduction of 65.19%.

Wed 21 Jun

Displayed time zone: Eastern Time (US & Canada) change

16:00 - 18:00
PLDI: Parsing & Formal LanguagesPLDI Research Papers at Cypress 2
Chair(s): Eric Eide University of Utah

#pldi-wed-1600-parsing-cypress Discord icon small YouTube icon small

16:00
20m
Talk
Search-Based Regular Expression Inference on a GPU
PLDI Research Papers
Mojtaba Valizadeh University of Sussex, Martin Berger
DOI Pre-print
16:20
20m
Talk
Derivative Based Nonbacktracking Real-World Regex Matching with Backtracking Semantics
PLDI Research Papers
Dan Moseley Microsoft DevDiv, Mario Nishio Microsoft Azure, Jose Perez Rodriguez Microsoft DevDiv, Olli Saarikivi Microsoft Research, Redmond, Stephen Toub Microsoft DevDiv, Margus Veanes Microsoft, Tiki Wan Microsoft Azure, Eric Xu Microsoft, USA
DOI
16:40
20m
Talk
Repairing Regular Expressions for Extraction
PLDI Research Papers
Nariyoshi Chida NTT Social Informatics Laboratories, Tachio Terauchi Waseda University
DOI
17:00
20m
Talk
Recursive State Machine Guided Graph Folding for Context-Free Language Reachability
PLDI Research Papers
Yuxiang Lei University of New South Wales, Yulei Sui University of New South Wales, Sydney, Shin Hwei Tan Concordia University, Qirun Zhang Georgia Institute of Technology
DOI
17:20
20m
Talk
Interval Parsing Grammars for File Format Parsing
PLDI Research Papers
Jialun Zhang Pennsylvania State University, Greg Morrisett Cornell University, Gang Tan Pennsylvania State University
DOI
17:40
20m
Talk
flap: A Deterministic Parser with Fused Lexing
PLDI Research Papers
Jeremy Yallop University of Cambridge, Ningning Xie Google DeepMind / University of Toronto, Neel Krishnaswami University of Cambridge
DOI Pre-print