Verifiable Parse Table Composition for Deterministic Parsing

Verifiable Parse Table Composition for Deterministic Parsing
Abstract

One obstacle to the implementation of modular extensions to programming languages lies in the problem of parsing extended languages. Specifically, the parse tables at the heart of traditional LALR(1) parsers are so monolithic and tightly constructed that, in the general case, it is impossible to extend them without regenerating them from the source grammar. Current extensible frameworks employ a variety of solutions, ranging from a full regeneration to using pluggable binary modules for each different extension. But recompilation is time-consuming, while the pluggable modules in many cases cannot support the addition of more than one extension, or use backtracking or non-deterministic parsing techniques. We present here a middle-ground approach that allows an extension, if it meets certain restrictions, to be compiled into a parse table fragment. The host language parse table and fragments from multiple extensions can then always be efficiently composed to produce a conflict-free parse table for the extended language. This allows for the distribution of deterministic parsers for extensible languages in a pre-compiled format, eliminating the need for the "source code’’ of the grammar to be distributed. In practice, we have found these restrictions to be reasonable and admit many useful language extensions.

Authors
Eric Van Wyk
Year of Publication
2009
Source
2nd International Conference on Software Language Engineering