Verifiable Composition of Deterministic Grammars

Verifiable Composition of Deterministic Grammars
Abstract

There is an increasing interest in domain-specific and extensible languages, and frameworks for developing extensions to them. One challenge is to develop tools that allow non-expert programmers to add an eclectic set of language extensions to a host language. In this paper we describe mechanisms for composing and analyzing syntactic specifications of a host language and extensions. These specifications consist of context-free grammars with each terminal symbol mapped to a regular expression, from which a slightly-modified LR parser and context-aware scanner are generated. Conflicts are detected in a composed grammar when its parser is generated, but this comes too late since the non-expert programmer performs the compilation after composing the independently developed extensions with the host language. The primary contribution of this paper is a modular analysis that is performed independently by each extension designer on her extension (composed alone with the host language). If each extension passes this modular analysis, then the language composed later by the programmer will compile with no conflicts or lexical ambiguities. Thus, extension writers can verify that their extension will safely compose with others and, if not, fix the specification so that it will. This is possible due to the context-aware scanner’s lexical disambiguation and a set of reasonable restrictions limiting the sort of constructs that can be introduced by a language extension. The restrictions ensure that the states in the parse table can be partitioned so that each state can be attributed to the host language or a single language extension.

Authors
Eric R. Van Wyk
Year of Publication
2009
Source
30th ACM SIGPLAN Conference on Programming Language Design and Implementation