Disambiguation Filters For Scannerless Generalized LRParsers

SDF: Modular Syntax Definition Formalism
by MarkVanDenBrand?, JeroenScheerder?, JurgenVinju? and EelcoVisser?. Disambiguation filters for scannerless generalized LR parsers. In N. Horspool, editor, Compiler Construction (CC'02), volume 2304 of Lecture Notes in Computer Science, pages 143--158, Grenoble, France, April 2002. Springer-Verlag.


In this paper we present the fusion of generalized LR parsing and scannerless parsing. This combination supports syntax definitions in which all aspects (lexical and context-free) of the syntax of a language are defined explicitly in one formalism. Furthermore, there are no restrictions on the class of grammars, thus allowing a natural syntax tree structure. Ambiguities that arise through the use of unrestricted grammars are handled by explicit disambiguation constructs, instead of implicit defaults that are taken by traditional scanner and parser generators. Hence, a syntax definition becomes a full declarative description of a language. Disambiguation constructs can be interpreted as filters on parse forests. Scannerless generalized LR parsing is a viable technique that has been applied in various industrial and academic projects.

Technical report


The following is an excerpt from a anonymous referee report:

This paper, "Disambiguation Filters for Scannerless Generalized LR Parsers," is about a systematic way of dealing with the problem of parsing with ambiguous grammars.

I thought this paper was wonderful. Sure it has some flaws, but I think it and all the SGLR papers should be required reading by all faculty who teach compilers and all graduate students who will someday teach compilers. Why? Because SGLR represents so many advantages over the lex+yacc or hand-coded lexer+parser models that the only compelling reason that I can think of to not teach SGLR is, as Homer Simpson would say, "It's a good idea, but it's a new idea, and, therefore, I fear it and must reject it." (OK, it does have one disadvantage that is serious, speed, but that's not enough of a concern to eliminate it from all compiler jobs.)

So, what's SGLR? It's a system of handling all lexical and parsing concerns within a unified Generalized LR parser. This unified system has advantages galore, which are well-described in the paper. Because it uses GLR parsing technology it allows ambiguous grammars. This paper describes 4 mechanisms that allow for declarative disambiguation of the resulting parses. All four are well described and carefully analyzed.