A few months ago I proposed to introduce character literals in Stratego as syntactic sugar for the integer ASCII value of the character. I would like to raise this issue again ;-) . My proposal was to introduce the widely used 'c' syntax to represent a character in Stratego. The stratego compiler can desugar this to the integer ASCII representation of the character, what we use to work with characters right now. The implementation is trivial, it requires no changes to the backend, it requires no special ATerm types. Advantages: * You don't need an ASCII table when your programming string manipulations in Stratego. * String processing code like in string.r and char.r in the ssl will become much more clearer. * It is tempting to explode string literals (even of length 1) at runtime so you don't have to look up the character codes. Character literals will improve performance in this case. Disadvantages: I'm currently writing strategies to rewrite XML entities and character references. I'm using overlays right now. This is already an improvement, but quite verbose. ------------------------ rules unescape-amp : [c_amp(), c_a(), c_m(), c_p(), c_semicolon() | cs] -> [c_amp() | cs] unescape-lt : [c_amp(), c_l(), c_t(), c_semicolon() | cs] -> [c_lt() | cs] unescape-gt : [c_amp(), c_g(), c_t(), c_semicolon() | cs] -> [c_gt() | cs] overlays c_space() = 32 c_quote() = 34 c_amp() = 38 c_apos() = 39 c_0() = 48 c_9() = 57 c_semicolon() = 59 c_numbersign() = 35 ------------------------ This would be possible with characters in Stratego: ----------------------------------- unescape-amp : ['&', 'a', 'm', 'p', ';' | cs] -> ['&' | cs] unescape-lt : ['&', 'l', 't', ';' | cs] -> ['<' | cs] unescape-gt : ['&', 'g', 't', ';' | cs] -> ['>' | cs] ----------------------------------- Of course (un)escaping is an example where the usefulness of character literals is huge. In general you won't use character literals a lot in Stratego. Because of the simplicity of the implementation I think that it is still worth the effort. I would like to hear your opinion :-) . -- Martin Bravenboer - 07 Dec 2002 Ok. I have added the following to the SDF definition of Stratego: ---------------------------------------------------------------------- lexical syntax "\'" CharChar "\'" -> Char ~[\'] -> CharChar [\\] [\'ntr\ ] -> CharChar Char -> Id {reject} context-free syntax Char -> Term {cons("Char")} ---------------------------------------------------------------------- and the following desugaring rules to stratego-desugar: ---------------------------------------------------------------------- Desugar : Char(c) -> Int(i) where c => i DesugarCharGeneric : [39, i, 39] -> i DesugarChar : "'\\''" -> 39 DesugarChar : "'\\n'" -> 10 DesugarChar : "'\\t'" -> 9 DesugarChar : // carriage return "'\\r'" -> 13 DesugarChar : // space "'\\ '" -> 32 ---------------------------------------------------------------------- Note that the desugaring is done at the syntactic level as part of parsing. This means that characters are pretty-printed as integers. This can be improved later by shifting the desugaring until later in the process. This requires deeper embedding of this notion in Stratego, though. Are any other escapes needed? Note that this will break existing specifications with identifiers of the form 'c' (which I have never seen). These changes are available in StrategoRelease09 (beta7). -- Main.EelcoVisser - 21 Dec 2002