parsing - Preserving comments in `Text.Parsec.Token` tokenizers -
i'm writing source-to-source transformation using parsec, have languagedef language , build tokenparser using text.parsec.token.maketokenparser:
mylanguage = languagedef { ... commentstart = "/*" , commentend = "*/" ... } -- defines 'stringliteral', 'identifier', etc... tokenparser {..} = maketokenparser mylanguage unfortunately since defined commentstart , commentend, each of parser combinators in tokenparser lexeme parser implemented in terms of whitespace, , whitespace eats spaces comments.
what right way preserve comments in situation?
approaches can think of:
don't definecommentstart , commentend. wrap each of lexeme parsers in combinator grabs comments before parsing each token. implement own version of maketokenparser (or perhaps utilize library generalizes text.parsec.token; if so, library?) what's done thing in situation?
in principle, defining commentstart , commentend don't fit preserving comments, because need consider comments valid parts of both source , target language, including them in grammar , ast/adt.
in way, you'd able maintain text of comment payload info of comment constructor, , output appropriately in target language, like
data statement = comment string | homecoming look | ...... the fact neither source nor target language sees comment text relevant irrelevant translation code.
major problem approach: doesn't fit maketokenparser, , fits improve implementing source language's parser ground up.
i guess i'm veering towards editing maketokenparser comment parsers homecoming string instead of ().
parsing haskell comments parsec code-translation
No comments:
Post a Comment