Sunday, 15 April 2012

c++ - How to read the identifier 'class' in Flex? -



c++ - How to read the identifier 'class' in Flex? -

i trying write compiler cool language , right @ lexical analysis. concretely, flex matches largest pattern understand.

thus if have in flex:

class inherits b

now if token class returned next pattern:

^"class" homecoming class;

for inherits token:

^"class"[ ]+[a-za-z]+[0-9]?[ ]+"inherits"[ ]+ homecoming inherits;

now since flex matches largest pattern, homecoming inherits , never class. there work around problem?

i can here homecoming token class alone. how homecoming token inherits since must preceded class token , name followed string token?

but if seek impose constraints on inherits, flex match largest pattern not class alone.

then should homecoming enums/number class identifier individually? , if that, how identify 'inherits' identifier?

edit:

class inherits b { main(): self_type{...} }

how flex match against main? reflexer differentiates between typeid a , main, declares objectid. can looking ahead @ paranthesis , if finds (, declares objectid. if that, counter same problem above: flex never match against ( main(.

you trying much in flex, , perhaps misunderstand role , boundaries of lexical phase. shouldn't attempting parse whole sentence flex regex alone. flex's job consume stream of text, , convert stream of integer tokens. sentence you've provided:

class inherits b

represents multiple tokens language requires parsing. flex not parser, lexical scanner/tokenizer. (technically parser of bytes or characters, want "parse" atomic units represent words of language, not characters).

so there 4 distinct tokens (atomic units), known terminals in above sentence: [class, a, inherits, b]. need identifier rule flex, such doesn't match token, falls through identifier, tokens returned flex parser are:

class identifier inherits identifier

the job flex parse each word / token , convert text distinct integer values consumed bison or other parser.

you typically have yacc/bison bnf grammar handle:

class_decl: class identifier | class identifier inherits identifier ;

so lex rule thus, , need homecoming identifier token parser, while attaching actual symbol (a, b). yytext variable:

letter [a-za-z_] digit [0-9] letterdigit [a-za-z0-9_] %% "class" return(class); "inherits" return(inherits); {letter}{letterdigit}* { yylval.sym = new symbol(yytext); yylval.sym->line = line; fprintf(stderr, "token identifier(%s)\n", yytext); return(identifier); }

if trying of within flex, possible, end mess, if seek parse html regex... :)

c++ regex compiler-construction flex-lexer lexical-analysis

No comments:

Post a Comment