java - Why does Terminals.tokenizer() tokenize unregistered operators/keywords? -
i've discovered root cause of confusing behavior observing. here test:
@test public void test2() { terminals terminals = terminals.caseinsensitive(new string[] {}, new string[] { "true", "false" }); object result = terminals.tokenizer().parse("d"); system.out.println("result: " + result); }
this outputs:
result: d
i expecting parser returned terminals.tokenizer()
not homecoming because "d" not valid keyword or operator.
the reason care because wanted own parser @ lower priority returned terminals.tokenizer()
:
public static final parser<?> instance = parsers.or( string_tokenizer, number_tokenizer, whitespace_tokenizer, (parser<token>)terminals.tokenizer(), identifier_tokenizer);
the identifier_tokenizer
above never used because terminals.tokenizer()
matches.
why terminals.tokenizer()
tokenize unregistered operators/keywords? , how might around this?
from documentation of tokenizer#caseinsensitive
:
org.codehaus.jparsec.terminals
public static terminals caseinsensitive(string[] ops, string[] keywords)
returns terminals object lexing , parsing operators names specified in ops, , lexing , parsing keywords case insensitively. keywords , operators lexed tokens.fragment tokens.tag.reserved tag. words not among keywords lexed fragment tokens.tag.identifier tag. word defined alphanumeric string starts [_a - za - z], 0 or more [0 - 9_a - za - z] following.
actually, result
returned parser fragment
object tagged according type. in case, d
tagged identifier
expected.
it not clear me want accomplish though. please provide test case ?
java parsing jparsec
No comments:
Post a Comment