This is a question about the remaining problem of the solution proposed for another Stackoverflow question about beginning-of-line keywords.
I am writing an ANTLR4 lexer and parser for a programming language where something is a keyword in case it is the first non-whitespace token of a line. Let me explain this with an example. Suppose "bla" is a keyword then in the following example:
foo bla
bla foo foo
foo bla bla
the second "bla" should be recognized as a keyword but the others shouldn't.
In order to achieve this I have defined the following simple ANTLR4 grammar:
grammar foobla;
// PARSER
main
: line* EOF
;
line
: indicator text*
;
indicator
: foo
| bla
;
foo: FOO ;
bla: BLA ;
text: TEXT ;
// LEXER
WHITESPACE: [ ] -> skip ;
fragment NL: [
f]+[ ]* ;
fragment NONNL: ~[
f] ;
// Indicators
FOO: NL 'foo' ;
BLA: NL 'bla' ;
TEXT: NONNL+ ;
This is similar to the answer given in How to detect beginning of line, or: "The name 'getCharPositionInLine' does not exist in the current context".
Now my question. This works fine, except in case the "bla" or "foo" keyword is used in the first line of the input program. I can think of 2 ways to solve this but I don't know how this can be achieved:
- Use something like a BOF (beginning of file) token. However, I can't find such a concept in the manual
- Use a hook to dynamically add a new line at the beginning of the input file before the parsing starts, preferably by specifying something in the g4 file itself. This I couldn't find either in the manual
I don't want to write an extra application/wrapper to add a new line to the input file just because of this.
question from:
https://stackoverflow.com/questions/65900992/how-to-specify-beginning-of-line-keywords-in-antlr-grammars-which-also-works-fo 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…