Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
665 views
in Technique[技术] by (71.8m points)

antlr - ANTLR4 mutual left recursion grammar

I have read many questions here on StackOverflow about mutual left-recursion issues in LL(k) parsers. I found the general algorithm for removing left-recursion:

A : Aa | b ;

becomes

A : bR ;
R : (aA)? ;

However, I cannot figure out how to apply it to my situation. I have

left_exp: IDENT | exp DOT IDENT ;
exp     : handful
        | of
        | other rules
        | left_exp ;  

The "handful of other rules" all contain regular recursion, such as exp : exp PLUS exp, etc. and have no issues. The issue is with left_exp and exp being mutually recursive.

I thought about just adding IDENT and exp DOT IDENT to the exp rules, but there are some situations where the other valid exp rules do not apply, where left_exp would be valid.

EDIT

I also have the following rule, which calls for a left expression followed by assignment.

assign_statement: left_exp ( COLON IDENT )? EQUAL exp SEMI ;

Since a regular expression is only a left expression if it is followed by DOT IDENT, it seems that I can't just add

| IDENT
| exp DOT IDENT

to my expression definition, because then assignment would accept any other valid expression on the left side, rather than only one of those two.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

The approach I apply usually goes like this:

A: Aa | b;

becomes:

A: b (a)*;

Or in general: all alts without left recursion followed by all alts with a (removed) left recursion with unlimited occurences (expressed via the kleene operator). Example:

A: Aa | Ab | c | d | Ae;

becomes:

A: (c | d) (a | b | e)*;

You can check this easily by continuosly replacing A:

A: Aa | b;
A: (Aa | b)a | b;
A: Aaa | ba | b;
A: (Aa | b)aa | ba | b;
A: Aaaa | baa | ba | b;

etc.

In your example however you have an indirect left recursion (via 2 rules). This is not accepted by ANTLR. A solution is to move the alts from left_exp to the exp rule and then apply the algorithm I described above.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...