Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.1k views
in Technique[技术] by (71.8m points)

regex - Why/how is an additional variable needed in matching repeated arbitary character with capture groups?

I'm matching a sequence of a repeating arbitrary character, with a minimum length, using a perl6 regex.

After reading through https://docs.perl6.org/language/regexes#Capture_numbers and tweaking the example given, I've come up with this code using an 'external variable':

#uses an additional variable $c
perl6 -e '$_="bbaaaaawer"; /((.){} :my $c=$0; ($c)**2..*)/ && print $0';

#Output:  aaaaa

To aid in illustrating my question only, a similar regex in perl5:

#No additional variable needed
perl -e ' $_="bbaaaaawer"; /((.)2{2,})/ && print $1';

Could someone enlighten me on the need/benefit of 'saving' $0 into $c and the requirement of the empty {}? Is there an alternative (better/golfed) perl6 regex that will match?

Thanks in advance.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Perl 6 regexes scale up to full grammars, which produce parse trees. Those parse trees are a tree of Match objects. Each capture - named or positional - is either a Match object or, if quantified, an array of Match objects.

This is in general good, but does involve making the trade-off you have observed: once you are on the inside of a nested capturing element, then you are populating a new Match object, with its own set of positional and named captures. For example, if we do:

say "abab" ~~ /((a)(b))+/

Then the result is:

?abab?
 0 => ?ab?
  0 => ?a?
  1 => ?b?
 0 => ?ab?
  0 => ?a?
  1 => ?b?

And we can then index:

say $0;        # The array of the top-level capture, which was quantified
say $0[1];     # The second Match
say $0[1][0];  # The first Match within that Match object (the (a))

It is a departure from regex tradition, but also an important part of scaling up to larger parsing challenges.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...