Difficult concept so I'll try my best... Someone feel free to edit and explain better if it is a bit confusing.
Expressions that match your patterns are searched from left to right. Yes, all of the following strings aaaab
, aaab
, aab
, and ab
are matches to your pattern, but aaaab
being the one that starts the most to the left is the one that is returned.
So here, your non-greedy pattern is not very useful. Maybe this other example will help you understand better when a non-greedy pattern kicks in:
str_match('xxx aaaab yyy', "a.*?y")
# [,1]
# [1,] "aaaab y"
Here all of the strings aaaab y
, aaaab yy
, aaaab yyy
matched the pattern and started at the same position, but the first one was returned because of the non-greedy pattern.
So what can you do to catch that last ab
? Use this:
str_match('xxx aaaab yyy', ".*(a.*b)")
# [,1] [,2]
# [1,] "xxx aaaab" "ab"
How does it work? By adding a greedy pattern .*
in the front, you are now forcing the process to put the last possible a
into the captured group.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…