Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
250 views
in Technique[技术] by (71.8m points)

python - Can't use '1' backreference to capture-group in a function call in re.sub() repr expression

I have a string S = '02143' and a list A = ['a','b','c','d','e']. I want to replace all those digits in 'S' with their corresponding element in list A.

For example, replace 0 with A[0], 2 with A[2] and so on. Final output should be S = 'acbed'.

I tried:

S = re.sub(r'([0-9])', A[int(r'g<1>')], S)

However this gives an error ValueError: invalid literal for int() with base 10: '\g<1>'. I guess it is considering backreference 'g<1>' as a string. How can I solve this especially using re.sub and capture-groups, else alternatively?

question from:https://stackoverflow.com/questions/65943956/replace-specific-strings-using-regex

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

The reason the re.sub(r'([0-9])',A[int(r'g<1>')],S) does not work is that g<1> (which is an unambiguous representation of the first backreference otherwise written as 1) backreference only works when used in the string replacement pattern. If you pass it to another method, it will "see" just g<1> literal string, since the re module won't have any chance of evaluating it at that time. re engine only evaluates it during a match, but the A[int(r'g<1>')] part is evaluated before the re engine attempts to find a match.

That is why it is made possible to use callback methods inside re.sub as the replacement argument: you may pass the matched group values to any external methods for advanced manipulation.

See the re documentation:

re.sub(pattern, repl, string, count=0, flags=0)

If repl is a function, it is called for every non-overlapping occurrence of pattern. The function takes a single match object argument, and returns the replacement string.

Use

import re
S = '02143' 
A = ['a','b','c','d','e']
print(re.sub(r'[0-9]',lambda x: A[int(x.group())],S))

See the Python demo

Note you do not need to capture the whole pattern with parentheses, you can access the whole match with x.group().


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...