Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
63 views
in Technique[技术] by (71.8m points)

python - How to target multiple strings with single regex pattern

I have multiple strings such as

POST /incentivize HTTP/1.1
DELETE /interactive/transparent/niches/revolutionize HTTP/1.1
DELETE /virtual/solutions/target/web+services HTTP/2.0
PATCH /interactive/architect/innovative/24%2f7 HTTP/1.1

I want to target all these strings with regex.

I tried the following pattern

pattern = r"([A-Z]* /([A-Za-z0-9])D+ [A-Z]*/d.d)"

Here is the full code

string = """
POST /incentivize HTTP/1.1
DELETE /interactive/transparent/niches/revolutionize HTTP/1.1
DELETE /virtual/solutions/target/web+services HTTP/2.0
PATCH /interactive/architect/innovative/24%2f7 HTTP/1.1
"""

pattern = r"(?P<url>[A-Z]* /([A-Za-z0-9])D+ [A-Z]*/d.d)"

result = [item.groupdict() for item in re.finditer(pattern,string)]

result

This outputs the following

[{'url': 'POST /incentivize HTTP/1.1'},
 {'url': 'DELETE /interactive/transparent/niches/revolutionize HTTP/1.1'},
 {'url': 'DELETE /virtual/solutions/target/web+services HTTP/2.0'}]

With this pattern, I am able to target the first three strings. But for the life of me, I am not able to figure out how to target the last string. This is just a sample of many more strings in the list. I need to make this dynamic so that the program is able to capture strings that are similar to this.

I am a rookie in python and have just started learning regex.

Any help will be appreciated.

question from:https://stackoverflow.com/questions/65649989/how-to-target-multiple-strings-with-single-regex-pattern

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I would use re.findall here with the following regex pattern:

(?:POST|GET|PUT|PATCH|DELETE) /[^/s]+(?:/[^/s]+)* HTTP/d+(?:.d+)?

Script:

string = """
POST /incentivize HTTP/1.1
DELETE /interactive/transparent/niches/revolutionize HTTP/1.1
DELETE /virtual/solutions/target/web+services HTTP/2.0
PATCH /interactive/architect/innovative/24%2f7 HTTP/1.1
"""
matches = re.findall(r'(?:POST|GET|PUT|PATCH|DELETE) /[^/s]+(?:/[^/s]+)* HTTP/d+(?:.d+)?', string)
print(matches)

This prints:

['POST /incentivize HTTP/1.1',
 'DELETE /interactive/transparent/niches/revolutionize HTTP/1.1',
 'DELETE /virtual/solutions/target/web+services HTTP/2.0',
 'PATCH /interactive/architect/innovative/24%2f7 HTTP/1.1']

The regex pattern works by matching one of several HTTP methods in an alternation, to which you may add more methods if necessary. Then, it matches a path, followed by HTTP and a version number.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...