在线时间:8:00-16:00
迪恩网络APP
随时随地掌握行业动态
扫描二维码
关注迪恩网络微信公众号
IntroductionRegular expressions are very powerful tools for matching, searching, and replacing text. Unfortunately, they are also very obtuse. It does not help that most explanations of regular expressions start from the specification, which is like learning to love Friends reruns by reading a VCR manual. This page provides some simple examples for reference. You should know a little programming and how to run basic perl scripts before reading this article. Section 1: Basic matching and substitutionDeclare a local variable called $mystring. my $mystring; Assign a value (string literal) to the variable. $mystring = "Hello world!"; Does the string contains the word "World"? if($mystring =~ m/World/) { print "Yes"; } No, it doesn't. The binding operator Does the string contains the word "World", ignoring case? if($mystring =~ m/World/i) { print "Yes"; } Yes, it does. The pattern modifier I want "Hello world!" to be changed to "Hello mom!" instead. $mystring =~ s/world/mom/; print $mystring; Prints "Hello mom!". The substitution operator Now change "Hello mom!" to say "Goodby mom!". $mystring =~ s/hello/Goodbye/; print $mystring; This does not substitute, and prints "Hello mom!" as before. By default, the search is case sensitive. As before, use the pattern modifier Okay, ignoring case, change "Hello mom!" to say "Goodby mom!". $mystring =~ s/hello/Goodbye/i; print $mystring; Prints "Goodby mom!". Section 2: Extracting substringsI want to see if my string contains a digit. $mystring = "[2004/04/13] The date of this article."; if($mystring =~ m/\d/) { print "Yes"; } Prints "Yes". The pattern Huh? Why doesn't "\d" match the exact characters '\' and 'd'? This is because Perl uses characters from the alphabet to also match things with special meaning, like digits. To differentiate between matching a regular character and something else, the character is immediately preceded by a backslash. Therefore, whenever you read '\' followed by any character, you treat the two together as one symbol. For example, '\d' means digit, '\w' means alphanumeric characters including '_', '\/' means forward slash, and '\\' means match a single backslash. Preceding a character with a '\' is called escaping, and the '\' together with its character is called an escape sequence. Okay, how do I return the first matching digit from my string? $mystring = "[2004/04/13] The date of this article."; if($mystring =~ m/(\d)/) { print "The first digit is $1."; } Prints "The first digit is 2." In order to designate a pattern for extraction, one places parenthesis around the pattern. If the pattern is matched, it is returned in the Perl special variable called $1. If there are multiple parenthesized expressions, then they will be in variables $1, $2, $3, etc. Huh? Why doesn't '(' and ')' match the parenthesis symbols exactly? This is because the designers of regular expressions felt that some constructs are so common that they should use unescaped characters to represent them. Besides parentheses, there are a number of other characters that have special meanings when unescaped, and these are called metacharacters. To match parenthesis characters or other metacharacters, you have to escape them like '\(' and '\)'. They designed it for their convenience, not to make it easy to learn. Okay, how do I extract a complete number, like the year? $mystring = "[2004/04/13] The date of this article."; if($mystring =~ m/(\d+)/) { print "The first number is $1."; } Prints "The first number is 2004." First, when one says "complete number", one really means a grouping of one or more digits. The pattern quantifier How do I print all the numbers from the string? $mystring = "[2004/04/13] The date of this article."; while($mystring =~ m/(\d+)/g) { print "Found number $1."; } Prints "Found number 2004. Found number 04. Found number 13. ". This introduces another pattern modifier How do I get all the numbers from the string into an array instead? $mystring = "[2004/04/13] The date of this article."; @myarray = ($mystring =~ m/(\d+)/g); print join(",", @myarray); Prints "2004,04,13". This does the same thing as before, except assigns the returned values from the pattern search into myarray. Section 3: Common tasksHow do I extract everything between a the words "start" and "end"? $mystring = "The start text always precedes the end of the end text."; if($mystring =~ m/start(.*)end/) { print $1; } Prints That isn't exactly what I expected. How do I extract everything between "start" and the first "end" encountered? $mystring = "The start text always precedes the end of the end text."; if($mystring =~ m/start(.*?)end/) { print $1; } Prints ConclusionRegular expressions in Perl are very powerful, and there are many ways to do the same thing. I hope you find this page useful to get started in regular expressions. Hopefully, now you can read the specifications and get more out of it. Perl Book Recommendations
As an experienced, non-Perl programmer, I have been able to get by with the above two books, thecomp.lang.perl newsgroup, and the perldoc documentation. The first book I use when I need some example code to get something working quickly, and the second book I use for reference when I need to look up some regular expression syntax or a specific function call. The Nutshell book is easier to use on my desk as a reference, because it is lightweight. However, if I were to own one book, I would own the Perl Black Book. Neither of these books is for novice programmers who don't understand things like control structures and functions. Quick (Incomplete) ReferenceMetacharactersThese need to be escaped to be matched. \ . ^ $ * + ? { } [ ] ( ) | Escape sequences for pre-defined character classes
AssertionsAssertions have zero width.
Minimal Matching QuantifiersThe quantifiers below match their preceding element in a non-greedy way.
|
请发表评论