7 pat t e r n m at c h I n g w I t h r e g u L a r e X p r e s s I o n s



Yüklə 397,03 Kb.
Pdf görüntüsü
səhifə15/25
tarix29.11.2022
ölçüsü397,03 Kb.
#71308
1   ...   11   12   13   14   15   16   17   18   ...   25
P A T T E R N M A T C H I N G W I T H

The Wildcard Character
The 
.
(or dot) character in a regular expression is called a wildcard and will 
match any character except for a newline. For example, enter the following 
into the interactive shell:
>>> atRegex = re.compile(r'.at')
>>> atRegex.findall('The cat in the hat sat on the flat mat.')
['cat', 'hat', 'sat', 'lat', 'mat']
Remember that the dot character will match just one character, which 
is why the match for the text 
flat
in the previous example matched only 
lat

To match an actual dot, escape the dot with a backslash: 
\.
.
Matching Everything with Dot-Star
Sometimes you will want to match everything and anything. For example, 
say you want to match the string 
'First Name:'
, followed by any and all text, 
followed by 
'Last Name:'
, and then followed by anything again. You can 
use the dot-star (
.*
) to stand in for that “anything.” Remember that the 
dot character means “any single character except the newline,” and the 
star character means “zero or more of the preceding character.”
Enter the following into the interactive shell:
>>> nameRegex = re.compile(r'First Name: (.*) Last Name: (.*)')
>>> mo = nameRegex.search('First Name: Al Last Name: Sweigart')
>>> mo.group(1)
'Al'
>>> mo.group(2)
'Sweigart'


176
Chapter 7
The dot-star uses greedy mode: It will always try to match as much text as 
possible. To match any and all text in a non-greedy fashion, use the dot, star, 
and question mark (
.*?
). Like with braces, the question mark tells Python 
to match in a non-greedy way.
Enter the following into the interactive shell to see the difference 
between the greedy and non-greedy versions:
>>> nongreedyRegex = re.compile(r'<.*?>')
>>> mo = nongreedyRegex.search(' for dinner.>')
>>> mo.group()
''
>>> greedyRegex = re.compile(r'<.*>')
>>> mo = greedyRegex.search(' for dinner.>')
>>> mo.group()
' for dinner.>'
Both regexes roughly translate to “Match an opening angle bracket, 
followed by anything, followed by a closing angle bracket.” But the string 
' for dinner.>'
has two possible matches for the closing angle 
bracket. In the non-greedy version of the regex, Python matches the short-
est possible string: 
''
. In the greedy version, Python matches 
the longest possible string: 
' for dinner.>'
.

Yüklə 397,03 Kb.

Dostları ilə paylaş:
1   ...   11   12   13   14   15   16   17   18   ...   25




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©azkurs.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin