7 pat t e r n m at c h I n g w I t h r e g u L a r e X p r e s s I o n s


Optional Matching with the Question Mark



Yüklə 397,03 Kb.
Pdf görüntüsü
səhifə8/25
tarix29.11.2022
ölçüsü397,03 Kb.
#71308
1   ...   4   5   6   7   8   9   10   11   ...   25
P A T T E R N M A T C H I N G W I T H

Optional Matching with the Question Mark
Sometimes there is a pattern that you want to match only optionally. That 
is, the regex should find a match regardless of whether that bit of text is 
there. The 
?
character flags the group that precedes it as an optional part 
of the pattern. For example, enter the following into the interactive shell:
>>> batRegex = re.compile(r'Bat(wo)?man')
>>> mo1 = batRegex.search('The Adventures of Batman')


Pattern Matching with Regular Expressions
169
>>> mo1.group()
'Batman'
>>> mo2 = batRegex.search('The Adventures of Batwoman')
>>> mo2.group()
'Batwoman'
The 
(wo)?
part of the regular expression means that the pattern 
wo
is 
an optional group. The regex will match text that has zero instances or one 
instance of wo in it. This is why the regex matches both 
'Batwoman'
and 
'Batman'

Using the earlier phone number example, you can make the regex look 
for phone numbers that do or do not have an area code. Enter the following 
into the interactive shell:
>>> phoneRegex = re.compile(r'(\d\d\d-)?\d\d\d-\d\d\d\d')
>>> mo1 = phoneRegex.search('My number is 415-555-4242')
>>> mo1.group()
'415-555-4242'
>>> mo2 = phoneRegex.search('My number is 555-4242')
>>> mo2.group()
'555-4242'
You can think of the 
?
as saying, “Match zero or one of the group pre-
ceding this question mark.”
If you need to match an actual question mark character, escape it with 
\?
.
Matching Zero or More with the Star
The 
*
(called the star or asterisk) means “match zero or more”—the group 
that precedes the star can occur any number of times in the text. It can be 
completely absent or repeated over and over again. Let’s look at the Batman 
example again.
>>> batRegex = re.compile(r'Bat(wo)*man')
>>> mo1 = batRegex.search('The Adventures of Batman')
>>> mo1.group()
'Batman'
>>> mo2 = batRegex.search('The Adventures of Batwoman')
>>> mo2.group()
'Batwoman'
>>> mo3 = batRegex.search('The Adventures of Batwowowowoman')
>>> mo3.group()
'Batwowowowoman'
For 
'Batman'
, the 
(wo)*
Yüklə 397,03 Kb.

Dostları ilə paylaş:
1   ...   4   5   6   7   8   9   10   11   ...   25




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©azkurs.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin