part of the regex matches zero instances of
wo
in the string; for
'Batwoman'
, the
(wo)*
matches one instance of
wo
; and for
'Batwowowowoman'
,
(wo)*
matches four instances of
wo
.
If you need to match an actual star character, prefix the star in the
regular expression with a backslash,
\*
.
170
Chapter 7
Matching One or More with the Plus
While
*
means “match zero or more,” the
+
(or plus) means “match one or
more.” Unlike the star, which does not require its group to appear in the
matched string, the group preceding a plus must appear at least once. It is
not optional. Enter the following into the interactive shell, and compare it
with the star regexes in the previous section:
>>> batRegex = re.compile(r'Bat(wo)+man')
>>> mo1 = batRegex.search('The Adventures of Batwoman')
>>> mo1.group()
'Batwoman'
>>> mo2 = batRegex.search('The Adventures of Batwowowowoman')
>>> mo2.group()
'Batwowowowoman'
>>> mo3 = batRegex.search('The Adventures of Batman')
>>> mo3 == None
True
The regex
Bat(wo)+man
will not match the string
'The Adventures of
Batman'
, because at least one
wo
is required by the plus sign.
If you need to match an actual plus sign character, prefix the plus sign
with a backslash to escape it:
\+
.
Matching Specific Repetitions with Braces
If you have a group that you want to repeat a specific number of times,
follow the group in your regex with a number in braces. For example, the
regex
(Ha){3}
will match the string
'HaHaHa'
, but it will not match
'HaHa'
,
since the latter has only two repeats of the
(Ha)
group.
Instead of one number, you can specify a range by writing a minimum,
a comma, and a maximum in between the braces. For example, the regex
(Ha){3,5}
will match
'HaHaHa'
,
'HaHaHaHa'
, and
'HaHaHaHaHa'
.
You can also leave out the first or second number in the braces to leave
the minimum or maximum unbounded. For example,
(Ha){3,}
will match
three or more instances of the
(Ha)
group, while
(Ha){,5}
will match zero
to five instances. Braces can help make your regular expressions shorter.
These two regular expressions match identical patterns:
(Ha){3}
(Ha)(Ha)(Ha)
And these two regular expressions also match identical patterns:
(Ha){3,5}
((Ha)(Ha)(Ha))|((Ha)(Ha)(Ha)(Ha))|((Ha)(Ha)(Ha)(Ha)(Ha))
Pattern Matching with Regular Expressions
Dostları ilə paylaş: |