170
Chapter 7
Matching One or More with the Plus
While
*
means “match zero or more,” the
+
(or
plus) means “match one or
more.”
Unlike the star, which does not require its group to appear in the
matched string, the group preceding a plus must appear
at least once. It is
not optional. Enter the following
into the interactive shell, and compare it
with the star regexes in the previous section:
>>>
batRegex = re.compile(r'Bat(wo)+man')
>>>
mo1 = batRegex.search('The Adventures of Batwoman')
>>>
mo1.group()
'Batwoman'
>>>
mo2 = batRegex.search('The Adventures of Batwowowowoman')
>>>
mo2.group()
'Batwowowowoman'
>>>
mo3 = batRegex.search('The Adventures of Batman')
>>>
mo3 == None
True
The regex
Bat(wo)+man
will not match the string
'The
Adventures of
Batman'
, because at least one
wo
is required by the plus sign.
If you need to match an actual plus sign character, prefix the plus sign
with a backslash to escape it:
\+
.
Matching Specific Repetitions with Braces
If you have a group that you want to repeat a specific number of times,
follow the group in your regex with a number in braces. For example, the
regex
(Ha){3}
will match the string
'HaHaHa'
,
but it will not match
'HaHa'
,
since the latter has only two repeats of the
(Ha)
group.
Instead of one number, you can specify a range by writing a minimum,
a comma, and a maximum in between the braces. For example, the regex
(Ha){3,5}
will match
'HaHaHa'
,
'HaHaHaHa'
, and
'HaHaHaHaHa'
.
You can also leave out the first or second number in the braces to leave
the minimum or maximum unbounded. For example,
(Ha){3,}
will match
three or more instances of the
(Ha)
group, while
(Ha){,5}
will
match zero
to five instances. Braces can help make your regular expressions shorter.
These two regular expressions match identical patterns:
(Ha){3}
(Ha)(Ha)(Ha)
And these two regular expressions also match identical patterns:
(Ha){3,5}
((Ha)(Ha)(Ha))|((Ha)(Ha)(Ha)(Ha))|((Ha)(Ha)(Ha)(Ha)(Ha))
Pattern Matching with Regular Expressions
Dostları ilə paylaş: