168 Chapter 7
Matching Multiple Groups with the Pipe The
|
character is called a pipe. You can use it anywhere you want to match
one of many expressions. For example, the regular expression
r'Batman|Tina
Fey'
will match either
'Batman'
or
'Tina Fey'
.
When both Batman and Tina Fey occur in the searched string, the first
occurrence of matching text will be returned as the
Match
object. Enter the
following into the interactive shell:
>>> heroRegex = re.compile (r'Batman|Tina Fey') >>> mo1 = heroRegex.search('Batman and Tina Fey') >>> mo1.group() 'Batman'
>>> mo2 = heroRegex.search('Tina Fey and Batman') >>> mo2.group() 'Tina Fey'
N O T E You can find all matching occurrences with the findall() method that’s discussed in “The findall() Method” on page 171. You can also use the pipe to match one of several patterns as part
of your regex. For example, say you wanted to match any of the strings
'Batman'
,
'Batmobile'
,
'Batcopter'
, and
'Batbat'
. Since all these strings start
with
Bat
, it would be nice if you could specify that prefix only once. This can
be done with parentheses. Enter the following into the interactive shell:
>>> batRegex = re.compile(r'Bat(man|mobile|copter|bat)') >>> mo = batRegex.search('Batmobile lost a wheel') >>> mo.group() 'Batmobile'
>>> mo.group(1) 'mobile'
The method call
mo.group()
returns the full matched text
'Batmobile'
, while
mo.group(1)
returns just the part of the matched text inside the first parenthe-
ses group,
'mobile'
. By using the pipe character and grouping parentheses, you
can specify several alternative patterns you would like your regex to match.
If you need to match an actual pipe character, escape it with a backslash,
like
\|
.