Pattern Matching with Regular Expressions
163
for i in range(8, 12):
if not text[i].isdecimal():
return False
return True
print('Is 415-555-4242 a phone number?')
print(isPhoneNumber('415-555-4242'))
print('Is Moshi moshi a phone number?')
print(isPhoneNumber('Moshi moshi'))
When
this program is run, the output looks like this:
Is 415-555-4242 a phone number?
True
Is Moshi moshi a phone number?
False
The
isPhoneNumber()
function has code that does several checks to see
whether the string in
text
is a valid phone number. If any of these checks
fail, the function returns
False
. First the code checks that the string is
exactly 12 characters . Then it checks that the area code (that is, the
first three characters in
text
) consists of only numeric characters . The
rest of the function checks that the string follows
the pattern of a phone
number: the number must have the first hyphen after the area code ,
three more numeric characters , then another hyphen , and finally
four more numbers . If the program execution manages to get past all
the checks, it returns
True
.
Calling
isPhoneNumber()
with the argument
'415-555-4242'
will return
True
. Calling
isPhoneNumber()
with
'Moshi moshi'
will return
False
; the first
test fails because
'Moshi moshi'
is not 12 characters long.
If you wanted to find a phone
number within a larger string, you would
have to add even more code to find the phone number pattern. Replace the
last four
print()
function calls in
isPhoneNumber.py with the following:
message = 'Call me at 415-555-1011 tomorrow. 415-555-9999 is my office.'
for i in range(len(message)):
chunk = message[i:i+12]
if isPhoneNumber(chunk):
print('Phone number found: ' + chunk)
print('Done')
When this program is run, the output will look like this:
Phone number found: 415-555-1011
Phone number found: 415-555-9999
Done
164
Chapter 7
On each iteration of the
for
loop, a new chunk of 12 characters from
message
is
assigned to the variable
chunk
. For example, on the first itera-
tion,
i
is
0
, and
chunk
is assigned
message[0:12]
(that is, the string
'Call me
at 4'
). On the next iteration,
i
is
1
, and
chunk
is assigned
message[1:13]
(the string
'all me at 41'
).
In other words, on each iteration of the
for
loop,
chunk
takes on the following values:
•
'Call me at 4'
•
'all me at 41'
•
'll me at 415'
•
'l me at 415-'
• . . . and so on.
You pass
chunk
to
isPhoneNumber()
to see whether it matches the phone
number pattern , and if so, you print the chunk.
Continue to loop through
message
, and eventually the 12 characters
in
chunk
will be a phone number. The loop goes through the entire string,
testing each 12-character
piece and printing any
chunk
it finds that satisfies
isPhoneNumber()
. Once we’re done going through
message
, we print
Done
.
While the string in
message
is short in this example, it could be millions
of characters long and the program would still run in less than a second. A
similar program that finds phone numbers using
regular expressions would
also run in less than a second, but regular expressions make it quicker to
write these programs.
Dostları ilə paylaş: