7 pat t e r n m at c h I n g w I t h r e g u L a r e X p r e s s I o n s



Yüklə 397,03 Kb.
Pdf görüntüsü
səhifə12/25
tarix29.11.2022
ölçüsü397,03 Kb.
#71308
1   ...   8   9   10   11   12   13   14   15   ...   25
P A T T E R N M A T C H I N G W I T H

Character Classes
In the earlier phone number regex example, you learned that 
\d
could 
stand for any numeric digit. That is, 
\d
is shorthand for the regular expres-
sion 
(0|1|2|3|4|5|6|7|8|9)
. There are many such shorthand character classes, as 
shown in Table 7-1.


Pattern Matching with Regular Expressions
173
Table 7-1: 
Shorthand Codes for Common Character Classes
Shorthand character class
Represents
\d
Any numeric digit from 0 to 9.
\D
Any character that is not a numeric digit from 0 to 9.
\w
Any letter, numeric digit, or the underscore character. 
(Think of this as matching “word” characters.)
\W
Any character that is not a letter, numeric digit, or the 
underscore character.
\s
Any space, tab, or newline character. (Think of this as 
matching “space” characters.)
\S
Any character that is not a space, tab, or newline.
Character classes are nice for shortening regular expressions. The char-
acter class 
[0-5]
will match only the numbers 
0
to 
5
; this is much shorter 
than typing 
(0|1|2|3|4|5)
. Note that while 
\d
matches digits and 
\w
matches 
digits, letters, and the underscore, there is no shorthand character class 
that matches only letters. (Though you can use the 
[a-zA-Z]
character class, 
as explained next.)
For example, enter the following into the interactive shell:
>>> xmasRegex = re.compile(r'\d+\s\w+')
>>> xmasRegex.findall('12 drummers, 11 pipers, 10 lords, 9 ladies, 8 maids, 7 
swans, 6 geese, 5 rings, 4 birds, 3 hens, 2 doves, 1 partridge')
['12 drummers', '11 pipers', '10 lords', '9 ladies', '8 maids', '7 swans', '6 
geese', '5 rings', '4 birds', '3 hens', '2 doves', '1 partridge']
The regular expression 
\d+\s\w+
will match text that has one or more 
numeric digits (
\d+
), followed by a whitespace character (
\s
), followed by 
one or more letter/digit/underscore characters (
\w+
). The 
findall()
method 
returns all matching strings of the regex pattern in a list.

Yüklə 397,03 Kb.

Dostları ilə paylaş:
1   ...   8   9   10   11   12   13   14   15   ...   25




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©azkurs.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin