7 pat t e r n m at c h I n g w I t h r e g u L a r e X p r e s s I o n s



Yüklə 397,03 Kb.
Pdf görüntüsü
səhifə17/25
tarix29.11.2022
ölçüsü397,03 Kb.
#71308
1   ...   13   14   15   16   17   18   19   20   ...   25
P A T T E R N M A T C H I N G W I T H

Case-Insensitive Matching
Normally, regular expressions match text with the exact casing you specify. 
For example, the following regexes match completely different strings:
>>> regex1 = re.compile('RoboCop')
>>> regex2 = re.compile('ROBOCOP')
>>> regex3 = re.compile('robOcop')
>>> regex4 = re.compile('RobocOp')
But sometimes you care only about matching the letters without wor-
rying whether they’re uppercase or lowercase. To make your regex case-
insensitive, you can pass 
re.IGNORECASE
or 
re.I
as a second argument to 
re.compile()
. Enter the following into the interactive shell:
>>> robocop = re.compile(r'robocop', re.I)
>>> robocop.search('RoboCop is part man, part machine, all cop.').group()
'RoboCop'
>>> robocop.search('ROBOCOP protects the innocent.').group()
'ROBOCOP'
>>> robocop.search('Al, why does your programming book talk about robocop so much?').group()
'robocop'


178
Chapter 7
Substituting Strings with the sub() Method
Regular expressions can not only find text patterns but can also substitute 
new text in place of those patterns. The 
sub()
method for 
Regex
objects 
is passed two arguments. The first argument is a string to replace any 
matches. The second is the string for the regular expression. The 
sub()
method returns a string with the substitutions applied.
For example, enter the following into the interactive shell:
>>> namesRegex = re.compile(r'Agent \w+')
>>> namesRegex.sub('CENSORED', 'Agent Alice gave the secret documents to Agent Bob.')
'CENSORED gave the secret documents to CENSORED.'
Sometimes you may need to use the matched text itself as part of the 
substitution. In the first argument to 
sub()
, you can type 
\1

\2

\3
, and so 
on, to mean “Enter the text of group 
1

2

3
, and so on, in the substitution.”
For example, say you want to censor the names of the secret agents by 
showing just the first letters of their names. To do this, you could use the 
regex 
Agent (\w)\w*
and pass 
r'\1****'
as the first argument to 
sub()
. The 
\1
in that string will be replaced by whatever text was matched by group 
1

that is, the 
(\w)
group of the regular expression.
>>> agentNamesRegex = re.compile(r'Agent (\w)\w*')
>>> agentNamesRegex.sub(r'\1****', 'Agent Alice told Agent Carol that Agent 

Yüklə 397,03 Kb.

Dostları ilə paylaş:
1   ...   13   14   15   16   17   18   19   20   ...   25




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©azkurs.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin