Python正则表达式re模块.docx

上传人:b****8 文档编号:10655471 上传时间:2023-02-22 格式:DOCX 页数:11 大小:19.59KB
下载 相关 举报
Python正则表达式re模块.docx_第1页
第1页 / 共11页
Python正则表达式re模块.docx_第2页
第2页 / 共11页
Python正则表达式re模块.docx_第3页
第3页 / 共11页
Python正则表达式re模块.docx_第4页
第4页 / 共11页
Python正则表达式re模块.docx_第5页
第5页 / 共11页
点击查看更多>>
下载资源
资源描述

Python正则表达式re模块.docx

《Python正则表达式re模块.docx》由会员分享,可在线阅读,更多相关《Python正则表达式re模块.docx(11页珍藏版)》请在冰豆网上搜索。

Python正则表达式re模块.docx

Python正则表达式re模块

Python正则

基本说明

之前讲过关于Python正则的,都是理论的东西,现在讲讲Python正则re模块。

导入re模块:

importre 

查看帮助文档:

printre._doc_ 

下面就是输出的帮助文档:

Supportforregularexpressions(RE).

Thismoduleprovidesregularexpressionmatchingoperationssimilarto

thosefoundinPerl.Itsupportsboth8-bitandUnicodestrings;both

thepatternandthestringsbeingprocessedcancontainnullbytesand

charactersoutsidetheUSASCIIrange.

Regularexpressionscancontainbothspecialandordinarycharacters.

Mostordinarycharacters,like"A","a",or"0",arethesimplest

regularexpressions;theysimplymatchthemselves.Youcan

concatenateordinarycharacters,solastmatchesthestring'last'.

Thespecialcharactersare:

"."Matchesanycharacterexceptanewline.

"^"Matchesthestartofthestring.

"$"Matchestheendofthestringorjustbeforethenewlineat

theendofthestring.

"*"Matches0ormore(greedy)repetitionsoftheprecedingRE.

Greedymeansthatitwillmatchasmanyrepetitionsaspossible.

"+"Matches1ormore(greedy)repetitionsoftheprecedingRE.

"?

"Matches0or1(greedy)oftheprecedingRE.

*?

+?

?

?

Non-greedyversionsofthepreviousthreespecialcharacters.

{m,n}MatchesfrommtonrepetitionsoftheprecedingRE.

{m,n}?

Non-greedyversionoftheabove.

"\\"Eitherescapesspecialcharactersorsignalsaspecialsequence.//FROMTHISWEBSITE:

[]Indicatesasetofcharacters.

A"^"asthefirstcharacterindicatesacomplementingset.

"|"A|B,createsanREthatwillmatcheitherAorB.

(...)MatchestheREinsidetheparentheses.

Thecontentscanberetrievedormatchedlaterinthestring.

(?

iLmsux)SettheI,L,M,S,U,orXflagfortheRE(seebelow).

(?

:

...)Non-groupingversionofregularparentheses.

(?

P...)Thesubstringmatchedbythegroupisaccessiblebyname.

(?

P=name)Matchesthetextmatchedearlierbythegroupnamedname.

(?

#...)Acomment;ignored.

(?

=...)Matchesif...matchesnext,butdoesn'tconsumethestring.

(?

!

...)Matchesif...doesn'tmatchnext.

(?

<=...)Matchesifprecededby...(mustbefixedlength).

(?

...)Matchesifnotprecededby...(mustbefixedlength).

(?

(id/name)yes|no)Matchesyespatternifthegroupwithid/namematched,

the(optional)nopatternotherwise.

Thespecialsequencesconsistof"\\"andacharacterfromthelist

below.Iftheordinarycharacterisnotonthelist,thenthe

resultingREwillmatchthesecondcharacter.

\numberMatchesthecontentsofthegroupofthesamenumber.

\AMatchesonlyatthestartofthestring.

\ZMatchesonlyattheendofthestring.

\bMatchestheemptystring,butonlyatthestartorendofaword.

\BMatchestheemptystring,butnotatthestartorendofaword.

\dMatchesanydecimaldigit;equivalenttotheset[0-9].

\DMatchesanynon-digitcharacter;equivalenttotheset[^0-9].

\sMatchesanywhitespacecharacter;equivalentto[\t\n\r\f\v].

\SMatchesanynon-whitespacecharacter;equiv.to[^\t\n\r\f\v].

\wMatchesanyalphanumericcharacter;equivalentto[a-zA-Z0-9_].

WithLOCALE,itwillmatchtheset[0-9_]pluscharactersdefined

aslettersforthecurrentlocale.

\WMatchesthecomplementof\w.

\\Matchesaliteralbackslash.

Thismoduleexportsthefollowingfunctions:

matchMatcharegularexpressionpatterntothebeginningofastring.

searchSearchastringforthepresenceofapattern.

subSubstituteoccurrencesofapatternfoundinastring.

subnSameassub,butalsoreturnthenumberofsubstitutionsmade.

splitSplitastringbytheoccurrencesofapattern.

findallFindalloccurrencesofapatterninastring.

finditerReturnaniteratoryieldingamatchobjectforeachmatch.

compileCompileapatternintoaRegexObject.

purgeCleartheregularexpressioncache.

escapeBackslashallnon-alphanumericsinastring.

Someofthefunctionsinthismoduletakesflagsasoptionalparameters:

IIGNORECASEPerformcase-insensitivematching.

LLOCALEMake\w,\W,\b,\B,dependentonthecurrentlocale.

MMULTILINE"^"matchesthebeginningoflines(afteranewline)

aswellasthestring.

"$"matchestheendoflines(beforeanewline)aswell

astheendofthestring.

SDOTALL"."matchesanycharacteratall,includingthenewline.

XVERBOSEIgnorewhitespaceandcommentsfornicerlookingRE's.

UUNICODEMake\w,\W,\b,\B,dependentontheUnicodelocale.

Thismodulealsodefinesanexception'error'.

上面说了基本语法和一些函数的使用。

基本语法在上面链接已经说明。

下面介绍主要函数的使用。

re的函数说明

match

查看帮助:

help(re.match)

Helponfunctionmatchinmodulere:

match(pattern,string,flags=0)

Trytoapplythepatternatthestartofthestring,returningamatchobject,orNoneifnomatchwasfound.

re.match(pattern,string,flags=0)

功能:

从字符串string第一个位置开始匹配,根据建立的pattern规则匹配,返回匹配规则的的字符串。

如果没有匹配成功返回:

None.flags是可选参数,用于控制正则表达式的匹配方式。

 

例子:

importre

pattern='[w]{3}.[a-z]+.(com)'

str1=""

str2="http:

"

re1=re.match(pattern,str1)

printre1.group(0)

re2=re.match(pattern,str2)

printre2.group(0)

匹配开始位置是的网址,第一个输出,第二个竟然报错了,因为第一个不匹配,但是说明文档说的是返回None的。

search

查看帮助:

help(re.search)

Helponfunctionsearchinmodulere:

search(pattern,string,flags=0)

Scanthroughstringlookingforamatchtothepattern,returning

amatchobject,orNoneifnomatchwasfound.

re.search(pattern,string,flags=0)

功能:

在字符串string中找到一个满足pattern匹配模式的字符串,不存在的返回None

例子:

importre

pattern='[w]{3}\.[a-z]+\.(com)'

str1=""

str2="http:

"

re1=re.search(pattern,str1)

printre1.group()

re2=re.search(pattern,str2)

printre2.group()

第一个输出:

第二个:

报错,匹配失败

sub

查看帮助:

help(re.sub)

Helponfunctionsubinmodulere:

sub(pattern,repl,string,count=0,flags=0)

Returnthestringobtainedbyreplacingtheleftmost

non-overlappingoccurrencesofthepatterninstringbythe

replacementrepl.replcanbeeitherastringoracallable;

ifastring,backslashescapesinitareprocessed.Ifitis

acallable,it'spassedthematchobjectandmustreturn

areplacementstringtobeused.

re.sub(pattern,repl,string,count=0,flags=0)

功能:

将字符串string满足pattern规则的字符串替换成repl,count默认是0全部替换,若是2是指只替换前两个。

例子:

importre

pattern='[w]{3}\.[a-z]+\.(com)'

repl=''

str3="ilove,tomlove"

re3=re.sub(pattern,repl,str3,1)

printre3

输出:

ilove,tomlove

subn

与re.sub差不多只是在返回时候还返回替换字符的个数 

例子:

importre

pattern='[w]{3}\.[a-z]+\.(com)'

repl=''

str3="ilove,tomlove"

re3=re.subn(pattern,repl,str3,2)

printre3

输出:

(‘ilove,tomlove’,2)

split

查看帮助:

help(re.split)

Helponfunctionsplitinmodulere:

split(pattern,string,maxsplit=0,flags=0)

Splitthesourcestringbytheoccurrencesofthepattern,

returningalistcontainingtheresultingsubstrings.

re.split(pattern,string,maxsplit=0,flags=0)

功能:

根据pattern规则把字符串string分离,保存在list中。

maxsplit是最大分类个数,默认最大。

 

例子:

importre

str="xiaoming,xiaohua,xiaoli,xiaoqiang,xiaozhang"

pattern=","

printre.split(pattern,str)

输出结果:

[‘xiaoming’,‘xiaohua’,‘xiaoli’,‘xiaoqiang’,‘xiaozhang’]

findall

查看帮助:

help(re.findall)

Helponfunctionfindallinmodulere:

findall(pattern,string,flags=0)

Returnalistofallnon-overlappingmatchesinthestring.

Ifoneormoregroupsarepresentinthepattern,returna

listofgroups;thiswillbealistoftuplesifthepattern

hasmorethanonegroup.

Emptymatchesareincludedintheresult.

re.findall(pattern,string,flags=0)

功能:

在字符串string中找出所有满足正则的字符串,并存在列表list中,没有列表为空

例子:

importre

str="xiaoming,xiaohua,xiaoli,xiaoqiang,xiaozhang"

pattern="\w+"

printre.findall(pattern,str)

结果和上面的一样但是理解一样不一样的:

[‘xiaoming’,‘xiaohua’,‘xiaoli’,‘xiaoqiang’,‘xiaozhang’]

finditer

和findall类似,在字符串中找到正则表达式所匹配的所有子串,并组成一个迭代器返回

例子:

importre

str="xiaoming,xiaohua,xiaoli,xiaoqiang,xiaozhang"

pattern="\w+"

re4=re.finditer(pattern,str)

foriinre4:

printi.group()

迭代器,通过for循环输出

foriinre4:

...printi.group()

...

xiaoming

xiaohua

xiaoli

xiaoqiang

xiaozhang

compile

查看帮助:

help(pile)

Helponfunctioncompileinmodulere:

compile(pattern,flags=0)

Compilearegularexpressionpattern,returningapatternobject.

pile(pattern,flags=0)

功能:

把正则表达式pattern转化成正则表达式对象 

例子:

importre

str="xiaoming,xiaohua,xiaoli,xiaoqiang,xiaozhang"

pattern="\w+"

patternobj=pile(pattern)

re4=re.finditer(pattern,str)

foriinre4:

printi.group()

结果和上一个一样,感觉就是转成对象,在进行其他操作。

purge

查看帮助:

help(re.purge)

Helponfunctionpurgeinmodulere:

purge()

Cleartheregularexpressioncache

功能:

清除缓存的正则表达式

escape

查看帮助:

help(re.escape)

Helponfunctionescapeinmodulere:

escape(pattern)

Escapeallnon-alphanumericcharactersinpattern.

功能:

对字符串中的非字母数字进行转义,具体什么意思我就不知道了。

 

例子:

>>>pattern

'\\w+'

>>>re.escape(pattern)

'\\\\w\\+'

看,不一样了。

具体我真的不懂了。

flags

IIGNORECASEPerformcase-insensitivematching.

LLOCALEMake\w,\W,\b,\B,dependentonthecurrentlocale.

MMULTILINE"^"matchesthebeginningoflines(afteranewline)

aswellasthestring.

"$"matchestheendoflines(beforeanewline)aswell

astheendofthestring.

SDOTALL"."matchesanycharacteratall,includingthenewline.

XVERBOSEIgnorewhitespaceandcommentsfornicerlookingRE's.

UUNICODEMake\w,\W,\b,\B,dependentontheUnicodelocale.

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 求职职场 > 简历

copyright@ 2008-2022 冰豆网网站版权所有

经营许可证编号:鄂ICP备2022015515号-1