72 reRegular expression operationsPython v275 documentation.docx

上传人:b****6 文档编号:4944009 上传时间:2022-12-12 格式:DOCX 页数:17 大小:29.20KB
下载 相关 举报
72 reRegular expression operationsPython v275 documentation.docx_第1页
第1页 / 共17页
72 reRegular expression operationsPython v275 documentation.docx_第2页
第2页 / 共17页
72 reRegular expression operationsPython v275 documentation.docx_第3页
第3页 / 共17页
72 reRegular expression operationsPython v275 documentation.docx_第4页
第4页 / 共17页
72 reRegular expression operationsPython v275 documentation.docx_第5页
第5页 / 共17页
点击查看更多>>
下载资源
资源描述

72 reRegular expression operationsPython v275 documentation.docx

《72 reRegular expression operationsPython v275 documentation.docx》由会员分享,可在线阅读,更多相关《72 reRegular expression operationsPython v275 documentation.docx(17页珍藏版)》请在冰豆网上搜索。

72 reRegular expression operationsPython v275 documentation.docx

72reRegularexpressionoperationsPythonv275documentation

7.2.re—Regularexpressionoperations—Pythonv2.7.5documentation

7.2.—Regularexpressionoperations

ThismoduleprovidesregularexpressionmatchingoperationssimilartothosefoundinPerl.BothpatternsandstringstobesearchedcanbeUnicodestringsaswellas8-bitstrings.

Regularexpressionsusethebackslashcharacter('\')toindicatespecialformsortoallowspecialcharacterstobeusedwithoutinvokingtheirspecialmeaning.ThiscollideswithPython’susageofthesamecharacterforthesamepurposeinstringliterals;forexample,tomatchaliteralbackslash,onemighthavetowrite'\\\\'asthepatternstring,becausetheregularexpressionmustbe\\,andeachbackslashmustbeexpressedas\\insidearegularPythonstringliteral.

ThesolutionistousePython’srawstringnotationforregularexpressionpatterns;backslashesarenothandledinanyspecialwayinastringliteralprefixedwith'r'.Sor"\n"isatwo-characterstringcontaining'\'and'n',while"\n"isaone-characterstringcontaininganewline.UsuallypatternswillbeexpressedinPythoncodeusingthisrawstringnotation.

Itisimportanttonotethatmostregularexpressionoperationsareavailableasmodule-levelfunctionsandmethods.Thefunctionsareshortcutsthatdon’trequireyoutocompilearegexobjectfirst,butmisssomefine-tuningparameters.

Seealso

∙MasteringRegularExpressions

∙BookonregularexpressionsbyJeffreyFriedl,publishedbyO’Reilly.ThesecondeditionofthebooknolongercoversPythonatall,butthefirsteditioncoveredwritinggoodregularexpressionpatternsingreatdetail.

7.2.1.RegularExpressionSyntax

Aregularexpression(orRE)specifiesasetofstringsthatmatchesit;thefunctionsinthismoduleletyoucheckifaparticularstringmatchesagivenregularexpression(orifagivenregularexpressionmatchesaparticularstring,whichcomesdowntothesamething).

Regularexpressionscanbeconcatenatedtoformnewregularexpressions;ifAandBarebothregularexpressions,thenABisalsoaregularexpression.Ingeneral,ifastringpmatchesAandanotherstringqmatchesB,thestringpqwillmatchAB.ThisholdsunlessAorBcontainlowprecedenceoperations;boundaryconditionsbetweenAandB;orhavenumberedgroupreferences.Thus,complexexpressionscaneasilybeconstructedfromsimplerprimitiveexpressionsliketheonesdescribedhere.Fordetailsofthetheoryandimplementationofregularexpressions,consulttheFriedlbookreferencedabove,oralmostanytextbookaboutcompilerconstruction.

Abriefexplanationoftheformatofregularexpressionsfollows.Forfurtherinformationandagentlerpresentation,consulttheRegularExpressionHOWTO.

Regularexpressionscancontainbothspecialandordinarycharacters.Mostordinarycharacters,like'A','a',or'0',arethesimplestregularexpressions;theysimplymatchthemselves.Youcanconcatenateordinarycharacters,solastmatchesthestring'last'.(Intherestofthissection,we’llwriteRE’sinthisspecialstyle,usuallywithoutquotes,andstringstobematched'insinglequotes'.)

Somecharacters,like'|'or'(',arespecial.Specialcharacterseitherstandforclassesofordinarycharacters,oraffecthowtheregularexpressionsaroundthemareinterpreted.Regularexpressionpatternstringsmaynotcontainnullbytes,butcanspecifythenullbyteusingthe\numbernotation,e.g.,'\x00'.

Thespecialcharactersare:

∙'.'

∙(Dot.)Inthedefaultmode,thismatchesanycharacterexceptanewline.Iftheflaghasbeenspecified,thismatchesanycharacterincludinganewline.

∙'^'

∙(Caret.)Matchesthestartofthestring,andinmodealsomatchesimmediatelyaftereachnewline.

∙'$'

∙Matchestheendofthestringorjustbeforethenewlineattheendofthestring,andinmodealsomatchesbeforeanewline.foomatchesboth‘foo’and‘foobar’,whiletheregularexpressionfoo$matchesonly‘foo’.Moreinterestingly,searchingforfoo.$in'foo1\nfoo2\n'matches‘foo2’normally,but‘foo1’inmode;searchingforasingle$in'foo\n'willfindtwo(empty)matches:

onejustbeforethenewline,andoneattheendofthestring.

∙'*'

∙CausestheresultingREtomatch0ormorerepetitionsoftheprecedingRE,asmanyrepetitionsasarepossible.ab*willmatch‘a’,‘ab’,or‘a’followedbyanynumberof‘b’s.

∙'+'

∙CausestheresultingREtomatch1ormorerepetitionsoftheprecedingRE.ab+willmatch‘a’followedbyanynon-zeronumberof‘b’s;itwillnotmatchjust‘a’.

∙'?

'

∙CausestheresultingREtomatch0or1repetitionsoftheprecedingRE.ab?

willmatcheither‘a’or‘ab’.

∙*?

+?

?

?

∙The'*','+',and'?

'qualifiersareallgreedy;theymatchasmuchtextaspossible.Sometimesthisbehaviourisn’tdesired;iftheRE<.*>ismatchedagainst'

title

',itwillmatchtheentirestring,andnotjust'

'.Adding'?

'afterthequalifiermakesitperformthematchinnon-greedyorminimalfashion;asfewcharactersaspossiblewillbematched.Using.*?

inthepreviousexpressionwillmatchonly'

'.

∙{m}

∙SpecifiesthatexactlymcopiesofthepreviousREshouldbematched;fewermatchescausetheentireREnottomatch.Forexample,a{6}willmatchexactlysix'a'characters,butnotfive.

∙{m,n}

∙CausestheresultingREtomatchfrommtonrepetitionsoftheprecedingRE,attemptingtomatchasmanyrepetitionsaspossible.Forexample,a{3,5}willmatchfrom3to5'a'characters.Omittingmspecifiesalowerboundofzero,andomittingnspecifiesaninfiniteupperbound.Asanexample,a{4,}bwillmatchaaaaborathousand'a'charactersfollowedbyab,butnotaaab.Thecommamaynotbeomittedorthemodifierwouldbeconfusedwiththepreviouslydescribedform.

∙{m,n}?

∙CausestheresultingREtomatchfrommtonrepetitionsoftheprecedingRE,attemptingtomatchasfewrepetitionsaspossible.Thisisthenon-greedyversionofthepreviousqualifier.Forexample,onthe6-characterstring'aaaaaa',a{3,5}willmatch5'a'characters,whilea{3,5}?

willonlymatch3characters.

∙'\'

∙Eitherescapesspecialcharacters(permittingyoutomatchcharacterslike'*','?

',andsoforth),orsignalsaspecialsequence;specialsequencesarediscussedbelow.Ifyou’renotusingarawstringtoexpressthepattern,rememberthatPythonalsousesthebackslashasanescapesequenceinstringliterals;iftheescapesequenceisn’trecognizedbyPython’sparser,thebackslashandsubsequentcharacterareincludedintheresultingstring.However,ifPythonwouldrecognizetheresultingsequence,thebackslashshouldberepeatedtwice.Thisiscomplicatedandhardtounderstand,soit’shighlyrecommendedthatyouuserawstringsforallbutthesimplestexpressions.

∙[]

∙Usedtoindicateasetofcharacters.Inaset:

∙Characterscanbelistedindividually,e.g.[amk]willmatch'a','m',or'k'.

∙Rangesofcharacterscanbeindicatedbygivingtwocharactersandseparatingthembya'-',forexample[a-z]willmatchanylowercaseASCIIletter,[0-5][0-9]willmatchallthetwo-digitsnumbersfrom00to59,and[0-9A-Fa-f]willmatchanyhexadecimaldigit.If-isescaped(e.g.[a\-z])orifit’splacedasthefirstorlastcharacter(e.g.[a-]),itwillmatchaliteral'-'.

∙Specialcharacterslosetheirspecialmeaninginsidesets.Forexample,[(+*)]willmatchanyoftheliteralcharacters'(','+','*',or')'.

∙Characterclassessuchas\wor\S(definedbelow)arealsoacceptedinsideaset,althoughthecharacterstheymatchdependsonwhetherormodeisinforce.

∙Charactersthatarenotwithinarangecanbematchedbycomplementingtheset.Ifthefirstcharacterofthesetis'^',allthecharactersthatarenotinthesetwillbematched.Forexample,[^5]willmatchanycharacterexcept'5',and[^^]willmatchanycharacterexcept'^'.^hasnospecialmeaningifit’snotthefirstcharacterintheset.

∙Tomatchaliteral']'insideaset,precedeitwithabackslash,orplaceitatthebeginningoftheset.Forexample,both[()[\]{}]and[]()[{}]willbothmatchaparenthesis.

∙'|'

∙A|B,whereAandBcanbearbitraryREs,createsaregularexpressionthatwillmatcheitherAorB.AnarbitrarynumberofREscanbeseparatedbythe'|'inthisway.Thiscanbeusedinsidegroups(seebelow)aswell.Asthetargetstringisscanned,REsseparatedby'|'aretriedfromlefttoright.Whenonepatterncompletelymatches,thatbranchisaccepted.ThismeansthatonceAmatches,Bwillnotbetestedfurther,evenifitwouldproducealongeroverallmatch.Inotherwords,the'|'operatorisnevergreedy.Tomatchaliteral'|',use\|,orencloseitinsideacharacterclass,asin[|].

∙(...)

∙Matcheswhateverregularexpressionisinsidetheparentheses,andindicatesthestartandendofagroup;thecontentsofagroupcanberetrievedafteramatchhasbeenperformed,andcanbematchedlaterinthestringwiththe\numberspecialsequence,describedbelow.Tomatchtheliterals'('or')',use\(or\),orenclosetheminsideacharacterclass:

[(][)].

∙(?

...)

∙Thisisanextensionnotation(a'?

'followinga'('isnotmeaningfulotherwise).Thefirstcharacterafterthe'?

'determineswhatthemeaningandfurthersyntaxoftheconstructis.Extensionsusuallydonotcreateanewgroup;(?

P...)istheonlyexceptiontothisrule.Followingarethecurrentlysupportedextensions.

∙(?

iLmsux)

∙(Oneormorelettersfromtheset'i','L','m','s','u','x'.)Thegroupmatchestheemptystring;theletterssetthecorrespondingflags:

(ignorecase),(localedependent),(multi-line),(dotmatchesall),(Unicodedependent),and(verbose),fortheentireregularexpression.(Theflagsaredescribedin.)Thisisusefulifyouwishtoinclu

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 医药卫生 > 药学

copyright@ 2008-2022 冰豆网网站版权所有

经营许可证编号:鄂ICP备2022015515号-1