Greta 正则表达式.docx

上传人:b****3 文档编号:3837644 上传时间:2022-11-25 格式:DOCX 页数:34 大小:39.21KB
下载 相关 举报
Greta 正则表达式.docx_第1页
第1页 / 共34页
Greta 正则表达式.docx_第2页
第2页 / 共34页
Greta 正则表达式.docx_第3页
第3页 / 共34页
Greta 正则表达式.docx_第4页
第4页 / 共34页
Greta 正则表达式.docx_第5页
第5页 / 共34页
点击查看更多>>
下载资源
资源描述

Greta 正则表达式.docx

《Greta 正则表达式.docx》由会员分享,可在线阅读,更多相关《Greta 正则表达式.docx(34页珍藏版)》请在冰豆网上搜索。

Greta 正则表达式.docx

Greta正则表达式

GRETA:

 

TheGRETARegularExpressionTemplateArchive

 

Copyright EricNiebler,2002

 

 

 

ThepurposeofthisdocumentistodescribehowtousetheGRETARegularExpressionTemplateArchive. Itdescribestheobjectsinthelibrary,themethodsdefinedontheobjects,andthewaystousetheobjectsandmethodstoperformregularexpressionpatternmatchingonstringsinC++. Itdoesnotdescriberegularexpressionsyntax. ItisenoughtosaythatthefullPerl5syntaxissupported. IfyouarenotfamiliarwithPerl’sregularexpressionsyntax,IrecommendreadingChapter2of ProgrammingPerl,2nd Ed. (a.k.a.TheCamelBook),oneofthemanyfinebooksputoutbyO’Reillypublishers. 

GRETA:

 TheGRETARegularExpressionTemplateArchive

Overview

AWordaboutSpeed

NoticetoUsersofVersion1.x

TherpatternObject

rpattern:

:

string_type

rpattern:

:

rpattern

rpattern:

:

match

rpattern:

:

substitute

rpattern:

:

count

rpattern:

:

split

rpattern:

:

set_substitution

rpattern:

:

cgroups

match_results,subst_resultsandsplit_results

match_results:

:

cbackrefs

match_results:

:

backref

match_results:

:

rstart

match_results:

:

rlength

match_results:

:

all_backrefs

subst_results:

:

backref_str

split_results:

:

strings

TheSyntaxModule

register_intrinsic_charset

CustomizingYourSearch

NOCASE

GLOBAL

MULTILINE

SINGLELINE

EXTENDED

RIGHTMOST

NOBACKREFS

ALLBACKREFS

FIRSTBACKREFS

NORMALIZE

MatchingModes

MODE_FAST

MODE_SAFE

MODE_MIXED

KnownIssuesandPerlIncompatibilities

EmbeddedCodeinaRegularExpression

PatternModifierScope

CommentBlocksBeforeQuantifiers

VariableWidthLook-BehindAssertions

RecursivePatterns

Compile-TimeSwitches

REGEX_WIDE_AND_NARROW

REGEX_POSIX

REGEX_NO_PERL

REGEX_DEBUG

REGEX_DEBUG_HEAP

REGEX_STACK_ALIGNMENT

REGEX_FOLD_INSTANTIATIONS

REGEX_TO_INSTANTIATE

Miscellaneous

StaticConstPatterns

Thread-safety

StackUsage

DBCS

STL

VC7andManagedCode

TemplateInstantiation

ContactInformation

Appendix1:

History

Appendix2:

ImplementationDetails

 

Overview

TheregularexpressiontemplatelibrarycontainsobjectsandfunctionsthatmakeitpossibletoperformpatternmatchingandsubstitutiononstringsinC++. Theyare:

∙rpattern:

thepatterntouseduringthesearch.

∙match_results/subst_results:

containerfortheresultsofamatch/substitution.

Toperformasearchorreplaceoperation,youwilltypicallyfirstinitializean rpattern objectbygivingitastringrepresentingthepatternagainstwhichtomatch. Youwillthencallamethodontherpatternobject(match() or substitute(),forinstance),passingitastringtomatchagainstanda match_results objectstoreceivetheresultsofthematch. Ifthe match()/substitute() fails,themethodreturnsfalse. Ifitsucceeds,itreturnstrue,andthe match_results objectstorestheresultingarrayof backreferences internally. (Here,theterm backreference hasthesamemeaningasitdoesinPerl. Backreferencesprovideextrainformationaboutwhatpartsofthepatternmatchedwhichpartsofthestring.) Therearemethodsonthe match_results objecttomakethebackreferenceinformationavailable. Forexample:

 

#include

#include

#include“regexpr2.h”

usingnamespacestd;

usingnamespaceregex;

 

intmain(){

   match_resultsresults;

   stringstr(“Thebookcost$12.34”);

   rpatternpat(“\\$(\\d+)(\\.(\\d\\d))?

”); 

    //Matchadollarsignfollowedbyoneormoredigits,

    //optionallyfollowedbyaperiodandtwomoredigits.

    //Thedouble-escapesarenecessarytosatisfythecompiler.

 

   match_results:

:

backref_typebr=pat.match(str,results);

   if(br.matched){

       cout<<“matchsuccess!

”<

       cout<<“price:

”<

   }else{

       cout<<“matchfailed!

”<

   }

   return0;

}

 

Theaboveprogramwouldprintoutthefollowing:

 

matchsuccess!

price:

$12.34

 

Thefollowingsectionsdiscussthe rpattern objectindetailandhowtocustomizeyoursearchestobefasterandmoreefficient.

 

Note:

alldeclarationsintheheaderfile(regexpr2.h)arecontainedinthe regex namespace. Touseanyoftheobjects,methodsorenumerationsdescribedinthisdocument,youmustprependalldeclarationswith“regex:

:

”oryoumusthavethe“usingnamespaceregex;”directivesomewherewithintheenclosingscopeofyourdeclarations. Forsimplicity,I’veleftoffthe“regex:

:

”prefixesintherestofmycodesnippets.

 

AWordaboutSpeed

Differentregexenginesaregoodondifferenttypesofpatterns. Thatsaid,Ihavefoundmyregexenginetobeprettyquick. Forabenchmark,Imatchedthepattern“^([0-9]+)(\-||$)(.*)$”againstthestring“100-thisisalineofftpresponsewhichcontainsamessagestring”. GRETAisabout7timesfasterthantheregexlibraryinboost(http:

//www.boost.org),andabout10timesfasterthantheregularexpressionclassesinATL7. Forthisinput,GRETAisevenfasterthanPerl,althoughPerlisfasterforsomeotherpatterns. MostregexenginesIhaveseenbuildupanNFA (non-deterministicfinitestateautomaton) andexecuteititeratively,oftenwithabig,slowswitchstatement. Ihaveadifferentapproach:

patternsarecompiledintoadirected,possiblycyclicgraph,andmatchinghappensbytraversingthisgraphrecursively. Inaddition,thecodemakesheavyuseoftemplatestofreezethestateoftheflagsintothecompiledpatternsothattheydon’tneedtobecheckedatmatchtime. Theresultisaprettyleanblobofcodethatcanmatchyourpatternquickly.

 

Eventhebestalgorithmshavetheirweaknesses,though. MatchingregularexpressionswithbackreferencesisanNP-completeproblem. Therearepatternsthatwillmakeanybacktrackingregexenginetakeexponentialtimetofinish. (Theseusuallyinvolvenestedquantifiers.) Ifyouhaveaperformancecriticalapp,youwouldbesmarttotestyourpatternsforspeed,orprofileyourapptomakesureyouarenotspendingtoomuchtimethrashingaroundintheregexcode. You’vebeenwarned!

 

Also,seethesection VC7andManagedCode forsomeadviceforcompilingGRETAunderVC7.

 

NoticetoUsersofVersion1.x

Manythingshavechangedsinceversion1.xoftheRegularExpressionTemplateArchive. Ifyouhavecodewhichusesversion1.x,youwillnotbeabletouseversion2withoutmakingchangestoyourcode. Sorry!

 Therewereanumberofunsafe,unintuitiveinterfacefeaturesofversion1thatIfeltwereworthfixingforversion2. Ifyouneedversion1,IhaveacopyandI’dbehappytogiveittoyou.

 

Mostnotably,the regexpr objecthasgoneaway. Itwasasubclassof std:

:

string,with match() and substitute() methods,anditstoredtheresultsofthematch/substituteinternally. Subclassing std:

:

string isdangerousbecause std:

:

string doesn’thaveavirtualdestructor. Also,matchingisconceptuallyaconstoperation,anditseemedwrongthatitshouldchangeinternalstate.

 

The match/count/substitute methodshavemovedtothe rpattern object. Thestatethatusedtobestoredinthe regexpr objectisnowputinamatch_results/subst_results container,whichispassedasanoutparametertothe match/substitute methods.

 

Also,the CSTRINGS flaghasgoneaway. ItisnolongernecessarytooptimizeapatternforusewithC-styleNULL-terminatedstrings. WhenyoupassaC-stylestringtothe rpattern:

:

matchmethod,thesameoptimizationisusedautomatically. (Inearly2.Xversionsofthelibrary,therewasa basic_rpattern_c objectforperformingthisoptimization,butitisnolongernecessaryandhasbeendeprecated.)

 

Anotherminorchangeinvolvesthe register_intrinsic_charset() method. Itusedtobeapartof rpattern’sinterface,butithasmovedtothesyntaxmodule.

 

Despitethesweepinginterfacechanges,themajorityoftheback-endcodeisunchanged. Youshouldexpectpatternsthatworkedinversion1.xtocontinuetoworkinversion2.

TherpatternObject

The rpattern objectcontainstheregularexpressionpatternagainstwhichtomatch. Italsoexposesthe match(), substitute(),and count() methodsyouwillusetoperformregularexpressionmatches. Whenyouinstantiatean rpattern object,thepatternis“compiled”intoastructurethatspeedsuppatternmatching. Oncecompiled,youmayreusethesamepatternformultiplematchoperations.

 

Hereishow rpattern isdeclared:

 

template

         typenameSY=perl_syntax

:

iterator_traits:

:

value_type>>

classbasic_rpattern{

};

typedefbasic_rpattern

:

basic_string:

:

const_iterator>rpattern;

typedefbasic_rpatternrpattern_c;

 

The rpattern classisatemplateoniteratortype. Itisalsoatemplateonthesyntaxmodule. Bydefault,thePerlsyntaxmoduleisused,butyouarefreetowriteyourownsyntaxandspecifyitasatemplateparameter. Seethesectiononthe SyntaxModule.

 

Thefollowingsectionsdescribethemethodsavailableonthe rpattern object.

rpattern:

:

string_type

rpattern:

:

string_type isatypedefthatisusedinmanyofthefollowingfunctionprototypes. Itisdefinedasfollows:

 

typedefCIconst_iterator;

typedefstd:

:

iterator_traits:

:

value_typechar_type;

typedefstd:

:

basic_stringstring_type;

 

Thetypedefisalittlecomplicated,butitseffectiswhatyouwouldexpect. Iftheresultofdereferencinga const_iterator isa char,then string_type isthesameas std:

:

string. Ifdereferencinga const_iterator resultsina wchar_t,then string_type isthesameas std:

:

wstring.

rpattern:

:

rpattern

Therearetwoconstructorsforinstantiatingan rpattern object. Herearetheirprototypes:

 

rpattern:

:

rpattern(

const string_type &pat,

REGEX_FLAGSflags=NOFLAGS,

REGEX_MODEmode=MODE_DEFAULT); //throw(bad_alloc,bad

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 党团工作 > 其它

copyright@ 2008-2022 冰豆网网站版权所有

经营许可证编号:鄂ICP备2022015515号-1