XML模糊查询.docx
《XML模糊查询.docx》由会员分享,可在线阅读,更多相关《XML模糊查询.docx(26页珍藏版)》请在冰豆网上搜索。
XML模糊查询
FuzzyXQuery
MarleneGoncalves
UniversidadSimónBolívar,Caracas,Venezuela
mgoncalves@usb.ve
LeonidTineo
UniversidadSimónBolívar,Caracas,Venezuela
leonid@usb.ve
Abstract WepresenthereafuzzysetbasedextensiontoXQuerythatallowusertoexpresspreferencesinqueryingXMLdocumentsobtainingdiscriminatedanswers.Theextensioncomprisesthenewxs:
truthbuilt-indatatypeintendedtorepresentgradualtruthdegreesaswellasthexml:
truthattributeforhandingsatisfactiondegreesinnodesproceedingsoffuzzyXQueryexpressions.Thelanguageisextendedtodeclarefuzzytermsandusetheminqueryingexpressions.Alsoanevaluationmechanismispresentedinordertoavoidsuperfluouscalculationsoftruthdegrees.
Introduction
TheWebplaysanessentialroleinmanyonlinecompaniesandithasmadeavailableanexorbitantamountofdatafromseveralwebsites.Manywebsitesofferonlineservicesforbookingflights,hotels,railpasses,cruises,etc.Thus,theWebhasbeenbecomeanindispensabletoolforthiskindofwebsites.
Supposeresearcherswhowanttoattendaconferenceandtherefore,theydecidetoquerythebestflightforthemusingatravelcompanywebsite.Aflightmaybeselectedforaresearcherifandonlyifistheveryeconomicalandmakesfewstops.However,asecondresearcherprefersaflightwhosedestinationisanearcityandthenhemaytakeatraintoarrivecitywherewillbeheldtheconference.Thus,researchesmaydefineseveralcriteriatotakeafinaldecision.Sincemanyflightsmaybereturnedbywebsites,itisnecessarytodiscriminatequeryanswersandsortthembyuser-criteria.Apossiblesolutionisfuzzydatabasemanagementsystemswhichallowdefininguser-criteriaandrankingqueryanswersbyamembershipfunction;amembershipfunctionquantifiesthesatisfactiongradeofeachanswerwithrespecttouser-criteriaandinducesatotalorderofthesetobjects.
Additionally,atravelwebsitemayhaveincompletedataonflights,e.g.,lackofinformationaboutsometaxes.Inconsequence,severalonlinetravelcompaniesmaysharedataandcommunicateamongthemselvesinordertorespondaresearch’squery.Usually,mostofthesewebsitesmayuseXML(ExtensibleMarkupLanguage)format[6]tointerchangedata.XMLisaformatrecommendedbytheWorldWideWebConsortium(W3C)andithasbeenextensivelyusedtotransferdataamongwebsites.XMLdatamaybequeriedthroughdeclarativequeryinglanguagessuchasXPath[10]andXQuery[12].BothlanguagesareXML-centric,i.e.,theirdatamodelandtypesystemarebasedonXML.
XQueryisanextensionofXPathandistheW3CstandardlanguageforXMLdata.Mostofdatabaseengines(IBM,Oracle,andMicrosoft)supportXQueryanditisanativeXMLquerylanguageconceivedtointegratemultipleXMLsources.
Ontheotherhand,XMLstructuremaybedefinedusingDocumentTypeDefinition(DTD)[6]andXMLSchema[11].XMLSchemaistheW3CsuccessorofDTDanditisthetypesystemofXQuery.Furthermore,XMLSchemaismoreextensiblethanDTDandiseasiertoshareuser-definedextensionwithnamespacesupport.
Inourqueryexample,aresearcher’squerymayretrieveeitherapossiblylargesubsetofXMLdatafromseveralwebsitesthatbeuselessfortheresearcheroremptyanswersbecauseofdatabaseenginesofthewebsitesarebasedonBooleanlogic.Thus,theresearcherhastodiscardirrelevantanswerswhensizeofqueryresultishuge,interestinganswersmaybelostifqueryresultisemptyandthereisnotdiscriminationonqueryresultintermsofalluser-criteria.Fuzzy-logic-basedlanguageshavebeensuccessfullyemployedtoexpressuser-criteriaanddiscriminatequeryanswers.Severalfuzzyquerylanguagesforrelationaldatabasehavebeenintroduced:
SQLf[3],FSQL[7,8]andSoftSQL[2].Inthispaper,weproposetoincorporatefuzzylogictoXQuerylanguageasasteptowardsprovidingamoreflexiblenativeXMLlanguage.WesupposeXQuerysyntaxisfamiliarforreader.DetailsofXQueryaregiven[12].WeextendatmostallXQueryvalidexpressionsinordertoallowfuzzyconditions.
Background
In[13],Zadehintroducedfuzzysetsasanextensionofclassicalones.Afuzzysetisasetwhoseelementspossessmembershipdegreesdeterminedbyamembershipfunction.AmembershipfunctionofafuzzysetF,denotedbyF,isafunctionintorange[0,1]thatinducesatotallyorderedset.
Fuzzysetsarethebasisoffuzzylogic.Fuzzylogic isamulti-valued logicwhosetruthvaluesarerepresentedbyrealnumbersontheclosedinterval[0,1]wherecompletelyfalsecorrespondstoazerovalueandcompletelytrueisindicatedbyonevalue.Thetruthvalueofapropositionsisdenotedby(s).
Linguistictermsmaybespecifiedusingfuzzylogicandtheyexpressuser’scriteria.Thesetermsmayhelpustoidentifyfuzzytermssuchaspredicates,modifiers,connectives,quantifiersandcomparators.Afuzzypredicateisthebasicterminfuzzylogicanditisexpressedinnaturallanguagebypositiveadjectives,suchasgood,bad,cheap,andexpensive.Afuzzymodifiertransformsamembershipfunctionofafuzzypredicatebyincrease,decrease,translateoreverse.Inmostcases,adverbsveryandextremelyindicatefuzzymodifierpresence,e.g.,verycheapandextremelyexpensivemodifymembershipfunctionsofpredicatescheapandexpensive.Afuzzyconnectiveisanoperatortoconnectfuzzyconditionsbynegation,conjunctionand/ordisjunction.Afuzzycomparatorisapredicatedefinedonpairsofelements.ComparativeadjectivesexpressedinEnglishbyterminationerorthewordmoreandpureadjectivessuchasbetterandworse,areexamplesoflinguistictermsrepresentedasfuzzycomparators.Finally,fuzzyquantifiersareextensionofuniversalandexistentialones.Quantitativeadjectivesasfewandmany,andrelativesuperlativeasmostofareinstancesoffuzzyquantifiers.Theymaybeabsoluteorproportional[REF].Absolutequantifiersaredefinedonrealnumbersandtheydescribefuzzyquantitiessuchasabout5ormorethan20.Proportionalonescorrespondtonumbersintoclosedinterval[0,1]andtheydefinerelativequantitiessuchasatleasthalformostofhalf.
Someimportantconceptsoffuzzysetstheoryaresupport,core,and-cut.Theyallowstabilizingrelationshipsbetweenfuzzysetsandclassical.Supportoperatorallowsdefiningthoseelementsthatarenotfullyexcludedfromthefuzzyset,i.e.,asetofelementswhosemembershipdegreeisgreaterthanzero.Coreoperatorallowsidentifythoseelementsthatarefullyincludedintheset,thisis,asetofelementswhosemembershipdegreeis1.Finally,-cutoperatordefinesasetofelementswhosemembershipdegreeisgreaterorequalthan[0,1].
Sincemembershipfunctioncalculationandfuzzysetscontainmoreelementsthanclassicalones,timecomplexityoffuzzyqueriesevaluationmaybehigherthancrispqueryevaluation.Forthisreason,Boscetal.[REF]haveconceivedanevaluationmechanismbasedin-cutdistributionoverconditionsinvolvedinaafuzzyqueryinordertotranslateafuzzyquerytoaregularone.ThismechanismisknownasDerivationPrinciple.MorerecentlyMa[REF]hasproposedtheuseofthisprincipleasanunifiedapproachforfuzzyqueryevaluationonrelationaldatabases.ThisprinciplehasbeenwidelystudiedbyTineoetal.[REF],showingbenefitsofitsapplication.
Thederivedqueryselectsthedesiredsubsetofdatawithoutcomputingthesatisfactiondegreeoffuzzyconditionforthecompleteinputdata.Ifsizeofdatasubsetisexactlyequaltosizeof-cut,derivationissaidtobeastrongderivation.Ifthissubsetisasupersetof-cut,thenderivationissaidtobeweak[REF].
Toillustratethisprinciple,wedenoteDS()astheregularsentencederivedfromafuzzysentenceusingqualitativecalibrationandDNC(,,)asderivednecessaryconditionthatspecify-cutofafuzzycondition.
Fig.1.Representationofthefuzzypredicateeconomical
TheuserrequirementFindeconomicalflightswithathresholdof0.5maybespecifiedinSQLf[3]as:
select*fromflightswhereprice=economicalwithcalibration0.5;
Wherewithcalibrationisaclauseforspecificationofthethreshold.LetassumethedefinitionoffuzzypredicateeconomicalasinFig.1.ApplyingtheDerivationPrinciple,weobtainaclassicSQLquerythatretrievesthesamerowsthatthefuzzyone.
DS(select*fromflightswhereprice=economicalwithcalibration0.5)=select*fromflightswhereprice1400
Thisqueryderivationobeysthefactthat,accordingtothedefinitionofthe-cutofafuzzset,wemayderivetheBooleancondition:
DNC(price=economical,,0.5)=price1400.
ThefuzzyqueryprocessingismadeontheresultofthederivedBoolenaquery,inthiscasesuperfluousaccessandcalculusforrowswithprice>1400wouldbeavoided.ThisstrategyhasreportedtoreducethequeryprocessingtimeforSQLf.
RelatedWorks
FSQL[7,8]andSQLf[3]aretwofuzzyextensionsofSQL.FSQLallowsprocessinginexactinformationandmostoffuzzytermsarepredefinedordefinedduringdatabasemodeling.SQLfsupportsfuzzytermdefinitionsandSQL:
2003[5]features.Bothlanguagesarerelationaldatabase-centric.
Analogously,XMLquerylanguageshavebeenextendedwithfuzzylogic.SomeideastoextendXPathwithfuzzytermswereintroducedin[4]:
rankedliststhroughannotations(orcomments)intheresult,anduseoffuzzypredicatesandfuzzyquantifiers.Nevertheless,theydonotallowusertodefinefuzzytermsbuttheyofferasetoffewbuilt-inpredicates;modifiers,comparatorsandconnectivesarenotconsideredin[4].Combinationoffuzzypredicatesismadebymeansofarithmeticoperationsonrankingvariablesinsteadof