机械英文翻译Word下载.docx
《机械英文翻译Word下载.docx》由会员分享,可在线阅读,更多相关《机械英文翻译Word下载.docx(13页珍藏版)》请在冰豆网上搜索。
Howcantheinformationisnotsubmerged,butfromtimetodiscoverusefulknowledgetoimproveinformationutilization."
Facethischallenge,dataminingcameintobeing,anddeveloprapidly,showingastrongvitality.
آآآآDataMiningistheso-calleddataminingfromalargenumberofincomplete,noisy,fuzzy,randomextractionoftherawdataimplicitinthemknowninadvance,butispotentiallyusefulinformationandknowledgeoftheprocessThebirthofdataminingtechnologyforpeopleonthedatabasetheresultsoflong-termresearchanddevelopment,anddataminingtechnologyFazhanthesametimeitinturnledintoadatabasetechnologymoreadvancedstage:
thedataHuanJingChuanTongbasicallydataoperationXing'
straditionalinformationsystemisonlyresponsiblefordata,deleteandmodifyoperationsinthedatabasecanberealizedonthebasisoftheworkisOLTP(OnLineTransactionProcesson-linetransactionprocessing).Nowthatthegrowingaccumulationofdata,peopleneedtoanalyzethetypeofthedataenvironment,andsoitwasderivedfromthedatawarehousedatabase,youcanachieveasabasisforOLAP(OnLineAnalysisProcessOnlineAnalyticalProcessing):
Withthemassivedatacollectionmaybeenhancedcomputerprocessingtechnologyandadvanceddataminingalgorithmsproposed,dataminingtechnologycannotonlyquerythedataofthepastandthetraverse,butalsotoidentifythepotentialvalueoverthepastlinksbetweenthedataandtocertainforms,andthusgreattomeeturgentneedsforknowledge.
آآآآDataminingisbasedontheformationoftheoriginaldatasourceofknowledge,Itcanbestructureddatasuchasrelationaldatabase,itcanbesemi-structured,suchastext,graphics,images,data,orevendistributiononthedifferentconfigurationdata.Thisarticlewillfocusononeforsemi-structureddatamining-WEB-baseddatamining,introducesitsbasicconceptsandtechniquesfrequentlyusedinthefinalbriefexplanationoftheXMLapplicationinwhich.
1,basedonWEBkeyconceptsofdatamining
1WhatisWEB-baseddatamining
Therapiddevelopmentofthecurrentnetwork,allsitesabound.ButinanincreasinglycompetitiveInterneteconomy,onlytowincustomersinordertoultimatelygainacompetitiveadvantage.Asasiteadministratororowner,shouldknowthatusersdoonhisWebsite,toknowwhichpartofthesitelikemostusers,whichallowsuserstofeeltired,outofasecurityvulnerabilitywhere,whatkindofchangeshavebroughtsignificantcustomersatisfactionandimprovethecontrary,whatkindofchangestheuserandsolost."
Knowthyself"
to"
knowyourself."
TheWEB-baseddataminingtechnologyisabletomeetthoseneeds.
WEB-baseddataminingontheexactdefinition,sofarnotveryclearandauthoritativestatement.Abroadthat:
WEB-baseddatamining,istousedataminingtechniquestoautomaticallydocumentfromthenetworkandservicediscoveryandextractionofinformationintheprocess.InTaiwan,differentopinions,thereisconsideredtobealargenumberofknowndatasamplesonthebasisoftheinherentcharacteristicsofdataobjects,andasabasisforapurposeintheWEBintheinformationextractionprocess.Atthesametime,scholarswillbethenetworkenvironmentincludedinthenetworkinformationretrievaldataminingandwebcontentdevelopmentandsoon.Inshort,WEB-baseddatamining(WebMining)isfromtheWorldWideWeb(WorldWideWeb)onaccesstorawdatafromahiddentapthepotentialofavailableknowledgeandultimatelyusedincommercialoperationstomeettheneedsofmanagers.
2,WEB-baseddataminingclassification
AccordingtodifferentobjectsexcavatedWecanWEB-baseddataminingisdividedintothreecategories:
WEB-basedcontentmining(WebContentMining)
Themining-basedWEB(WebStructureMining)
WEB-baseduseofmining(WebUsageMining)
(1)WEB-basedcontentmining
Theso-calledWEB-basedcontentminingisactuallyadocumentfromtheWEBandthedescriptionoftheaccesstoknowledge,WEBDocumentsminingandconcept-basedindexorsearchforAgenttechnologyshouldalsobeattributedtosuchresources.ManytypesofWebinformationresources,thecurrentWWWinformationresourceshasbecomethesubjectofnetworkinformationresources,butinadditionalargenumberofpeopledirectlyfromthewebcrawling,indexing,queryservicestoachievetheresources,theconsiderablepartoftheinformationishiddeninthedata(Ifthequestionsraisedbytheuserdynamicallygeneratedresults,thereisdatainthedatabasesystem,orsomeprivatedata)cannotbeindexed,sotheycannotprovideeffectiveretrievalmethod,whichforcesustodigouttheseelements.Iftheformsfromtheperspectiveofinformationresources,WEBcontentistext,images,audio,video,metadatasuchasthecompositionofthevariousformsofdata,whichwerefertoWEB-basedcontentminingisalsoamultimediadataMining.
2,basedonthestructureoftheminingWEB
ThistypeofminingistheoverallstructurefromtheWorldWideWebandwebpagesfoundonthelinkbetweenknowledgeoftheprocess,itismainlythepotentialofthelinkstructureminingWEBmode.Thisideacomesfromcitationanalysis,thatis,byanalyzingawebpagelinkandthenumberoflinksandtheobjectwastoestablishthelinkstructureofitsownmodeofWEB.Thismodelcanbeusedforwebpageclassificationandcanthusberelatedtoandassociatedwithdifferentdegreesofsimilaritybetweenpagesofinformation.WEBstructuremininghelpsusersfindrelatedtopicsintheauthorityofthesite,andsearchresultsontherankingofnetworkresourcesisverysignificant.
3,basedontheuseofminingWEB
WEB-baseduseofmining,alsoknownasWEBlogmining(WebLogMining).Andthefirsttwominingapproachtotheon-linedataminingoftheoriginalobject,useWEB-basedminingfaceisintheprocessofinteractiontheuserandthenetworktoextractdataoutofsecond-hand.Thesedatainclude:
webserveraccesslogs,proxyserver,logging,userregistrationinformation,andwhenusersvisittheWebsitebehaviorandaction,andsoon.WEBusageminingthisdata11recordstothelogfile,andthenaccumulatedinthelogfileminingtounderstandtheuser'
sWebbehaviordatawithmeaning.Theexamplebeforeusfallintothistype.
Excavatedfromfivetothreeformswerecomparedwiththespecificcontentofwhichwillbefurtherdescribedbelow.
WEB-basedcontentmining:
unstructuredsemi-structured\textdocumenthypertextdocuments\Bagofwordsn-gramswordorphraseintheconceptofrelationaldataentities\TFIDFandstatisticalmachinelearningvariants(includingnaturallanguageprocessing)\returnclassclustermodeltoexplorethetextextractionrulestoexploretheestablishmentofmodel.
Themining-basedWEB:
semi-structureddatabaseformofweblinkstructureof\super-textdocumentlinks\boundarysignsOEMrelationaldatagraphgraphic\ProprietaryAlgorithmforILP(revised)oftheassociationrules\explorehigh-frequencysub-structureexcavationsitesystemStructuralclassificationclustering.
WEB-basedmininguse:
interactiveforms\serverlogrecordslogrecordsbrowser\relationaltablegraphics\Proprietarystatisticalmachinelearningalgorithm(revised)associationrules\siteconstructionandmanagementofsalesimprovedtocreateausermode.
3,thecharacteristicsofdataminingbasedonWEB
(1)Whatisthesemi-structured
Theso-calledsemi-structuredasopposedtothepurposesofstructuredandunstructured.Wecallthetraditionaldatabasedatafullystructureddata,whiletherearestillsome,suchasabook,apicturesocompletelywithoutstructureunstructureddata.Semi-structuredissomewhereinbetween,withtheimplicitmodel,informationstructure,irregular,non-stricttypesofconstraintsandsoon.Semi-structureddatamodelhasthefollowingcharacteristics
Priordata,afterthemodel;
Semi-structureddatamodelisusedtodescribethedatastructureofinformation,ratherthanmandatoryconstraintdatastructure;
Semi-structureddatamodelnon-precise,itcanonlybedescribedaspartofthedatastructuremayalsobeundervariousstagesofdataprocessingperspectivevaries;
Semi-structureddatamodel,maybeverylargeevenmorethanthesizeofthesourcedata,andwillcontinuouslyupdateasthedataisintheprocessofdynamicchange.
(2)WEBcharacteristicsofthedata
Webdataonthemostimportantfeatureisthesemi-structured.However,dataontheWebandtraditionaldatainthedatabaseisdifferentfromtraditionaldatabaseshavesomedatamodel,candescribethemodeltospecificdataandspecificorganizationsinaccordancewiththelawofacertainconcentrationordistributionofstorage,structuralstrong;
theWeb,thedataisverycomplex,nospecificmodeltodescribethedataforeachsite,allindependentlydesignedandthedataitselfhasareadmeanddynamicvariability,andthereforethedataontheWebisnotastrongstructural.AtthesametimeWebpagesisadescriptionoflevels,asinglesiteisinaccordancewiththestructureoftheirarchitecture,whichhassomestructural.Therefore,webelievethat