外文文献及翻译.docx

上传人:b****4 文档编号:2830325 上传时间:2022-11-15 格式:DOCX 页数:11 大小:27.67KB
下载 相关 举报
外文文献及翻译.docx_第1页
第1页 / 共11页
外文文献及翻译.docx_第2页
第2页 / 共11页
外文文献及翻译.docx_第3页
第3页 / 共11页
外文文献及翻译.docx_第4页
第4页 / 共11页
外文文献及翻译.docx_第5页
第5页 / 共11页
点击查看更多>>
下载资源
资源描述

外文文献及翻译.docx

《外文文献及翻译.docx》由会员分享,可在线阅读,更多相关《外文文献及翻译.docx(11页珍藏版)》请在冰豆网上搜索。

外文文献及翻译.docx

外文文献及翻译

 

WhatisDataMining?

Manypeopletreatdataminingasasynonymforanotherpopularlyusedterm,“KnowledgeDiscoveryinDatabases”,orKDD.Alternatively,othersviewdataminingassimplyanessentialstepintheprocessofknowledgediscoveryindatabases.Knowledgediscoveryconsistsofaniterativesequenceofthefollowingsteps:

·datacleaning:

toremovenoiseorirrelevantdata,

·dataintegration:

wheremultipledatasourcesmaybecombined,

·dataselection:

wheredatarelevanttotheanalysistaskareretrievedfromthedatabase,

·datatransformation:

wheredataaretransformedorconsolidatedintoformsappropriateforminingbyperformingsummaryoraggregationoperations,forinstance,

·datamining:

anessentialprocesswhereintelligentmethodsareappliedinordertoextractdatapatterns,

·patternevaluation:

toidentifythetrulyinterestingpatternsrepresentingknowledgebasedonsomeinterestingnessmeasures,and

·knowledgepresentation:

wherevisualizationandknowledgerepresentationtechniquesareusedtopresenttheminedknowledgetotheuser.

Thedataminingstepmayinteractwiththeuseroraknowledgebase.Theinterestingpatternsarepresentedtotheuser,andmaybestoredasnewknowledgeintheknowledgebase.Notethataccordingtothisview,dataminingisonlyonestepintheentireprocess,albeitanessentialonesinceituncovershiddenpatternsforevaluation.

Weagreethatdataminingisaknowledgediscoveryprocess.However,inindustry,inmedia,andinthedatabaseresearchmilieu,theterm“datamining”isbecomingmorepopularthanthelongertermof“knowledgediscoveryindatabases”.Therefore,inthisbook,wechoosetousetheterm“datamining”.Weadoptabroadviewofdataminingfunctionality:

dataminingistheprocessofdiscoveringinterestingknowledgefromlargeamountsofdatastoredeitherindatabases,datawarehouses,orotherinformationrepositories.

Basedonthisview,thearchitectureofatypicaldataminingsystemmayhavethefollowingmajorcomponents:

1.Database,datawarehouse,orotherinformationrepository.Thisisoneorasetofdatabases,datawarehouses,spreadsheets,orotherkindsofinformationrepositories.Datacleaninganddataintegrationtechniquesmaybeperformedonthedata.

2.Databaseordatawarehouseserver.Thedatabaseordatawarehouseserverisresponsibleforfetchingtherelevantdata,basedontheuser’sdataminingrequest.

3.Knowledgebase.Thisisthedomainknowledgethatisusedtoguidethesearch,orevaluatetheinterestingnessofresultingpatterns.Suchknowledgecanincludeconcepthierarchies,usedtoorganizeattributesorattributevaluesintodifferentlevelsofabstraction.Knowledgesuchasuserbeliefs,whichcanbeusedtoassessapattern’sinterestingnessbasedonitsunexpectedness,mayalsobeincluded.Otherexamplesofdomainknowledgeareadditionalinterestingnessconstraintsorthresholds,andmetadata(e.g.,describingdatafrommultipleheterogeneoussources).

4.Dataminingengine.Thisisessentialtothedataminingsystemandideallyconsistsofasetoffunctionalmodulesfortaskssuchascharacterization,associationanalysis,classification,evolutionanddeviationanalysis.

5.Patternevaluationmodule.Thiscomponenttypicallyemploysinterestingnessmeasuresandinteractswiththedataminingmodulessoastofocusthesearchtowardsinterestingpatterns.Itmayaccessinterestingnessthresholdsstoredintheknowledgebase.Alternatively,thepatternevaluationmodulemaybeintegratedwiththeminingmodule,dependingontheimplementationofthedataminingmethodused.Forefficientdatamining,itishighlyrecommendedtopushtheevaluationofpatterninterestingnessasdeepaspossibleintotheminingprocesssoastoconfinethesearchtoonlytheinterestingpatterns.

6.Graphicaluserinterface.Thismodulecommunicatesbetweenusersandthedataminingsystem,allowingtheusertointeractwiththesystembyspecifyingadataminingqueryortask,providinginformationtohelpfocusthesearch,andperformingexploratorydataminingbasedontheintermediatedataminingresults.Inaddition,thiscomponentallowstheusertobrowsedatabaseanddatawarehouseschemasordatastructures,evaluateminedpatterns,andvisualizethepatternsindifferentforms.

Fromadatawarehouseperspective,dataminingcanbeviewedasanadvancedstageofon-1ineanalyticalprocessing(OLAP).However,datamininggoesfarbeyondthenarrowscopeofsummarization-styleanalyticalprocessingofdatawarehousesystemsbyincorporatingmoreadvancedtechniquesfordataunderstanding.

Whiletheremaybemany“dataminingsystems”onthemarket,notallofthemcanperformtruedatamining.Adataanalysissystemthatdoesnothandlelargeamountsofdatacanatmostbecategorizedasamachinelearningsystem,astatis

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 经管营销 > 经济市场

copyright@ 2008-2022 冰豆网网站版权所有

经营许可证编号:鄂ICP备2022015515号-1