03Data-Warehousing-and-OPPT课件下载推荐.ppt
《03Data-Warehousing-and-OPPT课件下载推荐.ppt》由会员分享,可在线阅读,更多相关《03Data-Warehousing-and-OPPT课件下载推荐.ppt(58页珍藏版)》请在冰豆网上搜索。
ConceptsandTechniques,2,October6,2022,DataMining:
ConceptsandTechniques,3,Chapter3:
DataWarehousingandOLAPTechnology:
AnOverview,Whatisadatawarehouse?
Amulti-dimensionaldatamodelDatawarehousearchitectureDatawarehouseimplementationFromdatawarehousingtodatamining,October6,2022,DataMining:
ConceptsandTechniques,4,WhatisDataWarehouse?
Definedinmanydifferentways,butnotrigorously.AdecisionsupportdatabasethatismaintainedseparatelyfromtheorganizationsoperationaldatabaseSupportinformationprocessingbyprovidingasolidplatformofconsolidated,historicaldataforanalysis.“Adatawarehouseisasubject-oriented,integrated,time-variant,andnonvolatilecollectionofdatainsupportofmanagementsdecision-makingprocess.”W.H.InmonDatawarehousing:
Theprocessofconstructingandusingdatawarehouses,October6,2022,DataMining:
ConceptsandTechniques,5,DataWarehouseSubject-Oriented,Organizedaroundmajorsubjects,suchascustomer,product,salesFocusingonthemodelingandanalysisofdatafordecisionmakers,notondailyoperationsortransactionprocessingProvideasimpleandconciseviewaroundparticularsubjectissuesbyexcludingdatathatarenotusefulinthedecisionsupportprocess,October6,2022,DataMining:
ConceptsandTechniques,6,DataWarehouseIntegrated,Constructedbyintegratingmultiple,heterogeneousdatasourcesrelationaldatabases,flatfiles,on-linetransactionrecordsDatacleaninganddataintegrationtechniquesareapplied.Ensureconsistencyinnamingconventions,encodingstructures,attributemeasures,etc.amongdifferentdatasourcesE.g.,Hotelprice:
currency,tax,breakfastcovered,etc.Whendataismovedtothewarehouse,itisconverted.,October6,2022,DataMining:
ConceptsandTechniques,7,DataWarehouseTimeVariant,ThetimehorizonforthedatawarehouseissignificantlylongerthanthatofoperationalsystemsOperationaldatabase:
currentvaluedataDatawarehousedata:
provideinformationfromahistoricalperspective(e.g.,past5-10years)EverykeystructureinthedatawarehouseContainsanelementoftime,explicitlyorimplicitlyButthekeyofoperationaldatamayormaynotcontain“timeelement”,October6,2022,DataMining:
ConceptsandTechniques,8,DataWarehouseNonvolatile,AphysicallyseparatestoreofdatatransformedfromtheoperationalenvironmentOperationalupdateofdatadoesnotoccurinthedatawarehouseenvironmentDoesnotrequiretransactionprocessing,recovery,andconcurrencycontrolmechanismsRequiresonlytwooperationsindataaccessing:
initialloadingofdataandaccessofdata,October6,2022,DataMining:
ConceptsandTechniques,9,DataWarehousevs.HeterogeneousDBMS,TraditionalheterogeneousDBintegration:
AquerydrivenapproachBuildwrappers/mediatorsontopofheterogeneousdatabasesWhenaqueryisposedtoaclientsite,ameta-dictionaryisusedtotranslatethequeryintoqueriesappropriateforindividualheterogeneoussitesinvolved,andtheresultsareintegratedintoaglobalanswersetComplexinformationfiltering,competeforresourcesDatawarehouse:
update-driven,highperformanceInformationfromheterogeneoussourcesisintegratedinadvanceandstoredinwarehousesfordirectqueryandanalysis,October6,2022,DataMining:
ConceptsandTechniques,10,DataWarehousevs.OperationalDBMS,OLTP(on-linetransactionprocessing)MajortaskoftraditionalrelationalDBMSDay-to-dayoperations:
purchasing,inventory,banking,manufacturing,payroll,registration,accounting,etc.OLAP(on-lineanalyticalprocessing)MajortaskofdatawarehousesystemDataanalysisanddecisionmakingDistinctfeatures(OLTPvs.OLAP):
Userandsystemorientation:
customervs.marketDatacontents:
current,detailedvs.historical,consolidatedDatabasedesign:
ER+applicationvs.star+subjectView:
current,localvs.evolutionary,integratedAccesspatterns:
updatevs.read-onlybutcomplexqueries,October6,2022,DataMining:
ConceptsandTechniques,11,OLTPvs.OLAP,October6,2022,DataMining:
ConceptsandTechniques,12,WhySeparateDataWarehouse?
HighperformanceforbothsystemsDBMStunedforOLTP:
accessmethods,indexing,concurrencycontrol,recoveryWarehousetunedforOLAP:
complexOLAPqueries,multidimensionalview,consolidationDifferentfunctionsanddifferentdata:
missingdata:
DecisionsupportrequireshistoricaldatawhichoperationalDBsdonottypicallymaintaindataconsolidation:
DSrequiresconsolidation(aggregation,summarization)ofdatafromheterogeneoussourcesdataquality:
differentsourcestypicallyuseinconsistentdatarepresentations,codesandformatswhichhavetobereconciledNote:
TherearemoreandmoresystemswhichperformOLAPanalysisdirectlyonrelationaldatabases,October6,2022,DataMi