Part 9 Review Questions and Exercises.docx
《Part 9 Review Questions and Exercises.docx》由会员分享,可在线阅读,更多相关《Part 9 Review Questions and Exercises.docx(13页珍藏版)》请在冰豆网上搜索。
Part9ReviewQuestionsandExercises
SOLUTIONSTOREVIEWQUESTIONS
ANDEXERCISES
FORPART9–BUSINESSINTELLIGENCE(CHAPTERS32–35)
SolutionstoReviewQuestionsandExercises
Chapter32DataWarehousingConcepts3
Chapter33DataWarehousing-Design6
Chapter34OLAP8
Chapter35DataMining13
Chapter32DataWarehousingConcepts
ReviewQuestions
32.1Describewhatismeantbythefollowingterms,whendescribingthecharacteristicsofthedatainadatawarehouse:
(a)subject-oriented;
(b)integrated;
(c)time-variant;
(d)non-volatile.
SeeSection32.1.2.
32.2DiscusshowOnlineTransactionProcessing(OLTP)systemsdifferfromdatawarehousingsystems.
SeeSection32.1.4
32.3Discussthemainbenefitsandproblemsassociatedwithdatawarehousing.
ForthemainbenefitsofdatawarehousingseeSection32.1.3andforthemainproblemsassociatedwithdatawarehousingseeSection32.1.5.
32.4Presentadiagrammaticrepresentationofthetypicalarchitectureandmaincomponentsofadatawarehouse.
ForadiagramofthetypicalarchitectureofadatawarehouseseeFigure32.1.
32.5Describethecharacteristicsandmainfunctionsofthefollowingcomponentsofadatawarehouse.
(a)loadmanagerSeeSection32.2.3
(b)warehousemanagerSeeSection32.2.4
(c)querymanagerSeeSection32.2.5
(d)metadataSeeSection32.2.9
(e)end-useraccesstools.SeeSection32.2.10
32.6Describetheprocessesassociatedwithdataextraction,cleansing,andtransformationtools.
TheextractionsteptargetsoneormoredatasourcesfortheEDW;thesesourcestypicallyincludeOLTPdatabasesbutcanalsoincludesourcessuchaspersonaldatabasesandspreadsheets,enterpriseresourceplanning(ERP)files,andwebusagelogfiles.Thedatasourcesarenormallyinternalbutcanalsoincludeexternalsources,suchasthesystemsusedbysuppliersand/orcustomers.
Thetransformationstepappliesaseriesofrulesorfunctionstotheextracteddata,whichdetermineshowthedatawillbeusedforanalysisandcaninvolvetransformationssuchasdatasummations,dataencoding,datamerging,datasplitting,datacalculations,andcreationofsurrogatekeys.Theoutputfromthetransformationsisdatathatiscleanandconsistentwiththedataalreadyheldinthewarehouse,andfurthermore,isinaformthatisreadyforanalysisbyusersofthewarehouse.
Theloadingofthedataintothewarehousecanoccurafteralltransformationshavetakenplaceoraspartofthetransformationprocessing.Asthedataloadsintothewarehouse,additionalconstraintsdefinedinthedatabaseschemaaswellasintriggersactivatedupondataloadingwillbeapplied(suchasuniqueness,referentialintegrity,andmandatoryfields),whichalsocontributetotheoveralldataqualityperformanceoftheETLprocess.
32.7Describethespecializedrequirementsofarelationaldatabasemanagementsystem(RDBMS)suitableforuseinadatawarehouseenvironment.
SeeSection32.4.2
32.8Discusshowparalleltechnologiescansupporttherequirementsofthedatawarehouse.
SeelasttopicdiscussedinSection32.4.2undertheheadingParalleldatabasetechnologies.
32.9Discusstheimportanceofmanagingmeta-dataandhowthisrelatestotheintegrationofthedatawarehouse.
SeeSection32.4.3.
32.10Discussthemaintasksassociatedwiththeadministrationandmanagementofadatawarehouse.
Thedatawarehouseadministrationandmanagementtoolsmustbecapableofsupportingthefollowingtasks:
∙monitoringdataloadingfrommultiplesources;
∙dataqualityandintegritychecks;
∙managingandupdatingmeta-data;
∙monitoringdatabaseperformancetoensureefficientqueryresponsetimesandresourceutilization;
∙auditingdatawarehouseusagetoprovideuserchargebackinformation;
∙replicating,subsetting,anddistributingdata;
∙maintainingefficientdatastoragemanagement;
∙purgingdata;
∙archivingandbacking-updata;
∙implementingrecoveryfollowingfailure;
∙securitymanagement.
SeeSection32.4.4.
32.11Discusshowdatamartsdifferfromdatawarehousesanddiscussthemainreasonsforimplementingadatamart.
ForadiscussiononhowdatamartsdifferfromdatawarehousesseeintroductoryparagraphsofSection321.5andforreasonsforimplementingadatamartseeSection32.5.1.
32.12DescribethefeaturesofOraclethatsupportthecorerequirementsofdatawarehousing.
SeeSection32.6.
Exercises
32.13YouareaskedbytheManagingDirectorofDreamHometoinvestigateandreportontheapplicabilityofdatawarehousingfortheorganization.ThereportshouldcomparedatawarehousetechnologywithOLTPsystemsandshouldidentifytheadvantagesanddisadvantages,andanyproblemareasassociatedwithimplementingadatawarehouse.ThereportshouldreachafullyjustifiedsetofconclusionsontheapplicabilityofadatawarehouseforDreamHome.
Theformatandtheappropriatecontenttobecoveredinansweringthisquestionisdescribedinthequestionset.
Chapter33DataWarehousing-Design
ReviewQuestions
33.1Discusstheactivitiesassociatedwithinitiatinganenterprisedatawarehouse(EDW)project.
Tobeginadatawarehouseproject,weneedanswersforquestionssuchas:
whichuserrequirementsaremostimportantandwhichdatashouldbeconsideredfirst?
Also,shouldtheprojectbescaleddownintosomethingmoremanageable,yetatthesametimeprovideaninfrastructurecapableofultimatelydeliveringafull-scaleenterprise-widedatawarehouse?
TherequirementscollectionandanalysisstageofanEDWprojectinvolvesinterviewingappropriatemembersofstaffsuchasmarketingusers,financeusers,salesusers,operationalusers,andmanagementtoenabletheidentificationofaprioritizedsetofrequirementsfortheenterprisethatthedatawarehousemustmeet.Atthesametime,interviewsareconductedwithmembersofstaffresponsibleforOLTPsystemstoidentify,whichdatasourcescanprovideclean,valid,andconsistentdatathatwillremainsupportedoverthenextfewyears.
Theinterviewsprovidethenecessaryinformationforthetop-downview(userrequirements)andthebottom-upview(whichdatasourcesareavailable)oftheEDW.
33.2CompareandcontrasttheapproachestakeninthedevelopmentofanEDWbyInmon’sCorporateInformationFactory(CIF)andKimball’sBusinessDimensionalLifecycle.
Inmon’sapproachistostartbycreatingadatamodelofalltheenterprise’sdata;oncecomplete,itisusedtoimplementanEDW.TheEDWisthenusedtofeeddepartmentaldatabases(datamarts),whichexisttomeettheparticularinformationrequirementsofeachdepartment.TheEDWcanalsoprovidedatatootherspecializeddecisionsupportapplicationssuchasCustomerRelationshipManagement(CRM).Inmon’smethodologyusestraditionaldatabasemethodsandtechniquestodeveloptheEDW.Forexample,entity–relationship(ER)modeling(Chapter12)isusedtodescribetheEDWdatabase,whichholdstablesthatareinthirdnormalform(Chapter14).InmonbelievesthatafullynormalizedEDWisrequiredtoprovidethenecessaryflexibilitytosupportthevariousoverlappinganddistinctinformationrequirementsofallpartsoftheenterprise.
Kimball’sapproachusesnewmethodsandtechniquesinthedevelopmentofanEDW.Kimballstartsbyidentifyingtheinformationrequirements(referredtoasanalyticalthemes)andassociatedbusinessprocessesoftheenterprise.ThisactivityresultsinthecreationofacriticaldocumentcalledaDataWarehouseBusMatrix.Thematrixlistsallofthekeybusinessprocessesofanenterprisetogetherwithanindicationofhowtheseprocessesaretobeanalyzed.Thematrixisusedtofacilitatetheselectionanddevelopmentofthefirstdatabase(datamart)tomeettheinformationrequirementsofaparticulargroupofusersoftheenterprise.Thisfirstdatamartiscriticalinsettingthesceneforthelaterintegrationofotherdatamartsastheycomeonline.TheintegrationofdatamartsultimatelyleadstothedevelopmentofanEDW.Kimballusesanewtechniquecalleddimensionalitymodelingtoestablishthedatamodel(referredtoasadimensionalmodel(DM)foreachdatamart.Dimensionalitymodelingresultsinthecreationofadimensionalmodel(commonlycalledastarschema)foreachdatamartthatishighlydenormalized.Kimballbelievesthattheuseofstarschemasisamoreintuitivewaytomodeldecisionsupportdataandfurthermorecanenhanceperformanceforcomplexanalyticalqueries.
33.3DiscussthemainprinciplesandstagesassociatedwithKimball’sBusinessDimensionalLifecycle.
ThemainstagesaresummarizedinFigure33.1.
33.4Discusstheconceptsassociatedwithdimensionalitymodeling.
Everydimensionalmodel(DM)iscomposedofonetablewithacompositeprimarykey,calledthefacttable,andasetofsmallertables,calleddimensiontables.Eachdimensiontablehasasimple(noncomposite)primarykeythatcorrespondsexactlytooneofthecomponentsofthecompositekeyinthefacttable.Inotherwords,theprimarykeyofthefacttableismadeupoftwoormoreforeignkeys.Thischaracteristic“star-like”structureiscalledastarschemaorstarjoin.AnotherimportantfeatureofaDMisthatallnaturalkeysarereplacedwithsurrogatekeys.Thismeansthateveryjoinbetweenfactanddimensiontablesisbasedonsurrogatekeys,notnaturalkeys.Eachsurrogatekeyshouldhaveageneralizedstructurebasedonsimpleintegers.TheuseofsurrogatekeysallowsthedatainthewarehousetohavesomeindependencefromthedatausedandproducedbytheOLTPsystems.
33.5Describehowstar,snowflake,andstarflakeschemasdiffer.
Starschemaisalogicalstructurethathasafacttablecontainingfactualdatainthecenter,surroundedbydim