DTData Integration Architecture Elvis Zhang.docx
《DTData Integration Architecture Elvis Zhang.docx》由会员分享,可在线阅读,更多相关《DTData Integration Architecture Elvis Zhang.docx(12页珍藏版)》请在冰豆网上搜索。
DTDataIntegrationArchitectureElvisZhang
DataTechnology
DataIntegrationArchitecture
Thepurposeofthedocumentistoprovidedatastandardsfortheuseandmaintenanceofenterprisedata.ThisisaWhitePapercontaininginformationonthearchitectureandprocessesrequiredinordertointegratedisparatedatasources.Itincludeinformationon:
❑ComparisonbetweenDataIntegrationandApplicationIntegration
❑Components&ProcessesofDataIntegration
❑XML’suseinDataIntegration
TableofContents
TableofContents2
I.Introduction3
A.DifferentiatingDataIntegrationfromApplicationIntegration(EAI)3
1.EachTechnologysolvesdifferentbusinessproblems3
2.EachTechnologyrequiresdifferenttools4
B.FactorsInDeterminingWhichTechnologyToUse4
1.LimitationsofApplicationIntegration(EAI)4
2.StrengthsofDataIntegration5
C.GuidelinesforSelection6
II.“Actual”DataIntegration(ODSBased)7
A.DataIntegrationusinganOperationalDataStore(ODS)7
B.ODSComponents&Processes8
1.Components8
2.Processes9
III.“Virtual”DataIntegration(XML-Based)11
I.Introduction
Integratingapplicationsissimplefromacommunicationperspective,butitcanbeextremelychallengingfromadataperspective.Multipleapplicationscanbemadetosendandreceivemessagesandtransactionsamongeachother;butiftheydonothaveacommonunderstandingofthecontextandmeaningofthedatainvolved,theresultwillbeincomplete,orinaccurate,informationinone,ormore,oftheapplications.Thereforeapplicationintegration(EAI)shouldnotbeimplementedasanisolatedtechnologybutratheraspartofabroaderintegrationstrategythatevaluateswhichtypeofintegration(dataorapplication)isappropriateforwhichtask.
Eventhoughapplicationintegrationisgettingmoreattentiontoday,theneedfordataintegrationisalsogrowingrapidly,drivenprimarilybye-businessandportalrequirements.Thispaperwillexplorethecriticaldataintegrationissuesthatneedtobeevaluatedaspartofamiddleware-basedapplicationintegrationstrategy,andguidelinesforwhenmigrationtoacentralizeddatarepositoryforhousingkeyelementsofenterpriseinformationispreferred,ratherthananEAImiddlewaresolution,aswellasthetoolsandstandardstodoso.
A.DifferentiatingDataIntegrationfromApplicationIntegration(EAI)
Thekeydifferencebetweenapplicationintegrationanddataintegrationisthatapplicationintegrationenablesreal-timesharingofdatabetweendifferentsystemsand/applications.Dataintegrationrequiresthecombiningofdatafromdisparatesourcesintoanew,consolidateddataresource.
❑Applicationintegration,whichisthecreationofnewstrategicbusinesssolutionsbyreusingthefunctionalityofexistingapplications,involvestheuseofEAImiddlewaretoconnectdisparatesystemsand/orapplications.EAImiddlewareenablesoneapplication,ordatabase,to"communicate"withanotherapplication,ordatabase,butdoesnotrequireanychangetotheexistingdataintheunderlyingdatabases.
❑DataintegrationinvolvestheuseofETLmiddlewaretoreducedataredundancybycollectingandreorganizingdisparatedataintoonephysicalorlogicalplace.IntegrateddatarepositoriesarethealternativetoEAI,whichpresentacentralizedandlogicalapproachestointegratinginformation.Thephysicalimplementationofdataintegrationcouldbewithinacentralizedenterprisedatawarehouseorseverallogicaldatarepositories(ODS).
1.EachTechnologysolvesdifferentbusinessproblems
TherearemanyrequirementsforEAI,frombusiness-to-business(B2B)tointernalapplicationintegrationscenarios,howeverdifferentintegrationscenariosrequiredifferentintegrationtechnologies.
❑ApplicationIntegration(EAI)
Thebusinessdriverbehindapplicationintegration,(thecreationofnewstrategicbusinesssolutionsbyintegratingthefunctionalityofexistingenterprise'sapplications)istherecognitionthatstove-pipedapplications,typicallytransactionprocessing(OLTP),automateindividualstepsinalargerbusinessprocess.Bycapturingoutputfromonesystemandroutingitintoareceivingapplication,abroaderbusinessprocessisautomated.Threecriticalattributesofapplicationintegrationare:
1)participatingsystemsaretightlydependent
2)integrationisdoneattheapplicationleveltopreservetransactionalintegrity
3)bothsystemsareprocesscentricintheirdesign.
❑DataIntegration
Thegoalofdataintegrationistogetredundantdata,storedinmultipleindependentsystems,to"agreeonthefacts,"sincethereisnopracticalwaytore-engineertheapplicationsandeliminatetheredundancy.Intoday’senvironment,dataconsistencyisimportantforbothOLTPandDSSarchitectures.Whendataconsistencyisthesolemotivationforintegratingpreviouslyindependentapplications,thendataintegrationtechnology,ratherthanEAItechnologyshouldbeused.Threecriticalconceptsare:
1)participatingsystemsarelooselycoupled
2)integrationisdoneatthedatabaselevel,and
3)thereceivingsystemisdatacentricinitsdesign
2.EachTechnologyrequiresdifferenttools
EAItoolsarethemostappropriateintegrationtechnologyforapplicationintegration.WithinEAItools,logicandbusinessrulesaredevelopedthattransformthesyntaxandsemanticsofthesendingapplication'sdata,messagesortransactionsintoinputsthataresemanticallyandsyntacticallyconsistentwiththoseofthereceivingsystems.
ETLtoolsaremoreappropriatefordataintegration.Datamustbecomplete,consistent,timelyandrelevanttocreateinformationquality.Therefore,dataoriginatingintwoormoreapplicationsmustbereconciledintoasemanticallyconsistentformat.ETLtoolsintegratedatafrommultiplesystems,reconcilingredundantandoverlappingdataintoasingle,consistentintegrateddatastructureandenablingsyntaxandsemantictransformationstobeappliedtothedata.
AlthoughETLtoolsandEAItoolshavelegitimatefunctionaloverlap,theyfeaturedifferenttechnicalstrengthsspecifictotheproblemstheysolve.Followingarethekeydifferentiatingfeaturesbetweenthem.
1.Thefirstisthetimerequirement.-Ifreal-time,ornear-time,updateisrequired,thenanEAItoolwillsatisfytherequirementbetterthananETLone.
2.Thesecondisthevolumeofdatatobemoved.-AnETLbatchprocesswillalwaysbemoreefficientatmovingalargevolumeofdatathanindividualmessages.
3.Thethirdisthelevelatwhichintegrationneedstooccur.-MessagingbrokerstendtohaveaprogrammingAPIwhichresultsinsystemintegrationattheapplicationprogramminglevel.
4.Thefourthistheneedforintelligentrouting.-ETLvendorsdonotcurrentlyprovideintelligent,ordynamic,routingoftransactions.
5.Thefifthisthelevelofmetadatasupport-BothEAIandETLmakeuseofmetadatahowever,EAIvendorshavenotofferedasrichanapproachtometadataastheETLvendors.
B.FactorsInDeterminingWhichTechnologyToUse
UsingEAImiddlewaretolinkdisparateinformationsourcesforcertaintypesofapplicationsshouldnotbeviewedasan“quickfix”alternativetobuildingandmaintaininganintegrateddatarepository.Inmanycases,addingnewapplicationstoinstalledapplicationsintroducesinefficienciesordisconnected"islands"ofinformation
1.LimitationsofApplicationIntegration(EAI)
UsingEAImiddlewaretolinkdisparateinformationsourcescanhaveanimpactondataconsistencyand/orquality,whichcancreatethefollowingproblems:
❑DataRedundancy.Separatedatabasesoftenmaintainduplicateornear-duplicatedata,incurringincreasedstorageandmanagementcosts.
❑ProcessRedundancy.Themiddlewareapproachrequiresthatdatabetransformedintotherequestingapplication'sformateachtimeitisaccessed..
❑SynchronizationandOwnershipIssues.Theseproblemsmayresultfromvaryingsourcesystemavailability,aswellasfromthelogicalrepositoriesreceivingnewandupdateddatafromabroadrangeofsources.
❑DataInconsistency.Seriousinconsistenciesmayresultfromthesourcesystems'employingdifferentsemantics,formats,periodsofapplicabilityandcyclesofupdateandrefreshment.
❑LimitedRepeatabilityandAuditability.Changesinthesourcesystems'operationaldatamaycausethesameactiontodeliverdifferentresultsatdifferenttimes.
2.StrengthsofDataIntegration
Thereareseveraltypesofdataintegrationscenarios,wherebuildinganintegrateddataresourcemaybepreferredoveranEAIsolution:
❑ToSolveDataIntegrityIssuesBetweenSystems
EAIenablesintegrationinabroad,businesscontextandcanserveasafoundationforintegratinglegacy,aswellas,newapplications,howeveritisunlikelythatanapplicationintegrationprojectwillsucceed,nomatterhowsophisticatedtheEAItools,iftheunderlyingsystemsareamess(i.e.applicationswitheccentricdata,uniquebusinessrules,andconvolutedprocesses)thesesystemsmayneverbeusefullyintegrated.
Notallapplicationintegrationprojectswillrequireintegratingsuchdisparateheterogeneousdatasourcestorequireadataintegrationsolution.However,forintegratingportals,commerceserversorcustomapplicationstotwoormorerelativelycomplexanddisparate,heterogeneous,datasources,adatarepositoryshouldbeconsideredasawayofnotonlyreducinginitialeffort,butalsocreatingleveragedreusabledataaccesscomponents.
❑ToTransform&CleanseDisparateData
Mostorganizationsarefacedwithmanaginglargequantitiesofdisparatedata.Thisdisparatedataseverelyimpactsboththeorganization'sabilitytoperformitsbusinessactivities