1、DT Data Integration Architecture Elvis Zhang Data TechnologyData Integration ArchitectureThe purpose of the document is to provide data standards for the use and maintenance of enterprise data. This is a White Paper containing information on the architecture and processes required in order to integr
2、ate disparate data sources. It include information on: Comparison between Data Integration and Application Integration Components & Processes of Data Integration XMLs use in Data Integration Table of ContentsTable of Contents 2I. Introduction 3A. Differentiating Data Integration from Application Int
3、egration (EAI) 31. Each Technology solves different business problems 32. Each Technology requires different tools 4B. Factors In Determining Which Technology To Use 41. Limitations of Application Integration (EAI) 42. Strengths of Data Integration 5C. Guidelines for Selection 6II. “Actual” Data Int
4、egration (ODS Based) 7A. Data Integration using an Operational Data Store (ODS) 7B. ODS Components & Processes 81. Components 82. Processes 9III. “Virtual” Data Integration (XML-Based) 11 I. IntroductionIntegrating applications is simple from a communication perspective, but it can be extremely chal
5、lenging from a data perspective. Multiple applications can be made to send and receive messages and transactions among each other; but if they do not have a common understanding of the context and meaning of the data involved, the result will be incomplete, or inaccurate, information in one, or more
6、, of the applications. Therefore application integration (EAI) should not be implemented as an isolated technology but rather as part of a broader integration strategy that evaluates which type of integration (data or application) is appropriate for which task. Even though application integration is
7、 getting more attention today, the need for data integration is also growing rapidly, driven primarily by e-business and portal requirements. This paper will explore the critical data integration issues that need to be evaluated as part of a middleware-based application integration strategy, and gui
8、delines for when migration to a centralized data repository for housing key elements of enterprise information is preferred, rather than an EAI middleware solution, as well as the tools and standards to do so.A. Differentiating Data Integration from Application Integration (EAI) The key difference b
9、etween application integration and data integration is that application integration enables real-time sharing of data between different systems and/ applications. Data integration requires the combining of data from disparate sources into a new, consolidated data resource. Application integration, w
10、hich is the creation of new strategic business solutions by reusing the functionality of existing applications, involves the use of EAI middleware to connect disparate systems and/or applications. EAI middleware enables one application, or database, to communicate with another application, or databa
11、se, but does not require any change to the existing data in the underlying databases. Data integration involves the use of ETL middleware to reduce data redundancy by collecting and reorganizing disparate data into one physical or logical place. Integrated data repositories are the alternative to EA
12、I, which present a centralized and logical approaches to integrating information. The physical implementation of data integration could be within a centralized enterprise data warehouse or several logical data repositories (ODS). 1. Each Technology solves different business problems There are many r
13、equirements for EAI, from business-to-business (B2B) to internal application integration scenarios, however different integration scenarios require different integration technologies. Application Integration (EAI)The business driver behind application integration, (the creation of new strategic busi
14、ness solutions by integrating the functionality of existing enterprises applications) is the recognition that stove-piped applications, typically transaction processing (OLTP), automate individual steps in a larger business process. By capturing output from one system and routing it into a receiving
15、 application, a broader business process is automated. Three critical attributes of application integration are: 1) participating systems are tightly dependent 2) integration is done at the application level to preserve transactional integrity 3) both systems are process centric in their design. Dat
16、a Integration The goal of data integration is to get redundant data, stored in multiple independent systems, to agree on the facts, since there is no practical way to re-engineer the applications and eliminate the redundancy. In todays environment, data consistency is important for both OLTP and DSS
17、 architectures. When data consistency is the sole motivation for integrating previously independent applications, then data integration technology, rather than EAI technology should be used. Three critical concepts are: 1) participating systems are loosely coupled 2) integration is done at the datab
18、ase level, and 3) the receiving system is data centric in its design 2. Each Technology requires different toolsEAI tools are the most appropriate integration technology for application integration. Within EAI tools, logic and business rules are developed that transform the syntax and semantics of t
19、he sending applications data, messages or transactions into inputs that are semantically and syntactically consistent with those of the receiving systems. ETL tools are more appropriate for data integration. Data must be complete, consistent, timely and relevant to create information quality. Theref
20、ore, data originating in two or more applications must be reconciled into a semantically consistent format. ETL tools integrate data from multiple systems, reconciling redundant and overlapping data into a single, consistent integrated data structure and enabling syntax and semantic transformations
21、to be applied to the data. Although ETL tools and EAI tools have legitimate functional overlap, they feature different technical strengths specific to the problems they solve. Following are the key differentiating features between them. 1. The first is the time requirement. - If real-time, or near-t
22、ime, update is required, then an EAI tool will satisfy the requirement better than an ETL one. 2. The second is the volume of data to be moved. - An ETL batch process will always be more efficient at moving a large volume of data than individual messages. 3. The third is the level at which integrati
23、on needs to occur. - Messaging brokers tend to have a programming API which results in system integration at the application programming level. 4. The fourth is the need for intelligent routing. - ETL vendors do not currently provide intelligent, or dynamic, routing of transactions. 5. The fifth is
24、the level of metadata support - Both EAI and ETL make use of metadata however, EAI vendors have not offered as rich an approach to metadata as the ETL vendors. B. Factors In Determining Which Technology To UseUsing EAI middleware to link disparate information sources for certain types of application
25、s should not be viewed as an “quick fix” alternative to building and maintaining an integrated data repository. In many cases, adding new applications to installed applications introduces inefficiencies or disconnected islands of information 1. Limitations of Application Integration (EAI) Using EAI
26、middleware to link disparate information sources can have an impact on data consistency and/or quality, which can create the following problems: Data Redundancy. Separate databases often maintain duplicate or near-duplicate data, incurring increased storage and management costs. Process Redundancy.
27、The middleware approach requires that data be transformed into the requesting applications format each time it is accessed. . Synchronization and Ownership Issues. These problems may result from varying source system availability, as well as from the logical repositories receiving new and updated da
28、ta from a broad range of sources . Data Inconsistency. Serious inconsistencies may result from the source systems employing different semantics, formats, periods of applicability and cycles of update and refreshment. Limited Repeatability and Auditability. Changes in the source systems operational d
29、ata may cause the same action to deliver different results at different times. 2. Strengths of Data Integration There are several types of data integration scenarios, where building an integrated data resource may be preferred over an EAI solution: To Solve Data Integrity Issues Between SystemsEAI e
30、nables integration in a broad, business context and can serve as a foundation for integrating legacy, as well as, new applications, however it is unlikely that an application integration project will succeed, no matter how sophisticated the EAI tools, if the underlying systems are a mess (i.e. appli
31、cations with eccentric data, unique business rules, and convoluted processes) these systems may never be usefully integrated. Not all application integration projects will require integrating such disparate heterogeneous data sources to require a data integration solution. However, for integrating p
32、ortals, commerce servers or custom applications to two or more relatively complex and disparate, heterogeneous, data sources, a data repository should be considered as a way of not only reducing initial effort, but also creating leveraged reusable data access components. To Transform & Cleanse Disparate Data Most organizations are faced with managing large quantities of disparate data. This disparate data severely impacts both the organizations ability to perform its business activities
copyright@ 2008-2022 冰豆网网站版权所有
经营许可证编号:鄂ICP备2022015515号-1