Five Essential Components of a Data Integration Framework

EAI is Dead, Long Live EAI
October 12, 2003
Ten Pervasive Data Integration Myths
November 13, 2003
Show all

Five Essential Components of a Data Integration Framework

published in Information ManagementLast month, I introduced you to the concept of a data integration framework (DIF) – a combination of processes, standards, people and tools that ultimately helps establish information as a corporate asset. Using a DIF as a blueprint, you can transform data into consistent, quality, timely information for your businesspeople to use in measuring, monitoring and managing the enterprise.

The importance of using a DIF comes into play in various situations, but it is especially important when an organization is first developing or revitalizing its BI approach or finds itself with competing projects. Many healthcare customers, for example, are just beginning to realize the need for a coordinated BI strategy –­ not just for financial information, but to reap the rewards of volumes of clinical data. Retailers find that using a DIF when they are revitalizing their BI approach gives them a clearer picture of the critical points and provides better value to their users. It’s a way to see through vendor hype to what is truly going to help the business. Organizations such as financial institutions that have blossomed during the boom years have found that competing data stovepipes in different subsidiaries have blossomed as well. A DIF gives them a better overall view of their customers and helps consolidate critical data where it can help them, not slow them down.

DIF Information Architecture

A good information architecture can turn scattered data into the information that your business uses to operate and plan for the future. Data is gathered, transformed using business rules and technical conversions, staged in databases and made available to business users to report and analyze. Data flows from creation through transformation to information, just as materials flow from the supplier (ERP, transactions systems) to the factory (data integration) to the warehouses (data warehouses) and finally to retail stores (data marts and business analytic applications).

The DIF information architecture comprises five parts:

  • Data sources: Data can come from numerous places, including legacy systems, ERP systems, and front-office forecasting and budgeting systems.
  • Data preparation: Gathering, reformatting, consolidating, transforming, cleansing and storing data.
  • Data franchising: The aggregation, summarization and formularization of data for use with business intelligence (BI) tools.
  • Meta data management: Meta data is defined as “data about data.” For example, a library catalog contains information (meta data) about publications (data).
  • Data management: Data and meta data management are the processes behind the scenes that pass data and its definitions between other processes.

DIF Processes

The DIF encompasses two kinds of processes ­– the processes to determine your data requirements and solution, and the processes used to physically gather data from its sources and transform it into information that businesspeople can use to analyze and make decisions.

The biggest mistake most people make is to assume that an ETL tool is their silver bullet. It’s a critical part, but it’s just one part of the whole process.

DIF Standards

Standards help ensure successful construction, implementation and deployment of the DIF. There are many areas that need to be addressed:

  • Project management: Methodology, baselines, status, controls, scope, people management and team management.
  • Software development: Tools, techniques, documentation, version control and release management.
  • Technology and products: Company and project technology infrastructure and tools.
  • Architecture: The blueprint of what and how you are going to build.
  • Data: Standards help ensure that data is consistent, accurate and valid.

DIF Tools

You’ll use software tools to create, deploy and maintain the DIF. There are five tool categories:

  • Data modeling: Use to create and document the logical data model and the initial draft of physical data model.
  • Data profiling: Use to help you understand the source system data, including its definition, condition and technical aspects (e.g., storage and format).
  • Data preparation: Use ETL tools to increase developer productivity, decrease overall development time, simplify maintenance and provide built-in documentation. You may also need tools for data cleansing to ensure data consistency and error detection.
  • Data franchising: These tools help with data aggregation, summarization and formularization. There’s some overlap with data preparation, but it’s a much simpler process.
  • Meta data management: While often the forgotten aspect of the architecture, technical meta data management can improve development and maintenance productivity. Business meta data use can improve business users’ understanding of the information offered.

DIF Resources and Skills

Don’t overlook the right resources and skills. You need experienced people who understand data warehousing and business intelligence. This doesn’t just mean understanding the tools and technology. It means understanding the complete business process. Too many DW/BI projects fail because they are treated like ERP or transaction processing systems, yet are being judged on their business return on investment (ROI). If the goal is ROI, the project needs to encompass a complete data integration framework, not just a piece of it.

Next month’s column will cover the DIF information architecture in depth. In the meantime, please feel free to e-mail me with your questions and feedback on these columns at rsherman@athena-solutions.com.

Leave a Reply

Your email address will not be published. Required fields are marked *