In part 1 of this article we discussed the ugly duckling called an operational data store (ODS). Rather than discussing the conventional wisdom and, quite frankly, the confusion surrounding an ODS, we decided take a different approach. We followed the approach someone would take when introducing a product to market – first find out what the customer’s needs are and then shape a product to meet that need (and maybe even anticipate future needs). We discussed four application areas where an ODS is used today:
Conventional wisdom doesn’t always age like a fine wine. When it starts smelling like dirty socks, it’s time to challenge it. Sometimes the conventional wisdom published in books, written in articles and presented in seminars is accepted as gospel, even when it starts to develop a distinctive odor. Examine conventional wisdom periodically to determine if it is current and relevant. This is particularly true when it comes to an ODS. The purposes and methods used for building the ODS may have changed dramatically over the years.
Why would the reasons for building an ODS have changed? Sometimes conventional wisdom is used to justify or rationalize an ODS without critically examining if that logic still applies to your current situation. Technology doesn’t stand still; the conditions that applied years ago, when many ODSs were built, may not apply anymore.
How have the methods and technology used to create ODS systems changed? ODS systems are quite often very distinct and separate from your DW and business intelligence (BI) architectures. Because they were often built from the operational applications and created by the IT groups handling DW/BI applications, they were usually extensions of DW/BI applications, not designed along DW and BI architectural principles. The tools used for development were often those used with the ERP systems, and the database design was often similar to the operational application’s schema. With different tools, database design principles and groups developing the ODS and DW, it is no wonder they were not engineered to work in the same architecture … and became silos. However, the world has changed both with the business climate demanding transparency in data (or the government mandating it!) and technology enabling data integration and access that was not possible before.
There are several technological advances that impact the “why” and “how” of building an ODS.
First, networks, storage and servers have all significantly advanced to the point where much larger data volumes and data throughput are possible than when many ODSs were initially built. Real-time connectivity to operational data, with many pre-built access paths having been built for the most widely used ERP systems, is readily available today. You can no longer get away with stuffing data into something called an ODS simply because the DW can’t handle the volumes, frequency of updates or real-time connectivity. There are many valid reasons for an ODS (see my June column), but making up for deficiencies in your DW is not one of them.
Second, far more data-integration technologies are available today. We are no longer restricted to the extract, transform and load (ETL) technology with which data warehouses were built. Today, enterprise application integration (EAI) and enterprise information integration (EII) tools are available and viable for data integration within a DW environment. In fact, several leading ETL vendors have incorporated EAI and EII in their product offerings, either through their own products or through software partnerships. These ETL vendors have also tried to create market awareness by rebranding themselves data integration companies rather than solely ETL tool vendors. In addition, service oriented architectures (SOAs) and Web services are creating ever-expanding options to integrate data and applications across an enterprise. Whereas in the past the underlying applications had to be integrated to effectively work together, today SOAs and Web services allow these disparate applications to “talk” to each other. This encourages data exchange, further enhancing data access and integration. Consider these technologies for your data integration framework (DIF) and ODS arsenal.
Third, in ancient times (i.e., a few years ago!) you needed to have your ODS, data warehouse, data marts and OLAP cubes on separate databases. These databases needed to be self-contained, meaning that they all had to have dimensions, detailed facts and summarized tables. Some companies had consistent dimensions and facts in these databases, but many did not. Every separate database meant duplication (at best) and often inconsistency (at worst).
You can still build everything in self-contained, independent databases, but now you also have the option of sharing data between databases without physically moving that data or physically storing the data in one database. If everything is in one database or enterprise data warehouse (EDW), you still should implement logical schemas representing ODS, DW, data mart and cube data. It may be more cost-effective for you to implement all the data in one database, but logically separating the tables in that EDW still is the best practice. (We will discuss this topic in a future column.) The alternative to having one EDW is to have shared data objects across your ODS, DW, data marts and cubes. The data that should be shared is the enterprise’s conformed dimensions and the detailed transactional data you initially brought into your architecture from the operational systems. This alternative will use a federated approach to integrating the data, incorporating data access versus data storage to share data. Of course, whether you have a single EDW or databases representing ODSs, DWs, data marts and cubes, you should use data integration tools such as ETL, EAI and EAI to integrate the data.
Regardless of what physical configuration is best for you, this data should be integrated in an overall DIF architecture. Just as independent data marts resulted in data integrity and consistency problems while draining business productivity throughout an enterprise, separate databases without a DIF will cause even greater long-term problems.
Finally, as the technical barriers dissolve, the worlds of operational and management reporting are coming together. This is good news for you silo-haters. Using the same tools, you can integrate or access data from operational applications such as ERP, customer relationship management (CRM) and supply chain management (SCM), as well as from management reporting applications such as BI and DW.
BI, ETL, EAI and EII tools can often access and integrate data from both worlds. Additionally, SOAs and Web services further extend the integration across disparate applications. A great example of these worlds coming together is business performance management (BPM). Interest in BPM is being driven by business pressures to improve the bottom line along with governmental regulations. BPM implementations require both operation and management reporting to measure the performance of the company in both real-time and historical perspectives. BPM needs data from your source applications and DW environments. The business user does not care where the data is stored, only that it is available, consistent, timely and has the integrity needed to make solid business analysis and decisions. It is only through a holistic approach to data integration that this can be achieved.
An ODS does not have to be IT’s wayward offspring – kept out of sight because IT is embarrassed about how it looks and why it was built. The old days of building an ODS simply because your DW was insufficient are over. There are solid business and technical reasons for building an ODS. An ODS can help enable data consistency, integrity and currency – but only if it is part of an overall data integration framework.
We’ll explore how an ODS fits into that DIF in future columns