Are you ready for how long it takes to do data preparation?
Data preparation processes are the lion’s share of the work of any DW or BI project – estimated at 60 to 75 percent of the project time. Project delays and cost overruns are frequently tied to underestimating the amount of time and resources necessary to complete data preparation or, even more frequently, to do the rework necessary when the project initially skimps on these activities and then data consistency, accuracy and quality issues arise.
Almost everyone – IT, vendors, consultants and industry analysts – associates data preparation solely with ETL development work. ETL tools are indispensable in a BI project because of the many benefits they provide (see chapter 5 of my Business Intelligence Guidebook for further details) but do not significantly reduce or speed up the majority of the data preparation work. The reason is that the bulk of the time is spent on defining the sources (getting data from source systems), data profiling, targets (putting data in DWs, ODSs and data marts), source-to-target mapping and business transformations. This definitional work is time-consuming because it involves meeting, discussing and obtaining consensus on the definitions and transformations with source systems’ subject matter experts and business people. As the number of systems and people expand these activities expand disproportionally.
(Learn more about data preparation and all things BI in my Business Intelligence Guidebook – From Data Integration to Analytics. Chapter 5 is on information architecture and discusses data preparation. )