Data preparation is the core set of processes for data integration that gather data from diverse source systems, transform it according to business and technical rules, and stage it for later steps in its life cycle when it becomes information used by business consumers.
Data profiling, which requires an understanding of data modeling, is usually done by data scientists or power users who use tools to get data from different sources and integrate it for data discovery.
There are no shortcuts with data preparation. Don't be lulled by the “silver bullet” a vendor or a consultant may try to sell you, saying that all you need to do is point a BI tool at the source system and your business group will have what it needs. That oversimplifies data integration into a connectivity issue alone. It’s a lot more than that.
Data preparation is not just pointing-and-clicking with a BI tool; in fact, it involves many steps. The complexity of these steps depends on how your business operates, as well as how your business systems are implemented within your enterprise. Physically accessing the data in source systems is the easy part; transforming it into information is where the hard work is.
How we've helped others with their data preparation needs:
- An information analysis firm was using manual extracts between statistical packages and databases to pull data from surveys, behavioral analysis, their competition, suppliers, customers and planning activities. We automated the import and export of the data, increasing their productivity and improving the scope of their data.
- Data scientists at this insurance company were manually pulling exracts from external files and importing data into their statistical database. We automated the entire data preparation process in order to streamline their imports and exports.