It was a wake-up call. When I first moved from a software firm to the information technology (IT) department of a Fortune 100 company, I was in for a big surprise. The IT group’s mission was to create a massive software application – it was actually a data warehouse (DW) before the term became mainstream. I had no reason to predict that developing this application would be so different from developing a software product. As it turned out, there were radical differences in the approaches.
As I write this column, I am faced with the choice of being politically correct and glossing over the issues I encountered, or being honest and potentially implying that IT people are not as good as software engineers. The latter is not the case – there were simply fundamental differences in the way they applied standards to their projects. In hindsight, I realize that the reason for the differences was the vastly different business purpose for the code.
How was creating an application in a software company so different from an application in an IT department? First, the software product was sold and installed in many companies, but the DW application was built and installed in one company. The software product had many versions and releases that were supported simultaneously, but the DW application had one version supported in one company. A company may have DW users spread across many business units, subsidiaries and geographies, but it can still keep them all on a single production version of its application.
Secondly, the software product had documentation and training for both technical and business users. This was not the case for the DW application. The technical users were the IT group itself and the business users were often the people they saw every day.
Finally, the software firm had people dedicated to building, maintaining, supporting and selling its product. The IT people who created and maintained the application also had other IT duties. Especially in today’s economy, it is common for the IT staff to serve many masters.
The software we created at the software firm was the firm’s product its asset. For the Fortune 100 company, the software was an application used to measure company performance; it was not what the company sold to produce revenue. With this background in mind, it is easier to understand why the software applications in these two cases were treated quite differently. The software company’s product created revenue, and the DW application created costs (disregarding ROI).
In order to be a viable product in the marketplace, the software product needed to have release management, version control, business and technical documentation, programming standards and adherence to industry standards to work in diverse environments. The DW project, on the other hand, did not need those standards and procedures – or the staff felt that their users did not want to pay for it.
Why can DW applications take so long to modify and maintain? Why can they be full of nasty surprises? The short answer is that IT doesn’t adhere to standards during their creation and maintenance. Sure, you can cut corners by not following standards and still create the report that the business users want; however, the IT group will pay for it in the long run with higher IT costs and a whopping loss in responsiveness when they struggle to rediscover the application every time they touch it.
Regardless of whether you are creating a software product, a data warehouse or a business intelligence (BI) application, there are basic standards that help verify the data and validate your work. The key standards are as follows:
Use and enforce source code and version controls. You can’t fix a bug if you can’t find the correct version of the source code or reproduce the problem. Software firms use source code control – software is their asset but IT groups can be much more lax. Lack of code control slows application development. DW applications should be managed like valuable company assets. In more than two decades in the IT industry, I’ve noticed that it is it very rare for IT groups to implement and enforce code management controls.
Document your business requirements. Document all the business and technical requirements and keep track as you change the application. The documentation can be used for change management, as well as the business and technical documentation.
Document your application. Whether you’re hand-coding or using a tool, you need to document the code with comments and descriptions. In addition, the data fields accessed by the application need to have data definitions in both business and technical terms – BI; extract, transform and load (ETL); and database tools make it easy. Document all of your data – database schema, file layouts, reference data – along with the corresponding code.
Develop and implement testing plans. Test code with sample data not only in the individual units, but also in the complete application. Allow business users to test the analytical and reporting functionality with samples of real data. If possible, test with full data volumes. Test the ETL functionality with actual data from each of the source systems that will feed the data warehouse or data marts. Too often, the development team is surprised when they learn at the end of the project that the source systems had dirty data. Source systems alwayscontain dirty data your ETL code must be able to handle it somehow. Before the system goes into production, IT and the business need to agree on how to handle dirty data pass it on, flag it, suspend it or delete it.
Document how to use your application. Document the use and operation of your application from both the technical and business perspectives. The IT staff may not need to develop the formal user guides, installation notes and tutorials of a software product, but they should create a simple application guide. Too often, I see application developers trying to figure out which disks and directories contain their programs, scripts, log files and data. This should all be documented for easy maintenance.
Enforcing standards for building and maintaining DW/BI applications from the start delivers long-term benefits. It takes an investment to implement the standards retroactively – it’s always more expensive to retrofit your applications, but the ROI justifies it. Some practical guidelines for using these development standards include:
Standards take time and money to implement, but their ROI justifies the cost.