In my previous post on iterative BI I talked about the need to adapt your BI systems to the storm of change that iterative development thrives on. The first thing on my mind in light of this coming storm is taking control of how my teams package and build their environments.
For most of my career I’ve worked with teams that have a varying degree of rigor in terms of code management and change controls. Especially deficient were controls for databases. The test databases were always in some indeterminate state following the application of various patches, the loading of various test data, and the execution of various tests. At one recent client, the QA team estimated that nearly 25% of their time went to running down issues related to missing tables, incorrectly defined columns, or out of date test data. This was pretty high on my recommended fix list!
The Future of Big Data
With some guidance, you can craft a data platform that is right for your organization’s needs and gets the most return from your data capital.
So, on my team, we’ve started to take control of our BI environment by following these principles:
- Environments can be built by executing a single command line command (assuming some environment prerequisites).
- Everything is in source control, and nothing can be moved between systems except via the source control repository.
- Multiple environments can coexist on the same system and same database instance.
- Multiple developers can “attach” to an environment by pulling all the relevant files and connecting to existing databases, cubes, etc.
- Environments are “version aware” and support “upgrading” from an older version to a newer version.
Following these principles means:
- Developers and testers can create a new environment quickly, easily, and reliably. This leads to fearlessness because they can try things without harming each other or the production system.
- Deployments are testable, repeatable, and reliable. By deploying entirely from source control and scripting all aspects of the environment build, system administrators are able to execute “dry runs” of deployments on a test environment that is (nearly) guaranteed to work on the production system.
- Continuous integration is possible in the BI world. Changes to ETL code or models can immediately be integrated and tested and feedback is quick!
To realize these benefits, teams must embrace and deploy:
- Source Control
- Build Automation
- Continuous Integration
I’ll follow up with some specific pointers on the gotchas when taking control of your BI environment.