Skip to main content

Data & Intelligence

Iterative BI – What’s the Difference?

Recently I was in a conversation where a PM declared “Agile’s just waterfall really fast – we can do that no problem!”  Uh oh.

Like (most) everything, delivery methodologies are subject to fashion and trend, and Agile/Scrum/Kanban and the like are en vouge.  Collective, I’ll refer to these highly cyclic methodologies as “iterative” or (little a) agile development.  My interest being BI, I’ll take a little time discussing how these iterative delivery methods impact your BI delivery processes.

Generally, iterative development does a number of things to your teams.  When operating effectively, it (among other things):

  • Brings your users much closer to the development process.
  • Multiplies the number of builds/deployments you do by a factor of LOTS (probably 10-20).
  • Multiplies the number of tests (esp. regressions) required.
  • Makes juggling project tasks more complex by putting many more “balls” in the air.
  • Eliminates the formality (and safety) of predefined scope and quality gates.
Data Intelligence - The Future of Big Data
The Future of Big Data

With some guidance, you can craft a data platform that is right for your organization’s needs and gets the most return from your data capital.

Get the Guide

A successful move to iterative development means planning for each of these, often in radically different ways that your current processes.  If your plan is just “do it more often/faster”, look out!  For instance, the cost of regression testing for quarterly releases is manageable using manually scripted test cases.  QA has weeks to execute the tests, report defects, and retest.  On a two-week cycle, this is either impossible or at least impossibly expensive.  So, you either a) stop/severely limit regression testing or b) automate.

In BI, iterative development gets tricky if you try and bite off whole BI sandwich at once.  It helps to step back and realize that an end-to-end BI process actually delivers at least 3 complete, testable, user acceptable components:

  • The Information Model – Some combination of business requirements, business process modeling (BPM), conceptual, logical, and physical data models, metadata, business rules, data quality rules, etc.  On its own, the information model describes information driven processes of the organization.  While the primary (or only) purpose of these artifacts may initially be BI, there is defensible value here to a much wider audience.
  • Data Integration – Getting data into the warehouse, ODS, or other data repository.  Includes data analysis and profiling, source to target mapping, ETL development, and, more and more often, message based real-time data integration development.
  • BI Delivery – Getting information to its business consumers including report/scorecard/dashboard design, development, and delivery.

More complex environments may add Master Data Management (MDM), Metadata Management, operational integration, or others to this list.  The point is, from requirements to report needn’t be viewed as  a single “project” so much as a set of interdependent projects.  This frees iterative teams to split the work into relatively independent “sprints” or cycles that can be scheduled and managed as such.

In practice, the overall delivers begins to look like a “cascade” of activities segmented by the above components and executed by relatively independent teams.  This looks very different from a waterfall Gantt chart.

Technically, relevant topics include:

  • How to design for continual change in the data model, ETL environment, and BI environment?
  • How to provision the many independent development and test environments needed to support these teams?
  • How to automate testing across the wide variety of technologies deployed in a BI stack (DBMS, ETL, BI delivery, metadata, etc.) including generating appropriate test data to load into a continually changing model?
  • How to package releases in such a way that loads and operations, issues, defects, and new requirement can be effectively traced to a particular release level of the environment?

I realize I’m only raising questions at this point.  The first step is realizing you have a problem!  Going forward I’ll attempt to answer some (hopefully most!) of these questions by including the unique challenges of iterative development in my discussions of BI tools and technologies as well as development processes.

Thoughts on “Iterative BI – What’s the Difference?”

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Chris Grenz

More from this Author

Follow Us