Skip to main content


Measuring Performance of Delivery Teams – ‘starter’ metrics

Recently I was asked about ‘starter metrics’ for projects (both multi-shore and single shore) looking to transition to a much more objective measure of delivery team performance.

Here are the first tier metrics that I would recommend as a good starting point. There is a lot more detail in the webinar and associated white-paper on (March 2009) (March 2010)

The framework defined in the above divides metrics into three inter-dependent categories:

Predictability – A measure of how close estimates come to actuals with regard to both delivery costs and deadlines. A key variable in measuring predictability is lead time. Specifically, what is the measured level of predictability at various points in the project lifecycle. Predictability should increase as quickly as possible as lead time shortens.

Quality – A collection of measures that ensure overall integrity of the delivered code is tracking properly to a base-lined production level acceptance; both from an operational support level as well as a business user level.

Productivity – Measurements that assess the amount of work completed as a function of cost. These metrics can be used to compare two different teams (such as a pure onshore vs a multi-sourced team) to assess the efficiency of delivery using either approach – assuming predictability and quality are the same.

The metrics under Productivity require the most rigorous metrics to measure accurately and compare different delivery teams to one another. However, there are some thoughts there on how to get started.

For PREDICTABILITY, I would start with ensuring there are multiple measurement points in the delivery lifecycle. At a minimum, you should capture the following:

  • Budgetary Estimates (at the use case level, or better yet at the feature level) – prior to any detailed requirements or decomposition (bottom up task based estimation). This estimate is usually used for budgetary purposes and precedes final project prioritization / ordering in the project portfolio.
  • Development Estimates – these are the bottoms up / decomposition estimates that development does once requirements are fairly complete (waterfall) or at the start of each 2-3 week iteration (iteration planning). They are done at a task level (functional and engineering tasks) and are then rolled up to compare to Budgetary Estimates at a use case level. Framework costs should be spread across each use case relative to the weight of each use case (relative size of each budgetary estimate).
  • Completion Actuals – these are the final actual captured (hopefully by task – but at the very least at the use case level).

You can then compare the variances between these three tap points during a project / release level retrospective. During that retrospective, the variances should be explained in terms of accepted change requests and missed dependencies as well as ‘white-space’ issues that arose during the project (those things that were not anticipated such as defects in vendor libraries or a key team member not being fully available to the project).

For QUALITY, I would look simply at the number of defects (at each severity level) in delivered code, over time, normalized to the total project weight (Completion Actuals). Project actuals can be used as a crude indicator of project complexity and weight of development. Units for this statistic could be ‘Defects per 1000 development hours’ (or whatever works to normalize across multiple projects). This alone will give you tremendous insight into delivered quality. Notice too that code that has been ‘short-cutted’ with regard to maintainability / scalability considerations will drive higher defect to project actuals in subsequent releases.

Finally, for PRODUCTIVITY, you may have to do some additional analysis since you won’t have normalized requirements (see the explanation in the whitepaper for how to normalize requirements). What you could do here is to pick particular use cases from each team that result in similar task breakdowns (for example you may focus on use cases that require integrations, database / web service access, ETL or front-end development. The measure of productivity will be at the task level (from Development Estimates which had small variances to Completion Actuals – or if there were tie-backs at the Actuals level to tasks – such as within an Agile iteration plan). You can then compare and contrast for example the variances in similar technical tasks (accounting for complexity in both tasks). Granted, there is some conversation that will need to occur to normalize the tasks, but at least you’ll be comparing apples to apples type development activities against measured actuals (rather than anecdotal statements from one developer claiming that task would only take them ½ the time). You also want to make sure and account for any differences in delivered quality.

The above are obviously just a start, but they would go a long way to starting down a road of more rigorous project delivery metrics without a lot of time investment or changes to existing artifacts or process. There are next levels of sophistication described in the white-paper.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Kevin Sheen, Vice President, Global Delivery

Kevin is responsible for Perficient's Global Delivery strategy and execution with teams distributed across the globe in the US, India, China and Mexico. With a background rooted in software development, he has been an Agile evangelist and practitioner for over 20+ years and has been advocating Agile as a way to make global teams successful since Perficient launched it's first global delivery center over 13 years ago. Scrum Certifications: CSP, CSM, CSPO

More from this Author

Follow Us