Sunday, August 15, 2010

The Pragmatic Capability Model Of Software Delivery (PCM-OSD)

The “Agile” vs “Formal” delivery discussion is well underway – with many interesting posts on the subject (such as The Agile Flywheel).
I was interested in this discussion because I needed to answer the question:
“How good are we at delivering software products?”
The reason for asking the question was a “platform” that we had created, a platform in the sense that other teams were expected to build applications on top of it. If we were expecting other teams to build on top of it we needed to ensure that it was “good quality”, that it would provide a firm foundation for their work. Also as it was a platform with many teams using a shared infrastructure, applications built on top of the platform had to be of “good quality” as well, otherwise they would degrade the platform for others.

Of course it’s hard to quantitatively define “good quality”, we can all recognised a good product like, but it’s harder to quantify all the attributes and processes that go into making a good product. It’s even harder to definitively show that a particular product is good enough against the nebulous label of “good quality”.

Unfortunately that’s exactly what we needed to do.

Creating a model

The approach taken was to create a model that captured the definition of “good quality”, which turns out to be a measure of how well a particular team delivers software products.

There are many metrics to measure individual activities, but what was needed was a holistic cross-functional measure – something that could cover the following areas:
  • Development: how we work with code
  • Building: how we turn code into deployable artefacts
  • Deploying: how we deploy into environments
  • Testing: how do we ensure that the deployment is good
  • Reporting: how we drive transparency (for quality)
  • Support: how we manage incidents and requests
  • Product Management: how we understand & communicate the product needs
  • Architecture: how we manage our technical direction
  • Operations: how we manage the run-time environments

Ultimately the model could have been created using different functional areas (for example we did not include Operations in this version), and the expectations are just opinion. The model presented here is easy to change and should be adaptable by others, for example tweaking to match local concerns.

The model is a mash-up of Agile & Formal product delivery views:
  • Agile: how to do software development, Scrum focuses within that on how to project manage and handle requirements, DevOps looks at the deployment and infrastructure aspects.
  • Formal: CMMI (Capability Maturity Model Integration) and ITIL (Information Technology Infrastructure Library) look at process improvement and IT Management.

Our model takes the “Maturity” concept from the formal stream, and a “Pragmatic Capability” from the agile stream. We did not want a model that was about a team’s ability to improve (e.g. continuous improvement), we wanted a snapshot of the capability of the team in a functional area. It was also heavily informed by the work we were doing with Thought Works on testing and CI, and by the highly recommended urban{code}’s anthillpro whitepaper:

Combining these concepts and streams together gives us the “Pragmatic Capability Model Of Software Delivery” (PCM-OSD).
The implementation

The model was implemented in Excel and consists of the a spreadsheet with three tabs:
Model Definition & Product Preview
This first worksheet “Model” defines the model (in terms of functional areas and attributes) and allows you to select a particular project and see what the maturity results are for each attribute in the functional areas. The Project Rating ranges from “F” to “A++”, this stops us trying to present precise % which would probably cause more arguments (an idea taken from thought works)

Assement data collection
The second worksheet “Projects” is used to collect the data from the maturity assessment. Each Project has a column that the data can be collected against, the list of projects at the top drives the drop-down on the “Model” worksheet.
The calculation is very simple, each attribute is given a score of 0-5, and the % maturity for the functional area is calculated by adding all the attribute scores up and taking that as a % of the maximum score possible for that area. Note this calculation is therefore very dependent on how many attributes are in each area, and how you group into functional areas. Finally these % are mapped to a rating (F to A++)
Reporting – benchmark projects
The final worksheet “Reports” is used to visualize the results. We used the Radar chart (a type of polar chart) to visualize the data.

Applying the model

In order to use it for our projects we created two benchmarks:

  • A “baseline” project, that assumes the lowest level of maturity that is acceptable
  • A “Agile Platform” project, that assumes good practice, not not to the “insane level” (see anthillpro’s whitepaper)
By creating these two benchmarks we can show our projects relatively by overlaying them:

Visualization of Maturity: Our platform before any improvement work
This first shows how our platform looked before we tried to improve it – the red peeking though shows us areas we have concerns that need addressing.
Visualization of Maturity: the platform after targeted improvement
And this diagram shows how our project looked after we improved our practices, no more red shows so have reached a base level of maturity, and we have made substantial steps towards become an “agile platform”.

Assessment process

We carried out this assessment on a 10 projects, and the diagrams produced coincided with our intuition of the teams maturity. We carried out the assessments by interviewing the technical leads on the projects (about one hour), and the information can be used to allow us to identify future training needs and capital expenditure.
During the assessment process the focus was on ensuring the teams understood the purpose of the process, and our preference to give a lower score for an attribute rather than over estimate the maturity of a team. For attributes that were not applicable to a particular team (for example internal facing only teams) the approach was to give a 5 for the maturity rather than penalize due to non-applicability. For the teams the opportunity to identify training opportunities, mentoring or improved working was the main benefit. In the future a quarterly review might allow teams to see where they should target their continuous improvement efforts.

Our opinionated model

The model is based on our views of what’s important, and the maturity levels defined by our views. Applying this model unchanged is arguable, so its worth understanding our reasoning behind the attributes, so they can be changed to match your views.

The model is documented in the same structure of the Excel Workbook, in terms of Areas with Attributes, each which has a description/benefit statement (what benefit you see from increasing this attribute)
  • Development: how we work with code
    • Check in frequency: as you move to a CI process the frequency of check in drives how quickly problems are found.
    • Behaviour: how responsible do developers act about ensuring the code is good, follows the broken window theory
    • Int. Documentation: what kind of documentation is available internally – not quantity, more quality. Preferably automated and easy to maintain.
    • Ext. Documentation: what kind of documentation is available internally – not quantity, more quality
    • Feedback loop: how long before a developer gets feedback on the quality implementation for a user story. This reflects the efficiency of tools, process and project. For example having too many dependencies may make build time too long. This feedback must have value - so tests and environments like live.
    • Code Management: coping with the need for multiple developers to work on the same code at the same time, using source control effectively allows parallelism and root cause tracking.
    • Source control: how do we use source control, making source control the start of an quality process rather than just a repository.
  • Building: how we turn code into deployable artefacts
    • Build: how easy is it to create a new build, making this quick and painless reduces friction to introducing CI and other automated tools
    • Binaries: how to manage the output of a build, is it versioned and tracked – and do we test the same artefact through all environments up to production
    • Configuration: often ignored it can turn a well designed system into an un-deployable mess, ultimately should be automated and by convention over config as much as possible
  • Deploying: how we deploy into environments
    • Deployment: how we make a release live, making this painless and automated reduces friction to testing and rapid evolution
    • Resource: what skills are needed to deploy, having special skills constraints becomes a blocker, ultimately it should be the final output of CI
    • Database: repositories and databases need to be deployable as easily as the application
    • Environment: having well defined standard environments with automation allows for creation of test environments and additional live capacity automatically
  • Testing: how do we ensure that the deployment is good
    • Code Quality: how do we ensure that code is good, pair programming and automated tools help, and toxicity reports can drive technical debt reduction and help in adjusting estimates for changing code in toxic areas.
    • Manual test: how is manual testing carried out, having some targeted smoke/sanity tests can quickly discover defects, user confirmations for stories allows dev to tests themselves.
    • Unit test: how do we do unit testing, using TDD improves the design, and increased coverage improves the ability to detect regressions earlier –whilst there is no defined lower limited for coverage, 60% seems to be a reasonable target for most teams. This should be higher for critical systems (and platform).
    • Acceptance test: how are acceptance tests carried out, these normally start out manual, but moving to automation allows cheaper tests cycles, and frees up testers to do targeted manual tests
    • Performance: often left until after a release causes performance issues (loss in revenue or increased costs), doing it earlier is cheaper (than fixing after) and can be ROI based (
    • Security: testing the security of the application can be automated to detect the main problem areas (OWASP top 10), and ultimately ovoid creating weaknesses in the first place.
    • Automation: moving from highly skilled resources running low-value manual processes, to automating as much as possible and targeting manual work where is leverages skill sets most
  • Reporting: how we drive transparency (for quality)
    • Visibility: increasing transparency drives the team to higher quality, and allows cross project comparison & learning
    • Traceability: being able to track why change occur, from end user feature to code, config and environment changes allows quicker fixing and trend detection
    • Integration: how well integrated is information sharing within the project team, is there one information repository or unconnected datasets (bugs not related to builds for example)
    • Defects: how defects are tracked and related to changes in the system enables better investigation and reporting
  • Support: how we manage incidents and requests
    • Support process definition: do we have a defined process for managing a incident, is it reviewed for effectiveness?
    • Support structure: what resource is allocated to providing support, having designated support reduces interrupting people unnecessarily & increases velocity.
    • Triage: how do we decide the priority to assign problems, and the resolution time. Ensuring that there is agreement to priorities across the whole team (especially with the business) ensure focus on what matters.
    • Self-service support: how do external teams/users get support for themselves (forums/wiki), can the community improve the knowledge base
    • Support SLA: do we have publicised goals, do we have processes for managing exceptions and ensuring we monitor compliance
  • Product Management: understanding & communicate the product needs
    • Vision: understanding the direction of the product and aligning with technology changes
    • Requirements: gathering new features and change requests, communicating stories in an actionable and testable way, prioritising by value. Better quality stories reduce confusion and increase velocity.
    • Documentation: gather and report in a consistent way, understand all the documents other roles create (dev & arch esp.)
    • Communications: communicating to internal & extern teams and customers. Is everyone aware of the product vision and changes etc
    • Co-Ordination: how does product management integrate with dev, architecture and project management office – is it aligned, reducing confusion and arguments between the different functions.
    • Delivery: moving from ad hoc delivery to repeatable delivery with selection of stories into releases rather than holding up releases for stories
  • Architecture: how we manage our technical direction
    • Documentation (A): ensuring we have light-weight documentation and diagrams that aid continual product evolution
    • Technical Debt management: keeping track of decisions that reduce velocity or options for future delivery, proactive planning to remove over time
    • Technical Risk management: keeping track of technical risks to the project in the existing solution, understand and communicate the impact of risks and costs to fix them.
    • Vision (A): understanding to technology vision for the product, aligning it with the product road map
  • Operations: how we manage the run-time environments
    • this area was exclude from our model as we are still working out the details of the attributes that would best capture capability levels
Next steps

The next phase is to complete the assessments for all the teams, and see what proactive action we can take to improve each team. Reports that are worth considering:
  • tracking the change in a team over time,
  • understanding the gaps in external supplier teams,
  • and looking at a set of teams in aggregate to find repeating patterns..
One of the unsurprising findings from the process is that is a lot easier to introduce best practice at the start of a project, and this model allows the make and processes of a team to be tuned before starting. It would be interesting to see what others think of this approach, where they think there are weaknesses, and if they try to apply it any issues or changes they made.

The latest version of the Pragmatic Capability Model Of Software Delivery spreadsheet uses generated data (RANDBETWEEN function), just replace this with your values.

No comments: