Chapter 7 - Doneness Model
Determining whether the product is “done” is the ultimate goal of software acceptance. Knowing how close to done we are, is a key function of project management discipline. This section introduces a model for determining how complete a product is relative to its goals from both functional and non-functional perspectives.
A key part of making the acceptance decision revolves around deciding whether the product is “done”. Coming to consensus on “doneness” requires agreement on what is being assessed – a question of target granularity – and what it is being assessed against – the doneness criteria. In its most basic form, the acceptance process defines how we decide whether a particular release of functionality is done. In its more advanced forms we can determine doneness at the level of individual features or work items. Each of these different targets has a set of doneness criteria that must be agreed-upon. The following are some example questions to consider:
- Is a software-intensive system (for example, a software product) ready for an alpha test with a friendly user community?
- Is an individual feature or user story ready for acceptance testing by a business tester?
- Is a software-intensive system ready for the design close milestone?
The definition of "done" is different for each of these examples.
Release Criteria – Doneness of Entire Product
When deciding the acceptance of an entire software-intensive system for release to users, doneness is a binary state. Either the product is accepted by the Product Owner and ripe for release to its intended audience or for the usage decision, or it’s not. If it’s not, then we can talk about how far the product is from the state of being done for the underlying decision. The same principles apply to the readiness decision that precedes the acceptance decision: the one about whether to release the product to the Product Owner for the acceptance decision. Thus there are two main criteria for determining whether a product is done:
- Is there enough high value, Product Owner-defined features included to make the release worthwhile?
- Is the quality of the feature implementations high enough for the features to be usable in their expected usage context?
The first criterion, known in this guide as Minimum Marketable Fnctionality (MMF), is typically determined while planning the release. The criterion may be revisited as the project is being executed and more is learned about the system context (such as business requirements) and the technical capabilities of the Product Development Team.
When reviewing acceptance test results for each feature, it is fairly simple to determine the percentage of features that is done. This is the number of features that the Product Owner has decided pass their critical acceptance tests divided by the total number of features for the release. It is also possible to express this ratio in storypoints to account for features of different sizes.
The second criterion, known in this guidance as Minimum Quality Requirement (MQR), is what we constantly test against while building and assessing the software. To be able to say whether a feature has met the MQR, we need to have the acceptance tests defined for that feature; in this guidance, this is the per-feature definition of "What done looks like." Figure 1 illustrates graphically the degree of doneness using these two criteria. Each chart representing the doneness on one of two separate dates in the same project.Figure 1Doneness relative to MMF and MQR at two points in time
In Figure 1, the chart on the left shows the completeness of each feature at point X
in time; the chart on the right shows the completeness two weeks later. Each column represents a feature with the width of each column indicating the estimated effort to build the feature. The line labeled RAT (Ready for Acceptance) marks the level at which the feature is deemed ready for acceptance testing by virtue of having conducted the readiness assessment. It is the per-feature equivalent of the readiness decision that is made at the system level. The space between the RAT line and the line labeled MQR is when acceptance testing is done. The line labeled MMF is the demarcation between the features that must be present (left of the line) and those that are optional (right of the line; omitted in these diagrams) for this release.
We can see progress over time by comparing the left and right charts in Figure 1. Features 1 through 5 (numbering from the left starting from 1) were already completed at the point in time represented by the left chart. Features 5 through 7 were already started at time x and completed (deemed ”done”) in the period ending at time x2 weeks. Features 8 through 10 were already in progress at time x and were not completed at time x
2 weeks. Features 11 and 12 were started after the date depicted by the left chart and not finished at the time depicted by the right chart.
The product is deemed acceptable when all features pass all their acceptance tests. This is the upper-right corner of the chart where the lines labeled MQR and MMF intersect. When the rectangle to the lower-left of this point is entirely colored in, the product is accepted. To simplify the discussion, the non-functionality requirements are deliberately ignored, but each set of non-functional tests could be treated as another "feature bar" from the perspective of measuring doneness.
Good Enough Software
It is very easy to say “It all has to be there!” when asked what functionality is must be present to satisfy the MMF. Likewise, it is easy to reply “There can be no bugs!” when asked what MQR needs to be. But these answers are taking the easy way out. The product owner has to make a conscious decision about the value of quality and features. The dogmatic answer (”all the features and absolutely no bugs”) may delay delivery of the product by months or even years. Meanwhile a competitor’s product may be earning money and market share because that competitor had a more realistic goal for MMF and MQR. The choice of MMF and MQR should be a conscious business decision made in full knowledge of the cost of increased quality and functionality in terms of actual costs (additional development, testing) and opportunity costs (delayed income and other products).
For additional treatment of the topic of Good Enough Software, see [BachOnGoodEnoughSoftware] and [YourdonOnGoodEnoughSoftware]
Defining What “Done” Means
For each chunk of functionality we have decided to deliver a "feature" that needs to be defined for the minimum quality requirement (MQR) in the form of a set of acceptance tests that must pass before the Product Owner will accept the feature. The set of acceptance tests for a release is merely the aggregate of the acceptance tests for all the features ("functional tests") plus the acceptance tests for each of the non-functional requirements (the "non-functional tests") that are deemed mandatory.
"Readiness" is when the Product Development Team believes the product is "done enough" to ask the Product Owner to consider accepting the product. This implies that the Product Development Team has a reasonably accurate understanding of how the Product Owner will conduct the acceptance testing. (In some cases, the Product Development Team's "readiness tests" may be much more stringent than the acceptance tests the Product Owner will run.) This understanding is known as the "acceptance criteria" and is usually captured in the form of acceptance tests. Ideally, the acceptance tests are jointly developed and agreed-upon between the Product Development Team and the Product Owner before the software is built to avoid playing "battleship" or "blind man's bluff" and the consequent rework when the Product Development Team guesses wrong. It is also ideal for readiness tests and test results to be supplied to the Product Owner by the Product Development Team. This can assist in auditing the readiness testing, streamlining acceptance testing, and performing a more informed-style of testing, such as exploratory acceptance testing. This kind of testing can be thought of a “black hat” testing referring to Edward de Bono’s one of six viewpoints for thinking about a problem [DeBonoOnSixThinkingHats]. A “black hat” approach looks at a problem from a cautious and protective perspective, concentrating on what can go wrong.
Doneness of Individual Features
For an individual feature we can describe the degree to which it is done in several ways. One is how far along it is in the software development lifecycle. For example, “The design is done and coding is 80% finished. We expect the feature to be ready for testing in two weeks as part of the testing phase.” This form of progress reporting is often used in sequential projects because the design and coding can take many weeks or even months. Unfortunately, it is not a very useful because of its subjectivity, low transparency, and coarse granularity. In projects that adopt a too coarse level of scheduling and clumping work, coding may remain 80% finished for many weeks or months even if progress is being made. It can therefore be very difficult to tell when progress is not being made.
Once the coding and debugging is finished, it is more useful to define doneness in terms of which test cases are passing and which are not. This alternative way of measuring progress is objective, transparent, and fine grained. In the sequential approach, the percentage of test cases passing stays at zero for most of the life cycle and then rapidly rises to somewhere near 100%. In iterative and incremental projects, features are sometimes implemented one test case at a time thereby allowing the percentage of test cases passing to be used as an objective progress metric at the feature level instead of at the product level.
The number of test cases for a particular feature may change over time as the feature is understood better. This may also be a sign of work creep (more effort to implement due to more complexity) or scope creep (additional functionality that wasn’t originally intended). The additional tests may result from increased clarity caused by attempting to write tests or it may come as a result of feedback from other forms of testing including usability testing, exploratory testing or acceptance testing of an Alpha or Beta release. Regardless of the source of the additional tests, the percent done may take a step backwards if the number of tests increases more than the number of additional tests passing.
Doneness of Non-functional Attributes
Non-functional attributes of the system are a bit different from individual features in that they tend to apply across all features; this is why some people prefer to call them para-functional. Non-functional attributes, when quantified and objectively measurable, may be binary in that the product either satisfies the requirement or not. An example is scalability: either the product can support a million transactions per second or not. We might, however, be able to break down the functionality of the system into functional areas and determine the usability of each area independently. For example, we may require a high degree of usability for customer self-service functions while administrative functions can have a much lower ease of use. Other non-functional requirements are quantifiable. For example, how many users can we support before response time exceeds our response time threshold? It could be 1 user, 10 users, or 1000 users.
We can choose whether we want to visualize progress on non-functional completeness as either additional features (support 10 users, support 100 users, support 1000 users) as illustrated in Figure Z or as a single feature (support many simultaneous users) with a minimum quality requirement (e.g. 1000 users) as illustrated in Figure Y. Our choice may be influenced by how we organize the work to develop the functionality. Agile projects often opt for the feature per threshold approach because support for additional users can be rolled out over a series of iterations. Sequential projects may prefer to treat them as different levels of quality for a single feature as illustrated in Figure Y or as a different level of quality across all features as illustrated in Figure X. (See the discussion of varying MMF for Alpha/Beta releases in section The Acceptance Process for Alpha & Beta Releases
in Chapter 1 for a counter argument.) Figure Z Incremental Non-functional Features
In Figure Z the requirement for supporting multiple users is divided into two “features”. The feature to support at least 10 users is represented by column 6 (with highlighted border) and is ready for acceptance testing while the feature to support 100 users, represented by column 14 (with highlighted border), has not yet been started. Figure Y Single Non-functionality Feature
In Figure Y the requirement for supporting multiple users is treated as a single “feature” represented by column 10. It currently supports at least 10 users but is not ready for acceptance testing at the full requirement of 100 users. Figure X Big Bang Non-functionality
In Figure X verification of the requirement for supporting multiple users is treated as an additional level of quality labeled NF across all features. It has not yet been started.
Communicating "Percent Doneness"
It can be said you are either "done
" or "not done (yet)
." But in practice, it is important to be able to clearly communicate "how close to done
" you are. More specifically, it is important to be able to communicate "what remains to do before we can say we are 'done'
". This is the amount of work left for each feature that has not yet passed all its acceptance tests summed over all the features that are part of the MMF. When looking at either diagram in Figure 1, we can ask "What percentage of the rectangle below/left of MQR/MMF is colored in?
" This gives us a sense of how much work is remaining. A lot of white implies a lot of work; very little white means we are nearly done. Where the white is located tells us what kind of work is left to do. White across the top means we have lots of partially finished features; white primarily on the right implies that entire features haven’t been built yet but most features are either not started or fully finished. These patterns of progression depend primarily on the project management methodology we are using. The diagram in Figure 2 shows snapshots of completeness for three different project styles:Figure 2
In Figure 2, the first row of charts represents a classic sequential or phased or document-driven style of project management with white across the top indicating that all features are at similar stages of completion. The bottom row represents a classic Extreme Programming project with white primarily on the right indicating that there are very few features partially done since most are either done and not started. The middle row represents a project using an incremental style of development with longer feature cycles than the Extreme Programming project. Notice the difference in how the coloring in of the chart advances toward the upper-right corner where the project would be considered fully done based on 100% of MMF and 100% of MQR. The next few sections delve more deeply into how we calculate and communicate done-ness on these different styles of projects.
Communicating Percent Done on Agile Projects
An agile project can simply divide the number of features that are accepted by the product owner as “done” by the total number of features scheduled for the release. This provides the percentage of features that are done. We can express this measure in monetary terms by weighting each feature’s percent completeness by the feature’s estimated cost. The estimated cost is represented by the width of each feature column in the progress charts of Figure 2. However the monetary equivalent of a feature calculated this way is not the same as the feature’s business value. It’s essentially an expression of the feature’s completeness in cost terms. The monetary equivalent is what is referred to as earned value in Earned Value Management (EVM). See the Sidebar titled Earned Value Management and Agile Projects. Note:Sidebar: Earned Value Management and Agile Projects
Earned Value Management (EVM) is a budget and schedule tracking technique developed by the Project Management Institute. Earned value tracks the relative progress of a work unit as a percentage of the allocated budget that has been spent for that work unit. If a work unit is 80% complete and was allocated a $100 budget, then its current earned value is $80. The total value earned by the project is the sum of the earned values of all the work units scheduled and budgeted for the project. When the project is complete, its total earned value equals the project’s total budget. Therefore, unlike the name suggests, earned value is essentially an effort tracking metric. In EVM, actual costs are also tracked. If at any point in the project, the actual accumulated cost exceeds the earned value of the project at that time, the project is considered over-budget, with the shortfall representing the budget overrun (in EVM terms, cost variance). If the accumulated earned value at that point in time is below the estimated total expenditures according to the planned budget, then the project is considered over-schedule with the difference representing the monetary expression of the project’s schedule shortfall (in EVM terms, schedule variance) at that point. Schedule variance can also be calculated in calendar-time. Figure W illustrates schedule variance (in both monetary and temporal terms) and cost variance for a project with an estimated total budget of $5 million and an estimated schedule of 1 year. In July 2010, at an earned value of $2.5 million, the project has a total earned value that is only 50% of its planned total cost, but it should have accumulated 75% of it according to the project plan. Therefore it’s over-schedule by $1.75 million or 50% in monetary terms. Since the project should have accumulated that much earned value back in April 2010, in calendar-time terms, it is 3 months behind. The project is also 120% over-budget on July 2010 because it has accumulated an actual cost of $5.5 million at that date compared to the $2.5 million it has earned. Figure W Calculation of schedule and cost variance in EVM.
An approach to porting EVM to an agile context is discussed by Griffiths and Cabri [CabriGriffithsOnAgileEVM]. Griffiths and Cabri advocate applying EVM on an incremental basis (iteration by iteration or release by release). In an agile project, a grand plan for the project is not meaningful for progress tracking purposes since the grand plan would subject to constant change during the course of the project. However mini-plans in terms of work scheduled in the current scope (features or stories) and estimated costs (resources allocated to the scope) are usually available and reliable enough to serve as a reference point on a per-iteration or per-release basis. Budget and schedule overruns can then be calculated within each increment, whether for a single iteration or release, in the same way as in EVM. Effort or a proxy for effort such as story/feature points can be substituted for cost. In agile projects, customer or business value is more important than earned value. Therefore earned value in itself may not be very meaningful. The customer-value or business-value equivalent of earned value can also be easily tracked provided that each feature has a value or utility assigned by the product owner. The business-value or customer-value can be expressed in absolute, monetary terms (in currency) or in relative, utility terms (in artificial utils). The business value is an estimate based on the product owner’s assessment. Therefore it is subjective, but it can still be informed and backed up by serious economic analysis. Actual business value is still often hard to measure, especially at the feature level. Unlike in plain earned value calculation, only delivered features should earn their respective business value: each feature earns its full estimated business value once delivered. However, if a feature is only useful in the field when delivered together with another feature, it shouldn’t earn business value until all the features on which it depends are also delivered. Percent weightings to adjust for a feature’s status of completeness don’t make much sense for tracking business value. Consequently, individual features don’t earn partial business value. Partial or percentage business value earned is only meaningful at the project level.
The diagram in Figure 3 illustrates snapshots of how "done" each feature is at various points in time. Each mini-chart represents a point in time. The height of the colored-in portion of each feature bar represents what degree that feature is done. A simple way to calculate this is dividing the number of acceptance tests passing by the total number of acceptance tests for that feature. Note that the number of tests could increase over time as the product owner’s understanding of the feature improve.Figure 3 Doneness charts for an agile project
Note how agile projects focus on both making the features small enough that they can be done in a cycle and not taking on too many features at the same time thereby minimizing the length of time that a particular feature is in development. (The goal is to complete each feature in the same iteration it was started in, or at worst case, the very next iteration.) This allows the Product Owner to do incremental acceptance testing as each feature is delivered. Any bugs found can be scheduled for fixing at the appropriate time (which may be right away or in subsequent iterations.)
Figure 4 illustrates a "burn-down chart" based on plotting the number of features left to be "done" against time.Figure 4
The simplicity of the burn-down chart helps quickly grasp the progress of the release. However, it may mask some other dynamics that takes place on the project. For example, when
For such situations, Michael Cohn introduced an alternative burn-down chart (Figure 4B), in which the progress, as usually, is shown above the baseline, and changes in scope are reflected below the baseline. [CohnOnBurndownChartAlternative]
Image to be providedFigure 4B Alternative burn-down chart
Instead of having 100 percent of the features 50 percent done at the halfway point of the project, agile projects strive to have 50 percent of the features 100 percent done. This gives the Product Owner options if specification and development of the functionality takes longer than expected, which is not uncommon. They can decide whether to adjust (reduce) the product scope to deliver on time or to adjust (delay) the delivery date to include all functionality. It also means that the work of readiness assessment and acceptance testing are spread out more or less evenly across the project. This is discussed in Chapter 17 – Planning Acceptance.
Figure 5 illustrates charts for a less agile project.Figure 5 Doneness charts for a less agile project
On this example project, most features are taking several iterations to complete and acceptance testing only starts after all features are deemed ready. Because they are found very late in the project, there is less time to fix deficiencies found during acceptance testing (such as missed requirements) so they need to be fixed much more quickly or they tend to stay unfixed for this release, with workarounds, restrictions, and special cases..
Communicating Percent Done on Sequential Projects
Sequential (a.k.a. waterfall) projects have more of a challenge because the phases/milestones synchronize development in such a way as to ensure that all functionality is available for testing at roughly the same time. This prevents using "percent of functionality accepted" as a meaningful predictive measure of progress. Instead, Sequential projects usually ask someone to declare what percentage each feature is done. For example, the developer may say they are 80 percent done coding and debugging (though this number is often stuck at 80 for many weeks in a row!). Considering the subjective nature of estimation techniques, sequential projects often choose to use techniques such as "earned value" to determine a "degree of doneness" metric. Unfortunately, these techniques are prone to error and fudging, and they are both difficult and time-consuming to produce and maintain.
Figure 6 contains a typical sequence of doneness charts for this approach.Figure 6 Doneness charts for a sequential project
In this sequential version of the charts, we can see how phased/sequential development encourages us to work in parallel on many features because each feature is synchronized by gating mechanisms such as the milestones Requirements Frozen, Design Complete, and Coding Complete. This means that all the features are available for acceptance testing at roughly the same time and must be finished acceptance testing in a very short period of time because it is on the critical path of the project. This has implications for the staffing levels required for the readiness assessment and acceptance testing roles. When development is late, the period for readiness assessment and acceptance testing is further shortened and the resources further stressed. It also has implications on the impact of finding bugs during the testing because the fixes are on the critical path to delivery.
Figure 7 illustrates a "burn-down chart" based on plotting the number of features left to be "done" against time.Figure 7 Burn-down chart for a sequential projectNote:Sidebar: What it takes to do Incremental Acceptance
In projects that routinely apply Incremental Acceptance Testing; the Product Owner (or the onsite customer proxy) is shown the working feature as soon as it is deemed ready by the developers, often just a few days into a development iteration. Projects that do not employ this practice, in particular those employing a sequential process, encounter many obstacles when they try to overlap testing with development. The problem is more severe when the overlap is introduced to recover the schedule when development is running late. Why the difference?There are a number of factors at play here, each of which may contribute to the situation. The problems include difficulty communicating what is testable and what is not; this results in frustration amongst testers that the testing results in many bugs to which the developer responds “that feature isn’t done yet; you should not be testing it yet.” Cross-functional teams advocated by agile processes address the communication problem. The product owner, requirements analysts, usability experts, developers, and testers all touch base daily. They participate in the same iteration planning meetings and end-of-iteration reviews. This means that there is much better awareness amongst testers, whether they be readiness assessors or acceptance testers, of what can be tested and what should be left for later.A disciplined source code management approach also helps. Applying test-driven development practices and using a continuous integration server that rebuilds the entire code base after every check in are practices that support disciplined source code management. In an environment using these practices, the build process involves running all the automated unit tests. Any test failures are treated as a “stop the line” event where the top priority becomes getting the build working again. This ensures a stable code base at all times; the team simply won’t tolerate members dumping half-finished code into the build. Another form of discipline is the decomposition of large features into smaller features and even smaller tasks which can be completed in under a day. This allows working, tested code to be checked in several times a day avoiding the overhead of branching and merging. Controlling the number of features in progress is yet another measure that supports incremental acceptance testing. This practice, borrowed from the world of lean manufacturing and lean product design, minimizes the amount of multi-tasking people need to do thereby improving focus, flow and productivity. A very useful side effect is that the length of time from deciding on what is required to when the functionality is ready to be acceptance tested is kept to a minimum; often just a few days. This reduces the likelihood of the requirements changing while development is in progress or the Product Owner forgetting what it was they meant by that requirement.Highly incremental acceptance testing implies that the product will be tested many times. The cost of testing would rise with every iteration as more functionality is added. Agile teams avoid this by having extensive automated regression test suites (at least unit tests but usually functional tests as well) that they run before declaring that the code is ready for testing. These regression tests act as a large “change detector” which informs the team whenever the behaviour of the product changes unexpectedly. For example, it may be an unexpected side effect of a feature being introduced. Acceptance testers are typically involved in preparation of the automated functional tests so they know what is already tested and focus on areas not covered by the automation such as user interface behaviour.A team that is encountering difficulty in doing incremental acceptance testing may be exhibiting symptoms of missing one or more of these practices. This is especially prevalent on teams new to agile and teams that have to deal with large legacy code bases.
It may seem like a simple concept to determine if a feature or product is done, but in practice it can prove to be somewhat slippery. We must know how to determine if we are done, what criteria to use to establish whether we are done and how to test those criteria. It involves determining if the proper features or functionality is present, if it works as expected or required and does it meet the expectations of the Product Owner. Having a process in place will increase our chance of successfully delivering a product we are happy with.
This finishes our introduction to the models we can use as tools for thinking about acceptance. The next few chapters describe perspectives on acceptance from a number of different players.
[BachOnGoodEnoughSoftware] Bach, J. The Challenge of “Good Enough Software”. http://www.satisfice.com/articles/gooden2.pdf
[YourdonOnGoodEnoughSoftware] Yourdon, E. "When Good Enough Software Is Best," IEEE Software
, vol. 12, no. 3, pp. 79-81, May 1995.
[CabriGriffithsOnAgileEVM] Cabri, A. and Griffiths, M. “Earned value and agile reporting”, Proc. of the Agile 2006 Conference, 23-28 July 2006, pp. 17-22.
[DeBonoOnSixThinkingHats] Edward de Bono, Six Thinking Hats
, Back Bay Books, 1999.
[CohnOnBurndownChartAlternative] Mike Cohn, “An Alternative Release Burndown Chart”, http://www.mountaingoatsoftware.com/alt-releaseburndown
, Oct 19, 2009