Estimation *

CL1 (Qualified)
Scope Concept

Estimates, Targets, and Commitments

Overestimate vs Underestimate

Decomposition and Recomposition

Analogy-based estimations

Story based estimations

CL2 (Competent)
Cone of Uncertainty The Cone of Uncertainty

Software development consists of making literally thousands of decisions about all the feature-related issues described in the previous section. Uncertainty in a software estimate results from uncertainty in how the decisions will be resolved. As you make a greater percentage of those decisions, you reduce the estimation uncertainty.

As a result of this process of resolving decisions, researchers have found that project estimates are subject to predictable amounts of uncertainty at various stages. The Cone of Uncertainty in Figure 4-1 shows how estimates become more accurate as a project progresses. (The following discussion initially describes a sequential development approach for ease of explanation. The end of this section will explain how to apply the concepts to iterative projects.)

The horizontal axis contains common project milestones, such as Initial Concept, Approved Product Definition, Requirements Complete, and so on. Because of its origins, this terminology sounds somewhat product-oriented. "Product Definition" just refers to the agreed-upon vision for the software, or the software concept, and applies equally to Web services, internal business systems, and most other kinds of software projects.



Figure 4-1: The Cone of Uncertainty based on common project milestones. The vertical axis contains the degree of error that has been found in estimates created by skilled estimators at various points in the project. The estimates could be for how much a particular feature set will cost and how much effort will be required to deliver that feature set, or it could be for how many features can be delivered for a particular amount of effort or schedule. This book uses the generic term scope to refer to project size in effort, cost, features, or some combination thereof.

As you can see from the graph, estimates created very early in the project are subject to a high degree of error. Estimates created at Initial Concept time can be inaccurate by a factor of 4x on the high side or 4x on the low side (also expressed as 0.25x, which is just 1 divided by 4). The total range from high estimate to low estimate is 4x divided by 0.25x, or 16x!

One question that managers and customers ask is, "If I give you another week to work on your estimate, can you refine it so that it contains less uncertainty?" That's a reasonable request, but unfortunately it's not possible to deliver on that request. Research by Luiz Laranjeira suggests that the accuracy of the software estimate depends on the level of refinement of the software's definition (Laranjeira 1990). The more refined the definition, the more accurate the estimate. The reason the estimate contains variability is that the software project itself contains variability. The only way to reduce the variability in the estimate is to reduce the variability in the project.

One misleading implication of this common depiction of the Cone of Uncertainty is that it looks like the Cone takes forever to narrow—as if you can't have very good estimation accuracy until you're nearly done with the project. Fortunately, that impression is created because the milestones on the horizontal axis are equally spaced, and we naturally assume that the horizontal axis is calendar time.

In reality, the milestones listed tend to be front-loaded in the project's schedule. When the Cone is redrawn on a calendar-time basis, it looks like Figure 4-2.



Figure 4-2: The Cone of Uncertainty based on calendar time. The Cone narrows much more quickly than would appear from the previous depiction in Figure 4-1. As you can see from this version of the Cone, estimation accuracy improves rapidly for the first 30% of the project, improving from ± 4x to ± 1.25x.

Can You Beat the Cone?

An important—and difficult—concept is that the Cone of Uncertainty represents the best-case accuracy that is possible to have in software estimates at different points in a project. The Cone represents the error in estimates created by skilled estimators. It's easily possible to do worse. It isn't possible to be more accurate; it's only possible to be more lucky.

Source of Estimation Errors

Diseconomies of Scale Diseconomies of Scale
 * 1) Chaotic Development Processes (Requirements that weren't investigated very well in the first place, Lack of end-user involvement in requirements validation, Poor designs that lead to numerous errors in the code, Poor coding practices that give rise to extensive bug fixing, Inexperienced personnel, Incomplete or unskilled project planning, Prima donna team members, Abandoning planning under pressure, Developer gold-plating, Lack of automated source code control)
 * 2) Unstable Requirements
 * 3) Omitted Activities (Developement: Ramp-up time for new team members, Mentoring of new team members, Management coordination/manager meetings, Cutover/deployment, Data conversion, Installation, Customization, Requirements clarifications, Maintaining the revision control system, Supporting the build, Maintaining the scripts required to run the daily build, Maintaining the automated smoke test used in conjunction with the daily build, Installation of test builds at user location(s), Creation of test data, Management of beta test program, Participation in technical reviews, Integration work, Processing change requests, Attendance at change-control/triage meetings, Coordinating with subcontractors, Technical support of existing systems during the project, Maintenance work on previous systems during the project, Defect-correction work, Performance tuning, Learning new development tools, Administrative work related to defect tracking, oordination with test (for developers), Coordination with developers (for test), Answering questions from quality assurance, Input to user documentation and review of user documentation, Review of technical documentation, Demonstrating software to customers or users, Demonstrating software at trade shows, Demonstrating the software or prototypes of the software to upper management, clients, and end users, Interacting with clients or end users; supporting beta installations at client locations, Reviewing plans, estimates, architecture, detailed designs, stage plans, code, test cases, and so on; Non-developement: Vacations, Holidays, Sick days, Training, Weekends Company meetings, Department meetings, Setting up new workstations, Installing new versions of tools on workstations, Troubleshooting hardware and software problems)
 * 4) Unfounded Optimism (We'll be more productive on this project than we were on the last project.; A lot of things went wrong on the last project. Not so many things will go wrong on this project.; We started the project slowly and were climbing a steep learning curve. We learned a lot of lessons the hard way, but all the lessons we learned will allow us to finish the project much faster than we started it.)
 * 5) Subjectivity and Bias
 * 6) Off-The-Cuff Estimates
 * 7) Unwarranted Precision
 * 8)  Other Sources of Error  (Unfamiliar business area, Unfamiliar technology area, Incorrect conversion from estimated time to project time (for example, assuming the project team will focus on the project eight hours per day, five days per week), Misunderstanding of statistical concepts (especially adding together a set of "best case" estimates or a set of "worst case" estimates), Budgeting processes that undermine effective estimation (especially those that require final budget approval in the wide part of the Cone of Uncertainty), Having an accurate size estimate, but introducing errors when converting the size estimate to an effort estimate, Having accurate size and effort estimates, but introducing errors when converting those to a schedule estimate, Overstated savings from new development tools or methods, Simplification of the estimate as it's reported up layers of management, fed into the budgeting process, and so on)

People naturally assume that a system that is 10 times as large as another system will require something like 10 times as much effort to build. But the effort for a 1,000,000-LOC system is more than 10 times as large as the effort for a 100,000-LOC system, as is the effort for a 100,000-LOC system compared to the effort for a 10,000-LOC system.

The basic issue is that, in software, larger projects require coordination among larger groups of people, which requires more communication (Brooks 1995). As project size increases, the number of communication paths among different people increases as a squared function of the number of people on the project.

The consequence of this exponential increase in communication paths (along with some other factors) is that projects also have an exponential increase in effort as a project size increases. This is known as a diseconomy of scale.

Outside software, we usually discuss economies of scale rather than diseconomies of scale. An economy of scale is something like, "If we build a larger manufacturing plant, we'll be able to reduce the cost per unit we produce." An economy of scale implies that the bigger you get, the smaller the unit cost becomes.

A diseconomy of scale is the opposite. In software, the larger the system becomes, the greater the cost of each unit. If software exhibited economies of scale, a 100,000-LOC system would be less than 10 times as costly as a 10,000-LOC system. But the opposite is almost always the case.



Figure 5-3 illustrates a typical diseconomy of scale in software compared with the increase of effort that would be associated with linear growth.

As you can see from the graph, in this example, the 10,000-LOC system would require 13.5 staff months. If effort increased linearly, a 100,000-LOC system would require 135 staff months, but it actually requires 170 staff months.

As Figure 5-3 is drawn, the effect of the diseconomy of scale doesn't look very dramatic. Indeed, within the 10,000 LOC to 100,000 LOC range, the effect is usually not all that dramatic. But two factors make the effect more dramatic. One factor is greater difference in project size, and the other factor is project conditions that degrade productivity more quickly than average as project size increases. Figure 5-4 shows the range of outcomes for projects ranging from 10,000 LOC to 1,000,000 LOC. In addition to the nominal diseconomy, the graph also shows the worst-case diseconomy.



In this graph, you can see that the worst-case effort growth increases much faster than the nominal effort growth, and that the effect becomes much more pronounced at larger project sizes. Along the nominal effort growth curve, effort at 100,000 lines of code is 13 times what it is at 10,000 lines of code, rather than 10 times. At 1,000,000 LOC, effort is 160 times the 10,000-LOC effort, rather than 100 times.

The worst-case growth is much worse. Effort on the worst-case curve at 100,000 LOC is 17 times what it is at 10,000 LOC, and at 1,000,000 LOC it isn't 100 times as large—it's 300 times as large!

Table 5-1 illustrates the general relationship between project size and productivity.

The numbers in this table are valid only for purposes of comparison between size ranges. But the general trend the numbers show is significant. Productivity on small projects can be 2 to 3 times as high as productivity on large projects, and productivity can vary by a factor of 5 to 10 from the smallest projects to the largest.

For software estimation, the implications of diseconomies of scale are a case of good news, bad news. The bad news is that if you have large variations in the sizes of projects you estimate, you can't just estimate a new project by applying a simple effort ratio based on the effort from previous projects. If your effort for a previous 100,000-LOC project was 170 staff months, you might figure that your productivity rate is 100,000/170, which equals 588 LOC per staff month. That might be a reasonable assumption for another project of about the same size as the old project, but if the new project is 10 times bigger, the estimate you create that way could be off by 30% to 200%.

There's more bad news: There isn't a simple technique in the art of estimation that will account for a significant difference in the size of two projects. If you're estimating a project of a significantly different size than your organization has done before, you'll need to use estimation software that applies the science of estimation to compute the estimate for the new project based on the results of past projects. My company provides a free software tool called Construx® Estimate™ that will do this kind of estimate. You can download a copy at www.construx.com/estimate.

When You Can Safely Ignore Diseconomies of Scale

After all that bad news, there is actually some good news. The majority of projects in an organization are often similar in size. If the new project you're estimating will be similar in size to your past projects, it is usually safe to use a simple effort ratio, such as lines of code per staff month, to estimate a new project. Figure 5-5 illustrates the relatively minor difference in linear versus exponential estimates that occurs within a specific size range.



If you use a ratio-based estimation approach within a restricted range of sizes, your estimates will not be subject to much error. If you used an average ratio from projects in the middle of the size range, the estimation error introduced by economies of scale would be no more than about 10%. If you work in an environment that experiences higher-than-average diseconomies of scale, the differences could be higher.

Importance of Diseconomy of Scale in Software Estimation

Much of the software-estimating world's focus has been on determining the exact significance of diseconomies of scale. Although that is a significant factor, remember that the raw size is the largest contributor to the estimate. The effect of diseconomy of scale on the estimate is a second-order consideration, so put the majority of your effort into developing a good size estimate. We'll discuss how to create software size estimates more specifically in Chapter 18, "Special Issues in Estimating Size."

Count, Compute, Judge techniques

Delphi method

Challenges with Estimating Size

Challenges with Estimating Effort

Challenges with Estimating Schedule

Story based scope definition: scoping project, release planning

Documenting and presenting estimation results

PERT analysis