estimating cost size difficulty effort productivity work rate cost loc, fp mm (ideal) mm $ $/mm time

Estimating Cost

size

difficulty

effort

productivity

work

rate

cost

LoC, fp

mm (ideal)

mm

$

$/mm

time

Modeling

• Simple functional shape– e.g.– Based on general observations

• Very many parameters• Calibration based on past experience• Speculative due to high uncertainty• Not very accurate

exponentscalingsizeceffort _

So how should we measure (and estimate) the size of a software project?

LoC

• What is a line? What is code?• Typically count statements• Exclude comments• Exclude blank lines• What about marcos?• What about temporary code, e.g. testing?• Depends on language– handled in other part of model

• How do you estimate LoC from requirements?

size

difficultyeffort

product.work

ratecost

Function Points (fp)

• Estimate the functional content of the project– Based on requirements– Mainly external aspects of the system (black box)– Can be used early in the life cycle

• Invented at IBM in 1970s• Geared towards information systems• Several variants, e.g. used by SPR (Capers

Jones’s company)• Translation of fp to LoC depends on language– fp itself independent of language

size

difficultyeffort

product.work

ratecost

Function Points (fp)

How is it done?• Based on counting 4-7 parameters • Multiplying them by weighting factors• Summing up the weighted counts • Multiplying by a complexity adjustment factor

size

difficultyeffort

product.work

ratecost

fp Spreadsheet

Parameter Count Weight ResultNumber of inputs _____ x 4 = _____Number of outputs _____ x 5 = _____Number of queries _____ x 4 = _____Number of data files _____ x 10 = _____Number of interfaces _____ x 7 = _____Unadjusted total _____Complexity adjustment _____Adjusted fp total _____

size

difficultyeffort

product.work

ratecost

fp Spreadsheat

Parameter Count Weight ResultNumber of inputs _____ X 4 = _____Number of outputs _____ X 5 = _____Number of queries _____ X 4 = _____Number of data files _____ X 10 = _____Number of interfaces _____ X 7 = _____Unadjusted total _____Complexity adjustment _____Adjusted fp total _____

size

difficultyeffort

product.work

ratecost

Controls (what to do)

Internal (e.g. index)

With other programs

Screens

Records / fields

fp Spreadsheat

Parameter Count Weight ResultNumber of inputs _____ x 4 = _____Number of outputs _____ x 5 = _____Number of queries _____ x 4 = _____Number of data files _____ x 10 = _____Number of interfaces _____ x 7 = _____Unadjusted total _____Complexity adjustment _____Adjusted fp total _____

size

difficultyeffort

product.work

ratecost

Can have different weights for “simple”,

“average”, or “complex”

Complexity AdjustmentData communicationsDistributed functionsPerformance objectivesHeavily used configurationTransaction rateOn-line data entryEnd-user efficiencyOn-line updateComplex processingReusabilityInstallation easeOperational easeMultiple sitesFacilitate change

size

difficultyeffort

product.work

ratecost

• Rate each on a scale of 0 to 5

• Sum them up• Divide by 100• Add 0.65• This gives a factor in

the range 0.65-1.35 (35%)


size

difficultyeffort

product.work

ratecost



the range 0.65-1.35 (35%)

0 = no influence1 = insignificant influence2 = moderate influence3 = average influence4 = significant influence5 = strong influence


size

difficultyeffort

product.work

ratecost



the range 0.65-1.35 (35%)

fp Spreadsheat

Parameter Count Weight ResultNumber of inputs _____ x 4 = _____Number of outputs _____ x 5 = _____Number of queries _____ x 4 = _____Number of data files _____ x 10 = _____Number of interfaces _____ x 7 = _____Unadjusted total _____Complexity adjustment factor (35%) _____Adjusted function point total _____

size

difficultyeffort

product.work

ratecost

COCOMO

• Stands for COnstructive COst MOdel• Published by Barry Boehm in 1981 for waterfall• COCOMO II update for modern methodologies

published in 2000• Actually three models, with many parameters– Early prototype: checking high-risk issues stage– Early design: architecture development stage– Post-architecture: code development to delivery

stage, very detailed

Size in COCOMO IIsize

difficultyeffort

product.work

ratecost

• Start with function points• Translate to KLoC based on language

Language LoC/fpAssembly 320C 128Fortran 77 105Lisp 64Java 53Visual C++ 34Perl 27

The Modelsize

difficultyeffort

product.work

ratecost

• MM = man-months of effort• KLoC = lines of code (‘000)– Includes model for taking reuse into account– Also estimate factor of increase due to changes

• SFj = set of scaling factors

• EMi = set of effort multipliers

16

1

91.05

194.2i i

SFEMKLoCMM j j

The Modelsize

difficultyeffort

product.work

ratecost

16

1

91.05

194.2i i

SFEMKLoCMM j j

5

12.028.0

67.3 j jSFMMtime

time

MMstaff

Note: not the other way around!!!

Calibration

• The model includes two “top-level” constants– Average productivity of 2.94 and exponent of 0.91

• Also dozens of parameters in scaling factors and effort multipliers

• These are all derived by calibrating the model to data about 161 specific projects from the late 1990s using Bayesian approach

• Users should calibrate to their own data

The Exponentsize

difficultyeffort

product.work

ratecost

5

191.0

j jSFLoC

• <1 reflects economy of scale– Uncommon– Possible due to fixed startup costs in small projects

• >1 reflects diseconomy of scale– Due to growth of inter-person communication

needs– Due to growth of integration overhead

Scaling Factorssize

difficultyeffort

product.work

ratecost

• The project is similar to previous ones• Flexibility in achieving goals• Risks have been resolved• Team is cohesive• Process is mature (based on 18-item

questionnaire)• Each factor has several levels• Each level has a score– Scores go down from ~0.07 to 0

Scaling Factorssize

difficultyeffort

product.work

ratecost

• The project is similar to previous ones• Flexibility in achieving goals• Risks have been resolved• Team is cohesive• Process is mature (based on 18-item

questionnaire)• Each factor has several levels• Each level has a score– Scores go down from ~0.07 to 0

Example:Thoroughly unprecedented = 0.0620Largely unprecedented = 0.0496Somewhat unprecedented = 0.0372Generally familiar = 0.0248Largely familiar = 0.0124Thoroughly familiar = 0.0000

Effort Multiplierssize

difficultyeffort

product.work

ratecost

• Each has a scale of possible values

• All scales include 1.00 as the default

• Some have only higher values

• Others have both higher and lower

Required reliabilityDatabase sizeProduct complexityIntended reusabilitySuitability of documentationExecution time constrainsMain storage constraintPlatform volatilityAnalysts capabilitiesProgrammers capabilitiesPersonnel continuityApplications experiencePlatform experienceLanguage/tools experienceUse of toolsMulti-site developmentRequired schedule


difficultyeffort

product.work

ratecost






Example:Very low = 0.73Low = 0.87Nominal = 1.00High = 1.17Very high = 1.34Extra high = 1.74

Selection basedon table withexamples for control, datacomputation,devices, anduser interface


difficultyeffort

product.work

ratecost






Example:

Nominal = 1.00High = 1.05Very high = 1.17Extra high = 1.46

Available storage used:50%70%85%95%


difficultyeffort

product.work

ratecost






Highest impact on productivity, as assessed by max/min factor:• Staff capabilities: 3.53• Project complexity: 2.38• Time constraint: 1.63• All others: range of 1.26 to 1.54

Examplesize

difficultyeffort

product.work

ratecost

• Assume an estimated size of 100 KLoC• Average large project exponent = 1.15• Average project all effort multipliers = 1.00

58710094.2 15.1 MM

7.2958767.3 33.0 time

75.197.29

587staff

Use-Case Points

• Function points are based on information system concepts like queries and transactions

• Modern systems are not characterized by the same attributes

• But they can be characterized in terms of use-cases

• Which are also known in an early phase of the lifecycle

Use-Case Points ENVTECHAUCUCP


UC = Sum of weights for simple, average, and complex use cases

Type Steps Classes WeightSimple 3 5 5Average 4-7 5-10 10complex >7 >10 15


A = Sum of weights for simple, average, and complex actors

Type characteristics WeightSimple Programmatic using API 1Average Programmatic using

protocol2

Complex Human using GUI 3


TECH = Technical complexity factor

Each factor scoredfrom 0 to 5 andMultiplied by weightFrom table

Range: 0.6-1.3

Distributed system 0.02Resp. time requirement 0.01End-user efficiency 0.01Complex processing 0.01Reusable code 0.01Easy to install 0.00

5Easy to use 0.00

5Portability 0.02Maintenance 0.01Concurrent/parallel 0.01Security features 0.01Access by third party 0.01End-user training 0.01

iiWF6.0


ENV = environmental complexity factor

Each factor scoredfrom 0 to 5 andMultiplied by weightFrom table

Range: 0.42-1.70

Familiarity with process 0.045Application experience 0.015OO experience 0.03Lead analyst capability 0.015Team motivation 0.03Requirements stability 0.06Part-time staff -0.03Difficult programming language -0.03

iiWF4.1

Agile

• Approach is to guarantee quality and schedule at expense of features

• User stories broken into tasks of 1-3 “ideal days”

• Measure velocity = how many ideal days correspond to a real day

• Plan user stories for next iteration taking velocity into account

Schedule

Common approach: • Manager decides on schedule• Subordinates tell him it will be OK– SNAFU principle: Accurate communication is only

possible between equals• Engineers don’t have a say• Schedule slippage is discovered too late• Overwork, hysteria, reduced quality, and late

delivery

Brooks’s Law

From “The Mythical Man-Month”

Adding manpower to a late software project makes it later

Men and months are not interchangeable• Attributed to– The need to train new personnel– communication overhead within a larger team– Serial tasks like design, debugging, and integration

Schedule

Alternative approach: • Manager decides on schedule• Subordinates provide estimates– Based on best available input from engineers– Managers may not change this

• Manager creates feedback loop: disappointing estimates lead to revised resources or tasks for engineers

• Schedule affects manager’s performance rating but not engineer performance rating

Technical Debt

Oftentimes you have two options of how to do something:

A clean design OR Quick and dirtyThe difference between them is the technical debt:• It burdens future development (harder to make

progress)• It accrues interest (you will pay by harder work)• You can (and should?) pay off the principal

(refactor to achieve a clean design)

Examples

• Not using classes to separate concerns• Partial/no unit testing• Using an inefficient algorithm because it’s simpler• Not checking inputs• Using bad identifier names or not naming

constants• Using constants instead of settable parameters• Skipping documentation• Cryptic if any error messages

Examples

• Not using classes to separate concerns• Partial/no unit testing• Using an inefficient algorithm because it’s simpler• Not checking inputs• Using bad identifier names or not naming

constants• Using constants instead of settable parameters• Skipping documentation• Cryptic if any error messages

Technical debt is a

form of risk (albeit risk

introduced intentionally)

Managing technical debt

is risk management

Considerations

• Take on debt to exploit a business opportunity– e.g. make a release to gain market share

• Pay off debt to avoid paying interest– Will make future progress more efficient, but..– Incurs “wasted” time in which we don’t make

progress• Continue paying interest if it is low enough– e.g. if dirty code is peripheral

Refactoring

• High level– Create new abstractions (+ information hiding)– Move methods from/to super/sub class

• Low level– Changing names– Extracting methods– Supported by tools

• Need to include in schedule

estimating cost size difficulty effort productivity work rate cost loc, fp mm (ideal) mm $ $/mm time

Documents

work rate cost rate

work rate cost slide

number of outputs

number of queries

number of interfaces

number of data files

work rate cost controls

complex slide