predicting programming group productivity a communications model

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOl,. SE-I, No.4, DECEMBER ]975 411

K. H. Kim (S'73--M'7i'i) was born in Koreain 1947. He received the B.S. degree inelectrical engineering from Seoul NationalUniversity, Seoul, Korea, the M.A. degree.in computer science from the University ofTexas, Austin, and the Ph.D. degree inelectrical engineering and computer sciencefrom the U.piversity of California, Berkeley,in 1969, 1972, and 1974, respectively.

From 1969 to ]971 he was an officer in theKorean Army. He worked as a Research

Assistant in the Electronics Research Center of the Universi~y

of Texas at Austin from 1971 to 1972, and in the Electronics Research Laboratory of the University of California, Berkeley, from1972 to 1974. During the summer of 1974, he worked as an ActingInstructor in the Computer Science Division of the University ofCalifornia, Berkeley. Since January 1975, he has been an AssistantProfessor of Electrical Engineering and Computer Science at theUniversity of Southern California, Los Angeles. His research interestsinclude computing system architecture, software engineering, andt.heory of parallel computing.

Dr. Kim is a member of the Association for Comput,ing Machineryand the Korean Institute of Eledri('al Engineers.

W. T. Chen was born in Taiwan, China, onMay 27, 1948. He received the B.S. degree innuclear engineering from National Tsing-HuaUniversity, Taiwan, China, and the M.S.degree in electrical engineering and computersciences from the University of California,Berkeley, in 1970 and 1973, respectively.

He is currently a Research Assistant withthe Electronics Research Laboratory, College of Engineering, University of California,Berkeley.

Predicting Programming Group Productivity

A Communications Model

RANDALL F. SCOTT AND DICK B. SIMMONS,. MEMBER, IEEE

Abstract-Methods of studing programmer productivity are difficult to find. The classical methods of observation and statisticalanalysis are in many cases inappropriate. This paper describes asimulation approach in which programmers are' considered to beindividual processors. Relative group productivity is then measuredbased upon the productivity levels and communications relationshipsof the processors.

lTU1ex Terms-Communications structure, programmer productivity, programming group size, simulation, small group model.

INTRODUCTION

THE productivity of computer programmers is themost important and yet the most neglected aspect of

a computer application. The major components of anyapplication-hardware and software-are growing furtherapart in relative costs. With continuing developments inhardware technology; the cost of computing power has,been reduced greatly. At the same time, soaring per~onnel costs and expanded applications have meantgreatly increased software costs. In 1972, over 10 billion

Manuscript received August 5, 1975. .R. F. Scott is with the Command Control and Communications

DirectorlJ,te, U.S. Air Force Headquarters, Washington, D.C.D. B. Simmons is with the Data Processing Center, Texas A&M

University, College Station, Tex. 77843.

dollars were spent in the U.S. on software [1J; further,software now accounts for more than 50 percent of thecosts of almost every computer application.

This shift in the software/hardware cost ratio has notspurred efforts to improve programmer producti~i:~not due to nonrecognition of inflated software costs, butbecause of the difficulties encountered In applying conventional human performance research techniques andsolutions to this problem area. Statistical analysis ishampered by the lack of data collected during programming projects, although Delphi surveys can help indicateimportant factors in determining productivity [2]. Project observation is too expensive and time consuming formost needs since a proper sample of programming projects would range from day-long problems to others requiring years to complete.

Simulation offers some hope as a research tool for theanalysis of programmer productivity. Simulation has;'proven useful for determining the small group socialpsychological viewpoint [3]. The programming shop hasalso beerl: viewed as a feedback network and then simu..:·lated [4]. Another simulation approach to the programming group, as detailed in this paper, is comparison to acommunications network.

412

One of the most available analogies of a communications model is a.,multiprocessor computer network. Eachprocessor has its own rate of productivity, communications requirements with other processors, and overhead.By extending this analogy, each processor can be considered a programmer with a productivity rate, needs fortechnical- communications'\,Vith other programmers, andpersonal time requirements.

The general type of multiprocessing system from whichthis concept was derived was referred to by Flynn [5Jassingle instruction multiple data stream (SIMD) orgll.mzation. In this category fall array, pipeline, and associative processors. Each processor can be unique with itsown processing ability and communications requirements.Communications times for SIMD programs vary fromfifteen to forty percent. Based on empirical evaluationsof program performances with SIMD organized processors, the actual performance was found to be proportionalto log2 of the total number of processing elements.

An activity profile description and a producti~tylevel are the functions that must be described for theprogrammer-to-processor analogy. The activity profileis the percentage of time a programmer spends in variousactivities. It is divided into three parts: personal, communications, and productive time. The empirical studiesof Shell [6J and Mayer and Stalnaker [7J were used forgeneral ideas of actual profiles. Communications timesvaried from 7.5 to 40 percent, personal time from 7 to13.5 percent, and productive time from 51 to 79 percent.Because of the wide variation, this empirical evidence isused only for providing a reasonable range of values.

Methods of describing productivity levels are not asavailable as profiles. By the very nature of a programmingproject, productivity (defined here' as implemented object instructions produced per unit of time) rises sharplyearly in the project, then drops considerably as the project approaches completion. (In fact, most people agreethat large programs are never 100 percent debugged.)There is evidence that it takes proportionally more manmonths to complete longer projects, which indicatesthere is reduced individual productivity after some timehas passed. The representative distribution chosen forthis type of decreasing productivity rate in the followingexperiments is the cumulative exponential

(1 - exp[ -alpha(x)J).

MODEL LOGIC

The model requires five items of input information.First is the communications probability matrix. This is amatrix with a row and a column for each programmer onthe project. At each row and column intersection is theprobability that the programmer represented by that rowwill initiate communications with the programmer represented by that column. A second input item is the personaltime probability vector. This vector contains the probability that each programmer will be nonproductive be-

'IEEi~' illA.NSACTIONs ON SOFTWARE ENGINEERING, DECEMBER 1975

cause he is doing something personal. A vector of productivity levels (alphas in the cumulative exponentialdistribution) provides each programmer a productivityrate, while the relative elapsed time provides input scalarto determine, relative to the exponential productivitycurves, the elapsed time to complete the project (i.e.,the maximum value of x). Finally, the number of sampletime increments must be specified. This is the number ofequal time spaced samples that will be taken during theexperiment.

To easily conduct multiple experime:t;lts in which theresults of the previous experiment determine the formof the next, the model was programmed in A ProgrammingLanguage (APL). The APL program consists of six func~

tion routines plus the system library plot routines.

EXPERIMENTS

A simulated programming group of five people wasused to demonstrate the performance of the model underextreme conditions. All were assigned productivity alphasof one and a zero for' personal time. Optimal total productivity would be 3.16 (i.e., 5 multiplied by 1 - e-1).

Communications percentages of 0 to 100 were tested.Table I contains the average productivity results for therange of profiles.

Because of the flexible structure of the model, an unlimited number of experiments could be conducted. Anyinteractive communications structure would be possibleas well as various productivity levels and profiles. Threetypes of experiments were conducted to illustrate possibleapproaches to using the model.

Structure Experiment

The first experiment was an attempt to determine ifthe communications structure, for a given group of programmers, would affect the total productivity of thegroup. The simulated programming group consi!3ted ofseven people. The supervisor was assigned a low productivity level (alpha + 0.5) and a high communic~ionsprofile (60 percent). Two programmers each had productivity levels of 3 and communications profiles of 25percent. The remaining four people each had productivitylevels of l' and communications profiles of 20 percent. Allpersonnel had personal time probabilities of 13 percent.

Three structures were examined (Fig. 1). 20 experiments were run on each structure to determine the totalproductivity after one relative time period. For each experiment, 20 equally time spaced samples were taken. Ananalysis of variance performed on the results showed avery significant statistical difference in the produc~ivity

between the structures. Structures one and three wereboth superior to two; there was no significant differencebetween one and three.

One explanation for structures one and three beingbetter than two is an information "bottleneck" problem.The better structures have a more balanced distributionof needed information. Structure two's centralization of

SCOTT AND SIMMONS: PREDICTING PRODUCTIVITY

TABLE IPRODUCTIVITY RANGE VALUES

CommunicationsPercentage Productivity

413

~2 3

4 5 6 ~Structure 1

o 3.1620 2.3440 1.4260 0.6480 0.27

100 0.00

~Strutt,ure 2 •Structure 3

Fig. 1. Experimental structure.

all communications in one member caused informationdelays and therefore more nonproductive periods. Thisindicates that multiple sources of needed information'would improve overall project communications.

Group Size Experiment

In the second experiment the effect on total projectproduction resulting from adding programmers, to theproject using structure two was studied. One programmerhad a communications profile of 60 percent and productivity alpha of 0.50. The other two had communicationsprofHes of 20 percent and productiVity alphas of 1. All

'personal time profiles were' 13 percent. The added programmers all had productivity alphas of 1 and communications profiles of 20 percent, Group sizes of 3, 6, 9, 12, 15,and 18 were tested. Fig. 2 is a plot of the means of eachset of experiments. Also plotted is' one-half log2 of thenumber of programmers for each experiment. The log2plot is used to compare these results with those obtainedfrom actual multiprocessing systems [8]. ,

There was an obvious increase in total productivityfor group sizes up to 12,people. However, an analysis ofvariance used to determine the effect on productivity ofadditIonal programmers failed to show any relativeproductivity benefit through expansion to the last threegroup siz~ (12, 15, and 18).

The use of other structures, productivity level~, l:!-ndprofiles would undoubtedly produce different results: Itdoes appear, however, that in all cases there would be apoint beyond which the assignment of additional personnel would provide no improvement to group produc-tivity. .

Productive lruiividual Experiment

This experiment involved the positioning of a highlyproductive individual within an organization. The twoalternatives ~xamined were whether he should be 'in asupervisory or working programmer position. Structuretwo, with a total of seven programmers, was selected.

For test one the supervisor was assigned a productivitylevel of 0.5 and communications profile of 30 percent.

•...:, ..··~del results~ ~LOG2 No. People

oL--.-_...--....--r---r_...--_--::.2No. of People 12 lS 18

Fig. 2. Average total productivity plot.

The other six members of the group each had productivitylevels of 1 and· communications, profiles,or 25 .percent.All programmers were assigned· personal time pr<;lfiles of13 percent. In the second test the same organization wasused but with a highly productive individual (productivelevel of 8) assig~ed as supervisor. In the third test, thehighly productive person replaced a working programmer,

The test productivity means were 1.476, 1.347, and1.974; obviously, a highly productive person's potentialproductivity is neutralized by assigning him to a supervisory position requiring significant amounts of COmmunications; further, hIS assignment l:l.S a working programmei' reshlted in increased group productivity; Thishighlights tIle old problem ofpromot~oriof highly skilledpeople without endangering their contribution to pr~ect

accomplishment.

CONCLUSIONS

Many project related variables are indfrectly iri<:ludedin this model. Tlie number of programmers, elapsed timefor the project,and the productiyity of 'individuais areall directly reflected in the model. Programmer experience and programming tanguage can be reflected in theindividual productivity level!;!. The effect of. importantvariables related directlytb comm\!.nicatiOrIs such asdocumentation, project communications, and assignmentof independent modules for task assignment are incl\!.dedin the communications ,profiles.

In general, the conclusions of these exPeriments were:for a given number 'of programmers .with fixed litctivityprofiles and productivity levels, productivity is affectedby the organizatIonal structure; for a given set of productivity levels, activity profiles, and organizational struc-

414 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. SE-l, NO. 4, DECE~BER 1975

ture, there is a,n upper limit to the number of programmersthat can effectively add to the total group productIvity;and, lastly, the potential of highly productive people canbe neutralized by assigning them to positions with high~mmunication5 requirements.

REFERENCES[I) B. W:Boehm, "Software aiJ.<j its impact: A quantitative assess

ment," Dalamation, pp. 48-59, May 1973.[2) R. F. Scott and D. B. Simmons, "Programmer productivity and

the Delphi technique," Datamation, pp. 71-73, May 1974.[a) H. J. Brightman, "Individual behavior and the small work group:

A simulation study," Ph.D. dissertation, Univ. Massachusetts,Amherst, 1970.

(4) O. A. Thomas, "A computer programming shop simulationmodel: A managerial decision tool," in Conf. Roc., 3rd Int. C(Jnf.Computer Management. Amsterdam, The Netherlands: North-Holland, 1972, ~p. 235-244.. ...

[5] M. J. Flynn, 'Toward more efficient computer organizations,"in 1972 Spring Joint Comput. Conf., AFIPS Conf. Proc., vol. 40.Montvale, N. J.: AFIPS Press, 1972, pp. 1211-1217.

[liI) R.. L. Shell, "Work measurement for computer programmingoperation," Ind. Eng., vol. 4, pp. 32-36, Oct. 1972.

(7) D. B. Mayer and A. W. Stalnaker, "Selection and evaluation ofcomputer personnel," in On the Management· of Computer Programming, G. F. Weinwurm, Ed. New York: Auerbach, 1970,pp. 133-157.

[8] M. J. Flynn, "Some computer organizations and their effectiveness," IEEE Trans. Comput., vol. C-21, pp. 948-960, Sept. 1972.

Randall F. Scott was born in Fayetteville,Tenn., on January 9, 1939. He received theB.S. degree in mathematics from MiddleTennessee State University, Murfreesboro,Tenn., in 1962 and the M.C.S. and Ph.D.degrees from Texas A & M University,College Station, in 1968 and 1973, reSpectively.

Since 1962 he has been on active duty inthe UB. Air Force. He is a Major and iscurrently assigned to the Command Control

and Communications Directorate, U.S. Air Force Headquarters,Washington, D.C.

Dr. Scott is a member of the Association for Computing Machinery, Phi Kappa Phi, and Upsilon Pi Epsilon.

Dick B. Sittunons (S'59-M'62), for a photograph and biography,please see p. 5 of the March 197.5 issue of this TRANSACTIONS.

Software Performance Modeling Using

Computation Structures

HOWARD A. SHOLL, MEMBER, IEEE, AND TAYLOR L. BOOTH, FELLOW, IEEE

Abstract-An engineering-oriented performance model of a computation is developed by extending the concept of a computationsn-uctUre to cover the performance costs appropriate to softwaremodeling. The model allows both serial and parallel (multiprocessor)configurations, and the evaluation of both time and space parametersfor alternate realizations.

A brief discussion on the use of the model as a mechanism to guidetlte performance optimization of programs is included.

Index Terms~Computation structures, parallel processing, software design, software performance.

1. INTRODUCTION

T.. HIS paper is concerned with the introduction of engineering design methods applied to the design of

t,seftware. A major difficulty in current day software design is the lack of applied modeling techniques that canbe used by a programmer to evaluate the dynamics of

Manuscript received August 5, 1975. This work was supportedin part by the National Science Foundation under Grants GJ 32-300and DCR 75-00084.

The authors are with the Department of Electrical EngineeringIl,Jld Computer Science, University of Connecticut, Storrs, Conn.

his programs or the use of system reSources. As a result,programs are created based upon the skill of a prograrrifuerwho has mastered to some unknown degree a set of programming techniques on a specific machine or in Ii specificlanguage. Little effort or interest is generally shown ininvestigating the performance properties of the createdprogram in order to either document or improve its execution performance. This lack of effort can be traced to theprogrammer's training, which usually lacks (even forengineers) an understanding of the modeling conceptsnecessary for software design.

With the current trends in system level organizationtoward more parallel types of computing bases (i.e., horizontal microprogramming and distributed microprocessorsystems), and the increase in real-time applications ofmini- and microprocessors, an understanding of the execution-time properties of programs arid software systems isessential, in order for one to assess accurately the influenceof individual software components on overall programperformance. What is needed is a modeling format thatwill provide a basis for evaluating a program's executiontime behavior. Such a model must represent both the

predicting programming group productivity a communications model

Documents