presentation paper bio bio's return to main menu · centre - radio (omc-r) is a network management...

29
P R E S E N T A T I O N International Conference On Software Testing, Analysis & Review NOV 8-12, 1999 BARCELONA, SPAIN Presentation Bio's Return to Main Menu T12 Thursday, Nov 11, 1999 Focused Integration Testing in a Large System: A Case Study Paul French & Deirdre Donovan Paper

Upload: others

Post on 24-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

  • P R E S E N T A T I O N

    International Conference On

    Software Testing, Analysis & ReviewNOV 8-12, 1999 • BARCELONA, SPAIN

    Presentation

    Bio

    Return to Main Menu

    Presentation

    Bio's

    Return to Main Menu T12

    Thursday, Nov 11, 1999

    Focused Integration Testing

    in a Large System:

    A Case Study

    Paul French & Deirdre Donovan

    Paper

  • 1

    Focused Integration Testing

    Paul FrenchDeirdre Donovan

  • 2

    Overview

    • Process Improvement Initiativesto find defects earlier in the SoftwareLifecycle in Large Systems

    • Focused Feature Testing

  • 3

    Product Background

    • GSM Operation and Maintenance Centre(OMC) launched in 1992

    • Graphical User Interface (GUI)

  • 4

    Large System Feature Development

    Feature A

    Feature C

    Feature D

    Feature B

    Feature C

    Feature E

    Subsystem 1 Subsystem 2

    PRODUCT

    • Multiple Subsystems,Multiple Features,Developed Separately

    • Feature/SubsystemIntegration testperformed after codefreeze

    • Features merged priorto Subsystem Test

  • 5

    Software Lifecycle

    Development Feature/

    Integration Test

    Functional Requirements/

    High Level Design

    FeatureRequirements

    SystemRequirements

    Code

    Low LevelDesign

    Unit Test

    Subsystem Test

    System Test

  • 6

    Phase Containment EffectivenessPhase Containment Effectiveness

    Phase

    0%10%20%30%40%50%60%70%

    Insp

    ecti

    ons

    Dev

    elo

    pm

    ent

    Su

    bsy

    stem

    Sys

    tem

    Cu

    sto

    mer

    Planned

    Actual

  • 7

    Problem Analysis

    • Escaped Defect Analysis

    • Post Mortem

  • 8

    Escaped Defect Analysis

    Inte

    grat

    ion

    Cov

    erag

    e

    Usa

    bilit

    y

    Oth

    ers

    P1

    P2

    P3

    P4

    • Defects missed due to:

    – Late subsystemIntegration

    – Insufficient TestCoverage

    – Usability issues

    – Others

    • eg Require LargeConfiguration, etc

  • 9

    Proposal

    • Process Improvements to enable earlydetection of defects per feature

    • Indication mechanism for remainingdefects and more focused approach totesting

  • 10

    Process Improvements

    • Early Feature Integration

    • Improved Development Feature Testing

  • 11

    Early Feature Integration

    • Subsystem Integration by Feature

    – Develop feature on each subsystem

    – Perform Development Feature Integration

    testing with both subsystems instead

    of stubs

    – Captures functional defects of feature as a standalone, including

    • interface defects

    • remote subsystem behaviour defects

  • 12

    Improved Development Feature Testing

    • Use expertise of independent test groups to write Development Feature Test Plan– Captures development perspective by input

    to inspections

    – More Extensive

    – Test Expertise

    – Avoids cognitive dissonance

  • 13

    Remaining Defects

    • When all features combined and tested,remaining defects are due to:

    – Feature interaction• Feature Testing based on code merge

    – Independent test perspective• Regression Testing

    • Ad-hoc/Customer Focused Testing

  • 14

    Feature Interaction

    Test Plan BTest Plan A

    Feature BFeature A

  • 15

    Code Merge

    For each Feature

    – Compute % code merge

    %Code Merge = # Files modified more than once Total # Files

  • 16

    Results of Process Initiatives

    • Applied to two similar features

    • Compared to original feature where improvements hadn’t been applied

  • 17

    Defects Vs PhaseDefects Vs Phase

    Phase

    0%

    10%

    20%

    30%

    40%

    50%

    60%

    70%

    Insp

    .

    Dev

    .

    Su

    bsy

    s.T

    est

    Sys

    tem

    Tes

    t

    Fie

    ld

    R e l 0

    R e l 1

    R e l 2

    P la n n e d

  • 18

    Feature Interaction Defects

    • % Code overlap of features analysed were2% and 5% respectively

    • Re-running Feature Test Plan found littlefunctional defects

    • Remaining defects were found due toindependent test perspective

  • 19

    Future Work

    0

    10

    20

    30

    40

    50

    60

    Total Defects

    % C

    od

    e M

    erg

    e Rel 0Rel 1

    Rel 2

    A.N Other1

    A.N Other2

    A.N Other3

  • 20

    Conclusions

    • Early defect discovery facilitated by:– Early Feature Integration

    – Early use of Independent Test Plan

    • Assuming process improvements have beenapplied:– Low % Code Merge may suggest

    • Regression test plus alternative test methods

    – Knowledge of program files modified most often asa result of merge can assist in prioritising testing onmost problematic areas

  • Page 1

    Focused Integration Testing in a Large System: A Case Study.

    byPaul French, Deirdre Donovan

    OMC GroupGSM Systems Division

    MotorolaCork

    [email protected]

    [email protected]

    ABSTRACT - Large software system releases usually consist of a number of individual features. In asystem that consists of multiple distinct hardware components, the functionality of these features is veryoften spread over more than one component. GSM is such a system. The Operations and MaintenanceCentre - Radio (OMC-R) is a network management product for the GSM Mobile market. It has been inthe field since 1992. It manages Radio Base Station Systems (BSS) and Remote Transcoders (RXCDRs).

    Up to now, the way software releases were developed was both sides (OMC-R and BSS) developed allthe features of a release individually. The features were then integrated on each hardware component.Finally the two components were put together and system integration testing was performed on allfeatures using the same software load.

    This process was changed when it is found that up to 80% of certain problems would have been found ifintegration testing were performed on a per-feature basis with early input from independent test teams.Furthermore, an associated mechanism was developed for measuring feature interaction within arelease with the resulting data used to prioritise testing.

    This paper describes these process improvements and details the results as seen in the recent releases.

    [email protected]@cork.cig.mot.com

  • Page 2

    1. Introduction

    Figure 1 below shows a typical GSM architecture for amobile phone network, which consists of an Operationsand Maintenance Centre (OMC) and a number of networkelements (NEs).

    H L R

    E I R

    S M S

    R X C D R

    M S C / V L R

    O M C

    B S S 1

    B S S 3

    Figure 1 GSM architecture

    The Motorola OMC (OMC-R) provides networkmanagement functionality for many subsystems such asMotorola Radio Base Station Subsystems (BSSs) andRemote Transcoders (RXCDR - Motorola only) inaddition to the ability to interwork with other vendorsMSCs.

    Each software product release consists of multipleFeatures, some of which require software to be written inmore than one subsystem. For example Feature C inFigure 2.

    Each feature/subsystem is developed separately bydifferent groups or subgroups, which in some cases residein geographically separate locations. This places a higherpriority on the management of the interfaces betweensubsystems.

    At one of a number of pre-defined points during theproject, code is frozen and the subsystem features aremerged into a single software product. At this point,integration testing occurs on each feature, using allrelevant subsystems. This is also generally the stage whenthis version of the product is handed over to theIndependent Test Group for Subsystem level Testing ofthe contained features.

    F e a t u r e A

    F e a t u r e C

    F e a t u r e D

    S u b s y s t e m 1 ,e . g . O M C

    S u b s y s t e m 2e . g . B S S

    S O F T W A R E P R O D U C T

    F e a t u r e B

    F e a t u r e C

    F e a t u r e E

    Figure 2 System Software Product

    In this paper the current development process in place inour organisation is described, along with some of thequality metrics that are recorded. It goes on to examine theproblems encountered and the analysis that was performedto locate their root causes. Some of the processimprovement initiatives that were put in place to resolvesome of the issues arising out of the problem analysis arethen discussed. The results to date of these initiatives aredescribed, along with conclusions and recommendationsarising out of this exercise and future work that is to becarried out.

    2. Current Development Process

    At Motorola the Software Development follows the IEEEV-model [1] as shown in Figure 3, defining system andthen lower level feature requirements, followed by a highlevel design and then low level design phase, and finallycoding.

    t

    F e a t u r eR e q s

    S y s t e mR e q s

    C o d e

    L L D U n i t T e s t

    S u b s y s t e mT e s t

    S y s t e mT e s t

    D e vI n t e g T e s t

    H L D

    Figure 3 Software Development Process

    All products undergo various levels of testing before beingfinally released to a customer. This approach is designedto direct each testing effort to different points in the defect

  • Page 3

    spectrum.

    Development Unit Testing is the first phase of testingwhich is performed. This is generally approached from awhite box perspective with test cases developed toexercise the internal logic of the code.

    Development Feature and Integration Testing involvesgrey box testing with test cases written using both theFunctional requirements and High Level Design. Testingat this level is designed to use stubs to externalsubsystems.

    Subsystem Feature Testing involves testing of features aspart of released code containing many features and bugfixes. While focusing on the testing of the relevant OMCsubsystem, any other subsystems, which have an impacton this, are used in the testing. This subsystem testing isperformed by an Independent Verification and Validationgroup within the OMC group and is based on the FeatureRequirements specification.

    System Level Testing occurs when all subsystems arebrought together and the entire system is tested as a whole.This System Testing is performed by an IndependentSystem Test group and is based on System Requirementspecifications.

    3 Problem Definition

    Phase Screening Effectiveness (PSE) is a metric that isused to determine the efficiency of the defect detectionprocess in each software phase. It involves the collectionof data on the numbers of defects of various severities thatare found during each phase. As typically, it costssignificantly more to fix a defect found in a test phase thanit does to find and fix it in development, it is considereddesirable to find faults as early in the SoftwareDevelopment Cycle as possible.

    A planned Phase Containment Effectiveness target isdefined for each project. This target takes into accountprevious data for similar features and shows that themajority of defects were found during inspections anddevelopment phases of project development.

    In the feature which was looked at (which was common tothe OMC and BSS), there was a significant drop in thenumber of defects found in development testing while thetotal found in Subsystem testing of the feature increased.This can be seen in Figure 4

    Figure 4 Planned Vs Actual PCE

    Our task was to move the primary defect detection phasesas far left in the graph as possible.

    4 Problem Analysis

    In order to determine why there was an increase in defectdetection in the subsystem test, two parallel courses ofaction were pursued.

    Firstly, each defect found by Test Organisationssubsequent to Development Testing was analysed with aview to ascertaining why these defects were not foundearlier.

    In parallel, a Post Mortem was conducted on the featurewith all the relevant organisations involved. A number of reasons were identified for the failure of theDevelopment phase to contain these defects. These areillustrated in Figure 5.

    The first category into which many of the higher prioritydefects fell was the area of Integration. As was describedearlier, at one of a number of pre-defined points during theproject, code is frozen and a number of the subsystemfeatures are merged into a single software product. As thisis the first complete build of all these features, IntegrationTesting of Features common to both the BSS and OMCSubsystems using actual subsystems instead of stubs, wasnot performed by the development organisations until the

    0%

    10%

    20%

    30%

    40%

    50%

    60%

    70%

    Insp

    ecti

    on

    s

    Dev

    elo

    pm

    ent

    Su

    bsy

    stem

    Sys

    tem

    Cu

    sto

    mer

    Planned

    Actual

  • Page 4

    code has been frozen. This testing then proceeded inparallel with Independent Subsystem Testing of thefeatures.

    Integration Coverage Usability Others

    P1

    P2

    P3

    P4

    Figure 5 Escaped Defect Analysis

    However one of the disadvantages of such an approachwas that the Independent Verification and ValidationGroup encountered some of the problems which thisintegration highlighted, before they could be found andfixed by Development. On further investigation andanalysis of each defect, it was believed that just testing thefeature standalone using both subsystems could havecaught many of the escaped defects. By making a minoralteration to the project lifecycle, this testing could beperformed prior to the merging of this feature code withall other code changes for the code freeze.

    A second problem that came out of this analysis was theissue of Coverage. Test Plans written by Development,although testing from much the same perspective, were notas complete as those developed by the Test Organisations.Also developers running tests on code they have writtenare not always as objective or independent as they couldbe. Furthermore they lacked the test expertise that theTest Groups had built up.

    Quite a number of the problems, albeit the lower priorityones, found by the Test Organisations fell under thegeneral heading of Usability. These related to the look andfeel of the product and how easily our customers could useit. Many problems were identified with functionality,which, although correct, was not easy to use.

    5 Proposals

    On completion of the analysis, a solution was proposed toimprove both the early detection of defects and to provide

    a more focused approach to Subsystem Testing.

    In the first instance two process improvements wereproposed to enable early detection of defects on a per-feature level. Secondly, an enabling mechanism wasproposed which would give an indication of where theremaining defects in the release were and more focusedtesting approach to find them. Focused Testing in thisinstance refers to guidelines for directing testing towardspotentially more problematic areas of code.

    5.1 Process Improvements

    Two main process improvements arose out of thisanalysis.

    The first tackled the issue of subsystems being integratedtoo late to find defects prior to handover to the testorganisations.

    The purpose of the second was to tackle the inadequaciesof the Development Feature Test Plan.

    These process improvements will now be dealt with inmore detail

    5.1.1 Early Feature Integration

    The analysis had shown that using pre-code freezeversions of subsystems were more productive (in findingdefects) than using stubs. For features common to morethan one subsystem, by defining and freezing thesubsystem interfaces at an early phase, these featurescould be integration tested with pre-code freeze versionsof their subsystems. Earlier findings had indicated thatfeature testing in this environment would be moreproductive than performing development featureintegration testing when all code on each subsystem hasbeen frozen.

    It would involve two main steps:

    • Development Groups would continue to develop thefeature on each subsystem.

    • They then would perform Development FeatureTesting with early versions of both subsystemsinstead of using stubs.

    If the functional defects of the feature as a standalonewere eliminated or cut down significantly prior to systemcode freeze, the Independent Test Group couldconcentrate on other parts of the defect spectrum.

  • Page 5

    5.1.2 Improved Development Feature Testing

    The second process improvement involved using a subsetof a test plan developed by the independent subsystem testgroup for Development Feature Testing. Due to the natureof the features in question (GUI based with very littleinteraction with lower level system processes),Development Feature Testing has taken on the form ofgrey/black box testing, with test cases written using boththe Functional requirements and High Level Design. Insome cases the form these test cases has taken on is verysimilar in perspective to that of Subsystem level featuretesting.

    By including the development team in the inspection ofthe Feature Test Plan, the small number of grey-box typetests which would have been unique to a development testplan were not lost. These may include specific tests for apiece of code, which the developer may considerparticularly high risk.

    The advantages of adopting such an approach werenumerous.

    • The test plans developed by the Independent TestTeam were typically more extensive -including areassuch as usability testing and drawing on test expertisebuilt up over the years by the team. By working moreclosely with development to prepare and review thesetest plans, the spread of this knowledge into thedevelopment organisation was facilitated.

    • The independent perspective in writing the documentshelps overcome a tendency which some developershave to test that their product works as opposed to tryand break it.

    • The focus of the Independent Test Team could beredirected to alternative testing mechanisms. Detailsof some initiatives in this area are detailed in [2]

    5.2 Post Code Freeze Feature Interaction

    The process initiatives described above help detect defectsearlier in the software cycle on a per-feature basis. Atsystem code freeze, it was believed that featureinteraction, i.e. the effect of one feature on another, was anarea in which further defects could be found.

    Code merge was used as a measure of feature interaction.Code merging means that a software module (programfile) may have been modified by more than one feature.Figure 6 illustrates a software subsystem release with twofeatures, namely A and B. Test Plan A tests Feature A.

    Test Plan B tests Feature B. When the software product isreleased, a single code base is produced from the filesmodified in Features A and B. The intersection of the twocircles represents program files modified by both features,i.e. the merged code.

    The merged program file(s) will not have been tested untilafter the software product is released to the independenttest group.

    T e s t P l a n A T e s t P l a n B

    F e a t u r e A F e a t u r e B

    Figure 6 Subsystem Feature Interaction

    The % code merge was computed by dividing the numberof program files modified more than once by the total filesmodified by the feature.

    % Code Merge = # Files modified more than onceTotal # Files

    Such an approach is based on the assumption than anyprogram file modified more than once increases thelikelihood of defects arising from the changes.

    Using this mechanism may assist in determining the mostproductive type of post development testing.

    A low %code merge computation for a particular featuremay indicate that the feature has a low impact on theoverall subsystem. Therefore, it may be assumed that thesystem integration testing performed prior to code freezewas sufficient to test the functionality of the feature

    A high %code merge computation for a particular featuremay indicate that the feature has a high impact on theoverall subsystem. This would indicate that re-running thefull feature integration test plan may be appropriatebecause the merged code would not have been tested.Furthermore, knowing the program files, which weremodified more than once can help prioritise feature testsand allow the test organisation focus on the tests relatingto the areas of high overlap.

  • Page 6

    To put it another way, if a feature has low code merge,then the likelihood of finding functional defects by re-running the feature test plan will be low. In this case, itmay prove more efficient to run a subset of Feature Tests(i.e. Regression) and concentrate on other non-requirement based tests addressing other areas of thedefect spectrum where the independent perspective of thetest engineer may yield greater results.

    Currently, the lack of available data points mean that thisis only a crude indicator of defect density. Work iscontinuing to examine in more detail this link betweencode merge and defects in software product and may bethe subject of a future presentation.

    Although there are a number of useful off-the-shelf toolsavailable to perform similar code analysis, the advantageof this method is that it is performed after code freeze andtakes into account the way internal projects are structuredand how features are divided for development[3].

    6 Results

    These process initiatives have been applied to two featuresthat are common to each of the OMC and the BSSsubsystems and have been released to customers.

    These features are of similar size, complexity andfunctionality. The results obtained from these two featureswill be compared to the data obtained from the originalfeature (the analysis of which lead to these improvements).

    6.1 Defects Vs Phase The Rel 0 line in Figure 7 illustrates the feature in whichthe process improvements were not applied and as wasdescribed earlier, a large number of defects were foundafter handover to the Independent Test Group.

    The other two lines, (Rel 1 and Rel 2) illustrate the twofeatures in which the process initiatives were applied andshow clearly how a significant portion of the defectdiscovery phase has been moved back to before handoverto the Independent Test Group.

    It is worth noting that the total number of defects for allthree features was similar. Thus without losing out onoverall effectiveness of the test process, the planned PhaseContainment Effectiveness within development has beenexceeded.

    6.2 Feature Interaction Defects

    The %code merge was calculated for the two featuresanalysed and found that in both cases it was very low,namely 2% and 5% respectively.

    Figure 7 Defects Vs Phase

    After handover, the Independent Test Group re-ran the testplans that were run in the Development Integration Test.As was expected, this did not produce many defects in thetwo features analysed. In both these cases, thecomputation of % code merge would have indicated thatthe Regression Testing (instead of full Feature Testing)would have sufficed to test the functionality of the feature.Independent Testing could then have been redirected toother areas of the defect spectrum.

    5.3. Future Work

    Work is currently underway to determine the full extent ofthe linkage between multiple file modifications and defectdensity. The exact threshold figure for “low” code merge,which would signify an opportunity to replace somerequirements based tests with alternative testing methods,cannot yet accurately be predicted. Work is continuing toattempt to quantify the impacts of multiple changes to asingle file as initial findings seem to indicate a linkagebetween multiple file changes and the number of defectsthat later need to be fixed in these files.

    Insp

    ecti

    on

    s

    Dev

    Tes

    t

    Su

    bsy

    stem

    Tes

    t

    Sys

    tem

    Tes

    t

    Cu

    st

    Rel 0

    Rel 1

    Rel 2

    Planned

  • Page 7

    Work has also begun in looking at these relationships insome other features in which our process improvementinitiatives were not applied. The data so far from theseinvestigations are illustrated in the Figure 8. More datapoints are however, required to prove the effectiveness ofusing %code merge as a measure of feature interaction.

    Figure 8. %Code Merge Vs Total Defects

    Also, Code merge, in this paper, has been defined as thenumber of times a program file has been modified duringsystem code merge and as such, is a crude technique forestimating the quality of a feature in a release. Thecalculation of code merge in future work, will be enhancedto take into account other factors such as:

    (a) number of lines merged(b) a weighting factor of the merged code, e.g. code that

    is used by all parts of the system will have a higherweighting that localised functionality code.

    7 Conclusions

    A number of conclusions can be made as a result of thisstudy. These conclusions are detailed below.

    In large systems where features are common to multiplesubsystems and developed independently, performingIntegration testing on a per-feature basis with all relevantsubsystems can find defects earlier in the developmentcycle.

    In such situations, where Development and IndependentFeature Testing both test this common feature from ablack box perspective, using a Test Plan developed by the

    independent test team can enhance development testcoverage and find defects in a phase where they arerelatively inexpensive to fix.

    Determining the extent of code merging prior tocommencement of the subsystem test phase can provide auseful indication of how much of the test plan should bere-run thus possibly freeing up testers time to focus onalternative test methods. Knowledge of the areas ofhighest code merge can help to focus testing on areas ofhighest risk. The full extent of this linkage is the subject offurther investigation.

    References

    [1] IEEE Standard for Developing Software LifecycleProcesses, IEEE 1074-1997.

    [2] "Customer Focused Testing Linking Customers withtesting", N.Barrett, S.Martin, P.Harrington. Eurostar ’98Symposium.

    [3] http:/www.macabe.com

    0

    10

    20

    30

    40

    50

    60

    Total Defects

    % C

    od

    e M

    erg

    e Rel 0Rel 1

    Rel 2

    A.N Other1

    A.N Other2

    A.N Other3

  • Paul French & Deirdre Donovan

    Paul FrenchB.E (1981) and M.Eng.Sc (1984) from University College Cork.

    Senior Staff Engineer with development group in Motorola Cork.

    Initially worked ('84' to '87)in the mobile switch development group,concentrating on call processing and man-machine interface areas.

    From 1988-1998, worked in the development of the Motorola GSMNetwork Management product, specifically in Management InformationBase (MIB) and communications protocols.

    From 1998 to present, working in the specification and development ofthe Motorola GPRS network management product.

    Deirdre DonovanBSC in Computer Science & Maths from University College Cork(1995).

    Senior Software Test Engineer with Motorola Cork. From 1995-1998,worked on Sub-system Testing of the Motorola GSM NetworkManagement product. From 1998-Early 1999, involved in sub-systemtesting of the Motorola GPRS Operations & Maintenance Centre.Currently co-ordinating part of System Test effort for GSM Operations &Maintenance Centre and associated products.

    Title PagePresentationPaperBio'sReturn to Main Menu