software engineering research: leading a double-agent life

Useful Software Engineering Research: Leading a Double-Agent Life

Lionel Briand IEEE Fellow FNR PEARL Chair Interdisciplinary Centre for ICT Security, Reliability, and Trust (SnT) University of Luxembourg APSEC, Bangkok, December 3rd, 2013

SnT$Centre$

•  SnT centre, Est. 2009: Interdisciplinary, ICT security, reliability, and trust (SnT)

•  Luxembourg city

•  220 scientists and Ph.D. candidates, 20 industry partners

•  SVV Lab: Established January 2012, www.svv.lu

•  25 scientists (Research scientists, associates, and PhD candidates)

•  Industry-relevant basic and applied research on system dependability: security, safety, reliability

•  Six partners: Cetrel, CTIE, Delphi, SES, IEE, Hitec, …

2

Software Everywhere

3

Software Everywhere

4

World Software Market: •  $296 Billion in 2013 •  $360 Billion in 2016 •  Annual growth of 6%

Failures Everywhere …

5

Failures Everywhere …

6

McKinsey, U. Oxford, 2012: •  5400 large IT projects (> 15 $M) •  17% threatened the very

existence of companies •  On average: 45% over budget,

56 percent less value

Software Engineering Research Funding & Relevance

•  Software Engineering (SE) research should be a top priority given its importance in society.

•  But that is not the case anymore (with a few exceptions). •  Symptoms

–  Listed priorities by research councils, relative funding –  University hiring, few software engineering departments –  Large centers or institutes being established or closed down

•  May be partly related to (perceived) lack of relevance? –  Industry participation in leading SE conferences –  Application/Experience tracks not first class citizens –  A very small percentage of research work is ever used or even

assessed on real industrial software –  Impact project (ACM SIGSOFT)

7

Basili’s and Meyer’s Take

•  Many (most?) of the advances in software engineering have come out of non-university sources

•  “Academic research has had its part, honorable but limited.” (Meyer)

•  Large scale labs don’t get funded, like they do in other engineering and scientific disciplines (Basili, Meyer)

•  One significant difference though is that we cannot entirely recreate the phenomena we study within four walls

•  Question: What is our responsibility in all this?

8

Setting the Stage

Engineering Research

•  “Engineering: The application of scientific and mathematical principles to practical ends such as the design, manufacture, and operation of efficient and economical structures, machines, processes, and systems.” (American Heritage Dictionary)

•  Engineering research: innovative engineering solutions –  Problem driven

–  Real world requirements

–  Scalability

–  Human factors, where it matters –  Economic tradeoffs and cost-benefit analysis

10

Pasteur’s Quadrant

Quest for Fundamental Understanding?

Consideration of Use?

Yes

No Yes

Pure Basic Research (Bohr)

Use-inspired Basic Research (Pasteur)

No - Pure Applied Research (Edison)

11 Donald Stokes, Pasteur’s Quadrant, 1997

Software Engineering Research

•  Pure basic research in software engineering?

•  As an engineering discipline, all research work should contribute to:

–  Knowledge discovery but also …

–  Innovative (software) engineering solutions

•  Pasteur’s quadrant: Research should driven by utility and its results be put to use …

•  This requires a knowledge of engineering practice

•  Are the majority of projects and papers doing so?

•  If not, why is that? What can we do about it? 12

Research Example

A Representative Example

•  Parnin and Orso (ISSTA, 2011) looked at automated debugging techniques

•  50 years of automated debugging research •  Only 5 papers have evaluated automated debugging

techniques with actual programmers •  Focus since ~2001: dozens of papers ranking program

statements according to their likelihood of containing a fault

•  Experiment –  How do programmers use the ranking? –  Do they see the bugs? –  Is the ranking important?

14

Results from Parnin and Orso’s Study

•  Only low performers strictly followed the ranking

•  Only one out of 10 programmers who checked a buggy statement stopped the investigation

•  Automated support did not speed up debugging

•  Developers wanted explanations rather than recommendations

•  We cannot abstract the human away in our research

•  “… we must steer research towards more promising directions that take into account the way programmers actually debug in real scenarios.”

15

What Happened?

•  How people debug and what information they need is poorly understood –  Probably varies a great deal according to context and skills

•  Researchers focused on providing a solution that was a mismatch for the actual problem

•  That line of research became fashionable: a lot of (cool) ideas could be easily applied and compared, without involving human participants

•  Resulted in many, many papers … •  Pasteur’s quadrant: That idea was never really put to use … •  Similar example: ISSTA 2012 paper on learning program

invariants (Daikon) by Matt Staats et al. •  Many other examples

16

Reconciling Relevance and Science in Software Engineering

-- Implementing Pasteur’s Quadrant

Objectives

•  How to implement Pasteur’s quadrant in software engineering?

•  Go through recent and successful projects with industry partners –  Simula Research Laboratory, Oslo, Norway

–  SnT centre, U. of Luxembourg

•  Summarize what happened, our experience

•  Two projects, in the automotive domain, described in more detail

•  Draw conclusions and lessons learned –  Patterns for successful research

–  Challenges and possible solutions

18

Mode of Collaboration

19

Adapted from Gorschek et al., IEEE Software 2006

Projects$Overview$(<$5$years)$

20

Company Domain Objective Notation Automation

ABB Robot controller Func. Safety Testing UML Model analysis for coverage criteria

Cisco Video conference Testing (robustness) UML profile Metaheuristic search

Kongsberg Maritime Fire and gas safety control system

Safety certification SysML + traceability Model slicing algorithm

Kongsberg Maritime Oil&gas, safety critical drivers

CPU usage analysis UML+MARTE Constraint Solver

FMC Subsea system Automated configuration

UML profile Constraint solver

WesternGeco Marine seismic acquisition

Func. Testing UML profile + MARTE Metaheuristic search

DNV Marine and Energy, certification body

Compliance with safety standards

UML profile Constraint verification

SES Satellite operator Func. Testing UML profile Metaheuristic search

Delphi Automotive systems Testing (safety+performance)

Matlab/Simulink Metaheuristic search

Lux. Tax department Legal & financial Legal Req. QA & testing

UML profile Under investigation

Project Example 1: Cisco

•  Context: Video conferencing systems

•  Original scientific problem: Modeling and test case generation, oracle, coverage strategy

•  Practical observation: Access to test network infrastructure limited (emulate network traffic, etc.). Models get too large and complex.

•  Modified research objectives: (1) How to select an optimal subset of test cases matching the time budget, (2) Modeling cross-cutting concerns

•  References: Hemmati et al. (2010),

Ali et al. (2011)

21

Project Example 1: Cisco

•  Context: Video conferencing systems

•  Original scientific problem: Modeling and model-based test case generation, oracle, coverage strategy

•  Practical observation: Access to test network infrastructure limited (emulate network traffic, etc.). Models get too large and complex.

•  Modified research objectives: (1) How to select an optimal subset of test cases matching the time budget, (2) Modeling cross-cutting concerns

•  References: Hemmati et al. (2010),

Ali et al. (2011)

22

Project Example 2: Siemens

•  Context: 3-D image segmentation algorithms for medical applications (Siemens)

•  Original scientific problem: Define specific test strategies for segmentation algorithms

•  Practical observations: Algorithms are validated by using highly specialized medical experts. Expensive and slow. No obvious test oracle

•  Modified research objective: Learning oracles for image segmentation algorithms in medical applications. Machine learning.

•  Reference: Frouchni et al. (2011)

23

Project Example 2: Siemens

•  Context: 3-D image segmentation algorithms for medical applications (Siemens)

•  Original scientific problem: Define specific test strategies for (constantly evolving) segmentation algorithms

•  Practical observations: Test results are validated by using highly specialized medical experts. Expensive and slow. No obvious test oracle: non-testable systems

•  Modified research objective: Learning oracles for image segmentation algorithms in medical applications. Machine learning.

•  Reference: Frouchni et al. (2011)

24

Project Example 3: FMC

25

•  Context: Subsea integrated control systems

•  Original scientific problem: integration in systems of systems

•  Practical observations: Each subsea installation is unique (variant), the software configuration is extremely complex (thousands of configuration parameters), > 50% configuration faults

•  Modified research objective: Real-time, automated support to the configuration process using domain specific product line modeling (focus on HW-SW dependencies) and constraint solving: : instant validation, guidance, inferred values

•  Behjati et al. (2012)

Project Example 3: FMC

•  Context: Subsea integrated control systems

•  Original scientific problem: integration in systems of systems

•  Practical observations: Each subsea installation is unique (variant), the software configuration is extremely complex (thousands of configuration parameters), > 50% configuration faults

•  Modified research objective: Real-time, automated support to the configuration process using domain specific product line modeling (focus on HW-SW dependencies) and constraint solving: instant validation, guidance, inferred values

•  Behjati et al. (2012) 26

Project Example 4: Kongsberg Maritime

•  Context: safety-critical embedded systems in the energy and maritime sectors, e.g., fire and gas monitoring, process shutdown, dynamic positioning

•  Original scientific problem: Model-driven engineering for failure-mode and effect analysis

•  Practical observations: Certification meetings with third-party certifiers. Certification is lengthy, expensive, etc. Requirements-Design decisions traceability in large complex systems a priority.

•  Modified research objective: Traceability between safety requirements and system design decisions. Solution based on SysML and a simple traceability language along with model slicing.

•  Reference: Sabetzadeh et al. (2011)

27

Project Example 4: Kongsberg Maritime

•  Context: safety-critical embedded systems in the energy and maritime sectors, e.g., fire and gas monitoring, process shutdown, dynamic positioning

•  Original scientific problem: Model-driven engineering for failure-mode and effect analysis

•  Practical observations: Certification meetings with third-party certifiers: Certification is lengthy, expensive, etc. Requirements-Design decisions traceability in large complex systems is an issue.

•  Modified research objective: Traceability between safety requirements and system design decisions. Solution based on SysML and a simple traceability language along with model slicing.

•  Reference: Sabetzadeh et al. (2011), Nejati et al. (2012)

28

Detailed Example 1

Testing Closed Loop Controllers

R. Matinnejad et al., 2013

29

Complexity$and$amount$of$so?ware$used$on$vehicles’$$Electronic$Control$Units$(ECUs)$grow$rapidly$$

More functions

Comfort and variety

Safety and reliability

Faster time-to-market

Less fuel consumption

Greenhouse gas emission laws

30

Model-based control system development has three major stages

31

Hardware-in-the-Loop Stage

Model-in-the-Loop Stage

Simulink Modeling

Generic Functional

Model

MiL Testing

Software-in-the-Loop Stage

Code Generationand Integration

Software Running on ECU

SiL Testing

SoftwareRelease

HiL Testing

Major$Challenges$in$MiLGSiLGHiL$TesIng$$

•  Manual test case generation

•  Complex functions at MiL, and large and integrated software/embedded systems at HiL

•  Lack of precise requirements and testing Objectives

•  Test selection at HiL 32

MiL$tesIng$

Requirements

The ultimate goal of MiL testing is to ensure that individual functions behave correctly and timely on any hardware configuration

Individual Functions

33

A$Taxonomy$of$AutomoIve$FuncIons$

Controlling Computation

State-Based Continuous Transforming Calculating

unit convertors calculating positions, duty cycles, etc

State machine controllers

Closed-loop controllers (PID)

Different testing strategies are required for different types of functions

34

Dynamic Continuous Controllers are present in many embedded systems including ECUs

35

Controller Plant Model and its Requirements

Plant Model

Controller(SUT)

Desired value Error

Actual value

System output+-

=<

~= 0>=

time time time

Des

ired

Valu

e &

Actu

al V

alue

Desired ValueActual Value

(a) (b) (c)Liveness Smoothness Responsiveness

x

y

z

v

w

36

37

Search$Elements$

•  Search:

•  Inputs: Initial and desired values, configuration parameters •  (1+1) EA

•  Search Objective:

•  Example requirement that we want to test: liveness

!  |Desired - Actual(final)|~= 0

For each set of inputs, we evaluate the objective function over the resulting simulation graphs:

•  Result:

•  worst case scenarios or values to the input variables that are more likely to break the requirement at MiL level

•  stress test cases based on actual hardware (HiL)

37

MiL-Testing of Continuous Controllers

Exploration+Controller-plant model

Objective Functions

Overview Diagram

Test Scenarios

List of Regions Local SearchDomain

Expert

time

Desired ValueActual Value

0 1 20.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Initial Desired

Final Desired

38

•  Addressed a testing problem that has been largely ignored

•  We found much worse scenarios during MiL testing than our partner had found so far

•  Much worse than random search

•  They are running them at the HiL level, where testing is much more expensive: MiL results -> test selection for HiL

•  But further research is needed:

–  To deal with the many configuration parameters

–  To dynamically adjust search algorithms in different subregions with different lanscapes

Conclusions$

i.e., 31s. Hence, the horizontal axis of the diagrams in Figure 8 shows the number ofiterations instead of the computation time. In addition, we start both random search and(1+1) EA from the same initial point, i.e., the worst case from the exploration step.

Overall in all the regions, (1+1) EA eventually reaches its plateau at a value higherthan the random search plateau value. Further, (1+1) EA is more deterministic than ran-dom, i.e., the distribution of (1+1) EA has a smaller variance than that of random search,especially when reaching the plateau (see Figure 8). In some regions (e.g., Figure 8(d)),however, random reaches its plateau slightly faster than (1+1) EA, while in some otherregions (e.g. Figure 8(a)), (1+1) EA is faster. We will discuss the relationship betweenthe region landscape and the performance of (1+1) EA in RQ3.RQ3. We drew the landscape for the 11 regions in our experiment. For example, Fig-ure 9 shows the landscape for two selected regions in Figures 7(a) and 7(b). Specifically,Figure 9(a) shows the landscape for the region in Figure 7(b) where (1+1) EA is fasterthan random, and Figure 9(b) shows the landscape for the region in Figure 7(a) where(1+1) EA is slower than random search.

0.30

0.31

0.32

0.33

0.34

0.35

0.36

0.37

0.38

0.39

0.40

0.70 0.71 0.72 0.73 0.74 0.75 0.76 0.77 0.78 0.79 0.800.10

0.11

0.12

0.13

0.14

0.15

0.16

0.17

0.18

0.19

0.20

0.90 0.91 0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99 1.00

(a) (b)

Fig. 9. Diagrams representing the landscape for two representative HeatMap regions: (a) Land-scape for the region in Figure 7(b). (b) Landscape for the region in Figure 7(a).

Our observations show that the regions surrounded mostly by dark shaded regionstypically have a clear gradient between the initial point of the search and the worst casepoint (see e.g., Figure 9(a)). However, dark regions located in a generally light shadedarea have a noisier shape with several local optimum (see e.g., Figure 9(b)). It is knownthat for regions like Figure 9(a), exploitative search works best, while for those like Fig-ure 9(b), explorative search is most suitable [10]. This is confirmed in our work wherefor Figure 9(a), our exploitative search, i.e., (1+1) EA with � = 0.01, is faster and moreeffective than random search, whereas for Figure 9(b), our search is slower than randomsearch. We applied a more explorative version of (1+1) EA where we let � = 0.03 to theregion in Figure 9(b). The result (Figure 10) shows that the more explorative (1+1) EAis now both faster and more effective than random search. We conjecture that, from theHeatMap diagrams, we can predict which search algorithm to use for the single-statesearch step. Specifically, for dark regions surrounded by dark shaded areas, we suggestan exploitative (1+1) EA (e.g., � = 0.01), while for dark regions located in light shadedareas, we recommend a more explorative (1+1) EA (e.g., � = 0.03).

6 Related WorkTesting continuous control systems presents a number of challenges, and is not yet sup-ported by existing tools and techniques [4, 1, 3]. The modeling languages that have been

13

39

Detailed Example 2

Minimizing CPU Time Shortage Risks in Integrated Embedded Software

S. Nejati et al., 2013

Today’s$cars$rely$on$integrated$systems$

•  Modular and independent development

•  Many opportunities for division of labor and outsourcing

•  Need for reliable and effective integration processes

41

IntegraIon$process$in$the$automoIve$domain$

AUTOSAR Models sw runnables

sw runnables AUTOSAR Models

Glue

42

43

CPU$Time$Shortage$in$Integrated$Embedded$$So?ware$

•  Static cyclic scheduling: predictable, analyzable •  Challenge

–  Many OS tasks and their many runnables run within a limited available CPU time

•  The execution time of the runnables may exceed their time slot

•  Our goal –  Reducing the maximum CPU time used per time slot to be

able to •  Minimize the hardware cost •  Reduce the probability of overloading the CPU in practice •  Enable addition of new functions incrementally

43

5ms 10ms 15ms 20ms 25ms 30ms 35ms 40ms

✗

5ms 10ms 15ms 20ms 25ms 30ms 35ms 40ms ✔

(a)

(b)

Fig. 4. Two possible CPU time usage simulations for an OS task with a 5mscycle: (a) Usage with bursts, and (b) Desirable usage.

its corresponding glue code starts by a set of declarationsand definitions for components, runnables, ports, etc. It thenincludes the initialization part followed by the execution part.In the execution part, there is one routine for each OS task.These routines are called by the scheduler of the underlyingOS in every cycle of their corresponding task. Inside eachOS task routine, the runnables related to that OS task arecalled based on their period. For example, in Figure 3, weassume that the cycle of the task o1 is 5ms, and the periodof the runnables r1, r2, and r3 are 10ms, 20ms and 100ms,respectively. The value of timer is the global system time. Sincethe cycle of o1 is 5, the value of timer in the Task o1() routineis always a multiple of 5. Runnables r1, r2 and r3 are thencalled whenever the value of timer is zero, or is divisible bythe period of r1, r2 and r3, respectively.

Although AUTOSAR provides a standard means for OEMsand suppliers to exchange their software, and essentiallyenables the process in Figure 1, the automotive integrationprocess still remains complex and erroneous. A major inte-gration challenge is to minimize the risk of CPU shortagewhile running the integrated system in Figure 1. Specifically,consider an OS task with a 5ms cycle. Figure 4 shows twopossible CPU time usage simulations of this task over eighttime slots between 0 to 40ms. In Figure 4(a), there are burstsof high CPU usage at two time slots at 0ms and 35ms, whilethe CPU usage simulation in Figure 4(b) is more stable anddoes not include any bursts. In both simulations, the totalCPU usage is the same, but the distribution of the CPU usageover time slots is different. The simulation in Figure 4(b) ismore desirable because: (1) It minimizes the hardware costsby lowering the maximum required CPU time. (2) It facilitatesthe assignment of new runnables to an OS task, and hence,enables the addition of new functions as it is typically done inthe incremental design of car manufacturers. (3) It reduces thepossibility of overloading CPU as the CPU time usage is lesslikely to exceed the OS task cycle (i.e., 5ms) in any time slot.Ideally, a CPU usage simulation is desirable if in each timeslot, there is a sufficiently large safety margin of unused CPUtime. Due to inaccuracies in estimating runnables’ executiontimes, it is expected that the unused margin shrinks when thesystem runs in a real car. Hence, the larger is this margin, thelower is the probability of exceeding the limit in practice.

In this paper, we study the problem of minimizing burstsof CPU time usage for a software system composed of alarge number of concurrent runnables. A known strategy toeliminate high CPU usage bursts is to shift the start time(offset) of runnables, i.e., to insert a delay prior to the start ofthe execution of runnables [5]. Offsets of the runnables mustsatisfy three constraints: C1. The offset values should not lead

to deadline misses, i.e., they should not cause the runnables torun passed their periods. C2. Since the runnables are invokedby OS tasks, the offset values of each runnable should bedivisible by the OS task cycle related to that runnable. C3. Theoffset values should not interfere with data dependency andsynchronization relations between runnables. For example,suppose runnables r1 and r2 have to execute in the same timeslot because they need to synchronize. The offset values of r1and r2 should be chosen such that they still run in the sametime slot after being shifted by their offsets.

There are four important context factors that are in line withAUTOSAR [13], and have influenced our work:

CF1. The runnables are not memory-bound, i.e., the CPUtime is not significantly affected by the low-bound memoryallocation activities such as transferring data in and out ofthe disk and garbage collection. Hence, our analysis of CPUtime usage is not affected by constraints related to memoryresources (see Section III-B).

CF2. The runnables are Offset-free [4], that is the offset ofa runnable can be freely chosen as long as it does not violatethe timing constraints C1-C3 (see Section III-B).

CF3. The runnables assigned to different OS tasks areindependent in the sense that they do not communicate withone another and do not share memory. Hence, the CPU timeused by an OS task during each cycle is not affected by otherOS tasks running concurrently. Our analysis in this paper,therefore, focuses on individual OS tasks.

CF4. The execution times of the runnables are remarkablysmaller than the runnables’ periods and the OS task cycles.Typical OS task cycles are around 1ms to 5ms. The runnables’periods are typically between 10ms to 1s, while the runnables’execution times are between 10ns = 10�5ms to 0.2ms.

Our goal is to compute offsets for runnables such that theCPU usage is minimized, and further, the timing constraints,C1-C3, discussed earlier above hold. This requires solvinga constraint-based optimization problem, and can be done inthree ways: (1) Attempting to predict optimal offsets in a de-terministic way, e.g., algorithms based on real-time schedulingtheory [6]. In general, these algorithms explore a very smallpart of the search space, i.e., worst/best case situations only(see Section V for a discussion). (2) Formulating the problemas a (symbolic) constraint model and applying a systematicconstraint solver [14], [15]. Due to assumption CF4 above,the search space in our problem is too large, resulting ina huge constraint model that does not fit in memory (seeSection V for more details). (3) Using metaheuristic search-based techniques [9]. These techniques are part of the generalclass of stochastic optimization algorithms which employsome degree of randomness to find optimal (or as optimalas possible) solutions to hard problems. These approaches areapplied to a wide range of problems, and are used in this paper.

III. SEARCH-BASED CPU USAGE MINIMIZATION

In this section, we describe our search-based technique forCPU usage minimization. We first define a notation for ourproblem in Section III-A. We formalize the timing constraints,

44

Using&runnable&offsets&(delay&3mes)&

5ms 10ms 15ms 20ms 25ms 30ms 35ms 40ms

5ms 10ms 15ms 20ms 25ms 30ms 35ms 40ms ✗

✔

Inserting runnables’ offsets

Offsets have to be chosen such that the maximum CPU usage per time slot is minimized, and further,

the runnables respect their period the runnables respect their time slot the runnables satisfy their synchronization constraints

44

45

Meta$heurisIc$search$$algorithms$

Case Study: an automotive software system with 430 runnables

Running the system without offsets

Simulation for the runnables in our case study andcorresponding to the lowest max CPU usage found by HC

5.34 ms

Optimized offset assignment

2.13 ms

-  The objective function is the max CPU usage of a 2s-simulation of runnables

-  The search modifies one offset at a time, and updates other offsets only if timing constraints are violated

-  Single-state search algorithms for discrete spaces (HC, Tabu)

45

46

Comparing$different$search$algorithms$$

(ms)

(s)

Best CPU usage

Time to find Best CPU usage

46

47

Conclusions$

-  Though schedulability analysis is a well developed field, the problem we addressed was never defined in those terms (context)

-  Search algorithms to compute offset values that reduce the max CPU time needed

-  Positive evaluation results: Quick and significant differences

-  Huge search space with constraints -  Current: Accounting for task time

coupling constraints with multi-objective search " trade-off between relaxing coupling constraints and maximum CPU time

47

What Have I Learned?

Successful Research Patterns

•  Successful: Innovative and high impact

•  Inductive research: Working from specific observations in real settings to broader generalizations and theories

–  Software development is highly diverse

–  Problems and solutions are diverse too

–  Context factors matter a great deal

–  Generalization: Field studies and replications, analyze commonalities

•  Scalability and practicality considerations must be part of the initial research problem definition

•  Researching by doing: Hands-on research. Apply what exists in well defined, realistic context, with clear objectives. The observed limitations become the research objectives. Put new technology to actual use.

•  Interdisciplinary: CS, Mathematics, Engineering, or non-technical domains

49

So What?

•  Making a conscious effort to understand the problem first

–  A careful investigation often leads to surprises and a very different understanding of the issues at stake

–  Precisely identify the requirements for an applicable solution

–  More papers focused on understanding the problems

–  Making experience/application tracks first class citizens in SE conferences

50

So What? - cont’d

•  Better relationships between academia and industry

–  Different models •  Fraunhofer centres, Germany

•  Research-based innovation centers in Norway

•  SnT centre in Luxembourg

•  Targeted grant programs

•  Joint industry-academia labs (e.g., NASA SEL Lab) –  Mutually beneficial setting where software development is an

object of study

–  Exposing PhD students to industry practice, management and leadership: Ethical considerations (“Fix the PhD”, Nature, 2011)

–  Incentives for SE academics 51

So What? - Cont’d

•  Work on end-to-end solutions: Pieces of solutions are interdependent. Necessary for impact, e.g., requirements and testing.

•  Beyond professors and students –  Labs with interdisciplinary teams of professional scientists

and engineers within or collaborating with universities

–  Used to be the case with corporate research labs: Bell Labs, Xerox PARC, HP labs, NASA SEL, etc.

–  Now: Fraunhofer (Germany), Simula (Norway), Microsoft Research (US), SEI (US), SnT (Luxembourg), FBK (Italy)

–  Corporate labs versus publicly supported ones?

–  Key point: The level of basic funding must allow high risk and rigorous research, performed by professional scientists, focused on impact in society (Use-inspired basic research)

52

The “Classical” Model of Research and Innovation

Basic Research

Applied Research

Innovation and

Development

An Effective, Collaborative Model of Research and Innovation

Basic Research Applied

Research

Innovation & Development

•  Basic and applied research take place in a rich context •  Basic Research is also driven by problems raised by

applied research •  Main motivation for SnT’s partnership program

B. Schneiderman, The Atlantic, Toward an Ecological Model of Research and Development, 2013

Challenges

Academic Challenges

•  Our CS legacy … emancipating ourselves as an engineering discipline

–  Electrical engineering: Subfield of Physics until late 19th century

–  Software engineering departments?

•  How cool is it? SE research is more driven by “fashion” than needs, a quest for silver bullets

–  We can only blame ourselves

•  Counting papers and how rankings do not help

–  We are pressuring ourselves into irrelevance

•  Taking academic tenure and promotion seriously

–  What about rewarding impact?

•  One’s research must cover a broader ground and be somewhat opportunistic – this pushes us out of our comfort zone

•  Resources to support industry collaborations

–  Large lab infrastructure, research engineers, time 56

Industrial Challenges

•  Distinguish the manifestations of a problem from its causes is not easy in practice

•  Short term versus longer term goals (next quarter’s forecast is sometimes the priority)

•  Industrial research groups are often disconnected from their own business units and external researchers may be perceived as competitors

•  Company’s intellectual property regulations may conflict with those of the research institution

•  Complexity of industrial systems and technology

–  Cannot be transplanted in artificial settings for research - Need studies in real settings

–  Substantial domain knowledge is required

57

A Double-Agent Life

58

Scientist, inquisitive but discrete

Warning: No research here

A new idea (as initially perceived by our partners)

Practitioner (anonymous)

Conclusions

•  Software engineering is obviously important in all aspects of society, but academic software engineering research is not always perceived the same way

•  The academic community, at various levels and in particular its leadership, is partly responsible for this

•  How we take up the challenge of increasing our impact will determine the future of the profession

•  There is some limited progress, but far too slow

•  There are solutions, but no silver bullet

•  We all have a role to play in this, as deans, department chairs, professors, scientists, reviewers, conference organizers, journal editors, etc. We can all be double-agents …

59

60 96 IEEE SOFTWARE | PUBLISHED BY THE IEEE COMPUTER SOCIET Y 074 0 -74 5 9 /12 / $ 31. 0 0 © 2 012 I E E E

SOUNDING BOARD

continued on p. 93

Editor: Philippe KruchtenUniversity of British Columbia [email protected]

Embracing the Engineering Side of Software EngineeringLionel Briand

I HAVE NOW been a professional researcher in software engineering for roughly 20 years. Throughout that time, I’ve worked at univer-sities and in research institutes and collabo-rated on research projects with 30-odd pri-vate companies and public institutions. Over the years, I have increasingly questioned and re! ected on the impact and usefulness of my research work and, as a result, made it a pri-ority to combine my research with a genu-ine involvement in actual engineering prob-lems. This short piece aims to re! ect on my experiences in performing industry-relevant software engineering research across several countries and institutions.

Not So Hot AnymoreI suppose a logical start for this article is to assess, albeit concisely, the current state of software engineering research. As software engineering is widely taught in many univer-sities, due in large part to a strong demand for software engineers in industry, the num-ber of software engineering academics is sub-stantial. The Journal of Systems and Soft-ware ranks researchers every year, usually accounting for roughly 4,000 individuals ac-tively publishing in major journals.

When I started my career, software en-gineering was de" nitely a hot topic in aca-demia: funding was plentiful, and universi-ties and research institutes were hiring in record numbers. This clearly isn’t the case anymore. Public funding for software engi-neering research has at best stagnated, and in many countries, declined signi" cantly.

Hiring for research positions is limited and falls far below the number of software engi-neering graduates seeking research careers. Industry attendance at scienti" c software engineering conferences is roughly 10 per-cent, including the scientists from corporate research centers. Adding insult to injury, in many academic and industry circles, soft-ware engineering research isn’t even consid-ered to be a real scienti" c discipline. I’ll spare you the numerous unpleasant comments about the credibility and scienti" c underpin-ning of software engineering research that I’ve heard over the years.

This situation isn’t due to the subject mat-ter’s lack of relevance. Software systems are pervasive in all industry sectors and have be-come increasingly complex and critical. The software engineering profession repeatedly tops job-ranking surveys. In many cases, most of a product’s innovation lies in its software components—for an example, think of the automotive industry. In all my recent industry collaborations, I’ve observed that all the is-sues and challenges traditionally faced in soft-ware development are becoming more acute.

So how can we explain the paradox of be-ing both highly relevant and increasingly un-derfunded and discredited?

Looking for Some AnswersLike other disciplines before us, because we’re a young and still-maturing engineer-ing " eld, we lack the credibility of more

Empirical Software Engineering

•  Springer, 6 issues a year

•  Both research papers and industry experience reports

•  High impact factor among SE research journals

•  “Applied software engineering research with a significant empirical component”

61

Personal References

•  Hemmati, Briand, Arcuri, Ali “An Enhanced Test Case Selection Approach for Model-Based Testing: An Industrial Case Study”, ACM FSE, 2010

•  Frouchni, Briand, Labiche, Grady, and Subramanyan, “Automating Image Segmentation Verification and Validation by Learning Test Oracles”, Information and Software Technology (Elsevier), 2011.

•  Sabetzadeh, Nejati, Briand, Evensen Mills “Using SysML for Modeling of Safety-Critical Software–Hardware Interfaces: Guidelines and Industry Experience”, HASE, 2011

•  Sabetzadeh et al., “Combining Goal Models, Expert Elicitation, and Probabilistic Simulation for Qualification of New Technology”, HASE, 2011

•  Ali, Briand, Hemmati, “Modeling Robustness Behavior Using Aspect-Oriented Modeling to Support Robustness Testing of Industrial Systems”, Journal of Software and Systems Modeling (Springer), 2011.

•  Behjati, Nejati, Yue, Gotlieb, Briand, “Model-based Automated and Guided Configuration of Embedded Software Systems”, ECMFA, 2012.

•  Behjati, Yue, Briand, Selic. SimPL, “A Product-Line Modeling Methodology for Families of Integrated Control Systems, Information and Software Technology”, Information and Software Technology (Elsevier), 2012.

•  Nejati et al., A SysML-Based Approach to Traceability Management and Design Slicing in Support of Safety Certification: Framework, Tool Support, and Case Studies, Information and Software Technology (Elsevier), 2012

•  Iqbal, Arcuri, Briand, “Empirical Investigation of Search Algorithms for Environment Model-Based Testing of Real-Time Embedded Software”, ACM ISSTA, 2012

62

Personal References II

•  Nejati, Di Alesio, Sabetzadeh, Briand, “Modeling and Analysis of CPU Usage in Safety-Critical Embedded Systems to Support Stress Testing”, ACM/IEEE MODELS 2012

•  Nejati, Sabetzadeh, Falessi, Briand, Coq, “A SysML-based approach to traceability management and design slicing in support of safety certification: Framework, tool support, and case studies”, Information & Software Technology, (Elsevier), 2012

•  Hemmati, Arcuri, L. Briand: Achieving scalable model-based testing through test case diversity. ACM Trans. Softw. Eng. Methodology, 2013.

•  Panesar-Walawege, Sabetzadeh, Briand: Supporting the verification of compliance to safety standards via model-driven engineering: Approach, tool-support and empirical validation. Information & Software Technology (Elsevier), 2013

•  S. Nejati et al., “Minimizing CPU Time Shortage Risks in Integrated Embedded Software”, 28th IEEE/ACM International Conference on Automated Software Engineering, 2013

•  Sabetzadeh, Falessi, Briand, Di Alesio: A goal-based approach for qualification of new technologies: Foundations, tool support, and industrial validation. Reliability Eng. & System Safety, 2013

•  Briand et al., “Traceability and SysML Design Slices to Support Safety Inspections: A Controlled Experiment”, forthcoming in ACM Transactions on Software Engineering and Methodology, 2013

63

Other References

•  Bertrand Meyer’s blog: http://bertrandmeyer.com/2010/04/25/the-other-impediment-to-software-engineering-research/

•  Basili, “Learning Through Application: The maturing of the QIP in the SEL”, Making Software; What really works and why we believe it, Edited by Andy Oram and Greg Wilson, O’Reilly Publishers, 2011, pp.65-78.

•  T. Gorschek, P. Garre, S. Larsson, C. Wohlin. A Model for Technology Transfer in Practice. IEEE Software 23(6), 2006.

•  Parnin and Orso, “Are Automated Debugging Techniques Actually Helping Programmers?”, ISSTA, 2011

64

software engineering research: leading a double-agent life

Technology

research example

research councils

line of research

scientists research

engineering discipline

software everywhere3

papers pasteurs quadrant

automated debugging