a novel industry grade dataset for fault prediction based ...€¦ · loc_add 1.0 0.807 0.434 0.132...

1
Altinger Harald Audi Electronics Venture GmbH Sachsstrasse 20, 85080 Gaimersheim [email protected] A Novel Industry Grade Dataset for Fault Prediction based on Model-Driven Developed Automotive Embedded Software Sebastian Siegl Audi Electronics Venture GmbH Sachsstrasse 20, 85080 Gaimersheim [email protected] Yanja Dajsuren Software Engineering and Technology Group Eindhoven University of Technology [email protected] Franz Wotawa Institute for Software Technology Graz University of Technology [email protected] Fig 1: Development Workflow #Requirements #Sub-Projects LOC #Testcases #Authors #src. Files #mdl files #commited files #error prone files software type AUTOSAR Safety function Project A 304 13 12465 185 4 45 26 1782 78 logic, timing dependent behaviour yes no Project L 600 8 10113 680 3 20 47 2892 73 logic, timing dependent behaviour yes yes Project K 900 24 36526 695 5 53 48 2481 329 mainly logic operations and branching yes yes Project Overview author sloc McCab Hv Hd He loc_add loc_remove nfunctions bug author 1.0 0.005 -0.0 0.015 0.001 0.01 0.06 0.043 -0.004 0.045 sloc 1.0 0.783 0.909 0.852 0.922 0.389 0.383 0.712 0.232 McCab 1.0 0.766 0.739 0.775 0.407 0.41 0.805 0.262 Hv 1.0 0.838 0.94 0.366 0.359 0.7 0.241 Hd 1.0 0.898 0.372 0.366 0.701 0.23 He 1.0 0.375 0.368 0.702 0.242 loc_add 1.0 0.807 0.434 0.132 loc_remove 1.0 0.435 0.13 nfunctions 1.0 0.213 bug 1.0 Kendalls t correlation analysis (Project K, all revisions) Abstract In this paper, we present a novel industry dataset on static software and change metrics for Matlab/Simulink models and their corresponding auto- generated C source code. The data set comprises data of three automotive projects developed and tested accordingly to industry standards and restrictive software development guidelines. We present background information of the projects, the development process and the issue tracking as well as the creation steps of the dataset and the used tools during development. A specific highlight of the dataset is a low measurement error on change metrics because of the used issue tracking and commit policies. References Data Quality As visualized in Fig. 1, the models have been developed using Matlab/Simulink and were commited to our repository system “PTC Integrity”. Using “dSpace TargetLink” the C-source code has been generated and commited to the repository too. Bugs have been filed at every development and testing stage. Restrictive commit policies ensure the link between every issue ticket and the coreseponding bug fix commit. Get the Audi Dataset: http://www.ist.tugraz.at/_attach/Publish/ AltingerHarald/MSR_2015_dataset_automotive.zip Dataset Creation Workflow

Upload: others

Post on 24-Sep-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Novel Industry Grade Dataset for Fault Prediction based ...€¦ · loc_add 1.0 0.807 0.434 0.132 loc_remove 1.0 0.435 0.13 nfunctions 1.0 0.213 bug 1.0 Kendalls t correlation analysis

Altinger Harald Audi Electronics Venture GmbH

Sachsstrasse 20, 85080 Gaimersheim [email protected]

A Novel Industry Grade Dataset for Fault Prediction based on Model-Driven Developed

Automotive Embedded Software

Sebastian Siegl Audi Electronics Venture GmbH

Sachsstrasse 20, 85080 Gaimersheim [email protected]

Yanja Dajsuren Software Engineering and Technology Group

Eindhoven University of Technology [email protected]

Franz Wotawa Institute for Software Technology

Graz University of Technology [email protected]

Fig 1: Development Workflow

#R

equ

irem

ents

#S

ub

-Pro

jects

LO

C

#Testc

ases

#A

uth

ors

#src

. F

iles

#m

dl file

s

#com

mited

file

s

#err

or

pro

ne

file

s

software type AU

TO

SA

R

Safe

ty f

un

ction

Project A 304 13 12465 185 4 45 26 1782 78 logic, timing dependent behaviour yes no

Project L 600 8 10113 680 3 20 47 2892 73 logic, timing dependent behaviour yes yes

Project K 900 24 36526 695 5 53 48 2481 329 mainly logic operations and branching yes yes

Project Overview

au

tho

r

slo

c

McC

ab

Hv

Hd

He

loc_a

dd

loc_re

mo

ve

nfu

nctio

ns

bu

g

author 1.0 0.005 -0.0 0.015 0.001 0.01 0.06 0.043 -0.004 0.045

sloc 1.0 0.783 0.909 0.852 0.922 0.389 0.383 0.712 0.232

McCab 1.0 0.766 0.739 0.775 0.407 0.41 0.805 0.262

Hv 1.0 0.838 0.94 0.366 0.359 0.7 0.241

Hd 1.0 0.898 0.372 0.366 0.701 0.23

He 1.0 0.375 0.368 0.702 0.242

loc_add 1.0 0.807 0.434 0.132

loc_remove 1.0 0.435 0.13

nfunctions 1.0 0.213

bug 1.0

Kendalls t correlation analysis (Project K, all revisions)

Abstract In this paper, we present a novel industry dataset on static software and change metrics for Matlab/Simulink models and their corresponding auto-generated C source code. The data set comprises data of three automotive projects developed and tested accordingly to industry standards and restrictive software development guidelines. We present background information of the projects, the development process and the issue tracking as well as the creation steps of the dataset and the used tools during development. A specific highlight of the dataset is a low measurement error on change metrics because of the used issue tracking and commit policies.

References

Data Quality As visualized in Fig. 1, the models have been developed using Matlab/Simulink and were commited to our repository system “PTC Integrity”. Using “dSpace TargetLink” the C-source code has been generated and commited to the repository too. Bugs have been filed at every development and testing stage. Restrictive commit policies ensure the link between every issue ticket and the coreseponding bug fix commit.

Get the Audi Dataset:

http://www.ist.tugraz.at/_attach/Publish/

AltingerHarald/MSR_2015_dataset_automotive.zip

Dataset Creation Workflow