developing a tester that simulates the user and can be used...

Most of today’s application softwareprovides some kind of graphicaluser interface (GUI). A GUI shouldbe easy to use by following logical

and intuitive steps to reach the desired result. Itmakes it easy to visualize the progress of dataprocessing and to give the necessary backgroundinformation. The actions (if appropriately con-structed) can be accessed by simple mouse clicks ongraphical objects. GUIs may also provide textual in-formation where necessary and give the possibility ofentering arbitrary alphanumerical values into editboxes. Graphical visualization is not a constraint butrather an added benefit [1], [2].

Motivation: Improving theReliability of Software with GUIs

Application software with a GUI is usually a much more complex pro-gram than conventional alphanumerical control. The possibility of unde-tected mistakes in the program increases with the many parallel programbranches. Error messages should provide pathways to correct the error. Reliabilityof the program is essential. There is an increasing demand for testing of a GUI and fortesting the whole system governed by a GUI to ensure reliability.

September 2003 IEEE Instrumentation & Measurement Magazine 27

Tamás Dabóczi,

István Kollár,

Gyula Simon, and

Tamás Megyeri

1094-6969/03/$17.00©2003IEEE

Developing a tester that simulates the

user and can be used to automatically

test GUIs in the MATLAB environment.

©2001 ARTVILLE LLC. AND COMSTOCK INC

Currently, embedded systemssuch as mobile phones and auto-matic teller machines or mea-surement instruments such as adigital oscilloscopes and car di-agnostic analyzers have GUIsthat play a crucial role in the sys-tem. The performance of such asystem depends strongly on thesoftware running on the dedicated hardware. The quality ofservice requirements cannot be met without ensuring a certainlevel of software reliability. In the world of virtual instru-ments, the role of testing the software is even more important,because the application software is embedded in a large oper-ating system [3]-[5]. The environment in which the applicationsoftware runs is often not well documented (think of a PC as-sembled from components available from various vendors) oris released in certain cases with bugs and awkward features.Moreover, exhaustive testing is not a practical solution, since itis very difficult, if not impossible, to produce and test all thepossible configurations and situations.

It is also necessary to ensure that error messages are under-standable to the GUI user. The GUI must perform graceful re-covery from errors. The user should not be frightened awayby technical messages or by catastrophic errors (e.g., freezing);instead, a short hint should appear, with the informationabout what to do.

Syntactic Versus Semantic Errors

There is a difference between syntactic and semantic errors.The IEEE Standard Glossary of Software Engineering Terminology[6] defines these terms as follows:

◗ Syntax: The structural or grammatical rules that definehow the symbols in a language are to be combined toform words, phrases, expressions, and other allowableconstructs.

◗ Semantics: The relationships of symbols or groups ofsymbols to their meanings in a given language.

A syntactic error is a violation of the structural or grammaticalrules defined for a language, e.g., “for i=1 to 10” in C languageinstead of “for (i=1; i<10; i++),” or a begin without the corre-sponding end in Pascal. A syntactic error shows up usually atcompilation time. The syntax can also be checked withoutrunning the program.

A semantic error results from misunderstanding the rela-tionship between symbols or groups of symbols and theirmeaning in a given language, e.g., calculation of sin(x) insteadof cos(x). There are certain expressions that may produce anerror depending on the current conditions; e.g., dividing anumber with a variable (usually) produces an error if the vari-able is zero. Similarly, calculating the square root of a negativenumber produces an error message if complex numbers arenot interpreted. Some other examples from this group includeindexing an array out of range and allocating a large block ofmemory beyond the available limit.

It is important to mention thatnot all semantic errors generateerror messages. Thus, it is possi-ble to run a program without get-ting error messages and yetobtain a wrong result. Automatictesting, to our knowledge, aimsto catch errors that produce anerror message. Catching seman-

tic errors without an error message is based mostly on humaninteraction. For example, a specialist can see if the computedresults are incorrect by looking at some informative plots likethe pole/zero pattern or the transfer function, but the com-puter cannot recognize such errors.

“Natural intelligence” sometimes outperforms the com-puter. Therefore, having a good engineer before the screen,which provides easy-to-capture information for the user, canbe more effective for GUI testing than any kind of automatictesting. A method that tirelessly allows “nonsubjective” trialsin the software, combined with a human observer, is very ef-fective. Testing for all types of semantic errors is expensiveand time consuming for people. Exhaustive test, however,cannot be implemented in practice as there are too many pos-sible combinations.

Testing Approaches

Random Human TestingSome software companies tests GUIs by letting some users playwith the software for a couple of weeks, with the goal of catch-ing errors. Although this approach might be useful, it is ratherarbitrary, time consuming, and expensive and does not providereliable coverage of the frequently occurring cases. People aredifferent; some try features that are never touched by others.The repeatability of the testing is also hard to ensure.

Predefined Test SequencesAnother usual approach is to write a program that containscalls to the GUI to be tested. Theoretically this might also work;in many cases, however, such an approach is as arbitrary as therandom human testing; moreover, writing such a program iscumbersome, and it is difficult to achieve good coverage of thecases. An advantage is that repeatability is straightforward, ifthe test is always started from the same initial state.

Formal Description of the SystemConsidering the possibilities of testing, the systematic one isbased on a formal description of the software system, whichallows automatic generation of a test program. This workswell in theory, but in practice, it is rare to have a good formaldescription of the system. Therefore, this is a theoretical op-tion rather than a practical one.

Simulation of the UserWhy not let the computer act as an ordinary user would,providing mouse clicks, entering numeric or alphanu-

28 IEEE Instrumentation & Measurement Magazine September 2003

A graphical user interface

should be easy to use by

following logical and

intuitive steps to reach

the desired result.

meric inputs, activating usercontrols, and so on? By simu-lating the user, we test thesoftware for errors that pro-duce error messages.

Exhaustive testing of the ap-plication software by a person isusually not possible because thenumber of possible inputs fromthe user is virtually unlimited.Random testing can find manybugs. Completely random test-ing is not always a good approach because it does not describereality well (i.e., the random inputs may be far from the actualusage of the system). Thus, we have found a heuristic ap-proach with guided randomness appropriate.

The Proposed Approachfor Random Testing

Our goal was to develop a tester that simulates the user andcan be used to automatically test GUIs in the MATLAB envi-ronment [7]. The tester is a software program that providesgraphical interaction with the GUI in a heuristic way. We in-tended to find software failures that would cause an abrupttermination of the application software. This abrupt termina-tion could be due to an unexpected input from the user, forwhich the application software is not prepared. (Users arevery talented in this. They will enter the most extraordinarystrings and numeric values into textual edit boxes. So the ap-plication software needs to be tested for a very wide set ofpossible inputs.)

Although the automatic tester software was originally de-veloped for improving the reliability of the “Frequency Do-main System Identification Toolbox (FDIdent)” of MATLAB(Editor’s note: see the previous article), the principles are very

general and can be applied toother MATLAB GUIs or to otherenvironments [8]. The GUI ofthe tester software can be seenin Figure 1. The GUI of thetester can be easily adapted toany application software.

Testing ProcedureThe testing procedure takes thefollowing steps:

◗ 1) bring the software envi-ronment into a defined initial state

◗ 2) select a GUI object◗ 3) store the selection before execution◗ 4) activate the GUI object◗ 5) if error then stop, else repeat from step 2.

Initial StateTo be able to repeat the sequence of testing actions, the soft-ware environment needs to be brought into some predefinedstate before starting the testing. This may be a clean or awell-defined workspace. MATLAB provides both possibili-ties, which we used for the tester software. After an error hasbeen caught, the environment needs to be brought into thesame condition as before the beginning of the test. The actionhistory can then be repeated step by step, and the workspacecan be investigated after every step.

MATLAB Graphical User InterfaceMATLAB has many graphical objects, like figures, lines, texts,surfaces, user interface controls, and menus. All of the graphi-cal objects are uniquely identified with handles. The graphicalobjects have different properties, depending on their type. Themost important properties are as follows:


Starts Testing

MouseEmulation

User Levelof FDTool

Probabilities

Status Bar

Loads the Settingof Previous Run

Constraints Windows to Be Tested

Number ofControl Actions

in Primary Windows

Location ofProbability Table

Actions AreSaved to:

Starts FDTool byLoading Session

or Session History

Fig. 1. GUI of the tester.

Let the computer act as an

ordinary user would.

By simulating the user, we

test the software for errors

that are producing error

messages.

◗ action performed if thegraphical object is acti-vated (e.g., click on a pushbutton with left, right, ordouble mouse click)

◗ visibility of the object◗ whether it is active or dis-

abled (grayed out).

Selecting a Graphical Object to Be TestedThere are many graphical objects in a GUI, but not all of themare active at any one time. Some of them might be inactive,disabled, or invisible. There might be several windows con-taining different graphical objects. In MATLAB, a window canbe made invisible instead of deleting it, which allows it to beavailable quickly if it’s required again (e.g., a help window).

There are two possibilities to select a visible and activegraphical object. The first approach is to collect the active ob-jects from all visible windows and to select one from this set.The second approach is to select first a window and to collectthe active objects only on that one. We chose this later ap-proach because it describes better how a user interacts withthe GUI. Moreover, this approach is faster.

We operate on a selected window called “window in fo-cus.” We associated a certain increased weight to the windowin focus and unity weight to all other windows. The windowin focus can be changed on a random basis, making use of theweights. This emphasizes that users usually activate controlson the foreground window.

The tester explores the visible and active graphicalobjects in the selected window. Selecting callback func-

tion through the action recorderactivates one of the objects.

We also provided the abilityto test only a selected window(the window in focus). This isuseful if the testing is done aftersome software modifications thataffected just one particular win-dow and the others don’t need tobe tested again.

Activating the Graphical ObjectsThrough the Action RecorderWe chose to interact with the graphical user interface througha so-called action recorder, which will be described later. Wefeed actions into the recorder as if the user actions had been re-corded and replay them (Figure 2).

Catching ErrorsThe tester stores the initial condition of the software and theactions on the hard drive before executing them. This ensuresthat the history of actions and the last action causing the erroror an abrupt termination is saved, even if the application soft-ware freezes or becomes unstable.

If an error is caught, the testing procedure stops. The wholetesting can be replayed step by step, starting from the same ini-tial state as the tester did. The workspace can be investigated af-ter every step.

There is also the possibility to continue the testing after anerror has been caught (certainly not from the erroneous state).We can bring the system to a predefined initial state, and werestart the heuristic random test. If another error is caught, theaction history is automatically stored on the hard drive with a

different name. This possibilityis advantageous at the firstcouple of test runs of a GUI testbecause it collects many errorswithout the need of user inter-action. It might be useful alsoafter a major change in the GUIsoftware.

Guided RandomSearchDEFINING PROBABILITIES

TO UICONTROLS/

UIMENUS

To speed up the testing proce-dure, it is worthwhile to influ-ence the selection of thegraphical objects to be acti-vated. Some functions areknown to be critical, whereasothers do not perform compli-cated computations or do notcontain many branches. It is


Fig. 2. Storing a mouse click with the action recorder.

The basic operation for an

action recorder is the

mechanism to capture

and replay each action

performed by the GUI.

worth investigating the criticalparts more thoroughly. That iswhy we introduced weights tographical objects. The largerthe weight, the greater is theprobability that the tester acti-vates it. Assigning zero weightexcludes the object from test-ing. The weights are stored in alookup table, which can be ed-ited with a standard text pro-cessor. This requires a unique identification of all graphicalobjects. MATLAB assigns a unique handle to each of them;however, this happens at runtime. Fortunately, it is possibleto assign a tag (i.e., a string) to every object statically. We as-sign a unique tag to every object and identify them based onthis. If the programmer of the graphical user interface did notpay attention to that or the identification is not unique, thenthe program assigns weights. Selecting the object based on auniform distribution is still possible.

One graphical object can be activated several times. An-other heuristic, which we introduced, is to decrease dynami-cally the weight of the activated object. The factor fordecreasing the weight can be adjusted. Assigning a zero factorgives the possibility to test an object only once.

DEFINING SETS OF POSSIBLE INPUTS

TO EDIT BOXES

A complicated question is what numeric or textual inputneeds to be entered into the edit boxes. As the behavior of theapplication software depends on the data to be entered, an ex-haustive test theoretically is not possible.

We define a set of possible inputs to edit boxes. These setscan be assigned either to one particular object or to a group ofobjects. The sets can be conveniently edited with text proces-sors and can be either additions to completely random inputsor exclusive sets.

Defining the sets of possible inputs is a hard task. This isthe point where an expert on the application software needs to

cooperate with somebody whohas never used the software be-fore. An expert is bound to pro-pose for the possible input setdata that might be valid or inthe range of the required input.It is obvious that such datashould be used for the test.However, there is a muchgreater chance that the soft-ware to be tested is not pre-

pared for an unusual input; e.g., a negative number fordistance, a string for numeric input, or funny characters infilenames, etc. These inputs also need to be introduced intothe possible set. A colleague who does not know much aboutthe software might be much more useful for proposing unex-pected values. (A child can be also of great help to extend theset of possible inputs.)

CAPTURING/REPLAYING ACTIONS:

THE ACTION RECORDER

Let us summarize again the features we would like for the tester:

◗ use the GUI like a user

◗ if an error occurs, store enough state to be able to repro-duce the error.

Both requirements tell us that we want to act as the user would.We need to produce and collect computer-simulated user ac-tions and replay them when necessary. That is, we need some-thing that records and replays actions. What is more intuitivefor a person than to have an action recorder, similar to tape re-corders, dealing with actions instead of voice and sound? Wedeveloped such an action recorder. The action recorder is notonly a handy tool for testing, but its capabilities go much fur-ther. Before going into details, we need to consider the systemrequirements for the action recorder. We need to be able to:

◗ bring the GUI into a well-defined initial state

◗ act on any controls (pushbuttons, edit boxes, menus,etc.) as the user does; that is, we need to be able to emu-late all user actions


Fig. 3. Display of an action recorder.

Our goal was to develop a

tester that simulates the

user and can be used to

automatically test GUIs in

the MATLAB environment.

◗ record a sequence user ac-tions

◗ replay the sequence.For testing, it is enough to be ableto record programmed actionsonly, but if properly programmed,the action recorder can record bothprogrammed test actions and hu-man actions. This allows additionaltesting. The most typical user actions (typical according to the de-signers) can be recorded and stored for later replay; this allows aquick test of the most common actions when using that GUI. Sucha possibility can also be used to store demonstration sequences,which can provide an introduction to the GUI, and at the sametime can be used as test sequences, assuring that all introductorysteps work indeed. The examination of the results of such se-quences by the human tester enables an approximate check of thesemantics (proper calculations) of the software behind the GUI.

The basic operation for an action recorder is the mecha-nism to capture and replay each action performed by the GUI.There are two possibilities. One is when the operational sys-tem or the application program under which the GUI is real-ized can return the information of each action. In many cases,however, this functionality is not provided. In such cases, weare referred to the second one: the routines, called after eachuser action (callbacks), contain a program sequence that storesthe corresponding action when executed.

Replaying previously recorded actions can also be done intwo ways. If the operational system or the application pro-gram offers the possibility to program the mouse to moveabove an object and perform the push or writing, the user ac-tion can be precisely emulated. If this is not the case, thecallbacks need to be activated.

The MATLAB environment we used in this project allowedthe callback-based operation only, so we implemented the re-

corder in this way. The function-ality of each control is imple-mented also as a callable function;therefore, the recorder actions canalso be programmed, allowingthe implementation of the ran-dom tester (see below).

Functionalities of a Recorder

Figure 3 illustrates the graphical interface of the recorder.Meanings of the controls of the recorder are as follows:

◗ The buttons « ⟨ PLAY STOP ⟩ » provide the usual eventrecorder functionalities using a traditional tape-recorderfront-end: fast backward (home), one step backward,play, stop, one step forward, fast forward (end).

◗ The text box on the right-bottom side is a message display,providing information about the state of the recorder.

◗ RECORD is the start button for recording; this starts therecording and steps forward after each recorded action.

◗ The tickbox “Continuous” selects between nonstop orstep-by-step record and replay.

◗ The tickbox “Emulate mouse motion” makes the re-corder move the mouse cursor above the active controlduring replay.

◗ The tickbox “Pause” allows one to set a breakpoint: con-tinuous replay can be stopped if this tickbox is on for anaction (this can be edited manually, after recording).

◗ The tickbox “Discard Pause” allows continuous execu-tion even if the Pause tickbox is set for certain actions, toallow continuous execution of a demonstration that usu-ally contains pauses.

◗ The tickbox “Lecture mode” allows special stopping:when the recorder stops, the active GUI window, not the


Fig. 4. Action recorder in development mode.

Automatic testing, to our

knowledge, aims to catch

errors that produce an

error message.

recorder display, will appear in the foreground to allowexplanations during a lecture.

◗ The large text box on the right-center side is for longer ex-planations for a user who is observing demonstrations.

There are additional controls in the recorder display; these areusually filled in by the action-recording step, but they can alsobe programmed. They are as follows:

◗ “Index”—serial number of the current action◗ “Window”—name of the window where the action takes

place◗ “Cmd”—name of the control to be activated◗ “Param”—value of the action if necessary (tick/untick

in tickboxes, string in edit boxes, etc.)◗ “SelType”—selection type of the user action (e.g., single

or double mouse-click).The recorder also contains menu items; these allow easy modi-fication of action records: save to a file, load from a file,clear/cut/paste action, and insert a MATLAB command (aspecial possibility, e.g., to quickly set environment variablesby calling a MATLAB command).

The recorder also has built-in error handling and checkingcapabilities. This feature allows the playback of actions thatnormally would cause warnings or errors during execution;thus, the error handling of the program under test can be com-pared with the expected behavior. This is more than originallyexpected; thus, we can test proper error handling. Many usersjust stop experimenting with the program when an error theycannot understand occurs. This situation is what a softwaredeveloper tries to avoid.

Effective use of the recorder for tests can be helped byadditional features not available in demonstration mode.The controls not present in Figure 3 but visible in Figure 4are available only in development mode for the advancedusers or programmers.

The recorder can be used with any MATLAB-based GUI; see [9].

ConclusionsAn approach for testing GUIs has been proposed. We devel-oped a software tool that tests GUIs by simulating the userthrough an action recorder. We proposed a heuristic test proce-dure: providing random input to GUI, but guiding the random-ness with predefined weights assigned to the user controls. Theweights change during the testing process, as the controls areactivated. The errors are collected for later investigation.

AcknowledgmentsThis work was supported by the Hungarian Scientific Re-search Fund (OTKA F034900) and the Research and Develop-ment Fund for Higher Education (FKFP 0098/2001 and FKFP0074/2001).

References

[1] IEEE Instrum. Meas. Mag. (Special Section on Virtual Systems), vol.

2, pp. 13-37, Sept. 1999.

[2] User Interface Guidelines. Available: http://www.dcc.unicamp.

br/~hans/mc750/guidelines/newfrontmatter.html

[3] G Programming Reference Manual. National Instruments Corpora-

tion, Austin, TX, 1988, part no. 321296B-01. Available:

http://www.ni.com/pdf/manuals/321296b.pdf

[4] E. Baroth, C. Hartsough, and G. Wells, “A review of HP VEE 4.0,”

Evaluation Eng., pp. 57-62, Oct. 1997.

[5] IVI Foundation Home Page. Available:

http://www.ivifoundation.org/

[6] IEEE Standard Glossary of Software Engineering Terminology, IEEE

Standard 610.12, 1990.

[7] MATLAB home page. Available: http://www.mathworks.com/

[8] Frequency Domain System Identification Toolbox for MATLAB

home page. Available: http://elecwww.vub.ac.be/fdident/

[9] Action Recorder for MATLAB home page. Available:

http://www.mit.bme.hu/services/recorder/

Tamás Dabóczi received the M.Sc. and Ph.D. degrees in elec-trical engineering from the Budapest University of Technol-ogy, Hungary, in 1990 and 1994, respectively. Currently, he isan associate professor in the Department of Measurement andInstrument Engineering, Budapest University of Technologyand Economics. His research area includes embedded systemsand digital signal processing, particularly inverse filtering.

István Kollár received the M.S. degree in electrical engineer-ing in 1977, the Ph.D. degree in 1985, and the D.Sc. degree in1998, respectively. From 1989 to 1990, he was a visiting scien-tist at the Vrije Universiteit Brussel, Belgium. From 1993 to1995, he was a Fulbright scholar and visiting associate profes-sor in the Department of Electrical Engineering, Stanford Uni-versity, California. Currently, he is a professor of electricalengineering at the Budapest University of Technology andEconomics, Hungary. His main interest is signal processing,with emphasis on system identification, signal quantization,and roundoff noise. He is a member of the IEEE Instrumenta-tion and Measurement Society AdCom and EUPAS (EuropeanProject for ADC-based devices Standardisation).

Gyula Simon received the M.Sc. and Ph.D. degrees in electri-cal engineering from the Budapest University of Technology,Budapest, Hungary, in 1991 and 1998, respectively. Since1991, he has been with the Department of Measurement andInformation Systems, Budapest University of Technology andEconomics, Budapest. His research interest includes digitalsignal processing, embedded systems, adaptive systems, andsystem identification.

Tamás Megyeri is a gradate student at the Budapest Uni-versity of Technology and Economics, Faculty of ElectricalEngineering and Informatics. He began his studies in 1996.His specializations are embedded systems and digital sig-nal processing.


developing a tester that simulates the user and can be used...

Documents