using statistical software and web technologies in ...marin vlada 1, adriana sarah nica 2 (1)...
TRANSCRIPT
Using statistical software and Web Technologies in analyzing information on detection and monitoring of somatic and psycho-
behavioural deficiencies in children and adolescents
Marin Vlada1, Adriana Sarah Nica2
(1) University of Bucharest, Department of Mathematics and Computer Science, Romania, E-Mail: vlada[at]fmi.unibuc.ro
(2) University of Medicine and Pharmacy "Carol Davila", Bucharest, Romania, E-Mail: adisarahnica[at]yahoo.com
Abstract
The paper addresses the project objectives CEEX Research and Development (financed by NASR-National Authority for Scientific Research) DEMODEF (detection and monitoring of somatic and psycho-behavioural deficiencies in children and adolescents) as envisages research and pilot study results will provide a clear, concrete, on physical development and physical and psycho-behavioural shortcomings of youth. Investigations means used to generate project information grouped by several categories of parameters. This information is analyzed using databases and statistical analysis provided by EpiInfo software (EpiInfo is a trademark of the Centres for Disease Control and Prevention-CDC) and SPSS.
Keywords: Statistical Software, Web Technologies, Analyzing Information
1 Introduction and Motivation DEMODEF is a project of Research and Development CEEX financed by NASR (National Authority for Scientific Research of Romania) witch has as main objective “Detection and monitoring of psycho-behavioural deficiencies in children and adolescents (aged 10-18 years) to realize a pilot study to change perceptions and establish accountability for all involved in the complex process of spring growth and development of children and adolescents” Vlada and Nica [1, 8]. Proposed study is a dynamic study over two years. Growth period in which this study shall be made as early detection and swift implementation of remedial treatment there are an optimal time , proposed to obtain notable improvements. This is also why the reason for choosing as age 10-14-18 years, period in witch physical deficiencies appears and aggravates. After the initial investigation there will be proposed corrective programs, witch efficiency could be evidenced in the next investigations.
1.1 Research and investigations The project will investigate, in dynamics, school population in the growth period, period in witch anthropometrical values of the individual changes in close relation with age and found deficiencies. There will be studied significant elements close to growth and locomotor’s development and psycho-behavioral to identify functional disorders and deficiencies. Thus, the locomotors system will track the most important weakness: the spine, chest, hip, knee and foot. From psycho-medical point of view groups will investigate the following parameters: cognitive, emotional, behavioral, pathological and social climate: level of cognitive processes (attention, memory, intelligence) affective disorders (anxiety, depression, emotional liability) personality
University of Bucharest and University of Medicine and Pharmacy Târgu-Mureş
456
changes, behavioral changes. The data obtained from this study will allow the elaboration of some programs; both with prophylactic and therapeutic character, applicable on a large scale at this group of population, fro the optimization of the health status and to provide a harmonious growth of children and adolescents. The project is realized by six partners (universities, institutes, and research centers) scientific coordinated by Prof. Dr. A. Nica. Novelty of the project is trying to group a series of psycho-behavioral parameters of child development in a related study correlated with the interrelation of these children with main representatives of the family and school (teachers, peers). Data obtained from these assessments are organized in a database that will allow a detailed statistical analysis which will determine the variation of each parameter taken in study and the possible correlation between analyzed deficiencies. Also, from previous research on this subject, trial is conducted on a sample of approximately 1000 subjects. Internally, in 2001 an investigation was done (INMF and CCEFS) on a sample of 600 students (14-18 years), and internationally in 2005 an investigation was done (Zwaannswjk et al.) on a sample of 2449 children and adolescents (14-17 years). More recently, the U.S. has conducted an investigation on a sample of 9878 students, it is incomplete because only uses the parents answers for some statistical data, in SDQ questionnaire [2, 3].
2 Create questioners and data entry According to research and plan implementation stages of the project are defined the following models:
• functional model of clinical investigation, anthropometric and psychosocial behavior in children and adolescents (aged 10-18 years)
• sample clinical assessment of functional, psycho-behavioral and anthropometrical deficiencies in children and adolescents (aged 10-18 years)
If psycho-behavioral investigations using reference: Goodman R, Renfrew D, Mullick M (2000) Predicting type of psychiatric disorder from Strengths and difficulties Questionnaire (SDQ) scores in child mental health clinics in London and Dhaka. European Child and Adolescent Psychiatry, 9, 129-134. Algorithm calculates and questionnaire scores for SDQ: Emotional symptoms (emotional disorders); Conduct Problems (conduct problems), hyperactivity scales (hyperactivity), Peer Problems scales (relationship problems), prosaically scales (prosaically behavior), and the total difficulties score (score total difficulties) impact scores (scores of impact - assess the impacts of social malfunctioning and the difficulties suffered by children in general) Goodman, Renfrew, and Mullick [2, 3].
SDQ analyze the student responses, teacher and parent corresponding to the same set of questions. The answers to these questions have led a number of 25 variables, with values taken from them. SDQ predictive algorithm offers the calculation of scores generated using data from student responses, teacher and parent. Difficulties are analyzed following categories: behavioral problems, emotional problems, hyperactivity problems, psychiatric difficulties. 2.1 The utilization of statistical software Epi Info Responses from questionnaires were retrieved via file type questionnaire (view) designed to aid data entry with the program Epi Info. This is a program specific for statistical processing, initially utilized for epidemiology by CDC. The files created with this program are compatible with Microsoft Access, SQL, ODBC databases, HTML. ”With Epi Info™ and a personal computer, epidemiologists and other public health and medical professionals can rapidly develop a questionnaire or form, customize the data entry process, and enter and analyze data. Epidemiologic statistics, tables, graphs, and maps are produced with simple commands such as READ, FREQ,
The 5th International Conference on Virtual Learning ICVL 2010
457
LIST, TABLES, GRAPH, and MAP. Epi Map displays geographic maps with data from Epi Info™.” [4, 7].
Epi Info is a software package for processing data organized and systematic form of questionnaire results of studies to be included in communications and reports. Designed primarily for applications in epidemiology, Epi Info can be successfully used in medical and data processing from outside its package of features including data management and statistical programs such as those offered by SAS, SPSS [5,6] In an facilities included one system whose main advantage is that it is permitted free copying and distribution [7]. Details and instructions for use can be found in paper [7]. The main controls parts of the program Epi Info are:
• Make View, which is a text editor used to define the fields used in entering data on one or more pages of a questionnaire (View)
• Enter date, built with polls showing View Make controls the input settings using the codes specified in Make View, has records and search capabilities;
• Analyze Data, which is used to analyze the recorded data files created not only with Epi Info, and with dBase, FoxPro, Excel, etc. These files can contain lists, frequencies, tables, charts, specific epidemiological data;
• Create maps, which is an epidemiological tool used to create maps; • Create Reports, which is used to generate reports.
To create a file-questionnaire will be used Make View. It will use the commands: File → New
→ File name (database name: nume_EPI) Name → Open the view ("Chest1" the name given to the questionnaire). In the left-hand page are three options concerning the management page of the questionnaire (Add Page - add a new page at the end of the existing Insert Page - adding a new page between two existing Delete Page - eliminating the current page) and programming control program that enables the verification of certain operations, thereby avoiding errors that can occur in data input [7].
Placing fields on the current page of the questionnaire, as indicated shown, is made with a click right mouse button from his position in that field is desired appearance (for determining the grid position is useful). As a result, you'll see Field Definition dialog box to be placed in field characteristics: name, type, size, limitations of values, codes, laws, and values.
Figure 1. A file-questionnaire and Field Definition dialog box
University of Bucharest and University of Medicine and Pharmacy Târgu-Mureş
458
Entering can be done directly from the File menu, ordering Enter date. Other possibilities, after leaving the Make View module of Epi Info main page or choose the Enter Data module directly, or to order Enter Data Programs menu. In this case opens the questionnaire developed, choosing the project and the corresponding views. Initially be placed at least four entries.
Figure 2. A file-questionnaire for Enter data
To perform statistical analysis of primary data using Analyze Data module. In this way using
several commands that you can choose the window controls on the left. Command execution results are shown in the top right (called Output Analysis). In the lower right window (entitled Programme Editor) will show orders / sets of commands that were previously performed, also may introduce new commands in command line mode. We can choose commands are grouped in the left window, in some groups.
Figure 3. Excel spreadsheets corresponding SDQ algorithm
Distinguish such work orders data (grouped into "Date"), operating on the variables (obviously
grouped in "Variables"), selection commands (grouped in the "Select / if"), the primary statistical analysis commands (grouped into "Statistics"). Read (import) is the command used to start any work sessions Analysis module. It is used to retrieve data from a file, data will be used for further processing (up to a new Read command). Epi default data format is 2000, but it can be changed so
The 5th International Conference on Virtual Learning ICVL 2010
459
that it is possible to take data from other file types (e.g. various versions of Excel, various versions of FoxPro, Paradox or hypertext documents). Epi Info program is accompanied by several "projects" for example and self-learning, of which the simplest is Sample.mdb. For analysis and statistical calculations on data in files using the group "Statistics" which provides commands List, frequencies, means, Graph. Graph command in Group "Statistics", is used to make graphical representations of variables from a data file. The project DEMODEF for SDQ questionnaire was developed three questionnaires-file according to student feedback, teacher, parent respectively (in total for each questionnaire were created 915 records). These information files were used for statistical calculations performed with the software Epi Info, SAS, SPSS and Excel. If SDQ questionnaire was needed to convert database files to Excel. Responses from students, teacher, and parents that was stored in separate databases, namely the corresponding spreadsheet information.
3 The Utilization of SDQ Algorithm
Algorithm was used "Strengths and difficulties Questionnaire" SDQ scores. Analyzed these categories of difficulties:
• behavioural difficulties; • emotional difficulties; • hyperactivity difficulties; • psychiatric difficulties.
Algorithm was used on 101 subjects in England and Bangladesh on 89. Level prediction correlation between SDQ and an independent clinical diagnosis in these cases was significant: Kendall parameter between 0.49 and 0.73 and p <0.0001 probabilities that the prediction on this method to be fair is 81-91%. It is rather false positive than false negative [3].
The algorithm is sufficiently robust to be used in practice to determine the mental health of children. 3.1 SDQ: Generating scores in SAS The scoring algorithm is based on the 25 variables plus impact items for each questionnaire. The algorithm expects to find these variables with specific names: the first letter of each variable name is 'p' for the parent SDQ, 's' for the self-report SDQ and 't' for the teacher SDQ. After this first letter, the variable names are as follows:
consider
Item 1 : considerate
restless
Item 2 : restless
somatic
Item 3 : somatic symptoms
shares
Item 4 : shares readily
tantrum
Item 5 : tempers
loner Item 6 : solitary
obeys Item 7 : obedient
worries
Item 8 : worries
carin Item 9 : helpful if someone hurt
University of Bucharest and University of Medicine and Pharmacy Târgu-Mureş
460
g
fidgety
Item 10 : fidgety
friend
Item 11 : has good friend
fights Item 12 : fights or bullies
unhappy
Item 13 : unhappy
popular
Item 14 : generally liked
distract
Item 15 : easily distracted
clingy
Item 16 : nervous in new situations
kind Item 17 : kind to younger children
lies Item 18 : lies or cheats
bullied
Item 19 : picked on or bullied
help out
Item 20 : often volunteers
reflect
Item 21 : thinks before acting
steals Item 22 : steals
oldbest
Item 23 : better with adults than with children
afraid
Item 24 : many fears
attends
Item 25 : good attention
ebddiff
Impact question: oveall difficulties in at least one area
distress
Impact question: upset or distressed
imphome
Impact question: interferes with home life
impfrie
Impact question: interferes with friendships
impclas
Impact question: interferes with learning
implies
Impact question: interferes with leisure
For each of these items, if the first response category (not true, no, not at all) has been selected, this is coded as zero, the next response category (somewhat true, yes-minor, just a little) is coded as one and so on. For each informant, the algorithm generates six scores. The first letter of each derived variable is 'p' for parent-based scores, 's' for self-report-based scores and 't' for teacher-based scores. After this first letter, the names of the scores are as follows:
The 5th International Conference on Virtual Learning ICVL 2010
461
emotion
emotional symptoms
conduct
conduct problems
hyper
hyperactivity/inattention
peer
peer problems
prosoc
prosocial
ebdtot
total difficulties
impact
impact
Figure 4. Excel spreadsheets corresponding SDQ algorithm
3.2 SDQ: Predictive Algorithm in SAS A computerized algorithm predicts child psychiatric diagnoses from the symptom and impact scores derived from Strengths and Difficulties Questionnaires (SDQs) completed by parents, teachers and young people. The predictive algorithm generates "unlikely", "possible" or "probable" ratings for four broad categories of disorder, namely conduct disorders, emotional disorders, hyperactivity disorders, and any psychiatric disorder. The predictive algorithm is based on up to twelve input variables:
phyper
SDQ hyperactivity score from parent SDQ
thyper
SDQ hyperactivity score from teacher SDQ
shyper
SDQ hyperactivity score from self-report SDQ
pconduct
SDQ conduct problems score from parent SDQ
tconduct
SDQ conduct problems score from teacher SDQ
University of Bucharest and University of Medicine and Pharmacy Târgu-Mureş
462
sconduct
SDQ conduct problems score from self-report SDQ
pemotion
SDQ emotional symptoms score from parent SDQ
temotion
SDQ emotional symptoms score from teacher SDQ
semotion
SDQ emotional symptoms score from self-report SDQ
pimpact
SDQ impact score from parent SDQ
timpact
SDQ impact score from teacher SDQ
simpact
SDQ impact score from self-report SDQ
"." = value for relevant score missing
The algorithm generates four output variables:
sdqed
prediction of an emotional disorder (0 = unlikely, 1 = possible, 2 = probable)
sdqcd
prediction of a conduct disorder (0 = unlikely, 1 = possible, 2 = probable)
sdqhk
prediction of a hyperactivity disorder (0 = unlikely, 1 = possible, 2 = probable)
anydiag
prediction of any psychiatric disorder (0 = unlikely, 1 = possible, 2 = probable)
Figure 5. Excel spreadsheets corresponding predictive algorithm
4 Results and Analysis Investigations Based on responses from valour stored in databases, according to the algorithm SDQ scores were generated corresponding to the seven parameters investigated: Total difficulties; Emotional symptoms; Conduct problems; Hyperactivity-inattention; Peer problems; Prosocial behaviour; Impact. The results of these calculations are presented in the following tables.
The 5th International Conference on Virtual Learning ICVL 2010
463
Table 1. Answers of Children
Total
difficulties Emotional symptoms
Conduct problems
Hyperactivity-
inattention
Peer problems
Prosocial behavior
Impact
Normal 758 795 723 788 761 719 503
To limit 108 55 97 67 121 61 55
Abnormal 49 65 95 60 33 135 357
Table 2. Answers of Parents
Total difficulties
Emotional symptoms
Conduct problems
Hyperactivity-inattention
Peer problems
Prosocial behavior
Impact
Normal 716 680 724 798 656 504 713 To limit 75 79 57 68 82 53 64 Abnormal 124 156 134 49 177 358 138 Children 915 915 915 915 915 915 915
Table 3. Answers of Teachers
Total difficulties
Emotional symptoms
Conduct problems
Hyperactivity-inattention
Peer problems
Prosocial behavior
Impact
Normal 652 784 718 793 704 407 771 To limit 140 70 56 60 95 142 39 Abnormal 123 61 141 62 116 366 105
Total difficulties
Normal; 758; 83%
To limit; 108; 12%
Abnormal; 49; 5%
Normal La limita Anormal
Figure 6. The scores for “Total difficulties”
University of Bucharest and University of Medicine and Pharmacy Târgu-Mureş
464
Total difficulties
0
100
200
300
400
500
600
700
800
Normal To limit Abnormal
Youngs Parents Teachers
Figure 7. Comparative result Children - Parents- Teachers
4.1 Predictive calculation A computerised algorithm predicts child psychiatric diagnoses from the symptom and impact scores derived from Strengths and Difficulties Questionnaires (SDQs) completed by parents, teachers and young people. The predictive algorithm generates "unlikely", "possible" or "probable" ratings for four broad categories of disorder, namely conduct disorders, emotional disorders, hyperactivity disorders, and any psychiatric disorder.
Based on scores obtained from responses to student, teacher and parent have been implemented in Excel and calculating SDQ predictive algorithm given and calculated values for the four categories of difficulties: behavior difficulties, emotional difficulties, hyperactivity difficulties, psychiatric difficulties.
Formulas to calculate these values are: • behaviour difficulties =IF(OR(AND(F4=2,E4=2),AND(F4=2,E4=1)),2,IF(OR(AND(F4=0,E4=1),AN
D(F4=1,E4=0),AND(F4=1,E4=1),AND(F4=2,E4=2)),1,IF(OR(C4>=0,D4>=0,E4>=0),0,3)))
• emotional difficulties =IF(OR(AND(elevi!AT4>=6,elevi!BF4>=2),AND(profesori!AI4>=4,prof
esori!AU4>=2),AND(parinti!AI4>=5,parinti!AU4>=2)),2,IF(OR(parinti!AI4>=4,profesori!AI4>=3,elevi!AT4>=5),1,IF(OR(parinti!AI4>=0,profesori!AI4>=0,elevi!AT4>=0),0,3)))
• hyperactivity difficulties =IF(OR(AND(parinti!AH4>=5,parinti!AU4>=1),AND(profesori!AH4>=5,
profesori!AU4>=1),AND(elevi!AS4>=6,elevi!BF4>=1)),1,IF(K4>=1,2,IF(OR(AND(K4=1,H4=2),AND(K4=1,G4=2)),1,IF(OR(parinti!AH4>=0,profesori!AH4>=0,elevi!AS4>=0),0,3))))
• psychiatric difficulties =IF(OR(I4>=2,,H4>=2,G4>=2),2,IF(OR(I4>=1,H4>=1,G4>=1),1,IF(OR(I
4>=0,H4>=0,G4>=0),0,3)))
The 5th International Conference on Virtual Learning ICVL 2010
465
Comparation of results
0
100
200
300
400
500
600
700
800
900
Hyp. D. Conduct D. Emot. D. Psy. D.
Unlikely Possible Probable
Figure 8. Comparation of results
Behavior difficulties
Unlikely
67%
Possible
25%
Probable
8%
Unlikely Possible Probable
Figure 9. Behaviour difficulties
Emotional disorder
Unlikely
83%
Possible
17%
Probable
0%
Unlikely Possible Probable
Figure 10. Emotional difficulties
University of Bucharest and University of Medicine and Pharmacy Târgu-Mureş
466
Hyperactivity disordery
Unlikely
92%
Possible
8%
Probable
0%
Unlikely Possible Probable
Figure 11. Hyperactivity difficulties
Psychiatric desorder
Unlikely
55%
Possible
37%
Probable
8%
Unlikely Possible Probable
Figure 12. Psychiatric difficulties
5 Conclusions The proposed issue is interesting to better understand the changes that individuals must assume the lifestyle and type of food to eliminate / improve somatic deficiencies. Research also needs knowledge of health information necessary for parents of their children. There is evidence the number of affected subjects, the degree of damage and no better identification of high risk population in this respect. The results of this first detection and monitoring in children and adolescents in Bucharest would be a prerequisite for socio-economic policies, education and health to promote and encourage healthy and harmonious development of future adults. Also the results of this research can be the starting point for similar investigations across the country.
The 5th International Conference on Virtual Learning ICVL 2010
467
6 References [1] DEMODEF Project: http://demodef.googlepages.com, access July 2010. [2] Goodman R., Renfrew D., Mullick M. (2000): Predicting type of psychiatric disorder from Strengths and
Difficulties Questionnaire (SDQ) scores in child mental health clinics in London and Dhaka. European Child and Adolescent Psychiatry, 9, 129-134.
[3] SDQ (Strengths and Difficulties Questionnaire): http://www.sdqinfo.org, access July 2010. [4] Epi Info – CDC, www.cdc.gov/epiinfo, access July 2010. [5] SAS - Predictive analytics software, http://www.sas.com, access July 2010. [6] SPSS - Statistical Package for the Social Sciences, http://www.spss.com, access July 2010. [7] Tiberiu Spircu (2006): Medical Informatics and Biostatistics, Publishing House "Carol Davila",
Bucharest. [8] M. Vlada and A. Sarah Nica (2007): Analysis and results of project DEMODEF CNIV-2007, National [9] Virtual Learning Conference, Educational Software, 5th Edition, 27-29 October 2007, pp. 149-156, ISSS
1842-4708 (in Romanian), Publishing House of the University of Bucharest.