demonstration projects final evaluation dataset july 28, 2011 special diabetes program for indians...

Post on 15-Jan-2016

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Demonstration ProjectsFinal Evaluation Dataset

July 28, 2011

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Final Evaluation Dataset

Contents of the CD Database design and documentation Using Excel for basic functions Cautions when using Excel More advanced statistics Importing Excel data into SPSS/SAS Where to find help

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Contents of CD: Data

All data that were collected using the SDPI Demonstration Projects original data collection forms for all eligible participants

Up to 7/31/2009 for original participants

Up to 7/31/2010 for transition participants

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Contents of CD: Instruction Manual

Located on the CD and in the meeting binder Contents:

Subdirectory Structure Participant-Level Datasets and Documentation Grantee-Level Datasets and Documentation Analysis Conventions for Using Final

Evaluation Datasets

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Database Design and Documentation

Raw datasets

Constructed datasets

Data dictionaries and Variable indexes

Scale variable documentation

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Subdirectory Structure

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Raw Datasets Separate Excel dataset for each participant-level

and grantee-level data collection form Some forms contain multiple tabs in Excel file Contain responses to the original questions Variables were usually named using a combination

of form name and question number Examples

Participant’s gender on B1 form = B1DQ1 Provider’s professional background on AD1 form = AD1DQ1

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Raw Datasets Variables went through a thorough cleaning and

quality assurance process Checked for missing data Checked for outliers (out-of-range data) Checked for inconsistencies among variables Checked for inconsistencies in skip patterns

See the Instruction Manual for more specific information about the cleaning procedures used for each of the raw datasets.

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Raw Datasets

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Raw Datasets

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Raw Datasets

See the Instruction Manual for more specific information about the contents of each of the raw datasets.

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Constructed Datasets Only exist for participant-level data

Constructed datasets should be sufficient for most data analysis on participant-level data

Contain frequently used variables

Contain renamed, derived, and scale variables created from raw datasets

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Constructed Datasets – Renamed Variables

Meaningful variable names indicate variable content

Examples B1DQ3_1, F1DQ3_1, and A1DQ3_1 have been

renamed “Weight” M1DQ1_1 has been renamed “FBG”

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Constructed Datasets – Derived Variables

Re-categorized and re-coded variables, as well as some other variables created based on variables from the raw datasets Example

Emp4Cat is a derived variable for participant employment status that was re-classified into 4 categories based upon each participant’s response to B2DQ62, F2DQ41, and A2DQ54

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Constructed Datasets – Scale Variables

Scale variables that were created based upon several items from the raw datasets Examples

The Alcohol Use Disorders Identification Test (AUDIT) Scales

Audit, Audit1, Audit2

Healthy and Unhealthy Diet Scales HealDiet, UnHeDiet

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Constructed Datasets

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Constructed Datasets

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Constructed Datasets

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Constructed Datasets

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Unique Identifiers

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Variable Indexes

List of all the variable names and labels Found in the Data Dictionary subdirectory Available for only the following constructed

participant-level datasets: dpbaseline.xls dpfollowup.xls dpannual.xls dpmidyear.xls

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Data Dictionaries

Excel files corresponding to all datasets Found in the Data Dictionary subdirectories

for both the participant-level and grantee-level data

Some data dictionaries contain multiple tabs: I2.D, I3.D, AD1.D, I1.D, R3.D

Differ for raw versus constructed datasets

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Columns in Data Dictionaries for Both

Raw and Constructed Datasets Variable Name

- Raw: combination of form name and question #

- Constructed: meaningful, indicating content

Label (provides meaning of each variable)- Raw: actual question wording

- Constructed: includes “baseline only,” etc.

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Columns in Data Dictionaries for Both

Raw and Constructed Datasets Options (value labels – meaning of each

numeric value for categorical variables)Example:

1 = Strongly Agree

2 = Agree

3 = Neither Agree nor Disagree

4 = Disagree

5 = Strongly Disagree

Type (data type of each variable)Examples: Numeric, Text, Date

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Additional Columns in Data Dictionaries for Constructed

Datasets Input Variable

- Variable(s) that was/were used to create each variable in the constructed dataset

Comments- Algorithm for how the variable was created

- Cautions to note when using a variable in data analysis

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Additional Columns in Data Dictionaries for Constructed

Datasets *Citation- Reference(s) for each scale variable

*Cronbach’s α- Cronbach’s alpha coefficient indicating the internal

consistency of the items used to create each scale variable

* Columns are only applicable for baseline datasets that have scale variables. Because follow-up and annual datasets have the same sets of scale variables as the baseline dataset, users should refer to the baseline data dictionary for

this information.

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Scale Variable Documentation

Only applies to participant-level constructed datasets that include scale variables

Brief summaries of how each scale variable was created and the psychometric properties of each scale

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Recommendations for Using the Final Evaluation Datasets in Data

Analysis Use constructed datasets when available, before

going to the raw datasets folder

Use existing scale and derived variables in the constructed datasets, if possible (especially socio-demographic variables)

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Recommendations for Using the Final Evaluation Datasets in Data

Analysis Check for outliers that are out of normal ranges

(especially with regard to major outcome variables)

Check for timeline compliance of the measurements (especially at baseline)

Check for potential inconsistencies among variables

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Recommendations for Using the Final Evaluation Datasets in Data

Analysis When using an ordered categorical variable as a

continuous variable, check the linearity assumption

When using a dichotomous variable that was created from more than two levels originally, make sure that you are not losing important information by doing so

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Using Excel for Basic Functions Merging Data from two Excel Files Computing Descriptive Statistics

Count (N) Minimum Maximum Mean (average)

Using Selection Criteria Creating Bar Graphs

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Importing a Worksheet from another Excel file

Let’s say that you would like to compare data from baseline to data from follow-

up. In order to do this in Excel, you would need to merge the data from two

Excel files into one Excel file.

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Open first Excel file; Right-click on the worksheet tab; Choose “Insert”

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Choose Worksheet; Click OK

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Right-click on “Sheet1” and choose Rename to change the name of the

worksheet

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Copy and paste in the data from the other Excel file into the newly

renamed worksheet

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Computing Descriptive Statistics

First, add one more worksheet to the newly combined Excel file using the procedures above

Calculate count (n), minimum, maximum, and mean (average) for a variable of interest (e.g., BMI) at both baseline and follow-up using all of your participants

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

In the new worksheet, add appropriate row and column

headers

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Add in the appropriate formulas to compute the descriptive statistics Count (n)

=COUNT(dpbaseline!AJ2:dpbaseline!AJ76)

=COUNT(dpfollowup!U2:dpfollowup!U38)

Minimum =MIN(dpbaseline!AJ2:dpbaseline!AJ76)

=MIN(dpfollowup!U2:dpfollowup!U38)

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Add in the appropriate formulas to compute the descriptive statistics Maximum

=MAX(dpbaseline!AJ2:dpbaseline!AJ76)

=MAX(dpfollowup!U2:dpfollowup!U38)

Mean (average) =AVERAGE(dpbaseline!AJ2:dpbaseline!AJ76)

=AVERAGE(dpfollowup!U2:dpfollowup!U38)

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Check that resulting values make sense

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Using Selection Criteria to Compute Means by Gender

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Merge Cells for Gender Categories

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

After merging gender category cells…

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Average IF function=AVERAGEIF (Range, Criteria, Average_range)

– Range - the group of cells the function is to search.

– Criteria - this value is compared with the data in the Range. If a match is found, then the corresponding data in the Average_range is averaged. Actual data or the cell reference to the data can be entered for this argument.

– Average_range (optional) - the data in this range of cells is averaged when matches are found between the Range and Criteria arguments. If the Average_range argument is omitted, the data matched in the Range argument is averaged instead.

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Add in the appropriate formulas to compute the mean by gender and

timepoint=AVERAGEIF(dpbaseline!K2:dpbaseline!K76,1,dpbaseline!AJ2:dpbaseline!AJ76)

=AVERAGEIF(dpfollowup!F2:dpfollowup!F38,1,dpfollowup!U2:dpfollowup!U38)

=AVERAGEIF(dpbaseline!K2:dpbaseline!K76,0,dpbaseline!AJ2:dpbaseline!AJ76)

=AVERAGEIF(dpfollowup!F2:dpfollowup!F38,0,dpfollowup!U2:dpfollowup!U38)

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Check that resulting values make sense

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Create Bar Graph:Highlight data cells; Use “Insert”

menu

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Review Graph

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Adjust Y-Axis Range

Double-click on y-axis to get

the dialogue box

to appear

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Delete Legend:Click on legend and press delete

key

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Delete Grid Lines

Double-click on

grid lines to get the dialogue box

to appear

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Change Fill Style

Double-click on one bar to get

the dialogue box

to appear

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Label Y-Axis:Click on graph border; Use “Layout”

menu

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Add Graph Title:Click on graph border; Use “Layout”

menu

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Review Final Graph; Can copy and paste graph into Word

document

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Cautions when using Excel

Data from the same participants are not really merged – comparisons between baseline and follow-up are based on data with different sample sizes

Missing data issues Filtering issues Copying and pasting formulas Graphing options

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

More Advanced Statistics Inferential Statistics

Example question: Is a change in a particular measurement (e.g., BMI) from baseline to follow-up in a particular sample statistically significant?

Techniques: Paired t-test, ANOVA, Regression, etc.

Specialized statistical software SPSS, SAS, Stata, etc.

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Importing Excel into SPSS

Two methods

Menu-driven

Syntax

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Click File → Open → Data

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Choose “Excel” from Files of type;Browse to Excel file you want to

import; Click Open

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Check Options; Check Worksheet name;

Check Worksheet range; Click OK

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Check that all variables were imported; Check variable formats

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Check that all cases were imported

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

SPSS Import Syntax

GET DATA /TYPE=XLS /FILE='C:\Users\dilled\Desktop\dpbaseline.xls‘ /SHEET=name 'dpbaseline‘ /CELLRANGE=full /READNAMES=on /ASSUMEDSTRWIDTH=32767.EXECUTE.

DATASET NAME DataSet1 WINDOW=FRONT.

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Importing Excel into SAS

Multiple methods

Import wizard

Proc import

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Click File → Import Data

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Choose “Microsoft Excel Workbook (*.xls, *.xlsx)”; Click Next

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Browse to the desired Excel file

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Choose worksheet from Excel file to import;

Click Options

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Check options

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Click Next

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Choose library and SAS database name;

Click Next

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Choose name of syntax file to save (optional);Click Finish

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Check that all cases & all variables were imported; Check variable formats

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

PROC IMPORTproc import OUT= work.dp_baseline

DATAFILE= "C:\Users\dilled\Desktop\dpbaseline.xls"

DBMS=EXCEL REPLACE;

SHEET="dpbaseline";

GETNAMES=YES;

MIXED=YES;

USEDATE=YES;

SCANTIME=YES;

run;

proc contents data=dp_baseline;

run;

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Where to find help… For help with advanced statistical analyses, consider

contacting:

Local university

IHS service unit

Tribal planning office

SPECIAL DIABETES PROGRAM FOR INDIANS Diabetes Prevention Program Initiative: Year 1 Meeting 2

Questions?

top related