national health and nutrition examination survey: a very general overview taken from various nhanes...
TRANSCRIPT
National Health and Nutrition National Health and Nutrition Examination Survey: A Very General Examination Survey: A Very General
OverviewOverview
Taken from various NHANES sources and Lein’s comments.
NHANES Objective
To measure and assess the health
and nutritional status of adults and
children in the United States
When did NHANES start?When did NHANES start?
• The Health Examination Survey – the forerunner in the 1960’s
• The first three National Health Examination Surveys (NHES) were conducted between 1960 and 1970. These surveys were known as NHES I, II, and III.
• Between 1971 and 1975, a large nutrition component was added. Name was changed to NHANES.
SampleSample
• Civilian, non-institutionalized household population:
• Residents of all states and the District of Columbia
• All ages
• Unique in combining a home interview with health examinations conducted in a Mobile Examination Center (MEC)
• New Survey Available: 2009-2010
Six Principal Data Collection Six Principal Data Collection MethodsMethods
• Household interview
• Personal interviews
• Physical examination (MEC)
• Anthropometry (MEC)
• Diagnostic screening (MEC)
• Laboratory analysis (MEC)
NHANES Mobile Exam Center NHANES Mobile Exam Center (MEC)(MEC)
MEC examination componentsMEC examination components
• Dietary interviews/MEC interviews
• Phlebotomy
• Urine collection
• Blood pressure
• Physician’s exam
• Hearing
• Eye exam
• Dental exam
• DXA
• Muscle strength
• Balance
• Anthropometry
• Skin disease/Melanoma
• TB skin test
• Cognitive testing
• Cardiorespiratory fitness
• Peripheral vascular disease
• Peripheral neuropathy
The major categories of NHANES data files
• Demographics files: survey design and demographic variables
• Examination files: information collected through physical exams, dental exams, and dietary interview components
• Laboratory files: results from specimens such as blood, urine, hair, air, tuberculosis skin test, and household dust and water specimens
• Questionnaire files: household interview and mobile examination center (MEC) interview
Examples of NHANES Examples of NHANES FindingsFindings
and Uses and Uses
Landmark findings and Landmark findings and public health resultspublic health results
•High blood lead levelsLead out of gasoline
•Low folate levelsMandatory food fortification
•Rising levels of obesityPublic health action plan
•Racial and ethnic disparities in Hepatitis BUniversal vaccination of all infants &
children
0
5
10
15
20
1963-5 1966-70 1971-4 1976-80 1988-94 1999-00 2001-2 2003-4
6-11 y
12-19 y
Percent
Trends in Child and Adolescent Trends in Child and Adolescent OverweightOverweight
Note: Overweight is defined as BMI >= gender- and weight-specific 95th percentile from the 2000 CDC Growth Charts.Source: National Health Examination Surveys II (ages 6-11) and III (ages 12-17), National Health and Nutrition Examination Surveys I, II, III and 1999-2004, NCHS, CDC.
OH9900
NHANES Complex Survey NHANES Complex Survey DesignDesign
(sometimes called “multi-stage” survey (sometimes called “multi-stage” survey design)design)
NHANES data are NOT obtained using a simple random sample.
Rather, a “complex”, multistage, probability sampling design is used
to select participants representative of the civilian, non-
institutionalized US population.
In Brief..• The entire US is broken into about
40 strata.
• Each stratum is divided into many primary sampling units (PSUs) – mostly single counties.
• Within each stratum two PSUs are selected.
• Within each of the selected PSUs, individual households are selected and then individual subjects are selected.
Household/Individual Household/Individual OversamplingOversampling
In some geographic areas the proportion of some age, ethnic, or income groups are oversampled to
provide for accurate subgroup reporting.
E.g.: Native American subjects
More on Individual WeightsMore on Individual Weights
The sample weight is assigned to each individual subject. It is a measure of the number of people in the population represented by that sample person in NHANES.
This design creates three sampling weighting variables
STRATA (will have names with the letters “STRA” at the end )
PSUs (will have “PSU” at the end)
Individual WEIGHTs (will begin with the prefix “WT”)
Do I have to use sample weights and other survey design variables?
• Yes. For NHANES datasets, the use of sampling weights and sample design variables is recommended for all analyses.
• If you fail to account for the sampling parameters, you may obtain biased estimates and overstate significance levels.
Selecting the Selecting the CorrectCorrect Weight Weight
To produce estimates appropriately adjusted for survey non-response, it is important to check all of the variables in your analysis and select the weight
of the smallest analysis subpopulation.
Using Weighting Variables in SAS
SAS allows for three weighting statements in its survey procedures..
• STRATA Statement (for Strata Vars)
• CLUSTER Statement (for PSU Vars)
• WEIGHT Statement (for Weight Vars)
An example of a SAS Survey Procedure with NHANES data
(always begins with the prefix “survey”)
proc surveymeans; var kcal;
cluster SDMVPSU;
strata SDMVSTRA;
weight WTDRD1;
run;
What kinds of NHANES documents are available
online?
• Codebook -- The codebook portion lists all the variables in the data file.
• Data file documentation – Provides a brief description of the file.
• Frequency Tables -- Contains the frequency count for each item in the data file and can be used to verify the sample size.
Where and how can I access NHANES data files?
• NHANES data can be downloaded from the NHANES website.
http://www.cdc.gov/nchs/nhanes.htm
• NHANES files are in SAS transport file format (.xpt).
• xpt files are easily read directly by SAS or converted to a .sas7bdat file with StatTransfer.
How do you read .xpt files directly with SAS?
libname demog xport 'c:\nhanes\demo_e.xpt';
data demo; set demog.demo_e;
run; 1 libname demog xport 'c:\nhanes\demo_e.xpt';
NOTE: Libref DEMOG was successfully assigned as follows:
Engine: XPORT
Physical Name: c:\nhanes\demo_e.xpt
2 data demo; set demog.demo_e;
3 run;
NOTE: There were 10149 observations read from the data set DEMOG.DEMO_E.
NOTE: The data set WORK.DEMO has 10149 observations and 43 variables.
StatTransfer: another option for using SAS .xpt files
You can also use StatTransfer to convert SAS .xpt files
into .sas7bdat
StatTransfer Screen Shot
Do I have to format and label all variables?
• NHANES provides variable labels built into their data sets.
• Formats are not included so you must create your own formats by using PROC FORMAT and the FORMAT Statement.
Lots of Merging with NHANES
The data files remain separate by type of measurement. This requires
that you merge files together for analysis.
Things to know about Survey Analysis
• Not all software packages are equipped to analyze complex survey data.
• See “Summary of Survey Analysis Software” at Harvard Med’s site for list and limitations:www.hcp.med.harvard.edu/statistics/survey-soft/
• All have limitations
More…
• Stata allows for an interactive “survey mode”.
• SPSS provides limited menu-driven procedures.
• SAS provides limited procedures.
• SUDAAN is a program that can work with SAS to expand its survey procedures. Not widely available at UCB.
Analyzing Sub-Population in Survey Analysis
When analyzing sub-populations in complex survey design, it’s important
NOT to subset your data. Instead, create a sub-population indicator
variable. Correctly written statistical software will allow for the sub-
population variable to be included in the model specification.
Quote from AJE Article(Graubard and Korn, 1996)
“One frequently analyzes a subset of the data collected in a survey when interest focuses on individuals in a
certain subpopulation of the sampled population. Although it may seem
natural to eliminate from the data set all data from individuals outside the subpopulation before analysis, this
procedure may yield incorrect standard errors and confidence intervals. “
Why is this? It seems counterintuitive.
In complex (multi-stage) survey designs, the weighting variables for all subjects
are used to compute the standard errors for sub-populations. The mathematics of
this is complex – in general terms, though, the relative weight of each
subject can only be fully accounted for by analyzing all of the subjects.
How does SAS deal with sub-populations?
Most SAS Survey procedures use a DOMAIN statement for sub-population analysis. This statemnet identifies a sub-population indicator variable.
Your SAS Review Assignment will provide some practice in using NHANES data with SAS