aric why large databases as a topic? publically available. many relevant questions unanswered....
TRANSCRIPT
ARIC
Why Large Databases as a Topic?
•Publically available.•Many relevant questions unanswered.•Information about health disparities.•Good for testing/validating new methods.•Relevant for health disparities
Health care not econometrics; Cohort not claims
ARIC
Outline
ARIC Study• Description• Science• Results• Health disparities
Genomic data Other large database
ARIC
The Atherosclerosis Risk in Communities (ARIC) Study is an NHLBI-sponsored study of cardiovascular disease in four communities in the United States.
Includes a Community Surveillance and a Cohort Component.
ARIC
CHD and Atherosclerosis
ARIC
ARIC
Cohort Component
Probability samples of 4 communities 15,792 men and women 45-64 yrs at baseline
examination (1987-1989) Re-examined every three years
• 1987-1989, 1990-1992, 1993-1995, 1996-1998
Extensive examinations include medical, social and demographic data
Annual follow-ups by telephone to maintain contact and assess health status
ARIC
Community Surveillance Component
CVD endpoint surveillance of all residents of the 4 communities, ages 35-74 years
Ascertainment and classification of coronary and cerebral clinical events, trends over time
ARIC
ARIC
Characteristics of the Four ARIC Communities
Study Community Population %
Ages 35-74 Total Black >12 education
Forsyth County, NC 95,863 243,683 24 63
Jackson, MS 68,303 202,895 48 71
Minneapolis suburbs, MN 69,338 192,004 1 85
Washington County, MD 45,539 113,068 4 60
US Total 279,043 751,650
ARIC
Measure Variation in Cardiovascular Risk Factors, Medical Care & Disease by Race, Sex, Place & Time
ARIC communities differ in their reported cardiovascular mortality rates; atherosclerosis prevalence rates may also differ
Ecologic comparison of community rates with factors that may influence these rates
Study Community All-Cause Mortality Heart Disease Mortality
Men Women Men Women
Forsyth County, NC 16.3 8.7 6.7 2.7
Jackson, MS 20.8 10.0 6.6 2.9
Minneapolis suburbs, MN 9.4 6.3 4.2 1.3
Washington County, MD 16.1 8.2 7.8 2.8
US Total 14.4 8.0 5.7 2.6
Age-adjusted mortality rates* for men & women aged 35-74 years in ARIC study communities, 1980
*indirect age adjustments; annual rate per 1,000 population
ARIC
Sampling Framework
Probability sample from the previous census, except for Jackson, MS, which is an all black sample
In Forsyth, the original sampling unit was a household.
In the other three locations, the sampling unit was an individual
• Jackson, MS – driver’s license database
• Minneapolis, MN – eligible for jury duty (driver’s license and voters)
• Washington County, MD – driver’s license database
ARIC
Achilles Heal
Given what I just said, what is ARIC’s Achilles heal?
ARIC
Achilles Heal
Given what I just said, what is ARIC’s Achilles heal?
Confounding between race and geography!
Terrible decision!
ARIC
Elements of Baseline Examination
Sitting blood pressure – 3 measurements w/ random zero sphygmomanometer
Anthropometry – weight, standing & sitting height, triceps & subscapular skinfolds, waist, hip, arm & calf girths, wrist breadth
Venipuncture – fasting blood samples for lipids, hemostasis, hematology & chemistry
Electrocardiogram – digitally recorded 12-lead electrocardiogram & 2-minute rhythm strip
ARIC
Lipid Determinations
8 or 12 hour (overnight) fast information Central laboratory CDC certification Cholesterol measured enzymatically HDL measured by precipitation LDL estimated by Friedewald formula LDL=Total – HDL - (Trigs/5)
ARIC
ATP III Classification LDL Cholesterol Description
<100 Optimal
100-129 Near optimal
130-159 Borderline high
160-189 High
>190 Very High
Total Cholesterol
<200 Desirable
200-239 Borderline high
>240 High
HDL Cholesterol
<40 Low
>60 High
ARIC
Elements of Baseline Examination (cont’d)
Ultrasound, postural change – B-mode scan for wall & lumen measurements in both carotid arteries & 1 popliteal artery; supine brachial & ankle blood pressures, heart rate & blood pressures as participant rises
Interview – medical history, physical activity, TIA & respiratory symptoms, reproductive history, medication use, food frequency
Pulmonary function – digitally recorded forced vital capacity & timed expiratory volumes
ARIC
Elements of Baseline Examination (cont’d)
Physical exam – brief exam including heart, lungs & extremities; neurologic & breast exam
Medical data review – verify selected positive findings, report selected results to participants, refer for diagnosis or treatment
Reporting of results (deferred) – mail results from routine medical tests to participants & their physicians
ARIC
Definition of Hypertension
Systolic BP > 140 mmHg Diastolic BP > 90 mmHg Regular use of medications for high blood
pressure or hypertension (participants brought all medications with them to the examination)
ARIC
Measurements of the Environment
Smoking
Alcohol
Diet
Exercise
Education and income
Psychosocial
Employment
GIS
Some previous exposures
Medications
Biomarkers
ARIC
Data Collection & Quality Control Immediate entry of data from interviews & exams into
computer-assisted data collection system; data monitoring
Trained & certified staff; monitored performance; implement recertification & retraining as needed
Selected measures repeated during exams by same & different technicians
Duplicate blood samples drawn & shipped to labs with separate IDs; duplicate electrocardiograms transmitted blindly to ECG center
ARIC
Study Questions
“It is better to know some of the questions than all of the answers.” James Thurber
ARIC
Study Questions
Diversity of measurements included in ARIC permits many important questions to be addressed
3 primary objectives• Investigate the etiology and natural history of
atherosclerosis• Investigate the etiology of clinical atherosclerotic
diseases (especially incident diseases)• Measure variation in cardiovascular risk factors, medical
care and disease by race, sex, place and time
ARIC
Investigate the Etiology & Natural History of Atherosclerosis
Ultrasound used to identify signs of early arterial disease• Arterial wall dimensions; Arterial distensibility
Expect atherosclerosis to be associated with the following lipid parameters• Elevated levels of total cholesterol, LDL-C, apoB, Lp(a), TGs• Reduced levels of HDL-C, apoA-I• Predominance of small LDL• DNA variations in specific genes (apolipoprotein E)
ARIC
Investigate the Etiology & Natural History of Atherosclerosis (cont’d)
Evaluate associations of atherosclerosis with factors that are less directly related to lipid and thrombosis theories• Established risk factors (hypertension, smoking)• Fasting insulin and glucose levels• Routine hematologic measures (WBC, RBC and
platelet counts, hematocrit)• Lifestyle factors (diet, physical activity)
ARIC
Investigate the Etiology of Clinical Atherosclerotic Diseases
Study both risk factors and indicators of pre-clinical disease in relation to subsequent incident CHD and stroke
Risk factors measured in ARIC permit testing of new hypotheses
Indications of preclinical disease include not only ultrasound measurements but also• Ankle-arm index of peripheral vascular disease • Subtle changes in digitized electrocardiogram
ARIC
Processed CCA and Plaque Images
Fibrous cap segmentationPlaque segmentationCCA segmentation
ARIC
Measurement/ascertainment of incident disease
ARIC
Measurement/ascertainment of incident disease
Study both risk factors and indicators of pre-clinical disease in relation to subsequent incident CHD and stroke
Ascertainment of incident disease• Limited to CHD, CVD, Stroke (hospitalized)• Goal is 100% ascertainment• Annual telephone contact• Hospital record abstraction• Death certificates, death indices• Adjudication
ARIC
Effects of Study Design
ARIC’s ability to meet its objectives is enhanced by several design features
Consistency is evaluated by studying associations in four geographic locations among men, women, blacks and whites
Generalizability is examined by nesting cohorts into communities covered by broad surveillance• Permits interpretation of study results in terms of representativeness
of cohort participants & their CHD events in their communities & the characteristics of those communities
ARIC
Effects of Study Design (cont’d)
Surveillance rates are monitored and validated by each community cohort in two ways• Replication of event identification, investigation and diagnosis
activity• Greater effort for accuracy that is afforded each potential cohort
event
Cohorts also provide information on risk factors, preclinical disease and medical care which are used to interpret the rates of clinical disease found in surveillance
ARIC
Effects of Study Design (cont’d)
ARIC cohort study is prospective• Design of choice for identifying precursors of disease• Important for studying any potential risk factor that may be
influenced by disease or by changes in medications, diet or habits resulting from disease
ARIC observes directly the early signs of atherosclerosis, assessing the association of factors with atherosclerosis in particular• Attempts to unravel some complexity by investigating risk factor
associations with both atherosclerosis and its clinical sequelae
ARIC
Effects of Study Design (cont’d)
Statistical power in ARIC permits• Subgroup analyses “Is fibrinogen associated with atherosclerosis in
subjects who do or do not smoke?”
• Comparisons of the strength of correlated variables “Which has the stronger association with atherosclerosis—central or peripheral obesity?”
• Comparison of risk factor effects “Are there CHD risk factors that are not associated with atherosclerosis?”
ARIC benefits from progress in modern biochemistry• Storage of multiple aliquots of blood allows for continual
utilization of new biochemical technology
ARIC
ARIC
ARIC
A Sampling of ARIC Cohort Publications
Risk factors / predictors of prevalent and incident:• Coronary heart disease• Stroke• Diabetes• Obesity• Hypertension• Venous Thromboembolism• Renal dysfunction
ARIC
A Sampling of ARIC Cohort Publications
Risk factors / predictors of subclinical vascular diseases:• Carotid atherosclerosis• Cerebral infarcts, white matter disease• Peripheral arterial disease• Microvascular retinal disease• Arterial stiffness• Cardiac autonomic tone
ARIC
ARIC Ancillary Studies
To enhance the value of ARIC, welcome proposals from individual investigators to carry out ancillary studies and to promote the advancement of science
An ancillary study is one based on information from ARIC participants in an investigation that is not described in the ARIC protocol• Involves data collection or data analyses under additional funding
that are not included as part of the routine ARIC data set or data analyses
ARIC
Active ARIC Ancillary Studies
Intimately tied to ARIC, with new data collection and external funding• Periodontal disease, subclinical atherosclerosis and CVD• Chronic inflammation of endodontic origin• Longitudinal investigation of venous thromboembolism• Life course SES and CVD• Using historical records to reconstruct SES exposures in
decedents • Physical activity in context of the environment• Cardiovascular responses to particulate air pollution
ARIC
Active ARIC Ancillary Studies
Lab-based ancillary studies• Gene-environment interactions and CVD• Genetic determinants of diabetes• Novel biomarkers of atherosclerosis
Studies conducted independently• Jackson Heart Study• Family Heart Study• Sleep Heart Health Study
Meta-analyses (data contributed)
ARIC
ARIC
Cohort Baseline Characteristics
Percentage with risk factor
Black Women
Black Men White Women
White Men
Hypertension 57% 55% 26% 29%
Diabetes 21% 19% 8% 10%
Current Smoker 25% 38% 25% 25%
ARIC
Cohort Baseline Characteristics
Mean risk factor level
Black Women
Black Men White Women
White Men
HDL (mg/dl) 57.8 50.4 57.4 42.6
LDL (mg/dl) 138.0 137.3 135.6 140.0
BMI (kg/m2) 30.8 27.6 26.6 27.4
ARIC
Summary of Incident Events1987-2002
Black Women
Black Men White Women
White Men
Stroke 153 (6%) 111 (8%) 135 (2%) 178 (4%)
CHD 184 (8%) 190 (13%) 389 (7%) 857 (18%)
*prevalent cases excluded
ARIC ARIC Baseline Characteristics:
Gender/Racial Differences in Drinking, Smoking and BMI
White Male White Female Black Male Black Female0.0%
10.0%
20.0%
30.0%
40.0%
50.0%
60.0%
70.0%
80.0%
Current Drinker Current Smoker BMI≥30
ARIC
Gender/Racial Differences in HDL Levels, by Drinking Status
White Male White Female Black Male Black Female0.0
10.0
20.0
30.0
40.0
50.0
60.0
70.0
Never Drinker Low-Moderate Drinker Heavy Drinker
HD
L
Low-Mod Drinker = ≤2 drinks/day
Heavy Drinker = >2 drinks/day *P<0.001
*
*
*
*
**
*
*
ARIC Gender/Racial Differences in TG Levels, by Drinking Status
Low-Mod Drinker = ≤2 drinks/day
Heavy Drinker = >2 drinks/day *P<0.05
*
**
White Male White Female Black Male Black Female70.0
90.0
110.0
130.0
150.0
Never Drinker Low-Moderate Drinker Heavy Drinker
Tri
gly
ceri
des
ARIC CHD Risk is Influenced by Interaction between Drinking Status & Genotype
TT CT+CC0.8
1
1.2
1.4
CHD Risk in Whites for Alcohol-Related SNP
(no alcohol)
Never Drinker Low-Mod Drinker Heavy Drinker0.600000000000001
0.800000000000001
1
1.2
CHD Risk in Whites (no genotype)
P=0.01
TT CT+CC0.5
0.7
0.9
1.1
1.3
CHD Risk in WhitesBy Alcohol Intake and Genotype
Never Drinker Low-Mod Drinker Heavy Drinker
P<0.001
ARIC Stroke Risk is Influenced by Interaction between Drinking Status & Genotype
AA AG+GG0.8
1
1.2
1.4
Stroke Risk in Whites for Alcohol-Related SNP
(no alcohol)
Never Drinker Low-Mod Drinker Heavy Drinker0.8
1
1.2
1.4
Stroke Risk in Whites (no genotype)
AA AG+GG0.8
1
1.2
1.4
1.6
1.8
2
2.2
Stroke Risk in Whites by Alcohol Intake & Genotype
Never Drinker Low-Mod Drinker Heavy Drinker
P=0.008
ARIC
ARIC: Sustainable Philosophy
Role of epidemiologic research in the investigation of etiologic hypotheses is one of active interchange with other disciplines
Basic discoveries often come first in epidemiology• Importance of specific lipoprotein fractions was found first in
population studies, leading to specific investigations of cholesterol transport
Multidisciplinary team of ARIC investigators hopes to promote such scientific interchange
ARIC
ARIC: Future Goals
Another examination of the entire cohort.• Healthy aging• Cognitive decline
Imaging Some day we will all know our DNA sequence?
• First population-based cohort with the complete DNA sequence?
• Analysis
ARIC Genome-wide Association300,000 – 1,000,000 markers
Cases
Controls
SNP1 SNP2 SNP3 SNPn
….....
ARIC
Genome-wide Scan
Replicate 1:
Replicate 2:
Genome-wide Association Scan for CHD
Ottawa Heart Institute #1
Cases (n=323): CABG, MI < 60 yrs, no FH, no DM
Controls (n=312): asymptomatic, > 65 yr
Ottawa Heart Institute #2
(304 cases/326 controls)
Atherosclerosis Risk
in Communities (ARIC)(n=15,782)
2,586 SNPs
50 SNPs
2 SNPs
ARIC SNP 107 and CHD risk
0
0.5
1
1.5
AA
4
8
12
16
AG GG AA AG GG0
Relative Risk Absolute Risk
ARIC
Predictive Ability of 9p21
AUC AUC P value
CHD Risk Score only 0.776
Add 9p21 0.780 0.004 Significant
Add CRP 0.778 0.002 Not significant
Individual risk factors do not cause large changes in the area under the CHD
Risk Score ROC curve.
55
AUC curves plot one minus specificity vs sensitivity, and they are used by regulatory agencies to evaluate new diagnostics.
ARIC
ATP III Guidelines ATP III classification using ACRS + 9p21 allele
ATP III classification using ACRS alone
High Mid-high Mid Low
CHD and CHD risk equivalents10-year risk >20%LDL-C goal <100 mg/dL
High 1,870 (372)18.69%
1760 (360) 109 (12)3.95%*
0 0
Multiple (2+) risk factors10-year risk 10–20%LDL-C goal <130 mg/dL
Mid-high 2,049 (219)20.48%
217 (27)10.59%*
1,701 (179) 131 (13)6.39%*
0
Multiple (2+) risk factors10-year risk <10%LDL-C goal <130 mg/dL
Mid 1,737 (80)17.36%
0 179 (17)10.31%*
1,558 (63) 0
0–1 risk factor10-year risk <10%LDL-C goal <160 mg/dL
Low 4,349 (107)43.47%
0 0 0 4,349 (107)
Total 10,004 (778)(100%)
1,977(19.76%)
1989(19.88%)
1,689(16.88%)
4,349(43.47%)
* Percentage of people re-classified. (Number of events on 10 years of follow-up.)
ARIC
The Future is Here!
ARIC
The field of human genetics: the amount of data is growing#
vari
ants
Year
1980s 1990s 2000 2007 2010
10s
1000s
100s
1x105
1x106
10x106
Candidate
Genes
Linkage
GWAS
Exome and
Whole-genome
sequencing
ARIC
• Potential to survey all genetic variation in the genome (or at least ~2.5 M variants!)
• Individual researchers can access this data
Genome-wide association and
whole-genome sequencing
ARIC
Research Participants
Informedconsent
Submitting Investigators
Data Collection Submission & Management of Data
GWAS Data Repository
De-identified, Coded Data
As a part of funding and generating GWAS data, public repositories have been developed
Distribution &
Secondary Use of Data
RecipientInvestigators
Data Access Request
Data Submission
NIH Genome-Wide Association Studies Policy
ARIC
NIH Genome-Wide Association Studies Policy
dbGAP is one of the
central repositories
ARIC
Open Access (summary level)
Search for studies, review protocols and questionnairesView summary phenotype and genotype data
View pre-computed or published genetic associations (after embargo)
Identify studies of interest, view their consent conditions, and review terms for data access
Locate potential collaborators for follow up studies No individual data!
NIH Genome-Wide Association Studies Policy
ARIC Controlled Access (individual level)
dbGaP
DatabaseGenotype & Phenotype Data
Public AccessStudy Protocol
Descriptive Information
Coded Genotypes
Phenotypes
Pre-computes
Controlled Access
Specific
Research Use
• Request data for specific research use
• Agreement by PI and institution to terms of access in the Data
Use Certification
Data Access Committee
Specific access rights
NIH Genome-Wide Association Studies Policy
ARIC http://www.ncbi.nlm.nih.gov/gap
Data Release
ARIC Framingham Heart Study
ARIC Framingham Heart Study
In 1948, the Framingham Heart Study embarked on an ambitious project in health research. At the time, little was known about the general causes of heart disease and stroke, but the death rates for CVD had been increasing steadily since the beginning of the century and had become an American epidemic. Since 1971, the Framingham Heart Study has been conducted in collaboration with Boston University.
Objective - to identify the common factors or characteristics that contribute to CVD by following its development over a long period of time in a large group of participants who had not yet developed overt symptoms of CVD or suffered a heart attack or stroke.
recruited 5,209 men and women between the ages of 30 and 62 from the town of Framingham, Massachusetts,
ARICFramingham Heart Study
ARIC Framingham Heart Study
ARIC Framingham Heart Study
ARIC Framingham SHARe
ARIC Framingham SHARe
ARIC Women’s Health Initiative (WHI)
ARIC Women’s Health Initiative (WHI)
WHI is a long-term national health study (1993-2005)
Objective: strategies for preventing heart disease, breast and colorectal cancer and osteoporotic fractures in postmenopausal women.
161,000 women ages 50-79
Two major parts: a randomized Clinical Trial and an Observational Study
Clinical Trial (CT) enrolled 68,132 postmenopausal women between the ages of 50-79 into trials testing three prevention strategies. If eligible, women could choose to enroll in one, two, or all three of the trial components. The components are: • Hormone Replacement Trials• Dietary Modification Trial• Calcium / Vitamin D Trial
The Observational Study (OS) examines the relationship between lifestyle, health and risk factors and specific disease outcomes. This component involves tracking the medical history and health habits of 93,676 women. Recruitment for the observational study was completed in 1998 and participants were followed for 8 to 12 years.
ARIC Women’s Health Initiative (WHI)
ARIC WHI SHARe
ARIC ARIC CARe
ARIC ARIC CARe
ARICLarge datasets are not limited to genetic datasets
http://www.ehdp.com/vitalnet/datasets.htm