genomics workshop demography of aging centers biomarker network meeting in conjunction with the...

Post on 15-Jan-2016

215 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Genomics Workshop

Demography of Aging Centers Biomarker Network Meeting in Conjunction with the Annual Meeting of the PAA

April 14, 9:00 AM to 3:30 PM – Hyatt Regency, Dallas, Texas

Sponsored by USC/UCLA Center of Biodemography and Population Health

Organized by Teresa Seeman, Steven Cole, Eileen Crimmins

Tactical aspects of study administration and sample capture/storage

Biological overview of genetics & functional genomics

Strategic aspects of study design and data analysis

Lunch

Technical aspects of study design and data analysis

Perspectives on the State of the Field

Application clinic

Tactical aspects of study administration and sample capture/storage

DNA1. New sample capture

• Methods: e.g., Oragene, leukocytes• Consent & administrative issues

2. Retrospective analyses• Sources: blood spots, cheek swabs, etc• Consent & administrative issues

3. Epigenetics• DNA methylation • Histone acetylation & chromatin dynamics• Tissue specificity (vs DNA)

4. Tactical issues – Reports from the Field• I wish I’d known then…

RNA1. Identifying appropriate target tissues

• Whole blood, PBMC, saliva, hair, path specim.2. Sample capture/storage3. Consent & administrative issues

Tactical aspects of study administration and sample capture/storage

DNA1. New sample capture

• Methods: e.g., Oragene, leukocytes• Consent & administrative issues

2. Retrospective analyses• Sources: blood spots, cheek swabs, etc• Consent & administrative issues

3. Epigenetics• DNA methylation • Histone acetylation & chromatin dynamics• Tissue specificity (vs DNA)

4. Tactical issues – Reports from the Field• I wish I’d known then…

RNA1. Identifying appropriate target tissues

• Whole blood, PBMC, saliva, hair, path specim.2. Sample capture/storage3. Consent & administrative issues

Tactical aspects of study administration and sample capture/storage

DNA1. New sample capture

• Methods: e.g., Oragene, leukocytes• Consent & administrative issues

2. Retrospective analyses• Sources: blood spots, cheek swabs, etc• Consent & administrative issues

3. Epigenetics• DNA methylation • Histone acetylation & chromatin dynamics• Tissue specificity (vs DNA)

4. Tactical issues – Reports from the Field• I wish I’d known then…

RNA1. Identifying appropriate target tissues

• Whole blood, PBMC, saliva, hair, path specim.2. Sample capture/storage3. Consent & administrative issues

Gene

IL6 DNA

Gene

IL6 DNA

Gene

IL6

RNA

DNA

Gene

Health

IL6

RNA

DNA

Tactical aspects of study administration and sample capture/storage

DNA1. New sample capture

• Methods: e.g., Oragene, leukocytes• Consent & administrative issues

2. Retrospective analyses• Sources: blood spots, cheek swabs, etc• Consent & administrative issues

3. Epigenetics• DNA methylation • Histone acetylation & chromatin dynamics• Tissue specificity (vs DNA)

4. Tactical issues – Reports from the Field• I wish I’d known then…

RNA1. Identifying appropriate target tissues

• Whole blood, PBMC, saliva, hair, path specim.2. Sample capture/storage3. Consent & administrative issues

Biological overview of genetics & functional genomics

Theoretical framework: Genes, Environments, transcription, and health

1. “Genetic” influences (missing h, penetrance R-square, etc.)

2. Functional genomics• Transcription factors• Epigenetics

3. Gene-Environment interactions• Regulatory polymorphism• Coding polymorphism

System dynamics

1. Feedback, network pleiotropy

2. Recursive developmental trajectories

Gene

IL6 DNA

Biological overview of genetics & functional genomics

Theoretical framework: Genes, Environments, transcription, and health

1. “Genetic” influences (missing h, penetrance R-square, etc.)

2. Functional genomics• Transcription factors• Epigenetics

3. Gene-Environment interactions• Regulatory polymorphism• Coding polymorphism

System dynamics

1. Feedback, network pleiotropy

2. Recursive developmental trajectories

Gene

IL6 DNA

Gene

IL6 DNA

Gene

IL6

RNA

DNA

Gene

Health

IL6

RNA

DNA

Gene

Health

IL6

RNA

DNA

Social Environment

Gene

Health

IL6

RNA

DNA

Social Environment

Gene

Health

IL6

RNA

DNA

Social Environment

Gene

Health

IL6

RNA

DNA

Social Environment

Gene

Health

IL6

RNA

DNA

IL6TCT TGCGATGCTA AAG

IL6 gene transcription

IL6TCT TGCGATGCTA AAG

IL6 gene transcription

NE

IL6TCT TGCGATGCTA AAG

IL6 gene transcription

NE

PKA

IL6TCT TGCGATGCTA AAG

IL6 gene transcription

NE

GATA1P

PKA

IL6TCT TGCGATGCTA AAG

IL6 gene transcription

NE

GATA1P

PKA

IL6TCT TGCGATGCTA AAG

IL6 gene transcription

NE

GATA1P

PKA

IL6

prom

oter

act

ivity

(f

old-

chan

ge)

10

8

6

4

2

0

Norepinephrine (M): 0 10 - 0 10

70 80 90

0.0

0.2

0.4

0.6

0.8

1.0

Age

Su

rviv

al

70 80 90

0.0

0.2

0.4

0.6

0.8

1.0

Age

Su

rviv

al

Non-depressedDepressed

p = .008

Socio-environmental regulation of IL6

70 80 90

0.0

0.2

0.4

0.6

0.8

1.0

Age

Su

rviv

al

70 80 90

0.0

0.2

0.4

0.6

0.8

1.0

Age

Su

rviv

al

Biological overview of genetics & functional genomics

Theoretical framework: Genes, Environments, transcription, and health

1. “Genetic” influences (missing h, penetrance R-square, etc.)

2. Functional genomics• Transcription factors• Epigenetics

3. Gene-Environment interactions• Regulatory polymorphism• Coding polymorphism

System dynamics

1. Feedback, network pleiotropy

2. Recursive developmental trajectories

Gene

IL6 DNA

Gene

IL6 DNA

Gene

Health

IL6

RNA

DNA

Gene

Health

IL6

RNA

DNA

Gene

IL6 DNA

Biological overview of genetics & functional genomics

Theoretical framework: Genes, Environments, transcription, and health

1. “Genetic” influences (missing h, penetrance R-square, etc.)

2. Functional genomics• Transcription factors• Epigenetics

3. Gene-Environment interactions• Regulatory polymorphism• Coding polymorphism

System dynamics

1. Feedback, network pleiotropy

2. Recursive developmental trajectories

Social Environment

Gene

Health

IL6

RNA

DNA

Social Environment

Gene

Health

IL6… [G/C] …

RNA

DNA

Social Environment

Gene

Health

IL6… [G/C] …

RNA

DNA

Social Environment

Gene

IL6… [G/C] … DNA

IL6TCT TGCGATGCTA AAG

Gene x Environment Interaction

In silico

IL6TCT TGCGATGCTA AAG

V$GATA1_01 = .943

Gene x Environment Interaction

In silico

IL6TCT TGCGATGCTA AAG

C

V$GATA1_01 = .943

Gene x Environment Interaction

In silico

IL6TCT TGCGATGCTA AAG

C

V$GATA1_01 = .943

V$GATA1_01 = .619

Gene x Environment Interaction

In silico

Tra

nscr

iptio

nal a

ctiv

ity

(fol

d-ch

ange

)

10

8

6

4

2

0

IL6 promoter: WT -174C

Norepinephrine (M): 0 10 - 0 10

IL6TCT TGCGATGCTA AAG

C

V$GATA1_01 = .943

V$GATA1_01 = .619

Gene x Environment Interaction

In silico In vitro

Tra

nscr

iptio

nal a

ctiv

ity

(fol

d-ch

ange

)

10

8

6

4

2

0

IL6 promoter: WT -174C

Norepinephrine (M): 0 10 - 0 10

Difference: p < .0001

IL6TCT TGCGATGCTA AAG

C

V$GATA1_01 = .943

V$GATA1_01 = .619

Gene x Environment Interaction

In silico In vitro

70 80 90

0.0

0.2

0.4

0.6

0.8

1.0

Age

Su

rviv

al

70 80 90

0.0

0.2

0.4

0.6

0.8

1.0

Age

Su

rviv

al

Non-depressedDepressed

p = .008

Gene x Environment Interaction

70 80 90

0.0

0.2

0.4

0.6

0.8

1.0

Age

Su

rviv

al

70 80 90

0.0

0.2

0.4

0.6

0.8

1.0

Age

Su

rviv

al

IL6 -174 GG IL6 -174 CC/GC

70 80 90

0.0

0.2

0.4

0.6

0.8

1.0

Age

Su

rviv

al

70 80 90

0.0

0.2

0.4

0.6

0.8

1.0

Age

Su

rviv

al

p = .439

Non-depressedDepressed

70 80 90

0.0

0.2

0.4

0.6

0.8

1.0

Age

Su

rviv

al

70 80 90

0.0

0.2

0.4

0.6

0.8

1.0

Age

Su

rviv

al

Non-depressedDepressed

p = .008

Gene x Environment Interaction

70 80 90

0.0

0.2

0.4

0.6

0.8

1.0

Age

Su

rviv

al

70 80 90

0.0

0.2

0.4

0.6

0.8

1.0

Age

Su

rviv

al

IL6 -174 GG IL6 -174 CC/GC

Biological overview of genetics & functional genomics

Theoretical framework: Genes, Environments, transcription, and health

1. “Genetic” influences (missing h, penetrance R-square, etc.)

2. Functional genomics• Transcription factors• Epigenetics

3. Gene-Environment interactions• Regulatory polymorphism• Coding polymorphism

System dynamics

1. Feedback, network pleiotropy

2. Recursive developmental trajectories

Social Environment

Gene

Health

IL6

RNA

DNA

Social Environment

Gene

Health

IL6

RNA

DNA… [G/C] …

Social Environment

Gene

Health2

IL6

RNA2

DNA… [G/C] …

Biological overview of genetics & functional genomics

Theoretical framework: Genes, Environments, transcription, and health

1. “Genetic” influences (missing h, penetrance R-square, etc.)

2. Functional genomics• Transcription factors• Epigenetics

3. Gene-Environment interactions• Regulatory polymorphism• Coding polymorphism

System dynamics

1. Feedback, network pleiotropy

2. Recursive developmental trajectories

Social Environment

Gene

Health

IL6

RNA

DNA

Social Environment

Gene

IL6

RNA

DNA

Behavior

Social Environment

Gene

IL6

RNA

DNA

Behavior

Gene-Environment Correlation

Social Environment

Gene

IL6

RNA

DNA

Behavior

Gene-Environment Correlation

Social Environment

Gene

IL6

RNA

DNA

Behavior

Gene-Environment Correlation

Social Environment

Gene

IL6

RNA

DNA

Behavior

Gene-Environment Correlation

Social Environment

Gene

IL6

RNA

DNA

Behavior

Gene-Environment Correlation

Recursive Molecular Remodeling

Body1

Recursive developmental remodeling

Cole (2009) Current Directions in Psychological Science

Environment1 Body1

Recursive developmental remodeling

Cole (2009) Current Directions in Psychological Science

Environment1 Body1

Behavior1

Recursive developmental remodeling

Cole (2009) Current Directions in Psychological Science

Environment1 Body1

RNA1

Behavior1

Recursive developmental remodeling

Cole (2009) Current Directions in Psychological Science

Time 1 Environment1 Body1

RNA1

Behavior1

Time 2 Body2

Recursive developmental remodeling

Cole (2009) Current Directions in Psychological Science

Time 1 Environment1 Body1

RNA1

Behavior1

Time 2 Environment2 Body2

Recursive developmental remodeling

Cole (2009) Current Directions in Psychological Science

Time 1 Environment1 Body1

RNA1

Behavior1

Time 2 Environment2 Body2

RNA2

Behavior2

Recursive developmental remodeling

Cole (2009) Current Directions in Psychological Science

Time 1 Environment1 Body1

RNA1

Behavior1

Time 2 Environment2 Body2

RNA2

Behavior2

Time 3 Environment3 Body3

RNA3

Behavior3

Recursive developmental remodeling

Cole (2009) Current Directions in Psychological Science

Time 1 Environment1 Body1

RNA1

Behavior1

Time 2 Environment2 Body2

RNA2

Behavior2

Time 3 Environment3 Body3

RNA3

Behavior3

Recursive developmental remodeling

RNA = intra-organismic adaptation

Cole (2009) Current Directions in Psychological Science

Biological overview of genetics & functional genomics

Theoretical framework: Genes, Environments, transcription, and health

1. “Genetic” influences (missing h, penetrance R-square, etc.)

2. Functional genomics• Transcription factors• Epigenetics

3. Gene-Environment interactions• Regulatory polymorphism• Coding polymorphism

System dynamics

1. Feedback, network pleiotropy

2. Recursive developmental trajectories

Strategic aspects of study design and data analysis

Basic substantive objectives & study designs

1. “Gene discovery” (e.g., genetic epidemiology)

2. Environmental regulation of health (via transcription)

3. Gene-Environment interaction

Gene

IL6 DNA

Gene

Health

IL6 DNA

Strategic aspects of study design and data analysis

Basic substantive objectives & study designs

1. “Gene discovery” (e.g., genetic epidemiology)

2. Environmental regulation of health (via transcription)

3. Gene-Environment interaction

Gene

Health

IL6 DNA

Gene

Health

IL6

RNA

DNA

Strategic aspects of study design and data analysis

Basic substantive objectives & study designs

1. “Gene discovery” (e.g., genetic epidemiology)

2. Environmental regulation of health (via transcription)

3. Gene-Environment interaction

Gene

Health

IL6

RNA

DNA

Gene

Health

IL6

RNA

DNA… [G/C] …… [G/C] …

Strategic aspects of study design and data analysis

Basic substantive objectives & study designs

1. “Gene discovery” (e.g., genetic epidemiology)

2. Environmental regulation of health (via transcription)

3. Gene-Environment interaction

Antagonistic pleiotropy

IL6 -174: CC GC GG CC GC GG

p = .007

Older Adult Adolescent

CR

P m

g/L

/ A

dver

sity

SD

3.0

2.0

1.0

0.0

-1.0

-2.0

-3.0

p = .032

Antagonistic pleiotropy

IL6 -174: CC GC GG CC GC GG

p = .007

Older Adult Adolescent

CR

P m

g/L

/ A

dver

sity

SD

3.0

2.0

1.0

0.0

-1.0

-2.0

-3.0

p = .032

Antagonistic pleiotropy

IL6 -174: CC GC GG CC GC GG

p = .007

Older Adult Adolescent

CR

P m

g/L

/ A

dver

sity

SD

3.0

2.0

1.0

0.0

-1.0

-2.0

-3.0

p = .032

Antagonistic pleiotropy

Evolution deletes disadvantage, particularly to the young

GG GC CC

Out

com

e

Fisher’s regression:

GG GC CC

Out

com

e

y = a + b(#G) + e

Fisher’s regression:

GG GC CC

Out

com

e

y = a + b(#G) + e

Environment A

GG GC CCO

utco

me

Environment B

Fisher’s regression:

GG GC CC

Out

com

e

y = a + b(#G) + c(Env) + d(#G x Env) + e

Environment A

GG GC CCO

utco

me

Environment B

Fisher’s regression:

GG GC CC

Out

com

e

y = a + b(#G) + e’ ← c(Env) + d(#G x Env) + e

Environment A

GG GC CCO

utco

me

Environment B

Fisher’s regression:

GG GC CC

Out

com

e

y = a + b(#G) + e’ ← c(Env) + d(#G x Env) + e ↓ power

Environment A

GG GC CCO

utco

me

Environment B

Fisher’s regression:

GG GC CC

Out

com

e

y = a + b(#G) + e’ ← c(Env) + d(#G x Env) + e ↓ power ↑ parameter estimate bias

Environment A

GG GC CCO

utco

me

Environment B

Fisher’s regression:

GG GC CC

Out

com

e

y = a + b(#G) + e’ ← c(Env) + d(#G x Env) + e ↓ power ↑ parameter estimate bias

Marginal: 0

Environment A

GG GC CCO

utco

me

Environment B

Strategic aspects of study design and data analysis

Basic substantive objectives & study designs

1. “Gene discovery” (e.g., genetic epidemiology)

2. Environmental regulation of health (via transcription)

3. Gene-Environment interaction

Antagonistic pleiotropy

Valid statistical models are one major reason that substantive interests (environments) matter.

Strategic aspects of study design and data analysis

Basic substantive objectives & study designs

1. “Gene discovery” (e.g., genetic epidemiology)

2. Environmental regulation of health (via transcription)

3. Gene-Environment interaction

Antagonistic pleiotropy

Valid statistical models are one major reason that substantive interests (environments) matter.

OK, then, let’s have lunch.

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies• Genome-wide association studies• The bioinformatic middle road

2. Environmental regulation of health (via transcription)• Candidate transcript studies• Genome-wide approaches

3. Gene-Environment interaction• Statistical issues• Revisiting the bioinformatic middle road

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies

- Candidate identification- Targeted genotyping

a. PCRb. High-throughput approaches

- Statistical modelsa. Fisher’s basic regression modelb. Multivariate mapping / association / recombination

i. Recombinationii. Haplotype blocks

c. Confoundingi. Linkage disequilibrium & haplotype analysesii. Ethnic stratification

Phenotypic ascertainmentGenetic ancestry

iii. Mendelian randomization

IL6TCT TGCGATGCTA AAG

Gene x Environment Interaction

IL6TCT TGCGATGCTA AAG

C

IL6TCT TGCGATGCTA AAG

C

V$GATA1_01 = .943

Gene x Environment Interaction

In silico

IL6TCT TGCGATGCTA AAG

C

V$GATA1_01 = .943

V$GATA1_01 = .619

Gene x Environment Interaction

In silico

Tra

nscr

iptio

nal a

ctiv

ity

(fol

d-ch

ange

)

10

8

6

4

2

0

IL6 promoter: WT -174C

Norepinephrine (M): 0 10 - 0 10

Difference: p < .0001

IL6TCT TGCGATGCTA AAG

C

V$GATA1_01 = .943

V$GATA1_01 = .619

Gene x Environment Interaction

In silico In vitro

70 80 90

0.0

0.2

0.4

0.6

0.8

1.0

Age

Su

rviv

al

70 80 90

0.0

0.2

0.4

0.6

0.8

1.0

Age

Su

rviv

al

p = .439

Non-depressedDepressed

70 80 90

0.0

0.2

0.4

0.6

0.8

1.0

Age

Su

rviv

al

70 80 90

0.0

0.2

0.4

0.6

0.8

1.0

Age

Su

rviv

al

Non-depressedDepressed

p = .008

Gene x Environment Interaction

70 80 90

0.0

0.2

0.4

0.6

0.8

1.0

Age

Su

rviv

al

70 80 90

0.0

0.2

0.4

0.6

0.8

1.0

Age

Su

rviv

al

IL6 -174 GG IL6 -174 CC/GC

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies

- Candidate identification- Targeted genotyping

a. PCRb. High-throughput approaches

- Statistical modelsa. Fisher’s basic regression modelb. Multivariate mapping / association / recombination

i. Recombinationii. Haplotype blocks

c. Confoundingi. Linkage disequilibrium & haplotype analysesii. Ethnic stratification

Phenotypic ascertainmentGenetic ancestry

iii. Mendelian randomization

Well ID1 ID2 RFU1 RFU2 Ct1 Ct2 CallA01 053 053 1094.39 956.90 42.53 41.36 HeterozygoteA02 065 065 -43.33 1519.25 60.00 40.39 Allele2A03 075 075 1126.77 890.96 42.82 42.02 HeterozygoteA04 079 079 2095.09 25.36 42.84 60.00 Allele1A05 087 087 2187.80 18.09 41.27 60.00 Allele1

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies

- Candidate identification- Targeted genotyping

a. PCRb. High-throughput approaches

- Statistical modelsa. Fisher’s basic regression modelb. Multivariate mapping / association / recombination

i. Recombinationii. Haplotype blocks

c. Confoundingi. Linkage disequilibrium & haplotype analysesii. Ethnic stratification

Phenotypic ascertainmentGenetic ancestry

iii. Mendelian randomization

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies

- Candidate identification- Targeted genotyping

a. PCRb. High-throughput approaches

- Statistical modelsa. Fisher’s basic regression modelb. Multivariate mapping / association / recombination

i. Recombinationii. Haplotype blocks

c. Confoundingi. Linkage disequilibrium & haplotype analysesii. Ethnic stratification

Phenotypic ascertainmentGenetic ancestry

iii. Mendelian randomization

Fisher’s regression:

GG GC CC

Out

com

e

Fisher’s regression:

GG GC CC

Out

com

e

Fisher’s regression:

GG GC CC

Out

com

e

Fisher’s regression:

GG GC CC

Out

com

e

Fisher’s regression:

GG GC CC

Out

com

e

y = a + b(#G)

Fisher’s regression:

GG GC CC

Out

com

e

y = a + b(#G)

y = a + b(GG) + c(GC) + d(CC)

Fisher’s regression:

GG GC CC

Out

com

e

y = a + b(#G)

y = a + b(GG) + c(GC) + d(CC)

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies

- Candidate identification- Targeted genotyping

a. PCRb. High-throughput approaches

- Statistical modelsa. Fisher’s basic regression modelb. Multivariate mapping / association / recombination

i. Recombinationii. Haplotype blocks

c. Confoundingi. Linkage disequilibrium & haplotype analysesii. Ethnic stratification

Phenotypic ascertainmentGenetic ancestry

iii. Mendelian randomization

Fisher’s regression:

GG GC CC

Out

com

e

y = a + b(#G rs1800795)

Fisher’s regression:

GG GC CC

Out

com

e

y = a + b(#G rs1800795)

y = a + b(#G rs1800795) + c(#T rs20937) + ….

Fisher’s regression:

GG GC CC

Out

com

e

y = a + b(#G rs1800795)

y = a + b(Haplotype containing rs1800795)

Fisher’s regression:

GG GC CC

Out

com

e

y = a + b(#G rs1800795)

y = a + b(Haplotype containing rs1800795)y = a + b(ATTCGTAC)

Fisher’s regression:

GG GC CC

Out

com

e

y = a + b(#G rs1800795)

y = a + b(Haplotype containing rs1800795)y = a + b(ATTCGTAC)

HapMap Tag SNP

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies

- Candidate identification- Targeted genotyping

a. PCRb. High-throughput approaches

- Statistical modelsa. Fisher’s basic regression modelb. Multivariate mapping / association / recombination

i. Recombinationii. Haplotype blocks

c. Confoundingi. Linkage disequilibrium & haplotype analysesii. Ethnic stratification

Phenotypic ascertainmentGenetic ancestry

iii. Mendelian randomization

Linkage-driven indirect association gradients

Linkage-driven indirect association gradients

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies

- Candidate identification- Targeted genotyping

a. PCRb. High-throughput approaches

- Statistical modelsa. Fisher’s basic regression modelb. Multivariate mapping / association / recombination

i. Recombinationii. Haplotype blocks

c. Confoundingi. Linkage disequilibrium & haplotype analysesii. Ethnic stratification

Phenotypic ascertainmentGenetic ancestry

iii. Mendelian randomization

Culture/behavior/exposure“Environment”

Ancestry classification via mitochondrial haplogroups (also Y haplogroups for paternal lineage)

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies

- Candidate identification- Targeted genotyping

a. PCRb. High-throughput approaches

- Statistical modelsa. Fisher’s basic regression modelb. Multivariate mapping / association / recombination

i. Recombinationii. Haplotype blocks

c. Confoundingi. Linkage disequilibrium & haplotype analysesii. Ethnic stratification

Phenotypic ascertainmentGenetic ancestry

iii. Mendelian randomization

CRP

CVD

CRP

CVDCRP

CRP

CVDCRP

CRP

CVDCRP

IL-6

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies

- Candidate identification- Targeted genotyping

a. PCRb. High-throughput approaches

- Statistical modelsa. Fisher’s basic regression modelb. Multivariate mapping / association / recombination

i. Recombinationii. Haplotype blocks

c. Confoundingi. Linkage disequilibrium & haplotype analysesii. Ethnic stratification

Phenotypic ascertainmentGenetic ancestry

iii. Mendelian randomization

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies• Genome-wide association studies

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies• Genome-wide association studies

- Marker selection for blind search: tag SNPs- Massively parallel genotyping

a. Array-based strategiesb. Deep resequencing

- Statistical modelsa. Main effect modelsb. Interaction modelsc. Managing Type I error

- Bonferronni & FDR- Internal cross-validation- External replication

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies• Genome-wide association studies

- Marker selection for blind search: tag SNPs- Massively parallel genotyping

a. Array-based strategiesb. Deep resequencing

- Statistical modelsa. Main effect modelsb. Interaction modelsc. Managing Type I error

- Bonferronni & FDR- Internal cross-validation- External replication

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies• Genome-wide association studies

- Marker selection for blind search: tag SNPs- Massively parallel genotyping

a. Array-based strategiesb. Deep resequencing

- Statistical modelsa. Main effect modelsb. Interaction modelsc. Managing Type I error

- Bonferronni & FDR- Internal cross-validation- External replication

Fisher’s regression:

GG GC CC

Out

com

e

y = a + b(#G)

y = a + b(GG) + c(GC) + d(CC)

Fisher’s regression:

GG GC CC

Out

com

e

Environment A

GG GC CCO

utco

me

Environment B

y = a + b(#G)

y = a + b(GG) + c(GC) + d(CC)

Fisher’s regression:

GG GC CC

Out

com

e

y = a + b(#G) + c(Env) + d(#G x Env)

y = a + b(GG) + c(GC) + d(CC) + e(Env) + f(Env x GG) + g(Env x GC) + h(Env x CC)

Environment A

GG GC CCO

utco

me

Environment B

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies• Genome-wide association studies

- Marker selection for blind search: tag SNPs- Massively parallel genotyping

a. Array-based strategiesb. Deep resequencing

- Statistical modelsa. Main effect modelsb. Interaction modelsc. Managing Type I error

- Bonferronni & FDR- Internal cross-validation- External replication

Type 1 / false positive error:

Type 1 / false positive error:

Confirmatory hypothesis testing (candidate genes)

1 hypothesis = 1 t-test = 1 p-value = no problem: p < .05 = p < .05

Type 1 / false positive error:

Confirmatory hypothesis testing (candidate genes)

1 hypothesis = 1 t-test = 1 p-value = no problem: p < .05 = p < .05

Gene mapping (exploratory association testing)

Gene expression: 22,000 p-values = 1,100 false positives (p < .05)p(false discovery > 0) = .999999999999999999999999+

Type 1 / false positive error:

Confirmatory hypothesis testing (candidate genes)

1 hypothesis = 1 t-test = 1 p-value = no problem: p < .05 = p < .05

Gene mapping (exploratory association testing)

Gene expression: 22,000 p-values = 1,100 false positives (p < .05)p(false discovery > 0) = .999999999999999999999999+

Gene polymorphism: 10,000,000 p-values = 500,000 false positives (p < .05)p(false discovery > 0) = .999999999999999999999999+

What to do?

What to do?

1. Increase stringency (intra-study)

Bonferroni correct ( p = .05/22,000 = .00000227 )Choice: huge samples or massive Type 2 “false negative” error

What to do?

1. Increase stringency (intra-study)

Bonferroni correct ( p = .05/22,000 = .00000227 )Choice: huge samples or massive Type 2 “false negative” error

Model/simulate errorRandomization test or FDR modeling = less conservative bias

Unimpressive yield: p = .00000300 if you’re lucky. Still too conservative, and biased ( omitted true effects in error term )

What to do?

1. Increase stringency (intra-study)

Bonferroni correct ( p = .05/22,000 = .00000227 )Choice: huge samples or massive Type 2 “false negative” error

Model/simulate errorRandomization test or FDR modeling = less conservative bias

Unimpressive yield: p = .00000300 if you’re lucky. Still too conservative, and biased ( omitted true effects in error term )

What to do?

1. Increase stringency (intra-study)

Bonferroni correct ( p = .05/22,000 = .00000227 )Choice: huge samples or massive Type 2 “false negative” error

Model/simulate errorRandomization test or FDR modeling = less conservative bias

Unimpressive yield: p = .00000300 if you’re lucky. Still too conservative, and biased ( omitted true effects in error term )

Use a better sampling design

0 5000 10000 15000 20000

0.0

0.2

0.4

0.6

0.8

1.0

sample size

po

we

r

Population prevalence design

0 5000 10000 15000 20000

0.0

0.2

0.4

0.6

0.8

1.0

sample size

po

we

r

0 500 1000 1500 2000

0.0

0.2

0.4

0.6

0.8

1.0

sample sizep

ow

er

Population prevalence design Outcome-stratified design

What to do?

1. Increase stringency (intra-study)

Bonferroni correct ( p = .05/22,000 = .00000227 )Choice: huge samples or massive Type 2 “false negative” error

Model/simulate errorRandomization test or FDR modeling = less conservative bias

Unimpressive yield: p = .00000300 if you’re lucky. Still too conservative, and biased ( omitted true effects in error term )

Use a better sampling design

What to do?

1. Increase stringency (intra-study)

Bonferroni correct ( p = .05/22,000 = .00000227 )Choice: huge samples or massive Type 2 “false negative” error

Model/simulate errorRandomization test or FDR modeling = less conservative bias

Unimpressive yield: p = .00000300 if you’re lucky. Still too conservative, and biased ( omitted true effects in error term )

Use a better sampling design

2. Replicate (inter-study or intra-study cross-validation)

.05 x .05 x .05 = .000125 x 22,000 = 2.75 false positives ( vs. 1,100 )

What to do?

1. Increase stringency (intra-study)

Bonferroni correct ( p = .05/22,000 = .00000227 )Choice: huge samples or massive Type 2 “false negative” error

Model/simulate errorRandomization test or FDR modeling = less conservative bias

Unimpressive yield: p = .00000300 if you’re lucky. Still too conservative, and biased ( omitted true effects in error term )

Use a better sampling design

2. Replicate (inter-study or intra-study crossvalidation)

.05 x .05 x .05 = .000125 x 22,000 = 2.75 false positives ( vs. 1,100 )

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies• Genome-wide association studies

- Marker selection for blind search: tag SNPs- Massively parallel genotyping

a. Array-based strategiesb. Deep resequencing

- Statistical modelsa. Main effect modelsb. Interaction modelsc. Managing Type I error

- Bonferronni & FDR- Internal cross-validation- External replication

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies• Genome-wide association studies

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies• Genome-wide association studies• The bioinformatic “middle road” – biological hypotheses buy power

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies• Genome-wide association studies• The bioinformatic “middle road” – biological hypotheses buy power

- Candidate set selectiona. Regulatory polymorphismb. Coding polymorphism

- Statistical considerationsa. Powerb. Differential enrichment

IL6TCT TGCGATGCTA AAG

In silico prediction of Gene x Environment Interaction

IL6TCT TGCGATGCTA AAG

C

V$GATA1_01 = .943

V$GATA1_01 = .619

In silico prediction of Gene x Environment Interaction

In silico

Tra

nscr

iptio

nal a

ctiv

ity

(fol

d-ch

ange

)

10

8

6

4

2

0

IL6 promoter: WT -174C

Norepinephrine (M): 0 10 - 0 10

Difference: p < .0001

IL6TCT TGCGATGCTA AAG

C

V$GATA1_01 = .943

V$GATA1_01 = .619

In silico prediction of Gene x Environment Interaction

In silico In vitro

70 80 90

0.0

0.2

0.4

0.6

0.8

1.0

Age

Su

rviv

al

70 80 90

0.0

0.2

0.4

0.6

0.8

1.0

Age

Su

rviv

al

p = .439

Non-depressedDepressed

70 80 90

0.0

0.2

0.4

0.6

0.8

1.0

Age

Su

rviv

al

70 80 90

0.0

0.2

0.4

0.6

0.8

1.0

Age

Su

rviv

al

Non-depressedDepressed

p = .008

70 80 90

0.0

0.2

0.4

0.6

0.8

1.0

Age

Su

rviv

al

70 80 90

0.0

0.2

0.4

0.6

0.8

1.0

Age

Su

rviv

al

IL6 -174 GG IL6 -174 CC/GC

In silico prediction of Gene x Environment Interaction

In vivo

FLJ20719 -734LOC148490 -929AKR7A2 -678RHCE -292RHCE -292RHCE -292RHCE -292LOC440576 -934SOC -39SOC -49SOC -26UNQ6122 -877LAPTM5 -728PHC2 -168PHC2 -16ITGB3BP -311FLJ20331 -994ZNF265 -663ZNF265 -663FUBP1 -778LOC388650 -392LOC388654 -957PDE4DIP -175COAS2 -435LOC199882 -474LOC440689 -692LOC440689 -16LOC441906 -496FLG -17LEP3 -631RAB13 -310LOC91181 -956LOC91181 -956LOC126669 -407LOC440693 -399PKLR -118PKLR -597FCRH1 -580SPTA1 -163SLAMF9 -256KCNJ10 -383ITLN1 -760ITLN1 -760F11R -798F11R -798LMX1A -85SELP -144LOC400796 -263F13B -881F13B -881MYOG -951LOC440712 -956LGTN -331FLJ10874 -676GPATC2 -556LOC440721 -625AGT 1FLJ10359 -367LOC441927 -406LOC440741 -564MGC12466 -863KIAA1720 -894LOC388578 -522LOC391205 -430MIG-6 -618MIG-6 -638MIG-6 -678LOC441870 -731LOC440561 -255LOC401940 -500LOC401940 -564LOC401940 -606LOC339553 -400LOC440753 -695LOC388789 -593FLJ38374 -686LOC391241 -81LOC388794 -28C20orf70 -431STK4 -122PIGT -910DNTTIP1 -479C20orf67 -1MMP9 -875CEBPB -978RNPC1 -370RNPC1 -370TH1L -26TH1L -26LOC400849 -714LOC400849 -382CGI-09 -309FKHL18 -608C20orf172 -118TGM2 -220TGM2 -220LOC388798 -828Kua-UEV -465Kua-UEV -561Kua -465BTBD4 -590C21orf99 -772C21orf99 -13KRTAP15-1 -566B3GALT5 -889B3GALT5 -889B3GALT5 -889B3GALT5 -889B3GALT5 -889LOC441955 -824LOC441955 -824LOC400858 -624CLDN8 -17KRTAP19-7 -127DSCR1 -620C21orf84 -232KRTAP12-4 -899FTCD -410FTCD -410LOC440842 -24PEX26 -129PEX26 -89ZNF74 -726ZNF74 -726LOC440804 -940SMARCB1 -290CABIN1 -797KIAA1671 -26ARP10 -612ADSL -602ARHGAP8 -260NUP50 -629PPARA -184PPARA -184

BID -126DGCR14 -951TXNRD2 -882LOC391303 -881LOC150221 -939LOC91219 -352LOC150236 -666GSTT1 -141SEC14L4 -746SSTR3 -705FLJ22582 -372DIA1 -749ATP5L2 -328A4GALT -825SULT4A1 -729SULT4A1 -729C2orf15 -882LOC129521 -477LOC440892 -918IL1RL1 -332MRPS9 -970LOC442037 -839IL1F7 -978IL1F7 -978IL1F7 -978IL1F7 -978MGC52000 -273MGC52000 -466MGC52057 -404MAP1D -120COL3A1 -310SLC39A10 -921LOC200726 -220IL8RB -447TUBA4 -643FLJ25955 -24ALPPL2 -296UGT1A9 -651UGT1A7 -351UGT1A6 -224UGT1A6 -402TRPM8 -170ASB1 -723GCKR -204LOC388938 -212FLJ38348 -606MSH2 -376MSH2 -976MSH2 -376MSH2 -376SBLF -59LOC151443 -85LOC391387 -134SEMA4F -751RBM29 -1LOC339562 -621LOC339562 -641LOC200493 -245TXNDC9 -714FLJ40629 -946LOC401005 -12LOC389050 -170ORC4L -16ORC4L -16ORC4L -16ORC4L -16ARL5 -895ARL5 -895NR4A2 -527NR4A2 -527NR4A2 -527NR4A2 -527ATP5G3 -55ZNF533 -598ZSWIM2 -772PGAP1 -821PGAP1 -827SF3B1 -138ORC2L -786LOC391475 -413CRYGC -765PECR -942SLC23A3 -412LOC442070 -877LOC129607 -488LOC339789 -268LOC130502 -558ALK -710BCL11A -615BCL11A -615BCL11A -615BCL11A -615PAP -438PAP -438PAP -531CNTN4 -809PPARG -584PPARG -914LOC401054 -926GALNTL2 -427FBXL2 -107APRG1 -269APRG1 -347LOC440951 -20LOC389123 -140LOC285194 -808NR1I2 -769STXBP5L -480LOC442092 -880MRPS22 -897KCNAB1 -793LOC402146 -134LOC90133 -2NLGN1 -541FLJ20522 -803ATP2B2 -593LOC440946 -917ANKRD28 -437LOC152024 -365FLJ32685 -953SLC4A7 -509MST1 -895LOC377064 -623LOC200959 -572CPOX -150LOC401079 -259CBLB -250LOC344807 -514GPR156 -497IQCB1 -412MGC34728 -553LOC256374 -248KIAA0861 -435MGC15397 -397LOC254808 -484

LRRC15 -288KIAA0226 -776LOC255324 -380IBSP -319MGC48628 -101NDST3 -902LOC401149 -733LOC441038 -837FLJ35630 -291CYP4V2 -117LOC401164 -978LOC391727 -934LOC399917 -840ZAR1 -106LOC401132 -18PF4 -819EIF4E -716ADH7 -557TACR3 -957AGXT2L1 -631PLA2G12A -795PITX2 -411PITX2 -411LOC401155 -72CDHJ -652FGA -110FGA -110PPID -384LOC441049 -368GPM6A -203LOC389833 -878LOC389833 -288LOC389833 -288LOC389833 -878LOC442102 -418FGFBP1 -290LOC441013 -188FLJ00310 -289FLJ00310 -881FLJ00310 -289FLJ00310 -289FLJ00310 -289FLJ00310 -289FLJ00310 -289LOC442127 -287SRD5A1 -631LOC345711 -877LOC389281 -225MGC42105 -669PELO -938BDP1 -918DKFZp564C0469 -378LOC134505 -63TSLP -331LOC340069 -755SNCAIP -671LOC441106 -646SLC27A6 -484CDC42SE2 -384PHF15 -52LOC389331 -27PCDHA4 -26PCDHA4 -26PCDHB3 -623PCDHB6 -212PCDHB16 -609ABLIM3 -474LARP -716LOC134541 -868FGFR4 -472FGFR4 -472FGFR4 -745FGFR4 -745LOC442145 -7LOC442146 -856LOC345462 -604LOC345462 -609LOC442148 -595OR2V2 -340OR2V2 -901TPPP -454MYO10 -583LOC441066 -463GDNF -36LOC345643 -568FOXD1 -990ARSB -493DHFR -473SPATA9 -748CHD1 -581STK22D -863LOC389316 -227CDO1 -360FLJ33977 -166LOC391824 -129ALDH7A1 -920CAMK2A -429CAMK2A -429C5orf4 -657LOC345430 -332DUSP1 -361LOC285770 -132NQO2 -705MRS2L -22HIST1H2BA -960HIST1H2BD -597HIST1H2BD -597HIST1H2BH -618HIST1H4I -283HLA-H -477MRPS18B -207LOC401250 -26LOC401250 -497NFKBIL1 -305LY6G5B -359C6orf25 -413C6orf25 -413C6orf25 -413C6orf25 -413C6orf25 -413C6orf25 -413C6orf25 -413HSPA1B -942C2 -687HLA-DRA -774HLA-DQA1 -265ZBTB9 -229LOC389386 -725TLT4 -607C6orf139 -788KIAA1411 -549C6orf57 -986C6orf165 -728POU3F2 -997LOC340148 -581

C6orf55 -149LOC345829 -202LOC442278 -732LOC442279 -858LOC401289 -82LOC285766 -472SERPINB6 -657OFCC1 -367LOC441129 -714SMA3 -762LOC222699 -719LOC441138 -870OR12D3 -872LOC346171 -389HCG4P6 -80HCG4P6 -501PSORS1C2 -78PSORS1C2 -78HLA-C -512HLA-B -594HLA-DRB1 -469HLA-DRB1 -821HLA-DQB2 0HLA-DQB2 -333HLA-DQB2 0HLA-DOB -500MLN -740LRFN2 -452C6orf108 -907C6orf108 -907PLA2G7 -227CRISP1 -236CRISP1 -236IL17F -733HMGCLL1 -759LOC442226 -67C6orf66 -832DJ467N11.1 -34RTN4IP1 -207SLC22A16 -869LOC442254 -307DEADC1 -509FLJ44955 -391SYNE1 -484SYNE1 -126LOC389435 -451LOC389435 -565PIP3-E -457T -9T -3LOC442280 -112DKFZP434J154 -615LOC401303 -632LOC441198 -739GHRHR -646ADCYAP1R1 -60C7orf16 -842LOC441209 -41GPR154 -435GPR154 -435C7orf36 -707BLVRA -400BLVRA -400LOC51619 -311WBSCR19 -38LOC136288 -523LOC392030 -632FZD9 -485LOC85865 -255LOC442341 -390AKR1D1 -159LOC93432 -126OR2F1 -160OR2A5 -927LOC441184 -336LOC441186 -584LOC441187 -654LOC389831 -914LOC389831 -914LOC222967 -338LOC222967 -338LOC340267 -244ICA1 -699AGR2 -65LOC389472 -184LOC401316 -837CRHR2 -610PDE1C -20LOC441210 -361LOC222052 -77LOC441224 -287LOC441230 -143LOC441245 -127LOC441259 -954CCL26 -441SEMA3C -385C7orf23 -761PON1 -785GATS -36ACHE -715ACHE -224ACHE -715ACHE -224ORC5L -990ORC5L -990CHCHD3 -793MGC5242 -861LOC392997 -596LOC392997 -596FLJ44186 -168HIPK2 -70ZC3HDC1 -407LOC402301 -14BAGE4 -100BAGE4 -648MCPH1 -520MCPH1 -203AMAC -766NEIL2 -956NEF3 -789PNOC -756LOC441344 -308FKSG2 -72DKFZp586M1819 -469SNTG1 -463LOC389657 -899ADHFE1 -54SULF1 -522WWP1 -695LOC401471 -326FLJ45248 -290LOC441309 -343LOC392169 -927ANGPT2 -895SPAG11 -971

SPAG11 -971SPAG11 -971SPAG11 -971SPAG11 -622SPAG11 -622SPAG11 -971DEFB104 -132LOC389633 -370ASAH1 -702ASAH1 -882FLJ22494 -242FLJ22494 -781SNAI2 -728CPA6 -613FSBP -393MFTC -905MRPL13 -525LOC442399 -126TOP1MT -477LOC286126 -887LOC340393 -922DOCK8 -109LOC441386 -327C9orf93 -708SH3GL2 -702C9orf94 -376LOC340501 -32LOC441417 -394DKFZP434M131 -944SECISBP2 -404LOC441453 -821PHF2 -646PHF2 -648LOC441457 -742LOC441457 -802PRG-3 -971RAD23B -998SLC31A2 -380OR1N2 -646C9orf54 -2C9orf54 -2LAMC3 -895LOC441473 -825LOC441473 -825LOC441473 -825DBH -768OBP2A -732EGFL7 -330EGFL7 -335TRAF2 -32LOC441408 -394LOC389702 -288C9orf46 -353SLC24A2 -265IFNA10 -138IFNA14 -85C9orf11 -311C9orf24 -905C9orf24 -905UNQ470 -31STOML2 -420LOC392334 -904LOC286327 -215HNRPK -86LOC441452 -955DIRAS2 -896LOC286359 -774TXNDC4 -690TXN -239OR1L8 -459DYT1 -561ABO -790ABO -789ABO -790XPMC2H -374LOC441474 -921LOC389734 -489LOC389734 -223FCN1 -673FCN1 -709LOC441410 -990GAGE1 -21RRAGB -788RRAGB -788LOC340527 -194SH3BGRL -944DIAPH2 -921DIAPH2 -921HSU24186 -145NXF2 -89PLP1 -918PLP1 -918LOC286436 -713SLC6A14 -962LOC392529 -73FLJ25735 -992MAGEB4 -834MAGEB4 -834LOC389844 -822LOC389844 -814UBE1 -964LOC203604 -16LOC441481 -796DMD -923RPGR 3ZNF21 -828PRKY -308LOC441537 -223LOC441539 -222LOC441535 -225LOC441536 -223LOC338588 -51UCN3 -368NET1 -14MAPK8 -856MAPK8 -856MAPK8 -856MAPK8 -856LOC399768 -100CDC2 -415SLC29A3 -596LOC143244 -131LOC439994 -962LIPL3 -68LIPL3 -704LOC439996 -302LOC387701 -717LOC387701 -817FRAT1 -287FRAT1 -287ABCC2 -3ABCC2 -3HPS6 -237NFKB2 -790PNLIPRP2 -442

DMBT1 -462DMBT1 -462DMBT1 -462FANK1 3TAF3 -544LOC441547 9LOC220998 -941TPRT -277C10orf68 -817C10orf9 -269ZNF33A -477ZNF33A -477LOC399744 -202LOC399744 -202PPYR1 -81PPYR1 -81LOC439946 -71AKR1C2 -641AKR1C2 -641LOC441560 -504LOC439975 -618NEUROG3 6AMID -452PPP3CB -854LOC439983 -240LOC389988 -68MMS19L -221C10orf69 -121GPR10 -555C10orf93 -42ASB13 -506IL15RA -222IL15RA -827USP6NL -573C10orf45 -181NMT2 -912SIAT8F -676NEBL -727C10orf52 -163LOC439953 -879LOC399737 -608CTGLF1 -504LOC439963 -500KCNQ1 -40LOC387746 -61OR51F2 -640TRIM34 -105OR10A2 -851SAA1 -721SAA1 -722LOC441593 -126PDHX -845TRIM44 -24LOC90139 -660NDUFS3 -929LOC196346 -885OR5T3 -97CTNND1 -133CTNND1 -116CNTF -149ROM1 -515MARK2 -375MARK2 -375RAB1B -75GSTP1 -841GSTP1 -841LOC440056 -824USP35 -148LOC390231 -471OR4D5 -465OR8G5 -809MGC39545 -867LOC399969 -328LOC219797 -216NUP98 -651NUP98 -651NUP98 -651NUP98 -651KIAA0409 -533LOC283299 -427LOC440026 -69LOC440030 -675LOC387754 -159LOC144100 -631HPS5 -917HPS5 -917HPS5 -917LOC387764 -149LOC440041 -221FLJ31393 -362OR8H1 -161AGTRL1 -809PRG2 -899TCN1 -716RAB3IL1 -976KIAA0404 -771CHRDL2 -754KCTD14 -94MRE11A -879MRE11A -982MMP7 -853CRYAB -175ZNF202 -527LOC387820 -553LOC387823 -178CCND2 -350NDUFA9 -485KCNA5 -805FLJ10665 -245FLJ10665 -576LOC285407 -743LOC390299 -771FLJ10652 -491LOC144245 -455PFKM -838DKFZp686O1689 -733C12orf10 -110DGKA -806DGKA -800SUOX -384ZNFN1A4 -874LYZ -944GAS41 -166VEZATIN -34LOC387876 -110C12orf8 -840COX6A1 -124LOC390364 -971LOC390364 -971LOC144678 -418LOC338797 -31SLC6A13 -445NRIP2 -107NOL1 -122LOC387701 -817FRAT1 -287FRAT1 -287ABCC2 -3ABCC2 -3HPS6 -237NFKB2 -790PNLIPRP2 -442

CLECSF12 -885CLECSF12 -885CLECSF12 -885CLECSF12 -885CLECSF12 -885CLECSF12 -885CLECSF12 -885CLECSF12 -885CLECSF12 -885KLRK1 -349PRB1 -589PRB1 -589PRB1 -589ADAMTS20 -965ADAMTS20 -965SLC38A2 -638K-ALPHA-1 -27KIAA1602 -262RACGAP1 -620K6IRS3 -708KRT4 -83NPFF -777STAT2 -94FLJ32949 -500IFNG -795MGC26598 -498HAL -358DKFZp434M0331 -920LOC400070 -223TSC -785GPR109B -392EPIM -568EPIM -568GALNT9 -798LOC440122 -169LOC221140 -342LOC440128 -877LOC387912 -279LOC341784 -327NURIT -947RB1 -525DKFZP434K1172 -595DKFZP434K1172 -595LOC144983 -906LOC144983 -892LOC144983 -896LOC400144 -807PROZ -865PROZ -865CRYL1 -768POSTN -32LOC440134 -367EBPL -973GUCY1B2 -832LOC338862 -918LOC404785 -818OR11H6 -269C14orf92 -234PSMA6 -219KTN1 -222C14orf166B -786EVL -28CCNB1IP1 -868CCNB1IP1 -868NEDD8 -143BAZ1A -508BAZ1A -508NFKBIA -963LOC283551 -302CDKL1 -902LOC400214 -138RTN1 -974LOC390488 -457PLEK2 -465PIGH -153RDH11 -251FLJ39779 -161KIAA1509 -179SERPINA2 -559SERPINA2 -559SERPINA2 -559SERPINA9 -856LOC390529 -204LOC388073 -112LOC400307 -332LOC283694 -71LOC400320 -443FLJ35785 -414LOC440249 -92HH114 -991PLA2G4B -483CAPN3 -318CAPN3 -318CAPN3 -318LOC400368 -320SLC28A2 -275DUT -32SCG3 -739LIPC -853OSTbeta -781LOC440289 -446COMMD4 -790LOC400433 -496LOC390637 -55FLJ11175 -113LOC440224 -815LOC283804 -112CHSY1 -876LOC440315 -303LOC440315 -303LOC400470 -62LOC388076 -715LOC440250 -206LOC440255 -981FLJ20313 -236AVEN -767KIAA0377 -896FBN1 -191SPPL2A -4BCL2L10 -653LOC145780 -610BNIP2 -842BNIP2 -421RASL12 -878SNAPC5 -540BG1 -364LOC400411 -62LOC440293 -718FLJ40113 -951

IP -207TBL3 0KIAA1171 -70TNFRSF12A -968DNAJA3 -24ALG1 -464ALG1 -464FLJ12363 -773LOC92017 -711TMC7 -412MGC16824 -271RBBP6 -795RBBP6 -795RBBP6 -795ITGAX -504ERAF -510LOC388248 -649FLJ38101 -981CES4 -221MT1H -280GAN -839PLCG2 -534CDH13 -906HSBP1 -425MLYCD -917FLJ45121 -772DPEP1 -765FLJ32252 -288FLJ32252 -346MGC35212 -360FLJ25410 -280LOC400506 -715LOC94431 -77DOC2A -265LOC441761 -889LOC57019 -375ZNF319 -360DNCLI2 -857DNCLI2 -857DKFZP434A1319 -236LOC439920 -70CHST5 -601CHST5 -756LOC390748 -242DPH2L1 -42LOC388323 -892MAP2K4 -128MAP2K4 -128KRTAP4-12 -78JJAZ1 -789CCL2 -912PSMB3 -889LOC440440 -1FLJ25168 -244SP2 -57LOC388406 -800TBX4 -465DDX42 -212DDX42 -212LOC90799 -734DKFZP586L0724 -829SSTR2 -874MRPS7 -822MRPS7 -719LOC388429 -804NARF -669NARF -669GEMIN4 -911OR1D2 -376ALOX15 -267SLC16A11 -346CLECSF14 -596CLECSF14 -640FLJ40217 -393RCV1 -761CDRT1 -618NOS2A -287NOS2A -287KRT25D -828KRT12 -585HUMGT198A -797HUMGT198A -690FLJ31222 -769LOC284058 -524GIP -957LOC400619 -823UNC13D -695LOC339162 -685LOC388462 -43SEH1L -801LOC284232 -988LOC284232 -845CABLES1 -281CABYR -908CABYR -908CABYR -908CABYR -908CABYR -908DSG3 -367SLC14A1 -333DCC -386RAB27B -713ZCCHC2 -249LOC342808 -306LOC284276 -397MYOM1 -232MC2R -113LOC441817 -600KIAA1632 -405FBXO15 -123FBXO15 -192LOC390865 -489TXNL4 -33CDC34 -270GZMM -678C19orf21 -573ARID3A -913LOC126295 -456MGC39581 -37TRAPPC5 -352LOC51257 9OR7C2 -399OR10H3 -953OR10H4 -323LOC284434 -560HSPC142 -632PGLS -935LOC148206 -288ZNF431 -967CLECSF12 -885

PSMC4 -215PSMC4 -215EGLN2 -452LOC388549 -412SYNGR4 -825RPL13A -816LOC402665 -925FLJ46385 -176LOC91661 -13LAIR2 -705LAIR2 -705KIR2DL1 -763KIR3DL2 3ZNF583 -867ZNF71 -861MGC4728 -490ZNF211 -76ZNF211 -76LOC401895 -957APBA3 -13FUT5 -174TNFSF7 8SH2D3A -2738D6A -950EIF3S4 -547RAB3D -852MGC20983 -338MGC20983 -338MGC20983 -338NDUFB7 -741LOC339377 -660IL12RB1 -56IL12RB1 -56IL12RB1 -56IL12RB1 -56LOC148198 -361CEBPA -564UNQ467 -521FLJ22573 -941CLC -823DYRK1B -849DYRK1B -849DYRK1B -849PSG11 -297PSG11 -297PSG4 -299PSG4 -299PSG9 -435FLJ34222 -415ERCC2 -123DMPK -988PGLYRP1 -212LIG1 -806FLJ32926 -288CGB8 -202TEAD2 -546FLJ20643 -895LOC400712 -236SIGLEC6 -972SIGLEC6 -972SIGLEC6 -972ZNF577 -582ZNF611 -148ZNF600 -716ZNF600 -37NALP9 -489PRDM2 -762PRDM2 -762LOC400743 -400PADI1 -598FLJ44952 -494DJ462O23.2 -973PPP1R8 5PPP1R8 5PPP1R8 5ATPIF1 -766ATPIF1 -766ATPIF1 -766LOC440581 -793CGI-94 -384FLJ14351 -753UROD -715LOC441885 -810DKFZp761D221 -478DKFZp761D221 -221IL23R -322CTH -6CTH -6AK5 -966DNAJB4 -987CDC7 -604LOC388649 -426DCLRE1B -406LOC440610 -739LOC440610 -584LOC440610 -652LOC441903 -538LOC440673 -482BNIPL -420BNIPL -419SPRR1B -826SPRR1B -826IL6R -110IL6R -110CKS1B -983SYT11 -785PMF1 -223LOC164118 -75FY -397NCSTN -809HSPA6 -839HSPA6 -611CGI-01 7CGI-01 7DKFZP564J047 -208HFL1 -551HFL3 -563NEK7 -714MGC14801 -276OR2AK2 -528LOC441873 -501LOC441873 -565LOC441873 -607LOC343068 -256ARID3A -913LOC126295 -456MGC39581 -37TRAPPC5 -352LOC51257 9OR7C2 -399OR10H3 -953OR10H4 -323LOC284434 -560HSPC142 -632PGLS -935LOC148206 -288ZNF431 -967CLECSF12 -885

1205 GRE-modifying SNPs

Gene set enrichment analysis

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies• Genome-wide association studies• The bioinformatic “middle road” – biological hypotheses buy power

- Candidate set selectiona. Regulatory polymorphismb. Coding polymorphism

- Statistical considerationsa. Powerb. Differential enrichment

0 5000 10000 15000 20000

0.0

0.2

0.4

0.6

0.8

1.0

sample size

po

we

r

0 500 1000 1500 2000

0.0

0.2

0.4

0.6

0.8

1.0

sample sizep

ow

er

Population prevalence design Outcome-stratified design

0 5000 10000 15000 20000

0.0

0.2

0.4

0.6

0.8

1.0

sample size

po

we

r

0 500 1000 1500 2000

0.0

0.2

0.4

0.6

0.8

1.0

sample sizep

ow

er

Population prevalence design

GEscan GEscan

Outcome-stratified design

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies• Genome-wide association studies• The bioinformatic “middle road” – biological hypotheses buy power

- Candidate set selectiona. Regulatory polymorphismb. Coding polymorphism

- Statistical considerationsa. Powerb. Differential enrichment

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies• Genome-wide association studies• The bioinformatic “middle road” – biological hypotheses buy power

- Candidate set selectiona. Regulatory polymorphismb. Coding polymorphism

- Statistical considerationsa. Powerb. Differential enrichment

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies• Genome-wide association studies• The bioinformatic “middle road” – biological hypotheses buy power

Technical take-home points:

Strengths & weaknesses of alternative approaches

1. Candidate gene studies: focus on 1 candidateAdvantages

- Scientifically tractable: incremental & cross-validatable- Maximal statistical power (focused hypothesis)

Disadvantages- Can only “discover” what we already know (i.e., biased)

2. Genome-wide association studies: focus on all candidatesAdvantages

- Unbiased de novo discoveryDisadvantages

- Minimal statistical power, particularly for interactions3. The bioinformatic “middle road”: focus on a small set of causally

plausible candidates (unbiased search of regulatory and coding SNPs) Advantages

- Scientifically tractable: “short leap of inference” & cross-validatable- Relatively high statistical power (focus on 1-10% of plausible SNPs)

Disadvantages- Likely missing some true causal genetic influences- Bioinformatically intensive – thought (and programming) required

Take-home points for this group:

Take-home points for this group:

1. Gene-Environment interactions are likely far more…- ubiquitous- large in effect size- clinically/socially meaningful

…than current genetic analyses presume.

Take-home points for this group:

1. Gene-Environment interactions are likely far more…- ubiquitous- large in effect size- clinically/socially meaningful

…than current genetic analyses presume.

There is plenty left for you to find.

Take-home points for this group:

1. Gene-Environment interactions are likely far more…- ubiquitous- large in effect size- clinically/socially meaningful

…than current genetic analyses presume.

There is plenty left for you to find.

2. If you have the study you have (i.e., can’t alter sampling design), your major opportunities for increasing power/discovery involve:

Take-home points for this group:

1. Gene-Environment interactions are likely far more…- ubiquitous- large in effect size- clinically/socially meaningful

…than current genetic analyses presume.

There is plenty left for you to find.

2. If you have the study you have (i.e., can’t alter sampling design), your major opportunities for increasing power/discovery involve:

- focusing on substantive effects that are true/big (e.g., GxE, not G, given antagonistic pleiotropy; E, ExE, GxG,

etc.)

Take-home points for this group:

1. Gene-Environment interactions are likely far more…- ubiquitous- large in effect size- clinically/socially meaningful

…than current genetic analyses presume.

There is plenty left for you to find.

2. If you have the study you have (i.e., can’t alter sampling design), your major opportunities for increasing power/discovery involve:

- focusing on substantive effects that are true/big (e.g., GxE, not G, given antagonistic pleiotropy; E, ExE, GxG,

etc.)- modeling biological mechanisms to focus power/impose constraints (e.g., candidate systems, functional themes, regulatory themes)

Take-home points for this group:

1. Gene-Environment interactions are likely far more…- ubiquitous- large in effect size- clinically/socially meaningful

…than current genetic analyses presume.

There is plenty left for you to find.

2. If you have the study you have (i.e., can’t alter sampling design), your major opportunities for increasing power/discovery involve:

- focusing on substantive effects that are true/big (e.g., GxE, not G, given antagonistic pleiotropy; E, ExE, GxG,

etc.)- modeling biological mechanisms to focus power/impose constraints (e.g., candidate systems, functional themes, regulatory themes)- combinatorial data-mining (e.g., machine learning in discovery sample)

Take-home points for this group:

1. Gene-Environment interactions are likely far more…- ubiquitous- large in effect size- clinically/socially meaningful

…than current genetic analyses presume.

There is plenty left for you to find.

2. If you have the study you have (i.e., can’t alter sampling design), your major opportunities for increasing power/discovery involve:

- focusing on substantive effects that are true/big (e.g., GxE, not G, given antagonistic pleiotropy; E, ExE, GxG,

etc.)- modeling biological mechanisms to focus power/impose constraints (e.g., candidate systems, functional themes, regulatory themes)- combinatorial data-mining (e.g., machine learning in discovery sample)- sequential testing designs (low stringency discovery, med stringency test, high stringency confirm)

Take-home points for this group:

1. Gene-Environment interactions are likely far more…- ubiquitous- large in effect size- clinically/socially meaningful

…than current genetic analyses presume.

There is plenty left for you to find.

2. If you have the study you have (i.e., can’t alter sampling design), your major opportunities for increasing power/discovery involve:

- focusing on substantive effects that are true/big (e.g., GxE, not G, given antagonistic pleiotropy; E, ExE, GxG,

etc.)- modeling biological mechanisms to focus power/impose constraints (e.g., candidate systems, functional themes, regulatory themes)- combinatorial data-mining (e.g., machine learning in discovery sample)- sequential testing designs (low stringency discovery, med stringency test, high stringency confirm)

Your advantage is smart data analysis.

Follow-up references

Overview of genetics / biologyAttia, J., et al. (2009) How to use an article about genetic association: A: Background concepts. JAMA, 301, 74-81

Genetic association studiesHirschhorn, J., & Daly, M. (2005) Genome-wide association studies for common diseases and complex traits. Nature Reviews Genetics, 6, 95-108.Attia, J., et al. (2009) How to use an article about genetic association: B: Are the results of the study valid? JAMA, 301, 191-197.Cordell, H, & Clayton, D. (2005) Genetic epidemiology 3: Genetic association studies. Lancet, 366, 1121-1131

Basic statistical modeling for geneticsSiegmund, D., & Yakir, B. (2007) The statistics of gene mapping. New York, Springer

Sampling & statistical approaches for GxE discoveryThomas, D., (2010) Gene-environment-wide association studies: emerging approaches. Nature Reviews Genetics, 11, 259-272

Statistical strategies for combinatorial discoveryHastie, T., Tibshirani, R. & Friedman, J. (2001) The elements of statistical learning. New York, Springer.

.

Perspectives on the State of the Field

How can we best promote the integration of genetic and demographic approaches?

Application clinic

Open microphone

1. What do you want to accomplish?

2. At what stage are you now?i. Study design?ii. Data collection?iii. Analysis and reporting?

3. How can we be of help?

Genomics Workshop

Demography of Aging Centers Biomarker Network Meeting in Conjunction with the Annual Meeting of the PAA

April 14, 9:00 AM to 3:30 PM – Hyatt Regency, Dallas, Texas

Sponsored by USC/UCLA Center of Biodemography and Population Health

Organized by Teresa Seeman, Steven Cole, Eileen Crimmins

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies• Genome-wide association studies• The bioinformatic “middle road” – biological hypotheses buy power

2. Environmental regulation of health (via transcription)• Candidate transcript studies

- RT-PCR- Statistical analyses incorporating temporal & spatial heterogeneity

• Genome-wide approaches- Microarrays- Theme discovery

a. Functional (Gene Ontology)b. Regulatory (TELiS)c. Spatial (SpAnGEL)

RNA

DNA

RT

IFN-

Antiviral cytokine mRNA

0

100

200

300

400

500

600

700

800

900

1 2 3

0 6 12

IFN

- c

onse

nsus

mR

NA

(f

old-

indu

ctio

n ov

er b

asel

ine)

Exposure (hrs.)

IFN-

0

100

200

300

400

500

600

700

800

900

1 2 3

0 6 12IF

N-

mR

NA

(f

old-

indu

ctio

n ov

er b

asel

ine)

Exposure (hrs.)

CpG + NECpG + NE

CpG

CpG

Collado-Hidalgo et al (2006) Brain, Behavior and Immunity

SIV RNA (in situ hybridization)

SIV replication

Social Stress

- +

SIV

re

pli

ca

tio

n

(s

ites

/ sp

atia

l qu

ad

rat)

p < .0001

cond

0.00

0.05

0.10

0.15

0.20

0.25

0.30

SNS neurons

- +0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

SIV

re

pli

ca

tio

n

(

site

s /

spa

tial q

ua

dra

t)

p < .0001

Sloan et al. (2006) Journal of VirologySloan et al. (2007) Journal of Neuroscience

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies• Genome-wide association studies• The bioinformatic “middle road” – biological hypotheses buy power

2. Environmental regulation of health (via transcription)• Candidate transcript studies

- RT-PCR- Statistical analyses incorporating temporal & spatial heterogeneity

• Genome-wide approaches- Microarrays- Theme discovery

a. Functional (Gene Ontology)b. Regulatory (TELiS)c. Spatial (SpAnGEL)

Lonely

Integrated

Social isolation

J. Cacioppo

Genome Biology, 2007

78131

Palmer et al. BMC Genomics (2006)

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies• Genome-wide association studies• The bioinformatic “middle road” – biological hypotheses buy power

2. Environmental regulation of health (via transcription)• Candidate transcript studies

- RT-PCR- Statistical analyses incorporating temporal & spatial heterogeneity

• Genome-wide approaches- Microarrays- Theme discovery

a. Functional (Gene Ontology)b. Regulatory (TELiS)c. Spatial (SpAnGEL)

Social Environment

Gene

Biologicalfunction

IL6

RNA

DNA

Social Environment

Gene

Biologicalfunction

IL6

RNA

DNA

Social Environment

Gene

Biologicalfunction

IL6

RNA

DNA

Social Environment

Gene

Biologicalfunction

IL6

RNA

DNA

Social Environment

Gene

Biologicalfunction

IL6

RNA

DNA

Lonely

Integrated

Social isolation

J. Cacioppo

Genome Biology, 2007

78131

Lonely

Integrated

Social isolation

J. Cacioppo

Genome Biology, 2007

78131

InflammationCell growth/differentiation

Transcription control

Lonely

Integrated

Social isolation

J. Cacioppo

Genome Biology, 2007

78131

InflammationCell growth/differentiation

Transcription control

Immunoglobulin productionType I interferon antiviral response

http://www.gostat.wehi.edu.au

TRIM54ACSBG2HIST4H4KLHL32FLJ35773GPC4TRPV4LBPC20ORF200ASB15OCLM

http://www.gostat.wehi.edu.au

http://www.gostat.wehi.edu.au

Social Environment

Gene

Biologicalfunction

IL6

RNA

DNA

Sp1

CREB

NF-B

Sp1

CREB

NF-B

Sp1

CREB

NF-B

Sp1

CREB

NF-BEnvironment S equence ExpressionPromoterSequence

Sp1

CREB

NF-BEnvironment S equence ExpressionPromoterSequence

Sp1

CREB

NF-BEnvironment S equence ExpressionPromoterSequence

Sp1

CREB

NF-B

Sp1

CREB

NF-B

Sp1

CREB

NF-B

Environment S equence ExpressionPromoterSequence

?

Sp1

CREB

NF-B

Sp1

CREB

NF-B

Sp1

CREB

NF-B

Sp1

CREB

NF-B

Cole et al (2005) Bioinformatics, 21, 803

http://www.telis.ucla.edu

Cole et al (2005) Bioinformatics, 21, 803

http://www.telis.ucla.edu

Cole et al (2005) Bioinformatics, 21, 803

http://www.telis.ucla.edu

Lonely

Integrated

Social isolation

J. Cacioppo Genome Biology, 2007

78131

Lonely

Integrated

Social isolation

J. Cacioppo Genome Biology, 2007

78131

NF-B

Lonely

Integrated

Social isolation

J. Cacioppo Genome Biology, 2007

78131

NF-B

GRE

Social Environment

Gene

Biologicalfunction

IL6

RNA

DNA

NaB de-repression - fibroblast

gene 1

gene 2

gene 3

gene 4

gene 5

gene 6

gene 7

gene 8

gene 9

gene 10

gene 11

gene 12

gene 13

gene 14

gene 15

gene 16

gene 17

gene 18

gene 19

gene 20

gene 21

gene 22

gene 23

gene 24

gene 25

gene 26

gene 27

gene 28

gene 29

gene 30

gene 31

gene 32

gene 33

gene 34

gene 35

gene 36

gene 37

gene 38

gene 39

gene 40

gene 41

gene 1

gene 2

gene 3

gene 4

gene 5

gene 6

gene 7

gene 8

gene 9

gene 10

gene 11

gene 12

gene 13

gene 14

gene 15

gene 16

gene 17

gene 18

gene 19

gene 20

gene 21

gene 22

gene 23

gene 24

gene 25

gene 26

gene 27

gene 28

gene 29

gene 30

gene 31

gene 32

gene 33

gene 34

gene 35

gene 36

gene 37

gene 38

gene 39

gene 40

gene 41

TF1

gene 1

gene 2

gene 3

gene 4

gene 5

gene 6

gene 7

gene 8

gene 9

gene 10

gene 11

gene 12

gene 13

gene 14

gene 15

gene 16

gene 17

gene 18

gene 19

gene 20

gene 21

gene 22

gene 23

gene 24

gene 25

gene 26

gene 27

gene 28

gene 29

gene 30

gene 31

gene 32

gene 33

gene 34

gene 35

gene 36

gene 37

gene 38

gene 39

gene 40

gene 41

TF1

TF2

gene 1

gene 2

gene 3

gene 4

gene 5

gene 6

gene 7

gene 8

gene 9

gene 10

gene 11

gene 12

gene 13

gene 14

gene 15

gene 16

gene 17

gene 18

gene 19

gene 20

gene 21

gene 22

gene 23

gene 24

gene 25

gene 26

gene 27

gene 28

gene 29

gene 30

gene 31

gene 32

gene 33

gene 34

gene 35

gene 36

gene 37

gene 38

gene 39

gene 40

gene 41

TF1

TF2

TF3

gene 1

gene 2

gene 3

gene 4

gene 5

gene 6

gene 7

gene 8

gene 9

gene 10

gene 11

gene 12

gene 13

gene 14

gene 15

gene 16

gene 17

gene 18

gene 19

gene 20

gene 21

gene 22

gene 23

gene 24

gene 25

gene 26

gene 27

gene 28

gene 29

gene 30

gene 31

gene 32

gene 33

gene 34

gene 35

gene 36

gene 37

gene 38

gene 39

gene 40

gene 41

TF1

TF2

TF3

gene 1

gene 2

gene 3

gene 4

gene 5

gene 6

gene 7

gene 8

gene 9

gene 10

gene 11

gene 12

gene 13

gene 14

gene 15

gene 16

gene 17

gene 18

gene 19

gene 20

gene 21

gene 22

gene 23

gene 24

gene 25

gene 26

gene 27

gene 28

gene 29

gene 30

gene 31

gene 32

gene 33

gene 34

gene 35

gene 36

gene 37

gene 38

gene 39

gene 40

gene 41

TF1

TF2

TF3

gene 1

gene 2

gene 3

gene 4

gene 5

gene 6

gene 7

gene 8

gene 9

gene 10

gene 11

gene 12

gene 13

gene 14

gene 15

gene 16

gene 17

gene 18

gene 19

gene 20

gene 21

gene 22

gene 23

gene 24

gene 25

gene 26

gene 27

gene 28

gene 29

gene 30

gene 31

gene 32

gene 33

gene 34

gene 35

gene 36

gene 37

gene 38

gene 39

gene 40

gene 41

gene 1

gene 2

gene 3

gene 4

gene 5

gene 6

gene 7

gene 8

gene 9

gene 10

gene 11

gene 12

gene 13

gene 14

gene 15

gene 16

gene 17

gene 18

gene 19

gene 20

gene 21

gene 22

gene 23

gene 24

gene 25

gene 26

gene 27

gene 28

gene 29

gene 30

gene 31

gene 32

gene 33

gene 34

gene 35

gene 36

gene 37

gene 38

gene 39

gene 40

gene 41

TF1

TF2

TF3

gene 1

gene 2

gene 3

gene 4

gene 5

gene 6

gene 7

gene 8

gene 9

gene 10

gene 11

gene 12

gene 13

gene 14

gene 15

gene 16

gene 17

gene 18

gene 19

gene 20

gene 21

gene 22

gene 23

gene 24

gene 25

gene 26

gene 27

gene 28

gene 29

gene 30

gene 31

gene 32

gene 33

gene 34

gene 35

gene 36

gene 37

gene 38

gene 39

gene 40

gene 41

gene 1

gene 2

gene 3

gene 4

gene 5

gene 6

gene 7

gene 8

gene 9

gene 10

gene 11

gene 12

gene 13

gene 14

gene 15

gene 16

gene 17

gene 18

gene 19

gene 20

gene 21

gene 22

gene 23

gene 24

gene 25

gene 26

gene 27

gene 28

gene 29

gene 30

gene 31

gene 32

gene 33

gene 34

gene 35

gene 36

gene 37

gene 38

gene 39

gene 40

gene 41

TF1

TF2

TF3

miRNA1

miRNA2

miRNA3

gene 1

gene 2

gene 3

gene 4

gene 5

gene 6

gene 7

gene 8

gene 9

gene 10

gene 11

gene 12

gene 13

gene 14

gene 15

gene 16

gene 17

gene 18

gene 19

gene 20

gene 21

gene 22

gene 23

gene 24

gene 25

gene 26

gene 27

gene 28

gene 29

gene 30

gene 31

gene 32

gene 33

gene 34

gene 35

gene 36

gene 37

gene 38

gene 39

gene 40

gene 41

gene 1

gene 2

gene 3

gene 4

gene 5

gene 6

gene 7

gene 8

gene 9

gene 10

gene 11

gene 12

gene 13

gene 14

gene 15

gene 16

gene 17

gene 18

gene 19

gene 20

gene 21

gene 22

gene 23

gene 24

gene 25

gene 26

gene 27

gene 28

gene 29

gene 30

gene 31

gene 32

gene 33

gene 34

gene 35

gene 36

gene 37

gene 38

gene 39

gene 40

gene 41

TF1

TF2

TF3

miRNA1

miRNA2

miRNA3

DNMT1

DNMT2

DNMT3

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies• Genome-wide association studies• The bioinformatic “middle road” – biological hypotheses buy power

2. Environmental regulation of health (via transcription)• Candidate transcript studies

- RT-PCR- Statistical analyses incorporating temporal & spatial heterogeneity

• Genome-wide approaches- Microarrays- Theme discovery

a. Functional (Gene Ontology)b. Regulatory (TELiS)c. Spatial (SpAnGEL)

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies• Genome-wide association studies• The bioinformatic “middle road” – biological hypotheses buy power

2. Environmental regulation of health (via transcription)• Candidate transcript studies• Genome-wide approaches

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies• Genome-wide association studies• The bioinformatic “middle road” – biological hypotheses buy power

2. Environmental regulation of health (via transcription)• Candidate transcript studies• Genome-wide approaches

3. Gene-Environment interaction• Statistical considerations

- Main effects and antagonistic pleiotropy- Interaction models- Combinatorial discovery

• Revisiting the “bioinformatic” middle road- Candidate set selection

a. Regulatory polymorphismb. Coding polymorphism

Fisher’s regression:

GG GC CC

Out

com

e

y = a + b(#G)

y = a + b(GG) + c(GC) + d(CC)

Fisher’s regression:

GG GC CC

Out

com

e

y = a + b(#G) + c(Env) + d(#G x Env)

y = a + b(GG) + c(GC) + d(CC) + e(Env) + f(Env x GG) + g(Env x GC) + h(Env x CC)

Environment A

GG GC CCO

utco

me

Environment B

Combinatorial explosion

107 SNPs x 101-2 environments = 108-9 intx terms

Combinatorial explosion

107 SNPs x 101-2 environments = 108-9 intx terms

N = 2,000-20,000 for current main effect studies

Given that power/effect size, need 2 Million subjects for interaction sweep.

What to do?

1. Increase stringency (intra-study)

Bonferroni correct / FDR correctModel/simulate errorUse a better sampling design

2. Replicate (inter-study or intra-study crossvalidation)

3. Get a hypothesis- Biological- Empirical

Combinatorial discovery strategies

Smart study design + smart statistics + biological constraint

0 5000 10000 15000 20000

0.0

0.2

0.4

0.6

0.8

1.0

sample size

po

we

r

0 500 1000 1500 2000

0.0

0.2

0.4

0.6

0.8

1.0

sample sizep

ow

er

Population prevalence design Outcome-stratified design

Combinatorial discovery strategies

Smart study design + smart statistics + biological constraint

Combinatorial discovery strategies

Smart study design + smart statistics + biological constraint

• Stratified sampling• Multi-stage testing• Cross-validation

Combinatorial discovery strategies

Smart study design + smart statistics + biological constraint

• Stratified sampling• Multi-stage testing• Cross-validation

• Data-mining / Machine learning

- CART/forests- MARS- PRIM

Combinatorial discovery strategies

Smart study design + smart statistics + biological constraint

• Stratified sampling• Multi-stage testing• Cross-validation

• Data-mining / Machine learning

- CART/forests- MARS- PRIM

• Functional pathways• Regulatory pathways• Chromosomal units

Tra

nscr

iptio

nal a

ctiv

ity

(fol

d-ch

ange

)

10

8

6

4

2

0

IL6 promoter: WT -174C

Norepinephrine (M): 0 10 - 0 10

Difference: p < .0001

IL6TCT TGCGATGCTA AAG

C

V$GATA1_01 = .943

V$GATA1_01 = .619

In silico prediction of Gene x Environment Interaction

In silico In vitro

FLJ20719 -734LOC148490 -929AKR7A2 -678RHCE -292RHCE -292RHCE -292RHCE -292LOC440576 -934SOC -39SOC -49SOC -26UNQ6122 -877LAPTM5 -728PHC2 -168PHC2 -16ITGB3BP -311FLJ20331 -994ZNF265 -663ZNF265 -663FUBP1 -778LOC388650 -392LOC388654 -957PDE4DIP -175COAS2 -435LOC199882 -474LOC440689 -692LOC440689 -16LOC441906 -496FLG -17LEP3 -631RAB13 -310LOC91181 -956LOC91181 -956LOC126669 -407LOC440693 -399PKLR -118PKLR -597FCRH1 -580SPTA1 -163SLAMF9 -256KCNJ10 -383ITLN1 -760ITLN1 -760F11R -798F11R -798LMX1A -85SELP -144LOC400796 -263F13B -881F13B -881MYOG -951LOC440712 -956LGTN -331FLJ10874 -676GPATC2 -556LOC440721 -625AGT 1FLJ10359 -367LOC441927 -406LOC440741 -564MGC12466 -863KIAA1720 -894LOC388578 -522LOC391205 -430MIG-6 -618MIG-6 -638MIG-6 -678LOC441870 -731LOC440561 -255LOC401940 -500LOC401940 -564LOC401940 -606LOC339553 -400LOC440753 -695LOC388789 -593FLJ38374 -686LOC391241 -81LOC388794 -28C20orf70 -431STK4 -122PIGT -910DNTTIP1 -479C20orf67 -1MMP9 -875CEBPB -978RNPC1 -370RNPC1 -370TH1L -26TH1L -26LOC400849 -714LOC400849 -382CGI-09 -309FKHL18 -608C20orf172 -118TGM2 -220TGM2 -220LOC388798 -828Kua-UEV -465Kua-UEV -561Kua -465BTBD4 -590C21orf99 -772C21orf99 -13KRTAP15-1 -566B3GALT5 -889B3GALT5 -889B3GALT5 -889B3GALT5 -889B3GALT5 -889LOC441955 -824LOC441955 -824LOC400858 -624CLDN8 -17KRTAP19-7 -127DSCR1 -620C21orf84 -232KRTAP12-4 -899FTCD -410FTCD -410LOC440842 -24PEX26 -129PEX26 -89ZNF74 -726ZNF74 -726LOC440804 -940SMARCB1 -290CABIN1 -797KIAA1671 -26ARP10 -612ADSL -602ARHGAP8 -260NUP50 -629PPARA -184PPARA -184

BID -126DGCR14 -951TXNRD2 -882LOC391303 -881LOC150221 -939LOC91219 -352LOC150236 -666GSTT1 -141SEC14L4 -746SSTR3 -705FLJ22582 -372DIA1 -749ATP5L2 -328A4GALT -825SULT4A1 -729SULT4A1 -729C2orf15 -882LOC129521 -477LOC440892 -918IL1RL1 -332MRPS9 -970LOC442037 -839IL1F7 -978IL1F7 -978IL1F7 -978IL1F7 -978MGC52000 -273MGC52000 -466MGC52057 -404MAP1D -120COL3A1 -310SLC39A10 -921LOC200726 -220IL8RB -447TUBA4 -643FLJ25955 -24ALPPL2 -296UGT1A9 -651UGT1A7 -351UGT1A6 -224UGT1A6 -402TRPM8 -170ASB1 -723GCKR -204LOC388938 -212FLJ38348 -606MSH2 -376MSH2 -976MSH2 -376MSH2 -376SBLF -59LOC151443 -85LOC391387 -134SEMA4F -751RBM29 -1LOC339562 -621LOC339562 -641LOC200493 -245TXNDC9 -714FLJ40629 -946LOC401005 -12LOC389050 -170ORC4L -16ORC4L -16ORC4L -16ORC4L -16ARL5 -895ARL5 -895NR4A2 -527NR4A2 -527NR4A2 -527NR4A2 -527ATP5G3 -55ZNF533 -598ZSWIM2 -772PGAP1 -821PGAP1 -827SF3B1 -138ORC2L -786LOC391475 -413CRYGC -765PECR -942SLC23A3 -412LOC442070 -877LOC129607 -488LOC339789 -268LOC130502 -558ALK -710BCL11A -615BCL11A -615BCL11A -615BCL11A -615PAP -438PAP -438PAP -531CNTN4 -809PPARG -584PPARG -914LOC401054 -926GALNTL2 -427FBXL2 -107APRG1 -269APRG1 -347LOC440951 -20LOC389123 -140LOC285194 -808NR1I2 -769STXBP5L -480LOC442092 -880MRPS22 -897KCNAB1 -793LOC402146 -134LOC90133 -2NLGN1 -541FLJ20522 -803ATP2B2 -593LOC440946 -917ANKRD28 -437LOC152024 -365FLJ32685 -953SLC4A7 -509MST1 -895LOC377064 -623LOC200959 -572CPOX -150LOC401079 -259CBLB -250LOC344807 -514GPR156 -497IQCB1 -412MGC34728 -553LOC256374 -248KIAA0861 -435MGC15397 -397LOC254808 -484

LRRC15 -288KIAA0226 -776LOC255324 -380IBSP -319MGC48628 -101NDST3 -902LOC401149 -733LOC441038 -837FLJ35630 -291CYP4V2 -117LOC401164 -978LOC391727 -934LOC399917 -840ZAR1 -106LOC401132 -18PF4 -819EIF4E -716ADH7 -557TACR3 -957AGXT2L1 -631PLA2G12A -795PITX2 -411PITX2 -411LOC401155 -72CDHJ -652FGA -110FGA -110PPID -384LOC441049 -368GPM6A -203LOC389833 -878LOC389833 -288LOC389833 -288LOC389833 -878LOC442102 -418FGFBP1 -290LOC441013 -188FLJ00310 -289FLJ00310 -881FLJ00310 -289FLJ00310 -289FLJ00310 -289FLJ00310 -289FLJ00310 -289LOC442127 -287SRD5A1 -631LOC345711 -877LOC389281 -225MGC42105 -669PELO -938BDP1 -918DKFZp564C0469 -378LOC134505 -63TSLP -331LOC340069 -755SNCAIP -671LOC441106 -646SLC27A6 -484CDC42SE2 -384PHF15 -52LOC389331 -27PCDHA4 -26PCDHA4 -26PCDHB3 -623PCDHB6 -212PCDHB16 -609ABLIM3 -474LARP -716LOC134541 -868FGFR4 -472FGFR4 -472FGFR4 -745FGFR4 -745LOC442145 -7LOC442146 -856LOC345462 -604LOC345462 -609LOC442148 -595OR2V2 -340OR2V2 -901TPPP -454MYO10 -583LOC441066 -463GDNF -36LOC345643 -568FOXD1 -990ARSB -493DHFR -473SPATA9 -748CHD1 -581STK22D -863LOC389316 -227CDO1 -360FLJ33977 -166LOC391824 -129ALDH7A1 -920CAMK2A -429CAMK2A -429C5orf4 -657LOC345430 -332DUSP1 -361LOC285770 -132NQO2 -705MRS2L -22HIST1H2BA -960HIST1H2BD -597HIST1H2BD -597HIST1H2BH -618HIST1H4I -283HLA-H -477MRPS18B -207LOC401250 -26LOC401250 -497NFKBIL1 -305LY6G5B -359C6orf25 -413C6orf25 -413C6orf25 -413C6orf25 -413C6orf25 -413C6orf25 -413C6orf25 -413HSPA1B -942C2 -687HLA-DRA -774HLA-DQA1 -265ZBTB9 -229LOC389386 -725TLT4 -607C6orf139 -788KIAA1411 -549C6orf57 -986C6orf165 -728POU3F2 -997LOC340148 -581

C6orf55 -149LOC345829 -202LOC442278 -732LOC442279 -858LOC401289 -82LOC285766 -472SERPINB6 -657OFCC1 -367LOC441129 -714SMA3 -762LOC222699 -719LOC441138 -870OR12D3 -872LOC346171 -389HCG4P6 -80HCG4P6 -501PSORS1C2 -78PSORS1C2 -78HLA-C -512HLA-B -594HLA-DRB1 -469HLA-DRB1 -821HLA-DQB2 0HLA-DQB2 -333HLA-DQB2 0HLA-DOB -500MLN -740LRFN2 -452C6orf108 -907C6orf108 -907PLA2G7 -227CRISP1 -236CRISP1 -236IL17F -733HMGCLL1 -759LOC442226 -67C6orf66 -832DJ467N11.1 -34RTN4IP1 -207SLC22A16 -869LOC442254 -307DEADC1 -509FLJ44955 -391SYNE1 -484SYNE1 -126LOC389435 -451LOC389435 -565PIP3-E -457T -9T -3LOC442280 -112DKFZP434J154 -615LOC401303 -632LOC441198 -739GHRHR -646ADCYAP1R1 -60C7orf16 -842LOC441209 -41GPR154 -435GPR154 -435C7orf36 -707BLVRA -400BLVRA -400LOC51619 -311WBSCR19 -38LOC136288 -523LOC392030 -632FZD9 -485LOC85865 -255LOC442341 -390AKR1D1 -159LOC93432 -126OR2F1 -160OR2A5 -927LOC441184 -336LOC441186 -584LOC441187 -654LOC389831 -914LOC389831 -914LOC222967 -338LOC222967 -338LOC340267 -244ICA1 -699AGR2 -65LOC389472 -184LOC401316 -837CRHR2 -610PDE1C -20LOC441210 -361LOC222052 -77LOC441224 -287LOC441230 -143LOC441245 -127LOC441259 -954CCL26 -441SEMA3C -385C7orf23 -761PON1 -785GATS -36ACHE -715ACHE -224ACHE -715ACHE -224ORC5L -990ORC5L -990CHCHD3 -793MGC5242 -861LOC392997 -596LOC392997 -596FLJ44186 -168HIPK2 -70ZC3HDC1 -407LOC402301 -14BAGE4 -100BAGE4 -648MCPH1 -520MCPH1 -203AMAC -766NEIL2 -956NEF3 -789PNOC -756LOC441344 -308FKSG2 -72DKFZp586M1819 -469SNTG1 -463LOC389657 -899ADHFE1 -54SULF1 -522WWP1 -695LOC401471 -326FLJ45248 -290LOC441309 -343LOC392169 -927ANGPT2 -895SPAG11 -971

SPAG11 -971SPAG11 -971SPAG11 -971SPAG11 -622SPAG11 -622SPAG11 -971DEFB104 -132LOC389633 -370ASAH1 -702ASAH1 -882FLJ22494 -242FLJ22494 -781SNAI2 -728CPA6 -613FSBP -393MFTC -905MRPL13 -525LOC442399 -126TOP1MT -477LOC286126 -887LOC340393 -922DOCK8 -109LOC441386 -327C9orf93 -708SH3GL2 -702C9orf94 -376LOC340501 -32LOC441417 -394DKFZP434M131 -944SECISBP2 -404LOC441453 -821PHF2 -646PHF2 -648LOC441457 -742LOC441457 -802PRG-3 -971RAD23B -998SLC31A2 -380OR1N2 -646C9orf54 -2C9orf54 -2LAMC3 -895LOC441473 -825LOC441473 -825LOC441473 -825DBH -768OBP2A -732EGFL7 -330EGFL7 -335TRAF2 -32LOC441408 -394LOC389702 -288C9orf46 -353SLC24A2 -265IFNA10 -138IFNA14 -85C9orf11 -311C9orf24 -905C9orf24 -905UNQ470 -31STOML2 -420LOC392334 -904LOC286327 -215HNRPK -86LOC441452 -955DIRAS2 -896LOC286359 -774TXNDC4 -690TXN -239OR1L8 -459DYT1 -561ABO -790ABO -789ABO -790XPMC2H -374LOC441474 -921LOC389734 -489LOC389734 -223FCN1 -673FCN1 -709LOC441410 -990GAGE1 -21RRAGB -788RRAGB -788LOC340527 -194SH3BGRL -944DIAPH2 -921DIAPH2 -921HSU24186 -145NXF2 -89PLP1 -918PLP1 -918LOC286436 -713SLC6A14 -962LOC392529 -73FLJ25735 -992MAGEB4 -834MAGEB4 -834LOC389844 -822LOC389844 -814UBE1 -964LOC203604 -16LOC441481 -796DMD -923RPGR 3ZNF21 -828PRKY -308LOC441537 -223LOC441539 -222LOC441535 -225LOC441536 -223LOC338588 -51UCN3 -368NET1 -14MAPK8 -856MAPK8 -856MAPK8 -856MAPK8 -856LOC399768 -100CDC2 -415SLC29A3 -596LOC143244 -131LOC439994 -962LIPL3 -68LIPL3 -704LOC439996 -302LOC387701 -717LOC387701 -817FRAT1 -287FRAT1 -287ABCC2 -3ABCC2 -3HPS6 -237NFKB2 -790PNLIPRP2 -442

DMBT1 -462DMBT1 -462DMBT1 -462FANK1 3TAF3 -544LOC441547 9LOC220998 -941TPRT -277C10orf68 -817C10orf9 -269ZNF33A -477ZNF33A -477LOC399744 -202LOC399744 -202PPYR1 -81PPYR1 -81LOC439946 -71AKR1C2 -641AKR1C2 -641LOC441560 -504LOC439975 -618NEUROG3 6AMID -452PPP3CB -854LOC439983 -240LOC389988 -68MMS19L -221C10orf69 -121GPR10 -555C10orf93 -42ASB13 -506IL15RA -222IL15RA -827USP6NL -573C10orf45 -181NMT2 -912SIAT8F -676NEBL -727C10orf52 -163LOC439953 -879LOC399737 -608CTGLF1 -504LOC439963 -500KCNQ1 -40LOC387746 -61OR51F2 -640TRIM34 -105OR10A2 -851SAA1 -721SAA1 -722LOC441593 -126PDHX -845TRIM44 -24LOC90139 -660NDUFS3 -929LOC196346 -885OR5T3 -97CTNND1 -133CTNND1 -116CNTF -149ROM1 -515MARK2 -375MARK2 -375RAB1B -75GSTP1 -841GSTP1 -841LOC440056 -824USP35 -148LOC390231 -471OR4D5 -465OR8G5 -809MGC39545 -867LOC399969 -328LOC219797 -216NUP98 -651NUP98 -651NUP98 -651NUP98 -651KIAA0409 -533LOC283299 -427LOC440026 -69LOC440030 -675LOC387754 -159LOC144100 -631HPS5 -917HPS5 -917HPS5 -917LOC387764 -149LOC440041 -221FLJ31393 -362OR8H1 -161AGTRL1 -809PRG2 -899TCN1 -716RAB3IL1 -976KIAA0404 -771CHRDL2 -754KCTD14 -94MRE11A -879MRE11A -982MMP7 -853CRYAB -175ZNF202 -527LOC387820 -553LOC387823 -178CCND2 -350NDUFA9 -485KCNA5 -805FLJ10665 -245FLJ10665 -576LOC285407 -743LOC390299 -771FLJ10652 -491LOC144245 -455PFKM -838DKFZp686O1689 -733C12orf10 -110DGKA -806DGKA -800SUOX -384ZNFN1A4 -874LYZ -944GAS41 -166VEZATIN -34LOC387876 -110C12orf8 -840COX6A1 -124LOC390364 -971LOC390364 -971LOC144678 -418LOC338797 -31SLC6A13 -445NRIP2 -107NOL1 -122LOC387701 -817FRAT1 -287FRAT1 -287ABCC2 -3ABCC2 -3HPS6 -237NFKB2 -790PNLIPRP2 -442

CLECSF12 -885CLECSF12 -885CLECSF12 -885CLECSF12 -885CLECSF12 -885CLECSF12 -885CLECSF12 -885CLECSF12 -885CLECSF12 -885KLRK1 -349PRB1 -589PRB1 -589PRB1 -589ADAMTS20 -965ADAMTS20 -965SLC38A2 -638K-ALPHA-1 -27KIAA1602 -262RACGAP1 -620K6IRS3 -708KRT4 -83NPFF -777STAT2 -94FLJ32949 -500IFNG -795MGC26598 -498HAL -358DKFZp434M0331 -920LOC400070 -223TSC -785GPR109B -392EPIM -568EPIM -568GALNT9 -798LOC440122 -169LOC221140 -342LOC440128 -877LOC387912 -279LOC341784 -327NURIT -947RB1 -525DKFZP434K1172 -595DKFZP434K1172 -595LOC144983 -906LOC144983 -892LOC144983 -896LOC400144 -807PROZ -865PROZ -865CRYL1 -768POSTN -32LOC440134 -367EBPL -973GUCY1B2 -832LOC338862 -918LOC404785 -818OR11H6 -269C14orf92 -234PSMA6 -219KTN1 -222C14orf166B -786EVL -28CCNB1IP1 -868CCNB1IP1 -868NEDD8 -143BAZ1A -508BAZ1A -508NFKBIA -963LOC283551 -302CDKL1 -902LOC400214 -138RTN1 -974LOC390488 -457PLEK2 -465PIGH -153RDH11 -251FLJ39779 -161KIAA1509 -179SERPINA2 -559SERPINA2 -559SERPINA2 -559SERPINA9 -856LOC390529 -204LOC388073 -112LOC400307 -332LOC283694 -71LOC400320 -443FLJ35785 -414LOC440249 -92HH114 -991PLA2G4B -483CAPN3 -318CAPN3 -318CAPN3 -318LOC400368 -320SLC28A2 -275DUT -32SCG3 -739LIPC -853OSTbeta -781LOC440289 -446COMMD4 -790LOC400433 -496LOC390637 -55FLJ11175 -113LOC440224 -815LOC283804 -112CHSY1 -876LOC440315 -303LOC440315 -303LOC400470 -62LOC388076 -715LOC440250 -206LOC440255 -981FLJ20313 -236AVEN -767KIAA0377 -896FBN1 -191SPPL2A -4BCL2L10 -653LOC145780 -610BNIP2 -842BNIP2 -421RASL12 -878SNAPC5 -540BG1 -364LOC400411 -62LOC440293 -718FLJ40113 -951

IP -207TBL3 0KIAA1171 -70TNFRSF12A -968DNAJA3 -24ALG1 -464ALG1 -464FLJ12363 -773LOC92017 -711TMC7 -412MGC16824 -271RBBP6 -795RBBP6 -795RBBP6 -795ITGAX -504ERAF -510LOC388248 -649FLJ38101 -981CES4 -221MT1H -280GAN -839PLCG2 -534CDH13 -906HSBP1 -425MLYCD -917FLJ45121 -772DPEP1 -765FLJ32252 -288FLJ32252 -346MGC35212 -360FLJ25410 -280LOC400506 -715LOC94431 -77DOC2A -265LOC441761 -889LOC57019 -375ZNF319 -360DNCLI2 -857DNCLI2 -857DKFZP434A1319 -236LOC439920 -70CHST5 -601CHST5 -756LOC390748 -242DPH2L1 -42LOC388323 -892MAP2K4 -128MAP2K4 -128KRTAP4-12 -78JJAZ1 -789CCL2 -912PSMB3 -889LOC440440 -1FLJ25168 -244SP2 -57LOC388406 -800TBX4 -465DDX42 -212DDX42 -212LOC90799 -734DKFZP586L0724 -829SSTR2 -874MRPS7 -822MRPS7 -719LOC388429 -804NARF -669NARF -669GEMIN4 -911OR1D2 -376ALOX15 -267SLC16A11 -346CLECSF14 -596CLECSF14 -640FLJ40217 -393RCV1 -761CDRT1 -618NOS2A -287NOS2A -287KRT25D -828KRT12 -585HUMGT198A -797HUMGT198A -690FLJ31222 -769LOC284058 -524GIP -957LOC400619 -823UNC13D -695LOC339162 -685LOC388462 -43SEH1L -801LOC284232 -988LOC284232 -845CABLES1 -281CABYR -908CABYR -908CABYR -908CABYR -908CABYR -908DSG3 -367SLC14A1 -333DCC -386RAB27B -713ZCCHC2 -249LOC342808 -306LOC284276 -397MYOM1 -232MC2R -113LOC441817 -600KIAA1632 -405FBXO15 -123FBXO15 -192LOC390865 -489TXNL4 -33CDC34 -270GZMM -678C19orf21 -573ARID3A -913LOC126295 -456MGC39581 -37TRAPPC5 -352LOC51257 9OR7C2 -399OR10H3 -953OR10H4 -323LOC284434 -560HSPC142 -632PGLS -935LOC148206 -288ZNF431 -967CLECSF12 -885

PSMC4 -215PSMC4 -215EGLN2 -452LOC388549 -412SYNGR4 -825RPL13A -816LOC402665 -925FLJ46385 -176LOC91661 -13LAIR2 -705LAIR2 -705KIR2DL1 -763KIR3DL2 3ZNF583 -867ZNF71 -861MGC4728 -490ZNF211 -76ZNF211 -76LOC401895 -957APBA3 -13FUT5 -174TNFSF7 8SH2D3A -2738D6A -950EIF3S4 -547RAB3D -852MGC20983 -338MGC20983 -338MGC20983 -338NDUFB7 -741LOC339377 -660IL12RB1 -56IL12RB1 -56IL12RB1 -56IL12RB1 -56LOC148198 -361CEBPA -564UNQ467 -521FLJ22573 -941CLC -823DYRK1B -849DYRK1B -849DYRK1B -849PSG11 -297PSG11 -297PSG4 -299PSG4 -299PSG9 -435FLJ34222 -415ERCC2 -123DMPK -988PGLYRP1 -212LIG1 -806FLJ32926 -288CGB8 -202TEAD2 -546FLJ20643 -895LOC400712 -236SIGLEC6 -972SIGLEC6 -972SIGLEC6 -972ZNF577 -582ZNF611 -148ZNF600 -716ZNF600 -37NALP9 -489PRDM2 -762PRDM2 -762LOC400743 -400PADI1 -598FLJ44952 -494DJ462O23.2 -973PPP1R8 5PPP1R8 5PPP1R8 5ATPIF1 -766ATPIF1 -766ATPIF1 -766LOC440581 -793CGI-94 -384FLJ14351 -753UROD -715LOC441885 -810DKFZp761D221 -478DKFZp761D221 -221IL23R -322CTH -6CTH -6AK5 -966DNAJB4 -987CDC7 -604LOC388649 -426DCLRE1B -406LOC440610 -739LOC440610 -584LOC440610 -652LOC441903 -538LOC440673 -482BNIPL -420BNIPL -419SPRR1B -826SPRR1B -826IL6R -110IL6R -110CKS1B -983SYT11 -785PMF1 -223LOC164118 -75FY -397NCSTN -809HSPA6 -839HSPA6 -611CGI-01 7CGI-01 7DKFZP564J047 -208HFL1 -551HFL3 -563NEK7 -714MGC14801 -276OR2AK2 -528LOC441873 -501LOC441873 -565LOC441873 -607LOC343068 -256ARID3A -913LOC126295 -456MGC39581 -37TRAPPC5 -352LOC51257 9OR7C2 -399OR10H3 -953OR10H4 -323LOC284434 -560HSPC142 -632PGLS -935LOC148206 -288ZNF431 -967CLECSF12 -885

1205 GRE-modifying SNPs

0 5000 10000 15000 20000

0.0

0.2

0.4

0.6

0.8

1.0

sample size

po

we

r

0 500 1000 1500 2000

0.0

0.2

0.4

0.6

0.8

1.0

sample sizep

ow

er

Population prevalence design

GEscan GEscan

Outcome-stratified design

Coding sequence polymorphisms

gene 1

gene 2

gene 3

gene 4

gene 5

gene 6

gene 7

gene 8

gene 9

gene 10

gene 11

gene 12

gene 13

gene 14

gene 15

gene 16

gene 17

gene 18

gene 19

gene 20

gene 21

gene 22

gene 23

gene 24

gene 25

gene 26

gene 27

gene 28

gene 29

gene 30

gene 31

gene 32

gene 33

gene 34

gene 35

gene 36

gene 37

gene 38

gene 39

gene 40

gene 41

TF1

TF2

TF3

Combinatorial discovery strategies

Smart study design + smart statistics + biological constraint

• Stratified sampling• Multi-stage testing• Cross-validation

• Data-mining / Machine learning

- CART/forests- MARS- PRIM

• Functional pathways• Regulatory pathways• Chromosomal units

Combinatorial discovery strategies

Smart study design + smart statistics + biological constraint

• Stratified sampling• Multi-stage testing• Cross-validation

• Data-mining / Machine learning

- CART/forests- MARS- PRIM

• Functional pathways• Regulatory pathways• Chromosomal units

Why is this critical?

Combinatorial discovery strategies

Smart study design + smart statistics + biological constraint

• Stratified sampling• Multi-stage testing• Cross-validation

• Data-mining / Machine learning

- CART/forests- MARS- PRIM

• Functional pathways• Regulatory pathways• Chromosomal units

Why is this critical?

Antagonistic pleiotropy is the norm → GxE

Combinatorial discovery strategies

Smart study design + smart statistics + biological constraint

• Stratified sampling• Multi-stage testing• Cross-validation

• Data-mining / Machine learning

- CART/forests- MARS- PRIM

• Functional pathways• Regulatory pathways• Chromosomal units

Why is this critical?

Antagonistic pleiotropy is the norm → GxEEpistatic interaction is the norm → GxG

Combinatorial discovery strategies

Smart study design + smart statistics + biological constraint

• Stratified sampling• Multi-stage testing• Cross-validation

• Data-mining / Machine learning

- CART/forests- MARS- PRIM

• Functional pathways• Regulatory pathways• Chromosomal units

Why is this critical?

Antagonistic pleiotropy is the norm → GxEEpistatic interaction is the norm → GxGHigh-order interactions are likely normal → GxGxExE

Combinatorial discovery strategies

Smart study design + smart statistics + biological constraint

• Stratified sampling• Multi-stage testing• Cross-validation

• Data-mining / Machine learning

- CART/forests- MARS- PRIM

• Functional pathways• Regulatory pathways• Chromosomal units

Why is this critical?

Antagonistic pleiotropy is the norm → GxEEpistatic interaction is the norm → GxGHigh-order interactions are likely normal → GxGxExELow power, “replication failure”, and epistemological slop

- the missing “h”, and the missing “E”

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies• Genome-wide association studies• The bioinformatic “middle road” – biological hypotheses buy power

2. Environmental regulation of health (via transcription)• Candidate transcript studies• Genome-wide approaches

3. Gene-Environment interaction• Statistical considerations

- Main effects and antagonistic pleiotropy- Interaction models- Combinatorial discovery

• Revisiting the “bioinformatic” middle road- Candidate set selection

a. Regulatory polymorphismb. Coding polymorphism

Technical aspects of study design and data analysis

Study designs, assay technologies, and statistical methods

1. “Gene discovery” (e.g., genetic epidemiology)• Candidate gene studies• Genome-wide association studies• The bioinformatic “middle road” – biological hypotheses buy power

2. Environmental regulation of health (via transcription)• Candidate transcript studies• Genome-wide approaches

3. Gene-Environment interaction• Statistical considerations• Revisiting the “bioinformatic” middle road

Take-home points for this group:

Take-home points for this group:

1. Gene-Environment interactions are likely far more…- ubiquitous- large in effect size- clinically/socially meaningful

…than current genetic analyses presume.

Take-home points for this group:

1. Gene-Environment interactions are likely far more…- ubiquitous- large in effect size- clinically/socially meaningful

…than current genetic analyses presume.

There is plenty left for you to find.

Take-home points for this group:

1. Gene-Environment interactions are likely far more…- ubiquitous- large in effect size- clinically/socially meaningful

…than current genetic analyses presume.

There is plenty left for you to find.

2. If you have the study you have (i.e., can’t alter sampling design), your major opportunities for increasing power/discovery involve:

Take-home points for this group:

1. Gene-Environment interactions are likely far more…- ubiquitous- large in effect size- clinically/socially meaningful

…than current genetic analyses presume.

There is plenty left for you to find.

2. If you have the study you have (i.e., can’t alter sampling design), your major opportunities for increasing power/discovery involve:

- focusing on substantive effects that are true/big (e.g., GxE, not G, given antagonistic pleiotropy; E, ExE, GxG,

etc.)

Take-home points for this group:

1. Gene-Environment interactions are likely far more…- ubiquitous- large in effect size- clinically/socially meaningful

…than current genetic analyses presume.

There is plenty left for you to find.

2. If you have the study you have (i.e., can’t alter sampling design), your major opportunities for increasing power/discovery involve:

- focusing on substantive effects that are true/big (e.g., GxE, not G, given antagonistic pleiotropy; E, ExE, GxG,

etc.)- modeling biological mechanisms to focus power/impose constraints (e.g., candidate systems, functional themes, regulatory themes)

Take-home points for this group:

1. Gene-Environment interactions are likely far more…- ubiquitous- large in effect size- clinically/socially meaningful

…than current genetic analyses presume.

There is plenty left for you to find.

2. If you have the study you have (i.e., can’t alter sampling design), your major opportunities for increasing power/discovery involve:

- focusing on substantive effects that are true/big (e.g., GxE, not G, given antagonistic pleiotropy; E, ExE, GxG,

etc.)- modeling biological mechanisms to focus power/impose constraints (e.g., candidate systems, functional themes, regulatory themes)- combinatorial data-mining (e.g., machine learning in discovery sample)

Take-home points for this group:

1. Gene-Environment interactions are likely far more…- ubiquitous- large in effect size- clinically/socially meaningful

…than current genetic analyses presume.

There is plenty left for you to find.

2. If you have the study you have (i.e., can’t alter sampling design), your major opportunities for increasing power/discovery involve:

- focusing on substantive effects that are true/big (e.g., GxE, not G, given antagonistic pleiotropy; E, ExE, GxG,

etc.)- modeling biological mechanisms to focus power/impose constraints (e.g., candidate systems, functional themes, regulatory themes)- combinatorial data-mining (e.g., machine learning in discovery sample)- sequential testing designs (low stringency discovery, med stringency test, high stringency confirm)

Take-home points for this group:

1. Gene-Environment interactions are likely far more…- ubiquitous- large in effect size- clinically/socially meaningful

…than current genetic analyses presume.

There is plenty left for you to find.

2. If you have the study you have (i.e., can’t alter sampling design), your major opportunities for increasing power/discovery involve:

- focusing on substantive effects that are true/big (e.g., GxE, not G, given antagonistic pleiotropy; E, ExE, GxG,

etc.)- modeling biological mechanisms to focus power/impose constraints (e.g., candidate systems, functional themes, regulatory themes)- combinatorial data-mining (e.g., machine learning in discovery sample)- sequential testing designs (low stringency discovery, med stringency test, high stringency confirm)

Your advantage is smart data analysis.

Follow-up references

Overview of genetics / biologyAttia, J., et al. (2009) How to use an article about genetic association: A: Background concepts. JAMA, 301, 74-81

Genetic association studiesHirschhorn, J., & Daly, M. (2005) Genome-wide association studies for common diseases and complex traits. Nature Reviews Genetics, 6, 95-108.Attia, J., et al. (2009) How to use an article about genetic association: B: Are the results of the study valid? JAMA, 301, 191-197.Cordell, H, & Clayton, D. (2005) Genetic epidemiology 3: Genetic association studies. Lancet, 366, 1121-1131

Basic statistical modeling for geneticsSiegmund, D., & Yakir, B. (2007) The statistics of gene mapping. New York, Springer

Sampling & statistical approaches for GxE discoveryThomas, D., (2010) Gene-environment-wide association studies: emerging approaches. Nature Reviews Genetics, 11, 259-272

Statistical strategies for combinatorial discoveryHastie, T., Tibshirani, R. & Friedman, J. (2001) The elements of statistical learning. New York, Springer.

.

Perspectives on the State of the Field

How can we best promote the integration of genetic and demographic approaches?

Application clinic

Open microphone

1. What do you want to accomplish?

2. At what stage are you now?i. Study design?ii. Data collection?iii. Analysis and reporting?

3. How can we be of help?

Genomics Workshop

Demography of Aging Centers Biomarker Network Meeting in Conjunction with the Annual Meeting of the PAA

April 14, 9:00 AM to 3:30 PM – Hyatt Regency, Dallas, Texas

Sponsored by USC/UCLA Center of Biodemography and Population Health

Organized by Teresa Seeman, Steven Cole, Eileen Crimmins

Richlin et al. Brain, Behavior & Immunity (2004)

top related