pedigree analysis in a genome- wide world: of mice and...

36
Pedigree Analysis in a Genome- wide World: Of Mice and Moms Janet Sinsheimer PhD Prof. Human Genetics Center Applied Statistics Seminar 11/15/2011

Upload: lyquynh

Post on 14-Mar-2018

218 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

Pedigree Analysis in a Genome-wide World: Of Mice and Moms

Janet Sinsheimer PhD Prof. Human Genetics

Center Applied Statistics Seminar 11/15/2011

Page 2: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

Gene Mapping with Pedigrees

•  When markers were scarce, pedigrees provided the optimal study design to map the location of trait genes. •  Analyzed Linkage = use the patterns of

transmissions of trait phenotypes from parent to child and compare these patterns to the pattern of transmissions of genes whose location are known.

•  Marker genes = genes of known location.

Page 3: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

Example of Linkage Analysis

Disease susceptibility gene between markers 4 and 5

Page 4: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

Linkage Analysis’ Resolution is Poor

Most likely region ~ 3 million bases, could be dozens of genes

Page 5: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

What about Linkage Analysis using Model Organism Pedigrees?

•  Model Organisms can have similar traits to humans. •  Find genes by using planned mating of inbred

founders under controlled the environment (increases power) then look for analogous genes in humans.

•  Mice have been used extensively •  Highly inbred stocks lead to identical founders •  Cheap to keep and rapid generation times •  Used often in medical research. Can be genetically

engineered to carry human genes. •  There has been careful characterizations of an

extensive number of mutations.

Page 6: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

Classic Inter-Cross Design

•  Start with highly inbred strains as the founders.

•  Mate to create F1 generation and then mate brothers and sisters to create the F2s.

Page 7: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

Classical Inbred Crosses

•  Classical crosses, such as intercrosses, use just two strains and exploit recent recombination events.

•  Sparse marker maps. •  Statistical analysis typically is analysis of

variance. •  Power to map is high but the resolution is still

low. Design does not take advantage of inbred lines’ common histories.

Page 8: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

Genome Projects have made Association Testing in Unrelated Individuals Practical

•  Simple idea: A marker M is associated with trait T if trait values differ by marker genotype – ANOVA.

•  To be powerful, association testing in unrelateds requires very closely spaced markers and a high percentage of affected individuals having the same variant. •  Common variant – Common disease hypothesis. Common

diseases are caused in part by genetic variants that are also common but have small to moderate effect sizes.

•  Easy and quick to implement - do millions of tests in a couple of hours using an average desktop computer.

Page 9: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

Genome-wide Association Study for Height

•  Weedon et al. (2008) used ~30,000 individuals and more than ¼ million loci (SNPs).

Page 10: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

Success or Failure? Statistical but no Clinical Significance.

•  Strong statistical support that 20 genes are associated with height (even after accounting for multiple testing). •  P-values are very small, 2x10-24 to 3x10-7, reject

the null hypothesis of no association. •  These 20 polymorphisms explain very little

of the variation in height – less than the amount the average person shrinks from morning to evening.

Page 11: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

What Should we try next? •  Return to pedigrees but don’t return to

linkage analysis. •  Develop association methods that exploit

the information available in pedigrees. •  First project: Association using local strain

origins for inbred strains •  Second project: Use Maternal Genetic Effects to

improve power.

Page 12: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

Project 1 Exploits Recent Inbred Cross Design Innovations

•  Pedigrees are deeper and multiple founder strains are used providing more contrast and more recombination events for better resolution.

•  Gene chips with dense marker maps exist. •  Strain phylogenies are better known. •  Quasi-random mating sometimes used. •  Want to take the common histories of the different

founder strains into account. Straight forward to do if mating is nearly random. Otherwise need a new method.

Page 13: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

Collaborative Cross Design

But the method is quite general and allow for a variety of other crossing scheme.

Page 14: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

Overview of our New Method

•  For mice progeny, determine the strain origins for small sections all over the chromosomes (local strain origins).

•  Use these local strain origins as predictors of trait values in a regression analysis – more informative than using SNP genotypes.

•  Also take into account the common genetic history of the progeny by modeling the polygenic background, which is a function of the global strain fractions.

Page 15: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

The Global Strain Fractions for the Progeny Effect the Trait

Values •  Both the mean and the covariance of a trait

depend on the fraction of the genome contributed by a particular strain, the global strain fractions. Think of as providing a genetic history for the mouse progeny.

•  Global strain fractions are calculated recursively starting with the founders.

Page 16: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

Local Strain Origins are Mean Predictors in the Association Analysis

•  Instead of using the SNPs in a region as predictors – use the best guess of the maternally derived and paternally derived section of the chromosome.

•  Need to have a dense set of markers. •  Imputation of local strain origins is done in Mendel

software by minimizing a penalized likelihood one individual at a time and assigning the individual the most likely strain.

•  The penalty reduces the number of switches between founder strains.

•  We found that the accuracy of the algorithm to be very high >98%

Page 17: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

Strain Association versus SNP Association •  Strain association can be far more informative and therefore

more powerful. •  Note: working with two traits together is better than each alone

Traits analyzed

Trait 1 alone Trait 2 alone Traits 1 and 2

SNP LRT 6.984 0.494 11.914

SNP DF 1 1 2

SNP p-value 0.0082 0.493 0.0026

Strain LRT (CI interval)

27.55 0.90 Mb

31.57 6.46 Mb

46.57 0.73 Mb

Strain DF 3 3 6

Strain p-value 4.51X10-6 6.44X10-7 2.34X10-7

Page 18: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

Project 1 Future Work •  Map Genes for Multivariate Traits using Actual

Collaborative Cross Data •  Collaborative Cross Status:

•  Both genotype and phenotype data now available

•  http://csbio.unc.edu/CCstatus/index.py •  Project could make an excellent masters’ thesis

and would provide experience with very large genetic data sets and with genetic analyses as well as method development opportunities

Page 19: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

Project 2: Determining Maternal Influences on Offspring Traits

•  Human Data Project • Disturbances that effect fetal development may lead to adult diseases. • Prenatal environment has been postulated to have a role in common diseases such as: -  Cardiovascular disease - Anxiety and Depression - Diabetes - Schizophrenia -  Obesity - ADHD •  Maternal effects are difficult to detect in GWAS.

Page 20: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

Prenatal Effects can have Genetic Origins

•  Examples of genetically induced prenatal effects include maternal-fetal genotype incompatibilities.

•  Maternal-fetal incompatibility (MFG) = Combinations of maternal and fetal genes that create an adverse prenatal environment and lead to disease in the offspring

•  Phenotypes induced by MFG incompatibility cluster in families and are heritable.

Page 21: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

Hypothetical Example of Maternal-Fetal Genotype

Incompatibility §  Simple Case: One locus, two alleles

§ One allele codes for an antigen §  The other allele codes for nothing (null). § Mother is homozygous null, fetus is

heterozygous §  The mother produces an immune response

to the fetus’ antigen that is detrimental to the fetus.

§  Does this really occur?

Page 22: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

+ + + +

- - - - - - - -

- - - - - - - -

+ + + +

IgG IgG IgG

IgG

RHD Incompatibility and Hemolytic Disease of the Newborn

Mom forms IgG antibodies against baby’s expressed antigen and destroys the babies RBCs.

RhHDN: Jaundice Kernicterus Hypoxia

Drawing courtesy of C. Palmer

Mom’s genotype = dd Baby’s genotype = Dd

Page 23: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

Example 2: HLA Matching and Immunological Intolerance

Immunological intolerance = failure to stimulate an immune response that is needed to protect the baby from mother or exogenous agents. Immunological intolerance can increase risk of disease

Mom’s genotype = i/j Baby’s genotype = i/i (Matched from mom’s view)

i/i i/i i/i i/i

i/j i/j i/j i/j i/j i/j

i/j i/j i/j i/j

i/i i/i i/i i/i

Page 24: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

Find Study Designs and Analysis Approaches that can Answer the Following

Questions §  Is there a high risk allele that acts through the

offspring’s genotype alone to increase risk of disease?

§  Is there a high risk allele that acts through the mother’s genotype alone to increase risk of disease?

§  Are there combinations of maternal and offspring’s genotypes that increase risk of disease in the offspring?

Page 25: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

Possible Study Designs •  Standard GWAS designs that use unrelated

individuals are poorly powered to detect these effects (Sinsheimer 2003) and so MFG incompatibilities can account for a portion of the missing heritability.

•  Effective Designs: •  Case-Mother, Control-Mother. e.g. Chen

J, Zheng H, Wilson ML. 2009 •  Nuclear family based “affected only”

tests – e.g. Hsieh HJ et al , 2006, 2007.

Page 26: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

Our Motivation: Finnish Schizophrenia Family Study

•  230 families (161 nuclear and 69 extended) comprised of affected individuals and their available relatives from Finland. •  Largest family has 73 individuals, 32 of them

genotyped •  553 affected individuals, 1-6 per pedigree,

60% males

•  1090 individuals genotyped at HLA B. •  Our original analysis used nuclear families

and found a significant effect of HLA B matching (Palmer et al. 2006)

Page 27: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

Using Data from Complex Pedigrees in Nuclear Family MFG test

Which Family to choose?

437   438  

441   442  

440  

445  

447  

1  

439   444  

446  

449  

443  

448  

Page 28: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

Using all of the Nuclear Families could introduce Bias, Inflate Significance

437   438  

441   442  

440  

445  

447  

1  

439   444  

446  

449  

443  

448  

Page 29: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

How to Test for MFG Incompatibility using Extended

Pedigrees?

•  Want an approach that can use varied pedigree structures and incomplete data.

•  Use the likelihood of the genotype patterns conditional on the affecteds in the pedigree.

•  Requires more assumptions than a nuclear family or case-mom, control-mom tests.

Page 30: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

So what do we get in Return?

•  More accurate and precise estimates when we have extended pedigrees with more than one affected.

•  Illustrate with Simulated Data

Page 31: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

Example Simulation •  Extended pedigrees with

4 affecteds •  300 extended pedigrees •  1000 data sets •  Variable relative risks

due to matching •  Compare results of the

nuclear family test (3 families per pedigree) with extended pedigree test

Page 32: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

Selected Simulation Results µ Extended

Pedigrees Nuclear Families

Est µ 95%Coverage

Rejection Rate

Est µ 95%Coverage

Rejection Rate

1.00 0.987 0.965

0.046 0.986 0.969

0.042

1.50 1.490 0.950

0.856 1.438 0.937

0.807

2.50 2.504 0.953

0.999 2.316 0.908

0.991

Treating Data as Nuclear Families leads to slight loss of power and underestimates of MFG incompatibility

Page 33: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

What about the Finnish Schizophrenia Example?

Model Male µ 95% CI

Female µ 95% CI

Log Likelihood

Null =1.0 1.0 -2868.179 Full 0.890

(0.687,1.153) 1.449 (1.109,1.892)

-2864.638

Only Female = 1.0 1.417 (1.089,1.843)

-2865.042

•  Reject null of no MFG matching in favor of full model (p-value = 0.029) •  Reject null of no MFG matching in favor of female effect (p-value = 0.012)

Page 34: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

Project 2 Future Work

•  Need to make the analysis more efficient. Better algorithms to speed up.

•  How can we extend the method to handle quantitative traits?

•  Again could make an excellent masters’ project or part of PhD dissertation.

Page 35: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

References and Software •  Mouse references:

•  “QTL Association Mapping by Imputation of Strain Origins in Multifounder Crosses”, J.J. Zhou, A. Ghazalpour, E.M. Sobel, J.S. Sinsheimer, K. Lange, under review.

•  Bauman LE, Sinsheimer JS, Sobel EM, Lange K. (2008) Genetics. 180:1743-61.

•  MFG references: •  Childs EJ, Sobel EM, Palmer CG, Sinsheimer JS. (2011) Hum

Hered. 72:160-171. •  Childs EJ, Palmer CG, Lange K, Sinsheimer JS. (2010) Genet

Epidemiol. 34:512-21

•  Both methods implemented in the “Inbred Strains Analysis” Option, Mendel version 11.0 and higher www.genetics.ucla.edu/software

Page 36: Pedigree Analysis in a Genome- wide World: Of Mice and …cas.stat.ucla.edu/page_attachments/0000/0030/Sinsheimer_nov15_201… · Pedigree Analysis in a Genome-wide World: Of Mice

Acknowledgements

•  Collaborators: E. Childs, A. Ghazalpour, K. Lange, C. Palmer, E.M. Sobel, J.J. Zhou

•  Funding NIH GM53275 and MH59490