dog introgression patterns in a south european wolf...

74
Dog introgression patterns in a South European wolf population MSc in Bioinformatics Master’s Thesis Daniel Gómez-Sánchez Barcelona, 2014

Upload: others

Post on 16-Aug-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

Dog introgression patterns in a South

European wolf population

MSc in Bioinformatics

Master’s Thesis

Daniel Gómez-Sánchez

Barcelona, 2014

Page 2: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

Dog introgression patterns

in a South European wolf

population

Daniel Gómez-Sánchez

Barcelona, 2014

Approval of the tutors

Signed,

Dr. Antonio Barbadilla Dr. Carles Lalueza-Fox

Page 3: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

ACKNOWLEDGMENTS

First of all, it had been a pleasure working with the people of the Institut de Biologia

Evolutiva (CSIC-UPF). Concretely, I’m very grateful to the Paleogenomic’s group for

the opportunity to work with a very professional team: to Dr. Carles Lalueza-Fox for

accept me in his group, to Dr. Óscar Ramírez because this work could not have been

done without his help and ideas, to Iñigo Olalde and Federico Sánchez-Quinto for

shared their bioinformatic skills with me, and to Federica Pierini.

Second, I’m thankful to the faculty of the Universitat Autònoma de Barcelona’s MSc in

Bioinformatics, concretely to my academic tutor Antonio Barbadilla, for the

bioinformatic knowledge and skills taught.

Third, I would like to express my gratitude to Dr. Carles Vilà, Dr. Robert K. Wayne, Dr.

Tomas Marques-Bonet and Dr. Jeffrey M. Kidd for the unpublished data used in this

Master’s Thesis. I also wanted to point out the help of Raphael Carrasco and Conrad

Enseñat for the donation of the Sierra Morena and Wolf EEP samples; Dr. Natalia

Sastre for the microsatellite analysis; and Dr. Adam Boyko, Dr. Bridgett vonHoldt and

Dr. Malgorzata Pilot for the information about the 48K dataset.

I’m very thankful to Dr. Carles Lalueza-Fox, Dr. Óscar Ramírez, Iñigo Olalde and Dr.

Antonio Barbadilla for the review and comments on the manuscript; and Patricia

Rodríguez and Jordi Antonio Pinzón García for their help in the linguistic revision.

Last but not least, I’m very grateful to all the people who have believed, and continue to

believing in my early scientific career: to my parents Antonio and María Luisa, my

sisters Alba and Alicia, Patricia Rodríguez and the rest of my family for their patience

and affection; to my friends Alberto Segovia Sanz and Jordi Antonio Pinzón García for

their aid in the battle to conquer informatics; and to Dr. Juan Luis Santos and all the

people in the Cytogenetic’s lab at the Universidad Complutense de Madrid for my

initiation in the scientific world.

Thank you all, because without you this Master’s Thesis had never been written.

Page 4: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

1

INDEX

Contents

1. INTRODUCTION ...................................................................................................... 3

1.1. Extinction risk factors ......................................................................................... 4

1.2. Molecular markers .............................................................................................. 7

2. OBJETIVES ................................................................................................................ 8

3. MATERIAL AND METHODS ................................................................................. 9

3.1. Sampling and sequencing .................................................................................... 9

3.2. Mapping ................................................................................................................ 9

3.3. SNP calling ............................................................................................................ 9

3.4. Diversity analysis and inbreeding .................................................................... 10

3.5. Ancestry analysis ................................................................................................ 11

3.6. Hybridization analysis ....................................................................................... 12

4. RESULTS .................................................................................................................. 13

4.1. Heterozygosity and inbreeding ......................................................................... 13

4.2. Hybridization patterns ...................................................................................... 16

5. DISCUSSION ............................................................................................................ 24

6. CONCLUSION ......................................................................................................... 29

7. REFERENCES ......................................................................................................... 30

Appendixes

1. Bioinformatics’ discussion ....................................................................................... 37

2. Results for no-Iberian samples ................................................................................ 41

3. Heterozygosity by chromosome ............................................................................... 43

4. Heterozygosity distribution for no Iberian samples .............................................. 57

5. Principal components’ boxplots and PCA with component 4 .............................. 60

6. Cross-validation error of the ADMIXTURE analysis .......................................... 64

7. Linear model details of heterozygosity-percentage block analysis ...................... 65

Page 5: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

2

List of Tables

Table 1. Results for Iberian samples...............................................................................13

List of Figures

Figure 1. Iberian wolf distribution ................................................................................... 4

Figure 2. Diversity distribution ..................................................................................... 13

Figure 3. Diversity analysis ........................................................................................... 14

Figure 4. Runs of homozygosity .................................................................................... 15

Figure 5. Principal component analysis ......................................................................... 17

Figure 6. ADMIXTURE analysis of the present-work's dataset ................................... 18

Figure 7. ADMIXTURE analysis of the 48K-merged dataset ...................................... 19

Figure 8. STRUCTURE analysis of introgressed wolves ............................................. 20

Figure 9. Ancestry across the chromosome ................................................................... 21

Figure 10. Analysis of haplotype blocks ....................................................................... 22

Figure 11. Iberian shared alleles .................................................................................... 23

Page 6: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

3

1. INTRODUCTION

Grey wolves (Canis lupus) historically have been distributed across Europe, Asia and

North America, but due to human hunt, deforestation and wild prey loss its population

was reduced during the past centuries (Boitani 2003). The species was fragmented and

confined in the southern European peninsulas (Iberia, Italy and the Balkans), Canada

and Northern USA (Mech 1970; Breitenmoser 1998). In the 1960s legal protection led

to a population expansion in USA and Western Europe (Mech 1995); however, in

Eastern Europe and Northern Asia there was no protection, neither extinction risk

(Bibikov 1994). Differences in protection and threat between these populations make

this species a good model for conservation genetic and genomics.

In Europe, wolves have a discontinuous range where it can be distinguished three main

populations that spatially correspond to different glacial refugia and demographic

histories (Pilot et al. 2014): large and interconnected populations with constant hunting

pressures in Eastern Europe and two relatively smaller, isolated and bottlenecked

populations in Western Europe. Eastern Europe wolves are also connected with the

Asian populations; nevertheless, hunting causes multiple local demographic fluctuations

(for example, Ozolins & Andersone 2001; Sidorovich et al. 2003; Gomerčić et al.

2010). Currently, wolves in Western Europe are expanding from the partially protected

populations in Italy (including the Apennine Peninsula and the Western Italian Alps)

and the Iberian Peninsula to other regions as Catalonia or France (Sastre 2011).

The Iberian Peninsula contains the largest wolf population in Western Europe (Boitani

2003; Silva et al. 2013), isolated at least since the extinction at the end of 19th

century of

the France and Central Europe wolves, suffering a reduction due to human eradication

campaigns (Valverde 1971).With new conservation policies, the population underwent a

posterior expansion in range and size (Figure 1; Sastre 2011). Although controversially,

the wolf population in Iberian Peninsula is estimated to hold 2,200-2,500 individuals

concentrated in the Northwestern region (Silva et al. 2013). However, in the South of

the Duero river, very fragmented and isolated populations with high extinction risk are

present (Silva et al. 2013). In 1994, the European Breeding of Endangered Species

Programme (EEP) started a breeding program for the Iberian wolf derived from 15

Page 7: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

4

founders according to the studbook. The relatively high number of independent

founders and the subsequent genetic management leads to the conservation of the

original variability in the EEP population (Ramírez et al. 2006).

1.1. Extinction risk factors

Small and isolated populations have an increased risk of extinction due to genetic drift

and inbreeding because the probability of mating between relatives increases (Frankham

2005; Wright et al. 2007). Close-relative matting increases the amount of homozygous

alleles in the offspring which may reduce its fitness by inbreeding depression (Wright

1977; Falconer & Mackay 1996). Furthermore, small isolated populations are also

known to have a higher risk of hybridizing (Godinho et al. 2011; Randi et al. 2014).

Figure 1. Iberian wolf distribution. Range in the Iberian Peninsula between 19

th and 20

th

centuries. Taken from Sastre 2011.

Figure 1. Iberian wolf distribution

Page 8: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

5

Thereby, wolf populations are sensitive to both processes due to its biogeographical

history previously described.

Population bottlenecks

Bottlenecks are demographic processes that consist in a severe reduction of

effective size. Consequences can be the loss of genetic diversity and larger

amount of consanguinity, deleterious mutations and genetic drift (Bouzat 2010),

that leads to a reduction of adaptability (Frankham et al. 1999) and increase of

extinction risk by genetic and demographic processes (Keller 2002). Founding

effects, over-exploitation by humans, diseases, starvation and other natural and

biological catastrophes cause population bottlenecks, and the genetic effects

depends on their strength, duration and isolation level (Carmichael et al. 2001;

Busch et al. 2007). In the Italian and Iberian Peninsula, isolated populations

suffered a severe bottleneck, and many studies described the previous explained

effects in the genetic landscape (Sastre et al. 2010; Sastre 2011; Pilot et al.

2014).

Population fragmentation

Fragmentation of populations due to habitat loss and modification is an

increasingly important threat in the conservation of endangered species because

the diversity in a population can only increase through mutation or exchange of

genes with neighbouring populations (Vilà et al. 2003a). Isolated and

fragmented populations have an increased extinction risk due to the lack of

migration and the lower population effective size (Frankham 2005). Under these

conditions, migration events may have important effects for the rescue of small

and inbred populations (Tallmon et al. 2004). Detecting population

fragmentation is therefore crucial for conservation management. European wolf

population have been more fragmented than Americans; thus, old world grey

wolves are much differentiated between them due to the lack of genetic flux

between populations and the genetic drift (Pilot et al. 2010).

Page 9: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

6

Inbreeding depression

Founder effect and isolation might reduce allelic and genotypic diversity within

populations, increasing inbreeding and the probability of extinction. Inbreeding

leads to an increased frequency of deleterious alleles in the population which in

turn reduce the individual fitness. This phenomenon is known as inbreeding

depression and might decrease the short-term viability of a population due to the

loss of adaptive potential (Ouborg 2010; Ouborg et al. 2010). Wolves are prone

to inbreeding depression (Liberg et al. 2005; Räikkönen et al. 2006, 2009), but

large or fast-growing population seems to avoid it thanks to selection for

heterozygotes (Randi 2011). However, the decline of genetic variability is

correlated to the effective population size that is very small in wolves even in

the largest populations (Randi 2011), due probably to increased levels of

inbreeding together with decreased dispersal and immigration (Aspi et al. 2006).

Confirmed negative effects of inbreeding depression in wild wolves include the

decreasing over winter survival of pups (Liberg et al. 2005) and congenital bone

deformities in isolated wolf populations (Räikkönen et al. 2006, 2009).

Hybridization

Hybridization between wild species and their domestic counterparts represents a

threat to natural populations; although at the same time can introduce genetic

variation into isolated populations. The consequences could be the disruption of

local adaptation, increase of genetic homogenization between populations and

the extinction through introgressive hybridization (Rhymer & Simberloff 1996).

Grey wolves and domestic dogs possess identical karyotypes and can generate

fertile hybrids despite physiological and morphological differences (Wayne et

al. 1989; Vilà & Wayne 1999). Until now, population genetic studies did not

reveal large scale introgression of dog genes in wolves with the markers used.

Nevertheless, several works have started to detect hybrids in natural populations

(Randi et al. 2000, 2014; Randi & Lucchini 2002; Andersone et al. 2002;

Verardi et al. 2006; Sundqvist 2008; Godinho et al. 2011; Hindrikson et al.

2012). The use of only a few molecular markers cannot detect past generation

backcrosses (Randi 2008) and cryptic introgression is likely to go undetected

Page 10: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

7

(Currat et al. 2008). Moreover, introgressed variants may be undistinguishable

from intraspecific variation (Caniglia et al. 2013). If introgression is sufficiently

frequent, small and fragmented wolf populations can lose specific adaptations

and subsequently become extinct. Also wolf re-expansion waves are in risk to be

polluted by hybridization due to the amount of free-ranging or feral dogs (Randi

2011).

1.2. Molecular markers

Studies on evolutionary processes in natural populations have been extensively analysed

with classical population genetics (Ouborg et al. 2010). In grey wolves, many studies

have been conducted in this way using microsatellite loci (Verardi et al. 2006; vonHoldt

et al. 2010; Randi et al. 2014), MHC genes (Galaverni et al. 2013; Niskanen et al.

2014), mtDNA (Thalmann et al. 2013) and combinations of them (Ramírez et al. 2006;

Sastre et al. 2010; Godinho et al. 2011). Advances in Next Generation Sequencing

(NGS) technologies allow examining thousands of genetic markers, including indel-

polymorphisms and single nucleotide polymorphisms (SNPs). Access to a large number

of loci permits researchers to overcome analytical limitations associated with the

analysis of a small number of genetic markers (Allendorf et al. 2010), even in the

hybridization analysis (Twyford & Ennos 2012).

Many works have analysed the population processes of wolf and other canids using

NGS technologies (Boyko et al. 2010; vonHoldt et al. 2010, 2011; Pilot et al. 2014),

based on genotyping microarrays obtained from the complete sequencing of the dog

genome (Lindblad-Toh et al. 2005). Although this kind of data enhances the

understanding of wolf populations (vonHoldt et al. 2011), microarrays from close

relative species could introduce a bias due to the inclusion of the species variation

alone. Until now, just few works (Lindblad-Toh et al. 2005; Wang et al. 2013; Axelsson

et al. 2013; Freedman et al. 2014) include a whole-genome sequencing of wolves;

however, all of them are focused in the study of domestication.

Page 11: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

8

2. OBJETIVES

The objective of this Master’s Thesis is to analyse for first time a wolf population using

genome wide sequencing, including three Northwestern Iberian samples, one of them

from the EEP, and the first South of Duero individual in the literature. The current

diversity of the whole Iberian Peninsula has been covered using these samples. Only

one previous study (Godinho et al. 2011) have analysed the dynamics of wolf-dog

hybridization in this Northwestern population, where wolves use agricultural habitats

close to human settlements (Cuesta et al. 1991; Llaneza et al. 1996; Vos 2000; Blanco

& Cortés 2007) which is likely to favour the contact with feral and free-ranging dogs

and possibly resulting in extensive hybridization (Petrucci-Fonseca 1982; Blanco et al.

1992).

The specific objectives in the present work are two:

To analyse the wolf inbreeding degree in two small wolf populations, one of

them on the edge of extinction.

To analyse dog hybridization level and introgression patterns in wolves.

Page 12: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

9

3. MATERIAL AND METHODS

Bioinformatical methodology is discussed in Appendix 1.

3.1. Sampling and sequencing

We have generated the whole-genome sequence of 4 Iberian wolves: one captive (Wolf

EEP), two from the Northwestern population (Wolf Spain and Wolf Portugal) and one

from Southern Spain (Sierra Morena). The Illumina libraries were constructed following

manufacturer's instructions and sequenced in the CNAG (Centre Nacional d’Anàlisi

Genòmica) and the BGI (Beijing Genomics Institute). In addition, we included the

genomes of 11 dogs (different breeds), 6 American and 6 Eurasian wolves (unpublished

data; Freedman et al. 2014) for comparison purposes. All wild samples derive from

animals killed or found dead for reasons other than this research and deposited in

scientific collections. Captive wolf sample, whose origin is the Iberian Northwestern

population, comes from the Parc Zoologic of Barcelona.

3.2. Mapping

All the sequences were mapped to the dog reference genome (canFam3.1) using BWA

version 0.6.1 (Li & Durbin 2009) with the quality trimming parameter set to a Sanger

quality score of 15 and default parameters. Next, I used Picard tools version 1.70

(http://picard.sourceforge.net/) to remove PCR duplicates and GATK version 2.5

(McKenna et al. 2010) to perform indel realignment. The resulted files were used for

the SNP calling.

Then, the DepthOfCoverage tool implemented in GATK was used for the autosomic

data in order to average depth of coverage of this final set.

3.3. SNP calling

I produced a preliminary set of autosomic variants for wolf and dogs (19,640,837 SNPs)

using the GATK UnifiedGenotyper and VariantFiltration with the recommended

Page 13: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

10

filtering parameters for the case in which Variant Quality Score Recalibration (VQSR)

is not available (Auwera et al. 2013; more details in Appendix 1).

To avoid low complexity regions and gaps, the mappable region was obtained using the

GEM mappability program (Derrien et al. 2012) version 1.315, and custom Perl scripts.

Keeping the variants that fell into these regions I obtained a final dataset for all the

samples which contains 18,956,547 confident SNPs.

3.4. Diversity analysis and inbreeding

To explore the genome-wide distribution of genetic variability in the Iberian samples, I

looked at the distribution of heterozygosity across the genome in 1 Mb overlapping

window with 200kb sliding-step with an in-house-made Perl and R scripts. For each

window, the number of heterozygous positions in these regions was computed and

divided by the number of all callable positions. Only windows with a 100kb minimum

callable region were considered. The approach of the present work was been used in

other studies (for example, Prado-Martinez et al. 2013) instead the expected

heterozygosity diversity (π, Tajima 1983), because the number of samples for each

population is small.

To avoid coverage divergences between samples, I removed variants in non-callable

sample-specific regions obtained with GATK CallableLoci tool with a minimum base

quality of 20 and a maximum-minimum depth based in its coverage distribution, taken

the mean±5 autosomal read depth.

Runs of homocigosity (ROH) are regions with a lower heterozygosity rate. For each

sample, ROH were computed with a non-overlapping window-size of 1Mb. Depending

on the length, ROHs may be indicative of historical population demographics and

homozygosity by descent (Li et al. 2006; Hamzić 2011). Long ROHs (> 1Mb) are

indicative of autozygosity, inbreeding or admixture (Boyko et al. 2010; Pilot et al.

2014). Due to this association to the recent-past demography, I conservatively

considered ROHs when at least two consecutive windows (≥ 2Mb) fell under a

heterozygosity cutoff of 0.0005 (based in the heterozygosity distribution, Figure 2,

Appendix 2)

Page 14: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

11

To calculate the inbreeding coefficient based in runs of homocygosity, it was applied

the following definition of FROH (Keller et al. 2011):

Where ROHk and Lj are the kth ROH and the individual j’s genome length. The genome

length for each sample was computed using the callable bases in the considered

windows.

3.5. Ancestry analysis

For the ancestry analysis non-biallelic and missing markers were removed with a

custom Python script, filtering by MAF<0.01 and LD-pruned using PLINK version 1.07

(Purcell et al. 2007), with sliding-window size of 50 SNPs (10 overlap) and r2=0.5.

With this pruned dataset (4,558,774 SNPs), I performed an ADMIXTURE (Alexander

et al. 2009) analysis, which uses the same statistical model as STRUCTURE (Pritchard

et al. 2000). To assess the error, the program was run 5 times with K between 2 and 10,

and a 5-fold cross-validation (Alexander & Lange 2011). To visualize the relationships

between this genotype data a Principal Component Analysis (PCA) was performed

using the smartPCA program implemented in EIGENSOFT package version 5.0.1

(Price et al. 2006).

To check and improve the ancestry results for the Iberian wolves, I combined the

present-work’s data with a 48K dataset from previous works (Boyko et al. 2010;

vonHoldt et al. 2010, 2011) for increase the number of samples. This data comes from

the Affymetrix Canine version 2 genome-wide SNP mapping array, which uses

CanFam2 assembly coordinates. For this reason, each sample was mapped and SNPs

was called again to this assembly as previously described. After joining both datasets,

filtering by MAF and LD-pruned with PLIK using the same parameters as previously

described, I obtained a set of 43,497 SNPs. This dataset was used to repeat the

ADMIXTURE (only 3 runs) and PCA analysis as described above.

Page 15: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

12

3.6. Hybridization analysis

To confirm the high level of admixture between Sierra Morena wolf and dogs, shared

alleles between each Iberian wolf to the other samples were estimated. The percentage

of shared alleles by sample pairs was computed dividing by the total number of alleles

present in both samples. For this analysis, I included all confident SNPs called (no-

pruned dataset, 15,807,997 SNPs, without non-biallelic and missing markers). For the

Iberian wolf population I also computed the shared alleles between all samples drawing

a four set Venn diagram with VennDiagram R package version 1.6.5 (Chen & Boutros

2011).

To determine dog introgressed regions in highly admixed Iberian wolf samples (Sierra

Morena and Wolf Spain), it was used PCAdmix version 1.0 (Brisbin et al. 2012) with a

50 SNP window size. Because this program needs phased genotypes, the complete

pruned dataset were phased using SHAPEIT version 2.644 (Delaneau et al. 2013). To

detect blocks of ancestry (haplotypes assigned to ancestral populations), PCAdmix was

run with the 11 dogs as one ancestral population and 6 Eurasian wolves plus Wolf

Portugal and Wolf EEP as the second. To assess the result for both samples, the same

analysis for the Wolf EEP and Wolf Portugal was subsequently performed excluding

only the one used in the admixed population. From 50 SNP block assignment of the

four samples, overall percentage of haplotypes Dog/Dog, Wolf/Dog and Wolf/Wolf was

computed. To analyse the relationship of each kind of block, I fitted to a linear model

the incidence of each class with the mean heterozygosity in each chromosome using lm

function implemented in R (version 3.1.0).

For comparison purposes, a microsatellite analysis with Sierra Morena and Wolf Spain

was made, including the genotyping of 10 autosomic markers following the protocol of

Sastre et al. (2010). Using a dataset that includes 31 Iberian wolves and 32 dog samples

(Sastre 2011), a Bayesian model-based clustering approach implemented in

STRUCTURE version 2.0 (Falush et al. 2007) was performed, running 100,000 Markov

chain Monte Carlo repetitions and a burn-in period of 10,000 iterations for K=2.

Page 16: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

13

Figure 2. Diversity distribution. Density (a) and box plots (b) from heterozygosity in Iberian

samples using 1Mb 200kb-overlapping windows. Dotted lines point out the cutoff used as

inbreed windows.

Figure 2. Diversity distribution

4. RESULTS

4.1. Heterozygosity and inbreeding

Considering only Iberian wolves, Sierra Morena has the lowest mean heterozygosity

rate (0.00109 het/bp, Table 1), with 41.85% of sliding windows falling into inbreed

regions (Figures 2 and 3, Appendix 3); Wolf Portugal seems to share the same pattern

(0.00118 het/bp). The mean heterozygosity rates observed in the genome sequences of

the Eurasian wolves is 0.0016 het/bp (except Wolf Italy, where is lowest; Appendixes 2,

3 and 4), consistent to other genome-wide studies (Lindblad-Toh et al. 2005; Freedman

et al. 2014). Dog samples have a reduced heterozygosity (0.00088 het/bp; Appendixes

2, 3 and 4), but vary across different breeds as previously described (Lindblad-Toh et al.

2005; Freedman et al. 2014).

Table 1. Results for Iberian samples.

Sample Population Cov Het FROH %Dog blocks

Sierra Morena South Spain 43.94 0.00109271 0.42 31.88

Wolf Spain Northwestern 22.68 0.00154275 0.15 14.30

Wolf Portugal Northwestern 24.30 0.00118270 0.30 2.94

Wolf EEP Northwestern 22.74 0.00146647 0.15 3.20

Cov: atosomic coverage; Het: heterozygosity (het/bp); FROH: inbreeding coefficient;

%Dog: percentage of dog ancestry blocks

a) b)

Page 17: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

14

Fig

ure

3.

Div

ersi

ty a

naly

sis.

Het

ero

zygo

sity

in I

ber

ian s

ample

s usi

ng 1

Mb 2

00kb

-over

lappin

g w

ind

ow

s (b

lue

lin

es).

Dott

ed l

ines

po

int

out

the

med

ian o

f

each

sam

ple

an

d r

ed b

lock

s ar

e ru

ns

of

ho

mozy

gosi

ty (

RO

Hs)

usi

ng 1

Mb n

on

-over

lappin

g w

indow

s. D

etai

ls f

or

each

sam

ple

in

Ap

pen

dix

3.

Figu

re 3

. Div

ers

ity

anal

ysis

Page 18: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

15

ROHs appear in all Iberian wolves (Figure 4), but Sierra Morena had chromosomes

almost entirely homozygous (Figure 3, more details in Appendix 3). This sample shows

the largest ROHs at 40-60 Mbp, and the cumulative curve is the highest as compared to

the other Iberian samples. Although Wolf Portugal also has runs longer than 40 Mbp

(Figure 4b), the distribution is quite similar to other Iberian wolves. Wolf Spain and

Wolf EEP show a similar cumulative curve at the ROH length (Figure 4a), and almost

all runs of homozygosity are shorter than 30 Mbp.

Figure 4. Runs of homozygosity.

Cumulative (a) and total (b) counts for runs

of homozygosity (ROHs) in Iberian

samples computed using 1Mb non-

overlapping windows. Note that inbreed

regions less than 2Mb are in the plot but

not considered as ROH.

Figure 4. Runs of homozygosity

a)

b)

Page 19: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

16

Inbreeding coefficient analysis, calculated with the FROH, leads to the same result: Sierra

Morena is the most inbreed Iberian wolf (FROH = 0.42), followed by Wolf Portugal

(FROH = 0.30). Wolf Spain and Wolf EEP have an inbreeding coefficient (FROH = 0.15)

which is half the most inbreed Northwester Iberian sample (Table 1). The FROH of

Eurasian and American wolves (Appendix 2), except Wolf Italy (FROH = 0.51), Wolf

China (FROH = 0.23) and both Wolf Mexico (A and B samples, FROH = 0.70), is much

lower than that of Wolf Spain and Wolf EEP, with inbreeding coefficients between 0.01

and 0.09. On the other hand, dogs have a FROH in the range between 0.20-0.44

(depending the breed), higher than Northwestern Iberian wolves and around the Sierra

Morena’s value.

4.2. Hybridization patterns

48K-merged and the present-work’s dataset bring a similar result in the PCA analysis.

Wolf Portugal and Wolf EEP clusters with Iberian wolves and near other Eurasian

populations (Figure 5). Nevertheless, Wolf Spain and Sierra Morena are shifted from

this cluster towards dogs in the PC1, which differentiates well American wolves,

Eurasian wolves and dogs (Appendix 5). In the present-work’s dataset, PC4

distinguishes better Eurasian populations, but shows the same pattern in the Iberian

wolves (Appendix 5).

The ADMIXTURE analysis results in the same hybridization pattern with dogs (Figure

6). Because the 48K dataset (Boyko et al. 2010; vonHoldt et al. 2010, 2011) comes

from a microarray that maximizes the dog variability and has more samples, I detect

dog ancestry in Wolf Spain better in the 48K-merged (Figure 7) than in the present-

work’s dataset (Figure 6), although in the K=2 appears this component. Cross-validation

error for both datasets (Appendix 6) shows these differences, obtaining as correct

clusters K=9 and K=2, respectively.

From the ADMIXTURE analysis at K=2, our samples have the following percentage for

the dog component (this study and 48K-merged datasets, respectively): Sierra Morena

31.51% and 36.94%, Wolf Spain 10.43% and 17.69%, Wolf Portugal 0.00% and 4.47%,

Wolf EEP 0.00% and 3.28%. Importantly, the 48K dataset always detects more

percentage of introgression in any sample. It is likely that this bias is caused by the

Page 20: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

17

Figure 5. Principal

component analysis.

Principal component

analysis (PCA) of

Sierra Morena (red),

Wolf Portugal (blue),

Wolf Spain (green)

and Wolf EEP

(orange) with the

48K-merged dataset

samples (a) and

samples from this

work (b, c). In c),

SNP from dog blocks

in Sierra Morena and

Wolf Spain (Figure 9)

are removed (note that

in this case Iberian

samples cluster

closer).

Figure 5. Principal component analysis

a)

b)

c)

Page 21: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

18

microarray design, which takes into account only the variability from the dog genome.

Alternatively, for the most inbreed samples (Sierra Morena and Wolf Spain), the

STRUCTURE analysis using microsatellites leads to values of 42.4% and 0.6% of dog

component (Figure 8), respectively. These results are very different from those obtained

with genomic data, suggesting wrong estimations due to the low number of markers.

Due to this displacement towards dogs, I analyse haplotype blocks of dog and Eurasian

ancestry in the admixed samples (Figure 9). The result shows that almost a third of

Sierra Morena’s genome (31.88 % of 50 SNP) comes from dogs, doubling the Wolf

Spain’s dog ancestry (14.30%; Table 1, Figure 9). Moreover, the ancestry pattern

between both samples is different: Sierra Morena has long dog haplotypes present at the

same region in both chromosomes, whereas in Wolf Spain they are shorter and both

chromosomes shows a different distribution (Figure 10a). Wolf Portugal and Wolf EEP

have only 3% dog ancestry (Table 1), result that validate the method. These values are

close to the percentage of ADMIXTURE dog component and indicate an accurate

Figure 6. ADMIXTURE analysis of the present-work’s dataset. Cross-validation error

(Appendix 6) shows that the better cluster is K=2, which differentiates dogs and wolves.

Nevertheless, this analysis also differentiates North American (K=3), South American (K=4),

Asian and European (K=5, K=7) and Iberian (K=8) wolves. Moreover, relationships between

dogs breeds are reflected in K=6-9.

Figure 6. ADMIXTURE analysis of the present-work's dataset

Page 22: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

19

Fig

ure

7

. A

DM

IXT

UR

E

an

aly

sis

of

the

48K

-mer

ged

d

ata

set.

C

ross

-val

idat

ion

erro

r (A

pp

endix

6)

sho

ws

that

th

e bet

ter

clu

ster

is

K

=9

, w

hic

h

dif

fere

nti

ates

d

ogs,

an

d dif

fere

nt

wolf

popula

tions:

Asi

an (A

SW

), C

entr

al E

uro

pea

n (C

EW

), It

alia

n (

ITW

), I

ber

ian

(IB

W)

and 3 d

iffe

rent

Am

eric

an

(AM

W)

popula

tions

(lef

t pan

el).

On t

he

right,

zoo

m o

f Ib

eria

n s

ample

s an

alyse

d i

n t

his

work

.

Figu

re 7

. AD

MIX

TUR

E an

alys

is o

f th

e 4

8K

-me

rge

d d

atas

et

Page 23: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

20

estimation of the hybridization patterns with our genomic data in contrast with

microsatellites.

The linear models between heterozygosity and haplotypes (Dog/Dog, Wolf/Dog and

Wolf/Wolf) indicate that in the hybrid samples, chromosomes with more percentage of

both ancestry blocks tend to be more genetically variable (Figure 10b, Appendix 7).

This result suggests that the hybrid regions in Wolf Portugal increase the heterozygosity

by introgression.

Removing the SNP’s windows that present a dog haplotype in at least one of the

admixed samples, the PCA analysis shows the same clusters as in the previous one

(Figure 5c). In this case, Wolf Spain gathers the Iberian cluster (Wolf Portugal and Wolf

EEP), and Sierra Morena is close to them. However, Sierra Morena remains displaced

towards dogs, including in the PC4 (Appendix 5) which explains better the Eurasian

variation.

Furthermore, using all confident markers, Sierra Morena shares around 70% alleles with

dogs, which represents almost 1% more than Wolf Spain and 3% more than no-admixed

(Wolf Portugal and Wolf EEP) samples (Figure 11a). On the other hand, from all the

alleles present in the Iberian population (22,959,835 out of 31,616,032 in the dataset),

Sierra Morena have 5% of singletons, comparing with the 3% from the rest (Figure

11b). Moreover, exclusive alleles shared between Sierra Morena and Wolf Spain are a

Figure 8. STRUCTURE analysis of introgressed wolves. Probabilistic assignment to the

genetic clusters inferred by Bayesian analysis with K=2 of dog, Iberian wolves (IBW) and the

hybrid samples Wolf Spain (WS) and Sierra Morena (SM).

Figure 8. STRUCTURE analysis of introgressed wolves

Page 24: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

21

little higher (around 0.5% more) than between the other Iberian wolves and Sierra

Morena. Both results are consistent with the high hybridization level of Sierra Morena

and a few introgression of dog’s genome in Wolf Spain.

Figure 9. Ancestry across the chromosome. Ancestry blocks from dogs (red) and Eurasian

wolves (blue) in Sierra Morena (a), Wolf Spain (b), Wolf Portugal (c) and Wolf EEP (d). In

the legend, N represents the number of individuals used as ancestral population.

Figure 9. Ancestry across the chromosome

a) b)

c) d)

Page 25: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

22

Figure 10. Analysis of

haplotype blocks. Haplotype

class block frequency (a) and

heterozygosity-percentage

linear model (b, details in

Appendix 7) by sample and

class. Each point represents a

chromosome.

Figure 10. Analysis of haplotype blocks

a)

b)

Page 26: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

23

Figure 11. Iberian shared

alleles. Shared alleles between

samples in the present-work’s

dataset by pairs (a) and between

the four Iberian samples (b).

Percentage is calculated as the

number of shared alleles divided

by the total number of alleles in

the considered samples.

Figure 11. Iberian shared alleles

a)

b)

Page 27: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

24

5. DISCUSSION

Wolf population of Northwestern Iberia has been extensively studied in many aspects

(Ramírez et al. 2006; Sastre et al. 2010; Godinho et al. 2011; Sastre 2011; vonHoldt et

al. 2011; Pilot et al. 2014), but in this work it is included for the first time the variability

present in the South of the Iberian Peninsula. Although only one sample from this

population was analysed, it could be the last individual due to the high extinction risk of

an isolated one-pack group (Padial et al. 2000; Silva et al. 2013) composed by a single

breeding pair, their offspring of the year and occasional older offspring (Randi 2011).

Because this small size and the controversy about the existence of the South Iberian

wolf, I used “population” to refer to Sierra Morena individual data. By comparing the

genetic patterns between both populations it can be understood the dynamics of Iberian

wolf and its current conservation status. This study analyses the first NGS data from

wolves in the context of conservation and population genetics; thus we can investigate

in depth heterozygosity, inbreeding and hybridization patterns individually.

A major concern in wolf conservation genetics is the extensive hybridization between

wolf and wild or domestic canids (Rhymer & Simberloff 1996; Randi 2011).

Hybridization is a documented threat of canids, including the Ethiopian wolf with dogs

(Gottelli et al. 1994), and the red wolf (Adams et al. 2003), the Great Lakes wolf

(Leonard & Wayne 2008) and other North American wolves with coyotes (Roy et al.

1994). In Europe, hybridization between declining or expanding wolf populations and

their domestic counterparts is an important threat (Randi 2011) and many hybrids were

reported with a few number of genetic markers (Randi et al. 2000, 2014; Randi &

Lucchini 2002; Andersone et al. 2002; Verardi et al. 2006; Sundqvist 2008; Godinho et

al. 2011; Hindrikson et al. 2012). Only one previous study (Godinho et al. 2011)

provides information about the hybridization between wolves and dogs in the Iberian

Peninsula, obtaining a 4% of hybridization occurrence (8 individuals) in the

Northwestern population using 42 autosomal markers. In Godinho et al. (2011) some of

the introgressed samples were selected because presented dog phenotypic traits. Most of

the hybrid individuals show a 50% dog component and only two samples have a

component lower than 20% (concretely, 15.2% and 18.1%). Both individuals contain a

dog-like Y-chromosome which indicates the direction of the cross, suggesting that the

Page 28: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

25

introgression is recent and thus more detectable. In the present work, one out of two

wild samples from the Northwestern population shows a minor level of hybridization

(around 15%); otherwise, Sierra Morena has the third part of its genome introgressed by

dog (Table 1, Figure 9). Moreover, the mtDNA for both individuals (data not shown)

present a w1 wolf haplotype (from Vilà et al. 1999), that supports the major wolf-dog

hybridization direction detected in previous works (Vilà et al. 2003b; Godinho et al.

2011). Nevertheless, here I cannot detect the 15% hybridization level obtained with

whole-genome data of Wolf Spain using 10 microsatellite markers (Figure 8). Inversely,

Sierra Morena’s dog component is almost 50%, far for the proportion around 30%

obtained with ADMIXTURE and PCAdmix (Figures 6, 7 and 9). In addition, Wolf

Spain and Sierra Morena had no hybrid phenotypic characteristic, contrary to the

observations of Godinho et al. (2011). The results of this work suggest that the use of

microsatellite data might underestimate the hybridization incidence in populations with

at least 15% introgression; on the other hand, it might overestimate this parameter in the

most introgressed samples. Further whole-genome information from the Iberian

Peninsula will help to understand the proportion of hybridization in the Northwestern

population and accurately estimate the hybridization occurrence.

Because of the coexistence with feral and domestic dogs, hybridization is an important

effect in small wolf populations like those from the Italian and the Iberian Peninsulas

(Verardi et al. 2006; Randi 2008; Godinho et al. 2011; Randi et al. 2014) and also in

expanding populations in other European regions (Andersone et al. 2002; Sundqvist

2008; Hindrikson et al. 2012). Feral organisms might have an impact in the structure of

local communities, leading to loss of genetic diversity (Allendorf et al. 2001).

Moreover, introgression of dog genes can decrease the adaptive potential of the hybrid

and leads to extinction (Rhymer & Simberloff 1996). Introgressive hybridization would

enhance genetic homogenization, leading to disintegrate the local genetic adaptation.

Habitat modification is being increased by anthropogenic action, and this leads to

fragmentation and isolation of many populations. Individuals in these small and isolated

populations in contact with the domestic counterparts are more likely to hybridize

because of the difficulty of finding mates of the same species. This is very important in

the South Spain population, where the effective size is very small (Silva et al. 2013) and

the habitat is close to human settlements (Cuesta et al. 1991; Llaneza et al. 1996; Vos

2000; Blanco & Cortés 2007). When introgression occurs, a relatively greater fraction

Page 29: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

26

of small population would hybridize each generation and increase even more the

introgression rate (Rhymer & Simberloff 1996).

On the other hand, the importance of the genetic diversity in the scope of conservation

genetics is due to the effects in inbreeding depression and disruption of local adaptation

(Allendorf et al. 2010). Heterozygosity varies a lot across Iberian samples (Table 1,

Figure 2 and 3): in Wolf Spain it is similar to other European wolf populations

(Appendixes 2, 3 and 4), whereas Wolf Portugal has a lower rate; Wolf EEP, the captive

sample, is also as variable as Wolf Spain and the diversity of Sierra Morena is the

lowest, but very close to Wolf Portugal. Nevertheless, the four Iberian samples have a

higher inbreeding coefficient than the Eurasian populations (except Wolf China and

Wolf Italy; Appendix 2). Eurasian wolves have inbreeding coefficients between 0.01

and 0.09, except Wolf China (0.23) and Wolf Italy (0.51). Wolf China sample was

previously described (Freedman et al. 2014) showing the same diversity and ROHs;

Italian wolf is known to pass a severe bottleneck with genetic effects (Lucchini et al.

2004; Fabbri et al. 2007; Randi 2008) that leads to its inbreeding pattern. The genetic

evidence that a bottleneck occurred in the Iberian population was demonstrated (Sastre

et al. 2010; Sastre 2011), and explains the results obtained for the inbreeding coefficient

of Iberian samples in the present work.

Although the EEP studbook indicates that Wolf EEP must have an inbreeding

coefficient near 0 because only few generation of crosses have undergone, Wolf EEP

have the same inbreeding coefficient as Wolf Spain (FROH = 0.15). This value can be

explained by the past bottleneck which reduced the diversity of Iberian individuals

(Sastre et al. 2010; Sastre 2011). An inbreeding coefficient of 0.125, close to the value

of Wolf Spain, is produced by mattings between grandparent/grandchild, half-siblings,

or uncle/niece (assuming no previous inbreed parents). Wolf Portugal, due to its lower

heterozygosity rate, has an increased inbreeding coefficient (0.30) that is likely to

involve mattings between close-relative wolves with the same inbreeding coefficient as

Wolf Spain and Wolf EEP. In a population with a small effective size, the inbreeding

can’t be avoided (Randi 2011), as demonstrates the FROH very high (0.42) of Sierra

Morena. This value is near a very close-relative (parent/offspring, full siblings and

double first cousins in first degree) continuous matting structure, so the number of

individual must be very small. The autozygosity of Sierra Morena leads to a loss of

Page 30: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

27

genetic variation and inbreeding depression, which can make the population disappear,

threat that is likely to occur in the very inbred Iberian population.

In wolves, inbreeding has an effect in the health of the population (Liberg et al. 2005;

Räikkönen et al. 2013). Loss of genetic diversity (inbreeding depression) reduces

reproduction and survival, increasing extinction risk (Frankham 2005). Inbreeding

depression affects the population ability for adapting to the environmental change.

Although inbreeding depression can be avoided by removing the deleterious alleles by

selection (purging), this effect in small population is low and deleterious alleles of small

effect can drift to fixation (Frankham 2005). Consequently, this alleles increase in

frequency and reduce reproductive fitness (Wright et al. 2007). South Spain population

has a very small effective size (only one pack, Silva et al. 2013) which fixed deleterious

alleles are likely to have a high frequency due to inbreeding. Northwestern Iberia,

although the estimation of population size does not seem to endanger the population

diversity (Silva et al. 2013) because its recent growth, shows an inbreeding coefficient

between 0.15-0.30 (Table 1), slightly higher than other well-conserved Eurasian

populations (Appendix 2). This increase of inbreeding suggests that the effective size

values for the Iberian Peninsula might be overestimated (for example, by including

juveniles; Vilà 2010) as the matting structure related to the inbreeding coefficients

suggests. More samples will be necessary to verify this hypothesis.

Despite the differences in genetic variation, the three Northwestern wolves and Sierra

Morena cluster together when the dog component is removed (Figure 5). Moreover, dog

component only appears in the less inbreed Northwestern wild individual, leading to

conclude that the increase in the heterozygosity and decrease of inbreeding coefficient is

due to hybridization in this sample (Figure 10). Genetic integrity of Iberian population

is at risk due to its hybridization with dogs, as shown in Sierra Morena and Wolf Spain

samples. Nevertheless, Sierra Morena and Wolf Spain introgression patterns are

different (Figures 9 and 10): Wolf Spain introgressed regions are always Dog/Wolf,

whereas in Sierra Morena there are also haplotypes Dog/Dog. This suggests that the

hybridization in the South population is frequent, and almost all remaining individuals

have introgression signals in their genomes. On the other hand, Wolf Spain shows a

pattern more likely related to a sporadic hybridization event.

Page 31: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

28

Here it is confirmed the regional and continental genetic patterns of wolves detected in

vonHoldt et al. (2011) using genome-wide sequences without bias. Shared alleles bring

a geographical pattern (Figure 11), even considering only Iberian samples: the shared

percentage decreases with geographical distance, being highest with the Central Europe

sample. However, Sierra Morena reduces its affinity with other Iberian samples due to

the introgression of dog alleles. Wolf Spain shows an increase of shared alleles with

dogs, although it conserves the affinities with other Iberian wolves. Actual Iberian

population can be considered different from other European populations. Iberian wolf is

known to represent a different sub-specie (Cabrera 1907). Morphometric (Vilà 1993)

and genetic (Vilà et al. 1999; Lucchini et al. 2004) studies describe a notable

differentiation between Iberian and Eurasian wolves, which suggests that they have

been separated from all other European wolves for a long time. Recognizing its

evolutionary potential (Crandall et al. 2000), Northwestern Iberian population demands

a separate management. Although previous studies based in few molecular markers

(Lucchini et al. 2004; Ramírez et al. 2006; Sastre et al. 2010; Sastre 2011) conclude

that there is no severe reduction on the genetic variability, here it is demonstrated that

Wolf Portugal sample has an important reduction on the diversity (Table 1, Figures 2, 3

and 4), and that in the Wolf Spain the hybridization reduces its inbreeding coefficient

(Figure 10). Those two evidences indicate that the conservation status of Northwestern

Iberian population is at risk, either because inbreeding or introgression. Moreover,

compared with other European samples, the inbreeding coefficient is incremented

(Table 1, Appendix 2). If high levels of inbreeding or hybridization are an extended

pattern in the Iberian Peninsula, these results suggest that the real effective population

size is lower than previous estimations.

Page 32: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

29

6. CONCLUSION

Summarizing, similarities between Sierra Morena individual and other Iberian samples

included in this Master’s Thesis shows that the Northwestern population is at risk for

the same reason as Sierra Morena. Huge inbreeding coefficient and introgression are

well-known conservation threats (Rhymer & Simberloff 1996; Frankham 2005; Ouborg

2010; Allendorf et al. 2010; Randi 2011). Both factors are detected in the Northwestern

samples, indicating that the population is not as well-conserved as previously described.

Nevertheless, it has been detected two different patterns in both individuals: Wolf Spain

has a heterozygosity rate approximately equal to other Eurasian populations, but dog

introgression is present; on the other hand, Wolf Portugal has an increased inbreeding

coefficient and no hybridization. Analysis of more samples could explain the major

pattern in the Northwestern Iberian population.

Following, I point out the conclusions derived from the present work:

South Iberian wolf shows loss of genomic diversity and huge dog hybridization

which indicates an important extinction risk.

Northwestern Iberian wolf has higher diversity and less introgression than the

South population, but the level represents a threat to the population.

Patterns of hybridization are different in both populations: in the South,

introgression is frequent and extended; in the Northwestern, an occasional event.

Northwestern population has an inbreeding coefficient slightly higher than other

healthy grey wolves as a consequence of the bottleneck suffered in the Iberian

Peninsula.

Although more samples are needed, wolf population size seems to be

overestimated in the Northwestern Iberian Peninsula.

Page 33: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

30

7. REFERENCES

Adams JR, Kelly BT, Waits LP (2003) Using faecal DNA sampling and GIS to monitor

hybridization between red wolves (Canis rufus) and coyotes (Canis latrans).

Molecular ecology, 12, 2175–2186.

Alexander DH, Lange K (2011) Enhancements to the ADMIXTURE algorithm for

individual ancestry estimation. BMC bioinformatics, 12, 246.

Alexander DH, Novembre J, Lange K (2009) Fast model-based estimation of ancestry

in unrelated individuals. Genome Research, 19, 1655–1664.

Allendorf FW, Hohenlohe P a, Luikart G (2010) Genomics and the future of

conservation genetics. Nature reviews. Genetics, 11, 697–709.

Allendorf FW, Leary RF, Spruell P, Wenburg JK (2001) The problems with hybrids:

setting conservation guidelines. Trends in Ecology & Evolution, 16, 613–622.

Andersone Ž, Lucchini V, Ozoliņš J (2002) Hybridisation between wolves and dogs in

Latvia as documented using mitochondrial and microsatellite DNA markers.

Mammalian Biology - Zeitschrift für Säugetierkunde, 67, 79–90.

Aspi J, Roininen E, Ruokonen M, Kojola I, Vilà C (2006) Genetic diversity, population

structure, effective population size and demographic history of the Finnish wolf

population. Molecular ecology, 15, 1561–76.

Auwera GA Van Der, Carneiro MO, Hartl C et al. (2013) From FastQ Data to High-

Confidence Variant Calls : The Genome Analysis Toolkit Best Practices Pipeline.

In: Current Protocols in Bioinformatics (eds Bateman A, Pearson WR, Stein LD,

Stormo GD, Yates JR), pp. 11.10.1–11.10.33. Hoboken, NJ, USA.

Axelsson E, Ratnakumar A, Arendt M-L et al. (2013) The genomic signature of dog

domestication reveals adaptation to a starch-rich diet. Nature, 495, 360–364.

Bibikov DI (1994) Wolf problem in Russia. Lutreola, 3, 10–14.

Blanco JC, Cortés Y (2007) Dispersal patterns, social structure and mortality of wolves

living in agricultural habitats in Spain. Journal of Zoology, 273, 114–124.

Blanco JC, Reig S, de la Cuesta L (1992) Distribution, status and conservation problems

of the wolf Canis lupus in Spain. Biological Conservation, 60, 73–80.

Boitani L (2003) Wolf conservation and recovery. In: Wolves. Behavior, Ecology, and

Conservation (eds Mech LD, Boitani L), pp. 317–344. The University of Chicago

Press, Chicago.

Bouzat JL (2010) Conservation genetics of population bottlenecks: the role of chance,

selection, and history. Conservation Genetics, 11, 463–478.

Page 34: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

31

Boyko AR, Quignon P, Li L et al. (2010) A simple genetic architecture underlies

morphological variation in dogs. PLoS biology, 8, e1000451.

Breitenmoser U (1998) Large predators in the Alps: The fall and rise of man’s

competitors. Biological Conservation, 83, 279–289.

Brisbin A, Bryc K, Byrnes J et al. (2012) PCAdmix: principal components-based

assignment of ancestry along each chromosome in individuals with admixed

ancestry from two or more populations. Human biology, 84, 343–364.

Busch JD, Waser PM, Dewoody JA (2007) Recent demographic bottlenecks are not

accompanied by a genetic signature in banner-tailed kangaroo rats (Dipodomys

spectabilis). Molecular ecology, 16, 2450–62.

Cabrera A (1907) Los lobos de España. Boletín de la Real Sociedad Española de

Historia Natural, 7, 193–198.

Caniglia R, Fabbri E, Greco C et al. (2013) Black coats in an admixed wolf × dog pack

is melanism an indicator of hybridization in wolves? European Journal of Wildlife

Research, 59, 543–555.

Carmichael LE, Nagy JA, Larter NC, Strobeck C (2001) Prey specialization may

influence patterns of gene flow in wolves of the Canadian Northwest. Molecular

Ecology, 10, 2787–2798.

Chen H, Boutros PC (2011) VennDiagram: a package for the generation of highly-

customizable Venn and Euler diagrams in R. BMC bioinformatics, 12, 35.

Crandall KA, Bininda-Emonds ORP, Mace GM, Wayne RK (2000) Considering

evolutionary processes in conservation biology. Trends in Ecology & Evolution,

15, 290–295.

Cuesta L, Barcena F, Palacios F, Reig S (1991) The trophic ecology of the Iberian Wolf

(Canis lupus signatus Cabrera, 1907). A new analysis of stomach’s data.

Mammalia, 55, 239–254.

Currat M, Ruedi M, Petit RJ, Excoffier L (2008) The hidden side of invasions: massive

introgression by local genes. Evolution, 62, 1908–1920.

Delaneau O, Zagury J-F, Marchini J (2013) Improved whole-chromosome phasing for

disease and population genetic studies. Nature methods, 10, 5–6.

Derrien T, Estellé J, Marco Sola S et al. (2012) Fast computation and applications of

genome mappability. PloS one, 7, e30377.

Fabbri E, Miquel C, Lucchini V et al. (2007) From the Apennines to the Alps:

colonization genetics of the naturally expanding Italian wolf (Canis lupus)

population. Molecular ecology, 16, 1661–1671.

Falconer DS, Mackay TFC (1996) Quantitative genetics. Pearson Education Limited.

Page 35: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

32

Falush D, Stephens M, Pritchard JK (2007) Inference of population structure using

multilocus genotype data: dominant markers and null alleles. Molecular ecology

notes, 7, 574–578.

Frankham R (2005) Genetics and extinction. Biological Conservation, 126, 131–140.

Frankham R, Lees K, Montgomery ME et al. (1999) Do population size bottlenecks

reduce evolutionary potential? Animal Conservation, 2, 255–260.

Freedman AH, Gronau I, Schweizer RM et al. (2014) Genome sequencing highlights

the dynamic early history of dogs. PLoS genetics, 10, e1004016.

Galaverni M, Caniglia R, Fabbri E, Lapalombella S, Randi E (2013) MHC variability in

an isolated wolf population in Italy. The Journal of heredity, 104, 601–612.

Godinho R, Llaneza L, Blanco JC et al. (2011) Genetic evidence for multiple events of

hybridization between wolves and domestic dogs in the Iberian Peninsula.

Molecular ecology, 20, 5154–5166.

Gomerčić T, Sindičić M, Galov A et al. (2010) High genetic variability of the grey wolf

(Canis lupus L.) population from Croatia as revealed by mitochondrial DNA

control region sequences. Zoological Studies, 49, 816–823.

Gottelli D, Sillero-Zubiri C, Applebaum GD et al. (1994) Molecular genetics of the

most endangered canid: the Ethiopian wolf Canis simensis. Molecular ecology, 3,

301–312.

Hamzić E (2011) Division of Livestock Sciences Levels of Inbreeding Derived from

Runs of Homozygosity : A Comparison of Austrian and Norwegian Cattle Breeds.

PhD thesis, University of Natural Resources and Life Sciences: Vienna.

Hindrikson M, Männil P, Ozolins J, Krzywinski A, Saarma U (2012) Bucking the trend

in wolf-dog hybridization: first evidence from europe of hybridization between

female dogs and male wolves. PloS one, 7, e46465.

Keller L (2002) Inbreeding effects in wild populations. Trends in Ecology & Evolution,

17, 230–241.

Keller MC, Visscher PM, Goddard ME (2011) Quantification of inbreeding due to

distant ancestors and its detection using dense single nucleotide polymorphism

data. Genetics, 189, 237–249.

Leonard JA, Wayne RK (2008) Native Great Lakes wolves were not restored. Biology

letters, 4, 95–98.

Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler

transform. Bioinformatics (Oxford, England), 25, 1754–1760.

Li L, Ho S, Chen C et al. (2006) Long Contiguous Stretches of Homozygosity in the

Human Genome. Human mutation, 27, 1115–1121.

Page 36: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

33

Liberg O, Andrén H, Pedersen H-C et al. (2005) Severe inbreeding depression in a wild

wolf (Canis lupus) population. Biology letters, 1, 17–20.

Lindblad-Toh K, Wade CM, Mikkelsen TS et al. (2005) Genome sequence,

comparative analysis and haplotype structure of the domestic dog. Nature, 438,

803–819.

Llaneza L, Fernández A, Nores C (1996) Dieta del lobo en dos zonas de Asturias

(España) que difieren en carga ganadera. Doñana Acta Vertebrata, 23, 201–214.

Lucchini V, Galov A, Randi E (2004) Evidence of genetic distinction and long-term

population decline in wolves (Canis lupus) in the Italian Apennines. Molecular

Ecology, 13, 523–536.

McKenna A, Hanna M, Banks E et al. (2010) The Genome Analysis Toolkit: a

MapReduce framework for analyzing next-generation DNA sequencing data.

Genome research, 20, 1297–1303.

Mech LD (1970) The wolf: the ecology and behavior of an endangerted species

(Natural History Press, Ed,). Doubleday Publishing Co., N.Y.

Mech LD (1995) The challenge and opportunity of recovering wolf populations.

Conservation Biology, 9, 270–278.

Niskanen AK, Kennedy LJ, Ruokonen M et al. (2014) Balancing selection and

heterozygote advantage in major histocompatibility complex loci of the

bottlenecked Finnish wolf population. Molecular ecology, 23, 875–889.

Ouborg NJ (2010) Integrating population genetics and conservation biology in the era

of genomics. Biology letters, 6, 3–6.

Ouborg NJ, Pertoldi C, Loeschcke V, Bijlsma RK, Hedrick PW (2010) Conservation

genetics in transition to conservation genomics. Trends in genetics, 26, 177–187.

Ozolins J, Andersone Z (2001) Status of large carnivore conservation in the Baltic

States. Action plan for the conservation of wolf (Canis lupus) in Latvia. European

Commission: Strasbourg, T-PVS, 73, 1–32.

Padial JM, Contreras FJ, Pérez J, Ávila E, Barea JM (2000) Análisis de la situación y

problemática del lobo (Canis lupus signatus) en Sierra Morena Oriental (Sur de

España). Galemys, 12, 37–44.

Petrucci-Fonseca F (1982) Wolves and stray-feral dogs in Portugal. In: III International

Theriological Congress . Helsinky.

Pilot M, Branicki W, Jedrzejewski W et al. (2010) Phylogeographic history of grey

wolves in Europe. BMC evolutionary biology, 10, 104.

Page 37: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

34

Pilot M, Greco C, vonHoldt BM et al. (2014) Genome-wide signatures of population

bottlenecks and diversifying selection in European wolves. Heredity, 112, 428–

442.

Prado-Martinez J, Hernando-Herraez I, Lorente-Galdos B et al. (2013) The genome

sequencing of an albino Western lowland gorilla reveals inbreeding in the wild.

BMC genomics, 14, 363.

Price AL, Patterson NJ, Plenge RM et al. (2006) Principal components analysis corrects

for stratification in genome-wide association studies. Nature genetics, 38, 904–

909.

Pritchard JK, Stephens M, Donnelly P (2000) Inference of Population Structure Using

Multilocus Genotype Data. Genetics, 155, 945–959.

Purcell S, Neale B, Todd-brown K et al. (2007) PLINK : A Tool Set for Whole-Genome

Association and Population-Based Linkage Analyses. American Journal of Human

Genetics, 81, 559–575.

Räikkönen J, Bignert A, Mortensen P, Fernholm B (2006) Congenital defects in a

highly inbred wild wolf population (Canis lupus). Mammalian Biology - Zeitschrift

für Säugetierkunde, 71, 65–73.

Räikkönen J, Vucetich JA, Peterson RO, Nelson MP (2009) Congenital bone

deformities and the inbred wolves (Canis lupus) of Isle Royale. Biological

Conservation, 142, 1025–1031.

Räikkönen J, Vucetich J a, Vucetich LM, Peterson RO, Nelson MP (2013) What the

Inbred Scandinavian Wolf Population Tells Us about the Nature of Conservation.

PloS one, 8, e67218.

Ramírez O, Altet L, Enseñat C et al. (2006) Genetic assessment of the Iberian wolf

Canis lupus signatus captive breeding program. Conservation Genetics, 7, 861–

878.

Randi E (2008) Detecting hybridization between wild species and their domesticated

relatives. Molecular ecology, 17, 285–293.

Randi E (2011) Genetics and conservation of wolves Canis lupus in Europe. Mammal

Review, 41, 99–111.

Randi E, Hulva P, Fabbri E et al. (2014) Multilocus detection of wolf x dog

hybridization in italy, and guidelines for marker selection. PloS one, 9, e86409.

Randi E, Lucchini V (2002) Detecting rare introgression of domestic dog genes into

wild wolf (Canis lupus) populations by Bayesian admixture analyses of

microsatellite variation. Conservation Biology, 3, 31–45.

Page 38: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

35

Randi E, Lucchini V, Christensen MF et al. (2000) Mitochondrial DNA Variability in

Italian and East European Wolves: Detecting the Consequences of Small

Population Size and Hybridization. Conservation Biology, 14, 464–473.

Rhymer JM, Simberloff D (1996) Extinction by hybridization and introgression. Annual

Review of Ecology and Systematics, 27, 83–109.

Roy M, Geffen E, Smith D, Ostrander E, Wayne R (1994) Patterns of differentiation

and hybridization in North American wolflike canids, revealed by analysis of

microsatellite loci. Molecular biology and evolution, 11, 553–570.

Sastre N (2011) Genética de la conservación: el lobo gris (Canis lupus). PhD thesis,

Universidad Autónoma de Barcelona: Spain.

Sastre N, Vilà C, Salinas M et al. (2010) Signatures of demographic bottlenecks in

European wolf populations. Conservation Genetics, 12, 701–712.

Sidorovich VE, Tikhomirova LL, Jedrzejewska B (2003) Wolf Canis lupus numbers,

diet and damage to livestock in relation to hunting and ungulate abundance in

northeastern Belarus during 1990–2000. Wildlife Biol, 9, 103–111.

Silva JP, Toland J, Hudson T et al. (2013) LIFE and human coexistence with large

carnivores (The EU LIFE Programme - European Commision, Ed,). DG

Environment.

Sundqvist A (2008) Conservation Genetics of Wolves and their Relationship with Dogs.

PhD thesis, Uppsala University: Sweeden.

Tajima F (1983) Evolutionary relationship of DNA sequences in finite populations.

Genetics, 105, 437–460.

Tallmon DA, Luikart G, Waples RS (2004) The alluring simplicity and complex reality

of genetic rescue. Trends in ecology & evolution, 19, 489–96.

Thalmann O, Shapiro B, Cui P et al. (2013) Complete mitochondrial genomes of

ancient canids suggest a European origin of domestic dogs. Science, 342, 871–874.

Twyford AD, Ennos RA (2012) Next-generation hybridization and introgression.

Heredity, 108, 179–189.

Valverde JA (1971) El lobo español. Montes, 159, 228–241.

Verardi a, Lucchini V, Randi E (2006) Detecting introgressive hybridization between

free-ranging domestic dogs and wild wolves (Canis lupus) by admixture linkage

disequilibrium analysis. Molecular ecology, 15, 2845–2855.

Vilà C (1993) Aspectos morfológicos y ecológicos del lobo ibérico Canis lupus. PhD

thesis, Universidad de Barcelona: Spain.

Page 39: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

36

Vilà C (2010) Viabilidad de las poblaciones ibéricas de lobos. Enseñanzas de la

genética para la conservación. In: Los lobos de la Península Ibérica. Propuestas

para el diagnóstico de sus poblaciones. (eds Fernández-Gil A, Álvares F, Vilà C,

Ordiz A), pp. 157–171. ASCEL, Palencia, Spain.

Vilà C, Amorim IR, Leonard JA et al. (1999) Mitochondrial DNA phylogeography and

population history of the grey wolf Canis lupus. Molecular Ecology, 8, 2089–2103.

Vilà C, Sundqvist A-K, Flagstad Ø et al. (2003a) Rescue of a severely bottlenecked

wolf (Canis lupus) population by a single immigrant. Proceedings. Biological

sciences / The Royal Society, 270, 91–97.

Vilà C, Walker C, Sundqvist A-K et al. (2003b) Combined use of maternal, paternal

and bi-parental genetic markers for the identification of wolf-dog hybrids.

Heredity, 90, 17–24.

Vilà C, Wayne RK (1999) Hybridization between Wolves and Dogs. Conservation

Biology, 13, 195–198.

vonHoldt BM, Pollinger JP, Earl D a et al. (2011) A genome-wide perspective on the

evolutionary history of enigmatic wolf-like canids. Genome Research, 21, 1294–

1305.

vonHoldt BM, Pollinger JP, Lohmueller KE et al. (2010) Genome-wide SNP and

haplotype analyses reveal a rich history underlying dog domestication. Nature,

464, 898–902.

Vos J (2000) Food habits and livestock depredation of two Iberian wolf packs (Canis

lupus signatus) in the north of Portugal. Journal of Zoology, 251, 457–462.

Wang G, Zhai W, Yang H et al. (2013) The genomics of selection in dogs and the

parallel evolution between dogs and humans. Nature communications, 4, 1860.

Wayne RK, Van Valkenburgh B, Kat PW et al. (1989) Genetic and Morphological

Divergence among Sympatric Canids. J. Hered., 80, 447–454.

Wright S (1977) Evolution and the genetics of populations. Vol. 3. Evolution and the

genetics of populations. Univ. of Chicago Press, Chicago, IL.

Wright LI, Tregenza T, Hosken DJ (2007) Inbreeding, inbreeding depression and

extinction. Conservation Genetics, 9, 833–843.

Page 40: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

37

APPENDIX 1

Bioinformatics’ discussion

In this appendix, I present the pipeline diagrams for the analyses done and specific

bioinformatics issues were discussed.

Pipelines

Diagrams for the three main analyses done: mapping and variant calling,

diversity analysis and hybridization analysis. The symbol meanings in the

pipeline are the following:

Mapping and variant calling

Page 41: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

38

Diversity analysis

Hybridization analysis

Page 42: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

39

Bioinformatics’ discussion

SNP validation

Due to the small dataset used, I perform the validation of the calls with the

GATK Good Practices recommendations1, preforming a hard filtering. The

variants that pass the filters have:

Quality by depth higher than 2.

Root Mean Square (RMS) of the mapping quality higher than 40.

Phred-scaled p-value using Fisher’s Exact Test to detect strand bias

lower than 60.

Consistency of the site with two segregating haplotypes (haplotype

score) lower than 13.

u-based z-approximation from the Mann-Whitney Rank Sum Test for

mapping qualities higher than 12.5.

u-based z-approximation from the Mann-Whitney Rank Sum Test for the

distance from the end of the read for reads with the alternate allele higher

than 8.

Format conversion

In the analysis of the dataset, I used various published programs whose formats

are different. Although some open programs and scripts deal with this problem

(for example, VCFtools2), I wrote some scripts to better known of the final

dataset characteristic (https://github.com/magicDGS/bioConvert). For instance, a

VCF to TPED/TMAP converter written in Python (vcf2tplink.py) remove non-

bialelic SNPS and/or whitout GT information.

Analytical scripts

Perl and Python custom scripts used in the analyses of this work do not appear in

any repository because they are in optimization process. In addition, R analyses

were done using plyr3 and reshape2

4 packages to manage the data and ggplot2

5

to visualize the results. Because I explored the data by command-line interface,

no scripts were written.

Page 43: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

40

References

1. Auwera, G. A. Van Der et al. in Curr. Protoc. Bioinforma. (Bateman, A.,

Pearson, W. R., Stein, L. D., Stormo, G. D. & Yates, J. R.) 11.10.1–11.10.33

(2013).

2. Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27,

2156–8 (2011).

3. Wickham, H. The Split-Apply-Combine Strategy for Data Analysis. J. Stat.

Softw. 40, 1–29 (2011).

4. Wickham, H. Reshaping Data with the reshape Package. J. Stat. Softw. 21, 1–20

(2007).

5. Wickham, H. ggplot2: elegant graphics for data analysis. (Springer New York,

2009).

Page 44: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

41

APPENDIX 2

Results for no-Iberian samples

Table A2.1 shows the individual results and Table A2.2 the means for each population.

Means are computed considering dog population, Eurasian (excluding Wolf Italy),

North American and South American wolf populations.

Table A2.1 Individual results

Sample Specie/Region Population Cov Het FROH

Wolf Croatia Eurasian Wolf Central/Eastern Europe 6.98 0.00147319 0.09

Wolf China Eurasian Wolf Middle Eastern Europe/Asia 26.36 0.00148438 0.23

Wolf India Eurasian Wolf Middle Eastern Europe/Asia 24.90 0.00181422 0.01

Wolf Iran Eurasian Wolf Middle Eastern Europe/Asia 26.27 0.00178093 0.03

Wolf Israel Eurasian Wolf Middle Eastern Europe/Asia 6.01 0.00150744 0.05

Wolf Italy Eurasian Wolf Italy 5.81 0.00032140 0.51

Airedale Terrier Dog Modern breed 7.33 0.00064358 0.44

Basenji Dog Modern breed 1.35 0.00068557 0.34

Boxer Dog Modern breed 29.33 0.00066418 0.41

Chinese Crested Dog Modern breed 19.17 0.00076284 0.41

Chinook Dog Modern breed 7.84 0.00080670 0.39

English Cocker Spaniel Dog Modern breed 9.66 0.00104400 0.25

Kerry Blue Terrier Dog Modern breed 15.83 0.00068793 0.44

Labrador Retriever Dog Modern breed 10.80 0.00110546 0.20

Miniature Schnauzer Dog Modern breed 5.47 0.00076737 0.32

Soft Coated Wheaten Terrier Dog Modern breed 17.18 0.00070319 0.41

Standard Poodle Dog Modern breed 12.63 0.00101575 0.28

Wolf Great Lakes Amerian Wolf North America 24.34 0.00183124 0.08

Wolf Yellowstone A Amerian Wolf North America 25.73 0.00154630 0.18

Wolf Yellowstone B Amerian Wolf North America 24.07 0.00158641 0.13

Wolf Yellowstone C Amerian Wolf North America 5.41 0.00148466 0.09

Wolf Mexico A Amerian Wolf South America 23.59 0.00003753 0.70

Wolf Mexico B Amerian Wolf South America 5.23 0.00012047 0.70

Cov: atosomic coverage; Het: heterozygosity (het/bp); FROH: inbreeding coefficient

Page 45: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

42

Table A2.2. Population results

Population N Mean Het Mean FROH

Central/Eastern Europe Wolf 4 0.00161203 0.08

Dogs 11 0.00080787 0.35

North American Wolf 4 0.00161215 0.12

South American Wolf 2 0.00007900 0.70

Het: mean heterozygosity (het/bp); FROH: inbreeding coefficient

Page 46: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

43

APPENDIX 3

Heterozygosity by chromosome

Heterozygosity in each sample using 1Mb 200kb-overlapping windows. Red dots are

indicative for a window under 0.0005 heterozygotes per base pair (inbreed window).

Page 47: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

44

Page 48: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

45

Page 49: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

46

Page 50: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

47

Page 51: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

48

Page 52: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

49

Page 53: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

50

Page 54: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

51

Page 55: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

52

Page 56: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

53

Page 57: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

54

Page 58: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

55

Page 59: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

56

Page 60: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

57

APPENDIX 4

Heterozygosity distribution for no Iberian samples

Density and box plots from heterozygosity in dogs, Eurasian and American wolves,

using 1Mb windows with 200kb-overlapping. Dotted lines point out the cutoff used as

inbreed window.

Dogs

Page 61: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

58

Eurasian wolves

Page 62: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

59

American wolves

Page 63: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

60

APPENDIX 5

Principal components’ boxplots and PCA with component 4

In the 48K-merged dataset, PC1 shows the differentiation between wolves and dogs,

whereas PC2 represents the geographical variation of wolves. In the present-work’s

dataset, PC1 shows the differentiation between wolves and dogs, whereas PC2 clusters

American wolves together. The geographically differentiation of Eurasian wolves are

explained with PC4. Plotted below, PCA (using PC1 and PC4) with the samples from

this work with and without dog blocks from Sierra Morena and Wolf Spain.

Page 64: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

61

48K-merged dataset

Page 65: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

62

Dataset form this work

Page 66: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

63

PCA with dog blocks

PCA without dog blocks

Page 67: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

64

APPENDIX 6

Cross-validation error of the ADMIXTURE analysis

Cross-validation mean and standard deviation for the 5-run ADMIXTURE analysis of

the present-work’s dataset and the 3-run 48K-merged dataset. Note that the variation in

the cross-validation error is larger in the present-work’s dataset.

Page 68: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

65

APPENDIX 7

Linear model details of heterozygosity-percentage block analysis

Statistic summary (Table A7.1) for each linear model showed in Figure 10. Below,

residuals, Q-Q and leverage plots.

Table A7.1. Summary statistics table.

Sample Haplotype Slope p-value Adj. R2

Sierra Morena Dog/Dog -112.40 0.0657 0.0657

Wolf/Dog 332.32 0.0000 0.7331

Wolf/Wolf -219.92 0.0049 0.1774

Wolf Spain Dog/Dog 16.21 0.0252 0.1074

Wolf/Dog 443.56 0.0002 0.3029

Wolf/Wolf -459.77 0.0002 0.2968

Wolf Portugal Dog/Dog -17.37 0.0000 0.4639

Wolf/Dog 29.95 0.0000 0.3534

Wolf/Wolf -12.58 0.0942 0.0502

Wolf EEP Dog/Dog -7.19 0.2841 0.0049

Wolf/Dog 18.03 0.1870 0.0213

Wolf/Wolf -10.84 0.5100 -0.0153

Haplotype: haplotype class in both chromosomes;

Slope: estimation for the slope (heterozygosity in het/bp);

Adj R2:adjusted R

2 for the complete model

Dog/Dog haplotypes Sierra Morena

Page 69: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

66

Wolf/Dog haplotypes Sierra Morena

Wolf/Wolf haplotypes Sierra Morena

Page 70: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

67

Dog/Dog haplotypes Wolf Spain

Wolf/Dog haplotypes Wolf Spain

Page 71: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

68

Wolf/Wolf haplotypes Wolf Spain

Dog/Dog haplotypes Wolf Portugal

Page 72: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

69

Wolf/Dog haplotypes Wolf Portugal

Wolf/Wolf haplotypes Wolf Portugal

Page 73: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

70

Dog/Dog haplotypes Wolf EEP

Wolf/Dog haplotypes Wolf EEP

Page 74: Dog introgression patterns in a South European wolf populationbioinformatica.uab.cat/.../MThesis_Daniel_Gomez2015_4_18D19_48.… · Dog introgression patterns in a South European

71

Wolf/Wolf haplotypes Wolf EEP