analysis of short tandem repeat allele frequencies in the...

169
Analysis of short tandem repeat allele frequencies in the United Arab Emirates population Jones, R. J. (2016). Analysis of short tandem repeat allele frequencies in the United Arab Emirates population DOI: 10.4225/23/593f5758257be DOI: 10.4225/23/593f5758257be Link to publication in the UWA Research Repository Rights statement This work is protected by Copyright. You may print or download ONE copy of this document for the purpose of your own non-commercial research or study. Any other use requires permission from the copyright owner. The Copyright Act requires you to attribute any copyright works you quote or paraphrase. General rights Copyright owners retain the copyright for their material stored in the UWA Research Repository. The University grants no end-user rights beyond those which are provided by the Australian Copyright Act 1968. Users may make use of the material in the Repository providing due attribution is given and the use is in accordance with the Copyright Act 1968. Take down policy If you believe this document infringes copyright, raise a complaint by contacting [email protected]. The document will be immediately withdrawn from public access while the complaint is being investigated. Download date: 22. Jun. 2018

Upload: lamtu

Post on 13-May-2018

228 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

Analysis of short tandem repeat allele frequencies in theUnited Arab Emirates populationJones, R. J. (2016). Analysis of short tandem repeat allele frequencies in the United Arab Emiratespopulation DOI: 10.4225/23/593f5758257be

DOI:10.4225/23/593f5758257be

Link to publication in the UWA Research Repository

Rights statementThis work is protected by Copyright. You may print or download ONE copy of this document for the purposeof your own non-commercial research or study. Any other use requires permission from the copyright owner.The Copyright Act requires you to attribute any copyright works you quote or paraphrase.

General rightsCopyright owners retain the copyright for their material stored in the UWA Research Repository. The University grants no end-userrights beyond those which are provided by the Australian Copyright Act 1968. Users may make use of the material in the Repositoryproviding due attribution is given and the use is in accordance with the Copyright Act 1968.

Take down policyIf you believe this document infringes copyright, raise a complaint by contacting [email protected]. The document will beimmediately withdrawn from public access while the complaint is being investigated.

Download date: 22. Jun. 2018

Page 2: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

Analysis of Short Tandem Repeat Allele Frequencies in the United Arab

Emirates population

By Rebecca Jayne Jones (21524658)

Bachelor of Forensic Investigation (ECU), Graduate Diploma in Forensic Science

(UWA)

This thesis is presented for the degree of MASTER OF FORENSIC SCIENCE -

RESEARCH at The University of Western Australia

School of Anatomy, Physiology and Human Biology

2016

Page 3: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

2

Thesis Declaration

I, Rebecca Jones, certify that:

This thesis has been substantially accomplished during enrolment in the degree.

This thesis does not contain material that has been accepted for the award of any other

degree or diploma in my name, in any university or other tertiary institution.

No part of this work will, in the future, be used in a submission in my name, for any

other degree or diploma in any university or other tertiary institute without the prior

approval of the University of Western Australia and where applicable, any partner

institution responsible for the joint-award of this degree.

This thesis does not contain any material previously published or written by another

person, except where due reference has been made in the text.

The work(s) are not in any way a violation or infringement of any copyright,

trademark, patent, or other rights whatsoever of any person. The research involving

human data reported in this thesis was assessed and approved by The University of

Western Australia Human Research Ethics Committee (RA/4/1/7778).

This thesis contains published work and/or work prepared for publication, some of

which has been co-authored.

Signature,

Rebecca Jones

03-11-2016

Page 4: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

3

Abstract

Population-specific genetic research continues to progress with modern day

advancements in technology leading to the expansion in applications in forensic

identification, paternity testing and mapping of disease susceptibility genes. The

Middle East populations have been poorly studied but this region is significant as a

destination and route in early human migration out of Africa. The objective of the

research described in this thesis is to add to the existing knowledge in the region by

describing the genetic diversity that exists between ethnic groups of the United Arab

Emirates (UAE).

The UAE was chosen to study as it is situated in the Arabian Peninsula and located at

the crossroads of human migration out of the Horn of Africa as well as across the land

bridge that is now Egypt into Asia and Europe. The aims of the research were to

characterise allele frequencies and calculate forensic parameters for the UAE

populations and to improve our understanding of the genetic relationships between

different populations from the Middle East, North Africa and South Asia. In order to

address these aims, analyses of variable autosomal and Y-chromosome Short Tandem

Repeats (Y-STRs) were performed. A total of 996 UAE individuals were analysed

using 21 autosomal STR loci with the GlobalFiler® PCR Amplification Kit (Life

Technologies). The allele frequencies and statistical parameters were calculated and

results highlighted the genetic diversity of the UAE population. The combined power

of discrimination, exclusion and match probability were 0.999999999, 0.999999996

and 4.38 x 10-27 respectively. Locus-by-locus analysis of the autosomal STR allele

frequencies in the UAE population studied were then compared with published

autosomal data from the surrounding regions of North Africa, Middle East and South

Page 5: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

4

Asia. The UAE population showed close genetic relationships with other Middle

Eastern populations and South Asian populations. Increasing the number of loci from

six up to 15 loci was found to provide more accuracy and better delineated the

observed genetic relationships.

The comparison of Y-STR haplotypes in the UAE population was subsequently

carried-out with the Y-Filer PLUS Amplification Kit (Life Technologies). Twenty-

seven Y-STR loci were analysed in 217 UAE individuals. Statistical parameters were

calculated using haplotype frequencies. Haplotype data was compared to populations

in the public Y-chromosome Haplotype Reference Database (YHRD) using MDS and

AMOVA (www.yhrd.org). This study is the first to highlight the use of data from 27

Y-STR loci on a UAE population. A high degree of genetic diversity was observed

in the UAE population based on typing from the 27 Y-STR loci with a Discrimination

Capacity of 95.40%. However, the number of populations with haplotypes comprising

27 Y-STR loci was limited and as such only haplotypes based on 17 Y-STR loci were

available for comparison from other populations in the regions immediately

surrounding the UAE. The reduction in the number of Y-STR loci used for haplotype

construction resulted in decreased discrimination capacity to 83.40%. Based on the

Y-haplotype distribution, the UAE population clustered with other Middle Eastern

populations. South Asian populations clustered closer to the UAE than the North

African populations. Variable results of some population genetic relationships were

observed when comparing the results between autosomal STR and Y-STR analyses.

The analyses described in this thesis provides insight into the relationship between the

Arab population of the UAE and others in the region surrounding the Middle East,

which goes partly to explaining human migration, historical events, trade and socio-

Page 6: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

5

cultural relationships. The present research also establishes the importance of ongoing

research into Middle Eastern populations such as the UAE and the utility of increasing

sample size and number of sampled STR loci in population-based genetic applications.

Page 7: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

6

Table of Contents

Acknowledgements………………………………………………………………...11

Statement of Candidate Contribution………………………………………….....13

Chapter One

1. General Introduction……………………………………………………………...16

1.1 Introduction……………………………………………………………..17

1.2 Aims and Hypothesis…………………………………………………....17

1.3 Thesis Structure…………………………………………………………18

1.4 References……………………………………………………………….20

Chapter Two

2. Literature Review: Genetic Analysis of People of the Middle East and the Use of

Short Tandem Repeats for Population Genetic Analyses…..………………………22

2.1 Introduction……………………………………………………………..23

2.2 Human Migration……………………………………………………….24

2.3 Impact of Bidirectional Migration through the Middle East…………....26

2.4 Historical and Trade Migration in the Middle East Region…………….28

2.5 Geographical Differentiation Determined by Different Genetic

Components…………………………………………………………………32

2.6 Distinguishing Closely Related Groups using Autosomal STRs………..34

2.7 Middle East Region and Worldwide Research…………………………..36

2.8 Conclusion………………………………………………………………38

Page 8: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

7

2.9 References……………………………………………………………….39

Chapter Three

3. Population Genetics Data for 21 Autosomal STR loci for United Arab Emirates

(UAE) Population using a Next Generation Multiplex STR Kit……………………47

3.1 Introduction……………………………………………………………..49

3.2 Materials and Methods…………………………………………………..50

3.2.1 Sample Description……………………………………………50

3.2.2 DNA Extraction…………………………………………….....50

3.2.3 PCR Multiplex Amplification…………………………………50

3.2.4 STR Typing……………………………………………………51

3.2.5 Statistical Analysis……………………………………….........51

3.3 Results and Discussion…………………………………………………..52

3.4 Conclusion………………………………………………………………53

3.5 References……………………………………………………………….62

Chapter Four

4. Allele Frequencies of Short Tandem Repeat markers used for Forensic

Applications in the Arab Population of the United Arab Emirates.………………...63

4.1 Introduction……………………………………………………………..65

4.2 Materials and Methods…………………………………………………..66

4.2.1 Sample Description……………………………………………66

4.2.2 DNA Extraction……………………………………….............67

Page 9: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

8

4.2.3 PCR Multiplex Amplifications………………………………..67

4.2.4 STR Typing……………………………………………………67

4.2.5 Statistical Analysis………………………………………….....68

4.3 Results and Discussion…………………………………………………..68

4.4 Conclusion………………………………………………………………71

4.5 References……………………………………………………………….79

Chapter 5

5. A Comparative Analysis of Autosomal Short Tandem Repeat (STR) Allele

Frequencies of Populations in the United Arab Emirates and Surrounding Regions...81

5.1 Introduction……………………………………………………………..82

5.2 Methods…………………………………………………………………84

5.3 Results…………………………………………………………………..85

5.3.1 Within-population Genetic Variability Measures……………..85

5.3.2 Regional population Genetic Comparisons of the Middle East..87

5.3.3 Inter-population Genetic Comparison…………………………90

5.3.4 Effect of Increased Number of STR Markers………………….90

5.4 Discussion……………………………………………………………….93

5.4.1 Significance of Inter-population Genetic Comparisons……….93

5.4.2 Heterozygosity Analysis………………………………………93

5.4.3 Inter-population Genetic Comparison using Six Autosomal STR

Loci………………………………………………………………….94

Page 10: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

9

5.4.3.1 Regional Genetic Comparisons with the Present UAE

Study………………………………………………………...94

5.4.3.2 Broader Population Genetic Comparison with the

Present UAE Study………………………………………….95

5.4.4 Significance of Increasing Number of Autosomal STR Loci in

Analyses…………………………………………………………….97

5.5 Conclusion………………………………………………………………98

5.6 References…………………………………………………………….....99

Chapter Six

6. Y-Chromosome STR haplotypes can be used to differentiate lineages in the United

Arab Emirates Population………………………………………………………….104

6.1 Introduction……………………………………………………………106

6.2 Materials and Methods…………………………………………………108

6.2.1 Study Population……………………………………………..108

6.2.2 Genotyping…………………………………………………..109

6.2.3 Statistical Analysis…………………………………………...110

6.3 Results and Discussion…………………………………………………110

6.4 Conclusion……………………………………………………………..121

6.5 References……………………………………………………………...123

Chapter Seven

7. General Discussion and Conclusion……………………………………………..127

7.1 Discussion……………………………………………………………...128

Page 11: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

10

7.2 Conclusion……………………………………………………………..134

7.3 References……………………………………………………………...136

Bibliography………………………………………………………………………140

Appendices………………………………………………………………………..152

Appendix 1………………………………………………………………...153

Appendix 2………………………………………………………………...155

Appendix 3………………………………………………………………...157

Appendix 4………………………………………………………………...159

Page 12: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

11

Acknowledgements

I acknowledge the support and assistance from Khalifa University Biotechnology

Center for providing de-identified DNA samples from the Emirates Family Registry

for the following studies. I further acknowledge the travel grant from the Graduate

Research School at the University of Western Australia, which facilitated the

collaborative link with colleagues at Khalifa University. I would like to also

acknowledge the Abu Dhabi Police General Head Quarter for sponsoring the studies

of collaborators Osamah Ali Alhmoudi and Wafa Al Tayyare.

I would like to thank my supervisor Dr Guan Tay for providing me with great

opportunities, dedication and support towards the completion of the thesis.

Furthermore, I would like to thank Dr Habiba Alsafar for her efforts whilst travelling

to Khalifa University, providing a wonderful experience. Also I would like to thank

the ongoing assistance from my supervisors Dr Silvana Gaudieri and Dr Daniel

Franklin and the School of Anatomy, Physiology and Human Biology at the

University of Western Australia.

Page 13: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

12

I would like to dedicate this thesis to my parents Kylie and Brett Jones, my partner

Anthony Carameli and my family for their ongoing support throughout my many years

of study. I would not have achieved this without their positive influences impacting

my determination in the completion of this thesis.

Page 14: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

13

Statement of Candidate Contribution

In accordance with the University of Western Australia’s regulations regarding

Research Higher Degrees, this thesis includes published and formatted journal papers.

The contribution of the candidate and co-author(s) for the appropriate chapters are

hereby set forth:

Chapter Three: Population genetics data for 21 autosomal STR loci for the United

Arab Emirates (UAE) population using a next generation multiplex STR kit.

The manuscript presented is first authored by Osamah Ali Alhmoudi and published

with co-authors. The publication details are as follows:

Ali Alhmoudi O, Jones R, Tay G, Alsafar H, Sibte H. Population genetics data

for 21 autosomal STR loci for United Arab Emirates (UAE) population using

next generation multiplex STR kit. Forensic Science International: Genetics.

2015;19:190-1.

The candidate completed a thorough interpretation of the autosomal STR data under

the supervision of Drs Tay and Alsafar. The laboratory processes were carried out by

first author Osamah Ali Alhmoudi under the supervision of Drs Sibte and Alsafar.

The project was designed by Mr Alhmoudi in collaboration with the co-authors of the

paper. Mr Alhmoudi prepared the first draft of the paper and the candidate contributed

substantially to the editing and proof-reading processes of the manuscript. The

candidate was responsible for formatting and responding to comments from the editors

of the Journal.

Page 15: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

14

Chapter Four: Allele frequencies of Short Tandem Repeat markers used for forensic

applications in the Arab population of the United Arab Emirates.

The manuscript has been accepted and in Press of the Journal Forensic Science

International: Genetics. The details of the submission are:

Jones R, Al Tayyare W, Tay G, Alsafar H, Goodwin W. Description of Short

Tandem Repeat markers used for Forensic applications in the Arab population

of the United Arab Emirates. Forensic Science International: Genetics (In Press).

The candidate developed the project and outlined the justification for the study under

the supervision of Drs Tay and Alsafar. The laboratory processes were carried-out by

Wafa Al Tayyare under the supervision of Drs Alsafar and Goodwin with technical

assistance from the candidate. The writing, planning and preparation of the

manuscript, statistical calculations and formatting for the journal were carried-out by

the candidate; with guidance from Drs Tay, Alsafar and Goodwin.

Chapter Six: Y-Chromosome STR haplotypes can be used to differentiate lineages in

the United Arab Emirates population.

The manuscript has been presented in the appropriate format and submitted for

publication to the Annals of Human Biology. The details of the manuscript are:

Jones R, Tay G, Mawart A, Alsafar H. Y-Chromosome haplotypes can be used

to differentiate lineages in the population of the United Arab Emirates. Annals

of Human Biology (submitted).

The candidate developed the project outline and objectives of the work described in

this manuscript. The laboratory tasks were carried-out by Dr Mawart; with technical

assistance from the candidate under the supervision of Dr Alsafar. The writing,

Page 16: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

15

statistical calculations and formatting of the manuscript were carried-out by the

candidate under the guidance of Dr Tay.

Student signature:

Date: 03-11-2016

I, Dr Daniel Franklin certify that the student statements regarding their contribution to

each of the works listed above are correct.

Coordinating Supervisor signature:

Date:

Page 17: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

16

Chapter 1

GENERAL INTRODUCTION

Page 18: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

17

1.1 Introduction

Increasing and improving research into population-specific genetic data allows for the

establishment of quality standards within the field of Forensic Science (1). Short

Tandem Repeats (STRs) are commonly used in the generation of genetic data (1-4)

allowing the development of databases to advance knowledge in worldwide patterns

of genetic diversity (5). Importantly, the variability of the autosomal STRs provides

distinction between closely situated and related groups. Additionally Y-chromosome

STRs (Y-STRs) and mitochondrial (mtDNA) genetic information analyses establish

the paternal and maternal component of genetic history and geographic dispersal of

populations, respectively.

The importance of genetic research on the populations in the Middle East is

increasingly being highlighted in the literature (6,7) with a number of recent

population-specific genetic studies (8-10). The United Arab Emirates (UAE) is

situated within the Arabian Peninsula, located within the ancient human migration

route out from the Horn of Africa and into South Asia (11). In contrast to other genetic

studies on Middle East populations within the literature, there is a paucity of data on

UAE populations and minimal genetic comparisons to surrounding populations (2, 6,

12). Given the importance of genetic research in applications such as human identity,

paternity testing and in identifying disease susceptibility genes there is a need to

expand genetic research into this important region within the Middle East.

1.2 Aims and Hypothesis

The overall objective of this thesis was to improve the understanding of human genetic

diversity within the Middle East. The specific aims of this project are listed below:

Page 19: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

18

1. Characterise allele frequencies and calculate forensic parameters for UAE

populations.

2. Examine genetic relationships and distances between different populations

from the Middle East, North Africa and South Asia.

3. Establish findings provided from two different STR analyses (autosome and

Y-chromosome).

4. Recognize factors such as human migration that impact population genetic

diversity.

The hypothesis of this research is that autosomal STR and Y-STR analyses of UAE

populations will provide a better understanding of the region. This study will advance

the existing literature on the topic (2, 8, 13, 14) by increasing sample size, number of

loci tested and comparing data from the UAE population to other populations in the

Middle East and surrounding regions relevant to known human migration patterns.

1.3 Thesis Structure

Each chapter within the thesis presents a new study enhancing the results established

in the chapters before and building on previous knowledge noted within the literature.

Research collaborations were initially established with the completion of the first

studies of autosomal STR data allowing for the statistical calculations of the UAE

allele frequencies to be obtained (Chapter 3). In the effort to maintain improved

standards in genetic research, the validation and replication of the autosomal STR

report on the UAE allele frequencies was then carried-out with additional samples and

described in the subsequent chapter (Chapter 4). This study improved the confidence

level of the results and increased the population sample size for addition to genetic

Page 20: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

19

databases. Furthermore, the autosomal STR data of the UAE population was then

compared to populations from the Middle East, North Africa and South Asia (Chapter

5). The combination of all the populations used for comparison within this thesis is

novel as previous publications have omitted important populations in comparisons (2,

6, 12). To further enhance the knowledge in population genetics, Y-STR analyses

using up-to-date technologies for increased number of STR loci was carried-out

(Chapter 6). A meta-analysis of the UAE population using Y-STR haplogroups was

carried-out to compare findings established from the use of autosomal STR analyses

towards understanding how factors such as initial human migration, historic and trade

relationships impact genetic diversity.

Page 21: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

20

1.4 References

1. Schneider PM. Scientific standards for studies in forensic genetics. Forensic

Sci Int. 2007;165(2-3):238-43.

2. Garcia-Bertrand R, Simms TM, Cadenas AM, Herrera RJ. United Arab

Emirates: phylogenetic relationships and ancestral populations. Gene.

2014;533(1):411-9.

3. Carracedo A, Butler JM, Gusmao L, Linacre A, Parson W, Roewer L, et al.

Update of the guidelines for the publication of genetic population data.

Forensic Sci Int Genet. 2014;10:1-2.

4. Osman AE, Alsafar H, Tay GK, Theyab J, Mubasher M, Sheikh N, et al.

Autosomal short tandem repeat (STR) variation based on 15 loci in a

population from the Central Region (Riyadh Province) of Saudi Arabia. J

Forensic Res. 2015;6(1):1-5.

5. Silva NM, Pereira L, Poloni ES, Currat M. Human neutral genetic variation

and forensic STR data. PLOS One. 2012;7(11).

6. Petraglia MD, Rose JI. The evolution of human populations in Arabia:

Paleoenvironments, prehistory and genetics. Netherlands: Springer 2009.

7. Shetty P. Lihadh Al-Gazali: a leading clinical geneticist in the Middle East.

The Lancet. 2006;367(9515):979.

8. Alshamali F, Alkhayat AQ, Budowle B, Watson ND. STR population diversity

in nine ethnic populations living in Dubai. Forensic Sci Int. 2005;152(2-

3):267-79.

9. Barni F, Berti A, Pianese A, Boccellino A, Miller MP, Caperna A, et al. Allele

frequencies of 15 autosomal STR loci in the Iraq population with comparisons

Page 22: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

21

to other populations from the middle-eastern region. Forensic Sci Int.

2007;167(1):87-92.

10. Perez-Miranda AM, Alfonso-Sanchez MA, Pena JA, Herrera RJ. Qatari DNA

variation at a crossroad of human migrations. Hum Hered. 2006;61(2):67-79.

11. Kundu S, Ghosh SK. Trend of different molecular markers in the last decades

for studying human migrations. Gene. 2015;556(2):81-90.

12. Cadenas AM, Zhivotovsky LA, Cavalli-Sforza LL, Underhill PA, Herrera RJ.

Y-chromosome diversity characterizes the Gulf of Oman. Eur J Hum Genet.

2008;16(3):374-86.

13. Alshamali F, Pereira L, iacute, sa, Budowle B, Poloni ES, et al. Local

population structure in Arabian Peninsula revealed by Y-STR diversity. Hum

Hered. 2009;68(1):45-54.

14. Nazir M, Alhaddad H, Alenizi M, Alenizi H, Taqi Z, Sanqoor S, et al. A

genetic overview of 23Y-STR markers in UAE population. Forensic Sci Int

Genet. 2016;23:150-2.

Page 23: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

22

Chapter 2

LITERATURE REVIEW: GENETIC ANALYSIS OF PEOPLE OF THE

MIDDLE EAST AND THE USE OF SHORT TANDEM REPEATS FOR

POPULATION GENETIC ANALYSES

Page 24: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

23

2.1 Introduction

Highly variable regions within the DNA termed Short Tandem Repeats (STRs) are

widely used for characterising population structure and estimating human genetic

diversity (1-4). Such DNA-based data also provide leads in disease susceptibility

studies, paternity and individual identification. Population genetic analyses utilising

such variable markers have identified bidirectional human migration through the

Middle East, linking movement through Africa, Asia and Europe (2, 5, 6).

Accordingly, the Middle East is at the crossroads of human migration and the degree

of genetic variation within the region is of interest particularly in improving the

understanding of the impact of human migration on genetic diversity. However,

making sense of how human dispersal affects the patterns of genetic diversity remains

a complicated task (7, 8). Furthermore, there is minimal research on the genetic

diversity of Middle Eastern populations compared to other populations around the

world (9). Importantly, inconsistencies in genetic relationships have been noted due

to the geographical location of the Middle East in addition to the variety of cultural

and religious relationships throughout the region (8).

The Middle East region includes the Arabian Peninsula, which comprises Saudi

Arabia, Yemen, Oman, the United Arab Emirates (UAE), Bahrain, Qatar, Kuwait and

parts of Iraq and Jordan. The Levant region is an extension of the Middle East and

comprises Iraq, Jordan, Syria and Lebanon. Interactions between populations or

groups within the region and with surrounding areas are likely given known historical

events and the continuous trade throughout the Middle East, North Africa and South

Asia since the initial migration in early human history. Furthermore, there has been

the spread of cultural and religious lifestyles such as today the Muslim faith is

Page 25: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

24

prevalent within the Middle East and collectively throughout North Africa, Horn of

Africa and parts of South Asia.

The numerous dispersal routes have influenced the degree of genetic variation

throughout the region, which requires greater understanding to provide significant

insights into genetic variations associated with disease susceptibility and in individual

identification applications.

2.2 Human Migration

Changes in the frequency of genetic variations within and between populations over-

time can be caused by demographic events such as human dispersal, population

expansion and admixture or gene flow due to the establishment of trade routes or with

the spread of a particular technology. These events leave genetic imprints altering

allele frequencies that are passed down through generations (10, 11). It has been

suggested that human dispersal since the out of Africa migration and agricultural

transitions have exerted some of the strongest evolutionary pressures on human

populations (12). Additionally, selection events can impact upon allelic diversity,

including disease-associated genetic variation (13).

DNA analyses have provided insight into human migration routes (Figure 2.1). The

specific nature of human genetic variations for a particular population has been

associated with certain migration routes and the resulting ethnic admixtures (14). It is

known that migration routes were bidirectional with humans migrating back through

Asia, the Middle East and back into Africa (5). Factors causing migration such as war,

food and climatic conditions likely resulted in the occurrence of bidirectional

Page 26: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

25

migration. Furthermore, trade has become a more recent factor for many bidirectional

migration routes seen today (15, 16).

Figure 2.1: Estimated human migration routes using mitochondrial DNA

analyses. Map generated based on published data (2, 5, 6, 17-19).

Early human migration out of Africa has been predominantly studied using methods

inferring the evolutionary history of mitochondrial DNA (mtDNA) haplogroups

(Figure 2.1). One of the earliest discovered mtDNA lineages (L1 type) has been

suggested to have originated within East Africa approximately 130,000 years ago and

found to be restricted to African populations. This restriction of the L1 lineage in

African populations suggests ancient migration first began with the spread across

Africa with the discovery of the L2 and L3 lineages deriving from the L1 type also

within Africa (17). The first waves of the out of Africa migration are suggested to

have occurred approximately 85,000 years ago with two different proposed routes into

the Middle East (2, 3, 17, 18). These waves out of Africa from the L3 lineage gave

rise to the M and N type lineages observed within the Middle East. The proposed

routes of migration out of Africa are through the Horn of Africa (M type) and through

the Levantine region (N type) into the Middle East (Figure 2.1). The literature agrees

Page 27: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

26

that both of these routes would have occurred, however it is unclear as to which

migration route occurred first (17, 18).

Kundu & Ghosh (2015) used mtDNA analyses to suggest that following the initial

migration from Africa, humans travelled from the Middle East through Southern Asia

and later on through Sri Lanka into Indonesia and then Australia approximately 60,000

years ago (2). Furthermore, the analyses of mtDNA haplogroups led to the discovery

of the N1e splitting from haplogroup I into three variable sequences approximately

40,000-30,000 years ago (17, 19). Haplogroup I is overall frequent in most of Europe

and then within the Gulf region. The three N1e sequences were further observed in

Arabia and Russia (17, 19). This analysis is indicative that human migration from the

Middle East travelled north into Russia and expanded west into Europe within the

estimated 40,000-30,000 timeframe. Genetic analysis of mtDNA has dated the

beginning of the migration across America approximately 25,000 years ago. During

that timeframe, the ‘Bering Land Bridge’ is proposed to have formed from the freezing

climate between Asia and Alaska allowing humans to cross into America (2). Analysis

of mtDNA haplogroup variations suggests the route into South America to have

initially occurred approximately 20,000-15,000 years ago (17).

2.3 Impact of Bidirectional Migration through the Middle East

Evidence of bidirectional migration between the Middle East and Africa is provided

by mtDNA analysis that identifies the presence of the U6 lineage (of Middle East

origin) within North Africa. The presence of the U6 lineage in these regions suggests

migration from the Middle East back into Africa occurring approximately 50,000-

40,000 years ago (5). Such instances of bidirectional migrations are not unique as

Page 28: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

27

bidirectional routes within the Middle East are common (6). Accordingly, as seen in

Figure 2.1, the Middle East represents the multidirectional region between Africa,

Asia and Europe.

As mentioned earlier, the initial human migration out of Africa comprised the M and

N type mtDNA haplogroups, signifying separate genetic lineages between the Levant

and coast of the Arabian Peninsula. With the use of mtDNA analyses, Iraqi

populations (Levant Region) were found to not only have the N type lineage, but

additional haplogroups sharing relationships with populations within the M type

dispersal routes (e.g. India) (20). This example using mtDNA analysis highlights the

importance of considering the historical dispersal of humans after the initial out of

Africa migration to describe genetic relationships. Although there may have been two

different mtDNA haplogroups dispersed from the L3 type out of Africa, the dispersals

of humans within and through the Middle East region has involved multiple waves.

Furthermore, more recent human dispersals have resulted in genetic inferences that

have in some cases contradicted the assumed genetic relationships from the initial out

of Africa migration (18), signifying the importance of continuous genetic research.

Using mtDNA analyses to understand the dispersal of humans out of Africa indicates

how there is still much more to learn from population-specific analyses (18). For

example, in contrast to the observed heterogeneity within the Middle East, the Arabian

Gulf populations such as Saudi Arabia and Yemen have shown distinct genetic

homogeneous structures possibly due to the occurrence of isolated desert groups (18).

Further analyses of human dispersal throughout the Middle East and surrounding

regions will provide a better understanding of how the Middle East can exhibit a large

degree of heterogeneity with some areas of homogeneity.

Page 29: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

28

The analysis of the Y-chromosome will further add towards understanding the degree

of heterozygosity within the Middle East and the likely impact of sociocultural

impacts, such as polygyny that will affect the male contribution towards the Y-

chromosome genetic pool. Historically polygyny was common, which would give rise

to homogeneous populations (identified via Y-chromosome haplotypes) being traced

through genetic analyses with a shift to monogamy; increasing genetic diversity (21).

As polygynous marriages occur throughout the Middle East, North Africa and South

Asia (22), it is important to address this factor when examining the degree of observed

homozygosity and heterozygosity at genetic traits amongst populations in the region.

2.4 Historical and Trade Migration in the Middle East Region

As indicated by archaeological data, mtDNA and Y-chromosome analyses, trade

between the Horn of Africa and the Arabian Peninsula has been occurring since 6000

BCE (Figure 2.2) (23). This migration would have been a key beginning to simple

maritime dispersals around the Middle East (9). The bidirectional trade involved

exchanging plants and animals (e.g. cattle), resulting in international migration (23).

The earliest archaeological data indicating trade routes between South Asia,

Mesopotamia and the Arabian Peninsula was the Mature Harappan Period (2600

BCE). This period involved maritime trade and transportation.

Supported by DNA analyses, this trade period is indicative of migration from South

Asia (India) to areas along the coasts of the UAE, Iran and within the Levant (24).

More recent maritime trade (20th Century) has seen an increase in trade across the Gulf

between the geographical locations of Iran, Qatar and the UAE, resulting in

bidirectional human migration across the sea (25).

Page 30: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

29

Figure 2.2: Estimated historical and trade migration routes through the Middle

East. Map generated based on published data (9, 16, 19, 23-27).

Since the 6th Century, during suitable climatic conditions, the Red Sea coasts of Africa

and the Arabian Peninsula was a region of trade migrations (e.g for cotton) and

successive population expansions (16). The spread of Islam in the 7th and 11th Century

included the migration of populations throughout the Middle East and North Africa

(16). During the 7th Century, Islamic culture extended from Mecca and Jeddah into

Pakistan and west into the Iberian Peninsula. The extension of Islamic culture within

North Africa during the 11th Century, with evidence to have spread through to

Morocco (16, 26). Furthermore, the Ottoman Empire was the longest dynasty from

the 13th Century to the 20th Century (27). The migration routes of particular ethnic

groups are reflective of the spread of the dynasty, involving refugee routes and trade

(27). Furthermore, bidirectional migration between Oman and Zanzibar occurred

from the 19th Century (27). Cultural influences and genetic diversity can be seen to

Page 31: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

30

have resulted from the trade and various historical events such as the Zanzibar

colonization.

Overall human dispersal, together with subsequent sub-population endogamy and

consanguinity throughout the Middle East has resulted in ethnic population diversity

(Figure 2.3). Furthermore, in more recent family generations, the increase in

population sizes in addition to the large-scale urbanisation throughout the region has

impacted genetic structures. This rise in ethnic diversity has been observed from the

variety of social and cultural influences throughout the Middle East. There are at least

19 different ethnic groups within the Middle East with the largest pan-ethnic group

within and around the Arabian Peninsula referred to as ‘Arabs’. The next largest

ethnic groups within the Middle East region (and extending into Iran) are Persians and

then the Baluch. In addition to the diversity of different populations within the Middle

East, different Arab populations are geographically distributed throughout the Middle

East and North Africa (Bedouin, Egyptian, Jordanian, Yemenite, etc). The influence

of human migration on the resulting different Arab and other ethnic populations within

the Middle East is important to establish to determine its effect on genetic diversity

and the extent of ethnic geographical distributions.

Page 32: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

31

Figure 2.3: Qualitative analysis of different ethnicity and cultural groups residing

within and surrounding the Middle East. Map generated using published data (15,

26, 28-43).

Many populations within the Middle East practise consanguineous marriage (21).

Higher rates of consanguineous marriages have been seen within Arabic populations

(e.g. Bedouin), and to some extent increasing homozygosity rates (44, 45).

Homozygosity rates observed within populations such as the Bedouins are also

affected by the isolation of these cultures in the desert as reflected in the mtDNA

haplogroup analyses. Accordingly, it is important to consider consanguinity and

endogamy in population genetic research when examining the effect of historical

human migration factors on levels of genetic diversity observed in consanguineous

families (21, 22). However, recent bidirectional migration and cultural influences

from surrounding regions have been observed to increase gene flow from surrounding

Page 33: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

32

Asian, African and some European areas, resulting in observable genetic diversity

within the region (45). This highlights the importance of studying specific populations

within the Middle East region to understand the influences of observable

consanguinity, human migration and socio-cultural factors on genetic variation.

2.5 Geographical Differentiation Determined by Different Genetic Components

Analysis of genetic variation across the Y-chromosome (resulting in haplotypes)

allows for the investigation of male lineage contributions to gene flow. The genetic

variation along the Y-chromosome has been shown to be affected by rapid genetic

drift resulting in high levels of geographical differentiation of Y-haplotypes (46).

Furthermore, the Y chromosome contains the largest non-recombining section within

the human genome, providing significant informative haplotypes for application in

population genetic analyses (7). Worldwide population data have been collected for

Y-chromosome STR (Y-STR) haplotypes. Results have shown Y-STR haplotypes are

region-specific, increasing their utility in population genetic studies, forensics and

paternity testing (47).

Due to the geographical location of the Middle East and bidirectional dispersals, the

degree of ethnic diversity in the region has complicated population genetic analyses.

Discrepancies between genetic analyses of the Middle East have been observed. For

example, Triki-Fendri et al (2010) used Y-STR data to support the relationship

between the degree of genetic diversity and geographical location (47). Genetic

relatedness was seen between the Kuwaiti population and Arabian Peninsula

populations (Yemen, Saudi Arabia and the UAE) reflecting the close geographical

location of the groups (27). However, Roewer et al (2009) highlighted the importance

Page 34: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

33

of studying individual populations or minority groups in close-proximity in the same

region and country (48) as this study found genetic differences between linguistic

groups residing within Iran. This latter study highlights the importance in genetic

analyses of considering not only the geographical location of countries but also factors

such as cultural and religious isolation of populations within the same region.

The level of human dispersal throughout the Middle East has been diverse from

variable locations but the region also contains isolated populations. Not surprisingly,

Y-STR analysis of the genetic structure of the Middle East has been described as

reflecting a “mosaic pattern” (18). Accordingly, it is important to increase the number

of analyses between different identifiable populations and subpopulations within the

Middle East to attempt to understand the “mosaic pattern” of the genetic diversity.

Additional comparison to other regions in North Africa and South Asia due to

dispersal patterns needs to be considered to further understand genetic diversity in

close-proximity subpopulations.

The use of mtDNA analyses is thought to have advantages over Y-STR analyses in

genetic studies due to the abundance of mtDNA in the cell, faster mutation rate and

increased number of polymorphisms (2). However, the use of Y-chromosome

analyses supports mtDNA analyses of human migration such as the ancient African

origin for all modern humans and bidirectional dispersals (2, 5, 49-50). But

differences in observed genetic outcomes can result between mtDNA and Y-

chromosome analyses due to the different gene flow patterns of male and females (51).

As may be expected, males and females may not always accompany each other

through dispersal routes (51, 52) and the continued analysis of Y-chromosome

variation in different populations is necessary to fill in the gaps in human history not

reflected in mtDNA analyses.

Page 35: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

34

2.6 Distinguishing Closely Related Groups using Autosomal STRs

The use of autosomal DNA for population genetic analyses has advanced studies on

migration, parentage and ability to infer genetic diversity (53). Both mtDNA and Y-

chromosome results only represent single loci reflecting maternal or paternal history,

respectively. By analysing autosomal STRs, the overall population genetic structure

can be represented. Studies in the literature have shown autosomal STR markers

provide useful information on our evolutionary and migratory history (1, 53-55). The

enhanced understanding of the degree of genetic variation within and between

populations can then be associated with improving analyses of disease patterns and

susceptibility.

Due to the hyper-variability and ubiquity of autosomal STRs throughout the genome,

extensive sets of autosomal STRs can be used for population genetic applications (1).

Single nucleotide polymorphisms (SNPs) have been highlighted in the literature to be

superior to STRs for genetic analyses, especially with the advent of next generation

sequencing technologies (2, 4). There are a large number of SNPs along the genome

and these loci exhibit lower mutation rates than for STRs (56). The mutation rate of

these markers is important to consider as the high mutation rate of STRs have been

associated with a decrease in reliability in allele frequency estimations (1, 55, 57).

However, Gaiber et al (2012) found using 51 worldwide populations that 88.9% of the

genetic variation could be assessed by using 650,000 SNPs, whereas typing only 783

autosomal STRs resulted in considerably higher population genetic variation at 94%

with the same samples (55). Furthermore, due to the number of STR markers available

for large population sample sizes, the advantages of using STRs over SNPs include

the higher analytical power, higher allelic abundance and lower ascertainment bias

Page 36: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

35

(58-60). A worthy suggestion within the literature is to combine the analysis of STRs

and SNPs to improve population genetic inferences (53, 61).

As mtDNA and Y-chromosome analyses have identified the importance of studying

closely located and related groups, the hyper-variability of autosomal STRs has the

ability to advance knowledge on this topic. Shepard et al (2006) found a strong

correlation between the degree of genetic variation and both geography and language

using autosomal STRs (6). For example, Bentayebi et al (2014) found minimal

genetic variance with shared ancestry between North African populations (Morocco

and Egypt) and populations within the Middle East (Iraq and Oman respectively) (60).

Even though there is geographical distance between these populations, autosomal

STRs can indicate which cultural and religious similarities may have impacted genetic

relationships. The use of autosomal STRs has also shown that close geographical

populations have genetic relationships that reflect known historical events, trade and

sociocultural relationships (54, 62, 63). The fact autosomal STRs show both close

geographical relationship and distant relationships, highlights the need for

continuation of research to understand the degree of genetic influences.

Variable degrees of genetic relationships were observed within the literature between

Saudi Arabia and other Arabian Peninsula populations including Qatar (62). Perez-

Miranda et al (2006) found significant genetic distance between Qatar and Saudi

Arabia even though they share common borders (62). Additionally, the genetic

isolation of Saudi Arabian populations has been highlighted when compared with

other neighbouring populations such as the UAE (64). Furthermore, the UAE share

greater genetic similarities with South Asian populations such as Iran reflecting trade

and cultural relationships (54).

Page 37: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

36

The degree of genetic variation observed in the Middle East fluctuates from low levels

of diversity in isolated populations (Saudi Arabia) to high levels of heterogeneity

(UAE) within the region, complicating the task of understanding the degree of genetic

diversity in the Middle East. Clearly, the degree of genetic variation within the Middle

East and in North African and South Asian populations do not show a smooth pattern

of genetic diversity affected only by geographical distance. Additional factors

impacting genetic diversity, as seen using autosomal STRs, consist of historic and

trade relationships throughout the years that require consideration in addition to

geographical distance and initial human migration.

2.7 Middle East Region and Worldwide Research

There have been numerous global projects analysing the degree of genetic variations

in populations. The International HapMap Project was developed to characterise

DNA sequences, allele frequencies and genetic relationships across different

populations (65). The population samples were collected over three phases (Figure

2.4). Despite the widespread nature of this genetic research, no population data was

collected from the Middle East region. Furthermore, the 1000 Genomes Project only

characterised allele frequency variants from West Africa, Europe, North America and

South and East Asia (66). Although the Middle East region has been established as

the crossroads of migration, showing fluctuated degrees of genetic variation, there

remains a lack of information from this region, especially seen within global genetic

projects. As mentioned previously, the Middle East is significantly located

geographically, contributing to the flow of genes through bidirectional human

migration.

Page 38: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

37

Figure 2.4: Populations chosen for the International HapMap Project. Note there

is no described ethnic population from the Middle East region. Map generated using

published data (65, 67, 68). Note that the map shows the location from which the

ethnic groups came from and the location where participants were recruited is also

described.

Furthermore, the residing diverse ethnic populations and different cultural lifestyles

have impacted gene flow and in turn the genetic diversity in the Middle East and

surrounding North African and South Asian populations. These facts prove how the

Middle East region is important to be included in global genetic projects to compare

the diverse genetic imprint in the Middle East from a global perspective (69).

However, there is minimal genetic research in the literature on the Middle East in

comparison to other global regions (66). Although there are publications describing

genetic data from Middle Eastern populations, as mentioned above with the use of

mtDNA, Y-chromosome and autosomal STRs (70-73), not every country within the

region has been analysed and compared to each other. Furthermore, due to the

advancements in DNA analyses, more STR markers and sample size increases are

Page 39: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

38

required than what is provided within previous publications (74). Such research would

also impact on our understanding of the genetics of disease susceptibility in the region

(44).

2.8 Conclusion

Strategically located, the Middle East region is important in population genetic

analyses to understand the waves of migration and back-migrations intersecting

Europe, Asia and Africa (6). The advancements in genetic analyses have proven the

importance of migratory studies of the Middle East. However, the complexity of

genetic relationships amongst the Middle East populations may be understood with

further genetic studies into the significance of the major dispersal routes and the

impact of other factors on genetic diversity.

The literature has shown how STRs are still powerful tools for population genetic

analyses. From the amount of information STR analyses can provide, population allele

frequency standards can be obtained and population structures can be inferred. This

review highlights how genetic analyses using mtDNA, Y-chromosome and autosomal

STRs have the ability to better understand the genetic diversity of the Middle East.

However, due to the Middle East showing isolated population structures but overall

significant heterogeneity amongst the region and surrounding areas, further genetic

comparisons using modern advancements in technology and more populations for

comparisons in meta-analyses needs to take place. This information will allow for a

greater understanding of the Middle East region and the impact of historic and recent

human dispersal on genetic variation.

Page 40: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

39

2.9 References

1. Silva NM, Pereira L, Poloni ES, Currat M. Human neutral genetic variation

and forensic STR data. PLoS One. 2012;7(11).

2. Kundu S, Ghosh SK. Trend of different molecular markers in the last decades

for studying human migrations. Gene. 2015;556(2):81-90.

3. Ermini L, Der Sarkissian C, Willerslev E, Orlando L. Major transitions in

human evolution revisited: A tribute to ancient DNA. J Hum Evol. 2015;79:4-

20.

4. Novembre J, Peter BM. Recent advances in the study of fine-scale population

structure in humans. Curr Opin Genet Dev. 2016;41:98-105.

5. Maca-Meyer N, González AM, Pestano J, Flores C, Larruga JM, Cabrera VM.

Mitochondrial DNA transit between West Asia and North Africa inferred from

U6 phylogeography. BMC Genet. 2003;4:1-11.

6. Shepard EM, Herrera RJ. Genetic encapsulation among Near Eastern

populations. J Hum Genet. 2006;51(5):467-76.

7. Underhill PA, Kivisild T. Use of y chromosome and mitochondrial DNA

population structure in tracing human migrations. Annu Rev Genet.

2007;41:539-64.

8. Barbujani G, Ghirotto S, Tassi F. Nine things to remember about human

genome diversity. Tissue Antigens. 2013;82(3):155-64.

9. Petraglia MD, Rose JI. The evolution of human populations in Arabia:

Paleoenvironments, prehistory and genetics. Netherlands: Springer 2009.

10. Thangaraj K, Ramana GV, Singh L. Y-chromosome and mitochondrial DNA

polymorphisms in Indian populations. Electrophoresis. 1999;20:1743-7.

Page 41: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

40

11. Hellenthal G, Busby G, Band G, Wilson JF, Capelli C, Falush D, et al. A

genetic atlas of human admixture history. Science. 2014;343:747-51.

12. Shendure J. Human genomics: A deep dive into genetic variation. Nature.

2016;536:277-78.

13. Nakagome S, Alkorta-Aranburu G, Amato R, Peter B, Hudson R, Di Rienzo

A. Estimating the ages of selection signals from different epochs in human

history. Mol Biol Evol. 2015;33(3):657-669.

14. Der Sarkissian C, Brotherton P, Balanovsky O, Templeton JE, Llamas B,

Soubrier J, et al. Mitochondrial genome sequencing in Mesolithic North East

Europe unearths a new sub-clade within the broadly distributed human

haplogroup C1. PLoS One. 2014;9(2):e87612.

15. Valeri M. Nation-building and communities in Oman since 1970: The Swahili-

Speaking Omani in search of identity. African Affairs. 2007;106(424):479-96.

16. Abu-Amero KK, Gonzalez AM, Larruga JM, Bosley TM, Cabrera VM.

Eurasian and African mitochondrial DNA influences in the Saudi Arabian

population. BMC Evol Biol. 2007;7(32):1-15.

17. Forster P. Ice ages and the mitochondrial DNA chronology of human

dispersals: A review. Philos Trans R Soc Lond B Biol Sci.

2004;359(1442):255-64.

18. Petraglia MD, Haslam M, Fuller DQ, Boivin N, Clarkson C. Out of Africa:

New hypotheses and evidence for the dispersal of Homo sapiens along the

Indian Ocean rim. Ann Hum Biol. 2010;37(3):288-311.

19. Fernandes V, Alshamali F, Alves M, Costa MD, Pereira JB, Silva NM, et al.

The Arabian cradle: Mitochondrial relicts of the first steps along the southern

route out of Africa. Am J Hum Genet. 2012;90(2):347-55.

Page 42: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

41

20. Al-Zahery N, Semino O, Benuzzi G, Magri C, Passarino G, Torroni A, et al.

Y-chromosome and mtDNA polymorphisms in Iraq, a crossroad of the early

human dispersal and of post-Neolithic migrations. Mol Phylogenet Evol.

2003;28(3):458-72.

21. Dupanloup I, Pereira L, Bertorelle G, Calafell F, Joāo Prata M, Amorim A,

Barbujani G. A recent shift from polygyny to monogamy in humans is

suggested by the analysis of worldwide Y-chromosome diversity. J Mol Evol.

2003;57:85-97.

22. Al-Krenawi A, Slonim-Nevo V, Graham J. Polygyny and its impact on the

psychosocial well-being of husbands. J Comp Fam Stud. 2006;37(2):173-189.

23. Hodgson J, Mulligan C, Al-Meeri A, Raaum R. Early back-to-Africa migration

into the horn of Africa. PLoS Genet 2014;10(6):1-18.

24. Martin L. DNA tribes. DNA Tribes 2013:1-17.

25. Potter LG. The Persian Gulf in History. 1st ed. New York: Palgrave Macmillan

2009.

26. Shoup J. Ethnic Groups of Africa and the Middle East: An Encyclopedia.

California: ABC-CLIO 2011.

27. Tadmouri GO, Sastry KS, Chouchane L. Arab gene geography: From

population diversities to personalized medical genomics. Glob Cardiol Sci

Pract. 2014;2014(4):394-408.

28. Rashidvash V. Iranian people and the origin of the Turkish-speaking

population of the North-western Iran. Canadian Social Sci. 2012;8(2):132-9.

29. Tachjian V. Gender, nationalism, exclusion: The reintegration process of

female survivors of the Armenian genocide. Nations and Nationalism.

2009;15(1):60-80.

Page 43: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

42

30. Hovannisian R. The ebb and flow of the Armenian minority in the Arab Middle

East. The Middle East Journal. 1974;28(1):19-32.

31. Hovannisyan A, Khachatryan Z, Haber M, Hrechdakian P, Karafet T, Zalloua

P, et al. Different waves and directions of Neolithic migrations in the

Armenian highland. Invest Genet. 2014;5(15):1-11.

32. Lewis JE. Iraqi Assyrians. Barometer of pluralism Middle East Quarterly,

Summer 2003;10(3):49-57.

33. His Beatitude the Patriarch Mar E. Assyrians in the Middle East. J Royl Cent

Asian Soc. 1953;40(2):151-60.

34. Dalby A. Dictionary of languages: Bakhtiari. London, United Kindgom: A&C

Black 2004.

35. Dalby A. Dictionary of Languages: Beja. London, United Kingdom A&C

Black 2004.

36. Bernhard P. Behind the Battle Lines: Italian Atrocities and the Persecution of

Arabs, Berbers, and Jews in North Africa during World War II. Holocaust

Genocide Stud. 2012;26(3):425-46.

37. Mogib M. Copts in Egypt and their demands: Between inclusion and

exclusion. Contemp Arab Affairs. 2012;5(4):535-55.

38. Dalby A. Dictionary of languages: Gilaki. London, United Kingdom: A&C

Black 2004.

39. Gottesman L. Jews in the Middle East. Am Jew Yr Book. 1985;85:304-23.

40. Hisyar O. Introduction: The Kurds' ordeal with Turkey in a transforming

Middle East. Dialect Anthropol. 2013;37(1):103-11.

41. Dalby A. Dictionary of languages: Kurdish. London, United Kingdom: A&C

Black 2004.

Page 44: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

43

42. Dalby A. Dictionary of languages: Luri. London, United Kingdom: A&C

Black 2004.

43. Abbas F. Egypt, Arab nationalism, and Nubian disaporic identity in Idris Ali's

dongolia: A novel of Nubia. Res African Lit. 2014;45(3):147-66.

44. Reilly B. Revisiting consanguineous marriage in the greater Middle East:

Milk, blood, and Bedouins. Am Anthropol. 2013;115(3):374-87.

45. Qamar R, Ayub Q, Mohyuddin A, Helgason A, Mazhar K, Mansoor A, et al.

Y-Chromosome DNA variation in Pakistan. Am J Hum Genet. 2002;70:1107-

24.

46. Immel UD, Kleiber M, Klintschar M. Y-chromosomal STR haplotypes in an

Arab population from Yemen. Int Congress Series 2004;1261:340-3.

47. Triki-Fendri S, Alfadhli S, Ayadi I, Kharrat N, Ayadi H, Rebai A. Genetic

structure of Kuwaiti population revealed by Y-STR diversity. Ann Hum Biol.

2010;37(6):827-35.

48. Roewer L, Willuweit S, Stoneking M, Nasidze I. A Y-STR database of Iranian

and Azerbaijanian minority populations. Forensic Sci Int Genet. 2009;4:53-5.

49. Seielstad M, Bekele E, Ibrahim M, Toure A, Traore M. A view of modern

human origins from Y-chromosome microsatellite variation. Gen Res.

1999;9:558-67.

50. Cordaux R, Aunger R, Bentley G, Nasidze I, Sirajuddin SM, Stoneking M.

Independent origins of Indian caste and tribal paternal lineages. Curr Biol.

2004;14:231-5.

51. Triki-Fendri S, Sánchez-Diz P, Rey-González D, Ayadi I, Carracedo Á, Rebai

A. Paternal lineages in Libya inferred from Y-chromosome haplogroups. Am

J Phys Anthropol. 2015;157:242-51.

Page 45: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

44

52. Badro DA, Douaihy B, Haber M, Youhanna SC, Salloum A, Ghassibe-Sabagh

M, Johnsrud B, et al. Y-chromosome and mtDNA genetics reveal significant

contrasts in affinities of modern Middle Eastern populations with European

and African populations. PLoS One. 2013;8(1):e54616.

53. Putman AI, Carbone I. Challenges in analysis and interpretation of

microsatellite data for population genetic studies. Ecol Evol. 2014.

54. Garcia-Bertrand R, Simms TM, Cadenas AM, Herrera RJ. United Arab

Emirates: phylogenetic relationships and ancestral populations. Gene.

2014;533(1):411-9.

55. Gaibar M, Esteban ME, Via M, Harich N, Kandil M, Fernandez-Santander A.

Usefulness of autosomal STR polymorphisms beyond forensic purposes: Data

on Arabic- and Berber-speaking populations from central Morocco. Ann Hum

Biol. 2012;39(4):297-304.

56. Payseur BA, Jing P. A genomwide comparison of population structure at STRs

and nearby SNPs in humans. Mol Biol Evol. 2009;26:1369-77.

57. Zhang D, Hewitt G. Nuclear DNA analyses in genetic studies of populations:

Practice, problems and prospects. Mol Ecol. 2003;12:563-84.

58. Sun JX, Mullikin JC, Patterson N, Reich D. Microsatellites are molecular

clocks that support accurate inferences about history Mol Biol Evol.

2009;26:1017-27.

59. Haasl RJ, Payseur BA. Multi-locus inference of population structure: A

comparison between single nucleotide polymorphisms and microsatellites.

Heredity. 2011;106:591-7.

60. Bentayebi K, Abada F, Ihzmad H, Amzazi S. Genetic ancestry of a Moroccan

population as inferred from autosomal STRs. Meta gene. 2014;2:427-38.

Page 46: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

45

61. Middleton D. HLA typing from serology to sequencing era. Iran J Allergy

Asthma Immunol. 2005;4(2):53-66.

62. Perez-Miranda AM, Alfonso-Sanchez MA, Pena JA, Herrera RJ. Qatari DNA

variation at a crossroad of human migrations. Hum Hered. 2006;61(2):67-79.

63. Cassar M, Farrugia C, Vidal C. Allele frequencies of 14 STR loci in the

population of Malta. Leg Med. 2008;10(3):153-6.

64. Osman AE, Alsafar H, Tay GK, Theyab J, Mubasher M, Sheikh N, et al.

Autosomal short tandem repeat (STR) variation based on 15 loci in a

population from the Central Region (Riyadh Province) of Saudi Arabia. J

Forensic Res. 2015;6(1):1-5.

65. Garte S. Human population genetic diversity as a function of SNP type from

HapMap data. Am J Hum Biol. 2010;22(3):297-300.

66. Kumar S, Kingsley C, DiStefano JK. The human genome project: Where are

we now and where are we going? 2015:7-31.

67. Gibbs RA, Belmont JW, Hardenbol P, Willis TD, Yu F, Yanf H, et al. The

international HapMap project. Nature. 2003;426:789-96.

68. Zeggini E, Rayner W, Morris AP, Hattersley AT, Walker M, Hitman GA et al.

An evaluation on HapMap sample size and tagging SNP performance in large-

scale empirical and simulated data sets. Nature Genet. 2005;37(12):1320-2.

69. Zayed H. The Arab genome: Health and wealth. Gene. 2016;592:239-243.

70. Barni F, Berti A, Pianese A, Boccellino A, Miller MP, Caperna A, et al. Allele

frequencies of 15 autosomal STR loci in the Iraq population with comparisons

to other populations from the middle-eastern region. Forensic Sci Int.

2007;167(1):87-92.

Page 47: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

46

71. Alshamali F, Alkhayat AQ, Budowle B, Watson ND. STR population diversity

in nine ethnic populations living in Dubai. Forensic Sci Int. 2005;152(2-

3):267-79.

72. El Andari A, Othman H, Taroni F, Mansour I. Population genetic data for 23

STR markers from Lebanon. Forensic Sci Int Genet. 2013;7(4):e108-13.

73. Alenizi M, Goodwin W, Ismael S, Hadi S. STR data for the AmpFlSTR

Identifiler loci in Kuwaiti population. Legal Med. 2008;10(6):321-5.

74. Carracedo Á, Butler JM, Gusmão L, Linacre A, Parson W, Roewer L, et al.

New guidelines for the publication of genetic population data. Forensic Sci Int

Genet. 2013;7(2):217-20.

Page 48: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

47

Chapter 3

POPULATION GENETICS DATA FOR 21 AUTOSOMAL STR LOCI FOR

THE UNITED ARAB EMIRATES POPULATION USING A NEXT

GENERATION MULTIPLEX STR KIT

Following the literature review in the preceding chapters, the remainder of this

thesis describes the data collected and analysed. This specific chapter is the first

of four chapters describing the data collected. It contains the script for a

manuscript that has been published in 2015 in the Journal of Forensic Science

International (FSI): Genetics. The published manuscript is in the form of a Letter

to the Editor of FSI Genetics and is included in the appendices (Appendix 1) of the

thesis. The candidate contributed towards the interpretation of data, statistical

calculations and analyses, proof-reading and editing as well as formatting of the

manuscript for FSI Genetics.

Page 49: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

48

Forensic Science International: Genetics. 2015;19:190-1

Population genetics data for 21 autosomal STR loci for the United Arab Emirates

(UAE) population using a next generation multiplex STR kit.

Osamah Ali Alhmoudi1,2, Rebecca J Jones3, Guan K Tay3, Habiba Alsafar4, Sibte

Hadi1.

1 University of Central Lancashire, School of Forensic and Investigative

Sciences, Preston, UK.

2 Forensic Evidence Department, Abu Dhabi Police General Head Quarter, Abu

Dhabi, The United Arab Emirates.

3 Centre for Forensic Sciences, University of Western Australia, Crawley,

Western Australia.

4 Facility of Biomedical Engineering, Khalifa University of Science,

Technology and Research, Abu Dhabi, The United Arab Emirates.

Page 50: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

49

3.1 Introduction

The United Arab Emirates (UAE) is one of the Middle East countries located on the

Arabian Gulf. It shares a border with Iran, Saudi Arabia and Oman. The UAE was

founded in 1971, and consists of seven Emirates: Abu Dhabi, Dubai, Sharjah, Ajman,

Ra’s Al-Khaymah, Al-Fujairah and Umm Al-Quwain (1). According to the 2015

Census data (2015), the total UAE population was reported to be approximately 9.6

million in 2015. The data showed native Arabs to be 11.3% of the total population

with the majority of the population being of Indian and Pakistani ethnicities. In the

early part of the twentieth century, the different tribes started migrating in different

directions in search of a better life. Some moved into coastal regions, while others

inhabited the desert. Despite the modernization throughout the union, the basic family

structure and pattern of a native UAE Arab population has remained unchanged.

Culturally, the preference for consanguineous marriages remains embedded in the

society (2). However, as the awareness of the social and medical impact of

consanguinity increases and with diversification, non-consanguineous marriages

appear to be on the increase, which has possibly resulted in greater genetic diversity

throughout the population (3). The increase in genetic diversity in the population is

of interest to assess whether Short Tandem Repeat (STR) loci can be used for forensic

and paternity purposes. This study expands on previous publications in regards to the

analysis of UAE populations with the genotyping of additional STR loci and a larger

population sample size (4).

Page 51: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

50

3.2 Materials and Methods

3.2.1 Sample Description

DNA samples from 519 randomly chosen healthy, unrelated individuals who reside in

Abu Dhabi, UAE were used in this study. The DNA samples utilised in this current

study were previously obtained from individuals at Khalifa University in Abu Dhabi

and in accordance with approval from the Ethics committee of the Ministry of Health

in the UAE (2011). Informed consent was received from every volunteer during this

collection process and de-identified data is presented.

3.2.2 DNA Extraction

The DNA samples provided for this study were collected and extracted from buccal

swabs using the Oragene-DNA kit (Genotek, Ottawa, Canada) in accordance with

manufacturer’s guidelines. The quantity and quality of extracted DNA was

determined using a NanoDrop spectrophotometer (Thermo Scientific, Wilmington

DE, USA).

3.2.3 PCR Multiplex Amplification

Using half volume (7.5µl) reactions, samples were amplified using the GlobalFiler®

PCR amplification kit (Life Technologies, Foster City CA, USA) in addition to the

amplification of the provided allelic ladder in accordance with manufacturer’s

guidelines. The PCR was performed in the GeneAmp® PCR System 9700 (Life

Technologies). The GlobalFiler® PCR amplification kit (Life Technologies)

amplifies 24 STR loci. The 21 autosomal loci within this amplification kit are of

interest in this study for the analysis of the UAE population. The 21 autosomal loci

amplified and analysed in this study were D3S1358, vWA, D16S539, CSF1PO,

Page 52: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

51

TPOX, D8S1179, D21S11, D18S51, D2S441, D19S433, TH01, FGA, D22S1045,

D5S818, D13S317, D7S820, SE33, D10S1248, D1S1656, D12S391 and D2S1338.

3.2.4 STR Typing

The PCR products with the additional LIZ-internal Standard (Life Technologies) were

analysed using an ABI 3500 DNA Genetic Analyser with POP-4TM polymer (Life

Technologies). GeneMapper® Software version 4.0 (Life Technologies) was then

used for analysis. The alleles from all loci reported here were designated according

to the published nomenclature and the guidelines of the International Society for

Forensic Genetics (ISFG) for performing STR analyses (5).

3.2.5 Statistical Analysis

The STR allele frequencies along with parameters of population genetics: Observed

and Expected Heterozygosity (Ho and He, respectively), Power of Discrimination

(PD), Power of Exclusion (PE), and Polymorphic Information Content (PIC) were

estimated using PowerStats version 1.2 (Promega, Madison, USA). Version 3.11 of

the Arlequin software was used to perform an exact test to investigate any departures

from the Hardy-Weinberg equilibrium (HWE) (6). The Bonferroni correction was

carried-out by dividing the critical P value (α) by the number of comparisons being

made (n) to adjust for the multiple STR loci being tested using the following

calculation α ÷ n = 0.05 ÷ 21. The theoretical profile frequency range was estimated

and was used to identify the rarest and most common genotypes. Furthermore, the

number of possible genotypes was also calculated using 0.03 and 0.05 FST values to

gain a better understanding.

The data generated from this study was compared to five published population data

sets for the 15 overlapping STR loci (7). The locus-by-locus exact test (using 30,000

Page 53: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

52

Markov steps) comparisons were made between this current study of the UAE

population and data from Kuwait, India, Saudi Arabia, Egypt and Iran using Arlequin

v3.11 (6).

3.3 Results and Discussion

Through the analysis of allele frequency data (Table 3.1), allele 8 of TPOX was found

to exhibit the highest allele frequency with 49.4% in the total sample. During analysis,

two off-ladder allelic variants were observed at locus SE33. These variants were allele

7.3 (in three samples) and allele 17.3 (in one sample). Both of these variants have

been previously reported on STRBase (8). A tri-allelic pattern (allele 6, 8, 10) was

observed for TPOX during analysis, which has also been previously reported on

STRBase (8). The SE33 locus showed the largest number of alleles (50 alleles) and

the D13S1358, D16S539 and CSF1PO loci showed the smallest number of alleles (8

alleles). The Ho values of the 21 autosomal STR loci ranged from 65% (TPOX) to

92% (SE33). The PD values for all tested loci were above 85%; the highest observed

for SE33 with 99.3% and the lowest for TPOX with 85%. The combined PE (CPE),

combined PD (CPD) and combined matching probability (CMP) for all 21 STR loci

were 0.9999992, 0.9999999 and 6.2468x10-27, respectively. When HWE was tested,

there was no statistical significance observed for 19 out of 21 autosomal STR loci.

Bonferroni correction was calculated (0.002) and applied to the two loci (D8S1179

and D22S1045) that showed deviation from HWE after which no significant departure

was observed.

The data for the most common STR profile from the UAE population (Table 3.2)

showed that even using a conservative 0.05 FST value leads to a PD value in the order

Page 54: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

53

of 1015, which translates into a value higher than 1 in a billion. These estimates

indicate that the match probability estimates reported on UAE populations in

laboratories (in the case of a full match) can be based on statistics generated using its

own population allele frequency data. Further work is required as more STRs are

added to the standard panel in order to develop guidelines and standards for UAE

population genetic data to be used in the form of databases.

Some significant differences were identified between the obtained UAE population

data and other published data (Table 3.3). The populations from Iran and Saudi Arabia

showed significant differences at fewer loci when compared with populations from

Kuwait, Egypt and India (P > 0.05). This is also supported by low FST values for the

Iranian and Saudi Arabian populations. These results support the development of

population and/or location specific databases even when considering populations that

are geographically close such as within the Middle East.

3.4 Conclusion

This current dataset establishes the characteristics of the 21 STR loci panel for the

identification of individuals in paternity testing and for crime scene analysis in the

UAE with the use of an amplification kit.

Page 55: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

54

Table 3.1: Allele frequency data for 519 individuals from UAE Population for 21 autosomal STR loci.

Allele D3S1358 vWA D16S539 CSF1PO TPOX D21S11 D8S1179 D18S51 D2S441 D19S433 TH01

4 0.001

5 0.001

6 0.004 0.299

7 0.005 0.217

8 0.03 0.003 0.494 0.01 0.113

9 0.16 0.036 0.136 0.007 0.002 0.238

9.3 0.116

10 0.002 0.096 0.278 0.107 0.073 0.008 0.122 0.003 0.013

11 0.002 0.357 0.303 0.23 0.102 0.021 0.402 0.015 0.003

11.3 0.076

12 0.233 0.316 0.023 0.154 0.114 0.106 0.1

12.2 0.006

12.3 0.008

13 0.004 0.002 0.111 0.049 0.004 0.241 0.183 0.021 0.215

13.2 0.038

14 0.063 0.066 0.01 0.011 0.001 0.164 0.158 0.232 0.245

14.2 0.059

15 0.255 0.13 0.001 0.168 0.136 0.03 0.143

15.2 0.001 0.073

16 0.301 0.247 0.069 0.126 0.001 0.05

16.1 0.049

16.3 0.001

17 0.234 0.272 0.012 0.104 0.003

17.2 0.001

18 0.131 0.211 0.001 0.076 0.001

19 0.011 0.058 0.003 0.041

20 0.012 0.017

21 0.002 0.006

21.2 0.003

22 0.005

Page 56: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

55

26 0.001

27 0.03

28 0.155

29 0.219

30 0.234

30.2 0.031

31 0.039

31.2 0.102

32 0.007

32.2 0.119

33 0.001

33.2 0.045

34 0.002

34.2 0.005

35 0.007

35.2 0.001

36 0.002

N 519 519 519 519 518 519 519 519 519 519 519

Ho 0.768 0.799 0.732 0.716 0.657 0.797 0.811 0.838 0.743 0.793 0.764

He 0.769 0.796 0.77 0.727 0.673 0.843 0.843 0.875 0.752 0.848 0.782

HWE 0.110 0.577 0.156 0.641 0.274 0.060 0.037 0.347 0.340 0.367 0.069

PD 0.907 0.925 0.915 0.875 0.846 0.957 0.955 0.971 0.902 0.961 0.918

PE 0.542 0.598 0.48 0.455 0.367 0.62 0.595 0.672 0.499 0.588 0.536

MP 0.093 0.075 0.085 0.125 0.154 0.043 0.045 0.029 0.098 0.039 0.082

PIC 0.73 0.77 0.74 0.68 0.63 0.82 0.82 0.86 0.72 0.83 0.75

TPI 2.12 2.5 1.87 1.77 1.46 2.65 2.47 3.09 1.95 2.43 2.13

N: Number of Samples, Ho: Observed Heterozygosity, He: Expected Heterozygosity, HWE: Hardy Weinberg p-value, PD: Power of Discrimination,

PE: Power of Exclusion, MP: Match Probability, PIC: Polymorphic Information Content, TPI: Typical Paternity Index. Where no alleles were

identified, this allele number was deleted from the table.

Page 57: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

56

Continuation of Table 3.1

Allele FGA D22S1045 D5S818 D13S317 D7S820 SE33 D10S1248 D1S1656 D12S391 D2S1338

6 0.001

6.3 0.003

7 0.001 0.001 0.025

8 0.001 0.013 0.145 0.176 0.006 0.012

9 0.046 0.042 0.096 0.003 0.011

10 0.016 0.121 0.068 0.283 0.007

11 0.169 0.264 0.255 0.264 0.002 0.011 0.093

12 0.01 0.342 0.329 0.137 0.002 0.03 0.127

12.2 0.002

13 0.004 0.198 0.121 0.016 0.019 0.187 0.107

13.2 0.003

14 0.066 0.014 0.037 0.001 0.025 0.328 0.118 0.001 0.002

14.2 0.001

14.3 0.003 0.007

15 0.406 0.001 0.032 0.276 0.151 0.023

15.2 0.002

15.3 0.005 0.03

16 0.248 0.001 0.067 0.117 0.191 0.018 0.042

16.1 0.003

16.2 0.001

16.3 0.079 0.002 0.045

17 0.004 0.084 0.039 0.047 0.132 0.205

17.3 0.039 0.009

18 0.003 0.003 0.094 0.002 0.004 0.174 0.107

18.3 0.015 0.015

19 0.061 0.086 0.001 0.122 0.148

19.1 0.001

19.2 0.002 0.001

19.3 0.006 0.003

20 0.095 0.037 0.114 0.149

Page 58: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

57

20.2 0.003

20.3 0.001

21 0.115 0.018 0.133 0.052

21.1 0.002

21.2 0.005 0.015

22 0.158 0.007 0.086 0.042

22.2 0.008 0.017

23 0.162 0.003 0.086 0.104

23.2 0.001 0.027

24 0.208 0.001 0.054 0.078

24.2 0.004 0.034

25 0.115 0.022 0.057

25.2 0.042

26 0.041 0.006 0.011

26.2 0.049

27 0.01 0.002

27.2 0.061

28 0.003

28.2 0.059

29 0.002

29.2 0.053

30.2 0.039

31.2 0.002 0.034

32 0.002

32.2 0.021

33 0.002

33.2 0.004

34 0.01

34.2 0.004

35 0.005

35.2 0.001

36 0.004

36.2 0.001

37 0.002

Page 59: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

58

N 519 519 519 519 519 515 519 519 519 519

Ho 0.834 0.666 0.764 0.788 0.789 0.928 0.737 0.872 0.882 0.865

He 0.865 0.735 0.757 0.783 0.89 0.949 0.765 0.884 0.888 0.876

HWE 0.062 0.003 0.643 0.856 0.051 0.183 0.232 0.165 0.129 0.133

PD 0.966 0.893 0.896 0.92 0.919 0.993 0.907 0.974 0.975 0.97

PE 0.664 0.379 0.536 0.577 0.581 0.853 0.489 0.74 0.76 0.725

MP 0.034 0.107 0.104 0.08 0.081 0.007 0.093 0.026 0.025 0.03

PIC 0.85 0.7 0.72 0.75 0.76 0.95 0.73 0.87 0.88 0.86

TPI 3.02 1.5 2.13 2.36 2.38 6.96 1.91 3.93 4.25 3.71

CPD 0.9999999

CPE 0.9999992

CMP 6.2468x10-27

N: Number of Samples, Ho: Observed Heterozygosity, He: Expected Heterozygosity, HWE: Hardy Weinberg p-value, PD: Power of Discrimination,

PE: Power of Exclusion, MP: Match Probability, PIC: Polymorphic Information Content, TPI: Typical Paternity Index, CPD: Combined Match

Probability, CPE: Combined Power of Exclusion, CMP: Combined Match Probability. Where no alleles were identified, this allele number was deleted

from the table.

Page 60: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

59

Table 3.2: Calculations for theoretical most common and rarest genotype frequencies.

Locus Allele1

Allele

Frequency

HW2

Genotype

Proportion

BN3

Heterozygote

Genotype

proportion

at 0.03 FST

BN3

Heterozygote

Genotype

proportion

at 0.05 FST

D3S1358 15 0.255 0.15351 0.203679472 0.240908145

16 0.301

VWA 16 0.247 0.134368 0.163047194 0.216241269

17 0.272

D16S539 11 0.357 0.166362 0.198279695 0.249675596

12 0.233

CSF1PO 11 0.303 0.191496 0.224354029 0.284585861

12 0.316

TPOX 8 0.494 0.134368 0.169695556 0.213753173

9 0.136

D21S11 29 0.219 0.102492 0.128228429 0.175752793

30 0.234

D8S1179 13 0.241 0.080976 0.104758996 0.144462309

15 0.168

D18S51 13 0.183 0.057828 0.078271805 0.114813569

14 0.158

D2S441 11 0.402 0.186528 0.220441917 0.273021592

14 0.232

D19S433 13 0.215 0.10535 0.131415111 0.180246393

14 0.245

TH01 6 0.299 0.142324 0.17182914 0.222171096

9 0.238

Page 61: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

60

1 Calculations based on two heterozygote alleles found in UAE allele frequency database

2 Hardy Weinberg calculations

3 Balding Nichols formulae at two different FST Value

FGA 23 0.162 0.067392 0.089269632 0.131718484

24 0.208

D22S1045 15 0.406 0.201376 0.236017211 0.29064288

16 0.248

D5S818 11 0.264 0.180576 0.213001503 0.27555507

12 0.342

D13S317 11 0.255 0.16779 0.199303064 0.259982879

12 0.329

D7S820 10 0.283 0.149424 0.179303056 0.232466834

11 0.264

SE33 18 0.094 0.016168 0.028289106 0.052235983

19 0.086

D10S1248 14 0.328 0.181056 0.213346499 0.26924128

15 0.276

D1S1656 15 0.151 0.057682 0.078190328 0.118059339

16 0.191

D12S391 18 0.174 0.046284 0.065057259 0.098027087

21 0.133

D2S1338 17 0.205 0.06109 0.082209052 0.118295807

20 0.149

Most Common Profile

Frequency 3.48E-21 5.31E-19 3.10E-16

Discrimination Power For

Most Common Profile 2.88E+20 1.88E+18 3.23E+15

Rarest profile frequency

(2pq)21

9.955 x

10E-92

Page 62: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

61

Table 3.3: Population differentiation (locus-by-locus exact test p-value) between UAE and five regional populations for 15

Identifiler STR Loci.

Locus/ Population Kuwaiti (8) India (8) Saudi (8) Egypt (8) Iran (8)

D8S1179 0.00043 0 0.06934 0.00577 0

D21S11 0.18961 0.00021 0.03273 0.01611 0.3034

D7S820 0 0.00105 0 0.14422 0.11488

CSF1PO 0.00771 0 0.00199 0.28927 0.19687

D3S1358 0.13193 0.01863 0.17805 0.10996 0.32399

TH01 0.64225 0.00374 0.93339 0 0.38819

D13S317 0.45192 0 0.00781 0.01139 0.19152

D16S539 0.47292 0 0.68441 0.09367 0.58645

D2S1338 0.01664 0 0.00405 0.40457 0.01023

D19S433 0 0 0 0 0

vWA 0.0187 0 0.16488 0.22881 0.39942

TPOX 0.023 0 0.53425 0.03548 0.04426

D18S51 0.02238 0 0.11677 0 0.20286

D5S818 0.19789 0.00021 0.32408 0.05 0.03438

FGA 0.00218 0.14595 0.00435 0 0.00512

Note: Statistically significant P-value (P<0.05) are indicated in bold

(30,000 Markov steps done).

Page 63: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

62

3.5 References

1. Miles SB. The countries and tribes of the Persian Gulf. London: Harrison and

Sons 1919.

2. Tadmouri GO, Nair P, Obeid Y, Al-Ali MT, Al-Khaja N, Hamamy HA,

Consanguinity and reproductive health among Arabs. Reprod Health.

2009;6:17.

3. Abed I, Hellyer P. United Arab Emirates: A new perspective. London: Trident

Press 2001.

4. Garcia-Bertrand R, Simms TM, Cadenas AM, Herrera RJ. United Arab

Emirates: Phylogenetic relationships and ancestral populations. Gene.

2014;533:411-419.

5. Schneider PM, Scientific standards for studies in forensic genetics. Forensic

Sci Int. 2007;165:238-243.

6. Excoffier L, Laval G, Scnheider S. Arlequin ver. 3.0: An integrated software

package for population genetics data analysis. Evol Bioinform. 2005;1:47-50.

7. Al-Enizi M, Ge J, Ismael I, Al-Enezi H, Al-Awadhi A, Al-Duaij W et al.

Population genetic analyses of 15 STR loci from seven forensically-relevant

populations residing in the state of Kuwait. Forensic Sci Int Gen.

2013;7(4):e106-e107.

8. Ruitberg CM, Reeder DJ, Butler JM. STRBase: A short tandem repeat DNA

database for the human identity testing community. Nucl Acids Res.

2001;29(1):320-322.

Page 64: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

63

Chapter 4

ALLELE FREQUENCIES OF SHORT TANDEM REPEAT MARKERS

USED FOR FORENSIC APPLICATIONS IN THE ARAB POPULATION OF

THE UNITED ARAB EMIRATES

The data in this chapter augments the study in the previous chapter by inclusion of

other ethnic groups not studied previously. The following chapter has been

accepted by Forensic Science International (FSI): Genetics and currently in the

Press (Appendix 2). The candidate was involved in developing the research

question, providing technical assistance in the laboratory, compiling the data,

drafting the manuscript and undertaking the statistical calculations. She prepared

the manuscript in the format for FSI Genetics and was responsible for assisting in

its submission.

Page 65: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

64

Forensic Science International: Genetics. 2017 (In Press)

Allele frequencies of Short Tandem Repeat markers used for Forensic

applications in the Arab population of the United Arab Emirates.

Rebecca J Jones1, Wafa Al Tayyare2,3, , Guan K Tay1,4,5, Habiba Alsafar6,7, William

H Goodwin2.

1 School of Anatomy, Physiology and Human Biology, University of Western

Australia, Crawley, Western Australia.

2 School of Forensic and Applied Sciences, University of Central Lancashire,

Lancashire, United Kingdom.

3 Forensic Evidence Department, Abu Dhabi Police General Head Quarter, Abu

Dhabi, United Arab Emirates.

4 School of Psychiatry and Clinical Neurosciences, University of Western Australia,

Crawley, Western Australia

5 School of Medical and Health Sciences, Edith Cowan University, Joondalup,

Western Australia.

6 Center for Biotechnology, Khalifa University, Abu Dhabi, United Arab Emirates.

7 Faculty of Biomedical Engineering, Khalifa University of Science, Technology

and Research, Abu Dhabi, United Arab Emirates

Page 66: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

65

4.1 Introduction

The United Arab Emirates (UAE) lies on the east coast of the Arabian Peninsula with

a coastline at the Arabian Gulf and is neighbours with Saudi Arabia to the west and

Oman to the south. The residents of the country are predominantly expatriates with

only about 11.3% (2015 census data) of approximately 9.6 million residents

comprising UAE nationals. This specific subset are predominantly people of Arab

descent (1).

Throughout history, nomadic Arabian tribes have traversed this region of the Middle

East and this region is also located at a junction where there was constant and regular

human migration between the African, European and Asian continents (2, 3) through

the land bridge around modern Egypt. The human dispersal from Africa across the

land bridge into the Middle East was one of the two initial out of Africa migration

routes. The general region around contemporary UAE, Qatar, Bahrain and Oman,

with borders only defined in the early 1970s, was an alternative migration route from

the initial out of Africa dispersal involving migration from the Horn of Africa across

the Red Sea and into Yemen (4).

Consanguineous marriages are common in societies throughout this region, potentially

limiting the genetic pool. However, a recent report in 2014 cited an increase in genetic

diversity within the Middle East region arising from an increase in non-

consanguineous marriages (5). To appreciate the impact of cultural norms in a region

that represents the hub through which humans dispersed from Africa, it is essential

that studies related to genetic markers are continuously undertaken.

The genome era has and continues to reveal a rich vein of knowledge in disease-

susceptibility genes and in overall genetic diversity of different populations. Given

Page 67: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

66

the paradox that exists in a population shaped by extensive migration patterns and

constrained by cultural practices, this study was conceived in an attempt to replicate

the previous data on a set of highly polymorphic Short Tandem Repeat (STR) loci on

the UAE population. The focus of this study, in part, was to make observations

relating to the genetic diversity within the UAE population. It adds to the knowledge

base arising from previous studies, which have also presented autosomal STR allelic

data on UAE populations (2, 6, 7). With the addition of new studies, such as this

present study, the confidence in the information for applications in forensics and

medicine improves with the iterative process and with increases in the population

sample size being interrogated.

4.2 Materials and Methods

4.2.1 Sample Description

This study involved STR analysis of DNA samples from 477 unrelated individuals

from the UAE. These individuals were Emirati Arabs of mixed ethnic origin

predominantly from Abu Dhabi. The results from these samples were then combined

to the previous STR data described in Chapter 3, which used 519 DNA samples from

unrelated Emirati subjects who were largely Emirati Bedouins (6). DNA samples of

individuals who provided consent to have their de-identified DNA samples stored for

research purposes were obtained from the Emirates Family Registry (EFR). The EFR

was established as a resource to collect DNA for genetic association studies (8).

Assistance from EFR staff for this study involved sample collection and storage, and

provision of de-identified samples such that the researchers working on the study did

not have access to any linked personal information. Prior to commencement of this

Page 68: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

67

study, approval to undertake the work was obtained from the Ethics committee of the

Ministry of Health of the UAE (2011). The study was also submitted to and approved

by the Human Research Ethics Committee of the University of Western Australia

(RA/4/1/7778).

4.2.2 DNA Extraction

Buccal cells were collected from saliva using the Oragene-DNA kit (Genotek, Ottawa,

Canada) and DNA extracted from buccal cells using the prepIT-L2P system (Genotek)

in accordance with manufacturer’s instructions. The NanoDrop spectrophotometer

(Thermo Scientific, Wilmington DE, USA) was used to determine the quantity and

quality of the extracted DNA.

4.2.3 PCR Multiplex Amplification

The GlobalFiler® PCR Express Amplification Kit (Life Technologies, Carlsbad, CA,

USA) was used to amplify 24 STR markers using half volume (total 7.5 µl) reactions.

There were 21 autosomal STR presented in this analysis: D3S1358, vWA, D16S539,

CSF1PO, TPOX, D8S1179, D21S11, D18S51, D2S441, D19S433, TH01, FGA,

D22S1045, D5S818, D13S317, D7S820, SE33, D10S1248, D1S1656, D12S291 and

D2S1338. The PCR was performed using the GeneAmp® PCR System 9700 using

the manufacturer’s instructions for the PCR cycle program (Life Technologies).

4.2.4 STR Typing

The PCR products were added to a 500 LIZ-Internal Standard (Life Technologies) and

analysed using an ABI 3500 DNA Genetic Analyser with POP-4™ polymer (Life

Technologies). GeneMapper® Software ID-X version 4.0 (Life Technologies) was

used for analysis.

Page 69: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

68

The alleles from all 21 autosomal loci reported within this present study were labelled

according to the published nomenclatures and the guidelines for performing STR

analyses of the International Society for Forensic Genetics (ISFG) (9).

4.2.5 Statistical Analysis

The data from this present study was initially compared to population data from the

previous published STR data (6) to identify if there were differences between the two

datasets using locus-by-locus exact tests (with 20,000 Markov steps) with the Arlequin

v3.5.2.1 software (10). Allele frequencies for the 21 autosomal STR markers were

calculated using GeneALEx v6.5 (11, 12). Genetic parameters such as the Power of

Discrimination (PD), Power of Exclusion (PE), Match Probability (MP), typical

Paternity Index (TPI) and Polymorphism Information Content (PIC), were also

calculated using GeneAlEx v6.5 (11, 12) with the appropriate formulas for each of the

parameters (13). The observed and expected heterozygosity values (Ho and He,

respectively) and Hardy-Weinberg equilibrium (HWE) were calculated using an exact

test with the Arlequin v3.5.2.1 software (10). The Bonferroni correction was carried-

out by dividing the critical P value (α) by the number of comparisons being made (n)

to adjust for the multiple STR loci being tested using the following calculation α ÷ n

= 0.05 ÷ 21.

4.3 Results and Discussion

Before the two separate datasets from the UAE population were combined for allele

frequency analysis, a locus-by-locus exact test was carried-out to determine whether

there were any significant variations between them (Table 4.1). Three out of the 21

autosomal STRs (TPOX, FGA and D22S1045) showed significant differences

Page 70: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

69

between the two UAE population datasets (p-value < 0.05). However, after applying

the Bonferroni correction (0.002) no significant differences were observed.

Furthermore, when the overall locus-by-locus AMOVA test was carried-out using

Arlequin v3.5.2.1, the resulting fixation index (FST) of 0.00005 showed no significant

variation between the two separate datasets. Consequently, the two datasets were

combined for subsequent analyses totalling 996 samples.

Table 4.1: Population differentiation locus-by-locus exact test between the two

UAE population datasets for each locus

Locus P-Value

D3S1358 0.27590+-0.07084

vWA 0.51635+-0.05404

D16S539 0.46340+-0.05499

CSF1PO 0.27370+-0.6450

TPOX 0.02595+-0.01628*

D8S1179 0.31850+-0.04799

D21S11 0.50080+-0.04294

D18S51 0.05430+-0.01989

D2S441 0.09855+-0.04905

D19S443 0.44905+-0.06866

TH01 0.99075+-0.00272

FGA 0.00945+-0.00366*

D22S1045 0.0560+-0.0057*

D5S818 0.10040+-0.03256

D13S317 0.52585+-0.08496

D7S820 0.20525+-0.05905

SE33 0.44045+-0.08680

D10S1248 0.77245+-0.04410

D1S1656 0.22980+-0.05511

D12S391 0.36060+-0.06192

D2S1338 0.14675+-0.04035

Note: Statistically significant p-value (P<0.05) indicated in bold (20,000 Markov steps done).

*Before the Bonferroni Correction

Page 71: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

70

The calculated allele frequencies and genetic parameters for the 21 autosomal STR

markers are presented in Table 4.2. The STR locus with the largest number of alleles

was SE33 (59 alleles), as seen within the first UAE STR dataset described in Chapter

3. The least number of alleles was observed for the locus D16S539 (8 alleles). Eight

of the STR markers deviated from the HWE (D21S11, D18S51, D19S433, TH01,

D22S1045, SE33, D10S1248 and D2S1338). After applying the Bonferroni

correction, only three markers deviated from this HWE (D19S433, SE33 and

D2S1338). The Ho values ranged from 0.687 (TPOX) to 0.917 (SE33). High Ho

values for the UAE population have been reported in previous studies (2, 6, 7).

The PD range was 0.832 (TPOX) to 0.994 (SE33) with a combined PD value of

0.9999999999. The present study has shown a larger range of PD values than seen in

the previous STR dataset from Chapter 3 with only 519 samples (6), which could be

due to the increase in sample size (7). The PD in correlation with MP supports the

high degree of polymorphism between UAE individuals. The combined MP of

2.62515x10-26 is greatly reduced with the increase of sample size compared to when

only 519 samples were tested as described in Chapter 3 (6). The PE range was 0.408

(TPOX) to 0.830 (SE33) with a combined power of exclusion of 0.9999999964.

Allele TPOX has previously been observed to have the lower PD, PE and

Heterozygosity values out of the loci in the previous STR dataset in Chapter 3 and in

the literature (2, 6, 7). The wide range for the PE can be expected as these values do

vary across individual cases (14). The high value for the combined PE indicates that

there is a higher fraction of the individuals with allele variations, highlighting genetic

diversity of UAE individuals. The PIC range was 0.653 (TPOX) to 0.945 (SE33).

These high informative values support the heterozygosity values indicating the high

Page 72: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

71

degree of genetic polymorphism. The typical PI value for every marker was larger

than 1.0 making it useful for paternity testing applications (14).

4.4 Conclusion

The data presented here indicates that the 21 autosomal STR markers from the

GlobalFiler® Express amplification kits have forensic applications for individual

identification and paternity testing in the local population of the UAE. The similarity

between the two datasets provides a degree of reassurance that potential errors and

biases have been reduced or eliminated. Furthermore, consolidation of the data

provides representation of the distribution of various ethnic groups that make-up

nationals in the UAE. As further studies of Arab populations in the region become

available, it may be possible to develop a greater understanding of the relationships

between the different jurisdictions on the Arabian Peninsula.

Page 73: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

72

Table 4.2: Allele frequency data for 996 individuals from UAE Population for 21 autosomal STR loci.

Allele D3S1358 vWA D16S539 CSF1PO TPOX D21S11 D8S1179 D18S51 D2S441 D19S433 TH01

4 0.001

5 0.001

5.3 0.001

6 0.001 0.006 0.298

6.2 0.001

7 0.003 0.004 0.211

8 0.031 0.006 0.461 0.007 0.109

8.3 0.001

9 0.151 0.030 0.142 0.006 0.001 0.002 0.242

9.3 0.119

10 0.001 0.096 0.282 0.117 0.070 0.005 0.133 0.002 0.014

10.2 0.001

11 0.001 0.359 0.310 0.238 0.096 0.019 0.401 0.018 0.003

11.2 0.001

11.3 0.082

12 0.001 0.229 0.311 0.029 0.148 0.119 0.089 0.093

12.2 0.006

12.3 0.001 0.005

13 0.004 0.001 0.123 0.048 0.003 0.240 0.179 0.023 0.215

13.2 0.001 0.036

14 0.052 0.065 0.010 0.010 0.001 0.170 0.148 0.233 0.238 0.001

14.2 0.057

15 0.257 0.132 0.001 0.001 0.186 0.137 0.031 0.140

15.2 0.001 0.085 0.001

16 0.297 0.244 0.063 0.116 0.002 0.051 0.001

16.2 0.002 0.049

17 0.246 0.268 0.011 0.117 0.003

Page 74: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

73

17.2 0.001 0.005

18 0.130 0.210 0.002 0.077 0.001

18.2 0.001

19 0.014 0.068 0.002 0.049

20 0.009 0.019

21 0.002 0.005

21.2 0.002

22 0.005

24 0.001

26 0.001

27 0.024

28 0.165

29 0.219

29.2 0.001

30 0.239

30.2 0.027

31 0.044

31.2 0.096

31.3 0.001

32 0.007

32.2 0.123

33 0.002

33.2 0.038

34 0.002

34.2 0.006

35 0.006

35.2 0.001

36 0.002

n 996 996 996 996 995 996 996 996 996 996 996

Ho 0.750 0.801 0.738 0.715 0.687 0.827 0.798 0.858 0.732 0.819 0.766

He 0.766 0.799 0.771 0.725 0.697 0.839 0.839 0.878 0.752 0.852 0.782

Page 75: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

74

P-value 0.399 0.537 0.383 0.196 0.195 0.020 0.202 0.012 0.056 0.000 0.031

PD 0.908 0.928 0.916 0.872 0.863 0.953 0.955 0.972 0.900 0.962 0.920

MP 0.092 0.072 0.084 0.129 0.137 0.047 0.045 0.028 0.100 0.038 0.081

PE 0.510 0.601 0.489 0.452 0.408 0.650 0.601 0.711 0.479 0.635 0.538

TPI 2.000 2.513 1.908 1.754 1.597 2.890 2.475 3.521 1.866 2.762 2.137

PIC 0.726 0.768 0.738 0.673 0.653 0.819 0.819 0.864 0.718 0.836 0.748 N: Number of Samples, Ho: Observed Heterozygosity, He: Expected Heterozygosity, HWE: Hardy Weinberg p-value, PD: Power of Discrimination,

PE: Power of Exclusion, MP: Match Probability, PIC: Polymorphic Information Content, TPI: Typical Paternity Index, CPD. Where no alleles were

identified, this allele number was deleted from the table.

Page 76: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

75

Table 4.2 continued

Allele FGA D22S1045 D5S818 D13S317 D7S820 SE33 D10S1248 D1S1656 D12S391 D2S1338

4

4.2 0.001

5

5.3 0.001

6 0.001

6.2

6.3 0.005

7 0.001 0.001 0.021

7.3 0.001 0.001

8 0.001 0.012 0.136 0.174 0.005 0.009

8.3

9 0.041 0.044 0.088 0.002 0.010 0.002

9.3 0.001

10 0.009 0.110 0.078 0.297 0.001 0.002 0.007

10.2

11 0.165 0.287 0.251 0.255 0.002 0.010 0.100

11.2

11.3

12 0.015 0.355 0.338 0.144 0.004 0.033 0.114

12.2 0.002

12.3

13 0.003 0.180 0.117 0.021 0.016 0.186 0.109

13.2 0.003

13.3 0.001

14 0.071 0.014 0.035 0.001 0.033 0.328 0.117 0.001 0.002

14.2 0.001

14.3 0.002 0.005

15 0.413 0.001 0.001 0.029 0.284 0.145 0.023 0.001

Page 77: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

76

15.2 0.001 0.001

15.3 0.005 0.030

16 0.243 0.001 0.067 0.112 0.200 0.024 0.047

16.1 0.002

16.2 0.001

16.3 0.003 0.050

17 0.004 0.077 0.001 0.087 0.034 0.055 0.127 0.197

17.2

17.3 0.001 0.032 0.007

18 0.005 0.004 0.101 0.002 0.006 0.177 0.099

18.2 0.001

18.3 0.016 0.012

19 0.063 0.082 0.001 0.123 0.140

19.1 0.001

19.2 0.005 0.001 0.001

19.3 0.005 0.002

20 0.087 0.042 0.128 0.164

20.2 0.003 0.004

20.3 0.001

21 0.117 0.018 0.125 0.064

21.1 0.001

21.2 0.004 0.015

22 0.152 0.006 0.088 0.043

22.2 0.006 0.020

23 0.167 0.003 0.089 0.109

23.2 0.002 0.021

24 0.217 0.002 0.049 0.076

24.2 0.002 0.030

25 0.109 0.001 0.019 0.048

25.2 0.001 0.034

25.3 0.001

Page 78: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

77

26 0.040 0.005 0.009

26.2 0.043

27 0.007 0.002

27.2 0.054

28 0.005

28.2 0.066

28.3 0.001

29 0.002 0.002

29.2 0.059

30 0.001

30.2 0.038

31 0.001 0.001

31.2 0.001 0.036

31.3

32 0.001

32.2 0.022

33 0.002

33.2 0.006

34 0.010

34.2 0.004

35 0.004

35.2 0.001

36 0.002

36.2 0.001

37 0.002

n 991 996 996 996 996 991 996 996 996 996

Ho 0.850 0.689 0.749 0.785 0.783 0.917 0.730 0.859 0.878 0.850

He 0.863 0.732 0.746 0.782 0.788 0.948 0.763 0.883 0.886 0.877

P-value 0.294 0.024 0.629 0.648 0.230 0.000 0.009 0.305 0.376 0.000

PD 0.967 0.832 0.888 0.922 0.922 0.994 0.907 0.975 0.975 0.972

MP 0.034 0.108 0.112 0.078 0.078 0.006 0.093 0.025 0.025 0.028

Page 79: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

78

PE 0.695 0.411 0.508 0.572 0.568 0.830 0.476 0.713 0.751 0.695

PI 3.333 1.608 1.992 2.326 2.304 6.024 1.852 3.546 4.098 3.333

PIC 0.848 0.693 0.704 0.751 0.755 0.945 0.725 0.871 0.875 0.864

CPD 0.999999999999999

CMP 4.38E-27

CPE 0.999999996 N: Number of Samples, Ho: Observed Heterozygosity, He: Expected Heterozygosity, HWE: Hardy Weinberg p-value, PD: Power of Discrimination,

PE: Power of Exclusion, MP: Match Probability, PIC: Polymorphic Information Content, TPI: Typical Paternity Index, CPD: Combined Match

Probability, CPE: Combined Power of Exclusion, CMP: Combined Match Probability. Where no alleles were identified, this allele number was deleted

from the table.

Page 80: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

79

4.5 References

1. Osman AE, Alsafar H, Tay GK, Theyab J, Mubasher M, Sheikh N, et al.

Autosomal short tandem repeat (STR) variation based on 15 loci in a

population from the Central Region (Riyadh Province) of Saudi Arabia. J

Forensic Res. 2015;6(1):1-5.

2. Garcia-Bertrand R, Simms TM, Cadenas AM, Herrera RJ. United Arab

Emirates: phylogenetic relationships and ancestral populations. Gene.

2014;533(1):411-9.

3. Kundu S, Ghosh SK. Trend of different molecular markers in the last decades

for studying human migrations. Gene. 2015;556(2):81-90.

4. Beyin A. The Bab al Mandab vs the Nile-Levant: An appraisal of the two

dispersal routes for early modern humans out of Africa. African Archaeol Rev.

2006;23(1-2):5-30.

5. Tadmouri GO, Sastry KS, Chouchane L. Arab gene geography: From

population diversities to personalized medical genomics. Glob Cardiol Sci

Pract. 2014;2014(4):394-408.

6. Ali Alhmoudi O, Jones RJ, Tay GK, Alsafar H, Hadi S. Population genetics

data for 21 autosomal STR loci for United Arab Emirates (UAE) population

using next generation multiplex STR kit. Forensic Sci Int. 2015;19:190-1.

7. Alshamali F, Alkhayat AQ, Budowle B, Watson ND. STR population diversity

in nine ethnic populations living in Dubai. Forensic Sci Int. 2005;152(2-

3):267-79.

8. Alsafar H, Jama-Alol KA, Hassoun AAK, Tay GK. The prevalence of Type 2

Diabetes Mellitus in the United Arab Emirates: Justification for the

Page 81: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

80

establishment of the Emirates Family Registry. International Journal of

Diabetes in Developing Countries. 2012;32(1):25-32.

9. Schneider PM. Scientific standards for studies in forensic genetics. Forensic

Sci Int. 2007;165(2-3):238-43.

10. Excoffier L, Laval G, Schneider S. Arlequin (version 3.0): An integrated

software package for population genetics data analysis. Evol Bioinform

Online. 2005;1:47-50.

11. Peakall R, Smouse PE. GenAlEx 6.5: Genetic analysis in Excel. Population

genetic software for teaching and research-an update. Bioinformatics.

2012;28(19):2537-9.

12. Peakall ROD, Smouse PE. Genalex 6: Genetic analysis in Excel. Population

genetic software for teaching and research. Mol Ecol Notes. 2006;6(1):288-95.

13. Huston KA. Statistical Analysis of STR Data. Profiles in DNA. 1998;1(3):14-

5.

14. Bentayebi K, Abada F, Ihzmad H, Amzazi S. Genetic ancestry of a Moroccan

population as inferred from autosomal STRs. Meta gene. 2014;2:427-38.

Page 82: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

81

Chapter 5

A COMPARATIVE ANALYSIS OF AUTOSOMAL SHORT TANDEM

REPEAT (STR) ALLELE FREQUENCIES OF POPULATIONS IN THE

UNITED ARAB EMIRATES AND SURROUNDING REGIONS

The previous chapters describe autosomal STR data from 996 unrelated UAE

individuals. This chapter compares the data collected in this project with the

autosomal STR data published for populations of the Middle East, North Africa

and South Asia to examine the relationship between these populations. The

analysis further increases our knowledge of diversity in these regions.

Page 83: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

82

5.1 Introduction

Population genetic analyses provide important advancements in studying human

genetic diversity for applications in forensic studies, paternity identification and

medical research. Such DNA-based analyses have also provided the foundation to

examine the impact of human migration routes on the degree of genetic variation

amongst populations. Specifically, the use of the highly variable autosomal Short

Tandem Repeats (STRs) allow for population-specific DNA analyses that can provide

an understanding of the extent of human genetic variation amongst neighbouring and

geographically distant populations.

The Middle East is located amongst some of the earliest known bidirectional human

migration routes, linking dispersal through Africa, Asia and Europe (1-3). However,

there is a paucity of genetic analyses that have included Middle Eastern populations

relative to populations from other parts of the world (4). The migration out of Africa

is postulated to have involved at least two proposed routes through the Horn of Africa

(into the Arabian Peninsula) and through the Levant region, both into the Middle East

(see Figure 5.1). It is important to understand the impact of the two separate migration

routes and the ongoing human dispersal that is continuously taking place throughout

the Middle East region on the genetic diversity of the populations in the region.

The United Arab Emirates (UAE) plays a significant role in these ancient migration

routes as the proposed route from the Horn of Africa through the UAE and into South

Asia has been highlighted in the literature with the use of DNA analyses (5).

Archaeological discoveries have also indicated the climatic conditions in the UAE

(and the Hajar Mountains of Oman) at this time would have been conducive to

settlement and migration (4).

Page 84: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

83

Figure 5.1: Ancient human migration routes involving the Middle East. Map

generated using published data (1-8).

The UAE consists of the seven Emirates Abu Dhabi, Dubai, Sharjah, Ajman, Ra’s Al-

Khaymah, Al-Fujairah and Umm Al-Quwain. The country is known for its trade

expansions over several decades with countries in surrounding regions such as Iran

and other South Asian populations (9) and Dubai has been identified as the major

maritime trading centre of the lower Arabian Peninsula. Additionally during the 1960s

and 1970s oil-revenue-financed development boom, Western relationships (such as

with the United Kingdom) were also established (9). In more recent times, the

residents of the UAE have been observed to be predominantly expatriates with other

UAE nationals of mixed ethnic backgrounds. The noted subset residing within the

UAE are primarily of Arab descent with additional native populations and tribal

groups such as the Bedouins. Even with the increase of expatriates due to trading

relationships, the Arabic populations remain traditional within the Arabic cultures.

The UAE is an important population to consider in the region at the genetic-level due

to the ethnically diverse residents of the country resulting from historic tribal group

migrations and advancements in trade relationships. However, the UAE population

has been minimally examined at the genetic level and compared to other populations

from surrounding regions. Accordingly, the aim of this study was to advance

Page 85: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

84

knowledge from previous genetic-based studies (10-12) about the UAE by increasing

sample size and the range of selected populations for comparison to examine the

impact of human dispersal through the Middle East, North Africa and South Asia and

the resulting genetic relationships.

5.2 Methods

The allele frequency data used for this study were previously presented in Chapter 4

and describes the autosomal STR loci polymorphisms for 996 Arabic nationals from

the UAE. To complete the analysis, published population data from the Middle East

and North Africa (MENA region) and select South Asian countries were collected.

The Middle East populations included the UAE (11), Saudi Arabia (13), Yemen (10),

Oman (10), Qatar (14), Kuwait (15), Iraq (16), Jordan (17), Syria (18), and Lebanon

(19). North African and South Asian population data were from Egypt (20), Libya

(21), Malta (22), Tunisia (23), Algeria (24), Morocco (25), Iran (7), Pakistan (26),

India (27), and Bangladesh (28) (Figure 5.2).

The statistical analyses for the inter-population comparisons involved the six common

loci used in all the published data (namely vWA, CSF1PO, TPOX, TH01, D13S317

and D7S820). Initially, the combined statistical parameters and heterozygosity values

for the six common loci for each population were calculated using the data from

publications. The combined statistical parameters calculated for each population was

the Combined Power of Discrimination (CPD), Combined Power of Exclusion (CPE)

and Combined Match Probability (CMP) using the appropriate statistical equations to

determine the relevance of the six loci in the analysis (29).

Page 86: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

85

ALGERIA LIBYA EGYPT

MOROCCO TUNISIA

MALTA

SAUDI ARABIA

YEMEN

OMAN

U.A.E. QATAR

JORDAN

LEBANON

SYRIA

IRAQ IRAN

PAKISTAN

INDIA BANGLADESH

North Africa Middle East South Asia

KUWAIT

Figure 5.2: Geographical location of the countries with published population data

used for the meta-analysis in this study. Map generated using published data (7, 10,

11, 13-28).

The FST values of the shared six loci were calculated using Fisher’s locus-by-locus

Exact Test (using 20,000 Markov steps) using the Arlequin v3.5.2.1 software (30).

For each of the FST values from the locus-by-locus exact test calculated, the number

of significant differences (P-value < 0.05) between any two populations was counted

for all six loci. Additionally, an extended set of 15 of the most commonly published

STR loci (namely D3S1358, vWA, D16S539, CSF1PO, TPOX, D21S11, D8S1179,

D18S51, D19S433, TH01, FGA, D5S818, D13S317, D7S820 and D2S1338) was

compared and the percentage of significant differences using the FST values were

calculated where data was available between any two populations.

5.3 Results

5.3.1 Within-population Genetic Variability Measures

By using the published statistical parameters of the six common loci for the

populations (Tunisia and Bangladesh parameters not available), the combined

Page 87: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

86

parameters were calculated. The CPD was similar in all populations with the range of

0.99999913 (Libya) to 0.999999871 (Yemen) and the CPE were high overall with a

range of 0.950134 (Qatar) to 0.999432 (Jordan). The CMP varied between

populations but all with low probabilities below 5.00 x 10-5 (Table 5.1).

Table 5.1: The calculated combined parameters for the six common loci for each

population.

The observed heterozygosity values for the common six loci are shown in Figure 5.3.

The heterozygosity values for the majority of populations, except for Jordan, were

high for the loci representing substantial genetic diversity. The heterozygosity values

for the six loci for the Jordanian population ranged from a low of 0.220 (vWA) to

POPULATION COMBINED

POWER OF

DISCRIMINATION

COMBINED

POWER OF

EXCLUSION

COMBINED

MATCH

PROBABILITY

UAE (PRESENT

STUDY)

0.999999386 0.988943 6.27E-07

SAUDI ARABIA (13) 0.999998839 0.975964 1.16052E-06

QATAR (14) 0.999999381 0.950134 6.19446E-07

UAE (11) 0.999999076 0.979605 9.24449E-07

OMAN (10) 0.999998055 0.990594 1.94513E-06

YEMEN (10) 0.999999871 0.988433 1.40457E-06

LEBANON (19) 0.99999922 0.983311 7.79662E-07

SYRIA (18) 0.999999139 0.988335 8.6078E-07

JORDAN (17) 0.999999446 0.999432 5.5442E-07

IRAQ (16) 0.999998422 0.962225 1.57809E-06

KUWAIT (15) 0.999998194 0.978309 1.80627E-06

EGYPT (20) 0.999998697 0.982859 1.30333E-06

LIBYA (21) 0.99999913 0.986345 8.70071E-07

ALGERIA (24) 0.999998533 0.994072 1.46701E-06

MOROCCO (25) 0.999999545 0.987986 4.59474E-06

MALTA (22) 0.999999472 0.9894 5.2786E-07

IRAN (7) 0.999999034 0.986423 9.65997E-07

PAKISTAN (26) 0.999999398 0.981284 6.01986E-07

INDIA (27) 0.99999616 0.998577 3.8403E-06

Page 88: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

87

0.390 (TPOX), reflecting a more homogeneous population. The remaining

populations, including the present UAE study, varied in the range of values with vWA

heterozygosity values between 0.542 (Qatar) and 0.889 (Tunisia), CSF1PO between

0.568 (Tunisia) and 0.889 (India), TPOX between 0.563 (Kuwait) and 0.790

(Morocco), TH01 between 0.713 (Yemen) and 0.825 (Syria and Iraq), D13S317

between 0.618 (Qatar) and 0.812 (Yemen), and D7S820 between 0.713 (Yemen) and

0.907 (India).

5.3.2 Regional Population Genetic Comparison of the Middle East

The locus-by-locus exact test was carried-out between each of the 21 populations and

compared for the common six loci. Table 5.2 summarises the number of significant

differences (p-value < 0.05) found between the FST values compared. Initially, the

Middle East regional populations were compared to each other. The highest number

of significant differences between populations (6/6 loci) was observed for Saudi

Arabia, Jordan and Lebanon with other Middle East countries.

When data from the present UAE study was compared to the other Middle East

populations, Saudi Arabia was found to have the greatest number of significant

differences (6/6 loci) with Kuwait, Jordan and Lebanon also having a high number of

significant differences with the UAE (5/6 loci). The least number of significant

differences with the present UAE study was found with Syria and the previously

published UAE dataset (2/6 loci), then Iraq, Qatar and Yemen (1/6 loci), with Oman

and the present UAE study showing no significant differences between the loci.

Page 89: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

88

Figure 5.3: Observed heterozygosity values for the common six loci for each of the 21 populations in the analysis.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Pre

sen

t St

ud

y

Mo

rocc

o

Alg

eria

Tun

isia

Mal

ta

Lib

ya

Egyp

t

Leb

ano

n

Jord

an

Syri

a

Iraq

Ku

wai

t

Sau

di

Qat

ar

UA

E

Om

an

Yem

en

Iran

Pak

ista

n

Ind

ia

Ban

glad

esh

Het

ero

zygo

sity

Fre

qu

ency

Populations

Observed Heterozygosity

vWA

CSF1PO

TPOX

TH01

D13S317

D7S820

Autosomal Loci

Page 90: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

89

Table 5.2: Number of significant differences (p-value<0.05) between two populations for the common six loci in the analysis.

Page 91: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

90

5.3.3 Inter-population Genetic Comparison

When comparing the North African populations, the least significant differences were

observed between Libya, Algeria and Egypt (2/6 loci) but others had a high number

of significantly different loci. The South Asian countries shared less significant

differences between each other, with the greatest number of significant differences

between Pakistan and India (4/6 loci).

When the present UAE study was compared to the data from the South Asian

populations the number of significant differences ranged from 6/6 loci with India to

1/6 loci with Iran. Furthermore, when the present UAE study was compared to the

North African populations, the greatest number of significant differences was seen

with Morocco (6/6 loci), Tunisia (4/6 loci), then with Malta, Algeria and Egypt (3/6

loci), with the least number of significant differences with Libya (2/6 loci).

5.3.4 Effect of Increased Number of STR Markers

When 15 of the most commonly used STR markers in publications were compared

between the present UAE study and the other 20 populations, a better delineation of

the genetic relationships was obtained than observed with the examination of the

common six STR loci (see Appendix 3). Table 5.3 summarises the number of

significant differences between any two populations. Table 5.3 shows how the present

UAE study exhibited numerous significant differences (P-value ≤ 0.05) with

populations from Morocco, Saudi Arabia, Lebanon, Kuwait, Jordan and India (more

than 80% significant differences overall). The populations compared to the present

UAE study, which had the least number of significant differences when using a greater

number of markers, were the published UAE dataset, Oman, Yemen and Iraq (less

than 20% significant differences overall).

Page 92: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

91

Between the present UAE study and Oman only 1 out of 13 loci (8%) was significantly

different indicating a genetic relationship with more confidence than with the use of

only six loci. The improved confidence of genetic relationships was also observed

between populations with greater significant differences. When the common six STR

loci were compared, India and Morocco both exhibited 100% significant difference

with the present UAE study. However, the extent of genetic difference between

Morocco and India compared to the present UAE study was refined using 15 STR loci,

with only 80% significant difference between India and the present UAE study

compared to 100% significant difference between the present UAE study and

Morocco, providing a greater understanding of the degree of genetic relationships

between the populations.

Page 93: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

92

Table 5.3: Number of significant differences (p-value<0.05) between two populations for the common loci in the analysis (up to

15 loci).

Page 94: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

93

5.4 Discussion

5.4.1 Significance of Inter-population Genetic Comparisons

By comparing genetic relationships between the populations from the MENA and

South Asian regions, we can gain a greater understanding of the extent of ethnic

admixture with respect to human migration using autosomal STR analysis.

Unfortunately, publications utilised in this analysis did not contain the same set of loci

with allele frequency descriptions and hence limited the number of loci able to be

compared between populations. The reduced number of STR loci analysed still

produced high discriminatory and exclusionary values in all populations from North

Africa, the Middle East and South Asia, and highlighted how polymorphic the shared

six loci were for human identification, paternity testing and disease susceptibility

applications. However, when compared to the original datasets for all populations the

resulting PD, PE and MP values improved with the increase of STR loci studied in the

individual populations (7, 11, 15, 18, 20, 21, 25). An increase in autosomal STR loci

in the literature will provide access to an increased number of STR loci to be compared

between different populations in the future.

5.4.2 Heterozygosity Analysis

Analysing observed heterozygosity values in populations provide information on

genetic structure and potential genetic history of the particular population. Of the

common six loci tested in the different populations, the heterozygosity range is

considerably high ranging from 0.542 (Qatar; vWA) to 0.907 (India; D7S820) in

support of North Africa, the Middle East and South Asia having observable ethnic

admixture and genetic diversity. The Jordanian population however has the lowest

heterozygosity values. The known historical background and significant geographical

Page 95: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

94

location of Jordan as a “major transit zone” would provide the expectation of larger

observed heterozygosity values from the residing populations. However, the literature

describes outlier populations residing in Jordan amongst groups such as the Bedouin

(17, 31). Furthermore, the observed low heterozygosity values seen in this analysis

could also, in part, be due to common consanguineous marriages amongst Jordanian

populations and the data being retrieved from a relatively isolated population within

the region (17).

5.4.3 Inter-population Genetic Comparison using Six Autosomal STR Loci

5.4.3.1 Regional genetic comparisons with the present UAE study

The locus-by-locus test for the common six loci showed the dataset from the Saudi

Arabia population was significantly different at all of the common loci with the present

UAE study. Saudi Arabia is a large country, which includes Bedouin and isolated

tribes that have not dispersed from the region and have inhabited the region for

thousands of years (13). This is supported by the degree of significant differences

observed in this study between Saudi Arabia and the present UAE study, Lebanon and

Jordan. Less significant differences between Saudi Arabia and Arabian Peninsula

populations such as Oman are likely to reflect the historical migration of Omani

individuals to Saudi Arabia (and additional peninsula countries) prior to the 1970s for

work and education (9, 13). Previous analysis of historical human dispersal between

Yemen and Saudi Arabia (4) indicates that these two populations are similar.

However, the analysis here shows that additional STR loci are required to improve the

resolution of this study.

Minimal significant differences were seen between the present UAE study and the

previously published dataset on the UAE publication (11). This supports the present

Page 96: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

95

study representing the genetic information of the UAE. The present UAE study also

showed minimal significant differences at the common six loci with Arabian Peninsula

populations (excluding Saudi Arabia and Kuwait). The lack of significant differences

between Oman and the UAE is indicative of the neighbouring countries sharing

historical events relating to dispersal and ethnic admixture (4, 9). Furthermore, the

genetic relationship between the UAE, Yemen and Qatar likely reflects the shared

human dispersal route from the initial migration out of Africa from the Horn of Africa

and into the Arabian Peninsula. The greater distance between Kuwait and additional

Levant populations of Jordan and Lebanon, is indicative of how the Levant region

shared a different initial migration route out of Africa through Egypt and into Europe

(1). Lebanon has been highlighted within the literature as being part of an expansion

of ethnicities during historical times such as during the Ottoman Empire and involving

a greater variety of cultural and religious practices within the ethnicities in the country

(32). This is supported in this analysis as the Lebanese population shared a number

of close relationships (indicated by less significant differences) with North African

populations such as Malta, Middle Eastern populations such as Syria and Iraq and

South Asian populations such as Iran.

5.4.3.2 Broader population genetic comparison with the present UAE study

The Moroccan and Indian populations were seen to have significant differences with

the present UAE study for all of the common loci. Additionally, Morocco appeared

to be one of the more genetically distant populations compared to all the populations

in this analysis. This observation supports the literature, which indicates no

relationship clustering in published Multidimensional Scaling plots between Morocco

and Middle East populations such as Syria (18, 25), and North African populations

such as Libya (21) and Egypt (18). The degree of variation amongst the Moroccan

Page 97: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

96

population and the North African and Middle East populations can be explained due

to the additional migration between Morocco and European countries such as Spain

(25).

The Indian population also showed a large number of significant differences when

compared to the populations of the Middle East and North Africa. There are many

endogamous groups across the large country that is India (33). The ethnic history

using genetic literature suggests that the Indian population belongs to two different

classifications known commonly as the Dravidians and the Aryans. The tribal groups

in contemporary India arose historically from these two groups, and inhabit different

areas throughout India. This presumably increases the observed heterozygosity and

the presence of genetic diversity, with cultural and linguistic differences seen between

closely proximate groups (33, 34). Due to the high degree of genetic diversity of the

Indian population, it is not surprising that genetic differences are seen between the

Indian population, the Middle East and North Africa.

Less significant differences were seen between the present UAE study and Iran. The

Iranian population was also seen to show less significant differences with other Middle

East populations such as Iraq, the UAE (published data) and Lebanon. These genetic

relationships are supported in the literature as Iran plays a significant role in human

dispersal from South Asia and Europe, such as maritime trade across the Persian Gulf

to the UAE and historical events shared between the Levant populations (7, 35). A

greater degree of significant differences is seen with further geographical distance

between the Middle Eastern and South Asian countries. This is supported in the

literature (36) and the analysis shows the other South Asian countries (Pakistan,

Bangladesh and India) compared to the present UAE study and other Middle Eastern

countries exhibit an increase in significant differences with greater geographical

Page 98: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

97

distance between them. The impact of geographical distance on the number of

significant genetic differences was also observed between the North African

populations with closer proximity to the Middle East when compared to the present

UAE study. Libya and Egypt had less significant genetic differences with the present

UAE study than with Malta and Morocco.

5.4.4 Significance of Increasing Number of Autosomal STR Loci in Analyses

Even though the calculated parameters for the common six loci were found to show

strong CPD, CPE and low CMP results with high heterozygosity values (excluding

Jordan), the number of loci used in such analyses must be increased. Improved

interpretation of results can be seen when the number of autosomal STRs was

increased up to 15 loci between the present UAE study and the published population

data. For example, when the present UAE study was compared to the UAE publication

for the common six loci (33.33% significant differences) the outcome did not reflect

the likely “true” relationships when compared to the outcome from the comparison of

15 loci (13.33% significant differences) showing how there is a greater genetic

relationship between the two datasets. Furthermore, greater understanding of the

impact between populations with more significant differences can occur with the use

of 15 loci. The Indian population showed 100% significant differences when the

common six loci were compared to the present UAE study, however the impact was

reduced when 15 loci were compared showing only 80% significant differences,

lessening the impact of genetic difference between India and the present UAE study

than what was observed between the Moroccan population and present UAE study

(100% significant differences). Furthermore, inconsistent relationships such as seen

between Yemen and Saudi Arabia have to be analysed further with additional STR

markers.

Page 99: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

98

5.5 Conclusion

This analysis has provided the ability to understand the variety of significant

differences amongst populations from the Middle East, North Africa and South Asia.

The results have indicated the relationship between historical events, migration routes

and likely impacts upon genetic diversity. Future research of populations using an

increased number of autosomal STR loci and greater sample sizes would allow for

improvements into factors that impact genetic diversity.

The UAE population shows wide variation of relationships throughout the MENA

region and South Asia. The significant relationships coincide with migratory factors,

cultural and linguistic relationships and past and present trade across the region. This

analysis has provided greater understanding into multi-ethnic countries within the

Middle East. Additionally, the analysis has highlighted the importance of increasing

the number of autosomal STRs in the analyses to provide a greater understanding of

the degree of genetic variation amongst close and distant populations.

Page 100: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

99

5.6 References

1. Kundu S, Ghosh SK. Trend of different molecular markers in the last decades

for studying human migrations. Gene. 2015;556(2):81-90.

2. Forster P. Ice Ages and the mitochondrial DNA chronology of human

dispersals: A review. Philos Trans R Soc Lond B Biol Sci.

2004;359(1442):255-64.

3. Maca-Meyer N, González AM, Pestano J, Flores C, Larruga JM, Cabrera VM.

Mitochondrial DNA transit between West Asia and North Africa inferred from

U6 phylogeography. BMC Genet. 2003;4:1-11.

4. Petraglia MD, Rose JI. The evolution of human populations in Arabia:

Paleoenvironments, prehistory and genetics. Netherlands: Springer 2009.

5. Beyin A. The Bab al Mandab vs the Nile-Levant: An Appraisal of the two

dispersal routes for early modern humans out of Africa. Africa Archaeo Rev.

2006;23(1-2):5-30.

6. Ermini L, Der Sarkissian C, Willerslev E, Orlando L. Major transitions in

human evolution revisited: A tribute to ancient DNA. J Hum Evol. 2015;79:4-

20.

7. Shepard EM, Herrera RJ. Iranian STR variation at the fringes of

biogeographical demarcation. Forensic Sci Int. 2006;158(2-3):140-8.

8. Fernandes V, Alshamali F, Alves M, Costa MD, Pereira JB, Silva NM et al.

The Arabian cradle: Mitochondrial relicts of the first steps along the southern

route out of Africa. Am J Hum Genet. 2012; 90:347-55.

Page 101: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

100

9. Anthony JD, Hearty JA. Eastern Arabian States: Kuwait, Bahrain, Qatar, the

United Arab Emirates, and Oman. The Government and Politics of the Middle

East and North Africa. Colorado: Westview Press 1980.

10. Alshamali F, Alkhayat AQ, Budowle B, Watson ND. STR population diversity

in nine ethnic populations living in Dubai. Forensic Sci Int. 2005;152(2-

3):267-79.

11. Garcia-Bertrand R, Simms TM, Cadenas AM, Herrera RJ. United Arab

Emirates: phylogenetic relationships and ancestral populations. Gene.

2014;533(1):411-9.

12. Ali Alhmoudi O, Jones RJ, Tay GK, Alsafar H, Hadi S. Population genetics

data for 21 autosomal STR loci for United Arab Emirates (UAE) population

using next generation multiplex STR kit. Forensic Sci Int. 2015;19:190-1.

13. Osman AE, Alsafar H, Tay GK, Theyab J, Mubasher M, Sheikh N, et al.

Autosomal short tandem repeat (STR) variation based on 15 loci in a

population from the Central Region (Riyadh Province) of Saudi Arabia. J

Forensic Res. 2015;6(1):1-5.

14. Perez-Miranda AM, Alfonso-Sanchez MA, Pena JA, Herrera RJ. Qatari DNA

variation at a crossroad of human migrations. Hum Hered. 2006;61(2):67-79.

15. Alenizi M, Goodwin W, Ismael S, Hadi S. STR data for the AmpFlSTR

Identifiler loci in Kuwaiti population. Leg Med. 2008;10(6):321-5.

16. Barni F, Berti A, Pianese A, Boccellino A, Miller MP, Caperna A, et al. Allele

frequencies of 15 autosomal STR loci in the Iraq population with comparisons

to other populations from the middle-eastern region. Forensic Sci Int.

2007;167(1):87-92.

Page 102: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

101

17. Azab M, Al-Bashir N, Momani SN, Al-Nasser A, Alkaraki AK, Khabour O.

Comparison between frequencies of several STRs loci in Jordan with

neighboring countries. Jordan Med J. 2010;44(1):55-60.

18. Abdin L, Shimada I, Brinkmann B, Hohoff C. Analysis of 15 short tandem

repeats reveals significant differences between the Arabian populations from

Morocco and Syria. Leg Med. 2003;5:S150-S5.

19. El Andari A, Othman H, Taroni F, Mansour I. Population genetic data for 23

STR markers from Lebanon. Forensic Sci Int Genet. 2013;7(4):e108-13.

20. Coudray C, Guitard E, el-Chennawi F, Larrouy G, Dugoujon JM. Allele

frequencies of 15 short tandem repeats (STRs) in three Egyptian populations

of different ethnic groups. Forensic Sci Int. 2007;169(2-3):260-5.

21. Khodjet-el-Khil H, Fadhlaoui-Zid K, Gusmao L, Alves C, Benammar-

Elgaaied A, Amorim A. Allele frequencies for 15 autosomal STR markers in

the Libyan population. Annals of Hum Biol. 2012;39(1):80-3.

22. Cassar M, Farrugia C, Vidal C. Allele frequencies of 14 STR loci in the

population of Malta. Leg Med. 2008;10(3):153-6.

23. Cherni L, Loueslati Yaacoubi B, Pereira L, Alves C, Khodjet-El-Khil H, Ben

Ammar El Gaaied A, et al. Data for 15 autosomal STR markers (Powerplex 16

System) from two Tunisian populations: Kesra (Berber) and Zriba (Arab).

Forensic Sci Int. 2005;147(1):101-6.

24. Bosch E, Clarimon J, Perez-Lezaun A, Calafell F. STR data for 21 loci in

northwestern Africa. Forensic Sci Int. 2001;116:41-51.

25. Bentayebi K, Abada F, Ihzmad H, Amzazi S. Genetic ancestry of a Moroccan

population as inferred from autosomal STRs. Meta Gene. 2014;2:427-38.

Page 103: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

102

26. Rakha A, Yu B, Hadi S, Sheng-bin L. Population genetic data on 15 autosomal

STRs in a Pakistani population sample. Leg Med. 2009;11(6):305-7.

27. Singh A, Trivedi R, Kashyap VK. Polymorphisms at fifteen tetrameric short

tandem repeat loci in three ethnic populations of Bengal, India. Leg Med.

2006;8(3):191-3.

28. Ferdous A, Ali ME, Alam S, Hasan M, Hossain T, Akhteruzzaman S. Forensic

evaluation of STR data for the PowerPlex 16 System loci in a Bangladeshi

population. Leg Med. 2009;11(4):198-9.

29. Huston KA, Statistical analysis of STR data. Profiles in DNA. 1998;1(3):14-

5.

30. Excoffier L, Laval G, Schneider S. Arlequin (version 3.0): An integrated

software package for population genetics data analysis. Evol Bioinformatics

Online. 2005;1:47-50.

31. Zanetti D, Sadiq M, Carreras-Torres R, Khabour O, Alkaraki A, Esteban E, et

al. Human diversity in Jordan: Polymorphic Alu insertions in general

Jordanian and Bedouin groups. Hum Biol. 2014;86 (2):131-8.

32. Chouery E, Coble MD, Strouss KM, Saunier JL, Jalkh N, Medlej-Hashim M,

et al. Population genetic data for 17 STR markers from Lebanon. Leg Med.

2010;12(6):324-6.

33. Ashma R, Kashyap VK. Genetic profile based upon 15 microsatellites of four

caste groups of the eastern Indian state, Bihar. Ann Hum Biol. 2003;30(5):570-

8.

34. Fareed M, Afzal M. Genetic structure of human populations based on 5 gene

loci: A preliminary report. Northern India Gene Reports. 2016;4:244-8.

Page 104: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

103

35. Grugni, V., Battaglia, V., Kashani, B., Parolo, S., Al-Zahery, N, et al. Ancient

migratory events in the Middle East: New clues from the Y-chromosome

variation of modern Iranians. PLoS One. 2012;7(7):1-14.

36. Silva NM, Pereira L, Poloni ES, Currat M. Human neutral genetic variation

and forensic STR data. PLoS One. 2012;7(11):1-11.

Page 105: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

104

Chapter 6

Y-CHROMOSOME STR HAPLOTYPES CAN BE USED TO

DIFFERENTIATE LINEAGES IN THE UNITED ARAB EMIRATES

POPULATION

The following chapter has been submitted and presented in the format for the

journal Annals of Human Biology. The study was developed to address a gap in

the knowledge related to genetic diversity of populations of the Middle East. Y-

STR markers were analysed for this purpose. The research questions were

developed by the candidate in collaboration with her supervisors. She also

provided technical assistance in the laboratory, collating the results and

preparation of the multiple drafts of the manuscript.

Page 106: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

105

Annals of Human Biology (submitted)

Y-Chromosome STR haplotypes can be used to differentiate lineages in the

United Arab Emirates population

Rebecca J Jones1, Guan K Tay1,2,3, Aurélie Mawart4 and Habiba Alsafar 4,5.

1 School of Anatomy, Physiology and Human Biology, University of Western

Australia, Crawley, Western Australia.

2 School of Psychiatry and Clinical Neurosciences, University of Western Australia,

Crawley, Western Australia.

3 School of Medical and Health Sciences, Edith Cowan University, Joondalup,

Western Australia.

4 Center for Biotechnology, Khalifa University of Science, Technology and

Research, Abu Dhabi, United Arab Emirates.

5 Faculty of Biomedical Engineering, Khalifa University of Science, Technology and

Research, Abu Dhabi, United Arab Emirates.

Page 107: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

106

6.1 Introduction

The Y-chromosome haplotype is commonly constructed using Short Tandem Repeat

(STR) markers. As the Y-chromosome is subject to rapid genetic drift, haplotypes can

be used to study the geographical distribution of ethnic groups (1). The Y-

chromosome contains the largest non-recombining section within the human genome,

providing informative haplotypes for genetic analyses of populations (2). The

investigation into how the male lineage contributes to migration and population

distribution requires an understanding of the global distribution of these haplotypes.

Worldwide population data on the Y-chromosome indicate that these haplotypes are

region-specific, providing applications in genetic studies, human identification,

forensic investigation and paternity testing (3).

This study describes the first published information for the expanded 27 Y-

chromosome STR (Y-STR) panel for an Arab population within the United Arab

Emirates (UAE), designed to explore the impact of increasing the number of Y-STR

loci on population genetic analyses. The population studied comprised nationals of

the UAE, a country located on the South Eastern tip of the Arabian Peninsula,

comprising seven Emirates or principalities. The geographical location of the Arabian

Peninsula; at the crossroads of the African, Asian and European continents; means that

populations of this region have been shaped by the initial human migration out of

Africa and subsequently by the bidirectional flow of people between the three

continents. Factors that influence migration, such as conflict, availability of food and

water, trade, and the pursuit of knowledge have contributed to the bidirectional

migration patterns across this part of the world (4, 5). The Middle East region in the

western reaches of Asia, including the Arabian Peninsula, is central to two proposed

routes out of Africa with the southern part of the peninsula around what now includes

Page 108: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

107

the UAE being a staging point on passage across the Red Sea, through this region and

onwards through Asia (6). Human dispersal through the Middle East has resulted in

the region containing numerous ethnic groups. The ethnic diversity has arisen from

the variety of social and cultural influences throughout the Middle East (7). Studying

the genomic distribution of genetic markers, such as Y-chromosome haplotypes, will

result in an understanding of the influence of human migration, which has given rise

to different ethnic groups (eg. the Bedouin) as well as different nationalities (Egyptian,

Jordanian, Yemenite, etc.) in the region.

The residents of the UAE are predominantly immigrants. The UAE comprise a

mixture of local (national) and expatriates, with the 2015 census estimating that 11.3%

national of the approximate 9.6 million residents throughout the country. The national

population of the UAE consists predominately of people of Arab descent. A recent

genetic study using Y-STR haplotypes in 2015 showed that UAE nationals are

genetically similar with populations in close geographical proximity, such as Kuwait

(8). Others have also confirmed the relationship between genetic relatedness and

geographic proximity through phylogenetic analyses of populations from the UAE,

Oman, Qatar and Yemen (9, 10). Additionally, significant genetic relationships

between the populations of the UAE with some in North Africa and South Asia have

been observed (9-11), highlighting the importance of understanding the impact of

bidirectional migration in this region of the world.

In contrast to genetic relationships observed through chromosomal analyses between

populations of the Middle East and surrounding regions, Y-STR analyses have shown

that genetic diversity is more complicated than expected (4, 5, 7, 9, 11). The genetic

structure of the Middle East and surrounding regions has been described as a mosaic

pattern due to the fluctuating degree of genetic diversity (11). The term “mosaic

Page 109: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

108

pattern” describes a flow of genetic diversity throughout the Middle East that is not

accountable by factors such as geographical distance alone. Throughout the region,

the degree of genetic diversity has been found to be highly variable between both

populations residing in close proximity and those at distant locations. These complex

relationships highlight the need to continue to characterise the genetic make-up of

people of the Middle East region as well as those from the surrounding regions such

as Northern Africa and Southern Asia.

Y-STR analyses have been successfully used to develop an understanding of human

migration patterns and how movement has influenced the gene pool of specific

populations (6). Studies of this uni-parental transfer of genetic material have provided

insights into human movement patterns within Eurasian lineages (8). Although

previous Y-STR studies have provided information on individuals of Arabian descent

from multiple regions of the peninsula – Saudi Arabia, Yemen (10), Oman (9, 10) and

the UAE (10, 12), this present study focuses exclusively on UAE nationals. This study

also uses an additional 10 STR markers for haplotype construction, increasing the

number of markers to 27 as compared to the 17 Y-STR markers used in previous

efforts.

6.2 Materials and Methods

6.2.1 Study Population

The DNA samples of 217 consented healthy unrelated males from the UAE were used

in this study. These samples were available as de-identified DNA samples stored for

research purposes within the Emirates Family Registry (EFR), a biobank resource

available for genetic association studies (13). Prior to commencement of this present

Page 110: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

109

study, approval to undertake the work was obtained from the Ethics committee of the

Ministry of Health of the UAE (2011) and the Human Research Ethics Committee of

the University of Western Australia (RA/4/1/7778).

6.2.2 Genotyping

Extraction of DNA was performed using the Oragene-DNA kit (Genotek, Ottawa,

Canada) in accordance with manufacturer’s guidelines using buccal cells from saliva.

A NanoDrop Spectrophotometer (Thermo Scientific, Wilmington DE, USA) was used

to determine the quality and quantity of the individual DNA samples. The 27 Y-STR

used in this study were amplified using half reactions (7.5µl) with the Y-Filer PLUS

Amplification kit (Life Technologies, Foster City CA, USA). The amplification was

performed on a Veriti Thermal Cycler (Life Technologies, Foster City CA, USA). The

27 Y-STR loci were DYS576, DYS389I, DYS635, DYS389II, DYS627, DYS460,

DYS458, DYS19, YGATA H4, DYS448, DYS391, DYS456, DYS390, DYS438,

DYS392, DYS518, DYS570, DYS437, DYS385a/ DYS385b, DYS449, DYS393,

DYS439, DYS481, DYF387SI/DYF387S1II and DYS533. Size determination of the

alleles of each Y-STR locus was carried-out by capillary electrophoresis using the

Applied Biosystems Genetic Analyzer 3500 (Life Technologies, Foster City CA,

USA) with the POP-4 polymer and using the 600LIZ Internal Lane Standard. The

computer software GeneMapper ID-X v1.2 (Life Technologies, Foster City CA, USA)

was used for fragment analysis for allele sizing.

Allelic designations were determined by comparing the size of PCR products

separated by capillary gel electrophoresis with the reference allelic ladders provided

with the Y-Filer PLUS Amplification kit (Life Technologies, Foster City CA, USA).

The alleles from all loci reported here were designated according to the published

nomenclatures and the guidelines of the International Society for Forensic Genetics

Page 111: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

110

(ISFG) for performing STR analyses (14). Positive controls (provided in the Y-Filer

PLUS kit) and negative controls (deionized water) were used with each batch of

reactions. Additionally, a specific set of samples (approximately 10% of total

samples) were selected for replication and showed the assay was reproducible.

6.2.3 Statistical Analysis

The frequencies of the Y-STR haplotypes were determined by counting. Diversity in

the population studied was determined using Nei’s Formula (15) with the match

probability being 1 minus the haplotype diversity. The discriminatory capacity was

the number of unique (individual-specific) haplotypes divided by the total number of

individuals in the population (16).

Populations from the Y-chromosome Haplotype Reference Database (YHRD) were

chosen and compared to this study (referred to as the “present study”)

(www.yhrd.org). The YHRD database provides high-resolution and large reference

sample collections of world populations for genetic analyses (17). RST values were

calculated by Analysis of Molecular Variance (AMOVA) using information on

comparative populations from the YHRD database (10,000 permutations) and

Multidimensional Scaling (MDS) plots were constructed based on Kruskal’s non-

metric MDS algorithm (18) using the YHRD online program (www.yhrd.org).

6.3 Results and Discussion

A total of 217 individual samples from the UAE were analysed in this present study.

The total number of haplotypes identified in this study was 212. The unique number

of haplotypes, where there was only one individual observed with the haplotype, was

207 (see Table 6.1 and Appendix 4). The haplotype diversity was 0.9998 with a match

Page 112: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

111

probability of 0.0002. The discriminatory capacity was 95.4% (Table 6.1).

Replication studies for quality control purposes showed that there was complete

concordance between two sets of data from the same individual.

At the time of this study, there were 15 populations in YHRD with data generated

using the commercial Y-Filer PLUS kit that comprises 27 Y-STRs. The haplotype

data generated in this present study was compared to the haplotype distribution from

these 15 populations. The resultant MDS plot (Figure 6.1) shows the relationship

between the Arabian population of this present study with populations from Lebanon,

Poland, Hungary, Germany, Denmark, Austria, Switzerland, Spain, Italy, the United

States of America, Greenland, China, Somalia, Peru and the Russian Federation. The

UAE population studied in this present study clustered with populations of Europe in

the MDS plot generated (see Figure 6.1). The relatively tight clustering of the

populations of Germany, Austria, Switzerland and Denmark is reassuring, and

presumably reflects the close geographical proximity of the four countries. These

central European countries share some common elements. Germany shares common

borders with the other three and German is an official language of three of the four

nations and a minor language in Denmark, reflecting elements of a common ancestral

and cultural origin.

A Lebanese population of unknown description in the YHRD database was found to

be distant from all other populations including that of the present study (AMOVA RST

= 0.1951). Although Arabic is the national language with just over half the population

of the Muslim faith, the culture of Lebanon reflects the range of different ethnic and

religious groups that have arisen through numerous civilizations that have resided in

this region throughout the centuries (19).

Page 113: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

112

Figure 6.1: MDS plot comparing the Y-chromosome haplotypes of 15

populations constructed with 27 Y-STRs in the YHRD database with haplotypes

in Arabs from the UAE population typed in this present study.

The populations from the Far East (China), South America (Peru), Africa (Somalia)

and the Russian Federation were separate from the European cluster, relative to the

Arabian population of this study. Previously, a close relationship between the Arabs

and Europeans (20) has been suggested, which is supported in this study.

To allow a comparison with similar populations from the region, the number of Y-

STR loci analysed was reduced to 17 loci (DYS389I, DYS635, DYS389II, DYS458,

DYS19, YGATA H4, DYS448, DYS391, DYS456, DYS390, DYS438, DYS392,

DYS437, DYS385a/b, DYS393, DYS439). This combination is used in the Y-Filer

Lebanon

Peru

US (European)

0.2

MDS Plot

Dimension 1

Stress = 0.03487

0.1

0

.0

-0.1

-0

.2

-0.3 -0.2 -0.1 0.0 0.1 0.2 0.3

Dim

ensi

on

2

Poland

Hungary Germany Denmark

Austria Switzerland Spain

Present study

China

Greenland Somalia

Russian Federation

Italy

Page 114: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

113

Amplification kit. Y-STR haplotypes from population studies in countries from the

Middle East and North Africa (MENA) region and South Asian countries (India,

Pakistan and Bangladesh) were extracted from the YHRD for comparison.

Additionally, a distant population (Peru) from the original analysis shown in Figure

6.1 was used to provide context.

The resultant MDS plot in Figure 6.2 shows that the present UAE population studied

clusters with other countries of the Arabian Peninsula. The AMOVA RST between

this present study and the Arabian Peninsula countries were 0.0097 (UAE), 0.0059

(Kuwait) and 0.0241 (Yemen). Additionally, the Levant population from Iraq was

found to closely cluster with the present UAE study (RST = 0.0321).

The closer genetic relationship with the Lebanese population observed in this

comparison (Figure 6.2) appears to be a discrepancy when contrasted to the distance

matrix shown in Figure 6.1, suggesting that the additional 10 Y-STR loci provides

greater discrimination and are separating these lineage haplotypes in the two

populations. The distance between the Lebanese population and the present study

using 27 STR loci was AMOVA RST = 0.1951 compared to AMOVA RST = 0.0111 in

the latter. This could be due to the Lebanese population within the YHRD database

for the 27 Y-STR loci including numerous ethnic backgrounds increasing the

uniqueness of the country, distancing further from the UAE. Within the literature,

Lebanon has been described as being influenced by numerous ethnic backgrounds

impacting the genetic diversity (21). For this matter to be understood, an increase of

Middle Eastern populations using 27 Y-STR loci should be undertaken to analyse the

relationship of the UAE and Lebanon in a suitable context.

Page 115: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

114

To address the suggestion of improved discrimination with increased loci analysed, a

comparison was undertaken of the Y-Chromosome haplotypes of the 217 samples

from the present UAE study constructed using 27 Y-STR loci and 17 Y-STR loci (see

Table 6.1). The number of unique Y-Chromosome haplotypes using the 27 Y-STR

loci was 207, which was 26 more than the number of unique haplotypes constructed

using 17 Y-STR loci. The Haplotype Diversity increased and the Match Probability

decreased when more Y-STR loci were used to construct the haplotypes. Furthermore,

Figure 6.2: MDS plot comparing the Y-chromosome haplotypes (using 17 Y-

STR loci) of the UAE population described in this study with populations from

North Africa ( ), Arabian Peninsula ( ), the Levant region ( ) and South Asia

( ). Data from Peru was also available ( ).

0.0

5

0.0

0

-0.0

5

-0.1

0

-0.1

5

-0.10 -0.05 0.00 0.05 0.10 0.15 0.20

MDS Plot

Dim

ensi

on

2

Dimension 1

Stress = 0.03701

Lebanon Iraq Iran

India Bangladesh

United Arab Emirates

Kuwait Present Study

Egypt Yemen

Pakistan

Jordan

Libya Algeria Morocco

Tunisia

Peru

Page 116: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

115

the discrimination capacity increased by over 10% when the 27 Y-STR loci were

analysed. Other studies (22-24) have also shown improvements in haplotype

discrimination with the use of 27 Y-STR loci. These results show that the 27 Y-STR

loci provide greater discrimination and a lesser chance of incorrectly matching two

unique haplotypes than for lower numbers of STR loci.

The other unexpected result highlighted in Figure 6.2 was the distance of the Jordanian

population, a country in the Levant, from the Middle East group (AMOVA RST =

0.2763). Jordan is located close to Lebanon and shares a common border with Iraq,

yet the population does not cluster with the populations of these two neighbouring

countries based on the Y-STR assigned haplotype data.

Upon further interrogation, the Jordanian population in YHRD was found to be from

two separate subpopulations, Arab Adnanit and Arab Qahtanit. According to

historical scriptures, the two groups of Arabs, the Qahtanite and the Adnanite are

distinct populations, with the former referring to Arabs who originated from the

Southern region of the Arabian Peninsula around Yemen. The Qahtanit group

comprises the family of Qahtan and his 24 sons, who inhabited the southern parts of

the Arabian Peninsula. Between the 7th and 14th centuries, the expansion of the Arabic

empire extended into most of Spain, southern France and western China. During this

expansion period, Arabic populations including the Qahtanit tribes spread and

intermingled through these lands (25). Adnanites are Arabized Arabs (meaning those

who travelled into Arabia before the Mesopotamia period) descended from Adnan, the

father of the Ishmaelite Arabs who occupied Hijaz, Yamama regions as well as most

parts of northern Arabia (26).

Page 117: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

116

Table 6.1. The improvement in discrimination capacity following an increase in reportable Y-STRs from 17 (Y-Filer) to 27 loci

(Y-Filer PLUS).

Present study Austria (22) Italy (23) Spain (24)

Y-Filer Y-Filer

PLUS Y-Filer

Y-Filer

PLUS Y-Filer

Y-Filer

PLUS Y-Filer

Y-Filer

PLUS

Population Size 217 217 425 425 203 203 203 203

Unique Haplotype Count 181 207 407 423 191 197 171 203

Haplotype Diversity 0.9989 0.9998 0.9999 0.999989 0.9997 0.9999 0.9972 0.9999

Haplotype Match Probability 0.0011 0.0002 0.0024 0.0025 0.0052 0.0051 0.0028 0.000004

Discrimination Capacity 83.40% 95.40% 97.88% 99.77% 97.04% 98.52% 84.24% 99.99%

Page 118: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

117

Distance relationships were recalculated using the Jordanian groups as two distinct

populations and the MDS plot redrawn (see Figure 6.3). It shows the Arab Adnanit

group of Jordan clustering with the Middle East group and in close proximity to the

present UAE study (AMOVA RST = 0.0717) with the Arab Qahtanit group remaining

some distance away (RST = 0.4425). Previous studies have also suggested the great

diversity between different ethnic populations within Jordan, indicative of the distance

between the Qahtanit and Adnanit groups in Figure 6.3 (27). The distance of the

Jordan-Qahtanit population in Figure 6.3 is consistent with different waves of people

migrating to lands outside of the Arabian Peninsula towards the north-east, during the

expansionism period. Additionally, due to the Peninsula’s “wide array of variables

impacting the movement of human groups” (7) such as climate changes and trade that

are likely impacting the genetic diversity. However, further studies with

classifications of individuals into the Adnanit and Qahtanit groups would be required

to understand the relationship between the two subpopulations.

Since the populations in North Africa used in this analysis are potentially a mixture of

Arabian and African populations, 17 Y-STR haplotype data on Arabian populations

including a linguistic Arabic group within Iran (28), were extracted from YHRD.

These were compared with the 17 Y-STR haplotypes of this present UAE study to

establish the relationship between the Arabian populations (Figure 6.4). The North

African defined Arabian populations (Algeria, Morocco and Libya) are seen to cluster

together within the MDS plot, a not surprising observation due to their close

geographic proximity and relative distance away from the peninsula.

Page 119: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

118

MDS Plot

0.0

5

0.0

0

-0.0

5

-0.1

0

-0.1

5

-0.1 0.0 0.1 0.2 0.3

Dimension 1

Stress = 0.04885

Dim

ensi

on

2

Present Study Jordan [Adnanit]

United Arab Emirates Egypt

Yemen

Kuwait Lebanon

Bangladesh India

Iran Iraq

Pakistan

Jordan [Qahtanit]

Peru

Libya Morocco

Algeria

Tunisia

Figure 6.3: MDS plot comparing the Y-STR haplotypes (using 17 Y-STR loci)

of population from North Africa ( ), Arabian Peninsula ( ), the Levant

region ( ) and South Asia ( ), with the Jordanian population separated into

two genealogical groups. Data from Peru was also available ( ).

Page 120: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

119

The populations of Iraq, Kuwait, UAE, Jordan Adnanit, an Iranian Linguistic Arabic

group (Khuzestani Arabic) from Ahvaz, Iran and the present study form a second

cluster in Figure 6.4. Quite distinct from the North African defined Arabian

populations and the Middle Eastern cluster is the Yemen population. The distance of

Yemen to the North African populations coincides with the two separate migratory

routes into the Middle East out of Africa, one across the land bridge that is now Egypt

0.1

0

0.0

5

0.0

0

-0.0

5

-0.1

0

-0.1

5

-0.1 0.0 0.1 0.2 0.3 0.4

MDS Plot

Dim

ensi

on

2

Dimension 1

Stress = 0.04405

Jordan [Arab-Qahtanit]

Jordan [Arab-Adnanit] Present Study Abu Dhabi, UAE [Arab]

Kuwait City, Kuwait [Arab] Ahvaz, Iran [Arab]

Iraq [Iraqi]

Sanaa, Yemen [Yemeni]

Peru

Banghazi, Libya [Arab]

Rabat, Morocco [Arabs]

Casablanca, Morocco [Arab]

Oran, Algeria [Arab]

Figure 6.4: MDS plot of present study and defined Arab populations from

YHRD using 17 Y-STR loci. Countries differentiated according to region using

the following symbols: North Africa ( ), Arabian Peninsula ( ), the Levant

region ( ) and South Asia ( ). Data from Peru was also available ( ).

Page 121: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

120

and the second across the Red Sea (6, 7, 29, 30). The Jordan Qahtanit population still

remained significantly distant from the Middle Eastern cluster.

In the analyses shown in Figures 6.2-6.4; the location of the present study within the

MDS plot is consistently close to that of a UAE population studied previously (12)

with AMOVA values of R ST=0.0097; 0.0097; 0.0003. Additionally, the present study

can be seen to closely cluster with the Kuwaiti populations within Figures 6.2-6.4

(AMOVA RST = 0.0059; 0.0059; 0.0067). This relationship is indicative of the

historically regular interaction between the Arabian coastal populations such as during

the ancient civilisation of Dilmun (centred in Bahrain) from around 4000 to 2000 BCE.

Furthermore, the Phoenician ancestors were believed to have developed the maritime

skills within the gulf, linking all the coastal countries known today (31).

Both Figure 6.2 and Figure 6.3 show clustering of North African populations, which

is consistent with close links in this region. However, the Egyptian population appears

to be a part of the clustering of the Middle Eastern populations, which include that of

the present study (AMOVA RST = 0.0191) and further from the Arab populations in

North Africa. Additionally, the North African defined Arabian populations in Figure

6.4, clustered close to the Middle Eastern grouping, which included the present study.

Furthermore, Figures 6.2 and 6.3 show Southern Asia populations (Iran, India and

Bangladesh) cluster in closer proximity than to the present study and Middle Eastern

populations. These relationships appear to coincide with migratory factors and the

geographical closeness of the UAE to Southern Asian countries such as Iran (AMOVA

RST = 0.0308), India (RST = 0.0413) and Bangladesh (RST = 0.0497). Trade between

these regions have been documented for centuries.

The fact that the North African populations cluster opposite to those from Southern

Asian countries, with Middle Eastern populations in the middle (Figure 6.2), suggests

Page 122: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

121

that migratory routes and trade factors flowed back and forth from the east to west and

through the Middle East region, including the UAE. This region is at the crossroads

through which humans dispersed and migrated between the continents of Africa and

Asia. Additionally, the fact that Egypt clusters with the Middle Eastern populations

studied in comparison to the other North African populations supports the idea that

migration between Africa and the Middle East flowed through the land bridge that is

Egypt (7, 21).

6.4 Conclusion

This present study has shown the UAE population to be diverse with 207 unique

haplotypes observed in 217 individuals. The present UAE study has commonalties

with other Middle Eastern populations, and appears to be particularly close to the

Kuwaiti populations.

The genetic distance between Jordan and the UAE population can be explained by the

geographical distance between the two countries, with further research required into

the significance of subpopulation genetic analyses.

The array of genetic relationships within the MDS plot structures support the Middle

East region reflecting a ‘mosaic pattern’ of genetic structure using Y-chromosome

analyses. In addition, the discrepancy of distance between the Lebanese populations

when 27 or 17 Y-STR loci are analysed supports the need for further research to

determine whether the relationship is due to the discrimination values of the number

of loci used. This present study has highlighted the importance of further research

required for populations within the region. Nevertheless, the diversity of the

haplotypes identified in the UAE population using 27 Y-STR loci provides an

Page 123: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

122

effective tool for human identification, paternity testing and population genetic

studies.

Page 124: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

123

6.5 References

1. Qamar R, Ayub Q, Mohyuddin A, Helgason A, Mazhar K, Mansoor A, et al.

Y-Chromosome DNA variation in Pakistan. Am J Hum Genet. 2002;70:1107-

24.

2. Underhill PA, Kivisild T. Use of y chromosome and mitochondrial DNA

population structure in tracing human migrations. Ann Rev Genet.

2007;41:539-64.

3. Immel UD, Kleiber M, Klintschar M. Y-chromosomal STR haplotypes in an

Arab population from Yemen. Int Congress Series. 2004;1261:340-3.

4. Valeri M. Nation-building and communities in Oman since 1970: The Swahili-

Speaking Omani in search of identity. Oxford Journals 2007;106(424):479-96.

5. Abu-Amero KK, Gonzalez AM, Larruga JM, Bosley TM, Cabrera VM.

Eurasian and African mitochondrial DNA influences in the Saudi Arabian

population. BMC Evol Biol. 2007;7(32):1-15.

6. Kundu S, Ghosh SK. Trend of different molecular markers in the last decades

for studying human migrations. Gene. 2015;556(2):81-90.

7. Petraglia MD, Rose JI. The evolution of human populations in Arabia:

Paleoenvironments, prehistory and genetics. Netherlands: Springer 2009.

8. Triki-Fendri S, Sanchez-Diz P, Rey-Gonzalez D, Ayadi I, Carracedo A, Rebai

A. Paternal lineages in Libya inferred from Y-chromosome haplogroups. Am

J Phys Anthropol. 2015;157(2):242-51.

9. Cadenas AM, Zhivotovsky LA, Cavalli-Sforza LL, Underhill PA, Herrera RJ.

Y-chromosome diversity characterizes the Gulf of Oman. Euro J Hum Genet.

2008;16(3):374-86.

Page 125: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

124

10. Alshamali F, Pereira L, Iacute SA, Budowle B, Poloni ES, et al. Local

population structure in Arabian Peninsula revealed by Y-STR diversity. Hum

Hered. 2009;68(1):45-54.

11. Petraglia MD, Haslam M, Fuller DQ, Boivin N, Clarkson C. Out of Africa:

New hypotheses and evidence for the dispersal of Homo sapiens along the

Indian Ocean rim. Ann Human Biol. 2010;37(3):288-311.

12. Nazir M, Alhaddad H, Alenizi M, Alenizi H, Taqi Z, Sanqoor S, et al. A

genetic overview of 23 Y-STR markers in UAE population. Forensic Sci Int

Genet. 2016;23:150-2.

13. Alsafar H, Jama-Alol KA, Hassoun AAK, Tay GK. The prevalence of Type 2

Diabetes Mellitus in the United Arab Emirates: Justification for the

establishment of the Emirates Family Registry. Int J Diabetes Dev Ctries.

2012;32(1):25-32.

14. Schneider PM. Scientific standards for studies in forensic genetics. Forensic

Sci Int. 2007;165(2-3):238-43.

15. Nei M. Molecular Evolutionary Genetics New York, USA: Columbia

University Press 1987.

16. Chang YM, Swaran Y, Phoon YK, Sothirasan K, Sim HT, Lim KB, et al.

Haplotype diversity of 17 Y-chromosomal STRs in three native Sarawak

populations (Iban, Bidayuh and Melanau) in East Malaysia. Forensic Sci Int

Genet. 2009;3(3):e77-80.

17. Willuweit S, Roewer L. The new Y chromosome haplotype reference database.

Forensic Sci Int Genet. 2015;15:43-8.

18. Kruskal JB. Nonmetric Multidimensional Scaling: A numerical method.

Psychometrika. 1964;29(2):115-29.

Page 126: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

125

19. Chouery E, Coble MD, Strouss KM, Saunier JL, Jalkh N, Medlej-Hashim M,

et al. Population genetic data for 17 STR markers from Lebanon. Leg Med.

2010;12(6):324-6.

20. Garcia-Bertrand R, Simms TM, Cadenas AM, Herrera RJ. United Arab

Emirates: phylogenetic relationships and ancestral populations. Gene.

2014;533(1):411-9.

21. El Andari A, Othman H, Taroni F, Mansour I. Population genetic data for 23

STR markers from Lebanon. Forensic Sci Int Genet. 2013;7(4):e108-13.

22. Pickrahn I, Muller E, Zahrer W, Dunkelmann B, Cemper-Kiesslich J, Kreindl

G, et al. Yfiler(R) Plus amplification kit validation and calculation of forensic

parameters for two Austrian populations. Forensic Sci Int Genet. 2016;21:90-

4.

23. Rapone C, D'Atanasio E, Agostino A, Mariano M, Papaluca MT, Cruciani F,

et al. Forensic genetic value of a 27 Y-STR loci multiplex (Yfiler((R)) Plus

kit) in an Italian population sample. Forensic Sci Int Genet. 2016;21:e1-5.

24. Garcia O, Yurrebaso I, Mancisidor ID, Lopez S, Alonso S, Gusmao L. Data

for 27 Y-chromosome STR loci in the Basque Country autochthonous

population. Forensic Sci Int Genet. 2016;20:e10-2.

25. Ram P. Yemen History and Culture: A Book by AnVi OpenSource Knowledge

Trust: GBO 2015.

26. Thesiger W. Desert borderlands of Oman. Geograph J. 1950;116 (4/6):137-68.

27. Gonzalez AM, Karadsheh N, Maca-Meyer N, Flores C, Cabrera VM, Larruga

JM. Mitochondrial DNA variation in Jordanians and their genetic relationship

to other Middle East populations. Ann Hum Biol. 2008;35(2):212-31.

Page 127: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

126

28. Roewer L, Willuweit S, Stoneking M, Nasidze I. A Y-STR database of Iranian

and Azerbaijanian minority populations. Forensic Sci Int Genet.

2009;4(1):e53-5.

29. Tadmouri GO, Sastry KS, Chouchane L. Arab gene geography: From

population diversities to personalized medical genomics. Glob Cardiol Sci

Pract. 2014;2014(4):394-408.

30. Shepard EM, Herrera RJ. Genetic encapsulation among near Eastern

populations. J Hum Genet. 2006;51(5):467-76.

31. Anthony JD, Hearty JA. Eastern Arabian States: Kuwait, Bahrain, Qatar, the

United Arab Emirates, and Oman: The Government and Politics of the Middle

East and North Africa. Colorado: Westview Press 1980.

32. Manni F, Leonardi P, Barakat A, Rouba H, Heyer E, Klintschar M, et al. Y-

chromosome analysis in Egypt suggests a genetic regional continuity in

Northeastern Africa. Hum Biol. 2002;74(5):645-58.

Page 128: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

127

Chapter 7

GENERAL DISCUSSION AND CONCLUSION

Page 129: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

128

7.1 Discussion

Due to the paucity of literature on genetic analyses of Middle Eastern populations such

as the United Arab Emirates (UAE) (1, 2), this project was designed with the objective

to add to the knowledge base of human genetic diversity within the Middle East.

Contributions towards the literature were carried-out by increasing sample sizes and

number of short tandem repeat (STR) loci tested to show improved measures of

genetic diversity within the UAE and greater resolution of the genetic relationships

between populations in the Middle East and surrounding regions. The outcomes from

this project provide an increased understanding of the factors that impact genetic

diversity in the Middle East such as human dispersal following the initial migration

out of Africa, geographic location, historic, trade relationships, socio-cultural

influences and overall increases in population size from large- scale urbanization.

Although the recent introduction of next generation sequencing allows the comparison

of whole genomes, the vast amount of population-based genetic data still focuses on

autosomal and Y-chromosome STRs (Y-STRs). Furthermore, the minimal research

on the Middle East region, and particularly the UAE given its geographical location

at the crossroads of human dispersal from Africa, means that using STR analyses is

still an important research tool. The data from the variable STR loci provide for

population databases with high analytical power and allelic abundance.

The aim of characterizing allele frequencies and statistical forensic parameters for the

UAE population was carried-out with the use of both autosomal STRs and Y-STRs.

In regards to the autosomal STR analyses, the initial publication of UAE population

data allowed for the establishment of research quality assessments within the literature

(3). The same amplification kit and manufacturer’s instructions were utilised for both

data sets improving the quality of the results and providing reassurance in the

Page 130: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

129

reduction of biases and potential errors. Furthermore, the statistical analyses of the

two datasets was important to understand the extent of significant differences between

the two UAE subpopulations, which reflected minimal and nonsignificant variation

allowing for the combination of the two datasets for further analyses. The minimal

variation observed between the Emirati Bedouins (Chapter Three) and Emirati Arabs

of mixed ethnic origin (Chapter Four) is supported in the literature. Even though the

Arabian Peninsula and the UAE comprises remote Bedouin communities (4),

agricultural settlements during the 1920s within the UAE saw some Bedouin join the

socio-economic progresses of the region reducing the isolation between the various

Emirati nationals (4). It is important to analyse subpopulations to understand the

degree of relationships and impact of historical socio-cultural factors as seen within

the UAE from the 1920s agricultural settlements. Overall, the consolidation of the two

datasets highlighted the high degree of genetic heterozygosity in the various ethnic

groups making up the UAE nationals as supported in the literature (5) and improved

the meta-analysis of the genetic relationships between the UAE populations and

surrounding Middle East, North African and South Asian populations. Furthermore,

increasing the number of autosomal STR loci in the meta-analysis to 13-15 STR loci

(from six STR loci) better delineated the genetic relationships in the region.

The Y-STR analyses provided knowledge towards the genetic diversity of the UAE

male lineages. The genetic diversity and high heterozygosity within the UAE

population were confirmed through the use of the Y-chromosome Haplotype

Reference Database (YHRD), providing online access for haplotype interpretations

(6). Similar to the autosomal STR analyses, Y- STR analyses highlighted the

importance of increased number of loci for haplogroup characterisation. For example,

increasing the number of Y-STR loci compared between the UAE and Lebanon better

Page 131: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

130

reflected the genetic distance between the Lebanese and the UAE populations as

supported by the autosomal STR meta-analysis. Furthermore, the literature highlights

the effectiveness of the increased number of STR loci in improving genetic analyses

(7, 8, 9). As Lebanon was the only Middle Eastern population that could be used for

the 27 Y-STR loci comparison as opposed to the 17 Y-STR loci comparison, further

Middle Eastern populations require genetic analyses using 27 Y-STR loci to identify

any potential discrepancies and/or better delineate the genetic relationship.

An introductory meta-analysis was undertaken in Chapter Three with the first UAE

population autosomal STR dataset, highlighting the significance of involving the UAE

in comparative analyses. Inconsistencies were observed between the introductory

meta-analysis and the main meta-analysis using autosomal STR analyses (Chapter

Five). The Saudi and Egyptian population data sets compared in the introductory meta-

analysis showed less significant differences compared to the UAE population.

However, greater significant differences were seen between these two populations in

the main meta-analysis. This can be explained due to the collection of two separate

data sets of Egyptian and Saudi populations used for the two different meta-analyses.

The introductory meta-analysis used data from a Saudi and Egyptian population,

which were residing within Kuwait (10). Whilst the major meta-analysis used data

from a Saudi population residing within the central region of Saudi Arabia (11) and

an Egyptian population with individuals residing within Egypt for at least three

generations (12). The contrast of significant relationships towards the UAE

population highlights the importance of population and location specific data

collections for a better understanding of genetic diversity. Furthermore, this situation

highlights how the use of autosomal STRs can identify the impact on genetic diversity

of potential admixture and isolation, as shown by the example of the Saudi and

Page 132: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

131

Egyptian populations residing in Kuwait. Phylogenetic analysis should occur in many

research areas providing valuable insights. However, the process of obtaining the

appropriate phylogenetic perspective can be difficult as the chosen method of

phylogenetic analysis does not provide trees of equal quality and takes an invested

amount of time (13). Future research into using phylogenetic analysis towards the

UAE population should be considered with stress into rigorous analysis of the data.

The significance of population and subpopulation specific genetic analyses was also

highlighted through the Y-chromosome analysis using YHRD. The discrepancies

noted between the two separate Jordanian populations (Adnanit and Qahtanit)

highlight the importance of initially testing the relationship between subpopulations

before combining data to be used for comparisons such as the process accomplished

during the autosomal STR analysis (Chapter 4). Analysis of the relationship between

Jordan and the UAE population using autosomal STRs also highlighted how

populations within Jordan can be seen to show genetic distance from Middle East

populations and minimal heterozygosity values due to the impact of isolation even

though Jordan is the “major transit zone” (14). Once the two subpopulations of Jordan

were separated within YHRD, a better understanding of the relationship of Jordanians

towards the UAE population was provided. The importance of comparing

subpopulations is supported within the literature (5, 7, 14-17). The fact that

subpopulations have been observed to show genetic distance and diversity within close

proximity supports the description of the Middle East portraying a “mosaic pattern”

of genetic relationships impacted by the particular subpopulations analysed (18).

Similar genetic relationships were observed using the two different STR analyses

(autosomal and Y-chromosomal). Close genetic relationships between the UAE and

South Asian populations such as Iran were seen in both autosomal and Y-

Page 133: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

132

chromosomal STR analyses. This highlights how both autosomal and Y-chromosome

STR analyses provide supporting information towards human dispersal from the

Arabian Peninsula through South Asia impacting on genetic diversity via trade, history

and socio-cultural relationships (18). Additionally, significant genetic distance

between North African populations such as Morocco and the UAE were highlighted

through both autosomal and Y-chromosome STR analysis. Within the literature,

shared ancestry has been described between the Arab populations in Morocco and the

Arab populations in the Middle East (18, 19). However, significant genetic distance

between Morocco and the Middle East has been observed in these analyses. This is

not surprising due to genetic influences from Spain and African populations in close-

geographical distance with Morocco and the larger contribution of Berber genes to the

Moroccan gene pool, impacting the genetic diversity from the original shared ancestry

(19).

Different genetic relationships between the UAE population and others can be

observed when using autosomal and Y-chromosome STRs. Differences between the

UAE populations and Kuwait as well as Egypt were seen when 15 autosomal STRs

were compared. However, Y-STR analysis showed close clustering between the UAE

and these two populations. In the literature, Y-STR analyses have shown genetic

relationships between the UAE and Kuwait (20). This close genetic relationship

between the UAE and Kuwait can be explained by the shared historical relationships

between countries of the east coast of the Arabian Peninsula and the maritime trade

between the countries bordering the Arabian Gulf (21). In contrast, the genetic

distance observed using autosomal markers between Kuwait and the UAE may be

explainable by the geographic location of Kuwait relative to the main migration routes

out of Africa. Kundu and Ghosh (2015) have suggested that there are differences in

Page 134: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

133

human dispersal patterns passing through both the UAE and Kuwait (22). The

contributions of the maternal lineage could not be observed, since the study did not

consider mitochondrial DNA lineages. Furthermore, more detailed analyses of the

populations in Kuwait may be able to shed light on autosomal variations that are

inherited maternally. In the early 1900s, Kuwait comprised various cultures and

ethnicities due to extensive trade with Iran, Yemen, India and East Africa.

Furthermore, the attractive political and commercial environment with cultural

tolerance of Kuwait at this time, attracted travellers from mixed ethnic backgrounds

(e.g. Armenians, Baluchi and Jews) and an increase in immigrants continued after the

oil boom due to work opportunities (21). Marriage between different ethnic families

is uncommon in Kuwait, however there were exceptions. Kuwaiti nationals followed

traditional family values common in Arab cultures where a newlywed couple

(including intermarriages) would customarily reside at the husband’s geographical

location (21). This would impact the genetic diversity. Richards et al (2003)

commented on the presence of predominant female migration through the Middle East

using mitochondrial DNA analyses (23). Furthermore, the fact that women and men

did not necessarily accompany each other equally through dispersal (24, 25). However

it was common for women to move to the home of the husband after marriage (26).

The contribution of the maternal lineages would not be observed through Y-

chromosome analyses, explaining the variable results towards autosomal STR

analysis. At present, there is a lack of mitochondrial DNA studies involving UAE

populations within the literature (26). Furthermore, the practice of polygyny

throughout the Middle East, North Africa and South Asia populations would impact

the Y-chromosome gene pool more than observed through autosomal STR analysis

(27, 28). The difference in the genetic relationship observed when comparing Egypt

Page 135: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

134

to the UAE supports two main migration routes out of Africa. Dispersal through Egypt

and the UAE were part of the two separate dispersal routes, resulting in the genetic

differences observed in autosomal analysis. The same genetic relationships seen

through Y-chromosome analyses may be a result of factors based on trade and

employment, which would increase the dispersal of male lineages, looking for work

(1, 21, 26). The variable relationships between male and female lineages highlight the

importance of using different forms of genetic analyses on the same populations.

Through a combination of autosomal STR and lineage STR analyses, it is possible to

elucidate sociocultural impacts upon genetic diversity.

7.2 Conclusion

The overall analysis in the thesis of the UAE has contributed knowledge to

understanding the relationship between the population of the UAE and others. It

supports the relationship between human dispersal and its impact on genetic diversity.

Furthermore, the importance of comparing two subpopulations from the one country

(Emirati Bedouin and Emirati Arab of mixed ethnic origin) can be seen as it allows

the identification of potential discrepancies such as observed between the two

Jordanian populations in the Y-chromosome analyses. The thesis aims were carried-

out through the calculation of allele frequencies and the meta-analyses providing

insight into the diversity observed in the UAE, the Middle East and surrounding

regions. The thesis also highlights how using different types of STR analyses allow

for greater in-depth understanding of the history of populations and the impact on

genetic diversity of contemporary populations. The study of the UAE population has

further highlighted the relationship between human migration patterns and additional

impact factors of historical events and socio-cultural relationships towards genetic

Page 136: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

135

diversity. The analysis of the UAE population has also highlighted the importance of

population specific databases throughout the Middle East. By extending this UAE

genetic analyses to include mitochondrial DNA will further expand this study to assess

the impact of female migration, identified from the discrepancies observed between

the autosomal and Y-chromosome STR analyses. Finally, the overall analysis of the

UAE populations emphasises the significance of increasing both sample sizes and

number of STR markers used for efficacy in forensic human identification, paternity

testing and disease susceptibility.

Page 137: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

136

7.3 References

1. Petraglia MD, Rose JI. The evolution of human populations in Arabia:

Paleoenvironments, prehistory and genetics. Netherlands: Springer 2009.

2. Cadenas AM, Zhivotovsky LA, Cavalli-Sforza LL, Underhill PA, Herrera RJ.

Y-chromosome diversity characterizes the Gulf of Oman. Euro J Hum Genet.

2008;16(3):374-86.

3. Schneider PM. Scientific standards for studies in forensic genetics. Forensic

Sci Int. 2007;165(2-3):238-43.

4. Al-Semmari F. A history of the Arabian Peninsula. London: I.B. Tauris 2009.

5. Garcia-Bertrand R, Simms TM, Cadenas AM, Herrera RJ. United Arab

Emirates: phylogenetic relationships and ancestral populations. Gene.

2014;533(1):411-9.

6. Willuweit S, Roewer L. The new Y Chromosome Haplotype Reference

Database. Forensic Sci Int Genet. 2015;15:43-8.

7. Pickrahn I, Muller E, Zahrer W, Dunkelmann B, Cemper-Kiesslich J, Kreindl

G, et al. Yfiler(R) Plus amplification kit validation and calculation of forensic

parameters for two Austrian populations. Forensic Sci Int Genet. 2016;21:90-

4.

8. Rapone C, D'Atanasio E, Agostino A, Mariano M, Papaluca MT, Cruciani F,

et al. Forensic genetic value of a 27 Y-STR loci multiplex (Yfiler(R) Plus kit)

in an Italian population sample. Forensic Sci Int Genet. 2016;21:e1-5.

9. Garcia O, Yurrebaso I, Mancisidor ID, Lopez S, Alonso S, Gusmao L. Data

for 27 Y-chromosome STR loci in the Basque Country autochthonous

population. Forensic Sci Int Genet. 2016;20:e10-2.

Page 138: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

137

10. Al-Enizi M, Ge J, Ismael S, Al-Enezi H, Al-Awadhi A, Al-Duaij W et al.

Population genetic analyses of 15 STR loci from seven forensically-relevant

populations residing in the state of Kuwait. Forensic Sci Int Genet.

2013;7(4):e106-7.

11. Osman AE, Alsafar H, Tay GK, Theyab J, Mubasher M, Sheikh N, et al.

Autosomal short tandem repeat (STR) variation based on 15 loci in a

population from the Central Region (Riyadh Province) of Saudi Arabia. J

Forensic Res. 2015;6(1):1-5.

12. Coudray C, Guitard E, el-Chennawi F, Larrouy G, Dugoujon JM. Allele

frequencies of 15 short tandem repeats (STRs) in three Egyptian populations

of different ethnic groups. Forensic Sci Int. 2007;169(2-3):260-5.

13. Soltis DE & Soltis PS. The role of phylogenetics in comparative genetics. Plant

Physiol. 2003;132:1791-1800.

14. Zanetti D, Sadiq M, Carreras-Torres R, Khabour O, Alkaraki A, Esteban E, et

al. Human diversity in Jordan: Polymorphic Alu insertions in general

Jordanian and Bedouin groups. Hum Biol. 2014;86 (2):131-8.

15. Singh A, Trivedi R, Kashyap VK. Polymorphisms at fifteen tetrameric short

tandem repeat loci in three ethnic populations of Bengal, India. Leg Med.

2006;8(3):191-3.

16. Chang YM, Swaran Y, Phoon YK, Sothirasan K, Sim HT, Lim KB, et al.

Haplotype diversity of 17 Y-chromosomal STRs in three native Sarawak

populations (Iban, Bidayuh and Melanau) in East Malaysia. Forensic Sci Int

Genet. 2009;3(3):e77-80.

Page 139: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

138

17. Qamar R, Ayub Q, Mohyuddin A, Helgason A, Mazhar K, Mansoor A, et al.

Y-Chromosome DNA variation in Pakistan. Am J Hum Genet. 2002;70:1107-

24.

18. Petraglia MD, Haslam M, Fuller DQ, Boivin N, Clarkson C. Out of Africa:

New hypotheses and evidence for the dispersal of Homo sapiens along the

Indian Ocean rim. Ann Hum Biol. 2010;37(3):288-311.

19. Bentayebi K, Abada F, Ihzmad H, Amzazi S. Genetic ancestry of a Moroccan

population as inferred from autosomal STRs. Meta Gene. 2014;2:427-38.

20. Triki-Fendri S, Alfadhli S, Ayadi I, Kharrat N, Ayadi H, Rebai A. Genetic

structure of Kuwaiti population revealed by Y-STR diversity. Ann Hum Biol.

2010;37(6):827-35.

21. Anthony JD, Hearty JA. Eastern Arabian States: Kuwait, Bahrain, Qatar, the

United Arab Emirates, and Oman: The government and politics of the Middle

East and North Africa. Colorado: Westview Press 1980.

22. Kundu S, Ghosh SK. Trend of different molecular markers in the last decades

for studying human migrations. Gene. 2015;556(2):81-90.

23. Richards M, Rengo C, Cruciani F, Gratrix F, Wilson JF, Scozzari R, et al.

Extensive female-mediated gene flow from sub-Saharan Africa into near

eastern Arab populations. Am J Hum Genet. 2003;72(4):1058-64.

24. Triki-Fendri S, Sánchez-Diz P, Rey-González D, Ayadi I, Carracedo Á, Rebai

A. Paternal lineages in Libya inferred from Y-chromosome haplogroups. Am

J Phys Anthropol. 2015;157:242-51.

25. Badro D, Douaihy B, Haber M, Youhanna S, Salloum A, Ghassibe-Sabbagh

M et al. Y-chromosome and mtDNA genetics reveal significant contrast in

Page 140: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

139

affinities of modern Middle Eastern population with European and African

populations. PLoS One. 2013;8(1):e54616

26. Al-Nakib F. Kuwait transformed a history of oil and urban life. Redwood City:

Stanford University Press 2016.

27. Dupanloup I, Pereira L, Bertorelle G, Calafell F, Prata MJ, Amorim A et al. A

recent shift from polygyny to monogamy in humans is suggested by the

analysis of worldwide Y-chromosome diversity. J Mol Evol. 2003;57:85-97.

28. Al-Krenawi A, Slonim-Nevo V, Graham JR. Polygyny and its impact on the

psychosocial well-being of husbands. J Comp Fam Stud. 2006;37(2):173-189.

Page 141: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

140

Bibliography

Abbas F. Egypt, Arab nationalism, and Nubian disaporic identity in Idris Ali's dongolia:

A novel of Nubia. Res African Lit. 2014;45(3):147-66.

Abed I, Hellyer P. United Arab Emirates: A new perspective. London: Trident Press 2001.

Abdin L, Shimada I, Brinkmann B, Hohoff C. Analysis of 15 short tandem repeats reveals

significant differences between the Arabian populations from Morocco and Syria. Leg

Med. 2003;5:150-S5.

Abu-Amero KK, Gonzalez AM, Larruga JM, Bosley TM, Cabrera VM. Eurasian and

African mitochondrial DNA influences in the Saudi Arabian population. BMC Evol

Biol. 2007;7(32):1-15.

Al-Enizi M, Ge J, Ismael I, Al-Enezi H, Al-Awadhi A, Al-Duaij W et al. Population

genetic analyses of 15 STR loci from seven forensically-relevant populations residing

in the state of Kuwait. Forensic Sci Int Gen. 2013;7(4):e106-e107.

Alenizi M, Goodwin W, Ismael S, Hadi S. STR data for the AmpFlSTR Identifiler loci in

Kuwaiti population. Legal Med. 2008;10(6):321-5.

Ali Alhmoudi O, Jones RJ, Tay GK, Alsafar H, Hadi S. Population genetics data for 21

autosomal STR loci for United Arab Emirates (UAE) population using next generation

multiplex STR kit. Forensic Sci Int. 2015;19:190-1.

Al-Krenawi A, Slonim-Nevo V, Graham J. Polygyny and its impact on the psychosocial

well-being of husbands. J Comp Fam Stud. 2006;37(2):173-189.

Al-Nakib F. Kuwait transformed a history of oil and urban life. Redwood City: Stanford

University Press 2016.

Alsafar H, Jama-Alol KA, Hassoun AAK, Tay GK. The prevalence of Type 2 Diabetes

Mellitus in the United Arab Emirates: justification for the establishment of the

Emirates Family Registry. Int J Diabetes Dev Ctries. 2012;32(1):25-32.

Page 142: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

141

Al-Semmari F. A history of the Arabian Peninsula. London: I.B. Tauris 2009.

Alshamali F, Alkhayat AQ, Budowle B, Watson ND. STR population diversity in nine

ethnic populations living in Dubai. Forensic Sci Int. 2005;152(2-3):267-79.

Alshamali F, Pereira L, Iacute SA, Budowle B, Poloni ES, et al. Local population structure

in Arabian Peninsula revealed by Y-STR diversity. Hum Hered. 2009;68(1):45-54.

Al-Zahery N, Semino O, Benuzzi G, Magri C, Passarino G, Torroni A, et al. Y-

chromosome and mtDNA polymorphisms in Iraq, a crossroad of the early human

dispersal and of post-Neolithic migrations. Mol Phylogenet Evol. 2003;28(3):458-72.

Anthony JD, Hearty JA. Eastern Arabian States: Kuwait, Bahrain, Qatar, the United Arab

Emirates, and Oman. The Government and Politics of the Middle East and North

Africa. Colorado: Westview Press 1980.

Ashma R, Kashyap VK. Genetic profile based upon 15 microsatellites of four caste groups

of the eastern Indian state, Bihar. Ann Hum Biol. 2003;30(5):570-8.

Azab M, Al-Bashir N, Momani SN, Al-Nasser A, Alkaraki AK, Khabour O. Comparison

between frequencies of several STRs loci in Jordan with neighboring countries. Jordan

Med J. 2010;44(1):55-60.

Badro DA, Douaihy B, Haber M, Youhanna SC, Salloum A, Ghassibe-Sabagh M,

Johnsrud B, et al. Y-chromosome and mtDNA genetics reveal significant contrasts in

affinities of modern Middle Eastern populations with European and African

populations. PLoS One. 2013;8(1):e54616.

Barbujani G, Ghirotto S, Tassi F. Nine things to remember about human genome diversity.

Tissue Antigens. 2013;82(3):155-64.

Barni F, Berti A, Pianese A, Boccellino A, Miller MP, Caperna A, et al. Allele frequencies

of 15 autosomal STR loci in the Iraq population with comparisons to other populations

from the middle-eastern region. Forensic Sci Int. 2007;167(1):87-92.

Page 143: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

142

Bernhard P. Behind the Battle Lines: Italian Atrocities and the Persecution of Arabs,

Berbers, and Jews in North Africa during World War II. Holocaust Genocide Stud.

2012;26(3):425-46.

Bentayebi K, Abada F, Ihzmad H, Amzazi S. Genetic ancestry of a Moroccan population

as inferred from autosomal STRs. Meta gene. 2014;2:427-38.

Beyin A. The Bab al Mandab vs the Nile-Levant: An appraisal of the two dispersal routes

for early modern humans out of Africa. Africa Archaeol Rev. 2006;23(1-2):5-30.

Bosch E, Clarimon J, Perez-Lezaun A, Calafell F. STR data for 21 loci in northwestern

Africa. Forensic Sci Int. 2001;116:41-51.

Cadenas AM, Zhivotovsky LA, Cavalli-Sforza LL, Underhill PA, Herrera RJ. Y-

chromosome diversity characterizes the Gulf of Oman. Eur J Hum Genet.

2008;16(3):374-86.

Carracedo Á, Butler JM, Gusmão L, Linacre A, Parson W, Roewer L, et al. New

guidelines for the publication of genetic population data. Forensic Sci Int Genet.

2013;7(2):217-20.

Carracedo A, Butler JM, Gusmao L, Linacre A, Parson W, Roewer L, et al. Update of the

guidelines for the publication of genetic population data. Forensic Sci Int Genet.

2014;10:1-2.

Cassar M, Farrugia C, Vidal C. Allele frequencies of 14 STR loci in the population of

Malta. Leg Med. 2008;10(3):153-6.

Chang YM, Swaran Y, Phoon YK, Sothirasan K, Sim HT, Lim KB, et al. Haplotype

diversity of 17 Y-chromosomal STRs in three native Sarawak populations (Iban,

Bidayuh and Melanau) in East Malaysia. Forensic Sci Int Genet. 2009;3(3):e77-80.

Page 144: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

143

Cherni L, Loueslati Yaacoubi B, Pereira L, Alves C, Khodjet-El-Khil H, Ben Ammar El

Gaaied A, et al. Data for 15 autosomal STR markers (Powerplex 16 System) from two

Tunisian populations: Kesra (Berber) and Zriba (Arab). Forensic Sci Int.

2005;147(1):101-6.

Chouery E, Coble MD, Strouss KM, Saunier JL, Jalkh N, Medlej-Hashim M, et al.

Population genetic data for 17 STR markers from Lebanon. Leg Med. 2010;12(6):324-

6.

Cordaux R, Aunger R, Bentley G, Nasidze I, Sirajuddin SM, Stoneking M. Independent

origins of Indian caste and tribal paternal lineages. Curr Biol. 2004;14:231-5.

Coudray C, Guitard E, el-Chennawi F, Larrouy G, Dugoujon JM. Allele frequencies of 15

short tandem repeats (STRs) in three Egyptian populations of different ethnic groups.

Forensic Sci Int. 2007;169(2-3):260-5.

Dalby A. Dictionary of languages: Bakhtiari. London, United Kindgom: A&C Black

2004.

Dalby A. Dictionary of languages: Beja. London, United Kingdom: A&C Black 2004.

Dalby A. Dictionary of languages: Gilaki. London, United Kingdom: A&C Black 2004.

Dalby A. Dictionary of languages: Kurdish. London, United Kingdom: A&C Black 2004.

Dalby A. Dictionary of languages: Luri. London, United Kingdom: A&C Black 2004.

Der Sarkissian C, Brotherton P, Balanovsky O, Templeton JE, Llamas B, Soubrier J, et

al. Mitochondrial genome sequencing in Mesolithic North East Europe Unearths a

new sub-clade within the broadly distributed human haplogroup C1. PLoS One.

2014;9(2):e87612.

Dupanloup I, Pereira L, Bertorelle G, Calafell F, Joāo Prata M, Amorim A, Barbujani G.

A recent shift from polygyny to monogamy in humans is suggested by the analysis of

worldwide Y-chromosome diversity. J Mol Evol. 2003;57:85-97.

Page 145: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

144

El Andari A, Othman H, Taroni F, Mansour I. Population genetic data for 23 STR markers

from Lebanon. Forensic Sci Int Genet. 2013;7(4):e108-13.

Ermini L, Der Sarkissian C, Willerslev E, Orlando L. Major transitions in human

evolution revisited: A tribute to ancient DNA. J Hum Evol. 2015;79:4-20.

Excoffier L, Laval G, Scnheider S. Arlequin ver. 3.0: An integrated software package for

population genetics data analysis. Evol Bioinform. 2005;1:47-50.

Fareed M, Afzal M. Genetic structure of human populations based on 5 gene loci: A

preliminary report. Northern India Gene Reports. 2016;4:244-8.

Ferdous A, Ali ME, Alam S, Hasan M, Hossain T, Akhteruzzaman S. Forensic evaluation

of STR data for the PowerPlex 16 System loci in a Bangladeshi population. Leg Med.

2009;11(4):198-9.

Fernandes V, Alshamali F, Alves M, Costa MD, Pereira JB, Silva NM, et al. The Arabian

cradle: mitochondrial relicts of the first steps along the southern route out of Africa.

Am J Hum Genet. 2012;90(2):347-55.

Forster P. Ice ages and the mitochondrial DNA chronology of human dispersals: A review.

Philos Trans R Soc Lond B Biol Sci. 2004;359(1442):255-64.

Gaibar M, Esteban ME, Via M, Harich N, Kandil M, Fernandez-Santander A. Usefulness

of autosomal STR polymorphisms beyond forensic purposes: Data on Arabic- and

Berber-speaking populations from central Morocco. Ann Hum Biol. 2012;39(4):297-

304.

Garcia-Bertrand R, Simms TM, Cadenas AM, Herrera RJ. United Arab Emirates:

phylogenetic relationships and ancestral populations. Gene. 2014;533(1):411-9.

Garcia O, Yurrebaso I, Mancisidor ID, Lopez S, Alonso S, Gusmao L. Data for 27 Y-

chromosome STR loci in the Basque Country autochthonous population. Forensic Sci

Int Genet. 2016;20:e10-2.

Page 146: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

145

Garte S. Human population genetic diversity as a function of SNP type from HapMap

data. Am J Hum Biol. 2010;22(3):297-300.

Gibbs RA, Belmont JW, Hardenbol P, Willis TD, Yu F, Yanf H, et al. The international

HapMap project. Nature. 2003;426:789-96.

Gonzalez AM, Karadsheh N, Maca-Meyer N, Flores C, Cabrera VM, Larruga JM.

Mitochondrial DNA variation in Jordanians and their genetic relationship to other

Middle East populations. Ann Hum Biol. 2008;35(2):212-31.

Gottesman L. Jews in the Middle East. Am Jew Yr Book. 1985;85:304-23.

Grugni, V., Battaglia, V., Kashani, B., Parolo, S., Al-Zahery, N, et al. Ancient migratory

events in the Middle East: New clues from the Y-chromosome variation of modern

Iranians. PLoS One. 2012;7(7):1-14.

Haasl RJ, Payseur BA. Multi-locus inference of population structure: A comparison

between single nucleotide polymorphisms and microsatellites. Heredity.

2011;106:591-7.

Hellenthal G, Busby G, Band G, Wilson JF, Capelli C, Falush D, et al. A genetic atlas of

human admixture history. Science. 2014;343:747-51.

His Beatitude the Patriarch Mar E. Assyrians in the Middle East. J Royl Cent Asian Soc.

1953;40(2):151-60.

Hisyar O. Introduction: The Kurds' ordeal with Turkey in a transforming Middle East.

Dialect Anthropol. 2013;37(1):103-11.

Hodgson J, Mulligan C, Al-Meeri A, Raaum R. Early back-to-Africa migration into the

horn of Africa. PLoS Genet 2014;10(6):1-18.

Hovannisian R. The ebb and flow of the Armenian minority in the Arab Middle East. The

Middle East Journal. 1974;28 (1):19-32.

Page 147: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

146

Hovannisyan A, Khachatryan Z, Haber M, Hrechdakian P, Karafet T, Zalloua P, et al.

Different waves and directions of Neolithic migrations in the Armenian highland.

Invest Genet. 2014;5(15):1-11.

Huston KA. Statistical Analysis of STR Data. Profiles in DNA. 1998;1(3):14-5.

Immel UD, Kleiber M, Klintschar M. Y-chromosomal STR haplotypes in an Arab

population from Yemen. Int Congress Series 2004;1261:340-3.

Khodjet-el-Khil H, Fadhlaoui-Zid K, Gusmao L, Alves C, Benammar-Elgaaied A,

Amorim A. Allele frequencies for 15 autosomal STR markers in the Libyan

population. Annals of Hum Biol. 2012;39(1):80-3.

Kruskal JB. Nonmetric Multidimensional Scaling: A numerical method. Psychometrika.

1964;29(2):115-29.

Kumar S, Kingsley C, DiStefano JK. The human genome project: Where are we now and

where are we going? 2015:7-31.

Kundu S, Ghosh SK. Trend of different molecular markers in the last decades for studying

human migrations. Gene. 2015;556(2):81-90.

Lewis JE. Iraqi Assyrians. Barometer of pluralism Middle East Quarterly, Summer

2003;10(3):49-57.

Maca-Meyer N, González AM, Pestano J, Flores C, Larruga JM, Cabrera VM.

Mitochondrial DNA transit between West Asia and North Africa inferred from U6

phylogeography. BMC Genet. 2003;4:1-11.

Manni F, Leonardi P, Barakat A, Rouba H, Heyer E, Klintschar M, et al. Y-chromosome

analysis in Egypt suggests a genetic regional continuity in Northeastern Africa. Hum

Biol. 2002;74(5):645-58.

Martin L. DNA tribes. DNA Tribes 2013:1-17.

Page 148: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

147

Middleton D. HLA typing from serology to sequencing era. Iran J Allergy Asthma

Immunol. 2005;4(2):53-66.

Miles SB. The countries and tribes of the Persian Gulf. London: Harrison and Sons 1919.

Mogib M. Copts in Egypt and their demands: Between inclusion and exclusion. Contemp

Arab Affairs. 2012;5(4):535-55.

Mohammad T, Xue Y, Evison YM, Tyler-Smith C. Genetic structure of nomadic Bedouin

from Kuwait. Heredity. 2009;103(5):425-33.

Nakagome S, Alkorta-Aranburu G, Amato R, Peter B, Hudson R, Di Rienzo A. Estimating

the ages of selection signals from different epochs in human history. Mol Biol Evol.

2015;33(3):657-669.

Nazir M, Alhaddad H, Alenizi M, Alenizi H, Taqi Z, Sanqoor S, et al. A genetic overview

of 23Y-STR markers in UAE population. Forensic Sci Int Genet. 2016;23:150-2.

Nei M. Molecular Evolutionary Genetics. New York, USA: Columbia University Press

1987.

Novembre J, Peter BM. Recent advances in the study of fine-scale population structure in

humans. Curr Opin Genet Dev. 2016;41:98-105.

Osman AE, Alsafar H, Tay GK, Theyab J, Mubasher M, Sheikh N, et al. Autosomal short

tandem repeat (STR) variation based on 15 loci in a population from the Central

Region (Riyadh Province) of Saudi Arabia. J Forensic Res. 2015;6(1):1-5.

Payseur BA, Jing P. A genomwide comparison of population structure at STRs and nearby

SNPs in humans. Mol Biol Evol. 2009;26:1369-77.

Perez-Miranda AM, Alfonso-Sanchez MA, Pena JA, Herrera RJ. Qatari DNA variation at

a crossroad of human migrations. Hum Hered. 2006;61(2):67-79.

Peakall R, Smouse PE. GenAlEx 6.5: Genetic analysis in Excel. Population genetic

software for teaching and research--an update. Bioinformatics. 2012;28(19):2537-9.

Page 149: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

148

Peakall ROD, Smouse PE. Genalex 6: Genetic analysis in Excel. Population genetic

software for teaching and research. Mol Ecol Notes. 2006;6(1):288-95.

Petraglia MD, Haslam M, Fuller DQ, Boivin N, Clarkson C. Out of Africa: New

hypotheses and evidence for the dispersal of Homo sapiens along the Indian Ocean

rim. Ann Hum Biol. 2010;37(3):288-311.

Petraglia MD, Rose JI. The evolution of human populations in Arabia:

Paleoenvironments, prehistory and genetics. Netherlands: Springer 2009.

Pickrahn I, Muller E, Zahrer W, Dunkelmann B, Cemper-Kiesslich J, Kreindl G, et al.

Yfiler(R) Plus amplification kit validation and calculation of forensic parameters for

two Austrian populations. Forensic Sci Int Genet. 2016;21:90-4.

Potter LG. The Persian Gulf in History. 1st ed. New York: Palgrave Macmillan 2009.

Putman AI, Carbone I. Challenges in analysis and interpretation of microsatellite data for

population genetic studies. Ecol Evol. 2014.

Qamar R, Ayub Q, Mohyuddin A, Helgason A, Mazhar K, Mansoor A, et al. Y-

Chromosome DNA variation in Pakistan. Am J Hum Genet. 2002;70:1107-24.

Rakha A, Yu B, Hadi S, Sheng-bin L. Population genetic data on 15 autosomal STRs in a

Pakistani population sample. Leg Med. 2009;11(6):305-7.

Ram P. Yemen History and Culture. AnVi OpenSource Knowledge Trust: GBO 2015.

Rapone C, D'Atanasio E, Agostino A, Mariano M, Papaluca MT, Cruciani F, et al.

Forensic genetic value of a 27 Y-STR loci multiplex (Yfiler((R)) Plus kit) in an Italian

population sample. Forensic Sci Int Genet. 2016;21:e1-5.

Rashidvash V. Iranian People and the Origin of the Turkish-speaking Population of the

North-western Iran. Canadian Social Sci. 2012;8(2):132-9.

Reilly B. Revisiting Consanguineous Marriage in the Greater Middle East: Milk, Blood,

and Bedouins. Am Anthropol. 2013;115(3):374-87.

Page 150: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

149

Richards M, Rengo C, Cruciani F, Gratrix F, Wilson JF, Scozzari R, et al. Extensive

female-mediated gene flow from sub-Saharan Africa into near eastern Arab

populations. Am J Hum Genet. 2003;72(4):1058-64.

Roewer L, Willuweit S, Stoneking M, Nasidze I. A Y-STR database of Iranian and

Azerbaijanian minority populations. Forensic Sci Int Genet. 2009;4:53-5.

Ruitberg CM, Reeder DJ, Butler JM. STRBase: A short tandem repeat DNA database for

the human identity testing community. Nucl Acids Res. 2001;29(1):320-322.

Schneider PM. Scientific standards for studies in forensic genetics. Forensic Sci Int.

2007;165(2-3):238-43.

Seielstad M, Bekele E, Ibrahim M, Toure A, Traore M. A view of modern human origins

from Y-chromosome microsatellite variation. Gen Res. 1999;9:558-67.

Shendure J. Human genomics: A deep dive into genetic variation. Nature. 2016;536:277-

78.

Shepard EM, Herrera RJ. Genetic encapsulation among Near Eastern populations. J Hum

Genet. 2006;51(5):467-76.

Shepard EM, Herrera RJ. Iranian STR variation at the fringes of biogeographical

demarcation. Forensic Sci Int. 2006;158(2-3):140-8.

Shetty P. Lihadh Al-Gazali: a leading clinical geneticist in the Middle East. The Lancet.

2006;367(9515):979.

Shoup J. Ethnic Groups of Africa and the Middle East: An Encyclopedia. California:

ABC-CLIO 2011.

Silva NM, Pereira L, Poloni ES, Currat M. Human neutral genetic variation and forensic

STR data. PLoS One. 2012;7(11).

Singh A, Trivedi R, Kashyap VK. Polymorphisms at fifteen tetrameric short tandem

repeat loci in three ethnic populations of Bengal, India. Leg Med. 2006;8(3):191-3.

Page 151: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

150

Soltis DE & Soltis PS. The role of phylogenetics in comparative genetics. Plant Physiol.

2003;132:1791-1800.

Sun JX, Mullikin JC, Patterson N, Reich D. Microsatellites are molecular clocks that

support accurate inferences about history Mol Biol Evol. 2009;26:1017-27.

Tachjian V. Gender, nationalism, exclusion: The reintegration process of female survivors

of the Armenian genocide. Nations and Nationalism. 2009;15(1):60-80.

Tadmouri GO, Nair P, Obeid Y, Al-Ali MT, Al-Khaja N, Hamamy HA, Consanguinity

and reproductive health among Arabs. Reprod Health. 2009;6:17.

Tadmouri GO, Sastry KS, Chouchane L. Arab gene geography: From population

diversities to personalized medical genomics. Glob Cardiol Sci Pract.

2014;2014(4):394-408.

Thangaraj K, Ramana GV, Singh L. Y-chromosome and mitochondrial DNA

polymorphisms in Indian populations. Electrophoresis. 1999;20:1743-7.

Thesiger W. Desert borderlands of Oman. Geograph J. 1950;116 (4/6):137-68.

Triki-Fendri S, Alfadhli S, Ayadi I, Kharrat N, Ayadi H, Rebai A. Genetic structure of

Kuwaiti population revealed by Y-STR diversity. Ann Hum Biol. 2010;37(6):827-35.

Triki-Fendri S, Sánchez-Diz P, Rey-González D, Ayadi I, Carracedo Á, Rebai A. Paternal

lineages in Libya inferred from Y-chromosome haplogroups. Am J Phys Anthropol.

2015;157:242-51.

Underhill PA, Kivisild T. Use of y chromosome and mitochondrial DNA population

structure in tracing human migrations. Annu Rev Genet. 2007;41:539-64.

Valeri M. Nation-building and communities in Oman since 1970: The Swahili-Speaking

Omani in search of identity. African Affairs. 2007;106(424):479-96.

Willuweit S, Roewer L. The new Y Chromosome Haplotype Reference Database.

Forensic Sci Int Genet. 2015;15:43-8.

Page 152: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

151

Zanetti D, Sadiq M, Carreras-Torres R, Khabour O, Alkaraki A, Esteban E, et al. Human

diversity in Jordan: Polymorphic Alu insertions in general Jordanian and Bedouin

groups. Hum Biol. 2014;86 (2):131-8.

Zayed H. The Arab genome: Health and wealth. Gene. 2016;592:239-243.

Zeggini E, Rayner W, Morris AP, Hattersley AT, Walker M, Hitman GA et al. An

evaluation on HapMap sample size and tagging SNP performance in large-scale

empirical and simulated data sets. Nature Genet. 2005;37(12):1320-2.

Zhang D, Hewitt G. Nuclear DNA analyses in genetic studies of populations: Practice,

problems and prospects. Mol Ecol. 2003;12:563-84.

Page 153: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

152

APPENDICES

Page 154: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

153

Appendix 1: Published manuscript in the form of a Letter to the Editor for

Forensic Science International: Genetics (Chapter 3).

Page 155: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

154

Page 156: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

155

Appendix 2: Manuscript in Press in the form of a Letter to the Editor for Forensic

Science International: Genetics (Chapter 4).

Page 157: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

156

Page 158: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

157

Appendix 3: Exact test (FST) between 15 loci of 20 populations compared to the present GlobalFiler UAE population study

(Chapter 5).

POPULATION/

LOCI LEBANON JORDAN SYRIA IRAQ KUWAIT

SAUDI

ARABIA QATAR UAE OMAN YEMEN

D3S1358 0.00015+-

0.0001

n/a 0.41405+-

0.0434

0.04510+-

0.0084 0.00000+-

0.0000

0.00020+-

0.0002

0.01755+-

0.053

0.89195+-

0.0225

0.38170+-

0.0363

0.80405+-

0.0255

VWA 0.00000+-

0.0000

0.00000+-

0.0000

0.03510+-

0.0189

0.08580+-

0.0149 0.00000+-

0.0000

0.01070+-

0.0095

0.33510+-

0.0712

0.83715+-

0.0297

0.24680+-

0.0469 0.00005+-

0.0001

D16S539 0.00000+-

0.0000

n/a 0.00330+-

0.0037

0.52110+-

0.0436 0.00000+-

0.0000

0.00280+-

0.0016

0.00000+-

0.0000

0.26055+-

0.0395

0.87985+-

0.0278

0.22065+-

0.0291

CSF1PO 0.00645+-

0.0031

0.00055+-

0.0002

0.34240+-

0.0492

0.71080+-

0.0522 0.00000+-

0.0000

0.00095+-

0.0007

0.40180+-

0.0368

0.60405+-

0.0319

0.17475+-

0.0302

0.30315+-

0.0504

TPOX 0.00225+-

0.0024

0.45285+-

0.0293

0.08390+-

0.0144

0.32490+-

0.0387 0.00125+-

0.0010

0.00005+-

0.0001

0.37420+-

0.0446

0.89205+-

0.0119

0.18705+-

0.0415

0.28760+-

0.0346

D21S11 0.00000+-

0.0000

n/a 0.00000+-

0.0000

0.58890+-

0.0428

0.05385+-

0.0151 0.00000+-

0.0000

0.07435+-

0.0244

0.81435+-

0.0249 0.02800+-

0.0123

0.15185+-

0.0476

D8S1179 0.00000+-

0.0000

n/a 0.07835+-

0.0186

0.63605+-

0.0239 0.00160+-

0.0016

0.05120+-

0.0310

0.25580+-

0.0566

0.96235+-

0.0073

0.61750+-

0.0518 0.03670+-

0.0119

D18S51 0.00105+-

0.0010

n/a 0.32245+-

0.0478

0.85870+-

0.0250 0.00195+-

0.0012

0.00000+-

0.0000

0.89375+-

0.0320

0.56330+-

0.0520

0.68685+-

0.0561

0.64090+-

0.0478

D19S433 0.00055+-

0.0006

n/a n/a 0.14285+-

0.0343 0.00055+-

0.0004

0.00025+-

0.0003

0.00000+-

0.0000

0.89345+-

0.0240

n/a n/a

TH01 0.05120+-

0.0254 0.00000+-

0.0000

0.08825+-

0.0379

0.85055+-

0.0282

0.86470+-

0.0423 0.00000+-

0.0000

0.38910+-

0.0460

0.75295+-

0.0318

0.05590+-

0.0091

0.40555+-

0.0551

FGA 0.00270+-

0.0012

n/a 0.04710+-

0.0256

0.83170+-

0.0371 0.00855+-

0.0062

0.00120+-

0.0012

0.02235+-

0.0071

0.05855+-

0.0146

0.68955+-

0.0338

0.45740+-

0.0664

D5S818 0.11850+-

0.0534

n/a 0.24125+-

0.0526

0.56710+-

0.0416 0.00010+-

0.0001

0.00775+-

0.0037

0.00000+-

0.0000

0.82240+-

0.0298

0.59420+-

0.0425

0.08425+-

0.0229

D13S317 0.00310+-

0.0034

0.00000+-

0.0000

0.01950+-

0.0145

0.00010+-

0.0000

0.00000+-

0.0000

0.00000+-

0.0000

0.0000+-

0.0000

0.01110+-

0.0069

0.05195+-

0.0084

0.15140+-

0.0219

D7S820 0.00020+-

0.0002

0.00000+-

0.0000

0.41285+-

0.0529

0.57535+-

0.0377 0.00000+-

0.0000

0.00230+-

0.0016

0.11290+-

0.0207 0.03695+-

0.0202

0.22060+-

0.0250

0.06100+-

0.0127

D2S1338 0.00000+-

0.0000

n/a n/a 0.00985+-

0.0048

0.00660+-

0.0032

0.00000+-

0.0000

0.49100+-

0.0380

0.05815+-

0.0077

n/a n/a

Page 159: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

158

Appendix 3 continued

POPULATION/

LOCI

MOROCCO ALGERIA TUNISIA MALTA LIBYA EGYPT IRAN PAKISTAN INDIA BANGLADESH

D3S1358 0.00785+-

0.0055

0.68355+-0.0353

0.00120+-

0.0006 0.00235+-

0.0012 0.04350+-

0.0135

0.00000+-

0.0000

0.66735+-0.0345

0.09405+-0.0404

0.60160+-0.0267

0.01935+-

0.0076

VWA 0.00000+-

0.0000

0.01205+-

0.0059

0.09510+-0.0215

0.12370+-0.0290

0.16250+-0.0301

0.04930+-

0.0131

0.12090+-0.0461

0.00000+-

0.0000 0.00000+-

0.0000

0.00000+-

0.0000

D16S539 0.00000+-

0.0000

n/a 0.16350+-0.0188

0.00000+-

0.0000 0.00000+-

0.0000

0.02850+-

0.0083

0.00635+-

0.0029

0.04595+-0.0192

0.00000+-

0.0000

0.03105+-

0.0074

CSF1PO 0.00000+-

0.0000

0.00060+-

0.0003

0.00000+-

0.0000

0.00010+-

0.0001

0.03450+-

0.0105

0.06200+-

0.0154

0.75345+-

0.0345

0.85235+-

0.0381 0.04465+-

0.0340

0.04920+-

0.0214

TPOX 0.00000+-

0.0000

0.96720+-0.0084

0.00000+-

0.0000

0.71595+-0.0339

0.10095+-0.0113

0.02200+-

0.0072

0.09820+-0.0195

0.00000+-

0.0000 0.00000+-

0.0000

0.00020+-

0.0002

D21S11 0.00000+-

0.0000

0.56260+-0.0290

0.05110+-0.0110

0.00000+-

0.0000 0.01495+-

0.0078 0.00000+-

0.0000

0.29100+-0.0709

0.00355+-

0.0015 0.00325+-

0.0034

0.12340+-

0.0298

D8S1179 0.00000+-

0.0000

0.00000+-

0.0000

0.24440+-0.0252

0.00000+-

0.0000

0.52045+-0.0447

0.06915+-0.0180

0.02765+-

0.0118 0.00000+-

0.0000 0.00000+-

0.0000

0.00000+-

0.0000

D18S51 0.00000+-

0.0000

0.28960+-0.0313

0.00000+-

0.0000

0.22510+-0.0381

0.00000+-

0.0000

0.00000+-

0.0000

0.11170+-0.0315

0.01610+-

0.0179 0.00040+-

0.0004

0.01735+-

0.0141

D19S433 0.00000+-

0.0000

n/a n/a n/a 0.80720+-0.0325

0.21340+-0.0589

0.76840+-0.0409

n/a 0.00000+-

0.0000

n/a

TH01 0.00000+-

0.0000

0.05420+-0.0170

0.04410+-

0.0090 0.00000+-

0.0000

0.13570+-0.0446

0.03265+-

0.0116

0.93710+-0.0185

0.02390+-

0.0112 0.00495+-

0.0029

0.00170+-

0.0014

FGA 0.00000+-

0.0000

0.02580+-

0.0147

0.00000+-

0.0000

0.07365+-

0.0213

0.65040+-

0.0548

0.00000+-

0.0000

0.02470+-

0.0192

0.00655+-

0.0044

0.06725+-

0.0261

0.49345+-

0.0757

D5S818 0.00000+-

0.0000

0.03230+-

0.0105

0.00000+-

0.0000

0.00000+-

0.0000

0.00000+-

0.0000

0.00000+-

0.0000

0.91200+-

0.0196

0.00020+-

0.0002

0.00000+-

0.0000

0.00925+-

0.0061

D13S317 0.00000+-

0.0000

0.37445+-0.0406

0.01230+-

0.0026

0.00075+-

0.0005

0.72170+-0.0383

0.25175+-0.0486

0.00015+-

0.0002

0.00010+-

0.0001

0.00000+-

0.0000

0.00000+-

0.0000

D7S820 0.00000+-

0.0000

0.00115+-

0.0005

0.05460+-0.0191

0.05200+-0.0273

0.00200+-

0.0013

0.06475+-0.0138

0.12705+-0.0131

0.04045+-

0.0101

0.00000+-

0.0000

0.00015+-

0.0002

D2S1338 0.00000+-

0.0000

n/a n/a 0.20115+-0.0429

0.00375+-

0.0019

0.00000+-

0.0000

0.00275+-

0.0023

n/a 0.00000+-

0.0000

n/a

“n/a” marked when loci absent within a publication. Significant difference in bold (P-value ≤ 0.05). Using 20,000 Markov steps

Page 160: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

159

Appendix 4: Haplotype frequencies for 217 male individuals from UAE population using 27 Y-STR loci from Y-Filer PLUS

Amplification kit (Chapter 7).

Ha

plo

typ

e

DY

S5

76

DY

S3

89

I

DY

S6

35

DY

S3

89

II

DY

S6

27

DY

S4

60

DY

S4

58

DY

S1

9

YG

AT

A H

4

DY

S4

48

DY

S3

91

DY

S4

56

DY

S3

90

DY

S4

38

DY

S3

92

DY

S5

18

DY

S5

70

DY

S4

37

DY

S3

85

a

DY

S3

85

b

DY

S4

49

DY

S3

93

DY

S4

39

DY

S4

81

DY

F3

87

S1

I

DY

F3

87

S1

II

DY

S5

33

n

1 12 13 22 30 19 10 15 13 11 19 9 14 23 10 11 39 18 14 16 17 29 14 12 23 37 38 11 1

2 15 12 21 28 20 10 17 16 11 22 10 15 21 10 11 35 18 16 13 15 27 14 11 21 38 38 10 1

3 15 12 21 29 20 10 15 15 11 21 10 15 21 11 11 36 19 14 17 18 29 13 11 27 38 39 11 1

4 15 12 24 28 19 11 15 14 12 19 10 15 22 11 14 39 15 14 13 17 25 11 12 23 35 40 11 1

5 15 12 24 28 23 11 17 14 11 18 11 15 25 11 11 38 19 14 14 19 29 13 11 25 38 38 12 1

6 15 12 25 28 22 11 18 14 11 20 11 15 25 11 11 39 20 14 14 19 28 13 11 25 38 39 11 1

7 15 13 20 28 19 10 17 16 11 19 10 16 23 7 13 36 16 14 13 16 33 13 11 22 37 38 11 1

8 15 13 21 29 19 11 17 15 11 22 9 18 21 11 11 43 16 14 17 17 30 15 11 27 38 39 12 1

9 15 13 21 29 19 11 18.2 14 11 20 10 15 26 10 11 40 19 14 12 18 25 12 11 25 37 40 11 1

10 15 13 21 29 23 10 16 14 10 22 10 17 21 10 11 38 18 16 13 15 29 15 11 21 39 39 10 1

11 15 13 21 31 18 10 17 15 12 22 10 15 21 11 11 39 19 14 16 18 29 13 12 27 36 39 11 1

12 15 13 22 29 20 10 16 14 11 21 10 15 23 9 11 39 17 15 13 16 31 12 11 24 36 39 12 1

13 15 13 22 29 20 10 16 15 11 21 10 15 23 9 11 39 17 15 13 16 31 12 13 25 36 41 12 1

14 15 13 22 29 21 10 16 13 11 21 10 15 23 9 11 37 16 15 12 16 30 12 11 24 36 40 12 2

15 15 13 23 29 23 10 17 14 10 22 10 17 21 10 11 38 18 16 13 15 30 15 12 21 38 38 11 1

16 15 14 20 31 21 11 15 13 12 19 11 15 24 9 13 38 13 14 16 16 33 13 11 24 37 38 10 1

17 15 14 21 31 22 10 16 14 11 19 10 15 23 9 13 35 18 14 14 16 34 13 11 22 37 37 11 1

18 15 14 21 31 22 10 16 14 11 19 10 15 23 9 13 35 17 14 14 16 34 13 11 22 37 37 11 1

Page 161: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

160

19 15 14 21 31 22 10 16 14 11 19 10 15 23 9 13 35 17 14 14 16 34 13 11 22 37 38 11 1

20 15 14 21 32 21 10 15 15 12 21 10 15 21 11 11 37 19 14 17 18 30 13 11 30 37 39 11 1

21 15 14 23 30 20 10 16 14 11 20 10 15 23 9 11 39 14 15 13 16 30 12 11 24 36 40 12 2

22 16 11 21 28 20 11 17 14 10 22 11 15 24 10 12 37 18 16 14 16 27 13 12 24 38 39 10 1

23 16 12 21 29 19 12 17 15 11 19 11 13 24 9 11 37 15 15 14 18 29 13 11 23 36 38 9 1

24 16 12 21 30 18 9 16 14 11 21 10 15 23 10 13 37 19 16 14 16 29 13 13 27 39 39 10 1

25 16 12 23 28 20 10 15 15 12 16 10 16 23 9 14 37 20 14 13 13 29 13 11 27 37 38 11 1

26 16 13 20 31 21 11 17.2 14 11 19 10 14 23 10 12 38 18 14 13 17 25 12 11 27 36 39 11 1

27 16 13 21 29 20 10 15 15 12 21 10 15 24 9 11 38 17 14 16 16 30 12 12 25 39 40 12 1

28 16 13 21 29 20 10 15 15 12 21 9 15 23 9 11 39 16 15 13 16 30 12 11 22 39 39 12 1

29 16 13 21 30 18 11 16 15 11 21 10 16 21 11 11 41 17 15 16 16 31 14 12 27 37 39 11 1

30 16 13 21 30 18 11 16 16 11 21 10 16 21 11 11 42 17 14 17 18 31 14 13 26 38 40 11 1

31 16 13 21 31 19 11 16 15 12 21 10 15 21 11 11 38 19 14 16 17 28 13 13 27 39 39 11 1

32 16 13 21 31 20 10 17 16 12 20 10 15 21 11 11 39 19 14 16 17 28 13 11 28 37 39 11 1

33 16 13 22 29 19 10 15 14 11 22 10 17 23 9 12 38 17 15 13 17 32 12 12 21 40 41 12 1

34 16 13 22 30 18 9 15 13 11 20 10 17 24 10 11 38 20 14 15 17 32 13 11 22 35 38 12 1

35 16 13 22 30 19 9 15 13 11 20 10 17 24 10 11 40 21 14 15 17 32 13 11 22 35 38 12 1

36 16 13 22 30 16.3 11 18.2 14 11 19 11 14 23 10 11 38 18 14 13 18 25 12 11 25 37 37 11 1

37 16 13 22 30 23 10 20.2 14 11 20 11 14 23 10 11 39 18 14 13 19 25 12 11 26 37 37 11 1

38 16 13 22 30 23 11 16 14 11 21 10 15 22 9 11 38 17 15 13 13 30 12 11 23 39 40 11 1

39 16 13 23 30 21.2 11 18.2 14 11 19 10 14 23 10 11 38 18 14 13 18 25 12 12 25 37 37 11 1

40 16 14 19 32 20 10 17 15 11 21 9 15 21 10 11 40 17 16 12 12 34 13 11 25 35 37 11 1

41 16 14 21 30 22 10 17 14 11 19 10 15 23 9 13 35 18 14 14 16 34 13 11 22 37 37 11 1

42 16 14 21 30 22 10 18 14 11 19 10 16 24 9 13 35 17 14 14 15 33 13 10 22 36 37 12 1

43 16 14 21 31 21 10 17 14 11 19 10 16 23 9 13 35 16 14 14 16 35 13 10 21 37 38 12 1

Page 162: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

161

44 16 14 21 31 22 10 17 14 11 20 10 15 23 9 13 35 17 14 14 16 34 13 11 22 37 37 11 1

45 16 14 21 31 22 11 17 14 11 19 10 15 23 9 13 35 17 14 12 16 34 13 11 22 37 38 11 1

46 16 14 22 32 20 11 16 14 11 20 10 15 23 11 11 36 19 14 16 17 33 14 11 25 36 36 11 1

47 16 14 23 30 20 10 15 14 12 21 10 16 23 9 11 38 20 15 14 18 33 13 11 23 38 41 12 1

48 16 14 23 30 21 10 16 14 12 20 10 15 24 9 11 38 16 15 13 15 31 12 11 24 35 40 12 1

49 16 14 23 31 18 11 16 15 11 20 10 17 21 11 11 41 17 14 16 18 32 15 11 26 37 38 11 1

50 16 14 24 31 19 10 16 15 11 21 10 16 21 11 11 39 20 14 16 17 28 13 12 27 36 38 11 1

51 16 15 21 32 22 10 17 14 11 19 10 15 23 10 13 35 17 14 14 15 32 13 11 22 37 37 11 1

52 17 12 21 28 22 10 14 14 11 18 10 15 22 10 16 37 17 16 15 16 27 11 13 25 39 39 12 1

53 17 12 22 28 25 11 15 14 11 21 10 14 23 9 11 35 16 15 14 15 28 12 12 22 39 39 11 1

54 17 12 23 28 20 11 15 14 12 20 10 15 22 11 16 38 15 15 13 17 26 11 13 24 39 41 12 1

55 17 12 23 28 23 11 14 14 12 19 10 15 22 10 14 38 15 15 14 16 25 11 12 25 34 41 12 1

56 17 12 23 29 20 11 15 15 12 18 10 15 24 11 14 38 18 14 13 16 28 15 11 26 36 38 13 1

57 17 13 20 30 20 11 17 14 12 20 10 16 24 10 11 38 21 14 17 18 32 13 12 21 35 36 12 1

58 17 13 20 30 21 10 17.2 15 11 20 10 15 23 10 11 38 18 14 13 18 26 12 11 25 35 38 11 2

59 17 13 20 30 21 11 18 14 12 20 10 17 24 10 11 36 20 14 17 18 30 13 12 21 35 38 12 1

60 17 13 20 30 21 11 18 14 12 20 10 18 25 10 11 37 21 14 17 18 32 13 12 21 35 37 12 1

61 17 13 20 30 22 11 17 14 11 20 10 16 24 10 11 38 20 14 17 18 31 13 12 21 35 37 12 1

62 17 13 20 30 22 11 17 14 12 20 10 16 24 10 11 38 21 14 18 18 31 13 12 21 35 37 12 1

63 17 13 20 31 21 11 17.2 14 11 20 11 15 23 10 11 36 18 14 14 19 26 12 11 26 35 39 10 1

64 17 13 21 29 17 10 15 15 11 20 10 15 24 9 11 34 18 15 13 18 28 12 13 24 39 39 11 1

65 17 13 21 29 19 10 18 14 10 19 10 15 22 9 13 34 19 14 15 20 34 13 11 24 36 39 11 1

66 17 13 21 29 20 11 17.2 14 11 20 10 15 23 10 11 40 17 14 15 19 25 12 11 26 36 38 11 2

67 17 13 21 29 23 10 18.2 14 11 21 10 14 23 10 11 38 18 14 13 17 25 12 11 25 37 38 11 1

68 17 13 21 30 20 10 18.2 14 11 20 11 14 23 10 11 38 17 14 13 19 26 12 11 26 37 37 11 1

Page 163: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

162

69 17 13 21 30 20 10 18.2 14 11 20 11 14 23 10 11 38 19 14 13 19 27 12 11 26 37 37 11 1

70 17 13 21 30 21 10 18.2 14 10 20 11 14 23 10 11 39 18 14 13 19 25 12 11 25 36 36 11 1

71 17 13 21 30 21 10 18.2 14 11 20 11 14 23 10 11 40 18 14 13 18 25 12 11 26 37 37 11 1

72 17 13 21 30 21 10 19.2 14 11 20 10 14 23 10 11 37 17 14 13 17 25 12 12 24 35 39 11 1

73 17 13 21 30 21 11 14 13 10 20 10 15 25 10 11 37 19 14 16 17 32 13 11 24 37 39 12 1

74 17 13 21 30 21 11 15 13 10 20 10 15 25 10 11 36 19 14 16 17 33 13 11 25 38 38 12 1

75 17 13 21 30 21 11 19.2 14 11 20 10 14 23 10 11 37 17 14 13 17 25 12 12 24 35 40 11 1

76 17 13 21 30 21 12 16.2 15 11 20 11 15 23 10 11 39 18 14 13 17 25 12 11 25 36 37 11 1

77 17 13 21 30 22 10 19.2 14 11 20 11 14 23 10 11 41 17 14 12 18 25 12 13 26 37 38 11 1

78 17 13 21 31 18 10 17 13 11 20 10 15 24 10 11 40 17 14 16 17 31 12 13 25 38 39 11 1

79 17 13 21 31 18 10 18 14 10 18 10 15 24 9 13 35 18 14 15 16 34 13 12 23 38 39 11 1

80 17 13 21 31 19 9 18 13 11 20 10 15 24 10 11 41 18 14 14 15 32 14 10 27 39 39 10 1

81 17 13 21 32 18 10 18 13 11 20 10 14 24 10 11 38 18 14 16 18 31 12 12 24 38 38 11 1

82 17 13 21 32 19 9 18 13 11 20 10 15 24 10 11 41 18 14 14 15 32 14 10 27 39 39 10 1

83 17 13 22 29 20 10 18 13 10 19 10 15 22 11 15 38 16 14 14 16 30 13 12 24 35 36 11 1

84 17 13 22 30 18 10 16 13 11 20 9 15 25 10 11 40 19 14 15 18 34 14 11 22 36 38 11 1

85 17 13 23 28 20 11 17 14 12 19 11 15 25 12 14 40 17 15 11 14 28 12 12 22 35 36 12 1

86 17 13 23 29 18 11 15 15 13 20 10 17 25 11 11 43 18 14 11 14 32 13 10 23 37 39 12 1

87 17 13 23 29 23 11 15 14 13 18 11 14 24 12 13 38 17 15 11 14 28 12 12 22 35 35 11 1

88 17 13 23 30 17 11 15 16 12 20 11 15 25 11 11 39 18 14 11 14 34 13 10 21 37 39 12 1

89 17 13 23 30 17 11 17 16 11 20 11 16 24 11 11 40 18 14 11 14 32 13 10 23 37 38 12 1

90 17 13 23 30 21 9 15 13 12 20 10 16 24 10 11 43 17 14 16 18 33 13 12 22 35 38 12 1

91 17 13 23 30 21 9 15 13 12 20 10 16 24 10 11 43 17 14 16 18 33 13 12 22 35 37 12 1

92 17 13 26 30 21 11 16 14 12 20 10 17 23 11 13 43 16 15 12 12 30 13 11 23 37 38 12 1

93 17 14 17 32 23 11 17 15 11 20 10 12 24 11 11 39 15 14 13 14 31 13 11 22 35 42.2 12 1

Page 164: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

163

94 17 14 18 31 16 11 17 15 11 21 10 15 21 11 11 42 18 14 16 16 29 13 11 25 37 37 11 1

95 17 14 18 31 16 11 17 16 11 21 10 15 21 11 11 42 18 14 16 16 29 13 11 25 37 37 11 1

96 17 14 20 31 19 10 17 13 12 20 10 16 23 9 13 34 18 15 15 15 33 13 11 23 37 37 11 1

97 17 14 21 30 19 10 15 15 11 18 10 14 23 9 11 40 18 14 14 20 32 12 14 20 37 41 12 1

98 17 14 21 30 20 10 15 14 11 20 10 16 22 9 11 40 21 15 13 18 33 12 11 23 37 39 11 1

99 17 14 21 30 22 10 14 13 12 19 10 17 22 9 13 39 18 15 14 15 34 13 11 26 38 40 12 1

100 17 14 21 31 18 11 16 13 12 20 9 15 24 10 11 39 19 14 15 17 29 14 12 25 36 37 12 1

101 17 14 21 31 19 10 16 13 12 19 9 15 23 9 13 35 19 15 14 17 32 13 13 27 36 38 13 1

102 17 14 21 31 22 10 19.2 14 11 20 10 13 23 10 11 38 18 14 13 18 25 12 11 25 37 37 11 1

103 17 14 22 30 21 11 16 14 11 20 10 16 23 9 11 37 19 14 16 20 32 12 11 23 35 41 11 1

104 17 14 24 32 19 10 16 14 11 19 11 14 23 9 12 33 19 14 15 16 35 13 11 23 37 37 12 1

105 17.3 13 23 30 17 11 16 16 11 20 11 16 24 11 11 39 19 14 11 14 30 13 10 24 38 38 12 1

106 18 12 21 28 21 11 15 17 10 20 10 14 24 10 12 39 18 15 14 14 29 13 12 21 37 39 11 1

107 18 12 21 29 20 10 19.2 15 11 20 11 14 23 11 11 39 18 14 13 19 27 12 11 26 37 37 11 1

108 18 12 21 29 22 10 17.2 14 11 20 11 14 23 10 11 39 18 14 13 18 25 12 12 25 37 37 11 1

109 18 12 22 29 21 10 14 13 11 20 10 15 23 10 11 40 20 14 16 18 33 12 10 23 35 38 12 1

110 18 12 23 29 19 10 18.2 14 11 20 11 14 23 10 11 38 18 14 13 19 26 12 12 26 36 37 11 1

111 18 12 23 31 17 11 15 15 13 20 10 16 25 11 11 42 20 15 11 15 32 13 11 23 37 38 13 1

112 18 12 24 27 21 11 18 14 13 19 11 16 23 12 14 36 17 16 11 15 30 12 13 22 35 36 12 1

113 18 13 20 30 20 10 19.2 14 11 20 11 15 23 10 12 39 18 14 13 18 26 12 11 25 36 37 11 1

114 18 13 20 30 21 11 19 14 12 20 10 16 23 10 11 37 20 14 17 17 30 13 12 21 35 37 12 1

115 18 13 20 30 22 10 17.2 14 11 20 10 15 23 10 11 37 18 14 13 16 26 12 11 25 35 38 11 1

116 18 13 20 30 22 11 17 14 12 20 10 16 23 10 11 38 18 14 19 20 34 13 11 21 37 37 12 1

117 18 13 20 31 20 10 17.2 14 11 20 10 16 24 10 11 37 18 14 14 17 26 12 11 26 36 38 11 1

118 18 13 21 29 21 11 18.2 14 11 20 11 14 23 10 11 40 20 14 13 17 25 12 12 26 36 37 11 1

Page 165: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

164

119 18 13 21 29 21 11 18.2 14 11 20 11 14 23 10 11 39 18 14 13 17 26 12 11 24 36 36 11 1

120 18 13 21 29 21 11 19.2 14 11 20 11 14 23 10 11 41 17 14 13 19 25 12 11 26 37 37 11 1

121 18 13 21 29 21 12 18.2 14 11 20 12 14 23 10 11 39 18 14 13 19 25 12 11 24 37 38 11 1

122 18 13 21 29 22 11 17.2 14 11 20 10 15 23 10 11 39 17 14 13 19 25 12 11 26 39 40 11 1

123 18 13 21 29 22 11 18.2 14 11 20 11 14 23 10 11 39 18 14 13 17 25 12 11 25 36 37 11 1

124 18 13 21 29 23 10 19.2 15 11 20 10 14 24 10 11 35 17 14 17 19 27 12 12 25 37 42 11 1

125 18 13 21 30 18 10 16 17 11 21 10 15 21 11 11 42 18 14 18 18 26 14 12 24 39 40 11 1

126 18 13 21 30 18 10 17 13 11 20 11 16 23 10 11 36 21 14 16 17 33 12 12 27 39 40 10 1

127 18 13 21 30 19 10 17 13 11 20 11 16 24 10 11 40 18 14 14 16 30 14 13 24 37 37 12 1

128 18 13 21 30 19 11 16 16 11 21 10 17 21 11 9 10 17 14 17 17 30 14 12 23 38 39 11 1

129 18 13 21 30 20 10 16 17 11 21 10 17 21 11 11 41 17 14 16 17 26 13 12 24 39 39 10 1

130 18 13 21 30 20 10 16 17 11 21 10 17 21 11 11 40 17 14 17 17 26 14 11 24 38 39 11 1

131 18 13 21 30 20 10 18.2 14 11 20 11 14 23 10 11 38 18 14 13 19 26 12 12 28 35 37 11 1

132 18 13 21 30 20 10 18.2 14 11 20 11 14 23 10 11 39 18 14 13 18 26 12 11 27 37 37 11 1

133 18 13 21 30 20 10 18.2 15 11 20 11 14 23 10 11 38 17 14 14 20 26 12 11 26 37 38 11 1

134 18 13 21 30 20 11 15 13 10 20 10 16 25 10 11 36 19 14 16 17 33 13 12 26 38 38 12 1

135 18 13 21 30 20 11 16 16 12 21 10 15 21 11 11 40 17 14 17 18 30 14 12 25 38 38 11 1

136 18 13 21 30 20 12 15 13 10 20 10 15 25 10 11 37 19 14 16 17 34 13 11 25 38 38 12 1

137 18 13 21 30 21 10 18.2 14 11 20 10 14 23 10 11 38 17 14 13 18 26 12 13 25 36 38 11 1

138 18 13 21 30 21 10 18.2 14 11 20 10 14 23 10 11 39 17 14 13 18 25 13 11 26 36 36 11 1

139 18 13 21 30 21 10 18.2 14 11 20 11 14 23 10 11 38 18 14 13 18 25 12 11 26 37 37 11 1

140 18 13 21 30 21 10 19.2 14 11 20 11 14 23 10 11 38 18 14 13 18 25 12 11 26 37 37 11 1

141 18 13 21 30 21 10 20.2 14 11 20 11 14 23 10 11 40 17 14 13 18 25 12 12 26 37 37 11 1

142 18 13 21 30 21 11 15 13 10 20 10 15 25 10 11 37 19 14 16 17 33 13 12 25 38 38 12 1

143 18 13 21 30 21 11 18.2 14 11 20 10 15 23 10 11 40 18 14 13 20 25 12 11 25 36 38 11 1

Page 166: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

165

144 18 13 21 30 23 10 18.2 14 11 20 10 14 23 10 11 38 18 14 13 18 25 12 11 25 37 38 11 1

145 18 13 21 30 23 10 18.2 14 11 20 10 14 23 10 11 38 18 14 13 19 24 12 12 24 37 38 11 1

146 18 13 21 30 23 11 18.2 14 11 20 11 14 23 10 11 37 19 14 13 18 25 12 11 24 36 37 11 1

147 18 13 21 30 24 11 18.2 14 11 20 11 14 21 10 11 38 18 14 13 17 26 13 11 25 37 37 11 1

148 18 13 21 30 24 11 19.2 14 11 20 10 14 23 10 11 38 18 14 13 18 26 12 11 26 37 38 11 1

149 18 13 21 31 21 11 18.2 14 11 20 10 15 24 10 11 37 17 14 12 18 28 12 12 25 37 38 11 1

150 18 13 22 28 22 10 18 14 12 19 11 14 23 11 11 40 18 14 10 14 34 13 12 28 38.3 39.2 13 1

151 18 13 22 29 18 10 16 14 11 21 10 15 23 9 11 42 16 14 13 17 32 12 12 24 36 36 12 1

152 18 13 22 29 21 10 18 14 11 19 10 16 23 11 14 40 17 14 14 16 28 13 11 26 36 40 11 1

153 18 13 22 30 20 9 15 14 12 20 10 16 24 10 11 40 22 14 15 17 33 13 12 22 35 36 13 1

154 18 13 22 30 20 10 18.2 14 11 20 10 14 23 10 11 38 17 14 13 18 26 12 12 25 36 37 11 1

155 18 13 22 30 20 10 18.2 14 11 20 11 14 23 10 11 39 19 14 13 18 26 12 11 26 36 36 11 1

156 18 13 22 30 21 10 18.2 14 11 20 11 15 23 10 11 39 18 14 13 18 26 12 12 24 36 37 11 1

157 18 13 22 30 23 9 15 13 12 20 10 17 24 11 11 40 21 14 16 17 33 13 13 22 34 36 12 1

158 18 13 22 30 23 10 19.2 14 11 20 10 14 23 10 11 39 17 14 13 18 25 12 11 25 37 38 11 1

159 18 13 22 31 20 11 20 14 11 19 10 15 24 9 11 39 20 15 15 16 29 12 12 24 36 41 13 1

160 18 13 23 28 22 11 16 13 11 19 11 15 25 12 13 38 16 15 11 14 31 12 12 22 35 36 12 1

161 18 13 23 29 15 11 15 15 12 20 11 16 25 11 11 40 18 14 11 14 31 13 10 23 38 38 13 1

162 18 13 23 29 17 11 16 16 12 20 11 15 25 11 12 41 18 15 11 14 32 13 10 23 38 39 12 1

163 18 13 23 29 17 13 15 16 12 20 10 15 26 11 11 41 19 14 11 14 32 14 10 22 37 38 12 1

164 18 13 23 29 19 11 15 16 13 20 11 16 25 11 11 41 19 14 11 14 33 13 10 21 37 37 12 1

165 18 13 23 30 18 12 16 16 13 20 11 14 25 11 11 42 19 14 11 14 33 13 12 23 37 39 13 1

166 18 13 23 31 17 11 16 16 12 20 10 16 25 11 11 41 19 14 12 13 29 13 10 23 37 38 14 1

167 18 13 23 31 17 11 15 16 13 20 10 15 23 11 11 40 22 14 11 15 31 13 10 24 36.2 39 12 1

168 18 13 23 31 17 11 15 16 13 21 10 16 27 11 11 41 19 14 11 14 31 13 10 24 38 38 12 1

Page 167: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

166

169 18 13 23 31 22 9 15 13 12 20 10 17 24 10 11 41 22 14 16 17 33 13 12 22 35 36 13 1

170 18 14 22 30 19 11 17.2 14 11 21 10 15 24 10 11 39 19 14 12 20 27 12 11 24 39 39 11 1

171 18 14 22 30 19 11 18.2 13 11 20 12 13 22 10 11 39 18 14 13 17 26 12 11 25 36 37 11 1

172 18 14 22 30 20 10 17 13 11 18 10 15 24 9 11 37 16 15 10 13 31 12 11 27 37 37 11 1

173 18 14 23 29 18 11 15 13 12 20 11 15 25 11 11 42 19 14 11 14 32 13 10 23 37 38 12 1

174 18 14 23 32 17 11 16 15 12 20 10 15 25 11 11 42 19 14 11 14 32 13 10 23 37 40 11 1

175 18 14 24 30 22 11 17 15 12 18 10 15 22 12 10 40 18 15 12 19 31 14 12 25 36 40 11 1

176 19 12 20 29 21 10 17.2 14 11 20 11 17 23 10 11 39 18 14 13 17 27 12 11 25 36 38 11 1

177 19 13 20 30 21 10 18.2 14 11 20 10 14 23 10 11 38 17 14 13 17 26 12 12 25 36 38 11 1

178 19 13 20 31 20 10 17.2 15 12 20 10 15 23 10 11 38 20 14 13 17 27 12 11 25 37 37 11 1

179 19 13 20 32 20 11 16 15 12 20 10 16 25 9 11 42 19 14 18 18 34 13 12 22 36 36 12 1

180 19 13 20 32 20 11 17.2 15 12 20 10 15 23 10 11 38 20 14 13 17 27 12 11 25 37 37 11 1

181 19 13 21 29 19 12 17.2 14 11 20 11 16 24 10 11 40 18 14 13 20 28 12 12 26 36 40 11 1

182 19 13 21 29 20 11 18.2 14 11 20 10 13 23 10 11 37 18 14 13 17 26 12 11 25 36 37 11 1

183 19 13 21 29 22 11 18.2 14 11 19 10 15 24 10 11 40 18 14 18 18 28 12 12 25 39 39 11 1

184 19 13 21 29 23 10 17.2 15 11 19 10 15 25 7 11 41 17 14 17 18 27 12 12 25 39 39 12 1

185 19 13 21 30 19 10 18.2 14 11 20 11 14 23 10 11 37 18 14 13 20 26 12 11 26 36 37 10 1

186 19 13 21 30 20 10 18.2 14 11 20 11 14 23 10 11 40 18 14 13 19 27 12 12 26 37 37 11 1

187 19 13 21 30 20 10 18.2 14 11 20 11 14 23 10 11 42 18 14 13 19 27 12 11 26 37 37 11 1

188 19 13 21 30 20 10 18.2 14 11 20 11 14 23 10 11 42 18 15 13 19 27 12 11 26 37 37 11 2

189 19 13 21 30 20 10 18.2 14 11 20 11 14 24 10 11 39 18 14 13 19 25 12 11 26 37 37 11 1

190 19 13 21 30 21 10 18.2 14 11 20 11 14 23 10 11 38 17 14 13 20 26 12 11 25 37 37 11 1

191 19 13 21 30 21 11 18.2 14 11 20 12 14 23 10 11 37 18 14 13 19 25 13 11 26 36 37 11 1

192 19 13 21 30 21 11 19.2 14 12 20 9 13 23 10 11 41 20 14 13 18 25 12 11 25 34 38 12 1

193 19 13 21 31 20 10 18.2 14 11 20 10 13 23 10 11 39 18 14 13 19 26 12 11 26 37 37 11 1

Page 168: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

167

194 19 13 21 31 21 12 18.2 14 12 21 10 14 23 10 11 39 18 14 13 19 25 12 11 25 36 39 12 1

195 19 13 22 30 19 11 15 14 12 20 9 15 24 9 10 40 16 15 14 22 28 13 11 23 37 38 12 1

196 19 13 23 29 17 11 16 16 12 20 11 15 25 11 12 41 18 15 11 14 33 13 10 23 38 39 12 1

197 19 13 23 30 16 11 16 16 11 20 11 16 25 11 11 41 19 14 9 11 33 14 12 26 37 38 12 1

198 19 13 23 30 17 10 16 15 12 20 11 15 24 11 11 41 19 14 11 14 33 13 11 23 37 39 12 1

199 19 13 23 30 18 11 15 17 12 20 11 15 25 11 11 38 19 14 11 14 32 13 10 24 37 38 13 1

200 19 13 23 30 20 11 14 14 11 21 9 16 22 9 11 41 16 14 13 14 33 12 11 23 39 39 11 1

201 19 13 23 31 17 12 15 15 12 20 10 17 24 11 11 41 21 14 11 15 33 13 10 24 37 37 13 1

202 19 13 23 31 17 12 17 16 13 20 10 15 25 11 11 40 19 14 11 14 33 13 10 23 37 40 12 1

203 19 13 24 29 21 11 17 15 13 19 11 15 24 12 13 7 17 15 11 14 30 13 12 22 35 37 13 1

204 19 14 21 30 19 11 17.2 14 11 20 10 17 24 10 11 39 19 14 13 22 27 12 12 27 39 40 11 1

205 19 14 21 31 20 9 16 13 12 20 10 15 24 11 11 39 21 15 15 16 32 14 12 25 37 37 12 1

206 19 14 21 31 22 11 19.2 14 11 20 10 14 24 10 11 38 19 14 13 17 26 12 11 25 35 38 11 1

207 19 14 23 30 17 11 16 17 13 20 11 14 24 11 11 40 19 14 11 14 31 13 10 23 37 38 12 1

208 19 15 21 31 19 11 17.2 14 11 20 10 17 24 10 11 39 19 14 13 20 27 12 12 25 37 39 11 1

209 19 16 21 32 19 11 17.2 14 11 20 10 17 24 11 11 39 19 14 13 22 27 12 12 27 36 39 11 1

210 20 13 23 30 23 11 17 15 13 19 11 15 23 12 15 39 19 14 12 15 29 13 13 22 36 38 13 1

211 20 14 21 30 19 10 17.2 14 11 20 10 16 24 10 11 38 19 14 13 21 27 12 12 26 36 40 11 1

212 20 14 22 31 21 10 18.2 14 11 20 11 14 23 10 11 38 18 14 13 18 25 12 14 25 36 37 11 1

Total of Unique Haplotypes 207

Total of Different Haplotypes 212

Overall Total 217

Page 169: Analysis of short tandem repeat allele frequencies in the ...research-repository.uwa.edu.au/files/...JONES_Rebecca_Jayne_2016.pdf · Analysis of short tandem repeat allele frequencies

168