organism - wordpress.com · 22,222 ry rg by bg 44,444 22,222 44,444 11,111 22,222 11,111 yg 1,778...

Post on 09-Jun-2020

3 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

DNA Short Tandem Repeats

Organism

DNA Short Tandem Repeats

Organ

DNA Short Tandem Repeats

Cell

Weights

• 1kg – a bag of sugar

• 1g – paper clip

• 1mg (milligram) 0.001g – brain of a bee

• 1µg (microgram) 0.000001g weight of a

bacterium

• 1ng (nanogram) 0.000000001g a millionth

of a grain of salt - recommended input to

profiling

• 1pg (picogram) 0.000000000001g 6pg of

DNA from each cell

Cells

• We lose about 30,000-40,000 skin cells an

hour

• In a year, you lose about 8lbs of cells

• “Where do they all go? The dust that collects

on your tables, TV, windowsills and on those

picture frames that are so hard to get clean is

made mostly from dead human skin cells. In

other words, your house is filled with former

bits of yourself.”

• About 10,000 will fit on the head of a pin

• Current DNA technology can profile one cell

DNA Short Tandem Repeats

Nucleus

DNA Short Tandem Repeats

Chromosomes

DNA Short Tandem Repeats

DNA

DNA Short Tandem Repeats

Locus

DNA Short Tandem Repeats

STR

DNA Short Tandem Repeats

DNA Short Tandem Repeats

Allele

DNA Short Tandem Repeats

Allele

5

3

DNA Short Tandem Repeats

Locus is important

FGA 3

D3 3

DNA Short Tandem Repeats

A D3 vWA D16 D2 D8 D21 D18 D19 THO1 X Y 17 18 18 11 12 18 24 12 14 29 13 17 14 9 9.3

DNA profile

Locus

Allele Heterozygote

Homozygote

The process

• Extraction

• Quantitation

• Amplification

• Separation

• Interpretation

• Evaluation

Amplification = Multiplication

Raw data

Single source profile

One DNA component

from mother,

another from father

Area of DNA tested

Names of DNA

components

Why statistics?

• DNA is NOT unique

• We look at only a few areas

• Need to know what the probability

of finding the profile by chance is

(i.e. to give an idea of how many

other people may have been the

source of the profile)

Statistical estimates

= 0.1

1 in a billion

1 in 10 1 in 111 1 in 20

1 in 22,200

x x

1 in 100 1 in 14 1 in 81

1 in 113,400

x x

1 in 116 1 in 17 1 in 16

1 in 31,552

x x

Probability

• Black hair

• Blue eyes

• Beard

• Gold tooth

0.6

0.25

0.01

0.001

Probability= 0.6 x 0.25 x 0.01 x 0.001

= 0.0000015

= 1 in 666,666

Random Match Probability

R B

f 0.1 0.1

RB = 0.1 x 0.1 = 0.02 = 2 in 100 x 2 = 1 in 50

Mixtures

Mixtures

?

Mixtures

?

Mixtures

?

Mixtures

?

Mixtures

Mixtures

RB

RY

RG

BY

BG

GY

= 6 ‘suspect’ profiles that

‘cannot be excluded’ as

contributors

How many suspects?

• With 6 possibilities at each of 15 areas

• There are 6x6x6x6x6x6x6x6x6x6x6x6x6x6x6=

• More than 60 million suspect profiles

Alleles observed on ‘outside’

D8 D21 D7 CSF D3 THO1 D13 D16 D2 D19 vWA TPOX D18 D5 FGA

13 31.2

8

10

11

10

11

12

16

17

18

6 9

9.3

11

12

11

12

13

14

17

19

25

13

14

14

15

16

8

11

12

14

15

16

12

13

21

22

24

25

13

29

31.2

32.2

8

10

11

12

11

12

16

18

6 7

8

9.3

11

12

13

9 12

13

14

17

25

13

14

14

16

18

8

11

14

16

12

13

20

21

24

D8 D21 D7 CSF D3 THO1 D13 D16 D2 D19 vWA TPOX D18 D5 FGA

13

29

31.2

32.2

8

10

11

12

10

11

12

16

17

18

6

7

8

9

9.3

11

12

13

9

11

12

13

14

17

19

25

13

14

14

15

16

18

8

11 12

14

15

16

12

13

20

21

22

24 25

Alleles observed on ‘outside’

No. of alleles at each locus

D8 D21 D7 CSF D3 THO1 D13 D16 D2 D19 vWA TPOX D18 D5 FGA

13

29

31.2

32.2

8

10

11

12

10

11

12

16

17

18

6

7

8

9

9.3

11

12

13

9

11

12

13

14

17

19

25

13

14

14

15

16

18

8

11 12

14

15

16

12

13

20

21

22

24 25

1 3 4 3 3 5 3 5 3 2 4 3 3 2 5

No of ‘suspect’ profiles

D8 D21 D7 CSF D3 THO1 D13 D16 D2 D19 vWA TPOX D18 D5 FGA

13

29

31.2

32.2

8

10

11

12

10

11

12

16

17

18

6

7

8

9

9.3

11

12

13

9

11

12

13

14

17

19

25

13

14

14

15

16

18

8

11 12

14

15

16

12

13

20

21

22

24 25

1 3 4 3 3 5 3 5 3 2 4 3 3 2 5

1 3 6 3 3 10 3 10 3 1 6 3 3 1 10

D8 D21 D7 CSF D3 THO1 D13 D16 D2 D19 vWA TPOX D18 D5 FGA

13

29

31.2

32.2

8

10

11

12

10

11

12

16

17

18

6

7

8

9

9.3

11

12

13

9

11

12

13

14

17

19

25

13

14

14

15

16

18

8

11 12

14

15

16

12

13

20

21

22

24 25

1 3 4 3 3 5 3 5 3 2 4 3 3 2 5

1 x3 x6 x3 x3 x10 x3 x10 x3 x1 x6 x3 x3 x1 x10

No of ‘suspect’ profiles

D8 D21 D7 CSF D3 THO1 D13 D16 D2 D19 vWA TPOX D18 D5 FGA

13

29

31.2

32.2

8

10

11

12

10

11

12

16

17

18

6

7

8

9

9.3

11

12

13

9

11

12

13

14

17

19

25

13

14

14

15

16

18

8

11 12

14

15

16

12

13

20

21

22

24 25

1 3 4 3 3 5 3 5 3 2 4 3 3 2 5

1 x3 x6 x3 x3 x10 x3 x10 x3 x1 x6 x3 x3 x1 x10

= 78,732,000 ‘suspect profiles

No of ‘suspect’ profiles

D8

D8

D8

Adding ‘new’ alleles at D8 D8 D21 D7 CSF D3 THO1 D13 D16 D2 D19 vWA TPOX D18 D5 FGA

9

11

13

14

29

31.2

32.2

8

10

11

12

10

11

12

16

17

18

6

7

8

9

9.3

11

12

13

9

11

12

13

14

17

19

25

13

14

14

15

16

18

8

11 12

14

15

16

12

13

20

21

22

24 25

4 3 4 3 3 5 3 5 3 2 4 3 3 2 5

6 3 6 3 3 10 3 10 3 1 6 3 3 1 10

472,392,000 (470m) ‘suspect’ profiles

D21

D21 ‘zoom’

D21

D8 D21 D7 CSF D3 THO1 D13 D16 D2 D19 vWA TPOX D18 D5 FGA

9

11

13

14

28

29

30

31.2

32.2

8

10

11

12

10

11

12

16

17

18

6

7

8

9

9.3

11

12

13

9

11

12

13

14

17

19

25

13

14

14

15

16

18

8

11 12

14

15

16

12

13

20

21

22

24 25

4 5 4 3 3 5 3 5 3 2 4 3 3 2 5

D8 D21 D7 CSF D3 THO1 D13 D16 D2 D19 vWA TPOX D18 D5 FGA

19

11

13

14

28

29

30

31.2

32.2

8

10

11

12

10

11

12

16

17

18

6

7

8

9

9.3

11

12

13

9

11

12

13

14

17

19

25

13

14

14

15

16

18

8

11 12

14

15

16

12

13

20

21

22

24 25

4 5 4 3 3 5 3 5 3 2 4 3 3 2 5

6 10 6 3 3 10 3 10 3 1 6 3 3 1 10

1,574,640,000 (1.5 billion) ‘suspect profiles

Adding ‘new’ alleles at D21

D8 D21 CSF D3 THO1 D13 D19 TPOX D18 D5

IN

13

14 31.2 10 16 6 12

13

14 11 13 20 OUT 13 29

31.2

32.2

10

11

12

16

17

18

6

7

8

9

9.3

11

12

13

13

14

8

11

12

14

15

16

20

21

22

24

25

Alleles on inside & outside

The Likelihood Ratio = LR

Probability of this evidence if the DNA came from Mr X + unknown

Probability of this evidence if it came from 2 unknowns

LR = Probability of E given Hpros

Probability of E given Hdef

“… times more likely”

e.g. LR = 1/10

1/100 =

0.1

0.001 = 10

LR = 1 (1/frequency)

For single source profiles

=frequency

e.g. 1/(1/10) = 10

Mixtures

R B Y G

f 0.25 0.25 0.25 0.25

X p(Hp) p(Hd) LR

RB 0.125 0.0469 2.67

RY 0.125 0.0469 2.67

RG 0.125 0.0469 2.67

BY 0.125 0.0469 2.67

BG 0.125 0.0469 2.67

YG 0.125 0.0469 2.67

“Mr X + unknown rather than two unknowns”

R B Y G

f 0.1 0.1 0.25 0.25

Mr X p(Hp) p(Hd) LR

RB 0.125 0.0075 16.67

RY 0.05 0.0075 6.67

RG 0.05 0.0075 6.67

BY 0.05 0.0075 6.67

BG 0.05 0.0075 6.67

YG 0.02 0.0075 2.67

“Mr X + unknown rather than two unknowns”

R B Y G

f 0.1 0.1 0.25 0.25

Mr X p(Hp) p(Hd) LR

RB 0.125 0.0075 16.67

RY 0.05 0.0075 6.67

RG 0.05 0.0075 6.67

BY 0.05 0.0075 6.67

BG 0.05 0.0075 6.67

YG 0.02 0.0075 2.67

“Mr X + unknown rather than two unknowns”

RG 33.33

“Mr X + unknown rather than two unknowns”

R B Y G

f 0.01 0.1 0.2 0.5

Mr X p(Hp) p(Hd) LR

RB 0.2 0.0012 166.67

RY 0.1 0.0012 83.33

RG 0.04 0.0012 33.33

BY 0.01 0.0012 8.33

BG 0.004 0.0012 3.33

YG 0.002 0.0012 1.67

“Mr X + unknown rather than two unknowns”

R B Y G

f 0.01 0.1 0.2 0.5

Mr X p(Hp) p(Hd) LR

RB 0.2 0.0012 166.67

RY 0.1 0.0012 83.33

RG 0.04 0.0012 33.33

BY 0.01 0.0012 8.33

BG 0.004 0.0012 3.33

YG 0.002 0.0012 1.67

“Mr X + unknown rather than two unknowns”

More complicated mixture

Second area

Second area

A

B

A

C

D

D

B

C

Second area (locus)

A B C D

AB

AC

AD

BC

BD

CD

= 6 ‘suspect’ profiles that

‘cannot be excluded’ as

contributors

Second area only

AB AC AD BC BD CD

RB

RY

RG

BY

BG

YG

444

AB AC AD BC BD CD

RB

RY

RG

BY

BG

YG

444

889

444

AB AC AD BC BD CD

RB

RY

RG

BY

BG

YG

1,778

889

1,778

444

889

444

“X + unknown rather than two unknowns”

AB AC AD BC BD CD

RB

RY

RG

BY

BG

44,444

22,222

44,444

11,111

22,222

11,111

YG

1,778

889

1,778

444

889

444

“X + unknown rather than two unknowns”

AB AC AD BC BD CD

RB

88,889

44,444

88,889

22,222

44,444

22,222

RY

RG

BY

BG

44,444

22,222

44,444

11,111

22,222

11,111

YG

1,778

889

1,778

444

889

444

“X + unknown rather than two unknowns”

AB AC AD BC BD CD

RB

88,889

44,444

88,889

22,222

44,444

22,222

RY

3,556

1,778

3,556

889

1,778

889

RG

8,889

4,444

8,889

2,222

4,444

2,222

BY

17,778

8,889

17,778

4,444

8,889

4,444

BG

44,444

22,222

44,444

11,111

22,222

11,111

YG

1,778

889

1,778

444

889

444

“X + unknown rather than two unknowns”

Stochastic variation

Examples so far assume allele

calls are certain, but low template

samples cause new problems

because of stochastic variation.

•Stochastic variation is random

variation

•Failure to reproduce results

•Leads to uncertainty

The crimestain

Standard technique

Enough sample so that no dropout is expected and peak height represents

amount of DNA present (i.e. not variable)

Low Template Sample

• Stochastic variation is random

variation

• Failure to reproduce results

• Leads to uncertainty

A B

C

D

E

F

G H

I

A B

C D

E F

Dropout or dropin?

D8 D21 D7 CSF D3 THO1 D13 D16 D2 D19 vWA TPOX D18 D5 FGA

13 31.2

8

10

11

10

11

12

16

17

18

6

9

9.3

11

12

11

12

13

14

17

19

25

13

14

14

15

16

8

11

12

14

15

16

12

13

21

22

24

25

13

29

31.2

32.2

8

10

11

12

11

12

16

18

6

7

8

9.3

11

12

13

9

12

13

14

17

25

13

14

14

16

18

8

11

14

16

12

13

20

21

24

Probability of dropout and dropin

p(D)

Is the probability that an allele is

really there but you have not

detected it.

p(C)

Is the probability that an allele you

have detected is not from the

crimestain – it is contamination

FST statistic

• FST is the programme used to

calculate the LR in this case

• Statistic depends on

– Probability of dropout which is

• Dependent usually on the weight of DNA

• Which is unknown for the minor

contributors

– And the validation data do not support

any p(D) for any weight of DNA

– The LR being correct

Low Template Sample

• Identified by variable results, NOT

the amount of DNA

• Causes problems in;

– Identifying ‘true’ sample alleles

– Using peak height information

• Inclusion/exclusion of people

• Number of contributors

top related