1 a new theory of gene regulation based on relationships of dna sequences flanking genes richard j....

86
1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

Upload: ginger-bishop

Post on 04-Jan-2016

234 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

1

A new theory of gene regulation based on relationships of DNA sequences flanking genes

Richard J. Feldmann

Global Determinants, Inc.

Derwood, Maryland

Page 2: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

2

The intellectual property presented in this talk/document is protected by US and PCT Patent Applications dated May 30,2001

Page 3: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

3

Finding the right question to ask is the hard part

Answering the question is just a matter of hard work.

Page 4: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

4

Have you ever wondered how gene expression is controlled?

The TATA box of a gene is 5’ of the start coding

Small dimeric proteins bind in and near this area

The polymerase assembles around these proteins

Enhancer and/or repressor distal to this area can loop back

Page 5: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

5

Have you ever wondered how cellular differentiation and development is accomplished?

How is gene expression controlled so cells within a tissue are relatively the same?

How in a 1,000 cell creature like C. elegans can all the cells have different functions?

How is cellular development orchestrated?

Page 6: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

6

Simplified Gene Model

|<-------------------Promoter----------------->||<-----Enhancer/Repressor------>|<--TATA Box-->| |<-Beginning of Translation |<--------------Translation Region-------------->| End of Translation----->|+ strand ------------------------------------------------------------------------------------------------------------ strand ----------------------------------------------------------------------------------------------------------- |<-Exon->|<-Intron->|<-Exon->|<-Intron->|<-Exon->| |<-----3'UTR------>||<--------------------------------------------Gene----------------------------------------------|

Page 7: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

7

Specificity Region

The palindromic specificity area around the TATA box is only 6 to 8 bases in length

48 = 65,556 is a relatively small number

Not every combination can be used

My sense is that the enhancer/repressor elements only modulate the level of expression

Page 8: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

8

Promoter Action

Page 9: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

9

Range of Gene Numbers

Bacteria have 1,000 to 2,500 genes

S. cervesiae has 6,000 genes

C. elegans has 19,000 genes

A. thaliana has 25,000 genes

H. sapiens has 40,000 genes

Page 10: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

10

How many genes are exposed for promotion at a given time?

If the whole compliment of genes is exposed then quantitative regulatory elements have the whole burden of deciding whether a gene is to be expressed or not

Page 11: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

11

Is there a binary mechanism that could sequestrate genes from promotion?

The promoter regions of sequestrated genes would be hidden from the dimeric initiation proteins

The quantitative regulatory elements would have to deal only with the exposed set of genes

Page 12: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

12

Level 1

Level 2

Level 3

Level 4

Level 5

Level 6

Six Levels of DNA Structure

Page 13: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

13

30 nm Chromatin Structure

Page 14: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

14

Are the level-4 loops random or specific in length?

Is there a sequence specificity to the lengths of these loops?

Could a zinc-finger DNA Binding Protein (DBP) be used to make the loops be specific in length?

Could RNA be used to latch the loops shut?

Page 15: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

15

There are sequence-specific loops!

A simple Fortran program run on yeast showed there are specific sequences on the left and right sides of the level-4 loops

In bacteria, S. servesiae and C. elegans there are not enough DBPs to be able to make a whole-genome mechanism

There are two sequence elements that could be expressed as RNA

Page 16: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

16

C2

Chromosome L

Double Strand DNA

C1Gene a

(a) Transcription and Editing

C2

Single Strand RNA

C1

(b) Movement of the RNA through the Nucleus

30NM Particles

Two Triple Strand Hoogsteen Helices

T1 T2

C1 C2

(c) Connectron Formation 0 to 100 bp between C1 & C2

T1 T2

Chromosome K

Double Strand DNA

5kb to 100kb

Gene b Gene c Gene d

Figure 4

Page 17: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

17

Connectron

A left flanking sequence element (T1) of at least 15-bases in length

A right flanking sequence element (T2) of at least 15-bases in length

A pair of sequence elements (C1 and C2) of at least 15-bases in length in the 3’UTR of some gene

Page 18: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

18

Sequence Properties of Connectrons

T1 and T2 have a separation of 0.5kb to 100kb

C1=T1 and C2=T2

The separation of C1 from C2 is less than 100-bases

The separation of C1/C2 from the end of the gene is less than 1,000-bases

Page 19: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

19

What constraints are placed on the sequences

Only that C1=T1 and C2=T2

Otherwise any tetrad of non-trivial sequences of at least 15-bases can be used

Page 20: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

20

Connectron Convergence and Divergence

Connectrons form Many-to one relationships

Connectrons form One-to-many relationships

P G C1/C2

P G C1/C2

P G C1/C2

P G C1/C2

Page 21: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

21

Transient Connectrons Gene “A” causes

some connectron “B”

Some other gene “C” causes a connectron “D” that turns off gene “A”

When gene “C” expresses connectron “B” eventually expires

P G C1/C2

Gene "A" Connectron "B"

T1 T2

P G C1/C2

Gene "C"Connectron "D"

T1' T2'

Gene "A"

Page 22: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

22

Permanent Connectrons

Gene “A” causes some connectron “B” but no other connectron ever turns off gene “A”

P G C1/C2

Gene "A" Connectron "B"

T1 T2

Page 23: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

23

Hierarchy of Connectrons

Gene “A” causes connectron “B”

Gene “C” causes connectron “D”

Connectron "B"

T1 T2

Gene "A"

Connectron "D"

T3 T4

Gene "C"

Page 24: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

24

Hierarchy of Connectrons Gene “E” causes

connectrons “F” and “G”

Connectron “F” turns off gene “A” which eventually causes connectron “B” to disappear

Connectron “G” turns off gene “C” which eventually causes connectron “D” to disappear

Connectron "F"

T5 T6

Gene "A"

Gene "E"

Connectron "G"

T5 T6

Gene "C"

Page 25: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

25

Alternating Layers of a HierarchyRepressive

Repressive

Expressive

Expressive

Repressive

Page 26: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

26

Full Gene Data for ConnectronGN 1361 1 1 1191.213 1191.854 .642 ycfc COG2915 GN 1362 1 1 1191.890 1193.041 1.152 ycfb COG0482 GN 1363 1 1 1193.050 1193.511 .462 b1134 COG0494 GN 1364 1 1 1193.521 1194.144 .624 ymfc COG1187 GP 1365 1 1 1194.346 1195.596 1.251 icda COG0538 TN 1366 1 1 1195.576 1195.597 .022 GC *-*GN 1367 1 1 1196.090 1197.460 1.371 ymfd COG0500 |GP 1368 1 1 1197.918 1198.811 .894 lit - |GN 1369 1 1 1198.902 1200.255 1.354 inte - |GN 1370 1 1 1200.292 1200.603 .312 ymfh - |GP 1371 1 1 1200.675 1201.061 .387 ymfi - |GN 1372 1 1 1200.999 1201.283 .285 ymfj - |GN 1373 1 1 1201.482 1202.156 .675 b1145 COG1974 |GP 1374 1 1 1201.944 1202.447 .504 b1146 - |GP 1375 1 1 1202.479 1203.383 .905 ymfl - |GP 1376 1 1 1203.393 1204.760 1.368 ymfn - |GP 1377 1 1 1204.772 1206.720 1.949 ymfr - |GP 1378 1 1 1206.724 1207.353 .630 ycfk - |GP 1379 1 1 1207.355 1207.768 .414 b1155 - |GN 1380 1 1 1207.740 1208.881 1.142 ycfa - |GP 1381 1 1 1208.908 1209.462 .555 pin COG1961 |GP 1382 1 1 1209.569 1210.402 .834 mcra COG1403 |CN 1383 1 1 1210.756 1210.778 .023 .125 GC * |TN 1384 1 1 1210.756 1210.778 .023 GC *-*CN 1385 1 1 1210.780 1210.801 .022 .102 GC * GN 1386 1 1 1210.903 1211.226 .324 ycgw - GN 1387 1 1 1211.926 1212.330 .405 ycgx - GN 1388 1 1 1212.551 1213.282 .732 ycge COG0789 GN 1389 1 1 1213.487 1214.698 1.212 b1163 COG2200 GP 1390 1 1 1215.012 1215.248 .237 ycgz - GP 1391 1 1 1215.291 1215.563 .273 ymga - GP 1392 1 1 1215.592 1215.858 .267 ymgb -

Page 27: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

27

Gene Abstraction for One-Shot Connectron

Genes to be abstracted into Group0069

Final abstraction

Driving C1/C2

NC 483 1 1 1133.952 1195.596 61.644 Non-Controlled-Gene(s)TN 484 1 1 1195.576 1195.597 .022 *-*GG 485 1 1 1196.090 1210.402 14.312 Group0069 |CNT 486 1 1 1210.756 1210.778 .023 OS-> |TN 487 1 1 1210.756 1210.778 .023 *-*CNP 488 1 1 1210.780 1210.801 .022 --> NC 489 1 1 1210.903 1286.207 75.304 Non-Controlled-Gene(s)

Group0069

Gene_Name COG_Id Chromosome Direction Start Stop Lengthymfd COG0500 1 negative 1196.090 1197.460 1.371lit - 1 positive 1197.918 1198.811 .894inte - 1 negative 1198.902 1200.255 1.354ymfh - 1 negative 1200.292 1200.603 .312ymfi - 1 positive 1200.675 1201.061 .387ymfj - 1 negative 1200.999 1201.283 .285b1145 COG1974 1 negative 1201.482 1202.156 .675b1146 - 1 positive 1201.944 1202.447 .504ymfl - 1 positive 1202.479 1203.383 .905ymfn - 1 positive 1203.393 1204.760 1.368ymfr - 1 positive 1204.772 1206.720 1.949ycfk - 1 positive 1206.724 1207.353 .630b1155 - 1 positive 1207.355 1207.768 .414ycfa - 1 negative 1207.740 1208.881 1.142pin COG1961 1 positive 1208.908 1209.462 .555mcra COG1403 1 positive 1209.569 1210.402 .834

CNT 486 1 1 1210.756 1210.778 .023 OS-> |

Page 28: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

28

Transient Connectron

Driving C1/C2

Transient Connectron

Abstracted Groups

C1/C2 T1-T2Global_Id Chromosome C1_Id C2_Id Chromosome T1_Id T2_Id Connectron_Type 1 1 11 11 1 98 107 transient

Type Num Jobno Chr Start Stop Length GeneNameTN 98 1 1 278.386 279.148 .763 *-+++++++++++++++++*TP 99 1 1 278.387 278.450 .064 *-++++++++++++++++++**GG 100 1 1 278.402 279.099 .698 Group0014 ||||||||||||||||||||TP 101 1 1 278.452 278.892 .441 *-++++++++++++++++++++**CNT 102 1 1 279.155 279.332 .178 OS-> ||||||||||||||||||||||TP 103 1 1 279.155 279.332 .178 *-++*+*|||||||||||||||||TN 104 1 1 279.155 279.336 .182 *-+*-* |||||||||||||||||GG 105 1 1 279.609 289.529 9.920 Group0015 | |||||||||||||||||CNT 106 1 1 289.834 290.619 .786 OS-> | |||||||||||||||||TN 107 1 1 289.834 290.619 .786 *-+----++++++++++++*||||

Group0014

Gene_Name COG_Id Chromosome Direction Start Stop Lengthinsb_2 COG1662 1 negative 278.402 279.099 .698-------------------------------------------------------------------------Group0015

Gene_Name COG_Id Chromosome Direction Start Stop Lengthyagb - 1 negative 279.609 279.986 .378yaga COG1425 1 negative 280.053 281.207 1.155yage COG0329 1 positive 281.481 282.410 .930yagf COG0129 1 positive 282.425 284.392 1.968yagg COG2211 1 positive 284.619 286.001 1.383yagh - 1 positive 286.013 287.623 1.611yagi COG1414 1 negative 287.628 288.386 .759argf COG0078 1 negative 288.525 289.529 1.005

Page 29: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

29

Verbose Description of Transient Connectron

The verbose description of the transient connectron 1 is:

In the Escherichia coli K-12 MG1655 complete genome thetransient connectron number 1 is generated by the controlsequence (C1/C2) whose identifier number is 11. Thiscontrol sequence is on the negative strand of the genomicDNA of chromosome 1. The genomic start and stop positionsof this control sequence are 19.859 KB and 19.796 KB with alength of 0.064 KB. Expression of the RNA for thisconnectron is triggered by the promotion of the gene whosename is insb_1 and whose COG (Cluster of Orthologous Genes)identifier is COG1662. The genomic start and stoppositions of this gene are 20.508 KB and 19.811 KB and witha length of 0.698 KB. This connectron causes stabilizationof a loop of DNA. The target sequences (T1-T2) are on thenegative strand of the genomic DNA on chromosome 1. Theidentifier number of the initiating target sequence (T1) is98. The genomic start and stop positions of thisinitiating target sequence are 279.148 KB and 278.386 KBwith a length of 0.763 KB. The identifier number of theterminating target sequence (T2) is 107. The genomic startand stop positions of this terminating target sequence are290.619 KB and 289.834 KB with a length of 0.786 KB.

Page 30: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

30

Permanent Connectron

Driving C1/C2

Permanent Connectron

Abstracted Groups

C1/C2 T1-T2Global_Id Chromosome C1_Id C2_Id Chromosome T1_Id T2_Id Connectron_Type 397 1 527 527 1 599 608 permanent

Type Num Jobno Chr Start Stop Length GeneNameTP 599 1 1 2064.198 2064.486 .289 *-+*GG 600 1 1 2064.327 2065.343 1.017 Group0074 ||TN 601 1 1 2064.488 2065.321 .834 *-++*TP 602 1 1 2064.488 2065.321 .834 *-+++*TN 603 1 1 2065.323 2065.375 .053 *-++++*TP 604 1 1 2065.323 2065.375 .053 *-+++++*GG 605 1 1 2066.630 2099.290 32.660 Group0075 ||||||CNT 606 1 1 2099.768 2100.965 1.198 OS-> ||||||TN 607 1 1 2099.768 2100.970 1.203 *-*+*+*|TP 608 1 1 2099.771 2100.968 1.198 *--*-*-*

Group0074

Gene_Name COG_Id Chromosome Direction Start Stop Lengthtrs5_6 COG3039 1 negative 2064.327 2065.343 1.017-------------------------------------------------------------------------Group0075

Gene_Name COG_Id Chromosome Direction Start Stop Lengthb1995 COG0110 1 positive 2066.630 2067.049 .420yi22_3 - 1 negative 2066.974 2068.247 1.274b1998 COG1629 1 positive 2068.266 2069.233 .968flu - 1 positive 2069.405 2072.680 3.276b2001 - 1 positive 2072.795 2074.776 1.982yeet - 1 positive 2074.839 2075.060 .222yeeu - 1 positive 2075.134 2075.502 .369yeev - 1 positive 2075.591 2076.156 .566yeex COG2926 1 negative 2077.054 2077.449 .396yeea - 1 negative 2077.555 2078.613 1.059sbmc - 1 negative 2078.811 2079.284 .474dacd COG1686 1 negative 2079.403 2080.575 1.173sbcb COG2925 1 positive 2080.778 2082.205 1.428yeed COG0425 1 negative 2082.248 2082.475 .228

Page 31: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

31

Virtual Connectron - Example 1

Driving C1/C2

Virtual Connectron

Type Num Jobno Chr Start Stop Length GeneNameCPT 921 1 1 3619.493 3619.710 .218 --> ||| | | ||

Type Num Jobno Chr Start Stop Length GeneNameTP 281 1 1 731.510 731.562 .053 *-@* || | ||||||||TP 282 1 1 731.564 731.672 .109 *-@@*++*| ||||||||TP 283 1 1 731.698 731.734 .037 *-@@@++++*||||||||TP 284 1 1 731.757 731.804 .048 *-@@@+++++++++++++*TP 285 1 1 731.821 731.903 .083 *-@@@++++++++++++++**TP 286 1 1 731.907 731.978 .072 *-@@@++++++++++++++++*TP 287 1 1 731.980 732.033 .054 *-@@@+++++++++++++++++**TP 288 1 1 732.052 732.179 .128 *-@@@+++++++++++++++++++**CPT 289 1 1 732.066 732.179 .114 --> @@@|||||||||||||||||||||CPT 290 1 1 732.235 732.283 .049 --> @@@|||||||||||||||||||||CPT 291 1 1 732.306 732.326 .021 --> @@@|||||||||||||||||||||TP 292 1 1 732.306 732.326 .021 *-@@@+++++++++++++++++++++*CPT 293 1 1 732.328 732.452 .125 --> @@@||||||||||||||||||||||TP 294 1 1 732.328 732.482 .155 *-@@@++++++++++++++++++++++**CPT 295 1 1 732.454 732.482 .029 --> @@@||||||||||||||||||||||||CPT 296 1 1 732.484 732.506 .023 --> @@@||||||||||||||||||||||||TP 297 1 1 733.371 733.513 .143 *-***||||||||||||||||||||||||

Page 32: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

32

Virtual Connectron - Example 2

Driving C1/C2

Virtual Connectron

Type Num Jobno Chr Start Stop Length GeneNameCPT 962 1 1 3697.812 3697.832 .021 --> |||||||| |||CPT 963 1 1 3697.834 3697.888 .055 --> |||||||| |||

Type Num Jobno Chr Start Stop Length GeneNameTP 494 1 1 1268.586 1268.677 .092 *-@@@@**CPT 495 1 1 1268.586 1268.696 .111 --> @@@@@@TN 496 1 1 1268.586 1268.696 .111 *-@@@@@@**TN 497 1 1 1268.836 1268.948 .113 *-*@@@@@@@*TP 498 1 1 1268.851 1268.948 .098 *--@@@@@@@@*TN 499 1 1 1268.950 1268.988 .039 *-**@@@@@@@@TP 500 1 1 1268.950 1268.988 .039 *-@-@@@@@@@@*TP 501 1 1 1269.009 1269.119 .111 *-@-@@@@@@@@@**TN 502 1 1 1269.009 1269.207 .199 *-@**@@@*@@@@@@*TP 503 1 1 1269.121 1269.207 .087 *-@@-@*@-@@@@@@@*TN 504 1 1 1269.210 1269.233 .024 *-@@*@ @ @@@@@@@@TN 505 1 1 1269.273 1269.308 .036 *-@@@@*@ @@@@@@@@CNT 506 1 1 1269.371 1269.641 .271 OS-> @@@@@@ @@@@@@@@TN 507 1 1 1269.371 1269.641 .271 *-**@*@@-@*@@@@@@TP 508 1 1 1269.407 1269.547 .141 *---@-@@-@-**@@@@TP 509 1 1 1269.565 1269.637 .073 *---@-@@-@---*@@@TN 510 1 1 1269.643 1269.746 .104 *---*-@@-*----@*@CNT 511 1 1 1269.643 1269.767 .125 --> @@ @ @TP 512 1 1 1269.644 1269.765 .122 *-----@*------*-*

Page 33: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

33

Deeply Nested ConnectronsGP 115 1 1 335.149 338.967 3.819 yahe ||||||||||||TP 116 1 1 338.974 339.040 .067 *-******+++++++++++***CPT 117 1 1 338.974 339.154 .181 OS-> ||||| ||||||||||||||TN 118 1 1 338.974 339.154 .181 *-+++++-++++*+++++++++*******GG 119 1 1 338.993 339.313 .321 Group0019 ||||| |||| ||||||||||||||||TP 120 1 1 339.049 339.133 .085 *-+++++**+++-++++++++++++++++*******TP 121 1 1 339.193 339.215 .023 *-++++++-+++*+++++++++++++++++++++++*CPT 122 1 1 339.194 339.215 .022 --> |||||| ||||||||||||||||||||||||||||TN 123 1 1 339.218 339.294 .077 *-++++++*++++*+++++++++++++++++++++++******CPT 124 1 1 339.218 339.308 .091 OS-> ||||||||||| |||||||||||||||||||||||||||||TP 125 1 1 339.235 339.308 .074 *-+++++++*+++-+++++++++++++++++++++++++++++********CPT 126 1 1 339.310 339.332 .023 --> ||||||| ||| |||||||||||||||||||||||||||||||||||||TN 127 1 1 339.310 339.339 .030 *-+++++++-+++*+++++++++++++++++++++++++++++++++++++*GG 128 1 1 339.389 353.816 14.427 Group0020 ||||||| ||||||||||||||||||||||||||||||||||||||||||TP 129 1 1 353.859 353.879 .021 *-+++++++*||||||||||||||||||||||||||||||||||||||||||TP 130 1 1 353.973 353.993 .021 *-++++++++++++++++++++++++++++++++++++++++++++++++++*GG 131 1 1 354.146 356.678 2.533 Group0021 |||||||||||||||||||||||||||||||||||||||||||||||||||CNT 132 1 1 356.697 356.787 .091 OS-> |||||||||||||||||||||||||||||||||||||||||||||||||||TN 133 1 1 356.697 356.787 .091 *-++++++*++++**+++++++*++++++++++++++++++++++++++++++****TP 134 1 1 356.697 356.787 .091 *-*++++*-+*+*--+++++++-++++++++++++++++++++*+++++++++++++*******CPT 135 1 1 356.700 356.780 .081 OS-> |||| | | ||||||| |||||||||||||||||||| ||||||||||||||||||||CPT 136 1 1 356.900 356.980 .081 OS-> |||| | | ||||||| |||||||||||||||||||| ||||||||||||||||||||CNT 137 1 1 356.900 356.987 .088 OS-> |||| | | ||||||| |||||||||||||||||||| ||||||||||||||||||||TN 138 1 1 356.900 356.987 .088 *-*++++**+*+---*++++++-*+++++++++++++*+++++-+++++++*||||||||||||TP 139 1 1 356.900 356.987 .088 *-+*+++++++****-++++++*-+++++*++++++*-+++++**++++++-++++++++++++**GG 140 1 1 357.015 360.370 3.355 Group0022 | ||||||| ||| ||||||| ||||| |||||| |||||| |||||| ||||||||||||||TN 141 1 1 360.386 360.411 .026 *-**+++++++*+++*+++++++-*++++-++++++--*+++++-++++++-+*||||||||||||TP 142 1 1 360.386 360.411 .026 *--+*+++++++*++++++++++*-++++**+++++*--+++++-*+++++-+-+++*||||||||GG 143 1 1 360.473 374.105 13.632 Group0023 | ||||||| ||||||||||| ||||| |||||| ||||| ||||| | ||| ||||||||GP 144 1 1 371.333 374.105 2.773 mhpd | ||||||| ||||||||||| ||||| |||||| ||||| ||||| | ||| ||||||||TN 145 1 1 374.145 374.189 .045 *--*-++*++++-+++*+++++++-*++++-++++++--*++++--+++++-+-*|| ||||||||TP 146 1 1 374.145 374.195 .051 *----*+-++++-*++-++++++*--++++-*+++++---++++--*++++-+--++-*|||||||CPT 147 1 1 374.153 374.195 .043 OS-> | |||| || |||||| |||| ||||| |||| |||| | || |||||||TN 148 1 1 374.230 374.290 .061 *-----+-*++*--++-*+++++---*+++--+++++---*+++---++++-+--*| |||||||TP 149 1 1 374.247 374.296 .050 *-----*--++---*+--+++++----++*--*++++----+++---*+++-+---+--*||||||CPT 150 1 1 374.254 374.324 .071 OS-> || | ||||| || |||| ||| ||| | | ||||||CPT 151 1 1 374.326 374.348 .023 --> || | ||||| || |||| ||| ||| | | ||||||TN 152 1 1 374.331 374.391 .061 *--------+*----*--*++++----*+----++++----*++----+++-+---* ||||||TP 153 1 1 374.348 374.391 .044 *--------+---------*++*-----+----*++*-----++----*++-+-------*|||||CPT 154 1 1 374.404 374.425 .022 --> | || | || || || | |||||CPT 155 1 1 374.427 374.452 .026 --> | || | || || || | |||||TP 156 1 1 374.449 374.469 .021 *--------+----------*+------+-----*+------+*-----*+-+--------*||||CPT 157 1 1 374.454 374.476 .023 --> | | | | | | | ||||TP 158 1 1 374.471 374.515 .045 *--------+-----------*------+------*------+-------*-+---------*+*|TN 159 1 1 374.472 374.492 .021 *--------+------------------*-------------* | | |CPT 160 1 1 374.505 374.526 .022 --> | | | |

Page 34: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

34

Geneless Connectrons

There is a class of connectrons that are not associated with any gene - the so-called “geneless connectrons” or more properly “orf-less connectrons”

The geneless connectrons occur in the non-genic portion of a genome.

There are most probably many hierarchies of geneless connectrons for each cell type.

Page 35: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

35

Orf-less Gene Model

|<-------------------Promoter----------------->||<-----Enhancer/Repressor------>|<--TATA Box-->| |<-Beginning of Translation | End of Translation----->|+ strand -------------------------------------------------------------- strand ------------------------------------------------------------- |<-----3'UTR--------->| |<-C1->|--|<-C2->|

Page 36: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

36

Levels of Connectron StructureLevel 4 T1 C1/C2a C1/C2bÉ T2 C1/C2

Where C1/C2a and C1/C2bare Level 3

Level 3 T1 C1/C2a C1/C2bÉ T2 C1/C2

Where C1/C2a and C1/C2bare Level 1 or Level 2

Level 2 T1 T T T TÉ T2 G-C1/C2

Level 1 T1 G G G GÉ T2 G-C1/C2

Page 37: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

37

SNPs

Connectrons are resistant to single base mutations.

The RNA forming the two Hoogsteen triple-strand helices is often longer than the minimum 15-base length

Any distribution of the C1/C2 length over the minimum is usable.

Mutations just make weaker X-shaped structure.

Page 38: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

38Loose X Structure

Tight X Structure

Page 39: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

39

Connectrons versus Genome Size

The number of genes in a genome is not particularly correlated with the size of the genome.

The size of the genome is linearly correlated with the number of connectrons.

Page 40: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

40

Genome Size vs Connectron Number

Page 41: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

41

Connectrons occur across chromosomes

In a multi-chromosonal genome, C1/C2 sources on one chromosome create connectrons on the same and other chromosomes.

S. cervesiae is a wonderful example.

Page 42: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

42

S. cervesiae cross-chromosome connectron table

Saccharomyces cerevisiae complete genome.Chromosome of T1-T2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 ------------------------------------------------------------------------------------------------------ 1 | 251 67 102 105 10 12 54 3 1 80 58 27 7 34 43 | 2 | 9 64 29 97 157 90 24 4 2 13 59 119 18 13 99 | 3 | 9 232 161 153 127 28 3 11 52 44 40 15 66 | 4 | 27 119 339 1095 318 33 451 44 13 207 19 261 240 66 133 270 | 5 | 2 40 84 77 68 10 68 4 7 51 4 104 45 6 21 46 | 6 | 1 4 28 24 37 3 41 2 7 2 11 | 7 | 13 168 439 481 214 48 571 37 7 245 21 333 209 48 129 193 | 8 | 24 54 124 114 12 17 81 30 6 82 94 27 6 38 51 | 9 | 2 11 8 49 36 1 34 13 48 5 28 12 10 9 18 | 10 | 7 7 63 125 172 5 106 18 2 95 8 82 146 21 20 106 | 11 | 5 45 48 107 4 7 43 55 52 16 25 2 29 24 | 12 | 13 142 262 348 171 34 260 17 1 174 3 446 131 45 92 163 | 13 | 10 80 126 247 143 14 133 16 6 107 14 81 207 21 53 123 | 14 | 3 57 165 144 27 25 103 4 1 108 82 50 31 49 69 | 15 | 4 19 46 86 120 4 94 17 2 16 8 54 80 12 32 86 | 16 | 14 54 130 260 240 11 170 43 4 80 18 127 152 41 49 246 |

500 base lower limit Ğ 7-finger Ğ before pruning

Page 43: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

43

Duplicated Fragments

Connectrons are based on the fact that there are duplicated sequences in a genome.

Many fragments have only a few instances

A few fragments have many instances.

Page 44: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

44

1

10

100

1000

10000

100000

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82

Number of Fragments

S. Cervesiae - Fragment Distribution

Series1

Page 45: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

45

Genes per Group

Many groups of genes controlled by connectrons are only one gene.

In S. cervesiae in particular these one-gene groups are the LTR (Long Term Repeats)

A few groups have many genes

The distribution follows an exponential curve

Page 46: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

46

S. cervesiae - Genes per Group

0

50

100

150

200

250

300

350

400

450

500

550

600

650

700

750

800

850

900

950

1000

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Number of Genes in a Group

Num

ber

of

Gro

ups

Series1

Page 47: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

47

Distribution of C1/C2 distance from last econ

Many C1/C2 connectron sources occur immediately following the last exon

In S. cervesiae some of the C1/C2s are at extreme distances (i.e.10kb) from the last exon with no intervening genes

Page 48: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

48

S. cervesiae - Distribution of C1/C2 Lag from Last Exon

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

5500

6000

6500

7000

7500

8000

8500

9000

9500

10000

1 372 743 1114 1485 1856 2227 2598 2969 3340 3711 4082 4453 4824 5195 5566 5937 6308 6679 7050

C1/C2 Instance

Lag f

rom

Last

Exon

Series1

Page 49: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

49

Distribution of C1/C2 lengths

Many of the C1/C2 fragments are of the minimum length of 15-bases

A few C1/C2s are very long (i.e. over 100-bases in length)

The distribution follows an exponential pattern

Page 50: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

50

S. cervesiae - Distribution of C1/C2 Lengths

0

5

10

15

20

25

30

35

40

45

50

55

60

65

70

75

80

85

90

95

100

1 364 727 1090 1453 1816 2179 2542 2905 3268 3631 3994 4357 4720 5083 5446 5809 6172 6535 6898

C1/C2 Instance

Length

of

C1/c

2 F

ragm

ent

Series1

Page 51: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

51

Distribution of T1/T2 lengths Many of the T1/T2 fragments are of the minimum length

of 15-bases

A few T1/T2s are very long (i.e. over 100-bases in length)

The distribution follows an exponential pattern

Because of the many-to-one and the one-to-many relationships the C1/C2 distribution and the T1/T2 distribution can be different.

Page 52: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

52

S. cervesiae - Distribution of T1/T2 Lengths

0

5

10

15

20

25

30

35

40

45

50

55

60

65

70

75

80

85

90

95

100

1 352 703 1054 1405 1756 2107 2458 2809 3160 3511 3862 4213 4564 4915 5266 5617 5968 6319 6670

T1-T2 Instance

Length

of

T1-T

2 F

ragm

ent

Series1

Page 53: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

53

Do connectrons occur on both strands?

In S. cervesiae the positive strand is favored when the gap between the last exon and the C1/C2 is short.

As this gap gets longer the positive and negative strands have equivalent numbers of connectrons

Page 54: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

54

C. elegans - Ratio of Positive to Negative Strand Connectrons

0

0.5

1

1.5

2

2.5

3

3.5

4

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

Number of Bases Between Last Exon and C1/C2

Rati

o

Series1

Page 55: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

55

Clusters of Orthologous Genes

The COGs as defined by David Lipman and Eugene Koonin in the NCBI specify the relationships of genes across (bacterial) genomes.

Genes that are in co-linear in one genome are distributed in another genome.

There seems to be no conservation of flanking T1 and T2 sequences across any two (bacterial) genomes.

Page 56: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

56

Connectrons occur across chromosomes and plasmids In single and multi-chromosome genomes connectrons

occur in both directions between the chromosomes and the associated plasmids.

In D. radiodurans connectrons occur between the two chromosomes and the two plasmids.

In S. meliloti the chromosome is a vestigal thing with most of the connectrons originating in the associated mega-plasmid.

Page 57: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

57

Emergent Property of a Genome

Connectrons are one of the first properties to emerge as the result of whole-genome sequencing.

The connectron paradigm replaces the “one-gene - one-effect” paradigm with a rich gene expression control mechanism.

Connectrons can be computed (meaningfully) for any complete, stable genome.

Page 58: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

58

Connectrons, iRNA and stRNA The 3’UTR RNA produced by the expression of a gene is

used to form connectrons, and interference RNA (iRNAs).

The iRNA forms Hoogsteen triple-helices around the cognate double-strand DNAs.

The lifetime of these triple-helices is determined by their length.

small temporal RNA (stRNAs) are distinguished from iRNAs only by their lifetimes.

iRNAs and stRNAs block the expression of related RNAs in the 3’UTR of other genes.

Page 59: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

59

iRNAs and stRNAs

Interference RNAs (iRNAs) and Small Temporal RNAs (stRNAs) are now included in connectron determinations and calculations.

stRNAs have short lifetimes

iRNAs have longer lifetimes.

Connectrons are the same sequences that bind to two widely (i.e. 100kb) targets.

Page 60: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

60

Simulation of Connectron Control of Gene Expression

Connectrons have lifetimes

A C1/C2 connectron source may originate from a gene that is already in a connectron

The collection of all the connectrons for a genome forms an abstract state machine

Page 61: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

61

1 2 3 4 5 6 7 8 9 0 1 Avail 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234 453 11111111 1 247 11 11111111 11111111 11111 1 111111111 191 11 11111111 11111111111111 11111111 11111 1111111111111 111111111 186 11 11111111 11111111111111 11111111 11111 1111111111111 111111111 166 11 11111111 11111111111111 11111111 11111 1111111111111 111111111 160 11 11111111 11111111111111 11111111 11111 11111111111111111111 111111111 157 11 11111111 11111111111111 11111111 11111 11111111111111111111 111111111 156 11 11111111 11111111111111 11111111 11111 11111111111111111111 111111111 137 11 11111111 11111111111111 11111111 11111 11111111111111111111 111111111 135 11 11111111 11111111111111 11111111 11111 11111111111111111111 111111111 135 11 11111111 11111111111111 11111111 11111 11111111111111111111 111111111 133 11 11111111 11111111111111 11111111 11111 11111111111111111111 111111111 132 11 11111111 11111111111111 11111111 11111 11111111111111111111 111111111 107 1111111 1111 11111111 11111111111111 11111111 11111 11111111111111111111 111111111 105 1111111 1111 11111111 11111111111111 11111111 11111 11111111111111111111 111111111 101 1111111 11111111 11111111 11111111111111 11111111 11111 11111111111111111111 111111111 97 1111111 11111111 11111111 11111111111111 11111111 11111 11111111111111111111 111111111 94 1111111 11111111 11111111 11111111111111 11111111 11111 11111111111111111111 111111111 90 1111111 11111111 11111111 11111111111111 11111111 11111 11111111111111111111 111111111 82 1111111 11111111 11111111 11111111111111 11111111 11111 11111111111111111111 111111111 81 1111111 11111111 11111111 11111111111111 11111111 11111 11111111111111111111 111111111 80 1111111 11111111 11111111 11111111111111 11111111 11111 11111111111111111111 111111111 77 1111111 11111111 11111111 11111111111111 11111111 11111 11111111111111111111 111111111 77 1111111 11111111 11111111 11111111111111 11111111 11111 11111111111111111111 111111111 77 1111111 11111111 11111111 11111111111111 11111111 11111 11111111111111111111 111111111 76 1111111 11111111 11111111 11111111111111 11111111 11111 11111111111111111111 111111111 75 1111111 11111111 11111111 11111111111111 11111111 11111 11111111111111111111 111111111 73 1111111 11111111 11111111 11111111111111 11111111 11111 11111111111111111111 111111111 70 1111111 11111111 11111111 11111111111111 11111111 11111 11111111111111111111 111111111 152 1111111 11111111 11111111 11111111111111 11111111 11 11111 111111111 ------------------------------------------------------------------------------------------------------------------------ 147 1111111 11111111 11111111 11111111111111 11111111 11 11111 111111111 144 1111111 11111111 11111111 11111111111111 11111111 11 11111 111111111 142 1111111 11111111 11111111 11111111111111 11111111 11 11111 111111111 102 1111111 11111111 11111111 11111111111111 11111111 2222222 11 11111 111111111 102 1111111 11111111 11111111 11111111111111 11111111 2222222 11 2 11111 111111111 96 1111111 11111111 11111111 11111111111111 11111111 2222222 11 2 11111 111111111 94 1111111 11111111 11111111 11111111111111 11111111 2222222 11 2 11111 111111111 82 1111111 11111111 11111111 11111111111111 11111111 2222222 11 22222222222222222222111111111 79 1111111 11111111 11111111 11111111111111 11111111 2222222 11 222222222222222222222111111111 79 1111111 11111111 11111111 11111111111111 11111111 2222222 11 222222222222222222222111111111

Page 62: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

62

6 7 8 9 0 1 2 3 4 5678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345 %Perm Trial Cycle Count Ones %Off 1Shot 11111111111111 1111111111111111111 111 .0 1 1 0 70 10.9 0 1111 11111111111111 11111111 1111111111111111111 11111111111 .0 1 2 0 247 38.3 0 1111 11111111111111 11111111 1111111111111111111 11111111111 .0 1 3 0 309 47.9 0 1111 11111111111111 11111111 1111111111111111111 11111111111 .0 1 8 0 312 48.4 0 1111 11111111111111 11111111 1111111111111111111 11111111111 .0 1 17 0 331 51.3 0 1111 11111111111111 11111111 1111111111111111111 11111111111 .0 1 21 0 338 52.4 0 1111 11111111111111 11111111 1111111111111111111 11111111111 .0 1 25 0 353 54.7 0 1111 11111111111111 11111111 1111111111111111111 11111111111 .0 1 43 0 360 55.8 0 1111 11111111111111 11111111 1111111111111111111 11111111111 .0 1 51 0 371 57.5 0 1111 11111111111111 1 11111111 1111111111111111111 11111111111 .0 1 54 0 372 57.7 1 1111 11111111111111 1 11111111 1111111111111111111 11111111111 .0 1 56 0 372 57.7 1 11111 11111111111111 1 11111111 1111111111111111111 11111111111 .0 1 59 0 377 58.4 1 11111 11111111111111 1 11111111 1111111111111111111 11111111111 .0 1 69 0 390 60.5 1 11111 11111111111111 1 11111111 1111111111111111111 11111111111 .0 1 90 0 411 63.7 1 11111 11111111111111 1 11111111 1111111111111111111 11111111111 .0 1 98 0 411 63.7 1 11111 11111111111111 1 11111111 1111111111111111111 11111111111 .0 1 107 0 415 64.3 1 11111 1111111 11111111111111 1 11111111 1111111111111111111 11111111111 .0 1 111 0 422 65.4 1 11111 1111111 11111111111111 1 11111111 1111111111111111111 11111111111 .0 1 128 0 426 66.0 2 11111 1111111 11111111111111 1 11111111 1111111111111111111 11111111111 .0 1 131 0 434 67.3 2 11111 1111111 11111111111111 1 11111111 1111111111111111111 11111111111 .0 1 132 0 440 68.2 3 11111 1111111 11111111111111 1 11111111 1111111111111111111 11111111111 .0 1 236 0 441 68.4 3 11111 1111111 11111111111111 1 11111111 1111111111111111111 11111111111 .0 1 283 0 446 69.1 3 11111 1111111 11111111111111 1 11111111 1111111111111111111 11111111111 .0 1 290 0 447 69.3 3 11111 1111111 11111111111111 1 11111111 1111111111111111111 11111111111 .0 1 362 0 448 69.5 3 11111 1111111 11111111111111 1 11111111 1111111111111111111 11111111111 .0 1 368 0 448 69.5 3 11111 1111111 11111111111111 1 11111111 1111111111111111111 11111111111 .0 1 449 0 449 69.6 4 11111 1111111 11111111111111 1 11111111 1111111111111111111 11111111111 .0 1 462 0 453 70.2 4 11111 1111111 11111111111111 1 11111111 1111111111111111111 11111111111 .0 1 474 0 455 70.5 5 11111 1111111 11111111111111 1 11111111 1111111111111111111 11111111111 .0 1 483 0 461 71.5 5 11111 1111111 11111111111111 11111111111 .0 1 1314 0 341 52.9 5 -------------------------------------------------------------------------------------------State = 120 Changes = 2 Time = 11111 1111111 11111111111111 11111111111 .0 2 13 0 348 54.0 0 11111 1111111 11111111111111 11111111111 .0 2 17 0 358 55.5 0 11111 1111111 11111111111111 11111111111 .0 2 27 0 365 56.6 0 11111 1111111 11111111111111 2222222222222222222 2222222222 11111111111 .0 2 30 0 413 64.0 1 11111 1111111 11111111111111 2222222222222222222 2222222222 11111111111 .0 2 42 0 414 64.2 1 11111 1111111 11111111111111 2222222222222222222 2222222222 11111111111 .0 2 46 0 418 64.8 1 11111 1111111 11111111111111 2222222222222222222 2222222222 11111111111 .0 2 49 0 422 65.4 1 11111 1111111 111111111111112222222222222222222222 2222222222 11111111111 .0 2 51 0 449 69.6 1 11111 1111111 111111111111112222222222222222222222 2222222222 11111111111 .0 2 55 0 450 69.8 1 11111 1111111 111111111111112222222222222222222222 2222222222 11111111111 .0 2 100 0 451 69.9 1

Page 63: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

63

Simulation of Cellular Behavior

The program for the simulation of cellular control of gene expression by connectrons is now at mid-stage of development.

First results in the E. coli genome indicate that 60% to 80% of the genes are turned off at any given time.

Any gene that is not turned off by connectron control is open to promotion and transcription.

Page 64: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

64

An informatic view of the biological world

David States (now at Washington University in St. Louis) argued that

“All biological systems are essentially informatic systems that happen to be implemented in molecules.”

Page 65: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

65

Connectrons do it!

In the last two years, I have found

A purely informatic system for the high-level control of gene expression exists above the level of promotional control of gene expression.

I call these control elements “Connectrons”

Page 66: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

66

Connectrons Exist in all three Kingdoms

The four-sequence relationship that prevents sets of genes from being expressed has now been found in all public genomes.

In most genomes the percentage of genes controlled by connectrons range between 95% to 97%

Page 67: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

67

Genomes Covered“The Bad Bug List” Pseudomonas aeruginosa PA01 Deinococcus radiodurans Streptococcus pneumoniae SaccGharomyces cerevisiae Sinorhizobium meliloti Escherichia coli K-12 MG1655 Escherichia coli K-12, Plasmid F & Bacteriophage Caulobacter crescentus Halobacterium sp. NRC-1 Rickettsia conorii Malish 7 Mycobacterium tuberculosis Lactococcus lactis Haemophilus influenzae Helicobacter pylori 26695 Methanococcus jannaschii Synechocystis Aquifex aeolicus

Bacillus subtilis Aeropyrum pernix Streptococcus pneumoniae - TIGR4 Streptococcus pneumoniae R6 Ureaplasma urealyticum Helicobacter pylori J99 Methanobacterium thermoautotrophicum Mycobacterium leprae Escherichia coli O157:H7 Pasteurella multocida Yersinia pestis Bacillus halodurans Escherichia coli O157:H7:EDL933 Agrobacterium tumefaciens strain C58 Xylella fastidiosa Vibrio cholerae Sulfolobus tokodaii Chlamydia pneumoniae CWL029

Page 68: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

68

Genomes Covered (cont.) “The Bad Bug List” Mycoplasma genitalium G37 Thermoplasma acidophilum Chlamydophila pneumoniae J138 Mycoplasma pneumoniae Thermotoga maritima Chlamydophila pneumoniae AR39 Campylobacter jejuni Staphylococcus aureus strain N315 Archaeoglobus fulgidus Listeria monocytogenes strain EGD Staphylococcus aureus strain Mu50 Borrelia burgdorferi Pyrococcus horikoshii Listeria innocua Clip11262 Buchnera sp. APS Salmonella typhimurium LT2 Pyrococcus abyssi

Salmonella enterica serovar Typhi Rickettsia prowazekii Chlamydia trachomatis Treponema pallidum

Page 69: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

69

Percentage of genes controlled by connectrons

There are three parameters that determine the percentage of the genes control by connectrons

(1) Minimum fragment length (set to 15-bases)

(2) (2) Maximum gap between C1 and C2 (set to 100-bases maximum)

(3) (3) Maximum distance from last exon to C1/C2 (determined for each genome)

Page 70: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

70

The Bad Bug List10-gap15-base

Name % Ctron Control Lag Length Kingdom

Aeropyrum pernix 99.04 1,000 ArchaeaAgrobacterium tumefaciens strain C58 96.84 10 BacteriaAquifex aeolicus 99.29 8,500 BacteriaArabidopsis thaliana 23.74 600 EukaryotaArchaeoglobus fulgidus 99.38 5,000 ArchaeaBacillus halodurans 97.04 1,000 BacteriaBacillus subtilis 99.27 10,000 EubacteriaBorrelia burgdorferi 98.64 1,000 EubacteriaBuchnera sp. APS 97.29 10 BacteriaCampylobacter jejuni 99.73 10 BacteriaCaulobacter crescentus 99.97 10 BacteriaChlamydia pneumoniae CWL029 90.63 12,000 BacteriaChlamydia trachomatis 78.81 5,000 BacteriaChlamydophila pneumoniae AR39 78.79 1,500 BacteriaChlamydophila pneumoniae J138 88.23 8,500 BacteriaDeinococcus radiodurans 99.18 10 BacteriaEscherichia coli K-12 MG1655 96.81 10 BacteriaEscherichia coli K-12, Plasmid F & Bacteriophage 96.81 10 BacteriaEscherichia coli O157:H7 97.97 10 BacteriaEscherichia coli O157:H7:EDL933 96.86 10 BacteriaHaemophilus influenzae 99.67 10 BacteriaHalobacterium sp. NRC-1 99.95 10 ArchaeaHelicobacter pylori 26695 99.61 10 BacteriaHelicobacter pylori J99 98.28 10 BacteriaLactococcus lactis 99.90 10 BacteriaListeria innocua Clip11262 98.18 1,000 BacteriaListeria monocytogenes strain EGD 99.25 1,000 BacteriaMethanobacterium thermoautotrophicum 98.21 10,000 ArchaeaMethanococcus jannaschii 99.61 10 ArchaeaMycobacterium leprae 98.04 3,000 Bacteria

Page 71: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

71

10-gap15-base

Name % Ctron ControlLag Length Kingdom

Mycobacterium tuberculosis 99.91 1,000 ArchaeaMycoplasma genitalium G37 89.30 10 BacteriaMycoplasma pneumoniae 84.48 10 BacteriaNeisseria meningitidis MC58 99.79 600 BacteriaNeisseria meningitidis Z2491 99.90 600 BacteriaPasteurella multocida 97.53 10 BacteriaPseudomonas aeruginosa PA01 99.96 10 BacteriaPyrococcus abyssi 93.38 3,500 ArchaeaPyrococcus horikoshii 98.37 3,500 ArchaeaRickettsia conorii Malish 7 99.92 10 BacteriaRickettsia prowazekii 93.30 750 BacteriaSaccharomyces cerevisiae 98.91 10,000 EukaryotaSaccharomyces cerevisiae, mitochondrion & plasmid 98.85 10,000 EukaryotaSalmonella enterica serovar Typhi 93.33 10 BacteriaSalmonella typhimurium LT2 95.59 10 BacteriaSinorhizobium meliloti 98.05 1,000 BacteriaStaphylococcus aureus strain Mu50 99.16 10 BacteriaStaphylococcus aureus strain N315 99.65 10 BacteriaStreptococcus pneumoniae 99.00 10 BacteriaStreptococcus pneumoniae - TIGR4 99.00 10 BacteriaStreptococcus pneumoniae R6 98.84 10 BacteriaSulfolobus tokodaii 94.45 10 ArchaeaSynechocystis 99.43 10 EubacteriaThermoplasma acidophilum 88.49 5,000 ArchaeaThermotoga maritima 81.65 10 BacteriaTreponema pallidum 69.56 5,000 BacteriaUreaplasma urealyticum 98.73 10 BacteriaVibrio cholerae 94.66 1,000 BacteriaXylella fastidiosa 95.03 1,500 BacteriaYersinia pestis 97.34 10 Bacteria

Page 72: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

72

Collaboration to show the Physical Existence of Connectrons

Drs. Sankar Adhya and Susan Garges in the NCI have designed and implemented physical experiments in E. coli

First results show that the deletion of a “one-shot” connectron of 50kb with about 60 flagella genes causes changes in gene expression

Paper to be published in PNAS by mid year.

Page 73: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

73

Need to broaden the range of physical experimentation

Since all genomes have connectrons of the same form, the initial proof of the existence of connectrons in E. coli has great importance.

The density of connectrons controlling a particular set of genes is very much genome-dependent

Physical experiments should be carried out on a whole range of genomes

Page 74: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

74

Basic vs Applied Research

Most of the conceptual developments are really basic research.

The need for patent priority has hampered broader dissemination of the work.

When the physical proofs are ready for publication the balance will change.

Most commercial investment is concerned with end-use of connectron developments which is still years away.

Page 75: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

75

Processing the Human Genome

Processing the human genome to determine the connectron structure will make it possible to investigate many human diseases

There are “connectron defect”diseases which different from “gene defect” diseases

Page 76: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

76

Processing the Human Genome

Connectrons are determined from a pair of chromosomes.

The half-diagonal of 24*24 jobs is 300 jobs

Each pair of chromosomes have to be broken up into 50mb chunks.

There are 700 such chunks

The total number of jobs is 300 * 700 *700/2 = 73.5*106

Page 77: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

77

Zinc-finger DNA Binding Proteins (DBPs) as therapeutic agents

DBPs can block to C1/C2, T1 or T2 sites

DBPs can bind across T1 and T2 sites forming a DBP connectron

P G C1/C2

DBP

T1 T2

DBP DBP

T1 T2

DBP

T1 T2

DBP

Page 78: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

78

Where is the competition

There are lots of papers appearing on iRNA and stRNA

None of these people have understood the nature of the tetradic connectron relationship

Thomas Werner who is Genomatix in Munich is studying matrix attachment regions

Matrix attachment regions are responsible for bringing the T1 and T2 proximal to each other so connectrons can be formed

Page 79: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

79

Genomatix View

Page 80: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

80

Patent Status of the Connectron Technology

A basic methods US and PCT patent filed May 30th, 2001

USPTO analysis shows that there are 19 inventions

41 Bacterial, Archeal and Eukatyotic genomes covered by US Provisional Patent Applications

Page 81: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

81

Patenting whole genomes

People get all bent out of shape when they hear that I have been patenting the connectron structure of many whole genomes

My view is that if I don’t do it then someone else will reverse-engineer the connectron determination algorithm and do it themselves

The connectrons are both an observation and an invention

The utility which is the key to patentability is that a particular C1/C2 when expressed forms a T1-T2 connectron that turns off a particular set of genes

Page 82: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

82

Where do we go from here Simulation of E. coli to relate Affymetrix-type gene

expression measurement to modeled cell behavior

Processing, analysis and simulation of C. elegans as the model for differentiation and development

Processing of the human genome

Modification of genomic properties using zinc-finger DBPs

Page 83: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

83

The High Ground of the 21st Century

A patented concept of total, systematic gene expression control

Ability to compute all the gene expression control structures from genomic information

Ability to patent all computed instances of these control structures based on known content-of-matter and function

Page 84: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

84

The High Ground of the 21st Century

Ability to validate all gene expression events through existing measurement techniques

Ability to simulate the gene expression control behavior of the complete organism

Ability to set biological engineering standards

Page 85: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

85

My responsibility as inventor Modification of genomic behavior by changing

connectron interactions will be a very powerful force in our global society in a few years

I feel a very deep responsibility for future history of this invention

My intention is that everyone should and will have access to this invention

But everyone will pay - a small bit here and there

Page 86: 1 A new theory of gene regulation based on relationships of DNA sequences flanking genes Richard J. Feldmann Global Determinants, Inc. Derwood, Maryland

86

Contact Information

Richard J. Feldmann (v) 301-926-0921

Global Determinants, Inc. (f) 301-926-7954

17800 Mill Creek Dr. (c) 301-526-8524

Derwood, Maryland 20855-1019 [email protected]