introduction to dna computing
TRANSCRIPT
Introduction to DNA ComputingRussell DeatonComp. Sci. & Engr.The University of ArkansasFayetteville, AR [email protected]
Junghuei ChenDepartment of Chem & BiochemUniversity of DelawareNewark, DE [email protected]
What is DNA Computing(DNAC) ?
The use of biological molecules, primarily DNA, DNA analogs, and RNA, for computational purposes.
Why Nucleic Acids?� Density (Adleman, Baum):
� DNA: 1 bit per nm3, 1020 molecules� Video: 1 bit per 1012 nm3
� Efficiency (Adleman)� DNA: 1019 ops / J� Supercomputer: 109 ops / J
� Speed (Adleman):� DNA: 1014 ops per s� Supercomputer: 1012 ops per s
What makes DNAC possible?� Great advances in molecular biology
� PCR (Polymerase Chain Reaction)� DNA Microarrays� New enzymes and proteins� Better understanding of biological molecules
� Ability to produce massive numbers of DNAmolecules with specified sequence and size
� DNA molecules interact through templatematching reactions
What are the basics frommolecular biology that I need to
know to understand DNAcomputing?
PHYSICAL STRUCTURE OF DNA
NitrogenousBase
34 Å
MajorGroove
MinorGroove
Central Axis
Sugar-PhosphateBackbone
20 Å5� C
3� OH
3� 0HC 5�
5�
3�
3�
5�
INTER-STRAND HYDROGEN BONDING
Adenine Thymine
to Sugar-PhosphateBackbone
to Sugar-PhosphateBackbone
(+) (-)
(+)(-)
Hydrogen Bond
Guanine Cytosine
to Sugar-PhosphateBackbone
to Sugar-PhosphateBackbone
(-) (+)
(+)(-)
(+)(-)
STRAND HYBRIDIZATIONA B
a b
A B
ab
b
B
a
A
HEAT
COOL
ba
A B
OR
100° C
DNA LIGATION
αααα δδδδ
αααα� δδδδ�
αααα� δδδδ�
Ligase Joins 5' phosphateto 3' hydroxyl
αααα� δδδδ�αααα
δδδδ
RESTRICTION ENDONUCLEASES
EcoRI
HindIII
AluI
HaeIII
- OH 3�
5� P -
- P 5�
3� OH -
DNA Polymerase
DNA Sequencing
GEL ELECTROPHORESIS - SIZE SORTING
BufferGel
Electrode
Electrode
Samples
Faster
Slower
ANTIBODY AFFINITY
CACCATGTGAC
GTGGTACACTG B
PMP
+
Anneal
CACCATGTGAC
GTGGTACACTG B+
CACCATGTGAC
GTGGTACACTG B PMP
Bind
Add oligo withBiotin label
Heat and cool
Add Paramagnetic-Streptavidin
Particles
Isolate with MagnetN
S
POLYMERASECHAIN
REACTION
What is a the typicalmethodology?
� Encoding: Map problem instance onto setof biological molecules and molecularbiology protocols
� Molecular Operations: Let molecules reactto form potential solutions
� Extraction/Detection: Use protocols toextract result in molecular form
What is an example?
� �Molecular Computation of Solutions toCombinatorial Problems�
� Adleman, Science, v. 266, p. 1021.
Algorithm� Generate Random Paths through the graph.� Keep only those paths that begin with vin
and end with vout.� If graph has n vertices, then keep only those
paths that enter exactly n vertices.� Keep only those paths that enter all the
vertices at least once.� In any paths remain, say �Yes�; otherwise,
say �No�
Encoding0
1
2
�GCATGGCC
�AGCTTAGG
�ATGGCATG
CCGGTCGA�CCGGTACC�
�GCATGGCCAGCTTAGG CCGGTCGA�
�GCATGGCCATGGCATG CCGGTACC�
00 21
What are the success stories?� Self-Assembling Demonstrated (Winfree,
Seeman, Reif et al)� New Approaches and Protocols Developed
� Surface-based (Wisconsin-Madison, Dimacs II)� Evolutionary Approaches (Wood and Chen,
Gecco-99, DNA-5)� Reversible Logic Gates (Rubin et al., DNA-5)� Whiplash PCR (Hagiya et al., DNA-3)� RNA Computing (Landweber et al., PNAS)
� How do cells and nature compute? (Kariand Landweber, DNA-4)
Source: http://seemanlab4.chem.nyu.edu/
Self-Assembly
Source: Winfree, DIMACS IV
Source: http://corninfo.chem.wisc.edu/writings/dnatalk/dna01.html
In Vitro Evolution
Source: http://www.princeton.edu/~lfl/washpost.html
DNA Computing in Cells
DNA Fredkin Gates
Whiplash PCR
Source: John Rose, Institute of Physics, University of Tokyo
What are the challenges?
� Error: Molecular operations are not perfect.� Reversible and Irreversible Error� Efficiency: How many molecules
contribute?� Encoding problem in molecules is difficult.� Scaling to larger problems� Applications
Mismatches
DNA Word Design
� Design of DNA Sequences that hybridize asplanned (that is, minimize mismatches)
� Reliability: False Positives and Negatives� Efficiency: Hybridizations that Contribute
to Solution� Hybridizations are Templates for
Subsequent Enzymatic Steps
DNA Word Design
� Minimum Distance Codes to PreventHybridization Error
� Distance Measure� Combinatoric (Hamming)� Energetic (Base Stacking Energy)
� Design DNA Words with EvolutionaryAlgorithms
� Good Codes Achievable
Code Word
Hybridization
Code Word
Hybridization
Base Stacking
What are the possibleapplications?
� DNAC and Conventional Computers
� DNAC and Evolutionary Computation
� DNAC and Biotechnology
� DNAC and Nanotechnology
DNAC and ElectronicComputing
� Solution versus solid state� Individual molecules versus ensembles of
charge carriers� The importance of shape in biological
molecules� Programmability/Evolvability Trade-off
(Conrad)
Edna
� Electronic DNA� Virtual Test Tube for Design and
Simulation of DNA Computations� Molecules as Cellular Automata� Solve Adleman and Other Problems� Distributed Edna to Solve Large Problems� New Paradigm
In Vitro EvolutionaryComputation
� Randomness and Uncertainty Inherent inBiomolecular Reactions
� Never Level of Control like EE over SolidState Devices
� Use Nature�s ToolBox: Enzymes,Reaction/Diffusion, Adaptability, andRobustness
� Evolved, Not Designed
DNAC and Biotechnology
� �Computationally Inspired Biotechnology�� DNA2DNA �killer app�� Automation of protocols� DNA Word Design (Gene Expression
Chips, Universal DNA chips)� Exquisite Detection of Biomaterials� Bio-engineered Materials
Universal DNA Chips
� Universal chip has short oligonucleotidesequences (antitags) attached to solidsupport.
� W-C complement of antitag is tag.� Tag/Antitag pair designed to hybridize
strongly with each other, but not any otherTag/Antitag pair
Target ReporterTag
Target Specific
PolymorphismSites
Single Nucleotide Polymorphisms
Extension and Enzyme
Antitag
Fluorescence
Universal Chip Advantages
� Reusable Components
� Repeatable Manufacture
� Avoid problems with chip validation anderrors
� Target Specific in Solution Phase
What developments can weexpect in the near-term(2002-...)?� Large libraries of evolved sequences� Iterative improvements in self-assembly,
WPCR, etc�� Increased emphasis on simulation and
modeling� Some impact on molecular biology by DNA
computation
What are the long-termprospects?
� Cross-fertilization among evolutionarycomputing, DNA computing, molecularbiology, and computational biology
� Niche uses of DNA computers for problemsthat are difficult for electronic computers
� In Vitro Translation and Transcription
Where can I learn more?� Web Sites:
� http://www.wi.leidenuniv.nl/~jdassen/dna.html� http://dope.caltech.edu/winfree/DNA.html� http://www.msci.memphis.edu/~garzonm/bmc.html� (Conrad) http://www.cs.wayne.edu/biolab/index.html
� DIMACS Proceedings: DNA Based Computers I (#27), II (#44), III (#48), IV (Special Issue of Biosystems), V (MIT, June 1999), VI (Leiden, June 2000)� Other: Genetic Programming 1 (Stanford, 1997), Genetic Programming 2 (Wisconsin-Madison, 1998), GECCO-1999,IEEE International Conference on Evolutionary Computation (Indianapolis, 1997)� G. Paun (ed.), Computing with Biomolecules: Theory and Experiment, Springer-Verlag, Singapore 1998.� �DNA Computing: A Review,� Fundamenta Informaticae, vol. 35, pp. 231-245.�M. H. Garzon and R. J. Deaton, �Biomolecular Computing and Programming,� IEEE Transactions on Evolutionary Computation, vol. 3, pp. 236-250, 1999.