introduction to dna computing

Introduction to DNA ComputingRussell DeatonComp. Sci. & Engr.The University of ArkansasFayetteville, AR [email protected]

Junghuei ChenDepartment of Chem & BiochemUniversity of DelawareNewark, DE [email protected]

What is DNA Computing(DNAC) ?

The use of biological molecules, primarily DNA, DNA analogs, and RNA, for computational purposes.

Why Nucleic Acids?� Density (Adleman, Baum):

� DNA: 1 bit per nm3, 1020 molecules� Video: 1 bit per 1012 nm3

� Efficiency (Adleman)� DNA: 1019 ops / J� Supercomputer: 109 ops / J

� Speed (Adleman):� DNA: 1014 ops per s� Supercomputer: 1012 ops per s

What makes DNAC possible?� Great advances in molecular biology

� PCR (Polymerase Chain Reaction)� DNA Microarrays� New enzymes and proteins� Better understanding of biological molecules

� Ability to produce massive numbers of DNAmolecules with specified sequence and size

� DNA molecules interact through templatematching reactions

What are the basics frommolecular biology that I need to

know to understand DNAcomputing?

PHYSICAL STRUCTURE OF DNA

NitrogenousBase

34 Å

MajorGroove

MinorGroove

Central Axis

Sugar-PhosphateBackbone

20 Å5� C

3� OH

3� 0HC 5�

5�

3�

3�

5�

INTER-STRAND HYDROGEN BONDING

Adenine Thymine

to Sugar-PhosphateBackbone


(+) (-)

(+)(-)

Hydrogen Bond

Guanine Cytosine



(-) (+)

(+)(-)

(+)(-)

STRAND HYBRIDIZATIONA B

a b

A B

ab

b

B

a

A

HEAT

COOL

ba

A B

OR

100° C

DNA LIGATION

αααα δδδδ

αααα� δδδδ�

αααα� δδδδ�

Ligase Joins 5' phosphateto 3' hydroxyl

αααα� δδδδ�αααα

δδδδ

RESTRICTION ENDONUCLEASES

EcoRI

HindIII

AluI

HaeIII

- OH 3�

5� P -

- P 5�

3� OH -

DNA Polymerase

DNA Sequencing

GEL ELECTROPHORESIS - SIZE SORTING

BufferGel

Electrode

Electrode

Samples

Faster

Slower

ANTIBODY AFFINITY

CACCATGTGAC

GTGGTACACTG B

PMP

+

Anneal

CACCATGTGAC

GTGGTACACTG B+

CACCATGTGAC

GTGGTACACTG B PMP

Bind

Add oligo withBiotin label

Heat and cool

Add Paramagnetic-Streptavidin

Particles

Isolate with MagnetN

S

POLYMERASECHAIN

REACTION

What is a the typicalmethodology?

� Encoding: Map problem instance onto setof biological molecules and molecularbiology protocols

� Molecular Operations: Let molecules reactto form potential solutions

� Extraction/Detection: Use protocols toextract result in molecular form

What is an example?

� �Molecular Computation of Solutions toCombinatorial Problems�

� Adleman, Science, v. 266, p. 1021.

Algorithm� Generate Random Paths through the graph.� Keep only those paths that begin with vin

and end with vout.� If graph has n vertices, then keep only those

paths that enter exactly n vertices.� Keep only those paths that enter all the

vertices at least once.� In any paths remain, say �Yes�; otherwise,

say �No�

Encoding0

1

2

�GCATGGCC

�AGCTTAGG

�ATGGCATG

CCGGTCGA�CCGGTACC�

�GCATGGCCAGCTTAGG CCGGTCGA�

�GCATGGCCATGGCATG CCGGTACC�

00 21

What are the success stories?� Self-Assembling Demonstrated (Winfree,

Seeman, Reif et al)� New Approaches and Protocols Developed

� Surface-based (Wisconsin-Madison, Dimacs II)� Evolutionary Approaches (Wood and Chen,

Gecco-99, DNA-5)� Reversible Logic Gates (Rubin et al., DNA-5)� Whiplash PCR (Hagiya et al., DNA-3)� RNA Computing (Landweber et al., PNAS)

� How do cells and nature compute? (Kariand Landweber, DNA-4)

Source: http://seemanlab4.chem.nyu.edu/

Self-Assembly

Source: Winfree, DIMACS IV

Source: http://corninfo.chem.wisc.edu/writings/dnatalk/dna01.html

In Vitro Evolution

Source: http://www.princeton.edu/~lfl/washpost.html

DNA Computing in Cells

DNA Fredkin Gates

Whiplash PCR

Source: John Rose, Institute of Physics, University of Tokyo

What are the challenges?

� Error: Molecular operations are not perfect.� Reversible and Irreversible Error� Efficiency: How many molecules

contribute?� Encoding problem in molecules is difficult.� Scaling to larger problems� Applications

Mismatches

DNA Word Design

� Design of DNA Sequences that hybridize asplanned (that is, minimize mismatches)

� Reliability: False Positives and Negatives� Efficiency: Hybridizations that Contribute

to Solution� Hybridizations are Templates for

Subsequent Enzymatic Steps

DNA Word Design

� Minimum Distance Codes to PreventHybridization Error

� Distance Measure� Combinatoric (Hamming)� Energetic (Base Stacking Energy)

� Design DNA Words with EvolutionaryAlgorithms

� Good Codes Achievable

Code Word

Hybridization

Code Word

Hybridization

Base Stacking

What are the possibleapplications?

� DNAC and Conventional Computers

� DNAC and Evolutionary Computation

� DNAC and Biotechnology

� DNAC and Nanotechnology

DNAC and ElectronicComputing

� Solution versus solid state� Individual molecules versus ensembles of

charge carriers� The importance of shape in biological

molecules� Programmability/Evolvability Trade-off

(Conrad)

Edna

� Electronic DNA� Virtual Test Tube for Design and

Simulation of DNA Computations� Molecules as Cellular Automata� Solve Adleman and Other Problems� Distributed Edna to Solve Large Problems� New Paradigm

In Vitro EvolutionaryComputation

� Randomness and Uncertainty Inherent inBiomolecular Reactions

� Never Level of Control like EE over SolidState Devices

� Use Nature�s ToolBox: Enzymes,Reaction/Diffusion, Adaptability, andRobustness

� Evolved, Not Designed

DNAC and Biotechnology

� �Computationally Inspired Biotechnology�� DNA2DNA �killer app�� Automation of protocols� DNA Word Design (Gene Expression

Chips, Universal DNA chips)� Exquisite Detection of Biomaterials� Bio-engineered Materials

Universal DNA Chips

� Universal chip has short oligonucleotidesequences (antitags) attached to solidsupport.

� W-C complement of antitag is tag.� Tag/Antitag pair designed to hybridize

strongly with each other, but not any otherTag/Antitag pair

Target ReporterTag

Target Specific

PolymorphismSites

Single Nucleotide Polymorphisms

Extension and Enzyme

Antitag

Fluorescence

Universal Chip Advantages

� Reusable Components

� Repeatable Manufacture

� Avoid problems with chip validation anderrors

� Target Specific in Solution Phase

What developments can weexpect in the near-term(2002-...)?� Large libraries of evolved sequences� Iterative improvements in self-assembly,

WPCR, etc�� Increased emphasis on simulation and

modeling� Some impact on molecular biology by DNA

computation

What are the long-termprospects?

� Cross-fertilization among evolutionarycomputing, DNA computing, molecularbiology, and computational biology

� Niche uses of DNA computers for problemsthat are difficult for electronic computers

� In Vitro Translation and Transcription

Where can I learn more?� Web Sites:

� http://www.wi.leidenuniv.nl/~jdassen/dna.html� http://dope.caltech.edu/winfree/DNA.html� http://www.msci.memphis.edu/~garzonm/bmc.html� (Conrad) http://www.cs.wayne.edu/biolab/index.html

� DIMACS Proceedings: DNA Based Computers I (#27), II (#44), III (#48), IV (Special Issue of Biosystems), V (MIT, June 1999), VI (Leiden, June 2000)� Other: Genetic Programming 1 (Stanford, 1997), Genetic Programming 2 (Wisconsin-Madison, 1998), GECCO-1999,IEEE International Conference on Evolutionary Computation (Indianapolis, 1997)� G. Paun (ed.), Computing with Biomolecules: Theory and Experiment, Springer-Verlag, Singapore 1998.� �DNA Computing: A Review,� Fundamenta Informaticae, vol. 35, pp. 231-245.�M. H. Garzon and R. J. Deaton, �Biomolecular Computing and Programming,� IEEE Transactions on Evolutionary Computation, vol. 3, pp. 236-250, 1999.

introduction to dna computing

Documents