joost n. kok artificial intelligence: from computer science to molecular informatics
TRANSCRIPT
Joost N. Kok
Artificial Intelligence:Artificial Intelligence:from Computer Science from Computer Science to Molecular Informatics to Molecular Informatics
Artificial IntelligenceArtificial Intelligence
Movie Artificial Intelligence by Steven Spielberg
Five year studies at universities of Utrecht, Amsterdam, Groningen and Maastricht
Artificial IntelligenceArtificial Intelligence
The concept that machines can be improved to assume some capabilities normally thought to be like human intelligence such as learning, adapting, self-correction, etc.
The extension of human intelligence through the use of computers, as in times past physical power was extended through the use of mechanical tools.
Artificial IntelligenceArtificial Intelligence
On May 11, 1997, an IBM computer named Deep Blue whipped world chess champion Garry Kasparov in the deciding game of a six-game match
Artificial IntelligenceArtificial Intelligence
First Robot World Cup Soccer Games held in Nagoya, Japan in 1997
Goal: team of robots beats the FIFA World Cup champion in 2050
Artificial IntelligenceArtificial Intelligence
Natural language processing: it needs to be able to communicate in a natural language like English
Knowledge representation: it needs to be able to have knowledge and to store it somewhere
Automated reasoning: it needs to be able to do reasoning based on the stored knowledge
Machine learning: it needs to be able to learn from its environment
Time ComplexityTime Complexity
Turing machine gives notion of computability Time complexity: how many steps does it
take to find an answer? Combinatorial Explosion Problems that are computable in polynomial
time (class P) Problems that are verifiable in polynomial
time (class NP) P equals NP ?
Natural ComputingNatural Computing
Computers are to Computer Science as Comic Books to Literature (Joosen)
Natural ComputingNatural Computing
Natural Computing– Evolutionary Computing– Molecular Computing– Gene Assembly in Ciliates
select mating partners
mutateevaluate
selectsurvivors
recombine
(terminate)
Initialize population, evaluate
Evolutionary ComputingEvolutionary Computing
Example: Discrete RepresentationExample: Discrete Representation
Genotype: 8 bits
Phenotype: – integer
1*21*27 7 + 0*2+ 0*26 6 + 1*2+ 1*25 5 + 0*2+ 0*24 4 + 0*2+ 0*23 3 + 0*2+ 0*22 2 + 1*2+ 1*21 1 + 1*2+ 1*200 = 163= 163– a real number between 2.5 and 20.5
2.5 + 163/256 (20.5 - 2.5) = 13.96092.5 + 163/256 (20.5 - 2.5) = 13.9609– scheduleschedule
Example: MutationExample: Mutation
1 1 1 1 1 1 1 before
1 1 1 0 1 1 1 after
Mutation happens with probability pm for each bit
mutated bit
.
1 1 1 1 1 1 1 0 0 0 0 0 0 0 parentscut cut
1 1 1 0 0 0 0 0 0 0 1 1 1 1 offspring
Example: RecombinationExample: Recombination
Each chromosome is cut into 2 pieces which are recombined
Example: Fitness proportionate selectionExample: Fitness proportionate selection
Expected number of times fi is selected equals fi / average fitness
Better (fitter) individuals have:– more space– more chance to be
selected
Best
Worst
select mating partners
mutateevaluate
selectsurvivors
recombine
(terminate)
Initialize population, evaluate
Evolutionary ComputingEvolutionary Computing
Molecular ComputingMolecular Computing
Implementation of algorithms in biological hardware, e.g. using DNA molecules and enzymes
Power lies in massive parallel search Test tube may contain easily 1015 strands of
DNA Compared to computers very efficient in
energy consumption, storage density and number of operations per second
Molecular ComputingMolecular Computing
DNA: sequence of nucleotides linked together by strong backbone
Nucleotides have attached bases A, T, C, G:– Adenine– Thymine– Guanine– Cytosine
Watson-Crick complementarity A-T C-G
Molecular ComputingMolecular Computing
Algorithm
– generate random paths through graph– keep only paths from the initial to the final
node– keep only paths that enter exactly n nodes– keep only paths that enter all nodes– if any paths remain, the graph contains a
Hamiltonian path
Molecular ComputingMolecular Computing
For each node, take unique random sequence over A, C, T, G
For each node, the sequence is of the same length
Molecular ComputingMolecular Computing
For every connection, construct a sequence from the sequences of the two nodes– Node 1: TATCGGATCGGTATATCCGA
– Node 2: GCTATTCGAGCTTAAAGCTA Inverse: GTATATCCGAGCTATTCGAG Sequence: CATATAGGCTCGATAAGCTC
Molecular ComputingMolecular Computing
Generate random paths through graph– Mix strings for all nodes with strings for all
arrows, together with Ligase enzyme
Molecular ComputingMolecular Computing
Apply PCR (Polymerase Chain Reaction) amplification using as primers string for in and complement for string out
Molecular ComputingMolecular Computing
Select molecules that encode paths that enter exactly n nodes by running contents of test tube through agarose gel and save DNA strands of the right length
Molecular ComputingMolecular Computing
Create single strands by melting For each node, select those sequences that
anneal to the string of that node
Molecular ComputingMolecular Computing
Result: implementation of algorithm in DNA– First experiment took seven days– Now possible in seven seconds
Molecular ComputingMolecular Computing
Operations: denaturing, annealing, separation, selection, multiplying
Simulation of Turing Machine is possible Problems:
– PCR and separation procedures are error prone
– DNA may form non-existing pseudo-paths– DNA may form hairpin loops– Scalability
Molecular ComputingMolecular Computing
Combine Evolutionary Computing with Molecular Computing (EDNA project)– Use potential errors as feature– Huge population sizes– Automation of DNA processing necessary
Many more techniques from molecular biology can be used– Plasmids– Restriction Enzymes– Fluorescence
CiliatesCiliates
Very ancient ( ~ 2 . 109 years ago) Very rich group ( ~ 10000 genetically different
organisms) Very important from the evolutionary point of
view
CiliatesCiliates
DNA molecules in micronucleus are very long (hundreds of kilo bps)
DNA molecules in macronucleus are gene-size, short (average ~ 2000 bps)
Gene Assembly in CiliatesGene Assembly in Ciliates
Micronucleus: cell mating Macronucleus: RNA transcripts (expression) Micro: I0 M1 I1 M2 I2 M3 … Ik Mk Ik+1
M = P1 N P2
Macro: permutation of (possibly rotated)M1,…, Mk and I0 ,…, Ik+1are removed