molekularbiologische datenbanken grundlagen der ... · molekularbiologische datenbanken grundlagen...
TRANSCRIPT
Ulf
Lese
r
Wis
sens
man
agem
ent
in d
er
Bioi
nfor
mat
ik
Mol
eku
larb
iolo
gisc
he
Dat
enba
nke
nG
rund
lage
n de
r M
olek
ular
biol
ogie
Ulf
Lese
r:M
olek
ular
biol
ogis
che
Dat
enba
nken
, Vor
lesu
ng, S
oSe
2003
2
50th
anni
vers
ary
•25
.4.1
953
-Ja
mes
Wat
son,
Fra
ncis
Cric
k-
„Mol
ecul
ar S
truc
ture
of N
ucle
ic
Acid
s “-
Nat
ure,
1 p
age
-„T
his
stru
ctur
e ha
s tw
o he
lical
ch
ains
eac
h co
iled
arou
nd th
e sa
me
axis
“
•Ba
sed
on w
ork
by W
ilkin
s&
Fran
klin
Ulf
Lese
r:M
olek
ular
biol
ogis
che
Dat
enba
nken
, Vor
lesu
ng, S
oSe
2003
3
Cells
Ulf
Lese
r:M
olek
ular
biol
ogis
che
Dat
enba
nken
, Vor
lesu
ng, S
oSe
2003
4
Her
itabl
e In
form
atio
n
•Ce
ll nu
cleu
s
•Ch
rom
osom
es
•D
NA
•Th
e bl
uepr
int
of li
fe
Ulf
Lese
r:M
olek
ular
biol
ogis
che
Dat
enba
nken
, Vor
lesu
ng, S
oSe
2003
5
Des
oxyr
iboN
ucle
icA
cid
•Ch
rom
osom
e: s
trin
g of
DN
A•
Onl
y 4
diff
eren
t nu
clei
c ac
ids
•Fi
xed
pairs
: A-
T, G
-C•
Sam
e m
echa
nism
in a
ll sp
ecie
s
Ulf
Lese
r:M
olek
ular
biol
ogis
che
Dat
enba
nken
, Vor
lesu
ng, S
oSe
2003
6
Stru
ctur
e of
DN
A
Que
lle:
http
://w
ww
.nhg
ri.ni
h.go
v/
5’
3’
3’
5’
Nucleotide
Base
Ulf
Lese
r:M
olek
ular
biol
ogis
che
Dat
enba
nken
, Vor
lesu
ng, S
oSe
2003
7
DN
A St
rand
s
•D
NA
sing
le o
r do
uble
str
ande
d•
DN
A ha
s fix
ed o
rient
atio
n -
read
fro
m 5
’to
3’-
Stra
nds
are
antip
aral
lel
5’…
TACT
GAA
…3’
3’…
ATG
ACTT
…5’
Ulf
Lese
r:M
olek
ular
biol
ogis
che
Dat
enba
nken
, Vor
lesu
ng, S
oSe
2003
8
DN
A Rep
licat
ion
•M
itosi
s: C
ell r
eplic
atio
n•
Mei
osis
: Sp
ecie
s re
plic
atio
n
Ulf
Lese
r:M
olek
ular
biol
ogis
che
Dat
enba
nken
, Vor
lesu
ng, S
oSe
2003
9
Spec
ies
•Pr
okar
yote
s: c
ell n
ucle
us, d
iffer
entia
l spl
icin
g,
gene
reg
ulat
ion,
com
plex
ity•
Mod
el o
rgan
ism
Ulf
Lese
r:M
olek
ular
biol
ogis
che
Dat
enba
nken
, Vo
rlesu
ng, S
oSe
2003
10
Gen
ome
•Se
t of
all
gene
s of
a s
peci
es is
its
geno
me
•H
uman
s-
App.
3.3
00.0
00.0
00 b
p-
22 c
hrom
osom
es +
2
sex
chro
mos
omes
-Le
ngth
50–
250
MB
Ulf
Lese
r:M
olek
ular
biol
ogis
che
Dat
enba
nken
, Vo
rlesu
ng, S
oSe
2003
11
Hum
an G
enom
e Pr
ojec
t
•D
NA:
~
3.3
00.0
00.0
00 b
ase
pairs
•H
uman
Gen
ome
Proj
ect:
Seq
uenc
ing
the
com
plet
e H
uman
Gen
ome
(sch
edul
ed 2
005)
•W
orld
-wid
e Ef
fort
(G
erm
any:
sin
ce 1
996)
•“F
inis
hed”
in 2
000
-Ra
pid
impr
ovem
ent
in t
echn
olog
y-
Com
mer
cial
com
pani
es ju
mpe
d on
it (
Cele
ra)
•“A
lmos
t” f
inis
hed
in 2
001
•“R
eally
” fin
ishe
d in
200
3
Ulf
Lese
r:M
olek
ular
biol
ogis
che
Dat
enba
nken
, Vo
rlesu
ng, S
oSe
2003
12
Sequ
enci
ng a
Gen
ome
gatcaattatagttgacttcagtcctgcctgattcatctcca
aaaatgtagtctgcctgattcatctcccaaaaatgtagctc
cgcttaaaggagctttcaagttgggggtggtgggccattc
agtgttgtcactaacagatgcatcttgtgggggtaaaatgt
cccaaagtatcttttcttgcttatgttcataagggcgctggtc
tggaatgtgccacatctgttctcactctgccatggactcctg
gaccctctgtgtgtccctttgtatcctggtagcgagtgagtc
ctcatgatttatcatcctcatgctgggcctctgtatagatga
•Br
eak
it up
into
ver
y sm
all p
iece
s•
Read
the
seq
uenc
e on
e-by
-one
•As
sem
ble
the
piec
es t
o co
ntig
s
Ulf
Lese
r:M
olek
ular
biol
ogis
che
Dat
enba
nken
, Vo
rlesu
ng, S
oSe
2003
13
Wha
t do
es D
NA
tell
us ?
actt
tcct
cg g
cagc
ggta
g gc
gaga
gcac
gcg
gagg
agc
gtgc
gcgg
gg g
cccc
ggga
gac
ggcg
gcgg
tgg
cggc
gcg
ggca
gagc
aa g
gacg
cggc
g ga
tccc
actc
gca
cagcagc
gcac
tcgg
tg c
cccg
cgca
g gg
tcgc
gatg
ctg
cccg
gtt
tggc
actg
ct c
ctgc
tggcc
gcct
ggac
gg c
tcgg
gcgc
t gg
aggt
accc
act
gatg
gta
atgc
tggc
ct g
ctgg
ctgaa
cccc
agat
tg c
catg
ttct
g tg
gcag
actg
aac
atgc
aca
tgaa
tgtc
ca g
aatg
ggaag
tggg
attc
ag a
tcca
tcag
g ga
ccaa
aacc
tgc
attg
ata
ccaa
ggaa
gg c
atcc
tgcag
tatt
gcca
ag a
agtc
tacc
c tg
aact
gcag
atc
acca
atg
tggt
agaa
gc c
aacc
aacca
gtga
ccat
cc a
gaac
tggt
g ca
agcg
gggc
cgc
aagc
agt
gcaa
gacc
ca t
cccc
acttt
gtga
ttcc
ct a
ccgc
tgct
t ag
ttgg
tgag
ttt
gtaa
gtg
atgc
cctt
ct c
gttc
ctgac
aagt
gcaa
at t
ctta
cacc
a gg
agag
gatg
gat
gttt
gcg
aaac
tcat
ct t
cact
ggcac
accg
tcgc
ca a
agag
acat
g ca
gtga
gaag
agt
acca
act
tgca
tgac
ta c
ggca
tgttg
ctgc
cctg
cg g
aatt
gaca
a gt
tccg
aggg
gta
gagt
ttg
tgtg
ttgc
cc a
ctgg
ctgaa
gaaa
gtga
ca a
tgtg
gatt
c tg
ctga
tgcg
gag
gagg
atg
actc
ggat
gt c
tggt
ggggc
ggag
caga
ca c
agac
tatg
c ag
atgg
gagt
gaa
gaca
aag
tagt
agaa
gt a
gcag
aggag
gaag
aagt
gg c
tgag
gtgg
a ag
aaga
agaa
gcc
gatg
atg
acga
ggac
ga t
gagg
atggt
gatg
aggt
ag a
ggaa
gagg
c tg
agga
accc
tac
gaag
aag
ccac
agag
ag a
acca
ccagc
gcca
acga
ga g
acag
cagc
t gg
tgga
gaca
cac
atgg
cca
gagt
ggaa
gc c
atgc
tcaat
gacc
gccg
cc g
cctg
gccc
t gg
agaa
ctac
atc
accg
ctc
tgca
ggct
gt t
cctc
ctcgg
•D
NA
is ju
st a
str
ing:
[AC
TG]*
•In
hum
ans,
mos
t D
NA
is c
onsi
dere
d ju
nk•
All D
NA
is r
eplic
ated
, but
...
•...
onl
y D
NA
that
is t
rans
form
ed in
to
„som
ethi
ng“
mat
ters
Ulf
Lese
r:M
olek
ular
biol
ogis
che
Dat
enba
nken
, Vo
rlesu
ng, S
oSe
2003
14
Cent
ral D
ogm
a
The
Cent
ral D
ogm
a of
Mol
ecul
arG
enet
ics
Ulf
Lese
r:M
olek
ular
biol
ogis
che
Dat
enba
nken
, Vo
rlesu
ng, S
oSe
2003
15
Gen
e /
prot
ein
func
tions
•St
ruct
ure
-ce
ll w
all,
mem
bran
es, o
rgan
elle
s, ..
.•
Sign
al t
rans
duct
ion
-Si
gnal
rec
ogni
tion
(out
side
cel
l), t
rans
duct
ion,
and
re
actio
n•
Enzy
mat
ic c
atal
ysis
-Su
ppor
t fo
r ch
emic
al r
eact
ions
: re
spira
tion,
m
etab
olis
m, p
rote
in c
onst
ruct
ion/
deco
nstr
uctio
n, ..
.•
Tran
spor
t-
mRN
Afr
om n
ucle
us t
orib
osom
es, p
rote
ins
from
ribos
omes
to c
ell w
ell,
...
Ulf
Lese
r:M
olek
ular
biol
ogis
che
Dat
enba
nken
, Vo
rlesu
ng, S
oSe
2003
16
Inte
ract
ion
of g
enes
/ p
rote
ins
KEG
G, 1
998
Pat
hw
ays
•M
etab
olic
pa
thw
ays
•Reg
ulat
ory
path
way
s•
Sign
al
tran
sduc
tion
path
way
s
Ulf
Lese
r:M
olek
ular
biol
ogis
che
Dat
enba
nken
, Vo
rlesu
ng, S
oSe
2003
17
DN
A an
d G
enes
•G
enes
are
cod
ing
regi
ons:
DN
A ->
RN
A (
-> P
rote
in)
•Su
bjec
t to
evo
lutio
nary
prin
cipl
es:
mut
atio
n, s
elec
tion
•M
utat
ion
mig
ht c
hang
e ph
enot
ype
-ge
netic
dis
ease
•Co
mpl
ex r
egul
atio
n m
echa
nism
s
Ulf
Lese
r:M
olek
ular
biol
ogis
che
Dat
enba
nken
, Vo
rlesu
ng, S
oSe
2003
18
No.
of
gene
s do
esn‘
t m
ean
... 3000
MB
30.0
00M
ouse
100
MB
18.0
00C.
ele
gans
1600
0 M
BW
heat
180
MB
13.0
00Fl
y33
00 M
B30
.000
Hum
anG
enom
e si
zeN
o of
gen
esSp
ecie
s
•G
enom
ic r
earr
ange
men
ts &
dup
licat
ions
•In
activ
atio
n•
Gen
e tr
ansf
er
Ulf
Lese
r:M
olek
ular
biol
ogis
che
Dat
enba
nken
, Vo
rlesu
ng, S
oSe
2003
19
Gen
otyp
e -
Phen
otyp
e
•G
enes
are
diff
eren
t be
twee
n in
divi
dual
s-
Alle
les
•G
enot
ype:
se
t of
gen
es o
f an
indi
vidu
al•
Phen
otyp
e: a
ppea
ranc
e of
an
indi
vidu
al
-Vi
sual
fea
ture
s, b
ehav
iora
l fea
ture
s, s
usce
ptib
ility
to
dise
ases
, ...
•G
enot
ype
dete
rmin
es p
heno
type
-To
wha
t ex
tend
?
Ulf
Lese
r:M
olek
ular
biol
ogis
che
Dat
enba
nken
, Vo
rlesu
ng, S
oSe
2003
20
From
Gen
es t
o Pr
otei
ns
•G
enes
are
tra
nscr
ibed
into
mRN
A•
Post
-Pro
cess
ing
(spl
icin
g)•
mRN
Ais
tra
nsla
ted
into
pro
tein
s•
Post
-pro
cess
ing
•Co
de n
ot
univ
ersa
l
Ulf
Lese
r:M
olek
ular
biol
ogis
che
Dat
enba
nken
, Vo
rlesu
ng, S
oSe
2003
21
Alte
rnat
ive
Splic
ing
One
gen
e –
one
prot
ein
is m
ostly
wro
ng
Ulf
Lese
r:M
olek
ular
biol
ogis
che
Dat
enba
nken
, Vo
rlesu
ng, S
oSe
2003
22
Hum
an G
enom
e
•Ap
p. 3
0.00
0 ge
nes
•Le
ngth
fro
m 1
00bp
–2M
B (in
tron
s+ex
ons)
•Le
ngth
avg
. 140
0 bp
s (o
nly
exon
s)-
Avg.
pro
tein
leng
th 4
47 a
min
o ac
ids
•Av
gge
ne h
as 9
exon
s•
Onl
y 3%
of
hum
an g
enom
e is
cod
ing
-Re
st „
junk
“ ?
-M
any
repe
ats,
tra
nspo
sons
, low
-com
plex
ity s
eque
nce
-Re
gula
tory
ele
men
ts
Ulf
Lese
r:M
olek
ular
biol
ogis
che
Dat
enba
nken
, Vo
rlesu
ng, S
oSe
2003
23
Gen
e st
ruct
ure
Ulf
Lese
r:M
olek
ular
biol
ogis
che
Dat
enba
nken
, Vo
rlesu
ng, S
oSe
2003
24
Prot
eins
•Pr
otei
ns m
ake
(alm
ost)
eve
ryth
ing
-M
etab
olis
m-
Sign
al g
ener
atio
n, t
rans
port
atio
n, r
ecog
nitio
n, a
nd
reac
tion
-Ce
llula
r st
ruct
ures
-Re
gula
tion
of g
ene
expr
essi
on•
Prot
ein
sequ
ence
-Am
ino
acid
s-
20-le
tter
alp
habe
t-
300-
500
AA lo
ng
Ulf
Lese
r:M
olek
ular
biol
ogis
che
Dat
enba
nken
, Vo
rlesu
ng, S
oSe
2003
25
Prot
ein
Fold
ing
•Pr
otei
ns f
old
into
3D
-str
uctu
re•
Com
plex
and
dyn
amic
pro
cess
•M
ain
prob
lem
in b
ioin
form
atic
s•
„Fun
ctio
n fo
llow
s st
ruct
ure“
Ulf
Lese
r:M
olek
ular
biol
ogis
che
Dat
enba
nken
, Vo
rlesu
ng, S
oSe
2003
26
Prot
ein
stru
ctur
e
•Pr
imar
y:aa
sequ
ence
•Se
cond
ary:
H
elix
es &
she
ets
•Te
rtia
ry:
3D-S
truc
ture
•Q
uart
ary:
Pr
otei
n co
mpl
exes
Ulf
Lese
r:M
olek
ular
biol
ogis
che
Dat
enba
nken
, Vo
rlesu
ng, S
oSe
2003
27
From
Gen
otyp
e to
Phe
noty
p
Mol
ecul
arSt
ruct
ure
Bio
chem
ical
Inte
ract
ion
Phen
otyp
eG
enot
ype
Ulf
Lese
r:M
olek
ular
biol
ogis
che
Dat
enba
nken
, Vo
rlesu
ng, S
oSe
2003
28
The
„-om
ics“
in L
ife S
cien
ce
•G
enom
e-
All D
NA
sequ
ence
in a
cel
l-
Cons
tant
•Tr
ansc
ripto
me
-Al
lmRN
Ain
a c
ell a
t gi
ven
time
-Va
ries
grea
tly w
ith c
ell t
yp, e
nviro
nmen
tal c
ondi
tions
, de
velo
pmen
tal s
tage
, sex
, ...
of c
ells
•Pr
oteo
me
-Al
l pro
tein
s in
a c
ell a
t gi
ven
time
-Ev
en m
ore
com
plex
tha
n tr
ansc
ripto
me:
life
spa
n, in
tera
ctio
n,
post
-tra
nsla
tiona
lmod
ifica
tions
, ...
•M
etab
olom
e-
All n
on-p
rote
in t
hing
s in
a c
ell a
t gi
ven
time