structural bioinformatics - indiana university

31
Structural Bioinformatics I619 (B680) Spring 2008

Upload: others

Post on 03-Feb-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Structural Bioinformatics

I619 (B680)

Spring 2008

Basic Information

Class meets:Time: MW 9:30am – 10:45amPlace: PY 109

Instructor:Predrag RadivojacOffice: Eigenmann 1005 (will move to I219 during this semester)Email: [email protected]: www.informatics.indiana.edu/predrag

Office Hours:Time: By appointmentPlace: Eigenmann 1005 or some other place

Course Web Site:http://www.informatics.indiana.edu/predrag/classes/2008springi619/2008springi619.htm

Goal of the Course

• to introduce informaticsapproaches that are based onsequence and structure ofbiological macromolecules

• the goal of these approaches:– to support structural biology

and structural genomics– to elucidate new biological

knowledge

• along with the core concepts,the course will providestructural basics of biologicalphenomena and informationon types and sources ofbiological data

Overview of the Course

Textbook information:

Required:• Structural Bioinformatics - by Philip Bourne and

Helge Weissig (editors)

Recommended:• Introduction to Protein Structure - by Carl-Ivar

Branden and John Tooze• Proteins: Structure and Function – by David Whitford• Understanding DNA – by Chris Calladine et al.

Supplementary material will be provided in class!

Overview of the Course

See online syllabus…

introduction to structural bioinformatics fundamental principles of protein, DNA, and RNA structure structure-based biological databases, overview and the importance experimental methods for structure determination (guest lectures) molecular visualization (guest lecture(s)) structure assignment, comparison and alignment prediction of structure of biopolymers prediction of protein function from structure and other types of data principles of molecular recognition and docking intrinsically disordered proteins and more...

Grading Policy and Announcements

• Midterm exam: 20% (in class)• Final project: 30% (individual)• Homework assignments: 40% (readings)• Class participation: 10% (+ 1 quiz)_________________________________

• Midterm exam – week 9 (tentative; before Spr. Break)• Final presentat. – April 28 (10:15am)• Spring break – March 10-14• MLK Jr. day – January 21 (no classes!)

More details ...• Midterm exam

– in class; material from lectures and readings

• Final project– individual; group projects only with instructor’s permission and good

justification (e.g. a chemistry and an informatics student workingtogether)

– not necessarily programming tasks (can write a summary report onsome topic involving structural bioinformatics)

– project proposal due in week 8; presentation on April 28– some topics will be provided to chose from (can combine with some

other class or one’s research)

• Homework assignments– about 10 papers; one page summary report for each (standard format)– small presentation + in class discussion of readings

• Class participation– discussion– one quiz, maybe more

Guest Lectures

• Yuzhen Ye, Assist. Prof., School of Informatics• Vladimir Uversky, Senior Research Prof., School of Medicine• Sean Mooney, Assist. Prof., School of Medicine• someone from IU Department Chemistry

(guest lecture material counts toward midterm and will be covered by homework assignments)

Late Assignment Policy andAcademic Honesty

• The homework assignments are due in class, on the specified duedate (hardcopy only!)

• No late assignments will be accepted unless there are legitimatecircumstances

• All assignments are individual!!!

• All the sources used for problem solution must be acknowledged(people, web sites, books, etc.)

OK, Let’s Start!

Central “Dogma” of Molecular Biology

http://cats.med.uvm.edu/cats_teachingmod/microbiology/courses/genomics/introduction/1_2_bgd_gen_pro_dog.html

Bioinformatics

• The science of information and information flow in biological systems,esp. of the use of computational methods in genetics and genomics.(Oxford English Dictionary)

• First documented use of the word bioinformatics (journal Simulation):– in 1978 Paulien Hogeweg wrote about her field of study: “Since 1970 she

has been a staff member at the Subfaculty of Biology of the University ofUtrecht, with her main field of research in bioinformatics”

• A simplified view:– bioinformatics involves the use of technology in order to support biology

ask biological or medical questions. We may say “biology”, but weprimarily focus on the molecular level (e.g. “molecular biology”, ‘’cellbiology”).

• The ultimate goal of the field is to enable the discovery of newbiological insights as well as to create a global perspective fromwhich unifying principles in biology can be discerned (NCBI web site)

Promises of Structural Bioinformatics...

• creating an infrastructure for building up structural models fromcomponent parts

• understanding structural basis of biological phenomena (ordifferently put: understanding structure-to-function principles)

• gaining the ability to understand the design principles of proteins sothat new functionalities can be created

• learning how to design drugs efficiently based on structuralknowledge of their target

• catalyzing the development of simulation models that can giveinsight into function based on structural simulations

... by using computers!

Structural Bioinformatics Today

• methods to support structural biology and structuralgenomics– target selection– tracking experimental crystallization trials– analysis of experimental data– storing molecular structures in databases

• understanding structural basis of biological phenomena– visualization– characterization and classification– prediction– simulation

• integrating structural data with other data types

Structural andStructural andFunctional Analysis

Functional Analysis

Automated Crysta

l

Automated Crysta

l

Mounting and

Mounting and

Structu

re Refin

ement

Structu

re Refin

ement

Red

uctiv

e R

educ

tive

Met

hyla

tion

Met

hyla

tion

Dom

ain

Def

initi

onD

omai

n D

efin

ition

Web Site

Web Site

Workshops

Workshops

Publications

Publications

Cleavage On Column

Cleavage On ColumnNew Tags andNew Tags and

Expression SystemsExpression Systems

Targe

tTar

get

Refine

men

t

Refine

men

t

Domain

Par

sing

Domain

Par

sing

FUNCTIONALSTUDIES

GENOMICSEQUENCES

from A. Joachimiak

Structural Genomics

MCSG Structure Determination Pipeline

TM0064 TM0542 TM0015 TM0089 TM0262 TM0723

TM0790 TM1097TM0720 TM0761 TM0686

TM1766 TM0600 TM0542TM0812 TM0559TM0578

TM216

TM0716

TM0720 TM0723 TM0761

TM0808

TM0814 TM0820

TM0813

TM1766TM0866

TM0892 TM0920 TM0959TM0831 TM0875 TM0885 TM0991

TM1119 TM1056TM1097 TM1083 TM1419 TM1560 TM1521

from A. Godzik

Structures Solved by SG

OK, Let’s Really Start!

The Logic of Biological Phenomena

Facts:• molecules are lifeless

• in appropriate complexityand number, moleculescompose living things

• the organisms consist ofcells

• the cells consist of(bio)molecules that mustconform to the physicaland chemical principlesthat govern all matter

What makes living things distinct?• they can grow• they can move• they can replicate themselves• they can respond to stimuli• they can perform metabolism

http://www.enchantedlearning.com/subjects/animals/cell/anatomy.GIF

More Formally

• living organisms are complex and highly organized– organism → cell → subcellular structures (organelles) → polymeric

molecules (macromolecules) → building blocks (e.g. sugars, amino acids)– 3-D structure of a molecule aka its conformation

• biological structures serve functional purposes– biological purpose can be given for each component

• living systems are actively involved in energy transformations– extract energy from the environment (ultimate source is the Sun)– organisms capture sun’s energy (from photosynthesis to metabolism of food)– the living state is characterized by the flow of energy through the organism– energy is spent to maintain a steady-state (which, btw, is very dynamic!)

• living systems can self-replicate– e.g. simple division by bacteria, sexual reproduction by plants or animals– it is high-fidelity reproduction, but imperfect (good: evolution; bad: disease)

Biomolecules• Some facts

– hydrogen (H), oxygen (O), carbon (C), and nitrogen (N)constitute >99% of human body

– most of the H and O occur as H2O• Why are H, O, C, N so suitable to chemistry of life?

– they can form covalent bonds by electron-pair sharing!– also, H, C, N, and O are among the lightest elements in the

periodic table capable of forming covalent bonds ⇒ they formthe strongest bonds

– phosphorus (P) and sulfur (S) are also very important inbiomolecules (e.g., energy molecule ATP)

• So, what are biomolecules? No formal definition.– all biomolecules contain carbon (a very versatile atom)– necessary for the existence of all known forms of life– C can share 4 electrons (can form 4 bonds); N has 3, O has 2

and H has 1 unpaired electron(s)

... .

How are Biomolecules Produced?• major precursors for the formation of biomolecules are water,

carbon dioxide (CO2), and 3 inorganic nitrogen compounds(amonium NH4

+; nitrate NO3-; dinitrogen N2).

• metabolic processes transform these inorganic precursors intobiomolecules.

precursors

metabolites

building blocks

macromolecules

supramolecules

Precursors → Cell

Biochemistry by Reginald H. Garrett and Charles M. Grisham

Covalent Bond and Non-covalent Bond• Covalent bond (sharing pairs of electrons)

– Proteins: polymers of amino acids (peptide bonds)– DNA/RNA: polymers of nucleotides (phosphodiester bonds )

• Non-covalent bond (weak force)

C N Cα C NN Cα

R1

O

H

R2Φ ψ

O

Weak Forces and Interactions

• Van der Waals forces• result of induced electrical interactions between closely approaching atoms or

molecules as their negatively-charged electron clouds fluctuate instantaneouslyin time; may be attractive – between positively charged nuclei and the electronsof nearby atoms- dipole-dipole interactions, dipole induced dipole interactions- orrepulsive when two atoms approach too close to one another

• hydrogen bonds• form between hydrogen atom covalently bonded to an electronegative atom (O,

N) and a second electronegative atom that serves as the hydrogen bondacceptor; stronger than Van der Waals forces; highly directional

• ionic interactions• result of attractive forces between oppositely charged polar functions, such as

negative carboxyl groups and positive amino groups• hydrophobic interactions

• exist due to the strong tendency of water to exclude non-polar groups ormolecules; water molecules prefer the stronger interactions that they share withone another, compared with their interaction with non-polar interactions

Weak Forces and Interactions

946Covalent bond (N2)

402Covalent bond (O2)

2.5The average kinetic energy of molecules at 25°C

<40Hydrophobic interactions

20Ionic interactions

12-30Hydrogen bonds

0.4-4.0Van der Waals interactions

Strength (kJ/mol)Type of bonding

A commonly-used example of apolar compound is water (H2O).

The electrons of water's hydrogenatoms are strongly attracted to

the oxygen atom, and are actuallycloser to oyxgen's nucleus than tothe hydrogen nuclei; thus, waterhas a relatively strong negative

charge in the middle (red shade),and a positive charge at the ends(blue shade). (source: Wikipedia)

1. weak forces are non-covalent “bonds”2. they are 1-3 orders of magnitude weaker

than covalent bonds3. they are several times greater than

dissociating tendency due to thermal motionof molecules (at 25°C)

Biochemistry by Reginald H. Garrett and Charles M. Grishamwww.biology.arizona.edu

Why are Weak Forces Important?

1. They are reversible! Rigid molecules would not facilitate cellular activities.

2. Biomolecular recognition is performed via interplay of complementarymolecules. So, biological function is achieved through mechanisms basedon structural complementarity and weak chemical interactions.

3. If a sufficient number of weak bonds can be formed (as in macromoleculescomplementary in structure to one another) larger structures assemble“spontaneously”.

4. A consequence: weak forces restrict organisms to a narrow range ofenvironmental conditions (temperature, ionic strength, relative acidity).

5. The loss of structural order is called denaturation. It is accompanied by lossof function.

Weak Forces and 3D Structures of LargeBiomolecules

http://www.accessexcellence.org/RC/VL/GG/protein.html

Biomolecules are Dynamic

• Dynamic cellular process, e.g., Proteins associate and disassociate• Dynamic structures

– Allosteric protein: a protein whose conformation is changed when it binds aparticular molecule

– Disordered protein

Image from www.chem.umd.edu/groups/munoz/proteinfunction2.pdf

Energy Calculations for Structures

• Quantum mechanics (QM) vs molecular mechanics (MM)– Quantum mechanics based on the solution of the Schrodinger equation; mainly

applied to small systems; requires no experimental information as input.– Molecular mechanics (force-field for potential energy method); potential function

must be evaluated empirically• Molecular mechanics

– Cumulative physical forces can be used to describe molecular geometries andenergies; the spatial conformation is a natural adjustment of geometry tominimize the total internal energy

– Mechanical molecular representation: a molecule is described as a collection ofmasses centered at the nuclei (atoms) connected by springs (bonds); themolecule stretches, bends and rotates in response to inter and intramolecularforces

– Works generally well for describing molecular structures and processes (but notthe bond-breaking events).

Nonbonded Computations• Examples:

– 6/12 Lennard-Jones potential for van der Walls potential

– The Coulomb potential

• Force fields– energy function form & parameters– CHARM, AMBER, etc

More to come

“Structural bioinformatics is the subdiscipline of bioinformaticsthat focuses on the representation, storage, retrieval, analysis,and display of structural information at the atomic andsubcellular spatial scales”

(Structural Bioinformatics. Philip Bourne & Helge Weissig)