bioinformatics - link.springer.com

13
Bioinformatics

Upload: others

Post on 07-Jul-2022

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Bioinformatics - link.springer.com

Bioinformatics

Page 2: Bioinformatics - link.springer.com

460. Essential Concepts in Toxicogenomics, edited by Donna L. Mendrick and William B. Mattes, 2008

459. Prion Protein Protocols, edited by Andrew F. Hill, 2008458. Artificial Neural Networks: Methods and Applica-

tions, edited by David S. Livingstone, 2008457. Membrane Trafficking, edited by Ales Vancura, 2008456. Adipose Tissue Protocols, Second Edition, edited by

Kaiping Yang, 2008455. Osteoporosis, edited by Jennifer J. Westendorf, 2008454. SARS- and Other Coronaviruses: Laboratory Proto-

cols, edited by Dave Cavanagh, 2008453. Bioinformatics, Volume II: Structure, Function and

Applications, edited by Jonathan M. Keith, 2008452. Bioinformatics, Volume I: Data, Sequence Analysis and

Evolution, edited by Jonathan M. Keith, 2008451. Plant Virology Protocols: From Viral Sequence to Pro-

tein Function, edited by Gary Foster, Elisabeth Johansen, Yiguo Hong, and Peter Nagy, 2008

450. Germline Stem Cells, edited by Steven X. Hou and Shree Ram Singh, 2008

449. Mesenchymal Stem Cells: Methods and Protocols, edited by Darwin J. Prockop, Douglas G. Phinney, and Bruce A. Brunnell, 2008

448. Pharmacogenomics in Drug Discovery and Develop-ment, edited by Qing Yan, 2008

447. Alcohol: Methods and Protocols, edited by Laura E. Nagy, 2008

446. Post-translational Modification of Proteins: Tools for Functional Proteomics, Second Edition, edited by Christoph Kannicht, 2008

445. Autophagosome and Phagosome, edited by Vojo Deretic, 2008

444. Prenatal Diagnosis, edited by Sinhue Hahn and Laird G. Jackson, 2008

443. Molecular Modeling of Proteins, edited by Andreas Kukol, 2008.

442. RNAi: Design and Application, edited by Sailen Barik, 2008

441. Tissue Proteomics: Pathways, Biomarkers, and Drug Discovery, edited by Brian Liu, 2008

440. Exocytosis and Endocytosis, edited by Andrei I. Ivanov, 2008

439. Genomics Protocols, Second Edition, edited by Mike Starkey and Ramnanth Elaswarapu, 2008

438. Neural Stem Cells: Methods and Protocols, Second Edition, edited by Leslie P. Weiner, 2008

437. Drug Delivery Systems, edited by Kewal K. Jain, 2008436. Avian Influenza Virus, edited by Erica Spackman, 2008435. Chromosomal Mutagenesis, edited by Greg Davis and

Kevin J. Kayser, 2008434. Gene Therapy Protocols: Volume 2: Design and Char-

acterization of Gene Transfer Vectors, edited by Joseph M. LeDoux, 2008

433. Gene Therapy Protocols: Volume 1: Production and In Vivo Applications of Gene Transfer Vectors, edited by Joseph M. LeDoux, 2008

432. Organelle Proteomics, edited by Delphine Pflieger and Jean Rossier, 2008

431. Bacterial Pathogenesis: Methods and Protocols, edited by Frank DeLeo and Michael Otto, 2008

430. Hematopoietic Stem Cell Protocols, edited by Kevin D. Bunting, 2008

429. Molecular Beacons: Signalling Nucleic Acid Probes, Methods and Protocols, edited by Andreas Marx and Oliver Seitz, 2008

428. Clinical Proteomics: Methods and Protocols, edited by Antonia Vlahou, 2008

427. Plant Embryogenesis, edited by Maria Fernanda Suarez and Peter Bozhkov, 2008

426. Structural Proteomics: High-Throughput Methods, edited by Bostjan Kobe, Mitchell Guss, and Huber Thomas, 2008

425. 2D PAGE: Sample Preparation and Fractionation, Volume 2, edited by Anton Posch, 2008

424. 2D PAGE: Sample Preparation and Fractionation, Volume 1, edited by Anton Posch, 2008

423. Electroporation Protocols: Preclinical and Clinical Gene Medicine, edited by Shulin Li, 2008

422. Phylogenomics, edited by William J. Murphy, 2008421. Affinity Chromatography: Methods and Protocols, Sec-

ond Edition, edited by Michael Zachariou, 2008420. Drosophila: Methods and Protocols, edited by Christian

Dahmann, 2008419. Post-Transcriptional Gene Regulation, edited by Jeffrey

Wilusz, 2008418. Avidin–Biotin Interactions: Methods and Applications,

edited by Robert J. McMahon, 2008417. Tissue Engineering, Second Edition, edited by

Hannsjörg Hauser and Martin Fussenegger, 2007416. Gene Essentiality: Protocols and Bioinformatics, edited

by Svetlana Gerdes and Andrei L. Osterman, 2008415. Innate Immunity, edited by Jonathan Ewbank and Eric

Vivier, 2007414. Apoptosis in Cancer: Methods and Protocols, edited by

Gil Mor and Ayesha Alvero, 2008413. Protein Structure Prediction, Second Edition, edited

by Mohammed Zaki and Chris Bystroff, 2008412. Neutrophil Methods and Protocols, edited by Mark

T. Quinn, Frank R. DeLeo, and Gary M. Bokoch, 2007

411. Reporter Genes: A Practical Guide, edited by Don Anson, 2007

410. Environmental Genomics, edited by Cristofre C. Martin, 2007

409. Immunoinformatics: Predicting Immunogenicity In Silico, edited by Darren R. Flower, 2007

408. Gene Function Analysis, edited by Michael Ochs, 2007407. Stem Cell Assays, edited by Vemuri C. Mohan, 2007406. Plant Bioinformatics: Methods and Protocols, edited by

David Edwards, 2007405. Telomerase Inhibition: Strategies and Protocols, edited

by Lucy Andrews and Trygve O. Tollefsbol, 2007

METHODS IN MOLECULAR BIOLOGY™

John M. Walker, SERIES EDITOR

Page 3: Bioinformatics - link.springer.com

M E T H O D S I N M O L E C U L A R B I O L O G Y ™

BioinformaticsVolume II

Structure, Function and Applications

Edited by

Jonathan M. Keith, PhD

School of Mathematical Sciences, Queensland University of Technology, Brisbane, Queensland, Australia

Page 4: Bioinformatics - link.springer.com

ISBN: 978-1-60327-428-9 e-ISBN: 978-1-60327-429-6ISSN 1064-3745 e-ISSN: 1940-6029DOI: 10.1007/978-1-60327-429-6

Library of Congress Control Number: 20082922946

© 2008 Humana Press, a part of Springer Science+Business Media, LLCAll rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Humana Press, 999 Riverview Drive, Suite 208, Totowa, NJ 07512 USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden.The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights.While the advice and information in this book are believed to be true and accurate at the date of going to press, nei-ther the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein.

Cover illustration: Fig. 1A, Chapter 23, “Visualization,” by Falk Schreiber (main image); and Fig. 4, Chapter 5, “The Classification of Protein Domains,” by Russell L. Marsden and Christine A. Orengo (surrounding images)

Printed on acid-free paper

9 8 7 6 5 4 3 2 1

springer.com

EditorJonathan M. KeithSchool of Mathematical SciencesQueensland University of TechnologyBrisbane, Queensland, [email protected]

Series EditorJohn WalkerHatfield, Hertfordshire AL10 9NPUK

Page 5: Bioinformatics - link.springer.com

PrefaceBioinformatics is the management and analysis of data for the life sciences. As such, it is inherently interdisciplinary, drawing on techniques from Computer Science, Statis-tics, and Mathematics and bringing them to bear on problems in Biology. Moreover, its subject matter is as broad as Biology itself. Users and developers of bioinformatics methods come from all of these fields. Molecular biologists are some of the major users of Bioinformatics, but its techniques are applicable across a range of life sciences. Other users include geneticists, microbiologists, biochemists, plant and agricultural scientists, medical researchers, and evolution researchers.

The ongoing exponential expansion of data for the life sciences is both the major challenge and the raison d’être for twenty-first century Bioinformatics. To give one example among many, the completion and success of the human genome sequencing project, far from being the end of the sequencing era, motivated a proliferation of new sequencing projects. And it is not only the quantity of data that is expanding; new types of biological data continue to be introduced as a result of technological development and a growing understanding of biological systems.

Bioinformatics describes a selection of methods from across this vast and expanding discipline. The methods are some of the most useful and widely applicable in the field. Most users and developers of bioinformatics methods will find something of value to their own specialties here, and will benefit from the knowledge and experience of its 86 con-tributing authors. Developers will find them useful as components of larger methods, and as sources of inspiration for new methods. Volume II, Section IV in particular is aimed at developers; it describes some of the “meta-methods”—widely applicable mathematical and computational methods that inform and lie behind other more special-ized methods—that have been successfully used by bioinformaticians. For users of bioin-formatics, this book provides methods that can be applied as is, or with minor variations to many specific problems. The Notes section in each chapter provides valuable insights into important variations and when to use them. It also discusses problems that can arise and how to fix them. This work is also intended to serve as an entry point for those who are just beginning to discover and use methods in bioinformatics. As such, this book is also intended for students and early career researchers.

As with other volumes in the Methods in Molecular Biology™ series, the intention of this book is to provide the kind of detailed description and implementation advice that is crucial for getting optimal results out of any given method, yet which often is not incorporated into journal publications. Thus, this series provides a forum for the com-munication of accumulated practical experience.

The work is divided into two volumes, with data, sequence analysis, and evolution the subjects of the first volume, and structure, function, and application the subjects of the second. The second volume also presents a number of “meta-methods”: techniques that will be of particular interest to developers of bioinformatic methods and tools.

Within Volume I, Section I deals with data and databases. It contains chapters on a selection of methods involving the generation and organization of data, including

v

Page 6: Bioinformatics - link.springer.com

sequence data, RNA and protein structures, microarray expression data, and func-tional annotations.

Section II presents a selection of methods in sequence analysis, beginning with multiple sequence alignment. Most of the chapters in this section deal with methods for discovering the functional components of genomes, whether genes, alternative splice sites, non-coding RNAs, or regulatory motifs.

Section III presents several of the most useful and interesting methods in phylogenetics and evolution. The wide variety of topics treated in this section is indicative of the breadth of evolution research. It includes chapters on some of the most basic issues in phylogenet-ics: modelling of evolution and inferring trees. It also includes chapters on drawing infer-ences about various kinds of ancestral states, systems, and events, including gene order, recombination events and genome rearrangements, ancestral interaction networks, lateral gene transfers, and patterns of migration. It concludes with a chapter discussing some of the achievements and challenges of algorithm development in phylogenetics.

In Volume II, Section I, some methods pertinent to the prediction of protein and RNA structures are presented. Methods for the analysis and classification of structures are also discussed.

Methods for inferring the function of previously identified genomic elements (chiefly protein-coding genes) are presented in Volume II, Section II. This is another very diverse subject area, and the variety of methods presented reflects this. Some well-known techniques for identifying function, based on homology, “Rosetta stone” genes, gene neighbors, phylogenetic profiling, and phylogenetic shadowing are discussed, alongside methods for identifying regulatory sequences, patterns of expres-sion, and participation in complexes. The section concludes with a discussion of a technique for integrating multiple data types to increase the confidence with which functional predictions can be made. This section, taken as a whole, highlights the opportunities for development in the area of functional inference.

Some medical applications, chiefly diagnostics and drug discovery, are described in Volume II, Section III. The importance of microarray expression data as a diagnostic tool is a theme of this section, as is the danger of over-interpreting such data. The case study presented in the final chapter highlights the need for computational diagnostics to be biologically informed.

The final section presents just a few of the “meta-methods” that developers of bioinformatics methods have found useful. For the purpose of designing algorithms, it is as important for bioinformaticians to be aware of the concept of fixed parameter tractability as it is for them to understand NP-completeness, since these concepts often determine the types of algorithms appropriate to a particular problem. Clustering is a ubiquitous problem in Bioinformatics, as is the need to visualize data. The need to interact with massive data bases and multiple software entities makes the development of computational pipelines an important issue for many bioinformaticians. Finally, the chapter on text mining discusses techniques for addressing the special problems of interacting with and extracting information from the vast biological literature.

Jonathan M. Keith

vi Preface

Page 7: Bioinformatics - link.springer.com

Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vContributors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ixContent of Volume I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii

SECTION I: STRUCTURES

1. UNAFold: Software for Nucleic Acid Folding and Hybridization . . . . . . . . . . . . . . 3Nicholas R. Markham and Michael Zuker

2. Protein Structure Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33Bissan Al-Lazikani, Emma E. Hill, and Veronica Morea

3. An Introduction to Protein Contact Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . 87Nicholas Hamilton and Thomas Huber

4. Analysis of Mass Spectrometry Data in Proteomics. . . . . . . . . . . . . . . . . . . . . . . . . 105Rune Matthiesen and Ole N. Jensen

5. The Classification of Protein Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123Russell L. Marsden and Christine A. Orengo

SECTION II: INFERRING FUNCTION

6. Inferring Function from Homology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149Richard D. Emes

7. The Rosetta Stone Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169Shailesh V. Date

8. Inferring Functional Relationships from Conservation of Gene Order . . . . . . . . . . 181Gabriel Moreno-Hagelsieb

9. Phylogenetic Profiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201Shailesh V. Date and José M. Peregrín-Alvarez

10. Phylogenetic Shadowing: Sequence Comparisons of Multiple Primate Species . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217Dario Boffelli

11. Prediction of Regulatory Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233Albin Sandelin

12. Expression and Microarrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245Joaquín Dopazo and Fátima Al-Shahrour

13. Identifying Components of Complexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257Nicolas Goffard and Georg Weiller

14. Integrating Functional Genomics Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267Insuk Lee and Edward M. Marcotte

SECTION III: APPLICATIONS AND DISEASE

15. Computational Diagnostics with Gene Expression Profiles . . . . . . . . . . . . . . . . . . . 281Claudio Lottaz, Dennis Kostka, Florian Markowetz, and Rainer Spang

vii

Page 8: Bioinformatics - link.springer.com

16. Analysis of Quantitative Trait Loci . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297Mario Falchi

17. Molecular Similarity Concepts and Search Calculations . . . . . . . . . . . . . . . . . . . . . 327Jens Auer and Jürgen Bajorath

18. Optimization of the MAD Algorithm for Virtual Screening . . . . . . . . . . . . . . . . . . 349Hanna Eckert and Jürgen Bajorath

19. Combinatorial Optimization Models for Finding Genetic Signatures from Gene Expression Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363Regina Berretta, Wagner Costa, and Pablo Moscato

20. Genetic Signatures for a Rodent Model of Parkinson’s Disease Using Combinatorial Optimization Methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379 Mou’ath Hourani, Regina Berretta, Alexandre Mendes, and Pablo Moscato

SECTION IV: ANALYTICAL AND COMPUTATIONAL METHODS

21. Developing Fixed-Parameter Algorithms to Solve Combinatorially Explosive Biological Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395Falk Hüffner, Rolf Niedermeier, and Sebastian Wernicke

22. Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423Geoffrey J. McLachlan, Richard W. Bean, and Shu-Kay Ng

23. Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441Falk Schreiber

24. Constructing Computational Pipelines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451Mark Halling-Brown and Adrian J. Shepherd

25. Text Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471Andrew B. Clegg and Adrian J. Shepherd

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493

viii Contents

Page 9: Bioinformatics - link.springer.com

Contributors

BISSAN AL-LAZIKANI • Biofocus DPI, London, United KingdomFÁTIMA AL-SHAHROUR • Department of Bioinformatics, Centro de Investigación

Príncipe Felipe (CIPF), Valencia, SpainJENS AUER • Department of Life Science Informatics, Bonn-Aachen International

Center for Information Technology (B-IT), Rheinische Friedrich-Wilhelms-University Bonn, Bonn, Germany

JÜRGEN BAJORATH • Professor and Chair of Life Science Informatics, Department of Life Science Informatics, Bonn-Aachen International Center for Information Technology (B-IT), Rheinische Friedrich-Wilhelms-University Bonn, Bonn, Germany

RICHARD W. BEAN • ARC Centre of Excellence in Bioinformatics, and Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia

REGINA BERRETTA • Centre of Bioinformatics, Biomarker Discovery and Information-Based Medicine, The University of Newcastle, Callaghan, New South Wales, Australia

DARIO BOFFELLI • Children’s Hospital Oakland Research Institute, Oakland, CAANDREW B. CLEGG • Institute of Structural Molecular Biology, School of Crystallography,

Birkbeck College, University of London, London, United KingdomWAGNER COSTA • School of Electrical Engineering and Computer Science, The University

of Newcastle, Callaghan, New South Wales, AustraliaSHAILESH V. DATE • PENN Center for Bioinformatics, Department of Genetics, University

of Pennsylvania School of Medicine, Philadelphia, PAJOAQUÍN DOPAZO • Department of Bioinformatics, Centro de Investigación Príncipe

Felipe (CIPF), Valencia, SpainHANNA ECKERT • Department of Life Science Informatics, Bonn-Aachen International

Center for Information Technology (B-IT), Rheinische Friedrich-Wilhelms-University Bonn, Bonn, Germany

RICHARD D. EMES • Department of Biology, University College London, London, United Kingdom

MARIO FALCHI • Twin Research and Genetic Epidemiology Unit, King’s College London School of Medicine, London, United Kingdom

NICOLAS GOFFARD • Research School of Biological Sciences and ARC Centre of Excellence for Integrative Legume Research, The Australian National University, Canberra, Australian Capital Territory, Australia

MARK HALLING-BROWN • Institute of Structural Molecular Biology, School of Crystallography, Birkbeck College, University of London, London, United Kingdom

NICHOLAS HAMILTON • ARC Centre of Excellence in Bioinformatics, Institute for Molecular Bioscience and Advanced Computational Modelling Centre, The University of Queensland, Brisbane, Queensland, Australia

EMMA E. HILL • The Journal of Cell Biology, Rockefeller University Press, New York, NY

ix

Page 10: Bioinformatics - link.springer.com

MOU’ATH HOURANI • Newcastle Bioinformatics Initiative, School of Electrical Engineer-ing and Computer Science, The University of Newcastle, Callaghan, New South Wales, Australia

THOMAS HUBER • School of Molecular and Microbial Sciences and Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, Brisbane, Queensland, Australia

FALK HÜFFNER • Institut für Informatik, Friedrich-Schiller-Universität Jena, Jena, Germany

OLE N. JENSEN • Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark

JONATHAN M. KEITH • School of Mathematical Sciences, Queensland University of Techno logy, Brisbane, Queensland, AustraliaDENNIS KOSTKA • Max Planck Institute for Molecular Genetics and Berlin Center for

Genome-Based Bioinformatics, Berlin, GermanyINSUK LEE • Center for Systems and Synthetic Biology, Institute for Molecular Biology,

University of Texas at Austin, Austin, TXCLAUDIO LOTTAZ • Max Planck Institute for Molecular Genetics and Berlin Center for

Genome-Based Bioinformatics, Berlin, GermanyEDWARD M. MARCOTTE • Center for Systems and Synthetic Biology, and Department of

Chemistry and Biochemistry, Institute for Molecular Biology, University of Texas at Austin, Austin, TX

NICHOLAS R. MARKHAM • Xerox Litigation Services, Albany, NYFLORIAN MARKOWETZ • Max Planck Institute for Molecular Genetics and Berlin Center

for Genome-Based Bioinformatics, Berlin, GermanyRUSSELL L. MARSDEN • Biochemistry and Molecular Biology Department, University

College London, London, United KingdomRUNE MATTHIESEN • CIC bioGUNE, Bilbao, SpainGEOFFREY J. MCLACHLAN • ARC Centre of Excellence in Bioinformatics, Institute for

Molecular Bioscience, and Department of Mathematics, The University of Queensland, Brisbane, Queensland, Australia

ALEXANDRE MENDES • Centre of Bioinformatics, Biomarker Discovery and Information-Based Medicine, The University of Newcastle, Callaghan, New South Wales, Australia

VERONICA MOREA • National Research Council (CNR), Institute of Molecular Biology and Pathology (IBPN), Rome, Italy

GABRIEL MORENO-HAGELSIEB • Department of Biology, Wilfrid Laurier University, Waterloo, Ontario, Canada

PABLO MOSCATO • ARC Centre of Excellence in Bioinformatics, and Centre of Bioin-formatics, Biomarker Discovery and Information-Based Medicine, The University of Newcastle, Callaghan, New South Wales, Australia

SHU-KAY NG • Department of Mathematics, The University of Queensland, Brisbane, Queensland, Australia

ROLF NIEDERMEIER • Institut für Informatik, Friedrich-Schiller-Universität Jena, Jena, Germany

CHRISTINE A. ORENGO • Biochemistry and Molecular Biology Department, University College London, London, United Kingdom

JOSÉ M. PEREGRÍN-ALVAREZ • Hospital for Sick Children, Toronto, Ontario, Canada

x Contributors

Page 11: Bioinformatics - link.springer.com

ALBIN SANDELIN • The Bioinformatics Centre, Department of Molecular Biology and Biotech Research and Innovation Centre, University of Copenhagen, Copenhagen, Denmark

FALK SCHREIBER • Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Germany and Institute for Computer Science, Martin-Luther University Halle-Wittenberg, Germany

ADRIAN J. SHEPHERD • Institute of Structural Molecular Biology, School of Crystallography, Birkbeck College, University of London, London, United Kingdom

RAINER SPANG • Max Planck Institute for Molecular Genetics and Berlin Center for Genome-Based Bioinformatics, Berlin, Germany

GEORG F. WEILLER • Research School of Biological Sciences and ARC Centre of Excellence for Integrative Legume Research, The Australian National University, Canberra, Australian Capital Territory, Australia

SEBASTIAN WERNICKE • Institut für Informatik, Friedrich-Schiller-Universität Jena, Jena, Germany

MICHAEL ZUKER • Mathematical Sciences and Biology Department, Rensselaer Polytechnic Institute, Troy, NY

Contributors xi

Page 12: Bioinformatics - link.springer.com

Contents of Volume I

SECTION I: DATA AND DATABASES

1. Managing Sequence DataIlene Karsch Mizrachi

2. RNA Structure Determination by NMRLincoln G. Scott and Mirko Hennig

3. Protein Structure Determination by X-Ray CrystallographyAndrea Ilari and Carmelinda Savino

4. Pre-Processing of Microarray Data and Analysis of Differential ExpressionSteffen Durinck

5. Developing an OntologyMidori A. Harris

6. Genome AnnotationHideya Kawaji and Yoshihide Hayashizaki

SECTION II: SEQUENCE ANALYSIS

7. Multiple Sequence AlignmentWalter Pirovano and Jaap Heringa

8. Finding Genes in Genome SequenceAlice Carolyn McHardy

9. Bioinformatics Detection of Alternative SplicingNamshin Kim and Christopher Lee

10. Reconstruction of Full-Length Isoforms from Splice GraphsYi Xing and Christopher Lee

11. Sequence SegmentationJonathan M. Keith

12. Discovering Sequence MotifsTimothy L. Bailey

SECTION III: PHYLOGENETICS AND EVOLUTION

13. Modeling Sequence EvolutionPietro Liò and Martin Bishop

14. Inferring TreesSimon Whelan

15. Detecting the Presence and Location of Selection in ProteinsTim Massingham

16. Phylogenetic Model Evaluation Lars Sommer Jermiin, Vivek Jayaswal, Faisal Ababneh, and John Robinson

xiii

Page 13: Bioinformatics - link.springer.com

17. Inferring Ancestral Gene OrderJulian M. Catchen, John S. Conery, and John H. Postlethwait

18. Genome Rearrangement by the Double Cut and Join OperationRichard Friedberg, Aaron E. Darling, and Sophia Yancopoulos

19. Inferring Ancestral Protein Interaction NetworksJosé M. Peregrín-Alvarez

20. Computational Tools for the Analysis of Rearrangements in Mammalian GenomesGuillaume Bourque and Glenn Tesler

21. Detecting Lateral Genetic Transfer: A Phylogenetic ApproachRobert G. Beiko and Mark A. Ragan

22. Detecting Genetic RecombinationGeorg F. Weiller

23. Inferring Patterns of MigrationPaul M.E. Bunje and Thierry Wirth

24. Fixed-Parameter Algorithms in PhylogeneticsJens Gramm, Arfst Nickelsen, and Till Tantau

Index

xiv Contents of Volume I