pharmaceutical data mining. edited by konstantin v. balakin

1
aimed at the experienced practitioner— to get the most out of this book, the reader will need to have substantial ex- pertise in the area of drug discovery and the development and application of bio- assays. Dr. Gerd Wagner University of East Anglia (UK) [1] J. C. Milne, P. D. Lambert, S. Schenk, D. P. Carney, J. J. Smith, D. J. Gagne, L. Jin, O. Boss, R. B. Perni, C. B. Vu, J. E. Bemis, R. Xie, J. S. Disch, P. Y. Ng, J. J. Nunes, A. V. Lynch, H. Yang, H. Galonek, K. Israelian, W. Choy, A. Iff- land, S. Lavu, O. Medvedik, D. A. Sinclair, J. M. Olefsky, M. R. Jirousek, P. J. Elliott, C. H. West- phal, Nature 2007, 450, 712 – 716. [2] D. Beher, J. Wu, S. Cumine, K. W. Kim, S. C. Lu, L. Atangan, M. Wang, Chem. Biol. Drug Des. 2009, 74, 619 – 624 . [3] M. Pacholec, B. A. Chrunyk, D. Cunningham, D. Flynn, D. A. Griffith, M. Griffor, P. Loulakis, B. Pabst, X. Qiu, B. Stockman, V. Thanabal, A. Var- ghese, J. Ward, J. Withka, K. Ahn, J. Biol. Chem. 2010 ; DOI :10.1074/jbc.M109.088682. Pharmaceutical Data Mining Edited by Konstantin V. Balakin. Wiley, Hoboken 2009. 565 pp., hardcover $ 125.00.—ISBN 978-0-470-19608-3 Drug discovery has become a highly technolo- gy-intensive process generat- ing large amounts of data. Through- out the discov- ery and devel- opment pipe- line, the key question is how to convert this data to information and knowledge. This book deals with techniques to support this process. On the back cover, to show “how sophisticated computational data mining techniques can impact contem- porary drug discovery and development” is stated as the main objective of the book. In 17 chapters by different author groups, the book gives an overview of methods and application areas of data mining. Many of the chapters are written as introductory texts to algorithmic prin- ciples and are rich in formulae: for exam- ple, chapters on statistics and informa- tion theory (Chapter 1), dimensionality reduction (Chapter 15) and self-organiz- ing maps (Chapter 16). Some of these texts hide behind chapter titles promis- ing more pragmatic content: for exam- ple, Chapter 4 on compound selection and iterative screening is more of an in- troduction to Bayesian models and sup- port vector machines; Chapter 5 entitled “Prediction of toxic effects of pharma- ceutical agents” is a rather abstract de- scription of QSAR modeling. Most of these texts are too terse for the novice, but at the same time too superficial for the expert. To serve as useful introducto- ry material, they could have been made more accessible and bundled in a dedi- cated section of the book. The editor has attempted to cover a wide range of application domains, rang- ing from very early stage (e.g., high- throughput screening or microarray data analysis) to very late stage research (e.g. pharmacovigilance). There are gaps along the development pipeline; for ex- ample, the prediction of human pharma- cokinetics and the general principles of clinical study evaluation are not covered. Still, with such a broad selection of topics, any individual reader will be a nonexpert in most areas and will there- fore need to rely on worked examples and expert opinions wherever possible to answer questions such as: Which methods are of proven value, which are not? What databases are of particularly high quality and well annotated? Where are areas of improvement? Unfortunate- ly, most chapters do not provide clear guidance but are written as collections of tools and databases. This is particular- ly true for Chapters 6, 9 and 17, which logically belong together as they all touch on aspects of medicinal chemistry. In this field, there are a growing number of repositories containing increasingly re- liable data: crystal structures, thermody- namics of protein–ligand complexes, re- lationships between drugs and target families. The book lacks an in-depth ac- count on how data mining in these re- positories has led to greater understand- ing and thus to a more rational research process. Overall, it has to be said that the book does not meet its goal of showing the impact of data mining in drug discovery. Its strength is that it gives beginners a good impression of our contemporary data jungle; its weakness is that it does not lead a way through it. In this sense, it highlights one of the major issues of this industry—an overemphasis on tech- nology-driven instead of concept-driven projects. In the end, all data mining tools need to serve the same purpose by pro- viding appropriate support for decision making by experts. Technology should follow purpose, not the other way round. The availability of a multitude of toys inevitably leads to distractions. Per- haps this caveat is best expressed by a thoughtful concluding remark in the chapter on pharmacovigilance (Chap- ter 12): “We have observed…a tendency to be overawed by more complex meth- ods that may desensitize users to the limitations and complexities of the data that are not necessarily overcome by more elaborate mathematical frame- works”. Dr. Martin Stahl F. Hoffmann–La Roche Ltd. (Switzerland) DOI: 10.1002/cmdc.201000052 962 www.chemmedchem.org # 2010 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim ChemMedChem 2010, 5, 958 – 962 MED

Upload: martin-stahl

Post on 11-Jun-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Pharmaceutical Data Mining. Edited by Konstantin V. Balakin

aimed at the experienced practitioner—to get the most out of this book, thereader will need to have substantial ex-pertise in the area of drug discovery andthe development and application of bio-assays.

Dr. Gerd WagnerUniversity of East Anglia (UK)

[1] J. C. Milne, P. D. Lambert, S. Schenk, D. P.Carney, J. J. Smith, D. J. Gagne, L. Jin, O. Boss,R. B. Perni, C. B. Vu, J. E. Bemis, R. Xie, J. S.Disch, P. Y. Ng, J. J. Nunes, A. V. Lynch, H.Yang, H. Galonek, K. Israelian, W. Choy, A. Iff-land, S. Lavu, O. Medvedik, D. A. Sinclair, J. M.Olefsky, M. R. Jirousek, P. J. Elliott, C. H. West-phal, Nature 2007, 450, 712 – 716.

[2] D. Beher, J. Wu, S. Cumine, K. W. Kim, S. C. Lu,L. Atangan, M. Wang, Chem. Biol. Drug Des.2009, 74, 619 – 624 .

[3] M. Pacholec, B. A. Chrunyk, D. Cunningham,D. Flynn, D. A. Griffith, M. Griffor, P. Loulakis, B.Pabst, X. Qiu, B. Stockman, V. Thanabal, A. Var-ghese, J. Ward, J. Withka, K. Ahn, J. Biol. Chem.2010 ; DOI :10.1074/jbc.M109.088682.

Pharmaceutical Data MiningEdited by Konstantin V. Balakin.

Wiley, Hoboken 2009. 565 pp., hardcover$ 125.00.—ISBN 978-0-470-19608-3

Drug discoveryhas become ahighly technolo-gy-intensiveprocess generat-ing largeamounts ofdata. Through-out the discov-ery and devel-opment pipe-line, the keyquestion is howto convert thisdata to information and knowledge. Thisbook deals with techniques to support

this process. On the back cover, to show“how sophisticated computational datamining techniques can impact contem-porary drug discovery and development”is stated as the main objective of thebook.

In 17 chapters by different authorgroups, the book gives an overview ofmethods and application areas of datamining. Many of the chapters are writtenas introductory texts to algorithmic prin-ciples and are rich in formulae: for exam-ple, chapters on statistics and informa-tion theory (Chapter 1), dimensionalityreduction (Chapter 15) and self-organiz-ing maps (Chapter 16). Some of thesetexts hide behind chapter titles promis-ing more pragmatic content: for exam-ple, Chapter 4 on compound selectionand iterative screening is more of an in-troduction to Bayesian models and sup-port vector machines; Chapter 5 entitled“Prediction of toxic effects of pharma-ceutical agents” is a rather abstract de-scription of QSAR modeling. Most ofthese texts are too terse for the novice,but at the same time too superficial forthe expert. To serve as useful introducto-ry material, they could have been mademore accessible and bundled in a dedi-cated section of the book.

The editor has attempted to cover awide range of application domains, rang-ing from very early stage (e.g. , high-throughput screening or microarray dataanalysis) to very late stage research (e.g.pharmacovigilance). There are gapsalong the development pipeline; for ex-ample, the prediction of human pharma-cokinetics and the general principles ofclinical study evaluation are not covered.Still, with such a broad selection oftopics, any individual reader will be anonexpert in most areas and will there-fore need to rely on worked examplesand expert opinions wherever possibleto answer questions such as: Whichmethods are of proven value, which arenot? What databases are of particularlyhigh quality and well annotated? Where

are areas of improvement? Unfortunate-ly, most chapters do not provide clearguidance but are written as collectionsof tools and databases. This is particular-ly true for Chapters 6, 9 and 17, whichlogically belong together as they alltouch on aspects of medicinal chemistry.In this field, there are a growing numberof repositories containing increasingly re-liable data: crystal structures, thermody-namics of protein–ligand complexes, re-lationships between drugs and targetfamilies. The book lacks an in-depth ac-count on how data mining in these re-positories has led to greater understand-ing and thus to a more rational researchprocess.

Overall, it has to be said that the bookdoes not meet its goal of showing theimpact of data mining in drug discovery.Its strength is that it gives beginners agood impression of our contemporarydata jungle; its weakness is that it doesnot lead a way through it. In this sense,it highlights one of the major issues ofthis industry—an overemphasis on tech-nology-driven instead of concept-drivenprojects. In the end, all data mining toolsneed to serve the same purpose by pro-viding appropriate support for decisionmaking by experts. Technology shouldfollow purpose, not the other wayround. The availability of a multitude oftoys inevitably leads to distractions. Per-haps this caveat is best expressed by athoughtful concluding remark in thechapter on pharmacovigilance (Chap-ter 12): “We have observed…a tendencyto be overawed by more complex meth-ods that may desensitize users to thelimitations and complexities of the datathat are not necessarily overcome bymore elaborate mathematical frame-works”.

Dr. Martin StahlF. Hoffmann–La Roche Ltd. (Switzerland)DOI: 10.1002/cmdc.201000052

962 www.chemmedchem.org � 2010 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim ChemMedChem 2010, 5, 958 – 962

MED