pharmaceutical data mining. edited by konstantin v. balakin

aimed at the experienced practitioner—to get the most out of this book, thereader will need to have substantial ex-pertise in the area of drug discovery andthe development and application of bio-assays.

Dr. Gerd WagnerUniversity of East Anglia (UK)

[1] J. C. Milne, P. D. Lambert, S. Schenk, D. P.Carney, J. J. Smith, D. J. Gagne, L. Jin, O. Boss,R. B. Perni, C. B. Vu, J. E. Bemis, R. Xie, J. S.Disch, P. Y. Ng, J. J. Nunes, A. V. Lynch, H.Yang, H. Galonek, K. Israelian, W. Choy, A. Iff-land, S. Lavu, O. Medvedik, D. A. Sinclair, J. M.Olefsky, M. R. Jirousek, P. J. Elliott, C. H. West-phal, Nature 2007, 450, 712 – 716.

[2] D. Beher, J. Wu, S. Cumine, K. W. Kim, S. C. Lu,L. Atangan, M. Wang, Chem. Biol. Drug Des.2009, 74, 619 – 624 .

[3] M. Pacholec, B. A. Chrunyk, D. Cunningham,D. Flynn, D. A. Griffith, M. Griffor, P. Loulakis, B.Pabst, X. Qiu, B. Stockman, V. Thanabal, A. Var-ghese, J. Ward, J. Withka, K. Ahn, J. Biol. Chem.2010 ; DOI :10.1074/jbc.M109.088682.

Pharmaceutical Data MiningEdited by Konstantin V. Balakin.

Wiley, Hoboken 2009. 565 pp., hardcover$ 125.00.—ISBN 978-0-470-19608-3

Drug discoveryhas become ahighly technolo-gy-intensiveprocess generat-ing largeamounts ofdata. Through-out the discov-ery and devel-opment pipe-line, the keyquestion is howto convert thisdata to information and knowledge. Thisbook deals with techniques to support

this process. On the back cover, to show“how sophisticated computational datamining techniques can impact contem-porary drug discovery and development”is stated as the main objective of thebook.

In 17 chapters by different authorgroups, the book gives an overview ofmethods and application areas of datamining. Many of the chapters are writtenas introductory texts to algorithmic prin-ciples and are rich in formulae: for exam-ple, chapters on statistics and informa-tion theory (Chapter 1), dimensionalityreduction (Chapter 15) and self-organiz-ing maps (Chapter 16). Some of thesetexts hide behind chapter titles promis-ing more pragmatic content: for exam-ple, Chapter 4 on compound selectionand iterative screening is more of an in-troduction to Bayesian models and sup-port vector machines; Chapter 5 entitled“Prediction of toxic effects of pharma-ceutical agents” is a rather abstract de-scription of QSAR modeling. Most ofthese texts are too terse for the novice,but at the same time too superficial forthe expert. To serve as useful introducto-ry material, they could have been mademore accessible and bundled in a dedi-cated section of the book.

The editor has attempted to cover awide range of application domains, rang-ing from very early stage (e.g. , high-throughput screening or microarray dataanalysis) to very late stage research (e.g.pharmacovigilance). There are gapsalong the development pipeline; for ex-ample, the prediction of human pharma-cokinetics and the general principles ofclinical study evaluation are not covered.Still, with such a broad selection oftopics, any individual reader will be anonexpert in most areas and will there-fore need to rely on worked examplesand expert opinions wherever possibleto answer questions such as: Whichmethods are of proven value, which arenot? What databases are of particularlyhigh quality and well annotated? Where

are areas of improvement? Unfortunate-ly, most chapters do not provide clearguidance but are written as collectionsof tools and databases. This is particular-ly true for Chapters 6, 9 and 17, whichlogically belong together as they alltouch on aspects of medicinal chemistry.In this field, there are a growing numberof repositories containing increasingly re-liable data: crystal structures, thermody-namics of protein–ligand complexes, re-lationships between drugs and targetfamilies. The book lacks an in-depth ac-count on how data mining in these re-positories has led to greater understand-ing and thus to a more rational researchprocess.

Overall, it has to be said that the bookdoes not meet its goal of showing theimpact of data mining in drug discovery.Its strength is that it gives beginners agood impression of our contemporarydata jungle; its weakness is that it doesnot lead a way through it. In this sense,it highlights one of the major issues ofthis industry—an overemphasis on tech-nology-driven instead of concept-drivenprojects. In the end, all data mining toolsneed to serve the same purpose by pro-viding appropriate support for decisionmaking by experts. Technology shouldfollow purpose, not the other wayround. The availability of a multitude oftoys inevitably leads to distractions. Per-haps this caveat is best expressed by athoughtful concluding remark in thechapter on pharmacovigilance (Chap-ter 12): “We have observed…a tendencyto be overawed by more complex meth-ods that may desensitize users to thelimitations and complexities of the datathat are not necessarily overcome bymore elaborate mathematical frame-works”.

Dr. Martin StahlF. Hoffmann–La Roche Ltd. (Switzerland)DOI: 10.1002/cmdc.201000052

962 www.chemmedchem.org � 2010 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim ChemMedChem 2010, 5, 958 – 962

MED

http://dx.doi.org/10.1038/nature06261



www.chemmedchem.org

pharmaceutical data mining. edited by konstantin v. balakin

Documents