deuterated drugs in pubchem

15
[1] The Unforseen Consequences of Opportunistic Deuterated Drug Claims, Patent Extraction Feeds and the PubChem Chemistry Rules When data integration gets fuzzy Chris Southan ChrisDS Consulting, Göteborg, Sweden BioIT World, Hannover, Oct 2010

Upload: chris-southan

Post on 26-Jun-2015

1.423 views

Category:

Health & Medicine


6 download

DESCRIPTION

Presentation at BioIT World, Hannover, October 2010

TRANSCRIPT

Page 1: Deuterated drugs in PubChem

[1]

The Unforseen Consequences of Opportunistic Deuterated Drug Claims,

Patent Extraction Feeds and the PubChem Chemistry Rules

When data integration gets fuzzy

Chris Southan

ChrisDS Consulting, Göteborg, Sweden

BioIT World, Hannover, Oct 2010

Page 2: Deuterated drugs in PubChem

[2]

Background The scale of chemistry-biology-bioinformatics connectivity has made PubChem a de facto global data integration hub.

However the fidelity of this is compromised by factors including :

Inherent complexities of chemical structure representation

Chemistry rules that, while rigourous, tend to split CIDs

Submitter primacy and a low proliferation bar

Vendor dilution of bioanotated with no-data compounds

Increasingly complex BioAssay relationships

Patent-extraction feeds from commercial sources

Page 3: Deuterated drugs in PubChem

[3]

Discordance of Drug Collections

Page 4: Deuterated drugs in PubChem

[4]

Will the real Rosuvastatin stand up ?

Page 5: Deuterated drugs in PubChem

[5]

But 15 CIDs ?

Page 6: Deuterated drugs in PubChem

[6]

“Heavy” Rosuvastatin (+28) CID 25241235

Page 7: Deuterated drugs in PubChem

[7]

Patent Filings on Deuterated Drugs

“Protia's patents appear to be a blunderbuss approach, with a mass of US filings 237 published to date but only 11 PCT applications. None of these provide exemplification or any biological description. Concert has 39 PCTs and 26 US applications published, Auspex 57 PCTS and 39 US applications”

(Comment from “In the Pipeline” blog, June 2009)

Page 8: Deuterated drugs in PubChem

[8]

A Pipeline With Unintended Consequences

Page 9: Deuterated drugs in PubChem

[9]

Some of them Might Just Work ?

But we don’t know which one of the 25 !

Page 10: Deuterated drugs in PubChem

[10]

Picking Off the Best-sellers

Connectivity Deuteros

1) 38 32

2) 4 88 0 85

3) 15 12

7) 7 4

8) 8 7

9) 11 9

10) 68 66

13) 29 26

14 ) 51 48

15) 16 11

Page 11: Deuterated drugs in PubChem

[11]

Sorting

Page 12: Deuterated drugs in PubChem

[12]

Extent of the Problem

Page 13: Deuterated drugs in PubChem

[13]

Does this Matter ? In the grand scheme of things maybe not - but ........

Not just the deuteros but other patent-extracted compounds , are causing CID ”multiplexing ” in PubChem (and ChemSpider)

The Pharma ”Crown-jewels” of marketed drugs are badly hit

Less experienced PubChem users could be confused

Some types of search results get messed up

They ”gum up” company internal integrated systems

Do we want prophetic or virtual structures in PubChem ?

Page 14: Deuterated drugs in PubChem

[14]

Solutions ? On an individual basis you can sort drug CIDs by Mr as described

However, no filters can cleanly discriminate between data-supported or prophetic chemistry published in patents (i.e. the authentic wheat from the spurious chaff)

Patent-only flags on CIDs would be useful but as a filter they would remove many useful non-deuterateds

Filtering all Isotopically labeled compounds removes valuable experimental tools that are “in pots”

A (tiny?) fraction of the deuterated drugs may get data links

Drug crowdsourcing is already happening (e.g. Wikipedia) but we need the set to be flagged in PubChem

Some kind of “canonical clustering” in PubChem could help

Page 15: Deuterated drugs in PubChem

[15]

Questions ?