psb2014 a vision for biomedical research

18
An Informal Discussion About Big Data Better Stated as A Vision for Biomedical Research Digitally enabling the length and quality of life Philip E. Bourne [email protected] http://pebourne.wordpress.com/2013/12/21/taking-on-the-role-of-associate-director-for-data-science-at-the-nih-my-original-vision-statement/

Upload: philip-bourne

Post on 06-May-2015

2.132 views

Category:

Education


1 download

DESCRIPTION

Some preliminary thoughts about my role as Associate Director for Data Science at the NIH so as to have a discussion with attendees at the Pacific Symposium on Biocomputing on Jan 4, 2014, The Big Island of Hawaii.

TRANSCRIPT

Page 1: PSB2014 A Vision for Biomedical Research

An Informal Discussion About Big DataBetter Stated as

A Vision for Biomedical Research

Digitally enabling the length and quality of lifePhilip E. Bourne

[email protected]

http://pebourne.wordpress.com/2013/12/21/taking-on-the-role-of-associate-director-for-data-science-at-the-nih-my-original-vision-statement/

Page 2: PSB2014 A Vision for Biomedical Research

The Context for This Discussion

• On March 3, 2014 I will begin as the first Associate Director of the NIH devoted to data science

• I am giving up tenure and the sun because I believe this is the right time for change

• The change that I will try and instill at NIH and beyond is that of a Digital Enterprise

http://www.nih.gov/news/health/dec2013/od-09.htm

Page 3: PSB2014 A Vision for Biomedical Research

What Do I Mean By the Digital Enterprise?

An organization that succeeds by maximizing the use of its digital assets to achieve its goals

Page 4: PSB2014 A Vision for Biomedical Research

Why the Digital Enterprise Now?

• Biomedical research is increasingly digital – the talk of “Big Data” is one manifestation

• Fulfillment of the NIH mission (among others) will increasingly be tied to actions taken on digital data across boundaries

• History already has lessons to teach us to make the job easier

Page 5: PSB2014 A Vision for Biomedical Research

Actions on Data Implies:

• Insuring data quality and hence trust• Making data sustainable• Making data open and accessible• Making data findable• Providing suitable metadata and annotation• Making data queryable• Making data analyzable• Presenting data as to maximize its value• Rewarding good data practices

Page 6: PSB2014 A Vision for Biomedical Research

Boundaries on Data Implies:

• Working across biological scales• Working across biomedical disciplines• Working across basic and clinical research and

practice• Working across institutional boundaries• Working across public and private sectors• Working across national and international

borders• Working across funding agencies

Page 7: PSB2014 A Vision for Biomedical Research

Where to Start?

An external advisory group provided a valuable blueprint for what should be done

http://acd.od.nih.gov/Data%20and%20Informatics%20Working%20Group%20Report.pdf

Page 8: PSB2014 A Vision for Biomedical Research

Blueprint Recommendations• Promote central and federated catalogs

– Establish minimal metadata framework– Tools to facilitate data sharing– Elaborate on existing data sharing policies

• Support methods and applications– Fund all phases of software development– Leverage lessons from National Centers

• Training– More funding– Enhance review of training apps– Quantitative component to all awards

• On campus IT strategic plan– Catalog of existing tools– Informatics laboratory– Ditto big data

• Sustainable funding commitment

Page 9: PSB2014 A Vision for Biomedical Research

What is Under Way?• Now:

– Data centers (under review)– Data science training grants (call Q1 14)– Pilot data catalog consortium (call out)– Genomic Research Data Alliance (being finalized)– Piloting “NIH-drive”

• In Year One:– Extended public-private programs specifically for data science activities– Interagency activities– International exchange programs– Programs for better data descriptions– Reward institutions/communities– Policies to get clinical trial data into the public domain

Page 10: PSB2014 A Vision for Biomedical Research

Longer Term Strategy: Support for The Research Lifecycle

IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION

AuthoringTools

Lab Notebooks

DataCapture

SoftwareRepositories

Analysis Tools

Visualization

ScholarlyCommunication

Commercial &Public Tools

Git-likeResources

By Discipline

Data JournalsDiscipline-

Based MetadataStandards

Community Portals

Institutional Repositories

New Reward Systems

Commercial Repositories

Training

Page 11: PSB2014 A Vision for Biomedical Research

Longer Term Strategy: Support for The Research Lifecycle

IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION

AuthoringTools

Lab Notebooks

DataCapture

SoftwareRepositories

Analysis Tools

Visualization

ScholarlyCommunication

Commercial &Public Tools

Git-likeResources

By Discipline

Data JournalsDiscipline-

Based MetadataStandards

Community Portals

Institutional Repositories

New Reward Systems

Commercial Repositories

Training

Page 14: PSB2014 A Vision for Biomedical Research

Back Pocket Slides

Page 15: PSB2014 A Vision for Biomedical Research

The Role of Associate Director for Data Science

1. provide broad trans-NIH programmatic leadership in the area of data science;

2. lead long-term NIH strategic planning in areas of data science; 3. provide oversight of the BD2K Initiative; 4. establish and nurture a trans-NIH intellectual and programmatic

‘hub’ for coordinating and enhancing data science activities; 5. coordinate with data science activities beyond NIH (e.g., other

government agencies, other funding agencies, and the private sector);

6. play a major role in data sharing policy development and oversight at NIH; and

7. interact with the Chief Information Officer, NIH to generate synergy between BD2K and the Infrastructure Plus program.

Page 16: PSB2014 A Vision for Biomedical Research

Strategy

• Use the Blueprint as a starting point• Work with IC’s to determine science drivers• Define developments needed for these drivers• Look for commonalities across IC’s – make those a

priority• Manage and enable emergent developments

– data catalog – used to define the minimal data description and a home for domain definitions

– Centers of excellence – test beds and exemplars for best practices

Page 17: PSB2014 A Vision for Biomedical Research

Ways to Sell the NIH Data Science Vision

• Developed in response to well recognized scientific needs• Support for the complete research lifecycle – this is more

than just data • Simple and well understood by all stakeholders (i.e.,

branded)• A shared vision• As ubiquitous as TCP/IP is to the Internet – a backbone for

the digital enterprise• To data what PLOS is to knowledge – a movement that

people believe in and get behind• An app store for the research enterprise

Page 18: PSB2014 A Vision for Biomedical Research

General Features of NIH Data Science

• Lightweight metadata standards• Data & software registries• Expanded policies on data sharing, open

source software• Training programs & reward systems• Institutional incentives• Private sector incentives• Data centers serving community needs