biosense program: scientific collaboration
DESCRIPTION
BioSense is an all-hazards surveillance program for achieving near real-time national public health situation awareness and early detection. Prospective anomaly detection methods such as the Modified EARS C2 are commonly adapted and used in BioSense and other public health syndromic surveillance systems. These methods however can produce an excessive false alert rate. Analyses results will be presented on the combined use of retrospective (e.g., Change Point Analysis (or CPA)) and prospective (e.g., C2) anomaly detection methods. This combined approach will help detect sudden aberrations in addition to subtle changes in local trends, help rule out alarm investigations, and assist with retrospective follow-ups. Examples on the utility of this combined approach in working collaboratively with the scientific community are applied to BioSense emergency departments' visits due to ILI. Methods, limitations, future work, and invitation to the scientific community to collaborate with us will be discussed at this talk.TRANSCRIPT
BioSense Program: Scientific Collaboration
Division of Healthcare Information (DHI)Public Health Surveillance Program Office (PHSPO)Office of Surveillance, Epidemiology, and Laboratory Services (OSELS)Centers for Disease Control & Prevention (CDC)
Zhiheng (Roy) Xu, MS (PhD Candidate)Senior Research Scientist
Soyoun Park, MS (PhD Candidate)Statistician
Paul C. McMurray, MDSSenior Statistician
Taha A. Kass-Hout, MD, MSDeputy Director for Information Science and BioSense Program Manager
Any views or opinions expressed here do not necessarily represent the views of the CDC, HHS, or any other entity of the United States government. Furthermore, the use of any product names, trade names, images, or commercial sources is for identification purposes only, and does not imply endorsement or government sanction by the U.S. Department of Health and Human Services.
The 2010 Joint Statistical Meetings (JSM)Defense and National Security: Disease SurveillanceMonday August 2nd, 2010: 10:30 AM-12:20 PM – Room: CC-10 (East)Vancouver, British Columbia (Canada)
BioSense Updated Vision
… provide multi-purpose value in timely data for national public health situation awareness, routine public health practice, improving health outcomes and public health, and monitoring healthcare quality
Data Sources
Civilian Hospitals• ~640 facilities [~12% ED coverage in US, patchy geo
coverage] [Chief complaints: median 24-hour latency, Diagnoses: median 6 days latency]
• 8 health department sending data from 482 hospitals
• 165 facilities reporting ED data directly to CDC or a health department
Veterans Affairs and Department of Defense• ~1400 facilities in 50 states, District of Columbia, and
Puerto Rico [final diagnosis ~2->5 days latency]
National Labs [LabCorp and Quest]• 47 states, the District of Columbia, and Puerto Rico
[24-hour latency]
Hospital Labs• 49 hospital labs in 17 states/jurisdictions [24-hours
latency]
Pharmacies• 50,000 (27,000 Active) in 50 states [24-hour latency]
The Problem
Early Event Detection Monitoring Health-Related Events and Maintaining
Situation Awareness
Level 1: Perception of Elements in Current Situation
Level 2: Comprehension of Current Situation
Level 3: Projection of Future Status
Decision
Performance of Actions
Situation Awareness
“Raw” Data
Clinical and ER FeedsOTC DataAnimal DataAbseenteeDataNews Feeds Resource Status(needCDC examples)
Interpretation
Detection Algorithms and alerts
Visualization
Collection sources (WHO, OIE)
SME Interpretation and Collaboraiton(Epi-X, ProMed, etc)
Planning and Simulation
Knowledge of Interventions
Disease outbreak Modeling
Historical Data Analysis
Capacity and Resource Planning
Effects from Actions
Current State
Monitoring
Detection
Outbreak Management
Remediaiton
Retrospective Analysis
Ou
tbreak C
ycle
Biosurveillance: Methods and Case Studies, eds. Kass-Hout, T. and Zhang, X., CRC Press, Taylor & Francis LLC. September 2010.
Complementary Analytic Methods
The data Available data from most recent day(s) may be unstable due to
incomplete reporting and delays Instability of daily data: 2-3 day trends not consistently born out
by subsequent observations Reporting latency of 1-3+ days
The analytic methods [complimentary approach] Detect major changes using the Modified Early Aberration
Reporting System (EARS) C2 method• Find abnormalities in daily data
Detect more subtle changes using the Change Point Analysis (CPA) method
• Detect the series mean-shifts in historical datao Alternatives to the mean-shift model are currently being explored with the
community
Fill up the incomplete data with forecasting
Open-Access Scientific Collaboration
https://sites.google.com/site/changepointanalysis
58 Collaborators, > 100 users from 46 cities
Change Point Analysis (CPA)
Purpose CPA aims at detecting any change in the mean of a
process (e.g., time series) Benefits
Detect change in historical data Investigate what might have caused the change Real-time trend analysis
Example Did a change in % Influenza-like illness (ILI) occur? Did more than one change occur? When did the changes occur?
• Since last change, is Influenza activity going up, down or stable?
How confident are we that the change is a real one?
Change Point Analysis
A change point indicates the series means shifts from its previous mean to another. The green piece-wise constant lines represent mean shifts.
Change Point Analysis
Determine the Series Mean Accumulate Running Sum
of differences between Mean and individual values [residuals]
Plot the cumulative sum of the residuals [CUSUM] for the time series The point farthest from 0
denotes a Change-Point (CP) Break into two sections at
CP: analyze each subseries for
additional significant CPs, and repeat the process
Bootstrapping provides us with a measure of the CP’s significance
n
XXXX n
...21
00 S XXSS iii 1
1maxarg iSCP
Level 1: Find a change point maximizing |S|
Level 2: Find a change point on each sub-series Level n: Final result
Repeat the algorithm until
no
more change points are detected
Apply CPA Apply CPA
Initial Time Series
Complementary Methods
Aberration detection methods are generally better at detecting isolated or grouped abnormalities [assumption: mean is stable], while CPA is better at detecting subtle changes which may not be detected by aberration methods (assumption: mean is unstable). We use both methods in a complementary fashion to get better results.
Open Access Scientific Collaboration: Explore Alternative Methods & Address
Limitations
Bayesian CPA Weak prior Posterior distributions of the
change points Example: R package bcp
Structural change model Minimize the sum of squared
residuals Advantage:
• Allows for auto-correlated time-series data
Disadvantage: • Assumes a stationary process
Asymptotic distribution for change points
Example: R package strucchange
Alternative methods to mean-shift model
Autocorrelation in biosurveillance data
References
Bai, J. Estimation of a change point in multiple regression models. Review of Economics and Statistics, 79: 551-563, 1997.
Bai, J. and Perron, P. Computation and analysis of multiple structural change models. Journal of Applied Economics, 18: 1-22, 2003.
bcp: An R package for performing a Bayesian analysis of change point problems. Journal of Statistical Software, 23 (3): 1-13, 2007.
Tokars, J., et.al. Enhancing Time-Series Detection Algorithms for Automated Biosurveillance. Emerging Infectious Diseases, 15 (4): 533-539.
Wayne A. Taylor, Change-Point Analysis: A Powerful New Tool for Detecting Changes. Retrieved from http://www.variation.com/anonftp/pub/changepoint.pdf
Taha A. Kass-Hout, MD, MSDeputy Director for Information Science and BioSense Program ManagerDivision of Healthcare Information (DHI) Public Health Surveillance Program Office (PHSPO)Office of Surveillance, Epidemiology, & Laboratory Services (OSELS)Centers for Disease Control & Prevention (CDC)1600 Clifton Road, NE, MS E-51, Atlanta, GA 30329
Thank YOU!
Follow BioSense on Twitter
Join BioSense on Facebook
Data Sources
As of May 2010
Hospital Data N Direct Reporting Hospitals Health Departments
Civilian Hospitals 640 162 478
Outpatient reason for visit
93 91 2
Outpatient final diagnosis
70 68 2
ED chief complaint 640 162 478 ED final diagnosis 207 76 131 Inpatient reason for admit
120 118 2
Inpatient final diagnosis
79 77 2
Census 100 99 1 ED clinical 119 50 69 Laboratory results 66 64 2 Radiology results 33 31 2 Pharmacy orders 35 34 1
VA final diagnosis 887DoD final diagnosis 368