ties cancer research network y2 face to face meeting u24 ca...
TRANSCRIPT
October 29th, 2014 University of Pennsylvania
TIES Cancer Research Network Y2 Face to Face Meeting U24 CA 180921 Session II Project Status Updates
Morning Welcome 8:00-8:15 Goals for the Meeting, Overall Summary of Progress in Y1; Y2 Project Plan (Crowley-Jacobson) Project Status 8:15-8:30 University of Pittsburgh Update (Crowley-Jacobson) 8:30-9:00 University of Pennsylvania Update (Feldman) 9:00-9:30 Georgia Regents University (Bollag) 9:30-10:00 Roswell-Park Cancer Institute (Gaudioso) 10:00-10:30 Demonstration of Multi-Site Search; Discussion (Chavan and Crowley-Jacobson) 10:30-10:45 Break 10:45 – 11:15 Policies and Processes Subcommittee ~ Completed Work and Next Steps (Crowley-Jacobson, Bollag, Weaver, Murphy) Pilot Projects 11:15 – 12:00 National Mesothelioma Virtual Brank (Amin) 12:00 – 12:15 Other emerging pilot projects (Feldman, Crowley-Jacobson) 12:15 – 12:45 LUNCH
Release of TIES 5.0 New coding and indexing infrastructure • Replaced MMTX with our own NobleCoder • Redesigned database/coding interface with JMS to support hundreds of coders and
millions of documents • Support for Multiple Document Types in including customized dictionaries for
improved coding performance • Bundled with Pathology and Radiology dictionaries but easy to build custom
dictionaries with the NobleTools package included with NobleCoder • Switched from NegEx Jape implementation to Context (a more sophisticated
algorithm) • Migration to later version of Lucene 4.5
Easier installation • All prerequisite software bundled along with TIES software in one easy to use
installer • New Linux VM snapshot of a fully configured TIES node • Installer rewritten to run as a Java Web Start app or as stand alone. IzPack Installer
retired. New Installer allows flexibility for asymmetric key signing at the point where new provider nodes join the network
3
Release of TIES 5.0 More search features • Search for patients having specific characteristics across report types. Easily do
Quality Improvement studies using the Temporal Querying feature of TIES. • Specify different sections for different document types across different
institutions, all in the same query.
Improved User Interface • Dashboard query mode has spell check and auto lookup for best matching
concepts for your search terms. • New Filters palette on Diagram mode to easily add new filters to your query.
Better Administration • Supports highly granular user and study authorization management. Approve or
reject at study level or at the individual user level for any research study. • Can now manage hundreds of users and studies. • Administrators can restrict specific research studies to only certain report types.
Support for TIES Cancer Research Network • Customized installer to automatically register with TCRN • TCRN Hub hosted at Pitt
4
NobleCoder Performance
Release of TIES 5.2 (November, 2014) • Upgraded to latest Java, Tomcat and MySQL • Support for regulatory workflow, including file attachments to
studies, email notifications, etc. • Manual de-identification support • Speed improvements for Administrator UI • Numerous bug fixes
7
Next Release (Summer 2015) • Investigating other de-identification options • Tooling to upgrade local VM installation to TCRN node • Other features to incorporate your priorities
http://ties.pitt.edu • Live Demo • Feature descriptions • Extensive and new
Documentation • Policies and
Processes • Security
Documentation • Manual and videos • Resources for local
dissemination efforts
History
8
Thank you for using SourceForge!
9
Morning Welcome 8:00-8:15 Goals for the Meeting, Overall Summary of Progress in Y1; Y2 Project Plan (Crowley-Jacobson) Project Status 8:15-8:30 University of Pittsburgh Update (Crowley-Jacobson) 8:30-9:00 University of Pennsylvania Update (Feldman) 9:00-9:30 Georgia Regents University (Bollag) 9:30-10:00 Roswell-Park Cancer Institute (Gaudioso) 10:00-10:30 Demonstration of Multi-Site Search; Discussion (Chavan and Crowley-Jacobson) 10:30-10:45 Break 10:45 – 11:15 Policies and Processes Subcommittee ~ Completed Work and Next Steps (Crowley-Jacobson, Bollag, Weaver, Murphy) Pilot Projects 11:15 – 12:00 National Mesothelioma Virtual Brank (Amin) 12:00 – 12:15 Other emerging pilot projects (Feldman, Crowley-Jacobson) 12:15 – 12:45 LUNCH
11
University of Pennsylvania Abramson Cancer Center TCRN Update Michael Feldman, MD, PhD
Database • Up and running latest code set • Records coded 750K
– Active coding via periodic data from interface engine, not direct HL7 feed
– Catching all surg path cases from 3 hospitals (60K/yr)
Penn Ties Website • User enrollment and authorization • Rules of the road • Documents AUA, MTA’s… • Link to instructional videos • Information relate to TIES and our instance
Planned Roll out • Start rolling out to within Pathology
– 30-40 users in translational medicine – Surgical pathologists, basic scientists, translational scientists
• Expand to cancer center • Incorporate into SOM Biobank
Started using queries • Query1: Find all cases of patient with “indefinite for dysplasia in the
esophagus” who have another esophagus biopsy at any time – Could not query for cases which did not have dysplasia prior to or
at the time of the index lesion – Suggests
• User community to share experience, maybe post queries • Further UI enhancements?
Queries • Find all non-transplant liver biopsies “medical liver biospies”
diagnosed with steatosis or steatohepatitis – Straight forward query – Wanted to get out list of cases and link to other medical records
data • Current tool allows export of data in honest broker view, but
not easy to add cases to found set (must be done a few at a time)
Research study Rosai Dorfman Disease • Rosai Dorfman disease is an idiopathic reactive condition
characterized by exuberant macrophage reaction in lymph nodes or soft tissue
• Etiology is unknown but some studies have implicated a virus, Herpes virus 6 in some cases
• Pathochip is a microarray technology – All known pathogenic virus and bacteria and fungi arrayed – Allows FFPE to be probed for infectious signature in lesional
tissue compared to normal controls
Future query “Breast Papilloma study” • Find all breast needle cores with diagnosis of a papilloma but nothing
worse at the time of or before the core biopsy who then went on to a subsequent resection – In the resection after the papilloma core biopsy, what is the
frequency of finding either in situ or invasive carcinoma – Compare the carcinoma rate to carcinoma in a random core
biopsy population with BIRADS4
How can TIES be expanded (more corpuses) • Coding other corpuses
– Cytopathology – logical extension, starting to have folks be interested in examining cytology material for NGS
• Radiology – Oncology
• Diagnosis • Response Criteria (RECIST) – Lesion changes, Lymph node
size, bone lesion, PET… • Heterogeneity – structural (CT and MRI) and genomic (PET)
• Endoscopy • Cardiology • Clinical Notes
Structured + NLP • How can we link structured data with TIES?
– Pathology • Synoptic reports in AP • Lab Med data • Genomics • Biobank
– Oncology – tumor registry – Clinical trials database – Other disease registries
Morning Welcome 8:00-8:15 Goals for the Meeting, Overall Summary of Progress in Y1; Y2 Project Plan (Crowley-Jacobson) Project Status 8:15-8:30 University of Pittsburgh Update (Crowley-Jacobson) 8:30-9:00 University of Pennsylvania Update (Feldman) 9:00-9:30 Georgia Regents University (Bollag) 9:30-10:00 Roswell-Park Cancer Institute (Gaudioso) 10:00-10:30 Demonstration of Multi-Site Search; Discussion (Chavan and Crowley-Jacobson) 10:30-10:45 Break 10:45 – 11:15 Policies and Processes Subcommittee ~ Completed Work and Next Steps (Crowley-Jacobson, Bollag, Weaver, Murphy) Pilot Projects 11:15 – 12:00 National Mesothelioma Virtual Brank (Amin) 12:00 – 12:15 Other emerging pilot projects (Feldman, Crowley-Jacobson) 12:15 – 12:45 LUNCH
TIES Node Implementation at Georgia Regents University
Roni Bollag, MD, PhD Nita Maihle, PhD
Jennifer Irons Carrick, BS Sameera Qureshi, BS
Rahil Khan, BS
Who is GRU?
• MCG (1833–2011) • GHSU (2011–2013) • Now: Georgia Regents University
= GHSU + ASU
one of only four public comprehensive research institutions in the state of Georgia
• Founded in 1828 – 13th -oldest medical school – 8th -largest medical school
• Nine Colleges • 9000 students (across campuses 240 medical
students/year) • Supporting GR Health System
– 478-bed Georgia Regents Medical Center – 154-bed Children's Hospital of Georgia
Our Team
Repository Side: •Jinni Carrick – Biorepository Registrar •Sameera Qureshi – Biorepository Lab Associate •Rahil Khan – Biorepository Lab Assistant •Denise Harper – Biorepository Admin. Assistant •Roni Bollag – Interim Biorepository Director •Samir Khleif – Cancer Center Director •Nita Maihle – Cancer Center Liaison for TIES
Our Team (con’t) IT Side: •Mia Jolly – Business Analyst/Project Coordinator •Pankhil Patel – DBA •Latoya Butler and Lina Patel – Systems Analyst/Interface Team •Kimberley Hardy – Network Engineer •Craig Huff – System Engineer •Jason Rote and Angela Long – IT Security Analysts •Colleen Cain – Director Enterprise Application Systems •Charles Busbee – Manager Database & Application Administration •Michael Casdorph – Associate VP Academic & Research Technology
GRU Cancer Center
• Mission – To reduce the burden of cancer in the State of Georgia and
across the globe through superior care, innovation, and education.
• Vision – To be a global leader in cancer clinical care, discovery
innovation, translational research and professional education and public awareness.
GRU and Brag-Onc Biorepository
• Established in 2005 to provide a centralized service for biospecimen procurement and distribution to support basic and translational research.
• GRU also serves as a central repository for the statewide network • Bio-Repository Alliance of Georgia
for Oncology (BRAG-Onc).
• Collects and stores specimens under standardized conditions, with accompanying clinical and demographic information.
• Supported by a web-accessible database for inventory management and annotation
• Supported by a long term storage facility with back-ups for cryo-preservation of biospecimens.
Total accessions: ~ 11,000 (2560 BragOnc) Specimens: ~ 40,000
We’re thrilled to be part of TCRN!
• EHR = Cerner Millenium / Pathnet (switched from Copath in 2005)
• Biorepository Database: TissueMetrix – Possibly soon to convert to Encore (in
implementation for Clinical Trials)
• Mostly non-functional for translational research applications
GRU Pathology
• 12 - 14,000 surgical specimens/year • Since 2000 (target for retrospective data
capture) – 255,000 reports • MCG surgical emphases: endocrine oncology,
urologic oncology, gynecological oncology
IRB Approval
Our Progress • Alliance with GRU IT • GRU IRB completed • Subcontract execution • Software installed
– de-identification software (De-ID) – TIES on Administrators’ CPUs
• Server built and configured in secure setting • Port 80 configured and access open for external GRU
communications • HL7 Interface written and tested
– 2500+ reports have been transmitted to TIES via HL7 • Outstanding Tasks:
– Historical Data Load
• GRU Biorepository Team added to caTIES
• Added to TIES-TCRN study
Our Progress (con’t)
Our Progress (con’t)
• Reports Downloaded (est.)
Looking Forward to the Future!
Thank you from GRU!
Morning Welcome 8:00-8:15 Goals for the Meeting, Overall Summary of Progress in Y1; Y2 Project Plan (Crowley-Jacobson) Project Status 8:15-8:30 University of Pittsburgh Update (Crowley-Jacobson) 8:30-9:00 University of Pennsylvania Update (Feldman) 9:00-9:30 Georgia Regents University (Bollag) 9:30-10:00 Roswell-Park Cancer Institute (Gaudioso) 10:00-10:30 Demonstration of Multi-Site Search; Discussion (Chavan and Crowley-Jacobson) 10:30-10:45 Break 10:45 – 11:15 Policies and Processes Subcommittee ~ Completed Work and Next Steps (Crowley-Jacobson, Bollag, Weaver, Murphy) Pilot Projects 11:15 – 12:00 National Mesothelioma Virtual Brank (Amin) 12:00 – 12:15 Other emerging pilot projects (Feldman, Crowley-Jacobson) 12:15 – 12:45 LUNCH
TIES Cancer Research Network U24 CA 180921
Year 2 Face to Face Meeting Roswell Park Cancer Institute
Site Update Carmelo Gaudioso October 29, 2014
RPCI Progress Report Summary
• RPCI TIES node setup • Data, loading of pathology reports • De-Id validation • TIES performance testing • Policy & Procedures • Feature Requests • Tasks for Year Two
RPCI TIES Node Setup • Completed RPCI node setup • Established process for data loading • Set up TIES Public and Private servers
– Data with PHI stored only in private server • Established Internet access to RPCI node on port 80 • Validate De-Id • Created a post processing script to identify and
quarantine any reports with patient names that were missed by De-Id
Data • Loaded a total of 156,555 reports • Time Span: 1997-2013
De-Id Validation • Validated De-Id performance by following UPMC
procedures – Tested a sample of 725 reports from years 1997 and 2002 – Results
• 11 (3%) reports with missed PHI were identified and quarantined • 27 reports were over-scrubbed
• Wrote a post processing script to address missed PHI
TIES Query Performance Testing • Low/Moderate Complexity Queries
– Patients with medullary carcinoma in thyroid gland. – Patients with adenocarcinoma in brain. – Men with invasive ductal carcinoma of the breast. – Patients >60 with Hodgkins disease. – Patients, 40-60 with tubulovillous adenoma and
adenocarcinoma in colon or rectum.
(Rebecca Crowley et al., 2009)
TIES Query Performance Testing
• High Complexity Queries – Patients with both schwannomas and meningiomas. – Patients with colonic adenocarcinoma who also have
had invasive ductal carcinoma of breast. – Patients with renal carcinoma in kidney tissue who
also have lunch tissue with metastatic renal cell carcinoma.
(Rebecca Crowley et al., 2009)
Policy & Procedures • TCRN
– Validation of De-Identification of Pathology Reports – Auditing of TCRN Users – Recommendation of Member Institutions Establishing
Approval Committees for external Users • RPCI
– Validation of De-Id of RPCI pathology reports – RPCI user access to TIES – Honest Broker service – Data and biopsecimen request approval
Feature Requests • Detailed report when data loading is completed
– Number of reports attempted to load – Number of reports successfully loaded – List of reports not successfully load with
reason for not being loaded • Account lockout when incorrect credentials are
entered more than a few times • Facilitate the distinction of quarantined reports
from viewable reports
Tasks for Year Two • Complete De-Id validation across all data • Enhance the post processing script to check for
accession number and dates • Electronically mark the reports identified through
the post processing script as quarantined • Review, scrub and release all quarantined
reports
Tasks for Year Two • Test TIES query performance within RPCI
environment • Finalize RPCI internal policy and procedures • Develop and implement TIES user training • Establish an RPCI TCRN Request Approval
Committee • Build RPCI TIES website • Initiate pilot TCRN research project • Automate prospective pathology report loading into
TIES
Acknowledgements • Dr. Carl Morrison • Monica Murphy • Mayur Sakthivel • Amanda Rundell • Karin Hojczyk • Thomas Bertucci • IT Security & Server Teams • IRB and Legal Departments
• Rebecca Jacobson • Girish Chavan • Eugene Tseytlin • Kevin Mitchell • Elizabeth Legowski
51
Live Demonstration of search across all our institutions What did we do? - Protocol created at Pittsburgh - Rebecca and Girish are users - Each of the three partner sites
signed on a ‘data provider’
Searches Aggregate Level – Breast Cancer Record Level - MPNST
Morning Welcome 8:00-8:15 Goals for the Meeting, Overall Summary of Progress in Y1; Y2 Project Plan (Crowley-Jacobson) Project Status 8:15-8:30 University of Pittsburgh Update (Crowley-Jacobson) 8:30-9:00 University of Pennsylvania Update (Feldman) 9:00-9:30 Georgia Regents University (Bollag) 9:30-10:00 Roswell-Park Cancer Institute (Gaudioso) 10:00-10:30 Demonstration of Multi-Site Search; Discussion (Chavan and Crowley-Jacobson) 10:30-10:45 Break 10:45 – 11:15 Policies and Processes Subcommittee ~ Completed Work and Next Steps (Crowley-Jacobson, Bollag, Weaver, Murphy) Pilot Projects 11:15 – 12:00 National Mesothelioma Virtual Brank (Amin) 12:00 – 12:15 Other emerging pilot projects (Feldman, Crowley-Jacobson) 12:15 – 12:45 LUNCH
Regulatory Processes and Policies Group • Roni, Jinni, Monica, JoEllen, Rebecca, Liz and Girish
(representing development group) • Meeting every other week for past 9 months. • Develop relevant recommendations, processes and policies,
which are forwarded to Steering Committee for voting • Create functional requirements based on these policies that
are implemented in the TIES software and TCRN forms and supporting materials.
• Share results of significant regulatory related activities at sites (e.g. de-identification QA)
• Act as communication channel for information coming from IRBs, OORs and other institutional officials 53
Work to date - 1 • De-identification quality assurance policy (Murphy, lead)
– a set of policies surrounding the validation of the de-identification of the pathology reports using the De-ID software program.
– Once system is validated, defines continuous QC process, reporting and Network oversight and monitoring
• Recommendation for Approval Bodies (Bollag, lead) – Recommends process that each site must establish for approving
external users. – Final determination will likely involve institutional officials and
stakeholders • Auditing (Weaver, Lead)
– Defines responsibilities of sites in auditing user accounts to ensure all users are valid
– Defines responsibilities of sites in auditing user queries to identify risks for reidentification
– Defines responsibilities of sites in auditing usage of system
54
Work to date - 2 • Template IRB language (Legowski, Lead)
– Provides example IRB language developed in collaboration with Pitt IRB for projects that will use the TCRN
– Included agreement of Pitt IRB that protocols using the agreed upon forms and language will be ‘administratively reviewed’.
• Step Up requests (Legowski, Lead) – Created initial conceptual flow diagram for regulatory
requirements needed at each step of process from ‘data preliminary to research’ to tissue transfer
– Will • Forms for TCRN requests
– Implemented in new set of forms that can be used with our portal or you can link to these forms from your own
– http://ties.dbmi.pitt.edu/request-an-account
55
Administrator adding institutional requests to protocol
56
TIES Admin receives alert for pending request
57
Administrator processes request
58
De-Identification Validation
Performed on 2 yrs of reports: 1997 & 2002 • Online calculator used to determine number of reports to be QC’d • 95%conf level with 5 % error rate was approved by IRB • Built in randomization schema utilized when selecting reports
Report Year Total # Rpts Loaded
#RptsQC Pass Fail no PHI (#/%)
Fail with PHI (#/%)
Over scrubbing (#/%)
1997 5728 360 323 2 / 0.5% 9/2.5% 26 / 7.2% 2002 7423 365 360 2 / 0.5% 2 / 0.5% 1 / 0.3%
Too many PHI failures Further removal of PHI is required above and beyond the standard functionality of de-ID software. Post Processing Scripts being written and performed • Patient Names • Pathology Accession Numbers • Dates
De-Identification Validation
PPS – Check Names The post processing script takes the patient first name, last name and de-identified pathology reports from the TIES Public database and does a case insensitive search to check for the patterns below, • Does the path report contain first name or the last name as a whole word. • Does the path report contain names in the format “first_name”+“last name”, example
– johndoe • Does the path report contain the first name concatenated with the word “name”,
example – namejohn. Results • Positive reports quarantined, manually scrubbed and released
Report Year
Total # Rpts
Loaded
# rpts identified
True Positive
1997 5728 59 25 / 0.4%
2002 7423 153 40 / 0.5%
De-Identification Validation
PPS – Check Accession Numbers The accession numbers format S-11-11111. They are sometimes referred to in the pathology reports as 11-11111 and these instances are missed by the De-ID software. The plan is to use the post processing script to take the accession numbers from the TIES Public database and check if they are contained in the pathology reports after ignoring the leading ‘S’. • Script to be written PPS – Check Dates When the reports have dates of the format month/year, example - 12/2014, they are missed. This will be handled after handling the accession numbers. • Script to be written
Year 2 work • Work with IRBs to get Data Prep to Research excluded
from requirement of a NHSR designation or exempt IRB • Complete Step Up procedure and check implementation
in software • Incident Reporting Policy and Procedure • Policy and Procedure for entry of a new institution into
TCRN • Shepherd initial pilot projects and further refine policies
62
Discussion Questions • Can we remove the requirement for IRB approval for data
preparatory to research across TCRN, and under what conditions?
• Can we address the current requirement for separate IRB protocols at each institution, and under what conditions?
• Can we streamline the process for approval of the Authorized User Agreement, which needs to be signed by an institutional official?
• How do we communicated and educate users about the unique requirements of TCRN? How do we help them complete required steps quickly and efficiently?
63