charting a new course for epa’s information management larry fitzwater united states environmental...
TRANSCRIPT
Charting a New CourseFor EPA’s Information Management
Larry FitzwaterUnited States Environmental Protection Agency
September 21, 2004
2
Overview
1. Current EPA regulatory/electronic structure
2. New directions in technology and policy
3. US EPA’s Central Data Exchange (CDX)
4. State/EPA Information Exchange Network
• CDX’s role • Brief Technical Overview
5. Lessons/Conclusions
3
EPA’s IT and Regulatory StructureUS EPA
– Founded in 1970– Now 17,500 employees, 6,000 in Washington –
remainder in 10 Regional offices across the country– $1.2 billion operating budget, additional $7 billion
trust funds– $375-$400 million annually on information technology– National computing center in North Carolina
Most environmental statutes by “media” – e.g. air, water– Organizational chart mimics statute-specific
regulations
Many programs “delegated” to the States and Tribes– State environmental responsibilities spread among
State agencies
4
EPA Information Collection
EPA information collections:•350 active “information collections” at EPA•120 million estimated burden hours for
reporting and recordkeeping•1 million entities report to EPA, only 10%
directly – rest through States, Tribes or localities
•Hundreds of people managing data systems at EPA
•About 0.5 million-1 million transactions annually
5
Issues – Politics and Policy
•Devolution of power to the States
•States resent EPA/Washington control/mandates
•States resent having to use old “national systems”
•States resent slow EPA system modernization
•“Come and get” the data EPA
•States as “partners”
6
Issues - Technical• Architecture Snarl:
– Different systems for different statutes and none “talk” to each other
– Different architectures– Report one way/place here, and another there– Redundant data systems/entry
• Integrating data • Portraying state of environment or program
effectiveness• Portraying environmental quality across jurisdictions• Some collections still in paper or diskette• Little or no security, authentication, validation • State/EPA system modernization efforts asynchronous
8
EPA (and State) Response
•New Enterprise-wide architecture at EPA
•Central Data Exchange (CDX)
•National Environmental Information Exchange Network (Network) with the States, Tribes localities and eventually the private sector
Management Practices(Architecture, Policies, Standards, Security)
Public
Industry
Non-governmentPartners
GovernmentPartners
CDXServices
CONNECT andEXCHANGE
USE
Program SupportPublic Access
Decision Support
System ofAccess
Enterprise Repository
Metadata Holdings Catalog
Shared Geospatial DataCentral RegistriesData Warehouse
STORE for USE
Operational Databases & Applications
PROCESS and STAGE
EXCHANGENETWORKS
Intranet
Extranet
Securityand
AccessControls
EPA Users
EPA Enterprise Target Architecture
• CDX Can Accept Different Data Formats from Members e.g.
• Flat files• XML
• Confirms Authenticity of Submitters
• Provides Documentation to Submitters that data was successfully received.
• CDX Not Only Translates Data from any format, it can make sure data is of good quality
• CDX Distributes incoming data to USEPA Data Systems
Submit or Receive all EPA data via the Web
Members who submit data to CDX
Receive official copy of certification
Documentation of adherence to Agency and Federal Data Management Requirements
Get Access to other Member’s data
ArchiveData
States and others Register to be Members or “Trading Partners”
Members Receive Access to Technical Support to use CDX
DistributeTranslate orCheck Quality
Validate
Sen
d o
r R
eceiv
e
Using the web, Members can use CDX to send all their environmental data (incoming)
Members, especially EPA Programs, can also access CDX (outgoing) to get data sent by any other program (state or other)
Central Data Exchange (CDX)
EPAData Systems
11
Basic Data Exchange
• Node-Node Data Exchange - Transaction Logging - Error Handling - Naming & Directory
Services - Security/Access
Controls - Data Translation -
Registration/authentication/ authorization - Backup/recovery• Web User Data Exchange - Portal - Transaction Logging - Error Handling - Naming & Directory
Services - Security/Access
Controls -
Registration/authentication/ authorization - Backup/recovery• Legacy Application Integration - Transaction Logging - Error Handling - Security/Access
Controls -
Registration/authentication/ authorization - Data Translation - Backup/recovery• Non-repudiation - PKI - Encryption - Archiving• Auditing - Archiving
Enhanced Data Exchange
• Data Reconciliation & Validation
• Notification/Alert• Messaging• Reporting Capabilities• Workflow• Chat• PDA/Wireless• Interfaces to Legacy
Systems (EAI Middleware)
• Single Sign-on• CBI• CROMERR
Compliance
Exchange Support Services
• Development Support - System
specifications
and requirements
- XML Schema - Standards Development - Test plans and
test results - Data flow
evaluation - System HW/SW
enhancement -
Registry/repository• Transition Planning &
Management• Implementation,
Operations & Maintenance
• Disaster Recovery Services
• Security Planning (General support systems, major applications)
Document Services
• Document Collection• Data Entry/Data
Capture• Paper & Diskette
Processing• Data Validation, Error
Check and Reconciliation
• Data Filing/Storage
Client Support Services
• Hotline technical support
• Customer service tracking and reporting
• User guides, manuals, and handbooks
• Training and Outreach on the CDX System
• Periodic customer surveys
• Client support metrics
CDX Services
12
8 Flows Below Are In Production & More Are In Development/Testing
• AQS: Air Quality System• eBeaches• NEI: National Emission Inventory• PCS IDEF (Pass-Through) Permit Compliance
System-Interim Data Exchange Format• SDWARS: Safe Drinking Water Access and Review
System• RCRAInfo: Resource Conservation and Recovery Act
Information System• TRI Toxics Release Inventory• TSCA HaSD Toxic Substances Control Act Health
and Safety Data
14
Case Study – Toxic Release Inventory:Industry Reporting Directly to EPA Using CDX
For most recent reporting year, 2002
23,939 Facilities Reporting
15,003 Paper Submissions
57,867 Disk Submissions
22,325 CDX Submissions with electronic signature
14
15
Results with CDX
•Reduced time to process TRI data by 32% over last year
•TRI submissions through CDX were up 164% over last year
•Savings of nearly $10/facility submission or over $57,000 this year alone
•Users love paperless submission and electronic receipt
Management Practices(Architecture, Policies, Standards, Security)
Public
Industry
Non-governmentPartners
GovernmentPartners
CDXServices
CONNECT andEXCHANGE
USE
Program SupportPublic Access
Decision Support
System ofAccess
Enterprise Repository
Metadata Holdings Catalog
Shared Geospatial DataCentral RegistriesData Warehouse
STORE for USE
Operational Databases & Applications
PROCESS and STAGE
EXCHANGENETWORKS
Intranet
Extranet
Securityand
AccessControls
EPA Users
EPA Target Architecture
17
What is the Exchange Network?
A widely distributed web services network that offers a platform-independent, database-neutral, operating system-agnostic programming environment where trading partners can exchange information securely over the Internet
18
The Exchange Network VisionA voluntary standards-based, secure exchange environment across the Internet that – Improves data quality and reproducibility– Lowers burden for all partners– Offers the public and regulators better access to
data– Ensures data stewardship– Improves timeliness– Replaces multiple reporting points at EPA– Replaces non-electronic reporting– Improves data integration – regional portrayal
20
Components of the Exchange Network
• Data Standards• XML Schema Standards, Review Process and
Registry• Network Specifications • Tools• Support• Coordination • Trading Partner Agreements – what data when
etc.• One connection to the Network per partner
(node) • Grant Program for States
21
What is a Network Node?
• A Network Node (Node) is a simple web service that initiates and responds to requests for environmental information. CDX is EPA’s “node” on the Network– Client – Requests– Server - Responds
• Nodes can exchange/publish any type of content
• The requests and responses use common formats expressed in eXtensible Markup Language (XML)
22
Network Supports Four Basic Operations
1. Administering: authentication and authorization and identity management
2. Querying: Querying a partner for data.3. Sending: Send a set of data to a
partner.4. Retrieving : Retrieving from a partner
a standard set of data.
23
Basic Node Architecture
Internet
SOAP ListenerSOAP Processor
Database Connectivity
Object HandlerData MappingXML Processor
Node (Web Server)
Firewall
ExistingInformation
System
24
Network Exchange Protocol
Message Structure
• Transport - HTTP
• XML Message - SOAP
• Message Payload – XML– SOAP BODY– DIME Attachments
• Security – HTTPS/SSL
Security (HTTPS/SSL, PKI, etc.)
Transport Protocol (HTTP)
XML Messaging (SOAP)Envelope
SOAP Header
SOAP BodyXML Schema
25
Network Protocols and Specifications • Network Node Functional Specifications describe
– Actions performed by the node– How node functions are invoked– Expected node output
• Network Node Exchange Protocol– Defines types of valid messages a Node should
receive– Describes format for sending messages among
nodes
• Expected shelf life of Network Specifications V1.0 is approximately 18-24 months
28
Lessons/Conclusions
Cutting edge technology with evolving standards can be risky and expensive
• States: budget woes and varying technical ability– If State grant program disappears…
• Long-term investment – not just about ROI– Data quality and data improvement– Burden reduction for industry and government
reporters
• Coordination, politics and control issues – “Open” Network requires rigorous standards and
“central control”– EPA programs gradually coming around – hedging bets– Contractor competition and capacity with web services– Difficult to drive out costs so far – no national RFP no
“node-in-a-box”
29
Lessons/Conclusions
•Think about architecture and capacity planning at the beginning – we were a classic startup
•Consider web services instead of individual PKI or other security/authentication approaches
•Single sign-on, one registration database for all data flows with different views
•Take advantage of emerging portal technology
•Opportunity for business process improvement – don’t pave cowpaths
30
For More Information
Exchange Network: www.exchangenetwork.netCDX: www.epa.gov/cdx
Chris Clark, Lead [email protected](202) 566-1693
Jeff Wells, Business [email protected](202) 566-1706