running multiple etl workflow loads into a data mart and ...da… · presentation on running...
TRANSCRIPT
Running Multiple ETL Workflow Loads into a Data Mart and Dimensional Data Loading
UTOUG Training Days 2019
Scott Heffron
Senior Business Systems Analyst
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 2
What you will learn today
Today you will gain an understanding of three key takeaways from the presentation on running multiple ETL workflow loads into a Data Mart:
1. Payment Accuracy Division (PAD) BI/Reporting Environment
2. Differences between the processes that we run
3. How we monitor our process
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 3
Agenda
Use Case
Data Infrastructure
Facts & Fact Data Flow Processing
Dimension Data Loading
Monitoring
Q & A
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 4
Use Case
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 5
Understanding the business problem
What do we do:• We work with insurance providers and receive millions
of medical claims everyday. We evaluate these claims against a series of clinical, contractual and business rules to prevent fraud, waste, and abuse (FWA)
What is the problem:• Clients Data – Volume and Velocity of claims being
loaded is increasing
• Nurse Analyst – Compare reviewing claims against team members in near real time
What is our solution:• Create an ETL framework that allows for multiple
selected processes to run in parallel and not interfere with each other
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 7
Data Infrastructure
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 8
Data sources and destinations
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 9
Facts & Fact Data Flow Processing
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 10
Fact Area(s)
Granularity
LINE FACT
FLAG FACT
CLAIM ACTION FACT
APPEAL FACT
FRAUD TRIGGER FACT
INVOICE FACT
QA ANALYST FACT
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 11
Claim Line (PAD_LINE_FACT) Counts – 1 Days worth of records
Client A - Count: 4,487,259Client B - Count: 2,391,428Client C - Count: 1,173,785Client D - Count: 365,230Client E - Count: 262,997Client F - Count: 226,885Client G - Count: 225,619Client H - Count: 121,157Client I - Count: 118,372
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 12
Data Loading Use Cases
Large volumes of data being loaded.
Receiving data more frequently
KPI data needs to be seen throughout the day with current information
Need to be able to backfill data in history while daily processes are running
Need to be able to do quality assurance on normal data flow without interfering with any other processes running
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 13
Architectural Requirements
Each process runs independent of any other process
Make sure only one process can load data into the fact at a time
A process can not overlap itself
Need the ability to say which facts can be processed during the process run.
Each process indicators needs to be data driven
Some processing types have higher precedence resources than others.
High: Daily and Near Real Time
Low: Quality and Back Fill
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 14
Process Parameters
Process Parameters
Run Type:
1. Records processed based on system generated data range. This uses the record created and last_updated
2. Records processed based on user defined date range. This uses the record created date
3. Target records processing
Begin Date: Start date of given processing range
End Date: End date of given processing range
Processing Type:
1. PAD: Daily Processing
2. NRT: Near Real Time Processing
3. BF: Back Fill Processing
4. QA: Quality Assurance Processing
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 15
Insert Update Table QA
Insert Update Table NRT
Insert Update Table BF
Staging Table QA
Staging Table NRT
Staging Table BF
Staging Table PAD Insert Update Table PAD
Each processing type will have its own set of staging tables to be able to process the records before loading into the target fact table.
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 16
High LevelWork Flow Process
1
2
3
Identify the top four most critical or important steps from our standpoint
4
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 17
High Level FactWork Flow Process
1
2
3
Identify the top three most critical or important steps from our standpoint
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 18
System SettingData Elements
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 19
Manually Start of PAD Process
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 20
Manually Start of Near Real Time Process
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 21
Manually Start of Back Fill Process
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 22
Manually Start of QA Process
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 23
Dimension Data Loading
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 24
We need to find a way to allow daily loading of dimension data from multiple clients into a single dimension table. So that the data can be shared and maintain integrity.
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 25
Issues with Dimension Data Loading
General
Which Dimension Gets Loaded First
Multiple Dimensions needing to be loaded
Client Side
Is another client loading data into the dimension table?
Has the data already been loaded
Dimension Side
Never knowing when the data will be loading to the dimension table
Did the client ETL process finish?
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 26
Dimension Type(s)
System
This data is not generated by the client
Client
This data is generated by the client
Hybrid
This is a combination of System and Client data
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 27
Design areas of the processes
Allow only one client to load a dimension table at a time
Make sure only new records are added to the dimension table
What happens if a failure occurs
Notification of failure
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 28
Data Flow Diagram
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 29
Brain Trust – Data Model Dimension Data Process
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 30
Example of Dimension ETL Configuration Data
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 31
Example of Dimension ETL Table Priority Data
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 32
Example of Dimension ETL Table Usable Activity
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 33
Monitoring
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 34
Example of Processes Running
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 35
Detail Look At Client Processing
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 36
High Level Look At Client Processing
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 37
Look At Client Back Fill Processing
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 38
Look At QA Check
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 39
6am Data Mart Email Status
© 2018 Cotiviti, Inc. CONFIDENTIAL AND PROPRIETARY 40
T H A N K Y O U
Q & A