mcgonagle-security implications of cross-agency big data ... · pdf filesecurity implications...

35
SECURITY IMPLICATIONS OF CROSS- AGENCY BIG DATA APPROACHES FOR TAX COMPLIANCE Les McMonagle (CISSP, CISA, ITIL) Director & Principal Consultant Teradata – InfoSec COE July 2013

Upload: vuongthien

Post on 06-Mar-2018

217 views

Category:

Documents


2 download

TRANSCRIPT

SECURITY IMPLICATIONS OF CROSS-AGENCY BIG DATA APPROACHES FOR

TAX COMPLIANCE

Les McMonagle (CISSP, CISA, ITIL) Director & Principal Consultant Teradata – InfoSec COE July 2013

2 Confidential – Do Not Distribute Without Permission

Agenda

• Defining The Problem

•  Defining The Solution

•  Leveraging Information

•  Avoid Common Mistakes

•  Wrap-Up / Q & A

3 Confidential – Do Not Distribute Without Permission

Dilbert and Big Data

4 Confidential – Do Not Distribute Without Permission

Increasing data variety and complexity

BIG DATA User Generated

Content Mobile Web

SMS/MMS

Geo-location data External Reference Sources

HD Video

VOIP

Speech to Text

Sensor data

NetFlow / IPFIX Data

Business Data Feeds

User Click Stream

CDR (phone call records)

SIEM Logs

Web logs

DLP Logs

Internet A/B testing

Dynamic Routing

Affiliate Networks

Search marketing

Behavioral Targeting

Firewall Logs

IDS/IPS Logs

Dynamic Routing Tables

ARP Data

DNS Logs

DHCP Logs

Network

Access Logs LogOnOff Logs Static Routing Tables

Host

Big Data: Exponential Growth in Data

5 Confidential – Do Not Distribute Without Permission

Closing The Tax Gap Is Crucial

•  The IRS estimates that at the Federal level, the tax gap is 15% to 17%

•  Electronic filings introduce new fraud opportunity

•  Fraud Is Very Easy & Widespread Today

•  Even incarcerated felons are in on it !

• Not just a US Federal or State issue

6 Confidential – Do Not Distribute Without Permission

Fraud Is Very Easy & Widespread Today

•  “Electronic filing, which was introduced to speed up delivery of refunds, has made the system more vulnerable to fraud”

•  Delays in comparing W2’s to 1040’s

•  "We will not be prosecuting our way out of this …

7 Confidential – Do Not Distribute Without Permission

Big Data Analytics involves many data sources

• Data from multiple disparate sources needs to be combined to provide required insight and machine intelligence

• Additional data sources may contain sensitive or restricted data

• Getting approvals for access to data sources from other agencies can be a challenge

•  Poor Data Governance programs impede data sharing

•  Intelligent Security can ENABLE these Analytics Opportunities

8 Confidential – Do Not Distribute Without Permission 8

Trends impacting Data Privacy

Three trends in Big Data Analytics and Enterprise Data Warehousing today are raising privacy concerns

and increasing business risk

Only one can be controlled and leveraged to reduce risk

1. Proliferation of Personally Identifiable Information (PII)

2. Persistence/Pervasiveness of PII in Gov/Corp data

3. Consolidating data sources into a single, central repository

9 Confidential – Do Not Distribute Without Permission 9

Last year’s Historical data

Active data warehouse

PII – Personally Identifiable Information, PHI – Private Health Information, IP – Intellectual Property

Applying Protection at the data layer become more critical

Privacy not Technology becomes the limiting factor

Aligning Data Governance Strategy with emerging technology trends

10 Confidential – Do Not Distribute Without Permission

Access to Alternative Data Sources Structured and Unstructured Data

Generate Audit and Fraud

Investigation Leads

Dept of Justice Dept of Labor

Dept of Health

Professional Licenses

DMW – Vehicle Registrations/value

Dept of Human Services

Child and Spousal Support Payments

Dept of Revenue

Alignment with W2 Data

11 Confidential – Do Not Distribute Without Permission

Data Security Issues With Cross-Agency Data Sharing

•  Defining The Problem

• Defining The Solution

•  Leveraging Information

•  Avoid Common Mistakes

•  Wrap-Up / Q & A

12 Confidential – Do Not Distribute Without Permission

Getting Access — Without Getting Access

Leverage native database Semantic Layer Security Controls to provide only required access to other sensitive data sources

The Security of Inclusion

versus

The Security of Exclusion

Grease the “Data Sharing” Wheels

13 Confidential – Do Not Distribute Without Permission

Leverage Semantic Layer Security Controls

Views Macros

Routine Application

Marketing Application

Disclosure Application

Analytic User/Application

Single-Row Access

Consumer Access Macro

Customer Base Tables

DBA/System Administrator

Data Protection Security Admin Officer

….

Anonymized View

Opt-out/ Anonymized View

Privacy Infrastructure

Databases/Tables Views, Macros User Profiles Logs Audit Reports

Database Infrastructure

Opt-out View

Standard View

14 Confidential – Do Not Distribute Without Permission

Perform Complex Analytics on Multiple Data Sources

Common Precursors to

fraudulent activity

Clickstream led to a

fraudulent filing

Path Analysis

Data Visualization Dashboards

15 Confidential – Do Not Distribute Without Permission

Semantic Layer Security Controls

Different types or combinations of Views can be applied to limit access to only required data

Anonymized View(s)

Fraud Investigation Team View(s)

16 Confidential – Do Not Distribute Without Permission

Use sensitive data source without direct access

INTEGRATED DATA WAREHOUSE

Labor Health Human Services

Revenue Vehicle

Registrations

Macro

Stored Procedure

Standard Reports Output

Suspected Fraud

Yes

No

Drop

Initiate Audit or

Investigation

17 Confidential – Do Not Distribute Without Permission

Row Level Security (RLS) Controls

18 Confidential – Do Not Distribute Without Permission

Improved Understanding from Internal Network Traffic

•  Most network conversations (malicious and benign) have their origins in the intent of a human actor

•  The Analysts’ job is really to infer the intent of the human actor by looking at the packets they generate

Actor True

Intent

Network Conversations

Sessions

Packets

What we really care about

What we have to work with

Monitoring/Detecting Internal Misuse of Data

19 Confidential – Do Not Distribute Without Permission

Analytics Helps with ALL Compliance Issues

Lack of understanding of

requirements

Intended Fraud (External hackers)

Different paths, but

same revenue impact !

Innocent Mistakes

Employee misuse or abuse of data access

20 Confidential – Do Not Distribute Without Permission

Agenda

•  Defining The Problem

•  Defining The Opportunity

•  Analytics For Compliance

•  Analytics For Efficiency

•  Wrap-Up / Q & A

•  Defining The Problem

•  Defining The Solution

•  Leverage Information

•  Avoid Common Mistakes

•  Wrap-Up / Q & A

21 Confidential – Do Not Distribute Without Permission

Advanced Analytic Capabilities

Advanced Analytics (Predictive)

Traditional Analytics (Reactive)

22 Confidential – Do Not Distribute Without Permission

Leverage what private industry is already doing

•  Intelligent Credit Card Authorization Checks

• Retailers immediately detect fraudulent product return patterns by comparing and analyzing more data sources prior to providing a refund (has this product, person, card been used recently for a similar refund?)

•  Financial institutions for example have highly sophisticated fraud processes built off of a wide range of data and tools

No need for Tax to reinvent the wheel

23 Confidential – Do Not Distribute Without Permission

Leverage Cross-Agency Data Sharing Sources

State tax and revenue agencies today utilize any or all of the following:

•  All internal tax systems data

•  Federal IRS data

•  Department of Labor Unemployment data

•  Workforce Commission data

•  Department of Motor Vehicle (DMW) Driver’s License, Vehicle Registrations

•  Professional Licenses

•  Customs data

•  Secretary of State

•  US-CIS (immigration, work permits and Visas)

•  HHSC data, all agency data from DOL not just a subset as done today, etc.

24 Confidential – Do Not Distribute Without Permission

Other potential external/reference data sources

External reference data source include the following:

•  Source IP Address – Physical Address or neighborhood matching •  Multiple returns from the same source IP Address that is not equal to

the tax payer address or location

•  Credit Score Data ?

•  Clickstream data from on-line submissions subjected to path analysis to detect consistent fraudulent submission patterns

•  Death Notifications •  Fish and Game Licenses •  FAA •  Others ?

Some reference data may be sensitive or regulated

25 Confidential – Do Not Distribute Without Permission

Leverage what private industry is already doing

•  Utilize many common data analytic tools and algorithms with minimal adaptation or modification Employees looking at neighbors, family members, VIP’s or other acquaintances’ tax records or data

•  Tagging IRS provided data to ensure compliance with IRS-1075 (Data Classification follows the data)

•  Monitoring data access for anomalous or inappropriate access patterns or usage

26 Confidential – Do Not Distribute Without Permission

Agenda

•  Defining The Problem

•  Defining The Opportunity

•  Analytics For Compliance

•  Analytics For Efficiency

•  Wrap-Up / Q & A

•  Defining The Problem

•  Defining The Solution

•  Leverage Information

• Avoid Common Mistakes

•  Wrap-Up / Q & A

27 Confidential – Do Not Distribute Without Permission

Avoid Common Mistakes

•  Collecting an enormous amount of activity log and other security log data and never use it Data is then reduced to a basic forensic value only without proactive reporting and alerting on anomalous activity

•  Mixing together different data sensitivities (Data Classification follows most sensitive data)

•  Not leveraging activity log data to monitor data access and detect anomalous or inappropriate access patterns or usage

28 Confidential – Do Not Distribute Without Permission

Leading Misuse of Data or Data Access

•  Random curiosity browsing of data Looking at neighbors, family members, VIP’s, other acquaintance data

•  Mixing or co-mingling of IRS data with other sources (Data Classification follows the data)

•  Poor application of standard information security best practices Such as Least Privilege and Need to Know basis for granting access

Monitor user activity to ensure correct or appropriate use

29 Confidential – Do Not Distribute Without Permission 29

Privacy Principles – One 1/2

•  Accountability – requires that the entity define, document, communicate, and assign accountability for its privacy polices and procedures and be accountable for PII under its control.

•  Notice – requires that the entity provide notice about its privacy policies and procedures and identify the purpose for which personal information is collected, used, retained, and disclosed.

•  Choice and Consent – requires that the entity describe the choices available to the individual and obtain implicit or explicit consent with respect to the collection, use, and disclosure of personal information.

•  Collection Limitation – requires that the entity collect personal information only for the purposes identified in the notice.

•  Use Limitation – requires that the entity limit the use of personal information to the purpose identified in the notice and for which the individual has provided implicit or explicit consent.

Comparable lists from: International Security, Trust and Privacy Alliance (ISTPA)

Association of Insurance Compliance Professionals (AICP)

30 Confidential – Do Not Distribute Without Permission 30

Privacy Principles – Two 2/2

•  Access – requires that the entity provide individuals with access to their personal information for review and update.

•  Disclosure – requires that the entity disclose personal information to third parties only for the purposes identified in the notice and only with the implicit or explicit consent of the individual.

•  Security – requires that the entity protect personal information against unauthorized access or alteration (both physical and logical).

•  Data Quality – requires an entity maintain accurate, complete, and relevant personal information for the purposes identified in the notice.

•  Enforcement – requires that the entity monitor compliance with its privacy policies and procedures and have procedures to address privacy-related inquiries and disputes.

These must be captured in business/technical requirements

31 Confidential – Do Not Distribute Without Permission 31

Proven Data Privacy Methodology

• Convergence of existing Data Privacy Principles

• Centralized EDW’s processing/protecting broadly acquired PII

•  Experienced data privacy consultants to advise & assist (International experience, ISTPA, CHP, CISA, CISSP certifications)

• Reduce costs by protecting data in a single, secure repository

• Standardize processes to meet common requirements

Solicit help from external Subject Matter Experts (SME) where appropriate

32 Confidential – Do Not Distribute Without Permission

Conclusions

• State Tax authorities behind the curve in efforts to apply big data analytics to the tax fraud/tax gap problem

• Sharing data in a controlled and consistent way while applying consistent, policy and regulation compliant security controls is easier within a single, centralized data repository or EDW

• Reduce data hosting, data sharing, security controls and other operational costs by consolidating data from multiple DataMarts

•  Provide only the minimum access to sensitive information assets required to support each specific business process (Least-Privilege, Need-to-Know basis)

•  Ensure original data classification follows the data

33 Confidential – Do Not Distribute Without Permission

Q & A

34 Confidential – Do Not Distribute Without Permission

Les McMonagle Director & Principal Consultant - Information Security COE

•  Les McMonagle is an information security consultant leading the Teradata InfoSec COE

•  He has over 20 years of experience in the development and implementation of information security architectures

•  During his career he has specialized in computer training, E-Commerce applications, IT Operations, information security architecture, processes, audits and Corporate Risk Management

•  Les holds CISSP, CISA, ITIL and other relevant industry certifications

•  He has participated in the development of the BITS Financial Institution Shared Assessment Program and delivered executive level presentations on Data Privacy and Security

•  Les is also playing a lead role in developing Teradata’s Cyber Security solution strategy and how to leverage Teradata’s Unified Data Architecture (UDA) for CyberSecurity solutions

Les McMonagle (CISSP, CISA, ITIL) Mobile: (617) 501-7144 Email:[email protected]

35 Confidential – Do Not Distribute Without Permission

Contact Information

•  If you have further questions or comments:

Les McMonagle (CISSP, CISA, ITIL)

Teradata Information Security, Data Privacy and Regulatory Compliance COE [email protected]

(617) 501-7144 Cell

Les Arnold [email protected]

(512) 930-0135 Office