Transcript
Page 1: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

Welcome!

Big Data in The Cloud: Architecting a Better Platform

Brian Kinlaw, Principal Solution Architect, CSC

Page 2: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

Today’s Presenters

Brian KinlawPrincipal Solution ArchitectCSC Emerging Business GroupLeads the initiation, development and execution of Big Data, Analytics, Social Media, Mobile, Cloud, Cyber Security, and Internet of Things (IoT) solutions for the Office of the CTO.

Page 3: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

Agenda

I. CSC BDPaaS OverviewII. CSC ApproachIII. BDPaaS ArchitectureIV. BDPaaS SecurityV. Questions & Answers

Page 4: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

Rapidly Evolving Analytics Landscape

BIG DATA 1.0 (EDW/BI) BIG DATA 3.0 (OPEN SOURCE / NEXT GEN)

KEY CHARACTERISTICS

• Relatively Small, Structured Data Sets• Proprietary RDBMS• Internally Sourced / Small Teams• Reactive Reporting Mechanisms

• Introduction of Unstructured Data Sources• New In-Memory Analytic Capabilities• “Data Scientists” Emerge• Ad-hoc Reporting Becoming Pervasive

• Seamless Blend of Traditional Analytics and Big Data

• Heavily Open Sourced• Reporting Becomes Predictive &

Influence Business Process Change

REPRESENTATIVE TECHNOLOGIES

IBM DB2, Oracle DB, IBM Cognos, SAP Business Objects, Oracle BI, Informatica

IBM Netezza, HP Vertica, Oracle Exadata & Exalytics, Teradata, Pivotal Greenplum

Cloudera Hadoop, Hortonworks Hadoop, Spark, Storm, Kafka, Tableau, Pentaho

POTENTIAL BUSINESS ROI Low-Medium Medium Very High

CUSTOMER SKILLS/TALENT Bulk of Talent Today Talent Investments Required High Demand Talent

BIG DATA 2.0 (ANALYTIC APPLIANCES)

DETERMININGVALUE

SECURITY & COMPLIANCESKILLS & CAPABILITIES 32%30%65%

The Market is Here Today

Yet Challenges Remain…

Page 5: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

CSC BIG DATA & ANALYTICS: WE ARE UNIQUELY POSITIONED TO ADD VALUE

Technology Expertise Working with Hadoop since its Creation

Faster Time to Value Deliver a Big Data Platform in 30 Days

Enterprise SecurityData, Application, Platform Security and Compliance

SHAPE TRANSFORMMANAGEMENTAS A SERVICE

DIFFERENTIATION: OUR UNIQUE STRENGTHS

FIV

E C

OR

E

OF

FE

RIN

GS Analytics aaSBig Data Analytic Insights

Big Data Strategy Big Data Platform

InnovationBig Data Platform aaS

STRATEGY ANALYTICS

PLATFORMS

IND

US

TR

YA

CC

EL

ER

AT

OR

S

Product Innovation: Optimize product mix & feature set to improve revenue by 25-30%

Customer Intelligence: Identify innovative new revenue channels – up to 2x revenue increase

Smart Operations: Improve operating margins ~60% thru efficiency and quality improvements

Risk Insights: Reduce fraudulent activity by up to 75%, avoid millions in cost & exposure

RevenueEnhancers

ProfitEnhancers

Page 6: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

Client Value Achieved

• Prioritized Roadmap of Initiatives to Achieve Growth Vision within 2-3 years: BU Growth from $200M to $1B Through Analytic Insights

Client Value Achieved

• 331% ROI• Payback Period of 2.1 Months• 2% Yield Improvement = $300M

Client Value Achieved

• Reduced time to onboard customers by 80%• Improved visibility on service levels• Increased customer satisfaction

Client Value Achieved

• BSL Met Strategic Objective (ITaaS)• Reduced Costs by 20%• Improved Analytic Cycle Time by 50%

Client Value Achieved

• Access to Information in Minutes versus Weeks• Speed: Solution Deployed within Days• Access to Key Next Gen Talent

Client Value Achieved

• Speed to Market: 30 Days to Platform, 60 Days to Full Working Mobile Telematics Application

• Flexible Deployment Options

Achieving Real Business Value With Our Clients

Integrated data for ~100M people from 40 member companies

Healthcare

Maximized diamond company profitability through BI and analytics

Wholesale

Railway punctuality improved from 92% to a world-leading 96%

Transportation

Reduced tax evasion and litigation through DW and predictive modeling

Government

16% increase in claims fraud investigations for significant

ROI in 6 months

Insurance

Performance optimization and analytical

insights into POS and sales trends

Retail/CPG

$10M reduction in annual operating expenses

Printing

Customer intelligence lifetime value model driving marketing and customer service

Travel & Leisure

Use of sensor data for real-time managementof mining and mfg. ops and maintenance

Natural Resources

Comprehensive global view of exposure in near real time

Banking

Global InsuranceCompany

Page 7: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

RISK RESULT

• Structuring all data at the point of ingestion• Schema on Write vs Schema on Read

• Significant upfront expense ( and $$) for planning

• Significant expense ( and $$) to adapt to changes/needs of the business

• Data silos • Disparate information streams• Reduced ability to obtain requirements from

entire business• Does not allow for holistic decisions to be

made• No golden source of truth

• Proprietary/custom data warehousing/infrastructure

• Expensive• Non standard to environment

• Scale • Not economically feasible• Not technically possible

Risk to Traditional Data Model the status quo

Page 8: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

Risk of Transforming to a Big Data Business

RISK RESULT

• Numerous different technologies • Hard to select the best tool without specific experience with these technologies

• Lack of Big Data specific expertise • Unreasonable expectations without having done it before

• R&D in Big Data is lost or as time permits• Scope creep is common• Learning as your go

• Immature Big Data Technologies • Compliance risk• Security Risk• Complex deployments• Complex integrations between technologies• High operational costs

• Large CapEx expenditure • Buying upfront growth• More complex to scale

Big Data & Analytic systems should be a tool to enable companies with better information and insights, not a roadblock

Page 9: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

1. Implementation

• Complexity• Integration• Speed

2. Operation

3. Data Science

• Business Relevance• Feedback loop

4. Talent

• Robust & Scalable• Monitoring & Automated Alerts

Operational Big Data Risks

• The right talent at the right time

5. Infrastructure

• Upfront - CapEx investment• Iterative Flexibility• Matching Hardware to Software

Page 10: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

A New Mitigation Strategy Big Data Platform-as-a-Service

Operation• Managed to your SLA needs• Global delivery teams and support• Integrated testing

Implementation• DevOps infrastructure-as-code deployment• Pre-defined orchestration scripts• Flexible deployment locations

Talent• Data engineers• Solution Architects• ETL expertise

• Support Team• R&D Team• BI/Viz/Reporting expertise

Data Science• Subject matter expertise as needed• Global Data Science team• Applying analysis at the right point

Infrastructure• as-a-Service model• Pay-as-you-go structure• Pre-configured hardware designs

Page 11: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

Agenda

I. CSC BDPaaS OverviewII. CSC ApproachIII. BDPaaS ArchitectureIV. BDPaaS SecurityV. Questions & Answers

Page 12: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

Descriptive Analytics IWhat happened?

• Reporting — Query, Reporting, and Search Tools

Diagnostic AnalyticsWhy did it happen?

• Analysis — OLAP and Visualization Tools

Descriptive Analytics IIWhat’s happening now?

• Monitoring — Dashboards and Scorecards

Predictive AnalyticsWhat might happen?

• Predictive Analysis — Big Data

Prescriptive AnalyticsHow can we make it happen?

• Recommendations, Risk Avoidance

OPERATIONAL

ANALYTICAL

Complexity

Busi

ness

Val

ueInsight

Hindsight

Foresight

Operations Triggers

High ImpactLow Impact

Process Improvement via Applied Intelligence

The Analytics Journey

Page 13: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

MAJOR ACTIVITIES

Solution

Iterative App DevelopmentPlatform RolloutTechnical DesignDiscovery

Interview Key Business

Stakeholders

Interview Key Technical

Stakeholders

Define Objectives & Challenges

Define Target Use Case

Identify Data Sources

Define Business Benefits

Define Architecture

Develop High-Level Approach

& Costs

Agree to Project

Plan/Rollout

Standup / Connect

Environment

Design Data Flows

Architecture Validation

Build Data Flows

Historical Data

Real-Time Data Flow

MANAGETRANSFORMSHAPE

Iterate

• Identify data sources for target use case

• Develop high level tech approach and costs

• Define high level benefits• Develop initial case for action• Develop go forward plan

• Develop Data Model• Technical architecture &

integration design• Stand up environment• Dashboard design workshops• Data mapping

• Build dashboard• Configure application• Data load• Run solution iterations• Analytical modeling

• 2-4 hour Design Thinking Workshop• Review current state metrics• Review business pain points &

opportunities• Review application & infrastructure

environment• Define target use case

Customer Engagement Framework

Page 14: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

Data Exploration & Transformation

Data Modeling & Algorithm

Development

Data Visualization & Reporting

Business Discovery

InsightLab: Rapid Analytics Development

InsightOperationalization

Change Management

Use Case Prioritization & Roadmap

Data Inventory Identification & Coordination

8 – 12 Week SprintAgile Scientific Approach to Measurable Business Improvement

Inputs

Outputs

InsightLab

Page 15: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

How to Build a Business Outcome for anything

Tools & TechnologiesR / Python /

Java / Javascript

Tableau / Pentaho /

Qlik

Cognos / BobJ / OBIEE

SAS / SPSS / MatLab /

Rapid Miner

Relational DB

Columnar DB Graph DB Hadoop

In-Memory / Streaming

VisualizationTime Series Spatial

Charts Mapping Histogram Graphs Line Charts Scatter Plots Decision Trees

Data Exploration

Data ScienceDecision Trees Regression

Analysis Classification Clustering Anomaly Detection

Natural Language

Processing (NLP)

Correlation Analysis

Ingestion / MungingDiscovery Integration Normalization Dimensionality

ReductionFeature

ExtractionTransformation & Enrichment Data Fusion

Business InsightsDescriptive (1.0) Diagnostics (2.0) Predictive (3.0) Prescriptive (4.0)

5Define the right

tools for the task at hand

4Define consumption and interaction

3Define the types of Analysis

2Define data needed & format for

analysis

1Define the desired insights by stage

Page 16: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

Case Study

• Decrease warranty inquiry response times• Increase operational efficiency• Enable the business to extract new insights

• Conducted 5-week big data strategy assessment

• Established cloud-based big data platform• Built the apps and analytics to capitalize

on the data

• Over 10,000 queries/day• 30+ data connections• 1,000+TB of data• Response times of 2-3 months now done

with a single query• Improved customer satisfaction• Reduced churn• Reduced support costs• New product management capabilities,

fixes• Better supply chain coordination• Increased security• New data and analytics products• Increased cross-sales and up-sales• Increased renewals• Better license compliance

HGST, a Western Digital company, develops innovative, advanced hard disk drives, enterprise-class solid state drives, and external storage solutions and services. CSC improved customer support and product quality.

Solution ResultsChallenge

Page 17: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

Case StudyNetwork Rail manages the most of the rail infrastructure across Great Britain, responsible for control and maintenance of over 2,500 railway stations, 20,000

miles of track, and 40,000 bridges and tunnels. CSC provides a data and analytics hub for massive amounts of imagery and

analog track monitoring data.

• Network Rail needed a platform that could not only store, but also analyze petabytes of data over the long-term:– Track imagery and video data captured via drones and

cameras– Vibration data captured via maintenance trains– Other forms of large file size analog data crossed with

operational, structured data sets• Network Rail wanted to implement the solution

quickly, and ramp up data volumes at a fast pace• Goal of leveraging combined services to assist with

loading data, managing the underlying infrastructure, and working with and analyzing the data

• CSC designed and configured the solution, built and deployed it in the cloud, and developed ETL flows to import massive amounts of bulk data on an ongoing basis– Core platform (BDPaaS) leveraging Hortonworks Data

Platform, including Hive with Tez• CSC’s platform integrated with ESRI ArcGIS for Big

Data geolocation analysis features including geotagging and geo tiles

• CSC managed the infrastructure, platform components, and data flows, in addition to providing continued support/consultation services to the client

• Network Rail is generating insights on how to prioritize in near real-time the improvement and maintenance of the massive railway track and infrastructure footprint– Advanced analytics of analog data, including

geolocation capabilities– Ability to handle the scale required by the massive

amount of data under management and data growth– Complete transformation of a business unit’s analytics

capability on track for success in less than 12 months

SOLUTIONCHALLENGE RESULTS

ImageFiles

YARNHDFS

HiveHue

AWS S3Object Storage

Hue

Hadoop-ArcGISConnector

ESRI ArcGIS

AnalogData

GeoInfo

PostgreSQL

PostGIS

ArcGIS

Geocortex

Page 18: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

Case StudyThis Food & Hospitality Retailer has a footprint of over 650 regional hotels, 2,800 coffee shops, and a number of restaurant chains. CSC provides the

infrastructure, data platform, and analytics that uncovers revenue opportunities in customer web interactions.

• The client wanted to quickly evaluate the use of big data and the value that it brings as it relates to identifying new business opportunities

• Ease of use was a key need in making insights and reporting more accessible to analysts… and increasing the speed with which they could analyze

• Time to market was a key factor in the decision to implement a comprehensive big data platform. The client realized:– A bare platform would not be easy

to manage– Their staff does not possess the skills to operate a bare

platform– They needed to focus on the

big data applications, rather than the platform

• CSC designed and configured the solution, built and deployed it in the cloud, and developed ETL flows to transport web activity data within 90 days:– Core platform (BDPaaS) leveraging Hortonworks Data

Platform, including Hive with Tez– Aggregating various different data sources to create one

massive web log data set– Adding data science algorithms to clean up data for

better insights– Providing Pentaho Business Analytics as a

comprehensive reporting and dashboard suite for insight presentation

• CSC managed the infrastructure, platform components, and data flows, in addition to providing continued support/consultation services to the client

• The client is generating insights on how customers interact with their website, and improving their services for happier customers and more streamlined business:– Faster path to ROI with both tech and services– Creating a real-time customer insights dashboard and

set of reports– Ability to prove the value of big data internally through

the mining of data and generation of insights and reports for various teams

– Scalability to more data sources and use cases, including plans for mobile application analytics and operational metrics, as well as operational business analytics combining internal and external data sources

SOLUTIONCHALLENGE RESULTS

Food & HospitalityRetailer

YARNHDFS

HiveHue

PostgreSQL (onboard)

Distcp

Hue

Hive >

PostgreSQL

Hive-Pentaho

Connector

(ODBC)

Pentaho Business Analytics

Logs

Pentaho Data Integration (PDI)

Page 19: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

Agenda

I. CSC BDPaaS OverviewII. CSC ApproachIII. BDPaaS ArchitectureIV. BDPaaS SecurityV. Questions & Answers

Page 20: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

Big Data Platform Enables Insights in 30 Days

Cloud-EnabledScalableDistributed

Powerful Integration

Any Data Source, Real-Time to Batch

World-Class Managed Operations and Expert Services

Most TrustedSecurity Capabilities

APP 3

Flexible Deployment OptionsPublic

Cloud

VirtualPrivate Cloud

Dedicated Cluster

Enterprise Private Cloud

CSC Big Data Platform as a Service

APP 1APP 2

REAL TIME

BATCHAD-HOC

Agile Application Development Environment that is Scalable, Sustaining, Self Healing

Page 21: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

AD HOCBATCH

Big Data Platform as a Service

Flexible Deployment Options

REAL-TIME

CSC Command and Control

Deployment Center

Operations Center

Support Center

Application Center

Knowledge Center

Amazon Web Services

CSC Hybrid Cloud Services CSC WebScale Dedicated Hardware

Enterprise Grade SecurityAccess Control

Compliance Support

Perimeter Security

Activity Monitoring

Audit Logging EncryptionMalware

ProtectionHardened OS

INTERACTIVEHive w/ TezImpala

HDFS, YARN, MapReduce, Spark

RELATIONALPostgreSQL

DOCUMENTElasticsearchMongoDB

GRAPHTitanDB

STREAMStorm / Kafka

ETL Data Transformation Business Intelligence Data Mining Advanced Analytics Geolocation

COLUMNARHBaseAccumuloDataStax

Page 22: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

EventsHTTP(S) / TCP / UDP

FilesDirect Upload / FTP / FTPS / SFTP

Streams Queries

Hadoop

WebListener

Command & Control

File Store / Landing

Zone

Kafka Queue

Storm

HBase or Accumulo Tez or Impala

HDFS

MapReduce Hive Spark

Queries

Jobs

DataStax / TitanDB

Elastic-search or MongoDB

Splunk

FreeIPA + LDAP

Git

Jenkins

Agility Server

Puppet

Versioning Control

ID Access & Management

Monitoring & Log File Analysis

Continuous Integration

Infrastructure as Code

IT Policy & Governance

Big Data PaaS – Standard Reference Architecture

Page 23: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

Command & Control

• $100M+ R&D investment• 8+ years of R&D• 25+ distinguished big data engineers• 125+ related technology engineers (cloud, cybersecurity, etc.)• Core committers to all major Big Data open source projects

Puppet

• Fully Automated Deployment• Pre-built service orchestration scripts• Pre-built integration connector

scripts• Comprehensive Configuration

Management

Jenkins

• Automated, pre-built platform integration tests

• Framework for app-level integration testing

Splunk

• Detailed Log Monitoring & Troubleshooting• Complete activity monitoring & audit trail• Comprehensive system monitoring and

alerting suite

FreeIPA + LDAP

• User Account and Permissions Management

• LDAP Integration

Git

• Platform and Application Version Control• DevOps Push-Pull Application Code Delivery

Agility Server

• IT Policy & Governance Engine• Hybrid Cloud Workload Interoperability

Page 24: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

Production Data Flows

PRODUCTION

LOCAL

Push Code

w/ Git

Test w/

Jenkins

Maintenance WindowPush to Production

DEV / DR

Sample Data, Partial/Full Flows, or DR Replication

Storm

Kafka HDFS

Hive Impala

ElasticsearchC&C

Storm

Kafka HDFS

Hive Impala

ElasticsearchC&C

VM/Sandboxor “local node” environment

or “direct-dev” on BDPaaS

User

Queries

Page 25: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

Production Data Flows

PRODUCTION

DEV / DR

Sample Data, Partial/Full Flows, or DR Replication

Storm

Kafka HDFS

Hive Impala

ElasticsearchC&C

Storm

Kafka HDFS

Hive Impala

ElasticsearchC&C

• ADD OR REMOVE NODES• RECONFIGURE NODES• RECONGIFURE OVERALL CLUSTER• ADD OR REMOVE CLUSTERS• SCALE UP OR SCALE DOWN CPU, RAM, DISK• ADD OR REMOVE ENVIRONMENTS

• ADD OR REMOVE NODES• RECONFIGURE NODES• RECONGIFURE OVERALL CLUSTER• ADD OR REMOVE CLUSTERS• SCALE UP OR SCALE DOWN CPU, RAM, DISK• ADD OR REMOVE ENVIRONMENTS

Page 26: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

Tableau

ODBC

Kibana

API

RevolutionR

SAS

BulkExport

RHad

oop

/ Sc

aleR

Storm

Kafka HDFS

Hive Impala

ElasticsearchC&C

DR

BDRKfk Replicate

Teradata

Oracle RDBMS

Twitter

Logs

VideoFiles

IBM MQ

HDFS

Hive Impala

Elasticsearch

Command & Control

Sqoop

Hue

Kfk-HdpBulk Writer

Kfk-ESRecordWriter

Storm

Kafka

Teradata Connector for Hadoop

Distcp

Sqoop

HTTP (GNIP)

HTTP

CustomConnector

EBS Volumes

VPN

Amazon S3

Amazon IAM

Amazon Storage Gateway

Direct Connect

AmazonCloudFront

Amazon CloudFormation

AMI Service

GlacierEphemeral Local Drives

D2-Instances

R3-Instances

I2-Instances

C4-Instances

Amazon RDS

C3-Instances

M3-Instances

Page 27: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

Why Public Cloud

• Higher Resource Efficiency for Increase Savings

• Significantly Greater Workload and Resource Flexibility

• More compatible with software-defined-everything approach

• Shared Services (image service, identity management,

object storage, block storage, telemetry, etc.)

• High Scale Cost Efficiency

• Hybrid Cloud Compatibility

Amazon Cloud Management

Page 28: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

Agenda

I. CSC BDPaaS OverviewII. CSC ApproachIII. BDPaaS ArchitectureIV. BDPaaS SecurityV. Questions & Answers

Page 29: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

Guarded with Enterprise Grade Security

Change Management, Physical Security, Backups, Disaster Recovery, and more…

• Data at Rest Disk Encryption• Encrypted Node-to-Node in

Flight Communication • S3 Encryption & EBS Volume

Encryption -- AWS• Secure transmission from

encrypted customer facility to BDPaaS deployment

Disk & Network Encryption

• Deployment of a complete platform stack

• Virtualization ready & Anti-Virus ready

• Vulnerability Scanning, Penetration Testing and Security Patches

Std Operating Environment

• CSC Endpoint Security (TrendMicro) – Hardware security

• ClamAV -- Virtual Machine security

• Tripwire -- File Integrity monitoring

Malicious Code Protection

• Audit support -- HIPAA, PCI, FISMA, ITAR etc

• Documentation support• Compliance oversight• Security enforcement / issue

resolution*

Compliance Support

• Splunk -- activity monitoring and detailed system logging

• Cloudera Manager and Ambari -- Hadoop configuration information

• Puppet -- all non-Hadoop component configuration information

Activity Monitoring

• Free IPA -- centralized user management and policy controls

• LDAP/AD integration• Kerberos option• Apache Knox option• Apache Sentry option

Access Control

• Secure VPN connections• Isolated subnets• Secure port management and

fine-grained port monitoring• IP whitelisting & blacklisting

Perimeter Security

• ArcSight SIEM -- Security Event Management

• Managed audit operations personnel

• ArcSight via connector -- Splunk

Audit Logging

Ensuring Data, Application, Platform Security, and Meeting Regulatory Requirements

Page 30: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

Agenda

I. CSC BDPaaS OverviewII. CSC ApproachIII. BDPaaS ArchitectureIV. BDPaaS SecurityV. Questions & Answers

Page 31: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

Questions and Answers:

CSC Website http://www.csc.com/big_data/offerings/82345/105621-csc_big_data_platform_as_a_service_powered_by_infochimps

TheSource https://thesource.csc.com/Pages/Offerings/CSC-Big-Data-Platform-as-a-Service.aspx

Page 32: Big Data in The Cloud: Architecting a Better Platform

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

Thank You.This presentation will be loaded to SlideShare the week following the Symposium.

http://www.slideshare.net/AmazonWebServices

AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015


Top Related