meetup: case study - hpcc systems implementation for an aviation company

36
WHT/082311 1| | ©2013, Cognizant 1 | ©2017, Cognizant 1 Hammer and beyond – An ensembling journey

Upload: hpcc-systems

Post on 07-Apr-2017

72 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

1 | | ©2013, Cognizant1 | ©2017, Cognizant 1

Hammer and beyond – An ensembling journey

Page 2: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

2 | ©2017, Cognizant

Who am I?

Sunil Babu PeethambaramArchitect, Cognizant Technology Solutions, CTSH (NASDAQ)

Total IT experience – 13+ years

Consulting with LexisNexis since 2013 (Chennai, Dayton, Buford, Alpharetta)

Experience in HPCC Systems – more than 3 years

Domains worked on :• Supply Chain Management• Logistics • Retail –

• Merchandise and Store operations • Order Management and • Warehouse Management Systems

• Insurance • Healthcare • Aviation

Page 3: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

3 | ©2017, Cognizant

Problem statement - How did it all start

Build valid flight connections (VFC) based on direct flight schedules (DFS)

DFS come in a proprietary encoded format

DFS spans across 1000 carriers and over 4 million records

DFS are for a year or more into the future

DFS keeps changing every day and VFC needs to be versioned for every day (potentially)

Building VFC requires evaluating feasibility of over 16 trillion potential connections

Valid connections to be identified by applying:• Circuitry• Cabotage• BIETA and LCC• Schedule conflicts • MCT rules of over 100,000 to be applied in sequence

Page 4: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

4 | ©2017, Cognizant

The Legacy Setup

• Complex Business Logic

• Data intensive

• .NET/SQL Server

• Local datacenter

• Scaled-up architecture

• Ageing hardware

• Sequential processing

• Low fault tolerance

• Stale data delivery

• 24 X 7 life support

Page 5: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

5 | ©2017, Cognizant

The ask

SOS!

Page 6: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

6 | ©2017, Cognizant

The ask

Not really!

Page 7: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

7 | ©2017, Cognizant

The ask

Relevant data delivery – faster processing, parallelize independent tasks

Don’t marry the hardware(just friends with benefits)

Performance as a configuration (take your time, hurry up, choice is yours, don't be late)

Fail fast, recover faster

Onboard new customers quickly

Automated data delivery pipeline

Better maintainability – support and enhance the complex business logic

Page 8: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

8 | ©2017, Cognizant

So What?

Page 9: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

9 | ©2017, Cognizant

Every project has complex business logic

Page 10: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

10 | ©2017, Cognizant

But, We have to generate hundreds of millions of records…

Page 11: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

11 | ©2017, Cognizant

…which means we have a “big data” problem

Page 12: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

12 | ©2017, Cognizant

And we are going to do whatever it takes…

Page 13: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

13 | ©2017, Cognizant

OK Google.. What is big data?

Page 14: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

14 | ©2017, Cognizant

This is what we got!

Page 15: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

15 | ©2017, Cognizant

Our problem was different

Page 16: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

16 | ©2017, Cognizant

We have a big “data problem”

and the answers are a whole lot bigger!!!

Page 17: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

17 | ©2017, Cognizant

So, why HPCC Systems?

Why not?

Page 18: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

18 | ©2017, Cognizant

So, why HPCC Systems?

Our use case was data intensive and batch oriented

Embarrassingly parallel

ECL was built specifically for distributed data processing and gave us the fine control we needed

Been there.. done that, lot of real experiences to tap into

Access to the HPCC Systems development team

It’s performing and maintainable

We did a proof of concept and validated fitment anyway• 45 minute job ran in 1 second• 4 hours job ran in 90 seconds• 4 weeks planned proof of concept was completed in 4 days

Page 19: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

19 | ©2017, Cognizant

What did Bill have to say about it?

Page 20: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

20 | ©2017, Cognizant

Why AWS?

Bring a multi-node HPCC Systems cluster up or down at a click of a button

Scale up or down with zero upfront cost

Validating multiple configurations for performance and choose the best

And…

No need for Data Centers

Pay as you USE

Go Global

Speed of computing

Page 21: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

21 | ©2017, Cognizant

High level flow

Page 22: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

22 | ©2017, Cognizant

Inside HPCC Systems

Data warehouse as Source of Truth

Data warehouse is the base on which our solution was built.

Follows a push-pull architecture

The raw data from different data sources are cleansed and transformed to data cubes (push).

The cubes acts as views that are used by downstream applications (pull). Eg: Connection builder

Data warehouse is the only way by which data enters into the distributed data processing system

All views follow a common interface through which data can be accessed

Page 23: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

23 | ©2017, Cognizant

Lifecycle of a view in DW

Page 24: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

24 | ©2017, Cognizant

How did we fare?

Metrics Measure (Legacy – UTG) Measure (HPCC Systems)

Building connections (Singles) 40 hours 1 hour

Lines of Code 26535 (Not including SQL) 3973

Delivery Frequency Weekly Daily (Possible)

Hardware 24 GB and 12 cores for Batch Server384 GB and 24 Cores for SQL Server

Thor Master + Middleware – 16 GB Thor Slaves 64 GB – 16 cores across 4 nodes

AWS

4.4 million

100 million

13.5 million

Page 25: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

25 | ©2017, Cognizant

Happy Side Effects

Data Warehouse as a framework for new data sources

Data Warehouse as an interface for downstream applications

Plug and play by design

File builder template – Blue print for all data delivery jobs

Unit testing framework for HPCC Systems

Regression testing suite – Can run all tests in the code base and provide report

We integrated comparison testing tool from LNR into Hammer

HPCC Systems cluster can now be built in AWS at a click of a button (puppet)

Seamless sync between external FTP location and landing zone through S3

Page 26: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

26 | ©2017, Cognizant

What next?

Page 27: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

27 | ©2017, Cognizant

What next?

Page 28: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

28 | ©2017, Cognizant

What next?

Page 29: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

29 | ©2017, Cognizant

What next?

Page 30: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

30 | ©2017, Cognizant

What next?

Page 31: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

31 | ©2017, Cognizant

What next?

Page 32: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

32 | ©2017, Cognizant

What next?

Page 33: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

33 | ©2017, Cognizant

What next?

Page 34: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

34 | ©2017, Cognizant

What next?

Page 35: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

35 | ©2017, Cognizant

?

Questions?

Page 36: Meetup: Case Study - HPCC Systems implementation for an Aviation company

WHT/082311

36 | ©2017, Cognizant

Thank youReach out to me: [email protected]

Useful links

Cognizant: http://www.cognizant.com

FlightGlobal http://www.flightglobal.com

HPCC Systems Portal: http://hpccsystems.com

Machine Learning: http://hpccsystems.com/ml

Online Training: http://learn.lexisnexis.com/hpcc

HPCC Systems Wiki & Red Book: https://wiki.hpccsystems.com

Our GitHub portal: https://github.com/hpcc-systems

Community Forums: http://hpccsystems.com/bb

Documentation: https://hpccsystems.com/download/documentation