gridgain 6.0: open source in-memory computing platform - nikita ivanov

20
© 2014 GridGain Systems, Inc. NIKITA IVANOV Founder & CTO @c64hacker Apache Ignite: RealTime Big Data with InMemory Data Fabric www.gridgain.com #gridgain

Upload: jaxlondon2014

Post on 17-Jul-2015

495 views

Category:

Presentations & Public Speaking


9 download

TRANSCRIPT

Page 1: GridGain 6.0: Open Source In-Memory Computing Platform - Nikita Ivanov

©  2014  GridGain  Systems,  Inc.  

NIKITA  IVANOV  Founder  &  CTO  @c64hacker  

Apache  Ignite:  Real-­‐Time  Big  Data  with  In-­‐Memory  Data  Fabric  

www.gridgain.com   #gridgain  

Page 2: GridGain 6.0: Open Source In-Memory Computing Platform - Nikita Ivanov

©  2014  GridGain  Systems,  Inc.  

Agenda  

•  History  of  GridGain/Apache  Ignite  •  EvoluSon  of  In-­‐Memory  CompuSng  •  In-­‐Memory  Data  Fabric  •  Distributed  Cluster  &  Compute  

–  Coding  Example  •  Distributed  Data  Grid  

–  Coding  Examples  •  Distributed  Streaming  &  CEP  •  Plug-­‐n-­‐Play  Hadoop  Accelerator  

Page 3: GridGain 6.0: Open Source In-Memory Computing Platform - Nikita Ivanov

©  2014  GridGain  Systems,  Inc.  

What  is  In-­‐Memory  CompuAng  

•  High  Performance  &  Low  Latencies  •  Faster  than  Disk  and  Flash  •  Cost  EffecSve  •  Distributed  or  Not  •  Caching,  Streaming,  ComputaSons  •  Data  Querying  –  SQL  or  Unstructured  •  VolaSle  and  Persistent  •  OLAP  and  OLTP  Use  Cases  

Page 4: GridGain 6.0: Open Source In-Memory Computing Platform - Nikita Ivanov

©  2014  GridGain  Systems,  Inc.  

EvoluAon  of  In-­‐Memory  CompuAng  

Caching  

Distributed  Caching  

In-­‐Memory  Data  Grids  IMDBs  

Database  IM  opSons   Hadoop  

accelerators  

Streaming  

BI  accelerators  

Clustering & Compute Grid

Data Grid Streaming

Hadoop Acceleration

Page 5: GridGain 6.0: Open Source In-Memory Computing Platform - Nikita Ivanov

©  2014  GridGain  Systems,  Inc.  

ExisAng  Market  is  Fragmented  

Company   Product   Proprietary/  Open  Source   CharacterizaAon  

Oracle In-Memory Option for Oracle Database Proprietary Cost Option

Oracle Times Ten Proprietary Point Solution IMDB

Oracle Coherence Proprietary Point Solution IMDG

SAP Hana Proprietary Point Solution - IMDB

Microsoft SQL Server 2014 Proprietary Feature Upgrade

DataBricks Apache Spark Open Source Point Solution - Hadoop

VoltDB VoltDB Open Source Point Solution – IMDB

Aerospike Aerospike Open Source Point Solution – NoSQL DB

IBM DB2 with BLU Acceleration Proprietary Feature Upgrade

Software AG Terracotta Open Source Point Solution - IMDG

Hazelcast Hazelcast Open Source Point Solution - IMDG

Page 6: GridGain 6.0: Open Source In-Memory Computing Platform - Nikita Ivanov

In-­‐Memory  Data  Fabric:    Strategic  Approach  to  IMC  

•  Supports all Apps

•  Open Source – Apache 2.0 •  Simple Java APIs •  1 JAR Dependency •  High Performance & Scale •  Automatic Fault Tolerance •  Management/Monitoring •  Runs on Commodity Hardware

•  Supports existing & new data sources

•  No need to rip & replace

Clustering & Compute Grid

Data Grid Streaming

Hadoop Acceleration

©  2014  GridGain  Systems,  Inc.  

Page 7: GridGain 6.0: Open Source In-Memory Computing Platform - Nikita Ivanov

©  2014  GridGain  Systems,  Inc.  

Clustering  &  Compute  •  Zero  Deployment  •  Pluggable  SPI  Design  •  Full  Cluster  Management  •  Direct  API  for  MapReduce  •  Direct  API  for  Fork/Join  •  Cron-­‐like  Task  Scheduling  •  State  Checkpoints  •  Early  and  Late  Load  Balancing  •  AutomaSc  Failover  

Page 8: GridGain 6.0: Open Source In-Memory Computing Platform - Nikita Ivanov

©  2014  GridGain  Systems,  Inc.  

AutomaAc  Cluster  Discovery  

Page 9: GridGain 6.0: Open Source In-Memory Computing Platform - Nikita Ivanov

©  2014  GridGain  Systems,  Inc.  

Closure  ExecuAon  

Page 10: GridGain 6.0: Open Source In-Memory Computing Platform - Nikita Ivanov

©  2014  GridGain  Systems,  Inc.  

Closure  ExecuAon  

Page 11: GridGain 6.0: Open Source In-Memory Computing Platform - Nikita Ivanov

©  2014  GridGain  Systems,  Inc.  

In-­‐Memory  Caching  and  Data  Grid  

•  Distributed  In-­‐Memory  Key-­‐Value  Store  •  Replicated  and  ParSSoned  •  TBs  of  data,  of  any  type  •  On-­‐Heap  and  Off-­‐Heap  Storage  •  Backup  Replicas  /  AutomaSc  Failover    •  Distributed  ACID  TransacSons    •  SQL  queries  and  JDBC  driver  •  CollocaSon  of  Compute  and  Data  

Page 12: GridGain 6.0: Open Source In-Memory Computing Platform - Nikita Ivanov

©  2014  GridGain  Systems,  Inc.  

Cache  OperaAons  

Find  a  Bug?  

Page 13: GridGain 6.0: Open Source In-Memory Computing Platform - Nikita Ivanov

©  2014  GridGain  Systems,  Inc.  

Cache  TransacAon  

Page 14: GridGain 6.0: Open Source In-Memory Computing Platform - Nikita Ivanov

©  2014  GridGain  Systems,  Inc.  

Distributed  Java  Data  Structures  

•  Distributed  Map  (cache)  •  Distributed  Set  •  Distributed  Queue  •  CountDownLatch  •  AtomicLong  •  AtomicSequence  •  AtomicReference  •  Distributed  ExecutorService  

Page 15: GridGain 6.0: Open Source In-Memory Computing Platform - Nikita Ivanov

©  2014  GridGain  Systems,  Inc.  

Client-­‐Server  vs.  Affinity  ColocaAon  

Client-­‐Server   Affinity  ColocaSon  

Page 16: GridGain 6.0: Open Source In-Memory Computing Platform - Nikita Ivanov

©  2014  GridGain  Systems,  Inc.  

In-­‐Memory  Streaming  &  CEP  

•  Streaming  Data  Never  Ends  •  Branching  Pipelines  •  CEP  Sliding  Windows  •  Pluggable  RouSng  •  Real  Time  Analysis  •  At  Least  Once  Guarantee  

Page 17: GridGain 6.0: Open Source In-Memory Computing Platform - Nikita Ivanov

©  2014  GridGain  Systems,  Inc.  

Plug-­‐n-­‐Play  Hadoop  Accelerator  

•  Up  to  100x  AcceleraSon  •  In-­‐Memory  NaSve  MapReduce  

–  In-­‐Process  Data  ColocaSon  –  Eager  Push  Scheduling  

•  GGFS  In-­‐Memory  File  System  –  Pure  In-­‐Memory  –  Write-­‐Through  to  HDFS  –  Read-­‐Through  from  HDFS    

•  Sync  and  Async  Persistence  

Page 18: GridGain 6.0: Open Source In-Memory Computing Platform - Nikita Ivanov

©  2014  GridGain  Systems,  Inc.  

In-­‐Memory  NaAve  MapReduce  

•  In-­‐Memory  NaSve  MapReduce  –  Zero  Code  Change  –  Use  exisSng  MR  code  –  Use  exisSng  Hive  queries  

•  No  Name  Node  •  No  Network  Noise  •  In-­‐Process  Data  ColocaSon  •  Eager  Push  Scheduling  

Page 19: GridGain 6.0: Open Source In-Memory Computing Platform - Nikita Ivanov

©  2014  GridGain  Systems,  Inc.  

DevOps  Management  and  Monitoring  

Page 20: GridGain 6.0: Open Source In-Memory Computing Platform - Nikita Ivanov

©  2014  GridGain  Systems,  Inc.  

THANK  YOU  

www.gridgain.com   #gridgain