#geodesummit: democratizing fast analytics with ampool (powered by apache geode)
TRANSCRIPT
Democratizing Fast Analytics with Ampool(Powered by Apache Geode-incubating)
Avinash Dongre and Robert Geiger, Ampool Inc.
"
" "
Analytics
#
#
#
#
Apps
Multi-De
vice
Testing
$
%
&
|
)
*
AnalyticsneedstoworkinCLOSEDLOOP withAppsAnalyticsNeedstobeFaster!
"
" "
Analytics
#
#
#
#
Apps
Multi-De
vice
Testing
$
%
&
|
)
*
WhataretheCHALLENGES?
⚠ Manydatausers/stakeholders
⚠ Disparatetools&processingneeds
⚠ Longtimetoinsights
AnalyticsIngest AppUseETL
… FASTOBJECTACCESSacrossthepipeline
# # #
, ,
-
,
…
. DataArchitectDataDevelopers. . BusinessAnalysts
DataScientists.
WhatENABLERS canhelphere?
In-MemoryFabricTechnology• ApacheGeode!
• Flexible,stable,andprovendistributedin-memorytechnology
Newmemorytechnologiesandfastnetworkfabrics• StorageClassMemory
• lowlatency,highthroughput,persistent• Initiallyexposedviafilesysteminterface
• Regularormemorymapped
EmergingStorageClassMemory(SCM)isDISRUPTIVE
Challengesthevaluepropositionofin-memorysolutions
NearDRAMlatencyandthroughputatlowercost
Basedononeofseveraltypesofmemorytechnology• MRAM(magnetic)• ReRAM (resistive)• FRAM(ferroelectric),PCM(phasechange)• 3D-XPointTM (Intel/Micron)
AccessibleviaJavaandC/C++libraries• Mnemonic (Java)• Pmem.io (C++)
In-MemoryTechnologyCHALLENGES
Linebetweenmemoryandstorageisblurring
Filesystemsgettingreallyfast,sothespeedgapisclosing• SCMFileSystemswillalsobelowlatency• Filesystemoverheadstilllimitslatencyimprovements• Before:diskbasedvs.in-memory• After:filesystemvs.byteaddressableobjectstore
Managingmultiplelayersandtypesofmemory
FastClosedLoopAnalytics,Poweredbya Smart,DistributedIn-MemoryFabric…
Highthroughputandlargedatahandlingmatters• Throughput,latency,andcapacity:
• eachpipelinestagevaluesthesedifferently
Commoninterfaces,multipleregiontypes• Meettheneedsofmanytypesofbestofbreedengines
Managingmultiplelayersofmemoryandstorage• Speed(latency,throughput)differentiatorwilldiminish
Moreclassificationsfordatanow• Hot,cold=>hot,warm,lukewarm,cold
…musthandleMULTIPLE needsinonefabric
NeedforH
ighThroughput
NeedforLowlatency
Earlystages(ingest,ETL)
Laterstages(datadriveninsights&actions)
WhatMattersforApp,DB,andCompute?
Theflexibility,suitability,andeaseofuse oftheinterfaces
Memory&storagearemanagedtransparentlytoprovideQoS
Theserviceguarantee abstractionsareprovided
Conflictsaremanaged andprevented
Freeingdevelopers fromre-inventingthewheel
SmartDistributedIn-MemoryObjectStore
…forMANAGEDFLEXIBILITY...
+ 3D XPointTM......
✅ Flexibleregionsandinterfacesfor‘Bestofbreed’engines
✅ ExtensibleCore
✅Pluggablestores
AnalyticsIngest AppUseETL
…andFASTOBJECTACCESSacrossthepipeline
# # #
, ,
-
,
…
. DataArchitectDataDevelopers. . BusinessAnalysts
DataScientists.
In-MemoryDistributedSys
Low-latencyComms.
Key-ValueStore
FunctionPushdown
+
HighThroughput
TableStore
NativeInterfacePluggableStoreManager
JavaAPI
MASH(CLIExt)
JavaAPI
BuildingonPROVEN In-memoryTechnology
SmartDataTiering
MatureEventModel
TunableConsistency
Metadata/Catalog
SecurityAuthZ
Nochangeindataapplicationcode
Config.changesonly
Nochangeinuserexperience
Performancebenefits
NoaddedhasslesCurrentmgmt.tools
…anddeliverVALUE toallAnalyticsstakeholders
. DataArchitectDataDevelopers.
. BusinessAnalystsDataScientists.
. DataAdminsInfra/SysAdmins.
ContributingBack
PlanforcontributionsbacktoApacheGeode:• Storageplug-abilitylayer
• Off-heapmemoryplug-ability
• SCMplugin(Mnemonic)
• Impersonationsupportforsecurity
• Regiontypeplug-ability
ThankYou!
Avinash DongreArchitect,Ampool [email protected]
RobertGeigerChiefArchitect&VPEngineering, Ampool [email protected]