storage @ the edge - snia · 2019-12-21 · set each pg as a edge storage placement rules -...

23
Storage @ the Edge Vinod Eswaraprasad Wipro Technologies

Upload: others

Post on 21-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

2018 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 1Sensitivity: Internal & Restricted

Storage @ the Edge

Vinod EswaraprasadWipro Technologies

2018 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 2Sensitivity: Internal & Restricted

What is Edge Computing? Certain applications need Autonomy

Self Driving car Industry 4.0 – smart plant Smart Home

Importance of latency Certain decisions are important – cant go all

the way to cloud Latency can be a reason for failure.

Application would need Significant bandwidth

Significance of Edge comes from IoT - autonomous systems

2018 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 3Sensitivity: Internal & Restricted

Typical Edge Scenario

2018 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 4Sensitivity: Internal & Restricted

Key Characteristics of Edge

Centralized - to de-centralized Real Time processing Real time control Local filtering and caching. At source data visualization....

Geographically distributed Scalable Autonomous and distributed Contextual and low latency Realtime interaction. Heterogenous

Data + Analysis + Learning

2018 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 5Sensitivity: Internal & Restricted

Typical Edge Topology

Edge Micro DC

Edge Micro DC

Edge Micro DC

5-10ms

5-10ms 5-10ms

10-50 ms

100-xms

2018 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 6Sensitivity: Internal & Restricted

Edge design Challenges

Complex Open Environment Environmental, Supportability

Remotely Located Autonomous, Robust

De-centralized / Distributed Millions of nodes

Price Sensitive.

2018 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 7Sensitivity: Internal & Restricted

Edge workloads IoT data

Time series Historian

Video Un-structured Content search

Machine Learning/ Online learning

2018 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 8Sensitivity: Internal & Restricted

Characterizing edge workloadCharacteristics Edge CloudLatency Low High

Bandwidth Very Low High

Response Time Low High

Storage Capacity Low High

Server Overhead Very Low high

Energy Consumption Low High

Scale High Medium

QoS ad QoE High Medium

2018 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 9Sensitivity: Internal & Restricted

Timeseries in current times

2018 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 10Sensitivity: Internal & Restricted

Time Series workload characteristics

Write-mostly is common; perhaps 95% to 99% Writes are almost always sequential appends Updates are rare Deletes are in bulk

=> immutable storage format is better?

2018 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 11Sensitivity: Internal & Restricted

The edge topology

Centralized De-Centralized Distributed

2018 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 12Sensitivity: Internal & Restricted

Storage requirements at Edge.

Lowest latency Less than 10ms

Distributed view, isolated actions One network operation should not affect others

Local store and forward capability Ability to reduce intra node traffic

Ability to aggregate and send to core Avoid less data transfer over network.

Data Mobility Enable device movement across edges

2018 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 13Sensitivity: Internal & Restricted

What storage is best for Edge?

Block based? ( compatibility? Cant have local formats )

Distributed File? ( HDFS, PVFS, Lustre - centralized )

Object store? ( Decentralized ? )

Anything Better?

2018 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 14Sensitivity: Internal & Restricted

Our Analysis

Analyze edge storage specific parameters

Well known object storage implementation De-centralized

Choices were:

RADOS - CRUSH IPFS (InterPlanetary File System ) - DHT SWARM - DTH ( updated )

2018 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 15Sensitivity: Internal & Restricted

RADOS (Reliable Autonomic distributed Object Store)

RADOS CRUSH (Controlled Replication Under

Scalable Hashing) Set each PG as a Edge Storage Placement rules - restrict data to a local

site Less intra Edge communication CRUSH Maps equates to Local DC

Allows striping for higher performance Maps well to the decentralization of

Edge

Placement group -1

Placement group -2

Placement group -3

Clients

CRUSH

2018 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 16Sensitivity: Internal & Restricted

IPFS

IPFS (InterPlanetary File System ) Distributed Hash Table ( DHT) approach Content based Hashes - in a Merkle DAG DHT - stores the pointer to data - one level

indirection Always makes a local copy Built-in security and data reliability Client

request

2018 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 17Sensitivity: Internal & Restricted

SWARM

Decentralized - Peer to Peer Bzz protocol - slightly different than DHT

Distributed content-addressed chunk store (4K) Better for small chunks of data

Content Addressed

2018 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 18Sensitivity: Internal & Restricted

Feature comparison

Data proximity

Network isolation

Store n Forward

Data Migration Remarks

RADOS Good Good Bad Moderate Algorithmic, less chatty, paxos

IPFS Good Bad Moderate Good Data management, O (logN) lookup

SWARM Good Bad Good Good Smaller sized IO,

2018 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 19Sensitivity: Internal & Restricted

Parameters

Latency Impact on number of nodes: 10,50,100 Latency for 4K, 1M, 10M object size

Network traffic Network traffic as the node scale (10,50,100)

Remote/Local Reads The time taken for reading data on remote node

2018 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 20Sensitivity: Internal & Restricted

Findings

IPFS has lowest latency and scale IPFS has the lowest remote reads SWARM has lowest network traffic RADOS has lowest data traffic for co-ordination

2018 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 21Sensitivity: Internal & Restricted

Future of Edge storage? Tailored for edge workload

Time series - writes, append, no updates ML/DL/Online Learning

Bring intelligence to storage Intelligent comes from data To process data you need intelligence Edge storage will have built-in AI capability

Smart data - hierarchical Data payload Metadata Detachable code with data.

2018 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 22Sensitivity: Internal & Restricted

Conclusion

Edge requirement is here to stay Need for an Edge specific storage requirement

exists No new architectural changes are proposed

today An opportunity to innovate.

2018 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. 23Sensitivity: Internal & Restricted

Thank You