grau dataspace architecture

21
GRAU Data Space 2.0 – The Secure Communication Platform for Businesses and Organizations YOUR DATA. YOUR CONTROL 7. Dez 2013

Upload: thomas-uhl

Post on 14-Dec-2014

191 views

Category:

Technology


5 download

DESCRIPTION

GRAU DataSpace provides FileShare & Sync for enterprises and managed service providers

TRANSCRIPT

Page 1: GRAU DataSpace Architecture

GRAU Data Space 2.0 –The Secure Communication Platform forBusinesses and Organizations

YOUR DATA. YOUR CONTROL

7. Dez 2013

Page 2: GRAU DataSpace Architecture

Architectural Overview● The GDS is based on a very robust core which is available since years

● The architecture scales from SMB (<100 user) to large enterprises and service providers (>100.000 users)

● The key features for scalability are:

– Separation between data and meta data (optional)– Transactional scalable storage backend – Versioning of all file objects (UUID)– Chunking of large objects (chunksize can be different for each object)– Hashing of chunked objects (offloading to object store is possible)– Chunk level deduplication based on hash (under development)– Bidirectional master/master replication of all data and meta data on folder level– Session director allows redirection of sessions to another node– RESTful APIs– CMIS (getContentChanges)– Distributable in-memory cache for meta data

Page 3: GRAU DataSpace Architecture

Open interfaces

● Open standard interfaces– WebDAV

– JSON/SOAP core API

– CIFS

● Gateways

– OwnCloud

– CMIS 1.1 (SOAP, AtomPub, JSON)

● Identity Management

– Provisioning Gateway (LDAP, AD,SQL)

– Authentication Gateway (LDAP, AD, RADIUS)

Page 4: GRAU DataSpace Architecture

Architecture

GDS core

CaringoS3

SWIFTNASGAM

DB/2OracleMySQL

Postgres

GDS2 API (JSON)

CMIS GW

Object-Store FS/CIFS SQL SQL

Storage Backend Metadata

WebGUI

WebDAV

ownCloudGWAdm GW

Admin GUI

DB/2OracleMySQL

Postgres

CIFS

Page 5: GRAU DataSpace Architecture

Storage Backend (1)

● Storage backends:– Filesystem (ext4, XFS)

– NAS / CIFS

– RDBMS (MySQL, Oracle, Postgres, MSSQL, DB2)

– Object stores (Caringo, S3, SWIFT)

● Plugins:

– Object chunking (size definable on object level, 512k default)

– Hashing (MD5, SHA-1, SHA-256)

– Dedup on chunk-level [under development]

– Mirroring (one or many backends) [planed]

– Crypto (symmetrical) [planed]

– HSM [planed]

Page 6: GRAU DataSpace Architecture

Storage Backend (2)

GDS core

CaringoRADOS

SWIFT/S3

ext4XFS

DB/2OracleMySQL

Postgres

Filesystem SQL

Storage Backend

CIFSObject store

NASGAM/Archive

Chunking (512kB)

MirroringHashing(optional)

Crypto (sym.)

Page 7: GRAU DataSpace Architecture

Storage Backend (3)

GDS core

Metadata Object Store

GDS2 API (JSON)

RADOSOSD

GDS core

Object Store Metadata

GDS2 API (JSON)

RADOSOSD

ReplicationMetadata

GDS2 API (JSON)

RADOSOSD

RADOS GW

librados

RADOS GW

librados

SWIFT SWIFT

Page 8: GRAU DataSpace Architecture

Scalability / High availibility● Master/master replication on folder level

– Data, metadata

– Users, groups

– Access lists

● Shared nothing architecture– Horizontal scalability

– High availability

– Users that share a lot of folders can be relocated to the same node

– Adding or removing nodes dynamically

– Software updates on deactivated nodes

● Distributed metadata cache

– CMIS gateway allows session and metadata caching

● Session redirector (reverse proxy)– Redirects session to the home node of the user

– If the home node is down, one of the backup nodes will be used

Page 9: GRAU DataSpace Architecture

High availibility

GDS core

Storage Metadata

GDS2 API (JSON)

GDS core

Storage Metadata

GDS2 API (JSON)

ReplicationData

Metadata

GDS2 API (JSON)

Load Balancer Load Balancer

GDS (Session) Director GDS (Session) Director

Page 10: GRAU DataSpace Architecture

Scalability (1)

GDS core

Metadata Data

GDS2 API (JSON)

Load Balancer Load Balancer

GDS core

Data Metadata

GDS2 API (JSON)

Master/MasterReplicationMetadata

GDS (Session) Director

GDS2 API (JSON)

GDS (Session) Director

Objectstore / Cluster filesystem

Page 11: GRAU DataSpace Architecture

Scalability (2)

GDS core

MD Data

GDS2 API (JSON)

Load Balancer Load Balancer

GDS (Session) Director GDS (Session) Director

MetadataReplication

CMIS Cache

GDS core

GDS2 API (JSON)

GDS core

GDS2 API (JSON)

DataData

Objectstore / Cluster filesystem

MD MD MetadataReplication

CMIS Cache CMIS Cache CMIS Cache

Page 12: GRAU DataSpace Architecture

Multiple Sites - Roaming (1)● Every user has a home node which is stored in the account data● Redundancy of file objects is provided by objects store at each site● Users, groups and ACLs are synchronized between all sites● File objects are not synchronized between sites● Synchronization takes place asynchronously● Load balancer directs client request to session director● Session director redirects request based on user account to

– Home node of the user [my]

– Node which hosts shared data room [shared]

– Any node [global]

● Session director analyzes the request and forwards to

– CMIS caching layer

– JSON API layer

Page 13: GRAU DataSpace Architecture

Multiple Sites - Roaming (2)

GDS core

MD Data

GDS2 API

GDS Director

CMIS Cache

GDS core

Data MD

GDS2 API

CMIS Cache

Site B

GDS Director

Site A

GDS core

MD Data

GDS2 API

CMIS Cache

GDS core

Data MD

GDS2 API

CMIS Cache

GDS Director GDS Director

LB LBLB LB

JSON CMIS

Page 14: GRAU DataSpace Architecture

Identity Management (1)

● Separation between user provisioning and authentication● Multiple instances of gateways are possible● Multiple directories can be connected in parallel

● Provisioning gateway– LDAP/AD/SQL crawler

– Users that match a regular expression are created in the GDS

– Users that got deleted in the directory get deactivated in the GDS

– SCIM/SAML module [planed]

Page 15: GRAU DataSpace Architecture

Identity Management (2)

● Authentication gateway– LDAP/AD/SQL module

– Multilevel authentication

– Google authenticator [planed]

– RADIUS module [planned]

– MTAN/OTP module [planed]

● Single Sign-On [planned]

– Kerberos module

– OAUTH2 module

Page 16: GRAU DataSpace Architecture

Identity Management (3)

ProvisioningGateway

Storage Backend Metadata

WebGUI

Admin GW

Admin GUI

GDS2 API (JSON)

GDS core

AuthenticationGateway

LDAP/AD

LDAP/AD

SAML

RADIUS

SAML

SQL

Page 17: GRAU DataSpace Architecture

Multi Tenancy● Dedicated Hardware

– Highest level of separation and security

– No performance impact of virtualization layer

● Full virtualization (KVM, HyperV, Vmware, XEN)– Highest level of separation and security in virtualized environment

– Similar static memory pages can be shared between instances

– GDS version can be different for each tenant

● Linux Containers (LXC)

– Lightweight virtualization

– Memory and program files on disk can be shared between instances

● Single instance

– Same GDS version for all tenants

– Everything gets shared

– Software bugs or operational problems affect all tenants

Page 18: GRAU DataSpace Architecture

Distributed Data Space

FW

Internet

CIFS

LAN

CIFS

GDS

JSON

HTTPSFW

CIFS

LAN

CIFS

GDS

JSON

HTTPS

FW

CIFS

LAN

CIFSGDS

JSON

HTTPSHTTPSLAN

CIFS

FW

CIFS

GDS

JSON

Site A Site B

Site C Site D

Page 19: GRAU DataSpace Architecture

Corporate CDN

CIFS GDSCMIS

HTTPS

HTTPS

Site A

Site C

GDS

GDS

GDS

CM

IS C

ache

SD

CM

IS C

ache

CM

IS C

ache

OS

SD

WebDAV

CIFS

CMIS

WebDAV

CIFS

CMIS

WebDAV

HT

TP

S

Site B

GDS

GDS

OS

OS

CIFS GDSCMIS

Site B1

WebDAV

CIFS

CMIS

WebDAV

Site B2

GDS

HTTPS

HTTPS

Page 20: GRAU DataSpace Architecture

Cloud attached Data Space

FW

Internet

CIFS

LAN

CIFS

GDS

JSON

HTTPS

FW

GDS

HTTPSLAN

CIFS

FW

CIFS

GDS

JSON

Site A

Site B

GDS

GDS

GDS

LB

LB

Page 21: GRAU DataSpace Architecture

WWW: HTTP://WWW.GRAUDATA.COM/DATASPACE

E-MAIL: [email protected]

CEL: +49 151 54354373

TWITTER: @graudataspace

YOUR DATA. YOUR CONTROL.