data storage in clouds

16
Data storage in clouds Athens OpenStack User Group #OSATH 12 th Meetup, 22 th April 2015 Thanassis Parathyras, [email protected], @parathyras

Upload: thanassis-parathyras

Post on 17-Jul-2015

345 views

Category:

Technology


0 download

TRANSCRIPT

Data storage in clouds

Athens OpenStack User Group #OSATH

12th Meetup, 22th April 2015

Thanassis Parathyras,

[email protected], @parathyras

Announcements

#OSATH has 200 members !

Greek Mailing List

[email protected]

• Join at http://lists.openstack.org

OpenStack Summit (https://www.openstack.org)

• 18-22 May, Vancouver

OpenStack CEE Day (http://openstackceeday.com)

• 8 June, Budapest

Outline

• Storage types

• OpenStack Storage Services– Glance

– Cinder

– Swift

• Higher abstractions– Data Processing aaS (Sahara)

– DBaaS (Trove)

• Other OSS projects– GlusterFS

– Ceph

It’s all about DATA

“often called storage or memory, is a technology consisting of computer components and

recording media used to retain digital data.

It is a core function and fundamental component of computers.”

ent. Computer data storage, Wikipedia

File Disk Drive CD-ROM

Object Volume Image

Swift Cinder Glance

Amazon S3 Amazon EBSAWS Marketplace –

Software Infrastructure

Storage types

Glance

OpenStack Image service• REST API

• CRUD and Search features

• Caching and prefetching

• Supports several formats:

– raw, qcow2, vmdk, vhd, ami (aki, ari), iso, vdi

• Containers:

– bare, ovf, ami (aki, ari)

• Backend to storage (default: Swift)

• Able to aggregate multiple back-ends

– Can also increase availability

Cinder

OpenStack Block Storage service• Volumes• Snapshots• Backups• Modular architecture to support 50+ back-ends• LVM is the default (iSCSI)• Diverse storage types

– iSCSI, Fibre, RBD (Ceph), NFS, GPFS, …• Manages storage resources separately

Ephemeral vs Block Storage• Nova manages ephemeral storage coupled with VM state (non-persistent)• Cinder manages block storage decoupled from VM lifecylce (persistent)

Swift

OpenStack Object Storage service• REST API• Data redundancy (3x or more)• Drive auditing built-in• RAID not required• Commodity hardware (not low-end)• High availability• Distributed• Eventual consistency

CAP theorem• Choose 2 out of 3 (Consistency, Availability, Partition Tolerance)• Swift implements an eventual consistency model

Swift in action

Access tier• Proxy server

Storage tier• Account server

• Container server

• Object server

• Consistency servers (Auditors, Updaters, Replicators)

Zones• Selected per deployment

• Determine replica isolation (disk, server, rack, room)

Swift in detail

Upload• PUT http://<swift_url>/<acc>/<cont>/fileA

Internals• Consistent hashing – DHT

• Hash function: md5sum

Ring• Static mapping to direct data location

• Zones, disks, partitions and replicas

• Across every node in the cluster

1

2

3

45

6

7

8

partition

Swift in detail

Partition power• Estimate 100 partitions per

disk at max capacity

• Calculate closest power of 2 rounded up

• 2partition_power=partition number

Disks and partitions• Swift partition <> filesystem

partitions

• Presume 4 disks named A, B, C, D

A

A

B

BC

C

D

D

partition

Putting it all together

A. Spawn a new VM from an Image

B. Attachvolume(s) to VM

C. Storeapplication objectsD. Retrieve stored objects

Lifting more … data

OpenStack services moving up to stack• Infrastructure (image, volume, snapshot, backup)

• Platform (database, analytics)

Trove – DBaaS• MySQL, Percona, MariaDB, MongoDB, Couchbase, Cassandra, Redis,

PostgreSQL, Oracle

• Equivelant to Amazon {RDS, DynamoDB}

• Backup/Restore, Resize, Replication, User/DB management, etc.

Sahara – Big data analytics• Hadoop from Hortonworks or Cloudera, Spark

• Amazon Elastic MapReduce (EMR)

• Manage and configure cluster, HDFS, MapReduce

OSS storage projects

Lots of them, common goals to support scalable, large, software-defined storage systems

Ceph• Based on RADOS and CRUSH provides object, block and file-system storage

• 10+ years development effort

• http://www.meetup.com/Ceph-Athens/

GlusterFS• Simple to use scale-out storage provides unified access to files and objects

• Data stored in native format, no metadata completely algorithmic

Get involved

• Documentation– http://docs.openstack.org

• Join the community– http://www.openstack.org/community

• Greek mailing list– http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-el

• Contribute– code (development, blueprints, reviews, bugs)

– docs writing, translations, infrastructure support

Thank you!

Athens OpenStack User Group #OSATH

http://www.meetup.com/Athens-OpenStack-User-Group

Thanassis Parathyras

[email protected], @parathyras