ibm spectrum scale object metadata search - an overview

14
© 2016 IBM Corporation IBM Spectrum Scale Object Metadata Search An Overview 1

Upload: nilesh-bhosale

Post on 21-Jan-2018

414 views

Category:

Software


1 download

TRANSCRIPT

© 2016 IBM Corporation

IBM Spectrum Scale Object Metadata Search

An Overview

1

© 2016 IBM Corporation

Agenda

What is Object Metadata?

What is Metadata Search?

Use Cases

Implementation Details

Availability

2

© 2016 IBM Corporation

What is Metadata?

User-defined metadata

Unique feature of object storage compared to other storage systems

Swift and S3 metadata are compatible through Swift3 middleware

Metadata is the structured data about the unstructured object Who, what, when, where and why of account, container, object

Perfect for indexing and searching

3

© 2016 IBM Corporation

Metadata Examples

Age Biomarkers Developmental Stage Cell Surface Markers Cell Type/Cell LineDisease State Extract Molecule Genetic Characteristics Immunoprecipitation AntibodyOrganism Platform Sex Strain Time Point Tissue Type Treatment Compound

Biomedical

Astronomy & Astrophysics

Geospatial

Image

Music

4

© 2016 IBM Corporation

What Swift Metadata Exists and How do I use it?

User Metadata can be added/removed to Accounts/Containers/Objects

E.g., X-Container-Meta-{name}, X-Remove-Container-Meta-{name}

System metadata also exists, some can even be set by the user

E.g., Content-Type, Last-Modified

Semantics

PUT and POST Metadata Semantics

• Account/Container – New user metadata added to existing list of metadata

• Object – New user metadata overwrites all existing user metadata

COPY retains existing metadata unless new metadata is specified

HEAD returns metadata only

5

© 2016 IBM Corporation

What is Metadata Search?

Automatically index and catalog Swift user and system

metadata

Provide REST-API for searching for objects based on their

metadata

6

© 2016 IBM Corporation

Why is Metadata Search Valuable?

Imagine internet without Google

Swiftly find needles in the OpenStack

Help users and administrators perform Data Analytics

Metadata can be on highest tier (SSD) while data resides on lower tier (Disk/Tape)

General Use Cases Data Mining

Data Warehousing

Selective data retrieval, data backup, data archival, data migration

Management/Reporting

7

© 2016 IBM Corporation

Sample Use-CasesAdvanced Photo Album

City: RomeTime: Day

photo1.jpgCity: RomeTime: Night

photo2.jpgCity: HaifaTime: Day

photo3.jpg photo4.jpgCity: TokyoTime: Night

GET /MyPhotoSpace?query=city=‘Rome’ AND Time=’Day’

GET /MyPhotoSpace?query=time=‘Night’

* Schematic, not complete syntax 8

© 2016 IBM Corporation

Metadata Search – Media use case

Search Query

GET /MyPhotoSpace?query=tags ~ 'John' AND date > 2/12/2012 AND date < 3/12/2013 AND num_views > 10000

What we searched for?

Date range search

Free Text matching

Integer comparison

9

* Schematic, not complete syntax

© 2016 IBM Corporation

10

What happens behind the screens?

Storage system input data path Indexer

Queue

Index/SearchIndex DB

© 2016 IBM Corporation

11

Indexing objects' Metadata

Swift Proxy Pipeline MD IndexerMiddleware

RabbitMQ

Index/SearchElastic-

search

© 2016 IBM Corporation

12

Swift Proxy Pipeline MD SearchMiddleware

Index/Search

Elastic-Search

DB

Serving Search Requests

© 2016 IBM Corporation

Spectrum Scale Object Store

Spectrum Scale Object Store w/ Metadata Search

13

© 2016 IBM Corporation

Availability

Available via - IBM Spectrum Scale Metadata Search Open Beta

(link), that contains:

Roll-your-own solution White Paper to be released describing how to setup and configure A source tarball with an easy install tool

Also available at: IBM Spectrum Scale Beta website (link)

IBM Confidential 14