ibm spectrum scale object metadata search - an overview
TRANSCRIPT
© 2016 IBM Corporation
Agenda
What is Object Metadata?
What is Metadata Search?
Use Cases
Implementation Details
Availability
2
© 2016 IBM Corporation
What is Metadata?
User-defined metadata
Unique feature of object storage compared to other storage systems
Swift and S3 metadata are compatible through Swift3 middleware
Metadata is the structured data about the unstructured object Who, what, when, where and why of account, container, object
Perfect for indexing and searching
3
© 2016 IBM Corporation
Metadata Examples
Age Biomarkers Developmental Stage Cell Surface Markers Cell Type/Cell LineDisease State Extract Molecule Genetic Characteristics Immunoprecipitation AntibodyOrganism Platform Sex Strain Time Point Tissue Type Treatment Compound
Biomedical
Astronomy & Astrophysics
Geospatial
Image
Music
4
© 2016 IBM Corporation
What Swift Metadata Exists and How do I use it?
User Metadata can be added/removed to Accounts/Containers/Objects
E.g., X-Container-Meta-{name}, X-Remove-Container-Meta-{name}
System metadata also exists, some can even be set by the user
E.g., Content-Type, Last-Modified
Semantics
PUT and POST Metadata Semantics
• Account/Container – New user metadata added to existing list of metadata
• Object – New user metadata overwrites all existing user metadata
COPY retains existing metadata unless new metadata is specified
HEAD returns metadata only
5
© 2016 IBM Corporation
What is Metadata Search?
Automatically index and catalog Swift user and system
metadata
Provide REST-API for searching for objects based on their
metadata
6
© 2016 IBM Corporation
Why is Metadata Search Valuable?
Imagine internet without Google
Swiftly find needles in the OpenStack
Help users and administrators perform Data Analytics
Metadata can be on highest tier (SSD) while data resides on lower tier (Disk/Tape)
General Use Cases Data Mining
Data Warehousing
Selective data retrieval, data backup, data archival, data migration
Management/Reporting
7
© 2016 IBM Corporation
Sample Use-CasesAdvanced Photo Album
City: RomeTime: Day
photo1.jpgCity: RomeTime: Night
photo2.jpgCity: HaifaTime: Day
photo3.jpg photo4.jpgCity: TokyoTime: Night
GET /MyPhotoSpace?query=city=‘Rome’ AND Time=’Day’
GET /MyPhotoSpace?query=time=‘Night’
* Schematic, not complete syntax 8
© 2016 IBM Corporation
Metadata Search – Media use case
Search Query
GET /MyPhotoSpace?query=tags ~ 'John' AND date > 2/12/2012 AND date < 3/12/2013 AND num_views > 10000
What we searched for?
Date range search
Free Text matching
Integer comparison
9
* Schematic, not complete syntax
© 2016 IBM Corporation
10
What happens behind the screens?
Storage system input data path Indexer
Queue
Index/SearchIndex DB
© 2016 IBM Corporation
11
Indexing objects' Metadata
Swift Proxy Pipeline MD IndexerMiddleware
RabbitMQ
Index/SearchElastic-
search
© 2016 IBM Corporation
12
Swift Proxy Pipeline MD SearchMiddleware
Index/Search
Elastic-Search
DB
Serving Search Requests
© 2016 IBM Corporation
Spectrum Scale Object Store
Spectrum Scale Object Store w/ Metadata Search
13
© 2016 IBM Corporation
Availability
Available via - IBM Spectrum Scale Metadata Search Open Beta
(link), that contains:
Roll-your-own solution White Paper to be released describing how to setup and configure A source tarball with an easy install tool
Also available at: IBM Spectrum Scale Beta website (link)
IBM Confidential 14