Transcript
Page 1: BP-8 Global Federation and Search

Global Federation and Search!

Robin Bramley, Ixxus!

Page 2: BP-8 Global Federation and Search

Agenda!

•  Who I am!•  Setting the scene!•  The business challenge!•  Alfresco!•  Solr!•  Big Content!•  Global considerations!

•  Scaling strategies!•  Alfresco 4!•  Federation approaches!•  ʻIntelligentʼ storage!•  Challenges!

Page 3: BP-8 Global Federation and Search

My Background!

•  Senior Architect @ Ixxus!•  The UK Alfresco Platinum Partner!•  Lucid Imagination partner!

•  Worked at consultancies for 13 years!•  Developing solutions with Alfresco since 0.6!•  First UK Alfresco Gold partner!

•  Around the edges I also write!•  GroovyMag author – inc. 4 hands-on Grails articles!•  DZone Most Valuable Blogger!

•  Re-published posts include Event Driven indexing with Solr!•  Open source contributions include!

•  OpenID support for Acegi / Spring Security!•  Codenarc support for Hudson / Jenkins CI Violations plugin!

Page 4: BP-8 Global Federation and Search

The challenge!

•  Many global organisations face similar challenges around sharing information in a timely fashion between regions.

•  For publishers this is often exacerbated due to the size of some their assets such as print quality images or video.

Page 5: BP-8 Global Federation and Search

Alfresco!

Hopefully this needs little introduction. •  Clue: itʼs an ECM!

Page 6: BP-8 Global Federation and Search

Apache Solr!

RESTful Search Service •  POST it documents!•  GET query results!

•  Built on top of Lucene!•  Originated from CNET (created by Yonik Seeley)!•  Features !

•  Schema!•  Request handlers!•  Query types!•  Response Writers!•  Admin pages!•  Replication!•  Sharding!

•  Professional support available from Lucid Imagination!

Page 7: BP-8 Global Federation and Search

Big Content

Page 8: BP-8 Global Federation and Search

Going global!

Page 9: BP-8 Global Federation and Search

Going global!

Global systems can pose additional challenges

•  Infrastructure •  Network!

•  Bandwidth!•  Latency!•  Reliability!

•  Languages •  Timezones •  Collaboration •  Workflow •  Security permissions

Page 10: BP-8 Global Federation and Search

Scaling strategies!

You can scale / divide & conquer systems in a number of ways:

•  Scale up (vertical)

Page 11: BP-8 Global Federation and Search

Scaling strategies!

•  Scale out (horizontal)

•  Typically clustering!

•  But could also be!

•  Replication!

•  Separation of responsibilities!

Page 12: BP-8 Global Federation and Search

Scaling strategies!

•  Partitioning

•  Data Sharding!

•  Silos !•  Divisional / departmental!•  Regional!

Page 13: BP-8 Global Federation and Search

Alfresco 4!

What’s new in Alfresco 4.0? •  Wonʼt repeat the full press release here…!•  ʻCloud-scale performanceʼ!

•  Alfresco Index Server based on Apache Solr!•  Enhanced clustering!

Page 14: BP-8 Global Federation and Search

•  Based on Solr 1.4.1!•  Uses a custom alfrescoDataType fieldType!•  Leverages dynamic schema fields heavily!

•  Only statically defined field is ʻidʼ!•  Everything else (*) is a multi-valued dynamic field!

•  Though it uses the Alfresco model dictionary under the hood!•  Analysis chain (same for index/query)!

•  Whitespace tokenized !•  Word Delimited!

•  Breaks up camelCase etc.!•  Converted to lower case!

•  Adds a cmis request handler!•  Uses SSL client certificate authentication!

Alfresco 4 Solr!

Page 15: BP-8 Global Federation and Search

Federating!

Page 16: BP-8 Global Federation and Search

Federation Approaches!

Pros •  Can index many different

data sources!•  File systems!•  Databases!

Cons •  Timeliness!•  Pull model not suitable for all

scenarios!•  Additional storage

requirements!•  Indexing can be inefficient in

a global scenario!•  Permissions!

Build an index with a crawler

Page 17: BP-8 Global Federation and Search

Federation Approaches!

Federated Search using OpenSearch •  A collection of simple formats for sharing

search results!•  Can use an Atom response format!•  Elements such as totalResults used in

CMIS Atom binding!•  Was a big deal in Alfresco 2.0 (2007)!

•  Alfresco Explorer has an OpenSearch client!•  Alfresco has an OpenSearch server!

•  Provided keyword search!•  Wiki stated: ʻNote: Advanced Web Client Search and

Query Language searches will be OpenSearch enabled some time in the future, probably in line with up-and-coming CM standards.ʼ!

•  Client not in Share!•  CMIS a better bet for complex queries!

Page 18: BP-8 Global Federation and Search

Federation Approaches!

Pros •  Can work across

heterogeneous search engines!

•  Can implement asynchronous results!

Cons •  Rebuilding the wheel?!•  Authentication is a challenge

(without SAML or OAuth) !

Build a meta-search service

Page 19: BP-8 Global Federation and Search

Federation Approaches!Solr shards •  Treat separate Alfresco repositories Solr cores as separate shards!

Pros •  Distributed queries are a

standard Solr feature!

Cons •  The repositories need to be

backed by a single authentication source!

•  E.g. LDAP!•  Asynchronous results arenʼt

supported OOTB!

Page 20: BP-8 Global Federation and Search

ʻIntelligentʼ storage!

Storage Cloud Technology •  Underpinning for the repository is a storage cloud technology!

•  Uses a Content Store Selector!•  Base layer built on commodity hardware!

•  Keeps multiple replicas of the content!•  Management layer !

•  Cost-based routing!•  Knows where content resides!

•  On-demand content migration between repositories!

Page 21: BP-8 Global Federation and Search

Challenges!

•  Large file size •  Has to work with streaming!•  Beware of anything that attempts to buffer a full file into memory!

•  E.g. to POST it!•  Watch out for processes that need to copy a file!

•  User expectations •  Need training on asynchronous behaviour!•  Search results and their appearance!

•  Grouping / sort!•  Pagination (of distinct result sets)!

•  Time to migrate large content!•  Can be lengthy if there isnʼt a ʻnearʼ copy!

Page 22: BP-8 Global Federation and Search

Twitter: @rbramleyBlog: http://leanjavaengineering.com/!

Web: http://www.ixxus.com !


Top Related