making your domain objects searchable with hibenate search

49
Making Your Domain Objects Searchable with Hibernate Search Gustavo Fernandes Sunday, 23 May 2010

Upload: gustavo-fernandes

Post on 15-May-2015

2.525 views

Category:

Technology


0 download

DESCRIPTION

Presentation about Hibernate Search done in Lucene Apache Eurocon at Prague, Czech Republic on May 20th

TRANSCRIPT

Page 1: Making your domain objects searchable with Hibenate Search

Making  Your  Domain  Objects  Searchable  with  Hibernate

SearchGustavo  Fernandes

Sunday, 23 May 2010

Page 2: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Agenda

2

Mo#va#ons  and  Goals

Indexing

Retrieval

Scalability

Sunday, 23 May 2010

Page 3: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Hibernate  in  a  nutshell

3IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 4: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Hibernate  in  a  nutshell

4

@Entitypublic class Author { @Id @GeneratedValue private Integer id; private String name; @OneToMany private Set<Book> books;}

@Entitypublic class Book { private Integer id; private String title;}

@Entitypublic class Book { private Integer id; private String title;}

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 5: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Hibernate  in  a  nutshell

5

@Entitypublic class Author { @Id @GeneratedValue private Integer id; private String name; @OneToMany private Set<Book> books;}

@Entitypublic class Book { private Integer id; private String title;}

@Entitypublic class Book { private Integer id; private String title;}

Author author = new Author(“Stephen King”);Book aBook = new Book(“Blaze”);HashSet<Book> books = new HashSet<Book>();books.add(aBook);author.setBooks(books);Session session = sessionFactory.openSession(); Transaction tx = session.beginTransaction();session.save(author);tx.commit();

Author author = new Author(“Stephen King”);Book aBook = new Book(“Blaze”);HashSet<Book> books = new HashSet<Book>();books.add(aBook);author.setBooks(books);Session session = sessionFactory.openSession(); Transaction tx = session.beginTransaction();session.save(author);tx.commit();

Author author = new Author(“Stephen King”);Book aBook = new Book(“Blaze”);HashSet<Book> books = new HashSet<Book>();books.add(aBook);author.setBooks(books);Session session = sessionFactory.openSession(); Transaction tx = session.beginTransaction();session.save(author);tx.commit();

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 6: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Hibernate  in  a  nutshell

6

@Entitypublic class Author { @Id @GeneratedValue private Integer id; private String name; @OneToMany private Set<Book> books;}

@Entitypublic class Book { private Integer id; private String title;}

@Entitypublic class Book { private Integer id; private String title;}

Author author = new Author(“Stephen King”);Book aBook = new Book(“Blaze”);HashSet<Book> books = new HashSet<Book>();books.add(aBook);author.setBooks(books);Session session = sessionFactory.openSession(); Transaction tx = session.beginTransaction();session.save(author);tx.commit();

Author author = new Author(“Stephen King”);Book aBook = new Book(“Blaze”);HashSet<Book> books = new HashSet<Book>();books.add(aBook);author.setBooks(books);Session session = sessionFactory.openSession(); Transaction tx = session.beginTransaction();session.save(author);tx.commit();

Author author = new Author(“Stephen King”);Book aBook = new Book(“Blaze”);HashSet<Book> books = new HashSet<Book>();books.add(aBook);author.setBooks(books);Session session = sessionFactory.openSession(); Transaction tx = session.beginTransaction();session.save(author);tx.commit();

Select * from Author;+----+--------------+| id | name |+----+--------------+| 1 | Stephen King | +----+--------------+

Select * from Book;+----+----------+| id | title |+----+----------+| 1 | Blaze |+----+----------+

Select * from Book_Author;+---------+------------+| Book_id | authors_id |+---------+------------+| 1 | 1 |+---------+------------+

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 7: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Hibernate  extension  which  uses  Lucene  internally

Bring  full  text  search  capabiliIes  to  Hibernate

Object-­‐Document  mapping

Take  care  of  the  plumbing

Keep  database  and  index  in  sync

ConvenIon  over  configuraIon

Flexible

7

Meet  Hibernate  Search

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 8: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Meet  Hibernate  Search

Current  version:  3.2.0-­‐Final  (May/2010)

LGPL  License

Lucene  version  supported:  2.9.2

Solr  version  supported:  1.4

8IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 9: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Meet  Hibernate  Search

Dependencies:

<dependency> <groupId>org.hibernate</groupId> <artifactId>hibernate-search</artifactId> <version>3.2.0.Final</version> </dependency>

9IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 10: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing

Mapping  Objects  <-­‐>  Documents

Support  for  types

Analyzers/Boost  

Transparent/Manual  Indexing

10IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 11: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Mapping  EnIIes@Entitypublic class Author {

@Id @GeneratedValue private Integer id;

private String name;

@OneToMany private Set<Book> books; }

11IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 12: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Mapping  EnIIes@Indexed@Entitypublic class Author {

@Id @GeneratedValue private Integer id;

private String name;

@OneToMany private Set<Book> books; }

12IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 13: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Mapping  EnIIes@Indexed@Entitypublic class Author {

@Id @GeneratedValue private Integer id;

private String name;

@OneToMany private Set<Book> books; }

13IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 14: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Mapping  EnIIes@Indexed(index=”Author_Index”)@Entitypublic class Author { @Id @GeneratedValue private Integer id;

private String name;

@OneToMany private Set<Book> books; }

14IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 15: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Mapping  EnIIes@Indexed(index=”Author_Index”)@Entitypublic class Author { @Id @GeneratedValue @DocumentId private Integer id;

private String name;

@OneToMany private Set<Book> books; }

15IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 16: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Mapping  Fields@Indexed(index=”Author_Index”)@Entitypublic class Author { @Id @GeneratedValue @DocumentId private Integer id; @Field private String name;

@OneToMany private Set<Book> books; }

16IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 17: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Mapping  Fields@Indexed(index=”Author_Index”)@Entitypublic class Author { @Id @GeneratedValue @DocumentId private Integer id; @Field(name = name_field, store = Store.YES, index = Index.TOKENIZED) private String name;

@OneToMany private Set<Book> books; }

17IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 18: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Mapping  Fields@Indexed(index=”Author_Index”)@Entitypublic class Author { @Id @GeneratedValue @DocumentId private Integer id; @Fields( { @Field(index = Index.TOKENIZED), @Field(name= “nameForSort”, index = Index.UN_TOKENIZED) } ) private String name;

@OneToMany private Set<Book> books; }

18IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 19: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Mapping  RelaIonships@Indexed(index=”Author_Index”)@Entitypublic class Author { @Id @GeneratedValue @DocumentId private Integer id; @Field(index = Index.TOKENIZED) private String name;

@OneToMany @IndexEmbedded private Set<Book> books;

}

19IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 20: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Types

20

@Indexed(index=”Author_Index”)@Entitypublic class Author { @Id @GeneratedValue @DocumentId private Integer id; @Field(index = Index.TOKENIZED) private String name;

@OneToMany @IndexEmbedded private Set<Book> books;

@Field(bridge = @FieldBridge(impl = AddressBridge.class)) private Adress address;

}

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 21: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Boost

21

@Indexed(index=”Author_Index”)@Entitypublic class Author { @Id @GeneratedValue @DocumentId private Integer id; @Field(index = Index.TOKENIZED) @Boost(1.5f) private String name;

@OneToMany @IndexEmbedded private Set<Book> books;

@Field(bridge = @FieldBridge(impl = AddressBridge.class)) @Boost(0.75f) private Adress address;

}

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 22: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Analyzers

22

@Entity @Indexedpublic class Author { @Id @GeneratedValue @DocumentId private Integer id;

private String bio; ...}

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 23: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Analyzers

23

@Entity @Indexed@AnalyzerDef(name=”combinedAnalyzers”, tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class),)public class Author { @Id @GeneratedValue @DocumentId private Integer id; private String bio; ...}

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 24: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Analyzers

24

@Entity @Indexed@AnalyzerDef(name=”combinedAnalyzers”, tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class), filters = { @TokenFilterDef(factory = LowerCaseFilterFactory.class) })public class Author { @Id @GeneratedValue @DocumentId private Integer id;

private String bio; ...}

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 25: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Analyzers

25

@Entity @Indexed@AnalyzerDef(name=”combinedAnalyzers”, charFilters = { @CharFilterDef(factory = MappingCharFilterFactory.class) }, tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class), filters = { @TokenFilterDef(factory = LowerCaseFilterFactory.class) })public class Author { @Id @GeneratedValue @DocumentId private Integer id;

private String bio; ...}

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 26: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Analyzers

26

@Entity @Indexed@AnalyzerDef(name=”combinedAnalyzers”, charFilters = { @CharFilterDef(factory = MappingCharFilterFactory.class) }, tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class), filters = { @TokenFilterDef(factory = LowerCaseFilterFactory.class) })public class Author { @Id @GeneratedValue @DocumentId private Integer id; @Analyzer(definition = “combinedAnalyzers”) private String bio; ...}

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 27: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Index  -­‐  Fluent  APISearchMapping mapping = new SearchMapping();

mapping .analyzerDef("customAnalyzer", StandardTokenizerFactory.class) .filter(LowerCaseFilterFactory.class) .filter(SnowballPorterFilterFactory.class) .param("language", "English") .entity(Author.class) .indexed() .property("id",ElementType.FIELD).documentId() .property("adress", ElementType.FIELD) .field().bridge(AdressBrigde.class).store(Store.YES) .property("books", ElementType.FIELD).indexEmbedded() .property("name", ElementType.METHOD).field().store(Store.YES) .entity(Book.class) .indexed() .property("id", ElementType.METHOD).documentId() .property("title", ElementType.METHOD) .field().analyzer("customAnalyzer");

27IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 28: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Backend

28

Source:  Hibernate  Search  in  AcIon

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 29: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Backend

hibernate.work.execu#on    async

hibernate.work.thread_pool_size    1029

Source:  Hibernate  Search  in  AcIon

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 30: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  JMS  backend  

hibernate.worker.backend        jms

hibernate.worker.jms.connec#on_factory        /Connec#onFactory

hibernate.worker.jms.queue      queue/hsearch

30

Source:  Hibernate  Search  in  AcIon

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 31: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Manual  Indexing

Use  case Non-­‐exclusive  database

Manual  Indexing  types: Single  enIty

Mass  indexer

31IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 32: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Manual  Indexing  -­‐  Single  EnItyFullTextSession fullTextSession = Search.getFullTextSession(session);

Transaction tx = fullTextSession.beginTransaction();

Object author = fullTextSession.load( Author.class, 1 );

fullTextSession.index(author);

tx.commit();

32IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 33: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Mass  IndexingfullTextSession.createIndexer().startAndWait();fullTextSession.createIndexer().start();

33IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 34: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Retrieval  -­‐  Lucene  Queries  +  Hibernate  API// Wraps Hibernate Session Object

org.hibernate.seach.FullTextSession fullTextSession = org.hibernate.search.Search.getFullTextSession(session);

// Lucene queryVersion v = Version.LUCENE_29;

org.apache.lucene.queryParser.QueryParser queryParser = new org.apache.lucene.queryParser.QueryParser(v, "name", new StandardAnalyzer (v));

org.apache.lucene.search.Query query = queryParser.parse("+King");

// Hibernate search queryorg.hibernate.Query textQuery = fullTextSession.createFullTextQuery(query, Author.class);

Author loadedAuthor = (Author)textQuery.list();

34IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 35: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Retrieval  -­‐  Hibernate  Search

1.  Executes  Lucene  Query  and  get  the  results

2.  Retrieves  document  ids  from  the  index

3.  Load  objects  from  database  

4.  Return  domain  objects

35IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 36: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Retrieval  -­‐  Results  ManipulaIon Pagina#on

Type  restric#on

Projec#on

Result  mapping

36IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 37: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Retrieval  -­‐  IndexReader shared  strategy:  shared  IndexReader  (default)          hibernate.search.reader.strategy = shared

not-­‐shared  strategy:  open  IndexReader  for  every  query          hibernate.search.reader.strategy = not-shared

Extensible  by  using  ReaderProvider  Interfacehibernate.search.reader.strategy = com.mycompany.CoolReaderProvider

37IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 38: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Scalability

Sharding

Clustering

38IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 39: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Scalability  -­‐  Sharding

•Default:  one  index  per  en#ty  type

•Shard:  two  or  more  indexes  per  en#ty  type

•Use  cases  • Performance

• Maintenance

39

IndexApplicationQueryIndex

A - Z

Shard A

Shard B

Shard C

Application

A - H

I - N

O - Z

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 40: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Scalability  -­‐  Sharding

Indexes  separated  physically

Virtual  Index

40

Shard A

Shard B

Shard C

VirtualIndex

ApplicationQueryIndex

A - H

I - N

O - Z

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 41: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Scalability  -­‐  Sharding

Configura#onhibernate.search.com.sourcesense.Author.sharding_strategy.nbr_of_shard 2

41IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 42: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Scalability  -­‐  Shard  Strategy

Default  algorithm:    ID  Hash

42

12345

f(x) = x % N

1 2

3

4

5

Shard 1

Shard 2

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 43: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Custom  Sharding  Strategy

Implement  IndexShardingStrategy

hibernate.search.com.sourcesense.Author.sharding_strategy BookTitleStrategy

43IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 44: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Synchronous  Clustering

Every  node  can  read  and  write  to  the  index

Pessimist  locking  prevents  corrup#on

Single  index  shared  among  every  node

Choose  your  flavour:  NFS,  Database,  distributed  caches

44IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 45: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Clustering

Read-­‐Write  Synchronous  cluster

45

Index

Node 1

IndexWriter

Node 2

IndexWriter Node 3

IndexWriter

Node 4

IndexWriter

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 46: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Asynchronous  Clustering

46IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 47: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Asynchronous  Cluster

Advantages Only  master  writes

No  indexing  in  slaves  -­‐>  no  waiIng  for  locks

Downside Data  is  not  visible  immediately  by  the  slaves

47IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 48: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

To  learn  more...

48

hibernate.org/subprojects/search.html

anonsvn.jboss.org/repos/hibernate/search/

Sunday, 23 May 2010

Page 49: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Thank  you

49

[email protected]

twicer:  @gustavonalle

Sunday, 23 May 2010