apache solr for typo3 (@ t3con10 dallas, tx)

38
Apache Solr for TYPO3 TYPO3 Core Developer, Release Manager TYPO3 4.2 Ingo Renner Samstag, 22. Mai 2010

Upload: ingo-renner

Post on 05-Dec-2014

2.756 views

Category:

Technology


4 download

DESCRIPTION

An introduction to Apache Solr, what it is and why we use it with TYPO3. Covers Solr, the old Indexed Search, and the new Solr extension.

TRANSCRIPT

Page 1: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

Apache Solr for TYPO3TYPO3 Core Developer, Release Manager TYPO3 4.2

Ingo Renner

Samstag, 22. Mai 2010

Page 2: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

ingo@

typo3.

org

@ingor

enner

mail

twitter

ingo@

typo3.

org

@ingor

enner

mail

twitter

Samstag, 22. Mai 2010

Page 3: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

Current Status

Samstag, 22. Mai 2010

Page 4: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

• First Prototype Summer 2008• Development Kickoff February 2009• Public Release v1.0 T3CON09• v1.1 soon• v2.0 later this year

Current Status

Samstag, 22. Mai 2010

Page 5: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

• Initial development by dkd• Development Partnerships• Early Access, Trunk Access• Setup Support• Development Support• Development Priorities

Development Model

Samstag, 22. Mai 2010

Page 6: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

Development Partnersd.k.d Internet Service GmbH

e-netconsulting KGCross Content Media

Marketing Factory Consulting GmbH

University of Hohenheim

Andreae-Noris Zahn AG

Deutsche Lufthansa AG

Eichborn AG

SEB Assetmanagement AG

MÜPRO GmbH

AOE media GmbHNetcreators BV

marit AG

internezzo AG

Eventex

Samstag, 22. Mai 2010

Page 7: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

Indexed Search

Samstag, 22. Mai 2010

Page 8: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

• Indexing Frontend / Crawler• Respects access rights• Respects languages• Index in Database• Totally OK for smaller websites

Indexed Search

Slooooooooooooowww

Samstag, 22. Mai 2010

Page 9: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

Apache Solr

Samstag, 22. Mai 2010

Page 10: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

• Enterprise Search Server• Based on Lucene Index• Apache Software Foundation Project• Many powerful features

• CNet, Netflix, ilocal.nl, Zappos.com

So what is Apache Solr?

Samstag, 22. Mai 2010

Page 11: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

• Index = Collection of Documents• Document = Data stored in Fields• Field Type defines processing through

Analizers, Tokenizers, Filters• Dynamic Fields• Copy Fields

Solr Concepts

Flexibility

Samstag, 22. Mai 2010

Page 12: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

• Speed: Many times faster than IS• Better search results• Faceted search• Spellchecker: Did you mean ... ?• Similarity search: More like this ...• Editorial Content / paid search results• Synonyms, Stopwords, Protected Words• Boosting of specific index fields• Replication, distributed search

Why Apache Solr?

Speed &

PowerSamstag, 22. Mai 2010

Page 13: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

• REST like interface• Indexing of XML Documents through

HTTP POST• Querying through HTTP GET• Results as XML, JSON, PHP

How it works

Easy API

Samstag, 22. Mai 2010

Page 14: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

• Needs Java

• We donʻt want to deal with Java• Solr shields us from Java once set-up

Disadvantages

Developers

stay with PHP

Samstag, 22. Mai 2010

Page 15: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

• Multiple times faster than IS• NO database queries • Easy Installation / Configuration• Respects access restrictions• Respects languages• Cutomizability

Advantages

FastEasy to use

Powerful

Samstag, 22. Mai 2010

Page 16: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

• Indexing of XML Documents• Reversed Index• Access through GET and POST

(REST like)• Results as XML, JSON, PHP

Inner Workings

Samstag, 22. Mai 2010

Page 17: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

Inner WorkingsSolr IndexDocument

Field Field Field Field Field

Field Field Field Field Field

Field Field Field Field Field

Field Field Field Field Field

Field Field Field Field Field

Field Field Field Field Field

Document

Document

Document

Document

Document

Samstag, 22. Mai 2010

Page 18: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

Inner Workings

Lucene

Solr CoreAnalysis

Config Schema

Concurrency

CachingUpdateHandler

XML Update InterfaceXML

Response Writer

Custom Request Handler

DisMaxRequestHandler

Standard Request Handler

Admin Interface

HTTP Request Servlet Update Servlet

Replication

Samstag, 22. Mai 2010

Page 19: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

Apache Solr for TYPO3

+

EXT:solr

Samstag, 22. Mai 2010

Page 20: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

Features!FE Indexing

Search

Search Box

More Like This

Boosting

Common Searches

Facetted Search

Hierarchical Facets

Simple FormLast Searches

Sorting

Spellchecker / Did you mean ...

Auto Suggest

Index Queue

Hooks, Interfaces

Template Engine

View Helper

TYPO3 4.2

TYPO3 4.3

Scheduler

Reports

Access RightsInstall Script

Filter

Page Browser

Hit HighlightingMulti Language

Backend ModuleStatistics

File Indexing

Backend Search

Extbase / Fluid

Score Analyzer

Logging

Content Elevation

Samstag, 22. Mai 2010

Page 21: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

Features!FE IndexingSearch Search Box

More Like This

Boosting

Common SearchesFacetted Search

Hierarchical Facets

Simple Form

Last SearchesSortierungSpellchecker / Did you mean ...

Auto Suggest

Index Queue

Hooks, InterfacesTemplate Engine

View Helper

TYPO3 4.3

SchedulerReports

Access Rights

Install Script

Filter

Page BrowserHit Highlighting

Multi Language

Backend Module

Statistics

File IndexingBackend Search

Extbase / Fluid

Score Analyzer

Logging

Content Elevation

1.0 2.0

Samstag, 22. Mai 2010

Page 22: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

• „Acts like Indexed Search“• Indexing through Frontend / Crawler• Search• Search Word Highlighting• Sorting• Last and Common Searches

Current Status

Samstag, 22. Mai 2010

Page 23: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

• Spellchecker: Did you mean ... ?• Similarity Search: More like this ...• Faceted Search, Hierarchical Facets• Suggest / Autocompletion• Index Queue• File Indexing

Current Status

Samstag, 22. Mai 2010

Page 24: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

• Backend Module• Related Searches• Editorial / Paid Search Results• Editing of Stopwords, Synonyms• Statistics• Transition to Extbase / Fluid

Outlook

Samstag, 22. Mai 2010

Page 25: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

Showcases

Samstag, 22. Mai 2010

Page 26: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

Showcases

Samstag, 22. Mai 2010

Page 27: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

Showcases

Samstag, 22. Mai 2010

Page 28: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

Showcases

Samstag, 22. Mai 2010

Page 29: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

Showcases

Samstag, 22. Mai 2010

Page 30: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

Showcases

Samstag, 22. Mai 2010

Page 31: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

Showcases

Samstag, 22. Mai 2010

Page 32: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

Showcases

Samstag, 22. Mai 2010

Page 33: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

Showcases

Samstag, 22. Mai 2010

Page 34: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

Making the sun shine on your search

Samstag, 22. Mai 2010

Page 35: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

• Requires any J2EE container:Tomcat, Jetty, Resin, ...

• Run setup scripts provided with EXT:solr• Copy provided configuration files to Solr• Install EXT:solr, TypoScript• config.index_enable = 1

Requirements, Setup

Samstag, 22. Mai 2010

Page 36: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

• Indexing of additional Data through hooks, interfaces, TS configuration

• Individual index schema• En/Disable features through TS• Individual, flexible rendering of results

Customization

Samstag, 22. Mai 2010

Page 37: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

Thank you for listening.

Samstag, 22. Mai 2010

Page 38: Apache Solr for TYPO3 (@ T3CON10 Dallas, TX)

ingo@

typo3.

org

@ingor

enner

mail

twitter

ingo@

typo3.

org

@ingor

enner

mail

twitter

Samstag, 22. Mai 2010