thinking sphinx talk at boston.rb

24
Searching With Thinking Sphinx Dan Pickett

Upload: dan-pickett

Post on 15-Jan-2015

2.129 views

Category:

Technology


0 download

DESCRIPTION

Dan Pickett talks about Thinking Sphinx, the Ruby plugin/gem that interfaces with the Sphinx Full Text Search Engine

TRANSCRIPT

Page 1: Thinking Sphinx Talk at Boston.rb

Searching With

Thinking Sphinx

Dan Pickett

Page 2: Thinking Sphinx Talk at Boston.rb

I Know What You’re Thinking...

But, No

Page 3: Thinking Sphinx Talk at Boston.rb

The Sphinx We’re Talking About

Yes, the Eye is looking at you

Page 4: Thinking Sphinx Talk at Boston.rb

What is Full Text, Indexed Search?

• Searches for keyword matches

• Think of the DB “like” operator on steroids

• File based index (reduces DB load)

• Relevance Ranking / Phrase Proximity

• Two step process

• Query the DB and create indices (indexer)

• Search against created indices (searchd)

Page 5: Thinking Sphinx Talk at Boston.rb

Can Haz Search?

What’s Out There

• Direct SQL

• Ferret

• SOLR

• Lucene

• Sphinx

Every time you integrate Ferret, an angel weeps for you

Page 6: Thinking Sphinx Talk at Boston.rb

‘Nuff SaidAlthough angels are known to be emotional characters

Courtesy of: Evan Weaver, “Rails Search Benchmarks” 03/17/08

Page 7: Thinking Sphinx Talk at Boston.rb

UltraSphinx

Also, Evan Weaver likes Thinking Sphinx

Page 8: Thinking Sphinx Talk at Boston.rb

Why Sphinx Rocks

• Relevance Ratings and Phrase Proximity

• Active Development

• searchd Daemon doesn’t hog memory

• Delta Indexing

• Fast Indexing + Querying

• Distributed Capability

You rock too, but Sphinx is cooler

Page 9: Thinking Sphinx Talk at Boston.rb

Why TS Rocks• Maximizes use of the Riddle Client

• Sort modes

• Match modes

• Great support and active community

• Available as a gem and a plug-in

• Beautiful Code

• Pat Allan is the man

That was mean - I apologize for the burn in the last slide.You are equally as cool as Sphinx

Page 10: Thinking Sphinx Talk at Boston.rb

Let’s Play A Game...Where the F*ck is Carmen Sandiego?©

Courtesy: Bob-Rz @ Deviant Art 02/19/07

“Where the F*ck is Carmen Sandiego?” is a registered trademark of Enlight Solutions, Inc. Well, not really but it sounds cool. Honestly, though, does anyone ever read the fine print? You should be paying attention to the presentation. On we go...seriously,

focus people.

Page 11: Thinking Sphinx Talk at Boston.rb

Define your Index of Suspects

Page 12: Thinking Sphinx Talk at Boston.rb

InstallShield FTLLet’s Use Rake

• rake ts:config

• rake ts:in

• rake ts:start

• rake ts:stop

• rake ts:restart

Page 13: Thinking Sphinx Talk at Boston.rb

Get to Work, Detective

Page 14: Thinking Sphinx Talk at Boston.rb
Page 15: Thinking Sphinx Talk at Boston.rb
Page 16: Thinking Sphinx Talk at Boston.rb

Make Your Arrest

That was easy...

Page 17: Thinking Sphinx Talk at Boston.rb

Additional Features

• Match Modes

• Sort Modes

• Polymorphism

• Field Weighting

• Integration with will_paginate

Page 18: Thinking Sphinx Talk at Boston.rb

What I Wish I Knew

Protip: Despite its misleading name, Rockapella does not rock

Serious Mullet

Page 19: Thinking Sphinx Talk at Boston.rb

What I Wish I Knew

About Integrating TS• Sometimes the indexer silently fails

• Watch your output

• Disregard the Distributed Index warning

• Use delta indexing

• Run regular index tasks

• Use delayed_job or another queue manager to handle delta indexing

What time is it? Beer o’clock

Page 20: Thinking Sphinx Talk at Boston.rb

What I Wish I Knew

About Deploying TS

• Store PID files in a shared folder

• Ensure you’ve set proper permissions

• Set memory limits on indexing

• mem_limit option in sphinx.yml

• For large data sets, indices can be extremely large

• Ensure you have a surplus of storage capacity

Are we done yet? It’s about that time for a beer...

Page 21: Thinking Sphinx Talk at Boston.rb

What’s Missing?

• Excerpting

• Strong Facet Support

• ASpell Integration/Spell Check support

Blah, blah, blah - You must be getting thirsty by now

Page 22: Thinking Sphinx Talk at Boston.rb

It’s a Young but Awesome Utility

• Clone the source and see for yourself

• freelancing-god/thinking-sphinx

• Cucumber test-suite

• Extremely well architected

• Join the mailing list (Google Groups)

Did he mention Pat Allan is the man, yet?

Page 23: Thinking Sphinx Talk at Boston.rb

Thanks

• Follow me on Twitter

• www.twitter.com/dpickett

• Check out my blog

• www.enlightsolutions.com

• Recommend me

Page 24: Thinking Sphinx Talk at Boston.rb

Questions?