thinking sphinx talk at boston.rb
DESCRIPTION
Dan Pickett talks about Thinking Sphinx, the Ruby plugin/gem that interfaces with the Sphinx Full Text Search EngineTRANSCRIPT
Searching With
Thinking Sphinx
Dan Pickett
I Know What You’re Thinking...
But, No
The Sphinx We’re Talking About
Yes, the Eye is looking at you
What is Full Text, Indexed Search?
• Searches for keyword matches
• Think of the DB “like” operator on steroids
• File based index (reduces DB load)
• Relevance Ranking / Phrase Proximity
• Two step process
• Query the DB and create indices (indexer)
• Search against created indices (searchd)
Can Haz Search?
What’s Out There
• Direct SQL
• Ferret
• SOLR
• Lucene
• Sphinx
Every time you integrate Ferret, an angel weeps for you
‘Nuff SaidAlthough angels are known to be emotional characters
Courtesy of: Evan Weaver, “Rails Search Benchmarks” 03/17/08
UltraSphinx
Also, Evan Weaver likes Thinking Sphinx
Why Sphinx Rocks
• Relevance Ratings and Phrase Proximity
• Active Development
• searchd Daemon doesn’t hog memory
• Delta Indexing
• Fast Indexing + Querying
• Distributed Capability
You rock too, but Sphinx is cooler
Why TS Rocks• Maximizes use of the Riddle Client
• Sort modes
• Match modes
• Great support and active community
• Available as a gem and a plug-in
• Beautiful Code
• Pat Allan is the man
That was mean - I apologize for the burn in the last slide.You are equally as cool as Sphinx
Let’s Play A Game...Where the F*ck is Carmen Sandiego?©
Courtesy: Bob-Rz @ Deviant Art 02/19/07
“Where the F*ck is Carmen Sandiego?” is a registered trademark of Enlight Solutions, Inc. Well, not really but it sounds cool. Honestly, though, does anyone ever read the fine print? You should be paying attention to the presentation. On we go...seriously,
focus people.
Define your Index of Suspects
InstallShield FTLLet’s Use Rake
• rake ts:config
• rake ts:in
• rake ts:start
• rake ts:stop
• rake ts:restart
Get to Work, Detective
Make Your Arrest
That was easy...
Additional Features
• Match Modes
• Sort Modes
• Polymorphism
• Field Weighting
• Integration with will_paginate
What I Wish I Knew
Protip: Despite its misleading name, Rockapella does not rock
Serious Mullet
What I Wish I Knew
About Integrating TS• Sometimes the indexer silently fails
• Watch your output
• Disregard the Distributed Index warning
• Use delta indexing
• Run regular index tasks
• Use delayed_job or another queue manager to handle delta indexing
What time is it? Beer o’clock
What I Wish I Knew
About Deploying TS
• Store PID files in a shared folder
• Ensure you’ve set proper permissions
• Set memory limits on indexing
• mem_limit option in sphinx.yml
• For large data sets, indices can be extremely large
• Ensure you have a surplus of storage capacity
Are we done yet? It’s about that time for a beer...
What’s Missing?
• Excerpting
• Strong Facet Support
• ASpell Integration/Spell Check support
Blah, blah, blah - You must be getting thirsty by now
It’s a Young but Awesome Utility
• Clone the source and see for yourself
• freelancing-god/thinking-sphinx
• Cucumber test-suite
• Extremely well architected
• Join the mailing list (Google Groups)
Did he mention Pat Allan is the man, yet?
Thanks
• Follow me on Twitter
• www.twitter.com/dpickett
• Check out my blog
• www.enlightsolutions.com
• Recommend me
Questions?