enterprise search: how do we get there from here?
DESCRIPTION
Enterprise Search: How Do We Get There From Here? by Daniel Tunkelang (Head of Query Understanding, LinkedIn) Keynote at 2013 Enterprise Search Summit We've been tackling the challenges of enterprise and site search for at least 3 decades. We've succeeded to the point that search is the gateway to many of our information repositories. Nonetheless, users of enterprise search systems are frustrated with these systems' shortcomings. We see this frustration in surveys, but, more importantly, most of us experience it personally in our daily work life. We all dream of a world where searching any information repository is as effective as searching the web—perhaps even more so. A world where we find what we're looking for, or quickly determine that it doesn't exist. Is this Utopia possible? If so, how do we get there from here? Or at least somewhere close? In this talk, Tunkelang reviews the track record of enterprise search. He talks about what's worked and what hasn't, especially as compared to web search. Finally, he proposes some paths to bring us closer to our dream. -- Daniel Tunkelang is Head of Query Understanding at LinkedIn. Educated at MIT and CMU, he has his career working on big data, addressing key challenges in search, data mining, user interfaces, and network analysis. He co-founded enterprise search and business intelligence pioneer Endeca, where he spent a decade as its Chief Scientist. In 2011, Endeca was acquired by Oracle for over $1B. Previous to LinkedIn, he led a team at Google working on local search quality. Daniel has authored fifteen patents, written a textbook on faceted search, and created the annual symposium on human-computer interaction and information retrieval.TRANSCRIPT
Enterprise Search: How do we get there from here?
Daniel Tunkelang Head of Query Understanding, LinkedIn
THERE The Dream (Franz Marc, 1912)
“Computer, what is the nature of the universe?”
"a web of data that can be processed by machines"
Mind reading is now possible!
HERE Office Space (1999)
Google VP Udi Manber on their in-‐house search: “It’s not that good.”
Beyond 10 blue links? Not so much.
Meta-‐utopia or Metacrap?
Cory Doctorow’s seven straw-‐men of meta-‐utopia: 1. People lie. 2. People are lazy. 3. People are stupid. 4. Mission: Impossible -‐-‐ know thyself. 5. Schemas aren't neutral. 6. Metrics influence results. 7. There's more than one way to describe something.
So how do we get there from here?
Three Baby Steps on the Path to Utopia
1. Exercise common sense.
2. Show some humility.
3. If all else fails, cheat.
Remember what the Dormouse said: Feed your head.
From 2012 Google Zeitgeist
Monitor your top queries. Nail them.
15 15
for i in [1..n]! s ← w1 w2 … wi! if Pc(s) > 0! a ← new Segment()! a.segs ← {s}! a.prob ← Pc(s)! B[i] ← {a}! for j in [1..i-1]! for b in B[j]! s ← wj wj+1 … wi! if Pc(s) > 0! a ← new Segment()! a.segs ← b.segs U {s}! a.prob ← b.prob * Pc(s)! B[i] ← B[i] U {a}! sort B[i] by prob! truncate B[i] to size k!
Long tail? Structure and segment your queries.
Eneees and categories are your friends.
Even the eneees for which you have no results.
Idenefy unsuccessful searches.
Use analyecs to drive triage.
“Sorry, no results containing all your search terms were found.”
Analyzed representaDve random sample of name searches. Leading causes: 1) Misspelled names. 2) Correctly spelled name of someone not on site.
Combine automated analysis with human judgment.
Triage drives and validates agile development.
Misspelled name?
Correctly spelled name of someone not on site?
You just ask them?
vs.
Recognize ambiguity and ask for clarificaeon.
Clarify, then refine.
Computers Books
It’s 2013. Please use faceted search.
Make your best guess, but hedge your bets.
Claudia Hauff, Query Difficulty for Digital Libraries [2009]
Not all queries are created equal in difficulty.
“It's ok to cheat, as long as you cheat your way to the top."
Design an experience that doesn’t require search.
Crowd-‐source curaeon.
Unstructured data? Beg, borrow, or steal.
Solve an easier problem: re-‐finding.
Invest in type-‐ahead, especially instant results.
“Good arests copy. Great arests steal.” -‐-‐ Picasso / Jobs
Three Baby Steps on the Path to Utopia
1. Exercise common sense.
2. Show some humility.
3. If all else fails, cheat.
It’s the economy, stupid!
Warning: technology alone is not a solueon.
The future is on the way.
But the present doesn’t have to be so bad.