introduction to enterprise search

59
INTRODUCTION TO ENTERPRISE SEARCH Kristian Norling

Upload: findwise

Post on 20-Aug-2015

981 views

Category:

Technology


2 download

TRANSCRIPT

Page 2: Introduction to Enterprise Search

• Who is here?

• Your expectations?

• Kristian?

• 2 hours, one break

• Lifetime answer Guarantee on this class

Introduction

Page 4: Introduction to Enterprise Search

• Problem

• History of (web) search

• How we search and !nd?

• Current state of Enterprise Search + stats

• Technical concept

• Information quality

• Feedback cycle

• Five dimensions of Findability

Agenda

Page 8: Introduction to Enterprise Search

• Growing amounts of Information

• Changing patterns of information consumption

• Information silos

• Web like behaviour > Information !lters

• Internal information use is still in the Digital Stone Age

The Problems

Page 9: Introduction to Enterprise Search

In Academia search is called Information Retrieval.

It is an old discipline, dating back thousands of years...

Basic concepts in Information Retrieval:

Recall and Precision, more later...

History of Search

Page 10: Introduction to Enterprise Search

• Directories are manually compiled taxonomies of websites

• Directories are far more costly and time intensive to maintain

• Directories lack coverage, although it provides an important alternative, especially for novice surfers

• Search engines rely mainly on automated search algorithms

• Search engines rank pages by popularity on the web, the more referrals (links) the more relevant

Directories vs. Search Engines

Page 11: Introduction to Enterprise Search

Yahoo – searchable directory (1994, ~10000 websites)

• Integrates  search  over  its  directory.  Organized  by  subject  ma8ers.  Sites  can  be  suggested,  but  human  editors  control  quality  of  directory  (~100  dedicated  editors)

Ask – natural language search engine (1998)

•used  human  editors  to  match  popular  queries.  Tried  different  algorithms  to  rank  pages  by  popularity

Google – searchable index (1998)

•Developed  Pagerank,  popularity  algorithm  that  hides  bad  content.  Set  standards  (spellchecking,  query  suggesIon,  search  results  page  design)

Early days of Web Search

Page 12: Introduction to Enterprise Search

First generation (1995-97) – AltaVista, Excite, WebCrawler

Uses mostly on-page data (text and formatting).

Informational queries.

Second generation (1998-2010) – Google, Yahoo

Use o"-page, web-speci!c data: link analysis, anchor-text, click-through data. Informational and navigational queries.

Third generation (2010-present) – Google, Wolfram-Alpha, Bing

Blend data from many sources, tries to answer ‘‘the need behind the query’’: semantic analysis, context determination, dynamic database selection etc. Informational, navigational, and transactional queries.

Web Search - evolution

Page 13: Introduction to Enterprise Search

Find information assumed to be available on the web in a static form.

Seeking information modes:

Informational

Page 14: Introduction to Enterprise Search

Reach a particular site that the user has in mind, either because they visited it in the past or because they assume that such a site exists. Have usually only one "right" result.

Seeking information modes:

Navigational

Page 15: Introduction to Enterprise Search

Reach a site where further interaction will happen. This interaction constitutes the transaction de!ning these queries. The main categories for such queries are shopping, !nding various web-mediated services, downloading various type of !le (images, songs, etc), accessing certain data-bases (e.g. Yellow Pages type data), !nding servers (e.g.for gaming) etc.

Seeking information modes:

Transactional

Page 16: Introduction to Enterprise Search

Finding something when I know what I want and have words to describe it.

Four modes of seeking information

Page 17: Introduction to Enterprise Search

Exploring when I only have some idea of what I want and may lack the words to articulate it.

Four modes of seeking information

Page 18: Introduction to Enterprise Search

Finding relevant items when I don’t know what I need.

Four modes of seeking information

Page 19: Introduction to Enterprise Search

Finding something I have seen before, but can’t remember where.

Four modes of seeking information

Page 20: Introduction to Enterprise Search

•Amount of information is growing everyday

•What to Search for?

•Where to Search?

•How to Search?

•Search is simple, complex and powerful

•Findability Dimensions

The State of Enterprise Search

Page 22: Introduction to Enterprise Search

HOW CRITICAL IS FINDING THE RIGHT INFORMATION TO BUSINESS GOALS AND

SUCCESS?

Page 23: Introduction to Enterprise Search

EUROPE76.5%

IMPERATIVE/SIGNIFICANT

Page 24: Introduction to Enterprise Search

Zoom Zoom

Page 25: Introduction to Enterprise Search

IS IT EASY TO FIND THE RIGHT INFORMATION

WITHIN YOUR ORGANISATION TODAY?

Page 26: Introduction to Enterprise Search

EUROPE77%

MODERATELY/VERY HARD

Page 27: Introduction to Enterprise Search

LEVEL OF SATISFACTION?

Page 29: Introduction to Enterprise Search

EUROPE18.5%

MOSTLY/VERY SATISFIED

Page 30: Introduction to Enterprise Search

WHAT ARE THE OBSTACLES TO FINDING THE RIGHT

INFORMATION?

Page 31: Introduction to Enterprise Search

63.4% POOR SEARCH FUNCTIONALITY

52.1% DON'T KNOW WHERE TO LOOK

51.4% INCONSISTENCY IN HOW WE TAG

CONTENT

50.0% LACK OF ADEQUATE TAGS

33.1% DON’T KNOW WHAT TO LOOK FOR

Globally

Page 32: Introduction to Enterprise Search

“Enterprise search is the practice of making content from multiple enterprise-type sources, such as databases and intranets, searchable to a de!ned audience.”http://en.wikipedia.org/wiki/Enterprise_search

Wikipedia De!nition

Page 33: Introduction to Enterprise Search

In the !eld of information retrieval, precision is the fraction of retrieved documents that are relevant to the search.

Precision takes all retrieved documents into account, but it can also be evaluated at a given cut-o" rank, considering only the topmost results returned by the system. This measure is called precision at n or P@n.

Source: Wikipedia

The Concept of Enterprise Search: Precision

Page 34: Introduction to Enterprise Search

Recall in information retrieval is the fraction of the documents that are relevant to the query that are successfully retrieved.

For example for text search on a set of documents recall is the number of correct results divided by the number of results that should have been returned.

Source: Wikipedia

The Concept of Enterprise Search: Recall

Page 35: Introduction to Enterprise Search

M number of relevant documents

N number of retrieved documents

R number of retrieved documentsthat are also relevant

Precision and Recall

Page 36: Introduction to Enterprise Search

Recall = R / M =

Number of retrieved documents that are also relevant / Total number of relevant documents.

Precision = R / N =

Number of retrieved documents that are also relevant / Total number of retrieved documents.

Precision and Recall

Page 37: Introduction to Enterprise Search

...enterprises typically have to use other query-independent factors, such as a document's recency or popularity, along with query-dependent factors traditionally associated with information retrieval algorithms. Also, the rich functionality of enterprise search UIs, such as clustering and faceting, diminish reliance on ranking as the means to direct the user's attention.

Relevance

Source: Wikipedia

Page 38: Introduction to Enterprise Search

PageRank

Page 39: Introduction to Enterprise Search

We do not have PageRank...

...but we have social!

Social Reconnects Enterprise Search

Emails, People Catalogues, Connections, Tagging, Sharing etc.

Relevance

Page 40: Introduction to Enterprise Search

The Concept of Enterprise Search

Page 41: Introduction to Enterprise Search

Examples of implementations:

- People Search

- Product Search

- Document Search

- Intranet and Website Search

- E-commerce

- Dashboard / Search as a Service

Search based Solutions

Page 42: Introduction to Enterprise Search

• Good Data/Information hygiene

• Crap in = Crap out

• Metadata is very important!

• Taxonomy and Metadata demysti!ed

• TetraPak example (video)

• SimCorp example

• VGR example (video)

Information / Content

Page 47: Introduction to Enterprise Search

Author: Douglas CouplandTitle: Hej Nostradamus!Publisher: Norstedts

Printed by: SmedjebackenYear: 2003

Printed: 2004

KristianNorling

Page 49: Introduction to Enterprise Search

Example: Ernst & Young

• Metadata

• Titles

• Content Quality

• Information Life Cycle Management

ESEO: Actionable activities

Page 50: Introduction to Enterprise Search

But, an average Search budget is 100K Euro

• TCO

• ROI

• KPI

Search Analytics is key

Show me the Money

Page 51: Introduction to Enterprise Search

Important, delivers actionable to-dos quickly

• 0-results

• Top Terms Searched for

Video: Search Analytics in Practice

Search Analytics

Page 52: Introduction to Enterprise Search

• Feedback form

• KPI from Search Analytics

• Session time x n:o sessions = Time spent on search x hourly price = Cost per “answer”

• Add search re!nements + exit page (=is the right answer)

User Satisfaction

Page 53: Introduction to Enterprise Search

Findability by Findwise

1. BUSINESS

Build solutions to support your business processes and goals

2. INFORMATION

Prepare information to make it !ndable

3. USERS

Build usable solutions based on user needs

4. ORGANISATION

Govern and improve your solution over time

5. SEARCH TECHNOLOGY

Build solutions based on state-of-the-art search technology

Page 54: Introduction to Enterprise Search

• Analyze how your business goals and strategies can be met by improved information access

• Set Findability goals. Examples; increase the revenue on sales, raise productivity, improve knowledge sharing, better collaboration

• Specify your requirements

• De!ne KPI’s and measure the success of your investments

Business

Page 55: Introduction to Enterprise Search

• Clean up and archive or delete outdated/unrelevant information

• Ensure good quality of information by adding structured and suitable metadata

• Create and use information models and taxonomies

• Tagging?

Information

Page 56: Introduction to Enterprise Search

• Get to know your users and their needs

• Make sure your solution is easy to use

• Perform continuous usability evaluations, like usage tests and expert evaluations

• Make sure users !nd what they are looking for

• Enable feedback loops for complaints, feedback and praise

Users

Page 57: Introduction to Enterprise Search

• Resources!

• De!ne processes, roles and routines to govern the solution

• Perform Search Analytics

• Create easy to use administration interfaces

• Perform training, technical and editorial

• Help publishers get started with processes for better !ndability

Organisation

Page 58: Introduction to Enterprise Search

• Select a suitable search platform or make the most of your current solution• Design your architecture with search-as-a-service in mind• Utilise the full potential of the selected technology

Search Technology