boss open hack day, bangalore

56
Open Hack Day 2009 - Bangalore Chris Heilmann Saurabh Sahni Build your Own Search Service http://www.slideshare.net/saurabhsahni/

Upload: saurabh-sahni

Post on 22-Nov-2014

3.699 views

Category:

Education


1 download

DESCRIPTION

An introduction to BOSS API

TRANSCRIPT

Page 1: BOSS Open Hack Day, Bangalore

Open Hack Day 2009 - Bangalore

Chris Heilmann Saurabh Sahni

Build your Own Search Service

http://www.slideshare.net/saurabhsahni/

Page 2: BOSS Open Hack Day, Bangalore

- 2 -

Outline

•  Search engines using BOSS •  About BOSS API

–  What? –  Why? –  Features

•  How to use it –  BOSS API –  Code example –  BOSS Mashup framework

Page 3: BOSS Open Hack Day, Bangalore

- 3 -

Search engines using BOSS

Page 4: BOSS Open Hack Day, Bangalore

- 4 -

hakia: http://hakia.com/

Page 5: BOSS Open Hack Day, Bangalore

- 5 -

hakia: http://hakia.com/

Page 6: BOSS Open Hack Day, Bangalore

- 6 -

hakia: http://hakia.com/

Page 7: BOSS Open Hack Day, Bangalore

- 7 -

Cluuz: http://cluuz.com

Page 8: BOSS Open Hack Day, Bangalore

- 8 -

Cluuz: http://cluuz.com

Page 9: BOSS Open Hack Day, Bangalore

- 9 -

Cluuz: http://cluuz.com

Page 10: BOSS Open Hack Day, Bangalore

- 10 -

Keyword finder - http://keywordfinder.org/

Page 11: BOSS Open Hack Day, Bangalore

- 11 -

askBOSS: http://ask-boss.appspot.com/

Page 12: BOSS Open Hack Day, Bangalore

- 12 -

askBOSS: http://ask-boss.appspot.com/

Page 13: BOSS Open Hack Day, Bangalore

- 13 -

askBOSS: http://ask-boss.appspot.com/

Page 14: BOSS Open Hack Day, Bangalore

- 14 -

askBOSS: http://ask-boss.appspot.com/

Page 15: BOSS Open Hack Day, Bangalore

- 15 -

askBOSS: http://ask-boss.appspot.com/

Page 16: BOSS Open Hack Day, Bangalore

- 16 -

About BOSS API

Page 17: BOSS Open Hack Day, Bangalore

- 17 -

What?

•  Open Yahoo’s core search features via web services to let 3rd parties revolutionize Search

http://developer.yahoo.com/search/boss

Page 18: BOSS Open Hack Day, Bangalore

- 18 -

Usage

Opening the search technology stack

50B pages * 20ms page download = 31 years

CRAWL

EXTRACT

SPAM <-> Gold

Analyze

Index

Rank Assist

Index

Web Map

Retrieve

Page 19: BOSS Open Hack Day, Bangalore

- 19 -

Usage

Opening the search technology stack

50B pages * 20ms page download = 31 years

CRAWL

EXTRACT

SPAM <-> Gold

Analyze

Index

Rank Assist

Index

Web Map

Retrieve

WEB API

Your App here

Page 20: BOSS Open Hack Day, Bangalore

- 20 -

Why?

•  Removes entry barriers •  Asset to Innovate

–  Develop new relevance models –  Change presentation style

•  Search anywhere –  Improve Vertical Quality w/ Web comprehensiveness

Page 21: BOSS Open Hack Day, Bangalore

- 21 -

BOSS API features

•  No branding or attribution •  Ability to change presentation stlye •  Ability to re-order results and blend-in additional content •  Access to multiple verticals (web search, image, news) •  Keyword suggestions, spell checks •  Semantic data, in-links, abstracts •  Ability to monetize

Page 22: BOSS Open Hack Day, Bangalore

- 22 -

How to use it?

Page 23: BOSS Open Hack Day, Bangalore

- 23 -

Get Started

•  Register for an application id http://developer.yahoo.com/wsregapp/

•  Documentation http://developer.yahoo.com/search/boss/boss_guide/

•  Code samples: Javascript, PHP and Python http://www.saurabhsahni.com/boss-examples.zip

Page 24: BOSS Open Hack Day, Bangalore

- 24 -

BOSS API

Searching Slumdog Millionaire

(Source: http://en.wikipedia.org/wiki/File:Slumdog_Millionaire_poster.jpg)

Page 25: BOSS Open Hack Day, Bangalore

- 25 -

BOSS API

•  Search for slumdog millionaire: – http://boss.yahooapis.com/ysearch/web/v1/slumdog+millionaire?appid=xyz&format=xml

Page 26: BOSS Open Hack Day, Bangalore

- 26 -

BOSS API: XML response

http://boss.yahooapis.com/ysearch/web/v1/slumdog+millionaire?appid=xyz&format=xml

Page 27: BOSS Open Hack Day, Bangalore

- 27 -

Site Restrict Search

•  Search for slumdog millionaire on selected movie sites –  Add param sites=indiatimes.com,movies.yahoo.com,imdb.com –  http://boss.yahooapis.com/ysearch/web/v1/slumdog

+millionaire?appid=xyz&sites=indiatimes.com%2Cmovies.yahoo.com&format=xml

Page 28: BOSS Open Hack Day, Bangalore

- 28 -

http://boss.yahooapis.com/ysearch/web/v1/slumdog+millionaire? appid=xyz&sites=indiatimes.com%2Cmovies.yahoo.com&format=xml

Page 29: BOSS Open Hack Day, Bangalore

- 29 -

Search images

•  http://boss.yahooapis.com/ysearch/images/v1/slumdog +millionaire?dimensions=large

Page 30: BOSS Open Hack Day, Bangalore

- 30 -

http://boss.yahooapis.com/ysearch/images/v1/ slumdog +millionaire

Page 31: BOSS Open Hack Day, Bangalore

- 31 -

Search News

•  http://boss.yahooapis.com/ysearch/news/v1/slumdog +millionaire?age=15d

Page 32: BOSS Open Hack Day, Bangalore

- 32 -

http://boss.yahooapis.com/ysearch/news/v1/ slumdog + millionaire?age=15d

Page 33: BOSS Open Hack Day, Bangalore

- 33 -

Movie Search Code Example

Page 34: BOSS Open Hack Day, Bangalore

- 34 -

Page 35: BOSS Open Hack Day, Bangalore

- 35 -

Movie Search Code Example

Page 36: BOSS Open Hack Day, Bangalore

- 36 -

http://www.saurabhsahni.com/boss-examples.zip

Page 37: BOSS Open Hack Day, Bangalore

- 37 -

More with BOSS API

Page 38: BOSS Open Hack Day, Bangalore

- 38 -

Related keywords

Add parameter view=keyterms –  http://boss.yahooapis.com/ysearch/web/v1/slumdog

+millionaire?appid=xyz&view=keyterms&format=xml

Page 39: BOSS Open Hack Day, Bangalore

- 39 -

http://boss.yahooapis.com/ysearch/web/v1/slumdog +millionaire?appid=xyz&view=keyterms&format=xml

Page 40: BOSS Open Hack Day, Bangalore

- 40 -

•  Access structured data acquired through SearchMonkey

Semantic Data

Page 41: BOSS Open Hack Day, Bangalore

- 41 -

Semantic Data

view=searchmonkey_feed view=searchmonkey_rdf

http://developer.yahoo.com/search/boss/stuctureddata.html

Page 42: BOSS Open Hack Day, Bangalore

- 42 -

http://boss.yahooapis.com/ysearch/web/v1/slumdog +millionaire?appid=xyz& view=searchmonkey_feed&format=xml

Page 43: BOSS Open Hack Day, Bangalore

- 43 -

Long abstracts

•  Add parameter abstract=long –  get up to 300 characters instead of 130

Page 44: BOSS Open Hack Day, Bangalore

- 44 -

Spell Check

http://boss.yahooapis.com/ysearch/spelling/v1/milionare?format=xml

Response

Page 45: BOSS Open Hack Day, Bangalore

- 45 -

BOSS Search API REST Interface

•  {query}: term to look for (url-encoded) •  {vert} := {web, news, images, spelling} •  @ required

–  appid

•  @ optional –  start, count, lang, region, format, callback, sites, view

http://boss.yahooapis.com/ysearch/{vert}/v1/{query}

Page 46: BOSS Open Hack Day, Bangalore

- 46 -

Site Explorer

•  Get page inlinks –  http://boss.yahooapis.com/ysearch/se_inlink/v1/{URL}

?appid={APPID}

•  Page data: collection of subpages in a domain –  http://boss.yahooapis.com/ysearch/se_pagedata/v1/{URL}

?appid={APPID}

Page 47: BOSS Open Hack Day, Bangalore

- 47 -

BOSS Mashup Framework

•  Python (v2.5+) library

•  BOSS Search SDK plus …

•  SQL for remixing arbitrary XML/JSON sources

http://developer.yahoo.com/search/boss/mashup.html

Page 48: BOSS Open Hack Day, Bangalore

- 48 -

BMF + Google App Engine

•  Enhanced version of BMF to GAE platform

•  http://zooie.wordpress.com/2008/08/04/yahoo-boss-google-app-engine-integrated/

•  Enables quick deployment of BOSS applications online

Page 49: BOSS Open Hack Day, Bangalore

- 49 -

More BOSS Implementations

•  http://mashable.com/boss/ •  http://delicious.com/tag/bossmashup •  Add yours by tagging it with “bossmashup” on Del.icio.us!

Page 50: BOSS Open Hack Day, Bangalore

- 50 -

One more thing…

Page 51: BOSS Open Hack Day, Bangalore

- 51 -

BOSS Custom

Usage

50B pages * 20ms page download = 31 years

CRAWL

EXTRACT

SPAM <-> Gold

Analyze

Index

Retrieve

Rank Assist

Web Map

WEB API

Your App here

Page 52: BOSS Open Hack Day, Bangalore

- 52 -

Questions?

Thank You

More: http://developer.yahoo.com/search/boss/

Slides: http://www.slideshare.net/saurabhsahni/

Page 53: BOSS Open Hack Day, Bangalore

- 53 -

Appendix

Page 54: BOSS Open Hack Day, Bangalore

- 54 -

http://www.yahoo.com

Search UI Templates are Included in the BOSS Mashup Framework

BOSS Mashup Framework simplifies aggregating and presenting multiple data sources

Page 55: BOSS Open Hack Day, Bangalore

- 55 -

BMF Features

•  select, group, sort, union, joins, udfs, where •  Text normalization and duplicate removal •  Auto-transformation of resource-oriented API results

into tables w/o parsing •  All-in-memory storage and retrieval operations •  Ability to join lists of tables via an arbitrary predicate

function (map-like)

•  Search UI template framework •  Single search function provides total access to

BOSS REST API

Page 56: BOSS Open Hack Day, Bangalore

- 56 -

BOSS in Academic Research

•  The biggest dataset available on web •  Very useful for Web-mining research experiments

–  Natural language processing –  Semantic extraction –  Related keywords –  Similarity detection –  Clustering algorithms –  Spelling corrections