dzone search patterns webinar - amazon cloudsearch

54
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc. Search Patterns Jon Handler, Amazon CloudSearch Solution Architect

Upload: mbohlig

Post on 28-Nov-2015

15 views

Category:

Documents


0 download

DESCRIPTION

This webianr provides patterns for integrating cloud-based search with a variety of applications. Examples of these patterns are demonstrated using Amazon CloudSearch to abstract away the complexities of deploying and administering your own search servers, but the principles apply to other search systems as well.

TRANSCRIPT

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Search Patterns

Jon Handler, Amazon CloudSearch Solution Architect

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Agenda

!   Amazon CloudSearch Basics !   Searching in the Cloud !   Ranking !   Location-Based Search !   Faceting !   Mixed Data Sources !   Performance

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Patterns !   Title-Body Search !   Social Search Patterns !   Mobile Search Patterns ! eCommerce Patterns

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

AMAZON CLOUDSEARCH

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Search, In The Cloud

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

The Cloud is Elastic

SEARCH INSTANCE Index Partition n

Copy 1

SEARCH INSTANCE Index Partition 2

Copy 2

SEARCH INSTANCE Index Partition n

Copy 2

SEARCH INSTANCE Index Partition 2

Copy n

SEARCH INSTANCE

DATA Document Quantity and Size

TRAFFIC Search Request Volume and Complexity

Index Partition n Copy n

SEARCH INSTANCE Index Partition 1

Copy 1

SEARCH INSTANCE Index Partition 2

Copy 1

SEARCH INSTANCE Index Partition 1

Copy 2

SEARCH INSTANCE Index Partition 1

Copy n

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

SEARCHING IN THE CLOUD

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

CloudSearch Batches { "type": "add",! "id": "tt0076759",! "fields": { ! "title": "Star Wars",!

! "director": "Lucas, George",!! "year": 1977,!! "genre": ["Action","Adventure","Fantasy","Sci-Fi"],!! "actor": ["Ford, Harrison","Fisher, Carrie","Hamill,!! Mark","Jones, James Earl","Guinness, !! ! ! ! Alec",...] } },!

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Bootstrapping Data

Source System

Processing Script

Queuing Batching

Amazon EC2

Amazon EC2

Amazon CloudSearch

Amazon SQS

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Configuring for Search

!   Text fields for individual word search •  User-generated and external text – titles, descriptions

!   Literal fields for exact matches •  Application-generated text like facets

!   Integer fields for range searching and ranking

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Sending Queries http(s)://<endpoint>/2011-02-01/search? !   Simple searches

•  q=<text> !   Filtering

•  bq= (and title:'iron man' genre:'Action') !   Filtering with integer ranges

•  bq=(and 'iron man' year:..2010) !   Geo filtering

•  bq=(and 'iron man' latitude:12700..12900 longitude:5700..5800)

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Search Results { "rank": "-text_relevance",! "match-expr": "(label 'star wars')",! "hits": { "found": 7, "start": 0,! "hit": [{"id": "tt1185834"},! {"id": "tt0076759"},! {"id": "tt0086190"},! {"id": "tt0120915"},! {"id": "tt0121765"},! {"id": "tt0080684"},! {"id": "tt0121766"} ]! } ...!}!!

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Updating CloudSearch

Amazon EC2 Amazon CloudSearch

Amazon SQS Amazon EC2

Amazon S3 DynamoDB Amazon RDS

Web Server

Users

Update Processor

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

BASIC RANKING

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Customizing Ranking

! text_relevance and cs.text_relevance !   Rank expressions

•  Compute a score for each document •  &rank=<function>

!   Defined in the console !   Defined at query-time

•  &q='iron-man'&rank-recency=text_relevance + year &rank=recency

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Document Structure

Movie

title

description

user_rating

likes

release_date

latitude

longitude

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Field Weighting

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Field Weighting

!   Adjust relative importance of fields !   &rank-title_boost=

cs.text_relevance({"weights":{"title":4.0}, "default_weight":1})

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Popularity

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Popularity

!   Convert floating point to integer !   Weight by the number of ranks !   rank-pop=

(user-rating - 2) * log10(number-user-ranks) * 10 + metascore * 3

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Freshness

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Freshness

!   Exponential decay function

!   &rank-decay= 200*Math.exp(-0.1*days_ago)

r = ce−λt

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Rank Expressions: Combined

!   &rank-combined=1.0 * title + 0.5 * popularity + 0.3 * freshness

!   &rank=combined

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

LOCATION-BASED SEARCH

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

When wealthy industrialist Tony Stark is forced to build an armored suit after a life-threatening incident, he ultimately decides to use its technology to fight against evil. !

Iron Man (2008)!

Tony Stark has declared himself Iron Man and installed world peace... or so he thinks. He soon realizes that not only is there a mad man...!

Iron Man 2 (2010)!

When Tony Stark's world is torn apart by a formidable terrorist called the Mandarin, he starts an odyssey of rebuilding and retribution. !

Iron Man 3 (2013)!

On the hunt for a fabled treasure of gold, a band of warriors, assassins, and a rogue British soldier descend upon a village in feudal China, where a humble blacksmith...!

The Man With The Iron Fists (2012) !

Cancel Iron Man!

Movies Search Social Account Nearby

Done Iron Man

!

Movies Search Social Account Nearby

Mobile Experience

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Encoding Location !   Latitude and longitude expressed as

integers Movie

title

description

user_rating

likes

release_date

latitude

longitude

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Bounding Box Search

!  Latitude min/max !  Longitude min/max bq=(and 'theater' latitude:12700..12900 longitude:5700..5800)

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Location Sort

!   Cartesian distance function

!   &rank-geo=sqrt(pow(latitude - lat, 2) + pow(longitude - lon, 2)

!   &rank=-geo

(lat − latuser )2 + (lon− lonuser )

2

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

FACETING

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Facets

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Facets

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Simple Faceting: Document

Movie

title

description

genre

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Simple Faceting: Configuration

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Simple Faceting: Query q=iron+man&facet=genre

{"rank":  "-­‐text_relevance",  "match-­‐expr":  "(label  'star  wars')",  "hits":  {"found":  7,  "start":  0,  "hit":  []  },  "facets":  {      "genre":  {          "constraints":  [              {"value":  "Family",  "count":  62},              {"value":  "Action/Adventure",  "count":  21},              {"value":  "Drama",  "count":  5  },  

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Simple Faceting: UI <div  class='facet'>          <ul  class='facet_list'>                  <?php                          $genres  =  $resultsObj-­‐>facets-­‐>genre-­‐>constraints;                          for  ($i  =  0;  $i  <  count($genres);  $i++)  {                                  $curGenre  =  $genres[$i];  $curCount  =  $thisGenre-­‐>count;                    ?>                  <li  class='facet_item'>                          <div  class='facet_name'><?=$curGenre?></div>                          <div  class='facet_count'><?=$curCount?></div>                  </li>                  <?php  }  ?>          </ul>  </div>  

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Facets

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Document !   title: Lincoln !   description: ... !   oscar1: Awards !   oscar2: Awards/Best Actor !   oscar3: Awards/Best Actor/Daniel Day

Lewis

Movie title description oscar1 oscar2 oscar3

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Query &q=lincoln&facet=oscar1,oscar2,oscar3 {"rank":  "-­‐text_relevance",  "hits":{...},  "facets":  {      "oscar1":  {          "constraints":  [              {"value":  "Awards",  "count":  23},              {"value":  "Nominations",  "count":  124}]},      "oscar2":  {          "constraints":  [              {"value":  "Awards/Best  Actor",  "count":  6},              {"value":  "Awards/Best  Actress",  "count":  3}...]},            "oscar3":  {          "constraints":  [              {"value":  "Awards/Best  Actor/Daniel  Day  Lewis",  "count":  1},              {"value":  "Awards/Best  Actor/Denzel  Washington",  "count":  2}...]},        

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Drilldown

! bq=oscar1:'Awards' ! bq=oscar2:'Awards/Best Actor' ! bq=oscar3:'Awards/Best Actor/Daniel Day Lewis' ! bq=(and 'star' oscar2:'Awards/Best Actor')

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

MIXED DATA SOURCES

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Document

Showtime

type

title

theater_name

city

latitude

longitude

Movie

type

title

description

user_rating

likes

release_date

Review

type

title

movie_name

author

url

body

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Heterogeneous Data

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Multi Domain

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Trade-offs

!   Multiple domain •  Independent configuration •  Independent scale

!   Single domain •  Simpler •  Lower cost •  bq=(and 'iron man' type:'movie')

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

TUNING

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

What to Track

!   User queries !   Responses !   Response times !   Click positions

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Tuning Relevance

!   Return relevance values !   Check no-result queries !   Check most common results

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Tuning Performance

!   Identify consistent slow queries !   Tend towards text matching !   Cache slow queries when possible !   Benchmark with JMeter or Siege

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Q&A

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified or distributed in whole or in part without the express consent of Amazon.com, Inc.

Resources

!   Amazon CloudSearch Overview Page http://aws.amazon.com/cloudsearch/ •  Developer Guide •  FAQs, Articles •  Community Forum •  Tutorial

!   Free 30-day trial !   Contact: [email protected]