keyword generation for search engine advertising

24
18 December 2006 Amruta Joshi and Rajeev Motwa ni, Stanford University 1 Keyword Generation for Search Engine Advertising Amruta Joshi*, Yahoo! Research Rajeev Motwani, Stanford University * This work was done at Stanford

Upload: wallace-barrett

Post on 30-Dec-2015

38 views

Category:

Documents


2 download

DESCRIPTION

Keyword Generation for Search Engine Advertising. Amruta Joshi*, Yahoo! Research Rajeev Motwani, Stanford University. * This work was done at Stanford. Search Results. Sponsored Search Results. Expensive, high frequency keywords. Target inexpensive, low frequency keywords instead. - PowerPoint PPT Presentation

TRANSCRIPT

18 December 2006 Amruta Joshi and Rajeev Motwani, Stanford University

1

Keyword Generation for Search Engine Advertising

Amruta Joshi*, Yahoo! Research

Rajeev Motwani, Stanford University

* This work was done at Stanford

18 December 2006 Amruta Joshi and Rajeev Motwani, Stanford University

2

Search ResultsSponsored Search Results

18 December 2006 Amruta Joshi and Rajeev Motwani, Stanford University

3

Long Tail

Queries

Fre

quen

cy in

que

ry-lo

gs

Expensive, high frequency keywords

Target inexpensive, low frequency keywords instead

18 December 2006 Amruta Joshi and Rajeev Motwani, Stanford University

4

Keyword Pricing

18 December 2006 Amruta Joshi and Rajeev Motwani, Stanford University

5

Pick the right keywords

Advantages more focused audience lesser competition, easier to get #1 position cost-effective alternative

Keywords should be Highly Relevant to base query Nonobviousness to guess from the base query

E.g.: hawaii vacation $3 kona holidays $0.11

18 December 2006 Amruta Joshi and Rajeev Motwani, Stanford University

6

Objective

To generate, with good precision and recall, a large number of keywords that are relevant to the input word, yet non-obvious in nature.

18 December 2006 Amruta Joshi and Rajeev Motwani, Stanford University

7

Who’s doing all this?

Large Advertisers SEO companies and small start-ups

manage advertising profiles Eg: www.adchemy.com,

www.wordtracker.com, http://www.globalpromoter.com

Eventually every advertiser is interested in optimizing his portfolio

18 December 2006 Amruta Joshi and Rajeev Motwani, Stanford University

8

Other Techniques …

Meta-tag Spidering: Extract Keyword & Description tags from top

search hits Example of meta-tags for query ‘hawaii travel’

Relevant: hawaii travel, hawaii vacation, hawaiian islands, hawaii tourism

Off-topic: hawaii homes, moving to hawaii, hawaii living, hawaii news, living in hawaii, hawaii products,

Irrelevant: sovereignty, volcanoes, sports, music

18 December 2006 Amruta Joshi and Rajeev Motwani, Stanford University

9

Other Techniques …

Proximity-based tools Pick phrases in the proximity of given word e.g.: family hawaii vacations, discount hawaii

vacations

Query log Mining Suggest popular queries containing seed

keywords

18 December 2006 Amruta Joshi and Rajeev Motwani, Stanford University

10

Other Techniques

Advertiser log mining or Query Co-occurrence based mining Exploits co-occurrence in advertiser keyword

search logs Increase competition!

18 December 2006 Amruta Joshi and Rajeev Motwani, Stanford University

11

Directed Relevance Relationships Word A strongly suggests word B, but the

reverse may not hold true

A Bx

B Ay

x ≠ y

railwayseurail 25 railways eurail2

Example:

18 December 2006 Amruta Joshi and Rajeev Motwani, Stanford University

12

Building Context

Characteristic Document Build context of the term using terms found in the proximity

of seed term in the top 50 hits from search engine for that term

europe .

C europe . Search

Engine

18 December 2006 Amruta Joshi and Rajeev Motwani, Stanford University

13

Building the Graph

TermsNet Nodes = terms Edges = directed relevance relationships Weights = strength of directed relationship, i.e., the

frequency of destination term in characteristic document of source term

18 December 2006 Amruta Joshi and Rajeev Motwani, Stanford University

14

TermsNet

europe .

C

railways

C

euro

C

eurail

C

maps

C

atlas

Cschengen

C

25

1432

30

15

19

18 December 2006 Amruta Joshi and Rajeev Motwani, Stanford University

15

Ranking Suggestions

Quality Score Incorporates Edge-weights Normalization for common words

Quality Q(x, q) = wx,q / (1+log (1+∑wx,i))

where each i is an outneighbor of ‘x’

x qwx,q

18 December 2006 Amruta Joshi and Rajeev Motwani, Stanford University

16

Ratings Relevance

Indicates Relevance of suggested keyword to seed word Given by human editors e.g.: For query ‘flights’

Relevance (‘flights’, ‘cathay pacific’) = 1 Relevance (‘flights’, ‘cheap flight’) = 1 Relevance (‘flights’, ‘magazines’) = 0

Nonobviousness Indicates nonobviousness of suggested keyword relative to seed word Calculated as: If No base query word/stem present in suggested keyword,

Nonobviousness = 1, else = 0 e.g.: For query ‘flights’

Relevance (‘flights’, ‘cathay pacific’) = 1 Relevance (‘flights’, ‘cheap flight’) = 0 Relevance (‘flights’, ‘magazines’) = 1

Used standard Porter stemmer for automating this rating

18 December 2006 Amruta Joshi and Rajeev Motwani, Stanford University

17

Evaluation Evaluation Measures

Average Precision: Ratio of number of relevant keywords retrieved to number of

keywords retrieved. Indicates quality of results

Average Recall The proportion of relevant keywords that are retrieved, out of all

relevant keywords available. For our expts

Recall (Ti) = # retrieved by Ti / # retrieved by (T1 U T2 U…U Tn)

Average Nonobviousness Average of all nonobviousness ratings of suggested keywords

18 December 2006 Amruta Joshi and Rajeev Motwani, Stanford University

18

Output for query ‘flights’

Co-occurrence Based

Query Log Meta-Tag Spidering

Meta-Crawler Lists

Query-log Mining

TermsNet

AirfareairfaresairlinesCyprusgoaflysholidaystrainsaeraeroflotaeromexicoaircanadaalicantebwiaheathrowicelandairbookingsConsolidator

Flightscheap flightsairline flightscheap airline flightscheap international

flightsflights to europebusiness class flightsflights new yorkaustralia flightscheap flights to

europecheap flights to

orlandocheap flights las vegastrack flightsflights floridaflights europelas flightscheap flights to

australia

real time flight arrivals

airfareflightsflightmapdelayscruisesus flight arrivalsflight arrivalsstate mapflight arrivalflight

cancellations

arrival timesarrival delaysflight departurevacation

packagesstreet map

air travelairline discount

ticketsairline faresairline ticketsairline tickets

under 100american

airlinesbargain flightsbmibabybritish airwaysbritish airways

flightsbritish airways

home pagebritish airways

timetablebritish midlandbudget airline

flightcheap flightlas vegas flightflight trackerflight to orlandoflight to londonflight to new

yorkairline flightflight to los

angelesflight 93flight to fort

lauderdalelight of the

phoenixflight to

honoluluflight to chicagoflight to miami

cheap flightsairline flightsair newzealandflight pricesbmibabyglobespanlow cost airlinesunited airlinesairline-

consolidatorscharter flightsairfareflight reservationscathay pacificbritish midland

airwaysdiscount airfareflight ticketsjet2travelocity

18 December 2006 Amruta Joshi and Rajeev Motwani, Stanford University

19

Avg. Precision, Recall, Nonobviousness

0.636364

1

0.479675

0.94

1

0.788043

0.196

0.254

0.1180.094

0.201

0.58

1

0

0.559322

0.744681

0

0.913793

0

0.2

0.4

0.6

0.8

1

1.2

Query Co-occurrence

Query-LogMining

Meta-TagSpidering

MetaCrawlerLists

Query Logswith recency

TermsNet

Avg. Precision

Avg. Recall

Avg.Nonobviousness

18 December 2006 Amruta Joshi and Rajeev Motwani, Stanford University

20

Evaluation Measures

F-measures Measure of overall performance

Harmonic mean of F(PR) – Avg. Precision & Avg. Recall F(RN) – Avg. Recall & Avg. Nonobviousness F(PN) – Avg. Precision & Avg. Nonobviousness F(PRN) – Avg. Precision, Avg. Recall & Avg.

Nonobviousness

18 December 2006 Amruta Joshi and Rajeev Motwani, Stanford University

21

F-Measures

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

Query Co-occurrence

Query-LogMining

Meta-TagSpidering

MetaCrawlerLists

Query Logswith recency

TermsNet

F(PR)

F(RN)

F(PN)

F(PRN)

18 December 2006 Amruta Joshi and Rajeev Motwani, Stanford University

22

Quality of Suggestions over different intervals of ranked resultsAvg. Precision & Avg. Nonobviousness over Number of Top

Suggestions

0

0.2

0.4

0.6

0.8

1

0 100 200 300 400 500 600Top n keyword suggestions

Avg. Nonobviousness

Avg. Precision

Figure 2: Quality of keywords over different ranked intervals

18 December 2006 Amruta Joshi and Rajeev Motwani, Stanford University

23

Future Directions

Incorporate keyword frequency in ranking suggestions

Incorporate keyword pricing information in ranking suggestions

Applications to other domains Find related movies, papers, people

18 December 2006 Amruta Joshi and Rajeev Motwani, Stanford University

24

Thank You!

Questions?

[email protected]