neural networks for information retrievalnn4ir.com/ecir2018/slides/03_semanticmatching.pdf ·...

48
37 Outline Morning program Preliminaries Semantic matching Learning to rank Entities Afternoon program Modeling user behavior Generating responses Recommender systems Industry insights Q&A

Upload: others

Post on 09-Aug-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

37

Outline

Morning programPreliminariesSemantic matchingLearning to rankEntities

Afternoon programModeling user behaviorGenerating responsesRecommender systemsIndustry insightsQ & A

Page 2: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

38

Semantic matchingSemantic matching

Definition”... conduct query/document analysis to represent the meanings of query/documentwith richer representations and then perform matching with the representations.” - Liet al. [2014]

A promising area within neural IR, due to the success of semantic representations inNLP and computer vision.

Page 3: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

39

Outline

Morning programPreliminariesSemantic matching

Using pre-trained unsupervised representations for semantic matchingLearning unsupervised representations for semantic matchingLearning to match modelsLearning to match using pseudo relevanceToolkits

Learning to rankEntities

Afternoon programModeling user behaviorGenerating responsesRecommender systemsIndustry insightsQ & A

Page 4: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

40

Semantic matchingUnsupervised semantic matching with pre-trained representations

Word embeddings have recently gained popularity for their ability to encode semanticand syntactic relations amongst words.

How can we use word embeddings for information retrieval tasks?

Page 5: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

41

Semantic matchingWord embedding

Distributional Semantic Model (DSM): A model for associating words with vectorsthat can capture their meaning. DSM relies on the distributional hypothesis.

Distributional Hypothesis: Words that occur in the same contexts tend to have similarmeanings [Harris, 1954].

Statistics on observed contexts of words in a corpus is quantified to derive wordvectors.

I The most common choice of context: The set of words that co-occur in a contextwindow.

I Context-counting VS. Context-predicting [Baroni et al., 2014]

Page 6: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

42

Semantic matching

From word embeddings to query/document embeddings

Creating representations for compound units of text (e.g., documents) fromrepresentation of lexical units (e.g., words).

Page 7: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

43

Semantic matchingFrom word embeddings to query/document embeddings

Obtaining representations of compound units of text (in comparison to the atomicwords).Bag of embedded words: sum or average of word vectors.

I Averaging the word representations of query terms has been extensively exploredin di↵erent settings. [Vulic and Moens, 2015, Zamani and Croft, 2016b]

I E↵ective but for small units of text, e.g. query [Mitra, 2015].

I Training word embeddings directly for the purpose of being averaged [Kenteret al., 2016].

Page 8: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

44

Semantic matchingFrom word embeddings to query/document embeddings

I Skip-Thought VectorsI Conceptually similar to distributional semantics: a units representation is a function

of its neighbouring units, except units are sentences instead of words.

I Similar to auto-encoding objective: encode sentence, but decode neighboringsentences.

I Pair of LSTM-based seq2seq models with shared encoder.

I Doc2vec (Paragraph2vec) [Le and Mikolov, 2014].

I You’ll hear more later about it on “Learning unsupervised representations fromscratch”. (Also you might want to take a look at Deep Learning for SemanticComposition)

Page 9: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

45

Semantic matching

Using similarity amongst documents, queries and terms.

Given low-dimensional representations, integrate their similarity signal within IR.

Page 10: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

46

Semantic matchingDual Embedding Space Model (DESM) [Nalisnick et al., 2016]

Word2vec optimizes IN-OUT dot product which capturesthe co-occurrence statistics of words from the trainingcorpus:- We can gain by using these two embeddings di↵erently

I IN-IN and OUT-OUT cosine similarities are high for words that are similar byfunction or type (typical) and the

I IN-OUT cosine similarities are high between words that often co-occur in thesame query or document (topical).

Page 11: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

47

Semantic matchingPre-trained word embeddings for document retrieval and ranking

DESM [Nalisnick et al., 2016]: Using IN-OUT similarity to model document aboutness.

I A document is represented by the centroid of its word OUT vectors:

~vd,OUT =1

|d|X

td,2d

~vtd,OUT

k~vtd,OUTk

I Query-document similarity is average of cosine similarity over query words:

DESMIN-OUT(q, d) =1

q

X

tq2q

~v>tq,IN~vtd,OUT

k~vtq,INk k~vtd,OUTk

I IN-OUT captures more topical notion of similarity than IN-IN and OUT-OUT.

I DESM is e↵ective at, but only at, ranking at least somewhat relevant documents.

Page 12: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

48

Semantic matchingPre-trained word embeddings for document retrieval and ranking

I NTLM [Zuccon et al., 2015]: Neural Translation Language ModelI Translation Language Model: extending query likelihood:

p(d|q) ⇠ p(q|d)p(d)

p(q|d) =Y

tq2q

p(tq|d)

p(tq|d) =X

td2d

p(tq|td)p(td|d)

I Uses the similarity between term embeddings as a measure for term-term translationprobability p(tq|td).

p(tq|td) =cos(~vtq ,~vtd)Pt2V cos(~vt,~vtd)

Page 13: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

49

Semantic matchingPre-trained word embeddings for document retrieval and ranking

GLM [Ganguly et al., 2015]: Generalized Language ModelI Terms in a query are generated by sampling them independently from either the

document or the collection.I The noisy channel may transform (mutate) a term t into a term t0.

p(tq|d) = �p(tq|d)+↵X

td2dp(tq, td|d)p(td)+�

X

t02Nt

p(tq, t0|C)p(t0)+1���↵��)p(tq|C)

Nt is the set of nearest-neighbours of term t.

p(t0, t|d) =sim(~vt0 ,~vt).tf(t0, d)P

t12dP

t22d sim(~vt1 ,~vt2).|d|

Page 14: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

50

Semantic matchingPre-trained word embeddings for query term weighting

Term re-weighting using word embeddings [Zheng and Callan, 2015].- Learning to map query terms to query term weights.

I Constructing the feature vector ~xtq for term tqusing its embedding and embeddings of otherterms in the same query q as:

~xtq = ~vtq � 1

|q|X

t0q2q~vt0q

I ~xtq measures the semantic di↵erence of a termto the whole query.

I Learn a model to map the feature vectors thedefined target term weights.

Page 15: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

51

Semantic matchingPre-trained word embeddings for query expansion

I Identify expansion terms using word2vec cosine similarity [Roy et al., 2016].I pre-retrieval:

I Taking nearest neighbors of query terms as the expansion terms.I post-retrieval:

I Using a set of pseudo-relevant documents to restrict the search domain for thecandidate expansion terms.

I pre-retrieval incremental:I Using an iterative process of reordering and pruning terms from the nearest neighbors

list.- Reorder the terms in decreasing order of similarity with the previously selected term.

I Works better than having no query expansion, but does not beat non-neural queryexpansion methods.

Page 16: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

52

Semantic matchingPre-trained word embedding for query expansion

I Embedding-based Query Expansion [Zamani and Croft, 2016a]Main goal: Estimating a better language model for the query using embeddings.

I Embedding-based Relevance Model:Main goal: Semantic similarity in addition to term matching for PRF.

Page 17: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

53

Semantic matchingPre-trained word embedding for query expansion

Query expansion with locally-trained word embeddings [Diaz et al., 2016].

I Main idea: Embeddings be learned ontopically-constrained corpora, instead of largetopically-unconstrained corpora.

I Training word2vec on documents from firstround of retrieval.

I Fine-grained word sense disambiguation.

I A large number of embedding spaces can becached in practice.

Page 18: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

54

Outline

Morning programPreliminariesSemantic matching

Using pre-trained unsupervised representations for semantic matchingLearning unsupervised representations for semantic matchingLearning to match modelsLearning to match using pseudo relevanceToolkits

Learning to rankEntities

Afternoon programModeling user behaviorGenerating responsesRecommender systemsIndustry insightsQ & A

Page 19: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

55

Semantic matchingLearning unsupervised representations for semantic matching

Pre-trained word embeddings can be used to obtain

I a query/document representation through compositionality, or

I a similarity signal to integrate within IR frameworks.

Can we learn unsupervised query/document representations directly for IR tasks?

Page 20: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

56

Semantic matchingLSI, pLSI and LDA

History of latent document representationsLatent representations of documents that are learned from scratch have been aroundsince the early 1990s.

I Latent Semantic Indexing [Deerwester et al., 1990],

I Probabilistic Latent Semantic Indexing [Hofmann, 1999], and

I Latent Dirichlet Allocation [Blei et al., 2003].

These representations provide a semantic matching signal that is complementary to alexical matching signal.

Page 21: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

57

Semantic matchingSemantic Hashing

Salakhutdinov and Hinton [2009] propose Semantic Hashing for document similarity.

I Auto-encoder trained on frequencyvectors.

I Documents are mapped to memoryaddresses in such a way thatsemantically similar documents arelocated at nearby bit addresses.

I Documents similar to a querydocument can then be found byaccessing addresses that di↵er by onlya few bits from the query documentaddress.

Schematic representation of Semantic Hashing.Taken from Salakhutdinov and Hinton [2009].

Page 22: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

58

Semantic matchingDistributed Representations of Documents [Le and Mikolov, 2014]

I Learn document representations basedon the words contained within eachdocument.

I Reported to work well on a documentsimilarity task.

I Attempts to integrate learnedrepresentations into standard retrievalmodels [Ai et al., 2016a,b].

Overview of the Distributed Memory documentvector model. Taken from Le and Mikolov[2014].

Page 23: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

59

Semantic matchingTwo Doc2Vec Architectures [Le and Mikolov, 2014]

Overview of the Distributed Memory documentvector model. Taken from Le and Mikolov[2014]. Overview of the Distributed Bag of Words

document vector model. Taken from Le andMikolov [2014].

Page 24: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

60

Semantic matchingNeural Vector Spaces for Unsupervised IR [Van Gysel et al., 2018]

I Learns query (term) and document representations directly from the documentcollection.

I Outperforms existing latent vectorspace models and provides semanticmatching signal complementary tolexical retrieval models.

I Learns a notion of term specificity.

I Luhn significance: mid-frequencywords are more important for retrievalthan infrequent and frequent words.

Relation between query term representationL2-norm within NVSM and its collectionfrequency. Taken from [Van Gysel et al., 2018].

Page 25: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

61

Outline

Morning programPreliminariesSemantic matching

Using pre-trained unsupervised representations for semantic matchingLearning unsupervised representations for semantic matchingLearning to match modelsLearning to match using pseudo relevanceToolkits

Learning to rankEntities

Afternoon programModeling user behaviorGenerating responsesRecommender systemsIndustry insightsQ & A

Page 26: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

62

Semantic matchingText matching as a supervised objective

Text matching is often formulated as a supervised objective where pairs of relevant orparaphrased texts are given.

In the next few slides, we’ll go over di↵erent architectures introduced for supervisedtext matching. Note that this is a mix of models originally introduced for (i) relevanceranking, (ii) paraphrase identification, and (iii) question answering among others.

Page 27: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

63

Semantic matching

Representation-based models

Representation-based models construct a fixed-dimensional vector representation foreach text separately and then perform matching within the latent space.

Page 28: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

64

Semantic matching(C)DSSM [Huang et al., 2013, Shen et al., 2014]

I Siamese network between query and document, performed on character trigrams.

I Originally introduced for learning from implicit feedback.

Page 29: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

65

Semantic matchingARC-I [Hu et al., 2014]

I Similar to DSSM, perform 1D convolution on text representations separately.I Originally introduced for paraphrasing task.

Page 30: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

66

Semantic matching

Interaction-based models

Interaction-based models compute the interaction between each individual term ofboth texts. An interaction can be identity or syntactic/semantic similarity.

The interaction matrix is subsequently summarized into a matching score.

Page 31: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

67

Semantic matchingDRMM [Guo et al., 2016]

I Compute term/documentinteractions and matchinghistograms using di↵erentstrategies (count, relativecount, log-count).

I Pass histograms throughfeed-forward network forevery query term.

I Gating network that producesan attention weight for everyquery term; per-term scoresare then aggregated into arelevance score usingattention weights.

Page 32: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

68

Semantic matchingMatchPyramid [Pang et al., 2016]

I Interaction matrix between query/documentterms, followed by convolutional layers.

I After convolutions, feed-forward layersdetermine matching score.

Page 33: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

69

Semantic matchingaNMM [Yang et al., 2016]

I Compute word interactionmatrix.

I Aggregate similarities byrunning multiple kernels.

I Every kernel assigns adi↵erent weight to aparticular similarity range.

I Similarities are aggregated tothe kernel output byweighting them according towhich bin they fall in.

Page 34: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

70

Semantic matchingMatch-SRNN [Wan et al., 2016b]

I Word interaction layer, followed by a spatialrecurrent NN.

I The RNN hidden state is updated using thecurrent interaction coe�cient, and the hiddenstate of the prefix.

Page 35: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

71

Semantic matchingK-NRM [Xiong et al., 2017b]

I Compute word-interaction matrix,apply k kernels to every query termrow in interaction matrix.

I This results in k-dimensional vector.

I Aggregate the query term vectors intoa fixed-dimensional queryrepresentation.

I Later extended to convolutionalnetworks [Dai et al., 2018] (hybrid).

Page 36: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

72

Semantic matching

Hybrid models

Hybrid models consist of (i) a representation component that combines a sequence ofwords (e.g., a whole text, a window of words) into a fixed-dimensional representation

and (ii) an interaction component.

These two components can occur (1) in serial or (2) in parallel.

Page 37: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

73

Semantic matchingARC-II [Hu et al., 2014]

I Cascade approach where word representation are generated from context.

I Interaction matrix between sliding windows, where the interaction activation iscomputed using a non-linear mapping.

I Originally introduced for paraphrasing task.

Page 38: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

74

Semantic matchingMV-LSTM [Wan et al., 2016a]

I Cascade approach where inputrepresentations for the interactionmatrix are generated using abi-directional LSTM.

I Di↵ers from pure interaction-basedapproaches as the LSTM builds arepresentation of the context, ratherthan using the representation of aword.

I Obtains fixed-dimensionalrepresentation by max-pooling overquery/document; followed byfeed-forward network.

Page 39: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

75

Semantic matchingDuet [Mitra et al., 2017]

I Model has an interaction-based and arepresentation-based component.

I Interaction-based component consist of aindicator matrix showing where query termsoccur in document; followed by convolutionlayers.

I Representation-based component is similar toDSSM/ARC-I, but uses a feed-forwardnetwork to compute the similarity signal ratherthan cosine similarity.

I Both are combined at the end using a linearcombination of the scores.

Page 40: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

76

Semantic matchingDeepRank [Pang et al., 2017]

I Focus only on exact termoccurrences in document.

I Compute interaction betweenquery and windowsurrounding term occurrence.

I RNN or CNN then combinesper-window features (queryrepresentation, contextrepresentations andinteraction betweenquery/document term) intomatching score.

Page 41: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

77

Outline

Morning programPreliminariesSemantic matching

Using pre-trained unsupervised representations for semantic matchingLearning unsupervised representations for semantic matchingLearning to match modelsLearning to match using pseudo relevanceToolkits

Learning to rankEntities

Afternoon programModeling user behaviorGenerating responsesRecommender systemsIndustry insightsQ & A

Page 42: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

78

Semantic matchingBeyond supervised signals: semi-supervised learning

The architectures we presented for learning to match all require labels. Typically theselabels are obtained from domain experts.

However, in information retrieval, there is the concept of pseudo relevance that givesus a supervised signal that was obtained from unsupervised data collections.

Page 43: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

79

Semantic matchingPseudo test/training collections

Given a source of pseudo relevance, we can build pseudo collections for trainingretrieval models [Asadi et al., 2011, Berendsen et al., 2013].

Sources of pseudo-relevanceTypically given by external knowledge about retrieval domain, such as hyperlinks,query logs, social tags, ...

Page 44: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

80

Semantic matchingTraining neural networks using pseudo relevance

Training a neural ranker using weak supervision [Dehghani et al., 2017].

Main idea: Annotating a large amount of unlabeleddata using a weak annotator (Pseudo-Labeling) anddesign a model which can be trained on weak super-vision signal.

I Function approximation. (re-inventing BM25?)

I Beating BM25 using BM25!

Page 45: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

81

Semantic matchingTraining neural networks using pseudo relevance

Generating weak supervision training data for training neural IR model [MacAvaneyet al., 2017].

I Using a news corpus with article headlines acting as pseudo-queries and articlecontent as pseudo-documents.

I Problems:I Hard-Negative

I Mismatched-Interaction: (example: “When Bird Flies In”, a sports article aboutbasketball player Larry Bird)

I Solutions:I Ranking filter:

- top pseudo-documents are considered as negative samples.- only pseudo-queries that are able to retrieve their pseudo-relevant documents areused as positive samples.

I Interaction filter:- building interaction embeddings for each pair.- filtering out based on similarity to the template query-document pairs.

Page 46: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

82

Semantic matchingQuery expansion using neural word embeddings based on pseudo relevance

Locally trained word embeddings [Diaz et al., 2016]

I Performing topic-specific training, on a set of topic specific documents that arecollected based on their relevance to a query.

Relevance-based Word Embedding [Zamani and Croft, 2017].

I Relevance is not necessarily equal to semantically or syntactically similarity:I “united state” as expansion terms for “Indian American museum”.

I Main idea: Defining the “context”Using the relevance model distribution for the given query to define the context.So the objective is to predict the words observed in the documents relevant to aparticular information need.

I The neural network will be constraint by the given weights from RM3 to learnword embeddings.

Page 47: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

83

Outline

Morning programPreliminariesSemantic matching

Using pre-trained unsupervised representations for semantic matchingLearning unsupervised representations for semantic matchingLearning to match modelsLearning to match using pseudo relevanceToolkits

Learning to rankEntities

Afternoon programModeling user behaviorGenerating responsesRecommender systemsIndustry insightsQ & A

Page 48: Neural Networks for Information Retrievalnn4ir.com/ecir2018/slides/03_SemanticMatching.pdf · Pre-trained word embeddings for query expansion I Identify expansion terms using word2vec

84

Semantic matchingDocument & entity representation learning toolkits

gensim : https://github.com/RaRe-Technologies/gensim [Rehurek andSojka, 2010]

SERT : http://www.github.com/cvangysel/SERT [Van Gysel et al., 2017a]

cuNVSM : http://www.github.com/cvangysel/cuNVSM [Van Gysel et al., 2018]

HEM : https://ciir.cs.umass.edu/downloads/HEM [Ai et al., 2017]

MatchZoo : https://github.com/faneshion/MatchZoo [Fan et al., 2017]

K-NRM : https://github.com/AdeDZY/K-NRM [Xiong et al., 2017b]