delivering content: the right content at the right time ...recommender systems and contextual...

MASTER THESIS INFORMATION SCIENCES

Delivering Content: The Right Content Atthe Right Time and at the Right Place to the

Right Person

Author:Rikko Filiano

Supervisor:Prof. dr. ir. Arjen P. de Vries

Second Assessor:Prof. dr. ir. Theo P. van der Weide

March 16, 2017

Abstract

The number of User Generated Content providers and media contenders in Indonesia hasbeen growing fast. Consequently, Keepo.me, as one of Indonesian User Generated Contentproviders, needs to overcome several challenges in order to, at least, hold their base inthe high-risk competition. One of this challenges is to improve the User Engagement.Therefore, this master thesis has been carried out to find the truth whether incorporatingRecommender Systems and contextual information on the Recommender Systems asorganic circulation can improve the User Engagement on Keepo.me.

This research has analysed the performances of two different Recommender Systems,a Content-based Filtering Recommender System and a Contextual User Modeling Rec-ommender System. In Content-based Filtering approach, the recommendation is madeby calculating the similarity between user’s interest profile and the items (in our case,the content article). While in Contextual User Modeling, the recommendation is madeto the user based on their contextual information. Topic Modeling is used in the Con-textual User Modeling technique to generate topics or keywords from content articlecollections in Keepo.me datasets.

We carried out an empirical evaluation using Multileaving, in combination with adiscussion of observed User Engagement metrics from Google Analytics. From thediscussion on this research, we argue that implementing Recommender System asorganic circulation could increase the User Engagement metrics on Keepo.me. Despitethat the results are not significant, it shows the positive promises of User Engagementimprovement. Finally, considering the results of Recommender Systems and UserEngagement, we suggest the several challenges need to be taken into account as furtherresearch.

Acknowledgements

First and foremost, I would like to express my earnest gratitude to my supervisor,Arjen P. de Vries, for the countless valuable guidance, knowledge, improvements, andremarks throughout this semester. I would also like to thank Theo P. van der Weide asmy second assessor for the support along the way.

Many thanks also to Michael Rendy, Willianto Tobagus, Refa Dewangga, and JuliartoWongosari from Keepo.me for supplying me with all of the resources and informationneeded for the completion of this master thesis. Without them this research would nothave been possible.

This thesis is dedicated to my parents, who have always provided me with moral andemotional support in my life. My sincere gratitude to Shanti, Fefe, Tzu-Ling, Amin,Angga, Hussam, who have supported me along the way. I will be grateful forever fortheir love and support.

Finally, I gratefully acknowledge the funding received towards my master degree pro-gramme from EP-Nuffic Netherlands Fellowship Programme (NFP).

Contents

1 Introduction 11.1 Role of Keepo.me . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Background Information 42.1 Recommender System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.1.1 User Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.1.1.1 Explicit Feedback . . . . . . . . . . . . . . . . . . . . . . 52.1.1.2 Implicit Feedback . . . . . . . . . . . . . . . . . . . . . . 5

2.1.2 Information Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . 52.1.2.1 Content-based Filtering . . . . . . . . . . . . . . . . . . . 5

2.2 Context-Aware Recommender System . . . . . . . . . . . . . . . . . . . . 62.2.1 Definition of Context . . . . . . . . . . . . . . . . . . . . . . . . . . 62.2.2 Contextual Information Modeling . . . . . . . . . . . . . . . . . . 72.2.3 Context-Aware approaches in Recommender Systems . . . . . . . 8

2.2.3.1 Contextual preference elicitation and estimation . . . . 82.3 Topic Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.3.1 Latent Dirichlet Allocation . . . . . . . . . . . . . . . . . . . . . . 112.4 Interleaving Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.4.1 Team-Draft Interleaving . . . . . . . . . . . . . . . . . . . . . . . . 132.4.2 Team-Draft Multileaving . . . . . . . . . . . . . . . . . . . . . . . 14

3 Problem Definition 16

4 Experiment 214.1 Data models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214.2 Experiment strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4.2.1 Baseline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224.2.2 Content-based filtering strategy . . . . . . . . . . . . . . . . . . . 234.2.3 Contextual user modeling strategy . . . . . . . . . . . . . . . . . . 24

4.2.3.1 Topic Modeling using Mallet . . . . . . . . . . . . . . . . 314.2.4 Team-draft multileaving strategy . . . . . . . . . . . . . . . . . . . 34

5 Results 375.1 Evaluation of recommender systems . . . . . . . . . . . . . . . . . . . . . 375.2 Evaluation of user engagement . . . . . . . . . . . . . . . . . . . . . . . . 42

6 Conclusion 50

iii

iv CONTENTS

6.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 506.1.1 Recommender systems . . . . . . . . . . . . . . . . . . . . . . . . . 506.1.2 User engagement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

6.2 Lessons learned . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526.3 Final Remark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

7 Future Work 54

A Content-based filtering strategy script 59

B Contextual user modeling strategy script 62

C Team-draft Multileaving script 67

D Team-draft Multileaving evaluation script 69

E Stopword list 71

F Team-draft Multileaving evaluation result 74

List of Figures

2.1 Methods for incorporating context in Recommender System . . . . . . . 82.2 Documents exhibit multiple topics . . . . . . . . . . . . . . . . . . . . . . 112.3 Latent Dirichlet Allocation in graphical model . . . . . . . . . . . . . . . . 112.4 Example of combined documents retrieved. . . . . . . . . . . . . . . . . . 13

3.1 Overall user engagement comparison of 4 content providers . . . . . . . 173.2 Monthly visit comparison of 4 content providers . . . . . . . . . . . . . . 173.3 Avg. session duration comparison of 4 content providers . . . . . . . . . 173.4 Pages per session comparison of 4 content providers . . . . . . . . . . . . 183.5 Bounce rate comparison of 4 content providers . . . . . . . . . . . . . . . 183.6 Page Depth of visited page(s) . . . . . . . . . . . . . . . . . . . . . . . . . 193.7 Mobile vs Desktop vs Tablet . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4.1 Overview of the experiment strategy . . . . . . . . . . . . . . . . . . . . . 224.2 General view of the Contextual user modeling strategy . . . . . . . . . . 254.3 Example of the usage of Interleaving toolkit . . . . . . . . . . . . . . . . 344.4 Screenshot of a combined list shown to the users . . . . . . . . . . . . . . 35

5.1 Click-through Rate (CTR) on the recommended items list . . . . . . . . . 385.2 Team-draft Multileaving result of Baseline, Content-based Filtering, and

Contextual User Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . 395.3 F-Measure performance of Content-based Filtering and Contextual User

Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415.4 Computing time (sec) of Recommender Systems . . . . . . . . . . . . . . 425.5 Bounce Rate during the experiment period (25/12/2016 to 01/01/2017) vs.

non-experiment period (02/01/2017 to 09/01/2017) . . . . . . . . . . . . . . 435.6 Pages/Session during the experiment period (25/12/2016 to 01/01/2017)

vs. non-experiment period (02/01/2017 to 09/01/2017) . . . . . . . . . . . 435.7 A/B Significance Test for Pages/Session. The calculation was computed

with variables of 384,332 visitors and 1.65 conversion data for Test Non-Experiment (control), and 341,808 visitors and 1.70 conversion data forTest Experiment (variant). Relative improvement conversion rate variant(B) over control (A): 15.8%. The test result was not significant at 95%significance level (1-sided) where p > 0.05 (p-value = 45%). . . . . . . . 45

5.8 Avg. Session Duration during the experiment period (25/12/2016 to01/01/2017) vs. non-experiment period (02/01/2017 to 09/01/2017) . . . . . 45

5.9 Page Depth during the experiment period (25/12/2016 to 01/01/2017) vs.non-experiment period (02/01/2017 to 09/01/2017) . . . . . . . . . . . . . . 46

v

vi LIST OF FIGURES

5.10 Conversion Rate during the experiment period (25/12/2016 to 01/01/2017)vs. non-experiment period (02/01/2017 to 09/01/2017) . . . . . . . . . . . 47

5.11 A/B Significance Test for Goal (read 3 pages) Completion. The calculationwas computed with variables of 384,332 visitors and 33,270 conversiondata for Test Non-Experiment (control), and 341,808 visitors and 31,653conversion data for Test Experiment (variant). Relative improvementconversion rate variant (B) over control (A): 7%. The test result wassignificant at 95% level (1-sided) where p < 0.05 (p-value = 0%) . . . . . 48

5.12 Avg. Server Response Time during the experiment period (25/12/2016 to01/01/2017) vs. non-experiment period (02/01/2017 to 09/01/2017) . . . . . 48

List of Tables

4.1 Example of mixed recently published items and trending items for base-line strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4.2 Top-5 popular items example extracted for Content-based Filtering . . . 244.3 Cosine Similarity score example between item “Selama 17 Tahun Fo-

tografer Ini Melakukan Trik Keren Dengan Bermodal Suvenir Murah”and top-5 popular items from table 4.2 . . . . . . . . . . . . . . . . . . . . 24

4.4 User A’s history during the last 7 days . . . . . . . . . . . . . . . . . . . . 264.5 Filtered user A’s history items with noon as time context and office as

location context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264.6 Topics associated with filtered user A’s history items . . . . . . . . . . . . 264.7 Sorted topics associated with filtered user A’s history items . . . . . . . . 264.8 Recommended items from Content-based filtering with topics . . . . . . 274.9 Post-filtering result shown to user A . . . . . . . . . . . . . . . . . . . . . 274.10 Contextual attribute: Period of Day . . . . . . . . . . . . . . . . . . . . . . 274.11 Top-10 items extracted given on “Night” condition of contextual time

and -6.999 latitude and 110.385 longitude of contextual location. . . . . . 294.12 List of topics of the retrieved items from Table 4.11 . . . . . . . . . . . . 304.13 Filtered Content-based Filtering items with the correlated topics from

Table 4.12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304.14 Example of top keywords of each topic from Mallet . . . . . . . . . . . . 324.15 Example of top topics (by percentage) of each document from Mallet.

Higher the percentage indicates the principal topics of its document. . . 334.16 Example of recorded users’ clicks through proxy . . . . . . . . . . . . . . 354.17 Example of the result of the evalution using Team-draft multileaving, file

“74869dd1c25aaef7bd6b61d661ee2a8d91c8cdf7” . . . . . . . . . . . . . . 364.18 Example of the counted result . . . . . . . . . . . . . . . . . . . . . . . . . 36

5.1 One-way ANOVA sample data summary . . . . . . . . . . . . . . . . . . 395.2 One-way ANOVA result details . . . . . . . . . . . . . . . . . . . . . . . . 405.3 Tukey HSD test. Significance level at 0.05 = 3.861; Significance level at

0.01 = 4.830. M1 is mean of Baseline > CB = CUM, M2 is mean of CB >Baseline = CUM, M3 is mean of CUM > Baseline = CB, and M4 is meanof Tie. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

5.4 Detail of Bounce Rate during the experiment period (25/12/2016 to 01/01/2017)vs. non-experiment period (02/01/2017 to 09/01/2017) . . . . . . . . . . . 43

5.5 Detail of Pages/Session during the experiment period (25/12/2016 to01/01/2017) vs. non-experiment period (02/01/2017 to 09/01/2017) . . . . . 44

5.6 Detail Avg. Session Duration during the experiment period (25/12/2016to 01/01/2017) vs. non-experiment period (02/01/2017 to 09/01/2017) . . . 46

vii

viii LIST OF TABLES

5.7 Detail Conversion Rate during the experiment period (25/12/2016 to01/01/2017) vs. non-experiment period (02/01/2017 to 09/01/2017) . . . . . 47

5.8 Detail Avg. Server Response Time during the experiment period (25/12/2016to 01/01/2017) vs. non-experiment period (02/01/2017 to 09/01/2017) . . . 49

F.1 Team-draft Multileaving evaluation result . . . . . . . . . . . . . . . . . . 75F.2 Team-draft Multileaving evaluation result . . . . . . . . . . . . . . . . . . 76

Chapter 1

Introduction

The internet’s unstoppable growth and its rapid movement nowadays push businessof all sizes to become media or content providers. Businesses that want to stay in frontof their online audiences need to develop content that will then be distributed overthe internet to keep the search engines recommending their pages and their consumersaware of their products. However, not all contents are the same. Every piece of contentshould be delivered to specific audiences within a specific purpose. News contentproviders, the social components, and the advertising must be able to answer crucialquestion: What is the ideal version of this story of this individual consumer, given whatshe is doing, what she is thinking, what she has been reading or watching recently, andhow much time she has at this very moment?

Many content providers have been struggling to provide content which consumers areactually looking for. Amy Webb wrote in her blog, “it is critical for content providers toreframe their thinking, as the device ecosystem grows more disparate and the volumeof content continues to explode. Consumers will start to lose their appetite for thearticles, listicle, quizzes that delivered so much traffic” [13].

To produce contents which the consumers actually need, a content provider such asKeepo.me (http://keepo.me) needs to start asking to themselves on their content: Howwas it read? Under what circumstances? Where and when do you think it was read?Is the consumer at her home? Is the consumer at a new location? Is she at work? Oris she at the gym? Is she commuting? Is she most likely to read a long in-depth story?Or would she be happier just to get a few bullet points articles, or videos in the articles,or a text-only article? Is this a story that her friend are probably talking about. Smartcontent providers know that to be able to get consumers’ attention are more valuable,and it could make a greater impact rather than depending on consumers’ clicks. Theeffective way to harness consumers’ awareness in content delivery environments is toprovide a better service for consumers based on what they actually need.

1

2 Chapter 1 Introduction

The answer to this crucial question above is Context. Context is what differentiatescontent and makes it resonate with different people in the audience. This context isthe key to deciding what to say and how to say it. For instance, how a consumer usesthe product changes the way they would talk about such product with them. Thereare other contextual considerations to take into account: What other products doesthe group use? Are the individual existing or new customers? These are just a fewexamples of questions that need to be answered in order to properly create and delivercontent with the right context that will connect with the target audiences. Furthermore,all information activities take place within a context that affects the way people accessinformation, interact with a retrieval system, evaluate and make decisions about theretrieved documents [15].

Consider the process of recommending contents to a certain consumer who is workinglate night at her office. It could be considerably difficult to find the best contents forthis consumer even though we already have known the context (working, late night,at the office). Choosing the right topics over the contents in the dataset collectionswould be the best practice to recommend interesting contents which she might like toread. Here, Topic Modeling comes in useful where this technique can extract topicsfrom the collections and from these topic models we can choose the appropriate topicsto be recommended. For example, using the same consumer example, she usuallyreads contents that have politic and sport topics in the night at her office. Then, wecan recommend interesting contents to her which are based on her interest profile andrelated to sport and politic topics.

Having to know and understand how to provide the right content to consumers withintheir context alone is insufficient. Many media contenders and/or content providers areencountering difficulties on how to actually deliver such contents to their consumers.Many of them suffer from low conversion rates. Stuffing their users with an overloadof information can push them to leave their services. If this continues to happenon their services, they might not be able to sustain themselves with the overgrowthcompetition in terms of content providing services. Keepo.me, for example, has been atdisadvantages on a low level in the competition. While Keepo.me’s business models andtheir target audiences are right on track, they are not able to deliver the right contentsto their consumers actually need. It makes Keepo.me suffer from low conversion rateson their engaged users.

One of many ways to help media contenders and/or content providers, to improvetheir conversation rate, in particularly for Keepo.me, is to provide a system that canrecommend the right contents to their consumers within the context. This is called“organic recirculation”, where it motivates and helps the users to stick around andstay on the site [20]. Recommending and delivering the right contents that have beenpersonalized based on the preferences and context of the users could be helpful toincrease the engaged users. Along an increase on User Engagement, it can then boostthe Return On Investment (ROI).

Chapter 1 Introduction 3

1.1 Role of Keepo.me

Keepo.me is a digital news media provider in Indonesia, with a vision to providecontents that can be enjoyed from weary daily routine. Keepo.me’s main target audienceis the Generation Z, who exhibit different news consumption patterns than previousgenerations; generation Z is most likely to obtain information from social media; theymostly like to enjoy reading content and/or information with their own languages;and lastly, they prefer the contents and/or information that are lesser-actual, light,and easy to understand. Therefore, Keepo.me, as digital content and media provider,provides information that can entertain the youth in Indonesia with viral and excitingyet informative contents and/or information. As it has been reported in TechInAsia,“Keepo.me is a mix between BuzzFeed, WordPress, and Twitter in Indonesia”[12].

Keepo.me is User Generated Content (UGC) service. Keepo.me allows any indepen-dent writers and/or bloggers to publish their contents in Keepo.me’s platform. Up tothese days, Keepo.me has made a partnership with more that 600 bloggers and writersthroughout Indonesia.

The research for this thesis project requires collecting and proceeding Keepo.me’s users’data and Keepo.me’s internal analytics data. Therefore, Keepo.me is willing to sharetheir system resources in order to proceed with this research project. The systemresources include servers, system source codes, databases. Keepo.me is also willing toshare their internal analytics data to be analyzed and reported for the sake of this thesisproject.

1.2 Research Questions

As we have introduced the challenges that every content provider has, in particularly forKeepo.me; this thesis project has been carried out to find solutions for those challenges.We have constructed the research questions to find the answers as the solutions to tacklethe challenges. Our research questions are as follows:

1. To what extent can Recommender Systems improve user engagement on Keepo.me?

2. How can we incorporate contextual information to provide better content recom-mendations on Keepo.me?

3. How does Topic Modeling contribute to the Recommender Systems to providerelevant documents based on the context?

Chapter 2

Background Information

2.1 Recommender System

Many content provider websites are struggling to increase the number of engaged users.User engagement itself depends on how easy users can find the information they arelooking for without any difficulties. Many of these websites use personalization basedon user preferences. However, technique based on user preferences may have someshortcomings. Some of which include a situation in which the users may not usuallyknow what they are interested in or have to change their preferences over time. Theseshortcomings may lead to time-consuming for the users. The necessity for a systemthat can provide related information to users where user preferences information werepartially absent is required, hence the development of Recommender Systems [1].

Isinkaye, et al., explain in [16] that Recommender Systems have the ability to predictwhether a particular user would prefer an item or not based on the user’s profile.Recommender Systems can reduce transaction costs of finding and selecting items in acontent delivery environment, and have also proved to improve the decision-makingprocess and quality. In other words, Recommender Systems can offer personalizedinformation to a certain user by learning from her interaction behaviours (e.g. clicks,feedbacks, dwell-time, duration, et cetera).

2.1.1 User Modeling

For a Recommender System to be able to predict relevant and personalized informationto a user, the Recommender Systems have to learn to generate user model. As it hasbeen explained in [16], Recommender Systems cannot functionally work to representthe relevant information without a user model. A user model may contain informationsuch as: user’s attributes, behaviours, and histories, which later will be used by theRecommender Systems to predict and retrieve the relevant information to the userherself. Two major methods exist for the Recommender Systems to generate usermodel: explicit feedback, implicit feedback.

4

Chapter 2 Background Information 5

2.1.1.1 Explicit Feedback

This method will directly approach relevant users by asking direct questions. Forexample, the system will request the user to choose some topic choices which sheinterested in, or asking to rank a certain item to construct her data model. However,this method needs effort from the user to be able to obtain the data model, and thisapproach appears to be difficult to implement [16], a finding confirmed by a Keepo.me’spilot study1. Users tend to ignore such direct question where they may think it is time-consuming for them, despite that the obtained user data model from this method couldbe reliable.

2.1.1.2 Implicit Feedback

In this method, the system will automatically construct a model of the user based onher interactions with the system, such as: click-through links, history, behaviours, timespent on a single content, et cetera. This method might be the solution for the problemin explicit feedback, where users do not need to answer a direct question but rather letthe system learn by itself. However, Isinkaye, et al., describe that this method is lessaccurate than the explicit feedback and also it has been argued that the obtained datamodel from this method might be more objective[16].

2.1.2 Information Filtering

2.1.2.1 Content-based Filtering

Content-based filtering is one of the approaches to rank the items in recommendationsystem. The rating R(u,i) of item i for user u is estimated based on ratings R(u,i’)assigned by the same user u to other items i′ ∈ Items that are similar to item i in terms oftheir content [7]. For example, in case of news provider, the recommender systems willrecommend news to user u by analysing the similarity in the content of the news whichuser u has already evaluated in the past. Content-based Filtering does not need otherprofiles from other users, which the recommended items from this technique are morelikely accurate to user u within her interests. The model generated in this technique isusing one of several similarity approaches such as Term Frequency Inverse DocumentFrequency (TF/IDF), Naïve Bayes Classifier, Decision Tree, or Cosine Similarity.

Adomavicius, et al., in [7] describe the rating function R(u,i) as

R(u, i) = score(ContentBasedPro f ile(u),Content(i))1Keepo.me’s pilot study on explicit feedback method shows less than 5% of active users were willing to

answer direct questions.

6 Chapter 2 Background Information

Here ContentBasedProfile(u) is defined as a vector of weight (wu1, · · · ,wuk) where eachweight wui denotes the importance of keyword ki to user u. Content(i) is the set ofattributes characterizing item i. The score of ContentBasedProfile(u) and Content(i), R(u, i),can be measure with cosine similarity which is defined as follows:

sim(u, i) = cos(~u,~i) =~u·~i

‖~u‖2 × ‖~i‖2=

∑s∈Sui

wu,swi,s√∑s∈Suy

w2u,s

√∑s∈Sui

w2i,s

Even though Content-based Filtering is commonly used to determine the rank of atextual document, the authors of [7] identify several problems with this techniques.Content-based Filtering is dependent on the metadata of the item, where the content in-formation of the item can be extracted automatically or it has been provided manually.the so-called limited content analysis capabilities of Content-based Filtering reduce theability for this technique to recommend multimedia items. For example, Content-basedFiltering will not be able to analyse a content with video without any information pro-vided regarding the video. Also, Content-based Filtering suffers from over-specializationwhere the user is being recommended only based on the items she has been evaluatedin the past.

Despite the problems that Content-Based Filtering has, this technique does not give thecold-start for new items where the new items still can be recommended even thoughthey have not been rated by the users. However, a cold-start for new users could happenin this technique, where the recommender system cannot accurately recommend theitems since the new user has not evaluated any items in the collection or only a fewitems which she has evaluated in the past. This shortcoming is also called the new userproblem.

2.2 Context-Aware Recommender System

2.2.1 Definition of Context

Several literatures have defined context as situations or circumstances in which some-thing happens. Some argue that context is referred to location, environment, people,identity, time, et cetera. These variables are mostly difficult to be implemented when adeveloper wants to apply them to an application, since there are too many variables tobe included in term of context. Therefore, Abowd, et al., in [6] define the context as:


Context is any information that can be used to characterize the situation of an entity.An entity is a person, place, or object that is considered relevant to the interactionbetween a user and an application, including the user and applications themselves.

This definition tends to be easy to understand when a developer wants to build acontext-based application. As Abowd, et al., have explained in [6], the information canbe called context if that information can be used as attributes of situation of a participantin an interaction. For example, in the case of news content recommendation, a userusually reads news about the economy when she is having her breakfast in the morning,and the evening after having her dinner she prefers to read politics. Here, RecommenderSystem may try to understand her preferences at a given moment. Then RecommenderSystem can recommend business news in the morning while in the evening give thepolitical news recommendations. With respect to the example, such a RecommenderSystem would be context-aware since it follows the definition of context-aware given in[6]:

A system is context-aware if it uses context to provide relevant information and/orservices to the users, where relevancy depends on the user’s task.

2.2.2 Contextual Information Modeling

As it has been explained in Section 2.1, Recommender Systems predict the rating of anitem based on user u and item i. Once the Recommender Systems have started to set theestimation of the initial set of ratings, then the Recommender Systems can predict therating of items that have not been rated by users. The rating function R can be definedas follows

R : User × Item→ Rating

This rating function is called a two-dimensional (2D) or traditional recommender system,since the dimensions used are only User and Item dimension [8]. Context-aware rec-ommender systems, on the other hand, use not only User and Item dimension but alsothe Context dimension which specifies the contextual information. Adomavicius, et al.,define the rating function R in context-aware recommender systems as

R : User × Item × Context→ Rating

Here User and Item denotes the domain of users and items respectively, and Contextdenotes the contextual information assigned within the application [8, 7]. The Contextdimension can be specified as multiple dimensions such as Time, Location, Device, etc.Therefore, the dimension in the rating function R can be defined as


R : D1 × · · · ×Dn → Rating

Where two of D1 · · ·Dn are User and Item dimension, and the rest are contextual dimen-sions such as Time, Location, Device, etc. For example, let S = (User, Items,Time,Location)be the dimension space for rating function R, then the rating function R can be definedas: R : User × Items × Time × Location specifies how much user u ∈ User liked itemi ∈ Item at time t ∈ Time and in location l ∈ Location, R(u, i, t, l) [8, 7]. Adomavicius,et al., explain in [7] not all contextual dimensions should be taken into account in themulti dimensional recommender systems. For example, if Time dimension and Locationdimension have the same rating distribution for the item i then Location dimension canbe removed from the multi dimensional recommender systems.

2.2.3 Context-Aware approaches in Recommender Systems

2.2.3.1 Contextual preference elicitation and estimation

Contextual preference elicitation and estimation, or also called contextual user modeling,is one of the most recent approaches used in context-aware recommender system.Contextual preference elicitation and estimation predicts and estimates which items shouldbe recommended to the user by modeling and learning user preferences using variousdata analysis techniques (e.g., data mining or machine learning) [8]. As proposed in[8] and as shown in Figure 2.1 (this figure is taken from [8]), the contextual preferenceelicitation and estimation can use one of three methods to measure the ranking estimationof items in which the contextual information is used.

Figure 2.1: Methods for incorporating context in Recommender System


Contextual Pre-Filtering

In this method, contextual information is used to filter the relevant items from the databefore computing recommendation estimations. The benefit of this method is that it canbe used with any traditional (2D) recommender system techniques. For example, in caseof news content recommendation, when a user wants to read news in the morning, onlythe news that has relevancy or rating data in the morning will be used. Then traditional(2D) recommender system, such as Content-based Filtering, estimates the score of thatmorning news.

Contextual Post-Filtering

In contrast to Contextual Pre-Filtering, Contextual Post-Filtering uses contextual informa-tion to adjust the recommended items after the traditional (2D) recommender systemsare applied. The adjustment in Contextual Post-Filtering can be classified into heuristicand model-based techniques. The heuristic technique depends on the item characteristicsof a given user in the given context. The heuristic technique may use one of thesefollowing methods to adjust the recommendations:

1. Filtering out the recommended items that are irrelevant with the item characteris-tics.

2. Ranking the recommended items based on how many the relevant item character-istics they have.

The model-based technique, on the other hand, uses a probabilisitic computation toconstruct predictive models of the items which user has chosen in the given context.The adjustment of the recommendations is by using one of these following methods:

1. Filtering out the recommended items that have the probability of relevance smallerthan a predefined threshold.

2. Ranking the recommended items by measuring the weight of predicted rating withthe probability of relevance.

Similar to Contextual Pre-Filtering, Contextual Post-Filtering also has benefit to using anytraditional (2D) recommender systems.


Contextual Modeling

While Contextual Pre-Filtering and Contextual Post-Filtering use contextual informationbefore or after the recommendation algorithms. Contextual Modeling uses the contex-tual information within the recommendation algorithms. The major advantage of thistechnique is that it can give a truly multidimensional (MD) recommendation algorithm.

Heuristic-based approach and Model-based approach are commonly used in ContextualModeling. The heuristic-based approach extends two-dimensional recommendation func-tion by using n dimensions that include the contextual information. An exampleof a model-based contextual recommender system uses hierarchical regression-basedBayesian preference model to combine the users and items information by using MarkovChain Monte Carlo [9].

2.3 Topic Modeling

Topic modeling, or also known as Probabilistic Topic Models, are algorithms that learnand discover a set of latent variables as main themes or topics from a large and unstruc-tured collection of documents [10]. As Blei mentions in [10], these algorithms do notrequire labeling or annotations of the collection of documents beforehand, instead TopicModeling algorithms try to infer the latent structures of the collection of documents bythemselves.

Let’s assume that documents are generated from an underlying latent topic distributionand each document is generated from a mixture of these topics that each have a differentproportion in the document. The topics, then, are defined as a distribution over words.Topic Modeling method, such as LDA, uses an iterative process to estimate and constructa model of this underlying distribution based on the observed words in the text. Thismodel reflects the intuition that documents contain multiple topics, and each documentexhibits the topics in different proportion as shown in Figure 2.2.


Figure 2.2: Documents exhibit multiple topics

2.3.1 Latent Dirichlet Allocation

One of the common methods in Topic Modeling is Latent Dirichlet Allocation (LDA).Latent Dirichlet Allocation is a statistical model by computing the distribution probabilityof topics for a particular document and the distribution probability of words for aparticular topic. Figure 2.3 depicts the graphical model of joint distribution used inLatent Dirichlet Allocation [10].

Figure 2.3: Latent Dirichlet Allocation in graphical model

Variables N, D, and K are defined as the number of words in a document, the numberof documents and the number of topics, respectively. α denotes a Dirichlet prior on theper-document topic distribution. θd denotes topic distribution for document d. Zd,n andWd,n denote the topic for the n-th word in document d, and the n-th word in document


d, respectively. βk denotes word distribution for topic K. Parameter η denotes a Dirichletprior on the per-topic word distribution. The formula of this joint distribution is definedas follows:

p(β1:K, θ1:D,Z1:D,W1:D) =

K∏i=1

p(βi)D∏

d=1

p(θd)

N∏n=1

p(Zd,n|θd)p(wd,n|β1:K,Zd,n)

2.4 Interleaving Evaluation

For many years application developers have used A/B Testing to evaluate their retrievalsystems to their users. A/B Testing itself is an online evaluation technique which com-pares two systems by splitting the users into two groups. One user group will be shownsystem A while the other user group will be shown system B. Then A/B Testing tries toinfer the differences between two systems from the observed users’ behaviour. Thistechnique, however, has its limitations. A/B Testing tends to require a large number ofobservations when it has to evaluate substantial documents from the systems in theretrieval domain [25]; consequently, this issue could be expensive in term of time andresources.

The interleaving technique, proposed by Joachims in [17], reduces the number of ob-servations needed by combining documents retrieved from 2 (or more2) systems into asingle system. The combined documents retrieved list of both systems is shown to theusers without giving any information which document is using which retrieval system.The number of clicks from users is recorded, then the system that has a higher numberof clicks can be considered as the better system.

Figure 2.4 depicts an example of combined documents retrieved from the results of 2search engines for the query “support vector machine” (this figure is taken from [17]).The metadata from the documents retrieved of each search engines has been removedin order to fulfill the “blind test” criteria where users should not know which searchengine is being used for a particular document. From this example, if the user morelikely clicks the documents retrieved links from Google, it can be assumed that docu-ments retrieved from Google are more relevant than the ones from MSNSerarch. Theinterleaving technique has shown greater data efficiency than A/B Testing [11], becausefewer observations are necessary to conclude whether A or B should be preferred.

2To evaluate more than 2 information retrieval systems, the technique called Multileaving [24, 23] shouldbe used.


Figure 2.4: Example of combined documents retrieved.

2.4.1 Team-Draft Interleaving

One of the commonly used interleaving techniques is Team-Draft Interleaving, proposedby Radlinski, et al. in [22]. The basic idea of this technique is team selection for highschool team-sport match. The initial phase is to select two team captains, each captainhas a preference order over player, then takes turn to pick the highest preferred player(highest ranked document) which is still available and finally append to the Interleavingranking. For each round the selection of captain to pick their team player is randomized.The following is the algorithm for Team-Draft Interleaving, taken from [22, 23], to producea combined ranking:


Algorithm 1 Team-Draft InterleavingInput: Ranking A = (a1, a2, · · · ) and B = (b1, b2, · · · )Init: I← (); TeamA← ∅; TeamB← ∅;while (∀i : A[i] < I) ∧ (∀ j : B[ j] < I) do

if (|TeamA| < |TeamB|) ∨ ((|TeamA| = |TeamB|) ∧ (RandBit() = 1)) thenk← mini {i : A[i] < I}I← I + A[k];TeamA← TeamA ∪ {A[k]}

elsek← mini {i : B[i] < I}I← I + B[k];TeamB← TeamB ∪ {B[k]}

end ifend whileOutput: Interleaving ranking I, TeamA, TeamB

To infer the preference from ranking A and ranking B from the combined ranking I,denote with c1, c2, · · · the ranks of the clicks in the interleaved ranking I = (i1, i2, · · · ), ha

the number of clicks on links in the top k of ranking A, and hb the number of clicks onlinks in the top k of ranking B:

ha = |{c j, ic j ∈ TeamA}|,hb = |{c j, ic j ∈ TeamB}|

Here the preference can be inferred as: ha > hb infers a preference for ranking A, ha < hb

infers a preference for ranking B, and ha = hb infers no preference.

2.4.2 Team-Draft Multileaving

Major disadvantage of Interleaving technique is that this technique only accepts 2 docu-ments retrieved lists (or rankings) to be interleaved. Therefore, Schuth in [23] proposedTeam-Draft Multileaving as an extension of Team-Draft Interleaving, where this approachallows Team-Draft to receive more than 2 lists of retrieved documents (or rankings) andcombine them into a single interleaved list. The architecture of Team-Draft Multileavingis straightforward. It uses the Team-Draft Interleaving algorithm, with one modificationby changing the number of teams and the selection of their team captain to pick theteam players. The algorithm for Team-Draft Multileaving is as follows (the algorithm istaken from [23]):


Algorithm 2 Team-Draft MultileavingRequire: set of rankings R. multileaving length k.L→ []∀Rx ∈ R : Tx ← ∅

while |L| < k doselect Rx randomly s.t. |Tx| is minimizedp← 0while Rx[p] ∈ L and p < k − 1 do

p← p + 1end whileif Rx[p] < L then

L← L + [Rx[p]]Tx ← Tx ∪ {Rx[p]}

end ifend whilereturn L,T

Let us assume that the multileaving ranking L is produced from 3 ranked lists whichare denoted as Rx. Following the method in sub section 2.4.1, here the preference canbe inferred as: ha > hb = hc infers a preference for ranking A, hb > ha = hc infers apreference for ranking B, hc > ha = hb infers a preference for ranking C, and ha = hb = hc

infers no preference.

Chapter 3

Problem Definition

The number of User Generated Content providers and media contenders in Indonesia hasbeen growing fast recently. Consequently, Keepo.me faces more and more competitions.In this chapter, we describe the challenges which Keepo.me needs to overcome in orderto, at least, hold their base in the high-risk competition. We compare Keepo.me with 4other content providers which have the similar target audience, such as: IDNtimes.com,Hipwee.com, MalesBanget.com and Pulsk.com. The comparison data is aggregated fromSimilarWeb.com1 with the time range of August 2016 to October 2016. We also use datafrom Google Analytics for the internal data analytics with the time range of August2016 to October 2016 as well.

Comparing with 4 other content providers, the main challenge that Keepo.me has isthat their User Engagement metrics are lower than average. As it can be seen fromFigure 3.1, with the number of 953,069 visitors per month Keepo.me suffers low averagesession duration and pages per session, and high bounce rate. On the other hand,Pulsk.com with the monthly visit of 817,436, tends to have slightly better average sessionduration, pages per session, and bounce rate. Compared to IDNtimes.com, Hipwee.com,and MalesBanget.com, Keepo.me tends to be left far behind.

1SimilarWeb.com is a competitive intelligence (CI) service which provides slightly accurate information thenAlexa.com[14]. However, CI data is not 100% accurate but it suffices enough to compare and/or cross-referencewebsites.

16

Chapter 3 Problem Definition 17

Figure 3.1: Overall user engagement comparison of 4 content providers

Figure 3.2: Monthly visit comparison of 4 content providers

Figure 3.3: Avg. session duration comparison of 4 content providers

18 Chapter 3 Problem Definition

Figure 3.4: Pages per session comparison of 4 content providers

Figure 3.5: Bounce rate comparison of 4 content providers

Keepo.me has attempted many ways to improve the User Engagement in the past year,such as: adjusting the design by re-layout-ing the User Interface and User Experience,improving Search Engine Optimization, producing better contents and attracting theright visitors, et cetera. Unfortunately, the attempts were not able to give a significantimprovement of the User Engagement. Because of this reason, we want to evaluatewhether the implementation of Recommender Systems could improve Keepo.me’s UserEngagement (RQ1). Therefore, our main focus is to reduce the Bounce Rate by integrat-ing a Recommender System to improve the user experience of visitors to the site.

Chapter 3 Problem Definition 19

Figure 3.6: Page Depth of visited page(s)

Google Analytics defines Bounce Rate as the percentage of single-page sessions. Whenthe Bounce Rate is high, it means that users most likely leave the page if they have foundthe information they need, and do not continue to another page(s) [3]. For example,when the Bounce Rate metric is 75% on average, it means that 75% of the users whocome to the site leave after only viewing the page they entered. Simply say, if a sitehas a high Bounce Rate it means that site cannot retain its users to stay on the site.Considering the Bounce Rate metric shown in Figure 3.5, it shows that Keepo.me has63% of Bounce Rate. According to RocketFuel about the average of Bounce Rate in [21],Bounce Rate metric in tbe range of 56-70% can be considered as higher than average. IfKeepo.me can reduce its Bounce Rate to the range of 25-40% or lower, then Keepo.me willhave an excellent User Experience in terms of User Engagement.

Figure 3.6 depicts that the number of sessions of 2 or more visited pages is relativelylow. Only less than 5.5% of the user sessions visited more than 2 pages. Our target is tolet users visit 2 or more pages. We assume that a Recommender System could providerelated items that might attract users to click and continue to another page(s), whichcan then allow the Bounce Rate to be reduced. Furthermore, if the Bounce Rate can bereduced, Session Duration and the number of Pages per Session will be increased.

20 Chapter 3 Problem Definition

Figure 3.7: Mobile vs Desktop vs Tablet

Since the number of mobile users, as shown in Figure 3.7, is far greater than the numberof desktop users, we can assume that the mobility of Keepo.me’s users is high. Thus, ourassumption is that incorporating contextual information to the Recommender Systemscould provide more accurate related items to the users given their context, which canthen attract them to surf more pages (RQ2 and RQ3).

Chapter 4

Experiment

In order to evaluate a possible solution for the challenges described in Chapter 3, weconducted an experiment by implementing two Recommender Systems and integrat-ing those into the live system of Keepo.me. The implemented systems are exampleof Content-based Filtering and Contextual preference elicitation and estimation (orContextual User Modeling) respectively. This chapter explains the details of our im-plementation strategy. Since Keepo.me’s system is using PHP scripts, we produced ourimplementation using PHP as well.

4.1 Data models

The first step in implementing our strategy, we constructed models of our log data.The data models were constructed by following the definitions defined in [26] as ourbaseline.

Definition 1. User set U = {u1, · · · ,un} as a set of users. Let ui be the i-th user in U, thenui =< uci1, · · · ,ucil > where ucik is the correlation between user i and context k.

User set U contains i users. Each user in user set correlates with 2-dimensional contextvector which is denoted as < time, location >. ucik denotes the correlation between useri and vector of context k.

Definition 2. Item set I = {i1, · · · , im} as a set of items. Let i j be the j-th item in I, theni j = (< ic j1, · · · , ic jl >,< it j1, · · · , it jk >) where ic jn is the correlation between item j andcontext n, and it jl is the correlation between item j and topic l.

Item set I contains n items. Similar to user set, each item in item set correlates with 2-dimensional context vector which is denoted as < time, location >. ic jn is the correlationbetween item j and vector of context n. Moreover, it jl is the correlation between item jand topic l, where topic l is aggregated by Topic Modeling from item j.

21

22 Chapter 4 Experiment

4.2 Experiment strategy

Our experiment for this research is straightforward. We conducted an experiment tocompare the baseline and two Recommender Systems which includes: Content-basedFiltering and Contextual User Modeling. Within this experiment, we want to find outwhich of these systems is most likely preferred by users. From this experiment, we canalso seek out whether implementing Recommender Systems could improve the UserEngagement on Keepo.me by analysing the performance of data analytics from GoogleAnalytics with and without implementation of Recommender Systems.

As we have explained in Chapter 3 Keepo.me’s users are more likely to use mobiledevices than a personal computer. The second objective of this experiment is to proveour hypothesis whether contextual information can help recommend the relevant itemsto Keepo.me’s users accurately. To compare and evaluate the implementation of baselineand 2 Recommender systems, we implemented Team-draft Multileaving. Figure 4.1depicts the general overview of our experiment strategy.

Figure 4.1: Overview of the experiment strategy

4.2.1 Baseline

For the baseline strategy, we used both Keepo.me’s recently published items and “Trend-ing” items generated from “Trending” procedure. The idea behind this strategy is to

Chapter 4 Experiment 23

mix the recently published items and trending items. The set of recently items consistsof items which were just published on Keepo.me, while the trending items are itemswhich have the high number of view in the last 2 months. Table 4.1 shows the exampleof mixed items from this strategy.

ID Title Views

55051 Ternyata Makan Sedikit Nasi Itu Bikin Kita Sehat. Ini DiaManfaat Jika Kita Sedikit Makan Nasi. Kamu Harus TauFakta Ini, Guys!!

62

40748 Dengan Merasakan 8 Tanda Ini, Maka Kamu Diduga KuatMemiliki Indera ke-6

110882

55020 Kena Sensor, Seluruh Badan Agnez Mo Nyaris Diblur diAcara TV dan Cuma Kelihatan Kepala Sama Kakinya Doang!Gagal Paham Deh~

58

36613 Tips Keramat Agar Gebetan Tergila-gila Sama Kamu 95003

43940 5 Lokasi Paling Angker di Jawa Timur Yang Sebaiknya JanganKamu Kunjungi

99875

Table 4.1: Example of mixed recently published items and trending items for baselinestrategy

4.2.2 Content-based filtering strategy

In this strategy, we use the Cosine Similarity to find popular recent items that are similarto the item which the user previously read. Several steps occur within this strategy.First, we extract a collection of 100 items which have the highest views in the last 7days from our log data. Then, we compute the similarity of the content of those itemswith the content of the item which the user previously read by using Cosine Similarityapproach. The items that have a high score of similarity will be ranked high, as theyhave similar content with the one that the user previously read (see Appendix A for theprogram listing).

Table 4.2 shows the top-5 popular items from last 7 days (27/11/2016 - 04/12/2016)example extracted from “LOL” category page. Table 4.3 shows the sorted score ofthe similarity between the top-5 popular items from Table 4.2 with the item “Selama17 Tahun Fotografer Ini Melakukan Trik Keren Dengan Bermodal Suvenir Murah” aspreviously read item.


ID Title Views50704 Cowok Normal Pasti Ngiler Saat Pertama Kali Lihat 10 Foto

Ini, Asal Jangan Diteliti Lebih Lanjut64511

54149 5 Aktivitas Anti Mainstream yang Cuma Bisa Kamu DapetinKalo Traveling di Dubai, Tanpa Harus Bayar Tiket Pesawat!

50774

53754 Putus dengan Pacar Emang Menyedihkan Bagi Cowok. Tapikalo Mantanmu Ngelakuin 8 Perubahan ini, Kamu yangBakal Nyesek Dobel

44207

54293 Foto-foto Tingkah Gila Manusia di Jaman Modern ini MakinParah Banget! Serem, Tanda-Tanda Mau Kiamat Nih!

22029

54284 [Fail Moment] Dari Tersandung Sampai Muntah, 12 ArtisPapan atas ini Pernah Ngelakuin Hal yang Memalukan diatas Panggung!

18623

Table 4.2: Top-5 popular items example extracted for Content-based Filtering

ID Score54149 0.34167682286087

53754 0.23871388072908

50704 0.1483137661058

54284 0.14432818101437

54293 0.12029881517193

Table 4.3: Cosine Similarity score example between item “Selama 17 Tahun FotograferIni Melakukan Trik Keren Dengan Bermodal Suvenir Murah” and top-5 popular items

from table 4.2

The list of recommended items from this strategy will be stored and sent to Team-DraftMultileaving function, which combines this list of recommended items with the listsfrom other strategies.

4.2.3 Contextual user modeling strategy

In the Contextual user modeling strategy, we adopted the Post-Filtering strategy whichhas been explained in the Subsection 2.2.3.1 and used the Filtering method for theadjustment. Figure 4.2 depicts the general view of this strategy.


Figure 4.2: General view of the Contextual user modeling strategy

Our general idea of this strategy is to generate 50 topic models with Topic Modeling1

from 20,000 documents in our dataset (details are given in Sub section 4.2.3.1), and usethe top-10 most read topics extracted from the items which the user read during thelast 7 days. Then, filter-out the items aggregated from Content-based Filtering whichare irrelevant with those topics given on user’s contextual information; the user’scontextual information are separated into 2 attributes which include: contextual timeand contextual location. The location context is not necessarily available, because usersdo not always give the permission to share their location. In these cases, the contextuallocation attribute can be removed. Consequently, only temporal context will be takeninto account.

To illustrate the concept, let us consider the following example:

Example 3. Consider the strategy for recommending items to user A at noon in heroffice. From the “User Profiler” we extract the history of user A during the last 7 days.Table 4.4 shows an example of the history.

Now, we filter-out the history that are not relevant with “noon” as time context and“office” as location context of user A as shown in Table 4.5.

Since we need to find out what are the topics from the filtered items, we apply TopicModeling to extract 50 topic models from 20,000 documents in our dataset. Then, weassociate those 50 topic models with the filtered items, thus we can find out the topicsfrom the filtered items. Table 4.6 shows the topics of the filtered items from Table 4.5.

1The topics are trained using the Mallet toolkit. All Mallet code is open source and available athttp://mallet.cs.umass.edu/ [19].


Item Time LocationItem A Noon OfficeItem B Noon OfficeItem C Evening HomeItem D Morning OfficeItem E Night HomeItem F Noon OfficeItem G Noon OfficeItem H Morning OfficeItem I Night HomeItem J Noon Office

Table 4.4: User A’s history during the last 7 days

Item Time LocationItem A Noon OfficeItem B Noon OfficeItem F Noon OfficeItem G Noon OfficeItem J Noon Office

Table 4.5: Filtered user A’s history items with noon as time context and office aslocation context

Item TopicItem A anime

Item B anime

Item F japan

Item G superhero

Item J japan

Table 4.6: Topics associated with filtered user A’s history items

Next, as shown in Table 4.7, we count and sort the number of occurrences of the topics,since we only need the top-10 of most read topics.

Topic Number of occurrencesanime 2

japan 2

superhero 1

Table 4.7: Sorted topics associated with filtered user A’s history items

Lastly, with the recommended items from Content-based filtering (Table 4.8), we applythe Post-filtering technique by filtering-out those items which are irrelevant with thetopics as shown in Table 4.7. Table 4.9 shows the result of the Post-filtering which willbe shown to the user A.


Item TopicItem K photoshop

Item L superhero

Item M music

Item N superhero

Item O movie

Item P japan

Item Q superhero

Item R photography

Item S anime

Table 4.8: Recommended items from Content-based filtering with topics

ItemItem L

Item N

Item P

Item Q

Item S

Table 4.9: Post-filtering result shown to user A

From this example, we can see the recommended items from this strategy can differsignificantly depending on where the user A is and at what time.

With respect to contextual time for this strategy, we used the characteristic relatedto time such as: period of day which includes contextual conditions as morning, noon,evening, and night (as shown in Table 4.10). For example, if a user visits the page at17.30, the contextual time strategy will retrieve the items which are popular or mostviewed in the time range of 15.00 to 20.59.

Condition Time rangeMorning 07:00 - 11:59

Noon 12:00 - 14:59

Evening 15:00 - 20:59

Night 21:00 - 06:59

Table 4.10: Contextual attribute: Period of Day

For the contextual location, we followed the “Selecting points within a bounding circle”technique [27] to obtain location points within a certain distance from the given lati-tude and longitude, using the “First-cut bounding box” technique to improve runtimeefficiency [27]:


ϕbounds = ϕ ±dR

λbounds = λ ±asin(

dR

)

cosϕ

Here, ϕ and λ denote the latitude and longitude of a certain location respectively. ddenotes the radius of the bounding circle (in kilometers); in our strategy we set theradius of the bounding circle to 5 kilometers. Lastly, R denotes the radius of the earth(in kilometers) which is 6,371 kilometers. For example, user A has the location of -6.999 latitude and 110.385 longitude. Then, we can calculate the minimum - maximumlatitude, and minimum - maximum longitude with “First-cut bounding box” to find thelocation points within the square map. Therefore, we retrieve the items that have thelocation: -7.044 - -6.955 range of latitude, and 110.340 - 110.430 range of longitude (seeAppendix B for the program listing).

Table 4.11 shows the example of top-10 from a certain user and from last 7 days(14/12/2016 - 21/12/2016) extracted given on “Night” condition of contextual time (23:55)and location of -6.999 latitude and 110.385 longitude of contextual location. Table 4.12shows the topics correlated with the items from Table 4.11.


ID Title53824 Beberapa Sejarah Akan Tetap Jadi Misteri, tapi Dengan

Hadirnya 10 Foto Langka Ini, Sebagian Misteri Dunia punTerungkap

50409 Seketat-Ketatnya Pemeriksaan Bandara, Ya Gak Harus KayakGini Juga Keles. Keterlaluan Banget

54863 Dimanapun Berada, Orang Kalimantan Pasti Merindingkalau Mendengar Nama 5 Hantu ini


54588 Walau GTA San Andreas Udah Jadul, tapi Pasti Kamu BelumTahu Rahasia Ini. Ada yang Bisa Mecahin MisteriMengerikan Ini ?

53982 Pesta-pesta Paling Bejat Sejagat yang Pernah Terukir Abadidalam Sejarah Peradaban Manusia! No. 5 Parah Banget, Cuy!

55069 5 Foto Cewek “Berbulu” ini Dijamin Bakal Bikin KamuKeringetan, Keringet Dingin Maksudnya...

42549 13 Kelakuan Konyol Para Fans Yang Justru Membuat SangIdola Terlihat Bodoh

47328 Lagi, Inilah 15 Kelakuan Konyol Orang-Orang di DalamKendaraan Umum

54518 Pria ini Ditantang Fansnya untuk Membuat dan MaeninSkateboard Terberat di Dunia. Salah Atraksi Bisa Bikin KakiPatah!

Table 4.11: Top-10 items extracted given on “Night” condition of contextual time and-6.999 latitude and 110.385 longitude of contextual location.


Key Topics’ keywords Number of occurrences16 bikin foto orang dunia berikut aneh yuk kayak

jaman lho baca konyol pengen manusia lihatkeren gila issue cewek ngakak

4

21 game pokemon permainan bermain gamer atasplaystation versi karakter xbox terbaru mariodirilis nintendo tahun baru secara video mainsatu

1

43 hantu manusia makhluk hewan aneh memilikisatu orang penampakan boneka melihat setanmengerikan mitos ikan malam sosok salahburung percaya

1

11 membuat karya gambar seni lukisan unik kerendibuat menggunakan desain seniman menjadibentuk robot hasil buku bunga melihat tato dunia

1

41 tahun dunia pertama menjadi nama sejarah masasatu abad memiliki kuno ditemukan manusiasalah orang raja buku inggris dikenal terkenal

1

4 bikin kayak orang udah klik kocak emang lohpunya pengen tau dunia biar lucu ternyatangeliat indonesia tuh

1

48 foto selfie gambar kamera foto-foto terlihatfotografer diambil pose hasil mengambil kerenfotonya lihat wajah photoshop fotografi tampakmelihat asli

1

Table 4.12: List of topics of the retrieved items from Table 4.11

The retrieved items are filtered by the topics shown in Table 4.12. Table 4.13 shows theexample results of the filtered items from Table 4.3 with correlated topics from Table4.12.

ID Title50704 Cowok Normal Pasti Ngiler Saat Pertama Kali Lihat 10 Foto

Ini, Asal Jangan Diteliti Lebih Lanjut

54284 [Fail Moment] Dari Tersandung Sampai Muntah, 12 ArtisPapan atas ini Pernah Ngelakuin Hal yang Memalukan diatas Panggung!


Table 4.13: Filtered Content-based Filtering items with the correlated topics from Table4.12


The list of recommended items from this strategy will be stored and sent to Team-DraftMultileaving function, which by then will combine this recommended items list withthe lists from other strategies into a single combined list.

4.2.3.1 Topic Modeling using Mallet

Mallet is a powerful Java-based natural language processing toolkit, developed byMcCallum and contributors at University of Massachusetts Amherst and University ofPennsylvania [19]. One of the features Mallet offers is Topic Modeling which includessampling-based implementation of Latent Dirichlet Allocation. Thus, to generate ourtopic models in Contextual user modeling strategy we used Mallet Topic Modelingtoolkit.

We produced 50 topics2 from our 20,000 document (from the newest to the oldest ones)from our datasets by applying Mallet command as follows:

1 mallet import-dir --input /var/www/malletInput/ --output /var

/www/malletOutput/keepo.mallet --keep-sequence --remove-

stopwords --extra-stopwords /var/www/malletOutput/stopword

.txt

Here, the 20,000 documents, which are already cleaned from HTML structures, is al-ready saved into /var/www/malletInput directory and the output of Mallet’s prepro-cessing file will be then saved into /var/www/malletOutput directory. We also useour additional stopwords to remove the stopwords within the content (see AppendixE for the stopword list). Then to train the dataset, we run a mallet command whichthe result of trained dataset will be saved in /var/www/malletOutput/keepo_keys.txtand /var/www/malletOutput/keepo_composition.txt. The command is as following:

1 mallet train-topics --input /var/www/malletOutput/keepo.

mallet --num-topics 50 --optimize-interval 20 --output-

state /var/www/malletOutput/topic-state.gz --output-topic

-keys /var/www/malletOutput/keepo_keys.txt --output-doc-

topics /var/www/malletOutput/keepo_composition.txt

2In our pilot experiment, we produced 20 topics out of 20,000 documents. However, since the numberof topics were very limited, the Contexutal User Modeling could not predict the recommended itemsaccurately. Hence, the results were insufficient. Therefore, in this experiment we opted-in by increasingthe number of topics to 50 topics.


Thekeepo_keys.txt contains the top keywords of each topic whilekeepo_composition.txtcontains the percentage breakdown of each topic from our dataset. The examples ofkeepo_keys.txt and keepo_composition.txt are shown in Table 4.14 and Table 4.15respectively.

Key Keywords LDA0 jepang bahasa orang tokyo yen kata punya tahun inggris

budaya asing musim japan menggunakan negara satu salahdigunakan terkenal kyoto

0.03655

1 video media akun sosial netizen facebook foto twitter duniainstagram youtube langsung komentar pengguna indonesiakali salah berita maya bernama

0.08132

2 air kulit penyakit cara menggunakan mata gigi obat wajahbakteri digunakan secara bahan sakit minyak bau putihmenyebabkan dokter alami

0.04486

3 smartphone iphone teknologi apple perangkat ponsel kameralayar menggunakan memiliki baterai produk samsung fiturgalaxy baru digunakan sistem android kecepatan

0.04772

4 bikin kayak orang udah klik kocak emang loh punya pengentau dunia biar lucu ternyata ngeliat indonesia tuh

0.07536

Table 4.14: Example of top keywords of each topic from Mallet


File Topic 0 Topic 1 Topic 2 ... Topic 49perusahaan-fastfood-kfc-ciptain-krim-wajah-beraroma-ayam-goreng-ini-kreatif-ato-kurang-kerjaan-ya.txt

2.05E-4 4.56E-4 0.129 ... 3.89E-4

dimana-ayahnya-lewat-wawancara-eksklusif-ini-rasa-penasaranmu-tentang-gambar-khong-guan-akhirnya-terjawab.txt

1.06E-4 2.361E-4 1.30E-4 ... 0.038

quiz-bahkan-ahli-geografi-menyerah-dalam-quiz-ini–apakah-pengetahuanmu-tentang-dunia-ini-lebih-hebat-dari-ahlinya.txt

0.003 0.008 0.004 ... 0.007

woman-always-right-apalagi-ngadepin-polwan-hati-hati-kamu-bisa-senasib-dengan-remaja-ini.txt

2.25E-4 0.210 0.062 ... 0.007

tanda-tanda-kehamilan-palsu.txt

3.43E-4 0.038 4.21E-4 ... 6.51E-4

Table 4.15: Example of top topics (by percentage) of each document from Mallet.Higher the percentage indicates the principal topics of its document.

Since new documents are produced every day on Keepo.me, we need to apply the TopicModeling to the new documents again. Therefore, we produced a shell script which


will be run every day at 01:00 hour by crontab system. The shell script is as following:

1 #!/bin/sh

23 php /var/www/thesis/artisan content:extractor

4 mallet import-dir --input /var/www/malletInput/ --output /var

/www/malletOutput/keepo.mallet --keep-sequence --remove-

stopwords --extra-stopwords /var/www/malletOutput/stopword

.txt

5 mallet train-topics --input /var/www/malletOutput/keepo.

mallet --num-topics 50 --optimize-interval 20 --output-

state /var/www/malletOutput/topic-state.gz --output-topic

-keys /var/www/malletOutput/keepo_keys.txt --output-doc-

topics /var/www/malletOutput/keepo_composition.txt

4.2.4 Team-draft multileaving strategy

In the Team-draft multileaving strategy we used the Interleaving toolkit developed byKato3 to produce a combined item lists from different type of Recommender Systemsstrategies. Figure 4.3 shows an example of using Interleaving toolkit with the inputof retrieved items from the Baseline as shown in Table 4.1, Content-based filtering asshown in Table 4.3 and Contextual user modeling as shown in Table 4.13:

Figure 4.3: Example of the usage of Interleaving toolkit

In order for the main system of Keepo.me to use this toolkit asynchronously, we devel-oped a Python script which will be used to generate the combined item lists from theinputs. The general idea for this strategy is to let the lists from the baseline and the 2recommender systems on Keepo.me to be sent to a Python script, then let the script pro-duced a combined list with Team-draft approach by using Kato’s Interleaving toolkit.

3All code is open source and available at https://github.com/mpkato/interleaving [18]


Finally, the generated combined list will be sent back to Keepo.me to show the combinedlist to the users (see Appendix C for the program listing).

With respect to the combined list, the users do not have any knowledge which recom-mender systems is used for which items. Each item is routed through a proxy whichthe click of the item is recorded and by then the user will be re-routed to the originalitem page. Figure 4.4 shows the screenshot of the combined list which is shown to theusers and Table 4.16 shows the example of recorded users’ clicks.

Figure 4.4: Screenshot of a combined list shown to the users

docID multileavingID0 74869dd1c25aaef7bd6b61d661ee2a8d91c8cdf7

1 f1d46a5fe0b31a6fc370d73f5e0c918021b0466c

1 f226d95a761de1c12e3c4d88a7f1f2217cb3dd97

3 12d517aedf71c2734696bd4b479fd982f9a414d5

2 864650c6a432b75f54c21d28bf2f8083b81dacb6

Table 4.16: Example of recorded users’ clicks through proxy

For the evaluation of this clickthrough method, we have developed another Pythonscript to evaluate the users’ clicks by implementing Kato’s Interleaving toolkit. Theresult of the evaluation from this script will be saved into a file, which by then every


file that contains the result of the evaluation will be counted to seek out which one ofthe systems is preferable by the user (see Appendix D for the program listing).

Table 4.17 illustrates the example of the result of the evaluation, and Table 4.18 demon-strates the example of the number of occurrences of the result.

Result[2, 0], [2, 1]

[1, 0], [1, 2]

[0, 1], [0, 2]

[2, 0], [2, 1]

[0, 1], [0, 2]

Table 4.17: Example of the result of the evalution using Team-draft multileaving, file“74869dd1c25aaef7bd6b61d661ee2a8d91c8cdf7”

Result Number of occurences[0, 1], [0, 2] 1029

[1, 0], [1, 2] 961

[2, 0], [2, 1] 967

[] 24

Table 4.18: Example of the counted result

Here [0,1],[0,2] infers Ranker 0 (baseline) won against Ranker 1 (Content-basedFiltering) and Ranker 2 (Contextual user modeling), [1,0],[1,2] infers Ranker 1(Content-based Filtering) won against Ranker 0 (baseline) and Ranker 2 (Contextualuser modeling), [2,0],[2,1] infers Ranker 2 (Contextual user modeling) won againstRanker 1 (baseline) and Ranker 1 (Content-based Filtering), and [] infers a tie. Wetest the statistical significance of the counted result with a one-way analysis of variance(ANOVA).

Chapter 5

Results

This chapter summarizes the results obtained during the experiment period of 25/12/2016to 01/01/2017. The structure of the results is divided into 2 folds which include: Resultof the evaluation of the recommender systems, and Result of conversion rate and userengagement on Keepo.me. The results in this chapter would give some insights whichare closely related to answering the research questions.

5.1 Evaluation of recommender systems

For the assessment of which Recommender System that users preferred on Keepo.meduring the experiment period, we analyse the result of Team-draft Multileaving methodas well as the performance of the Recommender Systems. During the experimentperiod, we have gathered 15,169 clicks on recommended items list (the combined listfrom Team-draft Multileaving) shown to the users, out of a total of 218,265 impressions.

As it can be seen in Figure 5.1, at the start the Click-through Rate (CTR) on the rec-ommended items list is rather poor, but then the users seemed to be attracted by therecommended items which then increased the rate.

37

38 Chapter 5 Results

Figure 5.1: Click-through Rate (CTR) on the recommended items list

From the obtained 15,169 clicks we mentioned above, we analyse which RecommenderSystem the users preferred. On Figure 5.2, we can see the result of Team-draft Multileav-ing which shows the Baseline won by 34.5% of the total number of clicks against Content-based Filtering and Contextual User Modeling. On the other hand, the Content-basedFiltering recommender system won by 27.5% of the total number of clicks against Base-line and Contextual User Modeling, and the Contextual User Modeling won by 26.9%of the total number of clicks against Baseline and Content-based Filtering.

Chapter 5 Results 39

Figure 5.2: Team-draft Multileaving result of Baseline, Content-based Filtering, andContextual User Modeling

We test the statistical significance with a one-way ANOVA, where N is number of days ofthe experiment period and the treatments are the counted multileaving result of Baselinewon against Content-based Filtering and Contextual User Modeling (Baseline > CB =

CUM) as Treatment 1, Content-based Filtering won against Baseline and ContextualUser Modeling (CB > Baseline = CUM) as Treatment 2, Contextual User Modeling wonagainst Baseline and Content-based Filtering (CUM > Baseline = CB) as Treatment 3,and Tie as Treatment 4. Table 5.1 shows the summary of sample data and Table 5.2shows the result details.

Treatments1 2 3 4 Total

N 8 8 8 8 32

ΣX 5008 3985 3917 91 13001

Mean 626 498.125 489.625 11.375 406.281

ΣX2 3901068 2728667 2595605 1489 9226829

Variance 109437.143 106234.125 96820.554 64.839 127250.531

Std.Dev 330.813 325.936 311.160 8.052 356.722

Std.Err 116.960 115.236 110.012 2.847 63.060

Table 5.1: One-way ANOVA sample data summary


Source SS df MS f pBetween-treatments 1756869.844 3 585623.281 7.495 0.0008Within-treatments 2187896.625 28 78139.165

Total 3,944,766.469 31

Table 5.2: One-way ANOVA result details

The significance test shows that one or more treatments are significantly difference atp < 0.05 where p = 0.0008 and f − ratio = 7.495. However, the post-hoc Tukey HSD test,as shown in Table 5.3, reports that the significant differences (with significance level at0.01) between the treatments only happen on Baseline > CB = CUM and a Tie pair, CB> Baseline = CUM and Tie pair, and CUM > Baseline = CB and Tie pair. This meansthat there is no significantly different between Baseline > CB = CUM, CB > Baseline =

CUM, and CUM > Baseline = CB.

Treatments pair Tukey HSD Qstatistic

Tukey HSDp-value

Tukey HSD inferfence

M1 vs M2 1.294 0.776 insignificantM1 vs M3 1.380 0.742 insignificantM1 vs M4 6.219 0.001 p < 0.01

M2 vs M3 0.086 0.900 insignificantM2 vs M4 4.925 0.008 p < 0.01

M3 vs M4 4.839 0.010 p < 0.01

Table 5.3: Tukey HSD test. Significance level at 0.05 = 3.861; Significance level at 0.01= 4.830. M1 is mean of Baseline > CB = CUM, M2 is mean of CB > Baseline = CUM,

M3 is mean of CUM > Baseline = CB, and M4 is mean of Tie.

Since the users preferred the Baseline over results from Content-based Filtering orContextual User Modeling and there are no significant differences between Baseline> CB = CUM, CB > Baseline = CUM, and CUM > Baseline = CB; it is interesting toanalyse the reason behind it. We examined the F-measure performance to investigatethe accuracy of the retrieved items from Content-based Filtering and Contextual UserModeling. The data from F-measure was collected from 100 items which had not beenprocessed by any Recommender Systems, then let Keepo.me’s editor to evaluate the itemsthat were related with the previous item user read. Then, we picked top-10 of the itemsthat had been processed by Content-based Filtering and Contextual User Modeling tobe measured for F-measure. In Figure 5.3, we can see that the performances of bothContent-based Filtering and Contextual User Modeling are low.


Figure 5.3: F-Measure performance of Content-based Filtering and Contextual UserModeling

Along with a low score of F-measure performance, results from the RecommenderSystems also suffered by a notably high computing time. As it can be seen in Figure 5.4,while the Content-based Filtering had considerably low computing time with the meanof 0.316s, the Contextual User Modeling yielded high computing time with the averageof 1.680s. With the Multileaving computing time, all the processes in RecommenderSystems encountered processing time of 2.759s in average.


Figure 5.4: Computing time (sec) of Recommender Systems

5.2 Evaluation of user engagement

In this evaluation, we analyse the measurement of the metric data from Google An-alytics within two periods of data which are: (1) experiment period (25/12/2016 to01/01/2017) with unique user of 341,808 users and (2) non-experiment period (02/01/2017to 09/01/2017) with unique users of 384,332 users. From the analytics of the performancesfrom these two metric data, we can dig out the truth whether implementing Recom-mender Systems as organic recirculation could improve the conversion rate and/or userengagement on Keepo.me.

Firstly, we analyse the performance difference using a metric for Bounce Rate, com-paring the experiment period to the non-experiment period. As we have explained inChapter 3, we want to reduce the Bounce Rate on Keepo.me to achive a value in the rangeof 25-40% or lower. As it can be seen in Figure 5.5 and Table 5.4, the Bounce Rate duringthe non-experiment period is worse than during the experiment period. Even thoughon 25/12/2016 the Bounce Rate is on the highest rate of 27.36% on average, the BounceRate during the experiment period gradually drops and has a lower metric value (onaverage) than the Bounce Rate during the non-experiment period.


Figure 5.5: Bounce Rate during the experiment period (25/12/2016 to 01/01/2017) vs.non-experiment period (02/01/2017 to 09/01/2017)

Date Bounce Rate Change25/12/2016 27.36%

-145.27%02/01/2017 11.15%26/12/2016 15.58%

-42.44%03/01/2017 10.94%27/12/2016 11.49%

5.00%04/01/2017 12.10%

28/12/2016 13.44%0.59%

05/01/2017 13.52%

29/12/2016 12.95%7.29%

06/01/2017 13.96%

30/12/2016 12.06%20.10%

07/01/2017 15.10%

31/12/2016 12.27%31.49%

08/01/2016 17.91%

01/01/2017 12.97%19.17%

09/01/2017 16.04%

Table 5.4: Detail of Bounce Rate during the experiment period (25/12/2016 to01/01/2017) vs. non-experiment period (02/01/2017 to 09/01/2017)

However, analysing only on the performance of the Bounce Rate is not suffice to deductthe improvement of user engagement. Therefore, we then analyse the Pages/Sessionmetric data and Average Session Duration where these 2 metric data are essential metricsto evaluate the user engagement performance.

Figure 5.6: Pages/Session during the experiment period (25/12/2016 to 01/01/2017) vs.non-experiment period (02/01/2017 to 09/01/2017)


Date Pages/Session Change25/12/2016 1.58

-3.25%02/01/2017 1.6326/12/2016 1.58

-8.98%03/01/2017 1.7427/12/2016 1.68

-2.35%04/01/2017 1.7228/12/2016 1.77

8.35%05/01/2017 1.63

29/12/2016 1.768.73%

06/01/2017 1.62

30/12/2016 1.807.69%

07/01/2017 1.67

31/12/2016 1.717.69%

08/01/2016 1.59

01/01/2017 1.7310.77%

09/01/2017 1.57

Table 5.5: Detail of Pages/Session during the experiment period (25/12/2016 to01/01/2017) vs. non-experiment period (02/01/2017 to 09/01/2017)

On Pages/Session performance, we can see that the users most likely browsed andcontinued to load more pages during the experiment period rather than during the non-experiment period where the difference is 3.22% higher (1.70 Pages/Session during theexperiment period and 1.65 Pages/Session during the non-experiment period). Despitethe test result during the experiment period converted 15.8% better than the test resultduring the non-experiment period, as we can see in Figure 5.7, the test result of 3.22%difference of improvement was not significant at p > 0.051. Evalution of Average SessionDuration (Figure 5.8 and Table 5.6) indicates that the experiment period’s performanceimproves over the non-experiment period performance by 9.73% on average (00:02:13during the experiment period and 00:02:01 during the non-experiment period).

1Significance was calculated by using the tool for A/B significance tests provided on [4, 5]


Figure 5.7: A/B Significance Test for Pages/Session. The calculation was computedwith variables of 384,332 visitors and 1.65 conversion data for Test Non-Experiment(control), and 341,808 visitors and 1.70 conversion data for Test Experiment (variant).Relative improvement conversion rate variant (B) over control (A): 15.8%. The testresult was not significant at 95% significance level (1-sided) where p > 0.05 (p-value =

45%).

Figure 5.8: Avg. Session Duration during the experiment period (25/12/2016 to01/01/2017) vs. non-experiment period (02/01/2017 to 09/01/2017)

Lastly, to support an improvement in user engagement on Keepo.me, Figure 5.9 showsthe number of pages which users continued to load the pages. It indicates that the goalto let the users load 3 and more pages on the experiment period dominates the numberof pages on non-experiment period.


Date Avg. Session Duration Change25/12/2016 00:01:50

-14.85%02/01/2017 00:02:0926/12/2016 00:01:52

-18.03%03/01/2017 00:02:1727/12/2016 00:02:21

10.10%04/01/2017 00:02:0828/12/2016 00:02:20

20.91%05/01/2017 00:01:5629/12/2016 00:02:21

24.95%06/01/2017 00:01:5330/12/2016 00:02:28

16.62%07/01/2017 00:02:0731/12/2016 00:02:15

17.90%08/01/2016 00:01:5401/01/2017 00:02:19

28.23%09/01/2017 00:01:48

Table 5.6: Detail Avg. Session Duration during the experiment period (25/12/2016 to01/01/2017) vs. non-experiment period (02/01/2017 to 09/01/2017)

Figure 5.9: Page Depth during the experiment period (25/12/2016 to 01/01/2017) vs.non-experiment period (02/01/2017 to 09/01/2017)


Finally, during the experiment period, the goal (read 3 pages) conversion rate increasedby 9.20% on average (6.37% during experiment period and 5.83% in the non-experimentperiod). The significance test of the goal completions between experiment period andnon-experiment period shows that the result was significant at p < 0.05 with the im-provement of 7% (Figure 5.11)2. This result of conversion rate implies that implement-ing Recommender Systems as organic recirculation could improve user engagement onKeepo.me.

Figure 5.10: Conversion Rate during the experiment period (25/12/2016 to 01/01/2017)vs. non-experiment period (02/01/2017 to 09/01/2017)

Date Conversion Rate Change25/12/2016 5.09%

-15.83%02/01/2017 6.05%26/12/2016 5.87%

-7.70%03/01/2017 6.36%27/12/2016 6.40%

4.33%04/01/2017 6.14%

28/12/2016 6.71%21.40%

05/01/2017 5.53%

29/12/2016 6.69%16.96%

06/01/2017 5.72%

30/12/2016 7.17%13.66%

07/01/2017 6.31%

31/12/2016 6.33%15.98%

08/01/2016 5.46%

01/01/2017 6.82%27.64%

09/01/2017 5.34%

Table 5.7: Detail Conversion Rate during the experiment period (25/12/2016 to01/01/2017) vs. non-experiment period (02/01/2017 to 09/01/2017)

2Significance was calculated by using the tool for A/B significance tests provided on [4, 5]


Figure 5.11: A/B Significance Test for Goal (read 3 pages) Completion. The calculationwas computed with variables of 384,332 visitors and 33,270 conversion data for TestNon-Experiment (control), and 341,808 visitors and 31,653 conversion data for TestExperiment (variant). Relative improvement conversion rate variant (B) over control(A): 7%. The test result was significant at 95% level (1-sided) where p < 0.05 (p-value

= 0%)

As can be seen in Figure 5.12 and Table 5.8, during the experiment period Keepo.mesuffered notably slow server response time. The server response time during the exper-iment period was 88.37% slower than during the non-experiment period (2.64s duringthe experiment period and 1.40s during the non-experiment period). This drawbackindicates the consequences of implementing Recommender Systems where the perfor-mance of Recommender Systems takes more processing time in term of computationaltime.

Figure 5.12: Avg. Server Response Time during the experiment period (25/12/2016 to01/01/2017) vs. non-experiment period (02/01/2017 to 09/01/2017)


Date Avg. Server Response Time (sec) Change25/12/2016 1.48

-59.20%02/01/2017 3.62

26/12/2016 3.05241.48%

03/01/2017 0.8927/12/2016 2.58

589.92%04/01/2017 0.3728/12/2016 1.59

15.37%05/01/2017 1.3829/12/2016 2.12

27.12%06/01/2017 1.6730/12/2016 3.38

201.69%07/01/2017 1.1231/12/2016 2.26

28.06%08/01/2016 1.7601/01/2017 4.33

170.50%09/01/2017 1.60

Table 5.8: Detail Avg. Server Response Time during the experiment period (25/12/2016to 01/01/2017) vs. non-experiment period (02/01/2017 to 09/01/2017)

Chapter 6

Conclusion

6.1 Discussion

This thesis set out to support Keepo.me to increase their revenues and optimize contentcampaigns by making better use of their data. As we have implemented RecommenderSystems as organic recirculation, the obtained insight data from experimental resultswould provide the user engagement improvement guidelines for Keepo.me. We adoptedthe Content-based Filtering technique as well as Contextual User Modeling techniquefor the Recommender Systems. Where the Contextual User Modeling used a TopicModeling approach for its Post-filtering method. For the evaluation of RecommenderSystems, we used Team-Draft Multileaving.

6.1.1 Recommender systems

From the insight provided in Section 5.1, it shows that during the experiment period, theusers preferred the items which were generated by the Baseline procedure with 34.5% ofMultileaving evaluation result rather than the generated recommended items whetherthese originated in the Content-based Filtering or in the Contextual User Modelingprocess. The significance test results indicate no significant differences between thewinners (Baseline, Content-based Filtering, and Contextual User Modeling). Thus,despite that users most likely preferred the Baseline system, it does not necessarilymean that the users appreciated neither Content-based Filtering nor Contextual UserModeling. One could conclude from this result is that the combined item lists generatedfrom Team-draft multileaving could be biased. As mentioned in [23], the best practice ofTeam-draft Multileaving is that the list should be long enough to represent each team’sitem. In case of Keepo.me, since users tend to visit the article page which only shows5 items of the combined items list, the recommended items from the Baseline systemcould dominate the list.

50

Chapter 6 Conclusion 51

Let us now consider the context aware Recommender System which adopted ContextualUser Modeling with Topic Modeling using Post-filtering. Despite that 74.1% Keepo.me’susers use mobile device where contextual information could be considered as an aspectto provide relevant contents, the Contextual User Modeling method was preferredthe least in our experiment. This does not necessarily indicate that implementingcontextual information as context awareness in the Recommender System is useless.Many challenges should be taken into account when we are addressing contextualinformation into Recommender System, and further investigation is advised.

The low score of F-measure performance, observed in Section 5.1, might indicate that theretrieved items from Content-based Filtering and Contextual User Modeling were notenough appreciated. However, the measuring method of this F-measure performancemight be biased, since the retrieved items were evaluated only by one judge. Every userhas their own preferences regarding the relevancy of the content. Therefore, judgingthe relevancy of the items only from one judge could create a subjective result. It wouldbe better to integrate a survey to the users as explicit feedback in further research tofind out whether the recommended items are indeed relevant to them.

Regarding the cost of producing recommendations, Figure 5.4 reports that the pro-cessing time was notably high. The mean processing time of 2.759s indicates that theRecommender Systems were inefficient and induced high response times; along withother processes in Keepo.me’s system, all data that were sent back to user took approxi-mately more than 3s. The required computing time sacrificed users’ time to load a singlepage, which might lead to abandoning the page before it was completely loaded wherethe users would most likely wait the page to be loaded no more than 2s. Furthermore,Google Pagespeed recommends that all process in the server should take less than 200ms[2]. One major factor affecting the computing time, is the high computational time toquery from a large users’ contextual data quantity. To be able to estimate the validtopics, which were already generated from Topic Modeling, given the user’s contextualinformation, we need to collect a greater sample of contextual information data. As thecontextual data becomes bigger and bigger, consequently it will be made more difficultand require computational time to retrieve the relevant results.

6.1.2 User engagement

From the insight of user engagement provided in 5.2, the results indicate that duringthe experiment period the user engagement on Keepo.me increased (however, not sig-nificantly). The Bounce Rate reduced to 5.21%, which then affected the improvementof Pages per Session and Average Session Duration metrics by 3.22% and 9.73% respec-tively. Moreover, as shown in Figure 5.10, the Conversion Rate increased to 9.20% andthe number of users who visited more than 3 pages increased as well. Unfortunately,

52 Chapter 6 Conclusion

this improvement cannot yet serve as evidence that implementing Recommender Sys-tems as organic recirculation could give a greater impact to user engagement, sincewe conducted the experiment only for 1 week. To find out whether Keepo.me’s userengagement could compete with other competitors, the experiment should be done atleast 2 to 3 months long. Given the insight from this evaluation’s result, implement-ing Recommender Systems could improve the user engagement as Keepo.me needs toincrease their business revenues.

As shown in Figure 5.12, the users during the experiment period suffered slow serverresponse time. It means that the User Experience is sacrificed to befit the RecommenderSystems processes. Fast response time is one of the crucial aspects of User Experiencewhich also happens to be an essential element of user engagement improvement. Nomatter how good are the metrics data of Bounce Rate, Pages per Session, and AverageSession Duration, if the system shows slow response time the user engagement metriccould drop. The users could lose their intention to continue to load to another page(s)as it takes some time to load the page(s). It is interesting to investigate more on furtherresearch to lowering the high computing time for the Recommender System processes.

6.2 Lessons learned

Personalisation is not easy, one cannot easily implement some personalisation ap-proaches and just run the systems. The personalisation needs a deep understandingand comprehensive strategy of users behaviours, content marketing and social mediamonitoring, and integration and data mining. To understand users’ mind-sets shouldbe taken into account as well. It is appropriate to know how the user navigates the sites,makes decisions, and responses. However, if personalisation is done correctly and wellconfigured, it could improve the users’ satisfactions.

With respect to context awareness, some challenges should be considered when devel-oping a Context Aware Recommender System (CARS). As it has been explained in [28],some of these challenges include:

• The process of CARS. Here, Yujie, et al., explain that the first step of the processof CARS is to make sure the gathered data is valuable and sufficient as well.However, collecting useful data to predict the user preference accurately couldaugment the number of collected data, in other words, it is resource expensive.Consequently, the processing time to make the prediction will be higher, in otherwords computing time expensive.

• Valid context discovering and computing. The context types which to be implementedto Recommender Systems should be discovered as valid as possible. When the

Chapter 6 Conclusion 53

correct context types which related to the application has been identified, it canimprove the performance accuracy of CARS as well as reduce the computing time.However, estimating the valid context types needs a complex algorithm and datamining.

• Understanding user behaviour with context history and eliciting more accurate contextualuser preference. Yujie, et al., explain several aspects need to be taken into accountto understand user behaviours. The context history is one of the most importantaspects should be addressed in this challenge. However, it is hard to estimatethe main factors which affect the contextual user preference elicitation as it needslarge data quantity and high computation complexity. Another aspect is, it ischallenging to adapt contextual user preference where the contextual informationis mostly changing repeatedly.

• Taking account of interdisciplinary research. Incorporating contextual informationis not enough. The best practice to estimate valid context types accurately is totake several factors into consideration, including: social network relationship,human decision behaviour, user psychology, emotions, user interface and otheruser-oriented issues.

• Privacy and security. User’s personal data is a serious issue in context-awareapplications. Users tend to reject the permissions for the system to aggregatetheir contextual information. Consequently, any Recommender Systems whichincorporate contextual information could not estimate the ranking of the itemsaccurately.

These challenges should be investigated and understood more before building Recom-mender Systems which incorporate contextual information. The data collection shouldbe well organised, structured, and optimised in order to reduce its complexity.

6.3 Final Remark

This thesis is intended to give necessary background information to improve the userengagement on Keepo.me. The results of the evaluation of both Recommender Sys-tems and User Engagement showed that integrating Recommender Systems along withcontext-aware methods gives positive promises to improve the user engagement. How-ever, some challenges should be taken into consideration to give a greater impact onuser engagement improvement and to produce more accurate relevant information onRecommender Systems. Chapter 7 summarizes possible enhancements of the Recom-mender Systems applied and the most promising topics for future research.

Chapter 7

Future Work

While working on this thesis, we have studied a variety of knowledge sources regardingRecommender Systems and contextual awareness. Unfortunately, not all topics thatrequire futher study could be covered in the available amount of time. From the resultswe have gathered, we identified the most promising topics for further research. Thefurther research might enhance and improve the Recommender Systems’ methodologyin this research.

It is interesting to investigate the performance of Recommender System further by in-corporating Collaborative Filtering rather than just Content-based Filtering. However,incorporating Collaborative Filtering as well as Content-based Filtering does not neces-sarily mean to compare each other performance. Collaborative Filtering and Content-based Filtering each have their advantages and disadvantages. Content-based Filteringis respected to give the best performance since a content provider like Keepo.me whodoes not have a rating system for their content. On the other hand, modeling the user-to-user or item-to-item relationships as in Collaborative Filtering approaches mightresult in better recommendations. Therefore, it would be the best to understand users’behaviours (on how they navigate the site and their responses) beforehand, and in-vestigate Hybrid Filtering. However, adopting Hybrid Filtering could require moresophisticated computational resources. Therefore, there should be a detailed strategyof data management and structures before incorporating Hybrid Filtering.

With respect to contextual awareness, as we have mentioned in Chapter 6 following[28], the challenges should be taken into consideration as the main factors to do furtherresearch on contextual awareness, including: collecting useful and sufficient contex-tual information, finding the correct and valid context types, and put more work oninvestigating the users’ behaviours which include: social network relationship, humandecision behaviour, user psychology, emotions, user interface and other user-orientedissues.

54

Chapter 7 Future Work 55

For the last idea, it is interesting to follow-up the research on improving the time con-sumption on Recommender Systems processes. Time is one of the most importantelements of User Experience, hence improving computing time on producing recom-mended items to users will increase user’s satisfaction.

Bibliography

[1] Recommender Systems, 2012. URL http://recommender-systems.org/. Ac-cessed on: 14-02-2017.

[2] Improve Server Response Time | PageSpeed Insights, 2015. URL https://developers.google.com/speed/docs/insights/Server. Accessed on: 14-02-2017.

[3] Bounce Rate - Analytics Help, 2016. URL https://support.google.com/analytics/answer/1009409?hl=en. Accessed on: 26-11-2016.

[4] A/B-Test Calculator - Power & Significance, 2017. URL https://abtestguide.com/calc/. Accessed on: 15-02-2017.

[5] A/B Significance Test, 2017. URL http://getdatadriven.com. Accessed on: 15-02-2017.

[6] G. D. Abowd, A. K. Dey, P. J. Brown, N. Davies, M. Smith, and P. Steggles. Towardsa Better Understanding of Context and Context-Awareness. In Proceedings of the1st International Symposium on Handheld and Ubiquitous Computing, HUC ’99, pages304–307, London, UK, UK, 1999. Springer-Verlag. ISBN 978-3-540-66550-2. URLhttp://dl.acm.org/citation.cfm?id=647985.743843.

[7] G. Adomavicius, R. Sankaranarayanan, S. Sen, and A. Tuzhilin. IncorporatingContextual Information in Recommender Systems Using a Multidimensional Ap-proach. ACM Trans. Inf. Syst., 23(1):103–145, Jan. 2005. ISSN 1046-8188. doi:10.1145/1055709.1055714. URL http://doi.acm.org/10.1145/1055709.1055714.

[8] G. Adomavicius, B. Mobasher, F. Ricci, and A. Tuzhilin. Context-AwareRecommender Systems. AI Magazine, 32(3):67–80, 2011. ISSN 0738-4602.doi: 10.1609/aimag.v32i3.2364. URL http://www.aaai.org/ojs/index.php/aimagazine/article/view/2364.

[9] A. Ansari, S. Essegaier, and R. Kohli. Internet Recommendation Systems. Journalof Marketing Research, 37(3):363–375, Aug. 2000. ISSN 0022-2437. doi: 10.1509/jmkr.37.3.363.18779. URL http://journals.ama.org/doi/abs/10.1509/jmkr.37.3.363.18779.

56

http://recommender-systems.org/

https://developers.google.com/speed/docs/insights/Server

https://developers.google.com/speed/docs/insights/Server

https://support.google.com/analytics/answer/1009409?hl=en

https://support.google.com/analytics/answer/1009409?hl=en

https://abtestguide.com/calc/

https://abtestguide.com/calc/

http://getdatadriven.com

http://dl.acm.org/citation.cfm?id=647985.743843

http://doi.acm.org/10.1145/1055709.1055714

http://www.aaai.org/ojs/index.php/aimagazine/article/view/2364

http://www.aaai.org/ojs/index.php/aimagazine/article/view/2364

http://journals.ama.org/doi/abs/10.1509/jmkr.37.3.363.18779

http://journals.ama.org/doi/abs/10.1509/jmkr.37.3.363.18779

BIBLIOGRAPHY 57

[10] D. M. Blei. Probabilistic Topic Models. Commun. ACM, 55(4):77–84, Apr. 2012.ISSN 0001-0782. doi: 10.1145/2133806.2133826. URL http://doi.acm.org/10.1145/2133806.2133826.

[11] O. Chapelle, T. Joachims, F. Radlinski, and Y. Yue. Large-scale Validation andAnalysis of Interleaved Search Evaluation. ACM Trans. Inf. Syst., 30(1):6:1–6:41,Mar. 2012. ISSN 1046-8188. doi: 10.1145/2094072.2094078. URL http://doi.acm.org/10.1145/2094072.2094078.

[12] L. Cosseboom. Keepo is a mix between BuzzFeed, WordPress,and Twitter in Indonesia, 2015. URL https://www.techinasia.com/

indonesia-keepo-indie-publishing-social-network. Accessed on: 27-11-2016.

[13] W. Emy. Consumer-aware, context-aware, 2014. URL http://www.niemanlab.org/2014/12/consumer-aware-context-aware/. Accessed on: 13-10-2016.

[14] R. Fishkin. The Traffic Prediction Accuracy of 12 Metrics from Com-pete, Alexa, SimilarWeb, & More, June 2015. URL https://moz.com/rand/traffic-prediction-accuracy-12-metrics-compete-alexa-similarweb/. Ac-cessed on: 25-11-2016.

[15] P. Ingwersen and K. Järvelin. The Turn: Integration of Information Seeking and Re-trieval in Context, volume 18 of The Information Retrieval Series. Springer-Verlag,Berlin/Heidelberg, 2005. ISBN 978-1-4020-3850-1. URL http://link.springer.com/10.1007/1-4020-3851-8.

[16] F. O. Isinkaye, Y. O. Folajimi, and B. A. Ojokoh. Recommendation systems:Principles, methods and evaluation. Egyptian Informatics Journal, 16(3):261–273, Nov. 2015. ISSN 1110-8665. doi: 10.1016/j.eij.2015.06.005. URL http://www.sciencedirect.com/science/article/pii/S1110866515000341.

[17] T. Joachims. Evaluating Retrieval Performance using Clickthrough Data. InJ. Franke, G. Nakhaeizadeh, and I. Renz, editors, Text Mining, pages 79–96. Physi-ca/Springer Verlag, 2003.

[18] M. P. Kato. Interleaving, 2016. URL https://github.com/mpkato/interleaving.Accessed on: 23-11-2016.

[19] A. K. McCallum. MALLET: A Machine Learning for Language Toolkit, 2002. URLhttp://mallet.cs.umass.edu. Accessed on: 13-10-2016.

[20] C. Muller. Everything You Need To Know About Pages-Per-Session, Feb. 2016. URL http://blog.taboola.com/

everything-you-need-to-know-about-pages-per-session/. Accessed on:13-02-2017.

http://doi.acm.org/10.1145/2133806.2133826

http://doi.acm.org/10.1145/2133806.2133826

http://doi.acm.org/10.1145/2094072.2094078

http://doi.acm.org/10.1145/2094072.2094078

https://www.techinasia.com/indonesia-keepo-indie-publishing-social-network

https://www.techinasia.com/indonesia-keepo-indie-publishing-social-network

http://www.niemanlab.org/2014/12/consumer-aware-context-aware/

http://www.niemanlab.org/2014/12/consumer-aware-context-aware/

https://moz.com/rand/traffic-prediction-accuracy-12-metrics-compete-alexa-similarweb/

https://moz.com/rand/traffic-prediction-accuracy-12-metrics-compete-alexa-similarweb/

http://link.springer.com/10.1007/1-4020-3851-8

http://link.springer.com/10.1007/1-4020-3851-8

http://www.sciencedirect.com/science/article/pii/S1110866515000341

http://www.sciencedirect.com/science/article/pii/S1110866515000341

https://github.com/mpkato/interleaving

http://mallet.cs.umass.edu

http://blog.taboola.com/everything-you-need-to-know-about-pages-per-session/

http://blog.taboola.com/everything-you-need-to-know-about-pages-per-session/

58 BIBLIOGRAPHY

[21] J. Peyton. What’s the Average Bounce Rate for a Website?,Feb. 2014. URL http://www.gorocketfuel.com/the-rocket-blog/

whats-the-average-bounce-rate-in-google-analytics/. Accessed on:14-02-2017.

[22] F. Radlinski, M. Kurup, and T. Joachims. How Does Clickthrough Data ReflectRetrieval Quality? In Proceedings of the 17th ACM Conference on Information andKnowledge Management, CIKM ’08, pages 43–52, New York, NY, USA, 2008. ACM.ISBN 978-1-59593-991-3. doi: 10.1145/1458082.1458092. URL http://doi.acm.org/10.1145/1458082.1458092.

[23] A. Schuth. Search Engines that Learn from Their Users. PhD thesis, InformaticsInstitute, University of Amsterdam, May 2016. URL http://www.anneschuth.nl/thesis.

[24] A. Schuth, F. Sietsma, S. Whiteson, D. Lefortier, and M. de Rijke. MultileavedComparisons for Fast Online Evaluation. In Proceedings of the 23rd ACM Inter-national Conference on Conference on Information and Knowledge Management, CIKM’14, pages 71–80, New York, NY, USA, 2014. ACM. ISBN 978-1-4503-2598-1. doi:10.1145/2661829.2661952. URL http://doi.acm.org/10.1145/2661829.2661952.

[25] A. Schuth, K. Hofmann, and F. Radlinski. Predicting Search Satisfaction Metricswith Interleaved Comparisons. In Proceedings of the 38th International ACM SIGIRConference on Research and Development in Information Retrieval, SIGIR ’15, pages463–472, New York, NY, USA, 2015. ACM. ISBN 978-1-4503-3621-5. doi: 10.1145/

2766462.2767695. URL http://doi.acm.org/10.1145/2766462.2767695.

[26] D. Shin, J.-w. Lee, J. Yeon, and S.-g. Lee. Context-Aware Recommendation byAggregating User Context. pages 423–430. IEEE, July 2009. ISBN 978-0-7695-3755-9. doi: 10.1109/CEC.2009.38. URL http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5210763.

[27] C. G. Veness. Selecting points from a database by latitude/longitude withina bounding circle, 2016. URL http://www.movable-type.co.uk/scripts/latlong-db.html. Accessed on: 13-11-2016.

[28] Z. Yujie and W. Licai. Some challenges for context-aware recommender systems.In 2010 5th International Conference on Computer Science Education, pages 362–365,Aug. 2010. doi: 10.1109/ICCSE.2010.5593612.

http://www.gorocketfuel.com/the-rocket-blog/whats-the-average-bounce-rate-in-google-analytics/

http://www.gorocketfuel.com/the-rocket-blog/whats-the-average-bounce-rate-in-google-analytics/

http://doi.acm.org/10.1145/1458082.1458092

http://doi.acm.org/10.1145/1458082.1458092

http://www.anneschuth.nl/thesis

http://www.anneschuth.nl/thesis

http://doi.acm.org/10.1145/2661829.2661952

http://doi.acm.org/10.1145/2766462.2767695

http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5210763

http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5210763

http://www.movable-type.co.uk/scripts/latlong-db.html

http://www.movable-type.co.uk/scripts/latlong-db.html

Appendix A

Content-based filtering strategyscript

1 // -------------------

2 // Extract top-100 popular items from the last 7 days

3 // -------------------

45 $date7days = date(’Y-m-d’, strtotime(’-7 days’));

6 $dateNow = date(’Y-m-d’);

78 $posts = \DB::select(\DB::raw("SELECT ‘posts‘.‘id‘, ‘posts

‘.‘title‘, ‘posts‘.‘slug‘, ‘posts‘.‘content‘, ‘posts‘.‘

post_type ‘, SUM(‘loggers ‘.‘views‘) as ‘total‘ FROM ‘

loggers‘ LEFT JOIN ‘posts‘ ON ‘posts‘.‘id‘ = ‘loggers ‘.‘

post_id‘ WHERE ‘loggers ‘.‘created_on ‘ >= ’" . $date7days .

"’ AND ‘loggers ‘.‘created_on ‘ <= ’" . $dateNow . "’ AND ‘

loggers ‘.‘post_id‘ != 0 AND ‘posts‘.‘id‘ != " . $this->

lastPost[0]->id . " AND ‘posts‘.‘status‘ = 1 GROUP BY ‘

loggers ‘.‘post_id‘ ORDER BY ‘total‘ DESC LIMIT 100"));

910 /*

11 *

12 * ... other code ...

13 *

14 */

1516 // -------------------

17 // Filter the items according the accessed page (category

page, channel page, or all page)

59

60 Appendix A Content-based filtering strategy script

18 // -------------------

1920 if (! empty($this->category))

21 { $postData = \DB::select("SELECT ‘posts ‘.* FROM ‘posts‘ LEFT

JOIN ‘channels ‘ ON ‘channels ‘.‘id‘ = ‘posts‘.‘channel_id ‘

LEFT JOIN ‘categories ‘ ON ‘categories ‘.‘id‘ = ‘channels

‘.‘category_id ‘ WHERE ‘categories ‘.‘slug‘ = ? AND ‘posts

‘.‘slug‘ IN (" . implode(’,’, $postURL) . ")", [$this->

category]); }

22 elseif (! empty($this->channel))



WHERE ‘channels ‘.‘slug‘ = ? AND ‘posts‘.‘slug‘ IN (" .

implode(’,’, $postURL) . ")", [$this->channel]); }

24 else

25 { $postData = \DB::select("SELECT * FROM ‘posts‘ WHERE ‘slug‘

IN (" . implode(’,’, $postURL) . ")"); }

2627 /*

28 *

29 * ... other code ...

30 *

31 */

3233 // -------------------

34 // How similar between these posts and the last post that

user has read previously?

35 // -------------------

3637 $tokenizer = new WhitespaceTokenizer();

38 $cosine = new CosineSimilarity();

39 $originalContent = $tokenizer ->tokenize(strip_tags($this->

cleanUpContent($this->lastPost[0])));

4041 $data = [];

42 foreach ($postData as $postRetrieved)

43 {

44 // Clean up content

45 $postContent = strip_tags($this->cleanUpContent(

$postRetrieved));

46 if (empty($postContent)) { continue; }

Appendix A Content-based filtering strategy script 61

4748 // Calculate the similarity

49 $postContent = $tokenizer ->tokenize($postContent);

50 $cos = $cosine->similarity($originalContent , $postContent);

5152 $data[$postRetrieved ->id] = $cos;

53 }

5455 // Sort for top ranking

56 arsort($data);

Since we need to filter the items that are not related to the accessed page by the user(e.g. when the user visits the page with category “Fun & Humor”, the recommendeditems should be in the category “Fun & Humor” as well). Hence, on line 20 to 25 ofthe listing above provide the function to filter out the items that are not related to thecategory or channel.

Appendix B

Contextual user modeling strategyscript

1 // -------------------

2 // Determine current contextual time

3 // -------------------

45 /*

6 ---------------------------------------------

7 Context | Condition | Range of Values |

8 ---------------------------------------------

9 Pd | Morning | 07:00 - 11:59 |

10 | Noon | 12:00 - 14:59 |

11 | Evening | 15:00 - 20:59 |

12 | Night | 21:00 - 06:59 |

13 ---------------------------------------------

14 */

1516 $timeNow = date(’H’, time());

17 switch ($timeNow)

18 {

19 // Morning

20 case ($timeNow >= 7 AND $timeNow < 12):

21 $contextualTimeFrom = ’07:00:00’;

22 $contextualTimeTo = ’11:59:59’;

23 break;

24 // Noon



62

Appendix B Contextual user modeling strategy script 63


28 break;

29 // Evening




33 break;

34 // Night

35 case ($timeNow >= 21 AND $timeNow <= 23):



38 break;

39 // Night

40 case ($timeNow >= 0 AND $timeNow <= 6):

41 default:



44 break;

45 }

46 $bind = [$contextualTimeFrom , $contextualTimeTo];

4748 // -------------------

49 // Determine current contextual location

50 // -------------------

5152 // If location is known

53 if (! empty($this->latitude) AND ! empty($this->longitude))

54 {

55 // First-cut bounding box (in degrees)

56 // See: http://www.movable-type.co.uk/scripts/latlong-db.

html

57 $rad = 5; // 5km bounding

58 $R = 6371; // radius of the earth (6371km)

59 $maxLat = $this->latitude + rad2deg($rad/$R);

60 $minLat = $this->latitude - rad2deg($rad/$R);

61 $maxLon = $this->longitude + rad2deg(asin($rad/$R) / cos(

deg2rad($this->latitude)));

62 $minLon = $this->longitude - rad2deg(asin($rad/$R) / cos(

deg2rad($this->latitude)));

63

64 Appendix B Contextual user modeling strategy script

64 $sqlLoc = " AND ((‘latitude‘ BETWEEN ? AND ?) AND (‘

longitude ‘ BETWEEN ? AND ?))";

65 $bind = array_merge($bind, [$minLat, $maxLat, $minLon,

$maxLon]);

66 }

6768 // -------------------

69 // Apply SQL

70 // -------------------

7172 $bind[] = $this->cookieID;

73 $sql = "SELECT count(‘id‘) as ‘total‘, ‘url‘ FROM ‘

cars_log ‘ WHERE (TIME(‘time‘) BETWEEN TIME(?) AND TIME(?))

" . @$sqlLoc . " AND ‘cookie_id ‘ = ? GROUP BY ‘url‘ ORDER

BY ‘total‘ DESC LIMIT 100";

74 $posts = \DB::select($sql, $bind);

7576 /*

77 *

78 * ... other code ...

79 *

80 */

8182 // -------------------

83 // Filter the items according the accessed page (category

page, channel page, or all page)

84 // -------------------

8586 if (! empty($this->category))



LEFT JOIN ‘categories ‘ ON ‘categories ‘.‘id‘ = ‘channels

‘.‘category_id ‘ WHERE ‘categories ‘.‘slug‘ = ? AND ‘posts

‘.‘slug‘ IN (" . implode(’,’, $postURL) . ")", [$this->

category]); }

88 elseif (! empty($this->channel))



WHERE ‘channels ‘.‘slug‘ = ? AND ‘posts‘.‘slug‘ IN (" .

implode(’,’, $postURL) . ")", [$this->channel]); }

90 else

Appendix B Contextual user modeling strategy script 65

91 { $postData = \DB::select("SELECT * FROM ‘posts‘ WHERE ‘slug‘

IN (" . implode(’,’, $postURL) . ")"); }

929394 // -------------------

95 // Determine topics

96 // -------------------

9798 $topicArr = [];

99 foreach ($postData as $post)

100 {

101 $topic = \DB::table(’compositions’)->where(’key’, $post->

slug)->get();

102 if (empty($topic)) { continue; }

103104 $topics = json_decode($topic[0]->topics);

105 $topics = (array) $topics->topics;

106 reset($topics);

107108 $topicArr[] = key($topics);

109 }

110111 $topicArr = array_count_values($topicArr);

112 arsort($topicArr);

113114 // Take top-10 most read topics

115 $temp = []; $counter = 0;

116 foreach ($topicArr as $key => $value)

117 {

118 if ($counter >= 10) { break; }

119 $temp[] = $key;

120 $counter++;

121 }

122 $topicArr = $temp;

123124125 // -------------------

126 // Post-Filtering

127 // Filtering items from Content-Based with the obtained

topics

128 // -------------------

66 Appendix B Contextual user modeling strategy script

129130 $data = [];

131 foreach ($this->CB as $key => $post)

132 {

133 $topic = \DB::table(’compositions’)->where(’key’, $post[’

slug’])->get();

134 if (empty($topic)) { continue; }

135136 $topics = json_decode($topic[0]->topics);

137 $topics = (array) $topics->topics;

138 reset($topics);

139 $topicKey = key($topics);

140141 // -------------------

142143 if (in_array($topicKey , $topicArr) AND !in_array($key,

$data))

144 { $data[] = $key; }

145 }

The line 1 to 66 describes the procedure the items extraction given on user’s contextualtime and location. Similar to the Content-based filtering strategy (Appendix A), lines 86to 91 describe the procedure to filter the extracted items according to the accessed page.Then, lines 98 to 122 describe the procedure to extract the topics from the retrieved itemson lines 1 to 66, then get the top-10 most occurrence topics. Lines 130 to 145 describethe procedure to filter-out irrelevant items with the extracted topics from line 98 to 122.

Appendix C

Team-draft Multileaving script

1 #!/usr/bin/python

23 #

4 # Other code

5 #

67 try:

8 opts, args = getopt.getopt(argv,":A:B:C:o:",["rankA=","

rankB=","rankC=","output="])

910 for opt, arg in opts:

11 if opt in ("-A", "--rankA"):

12 rankA = ast.literal_eval(arg)

13 elif opt in ("-B", "--rankB"):

14 rankB = ast.literal_eval(arg)

15 elif opt in ("-C", "--rankC"):

16 rankC = ast.literal_eval(arg)

17 elif opt in ("-o", "--output"):

18 output = str(arg)

1920 except Exception:

21 print ’invalid’

22 sys.exit()

2324 # Set interleaving

25 method = interleaving.TeamDraft([rankA, rankB, rankC])

26 ranking = method.interleave()

27

67

68 Appendix C Team-draft Multileaving script

28 # Store ranking to file

29 with open(’/var/www/interleaving/data3/’ + output + ’.rank

’, ’wb’) as f:

30 pickle.dump(ranking, f, pickle.HIGHEST_PROTOCOL)

3132 print ranking

3334 #

35 # Other code

36 #

Appendix D

Team-draft Multileaving evaluationscript

1 #!/usr/bin/python

23 #

4 # Other code

5 #

67 def ev(input, file):

8 # Check if file exists

9 if (os.path.isfile(’/var/www/interleaving/data3/’ + input

[1] + ’.rank’) == False):

10 return

1112 try:

13 # Read stored rank

14 with open(’/var/www/interleaving/data3/’ + input[1] +

’.rank’, ’rb’) as f:

15 ranking = pickle.load(f)

1617 # Evaluate multileaving

18 result = interleaving.TeamDraft.evaluate(ranking,

input[0])

1920 # Write to file

21 with open(’/var/www/interleaving/data_eval3/eval_’ +

file, ’ab’) as f:

22 f.write(json.dumps(result) + "\n")

69

70 Appendix D Team-draft Multileaving evaluation script

2324 except Exception , e:

25 print ’failed :’ + input[1] + ’ ; ’ + str(e)

26 result = None

2728 #

29 # Other code

30 #

Appendix E

Stopword list

Stopwordada antum begini boleh dia

adanya apa beginian bolehkah dialah

adalah apaan beginikah bolehlah diantara

adapun apabila beginilah buat diantaranya

agak apakah begitu bukan dikarenakan

agaknya apalagi begitukah bukankah dini

agan apatah begitulah bukanlah diri

agan-agan atau begitupun bukannya dirinya

agar ataukah belum cuma disini

aja ataupun belumlah dahulu disinilah

akan bagai berapa dalam dong

akankah bagaikan berapakah dan dulu

akhirnya bagaimana berapalah dapat enggak

aku bagaimanakah berapapun dari enggaknya

akulah bagaimanapun bermacam daripada entah

amat bagi bersama deh entahlah

amatlah bahkan betulkah dekat ente

anda bahwa biasa demi gak

andalah bahwasanya biasanya demikian gan

ane bakal bila demikianlah gimana

antar banget bilakah dengan gini

antara banyak bisa depan gitu

antaranya beberapa bisakah di gue

71

72 Appendix E Stopword list

gw kalau lah meski sama

hal kalaulah lain meskipun sambil

hampir kalaupun lainnya mungkin sampai

hanya kalian lalu mungkinkah sana

hanyalah kalo lama nah sangat

harus kami lamanya namun sangatlah

haruslah kamilah lebih nanti saya

harusnya kamu loe nantinya sayalah

hendak kamulah lu neh se

hendaklah kan macam nggak sebab

hendaknya kapan macem nih sebabnya

hingga kapankah maka nyaris sebagai

ia kapanpun makanya oleh sebagaimana

ialah karena makin olehnya sebagainya

ibarat karenanya malah pada sebaliknya

ingin kau malahan padahal sebanyak

inginkah kaulah mampu padanya sebegini

inginkan ke mampukah paling sebegitu

ini kecil mana pantas sebelum

inikah kemudian manakala para sebelumnya

inilah kenapa manalagi pasti sebenarnya

itu kepada masih pastilah seberapa

itukah kepadanya masihkah per sebetulnya

itulah ketika masing percuma sebisanya

jadi khususnya mau pernah sebuah

jangan kini maupun pula sedang

jangankan kinilah melainkan pun sedangkan

janganlah kiranya melalui rupanya sedemikian

jika kita memang saat sedikit

jikalau kitalah mengapa saatnya sedikitnya

juga kok mereka saja segala

justru lagi merekalah sajalah segalanya

kala lagian merupakan saling segera

Appendix E Stopword list 73

seharusnya seolah sudahkah wong

sehingga seorang sudahlah yaitu

sejak sepanjang supaya yakni

sejenak sepantasnya tadi yang

sekali sepantasnyalah tadinya

sekalian seperti tak

sekaligus sepertinya tanpa

sekalipun sering tapi

sekarang seringnya telah

seketika serta tentang

sekiranya serupa tentu

sekitar sesaat tentulah

sekitarnya sesama tentunya

sela sesegera terdiri

selagi sesekali terhadap

selain seseorang terhadapnya

selaku sesuatu terlalu

selalu sesuatunya terlebih

selama sesudah tersebut

selamanya sesudahnya tersebutlah

seluruh setelah tertentu

seluruhnya seterusnya tetapi

semacam setiap tiap

semakin setidaknya tidak

semasih sewaktu tidakkah

semaunya siapa toh

sementara siapakah tokh

sempat siapapun untuk

semua sih waduh

semuanya sini wah

semula sinilah wahai

sendiri suatu walau

sendirinya sudah walaupun

Appendix F

Team-draft Multileaving evaluationresult

74

Appendix F Team-draft Multileaving evaluation result 75

[0,1

],[0

,2]

[0,1

],[0

,2],

[1,2

][0

,1],

[0,2

],[2

,1]

[0,1

],[2

,0],

[2,1

][0

,1],

[2,1

][0

,2],

[1,2

][1

,0],

[0,2

],[1

,2]

25/1

2/20

1629

23

62

2015

426/1

2/20

1646

36

57

3431

727/1

2/20

1639

84

75

3723

628/1

2/20

1610

2918

1712

111

6815

29/1

2/20

1660

011

118

4145

1330/1

2/20

1612

3932

1815

9611

719

31/1

2/20

1652

55

76

3432

601/0

1/20

1746

28

46

2831

7

Table

F.1:

Team

-dra

ftM

ulti

leav

ing

eval

uati

onre

sult

76 Appendix F Team-draft Multileaving evaluation result

[1,0],[1,2][1,0],[2,0]

[1,0],[2,0],[1,2][1,0],[2,0],[2,1]

[2,0],[2,1][]

25/12/2016

1757

23

2170

26/12/2016

24617

56

3175

27/12/2016

28525

54

2459

28/12/2016

96171

87

96724

29/12/2016

48645

1211

48614

30/12/2016

104094

1420

98619

31/12/2016

42237

85

36015

01/01/2017

37034

82

3395

Table

F.2:Team-draftM

ultileavingevaluation

result