web search as a linguistic toolpapers.… · web search engines over more traditional proo ng...

9
Web Search as a Linguistic Tool Adam Fourney Microsoft Research Redmond, Washington [email protected] Meredith Ringel Morris Microsoft Research Redmond, Washington [email protected] Ryen W. White Microsoft Research Redmond, Washington [email protected] ABSTRACT Many people rely on web search engines to check the spelling or grammatical correctness of input phrases. For example, one might search [recurring or reoccurring ] to decide be- tween these similar words. While language-related queries are common, they have low click-through rates, lack a strong intent signal, and are generally challenging to study. Per- haps for these reasons, they have yet to be characterized in the literature. In this paper we report the results of two surveys that investigate how, when, and why people use web search to support low-level, language-related tasks. The first survey was distributed by email, and asked partic- ipants to reflect on a recent search task. The second survey was embedded directly in search result pages, and captured information about searchers’ intents in-situ. Our analysis confirms that language-related search tasks are indeed com- mon, accounting for at least 2.7% of all queries posed by our respondents. Survey responses also reveal: (1) the range of language-related tasks people perform with search, (2) the contexts in which these tasks arise, and (3), the reasons why people elect to use web search rather than relying on tradi- tional proofing tools (e.g., spelling and grammar checkers). Categories and Subject Descriptors H.3.3 [Information storage and retrieval]: Information search and retrieval—Query formulation Keywords language-related queries; web search; spelling; grammar 1. INTRODUCTION People often rely on web search engines to perform a variety of low-level, language-related tasks. For example, searchers issue queries to verify the correct spelling of words, to disambiguate between homonyms, or to perform any num- ber of similar tasks (Table 1). These linguistic uses of web c 2017 International World Wide Web Conference Committee (IW3C2), published under Creative Commons CC BY 4.0 License. WWW 2017, April 3–7, 2017, Perth, Australia. ACM 978-1-4503-4913-0/17/04. http://dx.doi.org/10.1145/3038912.3052651 . Hyphenation Deciding about hyphenation, or about joining words (e.g., follow up/follow-up). Homonyms Deciding between similar sounding words (e.g., affect vs effect). Grammar Checking if a phrase is grammatically cor- rect (e.g., “in regard to”or “in regards to”). Spelling Checking the spelling of a word or proper noun. Definition Learning the definition of an unfamiliar word or phrase. Pronunciation Learning how to pronounce a word or proper noun. Thesaurus Finding similar or opposite words (i.e., synonyms or antonyms). Etymology Learning the history or origin of a word or phrase. Table 1: Eight linguistic tasks that people perform with web search engines. search are so common and well-accepted that they have be- come part of our shared public consciousness. As an ex- ample, the Merriam-Webster Online Dictionary tracked the themes of the 2016 US presidential election by examining the words web searchers looked up throughout the cam- paign [16]. These efforts and findings were widely reported in the media [23]. Likewise, to celebrate the Scripps Na- tional Spelling Bee, Google recently revealed which words web searchers most frequently asked [how to spell ], in each of the 50 US states [18]. In contrast, there is a surprising lack of research char- acterizing the phenomenon of using web search to support low-level linguistic tasks. This scarcity of literature persists despite the fact that language-related search strategies have motivated several automatic grammar-checking systems [5, 8, 15, 21, 22, 28], and frequently serve as illustrative exam- ples in the study of positive web search abandonment 1 [1, 13, 24]. As we will show, this situation can perhaps be ex- plained by the fact that language-related web searches have low click-through rates, lack a strong intent signal, and are generally challenging to study. 1 Here, positive abandonment refers to the scenario where a searcher’s information need is met by the contents of the search engine results page (SERP), thus eliminating the need to click on a result [4]. 549

Upload: others

Post on 16-Jul-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Web Search as a Linguistic Toolpapers.… · web search engines over more traditional proo ng tools, but these papers did not explicitly study searcher motivations or practices. Much

Web Search as a Linguistic Tool

Adam FourneyMicrosoft Research

Redmond, [email protected]

Meredith Ringel MorrisMicrosoft Research

Redmond, [email protected]

Ryen W. WhiteMicrosoft Research

Redmond, [email protected]

ABSTRACTMany people rely on web search engines to check the spellingor grammatical correctness of input phrases. For example,one might search [recurring or reoccurring ] to decide be-tween these similar words. While language-related queriesare common, they have low click-through rates, lack a strongintent signal, and are generally challenging to study. Per-haps for these reasons, they have yet to be characterizedin the literature. In this paper we report the results oftwo surveys that investigate how, when, and why peopleuse web search to support low-level, language-related tasks.The first survey was distributed by email, and asked partic-ipants to reflect on a recent search task. The second surveywas embedded directly in search result pages, and capturedinformation about searchers’ intents in-situ. Our analysisconfirms that language-related search tasks are indeed com-mon, accounting for at least 2.7% of all queries posed by ourrespondents. Survey responses also reveal: (1) the range oflanguage-related tasks people perform with search, (2) thecontexts in which these tasks arise, and (3), the reasons whypeople elect to use web search rather than relying on tradi-tional proofing tools (e.g., spelling and grammar checkers).

Categories and Subject DescriptorsH.3.3 [Information storage and retrieval]: Informationsearch and retrieval—Query formulation

Keywordslanguage-related queries; web search; spelling; grammar

1. INTRODUCTIONPeople often rely on web search engines to perform a

variety of low-level, language-related tasks. For example,searchers issue queries to verify the correct spelling of words,to disambiguate between homonyms, or to perform any num-ber of similar tasks (Table 1). These linguistic uses of web

c©2017 International World Wide Web Conference Committee (IW3C2),published under Creative Commons CC BY 4.0 License.WWW 2017, April 3–7, 2017, Perth, Australia.ACM 978-1-4503-4913-0/17/04.http://dx.doi.org/10.1145/3038912.3052651

.

Hyphenation Deciding about hyphenation, or aboutjoining words (e.g., follow up/follow-up).

Homonyms Deciding between similar sounding words(e.g., affect vs effect).

Grammar Checking if a phrase is grammatically cor-rect (e.g., “in regard to”or“in regards to”).

Spelling Checking the spelling of a word or propernoun.

Definition Learning the definition of an unfamiliarword or phrase.

Pronunciation Learning how to pronounce a word orproper noun.

Thesaurus Finding similar or opposite words (i.e.,synonyms or antonyms).

Etymology Learning the history or origin of a word orphrase.

Table 1: Eight linguistic tasks that people performwith web search engines.

search are so common and well-accepted that they have be-come part of our shared public consciousness. As an ex-ample, the Merriam-Webster Online Dictionary tracked thethemes of the 2016 US presidential election by examiningthe words web searchers looked up throughout the cam-paign [16]. These efforts and findings were widely reportedin the media [23]. Likewise, to celebrate the Scripps Na-tional Spelling Bee, Google recently revealed which wordsweb searchers most frequently asked [how to spell ], in eachof the 50 US states [18].

In contrast, there is a surprising lack of research char-acterizing the phenomenon of using web search to supportlow-level linguistic tasks. This scarcity of literature persistsdespite the fact that language-related search strategies havemotivated several automatic grammar-checking systems [5,8, 15, 21, 22, 28], and frequently serve as illustrative exam-ples in the study of positive web search abandonment1 [1,13, 24]. As we will show, this situation can perhaps be ex-plained by the fact that language-related web searches havelow click-through rates, lack a strong intent signal, and aregenerally challenging to study.

1Here, positive abandonment refers to the scenario wherea searcher’s information need is met by the contents of thesearch engine results page (SERP), thus eliminating the needto click on a result [4].

549

Page 2: Web Search as a Linguistic Toolpapers.… · web search engines over more traditional proo ng tools, but these papers did not explicitly study searcher motivations or practices. Much

In this paper we report the findings of two surveys de-signed to directly study how, when, and why people relyon web search engines to support linguistic tasks. The firstsurvey collected responses from 149 people, and asked re-spondents to reflect on a recent language-related search task.The second survey was embedded directly in search resultpages, and captured information about the language-relatedqueries issued by 142 individuals over a six week period.Taken together, responses to these surveys directly addressthe following research questions:

RQ1: How often do people rely on web searchengines to support linguistic tasks? How com-mon is this behavior?

RQ2: What specific language-related tasks dopeople use web search to perform? Which ofthese tasks can be directly addressed by the con-tents of SERPs?

RQ3: In what contexts or scenarios do peopleuse web search to support language-related tasks?

RQ4: Why do people use generic web searchengines for these tasks, rather than relying onthe specialized proofing tools now embedded inmost software applications and services?

In the remainder of this paper we review related work,then detail the methods and data used in our analyses. Weaddress the four research questions in turn, then discuss thelimitations and broad implications of this research.

2. BACKGROUNDThe observation that people rely on web search engines

to support language-related activities is not new. In 2008,while developing a classification scheme for searcher intents,Jansen et al. remarked [12]:

Web search engines are spell checkers, thesauruses,and dictionaries. (...) People are employing searchengines in new, novel, and increasingly diverseways.

At around the same time, Jacquemont et al. [11], Yi etal. [28], and Park et al. [21] independently observed thatwriters often performed web searches to decide if particularphrasings were common in English documents. For exam-ple, a searcher might query [“take a seat”] followed by [“takethe seat”] to determine, from document counts, which prepo-sition is correct. This class of errors is especially commonamong non-native English writers [21, 28]. These anecdotalobservations inspired several automated proofing systems [5,8, 11, 15, 17, 21, 22, 28]. These systems treat the web as anormative corpus of text in a given language [19, 22], andrely on web search engines to quickly compute documentfrequency statistics (e.g., [5]) and to collect sample phrases(e.g., [28]). In some cases, systems also consider the querysuggestions returned by the search engine (e.g., the text fol-lowing“Did you mean”, on Google) [17]. The success of theseefforts demonstrates that web search engines can be lever-aged to support low-level linguistic tasks. However, thesepapers do not directly study the day-to-day search practicesthat originally inspired these innovations.

In the same vein as the abovementioned work, prior re-search has explored using query logs to develop spellingand grammar checkers [3, 9, 14]. Again, the key prop-erty of these corpora is that correct strings generally occurmore frequently than incorrect strings [3]. Moreover, mod-ern search engines often suggest spelling corrections, andquery logs maintain records of which suggestions are ac-cepted by searchers [9]. This feedback is an important signalused to continuously improve the spell checker. Continualrefinement may help explain why some people prefer to useweb search engines over more traditional proofing tools, butthese papers did not explicitly study searcher motivationsor practices.

Much of what is known about language-related searchescomes from research studying positive abandonment in websearch [1, 13, 24]. Positive abandonment describes situationswhere a searcher’s information needs can be directly satisfiedby the contents of a search result page, therefore negatingthe need for users to click through to view individual searchresults [4]. We describe the relevant research below.

In what is perhaps the most detailed investigation of lang-uage-related search activities, Stamou and Efthimiadis askedsix graduate students to label the intents and outcomes ofevery web search they performed over the course of one week[24]. The authors reported that one intent,“look(ing) for lin-guistic information about query terms (e.g., spelling, trans-lation, definition)”, accounted for 4.76% of all web searchqueries that were recorded. Moreover, 73.9% of these searchesresulted in positive abandonment. While limited in scale,the research conducted by Stamou and Efthimiadis presentsthe first empirical evidence of how frequently language-relatedqueries occur, and how infrequently users engage with theresults of these searches. In this paper, we replicate thesefindings with a study that is roughly thirty times larger. Fur-thermore, we extend our understanding of this phenomenonby exploring language-related sub-tasks, contexts and moti-vations.

Additional details about language-related queries are pro-vided by work conducted by Li et al. in [13]. An interestingdimension of their work is that it considers queries posed inthree distinct locales: The United States, China and Japan.Li et al. sampled 400 abandoned web search queries fromeach locale, and, after manual labeling, identified dictionaryand spelling-related searches as having tremendous poten-tial for positive abandonment. Intriguingly, this potentialwas greatest for Japanese queries originating from mobiledevices. Unfortunately, Li et al. do not provide base ratesfor abandonment, and do not reveal how frequently peoplerely on search engines to perform language-related tasks.

Chilton and Teevan also briefly discuss language-relatedsearch queries in their work investigating the impact of pre-senting answer cards directly in line with search results [1].In particular, the authors studied 15 common answers cat-egories (e.g., weather reports, phone numbers, etc.) andfound that dictionary answers were among the least likely toreceive user interactions. These findings reinforce the themethat language-related queries can often be addressed withno further user interaction beyond simply issuing a searchquery. Again, language-related queries were not the focus ofthat work, and the authors provided few details about theprevalence of those queries.

550

Page 3: Web Search as a Linguistic Toolpapers.… · web search engines over more traditional proo ng tools, but these papers did not explicitly study searcher motivations or practices. Much

Finally, the topic of language-related searches also arisesin Ido Guy’s comparison of spoken and typed queries [10].In this work, Guy reports that dictionary answers are trig-gered 1.43 times as often when people issue voice queries, ascompared to when queries are entered with the keyboard (onmobile devices). In this same vein, Ong and Suplizio con-ducted market research on Amazon Echo users, and foundthat 17.6% of early adopters had issued voice queries askingthe Echo for assistance in “spelling something” [20]. To-gether, these findings suggest that language-related queriesare also common in voice search scenarios. Voice search isan interesting dimension to consider, but is ultimately outof the scope of this paper.

In summary, prior research provides several compellingclues about how people use web search to support low-levellinguistic tasks. However, these details are disparate, andare buried in papers describing a range of other phenomena.To the best of our knowledge, there has yet to be any re-search directly characterizing how, when or why people useweb search to support these linguistic tasks.

In this paper, we deliberately and methodically investigatethe common linguistic uses of web search engines. Wherepossible, we replicate prior results. We also extend priorwork by characterizing the specific tasks, scenarios, and mo-tivations that underlie this phenomenon.

3. METHODS AND DATAAs we will show later in this paper, many language-related

search queries consist only of a single term, and often failto receive any clicks. These factors render language-relatedsearches extremely difficult to study using traces or web logsalone. To this end, our work engages directly with usersby means of a complementary pair of short surveys: Thefirst survey is retrospective, and the second gathers searcherresponses in-situ (Figure 1). We describe each survey below,then present the results of our analyses in the next section.

3.1 Retrospective Survey

3.1.1 Design and DistributionAn invitation to complete an anonymous online survey

was emailed to a random sample of 2000 employees withinMicrosoft Corporation. The survey began by collecting de-mographic information. It then asked respondents to recalla specific example of a recent English-language search querythat they issued to support reading, writing or another sim-ilar language-related activity. To help respondents groundtheir responses in a recent experience, the questionnaireincluded links to pages listing their recent Bing.com2 andGoogle3 searches. Provided that respondents could recall aspecific example, the survey asked respondents to describethe task they were performing (Table 1), and to answer aseries of follow-up questions. Each question was displayed ina multiple-choice format, and provided an option allowingrespondents to write-in their own answers if they preferred.

3.1.2 Data and DemographicsIn total, 149 respondents completed the questionnaire (re-

sponse rate = 7.45%). Within this group, 110 respondents

2https://www.bing.com/profile/history3https://history.google.com/history

Figure 1: A screen shot of the first question posedby the in-situ survey. The in-situ survey was pre-sented inline with search results upon detection ofa potential language-related search query.

were male (73.8%), and 134 reported having a bachelor’s de-gree or greater college education (89.9%). Respondent ageswere distributed as follows: 43 participants were between theages of 25-34, (28.9%); 53 respondents were between 35-44,(35.6%); and 39 participants were between 45-54 (26.2%).

The demographics questionnaire also asked respondents tolist the language in which they were most comfortable read-ing and writing. Here, 145 participants reported that theypreferred reading in English (97.3%), while 148 participantsreported they preferred to write in English (99.3%). Assuch, in contrast to prior research that focused on English-language learners (e.g., [21, 28]), our work characterizes thelanguage-related searches of those who consider themselvesfluent in the English language.

3.2 In-situ SurveyThe in-situ survey (Figure 1) was implemented as a browser

extension, and largely follows the design principles outlinedby Diriye et al. in [4]. Given recent efforts to standardize

551

Page 4: Web Search as a Linguistic Toolpapers.… · web search engines over more traditional proo ng tools, but these papers did not explicitly study searcher motivations or practices. Much

browser extension development4, the in-situ survey exten-sion could be installed on a range of modern browsers. Wenow describe: (1) the conditions under which the surveywas triggered, (2) which questions were asked, and (3) whatadditional instrumentation data was collected.

3.2.1 Triggering the In-situ SurveyThe extension monitored web searches performed on both

the Bing and Google search engines. The in-situ survey wasthen triggered whenever a dictionary answer was present ona SERP. The survey was also displayed when any of the topthree organic search results linked to one of the top 100 mostfrequently visited English dictionaries, thesauri, grammarguides, style guides, or other linguistic references known tothe Bing.com search engine (as measured in January, 2016).

3.2.2 In-situ Survey DesignUpon detecting a potential language-related query, the

browser extension injected a short survey directly in linewith the search results (Figure 1). The first survey questionasked participants to report the language-related task theywere in the process of performing. The survey presentedthe same set of options as was available in the email survey,with two important distinctions: First, the survey includeda not applicable option allowing users to flag false positiveclassifications. Second, the dictionary definition task wasdivided into the two sub-tasks listed below:

• Confirm definition; Double-checking that one is usingthe correct word or phrase.

• Learn definition; Learning the definition of an unfa-miliar word or phrase.

The second in-situ survey question asked participants toexplain why they elected to use web search rather than thespell-checker, grammar-checker, or dictionary that is builtinto desktop software (e.g., Outlook, Office, Chrome, Fire-fox, etc).

The third and final question asked participants if the searchresults page contained enough information to directly satisfytheir information need.

Provided that participants answered at least one of thethree questions, the survey was snoozed, and would not reap-pear for 30 minutes (regardless of which queries a participantissued).

3.2.3 InstrumentationIn addition to recording survey responses, the extension

gathered basic information about every query the user is-sued, and every mouse or keyboard interaction the user per-formed on search results pages (similar to the instrumenta-tion described in [4]). In instances where queries failed tosatisfy the conditions for potential language-related searches,the extension logged only cryptographic hashes of the textinput by searchers or displayed on the results page. Thisdesign afforded users a degree of privacy, while allowing usto compute aggregate statistics and meet our research ob-jectives.

4https://developer.mozilla.org/en-US/Add-ons/WebExtensions

3.2.4 DeploymentAs with the email survey, the in-situ survey was deployed

within Microsoft Corporation. Participants were recruitedvia an email invitation sent to a random sample of 5000employees occupying a variety of job roles within the com-pany. Recipients were asked to install the browser extensionfor six weeks. To incentivize participation, we randomly se-lected one participant per week to receive a $50 gift card foran online retailer. To avoid the possibility of biasing par-ticipants with monetary rewards, we made it clear that theselection process did not depend on how frequently partici-pants engaged with the browser extension.

3.2.5 Data and DemographicsIn total, the browser extension was installed by 143 dis-

tinct individuals (response rate = 2.9%). Participants sharedsimilar demographics to those responding to the email sur-vey (no person participated in both studies). As notedabove, participants occupied a diverse set of roles includingsoftware development, sales and marketing, legal services,supply chain engineering, and technical writing, to name afew. As before, most respondents (93%) listed English astheir preferred language for both reading and for writing.

4. RESULTSWe now present the results of our two surveys. The discus-

sion is organized so as to address our four research questions.

4.1 Frequency of Language-Related SearchesTo determine how frequently people leverage web search

to support low-level language-related tasks, and to answerour first research question, we analyzed the instrumenta-tion data collected by the in-situ survey. In total, 29,211queries were observed, and 891 (3.05%) were classified aspotentially language-related (i.e., triggered a dictionary an-swer, or linked to a linguistic web site, as defined earlier).There was no statistically significant difference between theproportions of queries identified as language-related on Bingand Google (z = 1.088, p = 0.276). These figures representa median of 5 weeks of search activity collected for each par-ticipant (IQR = 3). As a sanity check, we compared theserates to the rates at which the same conditions arise in theproduction Bing search engine, and we found a similar lowsingle-digit value.

In response to the 891 positive classifications, the browserextension triggered the in-situ survey 690 times (the sur-vey was suppressed 201 times as a result of the 30-minutetimeout). Users responded to the survey on 381 occasions(55.2% engagement). Of those 381 responses, 48 querieswere labeled as false positives. This suggests a classifier pre-cision of 87.4%. Under the assumption that this precisionapplies to the 510 unlabeled positive classifications, we esti-mate that the extension recorded 776 ± 30 language-relatedsearches (confidence interval of 95%). This represents be-tween 2.56% to 2.76% of all queries observed. We note thatthis figure should be considered a lower bound on the pro-portion of language-related search activity, as our in-situdata does not allow us to estimate the false negative rate.

For one point of comparison, recall that Stamou and Efthimi-adis exhaustively labeled all queries posed by six computer-science graduate students over a single week. They foundthat 4.76% of search queries were related to language [24].Our lower-bound is compatible with this figure, and adds

552

Page 5: Web Search as a Linguistic Toolpapers.… · web search engines over more traditional proo ng tools, but these papers did not explicitly study searcher motivations or practices. Much

Figure 2: Bars indicate the proportions of language-related web searches ascribed to eight tasks, as la-beled by respondents of the in-situ and email sur-veys. In the in-situ survey, the definition task wassubdivided. For comparison purposes, these subtasks are shown together as a stacked bar graph.

roughly thirty times as much evidence to support the con-clusion that: language-related queries account for a smallbut meaningful proportion of all search traffic (about 1 inevery 40 searches, in our data set).

4.2 Linguistic TasksOur second research question asks: What specific language-

related tasks do people perform with web search? A natu-ral follow-up question is: Which of these information needscan be directly addressed from the search results pages (i.e.,without requiring users to click a result). We address bothquestions in turn.

4.2.1 Task VarietyRespondents from both surveys reported using web search

to perform a broad range of language-related tasks. Thoughnot directly comparable5, we present responses together inFigure 2 to emphasize their shared trend. Notably, retriev-ing definitions and checking spelling are the two most com-mon tasks, followed by checking grammar/punctuation andlooking up synonyms or antonyms. In both cases, word pro-nunciation was the task least-frequently reported by users.Finally, we note that the homonym disambiguation task isanomalous – this task was reported nearly three times moreoften in the retrospective survey than was observed in thein-situ survey. We have several hypotheses that may explainthis discrepancy. For example, it is possible that the in-situsurvey was not sensitive to these tasks, and failed to trigger.Alternatively, it is possible that, although rare, these tasksare more memorable, and are more likely to be reportedin the retrospective survey. We leave further analyses ofhomonym disambiguation queries to future research.

In addition to considering in-situ survey responses, wealso inspected the corresponding search queries. 172 (51.7%)of language-related queries consisted of only a single queryterm (e.g., [dreamt ]). Considered in isolation, these querieslack a clear intent signal. Paired with survey responses, the

5In the retrospective survey, respondents could only describeone search query.

Write-in task ResponsesDouble-check “double-checking the meaning”a definition “confirm the definition of a word”

“check if a word was used correctly”“check that the spelling matched my defi-nition of the word”

Translation “I wanted a translation to include in a pre-sentation”“Reading a site in French”“Kanji (translation) to go into a presenta-tion”“Translation”

Acronyms / “acronym meaning”Abbreviations “I wanted to know what the abbreviation

stood for”“check acronym”

Table 2: The retrospective survey allowed respon-dents to write in their own language-related tasksin cases where they felt the eight tasks listed in thesurvey failed to apply. This table lists three commontasks reported by respondents.

queries can be broken down by task. We found that spellingqueries were 2.5 times as frequent in this setting. This differ-ence is highly statistically significant (two-tailed, z = 4.298,p << 0.05). However, we were also able to identify single-term queries for each and every linguistic task we consideredin the survey. This highlights why it was important that weengage with users directly rather than relying only on logdata.

Finally, the retrospective survey allowed users to write intheir own tasks if they so desired (appearing under the othercategory in Figure 2). Inspection of the 21 write in responsesreveals a number of common themes, as depicted in Table 2.The first theme, double-checking definitions, motivated ourdecision to subdivide the definition task into two sub taskswhen deploying the in-situ survey.

4.2.2 Task SupportIn the previous section, we reported that respondents lever-

aged web search to support a broad range of linguistic tasks.In this section, we report which tasks could be accomplisheddirectly from the SERPs. To investigate this, both surveyssought to identify instances of positive abandonment. Here,the in-situ survey simply asked respondents if they were sat-isfied with the search results, then monitored the SERPs formouse clicks. Conversely, the retrospective survey asked re-spondents to reissue their queries and comment on the re-sults.

71.0% of in-situ survey responses, and 72.5% of retro-spective survey respondents, indicated instances of positiveabandonment. These rates compare favorably with the 73.9%figure reported by Stamou and Efthimiadis in [24].

We are also able to extend prior work by leveraging thein-situ responses to break abandonment rates down acrossthe range of linguistic tasks (Figure 3). Notably, retriev-ing definitions, verifying spelling, deciding about hyphen-ation, and distinguishing between homonyms, each resultedin positive abandonment rates above 70%. Conversely, theetymology and thesaurus tasks were positively abandonedless than 30% of the time. This low rate is statistically

553

Page 6: Web Search as a Linguistic Toolpapers.… · web search engines over more traditional proo ng tools, but these papers did not explicitly study searcher motivations or practices. Much

Figure 3: Bars indicate the positive abandonmentrates for various language-related tasks. In thesecases, respondents indicated that they were satisfiedwith the search results, yet did not click on any links.Error bars indicate the standard error (SE) of thesample proportions.

Writing I wanted to include the word or phrase insomething I was writing.

Proofreading I came across the word or phrase whileproofreading a document (reading with in-tent to edit).

Reading I came across the word or phrase whilereading (without intent to edit).

Overheard I heard this word or phrase used in a con-versation (including in a movie, on the ra-dio, etc.).

Brainstorming I was brainstorming (e.g., names for aproduct).

Table 3: Respondents to the retrospective surveywere asked to indicate the scenario in which theirlanguage-related information needs arose. This ta-ble lists the options respondents could choose whenanswering this question. Additionally, respondentscould write in their own scenario.

indistinguishable from the proportion of all 29,211 queriesthat experienced some form of abandonment (p = 0.67, andp = 0.50, respectively).

In summary, respondents leveraged web search to per-form a wide variety of linguistic tasks. However, the querieswere often under-specified and contained only a single term.Moreover, the corresponding search results were often dis-missed without user interaction.

4.3 ScenariosOne advantage of the retrospective survey is that it af-

forded participants the opportunity to describe the scenar-ios in which their information needs arose (Table 3), in-cluding reporting web searches performed on mobile devices(rather than being restricted to observe only queries orig-inating from a single desktop or laptop computer). Theresultant analysis answers our third research question.

Figure 4: Respondents of the retrospective sur-vey indicated that they performed language-relatedsearches in a number of different scenarios, and ona variety of devices. Bars indicate the proportion ofresponses ascribed to each. Error bars indicate thestandard error (SE) of the sample proportions.

Figure 4 summarizes the results of our analysis. Over-all, 45% of respondents reported that they had issued theirqueries because they wanted to use a word or phrase in adocument that they were in the process of writing. How-ever, writing was much more common on computers than onmobile devices (48.4% vs. 29.6%). This difference is statis-tically significant (two-tailed, z = −2.370, p = 0.018), andappeals to our intuitions. Apart from writing, language-related queries arose when users were reading documents(26.2%), or after having overheard a word used in conver-sation (12.8%). We also found that respondents were morelikely to write in their own scenarios when describing mo-bile searches (vs desktop searches). These responses appearas other in Figure 4. Some of the mobile write-in scenariosinclude: “having a friendly disagreement between friends”,helping children with homework, and playing puzzle gamessuch as Scrabble or Words With Friends.

Again, we are able to break these responses down by tasktype (Figure 5). As expected, the writing scenario gave riseto the most heterogeneous set of language-related informa-tion needs. Here, the most common task was checking thespelling of words, but this task accounted for only 28.4%of the responses. In contrast, the most common task per-formed in reading scenarios was looking up the definitions ofwords (comprising 69.2% of all tasks). Again, these findingsappeal to our intuition, serve as a useful sanity check, andraise our confidence in the quality of the survey data.

4.4 Reasons for Using Web SearchFinally, upon detecting a language-related search query,

the in-situ survey asked participants to answer the followingquestion: Why did you just use web search for this task? (vsrelying on the spell-checker, grammar-checker or dictionarybuilt into your browser, operating system, and other desktopsoftware). Possible responses are listed in Table 4. Figure6 summarizes the results, and directly answers our fourthresearch question.

Overall, the most common explanation for relying on websearch was that it was convenient – the browser was already

554

Page 7: Web Search as a Linguistic Toolpapers.… · web search engines over more traditional proo ng tools, but these papers did not explicitly study searcher motivations or practices. Much

Figure 5: Bars indicate the proportions of language-related search queries ascribed to each task typeacross four broad scenarios. The brainstorming andproofreading scenarios are omitted due to their lownumber of responses.

Task support The word or phrase is not in my spell-checker’s dictionary, or the task is not sup-ported by such tools.

Richness Web search provides richer information(e.g., examples, images)

Trust I am more confident that web search willprovide the correct answer

Familiarity I am more familiar with web search thanwith the aforementioned tools

Convenience E.g., The browser was already open.

Other A reason not shown here.

Table 4: Respondents of the in-situ survey wereasked to indicate why they elected to use web searchrather than rely on more traditional proofing tools(e.g., the spell checkers built into software applica-tions). This table lists the set of possible responses.

open (42.6%). This was followed by task support (18.4%),richness (13.2%), trust (10.3%), and familiarity (8.7%). Fig-ure 6 breaks down responses by task type. We find thereare interesting differences across tasks. For example, conve-nience dominates the responses in cases where queries wereissued to check a word’s spelling (56.3% of responses), whiletask support was the most common explanation offered forthe thesaurus task (50.0% of responses). Perhaps more sur-prising is that richness and trust together account for nearlya quarter (23.5%) of the linguistic uses of web search.

5. IMPLICATIONSWe now consider both the immediate and the broader im-

plications of our findings. In the former category, we feelthat our findings can serve as a useful critique on the de-sign of the dictionary answers returned by contemporarysearch engines. Specifically, we note that the first two lines(and largest font sizes) of dictionary answers are dedicated tothe presentation of syllabic and phonetic information (Fig-

Figure 6: Colors indicate the various reasons whyrespondents of the in-situ survey reported relyingon web search to perform language-related tasks.

ure 7). However, the need to learn pronunciations was veryrare among our survey respondents. More common was theneed to verify the correct spelling of words. We hypothesizethat the syllabic and phonetic representations may interferewith this more-common task (e.g., by rendering the correctspelling more difficult to read [25], copy & paste, or recog-nize by sight). We recommend exploring ways to tailor theseanswers to a searcher’s stated or implied linguistic tasks.

Similarly, our findings have implications for the design ofwriting tools. Richness and trust were two of the more fre-quent responses people offered when asked about their use ofweb search in this context. Perhaps next-generation spellingand grammar checkers should consider enriching their sug-gestions with information retrieved from the web. As Fall-man noted in [5], it can often be tremendously helpful to seeand compare the number of search results returned for var-ious spellings or phrasings of a word or sentence. Moreover,search engines and online content have evolved considerablyin the fourteen years since Fallman’s work was published. Assuch, it is worth considering other SERP features that mightadd richness to, and build trust in, spell checkers (e.g., pho-tographs, Wikipedia summaries, and structured informationabout entities).

More generally, we feel that language-related search queriespresent an opportunity for search engines to work togetherwith content creation software in service of better support-ing users and their tasks [6]. For example, one could imaginea scenario where clicking on a dictionary card in a SERP re-sults in the corresponding word being inserted into whateverdocument the user was currently editing. This interactionwould be especially useful on mobile devices where multi-tasking is difficult, and where copy & paste gestures arecumbersome. Notably, smartphone on-screen keyboards arealready beginning to include web search capabilities6, butdo not yet gracefully handle language-related queries.

Synchronizing search engines with content creation soft-ware may also benefit the underlying search services. For

6https://googleblog.blogspot.com/2016/05/gboard-search-gifs-emojis-keyboard.html

555

Page 8: Web Search as a Linguistic Toolpapers.… · web search engines over more traditional proo ng tools, but these papers did not explicitly study searcher motivations or practices. Much

Figure 7: A screen shot of the dictionary answer re-turned for the query [newtonian] on Bing.com. TheGoogle search engine returns a similar dictionarycard for this query. This presentation is optimizedto facilitate pronunciation, but may make it harderto verify spelling. Our data suggests the latter is amuch more common task.

example, while searchers may not click on results when is-suing language-related queries, they are likely to use theinformation to refine text they are composing in anotherapplication [7]. Detecting these explicit or implicit trans-fers of information (e.g., copy & paste events, or re-typingtext found on SERPs) could serve as useful implicit rele-vance signals – especially in cases of positive abandonment.Likewise, if a search engine was aware that a document wasbeing edited concurrently, it might be better able to de-termine which queries should trigger dictionary answers –or more generally, when language-related tasks were beingperformed.

To conclude, pairing search engines with content creationsoftware, and recognizing the roles of language-related queries,affords opportunities to: help users efficiently compose text,help search engines decide when to return linguistic answers,and help evaluators measure the success of search interac-tions.

6. DISCUSSION AND FUTURE WORKThe survey results analyzed in this paper indicate that

language-related searches are common among a group ofhighly-educated people who reported having good fluencyin both spoken and written English, and who worked for atechnology company in the United States. We caution read-ers against overly generalizing the results reported here. Forexample, it is known form prior research that both domainexpertise [26] and web search expertise [27] can impact howpeople use search engines.

Nevertheless, it seems likely that certain demographic groups,such as people who are learning English as a non-primarylanguage, children, and people with reading or writing dis-abilities (e.g., dyslexia) may benefit even more from language-related search than the group we studied. For example,English-language learners were the target audiences for manyof the prior search-inspired grammar checkers mentionedearlier (e.g., [21, 28]). We also find that certain classes oflanguage-related queries closely align to the American aca-demic calendar (Figure 8). This raises the possibility thatstudents or educators are among those who frequently lever-age web search for linguistic purposes.

Future work studying the frequency and nature of language-related queries for these user groups may reveal differences

Figure 8: Relative search volume for the homonymdisambiguation query [affect effect], from January2012 to October 2016 (as reported by GoogleTrends7). Shaded regions indicate the months June-July. Local minima occur during the summer, overAmerican Thanksgiving, and over Christmas break.These patterns align with the academic year, andmay indicate that students or educators are amongthose using web search to support linguistic tasks.

from the group we studied, and may suggest specializedranking schemes or user interfaces tailored to the needs ofthese audiences. For example, both Bing and Google cur-rently return definitions from the Oxford English Dictionary,but younger students may benefit from a dictionary writtenat a lower reading level [2]. Likewise, English language learn-ers may benefit from a bilingual dictionary. Finally, we notethat language-related search frequency and task types maydiffer for searches conducted in languages besides English.

7. CONCLUSIONIn this paper we reported the results of two surveys that

investigate how, when, and why people use web search tosupport low-level, language-related tasks. Our findings con-firm that language-related searches are indeed common (ac-counting for at least 2.7% of queries in our data set), andoften result in positive abandonment. This latter prop-erty renders language-related queries difficult to study us-ing trace logs alone. Fortunately, our survey-based methodsallowed us to characterize this phenomenon in detail. Wefound that people perform a wide variety of linguistic tasks,but retrieving definitions and verifying spelling dominate.Likewise, we found that, for users of desktop computers,linguistic queries were most often associated with writing;however, queries were also issued while users were readingtext or listening to spoken conversations. Finally, we foundthat convenience drove a plurality of all language-relatedweb searches (roughly 40%), but that task-support, richness,and trust (in the results) also gave rise to many linguisticqueries. Taken together, our findings paint a rich pictureof this common but rarely studied use of web search en-gines. Future work includes expanding our results to otherdemographics and device contexts, as well as exploring op-portunities to more tightly integrate web search engines withcontent creation applications.

7https://www.google.com/trends/

556

Page 9: Web Search as a Linguistic Toolpapers.… · web search engines over more traditional proo ng tools, but these papers did not explicitly study searcher motivations or practices. Much

8. REFERENCES[1] L. B. Chilton and J. Teevan. Addressing people’s

information needs directly in a web search result page.In Proceedings of WWW ’11, pages 27–36, New York,NY, USA, 2011. ACM.

[2] K. Collins-Thompson, P. N. Bennett, R. W. White,S. de la Chica, and D. Sontag. Personalizing websearch results by reading level. In Proceedings ofCIKM ’11, pages 403–412, New York, NY, USA, 2011.ACM.

[3] S. Cucerzan and E. Brill. Spelling correction as aniterative process that exploits the collective knowledgeof web users. In Proceedings of EMNLP 2004, pages293–300, Barcelona, Spain, July 2004. Association forComputational Linguistics.

[4] A. Diriye, R. White, G. Buscher, and S. Dumais.Leaving so soon?: Understanding and predicting websearch abandonment rationales. In Proceedings ofCIKM ’12, pages 1025–1034, New York, NY, USA,2012. ACM.

[5] D. Fallman. The penguin: Using the web as adatabase for descriptive and dynamic grammar andspell checking. In Proceedings of CHI EA ’02, pages616–617, New York, NY, USA, 2002. ACM.

[6] A. Fourney, B. Lafreniere, P. Chilana, and M. Terry.Intertwine: Creating interapplication informationscent to support coordinated use of software. InProceedings of UIST ’14, pages 429–438, New York,NY, USA, 2014. ACM.

[7] A. Fourney and M. R. Morris. Enhancing technicalQ&A forums with CiteHistory. In Proceedings ofICWSM ’13, 2013.

[8] M. Gamon and C. Leacock. Search right and thoushalt find...: Using web queries for learner errordetection. In Proceedings of IUNLPBEA ’10, pages37–44, Stroudsburg, PA, USA, 2010. Association forComputational Linguistics.

[9] J. Gao, X. Li, D. Micol, C. Quirk, and X. Sun. A largescale ranker-based system for search query spellingcorrection. In Proceedings of COLING ’10, pages358–366, Stroudsburg, PA, USA, 2010. Association forComputational Linguistics.

[10] I. Guy. Searching by talking: Analysis of voice querieson mobile web search. In Proceedings of SIGIR ’16,pages 35–44, New York, NY, USA, 2016. ACM.

[11] S. Jacquemont, F. Jacquenet, and M. Sebban. Correctyour text with google. In Web Intelligence,IEEE/WIC/ACM International Conference on, pages170–176, Nov 2007.

[12] B. J. Jansen, D. L. Booth, and A. Spink. Determiningthe informational, navigational, and transactionalintent of web queries. Information Processing &Management, 44(3):1251 – 1266, 2008.

[13] J. Li, S. Huffman, and A. Tokuda. Good abandonmentin mobile and pc internet search. In Proceedings ofSIGIR ’09, pages 43–50, New York, NY, USA, 2009.

[14] M. Li, Y. Zhang, M. Zhu, and M. Zhou. Exploringdistributional similarity based models for queryspelling correction. In Proceedings of ACL-44, pages1025–1032, Stroudsburg, PA, USA, 2006. Associationfor Computational Linguistics.

[15] A. D. Matthieu Hermet and S. Szpakowicz. Using theweb as a linguistic resource to automatically correctlexico-syntactic errors. In Proceedings of LREC’08,Marrakech, Morocco, may 2008.

[16] Merriam-Webster Incorporated. Tracing thepresidential election through the words Americanslooked up most.http://www.merriam-webster.com/words-at-play/trending-words-from-election-2016/, 2016.[Online; retrieved 20-October-2016].

[17] J. More. A grammar checker based on web searching.Digithum, 0(8), 2006.

[18] J. W. Moyer. Google uncovered ‘America’s topspelling mistakes,’ and ‘diarrhea’ is on the list.https://www.washingtonpost.com/news/local/wp/2016/05/26/google-uncovered-americas-top-

spelling-mistakes-and-diarrhea-is-on-the-list,May 2016. [Online; retrieved 20-October-2016].

[19] K. A. Olsen and J. G. Williams. Spelling and grammarchecking using the web as a text repository. J. Am.Soc. Inf. Sci. Technol., 55(11):1020–1023, Sept. 2004.

[20] S. Ong and A. Suplizio. Unpacking the breakoutsuccess of the amazon echo.https://www.experian.com/innovation/thought-leadership/amazon-echo-consumer-survey.jsp,2016. [Online; retrieved 20-October-2016].

[21] T. Park, E. Lank, P. Poupart, and M. Terry. ‘Is thesky pure today?’ AwkChecker: An assistive tool fordetecting and correcting collocation errors. InProceedings of UIST ’08, pages 121–130, New York,NY, USA, 2008. ACM.

[22] J. Sjobergh. The internet as a normative corpus:grammar checking with a search engine. Technicalreport, KTH Nada, Stockholm, Sweden, 2006.

[23] L. Stack. ‘Braggadocious?’ in Trump and Clintondebate, words that sent viewers to the dictionary.http://www.nytimes.com/2016/09/28/us/braggadocio-in-trump-and-clinton-debate-words-

that-sent-viewers-to-the-dictionary.html,September 2016. [Online; retrieved 20-October-2016].

[24] S. Stamou and E. N. Efthimiadis. Interpreting userinactivity on search results. In Proceedings ofECIR’2010, pages 100–113, Berlin, Heidelberg, 2010.Springer-Verlag.

[25] M. Tinker. Legibility of print. Iowa State UniversityPress, 1963.

[26] R. W. White, S. T. Dumais, and J. Teevan.Characterizing the influence of domain expertise onweb search behavior. In Proceedings of WSDM ’09,pages 132–141, New York, NY, USA, 2009. ACM.

[27] R. W. White and D. Morris. Investigating thequerying and browsing behavior of advanced searchengine users. In Proceedings of SIGIR ’07, pages255–262, New York, NY, USA, 2007. ACM.

[28] X. Yi, J. Gao, and W. B. Dolan. A web-based englishproofing system for english as a second language users.In Proceedings of IJCNLP’08, pages 619–624, 2008.

557