wikipedia tools for google spreadsheets

4
Wikipedia Tools for Google Spreadsheets Thomas Steiner Google Germany GmbH ABC Str. 19, 20354 Hamburg, Germany [email protected] ABSTRACT In this paper, we introduce the Wikipedia Tools for Google Spreadsheets. Google Spreadsheets is part of a free, Web- based software office suite offered by Google within its Google Docs service. It allows users to create and edit spread- sheets online, while collaborating with other users in real- time. Wikipedia is a free-access, free-content Internet ency- clopedia, whose content and data is available, among other means, through an API. With the Wikipedia Tools for Google Spreadsheets, we have created a toolkit that facilitates work- ing with Wikipedia data from within a spreadsheet context. We make these tools available as open-source on GitHub, 1 released under the permissive Apache 2.0 license. Categories and Subject Descriptors H.3.5 [Online Information Services]: Web-based services Keywords Wikipedia, Wikidata, Google Spreadsheets, Google Sheets 1. INTRODUCTION In the world of Computer Science, spreadsheet applica- tions serve for the organization, analysis, and storage of data in tabular form. Spreadsheets are the computerized simulation of paper accounting worksheets, and operate on data represented as cells of an array, organized in rows and columns. Cells can contain numeric or textual data, or the results of formulas that automatically calculate and display a value based on the contents of other cells. With the Wiki- pedia Tools for Google Spreadsheets, we introduce a toolkit of such formulas, tailored to the universe of Wikipedia, that enables a wide range of potential use cases starting from marketing, to search engine optimization, to business anal- ysis. Especially through the chaining of formulas, the true power and ease of spreadsheet applications can be unleashed. 1 Wikipedia Tools for Google Spreadsheets : https://github. com/tomayac/wikipedia-tools-for-google-spreadsheets Copyright is held by the International World Wide Web Conference Committee (IW3C2). IW3C2 reserves the right to provide a hyperlink to the author’s site if the Material is used in electronic media. ACM 978-1-4503-4144-8/16/04. http://dx.doi.org/10.1145/2872518.2891112 1.1 Wikipedia and Wikidata Wikipedia’s content and data is available through the Wikipedia API (https://{language}.wikipedia.org/w/api.php), where {language} represents one of the currently 291 sup- ported Wikipedia languages, 2 for example, en for English, de for German, or zu for Zulu. Wikidata is a collaboratively edited knowledge base and intended to provide a common source of structured data which can be used by projects such as Wikipedia. Its content and data is available through the Wikidata API (https://www.wikidata.org/w/api.php). Both the Wikipedia and the Wikidata APIs’ data is available as XML or JSON, among other formats. Wikipedia pageviews data, i.e., the number of times within a given period of time that a given Wikipedia article has been viewed can be ob- tained using the Pageviews API (https://wikimedia.org/api/ rest v1/?doc). The data is available in JSON format. 1.2 Google Spreadsheets and Apps Scripts Google Spreadsheets can be extended with custom func- tions (or formulas) using Google Apps Scripts 3 that are writ- ten in standard JavaScript. 4 To illustrate this, a trivial func- tion is defined in Listing 1 that can then be used from within a spreadsheet as outlined in Listing 2. Custom functions can access external resources on the Web by fetching URLs with the UrlFetchApp, one of the scripting services available in Google Apps Script. Fetched data can either be in XML or JSON format and parsed with convenience functions. function DOUBLE(input){ return input * 2; } Listing 1: Custom Google Sheets function called DOUBLE. =DOUBLE(A1) Listing 2: Usage of the custom DOUBLE function from List- ing 1 in a cell with the value of cell A1 as a parameter. 2. LIST OF DEVELOPED FUNCTIONS In our Wikipedia Tools for Google Spreadsheets, we provide eleven functions that—in traditional spreadsheets style— follow an all-uppercase naming convention and start with 2 List of Wikipedias: https://meta.wikimedia.org/wiki/List of Wikipedias 3 Google Apps Script: https://developers.google.com/ apps-script/ 4 Custom functions in Google Sheets: https://developers. google.com/apps-script/guides/sheets/functions

Upload: truongquynh

Post on 10-Feb-2017

256 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Wikipedia Tools for Google Spreadsheets

Wikipedia Tools for Google Spreadsheets

Thomas SteinerGoogle Germany GmbH

ABC Str. 19, 20354 Hamburg, [email protected]

ABSTRACTIn this paper, we introduce the Wikipedia Tools for GoogleSpreadsheets. Google Spreadsheets is part of a free, Web-based software office suite offered by Google within its GoogleDocs service. It allows users to create and edit spread-sheets online, while collaborating with other users in real-time. Wikipedia is a free-access, free-content Internet ency-clopedia, whose content and data is available, among othermeans, through an API. With the Wikipedia Tools for GoogleSpreadsheets, we have created a toolkit that facilitates work-ing with Wikipedia data from within a spreadsheet context.We make these tools available as open-source on GitHub,1

released under the permissive Apache 2.0 license.

Categories and Subject DescriptorsH.3.5 [Online Information Services]: Web-based services

KeywordsWikipedia, Wikidata, Google Spreadsheets, Google Sheets

1. INTRODUCTIONIn the world of Computer Science, spreadsheet applica-

tions serve for the organization, analysis, and storage ofdata in tabular form. Spreadsheets are the computerizedsimulation of paper accounting worksheets, and operate ondata represented as cells of an array, organized in rows andcolumns. Cells can contain numeric or textual data, or theresults of formulas that automatically calculate and displaya value based on the contents of other cells. With the Wiki-pedia Tools for Google Spreadsheets, we introduce a toolkitof such formulas, tailored to the universe of Wikipedia, thatenables a wide range of potential use cases starting frommarketing, to search engine optimization, to business anal-ysis. Especially through the chaining of formulas, the truepower and ease of spreadsheet applications can be unleashed.

1Wikipedia Tools for Google Spreadsheets: https://github.com/tomayac/wikipedia-tools-for-google-spreadsheets

Copyright is held by the International World Wide Web Conference Committee(IW3C2). IW3C2 reserves the right to provide a hyperlink to the author’s site if theMaterial is used in electronic media.

ACM 978-1-4503-4144-8/16/04.http://dx.doi.org/10.1145/2872518.2891112

1.1 Wikipedia and WikidataWikipedia’s content and data is available through the

Wikipedia API (https://{language}.wikipedia.org/w/api.php),where {language} represents one of the currently 291 sup-ported Wikipedia languages,2 for example, en for English,de for German, or zu for Zulu. Wikidata is a collaborativelyedited knowledge base and intended to provide a commonsource of structured data which can be used by projects suchas Wikipedia. Its content and data is available through theWikidata API (https://www.wikidata.org/w/api.php). Boththe Wikipedia and the Wikidata APIs’ data is available asXML or JSON, among other formats. Wikipedia pageviewsdata, i.e., the number of times within a given period of timethat a given Wikipedia article has been viewed can be ob-tained using the Pageviews API (https://wikimedia.org/api/

rest v1/?doc). The data is available in JSON format.

1.2 Google Spreadsheets and Apps ScriptsGoogle Spreadsheets can be extended with custom func-

tions (or formulas) using Google Apps Scripts3 that are writ-ten in standard JavaScript.4 To illustrate this, a trivial func-tion is defined in Listing 1 that can then be used from withina spreadsheet as outlined in Listing 2. Custom functions canaccess external resources on the Web by fetching URLs withthe UrlFetchApp, one of the scripting services available inGoogle Apps Script. Fetched data can either be in XML orJSON format and parsed with convenience functions.

function DOUBLE(input) {return input * 2;

}

Listing 1: Custom Google Sheets function called DOUBLE.

=DOUBLE(A1)

Listing 2: Usage of the custom DOUBLE function from List-ing 1 in a cell with the value of cell A1 as a parameter.

2. LIST OF DEVELOPED FUNCTIONSIn our Wikipedia Tools for Google Spreadsheets, we provide

eleven functions that—in traditional spreadsheets style—follow an all-uppercase naming convention and start with2List of Wikipedias: https://meta.wikimedia.org/wiki/List ofWikipedias3Google Apps Script: https://developers.google.com/apps-script/4Custom functions in Google Sheets: https://developers.google.com/apps-script/guides/sheets/functions

Page 2: Wikipedia Tools for Google Spreadsheets

a WIKI prefix. These functions are wrappers around the par-ticular Wikipedia or Wikidata API calls, or the PageviewsAPI respectively. Figure 1 shows exemplary output for theEnglish Wikipedia article https://en.wikipedia.org/wiki/Berlin

and the English Wikipedia category https://en.wikipedia.org/

wiki/Category:Berlin. The functions are listed below.

WIKITRANSLATE Returns Wikipedia translations (languagelinks) for a Wikipedia article.

WIKISYNONYMS Returns Wikipedia synonyms (redirects) fora Wikipedia article.

WIKIEXPAND Returns Wikipedia translations (language links)and synonyms (redirects) for a Wikipedia article.

WIKICATEGORYMEMBERS Returns Wikipedia category mem-bers for a Wikipedia category.

WIKISUBCATEGORIES Returns Wikipedia subcategories fora Wikipedia category.

WIKIINBOUNDLINKS Returns Wikipedia inbound links fora Wikipedia article.

WIKIOUTBOUNDLINKS Returns Wikipedia outbound links fora Wikipedia article.

WIKIMUTUALLINKS Returns Wikipedia mutual links, i.e, theintersection of inbound and outbound links for a Wiki-pedia article.

WIKIGEOCOORDINATES Returns Wikipedia geocoordinates fora Wikipedia article.

WIKIDATAFACTS Returns Wikidata facts for a Wikipediaarticle.

WIKIPAGEVIEWS Returns Wikipedia pageviews statistics fora Wikipedia article.

WIKIPAGEEDITS Returns Wikipedia pageedits statistics fora Wikipedia article.

Most functions directly wrap native API calls, with threeexceptions: (i) the functionality of the WIKISYNONYMS andthe WIKITRANSLATE functions is combined in the WIKIEXPANDfunction, both the WIKITRANSLATE and the WIKIEXPAND func-tion accept an optional target languages parameter that al-lows for limiting the output to just a subset of all availableWikipedia languages; (ii) the function WIKIMUTUALLINKS isthe intersection of the two functions WIKIINBOUNDLINKS andWIKIOUTBOUNDLINKS; and (iii) the function WIKIDATAFACTS

provides a list of claims [11] (or facts), enriched with en-tity and property labels for improved readability, limited tosingle-value objects, and simplified using an adapted versionof Maxime Lathuiliere’s simplifyClaims function5 from hisWikidata SDK [6]. This allows us to return two columns—in RDF [2] terms “predicate” and “object” pairs—with oneunique object, for example, the predicate ISO 3166-2 code

with the object DE-BE, and deliberately discarding multi-value claims, for example, predicate head of government

with objects Michael Müller and Klaus Wowereit, amongmany others. While in the concrete example the orderingis clear (temporal), this is not true in the general case,for example, with predicate instance of. As a result, inWIKIDATAFACTS, we prefer indisputability of claims over theircompleteness. Listing 3 exemplarily shows the complete im-plementation of the WIKISYNONYMS function.

5Wikidata SDK simplifyClaims function: https://github.com/maxlath/wikidata-sdk#simplify-claims-results

/*** Returns Wikipedia synonyms

* @param {string} article The Wikipedia article

* @return {Array<string>} The list of synonyms

*/function WIKISYNONYMS(article) {’use strict’;if (!article) {return ’’;

}var results = [];try {var language = article.split(/:(.+)?/)[0];var title = article.split(/:(.+)?/)[1];if (!title) {

return ’’;}title = title.replace(/\s/g, ’_’);var url = ’https://’ + language +

’.wikipedia.org/w/api.php’ +’?action=query’ +’&blnamespace=0’ +’&list=backlinks’ +’&blfilterredir=redirects’ +’&bllimit=max’ +’&format=xml’ +’&bltitle=’ +encodeURIComponent(title);

var xml = UrlFetchApp.fetch(url).getContentText();

var document = XmlService.parse(xml);var entries = document.getRootElement()

.getChild(’query’).getChild(’backlinks’)

.getChildren(’bl’);for (var i = 0; i < entries.length; i++) {

var text = entries[i].getAttribute(’title’).getValue();

results[i] = text;}

} catch (e) {// no-op

}return results.length > 0 ? results : ’’;

}

Listing 3: Implementation of WIKISYNONYMS.

3. USAGE SCENARIOSWe have tested the Wikipedia Tools for Google Spreadsheets

with different usage scenarios in mind. These include, butare not limited to, the ones listed in the following.

3.1 Usage Scenario I: Ordered Category PanelWikipedia holds an enormous amount of categories, for

example, visitor attractions in Montreal.6 Category membersobtained through a call of WIKICATEGORYMEMBERS are listedin alphabetical order, however, if we additionally requestpageviews data for each category member through a seriesof WIKIPAGEVIEWS calls and then sort by pageviews in de-scending order, we get a representative list of top-10 visitorattractions—enriched with photos retrieved through calls ofWIKIDATAFACTS filtered on “image”—as shown in Figure 2.A similar feature (based on non-disclosed metrics) in form

6Visitor attractions in Montreal: https://en.wikipedia.org/wiki/Category:Visitor attractions in Montreal

Page 3: Wikipedia Tools for Google Spreadsheets

WIK

ISYN

ON

YMS(

"en:

Ber

lin"

WIK

ISU

BC

ATE

GO

RIE

S("e

n:C

ateg

ory:

Ber

lin"

WIK

ICA

TEG

OR

YMEM

BER

S("e

n:C

ateg

ory:

Ber

linWIKIGEOCOORDINATES("en:Berlin")

WIK

IINB

OU

ND

LIN

KS(

"en:

Ber

lin")

WIK

IOU

TBO

UN

DLI

NK

S("e

n:B

erlin

")W

IKIM

UTU

ALL

INK

S("e

n:B

erlin

")W

IKIP

AG

EVIE

WS(

"en:

Ber

lin",

TO

DA

Y() -

7, T

OD

AY(

))

abБе

рлин

City

Ber

linde

fren

Cat

egor

y:B

erlin

eP

rixB

erlin

52.51666667

13.3

8333

333

Alb

ert E

inst

ein

1. F

C U

nion

Ber

linA

lber

t Ein

stei

nJa

nuar

y 24

, 201

648

96Ja

nuar

y 23

, 201

6-3

05hi

ghes

t poi

ntA

rken

berg

e

ace

Ber

linB

erlin

, Ger

man

yB

erlin

Ber

linB

erlin

Cat

egor

y:B

uild

ings

and

stru

ctur

es in

Ber

linFr

ee S

eces

sion

Ank

ara

1896

Sum

mer

Oly

mpi

csA

nkar

aJa

nuar

y 23

, 201

643

20Ja

nuar

y 23

, 201

6-6

topi

c's

mai

n W

ikim

edia

por

tal

Por

tal:B

erlin

afB

erly

nC

apita

l of E

ast G

erm

any

Frei

e B

ühne

(Zei

tsch

rift)

DE

3C

ityB

erlin

Cat

egor

y:C

rime

in B

erlin

Inte

lexi

tA

mst

erda

m19

00 S

umm

er O

lym

pics

Am

ster

dam

Janu

ary

22, 2

016

4685

Janu

ary

23, 2

016

6le

gisl

ativ

e bo

dyA

bgeo

rdne

tenh

aus

of B

erlin

akB

erlin

DE

BE

RLa

nd B

erlin

Ber

lin, G

erm

any

Cat

egor

y:C

ultu

re in

Ber

linN

ew S

eces

sion

Aud

i19

04 S

umm

er O

lym

pics

Aar

hus

Janu

ary

21, 2

016

4954

Janu

ary

22, 2

016

-1hi

ghes

t jud

icia

l aut

horit

yC

onst

itutio

nal C

ourt

of th

e S

tate

of B

erlin

als

Ber

linU

N/L

OC

OD

E:D

EB

ER

DE

-BE

Cap

ital o

f Eas

t Ger

man

Cat

egor

y:E

cono

my

of B

erlin

Alfo

ns M

aria

Jak

ob19

08 S

umm

er O

lym

pics

Ath

ens

Janu

ary

20, 2

016

5086

Janu

ary

22, 2

016

1co

untry

Ger

man

y

amበርሊን

Ber

lin-Z

entru

mW

illi F

reita

gD

EB

ER

Cat

egor

y:E

duca

tion

in B

erlin

Apr

il 12

1912

Sum

mer

Oly

mpi

csA

ache

nJa

nuar

y 19

, 201

652

11Ja

nuar

y 19

, 201

6-2

0D

ewey

Dec

imal

Cla

ssifi

catio

n2-

-431

55

anB

erlín

Ber

libR

oger

Ros

smei

slU

N/L

OC

OD

E:D

EB

ER

Cat

egor

y:G

eogr

aphy

of B

erlin

Aar

hus

1916

Sum

mer

Oly

mpi

csA

lexa

nder

plat

zJa

nuar

y 18

, 201

647

79IS

NI

0000

000

1 23

41 9

654

ang

Ber

linLa

nd B

erlin

Elis

abet

h C

onco

rdia

Cro

laB

erlin

-Zen

trum

Cat

egor

y:H

ealth

care

in B

erlin

Ant

isem

itism

1920

Sum

mer

Oly

mpi

csA

rcha

eopt

eryx

flag

flag

of B

erlin

arين

رلب

Ber

lin.d

eFe

licie

Ber

nste

inB

erlib

Cat

egor

y:H

isto

ry o

f Ber

linFo

reig

n re

latio

ns o

f Aze

rbai

jan

1920

s B

erlin

Atla

nta

flag

imag

eFl

ag o

f Ber

lin.s

vg

arc

ܝܢܪܠ

ܒB

erlin

(Ger

man

y)C

arl B

erns

tein

(Kun

stsa

mm

ler)

Land

Ber

linC

ateg

ory:

Ber

lin-r

elat

ed li

sts

Fore

ign

rela

tions

of A

rmen

ia19

24 S

umm

er O

lym

pics

Bon

nco

at o

f arm

s im

age

Coa

t of a

rms

of B

erlin

.svg

arz

ينرل

بيFe

dera

l Sta

te o

f Ber

linC

arl L

ange

nsch

eidt

Ber

lin.d

eC

ateg

ory:

Org

anis

atio

ns b

ased

in B

erlin

Acc

ordi

on19

28 S

umm

er O

lym

pics

Ber

linIS

O 3

166-

2 co

deD

E-B

E

ast

Ber

línC

ity o

f Ber

linJö

rg H

artm

ann

(Mau

erop

fer)

Ber

lin (G

erm

any)

Cat

egor

y:P

eopl

e fro

m B

erlin

Ale

iste

r Cro

wle

y19

32 S

umm

er O

lym

pics

Bru

ssel

sO

penS

treet

Map

Rel

atio

n id

entif

i624

22

ayB

erlin

His

toric

al s

ites

in b

erlin

Loth

ar S

chle

usen

erFe

dera

l Sta

te o

f Ber

linC

ateg

ory:

Pol

itics

of B

erlin

Alte

rnat

e hi

stor

y19

36 S

umm

er O

lym

pics

Bav

aria

Com

mon

s ca

tego

ryB

erlin

azB

erlin

Max

Jaf

feM

ark

Lehm

sted

tC

ity o

f Ber

linC

ateg

ory:

Rel

igio

n in

Ber

linA

then

s19

40 S

umm

er O

lym

pics

Bra

nden

burg

coat

of a

rms

Coa

t of a

rms

of B

erlin

azb

ينرل

بS

ilico

n A

llee

Eck

hard

Weh

age

His

toric

al s

ites

in b

erlin

Cat

egor

y:S

port

in B

erlin

Ala

n K

ay19

44 S

umm

er O

lym

pics

Bun

dest

agpo

stal

cod

e10

115–

1419

9

baБе

рлин

Spr

eeat

hen

Chr

iste

l Weh

age

Max

Jaf

feC

ateg

ory:

Tour

ism

in B

erlin

Alb

ert t

he B

ear

1948

Sum

mer

Oly

mpi

csB

arce

lona

loca

l dia

ling

code

030

bar

Ber

linA

then

s on

the

Spr

eeB

rigitt

e M

atsc

hins

ky-D

enni

ngho

ffS

ilico

n A

llee

Cat

egor

y:Tr

ansp

ort i

n B

erlin

Aac

hen

1952

Sum

mer

Oly

mpi

csB

aku

VIA

F id

entif

ier

1225

3098

0

bat-s

mg

Ber

līns

Cui

sine

of B

erlin

Luci

e B

erlin

Spr

eeat

hen

Cat

egor

y:V

isito

r attr

actio

ns in

Ber

linA

lexa

nder

III o

f Rus

sia

1956

Sum

mer

Oly

mpi

csB

asel

GN

D id

entif

ier

4005

728-

8

bcl

Ber

linO

tto F

reita

g (F

ußba

llspi

eler

)A

then

s on

the

Spr

eeC

ateg

ory:

Wor

ks s

et in

Ber

linA

lexa

nder

of A

phro

disi

as19

60 S

umm

er O

lym

pics

Cop

enha

gen

licen

ce p

late

cod

eB

beГо

рад

Берл

інJe

an-R

odriq

ue F

unke

Cui

sine

of B

erlin

Cat

egor

y:W

ikip

edia

boo

ks o

n B

erlin

Alg

iers

1964

Sum

mer

Oly

mpi

csC

entra

l Eur

ope

loca

ted

in th

e ad

min

istra

tive

terr

Ger

man

y

be-x

-old

Бэрл

інW

olfg

ang

Pre

ußC

ateg

ory:

Imag

es o

f Ber

linA

rt D

eco

1968

Sum

mer

Oly

mpi

csC

olog

neen

clav

e w

ithin

Bra

nden

burg

bgБе

рлин

Sve

n G

iller

tC

ateg

ory:

Ber

lin s

tubs

Ach

aean

s (H

omer

)19

72 S

umm

er O

lym

pics

Dub

linto

pic'

s m

ain

cate

gory

Cat

egor

y:B

erlin

biB

erlin

Frei

e H

anse

stad

t Ber

linC

ateg

ory:

Ber

lin te

mpl

ates

Ale

xand

er G

roth

endi

eck

1976

Sum

mer

Oly

mpi

csE

urop

eW

ikiv

oyag

e ba

nner

Ber

lin b

anne

r 2.jp

g

bmB

erlin

Aug

ustin

Ter

wes

ten

Aug

uste

and

Lou

is L

umiè

re19

80 S

umm

er O

lym

pics

Eur

opea

n U

nion

offic

ial w

ebsi

teht

tp://

ww

w.b

erlin

.de/

bnবারল

িনLu

cius

Rei

chlin

gA

nton

Dre

xler

1983

Wor

ld C

ham

pion

ship

s in

Ath

letic

sE

rfurt

Mus

icB

rain

z ar

ea ID

c9ac

1239

-e83

2-41

bc-9

930-

e252

a1fd

110

boཔ

ར་ལ

ན།S

ayed

Kam

alA

lban

Ber

g19

84 S

umm

er O

lym

pics

Eas

t Ber

linG

erm

an m

unic

ipal

ity k

ey11

0000

00

brB

erlin

Iwan

Kut

iske

rH

ouse

of A

scan

ia19

87 W

orld

Cha

mpi

onsh

ips

in A

thle

tics

Elb

ląg

LAU

1100

0000

bsB

erlin

Sta

dt B

erlin

Aur

ochs

1988

Sum

mer

Oly

mpi

csC

inem

a of

Ger

man

yN

DL

iden

tifie

r00

6291

94

bxr

Берл

инW

ilhel

m O

lsch

ewsk

i jun

ior

Ale

xand

erpl

atz

1991

Wor

ld C

ham

pion

ship

s in

Ath

letic

sFr

ankf

urt

Free

base

iden

tifie

r/m

/015

6q

caB

erlín

Ant

on H

errn

feld

Abb

ahu

1992

Sum

mer

Oly

mpi

csFl

oren

ceLC

Aut

h id

entif

ier

n790

3497

2

cbk-

zam

Ber

línD

onat

Her

rnfe

ldA

ndre

as S

chlü

ter

1993

Wor

ld C

ham

pion

ship

s in

Ath

letic

sG

othe

nbur

gFI

PS

10-

4 (c

ount

ries

and

regi

onG

M16

cdo

Bái

k-lìn

gTo

uris

mus

in B

erlin

Arc

haeo

pter

yx19

95 W

orld

Cha

mpi

onsh

ips

in A

thle

tics

Ger

man

yB

nF id

entif

ier

1529

8132

w

ceБе

рлин

Elli

e Ti

chau

erA

tlant

a19

96 S

umm

er O

lym

pics

Gda

ńsk

ince

ptio

n+1

237-

01-0

1T00

:00:

00Z

ceb

Ber

linE

nerg

ieve

rsor

gung

von

Ber

linA

Dol

l's H

ouse

1997

Wor

ld C

ham

pion

ship

s in

Ath

letic

sG

erm

an U

nity

Day

cate

gory

for p

eopl

e bo

rn h

ere

Cat

egor

y:P

eopl

e bo

rn in

Ber

lin

ckb

ينرل

بەS

tefa

n La

mpr

echt

Bon

n19

99 W

orld

Cha

mpi

onsh

ips

in A

thle

tics

Ger

man

cui

sine

cate

gory

for p

eopl

e w

ho d

ied

hC

ateg

ory:

Dea

ths

in B

erlin

coB

erlin

uC

harlo

tte M

arqu

ardt

Ber

lin2.

Fuß

ball-

Bun

desl

iga

Ger

man

Em

pire

Geo

Nam

es ID

2950

157

crh

Ber

linB

erm

uda

2000

Sum

mer

Oly

mpi

csH

ambu

rgfo

llow

sA

lt-B

erlin

csB

erlín

Fore

ign

rela

tions

of B

rune

i20

01 W

orld

Cha

mpi

onsh

ips

in A

thle

tics

Han

seat

ic L

eagu

eca

tego

ry fo

r film

s sh

ot a

t thi

s lo

Cat

egor

y:Fi

lms

shot

in B

erlin

csb

Ber

lëno

Bru

ssel

s20

03 W

orld

Cha

mpi

onsh

ips

in A

thle

tics

Han

over

cate

gory

of a

ssoc

iate

d pe

ople

Cat

egor

y:P

eopl

e fro

m B

erlin

cuБє

рлин

ъB

avar

ia20

04 S

umm

er O

lym

pics

Bud

apes

tca

tego

ry o

f peo

ple

burie

d he

reC

ateg

ory:

Bur

ials

in B

erlin

by

plac

e

cvБе

рлин

Bra

nden

burg

2005

Wor

ld C

ham

pion

ship

s in

Ath

letic

sB

ucha

rest

SE

LIB

R16

1170

cyB

erlin

Bun

dest

ag20

06 F

IFA

Wor

ld C

upE

aste

rn E

urop

eG

erm

an re

gion

al k

ey11

daB

erlin

Bau

haus

2006

FIF

A W

orld

Cup

Fin

alD

resd

enaw

ard

rece

ived

Prin

cess

of A

stur

ias

Aw

ard

- con

cord

deB

erlin

Bat

tle o

f Ram

illie

s20

07 W

orld

Cha

mpi

onsh

ips

in A

thle

tics

Aug

sbur

gde

scrib

ed b

y so

urce

Cat

holic

Enc

yclo

pedi

a

diq

Ber

linB

arry

Lyn

don

2008

Sum

mer

Oly

mpi

csA

vign

onof

fice

held

by

head

of g

over

nmG

over

ning

May

or o

f Ber

lin

dsb

Bar

lińB

ear

2009

Wor

ld C

ham

pion

ship

s in

Ath

letic

sFr

eder

ick

I, E

lect

or o

f Bra

nden

burg

LAC

iden

tifie

r00

53C

1712

eeB

erlin

Bar

celo

na20

11 W

orld

Cha

mpi

onsh

ips

in A

thle

tics

Fred

eric

k II,

Ele

ctor

of B

rand

enbu

rgN

LA (A

ustra

lia) i

dent

ifier

3655

9094

elΒ

ερολ

ίνο

Bric

k20

12 S

umm

er O

lym

pics

1936

Sum

mer

Oly

mpi

csN

LI (I

srae

l) id

entif

ier

0009

7494

7

eml

Ber

lîṅB

alts

2013

Wor

ld C

ham

pion

ship

s in

Ath

letic

sD

aim

ler A

GB

NE

iden

tifie

rX

X45

1163

eoB

erlin

oB

DS

M20

15 W

orld

Cha

mpi

onsh

ips

in A

thle

tics

Hei

delb

erg

area

+891

.68

esB

erlín

Bak

u20

16 S

umm

er O

lym

pics

hist

ory

of to

pic

His

tory

of B

erlin

etB

erlii

nB

ritis

h M

useu

m20

17 W

orld

Cha

mpi

onsh

ips

in A

thle

tics

FAS

T-ID

1204

829

euB

erlin

Bam

berg

2019

Wor

ld C

ham

pion

ship

s in

Ath

letic

sC

omm

ons

galle

ryB

erlin

ext

Ber

línB

asel

2020

Sum

mer

Oly

mpi

csFa

cebo

ok P

lace

s ID

1111

7511

8906

315

WIK

ITR

AN

SLA

TE("

en:B

erlin

")W

IKIE

XPA

ND

("en

:Ber

lin",

{"de

"; "

fr"}

)W

IKIP

AG

EED

ITS(

"en:

Ber

lin",

TO

DA

Y() -

7, T

OD

AY(

))W

IKID

ATA

FAC

TS("

en:B

erlin

")

Fig

ure

1:

Exam

ple

outp

ut

for

each

funct

ion

inth

eW

ikipedia

Tools

forGoogleSpreadsheets

(cro

pp

ed).

Liv

esp

readsh

eet:

https://goo.gl/yvbmex

.

of an image carousel can be seen in Google’s KnowledgeGraph [10] Web search results pages when searching for“vis-itor attractions in montreal” (demo https://goo.gl/Ugt0je).

3.2 Usage Scenario II: Search AdsSearch advertisers can greatly profit from the information

that is contained in Wikipedia and Wikidata. For exam-ple, if we imagine a hotel booking site, it may be desir-able to advertise based on points of interest (POIs) and cre-ate advertisements automatically featuring known facts ofsuch POIs. Figure 3 shows an example where skyscraperslisted in the category skyscrapers over 350 meter7 are first ob-tained via WIKICATEGORYMEMBERS and then checked for their“height” fact via WIKIDATAFACTS, which is then used in twotemplates to create ads. Search keywords are generated bycalling WIKISYNONYMS and combined with terms like “hotel”.

3.3 Usage Scenario III: Marketing CampaignsOn January 13, 2016, Google Maps added Street View

imagery for the model railway Miniatur Wunderland.8 Tak-ing global Wikipedia pageviews as a popularity indicator,we can examine if the marketing campaign has had anyimpact on the attraction, assuming that more pageviewstranslate to increased visitor interest. Therefore, we firstobtain the Miniatur Wunderland article in all available lan-guages via WIKITRANSLATE and then retrieve pageviews viaWIKIPAGEVIEWS. Figure 4 shows indeed an international up-take of pageviews starting January 13 after an earlier linearcurve progression (except for the German article, which hada peak on January 8, a long weekend after a public holiday).

4. RELATED WORKIn his book Google Apps Script for Beginners [4], Gabet

gives an introduction to extending Google Spreadsheets withcustom functions. A similar introduction is given in Fer-reira’s Google Apps Script: Web Application Development Es-

sentials [3]. In [5], Han et al. describe their approach RDF123

to translate spreadsheets data to RDF, the inverse of whatwe do in WIKIDATAFACTS. Olsen and Moser show in [8] howWeb APIs can be taught with spreadsheets. The process ofcalling Web APIs via spreadsheets is further described in [9].Further, in [1], Abramson et al. describe how they enabledspreadsheets to have“super-computing”powers through par-allelized custom functions. An open-source toolkit for min-ing Wikipedia—not bound to spreadsheets, but designed forgeneral use with the Java programming language—is de-scribed by Milne et al. in [7].

5. CONCLUSIONS AND FUTURE WORKIn this paper, we have introduced the Wikipedia Tools for

Google Spreadsheets. First, we have introduced the data sour-ces Wikipedia and Wikidata and their different APIs. Sec-ond, we have shown how Google Spreadsheets can be ex-tended through custom functions that can then be used fromwithin a cell context as if they were native functions. In thefollowing, we have listed the implemented functions, and ex-plained where they extend the functionality of the underly-

7Skyscrapers over 350 meter: https://en.wikipedia.org/wiki/Category:Skyscrapers over 350 meters8Miniatur Wunderland on Google Street View:https://www.google.com/maps/about/behind-the-scenes/streetview/treks/miniatur-wunderland/

Page 4: Wikipedia Tools for Google Spreadsheets

DIY Knowledge GraphFile Edit View Insert Format Data Tools Add­ons Help All changes saved in Drive

$ % 123 

Arial   10                 

=IFERROR(SUM(QUERY(WIKIPAGEVIEWS("en:"&A2, TODAY() - 30, TODAY()), "SELECT Col2")), "")

 

 

DIY Knowledge GraphComments   Share

[email protected]

     Sheet1

Figure 2: Usage scenario I: Wikipedia Tools for GoogleSpreadsheets used to create an ordered category panel basedon Wikipedia category memberships and accumulated Wiki-pedia pageviews for popularity ranking (here: the top-10visitor attractions in Montreal). Live spreadsheet: https:

//goo.gl/Njvt1T.

AdWords AdsFile Edit View Insert Format Data Tools Add­ons Help All changes saved in Drive

$ % 123 

Arial   10                 

=ARRAYFORMULA(LOWER(WIKISYNONYMS("en:"&C$2)&" hotel"))

 

 

AdWords AdsComments   Share

[email protected]

     Sheet1

Figure 3: Usage scenario II: Wikipedia Tools for GoogleSpreadsheets used to create textual search ads based onWikidata facts (here: skyscraper heights) and Wikipediasynonyms as keywords combined with the term“hotel”. Livespreadsheet: https://goo.gl/np1Is8.

Miniatur WunderlandFile Edit View Insert Format Data Tools Add­ons Help All changes saved in Drive

$ % 123 

Arial   10                 

=IF(ISBLANK(C$2), "", QUERY(WIKIPAGEVIEWS(C$2, TODAY() - $B$1, TODAY()), "SELECT Col2"))

 

da:Miniatur W…de:Miniatur W…en:Miniatur W…es:Miniatur­W…fa:سرزمين عجای…fi:Miniatur Wu…fr:Miniatur­Wu…hu:Miniatur W…

1/2Dec 29, 2015 Jan 5, 2016 Jan 12, 2016 Jan 19, 2016

0

500

1000

1500

2000

Date

Jan 13, 2016en:Miniatur Wunderland: 1487

 

Miniatur WunderlandComments   Share

[email protected]

     ExploreSheet1

Figure 4: Usage scenario III: Wikipedia Tools for GoogleSpreadsheets used to evaluate the impact of a marketingcampaign (here: model railway Miniatur Wunderland beingfeatured on Google Street View since January 13, 2016).Live spreadsheet: https://goo.gl/q1yhuV.

ing wrapped API functions. We have then focused on threedifferent usage scenarios that illustrate how to work withthe Wikipedia Tools for Google Spreadsheets and finally haveprovided an overlook on related work in the area.

Future work will focus on adding more functions as needbe and potentially making the functions more parameteri-zable. In the current iteration, we have favored simplicityand ease of use over customizability, essentially making themost common use case the only option. Possibly, in up-coming releases, we will add an advanced mode that allowsexperienced users to fine-tune the functions’ results, for ex-ample, to implicitly include bot traffic in WIKIPAGEVIEWS

that we have currently excluded on purpose.Concluding, we were positively surprised by the increased

productivity and short turnaround time enabled by the Wiki-

pedia Tools for Google Spreadsheets for the rapid prototypingof ideas, especially in combination with the fill-down andfill-right features in spreadsheets and the charting capabili-ties. We look forward to making the tools even more pow-erful and hope to attract collaborators for the open sourceproject available on GitHub at https://github.com/tomayac/

wikipedia-tools-for-google-spreadsheets. As a positive side ef-fect, the tools can even help improve Wikipedia and Wiki-data when authors add missing data, for example, we addedan image to one of the visitor attractions of Montreal, as thisfact was initially missing in Wikidata (and thus in Figure 2).

6. REFERENCES[1] D. Abramson, L. Kotler, D. Mather, and P. Roe.

ActiveSheets: Super-Computing with Spreadsheets. InU. Seattle, editor, Proceedings of the HighPerformance Computing Symposium – HPC 2001,pages 110–115, San Diego, USA, 2001.

[2] R. Cyganiak, D. Wood, and M. Lanthaler. RDF 1.1Concepts and Abstract Syntax. Recommendation,W3C, Feb. 2014.

[3] J. Ferreira. Google Apps Script: Web ApplicationDevelopment Essentials. O’Reilly Media, 2014.

[4] S. Gabet. Google Apps Script for Beginners. PacktPublishing, 2014.

[5] L. Han, T. Finin, C. Parr, J. Sachs, and A. Joshi.RDF123: From Spreadsheets to RDF. In TheSemantic Web – ISWC 2008, volume 5318 of LNCS,pages 451–466. Springer, 2008.

[6] M. Lathuiliere. Wikidata SDK, 2016.https://github.com/maxlath/wikidata-sdk (2016-02-08).

[7] D. Milne and I. H. Witten. An Open-Source Toolkitfor Mining Wikipedia. Artificial Intelligence,194:222–239, Jan. 2013.

[8] T. Olsen and K. Moser. Teaching Web APIs inIntroductory and Programming Classes: Why andHow. Paper 16, SIGED: IAIM Conference, Feb. 2013.

[9] K. Patel, S. Prish, S. Sadhu, L. Bizek, and X. Pan.Spreadsheet Functions to Call REST API Sources,May 15 2014. US Patent App. 13/672,704.

[10] A. Singhal. “Introducing the Knowledge Graph:things, not strings”, Official Google Blog, May 2012.http://googleblog.blogspot.com/2012/05/

introducing-knowledge-graph-things-not.html.

[11] D. Vrandecic and M. Krotzsch. Wikidata: A FreeCollaborative Knowledgebase. Commun. ACM,57(10):78–85, Sept. 2014.