google query language -- a dsl for advanced google searching xiaoqing wu advisor: dr. barrett r....

15
Google Query Language Google Query Language -- -- a DSL for Advanced Google Searching a DSL for Advanced Google Searching Xiaoqing Wu Xiaoqing Wu Advisor: Dr. Barrett R. Bryant Advisor: Dr. Barrett R. Bryant Department of Computer and Information Science Department of Computer and Information Science 03/04/2005 03/04/2005

Upload: morris-park

Post on 23-Dec-2015

224 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Google Query Language -- a DSL for Advanced Google Searching Xiaoqing Wu Advisor: Dr. Barrett R. Bryant Department of Computer and Information Science

Google Query Language Google Query Language -- -- a DSL for Advanced Google Searchinga DSL for Advanced Google Searching

Xiaoqing WuXiaoqing WuAdvisor: Dr. Barrett R. BryantAdvisor: Dr. Barrett R. Bryant

Department of Computer and Information ScienceDepartment of Computer and Information Science03/04/200503/04/2005

Page 2: Google Query Language -- a DSL for Advanced Google Searching Xiaoqing Wu Advisor: Dr. Barrett R. Bryant Department of Computer and Information Science

BackgroundBackground

• PhD research: Compiler Development EnvironmPhD research: Compiler Development Environment (CDE)ent (CDE)– Compiler, interpreter, and integrated development envCompiler, interpreter, and integrated development env

ironment automatic generationironment automatic generation– Several Domain-Specific Languages have been develSeveral Domain-Specific Languages have been devel

oped on top of CDEoped on top of CDE

• GQL: an application based on CDEGQL: an application based on CDE– Internet -- DatabaseInternet -- Database– Google --Database Management System (DBMS)Google --Database Management System (DBMS)– GQL -- Structured Query Language (SQL)GQL -- Structured Query Language (SQL)

Page 3: Google Query Language -- a DSL for Advanced Google Searching Xiaoqing Wu Advisor: Dr. Barrett R. Bryant Department of Computer and Information Science

Google: Google: more than keyword searchingmore than keyword searching

• Language preferenceLanguage preference• File format, date, occurrences, domainFile format, date, occurrences, domain• Image, forum, shopping searchImage, forum, shopping search

Page 4: Google Query Language -- a DSL for Advanced Google Searching Xiaoqing Wu Advisor: Dr. Barrett R. Bryant Department of Computer and Information Science

Query customization in Google Query customization in Google

• Filling formsFilling forms

• Writing meta-tokens directlyWriting meta-tokens directly– allintext:allintext: Xiaoqing Wu Xiaoqing Wu filetype:pdffiletype:pdf

Page 5: Google Query Language -- a DSL for Advanced Google Searching Xiaoqing Wu Advisor: Dr. Barrett R. Bryant Department of Computer and Information Science

Why GQL (I)?Why GQL (I)?

• Forms are not flexibleForms are not flexible– FixedFixed– Can’t be saved and reusedCan’t be saved and reused– Filling multiple forms is time-consumingFilling multiple forms is time-consuming– Mouse operation is slower than keyboard operationMouse operation is slower than keyboard operation

Page 6: Google Query Language -- a DSL for Advanced Google Searching Xiaoqing Wu Advisor: Dr. Barrett R. Bryant Department of Computer and Information Science

Why GQL (II)?Why GQL (II)?

• Meta-tokens are not designed for end-usersMeta-tokens are not designed for end-users– Not user friendly Not user friendly – No syntax providedNo syntax provided– No type-checkingNo type-checking– AmbiguousAmbiguous

keyword1 keyword3 OR keyword4 "keyword2"keyword1 keyword3 OR keyword4 "keyword2"

Page 7: Google Query Language -- a DSL for Advanced Google Searching Xiaoqing Wu Advisor: Dr. Barrett R. Bryant Department of Computer and Information Science

GQL: A well-formed DSLGQL: A well-formed DSL

• User friendly grammar User friendly grammar – Natural, SQL-like syntax rules, easy to followNatural, SQL-like syntax rules, easy to follow– No ambiguityNo ambiguity

• IDE supportIDE support– Automatic syntax and type checking Automatic syntax and type checking

• Program based queryProgram based query– Query could be saved and reusedQuery could be saved and reused– Search from old querySearch from old query

• Flexible: numerous forms!Flexible: numerous forms!

Page 8: Google Query Language -- a DSL for Advanced Google Searching Xiaoqing Wu Advisor: Dr. Barrett R. Bryant Department of Computer and Information Science

No more forms!No more forms!

search {key}* from filewhere {constraint}*

Page 9: Google Query Language -- a DSL for Advanced Google Searching Xiaoqing Wu Advisor: Dr. Barrett R. Bryant Department of Computer and Information Science

DemoDemo

Page 10: Google Query Language -- a DSL for Advanced Google Searching Xiaoqing Wu Advisor: Dr. Barrett R. Bryant Department of Computer and Information Science

GQL Syntax GrammarGQL Syntax Grammar

[1] query ::= SEARCH|IMAGE o_keylist occurrence constraints withinstmt[1] query ::= SEARCH|IMAGE o_keylist occurrence constraints withinstmt[2] o_keylist ::= keylist |[2] o_keylist ::= keylist |[3] keylist ::= key | keylist COMMA key[3] keylist ::= key | keylist COMMA key[4] key ::= word | noword | orwordlist | exactword[4] key ::= word | noword | orwordlist | exactword[5] word ::= STRING[5] word ::= STRING[6] noword ::= NOT word[6] noword ::= NOT word[7] orwordlist ::= orword OR orword | orwordlist OR orword[7] orwordlist ::= orword OR orword | orwordlist OR orword[8] orword ::= word | exactword[8] orword ::= word | exactword[9] exactword ::= QSTRING[9] exactword ::= QSTRING[10] occurrence ::= FROM OCCVALUE | [10] occurrence ::= FROM OCCVALUE | [11] constraints ::= WHERE constraintlist | [11] constraints ::= WHERE constraintlist | [12] constraintlist ::= constraint | constraintlist constraint[12] constraintlist ::= constraint | constraintlist constraint[13] constraint ::= domain | filetype[13] constraint ::= domain | filetype[14] domain ::= indomain | outdomain[14] domain ::= indomain | outdomain[15] indomain ::= DOMAIN EQ url[15] indomain ::= DOMAIN EQ url[16] outdomain ::= DOMAIN NE url[16] outdomain ::= DOMAIN NE url[17] url ::= QSTRING[17] url ::= QSTRING[18] filetype ::= acceptfiletype | rejectfiletype[18] filetype ::= acceptfiletype | rejectfiletype[19] acceptfiletype ::= TYPE EQ TYPEVALUE[19] acceptfiletype ::= TYPE EQ TYPEVALUE[20] rejectfiletype ::= TYPE NE TYPEVALUE[20] rejectfiletype ::= TYPE NE TYPEVALUE[21] withinstmt ::= WITHIN QSTRING |[21] withinstmt ::= WITHIN QSTRING |

Page 11: Google Query Language -- a DSL for Advanced Google Searching Xiaoqing Wu Advisor: Dr. Barrett R. Bryant Department of Computer and Information Science

GQL IDE structureGQL IDE structure

GQL IDE

Query Program

Google Search Engine

Query Result

GQLCompiler

Google-recognizable

tokens

Google-recognizable

tokens

Google-recognizable

tokens

Page 12: Google Query Language -- a DSL for Advanced Google Searching Xiaoqing Wu Advisor: Dr. Barrett R. Bryant Department of Computer and Information Science

Compiler implementation in CDECompiler implementation in CDE

GQL Specification

TLGCompiler

CUPJLex

Lexer in Java

AST Nodes

Parser in Java

JLex Specification

CUP Specification

Typechecking in AspectJ

Code generation in AspectJ

Aspect Weaving

GQL Compiler

Page 13: Google Query Language -- a DSL for Advanced Google Searching Xiaoqing Wu Advisor: Dr. Barrett R. Bryant Department of Computer and Information Science

Current statusCurrent status

• Basic GQL compiler Basic GQL compiler • IDE supporting multiple document managementIDE supporting multiple document management

– Program storageProgram storage– EditingEditing– Compiling, type-checking and executionCompiling, type-checking and execution

• Functionality including all features of Google Functionality including all features of Google web & image searchweb & image search

• Search within old queriesSearch within old queries

Page 14: Google Query Language -- a DSL for Advanced Google Searching Xiaoqing Wu Advisor: Dr. Barrett R. Bryant Department of Computer and Information Science

Future workFuture work

• Extending the grammar to implement all the Extending the grammar to implement all the functionality provided by Google functionality provided by Google

• Adding more strict type-checking for source Adding more strict type-checking for source programs written in GQL programs written in GQL

• Search result integration.Search result integration.

Page 15: Google Query Language -- a DSL for Advanced Google Searching Xiaoqing Wu Advisor: Dr. Barrett R. Bryant Department of Computer and Information Science

ConclusionConclusion

• To provide more flexibility in online search, a To provide more flexibility in online search, a SQL-like query language is developed in the SQL-like query language is developed in the Google query domain.Google query domain.

• Language programs are used to substitute the Language programs are used to substitute the provided query forms from Google, analogical to provided query forms from Google, analogical to SQL and query forms in DBMS, e.g. MS-Access. SQL and query forms in DBMS, e.g. MS-Access.

• The idea could be generalized to other domains, The idea could be generalized to other domains, especially in online searching, e.g. airfare especially in online searching, e.g. airfare searching. searching.