information retrieval lab disco – university of milan bicocca viale sarca 336 u14 head: prof....

14
Information Retrieval Lab DiSCo – University of Milan Bicocca Viale Sarca 336 U14 Head: Prof. Gabriella Pasi

Upload: cora-cook

Post on 11-Jan-2016

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Information Retrieval Lab DiSCo – University of Milan Bicocca Viale Sarca 336 U14 Head: Prof. Gabriella Pasi

Information Retrieval Lab

DiSCo – University of Milan Bicocca Viale Sarca 336 U14

Head: Prof. Gabriella Pasi

Page 2: Information Retrieval Lab DiSCo – University of Milan Bicocca Viale Sarca 336 U14 Head: Prof. Gabriella Pasi

IR Lab people

•Gabriella Pasi Associate Professor and Head of the Laboratory

•Silvia Calegari Post-doc DISCO

•Stefania MarraraPost-doc UNIMI

•Célia Cristina PereiraPost-doc UNIMI

Page 3: Information Retrieval Lab DiSCo – University of Milan Bicocca Viale Sarca 336 U14 Head: Prof. Gabriella Pasi

IR Lab numbers

•Small but active!▫Two people (since January 2009)▫Two external collaborators (since 2008)▫Three workplaces for Students and

Collaborators▫About 60 articles in proceeding of

international conferences and in international journals in the last three years

▫4-5 master students per year

Page 4: Information Retrieval Lab DiSCo – University of Milan Bicocca Viale Sarca 336 U14 Head: Prof. Gabriella Pasi

The IR Lab in brief

The Information Retrieval Group (IRG) was established in 2005 at DiSCo, University of Milan Bicocca.

FOCUS: as the amount of information available on the Web has enormously increased in last years, there is need of effective systems that allow an easy and flexible access to information relevant to specific user’s needs. By flexibility is here meant the capability of the system to both manage imperfect (vague and/or uncertain) information, and to personalise its behaviour to the user context.

AIM: the research activity undertaken by the IRG is aimed at defining models and techniques that improve the limitations of current systems for the Information Access to the main aim of offering personalised and flexible solutions to the problem of locating information relevant to specific user’s needs.

Page 5: Information Retrieval Lab DiSCo – University of Milan Bicocca Viale Sarca 336 U14 Head: Prof. Gabriella Pasi

Research in IR: some main issues

• Improving indexingtext representation is usually based on keywords extraction and weighting▫ how to improve document representations?

Conceptual indexing based on the use of conceptual structures

Latent semantic indexing Metadata and the Semantic Web

• Modeling user preferences in query formulationusually based on selection criteria specified by terms▫ how to formulate queries that capture real users’ needs?

Modeling the user’s context Accounting for vagueness Defining mechanism for query reformulation, relevance

feedback

Page 6: Information Retrieval Lab DiSCo – University of Milan Bicocca Viale Sarca 336 U14 Head: Prof. Gabriella Pasi

• Improving relevance estimate usually based on a measure of topicality, more recently Popularity (in search engines)

▫ It should be based on additional criteria: Novelty Trust in information sources Timeliness Contextual information (geographic location, date,

author, etc…)

▫ It should be learnt on the basis of users needs/behavior Application of machine learning techniques Query reformulation

• Text classification

• Text summarization

Research in IR: some main issues

Page 7: Information Retrieval Lab DiSCo – University of Milan Bicocca Viale Sarca 336 U14 Head: Prof. Gabriella Pasi

IR Lab Activity

•Research areas:▫Information Retrieval▫Information Filtering▫Document Clustering▫Personalization▫XML Retrieval

•Application Domains:▫Large document repositories▫World Wide Web

Page 8: Information Retrieval Lab DiSCo – University of Milan Bicocca Viale Sarca 336 U14 Head: Prof. Gabriella Pasi

Ongoing and future research• Definition of conceptual approaches to IR. • Definition of flexible query languages for semi-

structured documents (XML).• Definition of models for multi-dimensional

relevance assessment• Definition of text clustering techniques• Definition of techniques for assessment of text

quality and their use for relevance assessment • Web Service Retrieval

Page 9: Information Retrieval Lab DiSCo – University of Milan Bicocca Viale Sarca 336 U14 Head: Prof. Gabriella Pasi

Personalized Information Access

• Personalization is the process of customizing search results according to the user’s interests and context.

• Approach: generation of user-tailored ontologies

• Aim: to model and learn the user context to personalize the search process at distinct levels:

• Document indexing• Query formulation• Relevance assessment

Page 10: Information Retrieval Lab DiSCo – University of Milan Bicocca Viale Sarca 336 U14 Head: Prof. Gabriella Pasi

XML Retrieval • In XML collections it is important to retrieve

documents based on users’ constraints on both documents’ content and structure.

• Approach: 1) application of fuzzy set theory to define flexible extensions of existing XML query languages. 2) definition of ad hoc indexing strategies.

• Aim: to propose advanced solutions for storing, managing and retrieving semi-structured documents.

Page 11: Information Retrieval Lab DiSCo – University of Milan Bicocca Viale Sarca 336 U14 Head: Prof. Gabriella Pasi

Projects • Past.

▫ STREP ProjectSTREP Project: PENG (Personalised News Content Programming) (Gabriella Pasi, Project Coordinator) (2004 – 2006)

• Submitted▫ PRINPRIN. Title: What, Where, When? (W3?): Recommendation of

Information concerning specific topics and spatio-temporal contexts characterized by dynamicity and imprecision.

▫ FIRBFIRB. Title: A Cloud Service Stack for Personalized Semantic Information Retrieval.

▫ Spanish ProjectSpanish Project: High Performance processing for large data sets represented as Graphs (HIPERGRAPH) (Principal Investigator: Ricardo Baeza Yates – Yahoo! Research)

▫ COST Action "Combining Soft Computing Techniques and Statistical Methods to Improve Data Analysis Solutions", coordinated by ECSC (ONGOING)

Page 12: Information Retrieval Lab DiSCo – University of Milan Bicocca Viale Sarca 336 U14 Head: Prof. Gabriella Pasi

CollaborationsAt D.I.S.Co:

• Davide Ciucci• Fabio Farina• ITIS – SEQUOIAS (Information Quality; Web

Service Retrieval)

External Collaborations:

• CNR – IDPA, Italy• European Center for Soft Computing (ECSC),

Spain• IRIT – Toulouse, France• Iona College, NY, USA• Università La Coruna, Spain

Page 13: Information Retrieval Lab DiSCo – University of Milan Bicocca Viale Sarca 336 U14 Head: Prof. Gabriella Pasi

Conferences and Events• Organization of:

▫ The 2009 IEEE / WIC / ACM International Conferences on Web Intelligence (WI'09) and Intelligent Agent Technology (IAT'09), Milano, Italy, 15-18 September 2009

▫ International Workshop on “Managing Vagueness and Uncertainty in the Semantic Web (VUSW’09)”, Milano, Italy, 15 September 2009

▫ Program Chair of the International conference RIAO 2010, Paris▫ Poster co-chair of ACM SIGIR 2010

-----------------------• Some Past Events (since 2005)

▫ "Special Track on Information Access and Retrieval Systems”, within the “ACM Symposium on Applied Computing”, (Fortaleza, Ceará, Brazil, March 16 - 20, 2008, Dijon France March 2006, Santa Fe - New Mexico 13-17 March 2005, Cyprus 14-17 March 2004, Melbourne - Florida 9-12 March 2003, Madrid 10-14 March 2002).  IAR2008

▫ International Workshop on Fuzzy Logic and Applications (WILF 2007), Hotel Portofino Kulm, Portofino Vetta - Ruta di Camogli, Genova (Italy) - July 7-10, 2007

▫ PhD School on Web Information Retrieval, WebBar 2007 Varenna, Italy, 26th August-1st September 2007.

▫ Seventh International Conference on Flexible Query Answering Systems (FQAS 2006), Milano, 2-10 June 2006.

▫ “3rd International Summer School on Aggregation Operators”, Università della Svizzera Italiana (USI-Lugano), Lugano, 10-15 July 2005

Page 14: Information Retrieval Lab DiSCo – University of Milan Bicocca Viale Sarca 336 U14 Head: Prof. Gabriella Pasi

Publications from 2005 …some numbers

• Papers in International Journals: 2222

• Special Issues in International Journals: 44

• Edited Volumes: 33

• Chapters of International Books: 1010

• Proceedings for International Conferences: 4040