resumo sobre recovering from a decade: a systematic mapping of information retrieval approaches to...
TRANSCRIPT
![Page 1: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/1.jpg)
Recovering from a decade: a systematic mapping of information
retrieval approaches to software traceability
Avelino Ferreira Gomes Filho
![Page 2: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/2.jpg)
Rastreabilidade de software através de Recuperação da
Informação
![Page 3: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/3.jpg)
Porque falar disso?
![Page 4: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/4.jpg)
“Software é um local onde sonhos são plantados e pesadelos são colhidos, um pântano abstrato e místico onde terríveis demônios competem em
uma panaceia mágica, um mundo de lobisomens e balas de prata.”
Roger S. Pressman apud Brad J. Cox (2010)Software Engineering: A Practitioner’s Approach, 7th Ed.
![Page 5: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/5.jpg)
Expectativa
Processo de Negócio
Regras de Negócio
TestesCódigo-
fonteBinário
![Page 6: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/6.jpg)
Realidade
![Page 7: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/7.jpg)
![Page 8: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/8.jpg)
Rastreabilidade
• Uma das formas de aumentar a qualidade do software é manter a ligação entre
– Código-fonte
– Regras de Negócio
– Processos de Negócio
– Requisitos
– Change Request
– Etc.
![Page 9: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/9.jpg)
Rastreabilidade
“A habilidade de interrelacionar qualquer artefato de engenharia de software que possa
ser unicamente identificado; manter as ligações entre eles através do tempo; e utilizar a rede de interrelacionamento para responder a perguntas
do produto de software e do processo de desenvolvimento”.
Cleland-Huang et al. Apud CoEST (2014)Software Traceability: Trends and Future Directions
Proc. of the 36th International Conference on Software Engineering (ICSE)
![Page 10: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/10.jpg)
Rastreabiliade
Não é uma tarefa trivial
![Page 11: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/11.jpg)
Fazer Rastreabilidade manualmente
Planilhas
Software
http://www.chambers.com.au/glossary/traceability_matrix.php
http://www.ibm.com/developerworks/rational/library/5347.html
![Page 12: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/12.jpg)
Fazer Rastreabilidade manualmente
Boring
Error Prone
![Page 13: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/13.jpg)
Exemplo
![Page 14: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/14.jpg)
O ARTIGO
![Page 15: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/15.jpg)
O Artigo
• Escrito por:
– Borg, Markus
– Runeson, Per
– Ardö, Anders
• Em 2013
• Publicado na Springer - Empirical Software Engineering
• DOI: 10.1007/s10664-013-9255-y
![Page 16: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/16.jpg)
O Artigo
• Você deve ler esse artigo porque...
– está interessado no tema Rastreabilidade e Recuperação da Informação.
– a introdução é um excelente glossário com referências sobre Recuperação da Informação
– o artigo descreve muito bem como fazer um mapeamento sistemático.
![Page 17: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/17.jpg)
Introdução (Glossário com Referências)
Dataset
Bag-of-Words
Natural Language and NL Processing
Algebraic-IR
Vector Space Model
Binary and Frequency Terms
TF-IDF
Latent Semantic Indexing
Rocchio Method
Binary Independence Retrieval
Probabilistic IR
Probabilistic Inference Network
Statistical Language Models
Thesaurus
Precision – Recall
Recovery Effort Index
Mean Average Precision
Discounted Cumulative Gain
ETC…
![Page 18: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/18.jpg)
OBJETIVOS DA PESQUISA
![Page 19: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/19.jpg)
O Artigo
Objetivo fazer um mapeamento sistemático dos modelos de Recuperação da Informação
utilizados para Rastreabilidade de Software.
![Page 20: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/20.jpg)
O Artigo
Eles não propõem um novo Modelo
A contribuição do artigo foi o amplo estudo realizado sobre mais de 1000 trabalhos sobre
quais são os modelos de RI utilizados para realizar Rastreabilidade
![Page 21: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/21.jpg)
Perguntas da Pesquisa
RQ1 Quais modelos de Recuperação daInformação e estratégias de aprimoramento(enhance) são mais utilizadas para fazerrastreabilidade em artefatos de software emLinguagem Natural?
![Page 22: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/22.jpg)
Perguntas da Pesquisa
RQ2 Quais os tipos de artefatos em LinguagemNatural que são ligados com maior frequênciaem estudos de Rastreabilidade comRecuperação da Informação?
![Page 23: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/23.jpg)
Perguntas da Pesquisa
RQ3 Quão forte são as evidências com relaçãoao grau de realismo das avaliações dos sistemasde Rastreabilidades feito com Recuperação daInformação?
![Page 24: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/24.jpg)
TRABALHOS RELACIONADOS
![Page 25: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/25.jpg)
Trabalho Relacionados
• IR-Based Trace Recovery
– Borillo et al. 1992
• O primeiro trabalho a utilizar técnicas de Processamento de Linguagem Natural e Inteligência artificial para rastreabilidade.
Os mais relevantes
![Page 26: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/26.jpg)
Trabalho Relacionados
• IR-Based Trace Recovery
– De Lucia et al. (2002 – 2014)
• Criação de N ferramentas de Rastreabilidade por RI.
• SCOTCH: Slicing and COupling based Test to Code trace Hunter (2014)– Rastreabilidade entre Classes do Sistema e Classes de Testes
– Uso de Stop Class
– Conceptual Coupling Between Classes (CCBC)
Os mais relevantes
![Page 27: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/27.jpg)
Trabalho Relacionados
• IR-Based Trace Recovery
– Baeza-Yates R, Ribeiro-Neto B. (2011)
• Pré-processamento
• Tratamento de camelCase, under_score convention, etc.
Os mais relevantes
![Page 28: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/28.jpg)
Trabalho Relacionados
• Previous Overviews on IR-Based Trace Recovery
– Mapeamentos Sistemáticos sobre Rastreabilidadee RI
– Cleland-Huang et al. (2012)
– De Lucia (2009 – 2012)
– “Nossa análise é mais estruturada e vai mais fundo com um escopo mais estreito”.
Os mais relevantes
![Page 29: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/29.jpg)
MÉTODO DE PESQUISA
![Page 30: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/30.jpg)
Método de Pesquisa
Criação do Protocolo de Pesquisa
Seleção das Publicações
Extração de dados e
mapeamento das publicações
![Page 31: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/31.jpg)
Método de Pesquisa
• Parâmetros para Inclusão de Artigos– Em inglês, Revisada por pares, Resultados
Empíricos sobre o tema.
• Parâmetros para Exclusão de Artigos– Publicações que discutiam mais sobre outras
formas de rastreabilidade do que por RI.
– Publicações que discutiam sobre RI, mas falavam pouco ou nada de rastreabilidade.
Criação do Protocolo de Pesquisa
Seleção das Publicações
Extração de dados e
mapeamento das publicações
![Page 32: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/32.jpg)
Método de Pesquisa
• Definição das bases de pesquisa
• Definição dos termos de Pesquisa
• Remoção de trabalhos duplicados
• Refinamento
– De: 1.241 publicações
– Para: 76 publicações
Criação do Protocolo de Pesquisa
Seleção das Publicações
Extração de dados e
mapeamento das publicações
![Page 33: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/33.jpg)
Método de Pesquisa
• Extração das contribuições das publicações relevantes
• Mapeamento
Criação do Protocolo de Pesquisa
Seleção das Publicações
Extração de dados e
mapeamento das publicações
![Page 34: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/34.jpg)
RESULTADO
Modelos de Recuperação da Informação aplicados à Rastreabilidade (RQ1)
![Page 35: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/35.jpg)
Estado da Arte
Document Parsing, Extraction and Pre-Processing
Corpus indexing with an IR method
Ranked list generation
Enhance and Analysis of candidate links
![Page 36: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/36.jpg)
Document Parsing, Extraction and Pre-Processing
![Page 37: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/37.jpg)
Document Parsing, Extraction and Pre-Processing
• Stop words: a, an, to, it...
– Stop Class: java.lang.*, org.junit.*
• Stemming: produce, producing, produced, producer
• ID Splitting: Tratamento de camelCase e padrões de codificação
– Baeza-Yates R, Ribeiro-Neto B (2011)
• Google Translator
![Page 38: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/38.jpg)
Indexing, Filtering and Retrieval
![Page 39: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/39.jpg)
Indexing, Filtering and Retrieval
• Modelos Algébricos
– Relevância de um documento para o resultado de uma consulta depende da sua semelhança com o termo procurado
– Uso de formas algébricas de representação de semelhança.
– Ex.: Uso do Cosseno em VSM
• Modelos Probabilísticos
– Qual a probabilidade desse documento ser relevante para essa pesquisa?
– Dado um termo de busca o documento pode ou não ser relevante
– O sistema não pode ter certeza sobre o real status de relevância do documento.
Zhai C (2007) Abrief review of information retrievalmodels. Technicalreport,University of Illinois at Urbana-Champaign
![Page 40: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/40.jpg)
Ranking
![Page 41: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/41.jpg)
Enhance and Analyze
![Page 42: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/42.jpg)
Enhance and Analyze
• Relevance Feedback
– Há indícios de que humanos raramente consideram mais de 10 links candidatos.
Borg M, Pfahl D(2011) Do better IRtools improve the accuracy of engineers’ traceability recovery? In: Proceedings of the international workshop on machine learning technologies in software engineering, pp 27–34
![Page 43: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/43.jpg)
RESULTADO
Tipos de artefatos ligados (RQ2)
![Page 44: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/44.jpg)
Artefatos Ligados
![Page 45: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/45.jpg)
RESULTADO
Nível de Evidência (RQ3)
![Page 46: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/46.jpg)
![Page 47: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/47.jpg)
MINHA ANÁLISE
![Page 48: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/48.jpg)
Minha Análise
• É um estudo bem abrangente sobre Rastreabilidade através de RI.
• Serve como uma grande fonte de referências de RI.
• Não se aprofunda em nenhum modelo específico
– O que era esperado por se tratar de um SM.
– Mesmo assim ele apresenta o Estado da arte.
![Page 49: Resumo sobre Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability](https://reader033.vdocuments.net/reader033/viewer/2022060204/559fb2151a28abfb1b8b47c2/html5/thumbnails/49.jpg)
Dúvidas?
Feedback!