alfred demo -
DESCRIPTION
ALFRED: Crowd Assisted Data ExtractionTRANSCRIPT
![Page 1: ALFRED demo -](https://reader033.vdocuments.net/reader033/viewer/2022052619/555ddf56d8b42a1e2c8b4b51/html5/thumbnails/1.jpg)
ALFRED: Crowd Assisted Data Extraction
Valter Crescenzi, Paolo Merialdo, Disheng Qiu
Dipartimento di IngegneriaUniversità degli Studi Roma TreVia della Vasca Navale, 79, Rome
![Page 2: ALFRED demo -](https://reader033.vdocuments.net/reader033/viewer/2022052619/555ddf56d8b42a1e2c8b4b51/html5/thumbnails/2.jpg)
Extracting data
2M pages from IMDB, and we want to extract ... titles, directors etc ....
1/7
![Page 3: ALFRED demo -](https://reader033.vdocuments.net/reader033/viewer/2022052619/555ddf56d8b42a1e2c8b4b51/html5/thumbnails/3.jpg)
Extracting data
2M pages from IMDB, and we want to extract ... titles, directors etc ....
DB#Wrapper!
1/7
![Page 4: ALFRED demo -](https://reader033.vdocuments.net/reader033/viewer/2022052619/555ddf56d8b42a1e2c8b4b51/html5/thumbnails/4.jpg)
Extracting data
2M pages from IMDB, and we want to extract ... titles, directors etc ....
Inference algorithm!
DB#Wrapper!
1/7
![Page 5: ALFRED demo -](https://reader033.vdocuments.net/reader033/viewer/2022052619/555ddf56d8b42a1e2c8b4b51/html5/thumbnails/5.jpg)
Extracting data
2M pages from IMDB, and we want to extract ... titles, directors etc ....
Inference algorithm!
DB#Wrapper!
1/7
![Page 6: ALFRED demo -](https://reader033.vdocuments.net/reader033/viewer/2022052619/555ddf56d8b42a1e2c8b4b51/html5/thumbnails/6.jpg)
Extracting data
2M pages from IMDB, and we want to extract ... titles, directors etc ....
Inference algorithm!
DB#Wrapper!
1/7
![Page 7: ALFRED demo -](https://reader033.vdocuments.net/reader033/viewer/2022052619/555ddf56d8b42a1e2c8b4b51/html5/thumbnails/7.jpg)
Scaling Wrapper Inference
Scaling the number of workers with Crowdsourcing platforms opens new challenges:
Issues: Contributions:
2/7
![Page 8: ALFRED demo -](https://reader033.vdocuments.net/reader033/viewer/2022052619/555ddf56d8b42a1e2c8b4b51/html5/thumbnails/8.jpg)
Scaling Wrapper Inference
Scaling the number of workers with Crowdsourcing platforms opens new challenges:
Issues: Contributions:
Non-expert workers
• Simple interactions to reduce the worker error rate• Membership Query (yes/no answer)
2/7
![Page 9: ALFRED demo -](https://reader033.vdocuments.net/reader033/viewer/2022052619/555ddf56d8b42a1e2c8b4b51/html5/thumbnails/9.jpg)
Scaling Wrapper Inference
Scaling the number of workers with Crowdsourcing platforms opens new challenges:
Issues: Contributions:
Non-expert workers
• Simple interactions to reduce the worker error rate• Membership Query (yes/no answer)
• Active Learning to carefully select queries
Costs
2/7
![Page 10: ALFRED demo -](https://reader033.vdocuments.net/reader033/viewer/2022052619/555ddf56d8b42a1e2c8b4b51/html5/thumbnails/10.jpg)
Scaling Wrapper Inference
Scaling the number of workers with Crowdsourcing platforms opens new challenges:
Issues: Contributions:
Non-expert workers
• Simple interactions to reduce the worker error rate• Membership Query (yes/no answer)
• Active Learning to carefully select queries
Costs
2/7
Quality
• Bayesian Model to evaluate the expected wrapper quality• Sampling algorithms• Tolerant to inaccurate workers
![Page 11: ALFRED demo -](https://reader033.vdocuments.net/reader033/viewer/2022052619/555ddf56d8b42a1e2c8b4b51/html5/thumbnails/11.jpg)
Architecture
ALFRED is a wrapper inference system supervised by workers from a crowdsourcing platform.
*Research Track: A Framework for Learning Web Wrappers from the Crowd WWW 2013 3/7
![Page 12: ALFRED demo -](https://reader033.vdocuments.net/reader033/viewer/2022052619/555ddf56d8b42a1e2c8b4b51/html5/thumbnails/12.jpg)
Input and Rules Generation
4/7
![Page 13: ALFRED demo -](https://reader033.vdocuments.net/reader033/viewer/2022052619/555ddf56d8b42a1e2c8b4b51/html5/thumbnails/13.jpg)
Sample Set and Extracted Values
5/7
![Page 14: ALFRED demo -](https://reader033.vdocuments.net/reader033/viewer/2022052619/555ddf56d8b42a1e2c8b4b51/html5/thumbnails/14.jpg)
Sample Set and Extracted Values
page0 page1 page2
r1
r2
r3
Inception City of God Oblivion
Inception City of God null
Inception null Oblivion
6/7
![Page 15: ALFRED demo -](https://reader033.vdocuments.net/reader033/viewer/2022052619/555ddf56d8b42a1e2c8b4b51/html5/thumbnails/15.jpg)
Sample Set and Extracted Values
page0 page1 page2
r1
r2
r3
Inception City of God Oblivion
Inception City of God null
Inception null Oblivion
6/7
![Page 16: ALFRED demo -](https://reader033.vdocuments.net/reader033/viewer/2022052619/555ddf56d8b42a1e2c8b4b51/html5/thumbnails/16.jpg)
Probability and Noisy
7/7