cse 574 extracting, managing & personalizing web information
DESCRIPTION
CSE 574 Extracting, Managing & Personalizing Web Information. Staffing Dan Weld Raphael Hoffmann Content Intersection of AI, ML, DB & HCI Student Responsibilities Reading, Reports, Discussion Project (for those taking 3 credits). Class Focus. Extracting, Managing & - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: CSE 574 Extracting, Managing & Personalizing Web Information](https://reader035.vdocuments.net/reader035/viewer/2022062723/56813ff3550346895dab0e7d/html5/thumbnails/1.jpg)
04/20/23 23:56 1
CSE 574 Extracting, Managing & Personalizing Web Information
• Staffing– Dan Weld– Raphael Hoffmann
• Content – Intersection of AI, ML, DB & HCI
• Student Responsibilities– Reading, Reports, Discussion– Project (for those taking 3 credits)
![Page 2: CSE 574 Extracting, Managing & Personalizing Web Information](https://reader035.vdocuments.net/reader035/viewer/2022062723/56813ff3550346895dab0e7d/html5/thumbnails/2.jpg)
Class Focus
Extracting, Managing & Personalizing Web Information
04/20/23 23:56 2
![Page 3: CSE 574 Extracting, Managing & Personalizing Web Information](https://reader035.vdocuments.net/reader035/viewer/2022062723/56813ff3550346895dab0e7d/html5/thumbnails/3.jpg)
Why Information Extraction• Next-Generation Search
– Citeseer, Google scholar, MSRA Libra– Google product search– Flipdog– Zvents– Zoominfo
• Question Answering
04/20/23 23:56 3
![Page 4: CSE 574 Extracting, Managing & Personalizing Web Information](https://reader035.vdocuments.net/reader035/viewer/2022062723/56813ff3550346895dab0e7d/html5/thumbnails/4.jpg)
04/20/23 23:56 5
![Page 5: CSE 574 Extracting, Managing & Personalizing Web Information](https://reader035.vdocuments.net/reader035/viewer/2022062723/56813ff3550346895dab0e7d/html5/thumbnails/5.jpg)
People
04/20/23 23:56 6
![Page 6: CSE 574 Extracting, Managing & Personalizing Web Information](https://reader035.vdocuments.net/reader035/viewer/2022062723/56813ff3550346895dab0e7d/html5/thumbnails/6.jpg)
…Continued
04/20/23 23:56 7
![Page 7: CSE 574 Extracting, Managing & Personalizing Web Information](https://reader035.vdocuments.net/reader035/viewer/2022062723/56813ff3550346895dab0e7d/html5/thumbnails/7.jpg)
…Continued Some More
04/20/23 23:56 8
![Page 8: CSE 574 Extracting, Managing & Personalizing Web Information](https://reader035.vdocuments.net/reader035/viewer/2022062723/56813ff3550346895dab0e7d/html5/thumbnails/8.jpg)
Making Structured Content • Information Extraction
– E.g. Google Scholar– Cons: Noisy
• Communal Content Creation– E.g. Wikipedia– Cons: Bootstrapping & Incentives
04/20/23 23:56 9
![Page 9: CSE 574 Extracting, Managing & Personalizing Web Information](https://reader035.vdocuments.net/reader035/viewer/2022062723/56813ff3550346895dab0e7d/html5/thumbnails/9.jpg)
Why Managing ?• Select• Store, Index, Aggregate• Search, Query, Explore• Share, Collaborate, “Publish”
Example: Personalized Portalscf DBlife, Rexa, Dontcheva UIST-07
04/20/23 23:56 10
![Page 10: CSE 574 Extracting, Managing & Personalizing Web Information](https://reader035.vdocuments.net/reader035/viewer/2022062723/56813ff3550346895dab0e7d/html5/thumbnails/10.jpg)
DBlife
04/20/23 23:56 11
![Page 11: CSE 574 Extracting, Managing & Personalizing Web Information](https://reader035.vdocuments.net/reader035/viewer/2022062723/56813ff3550346895dab0e7d/html5/thumbnails/11.jpg)
Summaries - 1
04/20/23 23:56 12
![Page 12: CSE 574 Extracting, Managing & Personalizing Web Information](https://reader035.vdocuments.net/reader035/viewer/2022062723/56813ff3550346895dab0e7d/html5/thumbnails/12.jpg)
Summaries - 2
04/20/23 23:56 13
![Page 13: CSE 574 Extracting, Managing & Personalizing Web Information](https://reader035.vdocuments.net/reader035/viewer/2022062723/56813ff3550346895dab0e7d/html5/thumbnails/13.jpg)
Summaries - 3
04/20/23 23:56 14
![Page 14: CSE 574 Extracting, Managing & Personalizing Web Information](https://reader035.vdocuments.net/reader035/viewer/2022062723/56813ff3550346895dab0e7d/html5/thumbnails/14.jpg)
Summaries - 4
04/20/23 23:56 15
![Page 15: CSE 574 Extracting, Managing & Personalizing Web Information](https://reader035.vdocuments.net/reader035/viewer/2022062723/56813ff3550346895dab0e7d/html5/thumbnails/15.jpg)
Summaries - 5
04/20/23 23:56 16
![Page 16: CSE 574 Extracting, Managing & Personalizing Web Information](https://reader035.vdocuments.net/reader035/viewer/2022062723/56813ff3550346895dab0e7d/html5/thumbnails/16.jpg)
Summaries - 6
04/20/23 23:56 17
![Page 17: CSE 574 Extracting, Managing & Personalizing Web Information](https://reader035.vdocuments.net/reader035/viewer/2022062723/56813ff3550346895dab0e7d/html5/thumbnails/17.jpg)
Why Personalize?• Because we can.
04/20/23 23:56 18
![Page 18: CSE 574 Extracting, Managing & Personalizing Web Information](https://reader035.vdocuments.net/reader035/viewer/2022062723/56813ff3550346895dab0e7d/html5/thumbnails/18.jpg)
Preliminary Schedule• Information Extraction
– Traditional Machine Learning Approaches– Self-Supervised Methods– Other Issues: Coreference & Ontology
• Collaborative Content Creation & UI Issues– Applying Contraints from Interaction to Learning– Decision Theoretic Interaction– Faceted Interfaces
• Community Information Management – Extraction over Evolving Text– Data Provenance – Mashups & Personalized Web
• Next-Generation Search – Inference, Textual Entailment, Machine Reading – Entity Search
04/20/23 23:56 19
![Page 19: CSE 574 Extracting, Managing & Personalizing Web Information](https://reader035.vdocuments.net/reader035/viewer/2022062723/56813ff3550346895dab0e7d/html5/thumbnails/19.jpg)
04/20/23 23:56 20
For next time• Read
– Agichtein, Gravano. Snowball: Extracting Relations from Large Plain-Text Collections.
• Add yourself to mailing list• Look at papers on website wiki
– Add new ones– Add summary (different from report)– Notate if you wish to present one
• Think about project / (form a group?)