![Page 1: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective](https://reader030.vdocuments.net/reader030/viewer/2022032704/56649d615503460f94a4231c/html5/thumbnails/1.jpg)
CSC 466: Knowledge Discovery From Data
Alex DekhtyarDepartment of Computer Science Cal Poly
New Computer Science Elective
![Page 2: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective](https://reader030.vdocuments.net/reader030/viewer/2022032704/56649d615503460f94a4231c/html5/thumbnails/2.jpg)
Outline
Why?
What?
How?
Discussion
![Page 3: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective](https://reader030.vdocuments.net/reader030/viewer/2022032704/56649d615503460f94a4231c/html5/thumbnails/3.jpg)
Why?
Information Retrieval
![Page 4: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective](https://reader030.vdocuments.net/reader030/viewer/2022032704/56649d615503460f94a4231c/html5/thumbnails/4.jpg)
Why?
Text Classification? Link Analysis?
![Page 5: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective](https://reader030.vdocuments.net/reader030/viewer/2022032704/56649d615503460f94a4231c/html5/thumbnails/5.jpg)
Why?
Recommender Systems
![Page 6: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective](https://reader030.vdocuments.net/reader030/viewer/2022032704/56649d615503460f94a4231c/html5/thumbnails/6.jpg)
Why?
Market Basket Analysis. Purchasing trends analysis.
![Page 7: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective](https://reader030.vdocuments.net/reader030/viewer/2022032704/56649d615503460f94a4231c/html5/thumbnails/7.jpg)
Why?
Data Warehouse… and so much more…
![Page 8: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective](https://reader030.vdocuments.net/reader030/viewer/2022032704/56649d615503460f94a4231c/html5/thumbnails/8.jpg)
Why?
Link Analysis
![Page 9: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective](https://reader030.vdocuments.net/reader030/viewer/2022032704/56649d615503460f94a4231c/html5/thumbnails/9.jpg)
Why?
Cluster Analysis
![Page 10: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective](https://reader030.vdocuments.net/reader030/viewer/2022032704/56649d615503460f94a4231c/html5/thumbnails/10.jpg)
Buzzwords
Data warehousing Data mining
Information filtering
Recommender SystemsInformation retrieval
Text classification
Web mining
OLAP Cluster Analysis
Market basket analysis
![Page 11: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective](https://reader030.vdocuments.net/reader030/viewer/2022032704/56649d615503460f94a4231c/html5/thumbnails/11.jpg)
Why?
As professionals, hobbyists and consumers students constantly interact with intelligent information management technologies
This is moving into the realm of undergraduate-level knowledge
![Page 12: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective](https://reader030.vdocuments.net/reader030/viewer/2022032704/56649d615503460f94a4231c/html5/thumbnails/12.jpg)
@Calstate.edu
CSU Fullerton: CPSC 483 Data Mining and Pattern Recognition
CSU LA: CS 461 Machine Learning CS 560 Advanced Topics in Artificial Intelligence
CSU Northridge: 595DM Data Mining
CSU Sacramento: CSC 177. Data Warehousing and Data Mining
CSU SF: CSC 869 - Data Mining
CSU San Marcos: CS475 Machine Learning CS574 Intelligent Information Retrieval
![Page 13: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective](https://reader030.vdocuments.net/reader030/viewer/2022032704/56649d615503460f94a4231c/html5/thumbnails/13.jpg)
What?
Undergraduate course
Informed consumers Professionals
OLAP/Data Warehousing
Data Mining
Collaborative Filtering
Information Retrieval
1 quarter = 10 weeks
Knowledge Discoveryfrom Data
![Page 14: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective](https://reader030.vdocuments.net/reader030/viewer/2022032704/56649d615503460f94a4231c/html5/thumbnails/14.jpg)
What? (goals) Understand KDD technologies @ consumer
level Understand basic types of
Data mining Information filtering Information retrieval
techniques Use KDD to analyze information Implement KDD algorithms Understand/appreciate societal impacts
![Page 15: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective](https://reader030.vdocuments.net/reader030/viewer/2022032704/56649d615503460f94a4231c/html5/thumbnails/15.jpg)
What? (syllabus in a nutshell) Intro (data collections, measurement): 2 lectures Data Warehousing/OLAP: 2 lectures Data Mining:
Association Rule Mining: 3 lectures Classification: 3 lectures Clustering: 3 lectures
Collaborative Filtering/Recommendations: 2 lectures Information Retrieval: 4 lectures
19 lectures
(= spring quarter)CSC 466, Spring 2009 quarter
![Page 16: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective](https://reader030.vdocuments.net/reader030/viewer/2022032704/56649d615503460f94a4231c/html5/thumbnails/16.jpg)
How? (Alex’s ideas) Learn-by-doing....
Labs: work with existing software, analyze data, interpret
Labs: small groups, implement simple KDD techniques Project: groups, find interesting data, analyze it…
Need to incorporate “societal issues”: privacy vs. data access, etc… Students to make informed choices
Lectures Breadth over depth do a follow-up CSC 560 (grad. DB topics class)
![Page 17: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective](https://reader030.vdocuments.net/reader030/viewer/2022032704/56649d615503460f94a4231c/html5/thumbnails/17.jpg)
How?
TODO List:
Find data for labs and projects Investigate open source mining/retrieval software Figure out the textbook
(Web Data Mining by Bing Liu is promising)
![Page 18: CSC 466: Knowledge Discovery From Data Alex Dekhtyar Department of Computer Science Cal Poly New Computer Science Elective](https://reader030.vdocuments.net/reader030/viewer/2022032704/56649d615503460f94a4231c/html5/thumbnails/18.jpg)
How?
This slide intentionally left blank