csce 5073 section 001: data mining spring 2016. overview class hour 12:30 – 1:45pm, tuesday &...
DESCRIPTION
Topic Description Introduction to data mining Know your data Data preprocessing Data warehousing and OLAP Frequent pattern mining, association and correlation Classification Cluster analysis Outlier Detection Advanced topics Deep learning Big data analysis including MapReduce, Spark Social aware data miningTRANSCRIPT
CSCE 5073Section 001: Data Mining
Spring 2016
Overview Class hour 12:30 – 1:45pm, Tuesday & Thur, JBHT
239 Office hour 2:00 – 4:00pm, Tuesday & Thur, JBHT
516 Instructor - Dr. Xintao Wu
email - [email protected] Office – JBHT 516 Webpage http://csce.uark.edu/~xintaowu/5073/5073.htm
Textbook Jiawei Han, Micheline Kamber, and Jian Pei, Data Mining:
Concepts and Techniques, 3rd edition, Morgan Kaufmann, 2011. ISBN: 978-0-12-381479-1
Topic Description Introduction to data mining Know your data Data preprocessing Data warehousing and OLAP Frequent pattern mining, association and
correlation Classification Cluster analysis Outlier Detection Advanced topics
Deep learning Big data analysis including MapReduce, Spark Social aware data mining
Course Prerequisite
Data Structure and algorithm Familiarity with programming with Java or
C++ is assumed Matlab/R/Python/Scala is preferred.
Probability and statistics basic concept Knowledge of linear algebra is a big plus
Grading Composition
Homework and quiz 10% Project 30% Midterm 20% Final 40%
Homework and Project Reports Late policy:
No acceptable. Hard copy is preferred Electronic submission (word or pdf)
accepted
Project Data Analysis Project
Each group consists 2-3 students Develop/implement/apply data mining
techniques on real challenging data mining problems
Individual Research Project More information
http://csce.uark.edu/~xintaowu/5073/proj.htm
Midterm & Final Open books/notes/internet
No discussion No help from any entity, e.g., by
posting/uploading your questions on Web Cumulative No makeup Class attendance is not required
Bonus is expected
9 9
Textbook & Recommended Reference Books
Textbook Jiawei Han, Micheline Kamber, Jian Pei, Data Mining: Concepts and
Techniques, 3rd ed., Morgan Kaufmann, 2011 Recommended reference books
C. M. Bishop, Pattern Recognition and Machine Learning, Springer 2007. S. Chakrabarti, Mining the Web: Statistical Analysis of Hypertext and Semi-
Structured Data, Morgan Kaufmann, 2002 T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning:
Data Mining, Inference, and Prediction,2nd ed., Springer-Verlag, 2009. B. Liu, Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data,
Springer, 2006 D. Easley and J. Kleinberg, Networks, Crowds, and Markets: Reasoning
About a Highly Connected World, Cambridge Univ. Press, 2010. M. Newman, Networks: An Introduction, Oxford Univ. Press, 2010.
10 10
Reference Papers
Course research papers: Check Reading_List Major conference proceedings that will be used
DM conferences: ACM SIGKDD (KDD), ICDM (IEEE, Int. Conf. Data Mining), SDM (SIAM Data Mining), PKDD (Principles KDD)/ECML, PAKDD (Pacific-Asia)
DB conferences: ACM SIGMOD, VLDB, ICDE ML conferences: NIPS, ICML IR conferences: SIGIR, CIKM Web conferences: WWW, WSDM
Other related conferences and journals IEEE TKDE, ACM TKDD, DMKD, ML,
Use course Web page, DBLP, Google Scholar, Citeseer CS591Han: Advanced Seminar on Data Mining
11 11
Research Frontiers in Data Mining Mining social and information networks Mining spatiotemporal data, moving object data & cyber-
physical systems Mining multimedia, social media, text and Web Data software engineering and computer system data Multidimensional online analytical analysis Pattern mining, pattern usage, and pattern understanding Biological data mining Stream data mining