|Data Mining and Data Warehouses|
|Title: ||Data Mining and Data Warehouses|
|Lesson Code: ||321-9253|
|Theory Hours: ||3|
|Lab Hours: ||2|
|Faculty: ||Maragkoudakis Emmanouil|
1. Introduction to Data Mining Techniques: (a) data, (b) problems, (c) applications, (d) general analysis and processing techniques.
2. Data pre-processing: (a) data cleansing, (b) data transformations, (c) dimension reduction techniques.
3. Clustering, Part I: (a) introduction to clustering, (b) proximity measures, (c) k-means and its variations, (d) hierarchical clustering.
4. Clustering, Part II: (a) DBSCAN, (b) cluster validity, (c) BIRCH.
5. Association Rules I: (a) problem definition, (b) a-priori algorithm, (c) frequent itemsets.
6. Association Rules II: (a) advanced methods for finding frequent itemsets, (b) FP-Growth, (c) association rules validation.
7. Classification I: (a) introduction, (b) Decision Trees (entropy, Gini Index, classification error).
8. Classification II: (a) Bayesian classifiers, (b) Support Vector Machines, (c) KNN, (d) rule-based classifiers, (e) overfitting.
9. Data Warehouses and OLAP: (a) definitions, ROLAP, MOLAP, HOLAP, (b) cuboid, (c) cuboid implementation.
Critical awareness of current problems and research issues in Data Mining. Comprehensive understanding of current advanced scholarship and research in data mining and how this may contribute to the effective design and implementation of data mining applications. Ability to consistently apply knowledge concerning current data mining research issues in an original manner and produce work which is at the forefront of current developments in the sub-discipline of data mining. Proficiency with leading data mining software, including RapidMiner, Weka and Business Intelligence of MS SQL server. Understanding of how to apply a wide range of clustering, estimation, prediction and classification algorithms, including k-means clustering, BIRCH clustering, DBSCAN clustering, classification and regression trees, the C4.5 algorithm, logistic Regression, k-nearest neighbor, multiple regression, neural networks and support vector machines. Understanding of how to apply the most current data mining techniques and applications, such as text mining, mining genomics data, and other current issues. Understanding of the mathematical/statistics foundations of the algorithms outlined above.
1. Data Mining-Introductory and Advanced Topics, Margaret H. Dunham, Pearson Education, ISBN: 9780130888921,2002.
2. Data Mining, A Knowledge Discovery Approach, Krzysztof J. Cios et al., Springer Verlag, ISBN: 9780387333335, 2007.
1. Data Mining-Foundations and Practice, Lin, Xie, Wasilewska and Liau, Springer-Verlag Berlin and Heidelberg GmbH & Co. KG, ISBN10: 354078487X, 2008.
|Learning Activities and Teaching Methods |
Online material in Electronic Learning Platforms, Classroom teaching, group activities, etc.
Work in classroom. Final exams.
|Assessment/Grading Methods |
||125 hours (5 ECTS)
|Language of Instruction|
|Greek, English (for Erasmus students)|
|Μode of delivery |