Department of Information & Communication Systems Engineering
University of the Aegean

Department of Information
& Communication Systems Engineering

Information & Communication Systems Security
Information Systems
Artificial Intelligence
Computer & Communication Systems
Geometry, Dynamical Systems & Cosmology
Data Mining and Data Warehouses

Title: Data Mining and Data Warehouses
Lesson Code: 321-9253
Semester: 8
Theory Hours: 3
Lab Hours: 2
Content outline

1. Introduction to Data Mining Techniques: (a) data, (b) problems, (c) applications, (d) general analysis and processing techniques.
2. Data pre-processing: (a) data cleansing, (b) data transformations, (c) dimension reduction techniques.
3. Clustering, Part I: (a) introduction to clustering, (b) proximity measures, (c) k-means and its variations, (d) hierarchical clustering.
4. Clustering, Part II: (a) DBSCAN, (b) cluster validity, (c) BIRCH.
5. Association Rules I: (a) problem definition, (b) a-priori algorithm, (c) frequent itemsets.
6. Association Rules II: (a) advanced methods for finding frequent itemsets, (b) FP-Growth, (c) association rules validation.
7. Classification I: (a) introduction, (b) Decision Trees (entropy, Gini Index, classification error).
8. Classification II: (a) Bayesian classifiers, (b) Support Vector Machines, (c) KNN, (d) rule-based classifiers, (e) overfitting.
9. Data Warehouses and OLAP: (a) definitions, ROLAP, MOLAP, HOLAP, (b) cuboid, (c) cuboid implementation.

Learning outcomes

Critical awareness of current problems and research issues in Data Mining. Comprehensive understanding of current advanced scholarship and research in data mining and how this may contribute to the effective design and implementation of data mining applications. Ability to consistently apply knowledge concerning current data mining research issues in an original manner and produce work which is at the forefront of current developments in the sub-discipline of data mining. Proficiency with leading data mining software, including RapidMiner, Weka and Business Intelligence of MS SQL server. Understanding of how to apply a wide range of clustering, estimation, prediction and classification algorithms, including k-means clustering, BIRCH clustering, DBSCAN clustering, classification and regression trees, the C4.5 algorithm, logistic Regression, k-nearest neighbor, multiple regression, neural networks and support vector machines. Understanding of how to apply the most current data mining techniques and applications, such as text mining, mining genomics data, and other current issues. Understanding of the mathematical/statistics foundations of the algorithms outlined above.

Not required.
Basic Textbooks

1. Data Mining-Introductory and Advanced Topics, Margaret H. Dunham, Pearson Education, ISBN: 9780130888921,2002.
2. Data Mining, A Knowledge Discovery Approach, Krzysztof J. Cios et al., Springer Verlag, ISBN: 9780387333335, 2007.

Additional References

1. Data Mining-Foundations and Practice, Lin, Xie, Wasilewska and Liau, Springer-Verlag Berlin and Heidelberg GmbH & Co. KG, ISBN10: 354078487X, 2008.

Learning Activities and Teaching Methods

Online material in Electronic Learning Platforms, Classroom teaching, group activities, etc.

Work in classroom. Final exams.

Assessment/Grading Methods
Activity Semester workload
Lectures 39 hours
Laboratory hours 26 hours
Personal study 57 hours
Final exams 3 hours
Course total 125 hours (5 ECTS)
Language of Instruction
Greek, English (for Erasmus students)
Μode of delivery


Home | Contact

University Of The Aegean

Department of Information & & Communications Systems Engineering

© Copyright ICSD :: 2008 - 2017