Data Mining II

Building on the Data Mining fundamentals course, this course deepens the theory and practice of advanced data mining topics, such as:

  • Data Preprocessing
  • Regression and Forecasting
  • Dimensionality Reduction
  • Anomaly Detection
  • Time Series Analysis
  • Parameter Tuning
  • Ensemble Learning
  • Online Learning

The course consists of a lecture together with accompanying practical exercises as well as student team projects.  In the exercises the participants will gather initial expertise in applying state of the art data mining tools on realistic data sets. The team projects take place in the last third of the term. Within the projects, students realize more sophisticated data mining projects of personal choice and report about the results of their projects in the form of a written report as well as an oral presentation.

Time and Location

  • Lecture: Monday, 10:15 - 11:45, B6 A104
  • Exercise: Tuesday, 13:45 - 15:15, B6 A104

 Instructors

Final exam

  • 50 % written exam
  • 50 % project work

 Slides and Excercises

 

 Participation FSS 2014

  • The course is open to students of the Master Business Informatics and Lehramt Informatik.
  • The course is restricted to 30 participants.
  • Registration is done via the ILIAS group 
  • Registration will be opened Wednesday, February 4th, 8:00 am 
  • Allocation of places is done by FCFS (limit 30 students)

Outline (preliminary)

 

WeekSession 1Session 2Important Dates
9.2.Lecture: PreprocessingExercise: Preprocessing
16.2.Lecture: RegressionExercise: Regression
23.2.Lecture: Online LearningExercise: Online Learning
2.3.Lecture: Anomaly DetectionExercise: Anomaly Detection
9.3.Lecture: EnsemblesExercise: EnsemblesTuesday 10.3. DMC Registration Opens
16.3.Lecture: Time SeriesExercise: Time Series
23.3.Lecture: Parameter TuningExercise: Parameter Tuning
30.3.EasterBreak
6.4.EasterBreakTuesday 7.4. DMC Task Publication
13.4.Task discussion, team buildingWork on DMC tasks
20.4.Intermediate PresentationWork on DMC tasks
27.4.Work on DMC tasksIntermediate Presentation
4.5.Work on DMC tasksIntermediate Presentation
11.5.Work on DMC tasksIntermediate Presentation
18.5.Work on Final Submission and PresentationWork on Final Submission and PresentationTuesday 19.5. DMC Task Submission Deadline
25.5.-Final presentationMonday 25.5. Final Report Submission Deadline

 

Literature 

  1. Pang-Ning Tan, Michael Steinbach, Vipin Kumar: Introduction to Data Mining, Pearson.
  2. Ian H. Witten, Eibe Frank, Mark A. Hall: Data Mining: Practical Machine Learning Tools and Techniques, 3rd Edition, Morgan Kaufmann.
  3. Bing Liu: Web Data Mining, 2nd Edition, Springer.

Further literature on specific topics will be announced in the lecture.

Software

  • In this year, we will be able to work with the newester version of RapidMiner. Licence key handling will be discussed within the first sessions of this course.

Lecture Videos

  • Video recordings of the Data Mining II lectures are available here.