Data Mining (FSS2015)

The course provides an introduction to advanced data analysis techniques as a basis for analyzing business data and providing input for decision support systems. The course will cover the following topics:

  • Goals and Principles of Data Mining
  • Data Representation and Preprocessing
  • Clustering
  • Classification
  • Association Analysis
  • Text Mining
  • Systems and Applications (e.g. Retail, Finance, Web Analysis)

The course consists of a lecture together with accompanying practical exercises as well as student team projects.  In the exercises the participants will gather initial expertise in applying state of the art data mining tools on realistic data sets. The team projects take place in the last third of the term. Within the projects, students realize more sophisticated data mining projects of personal choice and report about the results of their projects in the form of a written report as well as an oral presentation.

Exam Review (Klausureinsicht)

  • The exam review will take place on 22.9.2015 at 16:00 (updated time) in building B6 room A2.07.

Time and Location

  • Lecture: Wednesday, 10.15 - 11.45, Room A5, B144, Start: 11.02.2015
  • Exercise: Thursday, 10.15 - 11.45, Room B6 A104, Start: 12.02.2015
  • Alternative exercise: Thursday, 12.00 - 13.30, Room B6 A104, Start: 12.02.2015 

 Instructors

Final exam

  • 50 % written exam
  • 50 % project work

Slides and Excercises

The lecture slides and exercises are published on this web page.
The solutions to the exercises are provided in ILIAS.

  1. Slideset: Organization and Course Outline
  2. Slideset: Introduction to Data Mining
  3. Exercise 1: Introduction & Vizualization (Slides | Exercise | Dataset)
  4. Slideset: Cluster Analysis
  5. Exercise 2: Clustering (Exercise | Dataset)
  6. Slideset: Classification - Part 1
  7. Exercise 3: Classification (Exercise | Slides 1, Slides 2)
  8. Slideset: Classification - Part 2
  9. Exercise 4: Classification (Exercise | Dataset)
  10. Slideset: Classification - Part 3
  11. Exercise 5: Classification (Exercise | DataSet | Slides)
  12. Slideset: Association Analysis
  13. Exercise 6: Association Analysis (Exercise | DataSet)
  14. Slideset: Text Mining
  15. Exercise 7: Text Mining (Exercise | DataSet)
  16. Slideset: Student Projects

 Contact Person

  • If you have any questions, please contact Oliver Lehmberg (oli(at)informatik.uni-mannheim.de) 

 Outline

Week WednesdayThursday
11.02.2015Introduction to Data Mining Introduction to RapidMiner
18.02.2015Lecture ClusteringExercise Clustering
25.02.2015Lecture Classification 1Exercise Classification 
04.03.2015Lecture Classification 2Exercise Classification 
11.03.2015Lecture Classification 3Exercise Classification 
18.03.2015Lecture Association AnalysisExercise Association Analysis
25.03.2015Lecture Text MiningExercise Text Mining
Easter break
15.04.2015Introduction to Student ProjectsPreparation of Project Outline
22.04.2015Feedback Student ProjectsFeedback on demand
29.04.2015Project WorkFeedback on demand
06.05.2015Project WorkFeedback on demand
13.05.2015Project Work- Public holiday -
20.05.2015Project WorkSubmission of project summaries - by Friday, 22.05.2015, 23:59
27.05.2015Presentation of project resultsPresentation of project results

Registration

  • The course is open to students of the Master Business Informatics and Lehramt Informatik.
  • The course is restricted to 60 participants.
  • Registration is done via ILIAS.
  • Registration will be opened Wednesday, February 4th 2015, 10:00 am 
  • Allocation of places is done by FCFS (limit  60 students)
  • We offer two alternative times (Thursdays 10.15 or Thursdays 12.00) for the exercise session. Sign-In to one of both groups within ILIAS after you have registered for the course. The groups are restricted to 30 students each.

Literature 

  1. Pang-Ning Tan, Michael Steinbach, Vipin Kumar: Introduction to Data Mining, Pearson.
  2. Ian H. Witten, Eibe Frank, Mark A. Hall: Data Mining: Practical Machine Learning Tools and Techniques, 3rd Edition, Morgan Kaufmann.
  3. Bing Liu: Web Data Mining, 2nd Edition, Springer.

Software

Videos and Screen Casts

  • Video recordings of the Data Mining I lectures and screen casts of the exercises are available here.

 Course Evaluations