Data Mining (FSS 2016)

The course provides an introduction to advanced data analysis techniques as a basis for analyzing business data and providing input for decision support systems. The course will cover the following topics:

  • Goals and Principles of Data Mining
  • Data Representation and Preprocessing
  • Clustering
  • Classification
  • Association Analysis
  • Text Mining
  • Systems and Applications (e.g. Retail, Finance, Web Analysis)

The course consists of a lecture together with accompanying practical exercises as well as student team projects.  In the exercises the participants will gather initial expertise in applying state of the art data mining tools on realistic data sets. The team projects take place in the last third of the term. Within the projects, students realize more sophisticated data mining projects of personal choice and report about the results of their projects in the form of a written report as well as an oral presentation.

Exam Review

  • The exam review (Klasureinsicht) for both exam will take place on Friday, September 23rd at 14:00 in room B6, B1.21.

Time and Location

  • Lecture: Wednesday, 10.15 - 11.45, Room A5 B144, Start date 17.2.2016
  • Exercise 1: Thursday, 10.15 - 11.45, Room B6 23-25, A104, Start date 18.2.2016
  • Exercise 2: Thursday, 12.00 - 13.30, Room B6 23-25, A104

Note: there are two parallel exercise groups, you are supposed to only attend one.

 Instructors

Final exam

  • 50 % written exam
  • 50 % project work

Slides and Excercises

The lecture slides and exercise material will be published on this web page every week. 

  1. Lecture: Introduction and Organisation
  2. Exercise 1: Introduction to RapidMiner (Task | Dataset)
  3. Lecture: Cluster Analysis
  4. Exercise 2: Clustering (Task | Dataset)
  5. Lecture: Classification - Part 1
  6. Exercise 3: Classification - Part 1 (Slides | Task)
  7. Lecture: Classification - Part 2
  8. Exercise 4: Classification - Part 2 (Task | Dataset)
  9. Lecture: Classification - Part 3
  10. Exercise 5: Classification - Part 3 (Task | Dataset)
  11. Lecture: Association Analysis
  12. Exercise 6: Association Analysis (Task | Dataset)
  13. Lecture: Text Mining
  14. Exercise 7: Text Mining (Task | Dataset)
  15. Lecture: Introduction to Student Projects

Contact Person

  • If you have any questions, please contact Oliver Lehmberg (oli(at)informatik.uni-mannheim.de) 

 Registration

  • The course is open to students of the Master Business Informatics and Lehramt Informatik.
  • The course is restricted to 60 participants.
  • Registration will be opened Monday, February 8th 2016, 10:15 am.
  • Registration is done via ILIAS using this link (once the registration is open)
  • Allocation of places is done by FCFS (limit  60 students)
  • We offer two alternative times (Thursdays 12.00 and 13.45) for the exercise session. Sign-In to one of both groups within ILIAS after you have registered for the course. The groups are restricted to 30 students each.

Outline

WeekWednesdayThursday
17.02.2016Introduction to Data MiningIntroduction to RapidMiner
24.02.2016Lecture ClusteringExercise Clustering
02.03.2016Lecture Classification 1Exercise Classification 
09.03.2016Lecture Classification 2Exercise Classification 
16.03.2016Lecture Classification 3Exercise Classification 
Easter break
06.04.2016Lecture Association AnalysisExercise Association Analysis
13.04.2016Lecture Text MiningExercise Text Mining
20.04.2016Introduction to Student ProjectsPreparation of Project Outline
27.04.2016Feedback Student ProjectsFeedback on demand
04.05.2016Project WorkFeedback on demand
11.05.2016Project WorkFeedback on demand
18.05.2016Project WorkFeedback on demand
25.05.2016Project WorkSubmission of project summaries
01.06.2016Presentation of project resultsPresentation of project results

Literature 

  1. Pang-Ning Tan, Michael Steinbach, Vipin Kumar: Introduction to Data Mining, Pearson.
  2. Vijay Kotu, Bala Deshpande: Predictive Analytics and Data Mining: Concepts and Practice with RapidMiner. Morgan Kaufmann.
  3. Bing Liu: Web Data Mining, 2nd Edition, Springer.

Software

Videos and Screen Casts

  • Video recordings of the Data Mining I lectures and screen casts of the exercises are available here.

 Course Evaluations