Web Mining (FSS2015)

The textual content as well as the structured data which is accessible on the Web has an enormous potential for being mined to derive knowledge about nearly any aspect of human life.  The course covers advanced data mining techniques for extracting knowledge from Web content, the Web link structure, as well as usage data gathered by Web applications.  The course will cover the following topics: 

  • Web Usage Mining
  • Recommender Systems
  • Web Structure Mining
  • Social Network Analysis
  • Web Content Mining
  • Information Extraction
  • Sentiment Analysis

The course consists of a lecture together with accompanying practical exercises as well as student team projects.

In the exercises the participants will gather initial expertise in applying state of the art web mining tools/libraries on realistic data sets.

The team projects take place in the last third of the term. Within the projects, students realize more sophisticated web mining projects of personal choice and report about the results of their projects in the form of a written report as well as an oral presentation.

Time and Location

  • Thursday, 15:30 to 17:00, Room: B 6, A104, Start: 12.2.2015
  • Friday, 12:00 to 13:30, Room: B 6, A104, Start: 13.2.2015

Instructor

Final mark

  • 50 % written exam
  • 50 % project work

Slides and Excercises

The slides will be published on this webpage before each lecture.

  1. Slideset: Introduction and Course Outline
  2. Slideset: Web Usage Mining
  3. Exercise 1: Recommender Systems (Source | DataSet)
  4. Slideset: Web Structure Mining
  5. Exercise 2: Introduction to Pajek ( DataSet ) | Part 2 | Part 3 ( Additional DataSet )
  6. Slideset: Web Content Mining Part I (Sentiment Analysis)
  7. Exercise 3: Exercise sheet, Slides, RapidMiner process, Eclipse project in Ilias only!
  8. Slideset: Web Content Mining Part II (Information Extraction)
  9. Exercise 4: Exercise Sheet
  10. Slideset: Introduction to Student Projects

Participation 

  • The course is open to students of the Master Business Informatics 
  • The course is restricted to 30 participants
  • Students can register by joining the ILIAS group.

Requirements

  • The lecture can be attended without having attended the BI 600 Data Mining lecture before.
  • Basic programming skills in Java are required for the exercise.

 Course Evaluations

 Outline

Week

Topic Thursday

Topic Friday

12.02.2015

Lecture: Introduction to Web Mining

Lecture: Web Usage Mining

19.02.2015

Lecture: Recommender Systems

Exercise: Recommender Systems

26.02.2015

Exercise: Recommender Systems

Lecture: Web Structure Mining

05.03.2015

Lecture: Social Network Analysis

Lecture: Introduction to Pajek

12.03.2015

Exercise: Social Network Analysis

Exercise: Social Network Analysis

19.03.2015

Lecture: Web Content Mining: Sentiment analysis

Exercise:  Sentiment Analysis

26.03.2015

Lecture:  Web Content Mining: Information Extraction

Exercise: Information Extraction

Easter break
16.04.2015 Introduction to Student Projects Prepararation of Project Outlines
23.04.2015 Feedback about Project Oulines Project work

30.04.2015

Coaching

- Holiday -

07.05.2015

Project work

Coaching

14.05.2015

- Holiday -

Coaching

21.05.2015

Project work

Coaching

28.05.2015 Presentation of project results Presentation of project results

 Literature 

  1. Bing Liu: Web Data Mining, 2nd Edition, Springer.
  2. Wouter de Nooy, Andrej Mrvar, Vladimir Batagelj: Exploratory Social Network Analysis with Pajek, Cambridge University Press.
  3. Dietmar Jannach: Recommender Systems: An Introduction, Cambridge University Press.
  4. Pang-Ning Tan, Michael Steinback, Vipin Kumar: Introduction to Data Mining, Pearson.
  5. Ian H. Witten, Eibe Frank, Mark A. Hall: Data Mining: Practical Machine Learning Tools and Techniques, 3rd Edition, Morgan Kaufmann.

Software