CS 560: Large-Scale Data Management (HWS17)


  • Lecturer: Prof. Dr. Rainer Gemulla
  • Tutors: Daniel Ruffinelli
  • Type of course: Lecture, exercises (6 ECTS points)
  • Lecture: Tuesday, 8:30-10:00, A5 6, B144
  • Tutorial: Thusday, 13:45-15:15, A5 6, B244   and    Friday, 10:15-11:45, A5 6, C013
  • Evaluation: Written exam
  • Prerequisites: Database Systems I or equivalent, programming experience
  • Registration: Enroll in ILIAS



This course introduces the fundamental concepts and computational paradigms of large-scale data management and Big Data. This includes methods for storing, updating, querying, and analyzing large dataset as well as for data-intensive computing. The course covers concept, algorithms, and system issues; accompanying exercises provide hands-on experience. Topics include:

  • Parallel and distributed databases
  • MapReduce and its ecosystem
  • Spark and dataflows
  • NoSQL databases
  • Stream processing (tentative)
  • Graph databases (tentative)

Lecture Notes

Lecture notes, exercises, and supplementary material can be found in ILIAS.


  • H. Garcia-Molina, J. D. Ullman, J. Widom
    Database Systems: The Complete Book
    Prentice Hall, 2nd ed., 2008
  • T. Öszu, P. Valduriez
    Principles of Distributed Database Systems
    Springer, 3rd ed., 2011
  • T. White
    Hadoop – The Definitive Guide

    O’Reilly, 3rded., 2012
  • J. Lin, C. Dyer
    Data-Intensive Text Processing with
    Morgan and Claypool, 1st ed., 2010
  • C. Strauch
    NoSQL databases

    Stuttgart Media University, 2011
  • E. Redmond, J. R. Wilson
    Seven Databases in Seven Weeks: A Guide to Modern Databases and the NoSQL Movement
    Pragmatic Bookshelf, 1st ed., 2012
  • P. J. Sadalage, M. Fowler
    NoSQL Distilled

    Addison-Wesley, 2012
  • More in lecture notes