Focus Group: Data Analytics (Prof. Gemulla)

Our group's research focuses on systems and methods for analyzing and mining large datasets as well as their application in practice. Our research interests include:
  • Data analysis and data mining
  • Text mining and information extraction
  • Optimization
  • Approximation techniques
  • Algorithms for modern hardware

Data and Software

  • CORE: Context-aware open relation extraction with factorization machines
  • FINET: Context-aware fine-grained named entity typing
  • Werdy: Recognition and Disambiguation of Verbs and Verb Phrases with Syntactic and Semantic Pruning
  • ClausIE: Clause-Based Open Information Extraction
  • LEMP: Fast Retrieval of Large Entries in a Matrix Product
  • LASH: Large-Scale Sequence Mining with Hierarchies
  • MG-FSM: Large-Scale Frequent Sequence Mining


Current semester (HWS 2016)

Prof. Gemulla is on sabbatical leave; no lectures will be offered.

Previous semesters

Master and Bachelor theses

If you are interested in writing a seminar, Bachelor or Master thesis with us, please contact Rainer Gemulla directly. The list below represents a highly incomplete set of the topics currently offered by our group, talk to us for more topics and additional information. Your own ideas and interests are welcome as well.

Die Bachelorarbeit wird gemeinsam von Prof. Rainer Gemulla (Lehrstuhl für Praktische Informatik I) und Stefan Weil (UB Mannheim) unterstützt.


Die Bachelorarbeit wird gemeinsam von Prof. Rainer Gemulla (Lehrstuhl für Praktische Informatik I) und Dr. Philipp Zumstein (UB Mannheim) unterstützt.


Frequent sequence mining (FSM) is a key task in data mining. The goal of FSM is to discover sequential patterns in a given dataset, i.e., in a...


A language model assigns a probability to a sequence of words. They are used in wide range of applications, including information retrieval, speech...


ClausIE is an Open Information Extraction (OIE) which recognizes facts or propositions from raw text in a domain-independent way. For instance, given...


Die Firma Roche Diagnostics Mannheim erfasst im Rahmen einer Online-Befragung quartalsweise die Zufriedenheit der Mitarbeiter mit verschiedenen...


DBpedia is a community effort to extract structured information from Wikipedia and to make this information publicly available on the Web in the form...