Focus Group: Data Analytics (Prof. Gemulla)

Our group's research focuses on systems and methods for analyzing and mining large datasets as well as their application in practice. Our research interests include:
  • Data analysis and data mining
  • Text mining and information extraction
  • Optimization
  • Approximation techniques
  • Algorithms for modern hardware

Data and Software

  • CORE: Context-aware open relation extraction with factorization machines
  • FINET: Context-aware fine-grained named entity typing
  • Werdy: Recognition and Disambiguation of Verbs and Verb Phrases with Syntactic and Semantic Pruning
  • ClausIE: Clause-Based Open Information Extraction
  • LEMP: Fast Retrieval of Large Entries in a Matrix Product
  • LASH: Large-Scale Sequence Mining with Hierarchies
  • MG-FSM: Large-Scale Frequent Sequence Mining

Courses

Current semester (FSS 2016)

Previous semesters

Master and Bachelor theses

If you are interested in writing a seminar, Bachelor or Master thesis with us, please contact Rainer Gemulla directly. The list below represents a highly incomplete set of the topics currently offered by our group, talk to us for more topics and additional information. Your own ideas and interests are welcome as well.

Recently, there has been much interest to exploit Web-scale resource like the CommonCrawl for intelligent text processing and information extraction...

more

Recently, we started investigating methods and framework to automatically extract high-quality hypernym relations from Web-scale amounts of data,...

more

The School of Business Informatics and Mathematics together with the School of Social Sciences start to offer the new degree program Mannheim Master...

more

Die Bachelorarbeit wird gemeinsam von Prof. Rainer Gemulla (Lehrstuhl für Praktische Informatik I) und Stefan Weil (UB Mannheim) unterstützt.

more

Die Bachelorarbeit wird gemeinsam von Prof. Rainer Gemulla (Lehrstuhl für Praktische Informatik I) und Dr. Philipp Zumstein (UB Mannheim) unterstützt.

more

Data integration problems arise whenever data from separate sources needs to be combined as the basis for new applications. Within the context of the...

more

A large number of e-shops have started to markup structured data about products, offers and reviews in their HTML pages using the markup standard Micr...

more

Object detection in images from news articles is a very challenging task. On the one hand, available training data for object detectors is only...

more

Frequent sequence mining (FSM) is a key task in data mining. The goal of FSM is to discover sequential patterns in a given dataset, i.e., in a...

more

A language model assigns a probability to a sequence of words. They are used in wide range of applications, including information retrieval, speech...

more