Focus Group: Natural Language Processing and Information Retrieval (Prof. Ponzetto and Prof. Glavaš)

The NLP and IR group at DWS conducts research on integrating knowledge from heterogeneous Web sources – ranging from large raw text collections all the way through collaboratively constructed resources (e.g., Wikipedia) and knowledge bases (DBpedia, Freebase, etc.) – and its application to Natural Language Processing (NLP), Information Analysis and Retrieval tasks. Areas of interest include “deep” NLP techniques for text understanding, ranging from lexical and computational semantics (Word Sense Disambiguation, ontology-based and distributional meaning representations), over information extraction (entities and events), to document understanding and structuring (entity linking, ranking and search, automatic summarization). The group also applies NLP methods to support empirical research in Social Science and Humanities.

People

Faculty:

Staff:

Alumni:

 *  joint project with the AI group

**  joint work with the Web-based Information Systems and Services @ HDM Stuttgart

Projects

Publications

Conference Item

  • Alexander Diete, Timo Sztyler, Lydia Weiland and Heiner Stuckenschmidt Improving motion-based activity recognition with ego-centric vision. In: 2018 IEEE International Conference on Pervasive Computing and Communications : PerCom 2018, Athens, Greece, March 19-23, 2018 : PerCom Workshops proceedings; tba. IEEE Computer Society, Piscataway, NJ, 2018.
  • Thorsten Keiper, Zhonghao Lyu, Sara Pooladzadeh, Yuan Xu, Jingyi Zhang, Anne Lauscher and Simone Paolo Ponzetto UniMa at SemEval-2018 Task 7 : semantic relation extraction and classification from scientific publications. In: The International Workshop on Semantic Evaluation - proceedings of the twelfth workshop : June 5-June 6, 2018, New Orleans, Louisiana : NAACL HLT 2018; 826-830. Association for Computational Linguistics, Stroudsburg, PA, 2018.
  • Federico Nanni, Goran Glavaš, Simone Paolo Ponzetto, Sara Tonelli, Nicolò Conti, Ahmet Aker, Alessio Palmero Aprosio, Arnim Bleier, Benedetta Carlotti, Theresa Gessler, Tim Henrichsen, Dirk Hovy, Christian Kahmann, Mladen Karan, Akitaka Matsuo, Stefano Menini and Don Nguyen Findings from the hackathon on understanding euroscepticism through the lens of textual data. In: Proceedings of the LREC 2018 Workshop ParlaCLARIN : Miyazaki, Japan, Monday 7th May 2018; 1-8. LREC, Miyazaki, Japan, 2018.
  • Federico Nanni, Mahmoud Osman, Yi-Ru Cheng, Simone Paolo Ponzetto and Laura Dietz UKParl: A Data Set for Topic Detection with Semantically Annotated Text. In: Proceedings of the LREC 2018 Workshop ParlaCLARIN : Miyazaki, Japan, Monday 7th May 2018; 1-4. LREC, Miyazaki, Japan, 2018.
  • Federico Nanni, Simone Paolo Ponzetto and Laura Dietz Entity-aspect linking : providing fine-grained semantics of entities in context. In: Joint Conference on Digital Libraries 2018 : June 3-6, 2018 in Fort Worth, Texas : Proceedings of the ACM/IEEE Joint Conference on Digital Libraries (JCDL); 1-10. ACM, New York, NY, 2018.
  • Alexander Panchenko, Dmitry Ustalov, Stefano Faralli, Simone Paolo Ponzetto and Chris Biemann Improving hypernymy extraction with distributional semantic classes. In: LREC 2018, 11th International Conference on Language Resources and Evaluation : 7-12 May 2018, Miyazaki (Japan); 1541-1551. European Language Resources Association, ELRA-ELDA, Paris, 2018.
  • Christoph Kilian Theil, Sanja Štajner and Heiner Stuckenschmidt Word embeddings-based uncertainty detection in financial disclosures. In: ACL 2018 Workshop on Economics and Natural Language Processing (ECONLP); tba. Association for Computational Linguistics, Stroudsburg, PA, 2018.
  • Christoph Kilian Theil, Sanja Štajner, Heiner Stuckenschmidt and Simone Paolo Ponzetto Automatic detection of uncertain statements in the financial domain. In: Lecture notes in computer science ; 10761 + 10762Computational Linguistics and Intelligent Text Processing : 18th International Conference, CICLing 2017, Budapest, Hungary, April 17–23, 2017, Revised Selected Papers, Part I / II; tba. Springer International Publishing, Cham, 2018.
  • Dmitry Ustalov, Alexander Panchenko, Andrei Kutuzov, Chris Biemann and Simone Paolo Ponzetto Unsupervised semantic frame induction using triclustering. In: The 56th Annual Meeting of the Association for Computational Linguistics : ACL 2018 : proceedings of the conference, vol. 2 (short papers) : July 15 - 20, 2018 Melbourne, Australia; 55-62. Association for Computational Linguistics, Stroudsburg, PA, 2018.
  • Dmitry Ustalov, Denis Teslenko, Alexander Panchenko, Mikhail Chernoskutov, Chris Biemann and Simone Paolo Ponzetto An unsupervised word sense disambiguation system for under-resourced languages. In: LREC 2018, 11th International Conference on Language Resources and Evaluation : 7-12 May 2018, Miyazaki (Japan); 1018-1022. European Language Resources Association, ELRA-ELDA, Paris, 2018.

Master and Bachelor Theses

This thesis should provide an in-depth overview of the various recurrent neural network models (fully recurrent networks, recursive networks, long...

more

This thesis should provide an in-depth overview of the state-of-the-art methods for representing knowledge graphs and knowledge bases in the (i.e.,...

more

Social network are of high interests, for many applications ranging from simple user profiling to user customized advertisement. In this thesis, we...

more

Continuous emotions detection is a core aspect for many real application. In this work we will experiment with an existing interactive installation...

more

The goal of this thesis would be to organize news from German news outlets in such a way to detect events and salient topics in the news. The...

more

Convolutional neural networks have been shown to be very successful to various text classification tasks. The main shortcoming of CNNs used for text...

more

Recently the DWS group released a huge repository of hypernymy relations the Web, the WebIsADb (http://webdatacommons.org/isadb/), containing a large...

more

In this thesis we will build upon and extend an annotation tool to conduct a user study and better understand the requirements towards image...

more

Object detection in images from news articles is a very challenging task. On the one hand, available training data for object detectors is only...

more

Introduction/problem: Speculation/hedging/vagueness identification plays significant role in many applications, e.g. information extraction, machine...

more