Dr. Dmitry Ustalov

Post-Doctoral Researcher

B6, 26, Room B 1.19
D-68159 Mannheim

Email: dmitry (at) informatik.uni-mannheim.de

Research Group: Natural Language Processing and Information Retrieval

About

I joined the Data and Web Science group in November 2017. My research is mainly focused on different aspects of computational semantics, especially on word sense induction and disambiguation as well as on the automatic thesaurus construction and evaluation using unstructured data and crowdsourcing. In February 2018, I defended my Kandidat Nauk (PhD) thesis.

I am working on the JOIN-T project funded by the Deutsche Forschungsgemeinschaft (DFG).

You can find my publications listed on Google ScholarScopusdblp, and arXiv.

My ORCID iD is https://orcid.org/0000-0002-9979-2188 and my ResearcherID is P-6307-2014.

Research Interests

  • Natural Language Processing
  • Computational Semantics
  • Crowdsourcing

Selected Results

  • Watset, an efficient meta-algorithm for fuzzy graph clustering. This algorithm creates an intermediate representation of the input graph that naturally reflects the “ambiguity” of its nodes. Then, it uses hard clustering to discover clusters in this intermediate graph. Watset shows excellent results on the synset induction task for multiple languages as reported in our ACL 2017 paper: doi:10.18653/v1/P17-1145.
  • Hyperstar, a regularized projection learning approach that transforms the word embeddings for the hyponyms into the word embeddings for the corresponding hypernyms. The regularization term is added to the loss function to explicitly enforce the asymmetry of the “is-a” semantic relation. As the result, it is possible to generate more accurate hypernyms on the same data as reported in our EACL 2017 paper: doi:10.18653/v1/E17-2087.
  • Mechanical Tsar, an open source engine for microtask-based crowdsourcing. This engine makes it possible to host a highly customizable crowdsourcing platform to annotate your data using the volunteers from either Internet or the private community. It supports automatic task allocation, worker ranking, answer aggregation, and agreement assessment. More information can be found on its website, mtsar.nlpub.org, or in the paper: doi:10.15514/ISPRAS-2015-27(3)-25.

Related Projects

Recent Publications

2018

  • Alexander Panchenko, Anastasiya Lopukhina, Dmitry Ustalov, Konstantin Lopukhin, Nikolay Arefyev, Alexey Leontyev and Natalia Loukachevitch RUSSE'2018 : a shared task on word sense induction for the Russian language. In: DialogueComputational Linguistics and Intellectual Technologies : Papers from the Annual conference "Dialogue" 2018 : 24th International Conference on Computational Linguistics and Intellectual Technologies, May 30 - June 2, 2018 Moscow; 547-564. RSUH, Moscow, Russia, 2018.
  • Alexander Panchenko, Dmitry Ustalov, Stefano Faralli, Simone Paolo Ponzetto and Chris Biemann Improving hypernymy extraction with distributional semantic classes. In: LREC 2018, 11th International Conference on Language Resources and Evaluation : 7-12 May 2018, Miyazaki (Japan); 1541-1551. European Language Resources Association, ELRA-ELDA, Paris, 2018.
  • Dmitry Ustalov, Mikhail Chernoskutov, Alexander Panchenko and Chris Biemann Fighting with the sparsity of the synonymy dictionaries for automatic synset induction. In: Lecture notes in computer scienceAnalysis of Images, Social Networks and Texts : 6th International Conference, AIST 2017, Moscow, Russia, July 27-29, 2017, Revised Selected Papers; 94-105. Springer International Publishing, Cham, 2018.
  • Dmitry Ustalov, Alexander Panchenko, Andrei Kutuzov, Chris Biemann and Simone Paolo Ponzetto Unsupervised semantic frame induction using triclustering. In: The 56th Annual Meeting of the Association for Computational Linguistics : ACL 2018 : proceedings of the conference, vol. 2 (short papers) : July 15 - 20, 2018 Melbourne, Australia; 55-62. Association for Computational Linguistics, Stroudsburg, PA, 2018.
  • Dmitry Ustalov, Denis Teslenko, Alexander Panchenko, Mikhail Chernoskutov, Chris Biemann and Simone Paolo Ponzetto An unsupervised word sense disambiguation system for under-resourced languages. In: LREC 2018, 11th International Conference on Language Resources and Evaluation : 7-12 May 2018, Miyazaki (Japan); 1018-1022. European Language Resources Association, ELRA-ELDA, Paris, 2018.

2017

2016

2015

2014