Dmitry Ustalov

Post-Doctoral Researcher

B6, 26, Room B 1.19
D-68159 Mannheim

Email: dmitry (at) informatik.uni-mannheim.de

Research Group: Natural Language Processing and Information Retrieval

About

I joined the Data and Web Science group in November 2017. My research is mainly focused on different aspects of computational semantics, especially on word sense induction and disambiguation as well as on the automatic thesaurus construction and evaluation using unstructured data and crowdsourcing. In February 2018, I defended my Kandidat Nauk (PhD) thesis.

I am working on the JOIN-T project funded by the Deutsche Forschungsgemeinschaft (DFG).

You can find my publications listed on Google ScholarScopusdblp, and arXiv.

My ORCID iD is https://orcid.org/0000-0002-9979-2188 and my ResearcherID is P-6307-2014.

Research Interests

  • Natural Language Processing
  • Computational Semantics
  • Crowdsourcing

Selected Results

  • Watset, an efficient meta-algorithm for fuzzy graph clustering. This algorithm creates an intermediate representation of the input graph that naturally reflects the “ambiguity” of its nodes. Then, it uses hard clustering to discover clusters in this intermediate graph. Watset shows excellent results on the synset induction task for multiple languages as reported in our ACL 2017 paper: doi:10.18653/v1/P17-1145.
  • Hyperstar, a regularized projection learning approach that transforms the word embeddings for the hyponyms into the word embeddings for the corresponding hypernyms. The regularization term is added to the loss function to explicitly enforce the asymmetry of the “is-a” semantic relation. As the result, it is possible to generate more accurate hypernyms on the same data as reported in our EACL 2017 paper: doi:10.18653/v1/E17-2087.
  • Mechanical Tsar, an open source engine for microtask-based crowdsourcing. This engine makes it possible to host a highly customizable crowdsourcing platform to annotate your data using the volunteers from either Internet or the private community. It supports automatic task allocation, worker ranking, answer aggregation, and agreement assessment. More information can be found on its website, mtsar.nlpub.org, or in the paper: doi:10.15514/ISPRAS-2015-27(3)-25.

Related Projects

Recent Publications

2018

  • Alexander Panchenko, Dmitry Ustalov, Stefano Faralli, Simone Paolo Ponzetto and Chris Biemann Improving hypernymy extraction with distributional semantic classes. In: LREC 2018, 11th International Conference on Language Resources and Evaluation : 7-12 May 2018, Miyazaki (Japan); tba. European Language Resources Association, ELRA-ELDA, Paris, 2018.
  • Dmitry Ustalov, Mikhail Chernoskutov, Alexander Panchenko and Chris Biemann Fighting with the sparsity of the synonymy dictionaries for automatic synset induction. In: Lecture notes in computer scienceAnalysis of Images, Social Networks and Texts : 6th International Conference, AIST 2017, Moscow, Russia, July 27-29, 2017, Revised Selected Papers; 94-105. Springer International Publishing, Cham, 2018.
  • Dmitry Ustalov, Denis Teslenko, Alexander Panchenko, Mikhail Chernoskutov and Chris Biemann Word sense disambiguation based on automatically induced synsets. In: LREC 2018, 11th International Conference on Language Resources and Evaluation : 7-12 May 2018, Miyazaki (Japan); tba. European Language Resources Association, ELRA-ELDA, Paris, 2018.

2017

2016

2015

2014