Master Thesis: Speculation detection in political speeches (Štajner, Ponzetto)

Introduction/problem: Speculation/hedging/vagueness identification plays significant role in many applications, e.g. information extraction, machine translation, text simplification.

Goal: Automatic identification of sentences which contain speculation/hedging or are vague, as those sentences need special care when being translated or when used in information extraction systems (i.e. we usually just want information that is certain and not speculations and hypotheses).

Approach: Knowledge-rich speculation detection approach on political speeches.

Additional goals: Direct comparison of systems built for different domains (political speeches vs. Wikipedia).

Requirements: Basic knowledge of supervised classification algorithms and text processing (tokenisation, lemmatisation, etc.)