Two papers from the DWS group have been accepted at ESWC 2017:
Despite the growing amount of research in link and type prediction in knowledge graphs, systematic benchmark datasets are still scarce. In this paper, we propose a synthesis model for the generation of benchmark datasets for those tasks. Synthesizing data is a way of having control over important characteristics of the data, and allows the study of the impact of such characteristics on the performance of different methods. The proposed model uses existing knowledge graphs to create synthetic graphs with similar characteristics, such as distributions of classes, relations, and instances.
selection of instances, and horn rules. As a first step, we replicate already existing knowledge graphs in order to validate the synthesis model. To do so, we perform extensive experiments with different link and type prediction methods. We show that we can systematically create knowledge graph benchmarks which allow for quantitative measurements of the result quality and scalability of link and type prediction methods.
"Data-driven Joint Debugging of the DBpedia Mappings and Ontology: Towards Addressing the Causes instead of the Symptoms of Data Quality in DBpedia" by Heiko Paulheim
DBpedia is a large-scale, cross-domain knowledge graph extracted from Wikipedia. For the extraction, crowd-sourced mappings from Wikipedia infoboxes to the DBpedia ontology are utilized. In this process, different problems may arise: users may create wrong and/or inconsistent mappings, use the ontology in an unforeseen way, or change the ontology without considering all possible consequences. In this paper, we present a data-driven approach to discover problems in mappings as well as in the ontology and its usage in a joint, data-driven process. We show both quantitative and qualitative results about the problems identified, and derive proposals for altering mappings and refactoring the DBpedia ontology.