Data trustworthiness is an important factor to select datasets for question answering. In an environment of information published independently by many different actors, data veracity, and quality is usually uncertain, and there is always the risk of consuming misleading data. How to assess the trustworthiness of given data and search authoritative data, in order to decrease this risk, is still a challenge.

We are performing a survey for trust management and identified three main areas of research: data attribution and provenance, trust representation, and trust computation; the results of this work will be submitted to a relevant journal. While different solutions are usually proposed for provenance and trust representation, we believe that a generic representation is possible. On that matter, we have developed an ontology to represent contextual information about statements; this work has been submitted to FOIS 2016. On the topic of trust computation, we are working on assessing the trust of datasets based on reuse metrics. We already performed an initial assessment, computing the PageRank values of datasets considering the usage of external resources as links between datasets. This experiment was performed on web scale and achieved promising results. This work has been submitted to PROFILES workshop at ESWC 2016. In addition, we are also working on trust as a quality metric and collaborating with UBO. This collaboration led to a paper submitted to WIMS 2016 conference, currently under review, and continues in the present in joint research in data reuse and quality metrics.

Started:
2015-10-05
Supervisors:

Publications

Ranking the Web of Data. Jose M. Garica, Harsh Thakkar, and Antoine Zimmermann. PROFILES 2016 Workshop. (2016). PDF