The use of structured and semi structured data as a source of information and as a basis for decision making processes, alongside other sources, is not yet fully realised. To discover, filter, and rank within the web of data, different technologies and principles opposed to the traditional web are needed. With the increasing efforts to use web data for information retrieval, challenges connected to human interaction with data arise. Published data is heterogeneous concerning formats, structures, licenses, portals (storage), metadata, quality, size; and, therefore, not always easy to find. The discovery of data and of datasets can be difficult for non-technical users, but can present challenges also to experts. When looking at the whole interaction process of the user with data, numerous factors potentially contribute to the success of a user’s task. The user can be a person involved in constructing or designing adequate tools, as well as a person trying to “get an answer to a question”. As a first step a literature survey is looking at the process of Question Answering and the discovery of data as a whole – considering the user at three distinct stages: before the query is asked, during the query processing and after the query. Resulting from this higher level picture a gap analysis determining the importance of a user’s perspective in these stages will be conducted. Subsequently we will use an experimental mixed methods design, consisting of semi-structured interviews as well as a search log analysis for dataset search, and possibly other observational methods; depending on the focus of the resulting research questions. This will create a better understanding of the challenges connected to data search and use when it comes to a user’s assessment of data or of an answer to a question. This can inform the development of data discovery tools, such as question answering systems, especially in determining the presentation of results or answers.

Started:
2015-11-02
Supervisors:

Publications

Learning when searching for web data. Laura Koesten, Emilia Kacprzak, Jenifer Tennison. Search as Learning (SAL) workshop @SIGIR 2016. PDF

Position Paper: Dataset profiling for un-Linked Data. Emilia Kacprzak, Laura Koesten, Tom Heath, Jeni Tennison. PROFILES Workshop. ESWC 2016 (Satellite Events) PDF

The Trials and Tribulations of Working with Structured Data - a Study on Information Seeking Behaviour. Laura Koesten, Emilia Kacprzak, Jeni Tennison, Elena Simperl. CHI 2017: 1277-1289 URL

Searching Data Portals - More Complex Than We Thought?. Laura M Koesten, Jaspreet Singh. CHIIR 2017. Workshop on Supporting Complex Search Tasks, SCST@CHIIR 2017: 25-28 PDF

A Query Log Analysis of Dataset Search. Emilia Kacprzak, Laura M Koesten, Luis-Daniel Ibáñez, Elena Simperl, Jeni Tennison. ICWE 2017: 429-436 URL