A huge amount of information is posted on the Internet everyday in terms of free unstructured text. This amount of information can represent new facts or modify existing ones. We decide to deal with the dynamicity and evolution of the Web data from this point of view by extraction of new relations existing from the web and merging them with existing knowledge bases. Open Information Extraction (Open IE) is a sub problem of information extraction field that aims at extracting relations for the open domain without relying on a predefined set of relations. After a thorough literature review, we found drawbacks in the current representation of relations in most of the existing Open IE systems. In collaboration with the NLP & Semantic Computing Group at the University of Passau, we managed to define a new semantic representation for open relations is able to represent complex structures of relations and still can be easily processed by downstream applications. Moreover, we presented two Open IE systems relying on these representations. One of these systems was able to outperform the state of the art systems for the task of n-ary relations extraction. One of the outcomes of this collaboration is a submission in the ACL 2016 conference, as well as a plan for future work. In collaboration with SOTON, we target the problem of multilingual fact extraction and alignment. During this collaboration we represented WDAqua project in a Hackathon organized by the BBC News Labs in London. The outcome of this hackathon was a system prototype for fact extraction and alignment between news documents, and a future work for a research paper.

Started:
2015-07-08
Supervisors: