Open-domain question answering framework using wikipedia

Saleem Ameen, Hyunsuk Chung, Soyeon Caren Han, Byeong Ho Kang

Research output: Chapter in Book/Conference paperConference paperpeer-review

1 Citation (Scopus)

Abstract

This paper explores the feasibility of implementing a model for an open domain, automated question and answering framework that leverages Wikipedia’s knowledgebase. While Wikipedia implicitly comprises answers to common questions, the disambiguation of natural language and the difficulty of developing an information retrieval process that produces answers with specificity present pertinent challenges. However, observational analysis suggests that it is possible to discount the syntactical and lexical structure of a sentence in contexts where questions contain a specific target entity (words that identify a person, location or organisation) and that correspondingly query a property related to it. To investigate this, we implemented an algorithmic process that extracted the target entity from the question using CRF based named entity recognition (NER) and utilised all remaining words as potential properties. Using DBPedia, an ontological database of Wikipedia’s knowledge, we searched for the closest matching property that would produce an answer by applying standardised string matching algorithms including the Levenshtein distance, similar text and Dice’s coefficient. Our experimental results illustrate that using Wikipedia as a knowledgebase produces high precision for questions that contain a singular unambiguous entity as the subject, but lowered accuracy for questions where the entity exists as part of the object.

Original languageEnglish
Title of host publicationAI 2016
Subtitle of host publicationAdvances in Artificial Intelligence - 29th Australasian Joint Conference, Proceedings
EditorsByeong Ho Kang, Quan Bai
PublisherSpringer-Verlag Italia Srl
Pages623-635
Number of pages13
ISBN (Print)9783319501260
DOIs
Publication statusPublished - 2016
Externally publishedYes
Event29th Australasian Joint Conference on Artificial Intelligence, AI 2016 - Hobart, Australia
Duration: 5 Dec 20168 Dec 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9992 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference29th Australasian Joint Conference on Artificial Intelligence, AI 2016
Country/TerritoryAustralia
CityHobart
Period5/12/168/12/16

Fingerprint

Dive into the research topics of 'Open-domain question answering framework using wikipedia'. Together they form a unique fingerprint.

Cite this