COQA

The question answering system on canonicalized open knowledge base.

COQA is an open source research project providing a state-of-the art canonilicalizer for building fast, efficient, and accurate question answering system. It aims at leveraging various canonicalization techniques to build a less-redundant and better-linked knowledge base and integrating with the current open question answering system

Objectives

COQA investigates how entity resolution can be done by canonicalization and how canonicalization enables more effective question answering.

Canonicalization
Our team will focus on the design of more effective similarity functions to make canonicalization more accurate. Meanwhile, serveral approximation methods will be examined to simplify the computation.

Automatic Population
Given the open knowledge base, COQA will automatically mine the rule and add new assertions into the knowledge base. This enables the system to answer a wider range of questions.

Open QA system
COQA is one of the few question answering systemes based on canonicalized knowledge base. Its performance will be compared with state-of-the art QA system.


Learn more

Methodologies

COQA draws on the previous experience of canonicalization techniques proposed in the paper “Towards Practical Open Knowledge Base Canonicalization” and other open source questions answering systems while investigating on improving their implementations to improve the performance.

Knowledge Bases
Medium-size open knowledge base with around one million assertions will be used in this project. Other curated knowledge bases like Freebase will also serves as a calibration tool through canonicalization.

Synonym Identifacation
A similarity function concerning attribute overlap, string similarity, string identity and IDF token overlap will be implemented for HAC method.

Evaluation Metrics
COQA will be mainly evaluated by the recall, precision and F1 score. Detailed definitions and equations are presented in project details page.


Learn more

Schedule

Project schedule strictly follows the FYP timetable as presented on the course homapage.

1st Phase
From the beginning of the first semester to end of September, our team will study the necessary background and submit a project plan by September 30.

2nd Phase
From October to end of the first semester, project team should have completed about half of the project development and submit an interim report.

Final Phase
During the second semester, the development and implementation will be completed, tested, and delivered to the supervisor.


Learn more