Learning Automatic Question Answering from Community Data
Although traditional search engines can retrieval thousands or millions of web links related to input keywords, users still need to manually locate answers to their information needs from multiple returned documents or initiate further searches. Question Answering (QA) is an effective paradigm to address this problem, which automatically finds one or more accurate and concise answers to natural language questions. Existing QA systems often rely on off-the-shelf Natural Language Processing (NLP) resources and tools that are not optimized for the QA task. Additionally, they tend to require hand-crafted rules to extract properties from input questions which, in turn, means that it would be time and manpower consuming to build comprehensive QA systems. In this thesis, we study the potentials of using the Community Question Answering (cQA) archives as a central building block of QA systems. To that end, this thesis proposes two cQA-based query expansion and structured query generation approaches, one employed in Text-based QA and the other in Ontology-based QA. In addition, based on above structured query generation method, an end-to-end open-domain Ontology-based QA is developed and evaluated on a standard factoid QA benchmark.