 Automatic question answering systemTaxationOntology knowledge base Lucene AnalyzerChinese SegmentationCosine similarity algorithm    

& ltp& gt自动问答系统成为现在计算机应用的一个热门之一,它容许用户以符合人类的语言方式向系统提问,并以人类说话的方式返回给用户信息。用户可以迅速、精确地获取所需答案。问答系统分为开放性和封闭性两大类,前者不限定领域,用户可以提出任何感兴趣的问题,并得到答案;而后者解答特定领域的相关问题。由于税种众多,涉及诸多法律、法规,税务业务对于个人具有复杂性。对公司而言,使用员工解答将耗费不必要的人力、物力。因此,希望实现这样一个系统,能够对税务领域的税务问题进行解答。& lt/p& gt & ltp& gt本文按照税务业务的特点,构建了一个基于Lucene的自动问答系统。基于《简明税收知识问答》一书,获取税务领域常用问题和答案,再通过人工地进行动态拓展,构建税务领域本体知识库。因为Lucene框架功能强大,所以利用Lucene框架定制问答系统。中文分词是问答系统中的关键技术,它关系着系统对用户问题的理解是否准确,本文对Lucene的中文分析器分词效果进行了实验测试,选用分词效果最好的中文分析器。本文还使用余弦相似度算法,对备选答案进行排序,返回给最佳答案,如果无法解答,则由专家进行解答。通过自动问答和专家解答相结合,提供给用户更准确的答案,节省了公司人工成本,提高了解答效率。& lt/p& gt & ltp& gt关键词:自动问答系统税务本体知识库Lucene分析器中文分词余弦相似度& lt/p& gt
& ltp& gtAutomatic question answering system is now one of a hotspot of computer application, which allows users to ask the system a question in natural language, and return the answer to the user in natural language. The user can obtain the information they need quickly and accurately.& lt/p& gt & ltp& gtThere are two kinds of question answering system, the open system and the closed system, the former is not limited to a field, the user can ask any questions they interested, and get the answer in any fields, while the later answers the problems in specific areas. Because of different categories of taxes, together with a number of laws and regulations, the taxation is complex to individuals. For the company, pure manual answer will consume a lot of manpower and material resources. Therefore, it is essential to achieve such a system, which can answer tax issues.& lt/p& gt & ltp& gtAccording to the acter of taxation, this paper designs and realizes an automatic question answering system based on Lucene. This system mainly includes three modules: the problem processing module, the information retrieval module, the answer extraction module. Based on the book "concise tax knowledge Q & A", common problems and answers in the tax field are obtained, and then through dynamic development of the field, the tax domain ontology knowledge base is constructed. Because the Lucene framework is powerful, the framework of Lucene is used to construct the question and answer system. Chinese word segmentation is the key technology of question answering system. It is related to accuracy of the system in understanding user&rsquos questions, so in this paper the Lucene Chinese analyzers&rsquo effects are tested, and the Chinese analyzer with best segmentation results is ed. This paper also use cosine similarity algorithm to sort the alternative answers, and to return the best answer to the user. If the system cannot answer the user, then the experts will answer the question. Through the combination of automatic question answering system and expert solution, it can provide more accurate answers to the users, saving the labor cost of the company, and improve the efficiency of the solution.& lt/p& gt & ltp& gt & lt/p& gt & ltp& gtKey word: Automatic question answering systemTaxationOntology knowledge baseLucene AnalyzerChinese SegmentationCosine similarity algorithm& lt/p& gt



