 Fraud detection in Crowdfunding based on text analysis and cost-sensitive SVM    


 众筹 ; 欺诈检测 ; 说服理论 ; 隐含狄利克雷分布 ; 代价敏感支持向量机    


 Crowdfunding ; fraud detection ; Theory of Persuasion ; LDA ; Cost-Sensitive SVM    



近年来,越来越多的投资者和创业者将目光投向众筹领域。在我国,长期以来同时存 在着中小微企业融资难和民间资本投资渠道不畅的问题,众筹,作为金融创新的新思路, 解决了投资者和企业家的燃眉之急。随着众筹的火热,众多研究者对该领域产生了浓厚兴 趣,针对众筹平台、众筹项目及众筹各方参与者进行了多方位的研究,特别是在众筹发展 模式探索、驱动因素分析、项目成功率预测、项目推荐等领域进行了深入的研究并取得了 一定的成果。

伴随着众筹领域广受关注,项目欺诈问题也由此滋生。国内外均发生了多起欺诈案例, 涉及多种众筹形式、波及大量知名众筹平台,给投资者带来不少损失,也给众筹领域监管 敲响了警钟。然而,目前针对众筹领域的欺诈问题的研究还极其有限。国内外多数研究都 仅仅是从法律监管层面进行了探讨,并未从众筹项目本身出发,给出控制众筹欺诈行之有 效的方案。本文基于凡勃伦效应、说服理论的基础上,利用 LDA 模型提取项目文案中的 主题分布,通过情感分析模型提取项目结束日前情感值,结合项目本身的数值型、统计信 息,构建特征集,再对代价敏感性支持向量机模型进行训练,最终建立众筹欺诈监测模型。 通过 python 语言中的 urllib、re 以及 beautifulsoup 模块构建爬虫,爬取了国内最大众筹平 台众筹网(http://www.zhongchou.cn)的详细项目数据,对模型进行实证分析,证明了模型 的可行性、实用性,同时验证了特征的有效性。





In recent years, increasing number of entrepreneurs, investors and aspirants cast their eyes to crowdfunding. Many Researchers have contributed to a profound and multi-dimensional study of crowdfunding, including development pattern, intuition analysis, success rate prediction and project recommendation.

Too fast development will breed fraudulent problems, which will bring huge loss to investors and platforms. There have been several fraudulent cased in China and in America, concerning various categories and platforms. However, studies on fraud detection in crowdfunding are limited at present in both sides. The paper studies fraudulent projects in online crowdfunding platforms and proposes an automatic fraud detection model using text mining and machine learning based on persuasion theory and social influence theory. By programming in python, the study builds a web spider to crawl project details from Zhongchou.com (http://www.zhongchou.com) and assess the established model with the data.

To a degree, the model can capture fraudulent projects and give risk warning. Especially, this paper proposes features based on psychological and sociological theory and verify the effect of them by comparative experiments.


Key Words: Crowdfunding;fraud detection;Theory of Persuasion;LDA;Cost-Sensitive SVM





