报告题目:Data-driven Approaches for Large Scale Knowledge Graph Construction
报告日期及时间:2017年1月5日10:00
报告地点:B403
报告人: 肖仰华博士
报告人单位: 复旦大学
报告人简介: 肖仰华博士,知识工场创始人、复旦大学计算机学院副教授、博导、青年973科学家、上海市互联网大数据工程中心执行副主任、多个省部级重点实验室或工程中心专家委员、上市公司等规模企业高级技术顾问或首席科学家、复旦大学知识工场实验室负责人。主要研究兴趣包括:大数据管理与挖掘、图数据库、知识图谱等。曾访问美国贝勒医学院、微软亚洲研究院、香港中文大学。曾获教育部高校科研成果二等奖、CCF2014自然科学二等奖、ACM(CCF)上海杰出青年科学家提名奖。作为负责人承担近30项国家级、省部级课题以及企业项目。至今已经在中国计算机学会A类、B类期刊与会议发表论文70多篇,包括SIGMOD、VLDB、ICDE、IJCAI、AAAI、WWW等等。担任SCI期刊Frontier of Computer Science青年副主编,50多次担任国际顶级与知名学术会议的程序委员会委员;常年担任10多个国家、省市各部委各类获奖项目及奖项的评审专家;常年担任20多个国际知名学术期刊评审人。是ACM, IEEE会员和CCF高级会员。领导开发了国内首个知识库云服务平台(知识工场平台kw.fudan.edu.cn),以API形式服务人工智能与大数据相关企业近2亿次。
报告摘要:
Building large scale knowledge graphs has attracted wide research interest due to the wide applications of knowledge graphs in search, recommendation, QA and many other tasks related to text understanding. Despite of the great progress in this direction, there are many weaknesses of the current approaches in terms of effectiveness and efficiency. First, the knowledge extraction still requires huge human efforts both in feature engineering and data labeling. Second, the knowledge base after extraction usually suffers from low quality especially inconsistency and incompleteness. Third, the domain-specific knowledge base construction suffers from the sparsity of labeled data. In this talk, we will present the progress of my team (KW@FUDAN) in attacking these problems. Specifically, I will talk about (1) how we use deep learning models to build an end-to-end knowledge extraction system so that the knowledge graph construction is fully automated without any human efforts; (2) how we transfer the extraction models from source domain with rich labelled data to target domain with sparse labelled data under the deep learning framework; (3) how we develop automatic inference mechanisms to compete a knowledge base and clean a knowledge base. Based on these solutions, our knowledge bases published online including CN-DBpedia and ProbasePlus have served industries with 200+ millions API calls.
邀请人:刘金硕副教授