Zhicheng Dou


Zhicheng Dou is a professor at School of Information, Renmin University of China since September 2014. He received his Ph.D. and B.S. degrees in computer science and technology from the Nankai University in 2008 and 2003, respectively. After getting his Ph.D. degree, he worked at Microsoft Research as a researcher for more than six years (from July 2008 to September 2014). His research interests include Information Retrieval, Search Engine, Web Data Mining, and Big Data Analysis. Zhicheng Dou is not a pure research guy - besides writing papers, he also enjoys writing codes to convert cool ideas into real systems.

窦志成博士于2003年和2008年分别获得南开大学计算机学士和博士学位, 2008年博士毕业后就职于微软亚洲研究院,任研究员。 2014年9月份加入中国人民大学,任特别研究员。 主要研究兴趣为信息检索、互联网搜索、数据挖掘,大数据等。 目前已在国际知名会议和学术期刊上(如SIGIR、WWW、CIKM、WSDM、EMNLP及IEEE TKDE等)发表论文30余篇。 他于2013年获得国际信息检索大会(SIGIR2013)最佳论文提名奖,2012年获得亚洲信息检索大会(AIRS 2012)最佳论文奖。 担任多个国际学术会议(如SIGIR、WWW、KDD、WSDM、CIKM、EMNLP)的程序委员会成员。 任亚洲信息检索协会筹划指导委员会(AIRS Steering Committee)主席,中文信息学会信息检索专委会执委委员、中国人工智能学会智能服务专委会委员。 中国中文信息学会青年工作委员会委员 (YSSNLP)、中国计算机学会大数据专家委员会通讯委员。 任AIRS 2016(亚洲信息检索学术会议)联席大会主席,AIRS 2017程序委员会主席,日本国立情报学研究所(NII)信息检索评测会议(NTCIR)Intent-2任务和IMINE任务的组织者之一。 美国ACM学会、IEEE会员,中国计算机学会会员。

除研究工作外,窦志成博士乐于将研究想法实现成可运行的系统。 在亚洲研究院任职期间,他参与了多个项目的开发, 如时事探针WebStudioProjectQ、 和 WebSensor等。 他拥有多项专利,参与研发的多项技术已经成功转化到微软产品中(如必应搜索Bing和Office)。

Information Retrieval Web Search Data Mining Big Data 信息检索 互联网搜索 数据挖掘 大数据


Search Result Diversification

Studies show that the vast majority of queries to search engines are short and vague in specifying a user’s intent. Different users may have completely different information needs and goals when using precisely the same query. For example, User A is finding information about Apply Company by issuing a query "apple,", while User B is finding information related to fruit apple using the same query. When such a query is issued, search engines will return a list of documents that mix different topics. It takes time for a user to choose which information he/she wants. Search Result Diversification is an effective way to solve this problem. It provides a list of results that cover as many aspects as possible, so that most users can be satisfied by the top results.

Query Facet/Dimension Mining

We address the problem of finding multiple groups of words or phrases that explain the underlying query facets, which we refer to as query dimensions/facets. We assume that the important aspects of a query are usually presented and repeated in the query’s top retrieved documents in the style of lists, and query facets can be mined out by aggregating these significant lists.