北大知识计算实验室
The Knowledge Computing Lab
The Knowledge Computing Lab 知识计算实验室
at The National Engineering Research Center for Software Engineering

Introduction

Welcome to The Knowledge Computing Lab (KCL) of Peking University, a center of excellence for knowledge computing research and practice established in 2017.

The Knowledge Computing Lab at Peking University is part of National Engineering and Research Center for Software Engineering, where faculties, students and software engineers work together on advanced algorithms towards natural language understanding and programming language comprehension. Our research topics include, but are not limited to, relation extraction, sequence modeling, sentiment analysis, code knowledge graph, code summarization and code retrieval. We have won more than 10 awards in national and international AI competitions. We distinguish ourselves as being practice-oriented, as our researches have been applied in large-scale real-world projects in areas such as enterprise intelligence, medical AI, petition, legal system, and human resources.

The National Engineering Research Center for Software Engineering was established in July 1996. Professor YANG Fuqing, Academician of the Chinese Academy of Sciences, served as the founding director. As a national research institution, the center focuses on technological innovation and has successfully built SDKs and supporting tools for industrial software development and production. It has established itself as a pioneer in many emerging but strategic areas such as government big data, software and system security, human-machine integration, intelligent sensing computing, smart cities, and blockchain.

We are looking for highly self-motivated students to work with us as PhD candidates (several positions open for 2024 Fall) or for Master's Degree. If you share our vision and are interested in working with us, please send your resume to wye#pku.edu.cn.

Papers

No. Title
01Bo Li, Dingyao Yu, Wei Ye*, Jinglei Zhang, Shikun Zhang*. Sequence Generation with Label Augmentation for Relation Extraction. AAAI'23 (CCF Rank A, Full Paper)
02Bo Li, Wei Ye*, Jinglei Zhang, Shikun Zhang*. Reviewing Labels: Label Graph Network with Top-k Prediction Set for Relation Extraction. AAAI'23 (CCF Rank A, Full Paper)
03Xi Xiangyu, Jianwei Lv, Shuaipeng Liu, Wei Ye*, Fan Yang and Guanglu Wan. MUSIED: A Benchmark for Event Detection from Multi-Source Heterogeneous Informal Texts. EMNLP'22 (CCF Rank B, Full Paper)
04Chaoya Jiang, Haiyang Xu, Chenliang Li, Ming Yan, Wei Ye*, Shikun Zhang*, Bin Bi and Songfang Huang. TRIPS:Efficient Vision-and-Language Pre-training with Text-relevant Patch Selection. EMNLP'22 (CCF Rank B, Full Paper)
05Yidong Wang, Hao Chen, Yue Fan, Wang SUN, Ran Tao, Wenxin Hou, Renjie Wang, Linyi Yang, Zhi Zhou, Lan-Zhe Guo, Heli Qi, Zhen Wu, Yu-Feng Li, Satoshi Nakamura, Wei Ye, Marios Savvides, Bhiksha Raj, Takahiro Shinozaki, Bernt Schiele, Jindong Wang, Xing Xie, Yue Zhang. USB: A Unified Semi-supervised Learning Benchmark, NeurIPS'22 (CCF Rank A, Datasets and Benchmarks Track)
06Botao Yu, Peiling Lu, Rui Wang, Wei Hu, Xu Tan, Wei Ye, Shikun Zhang, Tao Qin, Tie-Yan Liu. Museformer: Transformer with Fine- and Coarse-Grained Attention for Music Generation. NeurIPS'22 (CCF Rank A, Full Paper)
07Peiyang Liu, Xi Xiangyu, Wei Ye* and Shikun Zhang. Label Smoothing for Text Mining, COLING'22 (CCF Rank B, Full Paper)
08Zile Qiao, Wei Ye*, Tong Zhang, Tong Mo, Weiping Li and Shikun Zhang. Exploiting Hybrid Semantics of Relation Paths for Multi-hop Question Answering Over Knowledge Graphs, COLING'22 (CCF Rank B, Full Paper)
09Yinyi Wei, Shuaipeng Liu, Jianwei Lv, Xi Xiangyu, Wei Ye*, Tong Mo, Fan Yang and Guanglu Wan. DESED: Dialogue-based Explanation for Sentence-level Event Detection, COLING'22 (CCF Rank B, Full Paper)
10Rui Xie, Tianxiang Hu, Wei Ye*, Shikun Zhang*. Low-Resources Project-Specific Code Summarization, ASE'22 (CCF Rank A, Full Paper)
11Xiangyu Xi, Chenxu Lv, Yuncheng Hua, Wei Ye*, Chaobo Sun, Shuaipeng Liu, Fan Yang, Guanglu Wan. A Low-Cost, Controllable and Interpretable Task-Oriented Chatbot: With Real-World After-Sale Services as Example. SIGIR'22 (CCF Rank A, Industry Track)
12Zheyu Ying, Xueyang Liu*, Jinglei Zhang, Rui Xie, Guochang Wen, Xiongfeng Xiao, Shikun Zhang. 3Rs:Data Augmentation Techniques Using Document Contexts For Low-Resource Chinese Named Entity Recognition. IJCNN'22(CCF Rank C,full paper)
13Tong Zhang, Wei Ye*, Baosong Yang, Long Zhang, Xingzhang Ren, Dayiheng Liu, Jinan Sun*, Shikun Zhang, Haibo Zhang, Wen Zhao. Frequency-Aware Contrastive Learning for Neural Machine Translation. AAAI'22 (CCF Rank A, Full Paper)
14Peiyang Liu, Xi Wang, Sen Wang, Wei Ye*, Xiangyu Xi and Shikun Zhang. Improving Embedding-based Large-scale Retrieval via Label Enhancement. In EMNLP'21 (CCF Rank B, Findings)
15Peiyang Liu, Xi Wang, Lin Wang, Wei Ye*, Xiangyu Xi and Shikun Zhang. Distilling Knowledge from BERT into Simple Fully Connected Neural Networks for Efficient Vertical Retrieval. In CIKM'21 (CCF Rank B, Applied Paper)
16Xi Xiangyu, Wei Ye*, Shikun Zhang*, Quanxiu Wang, Huixing Jiang and Wei Wu. Capturing Event Argument Interaction via A Bi-Directional Entity-Level Recurrent Decoder. In ACL'21 (CCF Rank A, Full Paper)
17Tong Zhang, Long Zhang, Wei Ye*, Bo Li, Jinan Sun, Shikun Zhang*, Xiaoyu Zhu and Wen Zhao. Point, Disambiguate and Copy: Incorporating Bilingual Dictionaries for Neural Machine Translation. In ACL'21 (CCF Rank A, Full Paper)
18Luyao Ma, Yating Zhang, Tianyi Wang, Xiaozhong Liu, Wei Ye*, Changlong Sun and Shikun Zhang. Legal Judgment Prediction with Multi-Stage Case Representation Learning in the Real Court Setting. In SIGIR'21. (CCF Rank A, Full Paper)
19Long Zhang, Tong Zhang, Haibo Zhang, Baosong Yang, Wei Ye* and Shikun Zhang. Multi-Hop Transformer for Document-Level Machine Translation. In NAACL'21. (CCF Rank C, Full Paper)
20Peiyang Liu, Sen Wang, Xi Wang, Wei Ye*, Shikun Zhang, QuadrupletBERT: An Efficient Model For Embedding-Based Large-Scale Retrieval. In NAACL'21. (CCF Rank C, Short Paper)
21Rui Xie, Wei Ye*, Jinan Sun, Shikun Zhang. Exploiting Method Names to Improve Code Summarization: A Deliberation Multi-Task Learning Approach, ICPC'21 (CCF Rank B, Full Paper)
22Xiangyu Xi, Wei Ye*, Tong Zhang, Quanxiu Wang, Shikun Zhang, Huixing Jiang, Wei Wu. Improve Event Detection by Exploiting Label Hierarchy. In ICASSP'21 (CCF Rank B, Full Paper)
23Tianxiang Hu, Jingxi Liang, Wei Ye*, Shikun Zhang. Keyword-Aware Encoder for Abstractive Text Summarization. In DASFAA’21 (CCF Rank B, Full Paper)
24Bo Li, Wei Ye*, Canming Huang and Shikun Zhang. Multi-view Inference for Relation Extraction with Uncertain Knowledge. In AAAI'21 (CCF Rank A, Full Paper)
25Zhonghao Sheng, Kaitao Song, Xu Tan, Yi Ren, Wei Ye*, Shikun Zhang and Tao Qin. Automatic Song Writing with Pre-training and Alignment Constraint. In AAAI'21 (CCF Rank A, Full Paper)
26Shikun Zhang, Rui Xie, Wei Ye*, Long Chen. Keyword-Based Code Auto-Summarization. Journal of Computer Research and Development, 2020. (CCF Chinese Rank A)
27Haixin Wang, Tianhao Zhang, Muzhi Yu, Jinan Sun, Wei Ye, Chen Wang, Shikun Zhang. Stacking Networks Dynamically for Image Restoration Based on the Plug-and-Play Framework. In ECCV'20. (CCF Rank B, Full Paper)
28Bo Li, Wei Ye*, Zhonghao Sheng, Rui Xie, Xiangyu Xi and Shikun Zhang. Graph Enhanced Dual Attention Network for Document-Level Relation Extraction. In COLING'20. (CCF Rank B, Full Paper)
29Wei Ye, Rui Xie, Jinglei Zhang, Tianxiang Hu, Xiaoying Wang, Shikun Zhang. Leveraging Code Generation to Improve Code Retrieval and Summarization via Dual Learning. In WWW'20. (CCF Rank A, Full Paper)
30Jinglei Zhang, Rui Xie, Wei Ye*, Yuhan Zhang, Shikun Zhang. Exploiting Code Knowledge Graph for Bug Localization via Bi-directional Attention. In ICPC'20. (CCF Rank B, Full Paper)
31Tong Zhang, Wei Ye*, Xiangyu Xi, Long Zhang, Shikun Zhang, Wen Zhao. Leveraging Human Prior Knowledge to Learn Sense Representations. in ECAI'20. (CCF B, Full Paper)
32Haixin Wang, Xingzhang Ren, Jinan Sun, Wei Ye, Long Chen, Muzhi Yu, Shikun Zhang. Deep Dynamic Boosted Forest. In ACML'20. (CCF Rank C, Full Paper)
33Bo Li, Zhonghao Sheng, Wei Ye*, Jinglei Zhang, Kai Liu and Shikun Zhang. Sliding Hierarchical Recurrent Neural Networks for Sequence Classification. In IJCNN'20. (CCF Rank C, Full Paper)
34Peiyang Liu, Wei Ye*, Xiangyu Xi, Tong Wang and Shikun Zhang. Not All Synonyms Are Created Equal: Incorporating Similarity of Synonyms to Enhance Word Embeddings. In IJCNN'20. (CCF Rank C, Full Paper)
35Wei Ye, Bo Li, Rui Xie, Zhonghao Sheng, Long Chen and Shikun Zhang. Exploiting Entity BIO Tag Embeddings and Multi-task Learning for Relation Extraction with Imbalanced Data. In ACL'19. (CCF Rank A, Full Paper)
36Bo Li, Zehua Cheng, Zhenghua Xu, Wei Ye*, Thomas Lukasiewicz and Shikun Zhang. Long Text Analysis Using Sliced Recurrent Neural Network with Breaking Point Information Enrichment. In ICASSP'19 (CCF Rank B, Full Paper)
37Rui Xie, Long Chen, Wei Ye*, Zhiyu Li, Tianxiang Hu, Dongdong Du and Shikun Zhang. DeepLink: A Code Knowledge Graph Based Deep Learning Approach for Issue-Commit Link Recovery. In SANER'19 (CCF Rank B, Full Paper)
38Long Chen, Wei Ye* and Shikun Zhang. Capturing Source Code Semantics via Tree-based Convolution over API-enhanced AST. ACM International Conference on Computing Frontiers, 2019 (CCF Rank C, Full Paper)
39Luyao Ma, Long Zhang, Wei Ye and Wenhui Hu. PKUSE at SemEval-2019 Task 3: Emotion Detection with Emotion-Oriented Neural Attention Network. In Proceedings of the 13th International Workshop on Semantic Evaluation(SemEval), 2019
40Xiangyu Xi, Tong Zhang, Wei Ye*, Jinglei Zhang, Rui Xie and Shikun Zhang. A Hybrid Character Representation for Chinese Event Detection. In IJCNN'19 (CCF Rank C, Full Paper)
View More
View Less

Awards

No. Rank Award Name
01 2/1029 Tianchi Big Data Competition Platform, National Metabolic Management Center Medical Knowledge Graph Challenge, 2019.
02 1/1124 Tianchi Big Data Competition Platform: Alibaba Security Algorithm Challenge, 2017.
03 3/2754 China Computer Federation(CCF) Big Data & Computing Intelligence Contest 2019(BDCI), Sentiment Analysis Of Internet News, 2019.
04 2/335 Beijing Campus University Big Data Competition 2018, Campus Traffic Forecast, 2018.
05 3/622 Tianchi Big Data Competition Platform, The 3rd Alibaba Cloud Security Algorithm Challenge, 2018.
06 2/576 Ppdai Mirror Cup: Text Matching Challenge, 2018.
07 3/4878 Daguan Technology Co., Ltd Daguan Cup, Long Sentence Classification Challenge, 2018.
08 3/1027 CIKM AnalytiCup 2018, Cross-Language Short Text Matching, 2018.
09 3/959 Tianchi Big Data Competition Platform, The 2nd Alibaba Cloud Security Algorithm Challeng, 2017.
View More
View Less

Projects

No. Time Project Name
01 2021.12 - 2025.11 BIGO Co., Ltd., PKU-BIGO Joint Laboratory of Artificial Intelligence
02 2019.12 - 2022.11 National Key Research And Development Program of China, Research on the Sensitive Text and Sensitive Image Monitoring Methods of Internet Cultural Services (No. 2019YFB1405802)
03 2019.04 - 2024.03 Guoxin Health Insurance Service Group Co., Ltd.,Joint Laboratory of Medical Artificial Intelligence
04 2018.10 - 2021.12 BeidaSoft Co., Ltd., Research and Development of Technologies on Domain Knowledge Graph Construction and Intelligent Information Retrieval
In addition to academic research projects, our technology supports a large number of commercial projects in the areas of government big data governance and software quality assurance, with a total value of over 1 billion yuan. Please refer to www.beidasoft.com for more detail.

People

Professors
Shikun Zhang
Researcher, Director of National Engineering Research Center for Software Engineering
Knowledge Computing
Software Engineering
Software Security
Associate Researcher, Director of Knowledge Computing Lab
Natural Language Processing
Programming Language Comprehension
Knowledge Graph
Post-Doc
Rui Xie
Software Engineering & Knowledge Computing
PhD Students
Tianxiang Hu
Software Engineering & Knowledge Computing
Tong Zhang
Knowledge Computing
Fuyao Duan
Software Engineering
Xiao Deng
Software Engineering
Zile Qiao
Natural Language Processing
Knowledge Computing
Jinglei Zhang
Knowledge Computing
Chaoya Jiang
Natural Language Processing
Master's Students
Yongle Wei
Natural Language Processing
Ce Liu
Deep Learning
Zheyu Ying
Software Engineering
Dingyao Yu
Natural Language Processing
Guochang Wen
Natural Language Processing
Hongxiang Chen
Natural Language Processing
Yang An
Natural Language Processing
Ziyue Zhang
Natural Language Processing
Wenjing Yang
Source Code Analysis
Yang Yang
Knowledge Computing
Gexiang Fang
Knowledge Computing
Ruimin Lin
Knowledge Computing
Tianfan Xu
Knowledge Computing
Jinghao Wei
Knowledge Computing
Yue Lu
Knowledge Computing
Jingxiang Ma
Knowledge Computing
Alumni
ZHANG YUHAN
2021 360
XIE RUI
2021 Peking University
FAN YUANHAO
2021 Meituan
LIU PEIYANG
2020 360
SHENG ZHONGHAO
2020 Baidu
XI XIANGYU
2020 Meituan
ZHANG LONG
2020 Sougou
MA LUYAO
2020 Alibaba
CHEN LONG
2019 Ding Xiang
REN XINGZHANG
2019 Alibaba
LIAO NINGLIN
2018 CASIC
LUO RUICI
2018 ByteDance
XING LIANG
2017 Tencent
TANG XIAOQING
2017 Run a startup
XU CHEN
2016 ByteDance
YANG JUN
2016 Baidu
GOU RUI
2014 Bank System
HUANG SHUZHI
2014 Google
LV YIQIANG
2013 ChinaClear
YI FEI
2013 ChinaClear
LI RUNDONG
2012 Google
LIU QINGZHOU
2012 Netease