no code implementations • COLING 2022 • Zhichao Geng, Ming Zhong, Zhangyue Yin, Xipeng Qiu, Xuanjing Huang
For dialogue summarization, the subdomain of text summarization, utterances are concatenated to flat text before being processed.
no code implementations • Findings (ACL) 2022 • Yong Dai, Linyang Li, Cong Zhou, Zhangyin Feng, Enbo Zhao, Xipeng Qiu, Piji Li, Duyu Tang
The meaning of a word in Chinese is different in that a word is a compositional unit consisting of multiple characters.
1 code implementation • EMNLP 2021 • Tuo ji, Hang Yan, Xipeng Qiu
Chinese Spelling Check (CSC) is to detect and correct Chinese spelling errors.
1 code implementation • Findings (EMNLP) 2021 • Yiran Chen, PengFei Liu, Xipeng Qiu
In this paper, we present an adversarial meta-evaluation methodology that allows us to (i) diagnose the fine-grained strengths and weaknesses of 6 existing top-performing metrics over 24 diagnostic test datasets, (ii) search for directions for further improvement by data augmentation.
no code implementations • ACL 2022 • Yunhua Zhou, Peiju Liu, Xipeng Qiu
The Out-of-Domain (OOD) intent classification is a basic and challenging task for dialogue systems.
no code implementations • ACL (WebNLG, INLG) 2020 • Qipeng Guo, Zhijing Jin, Ning Dai, Xipeng Qiu, xiangyang xue, David Wipf, Zheng Zhang
Text verbalization of knowledge graphs is an important problem with wide application to natural language generation (NLG) systems.
1 code implementation • 6 Jun 2024 • Zhiheng Xi, Yiwen Ding, Wenxiang Chen, Boyang Hong, Honglin Guo, Junzhe Wang, Dingwen Yang, Chenyang Liao, Xin Guo, wei he, Songyang Gao, Lu Chen, Rui Zheng, Yicheng Zou, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang
Building generalist agents that can handle diverse tasks and evolve themselves across different environments is a long-term goal in the AI community.
no code implementations • 22 May 2024 • Xuyang Ge, Fukang Zhu, Wentao Shu, Junxuan Wang, Zhengfu He, Xipeng Qiu
Circuit analysis of any certain model behavior is a central task in mechanistic interpretability.
1 code implementation • 21 May 2024 • Zhangyue Yin, Qiushi Sun, Qipeng Guo, Zhiyuan Zeng, Xiaonan Li, Tianxiang Sun, Cheng Chang, Qinyuan Cheng, Ding Wang, Xiaofeng Mou, Xipeng Qiu, Xuanjing Huang
Recent advancements in Chain-of-Thought prompting have facilitated significant breakthroughs for Large Language Models (LLMs) in complex reasoning tasks.
2 code implementations • 8 Apr 2024 • Dong Zhang, Zhaowei Li, ShiMin Li, Xin Zhang, Pengyu Wang, Yaqian Zhou, Xipeng Qiu
However, the integration of human feedback to align speech outputs to human preferences is often neglected.
no code implementations • 3 Apr 2024 • Mozhi Zhang, Mianqiu Huang, Rundong Shi, Linsen Guo, Chong Peng, Peng Yan, Yaqian Zhou, Xipeng Qiu
Large language models optimized with techniques like RLHF have achieved good alignment in being helpful and harmless.
1 code implementation • 26 Mar 2024 • Zheng Cai, Maosong Cao, Haojiong Chen, Kai Chen, Keyu Chen, Xin Chen, Xun Chen, Zehui Chen, Zhi Chen, Pei Chu, Xiaoyi Dong, Haodong Duan, Qi Fan, Zhaoye Fei, Yang Gao, Jiaye Ge, Chenya Gu, Yuzhe Gu, Tao Gui, Aijia Guo, Qipeng Guo, Conghui He, Yingfan Hu, Ting Huang, Tao Jiang, Penglong Jiao, Zhenjiang Jin, Zhikai Lei, Jiaxing Li, Jingwen Li, Linyang Li, Shuaibin Li, Wei Li, Yining Li, Hongwei Liu, Jiangning Liu, Jiawei Hong, Kaiwen Liu, Kuikun Liu, Xiaoran Liu, Chengqi Lv, Haijun Lv, Kai Lv, Li Ma, Runyuan Ma, Zerun Ma, Wenchang Ning, Linke Ouyang, Jiantao Qiu, Yuan Qu, FuKai Shang, Yunfan Shao, Demin Song, Zifan Song, Zhihao Sui, Peng Sun, Yu Sun, Huanze Tang, Bin Wang, Guoteng Wang, Jiaqi Wang, Jiayu Wang, Rui Wang, Yudong Wang, Ziyi Wang, Xingjian Wei, Qizhen Weng, Fan Wu, Yingtong Xiong, Chao Xu, Ruiliang Xu, Hang Yan, Yirong Yan, Xiaogui Yang, Haochen Ye, Huaiyuan Ying, JIA YU, Jing Yu, Yuhang Zang, Chuyu Zhang, Li Zhang, Pan Zhang, Peng Zhang, Ruijie Zhang, Shuo Zhang, Songyang Zhang, Wenjian Zhang, Wenwei Zhang, Xingcheng Zhang, Xinyue Zhang, Hui Zhao, Qian Zhao, Xiaomeng Zhao, Fengzhe Zhou, Zaida Zhou, Jingming Zhuo, Yicheng Zou, Xipeng Qiu, Yu Qiao, Dahua Lin
The evolution of Large Language Models (LLMs) like ChatGPT and GPT-4 has sparked discussions on the advent of Artificial General Intelligence (AGI).
Ranked #5 on Long-Context Understanding on Ada-LEval (BestAnswer)
1 code implementation • 25 Mar 2024 • Jiasheng Ye, Peiju Liu, Tianxiang Sun, Yunhua Zhou, Jun Zhan, Xipeng Qiu
Pretraining data of large language models composes multiple domains (e. g., web texts, academic papers, codes), whose mixture proportions crucially impact the competence of outcome models.
1 code implementation • 21 Mar 2024 • Qiushi Sun, Zhirui Chen, Fangzhi Xu, Kanzhi Cheng, Chang Ma, Zhangyue Yin, Jianing Wang, Chengcheng Han, Renyu Zhu, Shuai Yuan, Qipeng Guo, Xipeng Qiu, Pengcheng Yin, XiaoLi Li, Fei Yuan, Lingpeng Kong, Xiang Li, Zhiyong Wu
Building on our examination of the developmental trajectories, we further investigate the emerging synergies between code intelligence and broader machine intelligence, uncovering new cross-domain opportunities and illustrating the substantial influence of code intelligence across various domains.
1 code implementation • 6 Mar 2024 • Yuhong Sun, Zhangyue Yin, Qipeng Guo, Jiawen Wu, Xipeng Qiu, Hui Zhao
This paper presents a new method for evaluating LLM hallucination in Question Answering (QA) based on the unanswerable math word problem (MWP).
no code implementations • 5 Mar 2024 • Bo wang, Tianxiang Sun, Hang Yan, Siyin Wang, Qingyuan Cheng, Xipeng Qiu
The exploration of whether agents can align with their environment without relying on human-labeled data presents an intriguing research topic.
1 code implementation • 27 Feb 2024 • Chenxin An, Fei Huang, Jun Zhang, Shansan Gong, Xipeng Qiu, Chang Zhou, Lingpeng Kong
The ability of Large Language Models (LLMs) to process and generate coherent text is markedly weakened when the number of input tokens exceeds their pretraining length.
no code implementations • 27 Feb 2024 • Jiaqi Wang, Zhenxi Song, Zhengyu Ma, Xipeng Qiu, Min Zhang, Zhiguo Zhang
Reconstructing natural language from non-invasive electroencephalography (EEG) holds great promise as a language decoding technology for brain-computer interfaces (BCIs).
no code implementations • 26 Feb 2024 • Runyu Peng, Yunhua Zhou, Qipeng Guo, Yang Gao, Hang Yan, Xipeng Qiu, Dahua Lin
Significantly, our method is characterized by without necessitating additional involvement of any corpus, while simultaneously preserving orthogonality in conjunction with pruning and quantization methods.
1 code implementation • 24 Feb 2024 • Yi Zong, Xipeng Qiu
The Large Vision-Language Models (LVLMs) have demonstrated great abilities in image perception and language understanding.
1 code implementation • 22 Feb 2024 • Jinlan Fu, Shenzhen Huangfu, Hang Yan, See-Kiong Ng, Xipeng Qiu
Large Language Models (LLMs) have recently showcased remarkable generalizability in various domains.
1 code implementation • 22 Feb 2024 • Yunfan Shao, Linyang Li, Zhaoye Fei, Hang Yan, Dahua Lin, Xipeng Qiu
Data plays a fundamental role in the training of Large Language Models (LLMs).
1 code implementation • 21 Feb 2024 • Kai Lv, Xiaoran Liu, Qipeng Guo, Hang Yan, Conghui He, Xipeng Qiu, Dahua Lin
The quality of training data are crucial for enhancing the long-text capabilities of foundation models.
no code implementations • 20 Feb 2024 • Jie Ren, Qipeng Guo, Hang Yan, Dongrui Liu, Xipeng Qiu, Dahua Lin
Although large language models (LLMs) have demonstrated remarkable performance, the lack of transparency in their inference logic raises concerns about their trustworthiness.
no code implementations • 20 Feb 2024 • Demin Song, Honglin Guo, Yunhua Zhou, Shuhao Xing, Yudong Wang, Zifan Song, Wenwei Zhang, Qipeng Guo, Hang Yan, Xipeng Qiu, Dahua Lin
The programming skill is one crucial ability for Large Language Models (LLMs), necessitating a deep understanding of programming languages (PLs) and their correlation with natural languages (NLs).
no code implementations • 19 Feb 2024 • Zhengfu He, Xuyang Ge, Qiong Tang, Tianxiang Sun, Qinyuan Cheng, Xipeng Qiu
Sparse dictionary learning has been a rapidly growing technique in mechanistic interpretability to attack superposition and extract more human-understandable features from model activations.
1 code implementation • 19 Feb 2024 • Jun Zhan, Junqi Dai, Jiasheng Ye, Yunhua Zhou, Dong Zhang, Zhigeng Liu, Xin Zhang, Ruibin Yuan, Ge Zhang, Linyang Li, Hang Yan, Jie Fu, Tao Gui, Tianxiang Sun, Yugang Jiang, Xipeng Qiu
We introduce AnyGPT, an any-to-any multimodal language model that utilizes discrete representations for the unified processing of various modalities, including speech, text, images, and music.
no code implementations • 17 Feb 2024 • Siyin Wang, ShiMin Li, Tianxiang Sun, Jinlan Fu, Qinyuan Cheng, Jiasheng Ye, Junjie Ye, Xipeng Qiu, Xuanjing Huang
HAG extends the current paradigm in the text generation process, highlighting the feasibility of endowing the LLMs with self-regulate decoding strategies.
no code implementations • 17 Feb 2024 • Zhiyuan Zeng, Qipeng Guo, Zhaoye Fei, Zhangyue Yin, Yunhua Zhou, Linyang Li, Tianxiang Sun, Hang Yan, Dahua Lin, Xipeng Qiu
To address the dropped tokens and padding, we propose the Rectify-Router, comprising the Intra-GPU Rectification and the Fill-in Rectification.
1 code implementation • 9 Feb 2024 • Huaiyuan Ying, Shuo Zhang, Linyang Li, Zhejian Zhou, Yunfan Shao, Zhaoye Fei, Yichuan Ma, Jiawei Hong, Kuikun Liu, Ziyi Wang, Yudong Wang, Zijian Wu, Shuaibin Li, Fengzhe Zhou, Hongwei Liu, Songyang Zhang, Wenwei Zhang, Hang Yan, Xipeng Qiu, Jiayu Wang, Kai Chen, Dahua Lin
We further explore how to use LEAN to solve math problems and study its performance under the setting of multi-task learning which shows the possibility of using LEAN as a unified platform for solving and proving in math.
1 code implementation • 30 Jan 2024 • Xiaoran Fan, Tao Ji, Changhao Jiang, Shuo Li, Senjie Jin, Sirui Song, Junke Wang, Boyang Hong, Lu Chen, Guodong Zheng, Ming Zhang, Caishuang Huang, Rui Zheng, Zhiheng Xi, Yuhao Zhou, Shihan Dou, Junjie Ye, Hang Yan, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang
This technique introduces a fusion network to unify the processing of outputs from different visual experts, while bridging the gap between image encoders and pre-trained LLMs.
Ranked #54 on Visual Question Answering on MM-Vet
no code implementations • 26 Jan 2024 • Zhaoye Fei, Yunfan Shao, Linyang Li, Zhiyuan Zeng, Conghui He, Hang Yan, Dahua Lin, Xipeng Qiu
Large language models have demonstrated remarkable potential in various tasks, however, there remains a significant scarcity of open-source models and data for specific domains.
1 code implementation • 26 Jan 2024 • Yu Sun, Keyu Chen, Shujie Wang, Qipeng Guo, Hang Yan, Xipeng Qiu, Xuanjing Huang, Dahua Lin
However, these evaluation benchmarks are limited to assessing the instruction-following capabilities, overlooking the fundamental abilities that emerge during the pre-training stage.
1 code implementation • 24 Jan 2024 • Xinghao Wang, Junliang He, Pengyu Wang, Yunhua Zhou, Tianxiang Sun, Xipeng Qiu
These methods regularize the representation space by pulling similar sentence representations closer and pushing away the dissimilar ones and have been proven effective in various NLP tasks, e. g., semantic textual similarity (STS) tasks.
1 code implementation • 24 Jan 2024 • Qinyuan Cheng, Tianxiang Sun, Xiangyang Liu, Wenwei Zhang, Zhangyue Yin, ShiMin Li, Linyang Li, Zhengfu He, Kai Chen, Xipeng Qiu
To answer this question, we construct a model-specific "I don't know" (Idk) dataset for an assistant, which contains its known and unknown questions, based on existing open-domain question answering datasets.
1 code implementation • 24 Jan 2024 • Dong Zhang, Xin Zhang, Jun Zhan, ShiMin Li, Yaqian Zhou, Xipeng Qiu
It comprises an autoregressive model based on LLM for semantic information modeling and a non-autoregressive model employing flow matching for perceptual information modeling.
1 code implementation • 20 Jan 2024 • Pengyu Wang, Dong Zhang, Linyang Li, Chenkun Tan, Xinghao Wang, Ke Ren, Botian Jiang, Xipeng Qiu
With the rapid development of large language models (LLMs), they are not only used as general-purpose AI assistants but are also customized through further fine-tuning to meet the requirements of different applications.
1 code implementation • 11 Jan 2024 • Binghai Wang, Rui Zheng, Lu Chen, Yan Liu, Shihan Dou, Caishuang Huang, Wei Shen, Senjie Jin, Enyu Zhou, Chenyu Shi, Songyang Gao, Nuo Xu, Yuhao Zhou, Xiaoran Fan, Zhiheng Xi, Jun Zhao, Xiao Wang, Tao Ji, Hang Yan, Lixing Shen, Zhan Chen, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang
We introduce a series of novel methods to mitigate the influence of incorrect and ambiguous preferences in the dataset and fully leverage high-quality preference data.
no code implementations • 9 Jan 2024 • ShiMin Li, Tianxiang Sun, Qinyuan Cheng, Xipeng Qiu
Agents based on Large Language Models (LLMs) are increasingly permeating various domains of human production and life, highlighting the importance of aligning them with human values.
1 code implementation • 8 Jan 2024 • Dong Zhang, Zhaowei Li, Pengyu Wang, Xin Zhang, Yaqian Zhou, Xipeng Qiu
In this paper, we propose SpeechAgents, a multi-modal LLM based multi-agent system designed for simulating human communication.
1 code implementation • 17 Dec 2023 • Jiankai Sun, Chuanyang Zheng, Enze Xie, Zhengying Liu, Ruihang Chu, Jianing Qiu, Jiaqi Xu, Mingyu Ding, Hongyang Li, Mengzhe Geng, Yue Wu, Wenhai Wang, Junsong Chen, Zhangyue Yin, Xiaozhe Ren, Jie Fu, Junxian He, Wu Yuan, Qi Liu, Xihui Liu, Yu Li, Hao Dong, Yu Cheng, Ming Zhang, Pheng Ann Heng, Jifeng Dai, Ping Luo, Jingdong Wang, Ji-Rong Wen, Xipeng Qiu, Yike Guo, Hui Xiong, Qun Liu, Zhenguo Li
Reasoning, a crucial ability for complex problem-solving, plays a pivotal role in various real-world settings such as negotiation, medical diagnosis, and criminal investigation.
1 code implementation • 12 Dec 2023 • Yuqing Yang, Ethan Chern, Xipeng Qiu, Graham Neubig, PengFei Liu
Recent research has made significant strides in applying alignment techniques to enhance the helpfulness and harmlessness of large language models (LLMs) in accordance with human intentions.
1 code implementation • 4 Dec 2023 • Zhangyue Yin, Qiushi Sun, Cheng Chang, Qipeng Guo, Junqi Dai, Xuanjing Huang, Xipeng Qiu
Large Language Models (LLMs) have recently made significant strides in complex reasoning tasks through the Chain-of-Thought technique.
1 code implementation • 1 Dec 2023 • Kai Lv, Shuo Zhang, Tianle Gu, Shuhao Xing, Jiawei Hong, Keyu Chen, Xiaoran Liu, Yuqing Yang, Honglin Guo, Tengxiao Liu, Yu Sun, Qipeng Guo, Hang Yan, Xipeng Qiu
This paper introduces CoLLiE, an efficient library that facilitates collaborative training of large language models using 3D parallelism, parameter-efficient fine-tuning (PEFT) methods, and optimizers such as Lion, Adan, Sophia, LOMO and AdaLomo.
1 code implementation • 14 Nov 2023 • Xiaonan Li, Changtai Zhu, Linyang Li, Zhangyue Yin, Tianxiang Sun, Xipeng Qiu
Thus, the LLM can iteratively provide feedback to retrieval and facilitate the retrieval result to fully support verifiable generation.
1 code implementation • 12 Nov 2023 • Kexin Huang, Xiangyang Liu, Qianyu Guo, Tianxiang Sun, Jiawei Sun, Yaru Wang, Zeyang Zhou, Yixu Wang, Yan Teng, Xipeng Qiu, Yingchun Wang, Dahua Lin
The widespread adoption of large language models (LLMs) across various regions underscores the urgent need to evaluate their alignment with human values.
1 code implementation • 23 Oct 2023 • Tengxiao Liu, Qipeng Guo, Yuqing Yang, Xiangkun Hu, Yue Zhang, Xipeng Qiu, Zheng Zhang
As large language models (LLMs) have shown effectiveness with different prompting methods, such as Chain of Thought, Program of Thought, we find that these methods have formed a great complementarity to each other on math reasoning tasks.
1 code implementation • 17 Oct 2023 • Linyang Li, Botian Jiang, Pengyu Wang, Ke Ren, Hang Yan, Xipeng Qiu
Abuse of large language models reveals high risks as large language models are being deployed at an astonishing speed.
1 code implementation • 16 Oct 2023 • Kai Lv, Hang Yan, Qipeng Guo, Haijun Lv, Xipeng Qiu
Our experiments with instruction-tuning and further pre-training demonstrate that AdaLomo achieves results on par with AdamW, while significantly reducing memory requirements, thereby lowering the hardware barrier to training large language models.
1 code implementation • 16 Oct 2023 • Yunfan Shao, Linyang Li, Junqi Dai, Xipeng Qiu
Large language models (LLMs) can be used to serve as agents to simulate human behaviors, given the powerful ability to understand human instructions and provide high-quality generated texts.
no code implementations • 14 Oct 2023 • Yuxin Wang, Xiannian Hu, Quan Gan, Xuanjing Huang, Xipeng Qiu, David Wipf
Graph neural networks (GNNs) for link prediction can loosely be divided into two broad categories.
1 code implementation • 13 Oct 2023 • Pengyu Wang, Linyang Li, Ke Ren, Botian Jiang, Dong Zhang, Xipeng Qiu
Therefore, it is important to build strong AI-generated text (AIGT) detectors.
1 code implementation • 13 Oct 2023 • Linyang Li, Ke Ren, Yunfan Shao, Pengyu Wang, Xipeng Qiu
Through experimental results, we find that we can build a connection between discrete and continuous perturbations and use the proposed PerturbScore to learn such correlation, surpassing previous methods used in discrete perturbation measuring.
1 code implementation • 8 Oct 2023 • Xiaoran Liu, Hang Yan, Shuo Zhang, Chenxin An, Xipeng Qiu, Dahua Lin
The extrapolation capability of Large Language Models (LLMs) based on Rotary Position Embedding is currently a topic of considerable interest.
2 code implementations • 5 Oct 2023 • Qinyuan Cheng, Tianxiang Sun, Wenwei Zhang, Siyin Wang, Xiangyang Liu, Mozhi Zhang, Junliang He, Mianqiu Huang, Zhangyue Yin, Kai Chen, Xipeng Qiu
We analyze the primary types of hallucinations in different types of models and their causes.
1 code implementation • 30 Sep 2023 • Qiushi Sun, Zhangyue Yin, Xiang Li, Zhiyong Wu, Xipeng Qiu, Lingpeng Kong
Large Language Models (LLMs) are evolving at an unprecedented pace and have exhibited considerable capability in the realm of natural language processing (NLP) with world knowledge.
1 code implementation • 14 Sep 2023 • Zhiheng Xi, Wenxiang Chen, Xin Guo, wei he, Yiwen Ding, Boyang Hong, Ming Zhang, Junzhe Wang, Senjie Jin, Enyu Zhou, Rui Zheng, Xiaoran Fan, Xiao Wang, Limao Xiong, Yuhao Zhou, Weiran Wang, Changhao Jiang, Yicheng Zou, Xiangyang Liu, Zhangyue Yin, Shihan Dou, Rongxiang Weng, Wensen Cheng, Qi Zhang, Wenjuan Qin, Yongyan Zheng, Xipeng Qiu, Xuanjing Huang, Tao Gui
Many efforts have been made to develop intelligent agents, but they mainly focus on advancement in algorithms or training strategies to enhance specific capabilities or performance on particular tasks.
3 code implementations • 31 Aug 2023 • Xin Zhang, Dong Zhang, ShiMin Li, Yaqian Zhou, Xipeng Qiu
Therefore, we propose SpeechTokenizer, a unified speech tokenizer for speech large language models.
1 code implementation • 5 Aug 2023 • Yuhao Dan, Zhikai Lei, Yiyang Gu, Yong Li, Jianghao Yin, Jiaju Lin, Linhao Ye, Zhiyan Tie, Yougen Zhou, Yilei Wang, Aimin Zhou, Ze Zhou, Qin Chen, Jie zhou, Liang He, Xipeng Qiu
Currently, EduChat is available online as an open-source project, with its code, data, and model parameters available on platforms (e. g., GitHub https://github. com/icalk-nlp/EduChat, Hugging Face https://huggingface. co/ecnu-icalk ).
no code implementations • 3 Aug 2023 • Xiaowu Zhang, Xiaotian Zhang, Cheng Yang, Hang Yan, Xipeng Qiu
As large language models, such as GPT, continue to advance the capabilities of natural language processing (NLP), the question arises: does the problem of correction still persist?
3 code implementations • 20 Jul 2023 • Chenxin An, Shansan Gong, Ming Zhong, Xingjian Zhao, Mukai Li, Jun Zhang, Lingpeng Kong, Xipeng Qiu
Recently, there has been growing interest in extending the context length of large language models (LLMs), aiming to effectively process long inputs of one turn or conversations with more extensive histories.
1 code implementation • 11 Jul 2023 • Rui Zheng, Shihan Dou, Songyang Gao, Yuan Hua, Wei Shen, Binghai Wang, Yan Liu, Senjie Jin, Qin Liu, Yuhao Zhou, Limao Xiong, Lu Chen, Zhiheng Xi, Nuo Xu, Wenbin Lai, Minghao Zhu, Cheng Chang, Zhangyue Yin, Rongxiang Weng, Wensen Cheng, Haoran Huang, Tianxiang Sun, Hang Yan, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang
Therefore, we explore the PPO-max, an advanced version of PPO algorithm, to efficiently improve the training stability of the policy model.
no code implementations • 19 Jun 2023 • Dongyu Ru, Lin Qiu, Xipeng Qiu, Yue Zhang, Zheng Zhang
Discourse analysis is an important task because it models intrinsic semantic structures between sentences in a document.
1 code implementation • 16 Jun 2023 • Kai Lv, Yuqing Yang, Tengxiao Liu, Qinghui Gao, Qipeng Guo, Xipeng Qiu
Large Language Models (LLMs) have revolutionized Natural Language Processing (NLP) but demand massive GPU resources for training.
1 code implementation • 16 Jun 2023 • Yuxin Wang, Quan Gan, Xipeng Qiu, Xuanjing Huang, David Wipf
Hypergraphs are a powerful abstraction for representing higher-order interactions between entities of interest.
1 code implementation • 30 May 2023 • Yuqing Yang, Qipeng Guo, Xiangkun Hu, Yue Zhang, Xipeng Qiu, Zheng Zhang
Motivated by the fact that all event structures can be inferred from AMR, this work reformulates EAE as a link prediction problem on AMR graphs.
1 code implementation • 29 May 2023 • Zhangyue Yin, Qiushi Sun, Qipeng Guo, Jiawen Wu, Xipeng Qiu, Xuanjing Huang
Large language models (LLMs) have a wealth of knowledge that allows them to excel in various Natural Language Processing (NLP) tasks.
no code implementations • 25 May 2023 • ShiMin Li, Xiaotian Zhang, Yanjun Zheng, Linyang Li, Xipeng Qiu
Dialogue data in real scenarios tend to be sparsely available, rendering data-starved end-to-end dialogue systems trained inadequately.
no code implementations • 23 May 2023 • Chenxin An, Jiangtao Feng, Fei Huang, Xipeng Qiu, Lingpeng Kong
In this paper, we propose to ease the difficulty of modality learning via sampling from the model distribution instead of the data distribution.
1 code implementation • 21 May 2023 • Xiaotian Zhang, Chunyang Li, Yi Zong, Zhengyu Ying, Liang He, Xipeng Qiu
Large Language Models(LLMs) have demonstrated remarkable performance across various natural language processing tasks; however, how to comprehensively and accurately assess their performance becomes an urgent issue to be addressed.
1 code implementation • 20 May 2023 • Mozhi Zhang, Hang Yan, Yaqian Zhou, Xipeng Qiu
We use prompts that contains entity category information to construct label prototypes, which enables our model to fine-tune with only the support set.
1 code implementation • 18 May 2023 • Dong Zhang, ShiMin Li, Xin Zhang, Jun Zhan, Pengyu Wang, Yaqian Zhou, Xipeng Qiu
Multi-modal large language models are regarded as a crucial step towards Artificial General Intelligence (AGI) and have garnered significant interest with the emergence of ChatGPT.
1 code implementation • 9 May 2023 • Xiaonan Li, Xipeng Qiu
Specifically, MoT is divided into two stages: 1. before the test stage, the LLM pre-thinks on the unlabeled dataset and saves the high-confidence thoughts as external memory; 2.
1 code implementation • 9 May 2023 • Peng Li, Tianxiang Sun, Qiong Tang, Hang Yan, Yuanbin Wu, Xuanjing Huang, Xipeng Qiu
A common practice is to recast the task into a text-to-text format such that generative LLMs of natural language (NL-LLMs) like GPT-3 can be prompted to solve it.
1 code implementation • 7 May 2023 • Xiaonan Li, Kai Lv, Hang Yan, Tianyang Lin, Wei Zhu, Yuan Ni, Guotong Xie, Xiaoling Wang, Xipeng Qiu
To train UDR, we cast various tasks' training signals into a unified list-wise ranking formulation by language model's feedback.
1 code implementation • 3 May 2023 • Qinyuan Cheng, Xiaogui Yang, Tianxiang Sun, Linyang Li, Xipeng Qiu
Our method utilizes AI feedback from large pre-trained language models (LLMs) to construct sample pairs with fine-grained sample similarity scores to improve contrastive learning.
no code implementations • 27 Apr 2023 • Linyang Li, Pengyu Wang, Ke Ren, Tianxiang Sun, Xipeng Qiu
The extraordinary performance of large language models (LLMs) heightens the importance of detecting whether the context is generated by an AI system.
no code implementations • 27 Feb 2023 • Xiaonan Li, Xipeng Qiu
Additionally, the strong dependency among in-context examples makes it an NP-hard combinatorial optimization problem and enumerating all permutations is infeasible.
2 code implementations • 19 Dec 2022 • Zhangyue Yin, Yuxin Wang, Xiannian Hu, Yiguang Wu, Hang Yan, Xinyu Zhang, Zhao Cao, Xuanjing Huang, Xipeng Qiu
Multi-Hop Question Answering (MHQA) is a significant area in question answering, requiring multiple reasoning components, including document retrieval, supporting sentence prediction, and answer span extraction.
1 code implementation • 14 Dec 2022 • ShiMin Li, Qinyuan Cheng, Linyang Li, Xipeng Qiu
As the functionality of dialogue systems evolves, hybrid dialogue systems that accomplish user-specific goals and participate in open-topic chitchat with users are attracting growing attention.
no code implementations • 8 Dec 2022 • Xiaotian Zhang, Yanjun Zheng, Hang Yan, Xipeng Qiu
While pre-trained Chinese language models have demonstrated impressive performance on a wide range of NLP tasks, the Chinese Spell Checking (CSC) task remains a challenge.
1 code implementation • 28 Nov 2022 • Zhengfu He, Tianxiang Sun, Kuanning Wang, Xuanjing Huang, Xipeng Qiu
We present DiffusionBERT, a new generative masked language model based on discrete diffusion models.
no code implementations • 23 Nov 2022 • Chu-Tak Lee, Qipeng Guo, Xipeng Qiu
Based on this observation, we rethink the existing character-aware method that takes character-level inputs but makes word-level sequence modeling and prediction.
1 code implementation • 31 Oct 2022 • Tengxiao Liu, Qipeng Guo, Xiangkun Hu, Yue Zhang, Xipeng Qiu, Zheng Zhang
RLET iteratively performs single step reasoning with sentence selection and deduction generation modules, from which the training signal is accumulated across the tree with elaborately designed aligned reward function that is consistent with the evaluation.
no code implementations • 31 Oct 2022 • Xiaotian Zhang, Hang Yan, Yu Sun, Xipeng Qiu
To adapt BERT to the CSC task, we propose a token-level self-distillation contrastive learning method.
1 code implementation • 28 Oct 2022 • Qipeng Guo, Yuqing Yang, Hang Yan, Xipeng Qiu, Zheng Zhang
In this paper, we investigate the root cause of the underwhelming performance of the existing generative DocRE models and discover that the culprit is the inadequacy of the training paradigm, instead of the capacities of the models.
1 code implementation • 26 Oct 2022 • Qinyuan Cheng, Linyang Li, Guofeng Quan, Feng Gao, Xiaofeng Mou, Xipeng Qiu
Besides, we introduce a sentence-level and a session-level score to measure the sentence fluency and session coherence in the interactive evaluation.
no code implementations • 21 Oct 2022 • Yunhua Zhou, Peiju Liu, Yuxin Wang, Xipeng Qiu
In this paper, starting from the intuition that discovering intents could be beneficial to the identification of the known intents, we propose a probabilistic framework for discovering intents where intent assignments are treated as latent variables.
1 code implementation • 20 Oct 2022 • Xiangyang Liu, Tianxiang Sun, Xuanjing Huang, Xipeng Qiu
Through extensive experimental results across various tasks and PTMs, we show that LPT can achieve competitive performance to full model tuning and other PETuning methods under both full-data and few-shot scenarios while possessing faster training speed and lower memory cost.
1 code implementation • 18 Oct 2022 • Xiaonan Li, Daya Guo, Yeyun Gong, Yun Lin, Yelong Shen, Xipeng Qiu, Daxin Jiang, Weizhu Chen, Nan Duan
In this paper, we present \textbf{SCodeR}, a \textbf{S}oft-labeled contrastive pre-training framework with two positive sample construction methods to learn functional-level \textbf{Code} \textbf{R}epresentation.
1 code implementation • 14 Oct 2022 • Tianxiang Sun, Zhengfu He, Qin Zhu, Xipeng Qiu, Xuanjing Huang
MP2 is a set of combinable prompts pre-trained on 38 Chinese tasks.
1 code implementation • 14 Oct 2022 • Tianxiang Sun, Junliang He, Xipeng Qiu, Xuanjing Huang
Automatic evaluation metrics are crucial to the development of generative systems.
1 code implementation • 13 Oct 2022 • Yunhua Zhou, Pengyu Wang, Peiju Liu, Yuxin Wang, Xipeng Qiu
Most existing methods of Out-of-Domain (OOD) intent classification rely on extensive auxiliary OOD corpora or specific training paradigms.
1 code implementation • COLING 2022 • Chenxin An, Ming Zhong, Zhiyong Wu, Qin Zhu, Xuanjing Huang, Xipeng Qiu
Traditional training paradigms for extractive and abstractive summarization systems always only use token-level or sentence-level training objectives.
no code implementations • 23 Sep 2022 • Zhigang Kan, Linhui Feng, Zhangyue Yin, Linbo Qiao, Xipeng Qiu, Dongsheng Li
In this paper, we propose a novel composable prompt-based generative framework, which could be applied to a wide range of tasks in the field of Information Extraction.
no code implementations • COLING 2022 • Zhaoye Fei, Yu Tian, Yongkang Wu, Xinyu Zhang, Yutao Zhu, Zheng Liu, Jiawen Wu, Dejiang Kong, Ruofei Lai, Zhao Cao, Zhicheng Dou, Xipeng Qiu
Our experiments on 13 benchmark datasets across five natural language understanding tasks demonstrate the superiority of our method.
1 code implementation • 9 Aug 2022 • Hang Yan, Yu Sun, Xiaonan Li, Xipeng Qiu
In this paper, we propose using Convolutional Neural Network (CNN) to model these spatial relations in the score matrix.
Ranked #3 on Nested Named Entity Recognition on ACE 2005
2 code implementations • 29 May 2022 • Chenxin An, Jiangtao Feng, Kai Lv, Lingpeng Kong, Xipeng Qiu, Xuanjing Huang
We validate CoNT on five generation tasks with ten benchmarks, including machine translation, summarization, code comment generation, data-to-text generation and commonsense generation.
1 code implementation • 27 May 2022 • Yuxin Wang, Chu-Tak Lee, Qipeng Guo, Zhangyue Yin, Yunhua Zhou, Xuanjing Huang, Xipeng Qiu
Transformers have made progress in miscellaneous tasks, but suffer from quadratic computational and memory complexities.
1 code implementation • 23 May 2022 • Tianxiang Sun, Zhengfu He, Hong Qian, Yunhua Zhou, Xuanjing Huang, Xipeng Qiu
By contrast, gradient-free methods only require the forward computation of the PTM to tune the prompt, retaining the benefits of efficient tuning and deployment.
1 code implementation • 23 Apr 2022 • Xiangkun Hu, Junqi Dai, Hang Yan, Yi Zhang, Qipeng Guo, Xipeng Qiu, Zheng Zhang
We propose Dialogue Meaning Representation (DMR), a pliable and easily extendable representation for task-oriented dialogue.
no code implementations • 27 Mar 2022 • Linyang Li, Demin Song, Xipeng Qiu
Adversarial purification is a successful defense mechanism against adversarial attacks without requiring knowledge of the form of the incoming attack.
1 code implementation • 12 Mar 2022 • Linyang Li, Yong Dai, Duyu Tang, Xipeng Qiu, Zenglin Xu, Shuming Shi
We present a Chinese BERT model dubbed MarkBERT that uses word information in this work.
Chinese Named Entity Recognition named-entity-recognition +7
1 code implementation • Findings (ACL) 2022 • Tianxiang Sun, Xiangyang Liu, Wei Zhu, Zhichao Geng, Lingling Wu, Yilong He, Yuan Ni, Guotong Xie, Xuanjing Huang, Xipeng Qiu
Previous works usually adopt heuristic metrics such as the entropy of internal outputs to measure instance difficulty, which suffers from generalization and threshold-tuning.
no code implementations • 1 Mar 2022 • Yong Dai, Linyang Li, Cong Zhou, Zhangyin Feng, Enbo Zhao, Xipeng Qiu, Piji Li, Duyu Tang
The meaning of a word in Chinese is different in that a word is a compositional unit consisting of multiple characters.
no code implementations • 20 Feb 2022 • Yitao Liu, Chenxin An, Xipeng Qiu
With the success of large-scale pre-trained models (PTMs), how efficiently adapting PTMs to downstream tasks has attracted tremendous attention, especially for PTMs with billions of parameters.
no code implementations • 18 Feb 2022 • Zhichao Geng, Hang Yan, Zhangyue Yin, Chenxin An, Xipeng Qiu
Chinese NER is a difficult undertaking due to the ambiguity of Chinese characters and the absence of word boundaries.
1 code implementation • 26 Jan 2022 • Xiaonan Li, Yeyun Gong, Yelong Shen, Xipeng Qiu, Hang Zhang, Bolun Yao, Weizhen Qi, Daxin Jiang, Weizhu Chen, Nan Duan
For bimodal contrastive learning, we leverage the documentation and in-line comments of code to build code-text pairs.
no code implementations • 24 Jan 2022 • Xiangkun Hu, Hang Yan, Qipeng Guo, Xipeng Qiu, Weinan Zhang, Zheng Zhang
Knowledge and expertise in the real-world can be disjointedly owned.
2 code implementations • 10 Jan 2022 • Tianxiang Sun, Yunfan Shao, Hong Qian, Xuanjing Huang, Xipeng Qiu
In such a scenario, which we call Language-Model-as-a-Service (LMaaS), the gradients of PTMs are usually unavailable.
1 code implementation • 21 Dec 2021 • ShiMin Li, Hang Yan, Xipeng Qiu
Meanwhile, we utilize an auxiliary response generation task to enhance the model's ability of handling context information, thereby forcing the model to recognize emotions with similar semantics in diverse contexts.
Ranked #11 on Emotion Recognition in Conversation on EmoryNLP
no code implementations • 14 Oct 2021 • Hao Jiang, Ke Zhan, Jianwei Qu, Yongkang Wu, Zhaoye Fei, Xinyu Zhang, Lei Chen, Zhicheng Dou, Xipeng Qiu, Zikai Guo, Ruofei Lai, Jiawen Wu, Enrui Hu, Yinxia Zhang, Yantao Jia, Fan Yu, Zhao Cao
To increase the number of activated experts without an increase in computational cost, we propose SAM (Switch and Mixture) routing, an efficient hierarchical routing mechanism that activates multiple experts in a same device (GPU).
1 code implementation • NAACL 2022 • Xiangyang Liu, Tianxiang Sun, Junliang He, Jiawen Wu, Lingling Wu, Xinyu Zhang, Hao Jiang, Zhao Cao, Xuanjing Huang, Xipeng Qiu
ELUE is dedicated to depict the Pareto Frontier for various language understanding tasks, such that it can tell whether and how much a method achieves Pareto improvement.
1 code implementation • 6 Oct 2021 • Linyang Li, Demin Song, Ruotian Ma, Xipeng Qiu, Xuanjing Huang
Pre-trained models are widely used in fine-tuning downstream tasks with linear classifiers optimized by the cross-entropy loss, which might face robustness and stability problems.
1 code implementation • 26 Sep 2021 • Tianxiang Sun, Xiangyang Liu, Xipeng Qiu, Xuanjing Huang
In this paper, we review such phenomenon of paradigm shifts in recent years, highlighting several paradigms that have the potential to solve different NLP tasks.
no code implementations • 16 Sep 2021 • Chenxin An, Ming Zhong, Zhichao Geng, Jianqiang Yang, Xipeng Qiu
Existing summarization systems mostly generate summaries purely relying on the content of the source document.
1 code implementation • 13 Sep 2021 • Yunfan Shao, Zhichao Geng, Yitao Liu, Junqi Dai, Hang Yan, Fei Yang, Li Zhe, Hujun Bao, Xipeng Qiu
In this paper, we take the advantage of previous pre-trained models (PTMs) and propose a novel Chinese Pre-trained Unbalanced Transformer (CPT).
no code implementations • 10 Sep 2021 • Yitao Liu, Tianxiang Sun, Xipeng Qiu, Xuanjing Huang
This one-way interaction leads to the teacher's inability to perceive the characteristics of the student and its training progress.
no code implementations • EMNLP 2021 • Linyang Li, Demin Song, Xiaonan Li, Jiehang Zeng, Ruotian Ma, Xipeng Qiu
\textbf{P}re-\textbf{T}rained \textbf{M}odel\textbf{s} have been widely applied and recently proved vulnerable under backdoor attacks: the released pre-trained weights can be maliciously poisoned with certain triggers.
no code implementations • 14 Jun 2021 • Xu Han, Zhengyan Zhang, Ning Ding, Yuxian Gu, Xiao Liu, Yuqi Huo, Jiezhong Qiu, Yuan YAO, Ao Zhang, Liang Zhang, Wentao Han, Minlie Huang, Qin Jin, Yanyan Lan, Yang Liu, Zhiyuan Liu, Zhiwu Lu, Xipeng Qiu, Ruihua Song, Jie Tang, Ji-Rong Wen, Jinhui Yuan, Wayne Xin Zhao, Jun Zhu
Large-scale pre-trained models (PTMs) such as BERT and GPT have recently achieved great success and become a milestone in the field of artificial intelligence (AI).
3 code implementations • ACL 2021 • Hang Yan, Junqi Dai, Tuo ji, Xipeng Qiu, Zheng Zhang
Aspect-based Sentiment Analysis (ABSA) aims to identify the aspect terms, their corresponding sentiment polarities, and the opinion terms.
Ranked #1 on Aspect Sentiment Triplet Extraction on SemEval
Aspect-Based Sentiment Analysis Aspect-oriented Opinion Extraction +2
1 code implementation • 8 Jun 2021 • Tianyang Lin, Yuxin Wang, Xiangyang Liu, Xipeng Qiu
X-formers) have been proposed, however, a systematic and comprehensive literature review on these Transformer variants is still missing.
1 code implementation • ACL 2021 • Hang Yan, Tao Gui, Junqi Dai, Qipeng Guo, Zheng Zhang, Xipeng Qiu
To that end, we propose to formulate the NER subtasks as an entity span sequence generation task, which can be solved by a unified sequence-to-sequence (Seq2Seq) framework.
Ranked #10 on Nested Named Entity Recognition on GENIA
no code implementations • 28 May 2021 • Tianxiang Sun, Yunhua Zhou, Xiangyang Liu, Xinyu Zhang, Hao Jiang, Zhao Cao, Xuanjing Huang, Xipeng Qiu
In this paper, we show that a novel objective function for the training of the ensemble internal classifiers can be naturally induced from the perspective of ensemble learning and information theory.
1 code implementation • ACL 2021 • Xiaonan Li, Yunfan Shao, Tianxiang Sun, Hang Yan, Xipeng Qiu, Xuanjing Huang
To alleviate this problem, we extend the recent successful early-exit mechanism to accelerate the inference of PTMs for sequence labeling tasks.
1 code implementation • Findings (EMNLP) 2021 • Yichao Luo, Yige Xu, Jiacheng Ye, Xipeng Qiu, Qi Zhang
In response to this problem, we propose a new fine-grained evaluation metric to improve the RL framework, which considers different granularities: token-level $F_1$ score, edit distance, duplication, and prediction quantities.
1 code implementation • NAACL 2021 • Ming Zhong, Da Yin, Tao Yu, Ahmad Zaidi, Mutethia Mutuma, Rahul Jha, Ahmed Hassan Awadallah, Asli Celikyilmaz, Yang Liu, Xipeng Qiu, Dragomir Radev
As increasing numbers of meetings are recorded and transcribed, meeting summaries have become essential to remind those who may or may not have attended the meetings about the key decisions made and the tasks to be completed.
1 code implementation • NAACL 2021 • Junqi Dai, Hang Yan, Tianxiang Sun, PengFei Liu, Xipeng Qiu
In this paper, we firstly compare the induced trees from PTMs and the dependency parsing trees on several popular models for the ABSA task, showing that the induced tree from fine-tuned RoBERTa (FT-RoBERTa) outperforms the parser-provided tree.
Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +1
1 code implementation • 7 Apr 2021 • Chenxin An, Ming Zhong, Yiran Chen, Danqing Wang, Xipeng Qiu, Xuanjing Huang
Previous work for text summarization in scientific domain mainly focused on the content of the input document, but seldom considering its citation network.
1 code implementation • ACL 2021 • Tao Gui, Xiao Wang, Qi Zhang, Qin Liu, Yicheng Zou, Xin Zhou, Rui Zheng, Chong Zhang, Qinzhuo Wu, Jiacheng Ye, Zexiong Pang, Yongxin Zhang, Zhengyan Li, Ruotian Ma, Zichu Fei, Ruijian Cai, Jun Zhao, Xingwu Hu, Zhiheng Yan, Yiding Tan, Yuan Hu, Qiyuan Bian, Zhihua Liu, Bolin Zhu, Shan Qin, Xiaoyu Xing, Jinlan Fu, Yue Zhang, Minlong Peng, Xiaoqing Zheng, Yaqian Zhou, Zhongyu Wei, Xipeng Qiu, Xuanjing Huang
To guarantee user acceptability, all the text transformations are linguistically based, and we provide a human evaluation for each one.
no code implementations • 29 Dec 2020 • Linyang Li, Yunfan Shao, Demin Song, Xipeng Qiu, Xuanjing Huang
The substitutions in the generated adversarial examples are not characters or words but \textit{'pieces'}, which are more natural to Chinese readers.
2 code implementations • 19 Dec 2020 • Jianze Liang, Chengqi Zhao, Mingxuan Wang, Xipeng Qiu, Lei LI
Neural machine translation often adopts the fine-tuning approach to adapt to specific domains.
1 code implementation • 14 Dec 2020 • Qipeng Guo, Zhijing Jin, Ziyu Wang, Xipeng Qiu, Weinan Zhang, Jun Zhu, Zheng Zhang, David Wipf
Cycle-consistent training is widely used for jointly learning a forward and inverse mapping between two domains of interest without the cumbersome requirement of collecting matched pairs within each domain.
1 code implementation • COLING 2020 • Zhijing Jin, Qipeng Guo, Xipeng Qiu, Zheng Zhang
With a human-annotated test set, we provide this new benchmark dataset for future research on unsupervised text generation from knowledge graphs.
Ranked #1 on Unsupervised KG-to-Text Generation on GenWiki (Fine)
no code implementations • 16 Nov 2020 • Jingjing Gong, Hang Yan, Yining Zheng, Xipeng Qiu, Xuanjing Huang
A lot of natural language processing problems need to encode the text sequence as a fix-length vector, which usually involves aggregation process of combining the representations of all the words, such as pooling or self-attention.
no code implementations • NAACL 2021 • Zhen Ke, Liang Shi, Songtao Sun, Erli Meng, Bin Wang, Xipeng Qiu
Recent researches show that pre-trained models (PTMs) are beneficial to Chinese Word Segmentation (CWS).
2 code implementations • Findings of the Association for Computational Linguistics 2020 • Yiran Chen, PengFei Liu, Ming Zhong, Zi-Yi Dou, Danqing Wang, Xipeng Qiu, Xuanjing Huang
In this paper, we perform an in-depth analysis of characteristics of different datasets and investigate the performance of different summarization models under a cross-dataset setting, in which a summarizer trained on one corpus will be evaluated on a range of out-of-domain corpora.
1 code implementation • EMNLP 2020 • Zehui Lin, Xiao Pan, Mingxuan Wang, Xipeng Qiu, Jiangtao Feng, Hao Zhou, Lei LI
We investigate the following question for machine translation (MT): can we develop a single universal MT model to serve as the common seed and obtain derivative and improved models on arbitrary language pairs?
Ranked #3 on Machine Translation on WMT2014 English-French (using extra training data)
1 code implementation • COLING 2020 • Tianxiang Sun, Yunfan Shao, Xipeng Qiu, Qipeng Guo, Yaru Hu, Xuanjing Huang, Zheng Zhang
With the emerging branch of incorporating factual knowledge into pre-trained language models such as BERT, most existing models consider shallow, static, and separately pre-trained entity embeddings, which limits the performance gains of these models.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Hang Yan, Xiaonan Li, Xipeng Qiu
Reverse dictionary is the task to find the proper target word given the word description.
no code implementations • ACL 2021 • Wei Zhu, Xipeng Qiu, Yuan Ni, Guotong Xie
Ablation study demonstrates the necessity of our search space design and the effectiveness of our search method.
1 code implementation • ACL 2021 • Zhichao Geng, Hang Yan, Xipeng Qiu, Xuanjing Huang
The joint-model is trained and evaluated on 13 corpora of four tasks, yielding near state-of-the-art (SOTA) performance in dependency parsing and NER, achieving SOTA performance in CWS and POS.
3 code implementations • 4 Sep 2020 • Wei Zhu, Xiaoling Wang, Xipeng Qiu, Yuan Ni, Guotong Xie
Though the transformer architectures have shown dominance in many natural language understanding tasks, there are still unsolved issues for the training of transformer models, especially the need for a principled way of warm-up which has shown importance for stable training of a transformer, as well as whether the task at hand prefer to scale the attention product or not.
no code implementations • ACL 2020 • Zhan Shi, Xu Zhou, Xipeng Qiu, Xiaodan Zhu
Image captioning is a multimodal problem that has drawn extensive attention in both the natural language processing and computer vision community.
1 code implementation • 21 Jun 2020 • Zhan Shi, Xu Zhou, Xipeng Qiu, Xiaodan Zhu
Image captioning is a multimodal problem that has drawn extensive attention in both the natural language processing and computer vision community.
2 code implementations • ACL (WebNLG, INLG) 2020 • Qipeng Guo, Zhijing Jin, Xipeng Qiu, Wei-Nan Zhang, David Wipf, Zheng Zhang
Due to the difficulty and high cost of data collection, the supervised data available in the two fields are usually on the magnitude of tens of thousands, for example, 18K in the WebNLG~2017 dataset after preprocessing, which is far fewer than the millions of data for other tasks such as machine translation.
1 code implementation • 5 Jun 2020 • Zhijing Jin, Yongyi Yang, Xipeng Qiu, Zheng Zhang
In natural language, often multiple entities appear in the same text.
1 code implementation • 30 Apr 2020 • Linyang Li, Xipeng Qiu
Gradient-based adversarial training is widely used in improving the robustness of neural networks, while it cannot be easily adapted to natural language processing tasks since the embedding space is discrete.
1 code implementation • ACL 2020 • Danqing Wang, PengFei Liu, Yining Zheng, Xipeng Qiu, Xuanjing Huang
An intuitive way is to put them in the graph-based neural network, which has a more complex structure for capturing inter-sentence relationships.
1 code implementation • ACL 2020 • Xiaonan Li, Hang Yan, Xipeng Qiu, Xuanjing Huang
Recently, the character-word lattice structure has been proved to be effective for Chinese named entity recognition (NER) by incorporating the word information.
Ranked #5 on Chinese Named Entity Recognition on MSRA
Chinese Named Entity Recognition named-entity-recognition +3
4 code implementations • EMNLP 2020 • Linyang Li, Ruotian Ma, Qipeng Guo, xiangyang xue, Xipeng Qiu
Adversarial attacks for discrete data (such as texts) have been proved significantly more challenging than continuous data (such as images) since it is difficult to generate adversarial samples with gradient-based methods.
2 code implementations • ACL 2020 • Ming Zhong, PengFei Liu, Yiran Chen, Danqing Wang, Xipeng Qiu, Xuanjing Huang
This paper creates a paradigm shift with regard to the way we build neural extractive summarization systems.
Ranked #1 on Text Summarization on BBC XSum
no code implementations • 13 Apr 2020 • Zhen Ke, Liang Shi, Erli Meng, Bin Wang, Xipeng Qiu, Xuanjing Huang
Besides, the pre-trained BERT language model has been also introduced into the MCCWS task in a multi-task learning framework.
3 code implementations • 18 Mar 2020 • Xipeng Qiu, Tianxiang Sun, Yige Xu, Yunfan Shao, Ning Dai, Xuanjing Huang
Recently, the emergence of pre-trained models (PTMs) has brought natural language processing (NLP) to a new era.
1 code implementation • 24 Feb 2020 • Yige Xu, Xipeng Qiu, Ligao Zhou, Xuanjing Huang
Fine-tuning pre-trained language models like BERT has become an effective way in NLP and yields state-of-the-art results on many downstream tasks.
no code implementations • 2 Dec 2019 • Qipeng Guo, Xipeng Qiu, PengFei Liu, xiangyang xue, Zheng Zhang
In this paper, we introduce the prior knowledge, multi-scale structure, into self-attention modules.
2 code implementations • 23 Nov 2019 • Kaiqiang Song, Logan Lebanoff, Qipeng Guo, Xipeng Qiu, xiangyang xue, Chen Li, Dong Yu, Fei Liu
If generating a word can introduce an erroneous relation to the summary, the behavior must be discouraged.
Ranked #27 on Text Summarization on GigaWord
1 code implementation • 12 Nov 2019 • Tianxiang Sun, Yunfan Shao, Xiaonan Li, PengFei Liu, Hang Yan, Xipeng Qiu, Xuanjing Huang
Most existing deep multi-task learning models are based on parameter sharing, such as hard sharing, hierarchical sharing, and soft sharing.
2 code implementations • 11 Nov 2019 • Zihao Ye, Qipeng Guo, Quan Gan, Xipeng Qiu, Zheng Zhang
The Transformer model is widely successful on many natural language processing tasks.
Ranked #1 on Machine Translation on IWSLT2015 Chinese-English
7 code implementations • 10 Nov 2019 • Hang Yan, Bocao Deng, Xiaonan Li, Xipeng Qiu
The Bidirectional long short-term memory networks (BiLSTM) have been widely used as an encoder in models solving the named entity recognition (NER) task.
Ranked #11 on Chinese Named Entity Recognition on Resume NER
no code implementations • WS 2019 • Ming Zhong, Danqing Wang, PengFei Liu, Xipeng Qiu, Xuanjing Huang
In this paper, we take stock of the current state of summarization datasets and explore how different factors of datasets influence the generalization behaviour of neural extractive summarization models.
no code implementations • 30 Aug 2019 • Danqing Wang, PengFei Liu, Ming Zhong, Jie Fu, Xipeng Qiu, Xuanjing Huang
Although domain shift has been well explored in many NLP applications, it still has received little attention in the domain of extractive text summarization.
3 code implementations • IJCNLP 2019 • Luyao Huang, Chi Sun, Xipeng Qiu, Xuanjing Huang
Word Sense Disambiguation (WSD) aims to find the exact sense of an ambiguous word in a particular context.
Ranked #3 on Word Sense Disambiguation on WiC-TSV
no code implementations • 25 Jul 2019 • Lin Zehui, PengFei Liu, Luyao Huang, Junkun Chen, Xipeng Qiu, Xuanjing Huang
Variants dropout methods have been designed for the fully-connected layer, convolutional layer and recurrent layer in neural networks, and shown to be effective to avoid overfitting.
2 code implementations • ACL 2019 • Ming Zhong, PengFei Liu, Danqing Wang, Xipeng Qiu, Xuanjing Huang
The recent years have seen remarkable success in the use of deep neural networks on text summarization.
Ranked #6 on Extractive Text Summarization on CNN / Daily Mail
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Xipeng Qiu, Hengzhi Pei, Hang Yan, Xuanjing Huang
Multi-criteria Chinese word segmentation (MCCWS) aims to exploit the relations among the multiple heterogeneous segmentation criteria and further improve the performance of each single criterion.
16 code implementations • 14 May 2019 • Chi Sun, Xipeng Qiu, Yige Xu, Xuanjing Huang
Language model pre-training has proven to be useful in learning universal language representations.
Ranked #1 on Text Classification on Yahoo! Answers
4 code implementations • ACL 2019 • Ning Dai, Jianze Liang, Xipeng Qiu, Xuanjing Huang
Disentangling the content and style in the latent space is prevalent in unpaired text style transfer.
1 code implementation • TACL 2020 • Hang Yan, Xipeng Qiu, Xuanjing Huang
Our graph-based joint model achieves better performance than previous joint models and state-of-the-art results in both Chinese word segmentation and dependency parsing.
8 code implementations • NAACL 2019 • Chi Sun, Luyao Huang, Xipeng Qiu
Aspect-based sentiment analysis (ABSA), which aims to identify fine-grained opinion polarity towards a specific aspect, is a challenging subtask of sentiment analysis (SA).
Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +5
2 code implementations • NAACL 2019 • Qipeng Guo, Xipeng Qiu, PengFei Liu, Yunfan Shao, xiangyang xue, Zheng Zhang
Although Transformer has achieved great successes on many NLP tasks, its heavy structure with fully-connected attention connections leads to dependencies on large training data.
Ranked #13 on Sentiment Analysis on SST-5 Fine-grained classification
Named Entity Recognition (NER) Natural Language Inference +2
1 code implementation • NAACL 2019 • Chi Sun, Xipeng Qiu, Xuanjing Huang
Chinese is a logographic writing system, and the shape of Chinese characters contain rich syntactic and semantic information.
no code implementations • 19 Dec 2018 • Jingjing Gong, Xinchi Chen, Tao Gui, Xipeng Qiu
With these auto-switched LSTMs, our model provides a more flexible solution for multi-criteria CWS, which is also easy to transfer the learned knowledge to new criteria.
no code implementations • 26 Nov 2018 • Pengfei Liu, Jie Fu, Yue Dong, Xipeng Qiu, Jackie Chi Kit Cheung
We present two architectures for multi-task learning with neural sequence models.
1 code implementation • 12 Oct 2018 • Fu Sun, Linyang Li, Xipeng Qiu, Yang Liu
A key subtask is to reliably predict whether the question is unanswerable.
Ranked #12 on Question Answering on SQuAD2.0 dev
no code implementations • EMNLP 2018 • Jingjing Gong, Xipeng Qiu, Xinchi Chen, Dong Liang, Xuanjing Huang
Attention-based neural models have achieved great success in natural language inference (NLI).
no code implementations • CONLL 2018 • Danlu Chen, Mengxiao Lin, Zhifeng Hu, Xipeng Qiu
This paper describes Fudan{'}s submission to CoNLL 2018{'}s shared task Universal Dependency Parsing.
no code implementations • 24 Sep 2018 • Shuyang Cao, Xipeng Qiu, Xuanjing Huang
Neural architecture for named entity recognition has achieved great success in the field of natural language processing.
no code implementations • 23 Sep 2018 • Kaiyu Chen, Yihan Dong, Xipeng Qiu, Zitian Chen
With curriculum learning, our model can deal with a complex arithmetic expression calculation with the deep hierarchical structure of skill models.
no code implementations • 23 Aug 2018 • Junkun Chen, Kaiyu Chen, Xinchi Chen, Xipeng Qiu, Xuanjing Huang
Designing shared neural architecture plays an important role in multi-task learning.
no code implementations • 21 Aug 2018 • Chi Sun, Hang Yan, Xipeng Qiu, Xuanjing Huang
Therefore, with the aim of representing words in a highly efficient way, we propose to operate a Gaussian word embedding model with a loss function based on the Wasserstein distance.
no code implementations • 14 Aug 2018 • Qipeng Guo, Xipeng Qiu, xiangyang xue, Zheng Zhang
Text generation is a fundamental building block in natural language processing tasks.
2 code implementations • COLING 2018 • Jingjing Gong, Xipeng Qiu, Shaojing Wang, Xuanjing Huang
The dynamic routing policy is dynamically deciding that what and how much information need be transferred from each word to the final encoding of the text sequence.
Ranked #44 on Sentiment Analysis on IMDb
3 code implementations • 30 Apr 2018 • Zhan Shi, Xinchi Chen, Xipeng Qiu, Xuanjing Huang
Similar to the adversarial models, the reward and policy function in IRL are optimized alternately.
no code implementations • 22 Apr 2018 • Renjie Zheng, Junkun Chen, Xipeng Qiu
More specifically, all tasks share the same sentence representation and each task can select the task-specific information from the shared sentence representation with attention mechanism.
no code implementations • 25 Feb 2018 • Junkun Chen, Xipeng Qiu, Pengfei Liu, Xuanjing Huang
Specifically, we use a shared meta-network to capture the meta-knowledge of semantic composition and generate the parameters of the task-specific semantic composition models.
no code implementations • 25 Feb 2018 • Jinyue Su, Jiacheng Xu, Xipeng Qiu, Xuanjing Huang
Generating plausible and fluent sentence with desired properties has long been a challenge.
no code implementations • EMNLP 2017 • Pengfei Liu, Kaiyu Qian, Xipeng Qiu, Xuanjing Huang
Idioms are peculiar linguistic constructions that impose great challenges for representing the semantics of language, especially in current prevailing end-to-end neural models, which assume that the semantics of a phrase or sentence can be literally composed from its constitutive words.
no code implementations • 2 Jul 2017 • Xinchi Chen, Zhan Shi, Xipeng Qiu, Xuanjing Huang
In this paper, we propose a new neural model to incorporate the word-level information for Chinese word segmentation.
1 code implementation • 9 Jun 2017 • Xipeng Qiu, Jingjing Gong, Xuanjing Huang
In this paper, we give an overview for the shared task at the CCF Conference on Natural Language Processing \& Chinese Computing (NLPCC 2017): Chinese News Headline Categorization.
no code implementations • 11 May 2017 • Pengfei Liu, Xipeng Qiu, Xuanjing Huang
Tree-structured neural networks have proven to be effective in learning semantic representations by exploiting syntactic information.
3 code implementations • 8 May 2017 • Minghao Hu, Yuxing Peng, Zhen Huang, Xipeng Qiu, Furu Wei, Ming Zhou
In this paper, we introduce the Reinforced Mnemonic Reader for machine reading comprehension tasks, which enhances previous attentive readers in two aspects.
Ranked #17 on Question Answering on SQuAD1.1 dev
no code implementations • ACL 2017 • Xinchi Chen, Zhan Shi, Xipeng Qiu, Xuanjing Huang
Different linguistic perspectives causes many diverse segmentation criteria for Chinese word segmentation (CWS).
no code implementations • ACL 2017 • Pengfei Liu, Xipeng Qiu, Xuanjing Huang
Neural network models have shown their promising opportunities for multi-task learning, which focus on learning the shared layers to extract the common and task-invariant features.
no code implementations • 26 Nov 2016 • Jiacheng Xu, Kan Chen, Xipeng Qiu, Xuanjing Huang
In this paper, we propose a novel deep architecture to utilize both structural and textual information of entities.
no code implementations • 16 Nov 2016 • Xinchi Chen, Xipeng Qiu, Xuanjing Huang
Recently, neural network models for natural language processing tasks have been increasingly focused on for their ability of alleviating the burden of manual feature engineering.
no code implementations • 15 Nov 2016 • Jingjing Gong, Xinchi Chen, Xipeng Qiu, Xuanjing Huang
However, it is nontrivial for pair-wise models to incorporate the contextual sentence information.