Search Results for author: Xipeng Qiu

Found 226 papers, 128 papers with code

Contrastive Aligned Joint Learning for Multilingual Summarization

1 code implementation • Findings (ACL) 2021 • Danqing Wang, Jiaze Chen, Hao Zhou, Xipeng Qiu, Lei LI

Paper
Code

Improving Abstractive Dialogue Summarization with Speaker-Aware Supervised Contrastive Learning

no code implementations • COLING 2022 • Zhichao Geng, Ming Zhong, Zhangyue Yin, Xipeng Qiu, Xuanjing Huang

For dialogue summarization, the subdomain of text summarization, utterances are concatenated to flat text before being processed.

Abstractive Dialogue Summarization Contrastive Learning +1

Paper
Add Code

“Is Whole Word Masking Always Better for Chinese BERT?”: Probing on Chinese Grammatical Error Correction

no code implementations • Findings (ACL) 2022 • Yong Dai, Linyang Li, Cong Zhou, Zhangyin Feng, Enbo Zhao, Xipeng Qiu, Piji Li, Duyu Tang

The meaning of a word in Chinese is different in that a word is a compositional unit consisting of multiple characters.

Grammatical Error Correction Language Modelling +2

Paper
Add Code

SpellBERT: A Lightweight Pretrained Model for Chinese Spelling Check

1 code implementation • EMNLP 2021 • Tuo ji, Hang Yan, Xipeng Qiu

Chinese Spelling Check (CSC) is to detect and correct Chinese spelling errors.

Graph Neural Network Language Modelling +1

Paper
Code

Are Factuality Checkers Reliable? Adversarial Meta-evaluation of Factuality in Summarization

1 code implementation • Findings (EMNLP) 2021 • Yiran Chen, PengFei Liu, Xipeng Qiu

In this paper, we present an adversarial meta-evaluation methodology that allows us to (i) diagnose the fine-grained strengths and weaknesses of 6 existing top-performing metrics over 24 diagnostic test datasets, (ii) search for directions for further improvement by data augmentation.

Data Augmentation

Paper
Code

KNN-Contrastive Learning for Out-of-Domain Intent Classification

no code implementations • ACL 2022 • Yunhua Zhou, Peiju Liu, Xipeng Qiu

The Out-of-Domain (OOD) intent classification is a basic and challenging task for dialogue systems.

Classification Contrastive Learning +4

Paper
Add Code

{\mathcal{P}^2}: A Plan-and-Pretrain Approach for Knowledge Graph-to-Text Generation

no code implementations • ACL (WebNLG, INLG) 2020 • Qipeng Guo, Zhijing Jin, Ning Dai, Xipeng Qiu, xiangyang xue, David Wipf, Zheng Zhang

Text verbalization of knowledge graphs is an important problem with wide application to natural language generation (NLG) systems.

Knowledge Graphs Text Generation

Paper
Add Code

AgentGym: Evolving Large Language Model-based Agents across Diverse Environments

1 code implementation • 6 Jun 2024 • Zhiheng Xi, Yiwen Ding, Wenxiang Chen, Boyang Hong, Honglin Guo, Junzhe Wang, Dingwen Yang, Chenyang Liao, Xin Guo, wei he, Songyang Gao, Lu Chen, Rui Zheng, Yicheng Zou, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang

Building generalist agents that can handle diverse tasks and evolve themselves across different environments is a long-term goal in the AI community.

Language Modelling Large Language Model

Paper
Code

Automatically Identifying Local and Global Circuits with Linear Computation Graphs

no code implementations • 22 May 2024 • Xuyang Ge, Fukang Zhu, Wentao Shu, Junxuan Wang, Zhengfu He, Xipeng Qiu

Circuit analysis of any certain model behavior is a central task in mechanistic interpretability.

Paper
Add Code

Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models

1 code implementation • 21 May 2024 • Zhangyue Yin, Qiushi Sun, Qipeng Guo, Zhiyuan Zeng, Xiaonan Li, Tianxiang Sun, Cheng Chang, Qinyuan Cheng, Ding Wang, Xiaofeng Mou, Xipeng Qiu, Xuanjing Huang

Recent advancements in Chain-of-Thought prompting have facilitated significant breakthroughs for Large Language Models (LLMs) in complex reasoning tasks.

Answer Selection

Paper
Code

SpeechAlign: Aligning Speech Generation to Human Preferences

2 code implementations • 8 Apr 2024 • Dong Zhang, Zhaowei Li, ShiMin Li, Xin Zhang, Pengyu Wang, Yaqian Zhou, Xipeng Qiu

However, the integration of human feedback to align speech outputs to human preferences is often neglected.

Language Modelling

978

Paper
Code

Calibrating the Confidence of Large Language Models by Eliciting Fidelity

no code implementations • 3 Apr 2024 • Mozhi Zhang, Mianqiu Huang, Rundong Shi, Linsen Guo, Chong Peng, Peng Yan, Yaqian Zhou, Xipeng Qiu

Large language models optimized with techniques like RLHF have achieved good alignment in being helpful and harmless.

Language Modelling

Paper
Add Code

InternLM2 Technical Report

1 code implementation • 26 Mar 2024 • Zheng Cai, Maosong Cao, Haojiong Chen, Kai Chen, Keyu Chen, Xin Chen, Xun Chen, Zehui Chen, Zhi Chen, Pei Chu, Xiaoyi Dong, Haodong Duan, Qi Fan, Zhaoye Fei, Yang Gao, Jiaye Ge, Chenya Gu, Yuzhe Gu, Tao Gui, Aijia Guo, Qipeng Guo, Conghui He, Yingfan Hu, Ting Huang, Tao Jiang, Penglong Jiao, Zhenjiang Jin, Zhikai Lei, Jiaxing Li, Jingwen Li, Linyang Li, Shuaibin Li, Wei Li, Yining Li, Hongwei Liu, Jiangning Liu, Jiawei Hong, Kaiwen Liu, Kuikun Liu, Xiaoran Liu, Chengqi Lv, Haijun Lv, Kai Lv, Li Ma, Runyuan Ma, Zerun Ma, Wenchang Ning, Linke Ouyang, Jiantao Qiu, Yuan Qu, FuKai Shang, Yunfan Shao, Demin Song, Zifan Song, Zhihao Sui, Peng Sun, Yu Sun, Huanze Tang, Bin Wang, Guoteng Wang, Jiaqi Wang, Jiayu Wang, Rui Wang, Yudong Wang, Ziyi Wang, Xingjian Wei, Qizhen Weng, Fan Wu, Yingtong Xiong, Chao Xu, Ruiliang Xu, Hang Yan, Yirong Yan, Xiaogui Yang, Haochen Ye, Huaiyuan Ying, JIA YU, Jing Yu, Yuhang Zang, Chuyu Zhang, Li Zhang, Pan Zhang, Peng Zhang, Ruijie Zhang, Shuo Zhang, Songyang Zhang, Wenjian Zhang, Wenwei Zhang, Xingcheng Zhang, Xinyue Zhang, Hui Zhao, Qian Zhao, Xiaomeng Zhao, Fengzhe Zhou, Zaida Zhou, Jingming Zhuo, Yicheng Zou, Xipeng Qiu, Yu Qiao, Dahua Lin

The evolution of Large Language Models (LLMs) like ChatGPT and GPT-4 has sparked discussions on the advent of Artificial General Intelligence (AGI).

Ranked #5 on Long-Context Understanding on Ada-LEval (BestAnswer)

4k Long-Context Understanding

5,435

Paper
Code

Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance

1 code implementation • 25 Mar 2024 • Jiasheng Ye, Peiju Liu, Tianxiang Sun, Yunhua Zhou, Jun Zhan, Xipeng Qiu

Pretraining data of large language models composes multiple domains (e. g., web texts, academic papers, codes), whose mixture proportions crucially impact the competence of outcome models.

Language Modelling

Paper
Code

A Survey of Neural Code Intelligence: Paradigms, Advances and Beyond

1 code implementation • 21 Mar 2024 • Qiushi Sun, Zhirui Chen, Fangzhi Xu, Kanzhi Cheng, Chang Ma, Zhangyue Yin, Jianing Wang, Chengcheng Han, Renyu Zhu, Shuai Yuan, Qipeng Guo, Xipeng Qiu, Pengcheng Yin, XiaoLi Li, Fei Yuan, Lingpeng Kong, Xiang Li, Zhiyong Wu

Building on our examination of the developmental trajectories, we further investigate the emerging synergies between code intelligence and broader machine intelligence, uncovering new cross-domain opportunities and illustrating the substantial influence of code intelligence across various domains.

174

Paper
Code

Benchmarking Hallucination in Large Language Models based on Unanswerable Math Word Problem

1 code implementation • 6 Mar 2024 • Yuhong Sun, Zhangyue Yin, Qipeng Guo, Jiawen Wu, Xipeng Qiu, Hui Zhao

This paper presents a new method for evaluating LLM hallucination in Question Answering (QA) based on the unanswerable math word problem (MWP).

Benchmarking Hallucination +4

Paper
Code

In-Memory Learning: A Declarative Learning Framework for Large Language Models

no code implementations • 5 Mar 2024 • Bo wang, Tianxiang Sun, Hang Yan, Siyin Wang, Qingyuan Cheng, Xipeng Qiu

The exploration of whether agents can align with their environment without relying on human-labeled data presents an intriguing research topic.

Paper
Add Code

Training-Free Long-Context Scaling of Large Language Models

1 code implementation • 27 Feb 2024 • Chenxin An, Fei Huang, Jun Zhang, Shansan Gong, Xipeng Qiu, Chang Zhou, Lingpeng Kong

The ability of Large Language Models (LLMs) to process and generate coherent text is markedly weakened when the number of input tokens exceeds their pretraining length.

16k

242

Paper
Code

Enhancing EEG-to-Text Decoding through Transferable Representations from Pre-trained Contrastive EEG-Text Masked Autoencoder

no code implementations • 27 Feb 2024 • Jiaqi Wang, Zhenxi Song, Zhengyu Ma, Xipeng Qiu, Min Zhang, Zhiguo Zhang

Reconstructing natural language from non-invasive electroencephalography (EEG) holds great promise as a language decoding technology for brain-computer interfaces (BCIs).

Brain Decoding EEG +2

Paper
Add Code

Data-freeWeight Compress and Denoise for Large Language Models

no code implementations • 26 Feb 2024 • Runyu Peng, Yunhua Zhou, Qipeng Guo, Yang Gao, Hang Yan, Xipeng Qiu, Dahua Lin

Significantly, our method is characterized by without necessitating additional involvement of any corpus, while simultaneously preserving orthogonality in conjunction with pruning and quantization methods.

Quantization

Paper
Add Code

GAOKAO-MM: A Chinese Human-Level Benchmark for Multimodal Models Evaluation

1 code implementation • 24 Feb 2024 • Yi Zong, Xipeng Qiu

The Large Vision-Language Models (LVLMs) have demonstrated great abilities in image perception and language understanding.

Paper
Code

Hint-before-Solving Prompting: Guiding LLMs to Effectively Utilize Encoded Knowledge

1 code implementation • 22 Feb 2024 • Jinlan Fu, Shenzhen Huangfu, Hang Yan, See-Kiong Ng, Xipeng Qiu

Large Language Models (LLMs) have recently showcased remarkable generalizability in various domains.

Logical Reasoning

Paper
Code

Balanced Data Sampling for Language Model Training with Clustering

1 code implementation • 22 Feb 2024 • Yunfan Shao, Linyang Li, Zhaoye Fei, Hang Yan, Dahua Lin, Xipeng Qiu

Data plays a fundamental role in the training of Large Language Models (LLMs).

Clustering Language Modelling

Paper
Code

LongWanjuan: Towards Systematic Measurement for Long Text Quality

1 code implementation • 21 Feb 2024 • Kai Lv, Xiaoran Liu, Qipeng Guo, Hang Yan, Conghui He, Xipeng Qiu, Dahua Lin

The quality of training data are crucial for enhancing the long-text capabilities of foundation models.

Language Modelling

Paper
Code

Identifying Semantic Induction Heads to Understand In-Context Learning

no code implementations • 20 Feb 2024 • Jie Ren, Qipeng Guo, Hang Yan, Dongrui Liu, Xipeng Qiu, Dahua Lin

Although large language models (LLMs) have demonstrated remarkable performance, the lack of transparency in their inference logic raises concerns about their trustworthiness.

In-Context Learning Knowledge Graphs

Paper
Add Code

Code Needs Comments: Enhancing Code LLMs with Comment Augmentation

no code implementations • 20 Feb 2024 • Demin Song, Honglin Guo, Yunhua Zhou, Shuhao Xing, Yudong Wang, Zifan Song, Wenwei Zhang, Qipeng Guo, Hang Yan, Xipeng Qiu, Dahua Lin

The programming skill is one crucial ability for Large Language Models (LLMs), necessitating a deep understanding of programming languages (PLs) and their correlation with natural languages (NLs).

Data Augmentation

Paper
Add Code

Dictionary Learning Improves Patch-Free Circuit Discovery in Mechanistic Interpretability: A Case Study on Othello-GPT

no code implementations • 19 Feb 2024 • Zhengfu He, Xuyang Ge, Qiong Tang, Tianxiang Sun, Qinyuan Cheng, Xipeng Qiu

Sparse dictionary learning has been a rapidly growing technique in mechanistic interpretability to attack superposition and extract more human-understandable features from model activations.

Dictionary Learning

Paper
Add Code

AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling

1 code implementation • 19 Feb 2024 • Jun Zhan, Junqi Dai, Jiasheng Ye, Yunhua Zhou, Dong Zhang, Zhigeng Liu, Xin Zhang, Ruibin Yuan, Ge Zhang, Linyang Li, Hang Yan, Jie Fu, Tao Gui, Tianxiang Sun, Yugang Jiang, Xipeng Qiu

We introduce AnyGPT, an any-to-any multimodal language model that utilizes discrete representations for the unified processing of various modalities, including speech, text, images, and music.

Language Modelling Large Language Model

596

Paper
Code

LLM can Achieve Self-Regulation via Hyperparameter Aware Generation

no code implementations • 17 Feb 2024 • Siyin Wang, ShiMin Li, Tianxiang Sun, Jinlan Fu, Qinyuan Cheng, Jiasheng Ye, Junjie Ye, Xipeng Qiu, Xuanjing Huang

HAG extends the current paradigm in the text generation process, highlighting the feasibility of endowing the LLMs with self-regulate decoding strategies.

Text Generation

Paper
Add Code

Turn Waste into Worth: Rectifying Top-$k$ Router of MoE

no code implementations • 17 Feb 2024 • Zhiyuan Zeng, Qipeng Guo, Zhaoye Fei, Zhangyue Yin, Yunhua Zhou, Linyang Li, Tianxiang Sun, Hang Yan, Dahua Lin, Xipeng Qiu

To address the dropped tokens and padding, we propose the Rectify-Router, comprising the Intra-GPU Rectification and the Fill-in Rectification.

Computational Efficiency

Paper
Add Code

InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning

1 code implementation • 9 Feb 2024 • Huaiyuan Ying, Shuo Zhang, Linyang Li, Zhejian Zhou, Yunfan Shao, Zhaoye Fei, Yichuan Ma, Jiawei Hong, Kuikun Liu, Ziyi Wang, Yudong Wang, Zijian Wu, Shuaibin Li, Fengzhe Zhou, Hongwei Liu, Songyang Zhang, Wenwei Zhang, Hang Yan, Xipeng Qiu, Jiayu Wang, Kai Chen, Dahua Lin

We further explore how to use LEAN to solve math problems and study its performance under the setting of multi-task learning which shows the possibility of using LEAN as a unified platform for solving and proving in math.

Data Augmentation GSM8K +3

326

Paper
Code

MouSi: Poly-Visual-Expert Vision-Language Models

1 code implementation • 30 Jan 2024 • Xiaoran Fan, Tao Ji, Changhao Jiang, Shuo Li, Senjie Jin, Sirui Song, Junke Wang, Boyang Hong, Lu Chen, Guodong Zheng, Ming Zhang, Caishuang Huang, Rui Zheng, Zhiheng Xi, Yuhao Zhou, Shihan Dou, Junjie Ye, Hang Yan, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang

This technique introduces a fusion network to unify the processing of outputs from different visual experts, while bridging the gap between image encoders and pre-trained LLMs.

Ranked #54 on Visual Question Answering on MM-Vet

Image Segmentation Image-text matching +4

Paper
Code

Query of CC: Unearthing Large Scale Domain-Specific Knowledge from Public Corpora

no code implementations • 26 Jan 2024 • Zhaoye Fei, Yunfan Shao, Linyang Li, Zhiyuan Zeng, Conghui He, Hang Yan, Dahua Lin, Xipeng Qiu

Large language models have demonstrated remarkable potential in various tasks, however, there remains a significant scarcity of open-source models and data for specific domains.

Language Modelling Large Language Model

Paper
Add Code

F-Eval: Asssessing Fundamental Abilities with Refined Evaluation Methods

1 code implementation • 26 Jan 2024 • Yu Sun, Keyu Chen, Shujie Wang, Qipeng Guo, Hang Yan, Xipeng Qiu, Xuanjing Huang, Dahua Lin

However, these evaluation benchmarks are limited to assessing the instruction-following capabilities, overlooking the fundamental abilities that emerge during the pre-training stage.

Instruction Following

Paper
Code

DenoSent: A Denoising Objective for Self-Supervised Sentence Representation Learning

1 code implementation • 24 Jan 2024 • Xinghao Wang, Junliang He, Pengyu Wang, Yunhua Zhou, Tianxiang Sun, Xipeng Qiu

These methods regularize the representation space by pulling similar sentence representations closer and pushing away the dissimilar ones and have been proven effective in various NLP tasks, e. g., semantic textual similarity (STS) tasks.

Contrastive Learning Denoising +4

Paper
Code

Can AI Assistants Know What They Don't Know?

1 code implementation • 24 Jan 2024 • Qinyuan Cheng, Tianxiang Sun, Xiangyang Liu, Wenwei Zhang, Zhangyue Yin, ShiMin Li, Linyang Li, Zhengfu He, Kai Chen, Xipeng Qiu

To answer this question, we construct a model-specific "I don't know" (Idk) dataset for an assistant, which contains its known and unknown questions, based on existing open-domain question answering datasets.

Math Open-Domain Question Answering +1

Paper
Code

SpeechGPT-Gen: Scaling Chain-of-Information Speech Generation

1 code implementation • 24 Jan 2024 • Dong Zhang, Xin Zhang, Jun Zhan, ShiMin Li, Yaqian Zhou, Xipeng Qiu

It comprises an autoregressive model based on LLM for semantic information modeling and a non-autoregressive model employing flow matching for perceptual information modeling.

Voice Conversion

978

Paper
Code

InferAligner: Inference-Time Alignment for Harmlessness through Cross-Model Guidance

1 code implementation • 20 Jan 2024 • Pengyu Wang, Dong Zhang, Linyang Li, Chenkun Tan, Xinghao Wang, Ke Ren, Botian Jiang, Xipeng Qiu

With the rapid development of large language models (LLMs), they are not only used as general-purpose AI assistants but are also customized through further fine-tuning to meet the requirements of different applications.

Paper
Code

Secrets of RLHF in Large Language Models Part II: Reward Modeling

1 code implementation • 11 Jan 2024 • Binghai Wang, Rui Zheng, Lu Chen, Yan Liu, Shihan Dou, Caishuang Huang, Wei Shen, Senjie Jin, Enyu Zhou, Chenyu Shi, Songyang Gao, Nuo Xu, Yuhao Zhou, Xiaoran Fan, Zhiheng Xi, Jun Zhao, Xiao Wang, Tao Ji, Hang Yan, Lixing Shen, Zhan Chen, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang

We introduce a series of novel methods to mitigate the influence of incorrect and ambiguous preferences in the dataset and fully leverage high-quality preference data.

Contrastive Learning Meta-Learning +1

1,194

Paper
Code

Agent Alignment in Evolving Social Norms

no code implementations • 9 Jan 2024 • ShiMin Li, Tianxiang Sun, Qinyuan Cheng, Xipeng Qiu

Agents based on Large Language Models (LLMs) are increasingly permeating various domains of human production and life, highlighting the importance of aligning them with human values.

Paper
Add Code

SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems

1 code implementation • 8 Jan 2024 • Dong Zhang, Zhaowei Li, Pengyu Wang, Xin Zhang, Yaqian Zhou, Xipeng Qiu

In this paper, we propose SpeechAgents, a multi-modal LLM based multi-agent system designed for simulating human communication.

Language Modelling Large Language Model

Paper
Code

A Survey of Reasoning with Foundation Models

1 code implementation • 17 Dec 2023 • Jiankai Sun, Chuanyang Zheng, Enze Xie, Zhengying Liu, Ruihang Chu, Jianing Qiu, Jiaqi Xu, Mingyu Ding, Hongyang Li, Mengzhe Geng, Yue Wu, Wenhai Wang, Junsong Chen, Zhangyue Yin, Xiaozhe Ren, Jie Fu, Junxian He, Wu Yuan, Qi Liu, Xihui Liu, Yu Li, Hao Dong, Yu Cheng, Ming Zhang, Pheng Ann Heng, Jifeng Dai, Ping Luo, Jingdong Wang, Ji-Rong Wen, Xipeng Qiu, Yike Guo, Hui Xiong, Qun Liu, Zhenguo Li

Reasoning, a crucial ability for complex problem-solving, plays a pivotal role in various real-world settings such as negotiation, medical diagnosis, and criminal investigation.

Medical Diagnosis

374

Paper
Code

Alignment for Honesty

1 code implementation • 12 Dec 2023 • Yuqing Yang, Ethan Chern, Xipeng Qiu, Graham Neubig, PengFei Liu

Recent research has made significant strides in applying alignment techniques to enhance the helpfulness and harmlessness of large language models (LLMs) in accordance with human intentions.

Paper
Code

Exchange-of-Thought: Enhancing Large Language Model Capabilities through Cross-Model Communication

1 code implementation • 4 Dec 2023 • Zhangyue Yin, Qiushi Sun, Cheng Chang, Qipeng Guo, Junqi Dai, Xuanjing Huang, Xipeng Qiu

Large Language Models (LLMs) have recently made significant strides in complex reasoning tasks through the Chain-of-Thought technique.

Language Modelling Large Language Model

Paper
Code

CoLLiE: Collaborative Training of Large Language Models in an Efficient Way

1 code implementation • 1 Dec 2023 • Kai Lv, Shuo Zhang, Tianle Gu, Shuhao Xing, Jiawei Hong, Keyu Chen, Xiaoran Liu, Yuqing Yang, Honglin Guo, Tengxiao Liu, Yu Sun, Qipeng Guo, Hang Yan, Xipeng Qiu

This paper introduces CoLLiE, an efficient library that facilitates collaborative training of large language models using 3D parallelism, parameter-efficient fine-tuning (PEFT) methods, and optimizers such as Lion, Adan, Sophia, LOMO and AdaLomo.

391

Paper
Code

LLatrieval: LLM-Verified Retrieval for Verifiable Generation

1 code implementation • 14 Nov 2023 • Xiaonan Li, Changtai Zhu, Linyang Li, Zhangyue Yin, Tianxiang Sun, Xipeng Qiu

Thus, the LLM can iteratively provide feedback to retrieval and facilitate the retrieval result to fully support verifiable generation.

Language Modelling Large Language Model +1

Paper
Code

Flames: Benchmarking Value Alignment of LLMs in Chinese

1 code implementation • 12 Nov 2023 • Kexin Huang, Xiangyang Liu, Qianyu Guo, Tianxiang Sun, Jiawei Sun, Yaru Wang, Zeyang Zhou, Yixu Wang, Yan Teng, Xipeng Qiu, Yingchun Wang, Dahua Lin

The widespread adoption of large language models (LLMs) across various regions underscores the urgent need to evaluate their alignment with human values.

Benchmarking Fairness

Paper
Code

Plan, Verify and Switch: Integrated Reasoning with Diverse X-of-Thoughts

1 code implementation • 23 Oct 2023 • Tengxiao Liu, Qipeng Guo, Yuqing Yang, Xiangkun Hu, Yue Zhang, Xipeng Qiu, Zheng Zhang

As large language models (LLMs) have shown effectiveness with different prompting methods, such as Chain of Thought, Program of Thought, we find that these methods have formed a great complementarity to each other on math reasoning tasks.

Logical Reasoning Math

Paper
Code

Watermarking LLMs with Weight Quantization

1 code implementation • 17 Oct 2023 • Linyang Li, Botian Jiang, Pengyu Wang, Ke Ren, Hang Yan, Xipeng Qiu

Abuse of large language models reveals high risks as large language models are being deployed at an astonishing speed.

Language Modelling Large Language Model +1

Paper
Code

AdaLomo: Low-memory Optimization with Adaptive Learning Rate

1 code implementation • 16 Oct 2023 • Kai Lv, Hang Yan, Qipeng Guo, Haijun Lv, Xipeng Qiu

Our experiments with instruction-tuning and further pre-training demonstrate that AdaLomo achieves results on par with AdamW, while significantly reducing memory requirements, thereby lowering the hardware barrier to training large language models.

948

Paper
Code

Character-LLM: A Trainable Agent for Role-Playing

1 code implementation • 16 Oct 2023 • Yunfan Shao, Linyang Li, Junqi Dai, Xipeng Qiu

Large language models (LLMs) can be used to serve as agents to simulate human behaviors, given the powerful ability to understand human instructions and provide high-quality generated texts.

369

Paper
Code

Efficient Link Prediction via GNN Layers Induced by Negative Sampling

no code implementations • 14 Oct 2023 • Yuxin Wang, Xiannian Hu, Quan Gan, Xuanjing Huang, Xipeng Qiu, David Wipf

Graph neural networks (GNNs) for link prediction can loosely be divided into two broad categories.

Decoder Link Prediction +1

Paper
Add Code

SeqXGPT: Sentence-Level AI-Generated Text Detection

1 code implementation • 13 Oct 2023 • Pengyu Wang, Linyang Li, Ke Ren, Botian Jiang, Dong Zhang, Xipeng Qiu

Therefore, it is important to build strong AI-generated text (AIGT) detectors.

Sentence Text Detection

Paper
Code

PerturbScore: Connecting Discrete and Continuous Perturbations in NLP

1 code implementation • 13 Oct 2023 • Linyang Li, Ke Ren, Yunfan Shao, Pengyu Wang, Xipeng Qiu

Through experimental results, we find that we can build a connection between discrete and continuous perturbations and use the proposed PerturbScore to learn such correlation, surpassing previous methods used in discrete perturbation measuring.

Paper
Code

Scaling Laws of RoPE-based Extrapolation

1 code implementation • 8 Oct 2023 • Xiaoran Liu, Hang Yan, Shuo Zhang, Chenxin An, Xipeng Qiu, Dahua Lin

The extrapolation capability of Large Language Models (LLMs) based on Rotary Position Embedding is currently a topic of considerable interest.

16k

Paper
Code

Evaluating Hallucinations in Chinese Large Language Models

2 code implementations • 5 Oct 2023 • Qinyuan Cheng, Tianxiang Sun, Wenwei Zhang, Siyin Wang, Xiangyang Liu, Mozhi Zhang, Junliang He, Mianqiu Huang, Zhangyue Yin, Kai Chen, Xipeng Qiu

We analyze the primary types of hallucinations in different types of models and their causes.

Hallucination Question Answering

587

Paper
Code

Corex: Pushing the Boundaries of Complex Reasoning through Multi-Model Collaboration

1 code implementation • 30 Sep 2023 • Qiushi Sun, Zhangyue Yin, Xiang Li, Zhiyong Wu, Xipeng Qiu, Lingpeng Kong

Large Language Models (LLMs) are evolving at an unprecedented pace and have exhibited considerable capability in the realm of natural language processing (NLP) with world knowledge.

World Knowledge

Paper
Code

The Rise and Potential of Large Language Model Based Agents: A Survey

1 code implementation • 14 Sep 2023 • Zhiheng Xi, Wenxiang Chen, Xin Guo, wei he, Yiwen Ding, Boyang Hong, Ming Zhang, Junzhe Wang, Senjie Jin, Enyu Zhou, Rui Zheng, Xiaoran Fan, Xiao Wang, Limao Xiong, Yuhao Zhou, Weiran Wang, Changhao Jiang, Yicheng Zou, Xiangyang Liu, Zhangyue Yin, Shihan Dou, Rongxiang Weng, Wensen Cheng, Qi Zhang, Wenjuan Qin, Yongyan Zheng, Xipeng Qiu, Xuanjing Huang, Tao Gui

Many efforts have been made to develop intelligent agents, but they mainly focus on advancement in algorithms or training strategies to enhance specific capabilities or performance on particular tasks.

Language Modelling Large Language Model

5,561

Paper
Code

SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models

3 code implementations • 31 Aug 2023 • Xin Zhang, Dong Zhang, ShiMin Li, Yaqian Zhou, Xipeng Qiu

Therefore, we propose SpeechTokenizer, a unified speech tokenizer for speech large language models.

Decoder Language Modelling +1

320

Paper
Code

EduChat: A Large-Scale Language Model-based Chatbot System for Intelligent Education

1 code implementation • 5 Aug 2023 • Yuhao Dan, Zhikai Lei, Yiyang Gu, Yong Li, Jianghao Yin, Jiaju Lin, Linhao Ye, Zhiyan Tie, Yougen Zhou, Yilei Wang, Aimin Zhou, Ze Zhou, Qin Chen, Jie zhou, Liang He, Xipeng Qiu

Currently, EduChat is available online as an open-source project, with its code, data, and model parameters available on platforms (e. g., GitHub https://github. com/icalk-nlp/EduChat, Hugging Face https://huggingface. co/ecnu-icalk ).

Chatbot Language Modelling +1

626

Paper
Code

Does Correction Remain A Problem For Large Language Models?

no code implementations • 3 Aug 2023 • Xiaowu Zhang, Xiaotian Zhang, Cheng Yang, Hang Yan, Xipeng Qiu

As large language models, such as GPT, continue to advance the capabilities of natural language processing (NLP), the question arises: does the problem of correction still persist?

Few-Shot Learning

Paper
Add Code

L-Eval: Instituting Standardized Evaluation for Long Context Language Models

3 code implementations • 20 Jul 2023 • Chenxin An, Shansan Gong, Ming Zhong, Xingjian Zhao, Mukai Li, Jun Zhang, Lingpeng Kong, Xipeng Qiu

Recently, there has been growing interest in extending the context length of large language models (LLMs), aiming to effectively process long inputs of one turn or conversations with more extensive histories.

Instruction Following

11,956

Paper
Code

Secrets of RLHF in Large Language Models Part I: PPO

1 code implementation • 11 Jul 2023 • Rui Zheng, Shihan Dou, Songyang Gao, Yuan Hua, Wei Shen, Binghai Wang, Yan Liu, Senjie Jin, Qin Liu, Yuhao Zhou, Limao Xiong, Lu Chen, Zhiheng Xi, Nuo Xu, Wenbin Lai, Minghao Zhu, Cheng Chang, Zhangyue Yin, Rongxiang Weng, Wensen Cheng, Haoran Huang, Tianxiang Sun, Hang Yan, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang

Therefore, we explore the PPO-max, an advanced version of PPO algorithm, to efficiently improve the training stability of the policy model.

1,194

Paper
Code

Distributed Marker Representation for Ambiguous Discourse Markers and Entangled Relations

no code implementations • 19 Jun 2023 • Dongyu Ru, Lin Qiu, Xipeng Qiu, Yue Zhang, Zheng Zhang

Discourse analysis is an important task because it models intrinsic semantic structures between sentences in a document.

Sentence

Paper
Add Code

Full Parameter Fine-tuning for Large Language Models with Limited Resources

1 code implementation • 16 Jun 2023 • Kai Lv, Yuqing Yang, Tengxiao Liu, Qinghui Gao, Qipeng Guo, Xipeng Qiu

Large Language Models (LLMs) have revolutionized Natural Language Processing (NLP) but demand massive GPU resources for training.

948

Paper
Code

From Hypergraph Energy Functions to Hypergraph Neural Networks

1 code implementation • 16 Jun 2023 • Yuxin Wang, Quan Gan, Xipeng Qiu, Xuanjing Huang, David Wipf

Hypergraphs are a powerful abstraction for representing higher-order interactions between entities of interest.

Bilevel Optimization Graph Neural Network +1

Paper
Code

An AMR-based Link Prediction Approach for Document-level Event Argument Extraction

1 code implementation • 30 May 2023 • Yuqing Yang, Qipeng Guo, Xiangkun Hu, Yue Zhang, Xipeng Qiu, Zheng Zhang

Motivated by the fact that all event structures can be inferred from AMR, this work reformulates EAE as a link prediction problem on AMR graphs.

Event Argument Extraction Link Prediction +1

Paper
Code

Do Large Language Models Know What They Don't Know?

1 code implementation • 29 May 2023 • Zhangyue Yin, Qiushi Sun, Qipeng Guo, Jiawen Wu, Xipeng Qiu, Xuanjing Huang

Large language models (LLMs) have a wealth of knowledge that allows them to excel in various Natural Language Processing (NLP) tasks.

In-Context Learning

Paper
Code

Multijugate Dual Learning for Low-Resource Task-Oriented Dialogue System

no code implementations • 25 May 2023 • ShiMin Li, Xiaotian Zhang, Yanjun Zheng, Linyang Li, Xipeng Qiu

Dialogue data in real scenarios tend to be sparsely available, rendering data-starved end-to-end dialogue systems trained inadequately.

Task-Oriented Dialogue Systems

Paper
Add Code

Optimizing Non-Autoregressive Transformers with Contrastive Learning

no code implementations • 23 May 2023 • Chenxin An, Jiangtao Feng, Fei Huang, Xipeng Qiu, Lingpeng Kong

In this paper, we propose to ease the difficulty of modality learning via sampling from the model distribution instead of the data distribution.

Contrastive Learning Machine Translation +2

Paper
Add Code

Evaluating the Performance of Large Language Models on GAOKAO Benchmark

1 code implementation • 21 May 2023 • Xiaotian Zhang, Chunyang Li, Yi Zong, Zhengyu Ying, Liang He, Xipeng Qiu

Large Language Models(LLMs) have demonstrated remarkable performance across various natural language processing tasks; however, how to comprehensively and accurately assess their performance becomes an urgent issue to be addressed.

478

Paper
Code

PromptNER: A Prompting Method for Few-shot Named Entity Recognition via k Nearest Neighbor Search

1 code implementation • 20 May 2023 • Mozhi Zhang, Hang Yan, Yaqian Zhou, Xipeng Qiu

We use prompts that contains entity category information to construct label prototypes, which enables our model to fine-tune with only the support set.

few-shot-ner Few-shot NER +4

Paper
Code

SpeechGPT: Empowering Large Language Models with Intrinsic Cross-Modal Conversational Abilities

1 code implementation • 18 May 2023 • Dong Zhang, ShiMin Li, Xin Zhang, Jun Zhan, Pengyu Wang, Yaqian Zhou, Xipeng Qiu

Multi-modal large language models are regarded as a crucial step towards Artificial General Intelligence (AGI) and have garnered significant interest with the emergence of ChatGPT.

Language Modelling Large Language Model +2

978

Paper
Code

MoT: Memory-of-Thought Enables ChatGPT to Self-Improve

1 code implementation • 9 May 2023 • Xiaonan Li, Xipeng Qiu

Specifically, MoT is divided into two stages: 1. before the test stage, the LLM pre-thinks on the unlabeled dataset and saves the high-confidence thoughts as external memory; 2.

Arithmetic Reasoning Natural Language Inference

Paper
Code

CodeIE: Large Code Generation Models are Better Few-Shot Information Extractors

1 code implementation • 9 May 2023 • Peng Li, Tianxiang Sun, Qiong Tang, Hang Yan, Yuanbin Wu, Xuanjing Huang, Xipeng Qiu

A common practice is to recast the task into a text-to-text format such that generative LLMs of natural language (NL-LLMs) like GPT-3 can be prompted to solve it.

Code Generation Few-Shot Learning +4

Paper
Code

Unified Demonstration Retriever for In-Context Learning

1 code implementation • 7 May 2023 • Xiaonan Li, Kai Lv, Hang Yan, Tianyang Lin, Wei Zhu, Yuan Ni, Guotong Xie, Xiaoling Wang, Xipeng Qiu

To train UDR, we cast various tasks' training signals into a unified list-wise ranking formulation by language model's feedback.

In-Context Learning Language Modelling +1

Paper
Code

Improving Contrastive Learning of Sentence Embeddings from AI Feedback

1 code implementation • 3 May 2023 • Qinyuan Cheng, Xiaogui Yang, Tianxiang Sun, Linyang Li, Xipeng Qiu

Our method utilizes AI feedback from large pre-trained language models (LLMs) to construct sample pairs with fine-grained sample similarity scores to improve contrastive learning.

Contrastive Learning Data Augmentation +5

Paper
Code

Origin Tracing and Detecting of LLMs

no code implementations • 27 Apr 2023 • Linyang Li, Pengyu Wang, Ke Ren, Tianxiang Sun, Xipeng Qiu

The extraordinary performance of large language models (LLMs) heightens the importance of detecting whether the context is generated by an AI system.

Paper
Add Code

Finding Support Examples for In-Context Learning

no code implementations • 27 Feb 2023 • Xiaonan Li, Xipeng Qiu

Additionally, the strong dependency among in-context examples makes it an NP-hard combinatorial optimization problem and enumerating all permutations is infeasible.

Combinatorial Optimization In-Context Learning +2

Paper
Add Code

Rethinking Label Smoothing on Multi-hop Question Answering

2 code implementations • 19 Dec 2022 • Zhangyue Yin, Yuxin Wang, Xiannian Hu, Yiguang Wu, Hang Yan, Xinyu Zhang, Zhao Cao, Xuanjing Huang, Xipeng Qiu

Multi-Hop Question Answering (MHQA) is a significant area in question answering, requiring multiple reasoning components, including document retrieval, supporting sentence prediction, and answer span extraction.

Image Classification Machine Reading Comprehension +6

Paper
Code

Mitigating Negative Style Transfer in Hybrid Dialogue System

1 code implementation • 14 Dec 2022 • ShiMin Li, Qinyuan Cheng, Linyang Li, Xipeng Qiu

As the functionality of dialogue systems evolves, hybrid dialogue systems that accomplish user-specific goals and participate in open-topic chitchat with users are attracting growing attention.

Contrastive Learning Style Transfer

Paper
Code

Investigating Glyph Phonetic Information for Chinese Spell Checking: What Works and What's Next

no code implementations • 8 Dec 2022 • Xiaotian Zhang, Yanjun Zheng, Hang Yan, Xipeng Qiu

While pre-trained Chinese language models have demonstrated impressive performance on a wide range of NLP tasks, the Chinese Spell Checking (CSC) task remains a challenge.

Chinese Spell Checking

Paper
Add Code

DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models

1 code implementation • 28 Nov 2022 • Zhengfu He, Tianxiang Sun, Kuanning Wang, Xuanjing Huang, Xipeng Qiu

We present DiffusionBERT, a new generative masked language model based on discrete diffusion models.

Denoising Language Modelling +1

276

Paper
Code

Word-Level Representation From Bytes For Language Modeling

no code implementations • 23 Nov 2022 • Chu-Tak Lee, Qipeng Guo, Xipeng Qiu

Based on this observation, we rethink the existing character-aware method that takes character-level inputs but makes word-level sequence modeling and prediction.

Cross-Lingual Transfer Image Classification +4

Paper
Add Code

RLET: A Reinforcement Learning Based Approach for Explainable QA with Entailment Trees

1 code implementation • 31 Oct 2022 • Tengxiao Liu, Qipeng Guo, Xiangkun Hu, Yue Zhang, Xipeng Qiu, Zheng Zhang

RLET iteratively performs single step reasoning with sentence selection and deduction generation modules, from which the training signal is accumulated across the tree with elaborately designed aligned reward function that is consistent with the evaluation.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

SDCL: Self-Distillation Contrastive Learning for Chinese Spell Checking

no code implementations • 31 Oct 2022 • Xiaotian Zhang, Hang Yan, Yu Sun, Xipeng Qiu

To adapt BERT to the CSC task, we propose a token-level self-distillation contrastive learning method.

Chinese Spell Checking Contrastive Learning +1

Paper
Add Code

DORE: Document Ordered Relation Extraction based on Generative Framework

1 code implementation • 28 Oct 2022 • Qipeng Guo, Yuqing Yang, Hang Yan, Xipeng Qiu, Zheng Zhang

In this paper, we investigate the root cause of the underwhelming performance of the existing generative DocRE models and discover that the culprit is the inadequacy of the training paradigm, instead of the capacities of the models.

Document-level Relation Extraction Relation

Paper
Code

Is MultiWOZ a Solved Task? An Interactive TOD Evaluation Framework with User Simulator

1 code implementation • 26 Oct 2022 • Qinyuan Cheng, Linyang Li, Guofeng Quan, Feng Gao, Xiaofeng Mou, Xipeng Qiu

Besides, we introduce a sentence-level and a session-level score to measure the sentence fluency and session coherence in the interactive evaluation.

Sentence

Paper
Code

Discovering New Intents Using Latent Variables

no code implementations • 21 Oct 2022 • Yunhua Zhou, Peiju Liu, Yuxin Wang, Xipeng Qiu

In this paper, starting from the intuition that discovering intents could be beneficial to the identification of the known intents, we propose a probabilistic framework for discovering intents where intent assignments are treated as latent variables.

Paper
Add Code

Late Prompt Tuning: A Late Prompt Could Be Better Than Many Prompts

1 code implementation • 20 Oct 2022 • Xiangyang Liu, Tianxiang Sun, Xuanjing Huang, Xipeng Qiu

Through extensive experimental results across various tasks and PTMs, we show that LPT can achieve competitive performance to full model tuning and other PETuning methods under both full-data and few-shot scenarios while possessing faster training speed and lower memory cost.

Paper
Code

Soft-Labeled Contrastive Pre-training for Function-level Code Representation

1 code implementation • 18 Oct 2022 • Xiaonan Li, Daya Guo, Yeyun Gong, Yun Lin, Yelong Shen, Xipeng Qiu, Daxin Jiang, Weizhu Chen, Nan Duan

In this paper, we present \textbf{SCodeR}, a \textbf{S}oft-labeled contrastive pre-training framework with two positive sample construction methods to learn functional-level \textbf{Code} \textbf{R}epresentation.

Paper
Code

Multitask Pre-training of Modular Prompt for Chinese Few-Shot Learning

1 code implementation • 14 Oct 2022 • Tianxiang Sun, Zhengfu He, Qin Zhu, Xipeng Qiu, Xuanjing Huang

MP2 is a set of combinable prompts pre-trained on 38 Chinese tasks.

Few-Shot Learning Machine Reading Comprehension

Paper
Code

BERTScore is Unfair: On Social Bias in Language Model-Based Metrics for Text Generation

1 code implementation • 14 Oct 2022 • Tianxiang Sun, Junliang He, Xipeng Qiu, Xuanjing Huang

Automatic evaluation metrics are crucial to the development of generative systems.

Fairness Language Modelling +1

Paper
Code

The Open-World Lottery Ticket Hypothesis for OOD Intent Classification

1 code implementation • 13 Oct 2022 • Yunhua Zhou, Pengyu Wang, Peiju Liu, Yuxin Wang, Xipeng Qiu

Most existing methods of Out-of-Domain (OOD) intent classification rely on extensive auxiliary OOD corpora or specific training paradigms.

intent-classification Intent Classification

Paper
Code

COLO: A Contrastive Learning based Re-ranking Framework for One-Stage Summarization

1 code implementation • COLING 2022 • Chenxin An, Ming Zhong, Zhiyong Wu, Qin Zhu, Xuanjing Huang, Xipeng Qiu

Traditional training paradigms for extractive and abstractive summarization systems always only use token-level or sentence-level training objectives.

Abstractive Text Summarization Contrastive Learning +2

Paper
Code

A Unified Generative Framework based on Prompt Learning for Various Information Extraction Tasks

no code implementations • 23 Sep 2022 • Zhigang Kan, Linhui Feng, Zhangyue Yin, Linbo Qiao, Xipeng Qiu, Dongsheng Li

In this paper, we propose a novel composable prompt-based generative framework, which could be applied to a wide range of tasks in the field of Information Extraction.

Relation Extraction

Paper
Add Code

Coarse-to-Fine: Hierarchical Multi-task Learning for Natural Language Understanding

no code implementations • COLING 2022 • Zhaoye Fei, Yu Tian, Yongkang Wu, Xinyu Zhang, Yutao Zhu, Zheng Liu, Jiawen Wu, Dejiang Kong, Ruofei Lai, Zhao Cao, Zhicheng Dou, Xipeng Qiu

Our experiments on 13 benchmark datasets across five natural language understanding tasks demonstrate the superiority of our method.

Multi-Task Learning Natural Language Understanding

Paper
Add Code

An Embarrassingly Easy but Strong Baseline for Nested Named Entity Recognition

1 code implementation • 9 Aug 2022 • Hang Yan, Yu Sun, Xiaonan Li, Xipeng Qiu

In this paper, we propose using Convolutional Neural Network (CNN) to model these spatial relations in the score matrix.

Ranked #3 on Nested Named Entity Recognition on ACE 2005

named-entity-recognition Named Entity Recognition +3

Paper
Code

CoNT: Contrastive Neural Text Generation

2 code implementations • 29 May 2022 • Chenxin An, Jiangtao Feng, Kai Lv, Lingpeng Kong, Xipeng Qiu, Xuanjing Huang

We validate CoNT on five generation tasks with ten benchmarks, including machine translation, summarization, code comment generation, data-to-text generation and commonsense generation.

Code Comment Generation Comment Generation +4

423

Paper
Code

What Dense Graph Do You Need for Self-Attention?

1 code implementation • 27 May 2022 • Yuxin Wang, Chu-Tak Lee, Qipeng Guo, Zhangyue Yin, Yunhua Zhou, Xuanjing Huang, Xipeng Qiu

Transformers have made progress in miscellaneous tasks, but suffer from quadratic computational and memory complexities.

Miscellaneous

Paper
Code

BBTv2: Towards a Gradient-Free Future with Large Language Models

1 code implementation • 23 May 2022 • Tianxiang Sun, Zhengfu He, Hong Qian, Yunhua Zhou, Xuanjing Huang, Xipeng Qiu

By contrast, gradient-free methods only require the forward computation of the PTM to tune the prompt, retaining the benefits of efficient tuning and deployment.

Few-Shot Learning Language Modelling

254

Paper
Code

Dialogue Meaning Representation for Task-Oriented Dialogue Systems

1 code implementation • 23 Apr 2022 • Xiangkun Hu, Junqi Dai, Hang Yan, Yi Zhang, Qipeng Guo, Xipeng Qiu, Zheng Zhang

We propose Dialogue Meaning Representation (DMR), a pliable and easily extendable representation for task-oriented dialogue.

coreference-resolution Negation +1

Paper
Code

Text Adversarial Purification as Defense against Adversarial Attacks

no code implementations • 27 Mar 2022 • Linyang Li, Demin Song, Xipeng Qiu

Adversarial purification is a successful defense mechanism against adversarial attacks without requiring knowledge of the form of the incoming attack.

Adversarial Attack Adversarial Purification

Paper
Add Code

MarkBERT: Marking Word Boundaries Improves Chinese BERT

1 code implementation • 12 Mar 2022 • Linyang Li, Yong Dai, Duyu Tang, Xipeng Qiu, Zenglin Xu, Shuming Shi

We present a Chinese BERT model dubbed MarkBERT that uses word information in this work.

Chinese Named Entity Recognition named-entity-recognition +7

Paper
Code

A Simple Hash-Based Early Exiting Approach For Language Understanding and Generation

1 code implementation • Findings (ACL) 2022 • Tianxiang Sun, Xiangyang Liu, Wei Zhu, Zhichao Geng, Lingling Wu, Yilong He, Yuan Ni, Guotong Xie, Xuanjing Huang, Xipeng Qiu

Previous works usually adopt heuristic metrics such as the entropy of internal outputs to measure instance difficulty, which suffers from generalization and threshold-tuning.

Paper
Code

"Is Whole Word Masking Always Better for Chinese BERT?": Probing on Chinese Grammatical Error Correction

no code implementations • 1 Mar 2022 • Yong Dai, Linyang Li, Cong Zhou, Zhangyin Feng, Enbo Zhao, Xipeng Qiu, Piji Li, Duyu Tang

The meaning of a word in Chinese is different in that a word is a compositional unit consisting of multiple characters.

Grammatical Error Correction Language Modelling +2

Paper
Add Code

$\mathcal{Y}$-Tuning: An Efficient Tuning Paradigm for Large-Scale Pre-Trained Models via Label Representation Learning

no code implementations • 20 Feb 2022 • Yitao Liu, Chenxin An, Xipeng Qiu

With the success of large-scale pre-trained models (PTMs), how efficiently adapting PTMs to downstream tasks has attracted tremendous attention, especially for PTMs with billions of parameters.

Representation Learning

Paper
Add Code

TURNER: The Uncertainty-based Retrieval Framework for Chinese NER

no code implementations • 18 Feb 2022 • Zhichao Geng, Hang Yan, Zhangyue Yin, Chenxin An, Xipeng Qiu

Chinese NER is a difficult undertaking due to the ambiguity of Chinese characters and the absence of word boundaries.

General Knowledge NER +1

Paper
Add Code

CodeRetriever: Unimodal and Bimodal Contrastive Learning for Code Search

1 code implementation • 26 Jan 2022 • Xiaonan Li, Yeyun Gong, Yelong Shen, Xipeng Qiu, Hang Zhang, Bolun Yao, Weizhen Qi, Daxin Jiang, Weizhu Chen, Nan Duan

For bimodal contrastive learning, we leverage the documentation and in-line comments of code to build code-text pairs.

Code Search Contrastive Learning

Paper
Code

Towards Collaborative Question Answering: A Preliminary Study

no code implementations • 24 Jan 2022 • Xiangkun Hu, Hang Yan, Qipeng Guo, Xipeng Qiu, Weinan Zhang, Zheng Zhang

Knowledge and expertise in the real-world can be disjointedly owned.

Question Answering

Paper
Add Code

Black-Box Tuning for Language-Model-as-a-Service

2 code implementations • 10 Jan 2022 • Tianxiang Sun, Yunfan Shao, Hong Qian, Xuanjing Huang, Xipeng Qiu

In such a scenario, which we call Language-Model-as-a-Service (LMaaS), the gradients of PTMs are usually unavailable.

In-Context Learning Language Modelling

254

Paper
Code

Contrast and Generation Make BART a Good Dialogue Emotion Recognizer

1 code implementation • 21 Dec 2021 • ShiMin Li, Hang Yan, Xipeng Qiu

Meanwhile, we utilize an auxiliary response generation task to enhance the model's ability of handling context information, thereby forcing the model to recognize emotions with similar semantics in diverse contexts.

Ranked #11 on Emotion Recognition in Conversation on EmoryNLP

Contrastive Learning Decoder +2

Paper
Code

Towards More Effective and Economic Sparsely-Activated Model

no code implementations • 14 Oct 2021 • Hao Jiang, Ke Zhan, Jianwei Qu, Yongkang Wu, Zhaoye Fei, Xinyu Zhang, Lei Chen, Zhicheng Dou, Xipeng Qiu, Zikai Guo, Ruofei Lai, Jiawen Wu, Enrui Hu, Yinxia Zhang, Yantao Jia, Fan Yu, Zhao Cao

To increase the number of activated experts without an increase in computational cost, we propose SAM (Switch and Mixture) routing, an efficient hierarchical routing mechanism that activates multiple experts in a same device (GPU).

Paper
Add Code

Towards Efficient NLP: A Standard Evaluation and A Strong Baseline

1 code implementation • NAACL 2022 • Xiangyang Liu, Tianxiang Sun, Junliang He, Jiawen Wu, Lingling Wu, Xinyu Zhang, Hao Jiang, Zhao Cao, Xuanjing Huang, Xipeng Qiu

ELUE is dedicated to depict the Pareto Frontier for various language understanding tasks, such that it can tell whether and how much a method achieves Pareto improvement.

Paper
Code

KNN-BERT: Fine-Tuning Pre-Trained Models with KNN Classifier

1 code implementation • 6 Oct 2021 • Linyang Li, Demin Song, Ruotian Ma, Xipeng Qiu, Xuanjing Huang

Pre-trained models are widely used in fine-tuning downstream tasks with linear classifiers optimized by the cross-entropy loss, which might face robustness and stability problems.

Contrastive Learning text-classification +1

Paper
Code

Paradigm Shift in Natural Language Processing

1 code implementation • 26 Sep 2021 • Tianxiang Sun, Xiangyang Liu, Xipeng Qiu, Xuanjing Huang

In this paper, we review such phenomenon of paradigm shifts in recent years, highlighting several paradigms that have the potential to solve different NLP tasks.

Chunking NER +3

Paper
Code

RetrievalSum: A Retrieval Enhanced Framework for Abstractive Summarization

no code implementations • 16 Sep 2021 • Chenxin An, Ming Zhong, Zhichao Geng, Jianqiang Yang, Xipeng Qiu

Existing summarization systems mostly generate summaries purely relying on the content of the source document.

Abstractive Text Summarization Retrieval

Paper
Add Code

CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Generation

1 code implementation • 13 Sep 2021 • Yunfan Shao, Zhichao Geng, Yitao Liu, Junqi Dai, Hang Yan, Fei Yang, Li Zhe, Hujun Bao, Xipeng Qiu

In this paper, we take the advantage of previous pre-trained models (PTMs) and propose a novel Chinese Pre-trained Unbalanced Transformer (CPT).

Decoder Denoising +4

473

Paper
Code

Learning to Teach with Student Feedback

no code implementations • 10 Sep 2021 • Yitao Liu, Tianxiang Sun, Xipeng Qiu, Xuanjing Huang

This one-way interaction leads to the teacher's inability to perceive the characteristics of the student and its training progress.

Knowledge Distillation

Paper
Add Code

Backdoor Attacks on Pre-trained Models by Layerwise Weight Poisoning

no code implementations • EMNLP 2021 • Linyang Li, Demin Song, Xiaonan Li, Jiehang Zeng, Ruotian Ma, Xipeng Qiu

\textbf{P}re-\textbf{T}rained \textbf{M}odel\textbf{s} have been widely applied and recently proved vulnerable under backdoor attacks: the released pre-trained weights can be maliciously poisoned with certain triggers.

text-classification Text Classification

Paper
Add Code

Pre-Trained Models: Past, Present and Future

no code implementations • 14 Jun 2021 • Xu Han, Zhengyan Zhang, Ning Ding, Yuxian Gu, Xiao Liu, Yuqi Huo, Jiezhong Qiu, Yuan YAO, Ao Zhang, Liang Zhang, Wentao Han, Minlie Huang, Qin Jin, Yanyan Lan, Yang Liu, Zhiyuan Liu, Zhiwu Lu, Xipeng Qiu, Ruihua Song, Jie Tang, Ji-Rong Wen, Jinhui Yuan, Wayne Xin Zhao, Jun Zhu

Large-scale pre-trained models (PTMs) such as BERT and GPT have recently achieved great success and become a milestone in the field of artificial intelligence (AI).

Computational Efficiency Self-Supervised Learning +1

Paper
Add Code

A Unified Generative Framework for Aspect-Based Sentiment Analysis

3 code implementations • ACL 2021 • Hang Yan, Junqi Dai, Tuo ji, Xipeng Qiu, Zheng Zhang

Aspect-based Sentiment Analysis (ABSA) aims to identify the aspect terms, their corresponding sentiment polarities, and the opinion terms.

Ranked #1 on Aspect Sentiment Triplet Extraction on SemEval

Aspect-Based Sentiment Analysis Aspect-oriented Opinion Extraction +2

141

Paper
Code

A Survey of Transformers

1 code implementation • 8 Jun 2021 • Tianyang Lin, Yuxin Wang, Xiangyang Liu, Xipeng Qiu

X-formers) have been proposed, however, a systematic and comprehensive literature review on these Transformer variants is still missing.

Paper
Code

A Unified Generative Framework for Various NER Subtasks

1 code implementation • ACL 2021 • Hang Yan, Tao Gui, Junqi Dai, Qipeng Guo, Zheng Zhang, Xipeng Qiu

To that end, we propose to formulate the NER subtasks as an entity span sequence generation task, which can be solved by a unified sequence-to-sequence (Seq2Seq) framework.

Ranked #10 on Nested Named Entity Recognition on GENIA

named-entity-recognition Named Entity Recognition +2

213

Paper
Code

Early Exiting with Ensemble Internal Classifiers

no code implementations • 28 May 2021 • Tianxiang Sun, Yunhua Zhou, Xiangyang Liu, Xinyu Zhang, Hao Jiang, Zhao Cao, Xuanjing Huang, Xipeng Qiu

In this paper, we show that a novel objective function for the training of the ensemble internal classifiers can be naturally induced from the perspective of ensemble learning and information theory.

Ensemble Learning

Paper
Add Code

Accelerating BERT Inference for Sequence Labeling via Early-Exit

1 code implementation • ACL 2021 • Xiaonan Li, Yunfan Shao, Tianxiang Sun, Hang Yan, Xipeng Qiu, Xuanjing Huang

To alleviate this problem, we extend the recent successful early-exit mechanism to accelerate the inference of PTMs for sequence labeling tasks.

Sentence

Paper
Code

Keyphrase Generation with Fine-Grained Evaluation-Guided Reinforcement Learning

1 code implementation • Findings (EMNLP) 2021 • Yichao Luo, Yige Xu, Jiacheng Ye, Xipeng Qiu, Qi Zhang

In response to this problem, we propose a new fine-grained evaluation metric to improve the RL framework, which considers different granularities: token-level $F_1$ score, edit distance, duplication, and prediction quantities.

Keyphrase Generation reinforcement-learning +1

Paper
Code

QMSum: A New Benchmark for Query-based Multi-domain Meeting Summarization

1 code implementation • NAACL 2021 • Ming Zhong, Da Yin, Tao Yu, Ahmad Zaidi, Mutethia Mutuma, Rahul Jha, Ahmed Hassan Awadallah, Asli Celikyilmaz, Yang Liu, Xipeng Qiu, Dragomir Radev

As increasing numbers of meetings are recorded and transcribed, meeting summaries have become essential to remind those who may or may not have attended the meetings about the key decisions made and the tasks to be completed.

Meeting Summarization

102

Paper
Code

Does syntax matter? A strong baseline for Aspect-based Sentiment Analysis with RoBERTa

1 code implementation • NAACL 2021 • Junqi Dai, Hang Yan, Tianxiang Sun, PengFei Liu, Xipeng Qiu

In this paper, we firstly compare the induced trees from PTMs and the dependency parsing trees on several popular models for the ABSA task, showing that the induced tree from fine-tuned RoBERTa (FT-RoBERTa) outperforms the parser-provided tree.

Ranked #5 on Aspect-Based Sentiment Analysis (ABSA) on SemEval-2014 Task-4

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +1

126

Paper
Code

Enhancing Scientific Papers Summarization with Citation Graph

1 code implementation • 7 Apr 2021 • Chenxin An, Ming Zhong, Yiran Chen, Danqing Wang, Xipeng Qiu, Xuanjing Huang

Previous work for text summarization in scientific domain mainly focused on the content of the input document, but seldom considering its citation network.

Text Summarization

Paper
Code

TextFlint: Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing

1 code implementation • ACL 2021 • Tao Gui, Xiao Wang, Qi Zhang, Qin Liu, Yicheng Zou, Xin Zhou, Rui Zheng, Chong Zhang, Qinzhuo Wu, Jiacheng Ye, Zexiong Pang, Yongxin Zhang, Zhengyan Li, Ruotian Ma, Zichu Fei, Ruijian Cai, Jun Zhao, Xingwu Hu, Zhiheng Yan, Yiding Tan, Yuan Hu, Qiyuan Bian, Zhihua Liu, Bolin Zhu, Shan Qin, Xiaoyu Xing, Jinlan Fu, Yue Zhang, Minlong Peng, Xiaoqing Zheng, Yaqian Zhou, Zhongyu Wei, Xipeng Qiu, Xuanjing Huang

To guarantee user acceptability, all the text transformations are linguistically based, and we provide a human evaluation for each one.

Adversarial Attack named-entity-recognition +5

630

Paper
Code

Generating Adversarial Examples in Chinese Texts Using Sentence-Pieces

no code implementations • 29 Dec 2020 • Linyang Li, Yunfan Shao, Demin Song, Xipeng Qiu, Xuanjing Huang

The substitutions in the generated adversarial examples are not characters or words but \textit{'pieces'}, which are more natural to Chinese readers.

Language Modelling Sentence

Paper
Add Code

Finding Sparse Structures for Domain Specific Neural Machine Translation

2 code implementations • 19 Dec 2020 • Jianze Liang, Chengqi Zhao, Mingxuan Wang, Xipeng Qiu, Lei LI

Neural machine translation often adopts the fine-tuning approach to adapt to specific domains.

Domain Adaptation Machine Translation +1

298

Paper
Code

Fork or Fail: Cycle-Consistent Training with Many-to-One Mappings

1 code implementation • 14 Dec 2020 • Qipeng Guo, Zhijing Jin, Ziyu Wang, Xipeng Qiu, Weinan Zhang, Jun Zhu, Zheng Zhang, David Wipf

Cycle-consistent training is widely used for jointly learning a forward and inverse mapping between two domains of interest without the cumbersome requirement of collecting matched pairs within each domain.

Knowledge Graphs Text Generation

Paper
Code

GenWiki: A Dataset of 1.3 Million Content-Sharing Text and Graphs for Unsupervised Graph-to-Text Generation

1 code implementation • COLING 2020 • Zhijing Jin, Qipeng Guo, Xipeng Qiu, Zheng Zhang

With a human-annotated test set, we provide this new benchmark dataset for future research on unsupervised text generation from knowledge graphs.

Ranked #1 on Unsupervised KG-to-Text Generation on GenWiki (Fine)

Knowledge Graphs Text Generation +1

Paper
Code

Text Information Aggregation with Centrality Attention

no code implementations • 16 Nov 2020 • Jingjing Gong, Hang Yan, Yining Zheng, Xipeng Qiu, Xuanjing Huang

A lot of natural language processing problems need to encode the text sequence as a fix-length vector, which usually involves aggregation process of combining the representations of all the words, such as pooling or self-attention.

Sentence text-classification +1

Paper
Add Code

Pre-training with Meta Learning for Chinese Word Segmentation

no code implementations • NAACL 2021 • Zhen Ke, Liang Shi, Songtao Sun, Erli Meng, Bin Wang, Xipeng Qiu

Recent researches show that pre-trained models (PTMs) are beneficial to Chinese Word Segmentation (CWS).

Chinese Word Segmentation Language Modelling +2

Paper
Add Code

CDEvalSumm: An Empirical Study of Cross-Dataset Evaluation for Neural Summarization Systems

2 code implementations • Findings of the Association for Computational Linguistics 2020 • Yiran Chen, PengFei Liu, Ming Zhong, Zi-Yi Dou, Danqing Wang, Xipeng Qiu, Xuanjing Huang

In this paper, we perform an in-depth analysis of characteristics of different datasets and investigate the performance of different summarization models under a cross-dataset setting, in which a summarizer trained on one corpus will be evaluated on a range of out-of-domain corpora.

Text Summarization

898

Paper
Code

Pre-training Multilingual Neural Machine Translation by Leveraging Alignment Information

1 code implementation • EMNLP 2020 • Zehui Lin, Xiao Pan, Mingxuan Wang, Xipeng Qiu, Jiangtao Feng, Hao Zhou, Lei LI

We investigate the following question for machine translation (MT): can we develop a single universal MT model to serve as the common seed and obtain derivative and improved models on arbitrary language pairs?

Ranked #3 on Machine Translation on WMT2014 English-French (using extra training data)

Machine Translation Translation

164

Paper
Code

CoLAKE: Contextualized Language and Knowledge Embedding

1 code implementation • COLING 2020 • Tianxiang Sun, Yunfan Shao, Xipeng Qiu, Qipeng Guo, Yaru Hu, Xuanjing Huang, Zheng Zhang

With the emerging branch of incorporating factual knowledge into pre-trained language models such as BERT, most existing models consider shallow, static, and separately pre-trained entity embeddings, which limits the performance gains of these models.

Entity Embeddings Knowledge Graph Completion +1

114

Paper
Code

BERT for Monolingual and Cross-Lingual Reverse Dictionary

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Hang Yan, Xiaonan Li, Xipeng Qiu

Reverse dictionary is the task to find the proper target word given the word description.

Reverse Dictionary Word Embeddings

Paper
Code

AutoRC: Improving BERT Based Relation Classification Models via Architecture Search

no code implementations • ACL 2021 • Wei Zhu, Xipeng Qiu, Yuan Ni, Guotong Xie

Ablation study demonstrates the necessity of our search space design and the effectiveness of our search method.

General Classification Neural Architecture Search +2

Paper
Add Code

fastHan: A BERT-based Multi-Task Toolkit for Chinese NLP

1 code implementation • ACL 2021 • Zhichao Geng, Hang Yan, Xipeng Qiu, Xuanjing Huang

The joint-model is trained and evaluated on 13 corpora of four tasks, yielding near state-of-the-art (SOTA) performance in dependency parsing and NER, achieving SOTA performance in CWS and POS.

Chinese Word Segmentation Dependency Parsing +6

743

Paper
Code

AutoTrans: Automating Transformer Design via Reinforced Architecture Search

3 code implementations • 4 Sep 2020 • Wei Zhu, Xiaoling Wang, Xipeng Qiu, Yuan Ni, Guotong Xie

Though the transformer architectures have shown dominance in many natural language understanding tasks, there are still unsolved issues for the training of transformer models, especially the need for a principled way of warm-up which has shown importance for stable training of a transformer, as well as whether the task at hand prefer to scale the attention product or not.

Natural Language Understanding Navigate

Paper
Code

Improving Image Captioning with Better Use of Caption

no code implementations • ACL 2020 • Zhan Shi, Xu Zhou, Xipeng Qiu, Xiaodan Zhu

Image captioning is a multimodal problem that has drawn extensive attention in both the natural language processing and computer vision community.

Caption Generation Image Captioning +3

Paper
Add Code

Improving Image Captioning with Better Use of Captions

1 code implementation • 21 Jun 2020 • Zhan Shi, Xu Zhou, Xipeng Qiu, Xiaodan Zhu

Image captioning is a multimodal problem that has drawn extensive attention in both the natural language processing and computer vision community.

Caption Generation Image Captioning +3

Paper
Code

CycleGT: Unsupervised Graph-to-Text and Text-to-Graph Generation via Cycle Training

2 code implementations • ACL (WebNLG, INLG) 2020 • Qipeng Guo, Zhijing Jin, Xipeng Qiu, Wei-Nan Zhang, David Wipf, Zheng Zhang

Due to the difficulty and high cost of data collection, the supervised data available in the two fields are usually on the magnitude of tens of thousands, for example, 18K in the WebNLG~2017 dataset after preprocessing, which is far fewer than the millions of data for other tasks such as machine translation.

Graph Generation Knowledge Graphs +2

Paper
Code

Relation of the Relations: A New Paradigm of the Relation Extraction Problem

1 code implementation • 5 Jun 2020 • Zhijing Jin, Yongyi Yang, Xipeng Qiu, Zheng Zhang

In natural language, often multiple entities appear in the same text.

Relation Relation Extraction

Paper
Code

TAVAT: Token-Aware Virtual Adversarial Training for Language Understanding

1 code implementation • 30 Apr 2020 • Linyang Li, Xipeng Qiu

Gradient-based adversarial training is widely used in improving the robustness of neural networks, while it cannot be easily adapted to natural language processing tasks since the embedding space is discrete.

Natural Language Understanding text-classification +1

Paper
Code

Heterogeneous Graph Neural Networks for Extractive Document Summarization

1 code implementation • ACL 2020 • Danqing Wang, PengFei Liu, Yining Zheng, Xipeng Qiu, Xuanjing Huang

An intuitive way is to put them in the graph-based neural network, which has a more complex structure for capturing inter-sentence relationships.

Document Summarization Extractive Document Summarization +3

240

Paper
Code

FLAT: Chinese NER Using Flat-Lattice Transformer

1 code implementation • ACL 2020 • Xiaonan Li, Hang Yan, Xipeng Qiu, Xuanjing Huang

Recently, the character-word lattice structure has been proved to be effective for Chinese named entity recognition (NER) by incorporating the word information.

Ranked #5 on Chinese Named Entity Recognition on MSRA

Chinese Named Entity Recognition named-entity-recognition +3

992

Paper
Code

BERT-ATTACK: Adversarial Attack Against BERT Using BERT

4 code implementations • EMNLP 2020 • Linyang Li, Ruotian Ma, Qipeng Guo, xiangyang xue, Xipeng Qiu

Adversarial attacks for discrete data (such as texts) have been proved significantly more challenging than continuous data (such as images) since it is difficult to generate adversarial samples with gradient-based methods.

Adversarial Attack

2,799

Paper
Code

Extractive Summarization as Text Matching

2 code implementations • ACL 2020 • Ming Zhong, PengFei Liu, Yiran Chen, Danqing Wang, Xipeng Qiu, Xuanjing Huang

This paper creates a paradigm shift with regard to the way we build neural extractive summarization systems.

Ranked #1 on Text Summarization on BBC XSum

Document Summarization Extractive Summarization +4

518

Paper
Code

Unified Multi-Criteria Chinese Word Segmentation with BERT

no code implementations • 13 Apr 2020 • Zhen Ke, Liang Shi, Erli Meng, Bin Wang, Xipeng Qiu, Xuanjing Huang

Besides, the pre-trained BERT language model has been also introduced into the MCCWS task in a multi-task learning framework.

Chinese Word Segmentation Language Modelling +3

Paper
Add Code

Pre-trained Models for Natural Language Processing: A Survey

3 code implementations • 18 Mar 2020 • Xipeng Qiu, Tianxiang Sun, Yige Xu, Yunfan Shao, Ning Dai, Xuanjing Huang

Recently, the emergence of pre-trained models (PTMs) has brought natural language processing (NLP) to a new era.

Representation Learning

2,027

Paper
Code

Improving BERT Fine-Tuning via Self-Ensemble and Self-Distillation

1 code implementation • 24 Feb 2020 • Yige Xu, Xipeng Qiu, Ligao Zhou, Xuanjing Huang

Fine-tuning pre-trained language models like BERT has become an effective way in NLP and yields state-of-the-art results on many downstream tasks.

Natural Language Inference text-classification +1

Paper
Code

Multi-Scale Self-Attention for Text Classification

no code implementations • 2 Dec 2019 • Qipeng Guo, Xipeng Qiu, PengFei Liu, xiangyang xue, Zheng Zhang

In this paper, we introduce the prior knowledge, multi-scale structure, into self-attention modules.

General Classification text-classification +1

Paper
Add Code

Joint Parsing and Generation for Abstractive Summarization

2 code implementations • 23 Nov 2019 • Kaiqiang Song, Logan Lebanoff, Qipeng Guo, Xipeng Qiu, xiangyang xue, Chen Li, Dong Yu, Fei Liu

If generating a word can introduce an erroneous relation to the summary, the behavior must be discouraged.

Ranked #27 on Text Summarization on GigaWord

Abstractive Text Summarization Decoder +1

Paper
Code

Learning Sparse Sharing Architectures for Multiple Tasks

1 code implementation • 12 Nov 2019 • Tianxiang Sun, Yunfan Shao, Xiaonan Li, PengFei Liu, Hang Yan, Xipeng Qiu, Xuanjing Huang

Most existing deep multi-task learning models are based on parameter sharing, such as hard sharing, hierarchical sharing, and soft sharing.

Multi-Task Learning

Paper
Code

BP-Transformer: Modelling Long-Range Context via Binary Partitioning

2 code implementations • 11 Nov 2019 • Zihao Ye, Qipeng Guo, Quan Gan, Xipeng Qiu, Zheng Zhang

The Transformer model is widely successful on many natural language processing tasks.

Ranked #1 on Machine Translation on IWSLT2015 Chinese-English

Language Modelling Machine Translation +4

13,129

Paper
Code

TENER: Adapting Transformer Encoder for Named Entity Recognition

7 code implementations • 10 Nov 2019 • Hang Yan, Bocao Deng, Xiaonan Li, Xipeng Qiu

The Bidirectional long short-term memory networks (BiLSTM) have been widely used as an encoder in models solving the named entity recognition (NER) task.

Ranked #11 on Chinese Named Entity Recognition on Resume NER

Chinese Named Entity Recognition Named Entity Recognition

4,848

Paper
Code

A Closer Look at Data Bias in Neural Extractive Summarization Models

no code implementations • WS 2019 • Ming Zhong, Danqing Wang, PengFei Liu, Xipeng Qiu, Xuanjing Huang

In this paper, we take stock of the current state of summarization datasets and explore how different factors of datasets influence the generalization behaviour of neural extractive summarization models.

Extractive Summarization

Paper
Add Code

Exploring Domain Shift in Extractive Text Summarization

no code implementations • 30 Aug 2019 • Danqing Wang, PengFei Liu, Ming Zhong, Jie Fu, Xipeng Qiu, Xuanjing Huang

Although domain shift has been well explored in many NLP applications, it still has received little attention in the domain of extractive text summarization.

Extractive Text Summarization Meta-Learning

Paper
Add Code

GlossBERT: BERT for Word Sense Disambiguation with Gloss Knowledge

3 code implementations • IJCNLP 2019 • Luyao Huang, Chi Sun, Xipeng Qiu, Xuanjing Huang

Word Sense Disambiguation (WSD) aims to find the exact sense of an ambiguous word in a particular context.

Ranked #3 on Word Sense Disambiguation on WiC-TSV

Word Sense Disambiguation

Paper
Code

DropAttention: A Regularization Method for Fully-Connected Self-Attention Networks

no code implementations • 25 Jul 2019 • Lin Zehui, PengFei Liu, Luyao Huang, Junkun Chen, Xipeng Qiu, Xuanjing Huang

Variants dropout methods have been designed for the fully-connected layer, convolutional layer and recurrent layer in neural networks, and shown to be effective to avoid overfitting.

Paper
Add Code

Searching for Effective Neural Extractive Summarization: What Works and What's Next

2 code implementations • ACL 2019 • Ming Zhong, PengFei Liu, Danqing Wang, Xipeng Qiu, Xuanjing Huang

The recent years have seen remarkable success in the use of deep neural networks on text summarization.

Ranked #6 on Extractive Text Summarization on CNN / Daily Mail

Extractive Summarization Extractive Text Summarization

Paper
Code

A Concise Model for Multi-Criteria Chinese Word Segmentation with Transformer Encoder

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Xipeng Qiu, Hengzhi Pei, Hang Yan, Xuanjing Huang

Multi-criteria Chinese word segmentation (MCCWS) aims to exploit the relations among the multiple heterogeneous segmentation criteria and further improve the performance of each single criterion.

Chinese Word Segmentation Multi-Task Learning +1

Paper
Code

How to Fine-Tune BERT for Text Classification?

16 code implementations • 14 May 2019 • Chi Sun, Xipeng Qiu, Yige Xu, Xuanjing Huang

Language model pre-training has proven to be useful in learning universal language representations.

Ranked #1 on Text Classification on Yahoo! Answers

General Classification Language Modelling +2

597

Paper
Code

Style Transformer: Unpaired Text Style Transfer without Disentangled Latent Representation

4 code implementations • ACL 2019 • Ning Dai, Jianze Liang, Xipeng Qiu, Xuanjing Huang

Disentangling the content and style in the latent space is prevalent in unpaired text style transfer.

Decoder Sentence +2

170

Paper
Code

A Graph-based Model for Joint Chinese Word Segmentation and Dependency Parsing

1 code implementation • TACL 2020 • Hang Yan, Xipeng Qiu, Xuanjing Huang

Our graph-based joint model achieves better performance than previous joint models and state-of-the-art results in both Chinese word segmentation and dependency parsing.

Chinese Word Segmentation Dependency Parsing +3

Paper
Code

Utilizing BERT for Aspect-Based Sentiment Analysis via Constructing Auxiliary Sentence

8 code implementations • NAACL 2019 • Chi Sun, Luyao Huang, Xipeng Qiu

Aspect-based sentiment analysis (ABSA), which aims to identify fine-grained opinion polarity towards a specific aspect, is a challenging subtask of sentiment analysis (SA).

Ranked #1 on Aspect-Based Sentiment Analysis (ABSA) on SemEval 2014 Task 4 Subtask 4

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +5

494

Paper
Code

Star-Transformer

2 code implementations • NAACL 2019 • Qipeng Guo, Xipeng Qiu, PengFei Liu, Yunfan Shao, xiangyang xue, Zheng Zhang

Although Transformer has achieved great successes on many NLP tasks, its heavy structure with fully-connected attention connections leads to dependencies on large training data.

Ranked #13 on Sentiment Analysis on SST-5 Fine-grained classification

Named Entity Recognition (NER) Natural Language Inference +2

13,131

Paper
Code

VCWE: Visual Character-Enhanced Word Embeddings

1 code implementation • NAACL 2019 • Chi Sun, Xipeng Qiu, Xuanjing Huang

Chinese is a logographic writing system, and the shape of Chinese characters contain rich syntactic and semantic information.

named-entity-recognition Named Entity Recognition +5

Paper
Code

Switch-LSTMs for Multi-Criteria Chinese Word Segmentation

no code implementations • 19 Dec 2018 • Jingjing Gong, Xinchi Chen, Tao Gui, Xipeng Qiu

With these auto-switched LSTMs, our model provides a more flexible solution for multi-criteria CWS, which is also easy to transfer the learned knowledge to new criteria.

Chinese Word Segmentation Segmentation

Paper
Add Code

Multi-task Learning over Graph Structures

no code implementations • 26 Nov 2018 • Pengfei Liu, Jie Fu, Yue Dong, Xipeng Qiu, Jackie Chi Kit Cheung

We present two architectures for multi-task learning with neural sequence models.

General Classification Multi-Task Learning +2

Paper
Add Code

U-Net: Machine Reading Comprehension with Unanswerable Questions

1 code implementation • 12 Oct 2018 • Fu Sun, Linyang Li, Xipeng Qiu, Yang Liu

A key subtask is to reliably predict whether the question is unanswerable.

Ranked #12 on Question Answering on SQuAD2.0 dev

Machine Reading Comprehension Question Answering

Paper
Code

Convolutional Interaction Network for Natural Language Inference

no code implementations • EMNLP 2018 • Jingjing Gong, Xipeng Qiu, Xinchi Chen, Dong Liang, Xuanjing Huang

Attention-based neural models have achieved great success in natural language inference (NLI).

Information Retrieval Natural Language Inference +2

Paper
Add Code

A Simple yet Effective Joint Training Method for Cross-Lingual Universal Dependency Parsing

no code implementations • CONLL 2018 • Danlu Chen, Mengxiao Lin, Zhifeng Hu, Xipeng Qiu

This paper describes Fudan{'}s submission to CoNLL 2018{'}s shared task Universal Dependency Parsing.

Dependency Parsing Transfer Learning

Paper
Add Code

Deformable Stacked Structure for Named Entity Recognition

no code implementations • 24 Sep 2018 • Shuyang Cao, Xipeng Qiu, Xuanjing Huang

Neural architecture for named entity recognition has achieved great success in the field of natural language processing.

Decoder named-entity-recognition +2

Paper
Add Code

Neural Arithmetic Expression Calculator

no code implementations • 23 Sep 2018 • Kaiyu Chen, Yihan Dong, Xipeng Qiu, Zitian Chen

With curriculum learning, our model can deal with a complex arithmetic expression calculation with the deep hierarchical structure of skill models.

Hierarchical Reinforcement Learning

Paper
Add Code

Exploring Shared Structures and Hierarchies for Multiple NLP Tasks

no code implementations • 23 Aug 2018 • Junkun Chen, Kaiyu Chen, Xinchi Chen, Xipeng Qiu, Xuanjing Huang

Designing shared neural architecture plays an important role in multi-task learning.

General Classification Multi-Task Learning +5

Paper
Add Code

Gaussian Word Embedding with a Wasserstein Distance Loss

no code implementations • 21 Aug 2018 • Chi Sun, Hang Yan, Xipeng Qiu, Xuanjing Huang

Therefore, with the aim of representing words in a highly efficient way, we propose to operate a Gaussian word embedding model with a loss function based on the Wasserstein distance.

Document Classification General Classification +1

Paper
Add Code

Top-Down Tree Structured Text Generation

no code implementations • 14 Aug 2018 • Qipeng Guo, Xipeng Qiu, xiangyang xue, Zheng Zhang

Text generation is a fundamental building block in natural language processing tasks.

Sentence Text Generation

Paper
Add Code

Information Aggregation via Dynamic Routing for Sequence Encoding

2 code implementations • COLING 2018 • Jingjing Gong, Xipeng Qiu, Shaojing Wang, Xuanjing Huang

The dynamic routing policy is dynamically deciding that what and how much information need be transferred from each word to the final encoding of the text sequence.

Ranked #44 on Sentiment Analysis on IMDb

Sentiment Analysis text-classification +1

Paper
Code

Toward Diverse Text Generation with Inverse Reinforcement Learning

3 code implementations • 30 Apr 2018 • Zhan Shi, Xinchi Chen, Xipeng Qiu, Xuanjing Huang

Similar to the adversarial models, the reward and policy function in IRL are optimized alternately.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

Same Representation, Different Attentions: Shareable Sentence Representation Learning from Multiple Tasks

no code implementations • 22 Apr 2018 • Renjie Zheng, Junkun Chen, Xipeng Qiu

More specifically, all tasks share the same sentence representation and each task can select the task-specific information from the shared sentence representation with attention mechanism.

General Classification Multi-Task Learning +4

Paper
Add Code

Meta Multi-Task Learning for Sequence Modeling

no code implementations • 25 Feb 2018 • Junkun Chen, Xipeng Qiu, Pengfei Liu, Xuanjing Huang

Specifically, we use a shared meta-network to capture the meta-knowledge of semantic composition and generate the parameters of the task-specific semantic composition models.

Multi-Task Learning Representation Learning +3

Paper
Add Code

Incorporating Discriminator in Sentence Generation: a Gibbs Sampling Method

no code implementations • 25 Feb 2018 • Jinyue Su, Jiacheng Xu, Xipeng Qiu, Xuanjing Huang

Generating plausible and fluent sentence with desired properties has long been a challenge.

Sentence

Paper
Add Code

Idiom-Aware Compositional Distributed Semantics

no code implementations • EMNLP 2017 • Pengfei Liu, Kaiyu Qian, Xipeng Qiu, Xuanjing Huang

Idioms are peculiar linguistic constructions that impose great challenges for representing the semantics of language, especially in current prevailing end-to-end neural models, which assume that the semantics of a phrase or sentence can be literally composed from its constitutive words.

General Classification Machine Translation +4

Paper
Add Code

DAG-based Long Short-Term Memory for Neural Word Segmentation

no code implementations • 2 Jul 2017 • Xinchi Chen, Zhan Shi, Xipeng Qiu, Xuanjing Huang

In this paper, we propose a new neural model to incorporate the word-level information for Chinese word segmentation.

Chinese Word Segmentation Feature Engineering +2

Paper
Add Code

Overview of the NLPCC 2017 Shared Task: Chinese News Headline Categorization

1 code implementation • 9 Jun 2017 • Xipeng Qiu, Jingjing Gong, Xuanjing Huang

In this paper, we give an overview for the shared task at the CCF Conference on Natural Language Processing \& Chinese Computing (NLPCC 2017): Chinese News Headline Categorization.

116

Paper
Code

Dynamic Compositional Neural Networks over Tree Structure

no code implementations • 11 May 2017 • Pengfei Liu, Xipeng Qiu, Xuanjing Huang

Tree-structured neural networks have proven to be effective in learning semantic representations by exploiting syntactic information.

Learning Semantic Representations

Paper
Add Code

Reinforced Mnemonic Reader for Machine Reading Comprehension

3 code implementations • 8 May 2017 • Minghao Hu, Yuxing Peng, Zhen Huang, Xipeng Qiu, Furu Wei, Ming Zhou

In this paper, we introduce the Reinforced Mnemonic Reader for machine reading comprehension tasks, which enhances previous attentive readers in two aspects.

Ranked #17 on Question Answering on SQuAD1.1 dev

Machine Reading Comprehension Question Answering +2

135

Paper
Code

Adversarial Multi-Criteria Learning for Chinese Word Segmentation

no code implementations • ACL 2017 • Xinchi Chen, Zhan Shi, Xipeng Qiu, Xuanjing Huang

Different linguistic perspectives causes many diverse segmentation criteria for Chinese word segmentation (CWS).

Chinese Word Segmentation Segmentation

Paper
Add Code

Adversarial Multi-task Learning for Text Classification

no code implementations • ACL 2017 • Pengfei Liu, Xipeng Qiu, Xuanjing Huang

Neural network models have shown their promising opportunities for multi-task learning, which focus on learning the shared layers to extract the common and task-invariant features.

General Classification Multi-Task Learning +2

Paper
Add Code

Knowledge Graph Representation with Jointly Structural and Textual Encoding

no code implementations • 26 Nov 2016 • Jiacheng Xu, Kan Chen, Xipeng Qiu, Xuanjing Huang

In this paper, we propose a novel deep architecture to utilize both structural and textual information of entities.

General Classification Knowledge Graph Embedding +2

Paper
Add Code

A Feature-Enriched Neural Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging

no code implementations • 16 Nov 2016 • Xinchi Chen, Xipeng Qiu, Xuanjing Huang

Recently, neural network models for natural language processing tasks have been increasingly focused on for their ability of alleviating the burden of manual feature engineering.

Chinese Word Segmentation Feature Engineering +1