Search Results for author: Hao Cheng

Found 124 papers, 61 papers with code

MR-P: A Parallel Decoding Algorithm for Iterative Refinement Non-Autoregressive Translation

no code implementations • Findings (ACL) 2022 • Hao Cheng, Zhihua Zhang

The Conditional Masked Language Model (CMLM) is a strong baseline of NAT.

Paper
Add Code

FREA: Feasibility-Guided Generation of Safety-Critical Scenarios with Reasonable Adversariality

no code implementations • 5 Jun 2024 • Keyu Chen, YuHeng Lei, Hao Cheng, Haoran Wu, Wenchao Sun, Sifa Zheng

Generating safety-critical scenarios, which are essential yet difficult to collect at scale, offers an effective method to evaluate the robustness of autonomous vehicles (AVs).

Autonomous Vehicles

Paper
Add Code

Mamba as Decision Maker: Exploring Multi-scale Sequence Modeling in Offline Reinforcement Learning

1 code implementation • 4 Jun 2024 • Jiahang Cao, Qiang Zhang, Ziqing Wang, Jiaxu Wang, Hao Cheng, Yecheng Shao, Wen Zhao, Gang Han, Yijie Guo, Renjing Xu

Sequential modeling has demonstrated remarkable capabilities in offline reinforcement learning (RL), with Decision Transformer (DT) being one of the most notable representatives, achieving significant success.

OpenAI Gym Reinforcement Learning (RL)

Paper
Code

Typography Leads Semantic Diversifying: Amplifying Adversarial Transferability across Multimodal Large Language Models

no code implementations • 30 May 2024 • Hao Cheng, Erjia Xiao, Jiahang Cao, Le Yang, Kaidi Xu, Jindong Gu, Renjing Xu

Following the advent of the Artificial Intelligence (AI) era of large models, Multimodal Large Language Models (MLLMs) with the ability to understand cross-modal interactions between vision and text have attracted wide attention.

Paper
Add Code

CityLLaVA: Efficient Fine-Tuning for VLMs in City Scenario

1 code implementation • 6 May 2024 • Zhizhao Duan, Hao Cheng, Duo Xu, Xi Wu, Xiangxie Zhang, Xi Ye, Zhen Xie

In the vast and dynamic landscape of urban settings, Traffic Safety Description and Analysis plays a pivotal role in applications ranging from insurance inspection to accident prevention.

Position Prompt Engineering

Paper
Code

Efficient Encoder-Decoder Transformer Decoding for Decomposable Tasks

1 code implementation • 19 Mar 2024 • Bo-Ru Lu, Nikita Haduong, Chien-Yu Lin, Hao Cheng, Noah A. Smith, Mari Ostendorf

Transformer-based NLP models are powerful but have high computational costs that limit deployment.

Decoder Dialogue State Tracking +1

Paper
Code

Progressive Divide-and-Conquer via Subsampling Decomposition for Accelerated MRI

1 code implementation • 15 Mar 2024 • Chong Wang, Lanqing Guo, YuFei Wang, Hao Cheng, Yi Yu, Bihan Wen

Starting from decomposing the original maximum-a-posteriori problem of accelerated MRI, we present a rigorous derivation of the proposed PDAC framework, which could be further unfolded into an end-to-end trainable network.

MRI Reconstruction

Paper
Code

Unveiling Typographic Deceptions: Insights of the Typographic Vulnerability in Large Vision-Language Model

no code implementations • 29 Feb 2024 • Hao Cheng, Erjia Xiao, Jindong Gu, Le Yang, Jinhao Duan, Jize Zhang, Jiahang Cao, Kaidi Xu, Renjing Xu

Large Vision-Language Models (LVLMs) rely on vision encoders and Large Language Models (LLMs) to exhibit remarkable capabilities on various multi-modal tasks in the joint space of vision and language.

Language Modelling Object Recognition +1

Paper
Add Code

Controllable Diverse Sampling for Diffusion Based Motion Behavior Forecasting

no code implementations • 6 Feb 2024 • Yiming Xu, Hao Cheng, Monika Sester

These issues lead the existing methods to a loss of predictive diversity and adherence to the scene constraints.

Autonomous Driving Denoising +1

Paper
Add Code

RobustTSF: Towards Theory and Design of Robust Time Series Forecasting with Anomalies

1 code implementation • 3 Feb 2024 • Hao Cheng, Qingsong Wen, Yang Liu, Liang Sun

Time series forecasting is an important and forefront task in many real-world applications.

Time Series Time Series Forecasting

Paper
Code

AdaMR: Adaptable Molecular Representation for Unified Pre-training Strategy

no code implementations • 28 Dec 2023 • Yan Ding, Hao Cheng, Ziliang Ye, Ruyi Feng, Wei Tian, Peng Xie, Juan Zhang, Zhongze Gu

We fine-tuned our proposed pre-trained model on six molecular property prediction tasks (MoleculeNet datasets) and two generative tasks (ZINC250K datasets), achieving state-of-the-art (SOTA) results on five out of eight tasks.

Attribute Molecular Property Prediction +2

Paper
Add Code

FedCompetitors: Harmonious Collaboration in Federated Learning with Competing Participants

no code implementations • 18 Dec 2023 • Shanli Tan, Hao Cheng, Xiaohu Wu, Han Yu, Tiantian He, Yew-Soon Ong, Chongjun Wang, Xiaofeng Tao

Federated learning (FL) provides a privacy-preserving approach for collaborative training of machine learning models.

Federated Learning Privacy Preserving

Paper
Add Code

ACT-Diffusion: Efficient Adversarial Consistency Training for One-step Diffusion Models

no code implementations • 23 Nov 2023 • Fei Kong, Jinhao Duan, Lichao Sun, Hao Cheng, Renjing Xu, HengTao Shen, Xiaofeng Zhu, Xiaoshuang Shi, Kaidi Xu

Though diffusion models excel in image generation, their step-by-step denoising leads to slow generation speeds.

Denoising Image Inpainting

Paper
Add Code

Unmasking and Improving Data Credibility: A Study with Datasets for Training Harmless Language Models

1 code implementation • 19 Nov 2023 • Zhaowei Zhu, Jialu Wang, Hao Cheng, Yang Liu

Given the cost and difficulty of cleaning these datasets by humans, we introduce a systematic framework for evaluating the credibility of datasets, identifying label errors, and evaluating the influence of noisy labels in the curated language data, specifically focusing on unsafe comments and conversation classification.

Language Modelling

2,956

Paper
Code

Pursing the Sparse Limitation of Spiking Deep Learning Structures

no code implementations • 18 Nov 2023 • Hao Cheng, Jiahang Cao, Erjia Xiao, Mengshu Sun, Le Yang, Jize Zhang, Xue Lin, Bhavya Kailkhura, Kaidi Xu, Renjing Xu

It posits that within dense neural networks, there exist winning tickets or subnetworks that are sparser but do not compromise performance.

Paper
Add Code

OrchestraLLM: Efficient Orchestration of Language Models for Dialogue State Tracking

no code implementations • 16 Nov 2023 • Chia-Hsuan Lee, Hao Cheng, Mari Ostendorf

Large language models (LLMs) have revolutionized the landscape of Natural Language Processing systems, but are computationally expensive.

Computational Efficiency Dialogue State Tracking +3

Paper
Add Code

DocLens: Multi-aspect Fine-grained Evaluation for Medical Text Generation

1 code implementation • 16 Nov 2023 • Yiqing Xie, Sheng Zhang, Hao Cheng, PengFei Liu, Zelalem Gero, Cliff Wong, Tristan Naumann, Hoifung Poon, Carolyn Rose

Medical text generation aims to assist with administrative work and highlight salient information to support decision-making.

Decision Making Instruction Following +1

Paper
Code

TrainerAgent: Customizable and Efficient Model Training through LLM-Powered Multi-Agent System

no code implementations • 11 Nov 2023 • Haoyuan Li, Hao Jiang, Tianke Zhang, Zhelun Yu, Aoxiong Yin, Hao Cheng, Siming Fu, Yuhao Zhang, Wanggui He

We anticipate that our work will contribute to the advancement of research on TrainerAgent in both academic and industry communities, potentially establishing it as a new paradigm for model development in the field of AI.

Decision Making Language Modelling +1

Paper
Add Code

LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents

1 code implementation • 9 Nov 2023 • Shilong Liu, Hao Cheng, Haotian Liu, Hao Zhang, Feng Li, Tianhe Ren, Xueyan Zou, Jianwei Yang, Hang Su, Jun Zhu, Lei Zhang, Jianfeng Gao, Chunyuan Li

LLaVA-Plus is a general-purpose multimodal assistant that expands the capabilities of large multimodal models.

Ranked #1 on LMM real-life tasks on Leaderboard

Instruction Following LLM real-life tasks +3

644

Paper
Code

ReEval: Automatic Hallucination Evaluation for Retrieval-Augmented Large Language Models via Transferable Adversarial Attacks

no code implementations • 19 Oct 2023 • Xiaodong Yu, Hao Cheng, Xiaodong Liu, Dan Roth, Jianfeng Gao

Specifically, given the potential of data contamination (e. g., leading to memorization), good static benchmark performance does not ensure that model can reliably use the provided evidence for responding, which is essential to avoid hallucination when the required knowledge is new or private.

Hallucination Hallucination Evaluation +6

Paper
Add Code

Feature Pyramid biLSTM: Using Smartphone Sensors for Transportation Mode Detection

no code implementations • 17 Oct 2023 • Qinrui Tang, Hao Cheng

The widespread utilization of smartphones has provided extensive availability to Inertial Measurement Units, providing a wide range of sensory data that can be advantageous for the detection of transportation modes.

Transportation Mode Detection

Paper
Add Code

Fast-ELECTRA for Efficient Pre-training

no code implementations • 11 Oct 2023 • chengyu dong, Liyuan Liu, Hao Cheng, Jingbo Shang, Jianfeng Gao, Xiaodong Liu

Although ELECTRA offers a significant boost in efficiency, its potential is constrained by the training cost brought by the auxiliary model.

Language Modelling

Paper
Add Code

MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts

1 code implementation • 3 Oct 2023 • Pan Lu, Hritik Bansal, Tony Xia, Jiacheng Liu, Chunyuan Li, Hannaneh Hajishirzi, Hao Cheng, Kai-Wei Chang, Michel Galley, Jianfeng Gao

To bridge this gap, we present MathVista, a benchmark designed to combine challenges from diverse mathematical and visual tasks.

Chatbot Image Captioning +5

190

Paper
Code

Gaining the Sparse Rewards by Exploring Lottery Tickets in Spiking Neural Network

no code implementations • 23 Sep 2023 • Hao Cheng, Jiahang Cao, Erjia Xiao, Mengshu Sun, Renjing Xu

Deploying energy-efficient deep learning algorithms on computational-limited devices, such as robots, is still a pressing issue for real-world applications.

Binarization

Paper
Add Code

RBFormer: Improve Adversarial Robustness of Transformer by Robust Bias

no code implementations • 23 Sep 2023 • Hao Cheng, Jinhao Duan, Hui Li, Lyutianyang Zhang, Jiahang Cao, Ping Wang, Jize Zhang, Kaidi Xu, Renjing Xu

Recently, there has been a surge of interest and attention in Transformer-based structures, such as Vision Transformer (ViT) and Vision Multilayer Perceptron (VMLP).

Adversarial Robustness

Paper
Add Code

AceGPT, Localizing Large Language Models in Arabic

1 code implementation • 21 Sep 2023 • Huang Huang, Fei Yu, Jianqing Zhu, Xuening Sun, Hao Cheng, Dingjie Song, Zhihong Chen, Abdulmohsen Alharthi, Bang An, Juncai He, Ziche Liu, Zhiyi Zhang, Junying Chen, Jianquan Li, Benyou Wang, Lian Zhang, Ruoyu Sun, Xiang Wan, Haizhou Li, Jinchao Xu

This paper is devoted to the development of a localized Large Language Model (LLM) specifically for Arabic, a language imbued with unique cultural characteristics inadequately addressed by current mainstream models.

Instruction Following Language Modelling +2

100

Paper
Code

Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment

1 code implementation • 10 Aug 2023 • Yang Liu, Yuanshun Yao, Jean-Francois Ton, Xiaoying Zhang, Ruocheng Guo, Hao Cheng, Yegor Klochkov, Muhammad Faaiz Taufiq, Hang Li

However, a major challenge faced by practitioners is the lack of clear guidance on evaluating whether LLM outputs align with social norms, values, and regulations.

Fairness Models Alignment

Paper
Code

An End-to-End Framework of Road User Detection, Tracking, and Prediction from Monocular Images

no code implementations • 9 Aug 2023 • Hao Cheng, Mengmeng Liu, Lin Chen

Perception that involves multi-object detection and tracking, and trajectory prediction are two major tasks of autonomous driving.

Autonomous Driving Multi-Object Tracking +4

Paper
Add Code

LaplaceConfidence: a Graph-based Approach for Learning with Noisy Labels

no code implementations • 31 Jul 2023 • Mingcai Chen, Yuntao Du, Wei Tang, Baoming Zhang, Hao Cheng, Shuwei Qian, Chongjun Wang

We introduce LaplaceConfidence, a method that to obtain label confidence (i. e., clean probabilities) utilizing the Laplacian energy.

Dimensionality Reduction Learning with noisy labels

Paper
Add Code

A Novel Multi-Task Model Imitating Dermatologists for Accurate Differential Diagnosis of Skin Diseases in Clinical Images

no code implementations • 17 Jul 2023 • Yan-Jie Zhou, Wei Liu, Yuan Gao, Jing Xu, Le Lu, Yuping Duan, Hao Cheng, Na Jin, Xiaoyong Man, Shuang Zhao, Yu Wang

Skin diseases are among the most prevalent health issues, and accurate computer-aided diagnosis methods are of importance for both dermatologists and patients.

Multi-Task Learning

Paper
Add Code

Does Collaborative Human-LM Dialogue Generation Help Information Extraction from Human Dialogues?

no code implementations • 13 Jul 2023 • Bo-Ru Lu, Nikita Haduong, Chia-Hsuan Lee, Zeqiu Wu, Hao Cheng, Paul Koester, Jean Utke, Tao Yu, Noah A. Smith, Mari Ostendorf

The capabilities of pretrained language models have opened opportunities to explore new application areas, but applications involving human-human interaction are limited by the fact that most data is protected from public release for privacy reasons.

Dialogue Generation Dialogue State Tracking +1

Paper
Add Code

Shifting Attention to Relevance: Towards the Predictive Uncertainty Quantification of Free-Form Large Language Models

2 code implementations • 3 Jul 2023 • Jinhao Duan, Hao Cheng, Shiqi Wang, Alex Zavalny, Chenan Wang, Renjing Xu, Bhavya Kailkhura, Kaidi Xu

Large Language Models (LLMs) show promising results in language generation and instruction following but frequently "hallucinate", making their outputs less reliable.

Instruction Following Question Answering +4

Paper
Code

Spiking Denoising Diffusion Probabilistic Models

1 code implementation • 29 Jun 2023 • Jiahang Cao, Ziqing Wang, Hanzhong Guo, Hao Cheng, Qiang Zhang, Renjing Xu

In our paper, we put forward Spiking Denoising Diffusion Probabilistic Models (SDDPM), a new class of SNN-based generative models that achieve high sample quality.

Denoising

Paper
Code

Augmenting Language Models with Long-Term Memory

no code implementations • NeurIPS 2023 • Weizhi Wang, Li Dong, Hao Cheng, Xiaodong Liu, Xifeng Yan, Jianfeng Gao, Furu Wei

Such a decoupled memory design can easily cache and update long-term past contexts for memory retrieval without suffering from memory staleness.

In-Context Learning Language Modelling +1

Paper
Add Code

DOS: Diverse Outlier Sampling for Out-of-Distribution Detection

1 code implementation • 3 Jun 2023 • Wenyu Jiang, Hao Cheng, Mingcai Chen, Chongjun Wang, Hongxin Wei

Modern neural networks are known to give overconfident prediction for out-of-distribution inputs when deployed in the open world.

Out-of-Distribution Detection

Paper
Code

Self-Verification Improves Few-Shot Clinical Information Extraction

1 code implementation • 30 May 2023 • Zelalem Gero, Chandan Singh, Hao Cheng, Tristan Naumann, Michel Galley, Jianfeng Gao, Hoifung Poon

Extracting patient information from unstructured text is a critical task in health decision-support and clinical research.

In-Context Learning

Paper
Code

Pre-training Multi-task Contrastive Learning Models for Scientific Literature Understanding

no code implementations • 23 May 2023 • Yu Zhang, Hao Cheng, Zhihong Shen, Xiaodong Liu, Ye-Yi Wang, Jianfeng Gao

Scientific literature understanding tasks have gained significant attention due to their potential to accelerate scientific discovery.

Citation Prediction Contrastive Learning

Paper
Add Code

Chain-of-Skills: A Configurable Model for Open-domain Question Answering

1 code implementation • 4 May 2023 • Kaixin Ma, Hao Cheng, Yu Zhang, Xiaodong Liu, Eric Nyberg, Jianfeng Gao

Our approach outperforms recent self-supervised retrievers in zero-shot evaluations and achieves state-of-the-art fine-tuned retrieval performance on NQ, HotpotQA and OTT-QA.

Ranked #4 on Question Answering on HotpotQA

Open-Domain Question Answering Retrieval +1

Paper
Code

Evaluating the Efficacy of Length-Controllable Machine Translation

no code implementations • 3 May 2023 • Hao Cheng, Meng Zhang, Weixuan Wang, Liangyou Li, Qun Liu, Zhihua Zhang

We can use automatic summarization or machine translation evaluation metrics for length-controllable machine translation, but this is not necessarily suitable and accurate.

Machine Translation Translation

Paper
Add Code

End-to-end Training and Decoding for Pivot-based Cascaded Translation Model

no code implementations • 3 May 2023 • Hao Cheng, Meng Zhang, Liangyou Li, Qun Liu, Zhihua Zhang

Utilizing pivot language effectively can significantly improve low-resource machine translation.

Machine Translation Translation

Paper
Add Code

Improve Video Representation with Temporal Adversarial Augmentation

no code implementations • 28 Apr 2023 • Jinhao Duan, Quanfu Fan, Hao Cheng, Xiaoshuang Shi, Kaidi Xu

In this paper, we introduce Temporal Adversarial Augmentation (TA), a novel video augmentation technique that utilizes temporal attention.

Paper
Add Code

Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models

1 code implementation • NeurIPS 2023 • Pan Lu, Baolin Peng, Hao Cheng, Michel Galley, Kai-Wei Chang, Ying Nian Wu, Song-Chun Zhu, Jianfeng Gao

At the heart of Chameleon is an LLM-based planner that assembles a sequence of tools to execute to generate the final response.

Logical Reasoning

1,032

Paper
Code

Pre-training Transformers for Knowledge Graph Completion

no code implementations • 28 Mar 2023 • Sanxing Chen, Hao Cheng, Xiaodong Liu, Jian Jiao, Yangfeng Ji, Jianfeng Gao

Learning transferable representation of knowledge graphs (KGs) is challenging due to the heterogeneous, multi-relational nature of graph structures.

Paper
Add Code

LAformer: Trajectory Prediction for Autonomous Driving with Lane-Aware Scene Constraints

1 code implementation • 27 Feb 2023 • Mengmeng Liu, Hao Cheng, Lin Chen, Hellward Broszio, Jiangtao Li, Runjiang Zhao, Monika Sester, Michael Ying Yang

Trajectory prediction for autonomous driving must continuously reason the motion stochasticity of road agents and comply with scene constraints.

Autonomous Driving Decoder +1

Paper
Code

Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback

no code implementations • 24 Feb 2023 • Baolin Peng, Michel Galley, Pengcheng He, Hao Cheng, Yujia Xie, Yu Hu, Qiuyuan Huang, Lars Liden, Zhou Yu, Weizhu Chen, Jianfeng Gao

Large language models (LLMs), such as ChatGPT, are able to generate human-like, fluent responses for many downstream tasks, e. g., task-oriented dialog and question answering.

Informativeness Open-Domain Question Answering

Paper
Add Code

ForceFormer: Exploring Social Force and Transformer for Pedestrian Trajectory Prediction

no code implementations • 15 Feb 2023 • Weicheng Zhang, Hao Cheng, Fatema T. Johora, Monika Sester

Predicting trajectories of pedestrians based on goal information in highly interactive scenes is a crucial step toward Intelligent Transportation Systems and Autonomous Driving.

Autonomous Driving Pedestrian Trajectory Prediction +1

Paper
Add Code

Generating Evidential BEV Maps in Continuous Driving Space

1 code implementation • 6 Feb 2023 • Yunshuang Yuan, Hao Cheng, Michael Ying Yang, Monika Sester

Safety is critical for autonomous driving, and one aspect of improving safety is to accurately capture the uncertainties of the perception system, especially knowing the unknown.

Autonomous Driving object-detection +2

Paper
Code

ShadowFormer: Global Context Helps Image Shadow Removal

1 code implementation • 3 Feb 2023 • Lanqing Guo, Siyu Huang, Ding Liu, Hao Cheng, Bihan Wen

It is still challenging for the deep shadow removal model to exploit the global contextual correlation between shadow and non-shadow regions.

Ranked #1 on Shadow Removal on ISTD

Image Shadow Removal Shadow Removal

122

Paper
Code

Frequency Guidance Matters in Few-Shot Learning

no code implementations • ICCV 2023 • Hao Cheng, Siyuan Yang, Joey Tianyi Zhou, Lanqing Guo, Bihan Wen

Few-shot classification aims to learn a discriminative feature representation to recognize unseen classes with few labeled support samples.

Few-Shot Learning Metric Learning

Paper
Add Code

Language Models as Inductive Reasoners

1 code implementation • 21 Dec 2022 • Zonglin Yang, Li Dong, Xinya Du, Hao Cheng, Erik Cambria, Xiaodong Liu, Jianfeng Gao, Furu Wei

To this end, we propose a new paradigm (task) for inductive reasoning, which is to induce natural language rules from natural language facts, and create a dataset termed DEER containing 1. 2k rule-fact pairs for the task, where rules and facts are written in natural language.

Philosophy

Paper
Code

ALT: Boosting Deep Learning Performance by Breaking the Wall between Graph and Operator Level Optimizations

no code implementations • 22 Oct 2022 • Zhiying Xu, Jiafan Xu, Hongding Peng, Wei Wang, Xiaoliang Wang, Haoran Wan, Haipeng Dai, Yixu Xu, Hao Cheng, Kun Wang, Guihai Chen

Deep learning models rely on highly optimized tensor libraries for efficient inference on heterogeneous hardware.

Paper
Add Code

Open-domain Question Answering via Chain of Reasoning over Heterogeneous Knowledge

2 code implementations • 22 Oct 2022 • Kaixin Ma, Hao Cheng, Xiaodong Liu, Eric Nyberg, Jianfeng Gao

We propose a novel open-domain question answering (ODQA) framework for answering single/multi-hop questions across heterogeneous knowledge sources.

Open-Domain Question Answering

145

Paper
Code

Task-Aware Specialization for Efficient and Robust Dense Retrieval for Open-Domain Question Answering

1 code implementation • 11 Oct 2022 • Hao Cheng, Hao Fang, Xiaodong Liu, Jianfeng Gao

Given its effectiveness on knowledge-intensive natural language processing tasks, dense retrieval models have become increasingly popular.

Open-Domain Question Answering Retrieval

Paper
Code

Vertical Semi-Federated Learning for Efficient Online Advertising

no code implementations • 30 Sep 2022 • Wenjie Li, Qiaolin Xia, Hao Cheng, Kouyin Xue, Shu-Tao Xia

Specifically, we build an inference-efficient single-party student model applicable to the whole sample space and meanwhile maintain the advantage of the federated feature extension.

Vertical Federated Learning

Paper
Add Code

Efficient Multi-Prize Lottery Tickets: Enhanced Accuracy, Training, and Inference Speed

no code implementations • 26 Sep 2022 • Hao Cheng, Pu Zhao, Yize Li, Xue Lin, James Diffenderfer, Ryan Goldhahn, Bhavya Kailkhura

Recently, Diffenderfer and Kailkhura proposed a new paradigm for learning compact yet highly accurate binary neural networks simply by pruning and quantizing randomly weighted full precision neural networks.

Paper
Add Code

GATraj: A Graph- and Attention-based Multi-Agent Trajectory Prediction Model

1 code implementation • 16 Sep 2022 • Hao Cheng, Mengmeng Liu, Lin Chen, Hellward Broszio, Monika Sester, Michael Ying Yang

This paper proposes an attention-based graph model, named GATraj, which achieves a good balance of prediction accuracy and inference speed.

Autonomous Driving Decoder +2

Paper
Code

Optimizing Bi-Encoder for Named Entity Recognition via Contrastive Learning

1 code implementation • 30 Aug 2022 • Sheng Zhang, Hao Cheng, Jianfeng Gao, Hoifung Poon

We present a bi-encoder framework for named entity recognition (NER), which applies contrastive learning to map candidate text spans and entity types into the same vector representation space.

Ranked #1 on Named Entity Recognition (NER) on BC5CDR

Contrastive Learning Metric Learning +5

Paper
Code

INSCIT: Information-Seeking Conversations with Mixed-Initiative Interactions

1 code implementation • 2 Jul 2022 • Zeqiu Wu, Ryu Parish, Hao Cheng, Sewon Min, Prithviraj Ammanabrolu, Mari Ostendorf, Hannaneh Hajishirzi

In an information-seeking conversation, a user may ask questions that are under-specified or unanswerable.

Open-Domain Question Answering Response Generation

Paper
Code

READ: Aggregating Reconstruction Error into Out-of-distribution Detection

no code implementations • 15 Jun 2022 • Wenyu Jiang, Yuxin Ge, Hao Cheng, Mingcai Chen, Shuai Feng, Chongjun Wang

We propose a novel method, READ (Reconstruction Error Aggregated Detector), to unify inconsistencies from classifier and autoencoder.

Out-of-Distribution Detection Out of Distribution (OOD) Detection

Paper
Add Code

VFed-SSD: Towards Practical Vertical Federated Advertising

no code implementations • 31 May 2022 • Wenjie Li, Qiaolin Xia, Junfeng Deng, Hao Cheng, Jiangming Liu, Kouying Xue, Yong Cheng, Shu-Tao Xia

As an emerging secure learning paradigm in lever-aging cross-agency private data, vertical federatedlearning (VFL) is expected to improve advertising models by enabling the joint learning of complementary user attributes privately owned by the advertiser and the publisher.

Federated Learning Knowledge Distillation +1

Paper
Add Code

Unsupervised Learning of Hierarchical Conversation Structure

1 code implementation • 24 May 2022 • Bo-Ru Lu, Yushi Hu, Hao Cheng, Noah A. Smith, Mari Ostendorf

Human conversations can evolve in many different ways, creating challenges for automatic understanding and summarization.

Paper
Code

Visually-Augmented Language Modeling

1 code implementation • 20 May 2022 • Weizhi Wang, Li Dong, Hao Cheng, Haoyu Song, Xiaodong Liu, Xifeng Yan, Jianfeng Gao, Furu Wei

With the visually-augmented context, VaLM uses a visual knowledge fusion layer to enable multimodal grounded language modeling by attending to both text context and visual knowledge in images.

Image Retrieval Language Modelling +1

Paper
Code

Mitigating Neural Network Overconfidence with Logit Normalization

2 code implementations • 19 May 2022 • Hongxin Wei, Renchunzi Xie, Hao Cheng, Lei Feng, Bo An, Yixuan Li

Our method is motivated by the analysis that the norm of the logit keeps increasing during training, leading to overconfident output.

138

Paper
Code

A Survey of Knowledge-Intensive NLP with Pre-Trained Language Models

no code implementations • 17 Feb 2022 • Da Yin, Li Dong, Hao Cheng, Xiaodong Liu, Kai-Wei Chang, Furu Wei, Jianfeng Gao

With the increasing of model capacity brought by pre-trained language models, there emerges boosting needs for more knowledgeable natural language processing (NLP) models with advanced functionalities including providing and making flexible use of encyclopedic and commonsense knowledge.

Language Modelling

Paper
Add Code

Identifiability of Label Noise Transition Matrix

no code implementations • 4 Feb 2022 • Yang Liu, Hao Cheng, Kun Zhang

When label noise transition depends on each instance, the problem of identifying the instance-dependent noise transition matrix becomes substantially more challenging.

Learning with noisy labels

Paper
Add Code

Fine-Tuning Large Neural Language Models for Biomedical Natural Language Processing

no code implementations • 15 Dec 2021 • Robert Tinn, Hao Cheng, Yu Gu, Naoto Usuyama, Xiaodong Liu, Tristan Naumann, Jianfeng Gao, Hoifung Poon

Overall, domainspecific vocabulary and pretraining facilitate more robust models for fine-tuning.

text similarity Transfer Learning

Paper
Add Code

Knowledge-Rich Self-Supervision for Biomedical Entity Linking

no code implementations • 15 Dec 2021 • Sheng Zhang, Hao Cheng, Shikhar Vashishth, Cliff Wong, Jinfeng Xiao, Xiaodong Liu, Tristan Naumann, Jianfeng Gao, Hoifung Poon

Zero-shot entity linking has emerged as a promising direction for generalizing to new entities, but it still requires example gold entity mentions during training and canonical descriptions for all entities, both of which are rarely available outside of Wikipedia.

Contrastive Learning Entity Linking

Paper
Add Code

Two Wrongs Don't Make a Right: Combating Confirmation Bias in Learning with Label Noise

no code implementations • 6 Dec 2021 • Mingcai Chen, Hao Cheng, Yuntao Du, Ming Xu, Wenyu Jiang, Chongjun Wang

We show that our method successfully alleviates the damage of both label noise and confirmation bias.

Ranked #2 on Image Classification on mini WebVision 1.0

Image Classification

Paper
Add Code

Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention

2 code implementations • 6 Dec 2021 • Yichong Xu, Chenguang Zhu, Shuohang Wang, Siqi Sun, Hao Cheng, Xiaodong Liu, Jianfeng Gao, Pengcheng He, Michael Zeng, Xuedong Huang

In particular, we focus on the task of Commonsense Reasoning, demonstrating that the proposed external attention mechanism can augment existing transformer models and significantly improve the model's reasoning capabilities.

Ranked #1 on Common Sense Reasoning on CommonsenseQA (using extra training data)

Common Sense Reasoning

106

Paper
Code

CLUES: Few-Shot Learning Evaluation in Natural Language Understanding

1 code implementation • 4 Nov 2021 • Subhabrata Mukherjee, Xiaodong Liu, Guoqing Zheng, Saghar Hosseini, Hao Cheng, Greg Yang, Christopher Meek, Ahmed Hassan Awadallah, Jianfeng Gao

We demonstrate that while recent models reach human performance when they have access to large amounts of labeled data, there is a huge gap in performance in the few-shot setting for most tasks.

Few-Shot Learning Natural Language Understanding

Paper
Code

RMNet: Equivalently Removing Residual Connection from Networks

1 code implementation • 1 Nov 2021 • Fanxu Meng, Hao Cheng, Jiaxin Zhuang, Ke Li, Xing Sun

In this paper, we aim to remedy this problem and propose to remove the residual connection in a vanilla ResNet equivalently by a reserving and merging (RM) operation on ResBlock.

Network Pruning

208

Paper
Code

Learning with Noisy Labels Revisited: A Study Using Real-World Human Annotations

2 code implementations • ICLR 2022 • Jiaheng Wei, Zhaowei Zhu, Hao Cheng, Tongliang Liu, Gang Niu, Yang Liu

These observations require us to rethink the treatment of noisy labels, and we hope the availability of these two datasets would facilitate the development and evaluation of future learning with noisy label solutions.

Benchmarking Learning with noisy labels +1

192

Paper
Code

Mitigating Memorization of Noisy Labels via Regularization between Representations

1 code implementation • 18 Oct 2021 • Hao Cheng, Zhaowei Zhu, Xing Sun, Yang Liu

Designing robust loss functions is popular in learning with noisy labels while existing designs did not explicitly consider the overfitting property of deep neural networks (DNNs).

Learning with noisy labels Memorization +1

Paper
Code

Open Domain Question Answering with A Unified Knowledge Interface

1 code implementation • ACL 2022 • Kaixin Ma, Hao Cheng, Xiaodong Liu, Eric Nyberg, Jianfeng Gao

The retriever-reader framework is popular for open-domain question answering (ODQA) due to its ability to use explicit knowledge.

Data-to-Text Generation Natural Questions +2

Paper
Code

A Good Representation Detects Noisy Labels

no code implementations • 29 Sep 2021 • Zhaowei Zhu, Zihao Dong, Hao Cheng, Yang Liu

In this paper, given good representations, we propose a universally applicable and training-free solution to detect noisy labels.

Paper
Add Code

Disentangled Feature Representation for Few-shot Image Classification

1 code implementation • 26 Sep 2021 • Hao Cheng, YuFei Wang, Haoliang Li, Alex C. Kot, Bihan Wen

In this work, we propose a novel Disentangled Feature Representation framework, dubbed DFR, for few-shot learning applications.

Benchmarking Classification +3

Paper
Code

Keypoints-Based Deep Feature Fusion for Cooperative Vehicle Detection of Autonomous Driving

1 code implementation • 23 Sep 2021 • Yunshuang Yuan, Hao Cheng, Monika Sester

Sharing collective perception messages (CPM) between vehicles is investigated to decrease occlusions so as to improve the perception accuracy and safety of autonomous driving.

Autonomous Driving

Paper
Code

Dialogue State Tracking with a Language Model using Schema-Driven Prompting

1 code implementation • EMNLP 2021 • Chia-Hsuan Lee, Hao Cheng, Mari Ostendorf

Task-oriented conversational systems often use dialogue state tracking to represent the user's intentions, which involves filling in values of pre-defined slots.

Ranked #1 on Dialogue State Tracking on MULTIWOZ 2.1 (MultiWOZ (Joint Goal Acc) metric)

Dialogue State Tracking Language Modelling +1

Paper
Code

Variational Disentanglement for Domain Generalization

1 code implementation • 13 Sep 2021 • YuFei Wang, Haoliang Li, Hao Cheng, Bihan Wen, Lap-Pui Chau, Alex C. Kot

Domain generalization aims to learn an invariant model that can generalize well to the unseen target domain.

Disentanglement Domain Generalization +1

Paper
Code

Domain-Specific Pretraining for Vertical Search: Case Study on Biomedical Literature

no code implementations • 25 Jun 2021 • Yu Wang, Jinchao Li, Tristan Naumann, Chenyan Xiong, Hao Cheng, Robert Tinn, Cliff Wong, Naoto Usuyama, Richard Rogahn, Zhihong Shen, Yang Qin, Eric Horvitz, Paul N. Bennett, Jianfeng Gao, Hoifung Poon

A prominent case in point is the explosion of the biomedical literature on COVID-19, which swelled to hundreds of thousands of papers in a matter of months.

Distributed Computing Self-Supervised Learning

Paper
Add Code

Reconciliation of Statistical and Spatial Sparsity For Robust Image and Image-Set Classification

1 code implementation • 1 Jun 2021 • Hao Cheng, Kim-Hui Yap, Bihan Wen

Recent image classification algorithms, by learning deep features from large-scale datasets, have achieved significantly better results comparing to the classic feature-based approaches.

Classification Image Classification

Paper
Code

CNN Retrieval based Unsupervised Metric Learning for Near-Duplicated Video Retrieval

no code implementations • 30 May 2021 • Hao Cheng, Ping Wang, Chun Qi

As important data carriers, the drastically increasing number of multimedia videos often brings many duplicate and near-duplicate videos in the top results of search.

Metric Learning Re-Ranking +2

Paper
Add Code

Interaction Detection Between Vehicles and Vulnerable Road Users: A Deep Generative Approach with Attention

no code implementations • 9 May 2021 • Hao Cheng, Li Feng, Hailong Liu, Takatsugu Hirayama, Hiroshi Murase, Monika Sester

Intersections where vehicles are permitted to turn and interact with vulnerable road users (VRUs) like pedestrians and cyclists are among some of the most challenging locations for automated and accurate recognition of road users' behavior.

Optical Flow Estimation Self-Driving Cars

Paper
Add Code

Mixture of Robust Experts (MoRE):A Robust Denoising Method towards multiple perturbations

no code implementations • 21 Apr 2021 • Kaidi Xu, Chenan Wang, Hao Cheng, Bhavya Kailkhura, Xue Lin, Ryan Goldhahn

To tackle the susceptibility of deep neural networks to examples, the adversarial training has been proposed which provides a notion of robust through an inner maximization problem presenting the first-order embedded within the outer minimization of the training loss.

Adversarial Robustness Denoising

Paper
Add Code

DisCo: Remedy Self-supervised Learning on Lightweight Models with Distilled Contrastive Learning

2 code implementations • 19 Apr 2021 • Yuting Gao, Jia-Xin Zhuang, Shaohui Lin, Hao Cheng, Xing Sun, Ke Li, Chunhua Shen

Specifically, we find the final embedding obtained by the mainstream SSL methods contains the most fruitful information, and propose to distill the final embedding to maximally transmit a teacher's knowledge to a lightweight model by constraining the last embedding of the student to be consistent with that of the teacher.

Contrastive Learning Representation Learning +1

Paper
Code

Targeted Adversarial Training for Natural Language Understanding

1 code implementation • NAACL 2021 • Lis Pereira, Xiaodong Liu, Hao Cheng, Hoifung Poon, Jianfeng Gao, Ichiro Kobayashi

We present a simple yet effective Targeted Adversarial Training (TAT) algorithm to improve adversarial training for natural language understanding.

Natural Language Understanding

2,210

Paper
Code

An Empirical Study and Analysis on Open-Set Semi-Supervised Learning

no code implementations • 19 Jan 2021 • Huixiang Luo, Hao Cheng, Fanxu Meng, Yuting Gao, Ke Li, Mengdan Zhang, Xing Sun

Pseudo-labeling (PL) and Data Augmentation-based Consistency Training (DACT) are two approaches widely used in Semi-Supervised Learning (SSL) methods.

Data Augmentation

Paper
Add Code

Joint Dimensionality Reduction for Separable Embedding Estimation

no code implementations • 14 Jan 2021 • Yanjun Li, Bihan Wen, Hao Cheng, Yoram Bresler

In this paper, we propose a supervised dimensionality reduction method that learns linear embeddings jointly for two feature vectors representing data of different modalities or data from distinct types of entities.

feature selection Information Retrieval +3

Paper
Add Code

NeurIPS 2020 EfficientQA Competition: Systems, Analyses and Lessons Learned

no code implementations • 1 Jan 2021 • Sewon Min, Jordan Boyd-Graber, Chris Alberti, Danqi Chen, Eunsol Choi, Michael Collins, Kelvin Guu, Hannaneh Hajishirzi, Kenton Lee, Jennimaria Palomaki, Colin Raffel, Adam Roberts, Tom Kwiatkowski, Patrick Lewis, Yuxiang Wu, Heinrich Küttler, Linqing Liu, Pasquale Minervini, Pontus Stenetorp, Sebastian Riedel, Sohee Yang, Minjoon Seo, Gautier Izacard, Fabio Petroni, Lucas Hosseini, Nicola De Cao, Edouard Grave, Ikuya Yamada, Sonse Shimaoka, Masatoshi Suzuki, Shumpei Miyawaki, Shun Sato, Ryo Takahashi, Jun Suzuki, Martin Fajcik, Martin Docekal, Karel Ondrej, Pavel Smrz, Hao Cheng, Yelong Shen, Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao, Barlas Oguz, Xilun Chen, Vladimir Karpukhin, Stan Peshterliev, Dmytro Okhonko, Michael Schlichtkrull, Sonal Gupta, Yashar Mehdad, Wen-tau Yih

We review the EfficientQA competition from NeurIPS 2020.

Open-Domain Question Answering Retrieval

Paper
Add Code

UnitedQA: A Hybrid Approach for Open Domain Question Answering

no code implementations • ACL 2021 • Hao Cheng, Yelong Shen, Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao

To date, most of recent work under the retrieval-reader framework for open-domain QA focuses on either extractive or generative reader exclusively.

Ranked #1 on Open-Domain Question Answering on TriviaQA

Open-Domain Question Answering Retrieval +1

Paper
Add Code

One for More: Selecting Generalizable Samples for Generalizable ReID Model

1 code implementation • 10 Dec 2020 • Enwei Zhang, Xinyang Jiang, Hao Cheng, AnCong Wu, Fufu Yu, Ke Li, Xiaowei Guo, Feng Zheng, Wei-Shi Zheng, Xing Sun

Current training objectives of existing person Re-IDentification (ReID) models only ensure that the loss of the model decreases on selected training batch, with no regards to the performance on samples outside the batch.

Person Re-Identification

Paper
Code

Train Once, and Decode As You Like

no code implementations • COLING 2020 • Chao Tian, Yifei Wang, Hao Cheng, Yijiang Lian, Zhihua Zhang

In this paper we propose a unified approach for supporting different generation manners of machine translation, including autoregressive, semi-autoregressive, and refinement-based non-autoregressive models.

Machine Translation Translation

Paper
Add Code

Exploring Dynamic Context for Multi-path Trajectory Prediction

2 code implementations • 30 Oct 2020 • Hao Cheng, Wentong Liao, Xuejiao Tang, Michael Ying Yang, Monika Sester, Bodo Rosenhahn

In our framework, first, the spatial context between agents is explored by using self-attention architectures.

Trajectory Forecasting

Paper
Code

Posterior Differential Regularization with f-divergence for Improving Model Robustness

2 code implementations • NAACL 2021 • Hao Cheng, Xiaodong Liu, Lis Pereira, YaoLiang Yu, Jianfeng Gao

Theoretically, we provide a connection of two recent methods, Jacobian Regularization and Virtual Adversarial Training, under this framework.

Domain Generalization

2,210

Paper
Code

Learning with Instance-Dependent Label Noise: A Sample Sieve Approach

1 code implementation • ICLR 2021 • Hao Cheng, Zhaowei Zhu, Xingyu Li, Yifei Gong, Xing Sun, Yang Liu

This high-quality sample sieve allows us to treat clean examples and the corrupted ones separately in training a DNN solution, and such a separation is shown to be advantageous in the instance-dependent noise setting.

Ranked #1 on Image Classification with Label Noise on CIFAR-10, 60% IDN

Image Classification with Label Noise Learning with noisy labels

Paper
Code

Pruning Filter in Filter

1 code implementation • NeurIPS 2020 • Fanxu Meng, Hao Cheng, Ke Li, Huixiang Luo, Xiaowei Guo, Guangming Lu, Xing Sun

Through extensive experiments, we demonstrate that SWP is more effective compared to the previous FP-based methods and achieves the state-of-art pruning ratio on CIFAR-10 and ImageNet datasets without obvious accuracy drop.

167

Paper
Code

Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning

2 code implementations • CVPR 2021 • Jinpeng Wang, Yuting Gao, Ke Li, Yiqi Lin, Andy J. Ma, Hao Cheng, Pai Peng, Feiyue Huang, Rongrong Ji, Xing Sun

Then we force the model to pull the feature of the distracting video and the feature of the original video closer, so that the model is explicitly restricted to resist the background influence, focusing more on the motion changes.

Representation Learning Self-Supervised Learning

152

Paper
Code

Do Not Disturb Me: Person Re-identification Under the Interference of Other Pedestrians

1 code implementation • ECCV 2020 • Shizhen Zhao, Changxin Gao, Jun Zhang, Hao Cheng, Chuchu Han, Xinyang Jiang, Xiaowei Guo, Wei-Shi Zheng, Nong Sang, Xing Sun

In the conventional person Re-ID setting, it is widely assumed that cropped person images are for each individual.

Person Re-Identification Retrieval

Paper
Code

Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing

1 code implementation • 31 Jul 2020 • Yu Gu, Robert Tinn, Hao Cheng, Michael Lucas, Naoto Usuyama, Xiaodong Liu, Tristan Naumann, Jianfeng Gao, Hoifung Poon

In this paper, we challenge this assumption by showing that for domains with abundant unlabeled text, such as biomedicine, pretraining language models from scratch results in substantial gains over continual pretraining of general-domain language models.

Ranked #2 on Participant Intervention Comparison Outcome Extraction on EBM-NLP (using extra training data)

Continual Pretraining Document Classification +10

Paper
Code

Intrinsic radiation background of LaBr$_3$(Ce) detector via coincidence measurements and simulations

no code implementations • 15 Jul 2020 • Hao Cheng, Bao-Hua Sun, Li-Hua Zhu, Tian-Xiao Li, Guang-Shuai Li, Cong-Bo Li, Xiao-Guang Wu, Yun Zheng

The LaBr$_3$(Ce) detector has attracted much attention in recent years for its superior characteristics to other scintillating materials in terms of resolution and efficiency.

Instrumentation and Detectors Nuclear Experiment

Paper
Add Code

Attentive Graph Neural Networks for Few-Shot Learning

no code implementations • 14 Jul 2020 • Hao Cheng, Joey Tianyi Zhou, Wee Peng Tay, Bihan Wen

Graph Neural Networks (GNN) has demonstrated the superior performance in many challenging applications, including the few-shot learning tasks.

Few-Shot Learning

Paper
Add Code

AMENet: Attentive Maps Encoder Network for Trajectory Prediction

1 code implementation • 15 Jun 2020 • Hao Cheng, Wentong Liao, Michael Ying Yang, Bodo Rosenhahn, Monika Sester

Trajectory prediction is critical for applications of planning safe future movements and remains challenging even for the next few seconds in urban mixed traffic.

Trajectory Prediction

Paper
Code

Probabilistic Assumptions Matter: Improved Models for Distantly-Supervised Document-Level Question Answering

1 code implementation • ACL 2020 • Hao Cheng, Ming-Wei Chang, Kenton Lee, Kristina Toutanova

We address the problem of extractive question answering using document-level distant super-vision, pairing questions and relevant documents with answer strings.

Extractive Question-Answering Question Answering +1

Paper
Code

Filter Grafting for Deep Neural Networks: Reason, Method, and Cultivation

1 code implementation • 26 Apr 2020 • Hao Cheng, Fanxu Meng, Ke Li, Yuting Gao, Guangming Lu, Xing Sun, Rongrong Ji

To gain a universal improvement on both valid and invalid filters, we compensate grafting with distillation (\textbf{Cultivation}) to overcome the drawback of grafting .

valid

140

Paper
Code

Adversarial Training for Large Neural Language Models

3 code implementations • 20 Apr 2020 • Xiaodong Liu, Hao Cheng, Pengcheng He, Weizhu Chen, Yu Wang, Hoifung Poon, Jianfeng Gao

In natural language processing (NLP), pre-training large neural language models such as BERT have demonstrated impressive gain in generalization for a variety of tasks, with further improvement from adversarial fine-tuning.

Ranked #6 on Natural Language Inference on ANLI test (using extra training data)

Natural Language Inference Natural Language Understanding

2,210

Paper
Code

The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding

3 code implementations • ACL 2020 • Xiaodong Liu, Yu Wang, Jianshu ji, Hao Cheng, Xueyun Zhu, Emmanuel Awa, Pengcheng He, Weizhu Chen, Hoifung Poon, Guihong Cao, Jianfeng Gao

We present MT-DNN, an open-source natural language understanding (NLU) toolkit that makes it easy for researchers and developers to train customized deep learning models.

Knowledge Distillation Multi-Task Learning +2

2,210

Paper
Code

MCENET: Multi-Context Encoder Network for Homogeneous Agent Trajectory Prediction in Mixed Traffic

1 code implementation • 14 Feb 2020 • Hao Cheng, Wentong Liao, Michael Ying Yang, Monika Sester, Bodo Rosenhahn

In inference time, we combine the past context and motion information of the target agent with samplings of the latent variables to predict multiple realistic trajectories in the future.

Autonomous Driving Intent Detection +1

Paper
Code

Filter Grafting for Deep Neural Networks

2 code implementations • CVPR 2020 • Fanxu Meng, Hao Cheng, Ke Li, Zhixin Xu, Rongrong Ji, Xing Sun, Gaungming Lu

To better perform the grafting process, we develop an entropy-based criterion to measure the information of filters and an adaptive weighting strategy for balancing the grafted information among networks.

140

Paper
Code

Asymmetric Co-Teaching for Unsupervised Cross Domain Person Re-Identification

1 code implementation • 3 Dec 2019 • Fengxiang Yang, Ke Li, Zhun Zhong, Zhiming Luo, Xing Sun, Hao Cheng, Xiaowei Guo, Feiyue Huang, Rongrong Ji, Shaozi Li

This procedure encourages that the selected training samples can be both clean and miscellaneous, and that the two models can promote each other iteratively.

Ranked #10 on Unsupervised Domain Adaptation on Market to Duke

Clustering Miscellaneous +2

107

Paper
Code

Interpretable Text Classification Using CNN and Max-pooling

no code implementations • 14 Oct 2019 • Hao Cheng, Xiaoqing Yang, Zang Li, Yanghua Xiao, Yu-Cheng Lin

Deep neural networks have been widely used in text classification.

General Classification Sentence +2

Paper
Add Code

Many could be better than all: A novel instance-oriented algorithm for Multi-modal Multi-label problem

no code implementations • 27 Jul 2019 • Yi Zhang, Cheng Zeng, Hao Cheng, Chongjun Wang, Lei Zhang

The quality of data collected from different channels are inconsistent and some of them may not benefit for prediction.

Paper
Add Code

Local to Global Learning: Gradually Adding Classes for Training Deep Neural Networks

no code implementations • CVPR 2019 • Hao Cheng, Dongze Lian, Bowen Deng, Shenghua Gao, Tao Tan, Yanlin Geng

We propose a new learning paradigm, Local to Global Learning (LGL), for Deep Neural Networks (DNNs) to improve the performance of classification problems.

General Classification

Paper
Add Code

A Dynamic Speaker Model for Conversational Interactions

1 code implementation • NAACL 2019 • Hao Cheng, Hao Fang, Mari Ostendorf

Characterizing these differences can be useful in human-computer interaction, as well as analysis of human-human conversations.

Text Generation

Paper
Code

Improving Span-based Question Answering Systems with Coarsely Labeled Data

no code implementations • 5 Nov 2018 • Hao Cheng, Ming-Wei Chang, Kenton Lee, Ankur Parikh, Michael Collins, Kristina Toutanova

We study approaches to improve fine-grained short answer Question Answering models by integrating coarse-grained data annotated for paragraph-level relevance and show that coarsely annotated data can bring significant performance gains.

Multi-Task Learning Question Answering

Paper
Add Code

Evaluating Capability of Deep Neural Networks for Image Classification via Information Plane

no code implementations • ECCV 2018 • Hao Cheng, Dongze Lian, Shenghua Gao, Yanlin Geng

Inspired by the pioneering work of information bottleneck principle for Deep Neural Networks (DNNs) analysis, we design an information plane based framework to evaluate the capability of DNNs for image classification tasks, which not only helps understand the capability of DNNs, but also helps us choose a neural network which leads to higher classification accuracy more efficiently.

General Classification Image Classification +1

Paper
Add Code

Sounding Board: A User-Centric and Content-Driven Social Chatbot

no code implementations • NAACL 2018 • Hao Fang, Hao Cheng, Maarten Sap, Elizabeth Clark, Ari Holtzman, Yejin Choi, Noah A. Smith, Mari Ostendorf

We present Sounding Board, a social chatbot that won the 2017 Amazon Alexa Prize.

Chatbot Dialogue Management +2

Paper
Add Code

A Factored Neural Network Model for Characterizing Online Discussions in Vector Space

1 code implementation • EMNLP 2017 • Hao Cheng, Hao Fang, Mari Ostendorf

We develop a novel factored neural model that learns comment embeddings in an unsupervised way leveraging the structure of distributional context in online discussion forums.

Feature Engineering

Paper
Code

Learning Latent Local Conversation Modes for Predicting Comment Endorsement in Online Discussions

no code implementations • WS 2016 • Hao Fang, Hao Cheng, Mari Ostendorf

Decision Making

Paper
Add Code

Learning Latent Local Conversation Modes for Predicting Community Endorsement in Online Discussions

no code implementations • 16 Aug 2016 • Hao Fang, Hao Cheng, Mari Ostendorf

Many social media platforms offer a mechanism for readers to react to comments, both positively and negatively, which in aggregate can be thought of as community endorsement.

Paper
Add Code

Bi-directional Attention with Agreement for Dependency Parsing

1 code implementation • EMNLP 2016 • Hao Cheng, Hao Fang, Xiaodong He, Jianfeng Gao, Li Deng

We develop a novel bi-directional attention model for dependency parsing, which learns to agree on headword predictions from the forward and backward parsing directions.

Ranked #4 on Chinese Dependency Parsing on Chinese Pennbank

Dependency Parsing

Paper
Code

Open-Domain Name Error Detection using a Multi-Task RNN

no code implementations • EMNLP 2015 • Hao Cheng, Hao Fang, Mari Ostendorf

Language Modelling Named Entity Recognition (NER) +1

Paper
Add Code

Language Models for Image Captioning: The Quirks and What Works

no code implementations • IJCNLP 2015 • Jacob Devlin, Hao Cheng, Hao Fang, Saurabh Gupta, Li Deng, Xiaodong He, Geoffrey Zweig, Margaret Mitchell

Two recent approaches have achieved state-of-the-art results in image captioning.

Image Captioning Language Modelling +1

Paper
Add Code

Convex Two-Layer Modeling

no code implementations • NeurIPS 2013 • Özlem Aslan, Hao Cheng, Xinhua Zhang, Dale Schuurmans

Latent variable prediction models, such as multi-layer networks, impose auxiliary latent variables between inputs and outputs to allow automatic inference of implicit features useful for prediction.

Vocal Bursts Valence Prediction

Paper
Add Code

Convex Relaxations of Bregman Divergence Clustering

no code implementations • 26 Sep 2013 • Hao Cheng, Xinhua Zhang, Dale Schuurmans

Although many convex relaxations of clustering have been proposed in the past decade, current formulations remain restricted to spherical Gaussian or discriminative models and are susceptible to imbalanced clusters.

Clustering

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.