no code implementations • Findings (ACL) 2022 • Hao Cheng, Zhihua Zhang
The Conditional Masked Language Model (CMLM) is a strong baseline of NAT.
no code implementations • 5 Jun 2024 • Keyu Chen, YuHeng Lei, Hao Cheng, Haoran Wu, Wenchao Sun, Sifa Zheng
Generating safety-critical scenarios, which are essential yet difficult to collect at scale, offers an effective method to evaluate the robustness of autonomous vehicles (AVs).
1 code implementation • 4 Jun 2024 • Jiahang Cao, Qiang Zhang, Ziqing Wang, Jiaxu Wang, Hao Cheng, Yecheng Shao, Wen Zhao, Gang Han, Yijie Guo, Renjing Xu
Sequential modeling has demonstrated remarkable capabilities in offline reinforcement learning (RL), with Decision Transformer (DT) being one of the most notable representatives, achieving significant success.
no code implementations • 30 May 2024 • Hao Cheng, Erjia Xiao, Jiahang Cao, Le Yang, Kaidi Xu, Jindong Gu, Renjing Xu
Following the advent of the Artificial Intelligence (AI) era of large models, Multimodal Large Language Models (MLLMs) with the ability to understand cross-modal interactions between vision and text have attracted wide attention.
1 code implementation • 6 May 2024 • Zhizhao Duan, Hao Cheng, Duo Xu, Xi Wu, Xiangxie Zhang, Xi Ye, Zhen Xie
In the vast and dynamic landscape of urban settings, Traffic Safety Description and Analysis plays a pivotal role in applications ranging from insurance inspection to accident prevention.
1 code implementation • 19 Mar 2024 • Bo-Ru Lu, Nikita Haduong, Chien-Yu Lin, Hao Cheng, Noah A. Smith, Mari Ostendorf
Transformer-based NLP models are powerful but have high computational costs that limit deployment.
1 code implementation • 15 Mar 2024 • Chong Wang, Lanqing Guo, YuFei Wang, Hao Cheng, Yi Yu, Bihan Wen
Starting from decomposing the original maximum-a-posteriori problem of accelerated MRI, we present a rigorous derivation of the proposed PDAC framework, which could be further unfolded into an end-to-end trainable network.
no code implementations • 29 Feb 2024 • Hao Cheng, Erjia Xiao, Jindong Gu, Le Yang, Jinhao Duan, Jize Zhang, Jiahang Cao, Kaidi Xu, Renjing Xu
Large Vision-Language Models (LVLMs) rely on vision encoders and Large Language Models (LLMs) to exhibit remarkable capabilities on various multi-modal tasks in the joint space of vision and language.
no code implementations • 6 Feb 2024 • Yiming Xu, Hao Cheng, Monika Sester
These issues lead the existing methods to a loss of predictive diversity and adherence to the scene constraints.
1 code implementation • 3 Feb 2024 • Hao Cheng, Qingsong Wen, Yang Liu, Liang Sun
Time series forecasting is an important and forefront task in many real-world applications.
no code implementations • 28 Dec 2023 • Yan Ding, Hao Cheng, Ziliang Ye, Ruyi Feng, Wei Tian, Peng Xie, Juan Zhang, Zhongze Gu
We fine-tuned our proposed pre-trained model on six molecular property prediction tasks (MoleculeNet datasets) and two generative tasks (ZINC250K datasets), achieving state-of-the-art (SOTA) results on five out of eight tasks.
no code implementations • 18 Dec 2023 • Shanli Tan, Hao Cheng, Xiaohu Wu, Han Yu, Tiantian He, Yew-Soon Ong, Chongjun Wang, Xiaofeng Tao
Federated learning (FL) provides a privacy-preserving approach for collaborative training of machine learning models.
no code implementations • 23 Nov 2023 • Fei Kong, Jinhao Duan, Lichao Sun, Hao Cheng, Renjing Xu, HengTao Shen, Xiaofeng Zhu, Xiaoshuang Shi, Kaidi Xu
Though diffusion models excel in image generation, their step-by-step denoising leads to slow generation speeds.
1 code implementation • 19 Nov 2023 • Zhaowei Zhu, Jialu Wang, Hao Cheng, Yang Liu
Given the cost and difficulty of cleaning these datasets by humans, we introduce a systematic framework for evaluating the credibility of datasets, identifying label errors, and evaluating the influence of noisy labels in the curated language data, specifically focusing on unsafe comments and conversation classification.
no code implementations • 18 Nov 2023 • Hao Cheng, Jiahang Cao, Erjia Xiao, Mengshu Sun, Le Yang, Jize Zhang, Xue Lin, Bhavya Kailkhura, Kaidi Xu, Renjing Xu
It posits that within dense neural networks, there exist winning tickets or subnetworks that are sparser but do not compromise performance.
no code implementations • 16 Nov 2023 • Chia-Hsuan Lee, Hao Cheng, Mari Ostendorf
Large language models (LLMs) have revolutionized the landscape of Natural Language Processing systems, but are computationally expensive.
1 code implementation • 16 Nov 2023 • Yiqing Xie, Sheng Zhang, Hao Cheng, PengFei Liu, Zelalem Gero, Cliff Wong, Tristan Naumann, Hoifung Poon, Carolyn Rose
Medical text generation aims to assist with administrative work and highlight salient information to support decision-making.
no code implementations • 11 Nov 2023 • Haoyuan Li, Hao Jiang, Tianke Zhang, Zhelun Yu, Aoxiong Yin, Hao Cheng, Siming Fu, Yuhao Zhang, Wanggui He
We anticipate that our work will contribute to the advancement of research on TrainerAgent in both academic and industry communities, potentially establishing it as a new paradigm for model development in the field of AI.
1 code implementation • 9 Nov 2023 • Shilong Liu, Hao Cheng, Haotian Liu, Hao Zhang, Feng Li, Tianhe Ren, Xueyan Zou, Jianwei Yang, Hang Su, Jun Zhu, Lei Zhang, Jianfeng Gao, Chunyuan Li
LLaVA-Plus is a general-purpose multimodal assistant that expands the capabilities of large multimodal models.
Ranked #1 on LMM real-life tasks on Leaderboard
no code implementations • 19 Oct 2023 • Xiaodong Yu, Hao Cheng, Xiaodong Liu, Dan Roth, Jianfeng Gao
Specifically, given the potential of data contamination (e. g., leading to memorization), good static benchmark performance does not ensure that model can reliably use the provided evidence for responding, which is essential to avoid hallucination when the required knowledge is new or private.
no code implementations • 17 Oct 2023 • Qinrui Tang, Hao Cheng
The widespread utilization of smartphones has provided extensive availability to Inertial Measurement Units, providing a wide range of sensory data that can be advantageous for the detection of transportation modes.
no code implementations • 11 Oct 2023 • chengyu dong, Liyuan Liu, Hao Cheng, Jingbo Shang, Jianfeng Gao, Xiaodong Liu
Although ELECTRA offers a significant boost in efficiency, its potential is constrained by the training cost brought by the auxiliary model.
1 code implementation • 3 Oct 2023 • Pan Lu, Hritik Bansal, Tony Xia, Jiacheng Liu, Chunyuan Li, Hannaneh Hajishirzi, Hao Cheng, Kai-Wei Chang, Michel Galley, Jianfeng Gao
To bridge this gap, we present MathVista, a benchmark designed to combine challenges from diverse mathematical and visual tasks.
no code implementations • 23 Sep 2023 • Hao Cheng, Jiahang Cao, Erjia Xiao, Mengshu Sun, Renjing Xu
Deploying energy-efficient deep learning algorithms on computational-limited devices, such as robots, is still a pressing issue for real-world applications.
no code implementations • 23 Sep 2023 • Hao Cheng, Jinhao Duan, Hui Li, Lyutianyang Zhang, Jiahang Cao, Ping Wang, Jize Zhang, Kaidi Xu, Renjing Xu
Recently, there has been a surge of interest and attention in Transformer-based structures, such as Vision Transformer (ViT) and Vision Multilayer Perceptron (VMLP).
1 code implementation • 21 Sep 2023 • Huang Huang, Fei Yu, Jianqing Zhu, Xuening Sun, Hao Cheng, Dingjie Song, Zhihong Chen, Abdulmohsen Alharthi, Bang An, Juncai He, Ziche Liu, Zhiyi Zhang, Junying Chen, Jianquan Li, Benyou Wang, Lian Zhang, Ruoyu Sun, Xiang Wan, Haizhou Li, Jinchao Xu
This paper is devoted to the development of a localized Large Language Model (LLM) specifically for Arabic, a language imbued with unique cultural characteristics inadequately addressed by current mainstream models.
1 code implementation • 10 Aug 2023 • Yang Liu, Yuanshun Yao, Jean-Francois Ton, Xiaoying Zhang, Ruocheng Guo, Hao Cheng, Yegor Klochkov, Muhammad Faaiz Taufiq, Hang Li
However, a major challenge faced by practitioners is the lack of clear guidance on evaluating whether LLM outputs align with social norms, values, and regulations.
no code implementations • 9 Aug 2023 • Hao Cheng, Mengmeng Liu, Lin Chen
Perception that involves multi-object detection and tracking, and trajectory prediction are two major tasks of autonomous driving.
no code implementations • 31 Jul 2023 • Mingcai Chen, Yuntao Du, Wei Tang, Baoming Zhang, Hao Cheng, Shuwei Qian, Chongjun Wang
We introduce LaplaceConfidence, a method that to obtain label confidence (i. e., clean probabilities) utilizing the Laplacian energy.
no code implementations • 17 Jul 2023 • Yan-Jie Zhou, Wei Liu, Yuan Gao, Jing Xu, Le Lu, Yuping Duan, Hao Cheng, Na Jin, Xiaoyong Man, Shuang Zhao, Yu Wang
Skin diseases are among the most prevalent health issues, and accurate computer-aided diagnosis methods are of importance for both dermatologists and patients.
no code implementations • 13 Jul 2023 • Bo-Ru Lu, Nikita Haduong, Chia-Hsuan Lee, Zeqiu Wu, Hao Cheng, Paul Koester, Jean Utke, Tao Yu, Noah A. Smith, Mari Ostendorf
The capabilities of pretrained language models have opened opportunities to explore new application areas, but applications involving human-human interaction are limited by the fact that most data is protected from public release for privacy reasons.
2 code implementations • 3 Jul 2023 • Jinhao Duan, Hao Cheng, Shiqi Wang, Alex Zavalny, Chenan Wang, Renjing Xu, Bhavya Kailkhura, Kaidi Xu
Large Language Models (LLMs) show promising results in language generation and instruction following but frequently "hallucinate", making their outputs less reliable.
1 code implementation • 29 Jun 2023 • Jiahang Cao, Ziqing Wang, Hanzhong Guo, Hao Cheng, Qiang Zhang, Renjing Xu
In our paper, we put forward Spiking Denoising Diffusion Probabilistic Models (SDDPM), a new class of SNN-based generative models that achieve high sample quality.
no code implementations • NeurIPS 2023 • Weizhi Wang, Li Dong, Hao Cheng, Xiaodong Liu, Xifeng Yan, Jianfeng Gao, Furu Wei
Such a decoupled memory design can easily cache and update long-term past contexts for memory retrieval without suffering from memory staleness.
1 code implementation • 3 Jun 2023 • Wenyu Jiang, Hao Cheng, Mingcai Chen, Chongjun Wang, Hongxin Wei
Modern neural networks are known to give overconfident prediction for out-of-distribution inputs when deployed in the open world.
1 code implementation • 30 May 2023 • Zelalem Gero, Chandan Singh, Hao Cheng, Tristan Naumann, Michel Galley, Jianfeng Gao, Hoifung Poon
Extracting patient information from unstructured text is a critical task in health decision-support and clinical research.
no code implementations • 23 May 2023 • Yu Zhang, Hao Cheng, Zhihong Shen, Xiaodong Liu, Ye-Yi Wang, Jianfeng Gao
Scientific literature understanding tasks have gained significant attention due to their potential to accelerate scientific discovery.
1 code implementation • 4 May 2023 • Kaixin Ma, Hao Cheng, Yu Zhang, Xiaodong Liu, Eric Nyberg, Jianfeng Gao
Our approach outperforms recent self-supervised retrievers in zero-shot evaluations and achieves state-of-the-art fine-tuned retrieval performance on NQ, HotpotQA and OTT-QA.
Ranked #4 on Question Answering on HotpotQA
no code implementations • 3 May 2023 • Hao Cheng, Meng Zhang, Weixuan Wang, Liangyou Li, Qun Liu, Zhihua Zhang
We can use automatic summarization or machine translation evaluation metrics for length-controllable machine translation, but this is not necessarily suitable and accurate.
no code implementations • 3 May 2023 • Hao Cheng, Meng Zhang, Liangyou Li, Qun Liu, Zhihua Zhang
Utilizing pivot language effectively can significantly improve low-resource machine translation.
no code implementations • 28 Apr 2023 • Jinhao Duan, Quanfu Fan, Hao Cheng, Xiaoshuang Shi, Kaidi Xu
In this paper, we introduce Temporal Adversarial Augmentation (TA), a novel video augmentation technique that utilizes temporal attention.
1 code implementation • NeurIPS 2023 • Pan Lu, Baolin Peng, Hao Cheng, Michel Galley, Kai-Wei Chang, Ying Nian Wu, Song-Chun Zhu, Jianfeng Gao
At the heart of Chameleon is an LLM-based planner that assembles a sequence of tools to execute to generate the final response.
no code implementations • 28 Mar 2023 • Sanxing Chen, Hao Cheng, Xiaodong Liu, Jian Jiao, Yangfeng Ji, Jianfeng Gao
Learning transferable representation of knowledge graphs (KGs) is challenging due to the heterogeneous, multi-relational nature of graph structures.
1 code implementation • 27 Feb 2023 • Mengmeng Liu, Hao Cheng, Lin Chen, Hellward Broszio, Jiangtao Li, Runjiang Zhao, Monika Sester, Michael Ying Yang
Trajectory prediction for autonomous driving must continuously reason the motion stochasticity of road agents and comply with scene constraints.
no code implementations • 24 Feb 2023 • Baolin Peng, Michel Galley, Pengcheng He, Hao Cheng, Yujia Xie, Yu Hu, Qiuyuan Huang, Lars Liden, Zhou Yu, Weizhu Chen, Jianfeng Gao
Large language models (LLMs), such as ChatGPT, are able to generate human-like, fluent responses for many downstream tasks, e. g., task-oriented dialog and question answering.
no code implementations • 15 Feb 2023 • Weicheng Zhang, Hao Cheng, Fatema T. Johora, Monika Sester
Predicting trajectories of pedestrians based on goal information in highly interactive scenes is a crucial step toward Intelligent Transportation Systems and Autonomous Driving.
1 code implementation • 6 Feb 2023 • Yunshuang Yuan, Hao Cheng, Michael Ying Yang, Monika Sester
Safety is critical for autonomous driving, and one aspect of improving safety is to accurately capture the uncertainties of the perception system, especially knowing the unknown.
1 code implementation • 3 Feb 2023 • Lanqing Guo, Siyu Huang, Ding Liu, Hao Cheng, Bihan Wen
It is still challenging for the deep shadow removal model to exploit the global contextual correlation between shadow and non-shadow regions.
Ranked #1 on Shadow Removal on ISTD
no code implementations • ICCV 2023 • Hao Cheng, Siyuan Yang, Joey Tianyi Zhou, Lanqing Guo, Bihan Wen
Few-shot classification aims to learn a discriminative feature representation to recognize unseen classes with few labeled support samples.
1 code implementation • 21 Dec 2022 • Zonglin Yang, Li Dong, Xinya Du, Hao Cheng, Erik Cambria, Xiaodong Liu, Jianfeng Gao, Furu Wei
To this end, we propose a new paradigm (task) for inductive reasoning, which is to induce natural language rules from natural language facts, and create a dataset termed DEER containing 1. 2k rule-fact pairs for the task, where rules and facts are written in natural language.
no code implementations • 22 Oct 2022 • Zhiying Xu, Jiafan Xu, Hongding Peng, Wei Wang, Xiaoliang Wang, Haoran Wan, Haipeng Dai, Yixu Xu, Hao Cheng, Kun Wang, Guihai Chen
Deep learning models rely on highly optimized tensor libraries for efficient inference on heterogeneous hardware.
2 code implementations • 22 Oct 2022 • Kaixin Ma, Hao Cheng, Xiaodong Liu, Eric Nyberg, Jianfeng Gao
We propose a novel open-domain question answering (ODQA) framework for answering single/multi-hop questions across heterogeneous knowledge sources.
1 code implementation • 11 Oct 2022 • Hao Cheng, Hao Fang, Xiaodong Liu, Jianfeng Gao
Given its effectiveness on knowledge-intensive natural language processing tasks, dense retrieval models have become increasingly popular.
no code implementations • 30 Sep 2022 • Wenjie Li, Qiaolin Xia, Hao Cheng, Kouyin Xue, Shu-Tao Xia
Specifically, we build an inference-efficient single-party student model applicable to the whole sample space and meanwhile maintain the advantage of the federated feature extension.
no code implementations • 26 Sep 2022 • Hao Cheng, Pu Zhao, Yize Li, Xue Lin, James Diffenderfer, Ryan Goldhahn, Bhavya Kailkhura
Recently, Diffenderfer and Kailkhura proposed a new paradigm for learning compact yet highly accurate binary neural networks simply by pruning and quantizing randomly weighted full precision neural networks.
1 code implementation • 16 Sep 2022 • Hao Cheng, Mengmeng Liu, Lin Chen, Hellward Broszio, Monika Sester, Michael Ying Yang
This paper proposes an attention-based graph model, named GATraj, which achieves a good balance of prediction accuracy and inference speed.
1 code implementation • 30 Aug 2022 • Sheng Zhang, Hao Cheng, Jianfeng Gao, Hoifung Poon
We present a bi-encoder framework for named entity recognition (NER), which applies contrastive learning to map candidate text spans and entity types into the same vector representation space.
Ranked #1 on Named Entity Recognition (NER) on BC5CDR
1 code implementation • 2 Jul 2022 • Zeqiu Wu, Ryu Parish, Hao Cheng, Sewon Min, Prithviraj Ammanabrolu, Mari Ostendorf, Hannaneh Hajishirzi
In an information-seeking conversation, a user may ask questions that are under-specified or unanswerable.
no code implementations • 15 Jun 2022 • Wenyu Jiang, Yuxin Ge, Hao Cheng, Mingcai Chen, Shuai Feng, Chongjun Wang
We propose a novel method, READ (Reconstruction Error Aggregated Detector), to unify inconsistencies from classifier and autoencoder.
Out-of-Distribution Detection Out of Distribution (OOD) Detection
no code implementations • 31 May 2022 • Wenjie Li, Qiaolin Xia, Junfeng Deng, Hao Cheng, Jiangming Liu, Kouying Xue, Yong Cheng, Shu-Tao Xia
As an emerging secure learning paradigm in lever-aging cross-agency private data, vertical federatedlearning (VFL) is expected to improve advertising models by enabling the joint learning of complementary user attributes privately owned by the advertiser and the publisher.
1 code implementation • 24 May 2022 • Bo-Ru Lu, Yushi Hu, Hao Cheng, Noah A. Smith, Mari Ostendorf
Human conversations can evolve in many different ways, creating challenges for automatic understanding and summarization.
1 code implementation • 20 May 2022 • Weizhi Wang, Li Dong, Hao Cheng, Haoyu Song, Xiaodong Liu, Xifeng Yan, Jianfeng Gao, Furu Wei
With the visually-augmented context, VaLM uses a visual knowledge fusion layer to enable multimodal grounded language modeling by attending to both text context and visual knowledge in images.
2 code implementations • 19 May 2022 • Hongxin Wei, Renchunzi Xie, Hao Cheng, Lei Feng, Bo An, Yixuan Li
Our method is motivated by the analysis that the norm of the logit keeps increasing during training, leading to overconfident output.
no code implementations • 17 Feb 2022 • Da Yin, Li Dong, Hao Cheng, Xiaodong Liu, Kai-Wei Chang, Furu Wei, Jianfeng Gao
With the increasing of model capacity brought by pre-trained language models, there emerges boosting needs for more knowledgeable natural language processing (NLP) models with advanced functionalities including providing and making flexible use of encyclopedic and commonsense knowledge.
no code implementations • 4 Feb 2022 • Yang Liu, Hao Cheng, Kun Zhang
When label noise transition depends on each instance, the problem of identifying the instance-dependent noise transition matrix becomes substantially more challenging.
no code implementations • 15 Dec 2021 • Robert Tinn, Hao Cheng, Yu Gu, Naoto Usuyama, Xiaodong Liu, Tristan Naumann, Jianfeng Gao, Hoifung Poon
Overall, domainspecific vocabulary and pretraining facilitate more robust models for fine-tuning.
no code implementations • 15 Dec 2021 • Sheng Zhang, Hao Cheng, Shikhar Vashishth, Cliff Wong, Jinfeng Xiao, Xiaodong Liu, Tristan Naumann, Jianfeng Gao, Hoifung Poon
Zero-shot entity linking has emerged as a promising direction for generalizing to new entities, but it still requires example gold entity mentions during training and canonical descriptions for all entities, both of which are rarely available outside of Wikipedia.
no code implementations • 6 Dec 2021 • Mingcai Chen, Hao Cheng, Yuntao Du, Ming Xu, Wenyu Jiang, Chongjun Wang
We show that our method successfully alleviates the damage of both label noise and confirmation bias.
Ranked #2 on Image Classification on mini WebVision 1.0
2 code implementations • 6 Dec 2021 • Yichong Xu, Chenguang Zhu, Shuohang Wang, Siqi Sun, Hao Cheng, Xiaodong Liu, Jianfeng Gao, Pengcheng He, Michael Zeng, Xuedong Huang
In particular, we focus on the task of Commonsense Reasoning, demonstrating that the proposed external attention mechanism can augment existing transformer models and significantly improve the model's reasoning capabilities.
Ranked #1 on Common Sense Reasoning on CommonsenseQA (using extra training data)
1 code implementation • 4 Nov 2021 • Subhabrata Mukherjee, Xiaodong Liu, Guoqing Zheng, Saghar Hosseini, Hao Cheng, Greg Yang, Christopher Meek, Ahmed Hassan Awadallah, Jianfeng Gao
We demonstrate that while recent models reach human performance when they have access to large amounts of labeled data, there is a huge gap in performance in the few-shot setting for most tasks.
1 code implementation • 1 Nov 2021 • Fanxu Meng, Hao Cheng, Jiaxin Zhuang, Ke Li, Xing Sun
In this paper, we aim to remedy this problem and propose to remove the residual connection in a vanilla ResNet equivalently by a reserving and merging (RM) operation on ResBlock.
2 code implementations • ICLR 2022 • Jiaheng Wei, Zhaowei Zhu, Hao Cheng, Tongliang Liu, Gang Niu, Yang Liu
These observations require us to rethink the treatment of noisy labels, and we hope the availability of these two datasets would facilitate the development and evaluation of future learning with noisy label solutions.
1 code implementation • 18 Oct 2021 • Hao Cheng, Zhaowei Zhu, Xing Sun, Yang Liu
Designing robust loss functions is popular in learning with noisy labels while existing designs did not explicitly consider the overfitting property of deep neural networks (DNNs).
1 code implementation • ACL 2022 • Kaixin Ma, Hao Cheng, Xiaodong Liu, Eric Nyberg, Jianfeng Gao
The retriever-reader framework is popular for open-domain question answering (ODQA) due to its ability to use explicit knowledge.
no code implementations • 29 Sep 2021 • Zhaowei Zhu, Zihao Dong, Hao Cheng, Yang Liu
In this paper, given good representations, we propose a universally applicable and training-free solution to detect noisy labels.
1 code implementation • 26 Sep 2021 • Hao Cheng, YuFei Wang, Haoliang Li, Alex C. Kot, Bihan Wen
In this work, we propose a novel Disentangled Feature Representation framework, dubbed DFR, for few-shot learning applications.
1 code implementation • 23 Sep 2021 • Yunshuang Yuan, Hao Cheng, Monika Sester
Sharing collective perception messages (CPM) between vehicles is investigated to decrease occlusions so as to improve the perception accuracy and safety of autonomous driving.
1 code implementation • EMNLP 2021 • Chia-Hsuan Lee, Hao Cheng, Mari Ostendorf
Task-oriented conversational systems often use dialogue state tracking to represent the user's intentions, which involves filling in values of pre-defined slots.
Ranked #1 on Dialogue State Tracking on MULTIWOZ 2.1 (MultiWOZ (Joint Goal Acc) metric)
1 code implementation • 13 Sep 2021 • YuFei Wang, Haoliang Li, Hao Cheng, Bihan Wen, Lap-Pui Chau, Alex C. Kot
Domain generalization aims to learn an invariant model that can generalize well to the unseen target domain.
no code implementations • 25 Jun 2021 • Yu Wang, Jinchao Li, Tristan Naumann, Chenyan Xiong, Hao Cheng, Robert Tinn, Cliff Wong, Naoto Usuyama, Richard Rogahn, Zhihong Shen, Yang Qin, Eric Horvitz, Paul N. Bennett, Jianfeng Gao, Hoifung Poon
A prominent case in point is the explosion of the biomedical literature on COVID-19, which swelled to hundreds of thousands of papers in a matter of months.
1 code implementation • 1 Jun 2021 • Hao Cheng, Kim-Hui Yap, Bihan Wen
Recent image classification algorithms, by learning deep features from large-scale datasets, have achieved significantly better results comparing to the classic feature-based approaches.
no code implementations • 30 May 2021 • Hao Cheng, Ping Wang, Chun Qi
As important data carriers, the drastically increasing number of multimedia videos often brings many duplicate and near-duplicate videos in the top results of search.
no code implementations • 9 May 2021 • Hao Cheng, Li Feng, Hailong Liu, Takatsugu Hirayama, Hiroshi Murase, Monika Sester
Intersections where vehicles are permitted to turn and interact with vulnerable road users (VRUs) like pedestrians and cyclists are among some of the most challenging locations for automated and accurate recognition of road users' behavior.
no code implementations • 21 Apr 2021 • Kaidi Xu, Chenan Wang, Hao Cheng, Bhavya Kailkhura, Xue Lin, Ryan Goldhahn
To tackle the susceptibility of deep neural networks to examples, the adversarial training has been proposed which provides a notion of robust through an inner maximization problem presenting the first-order embedded within the outer minimization of the training loss.
2 code implementations • 19 Apr 2021 • Yuting Gao, Jia-Xin Zhuang, Shaohui Lin, Hao Cheng, Xing Sun, Ke Li, Chunhua Shen
Specifically, we find the final embedding obtained by the mainstream SSL methods contains the most fruitful information, and propose to distill the final embedding to maximally transmit a teacher's knowledge to a lightweight model by constraining the last embedding of the student to be consistent with that of the teacher.
1 code implementation • NAACL 2021 • Lis Pereira, Xiaodong Liu, Hao Cheng, Hoifung Poon, Jianfeng Gao, Ichiro Kobayashi
We present a simple yet effective Targeted Adversarial Training (TAT) algorithm to improve adversarial training for natural language understanding.
no code implementations • 19 Jan 2021 • Huixiang Luo, Hao Cheng, Fanxu Meng, Yuting Gao, Ke Li, Mengdan Zhang, Xing Sun
Pseudo-labeling (PL) and Data Augmentation-based Consistency Training (DACT) are two approaches widely used in Semi-Supervised Learning (SSL) methods.
no code implementations • 14 Jan 2021 • Yanjun Li, Bihan Wen, Hao Cheng, Yoram Bresler
In this paper, we propose a supervised dimensionality reduction method that learns linear embeddings jointly for two feature vectors representing data of different modalities or data from distinct types of entities.
no code implementations • 1 Jan 2021 • Sewon Min, Jordan Boyd-Graber, Chris Alberti, Danqi Chen, Eunsol Choi, Michael Collins, Kelvin Guu, Hannaneh Hajishirzi, Kenton Lee, Jennimaria Palomaki, Colin Raffel, Adam Roberts, Tom Kwiatkowski, Patrick Lewis, Yuxiang Wu, Heinrich Küttler, Linqing Liu, Pasquale Minervini, Pontus Stenetorp, Sebastian Riedel, Sohee Yang, Minjoon Seo, Gautier Izacard, Fabio Petroni, Lucas Hosseini, Nicola De Cao, Edouard Grave, Ikuya Yamada, Sonse Shimaoka, Masatoshi Suzuki, Shumpei Miyawaki, Shun Sato, Ryo Takahashi, Jun Suzuki, Martin Fajcik, Martin Docekal, Karel Ondrej, Pavel Smrz, Hao Cheng, Yelong Shen, Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao, Barlas Oguz, Xilun Chen, Vladimir Karpukhin, Stan Peshterliev, Dmytro Okhonko, Michael Schlichtkrull, Sonal Gupta, Yashar Mehdad, Wen-tau Yih
We review the EfficientQA competition from NeurIPS 2020.
no code implementations • ACL 2021 • Hao Cheng, Yelong Shen, Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao
To date, most of recent work under the retrieval-reader framework for open-domain QA focuses on either extractive or generative reader exclusively.
Ranked #1 on Open-Domain Question Answering on TriviaQA
1 code implementation • 10 Dec 2020 • Enwei Zhang, Xinyang Jiang, Hao Cheng, AnCong Wu, Fufu Yu, Ke Li, Xiaowei Guo, Feng Zheng, Wei-Shi Zheng, Xing Sun
Current training objectives of existing person Re-IDentification (ReID) models only ensure that the loss of the model decreases on selected training batch, with no regards to the performance on samples outside the batch.
no code implementations • COLING 2020 • Chao Tian, Yifei Wang, Hao Cheng, Yijiang Lian, Zhihua Zhang
In this paper we propose a unified approach for supporting different generation manners of machine translation, including autoregressive, semi-autoregressive, and refinement-based non-autoregressive models.
2 code implementations • 30 Oct 2020 • Hao Cheng, Wentong Liao, Xuejiao Tang, Michael Ying Yang, Monika Sester, Bodo Rosenhahn
In our framework, first, the spatial context between agents is explored by using self-attention architectures.
2 code implementations • NAACL 2021 • Hao Cheng, Xiaodong Liu, Lis Pereira, YaoLiang Yu, Jianfeng Gao
Theoretically, we provide a connection of two recent methods, Jacobian Regularization and Virtual Adversarial Training, under this framework.
1 code implementation • ICLR 2021 • Hao Cheng, Zhaowei Zhu, Xingyu Li, Yifei Gong, Xing Sun, Yang Liu
This high-quality sample sieve allows us to treat clean examples and the corrupted ones separately in training a DNN solution, and such a separation is shown to be advantageous in the instance-dependent noise setting.
Image Classification with Label Noise Learning with noisy labels
1 code implementation • NeurIPS 2020 • Fanxu Meng, Hao Cheng, Ke Li, Huixiang Luo, Xiaowei Guo, Guangming Lu, Xing Sun
Through extensive experiments, we demonstrate that SWP is more effective compared to the previous FP-based methods and achieves the state-of-art pruning ratio on CIFAR-10 and ImageNet datasets without obvious accuracy drop.
2 code implementations • CVPR 2021 • Jinpeng Wang, Yuting Gao, Ke Li, Yiqi Lin, Andy J. Ma, Hao Cheng, Pai Peng, Feiyue Huang, Rongrong Ji, Xing Sun
Then we force the model to pull the feature of the distracting video and the feature of the original video closer, so that the model is explicitly restricted to resist the background influence, focusing more on the motion changes.
1 code implementation • ECCV 2020 • Shizhen Zhao, Changxin Gao, Jun Zhang, Hao Cheng, Chuchu Han, Xinyang Jiang, Xiaowei Guo, Wei-Shi Zheng, Nong Sang, Xing Sun
In the conventional person Re-ID setting, it is widely assumed that cropped person images are for each individual.
1 code implementation • 31 Jul 2020 • Yu Gu, Robert Tinn, Hao Cheng, Michael Lucas, Naoto Usuyama, Xiaodong Liu, Tristan Naumann, Jianfeng Gao, Hoifung Poon
In this paper, we challenge this assumption by showing that for domains with abundant unlabeled text, such as biomedicine, pretraining language models from scratch results in substantial gains over continual pretraining of general-domain language models.
Ranked #2 on Participant Intervention Comparison Outcome Extraction on EBM-NLP (using extra training data)
no code implementations • 15 Jul 2020 • Hao Cheng, Bao-Hua Sun, Li-Hua Zhu, Tian-Xiao Li, Guang-Shuai Li, Cong-Bo Li, Xiao-Guang Wu, Yun Zheng
The LaBr$_3$(Ce) detector has attracted much attention in recent years for its superior characteristics to other scintillating materials in terms of resolution and efficiency.
Instrumentation and Detectors Nuclear Experiment
no code implementations • 14 Jul 2020 • Hao Cheng, Joey Tianyi Zhou, Wee Peng Tay, Bihan Wen
Graph Neural Networks (GNN) has demonstrated the superior performance in many challenging applications, including the few-shot learning tasks.
1 code implementation • 15 Jun 2020 • Hao Cheng, Wentong Liao, Michael Ying Yang, Bodo Rosenhahn, Monika Sester
Trajectory prediction is critical for applications of planning safe future movements and remains challenging even for the next few seconds in urban mixed traffic.
1 code implementation • ACL 2020 • Hao Cheng, Ming-Wei Chang, Kenton Lee, Kristina Toutanova
We address the problem of extractive question answering using document-level distant super-vision, pairing questions and relevant documents with answer strings.
1 code implementation • 26 Apr 2020 • Hao Cheng, Fanxu Meng, Ke Li, Yuting Gao, Guangming Lu, Xing Sun, Rongrong Ji
To gain a universal improvement on both valid and invalid filters, we compensate grafting with distillation (\textbf{Cultivation}) to overcome the drawback of grafting .
3 code implementations • 20 Apr 2020 • Xiaodong Liu, Hao Cheng, Pengcheng He, Weizhu Chen, Yu Wang, Hoifung Poon, Jianfeng Gao
In natural language processing (NLP), pre-training large neural language models such as BERT have demonstrated impressive gain in generalization for a variety of tasks, with further improvement from adversarial fine-tuning.
Ranked #6 on Natural Language Inference on ANLI test (using extra training data)
3 code implementations • ACL 2020 • Xiaodong Liu, Yu Wang, Jianshu ji, Hao Cheng, Xueyun Zhu, Emmanuel Awa, Pengcheng He, Weizhu Chen, Hoifung Poon, Guihong Cao, Jianfeng Gao
We present MT-DNN, an open-source natural language understanding (NLU) toolkit that makes it easy for researchers and developers to train customized deep learning models.
1 code implementation • 14 Feb 2020 • Hao Cheng, Wentong Liao, Michael Ying Yang, Monika Sester, Bodo Rosenhahn
In inference time, we combine the past context and motion information of the target agent with samplings of the latent variables to predict multiple realistic trajectories in the future.
2 code implementations • CVPR 2020 • Fanxu Meng, Hao Cheng, Ke Li, Zhixin Xu, Rongrong Ji, Xing Sun, Gaungming Lu
To better perform the grafting process, we develop an entropy-based criterion to measure the information of filters and an adaptive weighting strategy for balancing the grafted information among networks.
1 code implementation • 3 Dec 2019 • Fengxiang Yang, Ke Li, Zhun Zhong, Zhiming Luo, Xing Sun, Hao Cheng, Xiaowei Guo, Feiyue Huang, Rongrong Ji, Shaozi Li
This procedure encourages that the selected training samples can be both clean and miscellaneous, and that the two models can promote each other iteratively.
Ranked #10 on Unsupervised Domain Adaptation on Market to Duke
no code implementations • 14 Oct 2019 • Hao Cheng, Xiaoqing Yang, Zang Li, Yanghua Xiao, Yu-Cheng Lin
Deep neural networks have been widely used in text classification.
no code implementations • 27 Jul 2019 • Yi Zhang, Cheng Zeng, Hao Cheng, Chongjun Wang, Lei Zhang
The quality of data collected from different channels are inconsistent and some of them may not benefit for prediction.
no code implementations • CVPR 2019 • Hao Cheng, Dongze Lian, Bowen Deng, Shenghua Gao, Tao Tan, Yanlin Geng
We propose a new learning paradigm, Local to Global Learning (LGL), for Deep Neural Networks (DNNs) to improve the performance of classification problems.
1 code implementation • NAACL 2019 • Hao Cheng, Hao Fang, Mari Ostendorf
Characterizing these differences can be useful in human-computer interaction, as well as analysis of human-human conversations.
no code implementations • 5 Nov 2018 • Hao Cheng, Ming-Wei Chang, Kenton Lee, Ankur Parikh, Michael Collins, Kristina Toutanova
We study approaches to improve fine-grained short answer Question Answering models by integrating coarse-grained data annotated for paragraph-level relevance and show that coarsely annotated data can bring significant performance gains.
no code implementations • ECCV 2018 • Hao Cheng, Dongze Lian, Shenghua Gao, Yanlin Geng
Inspired by the pioneering work of information bottleneck principle for Deep Neural Networks (DNNs) analysis, we design an information plane based framework to evaluate the capability of DNNs for image classification tasks, which not only helps understand the capability of DNNs, but also helps us choose a neural network which leads to higher classification accuracy more efficiently.
no code implementations • NAACL 2018 • Hao Fang, Hao Cheng, Maarten Sap, Elizabeth Clark, Ari Holtzman, Yejin Choi, Noah A. Smith, Mari Ostendorf
We present Sounding Board, a social chatbot that won the 2017 Amazon Alexa Prize.
1 code implementation • EMNLP 2017 • Hao Cheng, Hao Fang, Mari Ostendorf
We develop a novel factored neural model that learns comment embeddings in an unsupervised way leveraging the structure of distributional context in online discussion forums.
no code implementations • 16 Aug 2016 • Hao Fang, Hao Cheng, Mari Ostendorf
Many social media platforms offer a mechanism for readers to react to comments, both positively and negatively, which in aggregate can be thought of as community endorsement.
1 code implementation • EMNLP 2016 • Hao Cheng, Hao Fang, Xiaodong He, Jianfeng Gao, Li Deng
We develop a novel bi-directional attention model for dependency parsing, which learns to agree on headword predictions from the forward and backward parsing directions.
Ranked #4 on Chinese Dependency Parsing on Chinese Pennbank
no code implementations • IJCNLP 2015 • Jacob Devlin, Hao Cheng, Hao Fang, Saurabh Gupta, Li Deng, Xiaodong He, Geoffrey Zweig, Margaret Mitchell
Two recent approaches have achieved state-of-the-art results in image captioning.
no code implementations • NeurIPS 2013 • Özlem Aslan, Hao Cheng, Xinhua Zhang, Dale Schuurmans
Latent variable prediction models, such as multi-layer networks, impose auxiliary latent variables between inputs and outputs to allow automatic inference of implicit features useful for prediction.
no code implementations • 26 Sep 2013 • Hao Cheng, Xinhua Zhang, Dale Schuurmans
Although many convex relaxations of clustering have been proposed in the past decade, current formulations remain restricted to spherical Gaussian or discriminative models and are susceptible to imbalanced clusters.