no code implementations • EMNLP (ACL) 2021 • Kai-Wei Chang, He He, Robin Jia, Sameer Singh
In particular, we will review recent studies on analyzing the weakness of NLP systems when facing adversarial inputs and data with a distribution shift.
no code implementations • 30 Apr 2024 • Richard Yuanzhe Pang, Weizhe Yuan, Kyunghyun Cho, He He, Sainbayar Sukhbaatar, Jason Weston
Iterative preference optimization methods have recently been shown to perform well for general instruction tuning tasks, but typically make little improvement on reasoning tasks (Yuan et al., 2024, Chen et al., 2024).
1 code implementation • 24 Apr 2024 • Hannah Rose Kirk, Alexander Whitefield, Paul Röttger, Andrew Bean, Katerina Margatina, Juan Ciro, Rafael Mosquera, Max Bartolo, Adina Williams, He He, Bertie Vidgen, Scott A. Hale
Human feedback plays a central role in the alignment of Large Language Models (LLMs).
1 code implementation • 15 Apr 2024 • Usman Anwar, Abulhair Saparov, Javier Rando, Daniel Paleka, Miles Turpin, Peter Hase, Ekdeep Singh Lubana, Erik Jenner, Stephen Casper, Oliver Sourbut, Benjamin L. Edelman, Zhaowei Zhang, Mario Günther, Anton Korinek, Jose Hernandez-Orallo, Lewis Hammond, Eric Bigelow, Alexander Pan, Lauro Langosco, Tomasz Korbak, Heidi Zhang, Ruiqi Zhong, Seán Ó hÉigeartaigh, Gabriel Recchia, Giulio Corsi, Alan Chan, Markus Anderljung, Lilian Edwards, Yoshua Bengio, Danqi Chen, Samuel Albanie, Tegan Maharaj, Jakob Foerster, Florian Tramer, He He, Atoosa Kasirzadeh, Yejin Choi, David Krueger
This work identifies 18 foundational challenges in assuring the alignment and safety of large language models (LLMs).
1 code implementation • 30 Mar 2024 • Guande Wu, Chen Zhao, Claudio Silva, He He
Language agents that interact with the world on their own have great potential for automating digital tasks.
no code implementations • 19 Feb 2024 • Yanda Chen, Chen Zhao, Zhou Yu, Kathleen McKeown, He He
Pre-trained language models (LMs) are capable of in-context learning (ICL): they can adapt to a task with only a few examples given in the prompt without any parameter update.
1 code implementation • 25 Jan 2024 • Yanda Chen, Chandan Singh, Xiaodong Liu, Simiao Zuo, Bin Yu, He He, Jianfeng Gao
We propose explanation-consistency finetuning (EC-finetuning), a method that adapts LLMs to generate more consistent natural-language explanations on related examples.
1 code implementation • 28 Nov 2023 • Dang Nguyen, Chacha Chen, He He, Chenhao Tan
When pneumonia is not found on a chest X-ray, should the report describe this negative observation or omit it?
1 code implementation • 16 Nov 2023 • Nicholas Lourie, Kyunghyun Cho, He He
We present the first method to construct valid confidence bands for tuning curves.
no code implementations • 27 Oct 2023 • Nitish Joshi, Javier Rando, Abulhair Saparov, Najoung Kim, He He
This allows the model to separate truth from falsehoods and controls the truthfulness of its generation.
1 code implementation • 11 Sep 2023 • Vishakh Padmakumar, He He
We develop a set of diversity metrics and find that writing with InstructGPT (but not the GPT3) results in a statistically significant reduction in diversity.
no code implementations • 26 Jul 2023 • Richard Yuanzhe Pang, Stephen Roller, Kyunghyun Cho, He He, Jason Weston
We study improving social conversational agents by learning from natural dialogue between users and a deployed model, without extra annotations.
no code implementations • 17 Jul 2023 • Yanda Chen, Ruiqi Zhong, Narutatsu Ri, Chen Zhao, He He, Jacob Steinhardt, Zhou Yu, Kathleen McKeown
To answer these questions, we propose to evaluate $\textbf{counterfactual simulatability}$ of natural language explanations: whether an explanation can enable humans to precisely infer the model's outputs on diverse counterfactuals of the explained input.
1 code implementation • 31 May 2023 • Chenghao Yang, Fan Yin, He He, Kai-Wei Chang, Xiaofei Ma, Bing Xiang
In practice, Shapley Values are often estimated with a small number of stochastic model evaluations.
1 code implementation • NeurIPS 2023 • Abulhair Saparov, Richard Yuanzhe Pang, Vishakh Padmakumar, Nitish Joshi, Seyed Mehran Kazemi, Najoung Kim, He He
Given the intractably large size of the space of proofs, any model that is capable of general deductive reasoning must generalize to proofs of greater complexity.
1 code implementation • 22 May 2023 • Chenglei Si, Dan Friedman, Nitish Joshi, Shi Feng, Danqi Chen, He He
We investigate the inductive biases of ICL from the perspective of feature bias: which feature ICL is more likely to use given a set of underspecified demonstrations in which two features are equally predictive of the labels.
no code implementations • 29 Mar 2023 • Saranya Venkatraman, He He, David Reitter
We find that (i) surprisingly, model-generated responses follow the UID principle to a greater extent than human responses, and (ii) decoding algorithms that promote UID do not generate higher-quality responses.
1 code implementation • 8 Mar 2023 • Vishakh Padmakumar, Richard Yuanzhe Pang, He He, Ankur P. Parikh
We study the problem of extrapolative controlled generation, i. e., generating sequences with attribute values beyond the range seen in training.
no code implementations • 16 Nov 2022 • Richard Yuanzhe Pang, Vishakh Padmakumar, Thibault Sellam, Ankur P. Parikh, He He
To align conditional text generation model outputs with desired behaviors, there has been an increasing focus on training the model using reinforcement learning (RL) with reward functions learned from human annotations.
1 code implementation • 25 Oct 2022 • Nitish Joshi, Xiang Pan, He He
In case (i), we want the model to be invariant to the feature, which is neither necessary nor sufficient for prediction.
1 code implementation • 25 Oct 2022 • Tuhin Chakrabarty, Vishakh Padmakumar, He He
The core component of our system is a language model fine-tuned on a diverse collection of instructions for poetry writing.
1 code implementation • 10 Oct 2022 • Asa Cooper Stickland, Sailik Sengupta, Jason Krone, Saab Mansour, He He
To benchmark the performance of pretrained multilingual language models, we construct noisy datasets covering five languages and four NLP tasks and observe a clear gap in the performance between clean and noisy data in the zero-shot cross-lingual setting.
Data Augmentation Pretrained Multilingual Language Models +1
no code implementations • 4 Oct 2022 • Aahlad Puli, Nitish Joshi, He He, Rajesh Ranganath
In prediction tasks, there exist features that are related to the label in the same way across different settings for that task; these are semantic features or semantics.
1 code implementation • 3 Oct 2022 • Abulhair Saparov, He He
Large language models (LLMs) have shown remarkable reasoning capabilities given chain-of-thought prompts (examples with intermediate reasoning steps).
1 code implementation • 16 Sep 2022 • Yanda Chen, Chen Zhao, Zhou Yu, Kathleen McKeown, He He
In-context learning (ICL) suffers from oversensitivity to the prompt, making it unreliable in real-world scenarios.
no code implementations • NAACL 2022 • Vishakh Padmakumar, Leonard Lausen, Miguel Ballesteros, Sheng Zha, He He, George Karypis
Recent work has found that multi-task training with a large number of diverse tasks can uniformly improve downstream performance on unseen target tasks.
no code implementations • 16 Dec 2021 • Richard Yuanzhe Pang, He He, Kyunghyun Cho
For all three approaches, the generated translations fail to achieve rewards comparable to BSR, but the translation quality approximated by BLEU and BLEURT is similar to the quality of BSR-produced translations.
2 code implementations • NAACL 2022 • Richard Yuanzhe Pang, Alicia Parrish, Nitish Joshi, Nikita Nangia, Jason Phang, Angelica Chen, Vishakh Padmakumar, Johnny Ma, Jana Thompson, He He, Samuel R. Bowman
To enable building and testing models on long-document comprehension, we introduce QuALITY, a multiple-choice QA dataset with context passages in English that have an average length of about 5, 000 tokens, much longer than typical current models can process.
no code implementations • NeurIPS 2021 • Yana Dranker, He He, Yonatan Belinkov
Invariant Risk Minimization (IRM) is a recently proposed framework for out-of-distribution (o. o. d) generalization.
1 code implementation • NAACL 2022 • Vishakh Padmakumar, He He
Machine-in-the-loop writing aims to enable humans to collaborate with models to complete their writing tasks more effectively.
1 code implementation • ACL 2022 • Yanda Chen, Ruiqi Zhong, Sheng Zha, George Karypis, He He
The goal of meta-learning is to learn to adapt to a new task with only a few labeled examples.
no code implementations • 29 Sep 2021 • Zhiliang Tian, Yingxiu Zhao, Ziyue Huang, Yu-Xiang Wang, Nevin Zhang, He He
Differentially private (DP) learning algorithms provide guarantees on identifying the existence of a training sample from model outputs.
1 code implementation • EMNLP 2021 • Udit Arora, William Huang, He He
Despite agreement on the importance of detecting out-of-distribution (OOD) examples, there is little consensus on the formal definition of OOD examples and how to best detect them.
2 code implementations • ACL 2022 • Faisal Ladhak, Esin Durmus, He He, Claire Cardie, Kathleen McKeown
Despite recent progress in abstractive summarization, systems still suffer from faithfulness errors.
1 code implementation • ACL 2022 • Nitish Joshi, He He
While pretrained language models achieve excellent performance on natural language understanding benchmarks, they tend to rely on spurious correlations and generalize poorly to out-of-distribution (OOD) data.
1 code implementation • EACL 2021 • Vishakh Padmakumar, He He
Unsupervised approaches to extractive summarization usually rely on a notion of sentence importance defined by the semantic similarity between a sentence and the document.
1 code implementation • ICLR 2021 • Richard Yuanzhe Pang, He He
Current approaches to text generation largely rely on autoregressive models and maximum likelihood estimation.
1 code implementation • 14 Jul 2020 • Lifu Tu, Garima Lalwani, Spandana Gella, He He
Recent work has shown that pre-trained language models such as BERT improve robustness to spurious correlations in the dataset.
2 code implementations • ACL 2020 • Esin Durmus, He He, Mona Diab
We tackle the problem of evaluating faithfulness of a generated summary given its source document.
1 code implementation • 3 Dec 2019 • He He, Dongrui Wu
Currently, most domain adaptation approaches require the source domains to have the same feature space and label space as the target domain, which limits their applications, as the auxiliary data may have different feature spaces and/or different label spaces.
no code implementations • WS 2019 • Yiheng Zhou, He He, Alan W. black, Yulia Tsvetkov
We consider a bargaining scenario where a seller and a buyer negotiate the price of an item for sale through a text-based dialog.
1 code implementation • WS 2019 • He He, Sheng Zha, Haohan Wang
We first learn a biased model that only uses features that are known to relate to dataset bias.
1 code implementation • 16 Aug 2019 • Zhenhua Shi, Xiaomo Chen, Changming Zhao, He He, Veit Stuphorn, Dongrui Wu
Multi-view learning improves the learning performance by utilizing multi-view data: data collected from multiple sources, or feature sets extracted from the same data source.
3 code implementations • 9 Jul 2019 • Jian Guo, He He, Tong He, Leonard Lausen, Mu Li, Haibin Lin, Xingjian Shi, Chenguang Wang, Junyuan Xie, Sheng Zha, Aston Zhang, Hang Zhang, Zhi Zhang, Zhongyue Zhang, Shuai Zheng, Yi Zhu
We present GluonCV and GluonNLP, the deep learning toolkits for computer vision and natural language processing based on Apache MXNet (incubating).
2 code implementations • NAACL 2019 • He He, Nanyun Peng, Percy Liang
We tackle the problem of generating a pun sentence given a pair of homophones (e. g., "died" and "dyed").
no code implementations • 9 Apr 2019 • Pedro Rodriguez, Shi Feng, Mohit Iyyer, He He, Jordan Boyd-Graber
Throughout this paper, we show that collaborations with the vibrant trivia community have contributed to the quality of our dataset, spawned new research directions, and doubled as an exciting way to engage the public with research in machine learning and natural language processing.
no code implementations • EMNLP 2018 • Eunsol Choi, He He, Mohit Iyyer, Mark Yatskar, Wen-tau Yih, Yejin Choi, Percy Liang, Luke Zettlemoyer
We present QuAC, a dataset for Question Answering in Context that contains 14K information-seeking QA dialogs (100K questions in total).
2 code implementations • EMNLP 2018 • He He, Derek Chen, Anusha Balakrishnan, Percy Liang
We consider negotiation settings in which two agents use natural language to bargain on goods.
no code implementations • 21 Aug 2018 • Eunsol Choi, He He, Mohit Iyyer, Mark Yatskar, Wen-tau Yih, Yejin Choi, Percy Liang, Luke Zettlemoyer
We present QuAC, a dataset for Question Answering in Context that contains 14K information-seeking QA dialogs (100K questions in total).
no code implementations • 8 Aug 2018 • He He, Dongrui Wu
The electroencephalogram (EEG) is the most popular form of input for brain computer interfaces (BCIs).
1 code implementation • 8 Aug 2018 • He He, Dongrui Wu
Our approach has three desirable properties: 1) it aligns the EEG trials directly in the Euclidean space, and any signal processing, feature extraction and machine learning algorithms can then be applied to the aligned trials; 2) its computational cost is very low; and, 3) it is unsupervised and does not need any label information from the new subject.
no code implementations • 8 Aug 2018 • He He, Dongrui Wu
The electroencephalogram (EEG) is the most widely used input for brain computer interfaces (BCIs), and common spatial pattern (CSP) is frequently used to spatially filter it to increase its signal-to-noise ratio.
1 code implementation • ACL 2018 • Urvashi Khandelwal, He He, Peng Qi, Dan Jurafsky
We know very little about how neural language models (LM) use prior linguistic context.
6 code implementations • NAACL 2018 • Juncen Li, Robin Jia, He He, Percy Liang
We consider the task of text attribute transfer: transforming a sentence to alter a specific attribute (e. g., sentiment) while preserving its attribute-independent content (e. g., changing "screen is just the right size" to "screen is too small").
Ranked #1 on Unsupervised Text Style Transfer on Yelp2018
2 code implementations • ACL 2017 • He He, Anusha Balakrishnan, Mihail Eric, Percy Liang
To model both structured knowledge and unstructured language, we propose a neural model with dynamic knowledge graph embeddings that evolve as the dialogue progresses.
1 code implementation • 18 Sep 2016 • He He, Jordan Boyd-Graber, Kevin Kwok, Hal Daumé III
Opponent modeling is necessary in multi-agent settings where secondary agents with competing goals also adapt their strategies, yet it remains challenging because strategies interact with each other and change.
no code implementations • 5 Feb 2016 • He He, Paul Mineiro, Nikos Karampatziakis
We propose a general framework for sequential and dynamic acquisition of useful information in order to solve a particular task.
General Reinforcement Learning Reinforcement Learning (RL) +1
no code implementations • 18 Mar 2015 • Kai-Wei Chang, He He, Hal Daumé III, John Langford
We demonstrate that a dependency parser can be built using a credit assignment compiler which removes the burden of worrying about low-level machine learning details from the parser implementation.
no code implementations • NeurIPS 2014 • He He, Hal Daume III, Jason M. Eisner
Branch-and-bound is a widely used method in combinatorial optimization, including mixed integer programming, structured prediction and MAP inference.
no code implementations • NeurIPS 2016 • Kai-Wei Chang, He He, Hal Daumé III, John Langford, Stephane Ross
Many machine learning applications involve jointly predicting multiple mutually dependent output variables.
no code implementations • NeurIPS 2012 • He He, Jason Eisner, Hal Daume
However, it is important to note that these guarantees depend on how well the policy we found can imitate the oracle on the training data.