1 code implementation • 6 May 2024 • John Yang, Carlos E. Jimenez, Alexander Wettig, Kilian Lieret, Shunyu Yao, Karthik Narasimhan, Ofir Press
We investigate how interface design affects the performance of language model agents.
1 code implementation • 16 Apr 2024 • Quan Shi, Michael Tang, Karthik Narasimhan, Shunyu Yao
In this paper, we introduce the USACO benchmark with 307 problems from the USA Computing Olympiad, along with high-quality unit tests, reference code, and official analyses for each problem.
no code implementations • 12 Apr 2024 • Shreyas Chaudhari, Pranjal Aggarwal, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, Karthik Narasimhan, Ameet Deshpande, Bruno Castro da Silva
A promising approach is reinforcement learning from human feedback (RLHF), which leverages human feedback to update the model in accordance with human preferences and mitigate issues like toxicity and hallucinations.
no code implementations • 24 Jan 2024 • Alex Zhang, Khanh Nguyen, Jens Tuyls, Albert Lin, Karthik Narasimhan
Installing probabilistic world models into artificial agents opens an efficient channel for humans to communicate with and control these agents.
no code implementations • 6 Nov 2023 • Vishvak Murahari, Ameet Deshpande, Peter Clark, Tanmay Rajpurohit, Ashish Sabharwal, Karthik Narasimhan, Ashwin Kalyan
In this work, we address the shortcomings of quantitative metrics by proposing QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement.
no code implementations • 13 Oct 2023 • Ruijie Zheng, Khanh Nguyen, Hal Daumé III, Furong Huang, Karthik Narasimhan
By equipping a learning agent with an abstract, dynamic language and an intrinsic motivation to learn with minimal communication effort, CEIL leads to emergence of a human-like pattern where the learner and the teacher communicate progressively efficiently by exchanging increasingly more abstract intentions.
no code implementations • 10 Oct 2023 • Carlos E. Jimenez, John Yang, Alexander Wettig, Shunyu Yao, Kexin Pei, Ofir Press, Karthik Narasimhan
We find real-world software engineering to be a rich, sustainable, and challenging testbed for evaluating the next generation of language models.
Ranked #2 on Bug fixing on SWE-bench
no code implementations • 9 Oct 2023 • Baian Chen, Chang Shu, Ehsan Shareghi, Nigel Collier, Karthik Narasimhan, Shunyu Yao
Recent efforts have augmented language models (LMs) with external tools or environments, leading to the development of language agents that can reason and act.
Ranked #7 on Question Answering on Bamboogle
2 code implementations • 5 Sep 2023 • Theodore R. Sumers, Shunyu Yao, Karthik Narasimhan, Thomas L. Griffiths
Recent efforts have augmented large language models (LLMs) with external resources (e. g., the Internet) or internal control flows (e. g., prompt chaining) for tasks requiring grounding or reasoning, leading to a new class of language agents.
no code implementations • 18 Jul 2023 • Jens Tuyls, Dhruv Madeka, Kari Torkkola, Dean Foster, Karthik Narasimhan, Sham Kakade
Inspired by recent work in Natural Language Processing (NLP) where "scaling up" has resulted in increasingly more capable LLMs, we investigate whether carefully scaling up model and data size can bring similar improvements in the imitation learning setting for single-agent games.
1 code implementation • 17 Jul 2023 • Shunyu Yao, Howard Chen, Austin W. Hanjie, Runzhe Yang, Karthik Narasimhan
Text generation under constraints have seen increasing interests in natural language processing, especially with the rapidly improving capabilities of large language models.
no code implementations • 1 Jul 2023 • Anirudh Ajith, Chris Pan, Mengzhou Xia, Ameet Deshpande, Karthik Narasimhan
In-context learning (ICL) performs tasks by prompting a large language model (LLM) using an instruction and a small set of annotated examples called demonstrations.
2 code implementations • NeurIPS 2023 • John Yang, Akshara Prabhakar, Karthik Narasimhan, Shunyu Yao
Our framework is language and platform agnostic, uses self-contained Docker environments to provide safe and reproducible execution, and is compatible out-of-the-box with traditional seq2seq coding methods, while enabling the development of new methods for interactive code generation.
1 code implementation • 24 May 2023 • Ameet Deshpande, Carlos E. Jimenez, Howard Chen, Vishvak Murahari, Victoria Graf, Tanmay Rajpurohit, Ashwin Kalyan, Danqi Chen, Karthik Narasimhan
Semantic textual similarity (STS), a cornerstone task in NLP, measures the degree of similarity between a pair of sentences, and has broad application in fields such as information retrieval and natural language understanding.
1 code implementation • 24 May 2023 • Michael Tang, Shunyu Yao, John Yang, Karthik Narasimhan
We propose Referral-Augmented Retrieval (RAR), a simple technique that concatenates document indices with referrals, i. e. text from other documents that cite or link to the given document, to provide significant performance gains for zero-shot information retrieval.
no code implementations • 24 May 2023 • Ameet Deshpande, Tanmay Rajpurohit, Karthik Narasimhan, Ashwin Kalyan
With widespread adoption of AI systems, and the push from stakeholders to make it human-like through alignment techniques, human voice, and pictorial avatars, the tendency for users to anthropomorphize it increases significantly.
1 code implementation • 24 May 2023 • Yushan Su, Vishvak Murahari, Karthik Narasimhan, Kai Li
As language models increase in size by the day, methods for efficient inference are critical to leveraging their capabilities for various applications.
4 code implementations • NeurIPS 2023 • Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L. Griffiths, Yuan Cao, Karthik Narasimhan
Language models are increasingly being deployed for general problem solving across a wide range of tasks, but are still confined to token-level, left-to-right decision-making processes during inference.
no code implementations • 11 Apr 2023 • Ameet Deshpande, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, Karthik Narasimhan
Large language models (LLMs) have shown incredible capabilities and transcended the natural language processing (NLP) community, with adoption throughout many services like healthcare, therapy, education, and customer service.
2 code implementations • NeurIPS 2023 • Noah Shinn, Federico Cassano, Edward Berman, Ashwin Gopinath, Karthik Narasimhan, Shunyu Yao
Large language models (LLMs) have been increasingly used to interact with external environments (e. g., games, compilers, APIs) as goal-driven agents.
1 code implementation • 24 Feb 2023 • Vishvak Murahari, Ameet Deshpande, Carlos E. Jimenez, Izhak Shafran, Mingqiu Wang, Yuan Cao, Karthik Narasimhan
The widespread adoption of large language models such as ChatGPT and Bard has led to unprecedented demand for these technologies.
1 code implementation • 26 Jan 2023 • Pranjal Aggarwal, Ameet Deshpande, Karthik Narasimhan
In this paper, we develop SemSup-XC, a model that achieves state-of-the-art zero-shot and few-shot performance on three XC datasets derived from legal, e-commerce, and Wikipedia data.
no code implementations • 17 Jan 2023 • Aniket Agarwal, Alex Zhang, Karthik Narasimhan, Igor Gilitschenski, Vishvak Murahari, Yash Kant
Our human studies indicate that ASAP can align videos and annotations with high fidelity, precision, and speed.
no code implementations • 20 Dec 2022 • Howard Chen, Huihan Li, Danqi Chen, Karthik Narasimhan
We consider the task of text generation in language models with constraints specified in natural language.
1 code implementation • 29 Nov 2022 • Ameet Deshpande, Md Arafat Sultan, Anthony Ferritto, Ashwin Kalyan, Karthik Narasimhan, Avirup Sil
Fine-tuning pre-trained language models (PLMs) achieves impressive performance on a range of downstream tasks, and their sizes have consequently been getting bigger.
1 code implementation • 15 Nov 2022 • Henry Tang, Ameet Deshpande, Karthik Narasimhan
In particular, ALIGN-MLM outperforms XLM and MLM by 35 and 30 F1 points on POS-tagging for transfer between languages that differ both in their script and word order (left-to-right v. s.
5 code implementations • 6 Oct 2022 • Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, Yuan Cao
While large language models (LLMs) have demonstrated impressive capabilities across tasks in language understanding and interactive decision making, their abilities for reasoning (e. g. chain-of-thought prompting) and acting (e. g. action plan generation) have primarily been studied as separate topics.
1 code implementation • 4 Jul 2022 • Shunyu Yao, Howard Chen, John Yang, Karthik Narasimhan
Existing benchmarks for grounding language in interactive environments either lack real-world linguistic elements, or prove difficult to scale up due to substantial human involvement in the collection of data or feedback signals.
no code implementations • 27 Jun 2022 • Allen Z. Ren, Bharat Govil, Tsung-Yen Yang, Karthik Narasimhan, Anirudha Majumdar
Robust and generalized tool manipulation requires an understanding of the properties and affordances of different tools.
1 code implementation • 23 May 2022 • Sreejan Kumar, Carlos G. Correa, Ishita Dasgupta, Raja Marjieh, Michael Y. Hu, Robert D. Hawkins, Nathaniel D. Daw, Jonathan D. Cohen, Karthik Narasimhan, Thomas L. Griffiths
Co-training on these representations result in more human-like behavior in downstream meta-reinforcement learning agents than less abstract controls (synthetic language descriptions, program induction without learned primitives), suggesting that the abstraction supported by these representations is key.
1 code implementation • NAACL 2022 • Howard Chen, Jacqueline He, Karthik Narasimhan, Danqi Chen
Our experiments reveal that the rationale models show the promise to improve robustness, while they struggle in certain scenarios--when the rationalizer is sensitive to positional bias or lexical choices of attack text.
1 code implementation • ACL 2022 • Carlos E. Jimenez, Olga Russakovsky, Karthik Narasimhan
We introduce CARETS, a systematic test suite to measure consistency and robustness of modern VQA models through a series of six fine-grained capability tests.
1 code implementation • 26 Feb 2022 • Austin W. Hanjie, Ameet Deshpande, Karthik Narasimhan
Prior work along this vein have largely used expensive per-instance annotation or singular class-level descriptions, but per-instance descriptions are hard to scale and single class descriptions may not be rich enough.
1 code implementation • 18 Feb 2022 • Vishvak Murahari, Carlos E. Jimenez, Runzhe Yang, Karthik Narasimhan
In this paper, we introduce data multiplexing (DataMUX), a technique that enables deep neural networks to process multiple inputs simultaneously using a single compact representation.
1 code implementation • 10 Jan 2022 • Zeyu Wang, Yu Wu, Karthik Narasimhan, Olga Russakovsky
Retrieving target videos based on text descriptions is a task of great practical value and has received increasing attention over the past few years.
1 code implementation • ICLR 2022 • Jens Tuyls, Shunyu Yao, Sham Kakade, Karthik Narasimhan
Text adventure games present unique challenges to reinforcement learning methods due to their combinatorially large action spaces and sparse rewards.
no code implementations • NeurIPS 2021 • Victor Zhong, Austin Hanjie, Sida Wang, Karthik Narasimhan, Luke Zettlemoyer
We hope SILG enables the community to quickly identify new methodolo- gies for language grounding that generalize to a diverse set of environments and their associated challenges.
2 code implementations • NAACL 2022 • Ameet Deshpande, Partha Talukdar, Karthik Narasimhan
While recent work on multilingual language models has demonstrated their capacity for cross-lingual zero-shot transfer on downstream tasks, there is a lack of consensus in the community as to what shared properties between languages enable such transfer.
1 code implementation • 20 Oct 2021 • Victor Zhong, Austin W. Hanjie, Sida I. Wang, Karthik Narasimhan, Luke Zettlemoyer
We hope SILG enables the community to quickly identify new methodologies for language grounding that generalize to a diverse set of environments and their associated challenges.
no code implementations • 28 Jun 2021 • Pradeep Dogga, Karthik Narasimhan, Anirudh Sivaraman, Shiv Kumar Saini, George Varghese, Ravi Netravali
A major difficulty in debugging distributed systems lies in manually determining which of the many available debugging tools to use and how to query its logs.
1 code implementation • ACL 2021 • Shunyu Yao, Binghui Peng, Christos Papadimitriou, Karthik Narasimhan
Despite their impressive performance in NLP, self-attention networks were recently proved to be limited for processing formal languages with hierarchical structure, such as $\mathsf{Dyck}_k$, the language consisting of well-nested parentheses of $k$ types.
no code implementations • NAACL 2021 • Shunyu Yao, Karthik Narasimhan, Matthew Hausknecht
Text-based games simulate worlds and interact with players using natural language.
1 code implementation • 19 Jan 2021 • Austin W. Hanjie, Victor Zhong, Karthik Narasimhan
We investigate the use of natural language to drive the generalization of control policies and introduce the new multi-task environment Messenger with free-form text manuals describing the environment dynamics.
no code implementations • NeurIPS 2020 • Raeid Saqur, Karthik Narasimhan
Compositional generalization is a key challenge in grounding natural language to visual perception.
no code implementations • 27 Nov 2020 • Rachit Dubey, Erin Grant, Michael Luo, Karthik Narasimhan, Thomas Griffiths
This work connects the context-sensitive nature of cognitive control to a method for meta-learning with context-conditioned adaptation.
1 code implementation • ACL 2021 • Runzhe Yang, Jingxiao Chen, Karthik Narasimhan
In this paper, we explore the ability to model and infer personality types of opponents, predict their responses, and use this information to adapt a dialog agent's high-level strategy in negotiation tasks.
no code implementations • NeurIPS 2021 • Tsung-Yen Yang, Michael Hu, Yinlam Chow, Peter J. Ramadge, Karthik Narasimhan
We then develop an agent with a modular architecture that can interpret and adhere to such textual constraints while learning new tasks.
no code implementations • ICLR 2020 • Tsung-Yen Yang, Justinian Rosca, Karthik Narasimhan, Peter J. Ramadge
We consider the problem of learning control policies that optimize a reward function while satisfying constraints due to considerations of safety, fairness, or other costs.
1 code implementation • EMNLP 2020 • Shunyu Yao, Rohan Rao, Matthew Hausknecht, Karthik Narasimhan
In this paper, we propose the Contextual Action Language Model (CALM) to generate a compact set of action candidates at each game state.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Ameet Deshpande, Karthik Narasimhan
In this paper, we propose a simple and effective technique to allow for efficient self-supervised learning with bi-directional Transformers.
1 code implementation • 30 Sep 2020 • Theodore R. Sumers, Mark K. Ho, Robert D. Hawkins, Karthik Narasimhan, Thomas L. Griffiths
The sentiment models outperform the inference network, with the "pragmatic" model approaching human performance.
Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +2
1 code implementation • ECCV 2020 • Zeyu Wang, Berthy Feng, Karthik Narasimhan, Olga Russakovsky
We find that modern captioning systems return higher likelihoods for incorrect distractor sentences compared to ground truth captions, and that evaluation metrics like SPICE can be 'topped' using simple captioning systems relying on object detectors.
no code implementations • NeurIPS 2020 • Zhiwei Deng, Karthik Narasimhan, Olga Russakovsky
The ability to perform effective planning is crucial for building an instruction-following agent.
no code implementations • 20 Jun 2020 • Tsung-Yen Yang, Justinian Rosca, Karthik Narasimhan, Peter J. Ramadge
We consider the problem of reinforcement learning when provided with (1) a baseline control policy and (2) a set of constraints that the learner must satisfy.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Tsung-Yen Yang, Andrew S. Lan, Karthik Narasimhan
Learning representations of spatial references in natural language is a key challenge in tasks like autonomous navigation and robotic manipulation.
1 code implementation • NAACL 2021 • Liwei Song, Xinwei Yu, Hsuan-Tung Peng, Karthik Narasimhan
Recent work has demonstrated the vulnerability of modern text classifiers to universal adversarial attacks, which are input-agnostic sequences of words added to text processed by classifiers.
no code implementations • 31 Mar 2020 • Felix Yu, Zhiwei Deng, Karthik Narasimhan, Olga Russakovsky
In the Vision-and-Language Navigation (VLN) task, an agent with egocentric vision navigates to a destination given natural language instructions.
4 code implementations • NeurIPS 2019 • Runzhe Yang, Xingyuan Sun, Karthik Narasimhan
We introduce a new algorithm for multi-objective reinforcement learning (MORL) with linear preferences, with the goal of enabling few-shot adaptation to new tasks.
Multi-Objective Reinforcement Learning reinforcement-learning
no code implementations • ICML 2020 • Mark Braverman, Xinyi Chen, Sham M. Kakade, Karthik Narasimhan, Cyril Zhang, Yi Zhang
Building accurate language models that capture meaningful long-term dependencies is a core challenge in natural language processing.
1 code implementation • 13 May 2019 • Yilun Du, Karthik Narasimhan
While model-based deep reinforcement learning (RL) holds great promise for sample efficiency and generalization, learning an accurate dynamics model is often challenging and requires substantial interaction with the environment.
no code implementations • 27 Sep 2018 • Yilun Du, Karthik Narasimhan
While model-based deep reinforcement learning (RL) holds great promise for sample efficiency and generalization, learning an accurate dynamics model is challenging and often requires substantial interactions with the environment.
12 code implementations • Preprint 2018 • Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever
We demonstrate that large gains on these tasks can be realized by generative pre-training of a language model on a diverse corpus of unlabeled text, followed by discriminative fine-tuning on each specific task.
Ranked #3 on Natural Language Inference on SciTail
1 code implementation • 1 Aug 2017 • Karthik Narasimhan, Regina Barzilay, Tommi Jaakkola
In this paper, we explore the utilization of natural language to drive transfer for reinforcement learning (RL).
1 code implementation • TACL 2018 • Michael Janner, Karthik Narasimhan, Regina Barzilay
The interpretation of spatial references is highly contextual, requiring joint inference over both language and the environment.
no code implementations • TACL 2017 • Jiaming Luo, Karthik Narasimhan, Regina Barzilay
This paper focuses on unsupervised modeling of morphological families, collectively comprising a forest over the language vocabulary.
2 code implementations • EMNLP 2016 • Nicholas Locascio, Karthik Narasimhan, Eduardo DeLeon, Nate Kushman, Regina Barzilay
This paper explores the task of translating natural language queries into regular expressions which embody their meaning.
no code implementations • 11 Jul 2016 • Yewen Pu, Karthik Narasimhan, Armando Solar-Lezama, Regina Barzilay
We present a novel technique for automatic program correction in MOOCs, capable of fixing both syntactic and semantic errors without manual, problem specific correction strategies.
1 code implementation • ACL 2016 • Kayhan Batmanghelich, Ardavan Saeedi, Karthik Narasimhan, Sam Gershman
In this paper, we propose to use the von Mises-Fisher distribution to model the density of words over a unit sphere.
1 code implementation • EMNLP 2016 • Karthik Narasimhan, Adam Yala, Regina Barzilay
Most successful information extraction systems operate with access to a large collection of documents.
3 code implementations • EMNLP 2015 • Karthik Narasimhan, tejas kulkarni, Regina Barzilay
We evaluate our approach on two game worlds, comparing against baselines using bag-of-words and bag-of-bigrams for state representations.
1 code implementation • TACL 2015 • Karthik Narasimhan, Regina Barzilay, Tommi Jaakkola
In contrast, we propose a model for unsupervised morphological analysis that integrates orthographic and semantic views of words.
no code implementations • 1 Mar 2015 • Jonathan H. Huggins, Karthik Narasimhan, Ardavan Saeedi, Vikash K. Mansinghka
We derive the small-variance asymptotics for parametric and nonparametric MJPs for both directly observed and hidden state models.