Search Results for author: Oyvind Tafjord

Found 38 papers, 18 papers with code

``You are grounded!'': Latent Name Artifacts in Pre-trained Language Models

no code implementations • EMNLP 2020 • Vered Shwartz, Rachel Rudinger, Oyvind Tafjord

Pre-trained language models (LMs) may perpetuate biases originating in their training corpus to downstream models.

Paper
Add Code

“Let Your Characters Tell Their Story”: A Dataset for Character-Centric Narrative Understanding

no code implementations • Findings (EMNLP) 2021 • Faeze Brahman, Meng Huang, Oyvind Tafjord, Chao Zhao, Mrinmaya Sachan, Snigdha Chaturvedi

When reading a literary piece, readers often make inferences about various characters’ roles, personalities, relationships, intents, actions, etc.

Paper
Add Code

Enhancing Systematic Decompositional Natural Language Inference Using Informal Logic

no code implementations • 22 Feb 2024 • Nathaniel Weir, Kate Sanders, Orion Weller, Shreya Sharma, Dongwei Jiang, Zhengping Jiang, Bhavana Dalvi Mishra, Oyvind Tafjord, Peter Jansen, Peter Clark, Benjamin Van Durme

Contemporary language models enable new opportunities for structured reasoning with text, such as the construction and evaluation of intuitive, proof-like textual entailment trees without relying on brittle formal logic.

Formal Logic Knowledge Distillation +2

Paper
Add Code

OLMo: Accelerating the Science of Language Models

2 code implementations • 1 Feb 2024 • Dirk Groeneveld, Iz Beltagy, Pete Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Harsh Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson, Russell Authur, Khyathi Raghavi Chandu, Arman Cohan, Jennifer Dumas, Yanai Elazar, Yuling Gu, Jack Hessel, Tushar Khot, William Merrill, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Valentina Pyatkin, Abhilasha Ravichander, Dustin Schwenk, Saurabh Shah, Will Smith, Emma Strubell, Nishant Subramani, Mitchell Wortsman, Pradeep Dasigi, Nathan Lambert, Kyle Richardson, Luke Zettlemoyer, Jesse Dodge, Kyle Lo, Luca Soldaini, Noah A. Smith, Hannaneh Hajishirzi

Given the importance of these details in scientifically studying these models, including their biases and potential risks, we believe it is essential for the research community to have access to powerful, truly open LMs.

Language Modelling

4,114

Paper
Code

Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research

1 code implementation • 31 Jan 2024 • Luca Soldaini, Rodney Kinney, Akshita Bhagia, Dustin Schwenk, David Atkinson, Russell Authur, Ben Bogin, Khyathi Chandu, Jennifer Dumas, Yanai Elazar, Valentin Hofmann, Ananya Harsh Jha, Sachin Kumar, Li Lucy, Xinxi Lyu, Nathan Lambert, Ian Magnusson, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Abhilasha Ravichander, Kyle Richardson, Zejiang Shen, Emma Strubell, Nishant Subramani, Oyvind Tafjord, Pete Walsh, Luke Zettlemoyer, Noah A. Smith, Hannaneh Hajishirzi, Iz Beltagy, Dirk Groeneveld, Jesse Dodge, Kyle Lo

As a result, it is challenging to conduct and advance scientific research on language modeling, such as understanding how training data impacts model capabilities and limitations.

Language Modelling

819

Paper
Code

Paloma: A Benchmark for Evaluating Language Model Fit

no code implementations • 16 Dec 2023 • Ian Magnusson, Akshita Bhagia, Valentin Hofmann, Luca Soldaini, Ananya Harsh Jha, Oyvind Tafjord, Dustin Schwenk, Evan Pete Walsh, Yanai Elazar, Kyle Lo, Dirk Groeneveld, Iz Beltagy, Hannaneh Hajishirzi, Noah A. Smith, Kyle Richardson, Jesse Dodge

We invite submissions to our benchmark and organize results by comparability based on compliance with guidelines such as removal of benchmark contamination from pretraining.

Language Modelling

Paper
Add Code

Catwalk: A Unified Language Model Evaluation Framework for Many Datasets

1 code implementation • 15 Dec 2023 • Dirk Groeneveld, Anas Awadalla, Iz Beltagy, Akshita Bhagia, Ian Magnusson, Hao Peng, Oyvind Tafjord, Pete Walsh, Kyle Richardson, Jesse Dodge

The success of large language models has shifted the evaluation paradigms in natural language processing (NLP).

In-Context Learning Language Modelling

137

Paper
Code

BaRDa: A Belief and Reasoning Dataset that Separates Factual Accuracy and Reasoning Ability

no code implementations • 12 Dec 2023 • Peter Clark, Bhavana Dalvi Mishra, Oyvind Tafjord

This shows the clear progression of models towards improved factual accuracy and entailment reasoning, and the dataset provides a new benchmark that more cleanly separates and quantifies these two notions.

counterfactual valid

Paper
Add Code

Digital Socrates: Evaluating LLMs through Explanation Critiques

no code implementations • 16 Nov 2023 • Yuling Gu, Oyvind Tafjord, Peter Clark

While LLMs can provide reasoned explanations along with their answers, the nature and quality of those explanations are still poorly understood.

Paper
Add Code

CLIN: A Continually Learning Language Agent for Rapid Task Adaptation and Generalization

no code implementations • 16 Oct 2023 • Bodhisattwa Prasad Majumder, Bhavana Dalvi Mishra, Peter Jansen, Oyvind Tafjord, Niket Tandon, Li Zhang, Chris Callison-Burch, Peter Clark

Language agents have shown some ability to interact with an external environment, e. g., a virtual world such as ScienceWorld, to perform complex tasks, e. g., growing a plant, without the startup costs of reinforcement learning.

Paper
Add Code

Increasing Probability Mass on Answer Choices Does Not Always Improve Accuracy

1 code implementation • 24 May 2023 • Sarah Wiegreffe, Matthew Finlayson, Oyvind Tafjord, Peter Clark, Ashish Sabharwal

For example, both normalization and prompting methods for reducing SFC can be ineffective or even detrimental to task performance for some LMs.

In-Context Learning Multiple-choice +1

Paper
Code

Language Models with Rationality

no code implementations • 23 May 2023 • Nora Kassner, Oyvind Tafjord, Ashish Sabharwal, Kyle Richardson, Hinrich Schuetze, Peter Clark

To address this, our goals are to make model beliefs and their inferential relationships explicit, and to resolve inconsistencies that may exist, so that answers are supported by interpretable chains of reasoning drawn from a consistent network of beliefs.

Question Answering

Paper
Add Code

Lila: A Unified Benchmark for Mathematical Reasoning

1 code implementation • 31 Oct 2022 • Swaroop Mishra, Matthew Finlayson, Pan Lu, Leonard Tang, Sean Welleck, Chitta Baral, Tanmay Rajpurohit, Oyvind Tafjord, Ashish Sabharwal, Peter Clark, Ashwin Kalyan

Mathematical reasoning skills are essential for general-purpose intelligent systems to perform tasks from grocery shopping to climate modeling.

Ranked #1 on Mathematical Reasoning on Lila (OOD)

Mathematical Reasoning Question Answering

Paper
Code

Entailer: Answering Questions with Faithful and Truthful Chains of Reasoning

no code implementations • 21 Oct 2022 • Oyvind Tafjord, Bhavana Dalvi Mishra, Peter Clark

Our goal is a question-answering (QA) system that can show how its answers are implied by its own internal beliefs via a systematic chain of reasoning.

Question Answering

Paper
Add Code

Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering

1 code implementation • 20 Sep 2022 • Pan Lu, Swaroop Mishra, Tony Xia, Liang Qiu, Kai-Wei Chang, Song-Chun Zhu, Oyvind Tafjord, Peter Clark, Ashwin Kalyan

We further design language models to learn to generate lectures and explanations as the chain of thought (CoT) to mimic the multi-hop reasoning process when answering ScienceQA questions.

Ranked #6 on Science Question Answering on ScienceQA

Multimodal Deep Learning Multimodal Reasoning +5

553

Paper
Code

Towards Teachable Reasoning Systems: Using a Dynamic Memory of User Feedback for Continual System Improvement

no code implementations • 27 Apr 2022 • Bhavana Dalvi Mishra, Oyvind Tafjord, Peter Clark

Our goal is a teachable reasoning system for question-answering (QA), where a user can interact with faithful answer explanations, and correct its errors so that the system improves over time.

Question Answering

Paper
Add Code

BeliefBank: Adding Memory to a Pre-Trained Language Model for a Systematic Notion of Belief

no code implementations • EMNLP 2021 • Nora Kassner, Oyvind Tafjord, Hinrich Schütze, Peter Clark

We show that, in a controlled experimental setting, these two mechanisms result in more consistent beliefs in the overall system, improving both the accuracy and consistency of its answers over time.

Language Modelling World Knowledge

Paper
Add Code

"Let Your Characters Tell Their Story": A Dataset for Character-Centric Narrative Understanding

no code implementations • 12 Sep 2021 • Faeze Brahman, Meng Huang, Oyvind Tafjord, Chao Zhao, Mrinmaya Sachan, Snigdha Chaturvedi

When reading a literary piece, readers often make inferences about various characters' roles, personalities, relationships, intents, actions, etc.

Paper
Add Code

General-Purpose Question-Answering with Macaw

2 code implementations • 6 Sep 2021 • Oyvind Tafjord, Peter Clark

Despite the successes of pretrained language models, there are still few high-quality, general-purpose QA systems that are freely available.

Generative Question Answering Multiple-choice

459

Paper
Code

Explaining Answers with Entailment Trees

1 code implementation • EMNLP 2021 • Bhavana Dalvi, Peter Jansen, Oyvind Tafjord, Zhengnan Xie, Hannah Smith, Leighanna Pipatanangkura, Peter Clark

Our approach is to generate explanations in the form of entailment trees, namely a tree of multipremise entailment steps from facts that are known, through intermediate conclusions, to the hypothesis of interest (namely the question + answer).

Language Modelling Question Answering +1

Paper
Code

Enriching a Model's Notion of Belief using a Persistent Memory

no code implementations • 16 Apr 2021 • Nora Kassner, Oyvind Tafjord, Hinrich Schutze, Peter Clark

(This is an old and now obsolete draft.

Language Modelling

Paper
Add Code

Think you have Solved Direct-Answer Question Answering? Try ARC-DA, the Direct-Answer AI2 Reasoning Challenge

no code implementations • 5 Feb 2021 • Sumithra Bhakthavatsalam, Daniel Khashabi, Tushar Khot, Bhavana Dalvi Mishra, Kyle Richardson, Ashish Sabharwal, Carissa Schoenick, Oyvind Tafjord, Peter Clark

We present the ARC-DA dataset, a direct-answer ("open response", "freeform") version of the ARC (AI2 Reasoning Challenge) multiple-choice dataset.

Multiple-choice Natural Questions +2

Paper
Add Code

ProofWriter: Generating Implications, Proofs, and Abductive Statements over Natural Language

no code implementations • Findings (ACL) 2021 • Oyvind Tafjord, Bhavana Dalvi Mishra, Peter Clark

In this work we show that a generative model, called ProofWriter, can reliably generate both implications of a theory and the natural language proof(s) that support them.

Paper
Add Code

Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge

1 code implementation • NeurIPS 2020 • Alon Talmor, Oyvind Tafjord, Peter Clark, Yoav Goldberg, Jonathan Berant

In this work, we provide a first demonstration that LMs can be trained to reliably perform systematic reasoning combining both implicit, pre-trained knowledge and explicit natural language statements.

World Knowledge

Paper
Code

UnifiedQA: Crossing Format Boundaries With a Single QA System

2 code implementations • Findings of the Association for Computational Linguistics 2020 • Daniel Khashabi, Sewon Min, Tushar Khot, Ashish Sabharwal, Oyvind Tafjord, Peter Clark, Hannaneh Hajishirzi

As evidence, we use the latest advances in language modeling to build a single pre-trained QA model, UnifiedQA, that performs surprisingly well across 17 QA datasets spanning 4 diverse formats.

Ranked #5 on Common Sense Reasoning on WinoGrande

Common Sense Reasoning Language Modelling +3

426

Paper
Code

"You are grounded!": Latent Name Artifacts in Pre-trained Language Models

1 code implementation • 6 Apr 2020 • Vered Shwartz, Rachel Rudinger, Oyvind Tafjord

Pre-trained language models (LMs) may perpetuate biases originating in their training corpus to downstream models.

Reading Comprehension

Paper
Code

Transformers as Soft Reasoners over Language

2 code implementations • 14 Feb 2020 • Peter Clark, Oyvind Tafjord, Kyle Richardson

However, expressing the knowledge in a formal (logical or probabilistic) representation has been a major obstacle to this research.

counterfactual Counterfactual Reasoning +2

Paper
Code

SUPP.AI: Finding Evidence for Supplement-Drug Interactions

1 code implementation • ACL 2020 • Lucy Lu Wang, Oyvind Tafjord, Arman Cohan, Sarthak Jain, Sam Skjonsberg, Carissa Schoenick, Nick Botner, Waleed Ammar

We fine-tune the contextualized word representations of the RoBERTa language model using labeled DDI data, and apply the fine-tuned model to identify supplement interactions.

General Classification Language Modelling

Paper
Code

QuaRTz: An Open-Domain Dataset of Qualitative Relationship Questions

no code implementations • IJCNLP 2019 • Oyvind Tafjord, Matt Gardner, Kevin Lin, Peter Clark

QuaRTz contains general qualitative statements, e. g., "A sunscreen with a higher SPF protects the skin longer.

General Knowledge

Paper
Add Code

From 'F' to 'A' on the N.Y. Regents Science Exams: An Overview of the Aristo Project

no code implementations • 4 Sep 2019 • Peter Clark, Oren Etzioni, Daniel Khashabi, Tushar Khot, Bhavana Dalvi Mishra, Kyle Richardson, Ashish Sabharwal, Carissa Schoenick, Oyvind Tafjord, Niket Tandon, Sumithra Bhakthavatsalam, Dirk Groeneveld, Michal Guerquin, Michael Schmitz

This paper reports unprecedented success on the Grade 8 New York Regents Science Exam, where for the first time a system scores more than 90% on the exam's non-diagram, multiple choice (NDMC) questions.

Multiple-choice Question Answering

Paper
Add Code

Reasoning Over Paragraph Effects in Situations

no code implementations • WS 2019 • Kevin Lin, Oyvind Tafjord, Peter Clark, Matt Gardner

A system is presented a background passage containing at least one of these relations, a novel situation that uses this background, and questions that require reasoning about effects of the relationships in the background passage in the context of the situation.

Reading Comprehension

Paper
Add Code

Multi-class Hierarchical Question Classification for Multiple Choice Science Exams

1 code implementation • LREC 2020 • Dongfang Xu, Peter Jansen, Jaycie Martin, Zhengnan Xie, Vikas Yadav, Harish Tayyar Madabushi, Oyvind Tafjord, Peter Clark

Prior work has demonstrated that question classification (QC), recognizing the problem domain of a question, can help answer it more accurately.

Classification General Classification +2

Paper
Code

Declarative Question Answering over Knowledge Bases containing Natural Language Text with Answer Set Programming

1 code implementation • 1 May 2019 • Arindam Mitra, Peter Clark, Oyvind Tafjord, Chitta Baral

While in recent years machine learning (ML) based approaches have been the popular approach in developing end-to-end question answering systems, such systems often struggle when additional knowledge is needed to correctly answer the questions.

Logical Reasoning Natural Language Inference +1

Paper
Code

QuaRel: A Dataset and Models for Answering Questions about Qualitative Relationships

no code implementations • 20 Nov 2018 • Oyvind Tafjord, Peter Clark, Matt Gardner, Wen-tau Yih, Ashish Sabharwal

Many natural language questions require recognizing and reasoning with qualitative relationships (e. g., in science, economics, and medicine), but are challenging to answer with corpus-based methods.

Friction Semantic Parsing

Paper
Add Code

AllenNLP: A Deep Semantic Natural Language Processing Platform

1 code implementation • WS 2018 • Matt Gardner, Joel Grus, Mark Neumann, Oyvind Tafjord, Pradeep Dasigi, Nelson Liu, Matthew Peters, Michael Schmitz, Luke Zettlemoyer

This paper describes AllenNLP, a platform for research on deep learning methods in natural language understanding.

Natural Language Understanding Reading Comprehension +1

11,698

Paper
Code

Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge

1 code implementation • 14 Mar 2018 • Peter Clark, Isaac Cowhey, Oren Etzioni, Tushar Khot, Ashish Sabharwal, Carissa Schoenick, Oyvind Tafjord

We present a new question set, text corpus, and baselines assembled to encourage AI research in advanced question answering.

Question Answering Retrieval

Paper
Code

Semantic Parsing to Probabilistic Programs for Situated Question Answering

no code implementations • EMNLP 2016 • Jayant Krishnamurthy, Oyvind Tafjord, Aniruddha Kembhavi

Situated question answering is the problem of answering questions about an environment such as an image or diagram.

Question Answering Semantic Parsing

Paper
Add Code

Moving Beyond the Turing Test with the Allen AI Science Challenge

3 code implementations • 14 Apr 2016 • Carissa Schoenick, Peter Clark, Oyvind Tafjord, Peter Turney, Oren Etzioni

Given recent successes in AI (e. g., AlphaGo's victory against Lee Sedol in the game of GO), it's become increasingly important to assess: how close are AI systems to human-level intelligence?

Ranked #1 on Question Answering on Aristo Kaggle Allen AI 8th grade questions

Question Answering

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.