no code implementations • EMNLP (WNUT) 2020 • Karthik Radhakrishnan, Tushar Kanakagiri, Sharanya Chakravarthy, Vidhisha Balachandran
The rise in the usage of social media has placed it in a central position for news dissemination and consumption.
no code implementations • NAACL (DeeLIO) 2021 • Vidhisha Balachandran, Bhuwan Dhingra, Haitian Sun, Michael Collins, William Cohen
We create a subset of the NQ data, Factual Questions (FQ), where the questions have evidence in the KB in the form of paths that link question entities to answer entities but still must be answered using text, to facilitate further research into KB integration methods.
1 code implementation • 3 Jun 2024 • Shuyue Stella Li, Vidhisha Balachandran, Shangbin Feng, Jonathan Ilgen, Emma Pierson, Pang Wei Koh, Yulia Tsvetkov
We develop a reliable Patient system and prototype several Expert systems, first showing that directly prompting state-of-the-art LLMs to ask questions degrades the quality of clinical reasoning, indicating that adapting LLMs to interactive information-seeking settings is nontrivial.
1 code implementation • 25 Apr 2024 • Kabir Ahuja, Vidhisha Balachandran, Madhur Panwar, Tianxing He, Noah A. Smith, Navin Goyal, Yulia Tsvetkov
Transformers trained on natural language data have been shown to learn its hierarchical structure and generalize to sentences with unseen syntactic structures without explicitly encoding any structural bias.
no code implementations • 1 Feb 2024 • Shangbin Feng, Weijia Shi, Yike Wang, Wenxuan Ding, Vidhisha Balachandran, Yulia Tsvetkov
Despite efforts to expand the knowledge of large language models (LLMs), knowledge gaps -- missing or outdated information in LLMs -- might always persist given the evolving nature of knowledge.
no code implementations • 12 Jan 2024 • Abhika Mishra, Akari Asai, Vidhisha Balachandran, Yizhong Wang, Graham Neubig, Yulia Tsvetkov, Hannaneh Hajishirzi
On our benchmark, our automatic and human evaluations show that FAVA significantly outperforms ChatGPT and GPT-4 on fine-grained hallucination detection, and edits suggested by FAVA improve the factuality of LM-generated text.
no code implementations • 16 Nov 2023 • YuHan Liu, Shangbin Feng, Xiaochuang Han, Vidhisha Balachandran, Chan Young Park, Sachin Kumar, Yulia Tsvetkov
In this work, we take a first step towards designing summarization systems that are faithful to the author's intent, not only the semantic content of the article.
1 code implementation • 15 Oct 2023 • Yuyang Bai, Shangbin Feng, Vidhisha Balachandran, Zhaoxuan Tan, Shiqi Lou, Tianxing He, Yulia Tsvetkov
To gain a better understanding of LLMs' knowledge abilities and their generalization, we evaluate 10 open-source and black-box LLMs on the KGQuiz benchmark across the five knowledge-intensive tasks and knowledge domains.
1 code implementation • 2 Oct 2023 • Yike Wang, Shangbin Feng, Heng Wang, Weijia Shi, Vidhisha Balachandran, Tianxing He, Yulia Tsvetkov
To this end, we introduce KNOWLEDGE CONFLICT, an evaluation framework for simulating contextual knowledge conflicts and quantitatively evaluating to what extent LLMs achieve these goals.
1 code implementation • 2 Oct 2023 • Wenxuan Ding, Shangbin Feng, YuHan Liu, Zhaoxuan Tan, Vidhisha Balachandran, Tianxing He, Yulia Tsvetkov
We additionally propose two new approaches, Staged Prompting and Verify-All, to augment LLMs' ability to backtrack and verify structured constraints.
2 code implementations • 17 May 2023 • Shangbin Feng, Weijia Shi, Yuyang Bai, Vidhisha Balachandran, Tianxing He, Yulia Tsvetkov
Ultimately, Knowledge Card framework enables dynamic synthesis and updates of knowledge from diverse domains.
1 code implementation • 14 May 2023 • Shangbin Feng, Vidhisha Balachandran, Yuyang Bai, Yulia Tsvetkov
We propose FactKB, a simple new approach to factuality evaluation that is generalizable across domains, in particular with respect to entities and relations.
2 code implementations • 31 Mar 2023 • Leon Derczynski, Hannah Rose Kirk, Vidhisha Balachandran, Sachin Kumar, Yulia Tsvetkov, M. R. Leiser, Saif Mohammad
However, there is no risk-centric framework for documenting the complexity of a landscape in which some risks are shared across models and contexts, while others are specific, and where certain conditions may be required for risks to manifest as harms.
1 code implementation • 22 Oct 2022 • Vidhisha Balachandran, Hannaneh Hajishirzi, William W. Cohen, Yulia Tsvetkov
Abstractive summarization models often generate inconsistent summaries containing factual errors or hallucinated content.
no code implementations • 14 Oct 2022 • Sachin Kumar, Vidhisha Balachandran, Lucille Njoo, Antonios Anastasopoulos, Yulia Tsvetkov
Recent advances in the capacity of large language models to generate human-like text have resulted in their increased adoption in user-facing settings.
1 code implementation • 15 Mar 2022 • Rishabh Joshi, Vidhisha Balachandran, Emily Saldanha, Maria Glenski, Svitlana Volkova, Yulia Tsvetkov
Keyphrase extraction aims at automatically extracting a list of "important" phrases representing the key concepts in a document.
2 code implementations • ICLR 2021 • Rishabh Joshi, Vidhisha Balachandran, Shikhar Vashishth, Alan Black, Yulia Tsvetkov
To successfully negotiate a deal, it is not enough to communicate fluently: pragmatic planning of persuasive negotiation strategies is essential.
2 code implementations • NAACL 2021 • Artidoro Pagnoni, Vidhisha Balachandran, Yulia Tsvetkov
Modern summarization models generate highly fluent but often factually unreliable outputs.
no code implementations • EMNLP (MRQA) 2021 • Vidhisha Balachandran, Ashish Vaswani, Yulia Tsvetkov, Niki Parmar
Dense retrieval has been shown to be effective for retrieving relevant documents for Open Domain QA, surpassing popular sparse retrieval methods like BM25.
2 code implementations • EMNLP 2021 • Dheeraj Rajagopal, Vidhisha Balachandran, Eduard Hovy, Yulia Tsvetkov
We introduce SelfExplain, a novel self-explaining model that explains a text classifier's predictions using phrase-based concepts.
1 code implementation • EACL 2021 • Vidhisha Balachandran, Artidoro Pagnoni, Jay Yoon Lee, Dheeraj Rajagopal, Jaime Carbonell, Yulia Tsvetkov
To this end, we propose incorporating latent and explicit dependencies across sentences in the source document into end-to-end single-document summarization models.
1 code implementation • ICLR 2020 • Bhuwan Dhingra, Manzil Zaheer, Vidhisha Balachandran, Graham Neubig, Ruslan Salakhutdinov, William W. Cohen
In particular, we describe a neural module, DrKIT, that traverses textual data like a KB, softly following paths of relations between mentions of entities in the corpus.