Skip to main content

Showing 1–18 of 18 results for author: Rashkin, H

  1. arXiv:2406.13632  [pdf, other

    cs.CL

    Can Few-shot Work in Long-Context? Recycling the Context to Generate Demonstrations

    Authors: Arie Cattan, Alon Jacovi, Alex Fabrikant, Jonathan Herzig, Roee Aharoni, Hannah Rashkin, Dror Marcus, Avinatan Hassidim, Yossi Matias, Idan Szpektor, Avi Caciularu

    Abstract: Despite recent advancements in Large Language Models (LLMs), their performance on tasks involving long contexts remains sub-optimal. In-Context Learning (ICL) with few-shot examples may be an appealing solution to enhance LLM performance in this scenario; However, naively adding ICL examples with long context introduces challenges, including substantial token overhead added for each few-shot examp… ▽ More

    Submitted 23 June, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

  2. arXiv:2402.02077  [pdf, other

    cs.CL

    Investigating Content Planning for Navigating Trade-offs in Knowledge-Grounded Dialogue

    Authors: Kushal Chawla, Hannah Rashkin, Gaurav Singh Tomar, David Reitter

    Abstract: Knowledge-grounded dialogue generation is a challenging task because it requires satisfying two fundamental yet often competing constraints: being responsive in a manner that is specific to what the conversation partner has said while also being attributable to an underlying source document. In this work, we bring this trade-off between these two objectives (specificity and attribution) to light a… ▽ More

    Submitted 3 February, 2024; originally announced February 2024.

    Comments: Accepted at EACL 2024 Main Conference (Long)

  3. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  4. arXiv:2112.12870  [pdf, other

    cs.CL

    Measuring Attribution in Natural Language Generation Models

    Authors: Hannah Rashkin, Vitaly Nikolaev, Matthew Lamm, Lora Aroyo, Michael Collins, Dipanjan Das, Slav Petrov, Gaurav Singh Tomar, Iulia Turc, David Reitter

    Abstract: With recent improvements in natural language generation (NLG) models for various applications, it has become imperative to have the means to identify and evaluate whether NLG output is only sharing verifiable information about the external world. In this work, we present a new evaluation framework entitled Attributable to Identified Sources (AIS) for assessing the output of natural language genera… ▽ More

    Submitted 2 August, 2022; v1 submitted 23 December, 2021; originally announced December 2021.

  5. arXiv:2112.08558  [pdf, other

    cs.CL

    CONQRR: Conversational Query Rewriting for Retrieval with Reinforcement Learning

    Authors: Zeqiu Wu, Yi Luan, Hannah Rashkin, David Reitter, Hannaneh Hajishirzi, Mari Ostendorf, Gaurav Singh Tomar

    Abstract: Compared to standard retrieval tasks, passage retrieval for conversational question answering (CQA) poses new challenges in understanding the current user question, as each question needs to be interpreted within the dialogue context. Moreover, it can be expensive to re-train well-established retrievers such as search engines that are originally developed for non-conversational queries. To facilit… ▽ More

    Submitted 28 October, 2022; v1 submitted 15 December, 2021; originally announced December 2021.

    Comments: EMNLP 2022 camera-ready

  6. arXiv:2107.06963  [pdf, other

    cs.CL

    Increasing Faithfulness in Knowledge-Grounded Dialogue with Controllable Features

    Authors: Hannah Rashkin, David Reitter, Gaurav Singh Tomar, Dipanjan Das

    Abstract: Knowledge-grounded dialogue systems are intended to convey information that is based on evidence provided in a given source text. We discuss the challenges of training a generative neural dialogue model for such systems that is controlled to stay faithful to the evidence. Existing datasets contain a mix of conversational responses that are faithful to selected evidence as well as more subjective o… ▽ More

    Submitted 14 July, 2021; originally announced July 2021.

    Comments: ACL 2021

  7. arXiv:2105.00071  [pdf, other

    cs.CL

    Evaluating Attribution in Dialogue Systems: The BEGIN Benchmark

    Authors: Nouha Dziri, Hannah Rashkin, Tal Linzen, David Reitter

    Abstract: Knowledge-grounded dialogue systems powered by large language models often generate responses that, while fluent, are not attributable to a relevant source of information. Progress towards models that do not exhibit this issue requires evaluation metrics that can quantify its prevalence. To this end, we introduce the Benchmark for Evaluation of Grounded INteraction (BEGIN), comprised of 12k dialog… ▽ More

    Submitted 28 June, 2022; v1 submitted 30 April, 2021; originally announced May 2021.

    Comments: TACL, 12 pages, 9 figures, 2 tables

  8. arXiv:2010.13816  [pdf, other

    cs.CL cs.AI

    PowerTransformer: Unsupervised Controllable Revision for Biased Language Correction

    Authors: Xinyao Ma, Maarten Sap, Hannah Rashkin, Yejin Choi

    Abstract: Unconscious biases continue to be prevalent in modern text and media, calling for algorithms that can assist writers with bias correction. For example, a female character in a story is often portrayed as passive and powerless ("She daydreams about being a doctor") while a man is portrayed as more proactive and powerful ("He pursues his dream of being a doctor"). We formulate *Controllable Debias… ▽ More

    Submitted 26 October, 2020; originally announced October 2020.

    Comments: EMNLP 2020

  9. arXiv:2004.14967  [pdf, other

    cs.CL

    PlotMachines: Outline-Conditioned Generation with Dynamic Plot State Tracking

    Authors: Hannah Rashkin, Asli Celikyilmaz, Yejin Choi, Jianfeng Gao

    Abstract: We propose the task of outline-conditioned story generation: given an outline as a set of phrases that describe key characters and events to appear in a story, the task is to generate a coherent narrative that is consistent with the provided outline. This task is challenging as the input only provides a rough sketch of the plot, and thus, models need to generate a story by interweaving the key poi… ▽ More

    Submitted 9 October, 2020; v1 submitted 30 April, 2020; originally announced April 2020.

    Comments: EMNLP 2020

  10. arXiv:1908.05739  [pdf, other

    cs.CL

    Abductive Commonsense Reasoning

    Authors: Chandra Bhagavatula, Ronan Le Bras, Chaitanya Malaviya, Keisuke Sakaguchi, Ari Holtzman, Hannah Rashkin, Doug Downey, Scott Wen-tau Yih, Yejin Choi

    Abstract: Abductive reasoning is inference to the most plausible explanation. For example, if Jenny finds her house in a mess when she returns from work, and remembers that she left a window open, she can hypothesize that a thief broke into her house and caused the mess, as the most plausible explanation. While abduction has long been considered to be at the core of how people interpret and read between the… ▽ More

    Submitted 13 February, 2020; v1 submitted 15 August, 2019; originally announced August 2019.

    Comments: ICLR 2020 Camera Ready

  11. arXiv:1906.05317  [pdf, other

    cs.CL cs.AI

    COMET: Commonsense Transformers for Automatic Knowledge Graph Construction

    Authors: Antoine Bosselut, Hannah Rashkin, Maarten Sap, Chaitanya Malaviya, Asli Celikyilmaz, Yejin Choi

    Abstract: We present the first comprehensive study on automatic knowledge base construction for two prevalent commonsense knowledge graphs: ATOMIC (Sap et al., 2019) and ConceptNet (Speer et al., 2017). Contrary to many conventional KBs that store knowledge with canonical templates, commonsense KBs only store loosely structured open-text descriptions of knowledge. We posit that an important step toward auto… ▽ More

    Submitted 14 June, 2019; v1 submitted 12 June, 2019; originally announced June 2019.

    Comments: Accepted to ACL 2019

  12. arXiv:1905.12616  [pdf, other

    cs.CL cs.CY

    Defending Against Neural Fake News

    Authors: Rowan Zellers, Ari Holtzman, Hannah Rashkin, Yonatan Bisk, Ali Farhadi, Franziska Roesner, Yejin Choi

    Abstract: Recent progress in natural language generation has raised dual-use concerns. While applications like summarization and translation are positive, the underlying technology also might enable adversaries to generate neural fake news: targeted propaganda that closely mimics the style of real news. Modern computer security relies on careful threat modeling: identifying potential threats and vulnerabi… ▽ More

    Submitted 11 December, 2020; v1 submitted 29 May, 2019; originally announced May 2019.

    Comments: NeurIPS 2019 camera ready version. Project page/code/demo at https://rowanzellers.com/grover

  13. arXiv:1904.09728  [pdf, other

    cs.CL

    SocialIQA: Commonsense Reasoning about Social Interactions

    Authors: Maarten Sap, Hannah Rashkin, Derek Chen, Ronan LeBras, Yejin Choi

    Abstract: We introduce Social IQa, the first largescale benchmark for commonsense reasoning about social situations. Social IQa contains 38,000 multiple choice questions for probing emotional and social intelligence in a variety of everyday situations (e.g., Q: "Jordan wanted to tell Tracy a secret, so Jordan leaned towards Tracy. Why did Jordan do this?" A: "Make sure no one else could hear"). Through crow… ▽ More

    Submitted 9 September, 2019; v1 submitted 22 April, 2019; originally announced April 2019.

    Comments: the first two authors contributed equally; accepted to EMNLP 2019; camera ready version

  14. arXiv:1811.00207  [pdf, other

    cs.CL

    Towards Empathetic Open-domain Conversation Models: a New Benchmark and Dataset

    Authors: Hannah Rashkin, Eric Michael Smith, Margaret Li, Y-Lan Boureau

    Abstract: One challenge for dialogue agents is recognizing feelings in the conversation partner and replying accordingly, a key communicative skill. While it is straightforward for humans to recognize and acknowledge others' feelings in a conversation, this is a significant challenge for AI systems due to the paucity of suitable publicly-available datasets for training and evaluation. This work proposes a n… ▽ More

    Submitted 28 August, 2019; v1 submitted 31 October, 2018; originally announced November 2018.

    Comments: accepted to ACL 2019 (long paper)

  15. arXiv:1811.00146  [pdf, other

    cs.CL

    ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning

    Authors: Maarten Sap, Ronan LeBras, Emily Allaway, Chandra Bhagavatula, Nicholas Lourie, Hannah Rashkin, Brendan Roof, Noah A. Smith, Yejin Choi

    Abstract: We present ATOMIC, an atlas of everyday commonsense reasoning, organized through 877k textual descriptions of inferential knowledge. Compared to existing resources that center around taxonomic knowledge, ATOMIC focuses on inferential knowledge organized as typed if-then relations with variables (e.g., "if X pays Y a compliment, then Y will likely return the compliment"). We propose nine if-then re… ▽ More

    Submitted 7 February, 2019; v1 submitted 31 October, 2018; originally announced November 2018.

    Comments: AAAI 2019 CR

  16. arXiv:1805.06939  [pdf, other

    cs.CL

    Event2Mind: Commonsense Inference on Events, Intents, and Reactions

    Authors: Hannah Rashkin, Maarten Sap, Emily Allaway, Noah A. Smith, Yejin Choi

    Abstract: We investigate a new commonsense inference task: given an event described in a short free-form text ("X drinks coffee in the morning"), a system reasons about the likely intents ("X wants to stay awake") and reactions ("X feels alert") of the event's participants. To support this study, we construct a new crowdsourced corpus of 25,000 event phrases covering a diverse range of everyday events and s… ▽ More

    Submitted 14 June, 2019; v1 submitted 17 May, 2018; originally announced May 2018.

    Comments: Accepted to ACL 2018 (long paper). First two authors contributed equally. arXiv admin note: text overlap with arXiv:1903.06901 by other authors

  17. arXiv:1805.06533  [pdf, other

    cs.CL

    Modeling Naive Psychology of Characters in Simple Commonsense Stories

    Authors: Hannah Rashkin, Antoine Bosselut, Maarten Sap, Kevin Knight, Yejin Choi

    Abstract: Understanding a narrative requires reading between the lines and reasoning about the unspoken but obvious implications about events and people's mental states - a capability that is trivial for humans but remarkably hard for machines. To facilitate research addressing this challenge, we introduce a new annotation framework to explain naive psychology of story characters as fully-specified chains o… ▽ More

    Submitted 16 May, 2018; originally announced May 2018.

    Comments: Accepted to ACL 2018 (long paper)

  18. arXiv:1506.02739  [pdf, other

    cs.CL

    Connotation Frames: A Data-Driven Investigation

    Authors: Hannah Rashkin, Sameer Singh, Yejin Choi

    Abstract: Through a particular choice of a predicate (e.g., "x violated y"), a writer can subtly connote a range of implied sentiments and presupposed facts about the entities x and y: (1) writer's perspective: projecting x as an "antagonist"and y as a "victim", (2) entities' perspective: y probably dislikes x, (3) effect: something bad happened to y, (4) value: y is something valuable, and (5) mental state… ▽ More

    Submitted 21 August, 2016; v1 submitted 8 June, 2015; originally announced June 2015.

    Comments: 11 pages, published in Proceedings of ACL 2016