Skip to main content

Showing 1–29 of 29 results for author: Fisch, A

  1. arXiv:2406.04291  [pdf, other

    cs.LG stat.ML

    Stratified Prediction-Powered Inference for Hybrid Language Model Evaluation

    Authors: Adam Fisch, Joshua Maynez, R. Alex Hofer, Bhuwan Dhingra, Amir Globerson, William W. Cohen

    Abstract: Prediction-powered inference (PPI) is a method that improves statistical estimates based on limited human-labeled data. PPI achieves this by combining small amounts of human-labeled data with larger amounts of data labeled by a reasonably accurate -- but potentially biased -- automatic system, in a way that results in tighter confidence intervals for certain parameters of interest (e.g., the mean… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  2. arXiv:2406.02657  [pdf, other

    cs.CL cs.AI cs.LG

    Block Transformer: Global-to-Local Language Modeling for Fast Inference

    Authors: Namgyu Ho, Sangmin Bae, Taehyeon Kim, Hyunjik Jo, Yireun Kim, Tal Schuster, Adam Fisch, James Thorne, Se-Young Yun

    Abstract: This paper presents the Block Transformer architecture which adopts hierarchical global-to-local modeling to autoregressive transformers to mitigate the inference bottlenecks of self-attention. To apply self-attention, the key-value (KV) cache of all previous sequences must be retrieved from memory at every decoding step. Thereby, this KV cache IO becomes a significant bottleneck in batch inferenc… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 30 pages, 21 figures, 5 tables

  3. arXiv:2405.19316  [pdf, other

    cs.LG cs.CL

    Robust Preference Optimization through Reward Model Distillation

    Authors: Adam Fisch, Jacob Eisenstein, Vicky Zayats, Alekh Agarwal, Ahmad Beirami, Chirag Nagpal, Pete Shaw, Jonathan Berant

    Abstract: Language model (LM) post-training (or alignment) involves maximizing a reward function that is derived from preference annotations. Direct Preference Optimization (DPO) is a popular offline alignment method that trains a policy directly on preference data without the need to train a reward model or apply reinforcement learning. However, typical preference datasets have only a single, or at most a… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  4. arXiv:2405.06034  [pdf, other

    cs.LG

    Bayesian Prediction-Powered Inference

    Authors: R. Alex Hofer, Joshua Maynez, Bhuwan Dhingra, Adam Fisch, Amir Globerson, William W. Cohen

    Abstract: Prediction-powered inference (PPI) is a method that improves statistical estimates based on limited human-labeled data. Specifically, PPI methods provide tighter confidence intervals by combining small amounts of human-labeled data with larger amounts of data labeled by a reasonably accurate, but potentially biased, automatic system. We propose a framework for PPI based on Bayesian inference that… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  5. arXiv:2405.01563  [pdf, other

    cs.LG cs.AI cs.CL

    Mitigating LLM Hallucinations via Conformal Abstention

    Authors: Yasin Abbasi Yadkori, Ilja Kuzborskij, David Stutz, András György, Adam Fisch, Arnaud Doucet, Iuliya Beloshapka, Wei-Hung Weng, Yao-Yuan Yang, Csaba Szepesvári, Ali Taylan Cemgil, Nenad Tomasev

    Abstract: We develop a principled procedure for determining when a large language model (LLM) should abstain from responding (e.g., by saying "I don't know") in a general domain, instead of resorting to possibly "hallucinating" a non-sensical or incorrect answer. Building on earlier approaches that use self-consistency as a more reliable measure of model confidence, we propose using the LLM itself to self-e… ▽ More

    Submitted 4 April, 2024; originally announced May 2024.

  6. arXiv:2312.09244  [pdf, other

    cs.LG

    Helping or Herding? Reward Model Ensembles Mitigate but do not Eliminate Reward Hacking

    Authors: Jacob Eisenstein, Chirag Nagpal, Alekh Agarwal, Ahmad Beirami, Alex D'Amour, DJ Dvijotham, Adam Fisch, Katherine Heller, Stephen Pfohl, Deepak Ramachandran, Peter Shaw, Jonathan Berant

    Abstract: Reward models play a key role in aligning language model applications towards human preferences. However, this setup creates an incentive for the language model to exploit errors in the reward model to achieve high estimated reward, a phenomenon often termed \emph{reward hacking}. A natural mitigation is to train an ensemble of reward models, aggregating over model outputs to obtain a more robust… ▽ More

    Submitted 20 December, 2023; v1 submitted 14 December, 2023; originally announced December 2023.

  7. arXiv:2312.01692  [pdf, other

    cs.LG cs.AI stat.ME stat.ML

    Risk-Controlling Model Selection via Guided Bayesian Optimization

    Authors: Bracha Laufer-Goldshtein, Adam Fisch, Regina Barzilay, Tommi Jaakkola

    Abstract: Adjustable hyperparameters of machine learning models typically impact various key trade-offs such as accuracy, fairness, robustness, or inference cost. Our goal in this paper is to find a configuration that adheres to user-specified limits on certain risks while being useful with respect to other conflicting metrics. We solve this by combining Bayesian Optimization (BO) with rigorous risk-control… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

  8. arXiv:2307.05741  [pdf, other

    cs.CL

    Towards Robust and Efficient Continual Language Learning

    Authors: Adam Fisch, Amal Rannen-Triki, Razvan Pascanu, Jörg Bornschein, Angeliki Lazaridou, Elena Gribovskaya, Marc'Aurelio Ranzato

    Abstract: As the application space of language models continues to evolve, a natural question to ask is how we can quickly adapt models to new tasks. We approach this classic question from a continual learning perspective, in which we aim to continue fine-tuning models trained on past tasks on new tasks, with the goal of "transferring" relevant knowledge. However, this strategy also runs the risk of doing m… ▽ More

    Submitted 11 July, 2023; originally announced July 2023.

  9. arXiv:2306.10193  [pdf, other

    cs.CL cs.LG

    Conformal Language Modeling

    Authors: Victor Quach, Adam Fisch, Tal Schuster, Adam Yala, Jae Ho Sohn, Tommi S. Jaakkola, Regina Barzilay

    Abstract: We propose a novel approach to conformal prediction for generative language models (LMs). Standard conformal prediction produces prediction sets -- in place of single predictions -- that have rigorous, statistical performance guarantees. LM responses are typically sampled from the model's predicted distribution over the large, combinatorial output space of natural language. Translating this proces… ▽ More

    Submitted 1 June, 2024; v1 submitted 16 June, 2023; originally announced June 2023.

    Comments: ICLR 2024

  10. arXiv:2210.07913  [pdf, other

    cs.LG cs.AI stat.ME stat.ML

    Efficiently Controlling Multiple Risks with Pareto Testing

    Authors: Bracha Laufer-Goldshtein, Adam Fisch, Regina Barzilay, Tommi Jaakkola

    Abstract: Machine learning applications frequently come with multiple diverse objectives and constraints that can change over time. Accordingly, trained models can be tuned with sets of hyper-parameters that affect their predictive behavior (e.g., their run-time efficiency versus error rate). As the number of constraints and hyper-parameter dimensions grow, naively selected settings may lead to sub-optimal… ▽ More

    Submitted 14 October, 2022; originally announced October 2022.

  11. arXiv:2208.12084  [pdf, other

    cs.LG

    Calibrated Selective Classification

    Authors: Adam Fisch, Tommi Jaakkola, Regina Barzilay

    Abstract: Selective classification allows models to abstain from making predictions (e.g., say "I don't know") when in doubt in order to obtain better effective accuracy. While typical selective models can be effective at producing more accurate predictions on average, they may still allow for wrong predictions that have high confidence, or skip correct predictions that have low confidence. Providing calibr… ▽ More

    Submitted 20 June, 2024; v1 submitted 25 August, 2022; originally announced August 2022.

  12. arXiv:2208.02814  [pdf, other

    stat.ME cs.AI cs.LG math.ST stat.ML

    Conformal Risk Control

    Authors: Anastasios N. Angelopoulos, Stephen Bates, Adam Fisch, Lihua Lei, Tal Schuster

    Abstract: We extend conformal prediction to control the expected value of any monotone loss function. The algorithm generalizes split conformal prediction together with its coverage guarantee. Like conformal prediction, the conformal risk control procedure is tight up to an $\mathcal{O}(1/n)$ factor. We also introduce extensions of the idea to distribution shift, quantile risk control, multiple and adversar… ▽ More

    Submitted 29 April, 2023; v1 submitted 4 August, 2022; originally announced August 2022.

    Comments: Code available at https://github.com/aangelopoulos/conformal-risk

  13. arXiv:2207.07061  [pdf, other

    cs.CL cs.LG

    Confident Adaptive Language Modeling

    Authors: Tal Schuster, Adam Fisch, Jai Gupta, Mostafa Dehghani, Dara Bahri, Vinh Q. Tran, Yi Tay, Donald Metzler

    Abstract: Recent advances in Transformer-based large language models (LLMs) have led to significant performance improvements across many tasks. These gains come with a drastic increase in the models' size, potentially leading to slow and costly use at inference time. In practice, however, the series of generations made by LLMs is composed of varying levels of difficulty. While certain predictions truly bene… ▽ More

    Submitted 25 October, 2022; v1 submitted 14 July, 2022; originally announced July 2022.

    Comments: NeurIPS 2022 (selected as Oral)

  14. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  15. arXiv:2202.07650  [pdf, other

    cs.LG

    Conformal Prediction Sets with Limited False Positives

    Authors: Adam Fisch, Tal Schuster, Tommi Jaakkola, Regina Barzilay

    Abstract: We develop a new approach to multi-label conformal prediction in which we aim to output a precise set of promising prediction candidates with a bounded number of incorrect answers. Standard conformal prediction provides the ability to adapt to model uncertainty by constructing a calibrated candidate set in place of a single prediction, with guarantees that the set contains the correct answer with… ▽ More

    Submitted 15 February, 2022; originally announced February 2022.

  16. arXiv:2104.08803  [pdf, other

    cs.CL cs.AI cs.LG

    Consistent Accelerated Inference via Confident Adaptive Transformers

    Authors: Tal Schuster, Adam Fisch, Tommi Jaakkola, Regina Barzilay

    Abstract: We develop a novel approach for confidently accelerating inference in the large and expensive multilayer Transformers that are now ubiquitous in natural language processing (NLP). Amortized or approximate computational methods increase efficiency, but can come with unpredictable performance costs. In this work, we present CATs -- Confident Adaptive Transformers -- in which we simultaneously increa… ▽ More

    Submitted 9 September, 2021; v1 submitted 18 April, 2021; originally announced April 2021.

    Comments: EMNLP 2021

  17. arXiv:2103.08541  [pdf, other

    cs.CL cs.IR cs.LG

    Get Your Vitamin C! Robust Fact Verification with Contrastive Evidence

    Authors: Tal Schuster, Adam Fisch, Regina Barzilay

    Abstract: Typical fact verification models use retrieved written evidence to verify claims. Evidence sources, however, often change over time as more information is gathered and revised. In order to adapt, models must be sensitive to subtle differences in supporting evidence. We present VitaminC, a benchmark infused with challenging cases that require fact verification models to discern and adjust to slight… ▽ More

    Submitted 15 March, 2021; originally announced March 2021.

    Comments: NAACL 2021

  18. arXiv:2102.08898  [pdf, other

    cs.LG cs.AI cs.CL

    Few-shot Conformal Prediction with Auxiliary Tasks

    Authors: Adam Fisch, Tal Schuster, Tommi Jaakkola, Regina Barzilay

    Abstract: We develop a novel approach to conformal prediction when the target task has limited data available for training. Conformal prediction identifies a small set of promising output candidates in place of a single prediction, with guarantees that the set contains the correct answer with high probability. When training data is limited, however, the predicted set can easily become unusably large. In thi… ▽ More

    Submitted 20 July, 2021; v1 submitted 17 February, 2021; originally announced February 2021.

    Comments: ICML camera ready

  19. arXiv:2012.15723  [pdf, other

    cs.CL cs.LG

    Making Pre-trained Language Models Better Few-shot Learners

    Authors: Tianyu Gao, Adam Fisch, Danqi Chen

    Abstract: The recent GPT-3 model (Brown et al., 2020) achieves remarkable few-shot performance solely by leveraging a natural-language prompt and a few task demonstrations as input context. Inspired by their findings, we study few-shot learning in a more practical scenario, where we use smaller language models for which fine-tuning is computationally efficient. We present LM-BFF--better few-shot fine-tuning… ▽ More

    Submitted 2 June, 2021; v1 submitted 31 December, 2020; originally announced December 2020.

    Comments: Accepted to ACL 2021. The code is publicly available at https://github.com/princeton-nlp/LM-BFF

  20. arXiv:2011.04264  [pdf, other

    cs.CL cs.CV

    CapWAP: Captioning with a Purpose

    Authors: Adam Fisch, Kenton Lee, Ming-Wei Chang, Jonathan H. Clark, Regina Barzilay

    Abstract: The traditional image captioning task uses generic reference captions to provide textual information about images. Different user populations, however, will care about different visual aspects of images. In this paper, we propose a new task, Captioning with a Purpose (CapWAP). Our goal is to develop systems that can be tailored to be useful for the information needs of an intended population, rath… ▽ More

    Submitted 9 November, 2020; originally announced November 2020.

    Comments: EMNLP 2020

  21. arXiv:2007.03114  [pdf, other

    cs.LG stat.ML

    Efficient Conformal Prediction via Cascaded Inference with Expanded Admission

    Authors: Adam Fisch, Tal Schuster, Tommi Jaakkola, Regina Barzilay

    Abstract: In this paper, we present a novel approach for conformal prediction (CP), in which we aim to identify a set of promising prediction candidates -- in place of a single prediction. This set is guaranteed to contain a correct answer with high probability, and is well-suited for many open-ended classification tasks. In the standard CP paradigm, the predicted set can often be unusably large and also co… ▽ More

    Submitted 2 February, 2021; v1 submitted 6 July, 2020; originally announced July 2020.

    Comments: ICLR 2021. Revision of "Relaxed Conformal Prediction Cascades for Efficient Inference Over Many Labels"

  22. arXiv:1910.09753  [pdf, ps, other

    cs.CL

    MRQA 2019 Shared Task: Evaluating Generalization in Reading Comprehension

    Authors: Adam Fisch, Alon Talmor, Robin Jia, Minjoon Seo, Eunsol Choi, Danqi Chen

    Abstract: We present the results of the Machine Reading for Question Answering (MRQA) 2019 shared task on evaluating the generalization capabilities of reading comprehension systems. In this task, we adapted and unified 18 distinct question answering datasets into the same format. Among them, six datasets were made available for training, six datasets were made available for development, and the final six w… ▽ More

    Submitted 20 December, 2019; v1 submitted 21 October, 2019; originally announced October 2019.

    Comments: EMNLP 2019 Workshop on Machine Reading for Question Answering

  23. arXiv:1909.09279  [pdf, other

    cs.CL

    Working Hard or Hardly Working: Challenges of Integrating Typology into Neural Dependency Parsers

    Authors: Adam Fisch, Jiang Guo, Regina Barzilay

    Abstract: This paper explores the task of leveraging typology in the context of cross-lingual dependency parsing. While this linguistic information has shown great promise in pre-neural parsing, results for neural architectures have been mixed. The aim of our investigation is to better understand this state-of-the-art. Our main findings are as follows: 1) The benefit of typological information is derived fr… ▽ More

    Submitted 19 September, 2019; originally announced September 2019.

    Comments: EMNLP 2019

  24. arXiv:1806.01947  [pdf, other

    stat.ML cs.LG stat.AP stat.ME

    A linear time method for the detection of point and collective anomalies

    Authors: Alexander T. M. Fisch, Idris A. Eckley, Paul Fearnhead

    Abstract: The challenge of efficiently identifying anomalies in data sequences is an important statistical problem that now arises in many applications. Whilst there has been substantial work aimed at making statistical analyses robust to outliers, or point anomalies, there has been much less work on detecting anomalous segments, or collective anomalies, particularly in those settings where point anomalies… ▽ More

    Submitted 11 April, 2019; v1 submitted 5 June, 2018; originally announced June 2018.

  25. arXiv:1709.03856  [pdf, ps, other

    cs.CL

    StarSpace: Embed All The Things!

    Authors: Ledell Wu, Adam Fisch, Sumit Chopra, Keith Adams, Antoine Bordes, Jason Weston

    Abstract: We present StarSpace, a general-purpose neural embedding model that can solve a wide variety of problems: labeling tasks such as text classification, ranking tasks such as information retrieval/web search, collaborative filtering-based or content-based recommendation, embedding of multi-relational graphs, and learning word, sentence or document level embeddings. In each case the model works by emb… ▽ More

    Submitted 20 November, 2017; v1 submitted 12 September, 2017; originally announced September 2017.

  26. arXiv:1705.06476  [pdf, other

    cs.CL

    ParlAI: A Dialog Research Software Platform

    Authors: Alexander H. Miller, Will Feng, Adam Fisch, Jiasen Lu, Dhruv Batra, Antoine Bordes, Devi Parikh, Jason Weston

    Abstract: We introduce ParlAI (pronounced "par-lay"), an open-source software platform for dialog research implemented in Python, available at http://parl.ai. Its goal is to provide a unified framework for sharing, training and testing of dialog models, integration of Amazon Mechanical Turk for data collection, human evaluation, and online/reinforcement learning; and a repository of machine learning models… ▽ More

    Submitted 8 March, 2018; v1 submitted 18 May, 2017; originally announced May 2017.

  27. arXiv:1704.00051  [pdf, other

    cs.CL

    Reading Wikipedia to Answer Open-Domain Questions

    Authors: Danqi Chen, Adam Fisch, Jason Weston, Antoine Bordes

    Abstract: This paper proposes to tackle open- domain question answering using Wikipedia as the unique knowledge source: the answer to any factoid question is a text span in a Wikipedia article. This task of machine reading at scale combines the challenges of document retrieval (finding the relevant articles) with that of machine comprehension of text (identifying the answer spans from those articles). Our a… ▽ More

    Submitted 27 April, 2017; v1 submitted 31 March, 2017; originally announced April 2017.

    Comments: ACL2017, 10 pages

  28. arXiv:1703.03846  [pdf, other

    cs.GT

    Socially Optimal Mining Pools

    Authors: Ben A. Fisch, Rafael Pass, Abhi Shelat

    Abstract: Mining for Bitcoins is a high-risk high-reward activity. Miners, seeking to reduce their variance and earn steadier rewards, collaborate in pooling strategies where they jointly mine for Bitcoins. Whenever some pool participant is successful, the earned rewards are appropriately split among all pool participants. Currently a dozen of different pooling strategies (i.e., methods for distributing the… ▽ More

    Submitted 10 March, 2017; originally announced March 2017.

  29. arXiv:1606.03126  [pdf, other

    cs.CL

    Key-Value Memory Networks for Directly Reading Documents

    Authors: Alexander Miller, Adam Fisch, Jesse Dodge, Amir-Hossein Karimi, Antoine Bordes, Jason Weston

    Abstract: Directly reading documents and being able to answer questions from them is an unsolved challenge. To avoid its inherent difficulty, question answering (QA) has been directed towards using Knowledge Bases (KBs) instead, which has proven effective. Unfortunately KBs often suffer from being too restrictive, as the schema cannot support certain types of answers, and too sparse, e.g. Wikipedia contains… ▽ More

    Submitted 10 October, 2016; v1 submitted 9 June, 2016; originally announced June 2016.