Skip to main content

Showing 1–26 of 26 results for author: Warstadt, A

  1. arXiv:2404.06214  [pdf, other

    cs.CL

    [Call for Papers] The 2nd BabyLM Challenge: Sample-efficient pretraining on a developmentally plausible corpus

    Authors: Leshem Choshen, Ryan Cotterell, Michael Y. Hu, Tal Linzen, Aaron Mueller, Candace Ross, Alex Warstadt, Ethan Wilcox, Adina Williams, Chengxu Zhuang

    Abstract: After last year's successful BabyLM Challenge, the competition will be hosted again in 2024/2025. The overarching goals of the challenge remain the same; however, some of the competition rules will be different. The big changes for this year's competition are as follows: First, we replace the loose track with a paper track, which allows (for example) non-model-based submissions, novel cognitively-… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  2. arXiv:2403.14208  [pdf, other

    cs.CL

    Automatic Annotation of Grammaticality in Child-Caregiver Conversations

    Authors: Mitja Nikolaus, Abhishek Agrawal, Petros Kaklamanis, Alex Warstadt, Abdellah Fourtassi

    Abstract: The acquisition of grammar has been a central question to adjudicate between theories of language acquisition. In order to conduct faster, more reproducible, and larger-scale corpus studies on grammaticality in child-caregiver conversations, tools for automatic annotation can offer an effective alternative to tedious manual annotation. We propose a coding scheme for context-dependent grammaticalit… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Journal ref: LREC-Coling 2024, May 2024, Turin, Italy

  3. arXiv:2402.17936  [pdf, other

    cs.CL

    Acquiring Linguistic Knowledge from Multimodal Input

    Authors: Theodor Amariucai, Alex Warstadt

    Abstract: In contrast to children, language models (LMs) exhibit considerably inferior data efficiency when acquiring language. In this submission to the BabyLM Challenge (Warstadt et al., 2023), we test the hypothesis that this data efficiency gap is partly caused by a lack of multimodal input and grounding in the learning environment of typical language models. Although previous work looking into this que… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: in Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning

  4. arXiv:2312.02931  [pdf, other

    cs.CL cs.AI

    WhisBERT: Multimodal Text-Audio Language Modeling on 100M Words

    Authors: Lukas Wolf, Greta Tuckute, Klemen Kotar, Eghbal Hosseini, Tamar Regev, Ethan Wilcox, Alex Warstadt

    Abstract: Training on multiple modalities of input can augment the capabilities of a language model. Here, we ask whether such a training regime can improve the quality and efficiency of these systems as well. We focus on text--audio and introduce Whisbert, which is inspired by the text--image approach of FLAVA (Singh et al., 2022). In accordance with Babylm guidelines (Warstadt et al., 2023), we pretrain W… ▽ More

    Submitted 6 December, 2023; v1 submitted 5 December, 2023; originally announced December 2023.

    Comments: Published at the BabyLM Challenge, a shared task co-sponsored by CMCL 2023 and CoNLL 2023, hosted by EMNLP 2023

  5. arXiv:2311.17233  [pdf, other

    cs.CL cs.AI cs.IT cs.LG

    Quantifying the redundancy between prosody and text

    Authors: Lukas Wolf, Tiago Pimentel, Evelina Fedorenko, Ryan Cotterell, Alex Warstadt, Ethan Wilcox, Tamar Regev

    Abstract: Prosody -- the suprasegmental component of speech, including pitch, loudness, and tempo -- carries critical aspects of meaning. However, the relationship between the information conveyed by prosody vs. by the words themselves remains poorly understood. We use large language models (LLMs) to estimate how much information is redundant between prosody and the words themselves. Using a large spoken co… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

    Comments: Published at The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP)

  6. arXiv:2307.15054  [pdf, other

    cs.CL

    A Geometric Notion of Causal Probing

    Authors: Clément Guerner, Anej Svete, Tianyu Liu, Alexander Warstadt, Ryan Cotterell

    Abstract: The linear subspace hypothesis (Bolukbasi et al., 2016) states that, in a language model's representation space, all information about a concept such as verbal number is encoded in a linear subspace. Prior work has relied on auxiliary classification tasks to identify and evaluate candidate subspaces that might give support for this hypothesis. We instead give a set of intrinsic criteria which char… ▽ More

    Submitted 24 February, 2024; v1 submitted 27 July, 2023; originally announced July 2023.

  7. arXiv:2307.03056  [pdf, other

    cs.LG cs.AI cs.CL

    Generalizing Backpropagation for Gradient-Based Interpretability

    Authors: Kevin Du, Lucas Torroba Hennigen, Niklas Stoehr, Alexander Warstadt, Ryan Cotterell

    Abstract: Many popular feature-attribution methods for interpreting deep neural networks rely on computing the gradients of a model's output with respect to its inputs. While these methods can indicate which input features may be important for the model's prediction, they reveal little about the inner workings of the model itself. In this paper, we observe that the gradient computation of a model is a speci… ▽ More

    Submitted 6 July, 2023; originally announced July 2023.

    Comments: Long paper accepted at ACL 2023

  8. arXiv:2301.11796  [pdf, other

    cs.CL

    Call for Papers -- The BabyLM Challenge: Sample-efficient pretraining on a developmentally plausible corpus

    Authors: Alex Warstadt, Leshem Choshen, Aaron Mueller, Adina Williams, Ethan Wilcox, Chengxu Zhuang

    Abstract: We present the call for papers for the BabyLM Challenge: Sample-efficient pretraining on a developmentally plausible corpus. This shared task is intended for participants with an interest in small scale language modeling, human language acquisition, low-resource NLP, and cognitive modeling. In partnership with CoNLL and CMCL, we provide a platform for approaches to pretraining with a limited-size… ▽ More

    Submitted 27 January, 2023; originally announced January 2023.

  9. arXiv:2212.10792  [pdf, other

    cs.CL

    Reconstruction Probing

    Authors: Najoung Kim, Jatin Khilnani, Alex Warstadt, Abed Qaddoumi

    Abstract: We propose reconstruction probing, a new analysis method for contextualized representations based on reconstruction probabilities in masked language models (MLMs). This method relies on comparing the reconstruction probabilities of tokens in a given sequence when conditioned on the representation of a single token that has been fully contextualized and when conditioned on only the decontextualized… ▽ More

    Submitted 21 December, 2022; originally announced December 2022.

    Comments: Preprint

  10. arXiv:2209.12407  [pdf, other

    cs.CL

    Entailment Semantics Can Be Extracted from an Ideal Language Model

    Authors: William Merrill, Alex Warstadt, Tal Linzen

    Abstract: Language models are often trained on text alone, without additional grounding. There is debate as to how much of natural language semantics can be inferred from such a procedure. We prove that entailment judgments between sentences can be extracted from an ideal language model that has perfectly learned its target distribution, assuming the training sentences are generated by Gricean agents, i.e.,… ▽ More

    Submitted 8 January, 2024; v1 submitted 26 September, 2022; originally announced September 2022.

    Comments: Accepted at CONLL 2022. Updated Dec 4, 2023 and Jan 8, 2024 with erratum

  11. arXiv:2208.07998  [pdf, other

    cs.CL

    What Artificial Neural Networks Can Tell Us About Human Language Acquisition

    Authors: Alex Warstadt, Samuel R. Bowman

    Abstract: Rapid progress in machine learning for natural language processing has the potential to transform debates about how humans learn language. However, the learning environments and biases of current artificial learners and humans diverge in ways that weaken the impact of the evidence obtained from learning simulations. For example, today's most effective neural language models are trained on roughly… ▽ More

    Submitted 11 February, 2024; v1 submitted 16 August, 2022; originally announced August 2022.

    Comments: Please cite the published version with the following information: @incollection{warstadt2022artificial, title={What artificial neural networks can tell us about human language acquisition}, author={Warstadt, Alex and Bowman, Samuel R.}, booktitle={Algebraic Structures in Natural Language}, pages={17--60}, year={2022}, publisher={CRC Press} }

  12. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  13. arXiv:2203.06342  [pdf, other

    cs.CL cs.AI

    What Makes Reading Comprehension Questions Difficult?

    Authors: Saku Sugawara, Nikita Nangia, Alex Warstadt, Samuel R. Bowman

    Abstract: For a natural language understanding benchmark to be useful in research, it has to consist of examples that are diverse and difficult enough to discriminate among current and near-future state-of-the-art systems. However, we do not yet know how best to select text sources to collect a variety of challenging examples. In this study, we crowdsource multiple-choice reading comprehension questions for… ▽ More

    Submitted 11 March, 2022; originally announced March 2022.

    Comments: ACL 2022

  14. arXiv:2109.06987  [pdf, other

    cs.CL

    NOPE: A Corpus of Naturally-Occurring Presuppositions in English

    Authors: Alicia Parrish, Sebastian Schuster, Alex Warstadt, Omar Agha, Soo-Hwan Lee, Zhuoye Zhao, Samuel R. Bowman, Tal Linzen

    Abstract: Understanding language requires grasping not only the overtly stated content, but also making inferences about things that were left unsaid. These inferences include presuppositions, a phenomenon by which a listener learns about new information through reasoning about what a speaker takes as given. Presuppositions require complex understanding of the lexical and syntactic properties that trigger t… ▽ More

    Submitted 14 September, 2021; originally announced September 2021.

    Comments: CoNLL 2021. Data and code available at https://github.com/nyu-mll/nope

  15. arXiv:2106.00794  [pdf, other

    cs.CL cs.AI cs.HC

    What Ingredients Make for an Effective Crowdsourcing Protocol for Difficult NLU Data Collection Tasks?

    Authors: Nikita Nangia, Saku Sugawara, Harsh Trivedi, Alex Warstadt, Clara Vania, Samuel R. Bowman

    Abstract: Crowdsourcing is widely used to create data for common natural language understanding tasks. Despite the importance of these datasets for measuring and refining model understanding of language, there has been little focus on the crowdsourcing methods used for collecting the datasets. In this paper, we compare the efficacy of interventions that have been proposed in prior work as ways of improving… ▽ More

    Submitted 1 June, 2021; originally announced June 2021.

    Comments: ACL 2021

  16. arXiv:2104.07179  [pdf, other

    cs.CL

    Does Putting a Linguist in the Loop Improve NLU Data Collection?

    Authors: Alicia Parrish, William Huang, Omar Agha, Soo-Hwan Lee, Nikita Nangia, Alex Warstadt, Karmanya Aggarwal, Emily Allaway, Tal Linzen, Samuel R. Bowman

    Abstract: Many crowdsourced NLP datasets contain systematic gaps and biases that are identified only after data collection is complete. Identifying these issues from early data samples during crowdsourcing should make mitigation more efficient, especially when done iteratively. We take natural language inference as a test case and ask whether it is beneficial to put a linguist `in the loop' during data coll… ▽ More

    Submitted 14 April, 2021; originally announced April 2021.

    Comments: 14 pages, 10 figures

  17. arXiv:2101.11131  [pdf, other

    cs.CL

    CLiMP: A Benchmark for Chinese Language Model Evaluation

    Authors: Beilei Xiang, Changbing Yang, Yu Li, Alex Warstadt, Katharina Kann

    Abstract: Linguistically informed analyses of language models (LMs) contribute to the understanding and improvement of these models. Here, we introduce the corpus of Chinese linguistic minimal pairs (CLiMP), which can be used to investigate what knowledge Chinese LMs acquire. CLiMP consists of sets of 1,000 minimal pairs (MPs) for 16 syntactic contrasts in Mandarin, covering 9 major Mandarin linguistic phen… ▽ More

    Submitted 26 January, 2021; originally announced January 2021.

  18. arXiv:2011.04946  [pdf, other

    cs.CL

    When Do You Need Billions of Words of Pretraining Data?

    Authors: Yian Zhang, Alex Warstadt, Haau-Sing Li, Samuel R. Bowman

    Abstract: NLP is currently dominated by general-purpose pretrained language models like RoBERTa, which achieve strong performance on NLU tasks through pretraining on billions of words. But what exact knowledge or skills do Transformer LMs learn from large-scale pretraining that they cannot learn from less data? We adopt four probing methods---classifier probing, information-theoretic probing, unsupervised r… ▽ More

    Submitted 10 November, 2020; originally announced November 2020.

    Comments: 10 pages, 6 figures

  19. arXiv:2010.05358  [pdf, other

    cs.CL

    Learning Which Features Matter: RoBERTa Acquires a Preference for Linguistic Generalizations (Eventually)

    Authors: Alex Warstadt, Yian Zhang, Haau-Sing Li, Haokun Liu, Samuel R. Bowman

    Abstract: One reason pretraining on self-supervised linguistic tasks is effective is that it teaches models features that are helpful for language understanding. However, we want pretrained models to learn not only to represent linguistic features, but also to use those features preferentially during fine-turning. With this goal in mind, we introduce a new English-language diagnostic set called MSGS (the Mi… ▽ More

    Submitted 11 October, 2020; originally announced October 2020.

    Comments: accepted at EMNLP 2020

  20. arXiv:2007.06761  [pdf, other

    cs.CL

    Can neural networks acquire a structural bias from raw linguistic data?

    Authors: Alex Warstadt, Samuel R. Bowman

    Abstract: We evaluate whether BERT, a widely used neural network for sentence processing, acquires an inductive bias towards forming structural generalizations through pretraining on raw data. We conduct four experiments testing its preference for structural vs. linear generalizations in different structure-dependent phenomena. We find that BERT makes a structural generalization in 3 out of 4 empirical doma… ▽ More

    Submitted 23 September, 2020; v1 submitted 13 July, 2020; originally announced July 2020.

    Comments: To appear in Proceedings of 42nd Annual Meeting of the Cognitive Science Society

  21. arXiv:2004.03066  [pdf, other

    cs.CL

    Are Natural Language Inference Models IMPPRESsive? Learning IMPlicature and PRESupposition

    Authors: Paloma Jeretic, Alex Warstadt, Suvrat Bhooshan, Adina Williams

    Abstract: Natural language inference (NLI) is an increasingly important task for natural language understanding, which requires one to infer whether a sentence entails another. However, the ability of NLI models to make pragmatic inferences remains understudied. We create an IMPlicature and PRESupposition diagnostic dataset (IMPPRES), consisting of >25k semiautomatically generated sentence pairs illustratin… ▽ More

    Submitted 13 July, 2020; v1 submitted 6 April, 2020; originally announced April 2020.

    Comments: to appear in Proceedings of ACL 2020

  22. arXiv:1912.00582  [pdf, other

    cs.CL

    BLiMP: The Benchmark of Linguistic Minimal Pairs for English

    Authors: Alex Warstadt, Alicia Parrish, Haokun Liu, Anhad Mohananey, Wei Peng, Sheng-Fu Wang, Samuel R. Bowman

    Abstract: We introduce The Benchmark of Linguistic Minimal Pairs (shortened to BLiMP), a challenge set for evaluating what language models (LMs) know about major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each containing 1000 minimal pairs isolating specific contrasts in syntax, morphology, or semantics. The data is automatically generated according to expert-crafted grammars, and… ▽ More

    Submitted 14 February, 2023; v1 submitted 2 December, 2019; originally announced December 2019.

    Comments: 2020: Published in TACL Feb 2023: Corrected erroneous GPT-2 results

  23. arXiv:1909.02597  [pdf, other

    cs.CL

    Investigating BERT's Knowledge of Language: Five Analysis Methods with NPIs

    Authors: Alex Warstadt, Yu Cao, Ioana Grosu, Wei Peng, Hagen Blix, Yining Nie, Anna Alsop, Shikha Bordia, Haokun Liu, Alicia Parrish, Sheng-Fu Wang, Jason Phang, Anhad Mohananey, Phu Mon Htut, Paloma Jeretič, Samuel R. Bowman

    Abstract: Though state-of-the-art sentence representation models can perform tasks requiring significant knowledge of grammar, it is an open question how best to evaluate their grammatical knowledge. We explore five experimental methods inspired by prior work evaluating pretrained sentence representation models. We use a single linguistic phenomenon, negative polarity item (NPI) licensing in English, as a c… ▽ More

    Submitted 19 September, 2019; v1 submitted 5 September, 2019; originally announced September 2019.

    Comments: Accepted to EMNLP 2019; Added link to code+dataset

  24. arXiv:1901.03438  [pdf, other

    cs.CL

    Linguistic Analysis of Pretrained Sentence Encoders with Acceptability Judgments

    Authors: Alex Warstadt, Samuel R. Bowman

    Abstract: Recent work on evaluating grammatical knowledge in pretrained sentence encoders gives a fine-grained view of a small number of phenomena. We introduce a new analysis dataset that also has broad coverage of linguistic phenomena. We annotate the development set of the Corpus of Linguistic Acceptability (CoLA; Warstadt et al., 2018) for the presence of 13 classes of syntactic phenomena including vari… ▽ More

    Submitted 21 May, 2020; v1 submitted 10 January, 2019; originally announced January 2019.

  25. arXiv:1811.10773  [pdf, other

    cs.CL

    Verb Argument Structure Alternations in Word and Sentence Embeddings

    Authors: Katharina Kann, Alex Warstadt, Adina Williams, Samuel R. Bowman

    Abstract: Verbs occur in different syntactic environments, or frames. We investigate whether artificial neural networks encode grammatical distinctions necessary for inferring the idiosyncratic frame-selectional properties of verbs. We introduce five datasets, collectively called FAVA, containing in aggregate nearly 10k sentences labeled for grammatical acceptability, illustrating different verbal argument… ▽ More

    Submitted 26 November, 2018; originally announced November 2018.

    Comments: Accepted to SCiL 2019

  26. arXiv:1805.12471  [pdf, other

    cs.CL

    Neural Network Acceptability Judgments

    Authors: Alex Warstadt, Amanpreet Singh, Samuel R. Bowman

    Abstract: This paper investigates the ability of artificial neural networks to judge the grammatical acceptability of a sentence, with the goal of testing their linguistic competence. We introduce the Corpus of Linguistic Acceptability (CoLA), a set of 10,657 English sentences labeled as grammatical or ungrammatical from published linguistics literature. As baselines, we train several recurrent neural netwo… ▽ More

    Submitted 1 October, 2019; v1 submitted 31 May, 2018; originally announced May 2018.