Skip to main content

Showing 1–50 of 63 results for author: Linzen, T

  1. arXiv:2407.04593  [pdf, other

    cs.CL

    Testing learning hypotheses using neural networks by manipulating learning data

    Authors: Cara Su-Yi Leong, Tal Linzen

    Abstract: Although passivization is productive in English, it is not completely general -- some exceptions exist (e.g. *One hour was lasted by the meeting). How do English speakers learn these exceptions to an otherwise general pattern? Using neural network language models as theories of acquisition, we explore the sources of indirect evidence that a learner can leverage to learn whether a verb can passiviz… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: Submitted to Journal of Memory and Language

  2. arXiv:2404.06214  [pdf, other

    cs.CL

    [Call for Papers] The 2nd BabyLM Challenge: Sample-efficient pretraining on a developmentally plausible corpus

    Authors: Leshem Choshen, Ryan Cotterell, Michael Y. Hu, Tal Linzen, Aaron Mueller, Candace Ross, Alex Warstadt, Ethan Wilcox, Adina Williams, Chengxu Zhuang

    Abstract: After last year's successful BabyLM Challenge, the competition will be hosted again in 2024/2025. The overarching goals of the challenge remain the same; however, some of the competition rules will be different. The big changes for this year's competition are as follows: First, we replace the loose track with a paper track, which allows (for example) non-model-based submissions, novel cognitively-… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  3. arXiv:2403.07202  [pdf, other

    cs.CL

    SPAWNing Structural Priming Predictions from a Cognitively Motivated Parser

    Authors: Grusha Prasad, Tal Linzen

    Abstract: Structural priming is a widely used psycholinguistic paradigm to study human sentence representations. In this work we propose a framework for using empirical priming patterns to build a theory characterizing the structural representations humans construct when processing sentences. This framework uses a new cognitively motivated parser, SPAWN, to generate quantitative priming predictions from the… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  4. arXiv:2402.13956  [pdf, other

    cs.CL

    Can You Learn Semantics Through Next-Word Prediction? The Case of Entailment

    Authors: William Merrill, Zhaofeng Wu, Norihito Naka, Yoon Kim, Tal Linzen

    Abstract: Do LMs infer the semantics of text from co-occurrence patterns in their training data? Merrill et al. (2022) argue that, in theory, sentence co-occurrence probabilities predicted by an optimal LM should reflect the entailment relationship of the constituent sentences, but it is unclear whether probabilities predicted by neural LMs encode entailment in this way because of strong assumptions made by… ▽ More

    Submitted 17 July, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

  5. arXiv:2311.07811  [pdf, other

    cs.CL

    In-context Learning Generalizes, But Not Always Robustly: The Case of Syntax

    Authors: Aaron Mueller, Albert Webson, Jackson Petty, Tal Linzen

    Abstract: In-context learning (ICL) is now a common method for teaching large language models (LLMs) new tasks: given labeled examples in the input context, the LLM learns to perform the task without weight updates. Do models guided via ICL infer the underlying structure of the task defined by the context, or do they rely on superficial heuristics that only generalize to identically distributed examples? We… ▽ More

    Submitted 10 April, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

    Comments: Accepted to NAACL 2024

  6. arXiv:2311.00445  [pdf, other

    cs.CL cs.AI cs.LG

    A Systematic Comparison of Syllogistic Reasoning in Humans and Language Models

    Authors: Tiwalayo Eisape, MH Tessler, Ishita Dasgupta, Fei Sha, Sjoerd van Steenkiste, Tal Linzen

    Abstract: A central component of rational behavior is logical inference: the process of determining which conclusions follow from a set of premises. Psychologists have documented several ways in which humans' inferences deviate from the rules of logic. Do language models, which are trained on text generated by humans, replicate such human biases, or are they able to overcome them? Focusing on the case of sy… ▽ More

    Submitted 11 April, 2024; v1 submitted 1 November, 2023; originally announced November 2023.

    Comments: NAACL 2024

  7. arXiv:2310.19956  [pdf, other

    cs.CL

    The Impact of Depth on Compositional Generalization in Transformer Language Models

    Authors: Jackson Petty, Sjoerd van Steenkiste, Ishita Dasgupta, Fei Sha, Dan Garrette, Tal Linzen

    Abstract: To process novel sentences, language models (LMs) must generalize compositionally -- combine familiar elements in new ways. What aspects of a model's structure promote compositional generalization? Focusing on transformers, we test the hypothesis, motivated by theoretical and empirical work, that deeper transformers generalize more compositionally. Simply adding layers increases the total number o… ▽ More

    Submitted 10 April, 2024; v1 submitted 30 October, 2023; originally announced October 2023.

    Comments: Accepted to NAACL 2024

  8. arXiv:2310.16142  [pdf, other

    cs.CL cs.AI cs.LG

    A Language Model with Limited Memory Capacity Captures Interference in Human Sentence Processing

    Authors: William Timkey, Tal Linzen

    Abstract: Two of the central factors believed to underpin human sentence processing difficulty are expectations and retrieval from working memory. A recent attempt to create a unified cognitive model integrating these two factors relied on the parallels between the self-attention mechanism of transformer language models and cue-based retrieval theories of working memory in human sentence processing (Ryu and… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: To appear in Findings of the Association for Computational Linguistics: EMNLP 2023

  9. arXiv:2310.15151  [pdf, other

    cs.CL cs.AI cs.LG

    Verb Conjugation in Transformers Is Determined by Linear Encodings of Subject Number

    Authors: Sophie Hao, Tal Linzen

    Abstract: Deep architectures such as Transformers are sometimes criticized for having uninterpretable "black-box" representations. We use causal intervention analysis to show that, in fact, some linguistic features are represented in a linear, interpretable format. Specifically, we show that BERT's ability to conjugate verbs relies on a linear encoding of subject number that can be manipulated with predicta… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: To appear in Findings of the Association for Computational Linguistics: EMNLP 2023

  10. arXiv:2310.15040  [pdf, other

    cs.CL

    SLOG: A Structural Generalization Benchmark for Semantic Parsing

    Authors: Bingzhi Li, Lucia Donatelli, Alexander Koller, Tal Linzen, Yuekun Yao, Najoung Kim

    Abstract: The goal of compositional generalization benchmarks is to evaluate how well models generalize to new complex linguistic expressions. Existing benchmarks often focus on lexical generalization, the interpretation of novel lexical items in syntactic structures familiar from training; structural generalization tasks, where a model needs to interpret syntactic structures that are themselves unfamiliar… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP 2023

  11. arXiv:2308.05576  [pdf, other

    cs.CL

    Do Language Models' Words Refer?

    Authors: Matthew Mandelkern, Tal Linzen

    Abstract: What do language models (LMs) do with language? Everyone agrees that they can produce sequences of (mostly) coherent strings of English. But do those sentences mean something, or are LMs simply babbling in a convincing simulacrum of language use? Here we will address one aspect of this broad question: whether LMs' words can refer, that is, achieve "word-to-world" connections. There is prima facie… ▽ More

    Submitted 4 March, 2024; v1 submitted 10 August, 2023; originally announced August 2023.

  12. Language Models Can Learn Exceptions to Syntactic Rules

    Authors: Cara Su-Yi Leong, Tal Linzen

    Abstract: Artificial neural networks can generalize productively to novel contexts. Can they also learn exceptions to those productive rules? We explore this question using the case of restrictions on English passivization (e.g., the fact that "The vacation lasted five days" is grammatical, but "*Five days was lasted by the vacation" is not). We collect human acceptability judgments for passive sentences wi… ▽ More

    Submitted 9 June, 2023; originally announced June 2023.

    Comments: Accepted to SCiL 2023

  13. arXiv:2305.19905  [pdf, other

    cs.CL

    How to Plant Trees in Language Models: Data and Architectural Effects on the Emergence of Syntactic Inductive Biases

    Authors: Aaron Mueller, Tal Linzen

    Abstract: Accurate syntactic representations are essential for robust generalization in natural language. Recent work has found that pre-training can teach language models to rely on hierarchical syntactic features - as opposed to incorrect linear features - when performing tasks after fine-tuning. We test what aspects of pre-training are important for endowing encoder-decoder Transformers with an inductive… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

    Comments: Accepted to ACL 2023

  14. arXiv:2301.11462  [pdf, other

    cs.CL

    How poor is the stimulus? Evaluating hierarchical generalization in neural networks trained on child-directed speech

    Authors: Aditya Yedetore, Tal Linzen, Robert Frank, R. Thomas McCoy

    Abstract: When acquiring syntax, children consistently choose hierarchical rules over competing non-hierarchical possibilities. Is this preference due to a learning bias for hierarchical structure, or due to more general biases that interact with hierarchical cues in children's linguistic input? We explore these possibilities by training LSTMs and Transformers - two types of neural networks without a hierar… ▽ More

    Submitted 6 June, 2023; v1 submitted 26 January, 2023; originally announced January 2023.

    Comments: 10 pages plus references and appendices; accepted to ACL

    ACM Class: J.4; I.2.7

  15. arXiv:2212.10769  [pdf, other

    cs.CL

    Uncontrolled Lexical Exposure Leads to Overestimation of Compositional Generalization in Pretrained Models

    Authors: Najoung Kim, Tal Linzen, Paul Smolensky

    Abstract: Human linguistic capacity is often characterized by compositionality and the generalization it enables -- human learners can produce and comprehend novel complex expressions by composing known parts. Several benchmarks exploit distributional control across training and test to gauge compositional generalization, where certain lexical items only occur in limited contexts during training. While rece… ▽ More

    Submitted 21 December, 2022; originally announced December 2022.

    Comments: Preprint

  16. arXiv:2210.14328  [pdf, other

    cs.CL

    Causal Analysis of Syntactic Agreement Neurons in Multilingual Language Models

    Authors: Aaron Mueller, Yu Xia, Tal Linzen

    Abstract: Structural probing work has found evidence for latent syntactic information in pre-trained language models. However, much of this analysis has focused on monolingual models, and analyses of multilingual models have employed correlational methods that are confounded by the choice of probing tasks. In this study, we causally probe multilingual language models (XGLM and multilingual BERT) as well as… ▽ More

    Submitted 25 October, 2022; originally announced October 2022.

    Comments: Accepted to CoNLL 2022

  17. arXiv:2210.13569  [pdf, other

    cs.CL

    Characterizing Verbatim Short-Term Memory in Neural Language Models

    Authors: Kristijan Armeni, Christopher Honey, Tal Linzen

    Abstract: When a language model is trained to predict natural language sequences, its prediction at each moment depends on a representation of prior context. What kind of information about the prior context can language models retrieve? We tested whether language models could retrieve the exact words that occurred previously in a text. In our paradigm, language models (transformers and an LSTM) processed En… ▽ More

    Submitted 1 May, 2023; v1 submitted 24 October, 2022; originally announced October 2022.

    Comments: V2 corrects an issue with tokenization for one of the models (Wikitext-103 transformer). The relevant figures and the accompanying text were updated. This update does not affect conclusions which remain the same as in previous version

  18. arXiv:2210.12187  [pdf, other

    cs.CL

    Syntactic Surprisal From Neural Models Predicts, But Underestimates, Human Processing Difficulty From Syntactic Ambiguities

    Authors: Suhas Arehalli, Brian Dillon, Tal Linzen

    Abstract: Humans exhibit garden path effects: When reading sentences that are temporarily structurally ambiguous, they slow down when the structure is disambiguated in favor of the less preferred alternative. Surprisal theory (Hale, 2001; Levy, 2008), a prominent explanation of this finding, proposes that these slowdowns are due to the unpredictability of each of the words that occur in these sentences. Cha… ▽ More

    Submitted 1 August, 2023; v1 submitted 21 October, 2022; originally announced October 2022.

    Comments: 13 pages (4 references + appendix), 6 figures. To appear in the proceedings of the 2022 SIGNLL Conference on Computational Natural Language Learning. Revised after fixing errors in computing syntactic surprisal. The fix resulted in an increase in the NPZ GP effect observed and no evidence for a correlation between syntactic surprisal and word frequency. The main findings are unchanged

  19. arXiv:2209.12407  [pdf, other

    cs.CL

    Entailment Semantics Can Be Extracted from an Ideal Language Model

    Authors: William Merrill, Alex Warstadt, Tal Linzen

    Abstract: Language models are often trained on text alone, without additional grounding. There is debate as to how much of natural language semantics can be inferred from such a procedure. We prove that entailment judgments between sentences can be extracted from an ideal language model that has perfectly learned its target distribution, assuming the training sentences are generated by Gricean agents, i.e.,… ▽ More

    Submitted 8 January, 2024; v1 submitted 26 September, 2022; originally announced September 2022.

    Comments: Accepted at CONLL 2022. Updated Dec 4, 2023 and Jan 8, 2024 with erratum

  20. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  21. arXiv:2205.03472  [pdf, other

    cs.CL

    When a sentence does not introduce a discourse entity, Transformer-based models still sometimes refer to it

    Authors: Sebastian Schuster, Tal Linzen

    Abstract: Understanding longer narratives or participating in conversations requires tracking of discourse entities that have been mentioned. Indefinite noun phrases (NPs), such as 'a dog', frequently introduce discourse entities but this behavior is modulated by sentential operators such as negation. For example, 'a dog' in 'Arthur doesn't own a dog' does not introduce a discourse entity due to the presenc… ▽ More

    Submitted 6 May, 2022; originally announced May 2022.

    Comments: To appear at NAACL 2022

  22. arXiv:2203.09397  [pdf, other

    cs.CL

    Coloring the Blank Slate: Pre-training Imparts a Hierarchical Inductive Bias to Sequence-to-sequence Models

    Authors: Aaron Mueller, Robert Frank, Tal Linzen, Luheng Wang, Sebastian Schuster

    Abstract: Relations between words are governed by hierarchical structure rather than linear ordering. Sequence-to-sequence (seq2seq) models, despite their success in downstream NLP applications, often fail to generalize in a hierarchy-sensitive manner when performing syntactic transformations - for example, transforming declarative sentences into questions. However, syntactic evaluations of seq2seq models h… ▽ More

    Submitted 17 March, 2022; originally announced March 2022.

    Comments: Accepted to Findings of ACL 2022

  23. arXiv:2112.07610  [pdf, other

    cs.CL

    Improving Compositional Generalization with Latent Structure and Data Augmentation

    Authors: Linlu Qiu, Peter Shaw, Panupong Pasupat, Paweł Krzysztof Nowak, Tal Linzen, Fei Sha, Kristina Toutanova

    Abstract: Generic unstructured neural networks have been shown to struggle on out-of-distribution compositional generalization. Compositional data augmentation via example recombination has transferred some prior knowledge about compositionality to such black-box neural models for several semantic parsing tasks, but this often required task-specific engineering or provided limited gains. We present a more… ▽ More

    Submitted 4 May, 2022; v1 submitted 14 December, 2021; originally announced December 2021.

    Comments: NAACL 2022

  24. arXiv:2111.09509  [pdf, other

    cs.CL

    How much do language models copy from their training data? Evaluating linguistic novelty in text generation using RAVEN

    Authors: R. Thomas McCoy, Paul Smolensky, Tal Linzen, Jianfeng Gao, Asli Celikyilmaz

    Abstract: Current language models can generate high-quality text. Are they simply copying text they have seen before, or have they learned generalizable linguistic abstractions? To tease apart these possibilities, we introduce RAVEN, a suite of analyses for assessing the novelty of generated text, focusing on sequential structure (n-grams) and syntactic structure. We apply these analyses to four neural lang… ▽ More

    Submitted 17 November, 2021; originally announced November 2021.

    Comments: 10 pages, plus 39 pages of appendices

  25. arXiv:2111.05013  [pdf, other

    cs.CL cs.LG

    Learning to Generalize Compositionally by Transferring Across Semantic Parsing Tasks

    Authors: Wang Zhu, Peter Shaw, Tal Linzen, Fei Sha

    Abstract: Neural network models often generalize poorly to mismatched domains or distributions. In NLP, this issue arises in particular when models are expected to generalize compositionally, that is, to novel combinations of familiar words and constructions. We investigate learning representations that facilitate transfer learning from one compositional task to another: the representation and the task-spec… ▽ More

    Submitted 9 November, 2021; originally announced November 2021.

  26. arXiv:2109.07848  [pdf, other

    cs.CL

    The Language Model Understood the Prompt was Ambiguous: Probing Syntactic Uncertainty Through Generation

    Authors: Laura Aina, Tal Linzen

    Abstract: Temporary syntactic ambiguities arise when the beginning of a sentence is compatible with multiple syntactic analyses. We inspect to which extent neural language models (LMs) exhibit uncertainty over such analyses when processing temporarily ambiguous inputs, and how that uncertainty is modulated by disambiguating cues. We probe the LM's expectations by generating from it: we use stochastic decodi… ▽ More

    Submitted 16 September, 2021; originally announced September 2021.

    Comments: To appear in Proceedings of BlackboxNLP 2021: Analyzing and Interpreting Neural Networks for NLP

  27. arXiv:2109.07020  [pdf, other

    cs.CL

    Frequency Effects on Syntactic Rule Learning in Transformers

    Authors: Jason Wei, Dan Garrette, Tal Linzen, Ellie Pavlick

    Abstract: Pre-trained language models perform well on a variety of linguistic tasks that require symbolic reasoning, raising the question of whether such models implicitly represent abstract symbols and rules. We investigate this question using the case study of BERT's performance on English subject-verb agreement. Unlike prior work, we train multiple instances of BERT from scratch, allowing us to perform a… ▽ More

    Submitted 14 September, 2021; originally announced September 2021.

    Comments: Camera ready for EMNLP 2021

  28. arXiv:2109.06987  [pdf, other

    cs.CL

    NOPE: A Corpus of Naturally-Occurring Presuppositions in English

    Authors: Alicia Parrish, Sebastian Schuster, Alex Warstadt, Omar Agha, Soo-Hwan Lee, Zhuoye Zhao, Samuel R. Bowman, Tal Linzen

    Abstract: Understanding language requires grasping not only the overtly stated content, but also making inferences about things that were left unsaid. These inferences include presuppositions, a phenomenon by which a listener learns about new information through reasoning about what a speaker takes as given. Presuppositions require complex understanding of the lexical and syntactic properties that trigger t… ▽ More

    Submitted 14 September, 2021; originally announced September 2021.

    Comments: CoNLL 2021. Data and code available at https://github.com/nyu-mll/nope

  29. arXiv:2106.16163  [pdf, other

    cs.CL

    The MultiBERTs: BERT Reproductions for Robustness Analysis

    Authors: Thibault Sellam, Steve Yadlowsky, Jason Wei, Naomi Saphra, Alexander D'Amour, Tal Linzen, Jasmijn Bastings, Iulia Turc, Jacob Eisenstein, Dipanjan Das, Ian Tenney, Ellie Pavlick

    Abstract: Experiments with pre-trained models such as BERT are often based on a single checkpoint. While the conclusions drawn apply to the artifact tested in the experiment (i.e., the particular instance of the model), it is not always clear whether they hold for the more general procedure which includes the architecture, training data, initialization scheme, and loss function. Recent work has shown that r… ▽ More

    Submitted 21 March, 2022; v1 submitted 30 June, 2021; originally announced June 2021.

    Comments: Accepted at ICLR'22. Checkpoints and example analyses: http://goo.gle/multiberts

  30. arXiv:2106.06087  [pdf, other

    cs.CL

    Causal Analysis of Syntactic Agreement Mechanisms in Neural Language Models

    Authors: Matthew Finlayson, Aaron Mueller, Sebastian Gehrmann, Stuart Shieber, Tal Linzen, Yonatan Belinkov

    Abstract: Targeted syntactic evaluations have demonstrated the ability of language models to perform subject-verb agreement given difficult contexts. To elucidate the mechanisms by which the models accomplish this behavior, this study applies causal mediation analysis to pre-trained neural language models. We investigate the magnitude of models' preferences for grammatical inflections, as well as whether ne… ▽ More

    Submitted 22 June, 2021; v1 submitted 10 June, 2021; originally announced June 2021.

    Comments: Accepted to ACL-IJCNLP 2021

    MSC Class: 68T50 ACM Class: I.2.7

  31. arXiv:2105.06965  [pdf, other

    cs.CL

    Counterfactual Interventions Reveal the Causal Effect of Relative Clause Representations on Agreement Prediction

    Authors: Shauli Ravfogel, Grusha Prasad, Tal Linzen, Yoav Goldberg

    Abstract: When language models process syntactically complex sentences, do they use their representations of syntax in a manner that is consistent with the grammar of the language? We propose AlterRep, an intervention-based method to address this question. For any linguistic feature of a given sentence, AlterRep generates counterfactual representations by altering how the feature is encoded, while leaving i… ▽ More

    Submitted 15 September, 2021; v1 submitted 14 May, 2021; originally announced May 2021.

    Comments: Equal contribution by SR and GP. Accepted in CoNLL 2021

  32. arXiv:2105.00071  [pdf, other

    cs.CL

    Evaluating Attribution in Dialogue Systems: The BEGIN Benchmark

    Authors: Nouha Dziri, Hannah Rashkin, Tal Linzen, David Reitter

    Abstract: Knowledge-grounded dialogue systems powered by large language models often generate responses that, while fluent, are not attributable to a relevant source of information. Progress towards models that do not exhibit this issue requires evaluation metrics that can quantify its prevalence. To this end, we introduce the Benchmark for Evaluation of Grounded INteraction (BEGIN), comprised of 12k dialog… ▽ More

    Submitted 28 June, 2022; v1 submitted 30 April, 2021; originally announced May 2021.

    Comments: TACL, 12 pages, 9 figures, 2 tables

  33. arXiv:2104.07179  [pdf, other

    cs.CL

    Does Putting a Linguist in the Loop Improve NLU Data Collection?

    Authors: Alicia Parrish, William Huang, Omar Agha, Soo-Hwan Lee, Nikita Nangia, Alex Warstadt, Karmanya Aggarwal, Emily Allaway, Tal Linzen, Samuel R. Bowman

    Abstract: Many crowdsourced NLP datasets contain systematic gaps and biases that are identified only after data collection is complete. Identifying these issues from early data samples during crowdsourcing should make mitigation more efficient, especially when done iteratively. We take natural language inference as a test case and ask whether it is beneficial to put a linguist `in the loop' during data coll… ▽ More

    Submitted 14 April, 2021; originally announced April 2021.

    Comments: 14 pages, 10 figures

  34. arXiv:2010.05465  [pdf, other

    cs.CL

    COGS: A Compositional Generalization Challenge Based on Semantic Interpretation

    Authors: Najoung Kim, Tal Linzen

    Abstract: Natural language is characterized by compositionality: the meaning of a complex expression is constructed from the meanings of its constituent parts. To facilitate the evaluation of the compositional abilities of language processing architectures, we introduce COGS, a semantic parsing dataset based on a fragment of English. The evaluation portion of COGS contains multiple systematic gaps that can… ▽ More

    Submitted 12 October, 2020; originally announced October 2020.

    Comments: Accepted to EMNLP 2020

  35. arXiv:2006.16324  [pdf, other

    cs.CL cs.LG

    Universal linguistic inductive biases via meta-learning

    Authors: R. Thomas McCoy, Erin Grant, Paul Smolensky, Thomas L. Griffiths, Tal Linzen

    Abstract: How do learners acquire languages from the limited data available to them? This process must involve some inductive biases - factors that affect how a learner generalizes - but it is unclear which inductive biases can explain observed patterns in language acquisition. To facilitate computational modeling aimed at addressing this question, we introduce a framework for giving particular linguistic i… ▽ More

    Submitted 29 June, 2020; originally announced June 2020.

    Comments: To appear in the Proceedings of the 42nd Annual Conference of the Cognitive Science Society

  36. arXiv:2005.00955  [pdf, other

    cs.CL

    How Can We Accelerate Progress Towards Human-like Linguistic Generalization?

    Authors: Tal Linzen

    Abstract: This position paper describes and critiques the Pretraining-Agnostic Identically Distributed (PAID) evaluation paradigm, which has become a central tool for measuring progress in natural language understanding. This paradigm consists of three stages: (1) pre-training of a word prediction model on a corpus of arbitrary size; (2) fine-tuning (transfer learning) on a training set representing a class… ▽ More

    Submitted 2 May, 2020; originally announced May 2020.

    Comments: ACL 2020

  37. arXiv:2005.00187  [pdf, other

    cs.CL

    Cross-Linguistic Syntactic Evaluation of Word Prediction Models

    Authors: Aaron Mueller, Garrett Nicolai, Panayiota Petrou-Zeniou, Natalia Talmina, Tal Linzen

    Abstract: A range of studies have concluded that neural word prediction models can distinguish grammatical from ungrammatical sentences with high accuracy. However, these studies are based primarily on monolingual evidence from English. To investigate how these models' ability to learn syntax varies by language, we introduce CLAMS (Cross-Linguistic Assessment of Models on Syntax), a syntactic evaluation sui… ▽ More

    Submitted 21 May, 2020; v1 submitted 30 April, 2020; originally announced May 2020.

    Comments: Accepted for presentation at ACL 2020

  38. arXiv:2005.00019  [pdf, other

    cs.CL

    Representations of Syntax [MASK] Useful: Effects of Constituency and Dependency Structure in Recursive LSTMs

    Authors: Michael A. Lepori, Tal Linzen, R. Thomas McCoy

    Abstract: Sequence-based neural networks show significant sensitivity to syntactic structure, but they still perform less well on syntactic tasks than tree-based networks. Such tree-based networks can be provided with a constituency parse, a dependency parse, or both. We evaluate which of these two representational schemes more effectively introduces biases for syntactic structure that increase performance… ▽ More

    Submitted 30 April, 2020; originally announced May 2020.

    Comments: To appear in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL-2020)

  39. arXiv:2004.11999  [pdf, other

    cs.CL

    Syntactic Data Augmentation Increases Robustness to Inference Heuristics

    Authors: Junghyun Min, R. Thomas McCoy, Dipanjan Das, Emily Pitler, Tal Linzen

    Abstract: Pretrained neural models such as BERT, when fine-tuned to perform natural language inference (NLI), often show high accuracy on standard datasets, but display a surprising lack of sensitivity to word order on controlled challenge sets. We hypothesize that this issue is not primarily caused by the pretrained model's limitations, but rather by the paucity of crowdsourced NLI examples that might conv… ▽ More

    Submitted 24 April, 2020; originally announced April 2020.

    Comments: ACL 2020

  40. Syntactic Structure from Deep Learning

    Authors: Tal Linzen, Marco Baroni

    Abstract: Modern deep neural networks achieve impressive performance in engineering applications that require extensive linguistic skills, such as machine translation. This success has sparked interest in probing whether these models are inducing human-like grammatical knowledge from the raw data they are exposed to, and, consequently, whether they can shed new light on long-standing debates concerning the… ▽ More

    Submitted 22 April, 2020; originally announced April 2020.

    Comments: In press at Annual Reviews of Linguistics

  41. arXiv:2001.03632  [pdf, other

    cs.CL

    Does syntax need to grow on trees? Sources of hierarchical inductive bias in sequence-to-sequence networks

    Authors: R. Thomas McCoy, Robert Frank, Tal Linzen

    Abstract: Learners that are exposed to the same training data might generalize differently due to differing inductive biases. In neural network models, inductive biases could in theory arise from any aspect of the model architecture. We investigate which architectural factors affect the generalization behavior of neural sequence-to-sequence models trained on two syntactic tasks, English question formation a… ▽ More

    Submitted 10 January, 2020; originally announced January 2020.

    Comments: 12 pages, 10 figures; accepted to TACL

  42. arXiv:1911.02969  [pdf, other

    cs.CL

    BERTs of a feather do not generalize together: Large variability in generalization across models with similar test set performance

    Authors: R. Thomas McCoy, Junghyun Min, Tal Linzen

    Abstract: If the same neural network architecture is trained multiple times on the same dataset, will it make similar linguistic generalizations across runs? To study this question, we fine-tuned 100 instances of BERT on the Multi-genre Natural Language Inference (MNLI) dataset and evaluated them on the HANS dataset, which evaluates syntactic generalization in natural language inference. On the MNLI develop… ▽ More

    Submitted 16 November, 2020; v1 submitted 7 November, 2019; originally announced November 2019.

    Comments: 11 pages, 7 figures; accepted to the 2020 BlackboxNLP workshop

  43. Discovering the Compositional Structure of Vector Representations with Role Learning Networks

    Authors: Paul Soulos, Tom McCoy, Tal Linzen, Paul Smolensky

    Abstract: How can neural networks perform so well on compositional tasks even though they lack explicit compositional representations? We use a novel analysis technique called ROLE to show that recurrent neural networks perform well on such tasks by converging to solutions which implicitly represent symbolic structure. This method uncovers a symbolic structure which, when properly embedded in vector space,… ▽ More

    Submitted 16 November, 2020; v1 submitted 20 October, 2019; originally announced October 2019.

  44. arXiv:1909.10579  [pdf, other

    cs.CL

    Using Priming to Uncover the Organization of Syntactic Representations in Neural Language Models

    Authors: Grusha Prasad, Marten van Schijndel, Tal Linzen

    Abstract: Neural language models (LMs) perform well on tasks that require sensitivity to syntactic structure. Drawing on the syntactic priming paradigm from psycholinguistics, we propose a novel technique to analyze the representations that enable such success. By establishing a gradient similarity metric between structures, this technique allows us to reconstruct the organization of the LMs' syntactic repr… ▽ More

    Submitted 23 September, 2019; originally announced September 2019.

    Comments: 9 pages paper, 2 pages references and 3 pages supplementary materials. Code for the templates and analyses can be found here: https://github.com/grushaprasad/RNN-Priming

    Journal ref: CoNLL 2019

  45. arXiv:1909.00111  [pdf, other

    cs.CL

    Quantity doesn't buy quality syntax with neural language models

    Authors: Marten van Schijndel, Aaron Mueller, Tal Linzen

    Abstract: Recurrent neural networks can learn to predict upcoming words remarkably well on average; in syntactically complex contexts, however, they often assign unexpectedly high probabilities to ungrammatical words. We investigate to what extent these shortcomings can be mitigated by increasing the size of the network and the corpus on which it is trained. We find that gains from increasing network size a… ▽ More

    Submitted 30 August, 2019; originally announced September 2019.

    Comments: Accepted for presentation at EMNLP-IJCNLP 2019

  46. arXiv:1904.11544  [pdf, other

    cs.CL

    Probing What Different NLP Tasks Teach Machines about Function Word Comprehension

    Authors: Najoung Kim, Roma Patel, Adam Poliak, Alex Wang, Patrick Xia, R. Thomas McCoy, Ian Tenney, Alexis Ross, Tal Linzen, Benjamin Van Durme, Samuel R. Bowman, Ellie Pavlick

    Abstract: We introduce a set of nine challenge tasks that test for the understanding of function words. These tasks are created by structurally mutating sentences from existing datasets to target the comprehension of specific types of function words (e.g., prepositions, wh-words). Using these probing tasks, we explore the effects of various pretraining objectives for sentence encoders (e.g., language modeli… ▽ More

    Submitted 7 August, 2019; v1 submitted 25 April, 2019; originally announced April 2019.

    Comments: Accepted to *SEM 2019 (revised submission). Corresponding authors: Najoung Kim (n.kim@jhu.edu), Ellie Pavlick (ellie_pavlick@brown.edu)

  47. arXiv:1904.04063  [pdf, ps, other

    cs.CL stat.ML

    Analyzing and Interpreting Neural Networks for NLP: A Report on the First BlackboxNLP Workshop

    Authors: Afra Alishahi, Grzegorz Chrupała, Tal Linzen

    Abstract: The EMNLP 2018 workshop BlackboxNLP was dedicated to resources and techniques specifically developed for analyzing and understanding the inner-workings and representations acquired by neural models of language. Approaches included: systematic manipulation of input to neural networks and investigating the impact on their performance, testing whether interpretable knowledge can be decoded from inter… ▽ More

    Submitted 5 April, 2019; originally announced April 2019.

  48. arXiv:1903.06400  [pdf, other

    cs.CL

    Studying the Inductive Biases of RNNs with Synthetic Variations of Natural Languages

    Authors: Shauli Ravfogel, Yoav Goldberg, Tal Linzen

    Abstract: How do typological properties such as word order and morphological case marking affect the ability of neural sequence models to acquire the syntax of a language? Cross-linguistic comparisons of RNNs' syntactic performance (e.g., on subject-verb agreement prediction) are complicated by the fact that any two languages differ in multiple typological properties, as well as by differences in training c… ▽ More

    Submitted 26 March, 2019; v1 submitted 15 March, 2019; originally announced March 2019.

    Comments: Accepted as a long paper in NAACL 2019

  49. arXiv:1902.01007  [pdf, other

    cs.CL

    Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference

    Authors: R. Thomas McCoy, Ellie Pavlick, Tal Linzen

    Abstract: A machine learning system can score well on a given test set by relying on heuristics that are effective for frequent example types but break down in more challenging cases. We study this issue within natural language inference (NLI), the task of determining whether one sentence entails another. We hypothesize that statistical NLI models may adopt three fallible syntactic heuristics: the lexical o… ▽ More

    Submitted 24 June, 2019; v1 submitted 3 February, 2019; originally announced February 2019.

    Comments: Camera-ready for ACL 2019

  50. arXiv:1901.04587  [pdf, other

    cs.CL

    Human few-shot learning of compositional instructions

    Authors: Brenden M. Lake, Tal Linzen, Marco Baroni

    Abstract: People learn in fast and flexible ways that have not been emulated by machines. Once a person learns a new verb "dax," he or she can effortlessly understand how to "dax twice," "walk and dax," or "dax vigorously." There have been striking recent improvements in machine learning for natural language processing, yet the best algorithms require vast amounts of experience and struggle to generalize ne… ▽ More

    Submitted 10 May, 2019; v1 submitted 14 January, 2019; originally announced January 2019.

    Comments: Please cite as: Lake, B. M., Linzen, T., and Baroni, M. (2019). Human few-shot learning of compositional instructions. In Proceedings of the 41st Annual Conference of the Cognitive Science Society