Skip to main content

Showing 1–12 of 12 results for author: Lakretz, Y

  1. arXiv:2406.12620  [pdf, other

    cs.CL

    What Makes Two Language Models Think Alike?

    Authors: Jeanne Salle, Louis Jalouzot, Nur Lan, Emmanuel Chemla, Yair Lakretz

    Abstract: Do architectural differences significantly affect the way models represent and process language? We propose a new approach, based on metric-learning encoding models (MLEMs), as a first step to answer this question. The approach provides a feature-based comparison of how any two layers of any two models represent linguistic information. We apply the method to BERT, GPT-2 and Mamba. Unlike previous… ▽ More

    Submitted 24 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

    Comments: 7 pages, 6 figures

  2. arXiv:2402.11608  [pdf, other

    cs.CL

    Metric-Learning Encoding Models Identify Processing Profiles of Linguistic Features in BERT's Representations

    Authors: Louis Jalouzot, Robin Sobczyk, Bastien Lhopitallier, Jeanne Salle, Nur Lan, Emmanuel Chemla, Yair Lakretz

    Abstract: We introduce Metric-Learning Encoding Models (MLEMs) as a new approach to understand how neural systems represent the theoretical features of the objects they process. As a proof-of-concept, we apply MLEMs to neural representations extracted from BERT, and track a wide variety of linguistic features (e.g., tense, subject person, clause type, clause embedding). We find that: (1) linguistic features… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

    Comments: 17 pages, 13 figures

  3. arXiv:2306.03586  [pdf, other

    cs.CL cs.AI

    Language acquisition: do children and language models follow similar learning stages?

    Authors: Linnea Evanson, Yair Lakretz, Jean-Rémi King

    Abstract: During language acquisition, children follow a typical sequence of learning stages, whereby they first learn to categorize phonemes before they develop their lexicon and eventually master increasingly complex syntactic structures. However, the computational principles that lead to this learning trajectory remain largely unknown. To investigate this, we here compare the learning trajectories of dee… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

    Comments: Accepted to ACL 2023. *Equal Contribution

  4. arXiv:2305.13863  [pdf, other

    cs.CL

    Probing Brain Context-Sensitivity with Masked-Attention Generation

    Authors: Alexandre Pasquiou, Yair Lakretz, Bertrand Thirion, Christophe Pallier

    Abstract: Two fundamental questions in neurolinguistics concerns the brain regions that integrate information beyond the lexical level, and the size of their window of integration. To address these questions we introduce a new approach named masked-attention generation. It uses GPT-2 transformers to generate word embeddings that capture a fixed amount of contextual information. We then tested whether these… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

    Comments: 2 pages, 2 figures, CCN 2023

    Journal ref: CCN 2023

  5. arXiv:2302.14389  [pdf, other

    cs.CL

    Information-Restricted Neural Language Models Reveal Different Brain Regions' Sensitivity to Semantics, Syntax and Context

    Authors: Alexandre Pasquiou, Yair Lakretz, Bertrand Thirion, Christophe Pallier

    Abstract: A fundamental question in neurolinguistics concerns the brain regions involved in syntactic and semantic processing during speech comprehension, both at the lexical (word processing) and supra-lexical levels (sentence and discourse processing). To what extent are these regions separated or intertwined? To address this question, we trained a lexical language model, Glove, and a supra-lexical langua… ▽ More

    Submitted 28 February, 2023; originally announced February 2023.

    Comments: 19 pages, 8 figures, 10 pages of Appendix, 5 appendix figures

  6. arXiv:2207.03380  [pdf, other

    cs.AI cs.CL

    Neural Language Models are not Born Equal to Fit Brain Data, but Training Helps

    Authors: Alexandre Pasquiou, Yair Lakretz, John Hale, Bertrand Thirion, Christophe Pallier

    Abstract: Neural Language Models (NLMs) have made tremendous advances during the last years, achieving impressive performance on various linguistic tasks. Capitalizing on this, studies in neuroscience have started to use NLMs to study neural activity in the human brain during language processing. However, many questions remain unanswered regarding which factors determine the ability of a neural language mod… ▽ More

    Submitted 7 July, 2022; originally announced July 2022.

    Journal ref: ICML 2022 - 39th International Conference on Machine Learning, Jul 2022, Baltimore, United States. pp.18

  7. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  8. arXiv:2110.07240  [pdf, other

    cs.CL

    Causal Transformers Perform Below Chance on Recursive Nested Constructions, Unlike Humans

    Authors: Yair Lakretz, Théo Desbordes, Dieuwke Hupkes, Stanislas Dehaene

    Abstract: Recursive processing is considered a hallmark of human linguistic abilities. A recent study evaluated recursive processing in recurrent neural language models (RNN-LMs) and showed that such models perform below chance level on embedded dependencies within nested constructions -- a prototypical example of recursion in natural language. Here, we study if state-of-the-art Transformer LMs do any bette… ▽ More

    Submitted 14 October, 2021; originally announced October 2021.

    Comments: None

  9. arXiv:2101.02258  [pdf, other

    cs.CL

    Can RNNs learn Recursive Nested Subject-Verb Agreements?

    Authors: Yair Lakretz, Théo Desbordes, Jean-Rémi King, Benoît Crabbé, Maxime Oquab, Stanislas Dehaene

    Abstract: One of the fundamental principles of contemporary linguistics states that language processing requires the ability to extract recursively nested tree structures. However, it remains unclear whether and how this code could be implemented in neural circuits. Recent advances in Recurrent Neural Networks (RNNs), which achieve near-human performance in some language tasks, provide a compelling model to… ▽ More

    Submitted 6 January, 2021; originally announced January 2021.

  10. Mechanisms for Handling Nested Dependencies in Neural-Network Language Models and Humans

    Authors: Yair Lakretz, Dieuwke Hupkes, Alessandra Vergallito, Marco Marelli, Marco Baroni, Stanislas Dehaene

    Abstract: Recursive processing in sentence comprehension is considered a hallmark of human linguistic abilities. However, its underlying neural mechanisms remain largely unknown. We studied whether a modern artificial neural network trained with "deep learning" methods mimics a central aspect of human sentence processing, namely the storing of grammatical number and gender information in working memory and… ▽ More

    Submitted 3 May, 2021; v1 submitted 19 June, 2020; originally announced June 2020.

    Journal ref: Lakretz et al. (2021), Cognition

  11. arXiv:1903.07435  [pdf, other

    cs.CL

    The emergence of number and syntax units in LSTM language models

    Authors: Yair Lakretz, German Kruszewski, Theo Desbordes, Dieuwke Hupkes, Stanislas Dehaene, Marco Baroni

    Abstract: Recent work has shown that LSTMs trained on a generic language modeling objective capture syntax-sensitive generalizations such as long-distance number agreement. We have however no mechanistic understanding of how they accomplish this remarkable feat. Some have conjectured it depends on heuristics that do not truly take hierarchical structure into account. We present here a detailed study of the… ▽ More

    Submitted 2 April, 2019; v1 submitted 18 March, 2019; originally announced March 2019.

    Comments: To appear in Proceedings of NAACL, Minneapolis, MN, 2019

  12. arXiv:1809.07824  [pdf, other

    cs.LG cs.SD eess.AS stat.ML

    Metric Learning for Phoneme Perception

    Authors: Yair Lakretz, Gal Chechik, Evan-Gary Cohen, Alessandro Treves, Naama Friedmann

    Abstract: Metric functions for phoneme perception capture the similarity structure among phonemes in a given language and therefore play a central role in phonology and psycho-linguistics. Various phenomena depend on phoneme similarity, such as spoken word recognition or serial recall from verbal working memory. This study presents a new framework for learning a metric function for perceptual distances amon… ▽ More

    Submitted 20 September, 2018; originally announced September 2018.