Skip to main content

Showing 1–34 of 34 results for author: Strube, M

  1. arXiv:2404.00999  [pdf, other

    cs.CL

    What Causes the Failure of Explicit to Implicit Discourse Relation Recognition?

    Authors: Wei Liu, Stephen Wan, Michael Strube

    Abstract: We consider an unanswered question in the discourse processing community: why do relation classifiers trained on explicit examples (with connectives removed) perform poorly in real implicit scenarios? Prior work claimed this is due to linguistic dissimilarity between explicit and implicit examples but provided no empirical evidence. In this study, we show that one cause for such failure is a label… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: Accepted by NAACL2024 (Long Paper)

  2. arXiv:2402.01025  [pdf, other

    cs.CL

    Graph-based Clustering for Detecting Semantic Change Across Time and Languages

    Authors: Xianghe Ma, Michael Strube, Wei Zhao

    Abstract: Despite the predominance of contextualized embeddings in NLP, approaches to detect semantic change relying on these embeddings and clustering methods underperform simpler counterparts based on static word embeddings. This stems from the poor quality of the clustering methods to produce sense clusters -- which struggle to capture word senses, especially those with low frequency. This issue hinders… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

    Comments: EACL2024 Camera Ready (20 pages)

  3. arXiv:2312.01502  [pdf, other

    cs.LG cs.SI

    Normed Spaces for Graph Embedding

    Authors: Diaaeldin Taha, Wei Zhao, J. Maxwell Riestenberg, Michael Strube

    Abstract: Theoretical results from discrete geometry suggest that normed spaces can abstractly embed finite metric spaces with surprisingly low theoretical bounds on distortion in low dimensions. In this paper, inspired by this theoretical insight, we highlight normed spaces as a more flexible and computationally efficient alternative to several popular Riemannian manifolds for learning graph embeddings. No… ▽ More

    Submitted 3 December, 2023; originally announced December 2023.

    Comments: 23 pages,7 figures,9 tables | The first two authors contributed equally

  4. arXiv:2310.17734  [pdf, other

    cs.CL

    Investigating Multilingual Coreference Resolution by Universal Annotations

    Authors: Haixia Chai, Michael Strube

    Abstract: Multilingual coreference resolution (MCR) has been a long-standing and challenging task. With the newly proposed multilingual coreference dataset, CorefUD (Nedoluzhko et al., 2022), we conduct an investigation into the task by using its harmonized universal morphosyntactic and coreference annotations. First, we study coreference by examining the ground truth data at different linguistic levels, na… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

    Comments: Accepted at Findings of EMNLP2023

  5. arXiv:2306.14064  [pdf, other

    cs.LG

    Modeling Graphs Beyond Hyperbolic: Graph Neural Networks in Symmetric Positive Definite Matrices

    Authors: Wei Zhao, Federico Lopez, J. Maxwell Riestenberg, Michael Strube, Diaaeldin Taha, Steve Trettel

    Abstract: Recent research has shown that alignment between the structure of graph data and the geometry of an embedding space is crucial for learning high-quality representations of the data. The uniform geometry of Euclidean and hyperbolic spaces allows for representing graphs with uniform geometric and topological features, such as grids and hierarchies, with minimal distortion. However, real-world graph… ▽ More

    Submitted 24 June, 2023; originally announced June 2023.

    Comments: ECML2023 Camera Ready

  6. arXiv:2306.06480  [pdf, other

    cs.CL

    Annotation-Inspired Implicit Discourse Relation Classification with Auxiliary Discourse Connective Generation

    Authors: Wei Liu, Michael Strube

    Abstract: Implicit discourse relation classification is a challenging task due to the absence of discourse connectives. To overcome this issue, we design an end-to-end neural model to explicitly generate discourse connectives for the task, inspired by the annotation process of PDTB. Specifically, our model jointly learns to generate discourse connectives between arguments and predict discourse relations bas… ▽ More

    Submitted 10 June, 2023; originally announced June 2023.

  7. arXiv:2306.06472  [pdf, other

    cs.CL

    Modeling Structural Similarities between Documents for Coherence Assessment with Graph Convolutional Networks

    Authors: Wei Liu, Xiyan Fu, Michael Strube

    Abstract: Coherence is an important aspect of text quality, and various approaches have been applied to coherence modeling. However, existing methods solely focus on a single document's coherence patterns, ignoring the underlying correlation between documents. We investigate a GCN-based coherence model that is capable of capturing structural similarities between documents. Our model first creates a graph st… ▽ More

    Submitted 10 June, 2023; originally announced June 2023.

  8. arXiv:2304.01621  [pdf, other

    cs.CL

    SimCSum: Joint Learning of Simplification and Cross-lingual Summarization for Cross-lingual Science Journalism

    Authors: Mehwish Fatima, Tim Kolber, Katja Markert, Michael Strube

    Abstract: Cross-lingual science journalism generates popular science stories of scientific articles different from the source language for a non-expert audience. Hence, a cross-lingual popular summary must contain the salient content of the input document, and the content should be coherent, comprehensible, and in a local language for the targeted audience. We improve these aspects of cross-lingual summary… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

  9. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  10. arXiv:2201.11176  [pdf, other

    cs.CL

    DiscoScore: Evaluating Text Generation with BERT and Discourse Coherence

    Authors: Wei Zhao, Michael Strube, Steffen Eger

    Abstract: Recently, there has been a growing interest in designing text generation systems from a discourse coherence perspective, e.g., modeling the interdependence between sentences. Still, recent BERT-based evaluation metrics are weak in recognizing coherence, and thus are not reliable in a way to spot the discourse-level improvements of those text generation systems. In this work, we introduce DiscoScor… ▽ More

    Submitted 6 February, 2023; v1 submitted 26 January, 2022; originally announced January 2022.

    Comments: EACL2023 Camera Ready

  11. arXiv:2112.03256  [pdf, ps, other

    cs.CL

    Impact of Target Word and Context on End-to-End Metonymy Detection

    Authors: Kevin Alex Mathews, Michael Strube

    Abstract: Metonymy is a figure of speech in which an entity is referred to by another related entity. The task of metonymy detection aims to distinguish metonymic tokens from literal ones. Until now, metonymy detection methods attempt to disambiguate only a single noun phrase in a sentence, typically location names or organization names. In this paper, we disambiguate every word in a sentence by reformulati… ▽ More

    Submitted 6 December, 2021; originally announced December 2021.

  12. arXiv:2110.13475  [pdf, other

    cs.LG cs.CG

    Vector-valued Distance and Gyrocalculus on the Space of Symmetric Positive Definite Matrices

    Authors: Federico López, Beatrice Pozzetti, Steve Trettel, Michael Strube, Anna Wienhard

    Abstract: We propose the use of the vector-valued distance to compute distances and extract geometric information from the manifold of symmetric positive definite matrices (SPD), and develop gyrovector calculus, constructing analogs of vector space operations in this curved space. We implement these operations and showcase their versatility in the tasks of knowledge graph completion, item recommendation, an… ▽ More

    Submitted 26 October, 2021; originally announced October 2021.

    Comments: 30 pages. Accepted at NeurIPS 2021 as spotlight presentation (top 3%)

    ACM Class: I.2

  13. arXiv:2109.09358  [pdf, other

    cs.CL

    Augmenting the User-Item Graph with Textual Similarity Models

    Authors: Federico López, Martin Scholz, Jessica Yung, Marie Pellat, Michael Strube, Lucas Dixon

    Abstract: This paper introduces a simple and effective form of data augmentation for recommender systems. A paraphrase similarity model is applied to widely available textual data, such as reviews and product descriptions, yielding new semantic relations that are added to the user-item graph. This increases the density of the graph without needing further labeled data. The data augmentation is evaluated on… ▽ More

    Submitted 20 September, 2021; originally announced September 2021.

    Comments: 12 pages, 2 figures

    ACM Class: I.2.7

  14. arXiv:2106.04941  [pdf, other

    cs.LG cs.CG

    Symmetric Spaces for Graph Embeddings: A Finsler-Riemannian Approach

    Authors: Federico López, Beatrice Pozzetti, Steve Trettel, Michael Strube, Anna Wienhard

    Abstract: Learning faithful graph representations as sets of vertex embeddings has become a fundamental intermediary step in a wide range of machine learning applications. We propose the systematic use of symmetric spaces in representation learning, a class encompassing many of the previously used embedding targets. This enables us to introduce a new method, the use of Finsler metrics integrated in a Rieman… ▽ More

    Submitted 9 June, 2021; originally announced June 2021.

    Comments: 28 pages. Accepted at ICML 2021

    ACM Class: I.2

  15. arXiv:2010.02053  [pdf, other

    cs.CL

    A Fully Hyperbolic Neural Model for Hierarchical Multi-Class Classification

    Authors: Federico López, Michael Strube

    Abstract: Label inventories for fine-grained entity typing have grown in size and complexity. Nonetheless, they exhibit a hierarchical structure. Hyperbolic spaces offer a mathematically appealing approach for learning hierarchical representations of symbolic data. However, it is not clear how to integrate hyperbolic components into downstream tasks. This is the first work that proposes a fully hyperbolic m… ▽ More

    Submitted 5 October, 2020; originally announced October 2020.

    Comments: 16 pages, accepted at Findings of EMNLP2020

    ACM Class: I.2.7

  16. Adapting Deep Learning Methods for Mental Health Prediction on Social Media

    Authors: Ivan Sekulić, Michael Strube

    Abstract: Mental health poses a significant challenge for an individual's well-being. Text analysis of rich resources, like social media, can contribute to deeper understanding of illnesses and provide means for their early detection. We tackle a challenge of detecting social media users' mental status through deep learning-based models, moving away from traditional approaches to the task. In a binary class… ▽ More

    Submitted 17 March, 2020; originally announced March 2020.

    Comments: W-NUT at EMNLP 2019

    Journal ref: Proceedings of the 5th Workshop on Noisy User-generated Text, 2019, 322-327

  17. arXiv:1909.12375  [pdf, other

    cs.CL

    On the Importance of Subword Information for Morphological Tasks in Truly Low-Resource Languages

    Authors: Yi Zhu, Benjamin Heinzerling, Ivan Vulić, Michael Strube, Roi Reichart, Anna Korhonen

    Abstract: Recent work has validated the importance of subword information for word representation learning. Since subwords increase parameter sharing ability in neural models, their value should be even more pronounced in low-data regimes. In this work, we therefore provide a comprehensive analysis focused on the usefulness of subwords for word representation learning in truly low-resource scenarios and for… ▽ More

    Submitted 26 September, 2019; originally announced September 2019.

    Comments: CONLL2019

  18. arXiv:1906.06703  [pdf, other

    cs.CL

    Using Automatically Extracted Minimum Spans to Disentangle Coreference Evaluation from Boundary Detection

    Authors: Nafise Sadat Moosavi, Leo Born, Massimo Poesio, Michael Strube

    Abstract: The common practice in coreference resolution is to identify and evaluate the maximum span of mentions. The use of maximum spans tangles coreference evaluation with the challenges of mention boundary detection like prepositional phrase attachment. To address this problem, minimum spans are manually annotated in smaller corpora. However, this additional annotation is costly and therefore, this solu… ▽ More

    Submitted 16 June, 2019; originally announced June 2019.

    Comments: ACL 2019

  19. arXiv:1906.02505  [pdf, other

    cs.CL

    Fine-Grained Entity Typing in Hyperbolic Space

    Authors: Federico López, Benjamin Heinzerling, Michael Strube

    Abstract: How can we represent hierarchical information present in large type inventories for entity typing? We study the ability of hyperbolic embeddings to capture hierarchical relations between mentions in context and their target types in a shared vector space. We evaluate on two datasets and investigate two different techniques for creating a large hierarchical entity type inventory: from an expert-gen… ▽ More

    Submitted 6 June, 2019; originally announced June 2019.

    Comments: 12 pages, 4 figures, final version, accepted at the 4th Workshop on Representation Learning for NLP (RepL4NLP), held in conjunction with ACL 2019

    ACM Class: I.2.7

  20. arXiv:1906.01569  [pdf, other

    cs.CL

    Sequence Tagging with Contextual and Non-Contextual Subword Representations: A Multilingual Evaluation

    Authors: Benjamin Heinzerling, Michael Strube

    Abstract: Pretrained contextual and non-contextual subword embeddings have become available in over 250 languages, allowing massively multilingual NLP. However, while there is no dearth of pretrained embeddings, the distinct lack of systematic evaluations makes it difficult for practitioners to choose between them. In this work, we conduct an extensive evaluation comparing non-contextual subword embeddings,… ▽ More

    Submitted 4 June, 2019; originally announced June 2019.

    Comments: ACL 2019

  21. arXiv:1807.00717  [pdf, other

    cs.CL

    Transparent, Efficient, and Robust Word Embedding Access with WOMBAT

    Authors: Mark-Christoph Müller, Michael Strube

    Abstract: We present WOMBAT, a Python tool which supports NLP practitioners in accessing word embeddings from code. WOMBAT addresses common research problems, including unified access, scaling, and robust and reproducible preprocessing. Code that uses WOMBAT for accessing word embeddings is not only cleaner, more readable, and easier to reuse, but also much more efficient than code using standard in-memory… ▽ More

    Submitted 2 July, 2018; originally announced July 2018.

  22. arXiv:1710.02187  [pdf, other

    cs.CL

    BPEmb: Tokenization-free Pre-trained Subword Embeddings in 275 Languages

    Authors: Benjamin Heinzerling, Michael Strube

    Abstract: We present BPEmb, a collection of pre-trained subword unit embeddings in 275 languages, based on Byte-Pair Encoding (BPE). In an evaluation using fine-grained entity typing as testbed, BPEmb performs competitively, and for some languages bet- ter than alternative subword approaches, while requiring vastly fewer resources and no tokenization. BPEmb is available at https://github.com/bheinzerling/bp… ▽ More

    Submitted 5 October, 2017; originally announced October 2017.

  23. arXiv:1708.00160  [pdf, ps, other

    cs.CL

    Using Linguistic Features to Improve the Generalization Capability of Neural Coreference Resolvers

    Authors: Nafise Sadat Moosavi, Michael Strube

    Abstract: Coreference resolution is an intermediate step for text understanding. It is used in tasks and domains for which we do not necessarily have coreference annotated corpora. Therefore, generalization is of special importance for coreference resolution. However, while recent coreference resolvers have notable improvements on the CoNLL dataset, they struggle to generalize properly to new domains or dat… ▽ More

    Submitted 12 October, 2018; v1 submitted 1 August, 2017; originally announced August 2017.

    Comments: EMNLP 2018 long paper

  24. arXiv:1707.06456  [pdf, other

    cs.CL

    Revisiting Selectional Preferences for Coreference Resolution

    Authors: Benjamin Heinzerling, Nafise Sadat Moosavi, Michael Strube

    Abstract: Selectional preferences have long been claimed to be essential for coreference resolution. However, they are mainly modeled only implicitly by current coreference resolvers. We propose a dependency-based embedding model of selectional preferences which allows fine-grained compatibility judgments with high coverage. We show that the incorporation of our model improves coreference resolution perform… ▽ More

    Submitted 20 July, 2017; originally announced July 2017.

    Comments: EMNLP 2017 - short paper

  25. arXiv:1704.06779  [pdf, other

    cs.CL

    Lexical Features in Coreference Resolution: To be Used With Caution

    Authors: Nafise Sadat Moosavi, Michael Strube

    Abstract: Lexical features are a major source of information in state-of-the-art coreference resolvers. Lexical features implicitly model some of the linguistic phenomena at a fine granularity level. They are especially useful for representing the context of mentions. In this paper we investigate a drawback of using many lexical features in state-of-the-art coreference resolvers. We show that if coreference… ▽ More

    Submitted 22 April, 2017; originally announced April 2017.

    Comments: 6 pages, ACL 2017

  26. arXiv:1702.07507  [pdf, ps, other

    cs.CL

    Use Generalized Representations, But Do Not Forget Surface Features

    Authors: Nafise Sadat Moosavi, Michael Strube

    Abstract: Only a year ago, all state-of-the-art coreference resolvers were using an extensive amount of surface features. Recently, there was a paradigm shift towards using word embeddings and deep neural networks, where the use of surface features is very limited. In this paper, we show that a simple SVM model with surface features outperforms more complex neural models for detecting anaphoric mentions. Ou… ▽ More

    Submitted 24 February, 2017; originally announced February 2017.

    Comments: CORBON workshop@EACL 2017

  27. Never Look Back: An Alternative to Centering

    Authors: Michael Strube

    Abstract: I propose a model for determining the hearer's attentional state which depends solely on a list of salient discourse entities (S-list). The ordering among the elements of the S-list covers also the function of the backward-looking center in the centering model. The ranking criteria for the S-list are based on the distinction between hearer-old and hearer-new discourse entities and incorporate pr… ▽ More

    Submitted 25 June, 1998; originally announced June 1998.

    Comments: 7 pages, uses colacl.sty, epsfig.sty, times.sty, lingmacros.sty

    Journal ref: Proceedings of COLING-ACL '98

  28. Centering in-the-large: Computing referential discourse segments

    Authors: Udo Hahn, Michael Strube

    Abstract: We specify an algorithm that builds up a hierarchy of referential discourse segments from local centering data. The spatial extension and nesting of these discourse segments constrain the reachability of potential antecedents of an anaphoric expression beyond the local level of adjacent center pairs. Thus, the centering model is scaled up to the level of the global referential structure of disco… ▽ More

    Submitted 30 April, 1997; originally announced April 1997.

    Comments: LaTeX, 8 pages

    Journal ref: Proceedings of ACL 97 / EACL 97

  29. arXiv:cmp-lg/9605030  [pdf, ps

    cs.CL

    Incremental Centering and Center Ambiguity

    Authors: Udo Hahn, Michael Strube

    Abstract: In this paper, we present a model of anaphor resolution within the framework of the centering model. The consideration of an incremental processing mode introduces the need to manage structural ambiguity at the center level. Hence, the centering framework is further refined to account for local and global parsing ambiguities which propagate up to the level of center representations, yielding mod… ▽ More

    Submitted 16 May, 1996; originally announced May 1996.

    Comments: 6 pages, uuencoded gzipped PS file (see also Technical Report at: http://www.coling.uni-freiburg.de/public/papers/cogsci96-center.ps.gz)

    Journal ref: CogSci '96: Proc. of 18th Annual Conference of the Cognitive Science Society. La Jolla, Ca., Jul 12-15 1996.

  30. arXiv:cmp-lg/9605025  [pdf, ps

    cs.CL

    A Conceptual Reasoning Approach to Textual Ellipsis

    Authors: Udo Hahn, Katja Markert, Michael Strube

    Abstract: We present a hybrid text understanding methodology for the resolution of textual ellipsis. It integrates conceptual criteria (based on the well-formedness and conceptual strength of role chains in a terminological knowledge base) and functional constraints reflecting the utterances' information structure (based on the distinction between context-bound and unbound discourse elements). The methodo… ▽ More

    Submitted 15 May, 1996; originally announced May 1996.

    Comments: 5 pages, uuencoded gzipped PS file (see also Technical Report at: http://www.coling.uni-freiburg.de/public/papers/ecai96.ps.gz)

    Journal ref: ECAI '96: Proc. of 12th European Conference on Artificial Intelligence. Budapest, Aug 12-16 1996, pp.572-576

  31. arXiv:cmp-lg/9605022  [pdf, ps

    cs.CL

    Processing Complex Sentences in the Centering Framework

    Authors: Michael Strube

    Abstract: We extend the centering model for the resolution of intra-sentential anaphora and specify how to handle complex sentences. An empirical evaluation indicates that the functional information structure guides the search for an antecedent within the sentence.

    Submitted 14 May, 1996; originally announced May 1996.

    Comments: 3 pages, uuencoded gzipped PS file (see also Technical Report at: http://www.coling.uni-freiburg.de/public/papers/acl96-student.ps.gz)

    Journal ref: Proceedings of ACL '96 (Santa Cruz), Student Session

  32. arXiv:cmp-lg/9605021  [pdf, ps

    cs.CL

    Functional Centering

    Authors: Michael Strube, Udo Hahn

    Abstract: Based on empirical evidence from a free word order language (German) we propose a fundamental revision of the principles guiding the ordering of discourse entities in the forward-looking centers within the centering model. We claim that grammatical role criteria should be replaced by indicators of the functional information structure of the utterances, i.e., the distinction between context-bound… ▽ More

    Submitted 14 May, 1996; originally announced May 1996.

    Comments: 8 pages, uuencoded compressed PS file (see also Technical Report at: http://www.coling.uni-freiburg.de/public/papers/acl96.ps.gz)

    Journal ref: Proceedings of ACL '96 (Santa Cruz)

  33. arXiv:cmp-lg/9509005  [pdf, ps

    cs.CL

    ParseTalk about Textual Ellipsis

    Authors: Michael Strube, Udo Hahn

    Abstract: A hybrid methodology for the resolution of text-level ellipsis is presented in this paper. It incorporates conceptual proximity criteria applied to ontologically well-engineered domain knowledge bases and an approach to centering based on functional topic/comment patterns. We state text grammatical predicates for ellipsis and then turn to the procedural aspects of their evaluation within the fra… ▽ More

    Submitted 28 September, 1995; originally announced September 1995.

    Comments: 11 pages, uuencoded compressed PS file (see also Technical Report at: http://www.coling.uni-freiburg.de/public/papers/ranlp95.ps)

    Journal ref: RANLP 95: Proc. of the Intl. Conf. on Recent Advances in Natural Language Processing. Tzigov Chark, Bulgaria, Sep. 14-16 1995, pp.62-72.

  34. arXiv:cmp-lg/9503006  [pdf, ps

    cs.CL

    ParseTalk about Sentence- and Text-Level Anaphora

    Authors: Michael Strube, Udo Hahn

    Abstract: We provide a unified account of sentence-level and text-level anaphora within the framework of a dependency-based grammar model. Criteria for anaphora resolution within sentence boundaries rephrase major concepts from GB's binding theory, while those for text-level anaphora incorporate an adapted version of a Grosz-Sidner-style focus model.

    Submitted 3 March, 1995; originally announced March 1995.

    Comments: in Proceedings of EACL-95, uuencoded and gzipped postscript (see also technical Report at http://www.coling.uni-freiburg.de:80/forschung/papers/eacl95.ps)