Skip to main content

Showing 1–49 of 49 results for author: Frank, R

  1. arXiv:2406.18501  [pdf, other

    cs.CL

    Is In-Context Learning a Type of Gradient-Based Learning? Evidence from the Inverse Frequency Effect in Structural Priming

    Authors: Zhenghao Zhou, Robert Frank, R. Thomas McCoy

    Abstract: Large language models (LLMs) have shown the emergent capability of in-context learning (ICL). One line of research has explained ICL as functionally performing gradient descent. In this paper, we introduce a new way of diagnosing whether ICL is functionally equivalent to gradient-based learning. Our approach is based on the inverse frequency effect (IFE) -- a phenomenon in which an error-driven le… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  2. arXiv:2405.03572  [pdf, other

    cs.RO

    RoboCar: A Rapidly Deployable Open-Source Platform for Autonomous Driving Research

    Authors: Mehdi Testouri, Gamal Elghazaly, Raphael Frank

    Abstract: This paper introduces RoboCar, an open-source research platform for autonomous driving developed at the University of Luxembourg. RoboCar provides a modular, cost-effective framework for the development of experimental Autonomous Driving Systems (ADS), utilizing the 2018 KIA Soul EV. The platform integrates a robust hardware and software architecture that aligns with the vehicle's existing systems… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  3. arXiv:2404.13163  [pdf, other

    econ.GN cs.CL

    A national longitudinal dataset of skills taught in U.S. higher education curricula

    Authors: Alireza Javadian Sabet, Sarah H. Bana, Renzhe Yu, Morgan R. Frank

    Abstract: Higher education plays a critical role in driving an innovative economy by equipping students with knowledge and skills demanded by the workforce. While researchers and practitioners have developed data systems to track detailed occupational skills, such as those established by the U.S. Department of Labor (DOL), much less effort has been made to document skill development in higher education at a… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: 44 pages, 21 figures, 10 tables

  4. arXiv:2403.06301  [pdf, other

    cs.CL

    LIEDER: Linguistically-Informed Evaluation for Discourse Entity Recognition

    Authors: Xiaomeng Zhu, Robert Frank

    Abstract: Discourse Entity (DE) recognition is the task of identifying novel and known entities introduced within a text. While previous work has found that large language models have basic, if imperfect, DE recognition abilities (Schuster and Linzen, 2022), it remains largely unassessed which of the fundamental semantic properties that govern the introduction and subsequent reference to DEs they have knowl… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

  5. arXiv:2312.11029  [pdf, other

    cs.DC cs.CR cs.NI

    Picsou: Enabling Efficient Cross-Consensus Communication

    Authors: Reginald Frank, Micah Murray, Suyash Gupta, Ethan Xu, Natacha Crooks, Manos Kapritsos

    Abstract: Replicated state machines (RSMs) cannot effectively communicate today as there is no formal framework or efficient protocol to do so. To address this issue, we introduce a new primitive, the Cross-Cluster Consistent Broadcast (C3B) and present PICSOU, a practical C3B implementation. PICSOU draws inspiration from networking and TCP to allow two RSMs to communicate with constant metadata overhead in… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

  6. arXiv:2311.04900  [pdf, other

    cs.CL

    How Abstract Is Linguistic Generalization in Large Language Models? Experiments with Argument Structure

    Authors: Michael Wilson, Jackson Petty, Robert Frank

    Abstract: Language models are typically evaluated on their success at predicting the distribution of specific words in specific contexts. Yet linguistic knowledge also encodes relationships between contexts, allowing inferences between word distributions. We investigate the degree to which pre-trained Transformer-based large language models (LLMs) represent such relationships, focusing on the domain of argu… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

    Comments: Accepted to TACL; Presented at EMNLP 2023

  7. arXiv:2311.03595  [pdf, other

    econ.GN cs.AI

    Brief for the Canada House of Commons Study on the Implications of Artificial Intelligence Technologies for the Canadian Labor Force: Generative Artificial Intelligence Shatters Models of AI and Labor

    Authors: Morgan R. Frank

    Abstract: Exciting advances in generative artificial intelligence (AI) have sparked concern for jobs, education, productivity, and the future of work. As with past technologies, generative AI may not lead to mass unemployment. But, unlike past technologies, generative AI is creative, cognitive, and potentially ubiquitous which makes the usual assumptions of automation predictions ill-suited for today. Exist… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

  8. arXiv:2308.05234  [pdf, other

    cs.CV cs.AI cs.DC cs.LG cs.NI

    Leveraging the Edge and Cloud for V2X-Based Real-Time Object Detection in Autonomous Driving

    Authors: Faisal Hawlader, François Robinet, Raphaël Frank

    Abstract: Environmental perception is a key element of autonomous driving because the information received from the perception module influences core driving decisions. An outstanding challenge in real-time perception for autonomous driving lies in finding the best trade-off between detection quality and latency. Major constraints on both computation and power have to be taken into account for real-time per… ▽ More

    Submitted 9 August, 2023; originally announced August 2023.

  9. arXiv:2308.01654  [pdf, other

    cs.RO

    Towards a Safe Real-Time Motion Planning Framework for Autonomous Driving Systems: An MPPI Approach

    Authors: Mehdi Testouri, Gamal Elghazaly, Raphael Frank

    Abstract: Planning safe trajectories in Autonomous Driving Systems (ADS) is a complex problem to solve in real-time. The main challenge to solve this problem arises from the various conditions and constraints imposed by road geometry, semantics and traffic rules, as well as the presence of dynamic agents. Recently, Model Predictive Path Integral (MPPI) has shown to be an effective framework for optimal moti… ▽ More

    Submitted 6 May, 2024; v1 submitted 3 August, 2023; originally announced August 2023.

  10. arXiv:2307.08580  [pdf, other

    physics.soc-ph cs.CL

    The Resume Paradox: Greater Language Differences, Smaller Pay Gaps

    Authors: Joshua R. Minot, Marc Maier, Bradford Demarest, Nicholas Cheney, Christopher M. Danforth, Peter Sheridan Dodds, Morgan R. Frank

    Abstract: Over the past decade, the gender pay gap has remained steady with women earning 84 cents for every dollar earned by men on average. Many studies explain this gap through demand-side bias in the labor market represented through employers' job postings. However, few studies analyze potential bias from the worker supply-side. Here, we analyze the language in millions of US workers' resumes to investi… ▽ More

    Submitted 17 July, 2023; originally announced July 2023.

    Comments: 24 pages, 15 figures

  11. Art and the science of generative AI: A deeper dive

    Authors: Ziv Epstein, Aaron Hertzmann, Laura Herman, Robert Mahari, Morgan R. Frank, Matthew Groh, Hope Schroeder, Amy Smith, Memo Akten, Jessica Fjeld, Hany Farid, Neil Leach, Alex Pentland, Olga Russakovsky

    Abstract: A new class of tools, colloquially called generative AI, can produce high-quality artistic media for visual arts, concept art, music, fiction, literature, video, and animation. The generative capabilities of these tools are likely to fundamentally alter the creative processes by which creators formulate ideas and put them into production. As creativity is reimagined, so too may be many sectors of… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

    Comments: This white paper is an expanded version of Epstein et al 2023 published in Science Perspectives on July 16, 2023 which you can find at the following DOI: 10.1126/science.adh4451

  12. arXiv:2302.08822  [pdf

    cs.CL

    False perspectives on human language: why statistics needs linguistics

    Authors: Matteo Greco, Andrea Cometa, Fiorenzo Artoni, Robert Frank, Andrea Moro

    Abstract: A sharp tension exists about the nature of human language between two opposite parties: those who believe that statistical surface distributions, in particular using measures like surprisal, provide a better understanding of language processing, vs. those who believe that discrete hierarchical structures implementing linguistic information such as syntactic ones are a better tool. In this paper, w… ▽ More

    Submitted 17 February, 2023; originally announced February 2023.

  13. arXiv:2301.11462  [pdf, other

    cs.CL

    How poor is the stimulus? Evaluating hierarchical generalization in neural networks trained on child-directed speech

    Authors: Aditya Yedetore, Tal Linzen, Robert Frank, R. Thomas McCoy

    Abstract: When acquiring syntax, children consistently choose hierarchical rules over competing non-hierarchical possibilities. Is this preference due to a learning bias for hierarchical structure, or due to more general biases that interact with hierarchical cues in children's linguistic input? We explore these possibilities by training LSTMs and Transformers - two types of neural networks without a hierar… ▽ More

    Submitted 6 June, 2023; v1 submitted 26 January, 2023; originally announced January 2023.

    Comments: 10 pages plus references and appendices; accepted to ACL

    ACM Class: J.4; I.2.7

  14. arXiv:2211.15495  [pdf, other

    cs.RO

    FastCycle: A Message Sharing Framework for Modular Automated Driving Systems

    Authors: Mehdi Testouri, Gamal Elghazaly, Raphael Frank

    Abstract: Automated Driving Systems (ADS) have rapidly evolved in recent years and their architecture becomes sophisticated. Ensuring robustness, reliability and safety of performance is particularly important. The main challenge in building an ADS is the ability to meet certain stringent performance requirements in terms of both making safe operational decisions and finishing processing in real-time. Middl… ▽ More

    Submitted 28 November, 2022; originally announced November 2022.

  15. arXiv:2208.04688  [pdf, other

    cs.CY

    Connected Vehicle Platforms for Dynamic Insurance

    Authors: Christian Colot, Francois Robinet, Geoffrey Nichils, Raphael Frank

    Abstract: Following a regulatory change in Europe which mandates that car manufacturers include an eCall system in new vehicles, many car manufacturers are adding additional services on top, so that more and more cars become connected vehicles and act like IoT sensors. In the following study, we analyse the maturity level of this new technology to build insurance products that would take vehicle usage into… ▽ More

    Submitted 1 August, 2022; originally announced August 2022.

    Comments: Working paper

  16. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  17. arXiv:2204.06618  [pdf, ps, other

    cs.CC cs.AI cs.CL cs.FL cs.LG

    Formal Language Recognition by Hard Attention Transformers: Perspectives from Circuit Complexity

    Authors: Yiding Hao, Dana Angluin, Robert Frank

    Abstract: This paper analyzes three formal models of Transformer encoders that differ in the form of their self-attention mechanism: unique hard attention (UHAT); generalized unique hard attention (GUHAT), which generalizes UHAT; and averaging hard attention (AHAT). We show that UHAT and GUHAT Transformers, viewed as string acceptors, can only recognize formal languages in the complexity class AC$^0$, the c… ▽ More

    Submitted 13 April, 2022; originally announced April 2022.

    Comments: To appear in Transactions of the Association for Computational Linguistics

  18. arXiv:2203.09397  [pdf, other

    cs.CL

    Coloring the Blank Slate: Pre-training Imparts a Hierarchical Inductive Bias to Sequence-to-sequence Models

    Authors: Aaron Mueller, Robert Frank, Tal Linzen, Luheng Wang, Sebastian Schuster

    Abstract: Relations between words are governed by hierarchical structure rather than linear ordering. Sequence-to-sequence (seq2seq) models, despite their success in downstream NLP applications, often fail to generalize in a hierarchy-sensitive manner when performing syntactic transformations - for example, transforming declarative sentences into questions. However, syntactic evaluations of seq2seq models h… ▽ More

    Submitted 17 March, 2022; originally announced March 2022.

    Comments: Accepted to Findings of ACL 2022

  19. arXiv:2202.03611  [pdf, other

    cs.CL

    Do Language Models Learn Position-Role Mappings?

    Authors: Jackson Petty, Michael Wilson, Robert Frank

    Abstract: How is knowledge of position-role mappings in natural language learned? We explore this question in a computational setting, testing whether a variety of well-performing pertained language models (BERT, RoBERTa, and DistilBERT) exhibit knowledge of these mappings, and whether this knowledge persists across alternations in syntactic, structural, and lexical alternations. In Experiment 1, we show th… ▽ More

    Submitted 7 February, 2022; originally announced February 2022.

    Comments: To appear in the BUCLD 46 Proceedings

  20. arXiv:2110.13317  [pdf, other

    cs.CY cs.CL econ.GN

    Exposure of occupations to technologies of the fourth industrial revolution

    Authors: Benjamin Meindl, Morgan R. Frank, Joana Mendonça

    Abstract: The fourth industrial revolution (4IR) is likely to have a substantial impact on the economy. Companies need to build up capabilities to implement new technologies, and automation may make some occupations obsolete. However, where, when, and how the change will happen remain to be determined. Robust empirical indicators of technological progress linked to occupations can help to illuminate this ch… ▽ More

    Submitted 25 October, 2021; originally announced October 2021.

    Comments: 65 pages, 18 figures

  21. arXiv:2109.12036  [pdf, other

    cs.CL

    Transformers Generalize Linearly

    Authors: Jackson Petty, Robert Frank

    Abstract: Natural language exhibits patterns of hierarchically governed dependencies, in which relations between words are sensitive to syntactic structure rather than linear ordering. While re-current network models often fail to generalize in a hierarchically sensitive way (McCoy et al.,2020) when trained on ambiguous data, the improvement in performance of newer Trans-former language models (Vaswani et a… ▽ More

    Submitted 24 September, 2021; originally announced September 2021.

  22. arXiv:2104.09586  [pdf, other

    cs.AI

    Semantic Knowledge Discovery and Discussion Mining of Incel Online Community: Topic modeling

    Authors: Hamed Jelodar, Richard Frank

    Abstract: Online forums provide a unique opportunity for online users to share comments and exchange information on a particular topic. Understanding user behaviour is valuable to organizations and has applications for social and security strategies, for instance, identifying user opinions within a community or predicting future behaviour. Discovering the semantic aspects in Incel forums are the main goal o… ▽ More

    Submitted 21 April, 2021; v1 submitted 19 April, 2021; originally announced April 2021.

  23. A Survey on Data Plane Programming with P4: Fundamentals, Advances, and Applied Research

    Authors: Frederik Hauser, Marco Häberle, Daniel Merling, Steffen Lindner, Vladimir Gurevich, Florian Zeiger, Reinhard Frank, Michael Menth

    Abstract: Programmable data planes allow users to define their own data plane algorithms for network devices including appropriate data plane application programming interfaces (APIs) which may be leveraged by user-defined software-defined networking (SDN) control. This offers great flexibility for network customization, be it for specialized, commercial appliances, e.g., in 5G or data center networks, or f… ▽ More

    Submitted 4 August, 2021; v1 submitted 26 January, 2021; originally announced January 2021.

  24. arXiv:2011.00682  [pdf, other

    cs.CL

    Sequence-to-Sequence Networks Learn the Meaning of Reflexive Anaphora

    Authors: Robert Frank, Jackson Petty

    Abstract: Reflexive anaphora present a challenge for semantic interpretation: their meaning varies depending on context in a way that appears to require abstract variables. Past work has raised doubts about the ability of recurrent networks to meet this challenge. In this paper, we explore this question in the context of a fragment of English that incorporates the relevant sort of contextual variability. We… ▽ More

    Submitted 1 November, 2020; originally announced November 2020.

    Comments: 10 pages, 4 figures, 3 tables, accepted at CRAC 2020

  25. arXiv:2009.09799  [pdf, other

    cs.SI cs.LG econ.GN

    Industrial Topics in Urban Labor System

    Authors: Jaehyuk Park, Morgan R. Frank, Lijun Sun, Hyejin Youn

    Abstract: Categorization is an essential component for us to understand the world for ourselves and to communicate it collectively. It is therefore important to recognize that classification system are not necessarily static, especially for economic systems, and even more so in urban areas where most innovation takes place and is implemented. Out-of-date classification systems would potentially limit furthe… ▽ More

    Submitted 17 September, 2020; originally announced September 2020.

  26. arXiv:2009.03954  [pdf, other

    cs.CL cs.NE

    Probabilistic Predictions of People Perusing: Evaluating Metrics of Language Model Performance for Psycholinguistic Modeling

    Authors: Yiding Hao, Simon Mendelsohn, Rachel Sterneck, Randi Martinez, Robert Frank

    Abstract: By positing a relationship between naturalistic reading times and information-theoretic surprisal, surprisal theory (Hale, 2001; Levy, 2008) provides a natural interface between language models and psycholinguistic models. This paper re-evaluates a claim due to Goodkind and Bicknell (2018) that a language model's ability to model reading times is a linear function of its perplexity. By extending G… ▽ More

    Submitted 8 September, 2020; originally announced September 2020.

    Comments: To appear in the proceedings of the Cognitive Modeling and Computational Linguistics workshop (CMCL) at EMNLP 2020

  27. arXiv:2008.02250  [pdf, other

    cs.CL cs.CY cs.SI physics.soc-ph

    Generalized Word Shift Graphs: A Method for Visualizing and Explaining Pairwise Comparisons Between Texts

    Authors: Ryan J. Gallagher, Morgan R. Frank, Lewis Mitchell, Aaron J. Schwartz, Andrew J. Reagan, Christopher M. Danforth, Peter Sheridan Dodds

    Abstract: A common task in computational text analyses is to quantify how two corpora differ according to a measurement like word frequency, sentiment, or information content. However, collapsing the texts' rich stories into a single number is often conceptually perilous, and it is difficult to confidently interpret interesting or unexpected textual patterns without looming concerns about data artifacts or… ▽ More

    Submitted 5 August, 2020; originally announced August 2020.

    Comments: 20 pages, 7 figures, 2 tables

    Journal ref: EPJ Data Science, 10(4), 2021

  28. arXiv:2001.03632  [pdf, other

    cs.CL

    Does syntax need to grow on trees? Sources of hierarchical inductive bias in sequence-to-sequence networks

    Authors: R. Thomas McCoy, Robert Frank, Tal Linzen

    Abstract: Learners that are exposed to the same training data might generalize differently due to differing inductive biases. In neural network models, inductive biases could in theory arise from any aspect of the model architecture. We investigate which architectural factors affect the generalization behavior of neural sequence-to-sequence models trained on two syntactic tasks, English question formation a… ▽ More

    Submitted 10 January, 2020; originally announced January 2020.

    Comments: 12 pages, 10 figures; accepted to TACL

  29. arXiv:1906.01698  [pdf, other

    cs.CL

    Open Sesame: Getting Inside BERT's Linguistic Knowledge

    Authors: Yongjie Lin, Yi Chern Tan, Robert Frank

    Abstract: How and to what extent does BERT encode syntactically-sensitive hierarchical information or positionally-sensitive linear information? Recent work has shown that contextual representations like BERT perform well on tasks that require sensitivity to linguistic structure. We present here two studies which aim to provide a better understanding of the nature of BERT's representations. The first of the… ▽ More

    Submitted 4 June, 2019; originally announced June 2019.

    Comments: To appear in the Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP

  30. arXiv:1906.01661  [pdf, other

    cs.CL

    Detecting Syntactic Change Using a Neural Part-of-Speech Tagger

    Authors: William Merrill, Gigi Felice Stark, Robert Frank

    Abstract: We train a diachronic long short-term memory (LSTM) part-of-speech tagger on a large corpus of American English from the 19th, 20th, and 21st centuries. We analyze the tagger's ability to implicitly learn temporal structure between years, and the extent to which this knowledge can be transferred to date new sentences. The learned year embeddings show a strong linear correlation between their first… ▽ More

    Submitted 9 July, 2019; v1 submitted 4 June, 2019; originally announced June 2019.

    Comments: To appear in the proceedings of the Computational Approaches to Historical Language Change workshop at ACL 2019

  31. arXiv:1906.01594  [pdf, other

    cs.CL cs.LG cs.NE

    Finding Syntactic Representations in Neural Stacks

    Authors: William Merrill, Lenny Khazan, Noah Amsel, Yiding Hao, Simon Mendelsohn, Robert Frank

    Abstract: Neural network architectures have been augmented with differentiable stacks in order to introduce a bias toward learning hierarchy-sensitive regularities. It has, however, proven difficult to assess the degree to which such a bias is effective, as the operation of the differentiable stack is not always interpretable. In this paper, we attempt to detect the presence of latent representations of hie… ▽ More

    Submitted 4 June, 2019; originally announced June 2019.

    Comments: To appear in the Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP

  32. arXiv:1903.05260  [pdf, other

    cs.CL

    Syntax-aware Neural Semantic Role Labeling with Supertags

    Authors: Jungo Kasai, Dan Friedman, Robert Frank, Dragomir Radev, Owen Rambow

    Abstract: We introduce a new syntax-aware model for dependency-based semantic role labeling that outperforms syntax-agnostic models for English and Spanish. We use a BiLSTM to tag the text with supertags extracted from dependency parses, and we feed these supertags, along with words and parts of speech, into a deep highway BiLSTM for semantic role labeling. Our model combines the strengths of earlier models… ▽ More

    Submitted 3 April, 2019; v1 submitted 12 March, 2019; originally announced March 2019.

    Comments: NAACL 2019, Added Spanish ELMo results

  33. arXiv:1809.02836  [pdf, other

    cs.NE cs.CL cs.LG

    Context-Free Transductions with Neural Stacks

    Authors: Yiding Hao, William Merrill, Dana Angluin, Robert Frank, Noah Amsel, Andrew Benz, Simon Mendelsohn

    Abstract: This paper analyzes the behavior of stack-augmented recurrent neural network (RNN) models. Due to the architectural similarity between stack RNNs and pushdown transducers, we train stack RNN models on a number of tasks, including string reversal, context-free language modelling, and cumulative XOR evaluation. Examining the behavior of our networks, we show that stack-augmented RNNs can discover in… ▽ More

    Submitted 8 September, 2018; originally announced September 2018.

    Comments: To appear in the proceedings of the Analyzing and Interpreting Neural Networks for NLP workshop at EMNLP 2018

  34. arXiv:1807.05139  [pdf, ps, other

    cs.DC

    A Tight Lower Bound for Clock Synchronization in Odd-Ary M-Toroids

    Authors: Reginald Frank, Jennifer L. Welch

    Abstract: Synchronizing clocks in a distributed system in which processes communicate through messages with uncertain delays is subject to inherent errors. Prior work has shown upper and lower bounds on the best synchronization achievable in a variety of network topologies and assumptions about the uncertainty on the message delays. However, until now there has not been a tight closed-form expression for th… ▽ More

    Submitted 13 July, 2018; originally announced July 2018.

    Comments: 5 pages, 4 figures, to appear as a brief announcement at 2018 International Symposium on Distributed Computing (2018)

  35. arXiv:1804.06610  [pdf, other

    cs.CL

    End-to-end Graph-based TAG Parsing with Neural Networks

    Authors: Jungo Kasai, Robert Frank, Pauli Xu, William Merrill, Owen Rambow

    Abstract: We present a graph-based Tree Adjoining Grammar (TAG) parser that uses BiLSTMs, highway connections, and character-level CNNs. Our best end-to-end parser, which jointly performs supertagging, POS tagging, and parsing, outperforms the previously reported best results by more than 2.2 LAS and UAS points. The graph-based parsing architecture allows for global inference and rich feature representation… ▽ More

    Submitted 27 April, 2018; v1 submitted 18 April, 2018; originally announced April 2018.

    Comments: NAACL 2018

  36. arXiv:1802.09091  [pdf, other

    cs.CL

    Revisiting the poverty of the stimulus: hierarchical generalization without a hierarchical bias in recurrent neural networks

    Authors: R. Thomas McCoy, Robert Frank, Tal Linzen

    Abstract: Syntactic rules in natural language typically need to make reference to hierarchical sentence structure. However, the simple examples that language learners receive are often equally compatible with linear rules. Children consistently ignore these linear explanations and settle instead on the correct hierarchical one. This fact has motivated the proposal that the learner's hypothesis space is cons… ▽ More

    Submitted 8 June, 2018; v1 submitted 25 February, 2018; originally announced February 2018.

    Comments: Proceedings of the 40th Annual Conference of the Cognitive Science Society; 10 pages

  37. arXiv:1706.05105  [pdf, other

    cs.CV

    Symplectomorphic registration with phase space regularization by entropy spectrum pathways

    Authors: Vitaly L. Galinsky, Lawrence R. Frank

    Abstract: The ability to register image data to a common coordinate system is a critical feature of virtually all imaging studies that require multiple subject analysis, combining single subject data from multiple modalities, or both. However, in spite of the abundance of literature on the subject and the existence of several variants of registration algorithms, their practical utility remains problematic,… ▽ More

    Submitted 15 June, 2017; originally announced June 2017.

    Comments: 26 pages, 7 figures

  38. Small cities face greater impact from automation

    Authors: Morgan R. Frank, Lijun Sun, Manuel Cebrian, Hyejin Youn, Iyad Rahwan

    Abstract: The city has proven to be the most successful form of human agglomeration and provides wide employment opportunities for its dwellers. As advances in robotics and artificial intelligence revive concerns about the impact of automation on jobs, a question looms: How will automation affect employment in cities? Here, we provide a comparative picture of the impact of automation across U.S. urban areas… ▽ More

    Submitted 21 September, 2017; v1 submitted 16 May, 2017; originally announced May 2017.

  39. arXiv:1507.05098  [pdf, other

    physics.soc-ph cs.CY cs.SI

    The Lexicocalorimeter: Gauging public health through caloric input and output on social media

    Authors: S. E. Alajajian, J. R. Williams, A. J. Reagan, S. C. Alajajian, M. R. Frank, L. Mitchell, J. Lahne, C. M. Danforth, P. S. Dodds

    Abstract: We propose and develop a Lexicocalorimeter: an online, interactive instrument for measuring the "caloric content" of social media and other large-scale texts. We do so by constructing extensive yet improvable tables of food and activity related phrases, and respectively assigning them with sourced estimates of caloric intake and expenditure. We show that for Twitter, our naive measures of "caloric… ▽ More

    Submitted 10 January, 2017; v1 submitted 17 July, 2015; originally announced July 2015.

    Comments: Manuscript: 17 pages, 8 figures, 1 table, Supplementary Information: 10 pages, 7 figures, 3 tables

  40. arXiv:1505.06750  [pdf, other

    physics.soc-ph cs.CL

    Reply to Garcia et al.: Common mistakes in measuring frequency dependent word characteristics

    Authors: P. S. Dodds, E. M. Clark, S. Desu, M. R. Frank, A. J. Reagan, J. R. Williams, L. Mitchell, K. D. Harris, I. M. Kloumann, J. P. Bagrow, K. Megerdoomian, M. T. McMahon, B. F. Tivnan, C. M. Danforth

    Abstract: We demonstrate that the concerns expressed by Garcia et al. are misplaced, due to (1) a misreading of our findings in [1]; (2) a widespread failure to examine and present words in support of asserted summary quantities based on word usage frequencies; and (3) a range of misconceptions about word usage frequency, word rank, and expert-constructed word lists. In particular, we show that the English… ▽ More

    Submitted 28 May, 2015; v1 submitted 25 May, 2015; originally announced May 2015.

    Comments: 5 pages, 2 figures, 1 table. Expanded version of reply appearing in PNAS 2015

  41. arXiv:1410.1393  [pdf, other

    physics.soc-ph cs.SI

    Constructing a taxonomy of fine-grained human movement and activity motifs through social media

    Authors: Morgan R. Frank, Jake Ryland Williams, Lewis Mitchell, James P. Bagrow, Peter Sheridan Dodds, Christopher M. Danforth

    Abstract: Profiting from the emergence of web-scale social data sets, numerous recent studies have systematically explored human mobility patterns over large populations and large time scales. Relatively little attention, however, has been paid to mobility and activity over smaller time-scales, such as a day. Here, we use Twitter to identify people's frequently visited locations along with their likely acti… ▽ More

    Submitted 11 May, 2015; v1 submitted 28 September, 2014; originally announced October 2014.

  42. arXiv:1406.3855  [pdf, other

    physics.soc-ph cs.CL cs.SI

    Human language reveals a universal positivity bias

    Authors: Peter Sheridan Dodds, Eric M. Clark, Suma Desu, Morgan R. Frank, Andrew J. Reagan, Jake Ryland Williams, Lewis Mitchell, Kameron Decker Harris, Isabel M. Kloumann, James P. Bagrow, Karine Megerdoomian, Matthew T. McMahon, Brian F. Tivnan, Christopher M. Danforth

    Abstract: Using human evaluation of 100,000 words spread across 24 corpora in 10 languages diverse in origin and culture, we present evidence of a deep imprint of human sociality in language, observing that (1) the words of natural human language possess a universal positivity bias; (2) the estimated emotional content of words is consistent between languages under translation; and (3) this positivity bias i… ▽ More

    Submitted 15 June, 2014; originally announced June 2014.

    Comments: Manuscript: 7 pages, 4 figures; Supplementary Material: 49 pages, 43 figures, 6 tables. Online appendices available at http://www.uvm.edu/storylab/share/papers/dodds2014a/

  43. arXiv:1312.6122  [pdf, other

    physics.soc-ph cond-mat.dis-nn cs.SI physics.data-an

    Shadow networks: Discovering hidden nodes with models of information flow

    Authors: James P. Bagrow, Suma Desu, Morgan R. Frank, Narine Manukyan, Lewis Mitchell, Andrew Reagan, Eric E. Bloedorn, Lashon B. Booker, Luther K. Branting, Michael J. Smith, Brian F. Tivnan, Christopher M. Danforth, Peter S. Dodds, Joshua C. Bongard

    Abstract: Complex, dynamic networks underlie many systems, and understanding these networks is the concern of a great span of important scientific and engineering problems. Quantitative description is crucial for this understanding yet, due to a range of measurement problems, many real network datasets are incomplete. Here we explore how accidentally missing or deliberately hidden nodes may be detected in n… ▽ More

    Submitted 20 December, 2013; originally announced December 2013.

    Comments: 12 pages, 3 figures

  44. arXiv:1306.5358  [pdf, ps, other

    math-ph cs.IT math.FA quant-ph

    Monotonicity of a relative Rényi entropy

    Authors: Rupert L. Frank, Elliott H. Lieb

    Abstract: We show that a recent definition of relative Rényi entropy is monotone under completely positive, trace preserving maps. This proves a recent conjecture of Müller-Lennert et al.

    Submitted 1 October, 2013; v1 submitted 22 June, 2013; originally announced June 2013.

    Comments: 6 pages; minor revisions

  45. arXiv:1304.6257  [pdf, other

    physics.soc-ph cs.SI

    An Evolutionary Algorithm Approach to Link Prediction in Dynamic Social Networks

    Authors: Catherine A. Bliss, Morgan R. Frank, Christopher M. Danforth, Peter Sheridan Dodds

    Abstract: Many real world, complex phenomena have underlying structures of evolving networks where nodes and links are added and removed over time. A central scientific challenge is the description and explanation of network dynamics, with a key test being the prediction of short and long term changes. For the problem of short-term link prediction, existing methods attempt to determine neighborhood metrics… ▽ More

    Submitted 13 August, 2014; v1 submitted 23 April, 2013; originally announced April 2013.

    Comments: 17 pages, 12 figures, 4 tables, Submitted to the Journal of Computational Science

    Journal ref: Bliss, C. A., Frank, M. R., Danforth, C. M. & P. S. Dodds. (2014). An Evolutionary Algorithm Approach to Link Prediction in Dynamic Social Networks. Journal of Computational Science, 5(5):750-764

  46. arXiv:1304.1296  [pdf, other

    physics.soc-ph cs.SI

    Happiness and the Patterns of Life: A Study of Geolocated Tweets

    Authors: Morgan R. Frank, Lewis Mitchell, Peter S. Dodds, Christopher M. Danforth

    Abstract: The patterns of life exhibited by large populations have been described and modeled both as a basic science exercise and for a range of applied goals such as reducing automotive congestion, improving disaster response, and even predicting the location of individuals. However, these studies previously had limited access to conversation content, rendering changes in expression as a function of movem… ▽ More

    Submitted 12 September, 2013; v1 submitted 4 April, 2013; originally announced April 2013.

    Comments: 12 page main document, 12 page supplement, 21 figures

    Journal ref: Scientific Reports, Vol 3, No 2625, 2013

  47. arXiv:1302.3299  [pdf, other

    physics.soc-ph cs.SI

    The Geography of Happiness: Connecting Twitter sentiment and expression, demographics, and objective characteristics of place

    Authors: Lewis Mitchell, Kameron Decker Harris, Morgan R. Frank, Peter Sheridan Dodds, Christopher M. Danforth

    Abstract: We conduct a detailed investigation of correlations between real-time expressions of individuals made across the United States and a wide range of emotional, geographic, demographic, and health characteristics. We do so by combining (1) a massive, geo-tagged data set comprising over 80 million words generated over the course of several recent years on the social network service Twitter and (2) ann… ▽ More

    Submitted 18 May, 2013; v1 submitted 13 February, 2013; originally announced February 2013.

    Journal ref: PLoS ONE 8(5): e64417, 2013

  48. Using Single Layer Networks for Discrete, Sequential Data: An Example from Natural Language Processing

    Authors: Caroline Lyon, Ray Frank

    Abstract: A natural language parser which has been successfully implemented is described. This is a hybrid system, in which neural networks operate within a rule based framework. It can be accessed via telnet for users to try on their own text. (For details, contact the author.) Tested on technical manuals, the parser finds the subject and head of the subject in over 90% of declarative sentences. The ne… ▽ More

    Submitted 23 September, 1997; originally announced September 1997.

    Comments: 28 pages, 9 figures, Latex format, uses epsfig, .styfile included

    Journal ref: Neural Computing and Applications 5(4), 1997, 196-214

  49. From Regular to Context Free to Mildly Context Sensitive Tree Rewriting Systems: The Path of Child Language Acquisition

    Authors: Robert Frank

    Abstract: Current syntactic theory limits the range of grammatical variation so severely that the logical problem of grammar learning is trivial. Yet, children exhibit characteristic stages in syntactic development at least through their sixth year. Rather than positing maturational delays, I suggest that acquisition difficulties are the result of limitations in manipulating grammatical representations. I… ▽ More

    Submitted 4 November, 1994; originally announced November 1994.

    Comments: 4 pages

    Report number: TALANA-RT-94-01, TALANA, Universit\'{e} Paris 7, 1994

    Journal ref: Appeared in {\em 3e Colloque International sur les grammaires d'Arbres Adjoints (TAG+3).}