Skip to main content

Showing 1–17 of 17 results for author: Parascandolo, G

  1. arXiv:2307.12617  [pdf, other

    cs.LG

    Predicting Ordinary Differential Equations with Transformers

    Authors: Sören Becker, Michal Klein, Alexander Neitz, Giambattista Parascandolo, Niki Kilbertus

    Abstract: We develop a transformer-based sequence-to-sequence model that recovers scalar ordinary differential equations (ODEs) in symbolic form from irregularly sampled and noisy observations of a single solution trajectory. We demonstrate in extensive empirical evaluations that our model performs better or on par with existing methods in terms of accurate recovery across various settings. Moreover, our me… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

    Comments: Published at ICML 2023

  2. arXiv:2303.08774  [pdf, other

    cs.CL cs.AI

    GPT-4 Technical Report

    Authors: OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, Red Avila, Igor Babuschkin, Suchir Balaji, Valerie Balcom, Paul Baltescu, Haiming Bao, Mohammad Bavarian, Jeff Belgum, Irwan Bello, Jake Berdine, Gabriel Bernadett-Shapiro, Christopher Berner, Lenny Bogdonoff, Oleg Boiko , et al. (256 additional authors not shown)

    Abstract: We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based mo… ▽ More

    Submitted 4 March, 2024; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: 100 pages; updated authors list; fixed author names and added citation

  3. arXiv:2211.02830  [pdf, other

    cs.LG

    Discovering ordinary differential equations that govern time-series

    Authors: Sören Becker, Michal Klein, Alexander Neitz, Giambattista Parascandolo, Niki Kilbertus

    Abstract: Natural laws are often described through differential equations yet finding a differential equation that describes the governing law underlying observed data is a challenging and still mostly manual task. In this paper we make a step towards the automation of this process: we propose a transformer-based sequence-to-sequence model that recovers scalar autonomous ordinary differential equations (ODE… ▽ More

    Submitted 5 November, 2022; originally announced November 2022.

    Comments: Workshop paper at NeurIPS 2022 workshop "AI for Science"

  4. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  5. arXiv:2201.10222  [pdf, other

    cs.LG cs.AI cs.CL physics.hist-ph

    Explanatory Learning: Beyond Empiricism in Neural Networks

    Authors: Antonio Norelli, Giorgio Mariani, Luca Moschella, Andrea Santilli, Giambattista Parascandolo, Simone Melzi, Emanuele Rodolà

    Abstract: We introduce Explanatory Learning (EL), a framework to let machines use existing knowledge buried in symbolic sequences -- e.g. explanations written in hieroglyphic -- by autonomously learning to interpret them. In EL, the burden of interpreting symbols is not left to humans or rigid human-coded compilers, as done in Program Synthesis. Rather, EL calls for a learned interpreter, built upon a limit… ▽ More

    Submitted 25 January, 2022; originally announced January 2022.

    Comments: Main paper: 10 pages, References: 3 pages, Appendix: 7 pages

  6. arXiv:2106.06427  [pdf, other

    cs.LG

    Neural Symbolic Regression that Scales

    Authors: Luca Biggio, Tommaso Bendinelli, Alexander Neitz, Aurelien Lucchi, Giambattista Parascandolo

    Abstract: Symbolic equations are at the core of scientific discovery. The task of discovering the underlying equation from a set of input-output pairs is called symbolic regression. Traditionally, symbolic regression methods use hand-designed strategies that do not improve with experience. In this paper, we introduce the first symbolic regression method that leverages large scale pre-training. We procedural… ▽ More

    Submitted 11 June, 2021; originally announced June 2021.

    Comments: Accepted at the 38th International Conference on Machine Learning (ICML) 2021

  7. arXiv:2009.00329  [pdf, other

    cs.LG stat.ML

    Learning explanations that are hard to vary

    Authors: Giambattista Parascandolo, Alexander Neitz, Antonio Orvieto, Luigi Gresele, Bernhard Schölkopf

    Abstract: In this paper, we investigate the principle that `good explanations are hard to vary' in the context of deep learning. We show that averaging gradients across examples -- akin to a logical OR of patterns -- can favor memorization and `patchwork' solutions that sew together different strategies, instead of identifying invariances. To inspect this, we first formalize a notion of consistency for mini… ▽ More

    Submitted 24 October, 2020; v1 submitted 1 September, 2020; originally announced September 2020.

    Comments: From v1: extended 2.2 and 2.3, added details for reproducibility and link to codebase

  8. arXiv:2004.11410  [pdf, other

    cs.LG cs.AI stat.ML

    Divide-and-Conquer Monte Carlo Tree Search For Goal-Directed Planning

    Authors: Giambattista Parascandolo, Lars Buesing, Josh Merel, Leonard Hasenclever, John Aslanides, Jessica B. Hamrick, Nicolas Heess, Alexander Neitz, Theophane Weber

    Abstract: Standard planners for sequential decision making (including Monte Carlo planning, tree search, dynamic programming, etc.) are constrained by an implicit sequential planning assumption: The order in which a plan is constructed is the same in which it is executed. We consider alternatives to this assumption for the class of goal-directed Reinforcement Learning (RL) problems. Instead of an environmen… ▽ More

    Submitted 23 April, 2020; originally announced April 2020.

  9. arXiv:1812.00524  [pdf, other

    cs.LG stat.ML

    Generalization in anti-causal learning

    Authors: Niki Kilbertus, Giambattista Parascandolo, Bernhard Schölkopf

    Abstract: The ability to learn and act in novel situations is still a prerogative of animate intelligence, as current machine learning methods mostly fail when moving beyond the standard i.i.d. setting. What is the reason for this discrepancy? Most machine learning tasks are anti-causal, i.e., we infer causes (labels) from effects (observations). Typically, in supervised learning we build systems that try t… ▽ More

    Submitted 2 December, 2018; originally announced December 2018.

    Comments: A shorter version of this paper appeared at the workshop on `Critiquing and correcting trends in machine learning` at NeurIPS 2018

  10. arXiv:1808.04768  [pdf, other

    cs.LG stat.ML

    Adaptive Skip Intervals: Temporal Abstraction for Recurrent Dynamical Models

    Authors: Alexander Neitz, Giambattista Parascandolo, Stefan Bauer, Bernhard Schölkopf

    Abstract: We introduce a method which enables a recurrent dynamics model to be temporally abstract. Our approach, which we call Adaptive Skip Intervals (ASI), is based on the observation that in many sequential prediction tasks, the exact time at which events occur is irrelevant to the underlying objective. Moreover, in many situations, there exist prediction intervals which result in particularly easy-to-p… ▽ More

    Submitted 12 December, 2018; v1 submitted 14 August, 2018; originally announced August 2018.

  11. arXiv:1802.04374  [pdf, other

    stat.ML cs.CR cs.LG

    Tempered Adversarial Networks

    Authors: Mehdi S. M. Sajjadi, Giambattista Parascandolo, Arash Mehrjou, Bernhard Schölkopf

    Abstract: Generative adversarial networks (GANs) have been shown to produce realistic samples from high-dimensional distributions, but training them is considered hard. A possible explanation for training instabilities is the inherent imbalance between the networks: While the discriminator is trained directly on both real and fake samples, the generator only has control over the fake samples it produces sin… ▽ More

    Submitted 11 July, 2018; v1 submitted 12 February, 2018; originally announced February 2018.

    Comments: accepted to ICML 2018

  12. arXiv:1712.00961  [pdf, other

    cs.LG stat.ML

    Learning Independent Causal Mechanisms

    Authors: Giambattista Parascandolo, Niki Kilbertus, Mateo Rojas-Carulla, Bernhard Schölkopf

    Abstract: Statistical learning relies upon data sampled from a distribution, and we usually do not care what actually generated it in the first place. From the point of view of causal modeling, the structure of each distribution is induced by physical mechanisms that give rise to dependences between observables. Mechanisms, however, can be meaningful autonomous modules of generative models that make sense b… ▽ More

    Submitted 8 September, 2018; v1 submitted 4 December, 2017; originally announced December 2017.

    Comments: ICML 2018

    Journal ref: Proceedings of the 35th International Conference on Machine Learning, PMLR 80:4036-4044, 2018

  13. arXiv:1706.02744  [pdf, ps, other

    stat.ML cs.CY cs.LG

    Avoiding Discrimination through Causal Reasoning

    Authors: Niki Kilbertus, Mateo Rojas-Carulla, Giambattista Parascandolo, Moritz Hardt, Dominik Janzing, Bernhard Schölkopf

    Abstract: Recent work on fairness in machine learning has focused on various statistical discrimination criteria and how they trade off. Most of these criteria are observational: They depend only on the joint distribution of predictor, protected attribute, features, and outcome. While convenient to work with, observational criteria have severe inherent limitations that prevent them from resolving matters of… ▽ More

    Submitted 21 January, 2018; v1 submitted 8 June, 2017; originally announced June 2017.

    Comments: Advances in Neural Information Processing Systems 30, 2017 http://papers.nips.cc/paper/6668-avoiding-discrimination-through-causal-reasoning

    Journal ref: Advances in Neural Information Processing Systems 30, 2017, p. 656--666

  14. arXiv:1706.02293  [pdf, other

    cs.SD cs.LG

    Sound Event Detection in Multichannel Audio Using Spatial and Harmonic Features

    Authors: Sharath Adavanne, Giambattista Parascandolo, Pasi Pertilä, Toni Heittola, Tuomas Virtanen

    Abstract: In this paper, we propose the use of spatial and harmonic features in combination with long short term memory (LSTM) recurrent neural network (RNN) for automatic sound event detection (SED) task. Real life sound recordings typically have many overlapping sound events, making it hard to recognize with just mono channel audio. Human listeners have been successfully recognizing the mixture of overlap… ▽ More

    Submitted 7 June, 2017; originally announced June 2017.

  15. arXiv:1703.02317  [pdf, other

    cs.SD cs.LG stat.ML

    Convolutional Recurrent Neural Networks for Bird Audio Detection

    Authors: EmreÇakır, Sharath Adavanne, Giambattista Parascandolo, Konstantinos Drossos, Tuomas Virtanen

    Abstract: Bird sounds possess distinctive spectral structure which may exhibit small shifts in spectrum depending on the bird species and environmental conditions. In this paper, we propose using convolutional recurrent neural networks on the task of automated bird audio detection in real-life environments. In the proposed method, convolutional layers extract high dimensional, local frequency shift invarian… ▽ More

    Submitted 7 March, 2017; originally announced March 2017.

    Comments: Submitted to EUSIPCO 2017 Special Session on Bird Audio Signal Processing

  16. Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection

    Authors: Emre Çakır, Giambattista Parascandolo, Toni Heittola, Heikki Huttunen, Tuomas Virtanen

    Abstract: Sound events often occur in unstructured environments where they exhibit wide variations in their frequency content and temporal structure. Convolutional neural networks (CNN) are able to extract higher level features that are invariant to local spectral and temporal variations. Recurrent neural networks (RNNs) are powerful in learning the longer term temporal context in the audio signals. CNNs an… ▽ More

    Submitted 21 February, 2017; originally announced February 2017.

    Comments: Accepted for IEEE Transactions on Audio, Speech and Language Processing, Special Issue on Sound Scene and Event Analysis

  17. Recurrent Neural Networks for Polyphonic Sound Event Detection in Real Life Recordings

    Authors: Giambattista Parascandolo, Heikki Huttunen, Tuomas Virtanen

    Abstract: In this paper we present an approach to polyphonic sound event detection in real life recordings based on bi-directional long short term memory (BLSTM) recurrent neural networks (RNNs). A single multilabel BLSTM RNN is trained to map acoustic features of a mixture signal consisting of sounds from multiple classes, to binary activity indicators of each event class. Our method is tested on a large d… ▽ More

    Submitted 4 April, 2016; originally announced April 2016.

    Comments: To appean in Proceedings of IEEE ICASSP 2016