Skip to main content

Showing 1–10 of 10 results for author: Chollet, F

  1. arXiv:2405.20247  [pdf, other

    cs.AI cs.CV cs.LG cs.SE

    KerasCV and KerasNLP: Vision and Language Power-Ups

    Authors: Matthew Watson, Divyashree Shivakumar Sreepathihalli, Francois Chollet, Martin Gorner, Kiranbir Sodhia, Ramesh Sampath, Tirth Patel, Haifeng Jin, Neel Kovelamudi, Gabriel Rasskin, Samaneh Saadat, Luke Wood, Chen Qian, Jonathan Bischof, Ian Stenbit, Abheesht Sharma, Anshuman Mishra

    Abstract: We present the Keras domain packages KerasCV and KerasNLP, extensions of the Keras API for Computer Vision and Natural Language Processing workflows, capable of running on either JAX, TensorFlow, or PyTorch. These domain packages are designed to enable fast experimentation, with a focus on ease-of-use and performance. We adopt a modular, layered design: at the library's lowest level of abstraction… ▽ More

    Submitted 5 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

    Comments: Submitted to Journal of Machine Learning Open Source Software

    ACM Class: I.2.5; I.2.7; I.2.10

  2. arXiv:2207.12120  [pdf, other

    cs.CV cs.LG

    Efficient Graph-Friendly COCO Metric Computation for Train-Time Model Evaluation

    Authors: Luke Wood, Francois Chollet

    Abstract: Evaluating the COCO mean average precision (MaP) and COCO recall metrics as part of the static computation graph of modern deep learning frameworks poses a unique set of challenges. These challenges include the need for maintaining a dynamic-sized state to compute mean average precision, reliance on global dataset-level statistics to compute the metrics, and managing differing numbers of bounding… ▽ More

    Submitted 21 July, 2022; originally announced July 2022.

    Comments: 7 pages, 3 figures

  3. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  4. arXiv:1911.01547  [pdf, other

    cs.AI

    On the Measure of Intelligence

    Authors: François Chollet

    Abstract: To make deliberate progress towards more intelligent and more human-like artificial systems, we need to be following an appropriate feedback signal: we need to be able to define and evaluate intelligence in a way that enables comparisons between two systems, as well as comparisons with humans. Over the past hundred years, there has been an abundance of attempts to define and measure intelligence,… ▽ More

    Submitted 25 November, 2019; v1 submitted 4 November, 2019; originally announced November 2019.

  5. arXiv:1803.07416  [pdf, other

    cs.LG cs.CL stat.ML

    Tensor2Tensor for Neural Machine Translation

    Authors: Ashish Vaswani, Samy Bengio, Eugene Brevdo, Francois Chollet, Aidan N. Gomez, Stephan Gouws, Llion Jones, Łukasz Kaiser, Nal Kalchbrenner, Niki Parmar, Ryan Sepassi, Noam Shazeer, Jakob Uszkoreit

    Abstract: Tensor2Tensor is a library for deep learning models that is well-suited for neural machine translation and includes the reference implementation of the state-of-the-art Transformer model.

    Submitted 16 March, 2018; originally announced March 2018.

    Comments: arXiv admin note: text overlap with arXiv:1706.03762

  6. arXiv:1706.03059  [pdf, other

    cs.CL cs.LG

    Depthwise Separable Convolutions for Neural Machine Translation

    Authors: Lukasz Kaiser, Aidan N. Gomez, Francois Chollet

    Abstract: Depthwise separable convolutions reduce the number of parameters and computation used in convolutional operations while increasing representational efficiency. They have been shown to be successful in image classification models, both in obtaining better models than previously possible for a given parameter count (the Xception architecture) and considerably reducing the number of parameters requir… ▽ More

    Submitted 15 June, 2017; v1 submitted 9 June, 2017; originally announced June 2017.

  7. arXiv:1703.00426  [pdf, other

    cs.AI

    HolStep: A Machine Learning Dataset for Higher-order Logic Theorem Proving

    Authors: Cezary Kaliszyk, François Chollet, Christian Szegedy

    Abstract: Large computer-understandable proofs consist of millions of intermediate logical steps. The vast majority of such steps originate from manually selected and manually guided heuristics applied to intermediate goals. So far, machine learning has generally not been used to filter or generate these steps. In this paper, we introduce a new dataset based on Higher-Order Logic (HOL) proofs, for the purpo… ▽ More

    Submitted 1 March, 2017; originally announced March 2017.

  8. arXiv:1610.02357  [pdf, other

    cs.CV

    Xception: Deep Learning with Depthwise Separable Convolutions

    Authors: François Chollet

    Abstract: We present an interpretation of Inception modules in convolutional neural networks as being an intermediate step in-between regular convolution and the depthwise separable convolution operation (a depthwise convolution followed by a pointwise convolution). In this light, a depthwise separable convolution can be understood as an Inception module with a maximally large number of towers. This observa… ▽ More

    Submitted 4 April, 2017; v1 submitted 7 October, 2016; originally announced October 2016.

  9. arXiv:1607.05691  [pdf, other

    cs.CV cs.LG stat.ML

    Information-theoretical label embeddings for large-scale image classification

    Authors: François Chollet

    Abstract: We present a method for training multi-label, massively multi-class image classification models, that is faster and more accurate than supervision via a sigmoid cross-entropy loss (logistic regression). Our method consists in embedding high-dimensional sparse labels onto a lower-dimensional dense sphere of unit-normed vectors, and treating the classification problem as a cosine proximity regressio… ▽ More

    Submitted 19 July, 2016; originally announced July 2016.

  10. arXiv:1606.04442  [pdf, other

    cs.AI cs.LG cs.LO

    DeepMath - Deep Sequence Models for Premise Selection

    Authors: Alex A. Alemi, Francois Chollet, Niklas Een, Geoffrey Irving, Christian Szegedy, Josef Urban

    Abstract: We study the effectiveness of neural sequence models for premise selection in automated theorem proving, one of the main bottlenecks in the formalization of mathematics. We propose a two stage approach for this task that yields good results for the premise selection task on the Mizar corpus while avoiding the hand-engineered features of existing state-of-the-art models. To our knowledge, this is t… ▽ More

    Submitted 26 January, 2017; v1 submitted 14 June, 2016; originally announced June 2016.