Skip to main content

Showing 1–9 of 9 results for author: Bond-Taylor, S

  1. arXiv:2406.04449  [pdf, other

    cs.CL cs.CV

    MAIRA-2: Grounded Radiology Report Generation

    Authors: Shruthi Bannur, Kenza Bouzid, Daniel C. Castro, Anton Schwaighofer, Sam Bond-Taylor, Maximilian Ilse, Fernando Pérez-García, Valentina Salvatelli, Harshita Sharma, Felix Meissen, Mercy Ranjit, Shaury Srivastav, Julia Gong, Fabian Falck, Ozan Oktay, Anja Thieme, Matthew P. Lungren, Maria Teodora Wetscherek, Javier Alvarez-Valle, Stephanie L. Hyland

    Abstract: Radiology reporting is a complex task that requires detailed image understanding, integration of multiple inputs, including comparison with prior imaging, and precise language generation. This makes it ideal for the development and use of generative multimodal models. Here, we extend report generation to include the localisation of individual findings on the image - a task we call grounded report… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: 44 pages, 20 figures

  2. arXiv:2401.10815  [pdf, other

    cs.CV

    RAD-DINO: Exploring Scalable Medical Image Encoders Beyond Text Supervision

    Authors: Fernando Pérez-García, Harshita Sharma, Sam Bond-Taylor, Kenza Bouzid, Valentina Salvatelli, Maximilian Ilse, Shruthi Bannur, Daniel C. Castro, Anton Schwaighofer, Matthew P. Lungren, Maria Wetscherek, Noel Codella, Stephanie L. Hyland, Javier Alvarez-Valle, Ozan Oktay

    Abstract: Language-supervised pre-training has proven to be a valuable method for extracting semantically meaningful features from images, serving as a foundational element in multimodal systems within the computer vision and medical imaging domains. However, resulting features are limited by the information contained within the text. This is particularly problematic in medical imaging, where radiologists'… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

  3. arXiv:2312.12865  [pdf, other

    cs.CV cs.AI

    RadEdit: stress-testing biomedical vision models via diffusion image editing

    Authors: Fernando Pérez-García, Sam Bond-Taylor, Pedro P. Sanchez, Boris van Breugel, Daniel C. Castro, Harshita Sharma, Valentina Salvatelli, Maria T. A. Wetscherek, Hannah Richardson, Matthew P. Lungren, Aditya Nori, Javier Alvarez-Valle, Ozan Oktay, Maximilian Ilse

    Abstract: Biomedical imaging datasets are often small and biased, meaning that real-world performance of predictive models can be substantially lower than expected from internal testing. This work proposes using generative image editing to simulate dataset shifts and diagnose failure modes of biomedical vision models; this can be used in advance of deployment to assess readiness, potentially reducing cost a… ▽ More

    Submitted 3 April, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

  4. arXiv:2308.14152  [pdf, other

    cs.CV

    Unaligned 2D to 3D Translation with Conditional Vector-Quantized Code Diffusion using Transformers

    Authors: Abril Corona-Figueroa, Sam Bond-Taylor, Neelanjan Bhowmik, Yona Falinie A. Gaus, Toby P. Breckon, Hubert P. H. Shum, Chris G. Willcocks

    Abstract: Generating 3D images of complex objects conditionally from a few 2D views is a difficult synthesis problem, compounded by issues such as domain gap and geometric misalignment. For instance, a unified framework such as Generative Adversarial Networks cannot achieve this unless they explicitly define both a domain-invariant and geometric-invariant joint latent distribution, whereas Neural Radiance F… ▽ More

    Submitted 27 August, 2023; originally announced August 2023.

    Comments: Camera-ready version for ICCV 2023

  5. arXiv:2303.18242  [pdf, other

    cs.LG cs.CV

    $\infty$-Diff: Infinite Resolution Diffusion with Subsampled Mollified States

    Authors: Sam Bond-Taylor, Chris G. Willcocks

    Abstract: This paper introduces $\infty$-Diff, a generative diffusion model defined in an infinite-dimensional Hilbert space, which can model infinite resolution data. By training on randomly sampled subsets of coordinates and denoising content only at those locations, we learn a continuous function for arbitrary resolution sampling. Unlike prior neural field-based infinite-dimensional models, which use poi… ▽ More

    Submitted 1 March, 2024; v1 submitted 31 March, 2023; originally announced March 2023.

    Comments: Accepted at ICLR 2024

  6. arXiv:2202.01020  [pdf, other

    eess.IV cs.CV

    MedNeRF: Medical Neural Radiance Fields for Reconstructing 3D-aware CT-Projections from a Single X-ray

    Authors: Abril Corona-Figueroa, Jonathan Frawley, Sam Bond-Taylor, Sarath Bethapudi, Hubert P. H. Shum, Chris G. Willcocks

    Abstract: Computed tomography (CT) is an effective medical imaging modality, widely used in the field of clinical medicine for the diagnosis of various pathologies. Advances in Multidetector CT imaging technology have enabled additional functionalities, including generation of thin slice multiplanar cross-sectional body imaging and 3D reconstructions. However, this involves patients being exposed to a consi… ▽ More

    Submitted 8 April, 2022; v1 submitted 2 February, 2022; originally announced February 2022.

    Comments: 6 pages, 4 figures, accepted at IEEE EMBC 2022

    ACM Class: I.4; J.7

  7. arXiv:2111.12701  [pdf, other

    cs.CV cs.LG

    Unleashing Transformers: Parallel Token Prediction with Discrete Absorbing Diffusion for Fast High-Resolution Image Generation from Vector-Quantized Codes

    Authors: Sam Bond-Taylor, Peter Hessey, Hiroshi Sasaki, Toby P. Breckon, Chris G. Willcocks

    Abstract: Whilst diffusion probabilistic models can generate high quality image content, key limitations remain in terms of both generating high-resolution imagery and their associated high computational requirements. Recent Vector-Quantized image models have overcome this limitation of image resolution but are prohibitively slow and unidirectional as they generate tokens via element-wise autoregressive sam… ▽ More

    Submitted 24 November, 2021; originally announced November 2021.

    Comments: 19 pages, 14 figures

    MSC Class: 68T01 (Primary); 68T07 (Secondary) ACM Class: I.5.0; I.4.0; G.3

  8. arXiv:2103.04922  [pdf, other

    cs.LG cs.CV stat.ML

    Deep Generative Modelling: A Comparative Review of VAEs, GANs, Normalizing Flows, Energy-Based and Autoregressive Models

    Authors: Sam Bond-Taylor, Adam Leach, Yang Long, Chris G. Willcocks

    Abstract: Deep generative models are a class of techniques that train deep neural networks to model the distribution of training samples. Research has fragmented into various interconnected approaches, each of which make trade-offs including run-time, diversity, and architectural restrictions. In particular, this compendium covers energy-based models, variational autoencoders, generative adversarial network… ▽ More

    Submitted 28 March, 2022; v1 submitted 8 March, 2021; originally announced March 2021.

    Comments: 20 pages, 9 figures, will appear in IEEE Transactions on Pattern Analysis and Machine Intelligence

    MSC Class: 68T01 (Primary); 68T07 (Secondary) ACM Class: I.5.0; I.4.0; G.3

  9. arXiv:2007.02798  [pdf, other

    cs.CV cs.LG

    Gradient Origin Networks

    Authors: Sam Bond-Taylor, Chris G. Willcocks

    Abstract: This paper proposes a new type of generative model that is able to quickly learn a latent representation without an encoder. This is achieved using empirical Bayes to calculate the expectation of the posterior, which is implemented by initialising a latent vector with zeros, then using the gradient of the log-likelihood of the data with respect to this zero vector as new latent points. The approac… ▽ More

    Submitted 24 March, 2021; v1 submitted 6 July, 2020; originally announced July 2020.

    Comments: 16 pages, 17 figures, accepted at ICLR 2021, camera-ready version

    MSC Class: 68T01 (Primary); 68T07 (Secondary) ACM Class: I.5.0; I.4.0; G.3